International Encyclopedia of the
SOCIAL SCIENCES
Associate Editors Heinz Eulau, Political Science Lloyd A. Fallers, Anthropology William H. Kruskal, Statistics Gardner Lindzey, Psychology Albert Rees, Economics Albert J. Reiss, Jr., Sociology Edward Shils, Social Thought
Special Editors Elinor G. Barber, Biographies John G. Darley, Applied Psychology Bert F. Hoselitz, Economic Development Clifford T. Morgan, Experimental Psychology Robert H. Strotz, Econometrics
Editorial Staff Marjorie A. Bassett, Economics P. G. Bock, Political Science Robert M. Coen, Econometrics J. M. B. Edwards, Sociology David S. Gochman, Psychology George Lowy, Bibliographies Judith M. Tanur, Statistics Judith M. Treistman, Anthropology
Alvin Johnson HONORARY EDITOR
W. Allen Wallis CHAIRMAN, EDITORIAL ADVISORY BOARD
International Encyclopedia of the
SOCIAL SCIENCES DAVID L. S I L L S EDITOR
VOLUME
The Macmilkn Company & The Free Press
10
COPYRIGHT © iges BY CROWELL COLLIER AND MACMILLAN, INC.
ALL RIGHTS RESERVED. NO PART OF THIS BOOK MAY BE REPRODUCED OR TRANSMITTED IN ANY FORM OR BY ANY MEANS, ELECTRONIC OR MECHANICAL, INCLUDING PHOTOCOPYING, RECORDING, OR BY ANY INFORMATION STORAGE AND RETRIEVAL SYSTEM, WITHOUT PERMISSION IN WRITING FROM CROWELL COLLIER AND MACMILLAN, INC.
LIBRARY OF CONGRESS CATALOG NUMBER 68-10023
MANUFACTURED IN THE UNITED STATES OF AMERICA
International Encyclopedia of the SOCIAL SCIENCES
[CONTINUED]
MARRIAGE i. FAMILY FORMATION ii. COMPARATIVE ANALYSIS in. MARRIAGE ALLIANCE
Robert F. Winch Gloria A. Marshall Louis Dumont
FAMILY FORMATION
The beginning of family formation may be either marriage or parenthood. It should not be concluded from the fact that sexual intercourse is a prerequisite for pregnancy that all peoples regard marriage or the establishing of a man-woman relationship as the first step in family formation. Indeed, according to Bohannan (1963, p. 73) the matricentric family, consisting of a woman and her children, is "both more nearly universal and more elementary than is the nuclear family," consisting of a marital couple plus any children they may have. In some societies it is thought proper that marriage should precede pregnancy, while in others the reverse sequence is regarded with favor; in the extreme case marriage is viewed as irrelevant to family formation. However, it seems safe to assert that in most societies the nuclear family is thought to be well launched only when both conditions are met. Cultures also vary according to whether they emphasize marital solidarity over lineal solidarity or vice versa. Societies with strongly developed extended family systems emphasize lineal solidarity over marital solidarity. In such societies family formation is scarcely a meaningful concept: since the marriage of a man and woman and the coming of their progeny represent the carrying on of a continuous line, these events may signal the establishing of a new household but not the formation of a new family.
In this article the topic of family formation will be treated with reference to the nuclear family. Marriage will therefore be considered as the focus of the process of family formation, and mate selection as one of its most problematic features. Definitions. From the functional point of view, the family is the one social system that all societies look to for the replacement of their members. However, from the structural point of view, the word "family" is used to refer not only to the marital couple and their children but also to the larger kin group; accordingly, it will be necessary to draw some structural distinctions. The "extended family" includes a nuclear family plus lineal and collateral kinsmen; to the extent that a society emphasizes rights and obligations among kinsmen who are not in the same nuclear family, it is spoken of as having an "extended family system." On the other hand, a "nuclear family system" is said to exist in a society in which the rights and obligations among those in the larger kin group are given little emphasis relative to the claims among members of the same nuclear family. It should be emphasized that the term "family," whether it applies to a nuclear or an extended family, is not equivalent to the term "household" —the aggregate of persons occupying a common dwelling unit, whether or not those persons are kinsmen. In Western societies the nuclear family is frequently also a household while the parental couples are in their younger and middle years and before their children attain adulthood. However, many other arrangements are possible, and some are institutionalized. For example, in South Africa it has been the practice for decades for the husbandfather to be away from his nuclear family for years at a time. A common type of household among 1
2
MARRIAGE: Family Formation
Negroes in the Caribbean and in the United States consists of a working woman, her children, and her mother. In traditional China the ideal household included the nuclear family of the head of the household plus his unmarried daughters, his sons with their nuclear families, his sons' unmarried daughters, his sons' sons with their nuclear families, and so on through all living generations; in practice, however, not many Chinese families could afford households of such size. Marriage may be denned as a culturally approved relationship of one man and one woman (monogamy), of one man and two or more women (polygyny), or of one woman and two or more men (polyandry), in which there is cultural endorsement of sexual intercourse between the marital partners of opposite sex and, generally, the expectation that children will be born of the relationship ("polygamy" is the term that subsumes both polygyny and polyandry). "Homogamy" refers to the marriage of persons of similar characteristics, which is also known as "assortative" or "assertive" mating; "heterogamy" is the marriage of persons of different characteristics; and "hypergamy" is a marriage in which the husband is of higher social status than the wife. The term "endogamy" refers to marriage between persons belonging to the same social group, whereas in "exogamy" the partners come from different groups. Marriage and legitimacy. By definition marriage is a relationship within which sexual intercourse is legitimate In general, a woman who cohabits with a man has a legitimate status in relation to that man only if she is known to be married to him. Common-law marriage (recognized in the United Kingdom and in the United States) and the consensual union (recognized in the Caribbean) are forms of man-woman relationship that carry less than full cultural approval and legitimacy. Points of interest to American, as well as to English, courts in establishing whether or not a common-law marriage exists include: mutual agreement of the man and woman to take each other as husband and wife; cohabitation and presentation of themselves as a married couple to friends, neighbors, and the general public; and reputation, that is, the recognition by the community that the two are husband and wife. The Caribbean pattern of the consensual union differs from the common-law marriage of AngloSaxon countries in that the former is not a legally recognized marriage. Various writers have held that except for this lack of legal sanction the consensual union carries no social stigma and therefore is quite as acceptable among the people prac-
ticing it as is legal marriage. More recent analyses by Blake (1961) and by Goode (1960), however, have concluded that there is general recognition among Caribbean societies that consensual unions are less legitimate and hence less desirable than legal marriages. Goode argues that whereas legal marriage is recognized throughout Caribbean societies as the legitimate form, there is variation among the social strata of these societies in the degree of norm commitment, with the consequence that persons in the lower strata tend to be generally less committed to familial norms than persons in the upper strata. The more frequent occurrence of consensual unions among the lower social strata than among the upper is seen as a reflection of the class-linked variation in the degree of commitment to familial norms. [See CARIBBEAN SOCIETY.] Legitimacy affects the offspring of the marriage as well as the spouses themselves. In asserting what he called the "principle of legitimacy," Malinowski (1929) stated that in all societies a socially recognized father has been regarded as indispensable to the child. A legal marriage, then, gives a woman a socially recognized husband and her children a socially recognized father. According to Zimmerman (1947), the penalties attached to illegitimacy vary directly with the power of the extended family; thus, the penalties are heavy in societies characterized by the extended-family system and light where the nuclear family prevails. From a sociological point of view, the significance of legitimacy is that it is a necessary condition for the family to carry out its function of positionconferring. In this sense, the critical meaning of bastardy is not that the child has low status but rather that he lacks any position and status in his society. [See ILLEGITIMACY.] Variations in familial organization. Cultural expectations pertaining to marriage are affected by variations in familial organization. In Western civilization it appears that the power of the family and the size of the effective kin group (i.e., of the familial structure) have varied inversely with the complexity of the society of which the effective kin group is a part. Zimmerman (1947), who extensively analyzed the civilizations of ancient Athens and Rome, reports that in the early stages of both of these civilizations (i.e., when both societies were relatively simple) there existed what he calls the "trustee" type of familial organization; whereas in their late (and, to Zimmerman, decadent) stages, Athens and Rome developed much more complex societies and simpler familial structures, which he describes as "atomistic." The kernel of Zimmer-
MARRIAGE: Family Formation man's distinction lies in the locus of power. Where the trustee type of family exists, much power is located in the extended family. The head of the family, as the responsible center of familial authority, influences the behavior of the family members, and the extended family feels responsible for the behavior of its members. Where the atomistic type of family prevails, much power is located outside the kin group in specialized institutions. As the family loses power, its structure shifts from the extended family system to the nuclear family system. In the process of making this shift, according to Zimmerman, the divorce rate goes up and the birth rate goes down. Arguing that there are other lines of development than those of the West noted by Zimmerman, Goode (1963) holds, as we shall see below, that whether the divorce rate goes up as a society becomes more complex depends on the nature of the familial structure at the start of the process. One way of formulating variation in the family's power and size is to speak of its functioning as a political unit. Moreover, the family may show variation in other kinds of functioning. In some settings the family is the basic economic unit that creates and distributes goods and services. In many settings it is the principal social unit responsible for socializing and educating the young. And in some settings, especially where ancestor worship is practiced, the family carries out the religious function. In general, as societies become more complex, specialized societal structures develop for the carrying out of these functions, with the result that the family loses some of its functions; indeed such a state of affairs is the meaning of societal complexity. Taking account of Asian and African as well as Western societies, Goode (1963) agrees that most family systems of the world are moving toward a small-family system based on the nuclear family. Because the traits of non-Western family systems are so varied, however, he believes there will be marked differences in the direction of this change as the predicted convergence takes place. Thus, in African tribal societies where matrilineal systems are strong and divorce is common, Goode reasons that urbanization will be accompanied by a reduction in the conditions that have made divorce easy. Mate selection The functional emphasis in modern sociology leads the observer to anticipate that criteria for the choice of a mate will be related to the roles the mate is expected to enact and, perhaps, that the mate will be chosen by the incumbent of that social position most influenced by the quality of the mate's
3
performance. There is some evidence to support such a set of functional expectations, but of course the empirical world is always less tidy than the social scientist's model. The extended family system. In the extended family system it is common for members of the nuclear family to work in teams of kinsmen. Under this condition the mate-selective process is frequently a means of recruiting workers, and hence the members of the extended family have a lively interest in the work-related qualifications of a kinsman's prospective mate. Thus it is not unusual for responsible senior members of the extended family to select a son's spouse and to employ such familially relevant criteria as the industry and prospective fecundity of a potential daughter-in-law. For families of higher status, the standing of a girl's family becomes more important than her manual skills. Irrespective of status, however, the extended family system makes the procuring of a mate a matter of moment to a wide circle of kinsmen. It is consistent with this kind of family organization that mate selection should be a task calling for experienced perception and shrewd bargaining. Moreover, in order that their plans should not be thwarted by the passions of the young, the older people institute devices such as early marriage and efficient chaperonage (Goode 1959). On the other hand, where the extended family is not highly functional and where the nuclear family system prevails, it is frequently thought to be inappropriate for members of the extended kin group to exhibit lively interest in the marital choices of family members, and even the influence of parents is reduced. Under these conditions the criteria for mate selection are more likely to include attributes having primary appeal to the nubile pair—physical beauty, sexual attractiveness, and congeniality. The response to one or more of these attributes comes to be subsumed under the rubric of love. The diminution of relatives' influence in mate selection is not, of course, a categorical matter but rather one of degree. By their own religion, ethnicity, and social status, as well as by their own choice of location of residence and of schools, parents continue to influence their youngsters' choice of spouses. Traditional China provides an example of mate selection carried on by the family for familial purposes. When a son married, the preferred arrangement was for him to bring his bride into his parental home. The parents expected the bride to perform two important functions: to bear children, preferably sons, and to assist her mother-in-law in the performance of domestic chores. As the boy
4
MARRIAGE: Family Formation
was growing up, he looked to his parents to provide him with a wife. The parents expected the son to accept whatever bride they chose, and they condemned vigorously any disposition on the son's part to make his own marital selection, especially if the son tried to do so on the basis of love. It was generally agreed that young people of marriageable age were too inexperienced to have sound judgment in such an important undertaking. Since most of the bride's time was to be spent assisting her husband's mother, functional considerations dictated that the latter was the most interested party in the marriage; appropriately, therefore, she was usually the most active person in selecting her son's wife. Thus, arranged marriages were customary, and it was not unusual for a young man to meet his bride for the first time at the wedding ceremony. Traditional China made extensive use of the "go-between," or marriage broker. This occupation served two useful functions: marriage brokers made it their business to have extensive and detailed information about marriageable young people; and they made it possible for families to enter into and break off negotiations without loss of face (Hsu 1948; Lang 1946; Levy 1949). With industrialization came pressure for changes in Chinese family law. This was evident as early as the Boxer Rebellion at the beginning of the twentieth century, and new codes were promulgated in 1930 and 1931 (well before the communist revolution in China) that reflected Western standards —more emphasis on the nuclear family and less on the extended family, a reduction in male authority, and a closer approximation to legal equality of the sexes. However, the law retained a feature of Chinese filial piety: the obligations to one's parents superseded the obligations to one's children. In these matters the communist revolution has represented not so much a break with the past as a continuation of trends already under way (Yang 1959). Although reliable information on postrevolutionary China is still scanty, it appears that whereas the communist regime officially deplores both Western and traditional Chinese ways, love marriages are common, and the influence of the extended family is continuing to wane. The nuclear family system. As specialized social structures spring up, take over functions from the family, and become societally important and individually rewarding, the resulting reduction in the functional importance of the extended family removes incentives for maintaining an extended family system. At the same time there are four functions inherent in the nuclear family that come to the fore as being relevant in mate selection.
These functions are: providing emotional gratification in the marital and parental relationships; providing identity and a social status in the societal system to individuals who enter the family by birth, adoption, or marriage—a function to be known here as position-conferring; performing such tasks as cleaning, bringing in supplies, and disposing of waste products, which may be subsumed under maintenance of the household; and child rearing, especially with respect to the parental functions of nurturance and control. Of these four functions emotional gratification is most explicitly recognized in American culture as relevant to mate selection, and apparently this is so, to an increasing degree, in the middle-class subcultures of western Europe. There can be little doubt that convictions are widespread in the United States and western Europe that a couple should be "in love" before considering marriage and that legal codes are obsolete if they fail to provide for divorce on the ground of chronic marital conflict. Love as a mate-selective criterion invites idiosyncratic interpretation in the sense that, for instance, one man may be attracted to a demurely diffident girl whereas another finds the vivaciously extroverted girl irresistible. As a mate-selective criterion, position-conferring (especially when phrased as status-conferring) evokes ambivalent responses. In many middle-class settings a girl who is thought to have married for money rather than for love risks social condemnation (Indian culture, by contrast, has had the tradition that it is good for a girl to marry into a subcaste of higher standing than her own). If a girl marries for love plus status improvement, however, she is said to have married "well," and the durability of the Cinderella legend suggests that there is little novelty in this theme. The woman's social status depends so largely on her husband's occupational performance that, for her, mate selection is sometimes spoken of as a "mobility bet." Such evidence as exists on this matter for the United States indicates that most marriages are between persons of roughly equal social status. Although all four of the functions mentioned above are relevant to mate selection, a young couple considering marriage can usually check the suitability of each other only with respect to emotional gratification. This may have something to do with the emphasis given love as a criterion. In the premarital setting of early adulthood the other three functions can usually be no more than the focuses of guesswork. It is difficult for a young woman to foresee how a particular man will fare in the occupational sweepstakes and in being a
MARRIAGE: Family Formation model for their sons. Predictions are similarly difficult for the young man with respect to how a woman will manage their house and mother their children. Where marriages are voluntary rather than arranged, there is need of some means for marriageable young men and women to meet and to select each other. The practice of dating is societally rational in the sense that it affords this opportunity. On the other hand, dating as a prelude to mate selection has been criticized on the ground that the leisure-time activities of dating fail to provide an adequate setting in which to test prospective spouses with respect to maritally relevant criteria, especially with respect to the functions of household maintenance and child rearing. In sum, a reduction of functions in the extended family is accompanied by a reduction in the rights and obligations among extended kin that constitute the extended family system. This reduction in the significance of blood relationships shifts the emphasis from the extended family to the nuclear family. Marital solidarity replaces cognatic (both lineal and collateral) solidarity, and love becomes a criterion of mate selection. Principles of preferential mating Let us designate as "ego" a person of reference, that is, a person from whose point of view we shall consider certain relationships. All societies designate categories of persons whom ego may not marry, and frequently there are additional categories of persons whom it would be regrettable, but not totally forbidden, for ego to marry. Usually there are implicit, if not explicit, categories of persons whom it would be desirable for ego to marry. These negative and positive expectations can be subsumed under the "principle of incest avoidance" and the "principle of ethnocentrism." We shall speak of the set of persons whom ego is permitted to marry in any given sociocultural setting as ego's field of eligible spouse candidates or, in shorter form, as ego's "field of eligibles." European social scientists use the term "isolate" to refer to the field of eligibles. The principle of incest avoidance. Every society has a prohibition against incest, that is, against sexual relations between persons who are closely related. Although the precise relationships that are viewed as incestuous vary from one society to another, they regularly include the mother-son, the father-daughter, and the brother-sister relationships, that is, all heterosexual relationships within the nuclear family except, of course, the marital relationship. The principle of incest avoidance re-
5
fers to the set of prohibitions existing in every culture to prevent ego from marrying someone too close to him in the kinship system. Just how the principle of incest avoidance works out varies from one setting to another. In traditional China it was prohibited for ego to marry anyone with the same surname, and in that populous land with few surnames this rule proscribed hundreds of thousands of otherwise eligible spouse candidates. In northern India there was a tradition that marriage was not possible with someone removed from ego by less than seven degrees on the father's side or less than five degrees on the mother's; a more common rule in India prohibits marriage between relatives linked to a common ancestor within five degrees on the father's side and three on the mother's (Goode 1963, p. 210). In some societies ego is encouraged to marry a cross-cousin (e.g., mother's brother's daughter) but prohibited from marrying a parallel cousin (e.g., mother's sister's daughter). Prior to 1793 it was illegal in Connecticut for ego to marry the sister of his deceased wife; but among the ancient Hebrews there was the custom of the levirate, by which a man was enjoined to marry the widow of his deceased brother if the brother had died without a son. The record shows a very few isolated cases where persons of opposite sex from the same nuclear family were permitted to marry. An example is the brother-sister marriage among the Ptolemies of ancient Egypt. Apparently the practice in these few exceptions functioned to keep power within ruling families. [See INCEST.] Ethnocentrism and homogamy. Whereas the principle of incest avoidance prevents ego from marrying someone too close to him in the kinship system, the principle of ethnocentrism prevents his marrying someone too different from him with respect to a number of social characteristics. In other words, ethnocentrism is a force tending toward endogamous and homogamous marriages. Sumner ([1906] 1959, chapter 1) used the term ethnocentrism to refer to the set of attitudes shared by members of a tribe or other social group to the effect that the members of that group and any others like them were seen as the center of the civilized world and had, therefore, the correct and desirable set of social characteristics. Thus, ethnocentric attitudes lead to the condemnation of outsiders to the degree that they are recognized as differing from one's group. The minimum degree of social distance on the Bogardus scale is indicated by an affirmative response to the query as to whether or not the respondent would be willing to accept a person with a specified characteristic to
6
MARRIAGE: Family Formation
close kinship by marriage. Traditionally, the castes of India have been endogamous, as have the subaltern categories of subcaste, section, and subsection. According to Kapadia ([1955] 1958, p. 118), these endogamous restrictions limited a Hindu's field of eligibles to 50 to 300 families. In 1949, however, the Hindu Marriages Validity Act stipulated that no marriage of Hindus could be invalidated because of caste or sect differences between the parties concerned. Expert opinion is divided as to the likelihood that caste endogamy will break down. In accordance with the principle of ethnocentrism there is evidence that in American society ego tends to select a spouse similar to himself with respect to race, religio-ethnic identification, socioeconomic status, and other social characteristics. In the United States the most conspicuously homogamous dimension of mate selection is race. Interracial marriages are still prohibited by law in a number of the Southern states; moreover, even where such laws do not exist, or where they have been repealed, there is little evidence of enthusiasm for such marriages. Various studies have shown the proportion of racially heterogamous marriages to be under one per cent. [See ASSIMILATION.] The second dimension of ethnocentric preference and prohibition is that of religio-ethnic identification, which includes cultural as well as religious elements. Classifying the 1957 population of the United States into the three major religious categories (Protestant, Catholic, and Jewish), the U.S. Bureau of the Census found that approximately 94 per cent of the married persons had spouses in the same religious categories as themselves. If religious endogamy had not been practiced, and if, therefore, matings had been entirely random with respect to religious affiliation, the proportion having spouses in the same religious category as themselves would have been about 56 per cent (Winch [1952] 1963, p. 331). There is evidence that in heterogeneous communities the probability that ego will marry outside his religious category is greater when his category constitutes a small proportion of the community rather than a large proportion. If the religious category has a highly distinctive ethnic identity (e.g., Catholics who are Spanish-speaking in an English-speaking community), the probability of ego's marrying endogamously is increased. A third dimension of ethnocentric preference is that of socioeconomic status. Commonly used indexes of socioeconomic status are occupation, income, and number of years of schooling. Numerous studies have shown that people tend to select their spouses from their own socioeconomic strata
with respect to all three of these indexes (several are cited in Winch [1952] 1963, pp. 336-338). Other characteristics with respect to which people tend to mate homogamously are age, previous marital status, and location of residence. Systematic research supports the common observation that young people tend to select young mates and older people choose older spouses. No doubt it is partially because of this fact that there is a tendency for people to marry others who are like themselves with respect to previous marital status: divorced men tend to marry divorcees; single persons tend to marry those who have not previously been married; and widows and widowers tend to marry each other. Another common-sense observation that has been supported by research concerns residential propinquity: ego is more likely to marry someone living nearby than someone living far away (Winch [1952] 1963, pp. 322-324, 339-345). Since people are not randomly distributed through communities but rather tend to live near and to work with others of similar social characteristics, one would expect mate selection to be somewhat homogamous, whether or not there are any sanctions enforcing endogamy. Of course there are sanctions of varying degrees of intensity: for example, in American culture sanctions are quite intense with respect to race, less so with respect to religion and socioeconomic status, and virtually nonexistent with respect to residential propinquity. Homogamy may also be considered on a more psychological level. For example, there is evidence that spouses tend to resemble each other in level of intelligence, in values (e.g., religious and aesthetic), and in attitudes (e.g., toward birth control and toward communism). When spouses are tested by paper-and-pencil methods, they appear to resemble each other somewhat, but not greatly, with respect to traits of temperament and personality. However, data gathered by other methods, such as interviews and projective methods, lead to the contrary conclusion that, at least in such traits as dominance and dependence, spouses tend to be complementary rather than similar. At present this seeming paradox is unresolved, although the answer may be that the homogamy apparent in paperand-pencil tests is an artifact resulting from the effort of people to represent themselves to be as attractive as possible—what is called the "social desirability" effect (Winch 1958; [1952] 1963, chapter 18). Differentiation of sex roles The simple fact that only women can bear children causes every society to recognize some differentiation between the behavior of men and of
MARRIAGE: Family Formation women. Beyond the behavioral differences that are directly attributable to anatomy and physiology, however, cultures vary greatly in the degree to which they view human behavior as being properly sex differentiated. From a study of 224 societies, Murdock (1937) has found that men tend to engage in such active and mobile tasks as hunting, fishing, trapping, and lumbering, whereas women tend to specialize in more sedentary but equally important tasks, such as gathering fuel and fruits and cooking and preserving meat and fish. More generally, it is possible to conceptualize two criteria that distinguish masculine from feminine tasks. Tasks assigned to men usually require physical exertion and strength, or spatial mobility and absence from home for considerable periods of time, or both. By contrast, feminine activities are typically less demanding of great strength, although perhaps requiring a considerable output of energy, and will involve only a few hours at a time away from home. Analysis of these differences leads to the conclusion that the sharpness with which a culture distinguishes between masculine and feminine sex roles will be related to the importance it attaches to tasks requiring one or both of the two masculine task characteristics. Military activity is one obvious example that involves both of the masculine criteria; thus it is argued that a highly militaryoriented culture will be one that draws a sharp distinction between properly masculine activities and those that are properly feminine. The converse inference is that to the degree that a society's important tasks do not call for either of the criteria distinguishing masculine activities, there will be no basis for developing highly differentiated sex roles. As nonhuman power has taken over most of the heavy tasks in the industrial societies, the proportion of the total labor force that is classified as "white collar" has greatly increased. And whitecollar occupations, especially those not requiring travel, can be carried on as well by women as by men. Thus, if Western cultures have been "feminized" over the past century or so, as some writers have claimed, the present analysis would interpret such a trend as a consequence of the increased use of nonhuman power. Sex dominance in the marital dyad. What are the conditions that result in the dominance of one spouse over the other? The opportunity for dominance exists in a dyad when resources desired by one member are controlled by the other, that is, when one is dependent upon the other. Resources may be viewed broadly to include both material goods, such as food, and intangibles, such as a compliment.
7
Where no organizational feature exists to determine otherwise, it appears that men have usually dominated women. The reasons for this originate in the two criteria differentiating masculine from feminine pursuits and in their anatomical and physiological bases. A woman with small children has greater need of a man to take care of her than the man has need of her. His care may be viewed as a resource, and by granting or withholding that resource, the man can dominate the woman. This is a state of affairs that has been remarked by social scientists from Aristotle through E. A. Ross and Willard Waller and is perhaps best known to contemporary readers under the rubric of the "principle of least interest": that is, the person in a relationship who has the least to lose through the termination of the relationship is in a position to demand more from others and thus to dominate them in exchange for his continued participation. Aside from this situation of unilateral dependence, other possibilities are mutual interdependence, where the resources are not available to either one unless they cooperate, and mutual independence, where each has control over his own resources. With respect to organizational features, W. G. Sumner and A. G. Keller have remarked that where the bridal couple lives has bearing on which is the dominant sex and therefore that matrilocal marriage is a condition favorable to the relative standing of women. In traditional China, the favored pattern was patrilocal, and a wife was expected to obey her husband; masculine dominance was mitigated, however, in the case of adoptive marriage. According to this pattern, a man having no son might seek a young man (who was usually of somewhat lower social rank) to take the older man's family name, marry the older man's daughter, and live matrilocally. Studies of marriage During the second quarter of the twentieth century there was a good deal of concern about the state of the family in the Western world. There was evidence that divorce rates had risen, that the family had lost functions to other social structures, that the birth rate had fallen, that certain totalitarian regimes were trying to bring about the disintegration of the family, and that broken families were spawning delinquent children. Family disorganization was widely viewed as a social problem; probably for this reason, numerous studies were undertaken to discover the determinants, or at least some correlates, of what was variously called marital "adjustment," marital "happiness," and marital "success." Although these studies did not undertake to dis-
8
MARRIAGE: Comparative Analysis
tmguish very sharply among the three terms just noted, it does seem useful to differentiate them as follows. There are two kinds of marital adjustment, one pertaining to the role and the other to the psyche of the performer. An actor is adjusted to a marital or any other kind of role to the degree that he knows the expectations that define the role and, under the appropriate conditions, can produce the behaviors expected. On the other hand, he is adjusted psychically to the degree that the energy he invests in the role performance is commensurate with the gratification derived from it. Marital "happiness" refers to the subjective response of the actor to marriage and thus is related to psychic adjustment; however, one can be psychically adjusted when both output of energy and input of gratification are low, whereas presumably happiness requires at least a moderately high level of gratification. The term marital "success" implies the existence of a goal of marriage, and whatever goals there may be—avoidance of divorce, procreation, personality development of the spouses— seem to be more clearly conceived by those who write about marriage than by the participants whose behavior the writers describe. Much of the research on marriage has been concerned with marital adjustment—that is, both with the aptitude to carry out the marital role and with the capacity to derive commensurate gratification from the performance. Kirkpatrick has surveyed a large number of studies and has reported the variables he finds that have correlated most consistently with what is here called marital adjustment. He has divided these variables into two sets: those that were clearly operating before the marriage and those that may or may not have been. Presumably the determinants of marital adjustment are more likely to come from the former set. Kirkpatrick ([1955] 1963, p. 389) presents the following premarital factors as having shown the strongest and most consistent association with high marital adjustment: happiness of parents' marriage; adequate length of acquaintance, courtship, and engagement; adequate sex information in childhood; personal happiness in childhood; approval of the marriage by parents and others; adjustment in engagement and normal motivation toward marriage; ethnic and religious similarity of the spouses; high social and educational status; maturity (marriage in the late twenties rather than in the teens or early twenties); similar chronological age of the spouses; and harmonious affection with parents during childhood. Factors that may have become operative during marriage, rather than before, and therefore are regarded as part of the complex of marital adjust-
ment rather than among its determinants are early and adequate orgasm capacity, especially of the wife; confidence in the spouse's affection and satisfaction with degree of affection shown; equalitarian rather than patriarchal marital relations, with special reference to the role of husband; mental and physical health; and harmonious companionship based on common interests and accompanied by a favorable attitude toward the marriage and the spouse (Kirkpatrick [1955] 1963, p. 394). ROBERT F. WINCH [See also FAMILY; NUPTIALITY; and the biographies of BURGESS; MALINOWSKI; SUMNER; WALLER; WESTERMARCK.] BIBLIOGRAPHY BLAKE, JUDITH 1961 Family Structure in Jamaica: The Social Context of Reproduction. New York: Free Press. BOHANNAN, PAUL 1963 Social Anthropology. New York: Holt. GOODE, WILLIAM J. 1959 The Theoretical Importance of Love. American Sociological Review 24:38-47. GOODE, WILLIAM J. 1960 Illegitimacy in the Caribbean Social Structure. American Sociological Review 25: 21-30. GOODE, WILLIAM J. 1963 World Revolution and Family Patterns. New York: Free Press. Hsu, FRANCIS L. K. 1948 Under the Ancestors' Shadow: Chinese Culture and Personality. New York: Columbia Univ. Press. KAPADIA, KANAILAL M. (1955)1958 Marriage and Family in India. 2d ed. Bombay: Oxford Univ. Press. KIRKPATRICK, CLIFFORD (1955) 1963 The Family as Process and Institution. 2d ed. New York: Ronald Press. LANG, OLGA 1946 Chinese Family and Society. New Haven: Yale Univ. Press. LEVY, MARION J. 1949 The Family Revolution in Modern China. Cambridge, Mass.: Harvard Univ. Press. MALINOWSKI, BRONISLAW (1929) 1962 Marriage. Pages 1-35 in Bronislaw Malinowski, Sex, Culture and Myth. New York: Harcourt. MURDOCK, GEORGE P. 1937 Comparative Data on the Division of Labor by Sex. Social Forces 15:551—553. SUMNER, WILLIAM GRAHAM (1906) 1959 Folkways: A Study of the Sociological Importance of Usages, Manners, Customs, Mores, and Morals. New York: Dover. -* A paperback edition was published in 1960 by New American Library. WINCH, ROBERT F. (1952) 1963 The Modern Family. Rev. ed. New York: Holt. WINCH, ROBERT F. 1958 Mate-selection: A Study of Complementary Needs. New York: Harper. YANG, CH'ING-K'UN 1959 The Chinese Family in the Communist Revolution. Cambridge, Mass.: M.I.T. Press. ZIMMERMAN, CARLE C. 1947 Family and Civilization. New York: Harper. II COMPARATIVE ANALYSIS
Every society has rules governing the assumption of the conjugal roles of husband and wife; there are also discernible rights accruing to and obligations incumbent upon the individuals who assume
MARRIAGE: Comparative Analysis these roles. Marriage in all societies thus brings about a change in the jural status of the parties to the contract. Where marriage is defined by the state, it is possible to describe most of its jural entailments by reference to one or more legal codes adopted by that state. However, among many of the peoples studied by anthropologists, the jural tenets governing marriage cannot be ascertained by reference to codes laid down by a state and hence must be derived from the study of the recurrent patterns of behavior and of folk models that prescribe ideal behavior. Marriage entails not only a change in the jural status of the individuals who enter the roles of husband and wife but also a change in the lawful status of specifiable consanguineal kinsmen of the individual partners. In fact, it is the linkage of groups as well as of individuals that is crucial to the formulation of the difference between marriage and its social analogues. Only marriage creates (or maintains) affinal relationships between the kinsmen of individuals who claim the roles of husband and wife (see Fortes 1959, p. 209). Even where it is socially admissible for individuals to presume conjugal status—that is, where they may assume the husband-wife roles without their actions being legitimated according to prevailing jural rules— this presumption of status does not generate lawful relations of affinity between kinsmen of the "spouses" concerned. The importance of affinity to an understanding of marriage is made clear through a consideration of the nature of kinship. The social relations subsumed under the concept of kinship are of two fundamental types which, though referable to the biological processes of heterosexual mating and procreation, cannot be reduced to biology. Those social relationships based on parenthood and descent or, more precisely, on parenthood and filiation, are generally termed consanguineal relationships. All persons related by socially defined direct or shared descent are consanguineal kinsmen (P. Bohannan 1963, chapter 4). These "blood relatives" are distinguished in all societies from affinal relatives, i.e., those whose kinship status is fundamentally grounded "in law." Human mating is everywhere socially regulated, and adult mating for the purpose of procreation is normally preceded by the creation of jurally derived kinship ties between the mating pair and between certain of their respective consanguineal relatives. The continuance of publicly acknowledged affinal kinship depends on adherence to prescriptions and proscriptions delimited by the particular society under consideration. Whereas many societies make no provision for the legal severance of consanguineal
9
kinship bonds, they all provide for the severance —"by law"—of those which are based "in law." Societies differ considerably with respect to the rules governing the way in which the roles of husband and wife should be assumed, with respect to the specific rights and obligations which accrue to persons in these roles, and with regard to the behavioral and jural attributes of the other affinal roles created by marriage. Nonetheless, most anthropologists have regarded the institution of marriage as a universal in human societies, and many have attempted to provide definitions of marriage sufficiently general to encompass its various manifestations. The fact that marriage is closely linked to parenthood has led many scholars, including Westermarck, Malinowski, and Radcliffe-Brown, to propose definitions of marriage which center on what Malinowski termed "the principle of legitimacy." Thus, Radcliffe-Brown writes: "Marriage is a social arrangement by which a child is given a legitimate position in the society, determined by parenthood in the social sense" (1950, p. 5). The general, though by no means universal, acceptance of this formulation is indicated by the fact that Notes and Queries on Anthropology defines marriage in an essentially similar, but by implication more limited, manner: "Marriage is a union between a man and a woman such that children born to the woman are the recognized legitimate offspring of both partners" (British Association for the Advancement of Science 1951, p. 110). Edmund R. Leach was among the first to argue that a definition of marriage in terms of legitimacy is too limited. In his opinion, any attempt at a universal definition of marriage is inevitably "vain," since the "institutions commonly classed as marriage are concerned with the allocation of a number of distinguishable classes of rights" (1961a, p. 107). Leach suggests that in most cases the institution of marriage serves to allocate rights to either or both spouses; in some cases it serves primarily to allocate rights to the husband and his wife's brothers. Despite Leach's arguments against a universal definition of marriage, his formulations stimulated two fresh attempts at universal definitions. Prince Peter of Denmark suggested that in light of Leach's propositions, marriage should be defined as "the socially recognized assumption by man and woman of the kinship status of husband and wife" (Peter, Prince of Denmark 1956). The task of the anthropologist would then be to ascertain and delineate the particular rights and obligations associated with these kinship roles in the particular societies being studied.
10
MARRIAGE: Comparative Analysis
H. Fischer (1956) called this definition tautological, on the grounds that the Oxford and Webster dictionaries defined "husband" and "wife" respectively by phrases such as "a married man" and "a married woman." In a discussion of Nayar marriage, Gough agrees and reaffirms the heuristic value of a definition of marriage based on "the principle of legitimacy." In an attempt to overcome the difficulties inherent in any formulation which defines marriage as a union of "a man and a woman" and in an attempt to provide a substantive definition for the concept of legitimacy, Gough suggests that marriage be defined as "a relationship established between a woman and one or more other persons, which provides that a child born to the woman under circumstances not prohibited by the rules of the relationship is accorded full birthstatus rights common to normal members of his society or social stratum" (1959, p. 32). Her effort to refine the older, more general "principle of legitimacy" definition has yielded one which, on close examination, is equally inadequate. Operating with such a definition, no investigator could classify as married any particular woman who had assumed the jurally recognized kinship role of wife but who had not borne children. Of course, the conditions under which a child would be accorded "full birth-status rights" could be elicited by the investigator. However, for any given case, the researcher would have to await the birth—or perhaps the conception—of a child before he could ascertain whether conditions entailed in the husband-wife relationship had been violated. Furthermore, Cough's definition implies that in any society each person having "full birth-status rights" is the child of a relationship which can be termed marriage. Among various peoples of the world, "full birth-status rights" accrue to persons born of relationships which are not recognized as marriage according to prevailing jural rules. If a universal definition of marriage is to be formulated, it would seem that the one proposed by Prince Peter should serve as a model, Fischer's criticism of Prince Peter's definition may be disregarded, since dictionary definitions are usually unsatisfactory bases for discussions of roles. The roles of husband and wife must be defined in terms of the essential rights and obligations and the behavioral attributes entailed in them in any particular society. Gough and Fischer are justified in their concern that confronted with different forms of mating, the anthropologist employing Prince Peter's definition would be unable to decide which institutions should be referred to as "marriage," as "concubinage," etc. However, if the statement were
modified so as to define marriage as the jurally valid and socially (or publicly} recognized assumption of the kinship roles of husband and wife, there would be few or no problems concerning the distinction between marriage and its socially recognized alternatives. Such a proviso emphasizes that the publicly acknowledged kinship roles created by marriage—as opposed to its alternatives—derive support from the juridico-political domain of the society. Of course, there may be more than one jurally valid way of assuming the roles of husband and wife—as is the case in some present-day African states which recognize marriages contracted according to one or more sets of "customary laws" as well as marriages contracted in accordance with legal codes based on European models. It would appear that the cross-cultural study of marriage must rest on the premise that all societies recognize kinship roles which are founded "in law" as well as those which are based ultimately on actual, assumed, or presumed genetic relationships. Fundamental to the understanding of the concept of "lawfully based" kinship is the fact that human mating is everywhere subject to socially derived regulations. While it is normally expected that marriage will lead to parenthood, the roles of husband and wife need not be defined by reference to children who will come to be regarded as legitimate offspring of individuals in these roles. The roles of husband and wife should be defined in terms of the rights and obligations which attach to them, and marriage must be defined as the lawfully or jurally recognized assumption of these roles. Choice of spouses In all societies, socially derived limitations are placed on the range of persons from among whom spouses may be chosen. Regulations which prescribe marriage outside a stipulated group are referred to as rules of exogamy. Kin groups such as lineages, or territorial groups such as bands or villages, may constitute exogamous units. Societies possessing corporate unilineal descent groups usually prescribe that a person select as spouse someone from a descent group other than his own. In some cases the selection may be made from among persons within the descent group but outside specified degrees of relationship. Among the Gisu of east Africa, for example, it is the minimal patrilmeage, comprising persons who trace their descent from an ancestor three to five generations removed from the oldest living generation, which constitutes the exogamous unit. Every society prohibits heterosexual mating between certain "close" consanguineal relatives. This
MARRIAGE: Comparative Analysis prohibition is referred to as the incest taboo, and ordinarily it proscribes mating between relatives who stand to each other in the relationships of mother and son, father and daughter, and brother and sister. In many societies the incest taboo is extended to various other kinsmen in the parental and filial generations. Among some royal or ruling groups, as in dynastic Egypt and in Polynesia, relatives ordinarily prohibited from mating may be preferred as marriage partners. The mating of close relatives is also permitted in some societies on specified ritual occasions. A rule of endogamy exists where the field of possible spouses is limited to persons within an individual's territorial and/or social group. The castes of traditional India are the most often cited example of endogamous groups. Other societies also prescribe marriage among persons of the same social stratum. Among the Swazi of south Africa, where lineage exogamy prevails and where royalty marries royalty, there are frequent subdivisions of the royal lineage so as to make possible otherwise prohibited marriages. A number of studies indicate that in the absence of explicit prescriptions, it is posssible to discover endogamous tendencies within social or territorial groups of various size and scale. In addition to proscriptions associated with incest and exogamy, societies usually prohibit marriage between certain other categories of persons. In some instances slaves cannot marry freemen. Where age sets are a feature of social organization, as among the Nuer, a man may be prohibited from marrying the daughter of another man in his age set. Societies which prescribe that a spouse be chosen from among one or more designated categories of persons have been said to possess closed marriage systems. Those in which such prescriptions do not exist have been characterized as having open marriage systems. The designation of a marriage system as "closed" is not meant to suggest total absence of choice in the process of mate selection. This point is illustrated by Klass (1966), who shows that in Bengal (and in other parts of India), while caste affiliation delimits the broad category of persons from which a spouse is chosen, a man who must choose husbands for his daughters or "wards" does so from within a relatively narrow selection of eligible males known to certain of his kinsmen. The most frequently cited closed marriage systems are found among the indigenous societies of Australia. Some of these societies, for example the Kariera, practice what anthropologists term "symmetrical cross-cousin marriage," wherein pairs of
11
local groups engage in the "simultaneous or nearly simultaneous exchange of women" (Leach 196la, p. 59). The male members of the two groups concerned exchange their "sisters" for "wives." Ideally, a male ego marries his mother's brother's daughter, who may at the same time be his father's sister's daughter and the sister of his own sister's husband. Among the Murngin of Australia is found a type of asymmetrical cross-cousin marriage wherein marriage with the mother's brother's daughter is preferred and marriage with the father's sister's daughter is proscribed. In this society and others practicing matrilateral cross-cousin marriage, a localized descent group gives wives to one or more other such groups and receives wives from a different set of such groups. In Murngin society there are descent groups which are allied through ties of kinship and ritual. Moreover, each pair of such allied groups stands in balanced opposition to another similar pair with which it exchanges women on a nonexclusive basis. Since men do not marry within their own moiety, any male ego and his mother's brother are in opposite moieties. Ego's group receives wives from and gives prestations to his mother's brother's group. Ego's mother's brother's group receives wives from and gives prestations to the group with which ego's group is allied. This latter group is the one containing ego's mother's mother's brother, who, of course, stands in the relationship of mother's brother to ego's own mother's brother. It can be said, therefore, that in Murngin society the "mothers' brothers" stand in the relation of "wife givers" to their sisters' sons (see Leach 196la, pp. 68-72). Claude Levi-Strauss, Edmund R. Leach, Louis Dumont, and others have discussed the economic and political implications of this and other forms of "cousin marriage." Leach (1961a, pp. 54-104) has shown that where matrilateral cross-cousin marriage prevails, there exist permanent status differences between wife-giving and wife-receiving groups and has demonstrated that the marriage system is not insulated from other domains in the society. In fact, he argues that marriage alliance in such situations is but one of "many continuing relationships between paired local descent groups." Political and economic relationships are reflected in and sustained by the system of matrilateral cross-cousin marriage. In open marriage systems, the only group of persons unequivocally proscribed as marriage partners are those to whom the incest taboo is extended. There are no normative prescriptions relating to groups from which spouses should be chosen. Nonetheless, many studies indicate that demo-
12
MARRIAGE: Comparative Analysis
graphic, ecological, and sociological factors enter into the choice of spouse. Age, residential propinquity, class, religion, ethnicity, education, and occupation have been isolated as important determinants in the choice of marital partners. Likewise, parents and peer groups are often instrumental in delimiting for each individual the field from which a spouse will be chosen. The transfer of rights at marriage Marriage involves the allocation of rights and obligations among the parties to the agreement. A number of anthropologists have attempted to classify the various rights which are known to be allocated at marriage in different societies. In discussing the jural element in marital and other kinship relations, RadclifFe-Brown (1950, p. 12) distinguishes personal rights (jus in personam') from possessive rights (jus in rent}. A right in personam confers on an individual or a group the power to order the performance of certain duties by another individual or group. Rights in rent constitute claims on an object or person such that any encroachment on the object or person constitutes a violation of the "possessor's" rights. In most societies husbands and wives have personal rights in each other: either spouse may claim certain duties of the other. It is also common to find that a husband has "possessive" rights in relation to his wife. Her seduction, her abduction, or her murder would constitute a serious infringement of her husband's rights. In an important contribution to the literature on marriage, Laura Bohannan (1949) distinguishes two classes of rights in females which may be allocated at marriage. Rights in uxorem (rights in a woman as wife) are distinguished from rights in genetricem (rights in a woman as mother). In her discussion of Dahomean marriage, Bohannan shows that rights over a woman's sexual powers and certain of her domestic services were transferred from a woman's patrilineage to the man or woman who made the appropriate bridewealth payments. In most of the "types" of Dahomean marriage, rights to any children a woman might bear during the course of her marriage were also transferred from a woman's patrilineage to that of her husband. Distinct classes of marriage payments were necessary to the transfer of each of these two classes of rights. However, in certain "types" of marriage, rights in genetricem were retained by the woman's natal patrilineage; this might occur in cases where a lineage was faced with a shortage of male heirs and one of the daughters of the lineage was given in marriage to
a man who agreed to make all the bride-wealth payments except those which would have given him jural authority over children of the marriage. Moreover, the marriage of a woman of the royal lineage never involved the transfer of rights in genetricem to the lineage of her hub and (L. Bohannan 1949). Even though it is usually rights in women which are in the forefront of marital negotiations, Leach has pointed out that marriages also serve to allocate rights in and over men (196la, pp. 107-108). He suggests that a marriage may serve to do the following: (1) To establish the legal father of a woman's children. (2) To establish the legal mother of a man's children. (3) To give the husband a monopoly of the wife's sexuality. (4) To give the wife a monopoly of the husband's sexuality. (5) To give the husband partial or monopolistic rights to the wife's domestic and other labor services. (6) To give the wife partial or monopolistic rights to the husband's labor services. (7) To give the husband partial or total rights over property belonging or potentially accruing to the wife. (8) To give the wife partial or total rights over property belonging or potentially accruing to the husband. (9) To establish a joint fund of property—a partnership—for the benefit of the children of the marriage. (10) To establish a socially significant "relationship of affinity" between the husband and his wife's brothers. Leach thus focuses attention on rights in and regarding children, sexuality, domestic and economic services, and property. In the last instance, he suggests that marriages may establish between groups of men mutual interdependencies which could entail any of the above rights as well as others of a political nature. Where there are corporate kin groups, the allocation of rights at marriage is usually effected by and between at least two such groups. In the case of first marriages, it is usual that the groups into which the husband and wife were born are parties in this rearrangement of social relations. Where recruitment to the corporate kin groups is based on patrilineal descent, normally the rights over a woman's sexuality and procreative capacities that are held by her natal group are transferred
MARRIAGE: Comparative Analysis at marriage to the groom and his natal group. Thus, whereas prior to marriage any sexual offense against a woman constitutes a violation of rights held by her kin group, after marriage such an offense is an infringement of the groom's rights. Similarly, whereas children born to a woman outside marriage would fall under the jural authority of her natal kin group, those born after marriage are subject to the authority of, and have rights in, the groom's kin group. Total rights over the bride's domestic and economic services are seldom transferred from her natal group to her husband or his kin group. The woman herself, as an adult member of the society, may retain some control over the dispensing of these services. Often her kin group retains the right to call upon these services. Among the Yoruba of Nigeria, for example, rights in the bride's sexuality, rights over her procreative powers, and partial rights over her domestic services are acquired at marriage by the groom and his patrilineage. However, a woman maintains control over her economic powers and resources, and her natal lineage retains the right to call upon her domestic services in certain circumstances. She is called upon to buy and prepare food at times when deities associated with her lineage must be propitiated, and on the death of a member of her lineage, she is expected to be of service in various ways. This raises another point: in most societies possessing corporate patrilineages, a married woman does not usually relinquish all her rights in her natal lineage. She may retain some proprietary rights therein, and she usually remains under the religious protection of her lineage ancestors. Moreover, a woman's lineage may have the right to reclaim control over her sexual and procreative powers should there be a breach of the marital agreement on the part of her husband. While these statements are generally true, there are some patrilineal societies in which a married woman becomes virtually "absorbed" into her husband's lineage. According to Gluckman (1950), a married woman among the Zulu of south Africa had virtually no rights outside her husband's lineage; once a woman was married, her natal lineage forfeited virtually all authority over her. Whatever rights are transferred to the husband or his lineage may be temporarily or permanently reallocated by him or his lineage. The most common example of this is the practice of "wife-lending" found among the Kipsigis and others. The fact that a man may permit another to have access to his wife's sexuality is proof of his monopoly over her sexual capacities. In some societies, a man
13
who is impotent may choose a sexual partner for his wife in order that she may bear children. Where this is so, the husband is the lawful father of the children, even though he is not the genitor. Where a female is permitted to assume the role of husband, she bestows her rights of sexual access to her wife on a man of her own or of her wife's choice. In matrilineal societies, rights over the procreative capacities of women are held in perpetuity by their kin groups while partial or total rights in their sexuality are transferred at marriage to their husbands. Customarily, the husbands also attain rights to the domestic services of their wives. Among the Bemba of east central Africa, for example, a husband has monopoly over his wife's sexuality, but the children of any marriage belong to their mother's matrilineage and are therefore under the jural authority of the adult males of that group. A wife keeps her husband's house and contributes her labor to his agricultural pursuits. Marriages and the exchanges of goods and/or services occasioned thereby are sometimes processual events extending over considerable periods of time. The rights and obligations entailed in the marriage may be allocated in serial fashion, the timing of their transfer being dependent on the transfer of the appropriate goods and services. In such cases, the exchange of goods and services may commence during the period of betrothal and continue even after the formal transfer of certain rights has taken place. Where goods and services are exchanged as part of the marriage procedure, certain of these may be regarded as necessary prestations without whose exchange a transfer of rights will not take place. Others are contingent prestations which, although part of the contract, are not essential to the exchange of jural authority and the assumption of marital rights and obligations. As Fortes says, they constitute the "means of winning and preserving the goodwill of those with the power to transfer marital rights" (1962, p. 10). The most general terms used to describe prestations entailed in the marital contract are those of bridewealth (or bride-price) and dowry. The former refers to gifts presented by the groom's kin group to that of the bride, and the latter describes gifts made by the bride's kin group to that of the groom. The dowry is the more familiar to Westerners, since for centuries it has been a part of the marriage contract in Europe. However, both bridewealth and dowry have been reported for various parts of the world. Throughout history, the transfer of rights at
14
MARRIAGE: Comparative Analysis
marriage has been enshrined in ritual and ceremony. This is a correlate of the fact that marriage transactions are always "publicly" acknowledged. The ceremonies which take place in effect call forth "the public" to bear witness to the lawfulness of the transactions. The sanctions which emanate from the jural domain of the society are strengthened by the incorporation of rituals associated with the religious realm of the society. Concurrent marriages. The transfer of rights at marriage and the rituals associated with this transfer signify the assumption of new roles by the parties involved. In societies which permit polygyny or polyandry—marriages entailing a plurality of wives or of husbands, respectively—one of the partners to a marriage assumes the role of co-wife or co-husband along with the role of husband or wife. In polygynous marriages, the husband usually acquires the same categories of rights in each of his wives. In patrilineal societies, a man is the legitimate father of all his wives' children, even though his rights over the wives' sexuality may be assigned to or "usurped by" other men. The children of polygynous marriages may or may not have equal claims on their father's property. In any case, each wife considers herself the guardian of her children's rights within the family created by the polygynous marriage. Where polyandry is practiced, by definition a man does not have exclusive rights in his wife's sexuality. He may or may not have claims over the children which she bears him. Among the Sinhalese, rights over the wife's sexuality are partially vested in the first husband. The sexual rights of the other husbands are exercised with the consent of the first husband and the wife. A husband has claims over those of the wife's children whom he has fathered, and the children have legitimate claims over the property of their respective fathers. All the children have equal claims to the properties owned by their mother. Among the Nayar of south India, a ritual marriage ceremony, called the tali rite, bestowed upon a group of men of appropriate caste the right of access to a woman's sexuality. The completion of the tali rite marked a girl's transition to womanhood. Thereafter, when she attained appropriate age, she could begin to enter into relationships, termed sambandham unions, with a number of men, for whom she might bear children. Rights over a Nayar woman's procreative powers were retained by her matrilineage, which had jural authority over her children. Nonetheless, in order for a child to have "full birth-status rights" in his
mother's lineage, he had to have an acknowledged father. A man acknowledged the paternity of a child by bearing certain expenses associated with its delivery. This man could be any one of those with whom the mother had entered into a sambandham union. In cases of doubtful paternity, a woman's current "visiting husband" could be forced by an assembly of persons in the neighborhood to make the birth payments. "But if no man of appropriate rank could be cited as potential father, woman and child were expelled from their lineage and caste" (Gough 1959, p. 30). The levirate and the sororate. In many societies, an individual may assume the role of husband or wife in order to secure rights for a kinsman. Where the "true" levirate prevails, upon the death of a husband, it is the duty of one of his brothers to marry the widow, and any children born to the union are counted as the progeny of the deceased man. Certain of the "ghost marriages" found among the Nuer resemble the levirate. A man could marry a woman "to the name of" a brother who died childless, and the offspring of the union would be designated as children of the deceased. These practices differ from the custom of adelphic widow inheritance, wherein a man marries his deceased brother's widow and bears children who are counted as his own. Where the "true" sororate prevails, the husband of a barren woman marries her sister, and at least some of the children born to the union are counted as those of the childless wife. The term "sororate" is also used in reference to the custom whereby, upon the death of a wife, her kin supply a sister as wife for the widower. In the latter case, however, any children born to the woman are recognized as her own. Affinal relationships. The transfer of rights at marriage not only signals the couple's assumption of new conjugal roles but also serves to establish or perpetuate affinal relationships between consanguineal kinsmen of the spouses. Often associated with affinal roles are behavioral attributes commonly subsumed under the categories of "joking relationships" and "avoidance relationships." Radcliffe-Brown (1952, pp. 90-116) has argued that the respect implied in avoidance practices and the formalized disrespect demonstrated by joking relationships are expressions of alliance or consociation. The actors in roles characterized by joking or by avoidance have divergent interests which could generate conflict between them and thereby undermine the bases of their common interests. The institutionalization of avoidance and joking serves to minimize the chance of the development of openly hostile relations between the parties.
MARRIAGE: Comparative Analysis The most widespread of the avoidance practices are those which restrict contact between a husband or wife and the mother-in-law and/or father-in-law. Such restrictions on contact may also extend to actual or classificatory brothers or sisters of the father-in-law or mother-in-law. Among the patrilineal Swazi of south Africa, a wife is prohibited from coming into face-to-face contact with her husband's father and those of his male relatives of the same generation resident in the compound. A man behaves in similar fashion toward his motherin-law, but the likelihood of such contact is minimized by their residence in different compounds and often in different villages. Joking relationships most commonly exist between a man or woman and affinal relatives of opposite sex in the spouse's generation. These relationships are characterized by the use of intimate names, the use of language otherwise considered lewd or abusive, and, in some cases, by indulgence in sexual play. Affinal relatives are often expected to give assistance to one another in times of exigency. In many societies where political functions are vested in roles defined primarily by kinship criteria, affinal relatives serve to minimize open conflict between their respective consanguineal kin groups. They might serve, as among the Tiv of Nigeria, as emissaries of peace in cases of latent or open conflict between two lineages. The linkage of individuals through marriage leads to the creation of new groups or, in Nadel's terminology, to the creation of new sets of bounded social relationships and thereby constitutes a phase in the developmental cycle of kin groups. As Radcliffe-Brown has pointed out, the eventual result of most marriages is that new sets of individuals are linked through common descendants. Ultimately, the fission of kin groups can often be traced to relations generated by marriage. This process is evident in many societies where lineages (or, for that matter, ramages) are a feature of social organization. When adult members of a lineage segment occupy a common residence along with their spouses and children, the process of incorporation of additional coresidents through marriage often eventually leads to the founding of households in other locations. In the course of time, the founders of such households and their descendants may come to form new lineage segments. Postmarital residence In some societies, spouses are expected to live together throughout the period of their marriage; in
15
others, they may be members of separate domestic groups and only visit each other's residences. The "residence rules" outlined by anthropologists refer to situations in which husbands and wives are members of the same domestic unit. Neolocal residence predominates when couples establish independent domestic units after marriage. Residence is characterized as virilocal when most couples in a society join a domestic group in which the husband resided prior to marriage or in which he rather than the wife has proprietary or other claims. Residence is called uxorilocal when couples join the domestic group to which the wife was attached prior to the marriage or in which the wife rather than the husband has claims. The above terms may be compounded with others to describe more precisely the nature of the domestic group joined by the couple. Thus, viripatrilocal residence refers to domicile in a domestic group whose core includes the groom's father. Uxorimatrilocal residence refers to domicile in a group whose core includes the bride's mother. The term avunculocal is used to describe residence in a group whose core includes the groom's mother's brother. Data collected by Goodenough (1956) and J. L. Fischer (1958) among the Nakanai of New Britain show that the classification of postmarital residence patterns is not as straightforward as some might assume. Their data also illustrate that there is no simple correlation between particular residence rules and particular rules for recruitment to descent groups. Goodenough shows that in this matrilineal society, a man takes his bride to live in the village in which his father resides. The couple lives there so long as the groom's father is alive, and they may remain after the father's death, particularly if the father is without sisters' sons who would be his jural heirs. More often, however, after the father's death, the couple moves to the residence of the husband's matrilineage, in which he has hereditary land rights. A man whose father is deceased takes his bride to live with the group which includes the man who acted as father-surrogate at the time of the marriage. Goodenough shows that even where ideal residence patterns suggest one or more prevailing modes of residence, the actual choices which couples make may depend on economic and other factors. Fischer, who also worked on the island of Truk and who arrived at a classification of residences significantly different from Goodenough's, has suggested that residence be elicited for individuals than rather for married couples. He suggests that every person in a household has a "kin sponsor"
16
MARRIAGE: Comparative Analysis
and that his relationship to this sponsor most appropriately describes the residence pattern for that individual. While Fischer's suggestion has some merit for the classification of residence patterns for entire populations, attention cannot be shifted from the fact that in most societies the major spatial arrangements of individuals are associated with marriage. Moreover, the kinds of rearrangements which do occur have important implications for many kinds of social relations. It has been shown, for example, that the study of the developmental cycle of domestic groups touches on virtually all aspects of social structure and that postmarital residence patterns are crucial to the understanding of the development of domestic groups (Goody 1958). Alternatives to marriage Marriage is a process or event signifying the assumption of the roles of husband and wife in accordance with jural tenets prevalent in the society or stratum of society to which the parties belong. In contemporary societies, marriages are contracts which must be formally legitimized by the state. A state may provide that for purposes of inheritance, or for other specified purposes, persons who are not legally married to each other but who share a common domicile and who otherwise demonstrate a claim to conjugal status may be accorded some or all rights associated with legal marriage. Similarly, a state may choose to recognize marriages contracted according to rules formulated prior to its existence by some or all of the groups which constitute it. Such is the case in various parts of the world where formerly autonomous or semiautonomous political entities have come together to form modern nation-states. Unions other than lawful marriage are known to have existed in stateless societies as well as in states which did not make the legitimization of marriages their official concern. Yet it seems particularly characteristic of modern societies that there are individuals who, for various reasons, assume some or all of the obligations and rights associated with the roles of husband and wife without entering into legal marriage. Reference has already been made to the fact that one of the crucial ways in which such unions differ from marriage is that they do not create lawful kinship ties between consanguineal relatives of the couple. These "consensual unions" occur in different frequencies in different modern societies. In the United States, in the Caribbean, and in other areas where such unions occur with relatively high fre-
quency among certain socioeconomic classes and/ or ethnic groups, research has centered primarily on family organization, and consensual unions are often regarded as but one aspect of over-all "family instability." The result is that while many hypotheses have been offered to account for the matrifocal or matricentric family which, in some areas, is one structural correlate of consensual unions, few students have offered hypotheses which explicitly attempt to account for the origin and/or persistence of such unions. M. G. Smith (1962) has presented a wealth of statistical data in support of his hypothesis that specific mating patterns underlie the various forms of family organization in the Caribbean. He has demonstrated that the pattern of consensual mating underlies the matrifocal family in that area. However, he does not deal with the origin and persistence of the mating patterns themselves. Nevertheless, the data suggest that demographic and economic factors are important determinants of these patterns. For example, where the sex ratio is altered by the necessity that males migrate to find work, women often enter into extramarital unions with single or married men who remain behind. Such alliances may or may not entail coresidence. Consensual unions may constitute a stage in the development of domestic groups and as such are not so much alternatives as preludes to marriage. In parts of the Caribbean where great prestige is attached to church marriages followed by festivities requiring the outlay of large sums of money, couples often assume the roles of husband and wife by mutual consent until such time as they can afford a religious marriage ceremony. Thus, many couples establish a common domicile and bear children before they enter into matrimony "before the eyes of man and of God." This raises an important point. Even though, in most parts of the modern world, marriages may be contracted without religious ceremonies, historically marriage was the concern of religious institutions before it became the official concern of the state, and most religious doctrines still include prescriptions and proscriptions regarding marriage. Where the influence of religious tradition is particularly strong, civil marriages may be regarded as little more than alternatives to or complements of "true marriage." One of the consequences of this, as evidenced in parts of the Caribbean, is that couples enter into extramarital relationships until such time as they can finance the religious and convivial ceremonies as well as fulfill the legal requirements for marriage.
MARRIAGE: Comparative Analysis Marital stability and divorce The ambiguities entailed in the concept of marital stability have been succinctly stated by David Schneider: Stability may be defined in terms of the change of rules or expectations over time or in terms of the degree to which the rules or expectations are conformed to. Stable marriage may be defined as stable jural relations irrespective of conjugal relations, as stable conjugal and jural relations, or simply as stable conjugal relations. (1953, p. 56) Thus, divorce, defined as the lawful dissolution of jural ties established at marriage, may occur relatively infrequently even though separation and other breaches in conjugal relations occur relatively frequently. In traditional Nuer society, the jural bonds established by marriage were stable; divorce, signified by the return of bridewealth, was rare. On the other hand, conjugal separation was relatively frequent. Max Gluckman (1950) was one of the first anthropologists to deal with the factors which contribute to the jural stability of marriage in preindustrial societies. His data on the Lozi and the Zulu led him to the hypothesis that the stability of jural relations established by marriage is correlated with the presence of patrilineages. He argued that where the "principle of father-right" prevailed, as among the Zulu, there was a complete and final transfer of women into their husbands' lineages (from which their children obtained their legal rights), and he suggested that this fact accounted for the virtual absence of divorce in such societies. In a reconsideration of the Gluckman hypothesis, Fallers pointed out that not all patrilineal societies provide for the absorption of women into their husbands' lineages. He suggests that where women retain rights in their natal patrilineages, patriliny contributes to the jural instability of marriage by dividing the loyalties of spouses. Fallers (1957) found among the Busoga a relatively high incidence of divorce, which he attributed in part to the fact that loyalties to natal lineages undermined the bonds established at marriage. Leach (1961a, pp. 114-123), Cohen (1961), and others have contributed to the discussions of marital stability begun by Gluckman, Schneider, and Fallers. However, there is yet to be undertaken the quantitative and comparative analyses required for a definitive statement on the determinants of stability in the jural aspects of marriage. Whether the aim is to isolate the determinants of differential rates of divorce within a single society or to account for the differences in the divorce rates
17
reported for various societies, care must be taken to insure that the data utilized are in fact representative of the populations discussed. Moreover, more attention must be given than has been in the past to the limits of the utility of numerical data, which, at best, can be considered reliable for relatively short time spans. The separation of spouses is usually taken as an index of instability in conjugal relations. However, it should be obvious that separation can only be taken as indicative of the disintegration of conjugal bonds when the coresidence of spouses is a societal norm. Even in these cases, separation does not always signal instability in conjugal relations. Among the Yoruba, it is common to find women living and working in one place while their husbands live and work in another. So long as these women are not known to have committed adultery, and so long as they fulfill certain responsibilities to their husbands and their husbands' lineages, their conjugal relations are not necessarily impaired. The distinction drawn by Schneider between stability in conjugal relations and stability in the jural aspects of marriage relations is useful in the analysis of marriage in contemporary societies. For example, it would be useful to make such a distinction in discussions of marriage patterns in the Caribbean and in the United States. As has been pointed out, among some of the lower-class populations in these areas, consensual mating is common. Not all the parties to consensual unions are persons who have never been legally married. In fact, where the economics of divorce are a deterrent to the lawful dissolution of marriage, consensual unions are often an alternative to divorce and remarriage. Hence, the jural relations established at marriage are often maintained even though conjugal relations are completely or partially severed. Most of the societies whose marriage systems are described in the anthropological literature are now incorporated into independent states. The very existence of these states signals changes in the rules regarding the establishment of marital contracts, since all contemporary states reserve the right to define what types of union constitute legal marriage. There is general agreement that the economic and demographic changes taking place in the "developing areas" are also effecting changes in traditional marriage systems. However, Goode (1963) has pointed out the difficulties involved in isolating cause—effect relationships between changes in a society's family patterns, including marriage, and
18
MARRIAGE: Comparative Analysis
changes in its economic organization. Considerable refinement in research strategies is necessary before it will be possible to state with confidence the extent to which, the precise ways in which, and the specific points at which the spread of industrial technologies and the growth of cities impinge upon or serve to undermine traditional family structures and marriage patterns. Some of the studies of marriage found in the anthropological literature provide convenient points of departure for investigations of changes in the rules and behavior associated with marriage in different parts of the world. However, it is obvious that analyses of changing patterns of marriage require the collection of a larger body of quantifiable data than is available in most existent anthropological studies of marriage. Whereas most of the marriage systems described in the anthropological literature lent themselves to representation in terms of mechanical models, such models are becoming increasingly inadequate as representations of particular systems and as bases for comparative studies. The rules governing the establishment of marriage contracts, the factors influencing the choice of spouses, the rights and obligations entailed in conjugal roles, and the behavior of persons in these roles are sufficiently variable in any one system to require partial or total representation by means of statistical models. With the construction of such models, we can begin the assessment of the directions and rates of change in marriage systems and the isolation of the specific variables which account for these changes. GLORIA A. MARSHALL BIBLIOGRAPHY
BOHANNAN, LAURA 1949 Dahomean Marriage: A Revaluation. Africa 19:273-287. BOHANNAN, PAUL 1963 Social Anthropology. New York: Holt. BRITISH ASSOCIATION FOR THE ADVANCEMENT OF SCIENCE 1951 Notes and Queries on Anthropology. 6th ed. London: Routledge. -» The first edition was published in 1874. The sixth edition was revised and rewritten by a committee of the Royal Anthropological Institute of Great Britain and Ireland. CHRISTENSEN, HAROLD T. (editor) 1964 Handbook of Marriage and the Family. Chicago: Rand McNally. CLARKE, EDITH 1957 My Mother Who Fathered Me: A Study of the Family in Three Selected Communities in Jamaica. London: Allen & Unwin. COHEN, RONALD 1961 Marriage Instability Among the Kanuri of Northern Nigeria. American Anthropologist New Series 63:1231-1249. EVANS-PRITCHARD, E. E. 1951 Kinship and Marriage Among the Nuer. Oxford Univ. Press.
FALLERS, L. A. 1957 Some Determinants of Marriage Stability in Busoga: A Reformulation of Gluckman's Thesis. Africa 27:106-123, FISCHER, H. T. 1956 For a New Definition of Marriage. Man 56:87 only. FISCHER, JOHN L. 1958 The Classification of Residence in Censuses. American Anthropologist New Series 60: 508-517. FORTES, MEYER 1959 Descent, Filiation and Affinity: A Rejoinder to Dr. Leach. Man 59:193-197, 206-212. FORTES, MEYER (editor) 1962 Marriage in Tribal Societies. Cambridge Papers in Social Anthropology, No. 3. Cambridge Univ. Press. GLUCKMAN, MAX 1950 Kinship and Marriage Among the Lozi of Northern Rhodesia and the Zulu of Natal. Pages 166-206 in A. R. Radcliffe-Brown and Daryll Forde (editors), African Systems of Kinship and Marriage. Oxford Univ. Press. GOODE, WILLIAM J. 1963 World Revolution and Family Patterns. New York: Free Press. GOODE, WILLIAM J. (editor) 1964 Readings on the Family and Society. Englewood Cliffs, N.J.: PrenticeHall. GOODENOUGH, WARD H. 1956 Residence Rules. Southwestern Journal of Anthropology 12:22-37. GOODY, JACK R. 1956 A Comparative Approach to Incest and Adultery. British Journal of Sociology 7:286-305. GOODY, JACK R. (editor) 1958 The Developmental Cycle in Domestic Groups. Cambridge Papers in Social Anthropology, No. 1. Cambridge Univ. Press. GOUGH, E. KATHLEEN 1959 The Nayars and the Definition of Marriage. Journal of the Royal Anthropological Institute of Great Britain and Ireland 89:23-34. KLASS, MORTON 1966 Marriage Rules in Bengal. American Anthropologist New Series 68:951-970. LAWRENCE, WILLIAM; and MURDOCK, GEORGE P. 1949 Murngin Social Organization. American Anthropologist New Series 51:58-66. LEACH, EDMUND R. 1961a Rethinking Anthropology. London School of Economics and Political Science, Monographs on Social Anthropology, No. 22. London: Athlone. LEACH, EDMUND R. 1961i» Asymmetric Marriage Rules, Status Difference, and Direct Reciprocity: Comments on an Alleged Fallacy. Southwestern Journal of Anthropology 17:343-351. LEVI-STRAUSS, CLAUDE 1949 Les structures elementaires de la parente. Paris: Presses Universitaires de France. LOWIE, ROBERT 1933 Marriage. Volume 10, pages 146154 in Encyclopaedia of the Social Sciences. New York: Macmillan. MALINOWSKI, BRONISLAW (1929) 1962 Marriage. Pages 1—35 in Bronislaw Malinowski, Sex, Culture and Myth. New York: Harcourt. MOGEY, JOHN (editor) (1962) 1963 Family and Marriage. Leiden (Netherlands): Brill. -> First published in Volume 3 of the International Journal of Comparative Sociology. PETER, PRINCE OF DENMARK 1956 For a New Definition of Marriage. Man 56:48 only. RADCLIFFE-BROWN, A. R. 1950 Introduction. Pages 185 in A. R. Radcliffe-Brown and Daryll Forde (editors), African Systems of Kinship and Marriage. Oxford Univ. Press. RADCLIFFE-BROWN, A. R. 1951 Murngin Social Organization. American Anthropologist New Series 53:37-55.
MARRIAGE: Marriage Alliance RADCLIFFE-BROWN, A. R. (1952) 1961 Structure and Function in Primitive Society: Essays and Addresses. London: Cohen & West; New York: Free Press. RADCLIFFE-BROWN, A. R.; and FORDE, DARYLL (editors) 1950 African Systems of Kinship and Marriage. Published for the International African Institute. Oxford Univ. Press. SCHNEIDER, DAVID M. 1953 A Note on Bridewealth and the Stability of Marriage. Man 53:55-57. -> For the ensuing discussion on this topic, see the articles numbered 122, 223, and 279 in Volume 53 of Man, by E. E. Evans-Pritchard, Max Gluckman, and E. R. Leach, respectively; see also the articles numbered 96, 97, and 153 in Volume 54 of Man, by Max Gluckman, William Watson, and E. R. Leach, respectively. SCHNEIDER, DAVID M. 1965 Some Muddles in the Models: Or, How the System Really Works. Pages 25-85 in Conference on New Approaches in Social Anthropology, 1963, Cambridge, The Relevance of Models for Social Anthropology. Edited by Michael Banton. Association of Social Anthropologists, Monograph No. 1. London: Tavistock. SMITH, M. G. 1953 Secondary Marriage in Northern Nigeria. Africa 23:298-323. SMITH, M. G. 1962 West Indian Family Structure. Seattle: Univ. of Washington Press. WINCH, ROBERT; McGiNNis, ROBERT; and BARRINGER, HERBERT (editors) (1953) 1962 Selected Studies in Marriage and the Family. Rev. ed. New York: Holt. Ill MARRIAGE ALLIANCE
All societies prohibit marriage with certain relatives, but some societies complement this prohibition by prescribing, or preferring, marriage with other relatives. In this way two kinds of cousins are sometimes distinguished, marriage being prohibited between those who are children of siblings of the same sex ("parallel cousins"), while it is prescribed between children of siblings of opposite sex ("cross-cousins"). This disposition is generally accompanied by exogamy. This article attempts to sum up recent developments in the theory of crosscousin marriage. Descent and alliance The expression "marriage alliance," in which "alliance" refers to the repetition of intermarriage between larger or smaller groups, denotes what amounts to a special theory of kinship, a theory developed to deal with those types of kinship systems that embody positive marriage rules, though it also affords certain general theoretical insights regarding kinship. Two points may be noted at the outset: (1) The combination of the positive marriage rule with exogamy, or at the very least with a prohibition against marriage between parallel cousins, is essential to the type of system under description here; a preference for marriage with
19
the father's brother's daughter, as found among some Islamic peoples, is a quite different phenomenon. (2) The approach here presented is essentially common to several writers, though an element of personal interpretation is inevitable. In the initial stages of kinship studies, the reconstruction of fanciful marriage rules (or mating arrangements) as having supposedly existed in the past was widely used in order to explain seemingly strange ways of classifying relatives (kinship terminologies). This practice has brought discredit, in the eyes of some, to the study of both marriage rules and terminologies. In 1871 Lewis Henry Morgan made two assumptions: (1) terminology reflects behavior, and hence, (2) if a terminology cannot be understood from present behavior, it must be because the behavior it reflects belongs to the past. [See the biography of MORGAN, LEWIS HENRY.] Quite apart from the difficulty of reconstructing past behavior, anthropological thought in this matter is still ethnocentric. The underlying assumption is that all peoples entertain the same ideas about kinship; their classifying of relatives in different ways is, therefore, due to differences in behavior. Fully excusable in Morgan, such an assumption is less so today. W. H. R. Rivers recognized the link between an actual marriage rule (symmetrical cross-cousin marriage) and a certain type of terminology (often called "bifurcate merging"). For Rivers, the marriage rule was the cause, the terminology the effect, and he saw his task as explaining the marriage rule itself. [See the biography of RIVERS.] Once again, terminology reflects behavior, and again historical speculation is called in, this time to discover the "origin" of one item, which is in fact essentially a normative trait. In our time the different features of a kinship system are, in practice, often considered in isolation or are hierarchized according to what is assumed to be their degree of reality or determinativeness. This tendency, if not found in such crudity as in the past, still exerts considerable pressure even on the best minds, and that it constitutes a major obstacle to the understanding of certain kinship systems can be shown by the example of Australian kinship, a classical subject for kinship theory. In Australian section systems, descent is overstressed; the reasons that may elsewhere justify this emphasis are here misplaced, for it prejudices the consideration of other elements in the system. In writing about Australian kinship systems, authors vie with each other in stressing that in
20
MARRIAGE: Marriage Alliance
A,
K
r
n
-A L—J
di
Ac,
A8' C,
D, KARIERA
ARANDA (North)
AMBRYM (Balap)
Figure 1
symmetrical cross-cousin marriage arrangements, double descent is always present or implied. This is unobjectionable in itself, but in the literature it is accompanied by a bias which makes itself obvious by repetition, whether it be in B. Z. Seligman's attempt to reduce the "type of marriage" to "forms of descent" (1928, p. 534), however strange the latter forms may appear, or in Radcliffe-Brown's overemphasis upon descent, or in Murdock's outbidding of Radcliffe-Brown in this respect. Radcliffe-Brown was not content with finding an underlying matrilineal exogamy in his classic Australian patrilineal systems and with seeing in what is now called "double descent" a widespread principle of Australian kinship. He claimed that his second kind of exogamous group actually "existed," whereas he had only inferred it (1931, pp. 39, 439); the point is insisted upon by Goody (1961, pp. 6 ff). It is perplexing later on to find Murdock opposing Radcliffe-Brown, while praising the same discovery in others; but the crux of the matter is that in Murdock's opinion Radcliffe-Brown had not gone far enough in stressing descent and descent groups, for Radcliffe-Brown had maintained, at another level, the primacy of individual relationships and marriage rules over the arrangement of groups (Murdock 1949, pp. 51 ff.). Actually, the hypothesis of underlying matrilineal exogamy among the Kariera and Aranda accounts for the allocation of alternate generations to different groups. Among them, the patrilineal group is conceived not as a unity over a continuous series of generations but as a duality made up of two alternate generation-sections, called by different names and following different marriage rules (the grandson falling back, so to speak, into the grand-
father's section). This is the simple, concrete sociological fact, widespread in Australia. If we take this for granted, together with intermarriage between the named sections, we can in each case draw a simple diagram of the whole tribe. In Figure 1 the sign [=] denotes intermarriage in both directions, the letters A, B, etc., represent patrilineal groups, and the numbers 1 and 2 are used for the two alternating generation-sections in each patrilineal group. The system of Ambrym (Balap) is easily represented in the same fashion (Deacon 1927). All three systems represent variations on the same theme, the number of patrilineal groups being respectively two, four, and three, the number of sections four, eight, and six. Each of the three systems may be conceptualized as forming a single whole through a regular chain of intermarriage and patrilineal descent. The differences in the arrangement follow necessarily from the numbers of groups (for details, see Dumont 1966). I do not pretend that a second unilineal principle cannot be said to underlie these systems, but only that the above is a simpler view of them. Let us now turn to the general theory that, like the above analysis, recognizes intermarriage as a basic element in those systems which possess a preferential or prescriptive marriage rule. Levi-Strauss We must neglect the scholars who had previously advanced the distinction and description of the types of cross-cousin marriage (e.g. Fortune 1933; Wouden 1935) and start with the general theory of Levi-Strauss. His monumental book Les structures elementaires de la parente (1949) goes far beyond our limits. Josselin de Jong (1952) has
MARRIAGE: Marriage Alliance provided an able summary of the book, while Leach (1961) and Needham (1960) have sympathetically, but sharply, criticized its detail. Our concern here is only with its leading ideas. From the present point of view, the work is first of all a comparative study of positive marriage rules, informed by a general theory of kinship. Preferential marriage rules and marriage prohibitions are accounted for within an integrated body of theory. The prohibition of incest is recognized as universal; it is seen as a basic condition of social life. A man cannot take in marriage the women who are his immediate kin; on the contrary, he has to abandon them as wives to others and to receive from others his wife or wives. Levi-Strauss considers this situation as a universal principle which lies beyond sociological explanation—and which implies an opposition between consanguinity and affinity as the cornerstone of kinship systems. He views marriage as predominantly a process of exchange (between one man and other men or between one domestic group and others), and he sees in positive marriage rules devices through which this exchange is directly regulated, giving rise to what he has called "elementary" structures. Let us note that a kinship system is viewed here, starting from its basis in the incest prohibition, as an entirety resting on an opposition and not as a mere collection of features in which one feature might, for a priori reasons, be considered to determine the others. Abstractly, a kinship system is taken as combining a number of features (descent, inheritance, residence, affinity), and an effort is made to characterize the whole by the relations that prevail between the different features. Thus, a system is called harmonic if all transmission between generations takes place in one and the same line, dysharmonic if some features are transmitted patrilineally, others matrilineally. The rule of crosscousin marriage, where it exists, correlates with this. Theoretically three types may be distinguished: bilateral, matrilateral, and patrilateral. In bilateral cross-cousin marriage, the spouse is at the same time mother's brother's child and father's sister's child. Two intermarrying groups exchange women as wives and thus constitute a self-sufficient unit. Levi-Strauss has called this form "closed" or "restricted" exchange (echange restreint} and correlated it with dysharmonic transmission. In opposition to this type, he has stressed the quite different properties and implications of matrilateral crosscousin marriage. This type had been less clearly recognized by previous writers, though he does not consider the Dutch literature on Indonesia in which the type had been characterized (e.g. Fischer 1935;
21
1936; Wouden 1935). In this type, a man marries his mother's brother's daughter; a given line B takes wives from a line A and gives wives to a line C, generation after generation. Intermarriage is thus asymmetrical, and if the society is conceived as a number of discrete groups giving and receiving women in marriage, the simplest system is that of a circle: at the end of the series, Z receives from Y and gives to A (called the "circulating connubium" by the Dutch scholars). This is what LeviStrauss calls "generalized exchange." In opposition to the closed type, it requires at least three groups and may accommodate any number of groups. This type correlates with harmonic transmission, which may be either matrilineal or patrilineal. Here the identity of the intermarrying group emerges from the network of relationships, for one group is not closely dependent on any other single group, nor are two successive generations distinguished. Relatives belonging to different generations within the same group of affines are terminologically equated. Since intermarriage is directionally oriented—a group does not receive wives from the group to which it gives its daughters—there is a probability of difference of status between wife-givers and wife-takers. For a discussion of the further consequences, see Leach (1961, chapter 3; cf. Fischer 1935). The third type, the patrilateral, is only cursorily treated in Levi-Strauss's treatise; it appears there as a kind of abortive crossbreed between the first two types and is omitted here because it is somewhat controversial (Needham 1958Z?; Lane 1962). Some of the objections that have been leveled at Levi-Strauss's theory can be briefly mentioned. One, forestalled by Levi-Strauss, is that he argues exclusively about viripotestal societies; another is that his idea of marriage is naive, although this is beside the point, since he was actually concerned solely with the forms and implications of intergroup marriage. A more radical criticism can be directed at the fundamental character and explanatory value of "exchange" in Levi-Strauss's scheme (discussed in Wolfram 1956). To view the prohibition of incest as the basis for the opposition between consanguinity and affinity appears tautological to those who think of consanguinity itself as fundamental and self-explanatory or appears insufficient to those who would like a psychological explanation. Viewing marriage as an exchange may be questioned on two counts. First, it introduces an arbitrary analogy between women and chattels, women being supposed, for instance, to be universally the most prized of "valuables." Second, "exchange" here tends to be given so wide and inde-
22
MARRIAGE: Marriage Alliance
terminate a meaning as to be practically devoid of content. While this is true of "indirect exchange" and even more so of "reciprocity," the notion of exchange is certainly useful within limits. In still another critique of Levi-Strauss, Homans and Schneider (1955) argue, in the last analysis, that to look at kinship systems as wholes having explanatory value in relation to their parts is to resort to "final causes." This critique has itself been carefully refuted by Needham (1962). Developments Since 1949 the Levi-Straussian theory has been tested and has undergone partial modifications and developments. To mention only the major themes, we have first the clear-cut distinction, advocated by Needham, between prescription and preference in marriage rules. He claims that prescription alone has "structural entailments" in the total social system, and that Levi-Strauss has dealt only with prescription or at any rate should have done so (Needham 1962). "Prescription" is here defined more as the characteristic of a system than as simply a marriage rule: it involves the combination of a rule prescribing some relatives and prohibiting others, a corresponding terminological distinction, and a sufficient degree of observation of the rule in practice (Needham 1958a, p. 75; 1958£>, p. 212). The advisability of the distinction has been challenged by R. B. Lane (1962, p. 497). At first sight the distinction seems justified, and there is no objection to isolating a clear-cut type of "prescriptive alliance." That there is a danger of underestimating the importance of other types is apparent from the exacting criteria by which the author excludes the recognition of forms of patrilateral intermarriage as "prescriptive" in his sense (Needham 1958£>), These latter forms, like preferential marriage in general, do have "structural entailments" of a kind, as we shall see. Moreover, the two forms are not easily distinguishable; the distinction, so presented, is more one of levels than of systems (for a recent clarification of this question, see Maybury-Lewis 1965). The main development has probably been a refinement of the concept of alliance and the substitution of a more structural for a more empirical notion. At the start the theory, although anchored in the notion of complementarity, was in large part concerned with the exchange or circulation of women between the major exogamous components of the society. To begin with, three authors have asserted that the units which may be said to exchange women are, in concrete cases, smaller than
the exogarnous units. In 1951 Leach sternly insisted—with empirical, if somewhat dogmatic, good sense—that the agents arranging marriages are as a rule the males of the local descent groups, as distinct from the wider exogamous units and from the "descent lines" used in terminological diagrams and often unwittingly reified by the analyst into actual groups (see Leach 1961, p. 56; cf. Needham 1958a). Quite logically, Leach went on to criticize the assumption that a matrilateral marriage rule should necessarily result in the groups intermarrying "in a circle," an idea which Needham, on the other hand, tried to refine (1958a; 1962). A criticism from Berting and Philipsen may also be noted: to be meaningful, they suggest, the "marriage cycles" must be limited in number, and the people themselves must be aware of them (Needham 1961, p. 98). While such "alliance cycles" (Needham) do meaningfully exist in some cases, their existence does not exhaust the function or meaning of marriage alliance. On this all our authors agree, for Levi-Strauss (1962, p. 333) himself recently recognized—if my interpretation is correct—that "conscious rules" have emerged from recent research as more important than their results in terms of "exchange." Leach had pointed out that, in the absence of cycles, the basic relationship is "one of the many possible types of continuing relationship between paired local descent groups" (1961, p. 101). Elsewhere, while marriage alliance does not result in a system of exchange at the level of the group as a whole, it is an integral part of the system of categories and roles as conceived by the people studied (Dumont 1957, pp. 22, 34). Needham has gone furthest in submitting LeviStraussian structuralism to criticism from the inside and in referring the "mediating" concepts of exchange and reciprocity back to that of (distinctive) opposition (1960, p. 103). The more fundamental "integration" is not that of groups but rather that of the categories as it occurs within the social mind: the marriage rule is part and parcel of this system of ideas. Like everything else, social relationships are defined by classification. Studying the "symbolic order" of the Purum and others, Needham (1958a) found that asymmetrical intermarriage, although it could not function with less than three intermarrying or "alliance groups," can be dualistically conceptualized (wife-givers and wife-takers) in accordance with an over-all dualist scheme. Here are found "structural entailments" different from the group arrangements on which attention had first focused. The expression "mar-
MARSH, GEORGE PERKINS riage alliance" thus covers both the general phenomenon of mental integration and the particular phenomenon of group integration. In its restricted field this truly structural theory alone transcends the bias inherent in our own culture. Such expressions as "cross-cousin marriage" are technically useful but basically misleading. Real understanding is reached when the marriage rule understood as marriage alliance is seen as giving affinity the diachronic dimension that we tend to associate only with descent and/or consanguinity. By this means we are able to transcend the limitations of thinking based upon our own society and make comparisons in terms of the basic concepts involved (consanguinity and affinity). Much remains to be done. Certainly the implications of marriage alliance for status, economy, and political organization (i.e., the physiology of the system) should be worked out (Leach 1961, chapter 3). But even regarding the morphology, our analyses are as yet imperfectly structural; we still take too much for granted in the study of terminologies. Before attempting ambitious (^constructions, the basis in comparative data must be strengthened and extended, and we must obtain a clearer view of the limits of the logical integration of features, or conversely, of the plasticity and tolerance of systems, which can in some cases go so far as to deny in effect the ideological primacy postulated above in principle. Louis DUMONT BIBLIOGRAPHY
DEACON, A. BERNARD 1927 The Regulation of Marriage in Ambrym. Journal of the Royal Anthropological Institute of Great Britain and Ireland 57:325-342. DUMONT, Louis 1957 Hierarchy and Marriage Alliance in South Indian Kinship. London: Royal Anthropological Institute. DUMONT, Louis 1966 Descent or Intermarriage? A Relational View of Australian Section Systems. Southwestern Journal of Anthropology 22:231-250. FISCHER, H. T. 1935 De aanverwantschap bij enige volken van de Nederlands-Indische Archipel. Mensch en maatschappij (Amsterdam) 11:285-297, 365-378. FISCHER, H. T. 1936 Het asymmetrisch cross-cousinnuwelijk in Nederlandsch Indie. Tijdschrift voor Indische taal-, land- en volkenhunde 76:359-372. FORTUNE, R. F. 1933 A Note on Some Forms of Kinship Structure. Oceania 4:1-9. GOODY, JACK R. 1961 The Classification of Double Descent Systems. Current Anthropology 2:3-26. -> Includes comments by 13 scholars on pages 13-21; see especially R. B. Lane's comments on page 16. HOMANS, GEORGE C.; and SCHNEIDER, DAVID M. 1955 Marriage, Authority, and Final Causes: A Study of Unilateral Cross-cousin Marriage. Glencoe, 111.: Free Press.
23
JOSSELIN DE JONG, JAN P. B. DE 1952 Levi-Strauss's Theory on Kinship and Marriage. Mededelingen van het Rijkmuseum voor Volkenkunde, No. 10. Leiden (Netherlands): Brill. LANE, ROBERT B. 1962 Patrilateral Cross-cousin Marriage. Ethnology 1:467-499. LEACH, EDMUND R. 1961 Rethinking Anthropology. London School of Economics and Political Science, Monographs on Social Anthropology, No. 22. London: Athlone. LEVI-STRAUSS, CLAUDE 1949 Les structures elementaires de la parente. Paris: Presses Universitaires de France. LEVI-STRAUSS, CLAUDE (1962) 1966 The Savage Mind. Univ. of Chicago Press. -» First published in French. MAYBURY-LEWIS, DAVID H. P. 1965 Prescriptive Marriage Systems. Southwestern Journal of Anthropology 21:207-230. MURDOCH, GEORGE P. 1949 Social Structure. New York: Macmillan. -> A paperback edition was published in 1965 by the Free Press. NEEDHAM, RODNEY 1958a A Structural Analysis of Purum Society. American Anthropologist New Series 60:75-101. NEEDHAM, RODNEY 1958b The Formal Analysis of Prescriptive Patrilateral Cross-cousin Marriage. Southwestern Journal of Anthropology 14:199-219. NEEDHAM, RODNEY 1960 A Structural Analysis of Aimol Society. Bijdragen tot de taal-, land- en volkenkunde (The Hague) 116:81-108. H> Text is in Dutch and English. NEEDHAM, RODNEY 1961 Notes on the Analysis of Asymmetric Alliance. Bijdragen tot de taal-, land- en volkenkunde (The Hague) 117:93-117. NEEDHAM, RODNEY 1962 Structure and Sentiment: A Test Case in Social Anthropology. Univ. of Chicago Press. RADCLIFFE-BROWN, A. R. 1931 The Social Organization of Australian Tribes. Oceania 1:34-63, 206-246, 322341, 426-456. SELIGMAN, BRENDA Z. 1928 Asymmetry in Descent, With Special Reference to Pentecost. Journal of the Royal Anthropological Institute of Great Britain and Ireland 58:533-558. WOLFRAM, E. M. S. 1956 The Explanation of Prohibitions and Preferences of Marriage Between Kin. Ph.D. dissertation, Oxford Univ. -> See especially Chapter 8, "The Explanation of Incest and Marriage Regulations." WOUDEN, F. A. E. VAN 1935 Sociale structuurtypen in de Groote Oost. Leiden (Netherlands): Ginsberg.
MARSH, GEORGE PERKINS George Perkins Marsh (1801-1882), an American geographer, is known today primarily as the founding father of the conservation movement. His contemporaries regarded him as the most comprehensive American scholar of the time. His enduring contributions to knowledge stemmed from an unusual combination of historical and ecological insights. As a social historian, Marsh broke new ground in treating the story of mankind as the history of the use and misuse of resources.
24
MARSH, GEORGE PERKINS
As an ecologist, he saw the history of nature commingled with that of man and traced the motives, the techniques, and the consequences of man's impact on the earth. Although he was not a trained naturalist, Marsh manifested an extraordinary awareness of the fragile interdependence of all aspects of nature, physical and biological, and of their multiform significance for mankind. The scion of a patrician family in frontier Vermont, Marsh graduated from Dartmouth College and practiced law in Burlington, meanwhile engaging in business—farming, lumbering, woolen manufacturing, railroad development, marble quarrying—and in politics. He served in the state legislature and for six years in Congress. As a reward for services to the Whig and Republican parties, he was appointed United States minister to Turkey from 1849 to 1853 and to Italy from 1861 to 1882, the latter an unequaled tenure of office. This diplomatic career gave him an opportunity to travel widely and made possible the leisure essential for his scholarly work. Marsh's first contributions were in linguistics. Attracted by new developments in the study of Teutonic languages, folklore, and cultural origins, he edited the first Icelandic grammar in English, delved into the history and literature of Old Norse and related tongues, and became the American promoter for C. C. Rafn's monumental Antiquitates americanae, a collection of Icelandic sagas bearing on the Iceland-Greenland-Vinland settlements. Marsh's historical studies of the English language were the standard texts during the 1860s, and he was in continual demand as editor, adviser, and critic of dictionaries, including the New English Dictionary. Familiarity with an extraordinary range of source materials in twenty languages made Marsh a first-rate, if not profoundly original, etymologist. His work as a scientific middleman was of more lasting significance. In the House of Representatives, Marsh helped to create, to staff, and to shape the Smithsonian Institution, guiding its early ventures in archeology and in natural science and guarding its research endowment against congressional incursions. He himself added animal specimens from Turkey, Egypt, and Palestine to the Smithsonian collections. On his return from Turkey he strongly urged the introduction of camels as beasts of burden in the American West—an enterprise initially successful but aborted by the Civil War. A utilitarian zeal directed and inspired Marsh's best work. As early as the 1840s he publicly advocated measures of physical improvement and
conservation: the establishment of nurseries for forestry research and the regulation of logging to prevent excessive and flashy runoff and consequent flooding and desiccation. Not until he returned to Italy in 1861, however, did Marsh bring together the materials he had long been collecting into a systematic analysis of man's manipulation of the natural environment. Published in 1864, Man and Nature showed how man differs from nature, how nature operates within itself, and what happens to woods and waters, mountains and deserts, when men clear, farm, dam, and build. In surveying man's impact, conscious and unconscious, Marsh did not overlook the improvements but stressed the accompanying and resulting damage. Technology had enabled man to derange natural balances and might ultimately, Marsh feared, reduce the surface of the earth "to such a condition of impoverished productiveness, of shattered surface, of climatic excess, as to threaten the depravation, barbarism, and perhaps even extinction of the species" ([1864] 1965, p. 44). Man and Nature was the first book to controvert the American myth of an inexhaustible earth. Its immediate impact was more moral than practical, but by 1907, when it had gone through three editions, the basic principles of Man and Nature were embodied in the national conservation program. In geography, Marsh is also important as an early and effective critic of environmental determinism, then popularized in the works of Arnold Guyot. How could man be viewed as the product of environment, when man himself had the power to alter that environment—and had in fact done so over most of the earth's surface? The mistakes man had made through ignorance or greed could, Marsh thought, in most cases be rectified by applying scientific principles of land management, by avoiding waste, and by public control of resources. DAVID LOWENTHAL [For discussion of the subsequent development of Marsh's ideas, see CONSERVATION.] WORKS BY MARSH
1856
The Camel: His Organization, Habits, and Uses, Considered With Reference to His Introduction Into the United States. Boston: Gould & Lincoln. (1860a) 1885 Lectures on the English Language. Rev. ed. New York: Scribner. 1860& The Study of Nature. Christian Examiner 68: 33-62. H» An unsigned article. (1862) 1898 The Origin and History of the English Language, and of the Early Literature It Embodies. Rev. ed. New York: Scribner. (1864) 1965 Man and Nature: Or, Physical Geography as Modified by Human Action. Edited by David Lowenthal. Cambridge, Mass.: Harvard Univ. Press. -> See
MARSHALL, ALFRED the introduction to the 1965 edition by David Lowenthal. Revised editions were published between 1874 and 1907 as The Earth as Modified by Human Action. WORKS ABOUT MARSH
KOOPMAN, HARRY L. 1892 Bibliography of George Perkins Marsh. Burlington, Vt.: Free Press Association. LOWENTHAL, DAVID 1958 George Perkins Marsh: Versatile Vermonter. New York: Columbia Univ. Press. MARSH, CAROLINE C. 1888 Life and Letters of George Perkins Marsh. Vol. 1. New York: Scribner.
MARSHALL, ALFRED Alfred Marshall (1842-1924) is one of the great names in the development of contemporary economic thought, and the book by which he is most widely known—Principles of Economics—is one of the high points in the literature of social science. His influence was enormous; so much so that the first 25 years of twentieth-century economics may be described as the "age of Marshall" and subsequent developments as extensions of and countermovements to his influence. Moreover, even when due allowance is made for the natural progress of economic science since Marshall's time, it is remarkable how much of the Marshallian framework remains. These well-known points require restatement because the positive effects of the Marshallian influence are questioned today as perhaps never before. One could agree with criticisms if they were merely objections to the view sometimes expressed that "it's all in Marshall," meaning that little or no progress has been made in economics since he wrote. It would indeed be deplorable if scientific ideas worked out almost one hundred years ago were still the last word. (An analogy with the positions of Marx and Freud is appropriate here.) However, much of the contemporary criticism goes deeper than this; it argues that the Marshallian tradition checked the development of economics by diverting attention from real issues (by which is primarily meant macrotheory) much as Ricardo was alleged to have done in an earlier generation. The merit of these criticisms will be examined carefully later in this article. Alfred Marshall was born in Clapham—then a leafy London suburb—in 1842. His father, John Marshall, held the respectable middle-class position of cashier in the Bank of England, and the family lived in modest comfort. Marshall's father was of a rather severe, evangelical frame of mind, almost a textbook example of what is loosely called Victorianism, and closely supervised his son's education. This paternal control and repression had a marked and lasting effect on Marshall; his
25
pronounced tendency toward hypochondria, his unwillingness to commit himself unequivocally in print without massive documented qualification, his fear of indolence and idleness, and his ultimate rejection of "pure pleasure" activities (such as mathematics) have their roots in the experiences of his early years. His education was planned as basically a preparation for ordination in the Anglican church. He was expected to go up to Oxford with a classics scholarship, which would lead to a fellowship and a church living. However, he rejected this plan—rebelling not against orthodox theology but against further study of the classics— and with funds borrowed from an uncle proceeded to St. John's, Cambridge, where he read mathematics. Marshall was one of the best mathematics students of his generation in England (in 1865 he was second wrangler in the tripos examination). This is an important point to bear in mind in evaluating his ambivalent attitude toward the use of mathematical methods in economics—in any event, his criticisms were not based on ignorance. Marshall came into economics with much more mathematics training than did Jevons or Walras. After graduation Marshall was elected to a fellowship in mathematics and gradually came under the influence of a group of philosopher-dons who were increasingly concerned with the social problems of industrial England. Marshall's interests centered initially on philosophy and ethics, which were then still at the frontier of social science, but worry about social conditions and the realization that poverty was at the root of many social evils led him into economics. Indeed, to Marshall the problem of poverty was not only central to the study of economics but its ultimate rationale. As he later wrote in the Principles, "the study of the causes of poverty is the study of the causes of the degradation of a large part of mankind" ([1890] 1961, vol. l,p. 3). In 1877 he married Mary Paley, a former student of his and one of the first women to be educated at Cambridge. Upon his marriage he was forced to resign his fellowship. He was for a short while principal and professor of political economy at the then University College of Bristol, became a fellow at Balliol in 1883 (after the requirement of celibacy had been eliminated), and the following year returned to Cambridge, to the chair of political economy vacated by Henry Fawcett; there he reigned until his retirement in 1908, when he was succeeded by his star pupil, A. C. Pigou. Marshall's published output was not large, especially considering that he was active almost until the time of his death. Several books—The Pure
26
MARSHALL, ALFRED
Theory of Foreign Trade and The Pure Theory of Domestic Values (1879a), Principles of Economics (1890), Industry and Trade (1919), Money, Credit & Commerce (1923), and The Economics of Industry (1879b), written jointly with Mary Marshall (which he tried to have withdrawn for complex personal reasons not bearing on its merit), a handful of articles, mainly reprinted in the Memorials of Alfred Marshall (1925), edited by Pigou; and a series of official memoranda and evidence before royal commissions (contained in Official Papers, a volume published in 1926) make up his total written contribution. Marshall's reluctance to commit himself to print —the Principles did not appear until he was 48— makes it difficult to assess his originality. Ideas first published in the 1890s, such as Marshall's statement of the theory of marginal utility, had been worked out and presented orally by him in the late 1860s, i.e., before the publication of the theory in the works of Jevons, Walras, and Menger. As J. M. Keynes put it in his famous obituary of Marshall, "The task of expounding the development of Marshall's economics is rendered difficult by the long intervals of time which generally separated the initial discovery and its oral communication to pupils from the final publication in a book to the world outside" (1924, p. 322). Intellectual background. Efforts to disentangle the various influences on Marshall's thinking as an economist are made difficult by his modesty—his desire to emphasize the continuity of thought—and also by his rather confused accounts of these influences. Marshall's first reading in economics was Ricardo and Mill; he described his early efforts as attempts to translate the ideas of these writers into differential equations. The most important single influence was surely Mill's Principles of Political Economy (1848), and a good way to get perspective on Marshall's contribution is to compare the two Principles. Also, what little mathematical economics then existed was open to Marshall, although it was not to most of his contemporaries. He clearly learned a lot from Cournot —especially about the use of continuous functions in economics. Thiinen's Der isolierte Staat (18261863), with its hints of marginal productivity analysis, was also influential. German (Hegelian) philosophy and the historical school of economists are commonly mentioned as influencing him (Marshall studied in Germany for a year). However, it is difficult to see concrete evidence of these systems of thought in his work. There is no dialectic and no historicism, although in his concern with empirical investigation he was closer to the histor-
ical school than to the English classical school. His emphasis on the continuity of growth and his perpetual references to biology suggest the influence of social Darwinism—acquired through Herbert Spencer. Methods Much of the discussion of Marshallian economics deals with his methods of analysis. These methods are not particular hypotheses or models proposed by Marshall but, rather, represent ways of setting up a problem or partitioning it so that it can be solved. The central Marshallian method is usually termed "partial analysis" or "partial equilibrium analysis" and is often loosely referred to as the ceteris paribus approach. The Marshallian partial equilibrium approach is frequently contrasted with the method of general equilibrium associated with Leon Walras, and the contrast is usually considered unfavorable to Marshall. Indeed, this approach is sometimes regarded as one of the major weaknesses Marshall bequeathed to economic science. Since the question of partial versus general equilibrium has loomed so large in the literature, some discussion of the central issues is imperative. As Marshall realized, the general equilibrium approach is not de facto a fruitful approach to such practical problems as measuring the effect of an import duty on the price of a commodity or the effect of a fall in the final product price on the demand for a particular grade of labor. It is not very helpful to be told that "everything depends on everything else" and that a change in one parameter will have effects throughout an economic system. Partial analysis is a method by which an economy is partitioned so that the main effects of a parameter shift in a particular micromarket can be highlighted without considering the spillover into other markets; hence, this method also ignores the feedback effects from the spillover. There are, of course, obvious dangers inherent in this method, but the answer lies, not in the general equilibrium approach, but in better specification of the partial model. [See ECONOMIC EQUILIBRIUM and the biography of WALRAS.] Let us take a specific example to illustrate the use of the ceteris paribus approach of partial equilibrium analysis and the related concept of comparative statics. We can draw up a demand schedule for a commodity and show the amount demanded per unit of time as a decreasing function of the price of the good. The relationship is ceteris paribus, i.e., it assumes that other factors influencing demand—such as the price of substitutes—are given, as are factors such as incomes, tastes, and expec-
MARSHALL, ALFRED tations. In a free market, if an equilibrium exists it will be where supply equals demand. If one of the ceteris paribus conditions is relaxed, the demand curve shifts, and the new partial equilibrium solution is then considered. This leads to the comparison of the two sets of equilibrium values of the variables under discussion. The method of comparing equilibrium solutions is called comparative statics because it does not permit the tracing of the time paths (between the two points of equilibrium) of the variables involved. [See STATICS AND DYNAMICS IN ECONOMICS.]
Marshall's ultimate objective was to develop a full-fledged theory of dynamic change and growth. In the preface to his Principles he wrote, "The main concern of economics is ... with human beings who are impelled, for good and evil, to change and progress. Fragmentary statical hypotheses are used as temporary auxiliaries to dynamical—or rather biological—conceptions: but the central idea of economics, even when its Foundations alone are under discussion, must be that of living force and movement" ([1890] 1961, vol. 1, p. xv). The immediate objective of Marshall's formal analysis was more limited: namely, the comparison of static equilibrium positions. Yet, even within this restrictive framework he was able, by his use of the timeperiod concept, to approximate dynamic analysis. His approach was to divide the adjustment, say, of price to changing demand or supply conditions into a series of adjustment periods. These periods should be regarded as measured by operational, not clock, time—the market period for one sector or industry may be (in terms of clock time) a longer one than the market period for another industry. The important consideration is which ceteris paribus assumptions are relaxed in successive periods. Marshall's time division is as follows: the market period, the short period, and the long period. The market period takes the production of the commodity in question as fixed, so that supply can vary only if sellers have a reserve price for their own product. The condition for equilibrium (for all time periods) is that the market be cleared, i.e., that demand equal supply. Short-run equilibrium considers supply to be partially adaptable, in the sense that increased production can occur but capital equipment and certain other overhead items are held constant. In modern economics, analysis of this short period with partial adaptation is equivalent to an analysis of the law of variable proportions, although it is not certain that Marshall himself was precisely clear about the distinction between variable proportions and returns to scale. The Marshallian long period allows for optimal capital stock adjustment. The
27
market is cleared within a framework in which supply can be considered to be fully adaptable because all factors (excluding entrcpreneurship) have adjusted to the situation. It was by means of this differential adjustment of supply that Marshall restated, within the supply and demand framework, his theory of value. The classical emphasis on costs is now seen as a particular hypothesis: that in longrun adjustment there are constant returns to scale. Figure 1 illustrates this. SS is the fixed-stock supply curve (on the assumption of zero reserve price); S'S' the short-run supply curve and S"P the long-run supply curve. With demand at DD, the long-run price is OP. Now let demand rise to D'D'. Then long-run equilibrium price will again be OP, but price will pass through the stages OP" and OP', and quantity will increase. [See DEMAND AND SUPPLY.] PRICE
QUANTITY
Figure 1
Marshall also hinted at the analysis of a fourth time period, in which factor supplies are allowed to adjust to changes in their underlying determinants. In the absence of innovation, in this period the economy reaches the full equilibrium solution for a stationary state. Marshall was cautious and basically skeptical about the use of mathematics and theoretical statistics in economics; for better or worse, he did not foresee the mushrooming of mathematical economics and econometrics. The Marshallian attitude— which became embodied in the Cambridge tradition and especially in the work of Pigou and Keynes— is seen in the Principles, where the mathematical statements are in footnotes and in a mathematical appendix. The grounds for minimizing the formal use of mathematics in the final presentation (although not in preparation) were twofold: first, the need to communicate; and second—and much more important—the fear that sets of equations necessarily omit or distort many relevant influences
28
MARSHALL, ALFRED
and considerations. Marshall set out the matter squarely in a letter to A. L. Bowley dated February 27, 1906: [A] good mathematical theorem dealing with economic hypotheses was very unlikely to be good economics: and I went more and more on the rules— (1) Use mathematics as a shorthand language, rather than as an engine of inquiry. (2) Keep to them till you have done. (3) Translate into English. (4) Then illustrate by examples that are important in real life. (5) Burn the mathematics. (6) If you can't succeed in 4 burn 3. This last I did often. ([1925] 1956, p. 427) Marshall had grave doubts as to the reasonableness of the assumptions underpinning the then existing techniques of theoretical statistics (which meant, basically, regression analysis) when applied to social science data. He had no doubts, however, about the need to be steeped in the empirical facts of any situation under analysis. He always emphasized deep statistical and historical knowledge of the area being investigated and referred again and again to the complexity of economic problems and the naivete of simple hypotheses. Like Adam Smith, Marshall had a profound knowledge of the workings of economic systems. When asked, for example, by the Gold and Silver Commission of 1887/1888, "Do you speak with knowledge . . . of the working classes?" he replied (somewhat pompously but with all honesty), "I speak from personal observation ranging over many years, and a study of almost everything of importance that has been written on the subject" (Official Papers, p. 99). Marshall's oft-quoted definition of economics— "the study of man in the ordinary business of life" —was not an attempt to demarcate the discipline precisely from other social sciences. Marshall's basic view on the scope of economics is best expressed in the sentence "The less we trouble ourselves with scholastic inquiries as to whether a certain observation comes within the scope of economics, the better" ([1890] 1961, vol. 1, p. 27). He used the term "ordinary business of life" to emphasize the point that economics is not the study of the workings of a fictional economy populated by abstract economic men: it is concerned with the real world around us. Contributions to theory Marshall's central theoretical contribution was the working out of the rigorous economics of the stationary state. For Marshall this was not, of course, the ultimate end of economics—it was indeed but the preface. To point the way to the conclusion—the working out of a full-fledged growth
model — Marshall interlarded his stationary-state framework with bits and pieces of the dynamic process-. It is this mixture that makes Marshall's Principles such difficult reading for some. Theory of demand. Marshall developed utility theory for two reasons: first, to place restrictions on demand functions; and second, to create what he hoped would be powerful tools of welfare economics. The Marshallian demand curve relates the demand for a commodity per unit of time to its own price. The relationship is ceteris paribus; in particular, other prices and incomes are assumed constant. There are certain ambiguities in this statement of inclusions within ceteris paribus, but for the moment these are set aside. Marshall's generalized "law of demand" states that the price of a good and the quantity demanded are inversely related. This restriction on demand functions is derived a priori from the form of the utility function that he postulated, which is laid out most clearly in the mathematical appendix to the Principles. He used an additive, cardinal utility function; this means that one may think of utility as being a measurable quantity (although in practice Marshall spoke of it as being only indirectly measurable, at the margin, by price) and also that the total utility that a consumer derives from his consumption of goods and services is the sum of the individual utilities derived from the consumption of each item in his budget. Symbolically, we have
where Ui is the utility derived from the consumption of the ith commodity and 17 is total utility. The basic restriction given by the additive nature of the function is that interrelationships between goods are excluded (all cross-partial derivatives are zero). Further, the law of diminishing marginal utility operates with respect to each good; this means that extra units consumed of a given commodity will increase total utility at a decreasing rate. Thus, the addition to total utility induced by the nth unit of a commodity will be less than the increase in utility induced by the (n — l)st unit. In terms of the function above,
dU
d?U
dX;
It is assumed that the consumer seeks to maximize utility, given incomes and prices. The principle of substitution comes into full play here. By substituting at the margin, a consumer reaches his maximum utility point. Maximizing the utility function, subject to the budget constraint (i.e., that the quan-
MARSHALL, ALFRED titles of all goods and services purchased multiplied by their respective prices equals total income), yields the well-known Marshallian first-order conditions for a maximum. These can be stated in the following equivalent terms: for all i and j ,
w
— • • • — X,
(3) Mt/i = Pi A, for all z. Here Pi represents the price of commodity i, and the constant term X represents the marginal utility of income. A fall in the price of commodity i must lead to more of the commodity's being bought; this must be so to keep the equalities listed in eqs. (1) to (3). That formulation (as Marshall realized) avoids the implications of the income effects of a price change—this is the purpose of assuming constant marginal utility of income. [See UTILITY.] Rigorously applied, the Marshallian assumptions appear to restrict elasticities of demand to unity, but it is clear from the body of the Principles that Marshall did not contemplate this restriction. Strictly speaking, a ceteris paribus demand curve requires that real income be held constant as price changes, so as to eliminate from the analysis the income effect of the price change. Holding money income constant is insufficient, since the real value of money income is its command over commodities and if commodity prices change, this changes also. Marshall solved this problem intuitively, by talking in terms of money income but postulating small changes in the prices of commodities that make up a small portion of the consumer's budget, so that the error involved in using money income is "of the second order of small quantities" ([1890] 1961, vol. 1, p. 132). Milton Friedman has since put the demand curve on a more satisfactory analytic footing. But for Marshall the object of demand theory was not just to place testable restrictions on demand functions; he also regarded the demand curve and the allied concept of consumer surplus as powerful tools of welfare economics. We shall consider this aspect after we have looked at Marshall's contribution to production theory and the theory of the firm. Theory of production. Marshall spoke in terms of "real costs" when considering costs of production. By "real" he meant ultimately the disutility of both the labor and the waiting involved in producing and bringing a commodity to market. The emphasis on real cost seems to contrast with the Austrian notion of opportunity cost, but in fact it
29
is easy to reconcile the two concepts. In any case, in spite of Marshall's emphasis, he rather too easily assumed the equality of real and money costs and proceeded with his analysis in terms of the latter. Central to his theory of cost and production is the principle of substitution, which works here the same way it does in his consumer theory. The entrepreneur substitutes at the margin until the total cost of a given output is at a minimum or, what is the same thing, until the output from a given set of inputs is maximized. [See COST.] Marshall was confused about the so-called laws of production and especially about the distinction between what has come to be called "variable proportions" and returns to scale; so, of course, was the whole profession until Viner's classic article of 1931. Marshall tended to compare decreasing returns with increasing returns, as though they were similar. Although he postulated that diminishing returns were historically connected with agriculture and with a situation in which the labor-capital input had grown relative to (fixed) land, he did not see the logical connection between the principle of substitution and the law of variable proportions. Increasing returns, looked at in an analytic manner, occur where increase in output is proportionately greater than the simultaneous increase of all inputs. In the course of his discussion of increasing returns, Marshall made the crucial distinction between internal and external economies, from which the whole notion of externality started. Internal economies are "those dependent on the resources of the individual houses of business" in an industry, while external economies are "dependent on the general development of the industry." Internal economies, where present, produce a falling long-run marginal cost curve for a firm, and hence threaten the stability of competition. Marshall realized this, and his "life cycle" theory of entrepreneurship was meant as a partial explanation of the survival of competition. External economies, on the other hand, are compatible with competition but raise serious welfare problems. [See EXTERNAL ECONOMIES AND DISECONOMIES.] Central to Marshall's discussion of the theory of the firm is the concept of the representative firm —a notion which is not only tenuous and vague but apparently unnecessary for Marshall's own purposes, as critics like Lionel Robbins were quick to point out. Marshall's definition of the representative firm gets us nowhere; it is only by specifying the problem with which he was trying to cope that we see the purpose of the concept. Two questions in particular worried Marshall. First, in the real world, firms clearly are capable of
30
MARSHALL, ALFRED
expanding at falling marginal cost, yet industries do not become monopolized. Marshall's answer lies partially in the representative firm. The second, a closely related problem, concerns the estimate of the supply price of a product where industry output is taken as a given but the group of firms making up the industry are in a life cycle of birth, growth, decay, and death. According to Marshall's theory, the entrepreneurial life cycle prevents the continuing expansion of any one firm— an idea more appropriate to the days of small business than to those of the large corporation, where management is not dynastic. But apart from Marshall's exercise in social evolution, we still have the interesting problem, with disequilibrium at the firm level, of estimating supply price and, more generally, the industry supply curve. In contemporary economics the static solution to these problems, under perfect competition, is to sum the firms' marginal cost curves to obtain the industry supply curve. This supply price is equal to any firm's marginal cost; for Marshall, however, with his continual search for the dynamic solution, this answer was inadequate. A firm picked at random would not necessarily be typical in the sense that its costs would correctly reflect the sustainable degree of efficiency and level of economies for its aggregate output. It might be a firm about to disappear or one in the very early stages of growth. The answer, Marshall believed, was to identify a "typical" or representative firm. [See FIRM, THEORY OF THE.] What is the typical market structure in Marshall's world? Nowhere in his work do we find the perfectly elastic demand curve of the current textbook version of perfect competition. It is clearly not monopoly (for Marshall reserves this case for special treatment), but it is doubtful whether, as has recently been suggested, the typical Marshallian market can be interpreted as monopolistically competitive in Chamberlin's sense. In spite of Marshall's remark that in the short run firms may have to lower price to increase sales, his basic view is that price is a parameter in the typical firm's plans. Theory of distribution. Marshall's theory of distribution is outlined on two levels. On the assumption of fixed coefficients (such as Walras assumed in the first edition of the Elements), Marshall worked out his theory of joint demand. In this case there is no substitution within a given productive process; the principle of substitution is inoperative, and hence the marginal productivity theory is not applicable. To divide up the total product among the cooperating factors in the case of fixed proportions, Marshall used the law of derived demand: "The demand schedule for any factor of production
of a commodity can be derived from that of the commodity by subtracting from the demand price of each separative amount of the commodity the sum of the supply prices for corresponding amounts of the other factors" ([1890] 1961, vol. 1, p. 383). But at best this is a clumsy approach, and it did not represent Marshall's basic position. More generally he worked out a complete marginal productivity theory, and although he expressly denied that it was a theory of distribution, we must take this as typical Marshallian caution. His objections to regarding marginal productivity as a theory of wages were twofold: first, supply conditions must be included in the analysis; and second, to the extent to which labor and capital are in fixed proportions, it is not possible to identify the marginal product. The concept of quasi rent, which filled an important gap in classical analysis, is also important for Marshallian distribution theory. Rent theory explained the return to fixed land, but there was nothing in classical analysis to explain the return to capital equipment already in existence. Marshall used the term "quasi rent" to explain rewards to any factors in inelastic supply and specifically applied the analysis to capital equipment in the short run. [See RENT.] Equilibrium conditions. We have already discussed Marshall's division of the problem of price determination into a series of different time-period equilibrium positions. The general rule is that the longer the period of adaptation allowed, the more responsive is supply to price changes. To the extent to which long-run supply is perfectly elastic, Marshall saw a correlation with the classical cost-ofproduction theory of value. We have still to consider Marshall's conditions for market stability; PRICE
O
QUANTITY
MARSHALL, ALFRED PRICE
-QUANTITY
Figure 3
these appear to differ significantly from those laid down by Walras and Hicks, which are commonly studied in elementary dynamics. Marshall's conditions for a micromarket to be stable are as follows: (a) for quantity smaller than equilibrium quantity, demand price must be greater than supply price; (£>) for quantity larger than equilibrium quantity, demand price must be less than supply price. If a market that has "normal" demand and supply relationships, i.e., a downward-sloping demand curve and an upward-sloping supply curve, is stable in the Walras-Hicks sense, it is also stable in Marshall's sense. Divergences of interpretation occur, however, in other cases. Figure 2 shows a market where both stability conditions are satisfied. At price P l5 which is above equilibrium, excess demand is negative (the Hicksian condition) and quantity bought is less than equilibrium, with demand price, OPi, greater than supply price, OP 2 . The market shown in Figure 3 is stable in terms of Marshall's conditions but unstable in terms of the Walras-Hicks conditions (e.g., for price above equilibrium it has excess demand). Which specification is correct cannot be determined a priori but is a matter for empirical investigation. It is clear that the two solutions assume different behavioral reactions of buyers and sellers. More important, perhaps, the notion of a supply curve has to be specified much more carefully. [See DEMAND AND SUPPLY and ECONOMIC EQUILIBRIUM.] Marshall also attempted to formalize and explain Mill's work on the conditions for equilibrium—and the suitability of equilibrium—in foreign trade. He did this by using the techniques of offer curves. His work in this field was not formally published until 1923, when parts of it were appended to his Money, Credit & Commerce. However, it was circulated Privately, through the efforts of Henry Sidgwick.
31
Welfare economics. Marshall's contributions to welfare economics, while suggestive in terms of contemporary thought, contain some of his most doubtful analysis. Here Marshall relaxed his customary caution in the face of complex situations, in an effort to promote certain policy measures. In general his welfare economics supported the classical view that a regime of free markets maximizes welfare (utility). Marshall called this the doctrine of maximum satisfaction; his demonstration consisted of showing that for each micromarket the sum of surpluses is maximized. A monopolized market involves a suboptimal position because the sum of surpluses is lessened. The surpluses summed include both consumer and producer surpluses, or rents. In Figure 4 the free market price is P0 with quantity Q 0 . The sum of consumer surplus and producer surplus is CAB. Let the market be monopolized and price and output be PI , Qi; total surplus is then reduced to BCDE. [See CONSUMER'S SURPLUS.] However, Marshall stated two important exceptions to the doctrine of maximum satisfaction and free competition. First, he considered it to be an empirical fact that although utility is an increasing function of a person's real income, the rate of increase diminishes. As Marshall put it bluntly in the mathematical appendix to the Principles, "Every increase in his means diminishes the marginal degree of utility of money to him" ([1890] 196J, vol. 1, p. 838). Thus, it followed that, all other circumstances being the same, a redistribution of income from rich to poor would increase total satisfaction. Much more important, perhaps, is Marshall's second ground for modifying the doctrine of maxPRICE
O
QUANTITY
32
MARSHALL, ALFRED
imum satisfaction. This is that satisfaction can be increased by taxing increasing-cost industries and subsidizing decreasing-cost industries. His analysis runs entirely in terms of consumer surplus, with all its weaknesses, and can easily be seen to depend on the slopes of the supply curves. However, the whole theory of externality and divergences between private and social benefits developed from Marshall's discussion, especially his exposition of the decreasing-cost case. Monetary theory. Marshall is sometimes alleged to have neglected the monetary and, more generally, the aggregative framework within which his theory of value worked. This is a mistaken view. In his Principles Marshall is at pains to make clear that the core of that book presupposes a monetary framework, and he deals explicitly with this framework in other contributions. The two important sources for his views on money are Money, Credit & Commerce, written toward the end of his life, and, much more important, his Official Papers. This latter consists of a series of memoranda and evidence presented before royal commissions. Official Papers contains the core of Marshall's monetary theory. The most important elements of his contributions in this area are the following: the so-called Cambridge equation and his development of a credit cycle through disequilibrium between real and monetary interest rates. Marshall is often regarded as the founder of the Cambridge approach to monetary theory. In essence, this theory postulates a stable demand function for money, with real income (or wealth) as the prime argument in the function. Ceteris paribus, such an approach will give a proportionate relationship between changes in the supply of money and changes in the general level of prices. [See MONEY, article on QUANTITY THEORY.] This approach was formalized by Pigou (1917) in a famous article [see the biography of PIGOU] and elaborated by Keynes in his Tract on Monetary Reform (1923). Marshall made it absolutely clear, however, that changes in the other factors—in the volume of activity and the demand for money—may well dominate the relationship, especially in periods of economic crisis. His other contribution to the field is the spelling out of a mechanism connecting real and money rates of interest, through which divergences between the two generate a credit cycle. The view, mentioned at the beginning of this article, that Marshall diverted economics from a proper consideration of macroeconomics is largely a result of Keynes's treatment of Marshall in The General Theory (1936). His treatment there con-
trasts widely with his assessment of Marshall's monetary economics in his famous obituary of Marshall. The reasons for this dramatic volte-face are complex and cannot be discussed here; all that can be said is that in retrospect the "Keynesian revolution" appears to be more an extension of the Marshallian tradition than an attempt to reverse it. BERNARD CORRY [For the historical context of Marshall's work, see the biographies of COURNOT; JEVONS; MENGER; MILL; RicARDO; THUNEN; for discussion of the subsequent development of his ideas, see the biographies of KEYNES, JOHN MAYNARD; PIGOU; ROBERTSON.] WORKS BY MARSHALL (1879a) 1930 The Pure Theory of Foreign Trade and The Pure Theory of Domestic Values. Series of Reprints of Scarce Tracts in Economic and Political Science, No. 1. London School of Economics and Politcal Science. -» Privately printed in 1879. (1879&) 1889 MARSHALL, ALFRED; and MARSHALL, MARY P. Economics of Industry. London: Macmillan. (1890) 1961 Principles of Economics. 9th ed. 2 vols. New York and London: Macmillan. -» A variorum edition. The eighth edition is preferable for normal use. 1919 Industry and Trade: A Study of Industrial Technique and Business Organization, and of Their Influences on the Conditions of Various Classes and Nations. London: Macmillan. (1923) 1960 Money, Credit & Commerce. New York: Kelley. (1925) 1956 Memorials of Alfred Marshall. New York: Kelley. -» Contains essays on Marshall by J. M. Keynes, F. Y. Edgeworth, C. R. Fay, E. A. Benians, and A. C. Pigou; selections from Marshall's writings; and a bibliography of his works prepared by J. M. Keynes. Official Papers. London: Macmillan. 1926. -» Papers dated 1886-1903. SUPPLEMENTARY BIBLIOGRAPHY
KEYNES, JOHN MAYNARD 1923 A Tract on Monetary Reform. London: Macmillan. KEYNES, JOHN MAYNARD (1924) 1951 Alfred Marshall: 1842-1924. Pages 125-217 in John Maynard Keynes, Essays in Biography. New ed. -> First published in Volume 34 of the Economic Journal. A paperback edition was published in 1963 by Norton. KEYNES, JOHN MAYNARD 1936 The General Theory of Employment, Interest and Money. London: Macmillan. -» A paperback edition was published in 1965 by Harcourt. MILL, JOHN STUART (1848) 1961 Principles of Political Economy, With Some of Their Applications to Social Philosophy. 7th ed. Edited by W. J. Ashley. New York: Kelley. PIGOU, A. C. (1917) 1951 The Value of Money. Pages 162-183 in American Economic Association, Readings in Monetary Theory. Philadelphia: Blakiston. PIGOU, A. C. 1953 Alfred Marshall and Current Thought. London: Macmillan. SCHUMPETER, JOSEPH A, (1941) 1965 Alfred Marshall, 1842-1924. Pages 91-109 in Joseph A. Schumpeter,
MARSILIUS OF PADUA Ten Great Economists From Marx to Keynes. New York: Oxford Univ. Press. SCOTT, WILLIAM R. 1925 Alfred Marshall, 1842-1924. British Academy, London, Proceedings 11:446-457. STIGLER, GEORGE J. 1941 Production and Distribution Theories: 1870 to 1895. New York: Macmillan. -> See especially Chapter 4. THUNEN, JOHANN H. VON (1826-1863) 1930 Der isolierte Staat in Beziehung auf Landwirthschaft und Nationalokonomie. 3 vols. Jena (Germany): Fischer. VINER, JACOB (1931) 1952 Cost Curves and Supply Curves. Pages 198-232 in American Economic Association, Readings in Price Theory. Chicago: Irwin. -» First published in Volume 3 of the Zeitschrift fur Nationalokonomie.
MARSILIUS OF PADUA Marsilius of Padua (c. 1275-c. 1343) is known primarily as the author of the Defensor pads, a bold antipapal tract dedicated to Emperor Louis iv of Bavaria in 1324 during his controversy with Pope John xxn. Although contemporary condemnations name John of Jandun as coauthor, internal evidence and incongruence with John's known political statements (Gewirth 1948) argue for its ascription to Marsilius alone. The Defensor is revolutionary in denying the clergy jurisdiction of any kind and in subordinating them completely to the state. The political concepts it brings to the support of these contentions are neither so "modern" nor so original as has sometimes been claimed. All are anticipated in some fashion somewhere in the long and complex medieval tradition. Yet, because Marsilius' argument leads him to special emphases, and because he richly develops, with ingenious use of Aristotle, ideas briefly enunciated by canonists and by other publicists, he is the first to present in elaborated theoretical form views that were to be fundamental to much of modern political thought. Even though the teaching of the Defensor is not free of ambiguity, no other medieval writing offers so vigorous or so complete a theory of popular sovereignty. Jurisdiction in state and church. Marsilius argued that clerical pretensions to rule destroy the state (civitas or regnum), which, with its specialization of economic, military, governmental, and priestly "parts," is indispensable to "living and to living well." The survival of the state depends on effective government by a single unified "ruling part1' (pars principans^), and since the proper function of the clergy is not government but teaching (Christ having forbidden them all coercion), clerical rule of any sort, over anyone, destroys the unity of the government and deprives men of "the sufficient life." In this world divine law is without
33
direct sanction. It is only by authority of the human legislator that infractions can be brought to judgment. All that is "truly law" on earth is the expression of the will of the legislator—the citizens of each community as a whole or their "weightier part" (valentior pars'), the citizen body including all free adult males. The "weightier part" is weightier through "quality" as well as quantity of persons, but Marsilius argued that the more people who participate in legislation, the better the laws will be. Legislation by the multitude normally results in justice, because all but the singularly malicious or ignorant naturally wish to preserve the state and are able to discern the common benefit and to judge proposals. The legislator also establishes or selects the government or ruler. This may be one man or several but should in almost every instance be elective rather than hereditary. The legislator has the power to correct and even to depose the ruler. Although at times Marsilius seems to imply that the delegation of power to the ruler is all but absolute, his statements on correction and deposition are unequivocal. Denying the divine institution of the papacy and of the episcopacy, Marsilius provides that in a Christian community the legislator select from candidates those to be priests and bishops, appoint them to pastorates, and control their exercise of office. A general council, elected by the several legislators of the Christian world and composed of both clerics and learned laymen (the laymen voting if the clerics are not unanimous), determines the articles of faith. Only by decree of the human legislator do conciliar decisions become binding. For convenience the council or "the faithful human legislator without a superior" appoints a "head bishop" to act as president and executive secretary of the council. Custom alone argues the choice of the bishop of Rome. Although early in the Defensor Marsilius expressed a preference for a plurality of sovereign states, the idea of a "primary" or "universal" "faithful human legislator" and "a ruler by its authority" suggests the empire, and in the Defensor minor of 1342, written at the court of Louis, the "human legislator" becomes the "Roman prince." Formation of Marsilius' views. In the molding of Marsilius' concepts the fifth book of Aristotle's Politics played a significant part. Also of importance in the genesis of his attitudes was his youth in Padua, a city sovereign de facto, republican in constitution, and often in conflict with the clergy. His being a physician and not a lawyer or theologian no doubt had much to do with the freshness
34
MARX, KARL
of his attack. Acquaintance with the work of French publicists seems probable from his years at the University of Paris, and familiarity with corporation law, from his position as rector of the university in 1313 (Lewis 1963, p. 564). Although he associated with Averroists, their influence on his ideas of church and state is impossible to ascertain. Influence. Papal condemnations, including the excommunication of Marsilius (and John of Jandun) in 1327, established the reputation of the Defensor for the next three centuries. A principal charge in papal attacks on Wycliffe, and later on Luther, was that they borrowed Marsilius' doctrine. In the circumstances, conciliarist writers used the Defensor with caution. It was first printed (in Basel in 1522) to serve the Protestant cause, and in 1535 Thomas Cromwell paid most of the cost of printing an English translation. Through Richard Hooker, who cited it and shared some of its doctrines, it probably had an influence on John Locke, and so, indirectly, played an important part in carrying ideas of popular sovereignty from the medieval to the modern world. JANE E. RUBY [For the historical context of Marsilius' work, see the biographies of AQUINAS; ARISTOTLE; GERSON.] WORKS BY MARSILIUS (1324) 1951-1956 Marsilius of Padua: The Defender of Peace. Translated and introduced by Alan Gewirth. 2 vols. New York: Columbia Univ. Press. -» Volume 1: Marsilius of Padua and Medieval Political Philosophy. Volume 2: The Defensor pads. Volume 2 was written in 1324, and first printed in 1522 as Defensor pads. (1342) 1922 The "Defensor minor" of Marsilius of Padua. Edited by C. Kenneth Brampton. Birmingham (England): Cornish. SUPPLEMENTARY BIBLIOGRAPHY GEWIRTH, ALAN 1948 John of Jandun and the Defensor pads. Speculum 23:267-272. LAGARDE, GEORGES DE (1934) 1948 Marsile de Padoue, ou le premier theoriden de I'etat laique. 2d ed. Paris: Presses Universitaires de France. LEWIS, EWART 1963 The "Positivism" of Marsiglio of Padua. Speculum 38: 541-582. PREVITE-ORTON, CHARLES W. (1935) 1937 Marsilius of Padua. British Academy, London, Proceedings 21: 137-183. SCHOLZ, RICHARD 1937 Marsilius von Padua und die Genesis des modernen Staatsbewusstseins. Historische Zeitschrift 156:88-103. .
MARX, KARL Karl Marx the Prussian family when foreshadowed
(1818-1883) was born in Trier in Rhineland. His alienation from his he had scarcely passed adolescence the social isolation of his later years.
His father, a lawyer, was as concerned as he was impressed with his son's "demonic genius," as he called it, and feared that young Marx's passion for poetry and philosophy would consume him both physically and morally. The elder Marx and his wife were Jewish, but for social reasons they were converted to Christianity. The younger Marx's awareness of his ethnic background aroused in him a certain self-consciousness; this may have been one source of his sense of marginality, his ambivalence toward society, and eventually of his conflicting qualities—thinker and prophet, scientist and moralist. Although Marx received his doctorate in philosophy from the University of Jena at the age of 23, his association with the Young Hegelians, and with Bruno Bauer in particular, precluded his appointment to a university position in Germany; indeed, Bauer lost his own post at the university in Bonn as a result of questioning the historicity of the New Testament. Marx thus became a "degraded bourgeois," deprived of a stable source of income and dependent for his livelihood and that of his wife and children on the generosity of his lifelong friend, Friedrich Engels, the son of a wealthy cotton manufacturer. (Marx's wife, Jenny von Westphalen, was of noble parentage but had no dowry.) At the age of 25, Marx left Germany and, except for a brief stay in Cologne in 1848-1849, lived the rest of his life in exile: in Paris from 1843 to 1845, in Brussels from 1845 to 1848, and finally in London. As early as 1845 he renounced his Prussian citizenship, and since he failed to acquire British citizenship by naturalization, for the greater part of his life he was something of a pariah. Intellectual background. Marx's childhood and youth fall in that period of European history when the reactionary powers of the Holy Alliance were attempting to eradicate from post-Napoleonic Europe all traces of the French Revolution. There was, at the same time, a liberal movement in Germany that was making itself felt. The movement was given impetus by the July Revolution in France, and its chief representatives were the poets of the Junge Deutschland, among them Ludwig Borne and Heinrich Heine. In the late 1830s a further step toward radical criticism was made by the Young Hegelians, that group with which Marx became formally associated when he was studying law and philosophy at the University of Berlin. Although he was the youngest member of the Young Hegelians—who included, in addition to Bauer, such thinkers as Ludwig Feuerbach, Arnold Ruge, and Moses Hess—Marx inspired their confidence, respect, and even admiration. They saw in him a "new Hegel," or rather a powerful anti-
MARX, KARL Hegelian, who might successfully turn the dialectics of the master against his own conservative teachings in the fields of religion, politics, and law. Marx had already showed his determination to do so in his doctoral dissertation (1841), which dealt with the philosophical positions of Democritus and Epicurus; especially in the supplementary notes, he made his earliest attempt at a radical, albeit muted, criticism of Hegel, asserting, as Epicurus had argued against Democritus, that what is needed is a morally clear way of life rather than ideology or empty hypotheses. The intensive study of Spinoza, Leibniz, and Hume provided Marx with a spiritual armory for the elaboration of a positive conception of democracy that went far beyond the notions held at that time by radicals in Germany. It was from Spinoza rather than from Hegel that Marx learned to reconcile necessity and freedom. Therefore, when he undertook to destroy Hegel's metaphysics of "the State," Marx was well prepared to integrate a rational ethics with his own sociological and revolutionary doctrine. His early rejection of Hegel's political philosophy was unconditional and permanent; yet stripped of its "idealistic" content, Hegel's dialectic continued to influence Marx as a way of analyzing his subject matter, namely society. Marx's adherence to a radical view of democracy was also based on the study of such historical events as the revolutions in England, France, and America. From these historical studies he concluded that democracy must normally and inevitably culminate in communism, following a transitory stage of proletarian democracy (the "dictatorship of the proletariat"). After his conversion to communism Marx began his prolonged studies of economics; but while he was still developing from a liberal into a communist, he learned more from Spinoza and Feuerbach, Saint-Simon and Babeuf, Thomas Hamilton and Tocqueville, Weitling and Proudhon, Owen and Fourier, than from Smith or Ricardo. Contributions to socialist thought. Although the epoch to which Marx belonged has its beginnings in the French Revolution, its historical dimensions coincide with those of the whole era of industrial and social revolutions and extend into our own time; hence the lasting appeal of a body of teachings that is by no means free from theoretical ambiguities. The originality of Marx's thought lies in his immense efforts to synthesize, in a critical way, the entire legacy of social knowledge since Aristotle. His purpose was to achieve a better understanding of the conditions of human development and with this understanding to accelerate the actual process by which mankind was moving toward an "associa-
35
tion, in which the free development of each is the condition for the free development of all" (1848). The desired system would be a communist society based on rational planning, cooperative production, and equality of distribution and, most important, liberated from all forms of political and bureaucratic hierarchy. This dual commitment—to scholarly understanding and to political action—-created constant difficulty for Marx. He was often aware that his intense passion for reading and studying interfered with his activity on behalf of the political movement with which he identified himself. In his scholarly work the exposition and analysis are frequently interrupted by partisan outbursts of irony and sarcasm, by bitter indictments of the capitalist class and the social system based upon its dominance. Political economy was only one of the socialscientific disciplines that Marx intended to explore and then subject to criticism; the others were law, morals, and politics. He intended to treat each of these disciplines (and perhaps others also) in "separate pamphlets." But the thoroughness with which he undertook his studies of the great economists and the delays in his scholarly work that arose from the need to make a living as a penny-a-liner prevented him from elaborating even one of these projects. Capital, subtitled "A Critique of Political Economy," although a work of enormous dimensions, is the fruit of only partially completed research. However, before the age of thirty, Marx produced a number of works which together provide a relatively adequate outline of his "materialist conception of history." Among these, the most important are The Holy Family (1845a), The German Ideology (in collaboration with Engels, 18451846), The Poverty of Philosophy (1847), and The Communist Manifesto (1848). To these must be added an unfinished work, first published in 1932 with the title Economic and Philosophical Manuscripts of 1844 (see 1844a), which shows with particular clarity the connections between the various ideas Marx was later to elaborate in Capital. In these works, Marx sketched out his theory of society and history. He repudiated Hegelian and post-Hegelian speculative philosophy, and building on Feuerbach's anthropological naturalism, he developed instead a humanist ethics based on a strictly sociological approach to historical phenomena. Drawing also on French materialism and on British empiricism and classical economics, Marx's theory sought to explain all social phenomena in terms of their place and function in the complex systems of society and nature, without recourse to what he considered metaphysical explanations ("primary causes"). Clearly outlined in these early
36
MARX, KARL
writings, this eventually became a mature sociological conception of the making and development of human societies. At the beginning of A Contribution to the Critique of Political Economy (1859), Marx summed up in a dozen aphorisms the general results of the investigation he had undertaken in the 1840s and asserted that these results were the "guiding thread" of his further studies. Here are the beginning and the end of this justifiably celebrated and controversial passage: In the social production which men carry on they enter into definite relations that are indispensable and independent of their will; these relations of production correspond to a definite stage of development of their material powers of production. The sum total of these relations of production constitutes the economic structure of society—the real foundation, on which rise legal and political superstructures and to which correspond definite forms of social consciousness. The mode of production in material life determines the general character of the social, political and spiritual processes of life. It is not the consciousness of men that determines their existence, but, on the contrary, their social existence determines their consciousness. . . . In broad outlines we can designate the Asiatic, the ancient, the feudal, and the modern bourgeois methods of production as so many epochs in the progress of the economic formation of society. The bourgeois relations of production are the last antagonistic form of the social process of production . . . ; at the same time the production forces developing in the womb of bourgeois society create the material conditions for the solution of that antagonism. This social formation constitutes, therefore, the closing chapter of the prehistoric stage of human society. ([1859] 1913, pp. 11-13) Marx's "materialistic method" is well exemplified by his treatment of the concept of "alienation"—a spiritual concept in Hegel's philosophy that had already been modified in Feuerbach's anthropology. In the "Paris Manuscripts of 1844" (1844a), Marx conceived of alienation as a phenomenon related to the structure of those societies in which the producer is divorced from the means of production and in which "dead labor" (capital) dominates "living labor" (the worker). A systematic elaboration of the concept appears in Capital under the heading "fetishism of commodities and money." But the ethical germ of this conception can be found as early as 1844 in the two essays Marx published in the Deutsch—franzosische Jahrbucher: "On the Jewish Question" (1844b) and "Contribution to the Critique of Hegel's Philosophy of Right" (1844c). There Marx unequivocally rejected and condemned "the state" and "money," and he invested the proletariat with the "historical mission" of emancipating society as a whole. The identity of Marx's early
political views with the theoretical analysis in Capital is evident in the manner in which the argument of Capital is brought to a close. Describing the "historical tendency of capital accumulation," Marx quoted the prophetic statement in the Communist Manifesto: "What the bourgeoisie . . . produces, above all, are its own gravediggers. Its fall and the victory of the proletariat are equally inevitable." Similarly, in ending the preamble of his inaugural address to the International Working Men's Association (1864), Marx launched the same summons that ends the Manifesto: "Workingmen of all countries, unite!" Although this summons seems to contradict his assertion of the "historical necessity" of communism, in the very real unity of sociology and ethics the contradiction vanishes. The proletariat is enjoined to unite in order to transform society, and its recognition of the consequences of such unity for the achievement of its historical mission becomes part of the "historical necessity" of the process; by this recognition, the proletariat confirms the process. In accordance with the maxim, formulated in his "Theses on Feuerbach" (1845b), that man must prove the truth of his thinking in practice, Marx neglected his scientific work for long periods in order to participate in the class struggles of his time. He did so not without regret, for he considered his scholarly studies the most valuable form of participation in the social struggle. His more direct intervention was, of course, mainly literary in character—his several hundred articles in German, British, and American newspapers and journals; and the various addresses and manifestoes he wrote for the Working Men's International. Among his writings on the political events of his time are some unquestionable masterpieces of this genre: The Class Struggles in France (1850); The Eighteenth Brumaire of Louis Bonaparte (1852); Secret Diplomatic History of the Eighteenth Century (1856); Herr Vogt (I860); "Address" to the First International (1864); The Civil War in France (1871); the Critique of the Gotha Programme (1875). In every line he wrote, whether intended for publication or not, his ultimate singleness of purpose is clearly evident. This is particularly true of his magnum opus, Capital, whose scope transcends its outline of political economy as well as its critique of economics. At the same time that Marx defined the ultimate aim of the work as "[laying] bare the economic law of motion of modern society," he had in mind a thorough and systematic criticism of a type of society, namely capitalism. In spite of its truncated
MARX, KARL character, Capital is monumental in its construction and grandiose in its purpose. It is in Capital (even more than in Marx's philosophical writings) and particularly in the posthumously published Grundrisse (1857-1858), that the serious student will find the key to Marx's dialectical method as it contrasts with the method of Hegel. Moreover, Capital, to a greater extent than Marx's political writings, reveals the reason for the celebrated "failure" of Marxian predictions: the reason lies not so much in the inadequacy of Marx's social and economic theory as in the expectations he based on it. However, in the last analysis these expectations rest on the individual search for perfection and liberty. Marx's influence. Marx's teachings have been expanded and diffused in two ways that are, in effect, opposed to each other. The first is "Marxism" as an ideology, i.e., a dogmatic systematization of Marx's ideas for political purposes, expressed as party doctrine or state religion, and disseminated by its supporters; the second form is a growing body of research and scholarly activity in various branches of the social sciences that has been illuminated by Marx's theoretical discoveries. When Marx himself noticed that his admirers were showing the first signs of "Marxism," he rebuked them unequivocally and asserted, as Engels reported in several letters (e.g., to Bernstein and Conrad Schmidt): "I am not a Marxist." However, he tolerated and even supported Engels' efforts to win acceptance for Capital in academic circles. Inadvertently, Engels thus became the first "Marxist" and the cofounder of the Marxist ideology, whose manifesto was Engels' Anti-Diihring (1878). Marx was thereafter acclaimed as the founder of the new science of socialism and was credited by Engels with two scientific discoveries—the materialistic concept of history and the theory of surplus value. Engels' efforts to popularize Marx's ideas led to the schematization of some of Marx's basic propositions; he claimed to have extended Marx's methodological and critical approach, so that it embraced nature as well as history. With their followers, the distortion of Marx's thought went further still. While Marx considered his general theory to be a scientific method of investigating the transient nature of every economic system and placed his confidence in proletarian class consciousness as an agency of change, "Marxism," Particularly in its Leninist version, has become a Party ideology. This transformation is reflected in the substitution of the coercive direction of political elites for the spontaneous activity and con-
37
sciousness of the producing class; paradoxically, these "Marxist" elites have transformed Marx's theoretical propositions into norms of political action. The relevance of Marx's theories for the social sciences has been the subject of much fruitful debate. In a kind of osmotic process, Marx's theories have been incorporated into the social sciences at the same time that they have stimulated important countertheories. A significant event in this process was Sorel's critique of Durkheim (Sorel 1895), in which he praised the "materialist theory of sociology" according to which the various social systems—political, philosophical, religious— must be considered as interdependent and as having a common base; Sorel believed that what Marx assigned to sociology as its major subject for investigation was the underlying system of production and exchange and the conflict of classes. Marxist social science developed in Germany, stimulated by the work of Rudolf Stammler (1896), and it was in response to Stammler that Max Weber began his influential studies of the Marxian thesis concerning the relationship between the economy and other social institutions. In Italy Marxist theories were discussed in several universities under the leadership of Antonio Labriola, Giovanni Gentile, and Benedetto Croce, and in France such discussions were stimulated by Francois Simiand. Thomas G. Masaryk, while he was a university professor in Prague, produced a large work of analysis and criticism of Marx's sociological method and hypotheses (1898). The international character of the "debate with the ghost of Marx" may be further illustrated by the fact that in tsarist Russia numerous books and periodicals paid increasing attention to "scientific socialism" even before Plekhanov and Lenin appeared on the scene. In the United States the influence of Marx's ideas is evident in the writings of Albion W. Small, George H. Mead, Thorstein Veblen, and Joseph Schumpeter, among others. Since World War i, Marx's theories have not only stimulated sociological work in general but have also given impetus to a new field of sociological inquiry, the sociology of knowledge, exemplified by the works of Max Scheler and Karl Mannheim. The process of incorporating Marx's ideas into the social sciences in Western countries contrasts vividly with the unsure attempts by "Marxist" regimes to invent and decree a "Marxist" sociology. The efforts of these regimes unwittingly confirm one of Marx's major hypotheses—that the dominant ideas of a society are those of its ruling class. MAXIMILIEN RUBEL
38 [See
MARX, KARL also COMMUNISM; ECONOMIC THOUGHT, article
On SOCIALIST THOUGHT; MARXISM; MARXIST SOCIOLOGY; SOCIALISM; and the biographies BERNSTEIN; DURKHEIM; ENGELS; HEGEL; HUME; LENIN; LUKACS; MANNHEIM; MASARYK; MEAD; PROUDHON; SAINTSIMON; SCHELER; SCHUMPETER; SlMIAND; SMALL;
SOREL; SPINOZA; ToCQUEVILLE; VEBLEN; WEBER, MAX.] MARX'S WRITINGS WORKS BY MARX (1841) 1927-1929 Uber die Differenz der demokratischen und epikureischen Naturphilosophie. Pages 3—144 in Karl Marx and Friedrich Engels, Historisch-kritische Gesamtausgabe: Werke, Schriften, Briefe. Section 1, Volume 1, part 1: Werke und Schriften bis 1844. Frankfurt am Main (Germany): Marx-Engels Verlag. -» Written in 1841, the text with some notes was first published posthumously in 1902. (1843) 1953 Kritik des hegelschen Staatsrechts. Pages 20-149 in Karl Marx, Die Friihschriften. Stuttgart (Germany) : Kroner. (1844&) 1964 Economic and Philosophic Manuscripts of 1844. New York: International Publishers; London: Lawrence & Wishart. -> Written in 1844 but first published posthumously in German in 1932. Sometimes referred to as the "Paris Manuscripts of 1844." (1844&) 1963 On the Jewish Question. Pages 1-40 in Karl Marx, Early Writings. London: Watts. -> First published in Volume 1/2 of the Deutsch-franzosische Jahrbiicher. (1844c) 1963 Contribution to the Critique of Hegel's Philosophy of Right: Introduction. Pages 41-59 in Karl Marx, Early Writings. London: Watts. -> First published in Volume 1/2 of the Deutsch-franzosische Jahrbiicher. (1844d) 1963 Early Writings. Translated and edited by T. B. Bottomore. London: Watts. -> First published in German. Contains "On the Jewish Question"; "Contribution to the Critique of Hegel's Philosophy of Right"; and "Economic and Philosophic Manuscripts." (1845a) 1956 The Holy Family. Moscow: Foreign Languages Publishing House. -> First published as Die heilige Familie. (1845k) 1935 Theses on Feuerbach. Pages 73-75 in Friedrich Engels, Ludwig Feuerbach and the Outcome of Classical German Philosophy. New York: International Publishers. -> First published in German. (1845-1846) 1939 MARX, KARL; and ENGELS, FRIEDRICH The German Ideology. Parts 1 and 3. With an introduction by R. Pascal. New York: International Publishers. -> Written in 1845-1846, the full text was first published in 1932 as Die deutsche Ideologic and republished by Diet/ Verlag in 1953. (1847) 1963 The Poverty of Philosophy. With an introduction by Friedrich Engels. New York: International Publishers. -> First published as Misere de la philosophic. (1848) 1964 MARX, KARL; and ENGELS, FRIEDRICH The Communist Manifesto. New York: Washington Square Press. -> First published in German. (1849) 1962 Wage Labour and Capital. Volume 1, pages 74-97 in Karl Marx and Friedrich Engels, Selected Works. Moscow: Foreign Languages Publishing House. -» First published as "Lohnarbeit und Kapital" in the Neue Rheinische Zeitung. (1850) 1964 The Class Struggles in France: 1848-1850. New York: International Publishers. -» A series of
articles first published as "Die Klassenkampfe in Frankreich 1848 bis 1850" in the Neue Rheinische Zeitung: Politisch-okonomische Revue. (1852) 1964 The Eighteenth Brumaire of Louis Bonaparte. New York: International Publishers. -> First published in German. (1856) 1899 Secret Diplomatic History of the Eighteenth Century. Edited by Eleanor Marx Aveling. London: Sonnenschein. -> First published as "Revelations of the Diplomatic History of the Eighteenth Century" in the Sheffield Free Press. (1857-1858) 1953 Grundrisse der Kritik der politischen Okonomie. Berlin: Dietz. -» Written in 1857-1858; first published posthumously by the Marx-EngelsLenin Institute, Moscow, in 1939-1941. A partial English translation was published in 1965 as Pre-capitalist Economic Formations by International Publishers. (1857-1859) 1959 MARX, KARL; and ENGELS, FRIEDRICH The First Indian War of Independence: 1857-1859. Moscow: Foreign Languages Publishing House. -> A collection of articles written for the New York Daily Tribune. Also includes articles dated 1853 and notes from a manuscript of the 1870s. (1859) 1913 A Contribution to the Critique of Political Economy. Chicago: Kerr. -> First published as Zur Kritik der politischen Okonomie. (1860) 1953 Herr Vogt. Berlin: Dietz. (1861-1863) 1952 Theories of Surplus Value: Selections. New York: International Publishers. -> A selection from the volumes first published between 1905 and 1910 as Theorien iiber den Mehrwert, edited by Karl Kautsky, taken from Karl Marx's preliminary manuscript written between 1861-1863 for a projected fourth volume of Capital. (1861-1866) 1961 MARX, KARL; and ENGELS, FRIEDRICH The Civil War in the United States. 3d (Centennial) ed. New York: International Publishers. -> A paperback edition was published in 1964 by Citadel Press. (1864) 1937 Address and Provisional Rules of the Working Men's International Association. Pages 27-44 in Founding of the First International: A Documentary Record. New York: International Publishers. (1867-1879) 1925-1926 Capital: A Critique of Political Economy. 3 vols. Chicago: Kerr. ->• Volume 1: The Process of Capitalist Production. Volume 2: The Process of Circulation of Capital. Volume 3: The Process of Capitalist Production as a Whole. The first volume was published in 1867. The manuscripts of Volumes 2 and 3 were written between 1867 and 1879. They were first published posthumously in German in 1885 and 1894. (1871) 1963 The Civil War in France. With an introduction by Friedrich Engels. Moscow: Foreign Languages Publishing House. -> First published in English. A paperback edition was published in 1964 by International Publishers. (1875) 1959 MARX, KARL; and ENGELS, FRIEDRICH Critique of the Gotha Programme. Moscow: Foreign Languages Publishing House. -» Written by Marx in 1875 as "Randglossen zum Programm der deutschen Arbeiterpartei." First published with notes by Engels in 1891. SELECTIONS FROM MARX'S WORKS
Die Friihschriften. Stuttgart (Germany): Kroner, 1953. Marx on China, 1853-1860: Articles From the New York Daily Tribune. With an introduction and notes by Dona Torr. London: Lawrence & Wishart, 1951. MARX, KARL; and ENGELS, FRIEDRICH Revolution in Spain. New York: International Publishers, 1939. -> A collec-
MARX, KARL tion of articles first published in the New York Daily Tribune, Putnam's Magazine, the New American Encyclopedia, and Der Volkesstaat. MARX, KABL; and ENGELS, FRIEDRICH The Russian Menace to Europe: A Collection of Articles, Speeches, Letters and News Dispatches. Edited by Paul W. Blackstock and Bert F. Hoselitz. Glencoe, 111.: Free Press, 1952. -* Contains materials written between 1848-1894. MARX, KARL; and ENGELS, FRIEDRICH Karl Marx and Frederick Engels on Britain. Moscow: Foreign Languages Publishing House, 1953. -» Contains a collection of the most important writings of Marx and Engels, written between 1844-1895, dealing with England. MARX, KARL; and ENGELS, FRIEDRICH Karl Marx and Frederick Engels; Letters to Americans 1848-1895: A Selection. New York: International Publishers, 1953. MARX, KARL; and ENGELS, FRIEDRICH Karl Marx and Frederick Engels: Selected Correspondence. Moscow: Foreign Languages Publishing House, 1956. Contains material dated 1843-1895. MARX, KARL, and ENGELS, FRIEDRICH On Colonialism. Moscow: Foreign Languages Publishing House, 1960. -» Contains a collection of works by Marx and Engels written between 1850-1894. MARX, KARL; and ENGELS, FRIEDRICH Selected Works. 2 vols. Moscow: Foreign Languages Publishing House, 1962. Selected Writings in Sociology and Social Philosophy. 2d ed. Edited by T. B. Bottomore and M. Rubel, with a foreword by Erich Fromm. New York: McGraw-Hill, 1964. -» Contains works written by Marx between 1844-1875. COLLECTED
WORKS
MARX, KARL; and ENGELS, FRIEDRICH Historisch-kritische Gesamtausgabe: Werke, Schriften, Briefe. 12 vols. Edited by David Rjazanov and V. Adoratskij, commissioned by the Marx-Engels Institute, Moscow. Frankfurt am Main, Berlin, and Moscow: Marx-Engels Verlag, 1927-1935. MARX, KARL; and ENGELS, FRIEDRICH Karl Marx, Friedrich Engels: Werke. Vols. 1-. Berlin: Dietz, 1956-.-* Volumes 1-19, 22-31 of a contemplated 36-volume edition. SUPPLEMENTARY BIBLIOGRAPHY
ADLER, MAX 1922 Die Staatsauffassung des Marxismus: Ein Beitrag zur Unterscheidung von soziologischen und juristischen Methoden. Marx-Studien, Vol. 4, part 2. Vienna: Wiener Volksbuchhandlung. ADLER, MAX (1930-1932) 1964 Soziologie des Marxismus. 3 vols. Vienna: Europa. -» First published as Lehrbuch der materialistischen Geschichtsauffassung. Volume 1: Grundlegung der materialistischen Geschichtsauffassung. Volume 2: Natur und Gesellschaft. Volume 3: Die solidarische Gesellschaft. Archiv fur die Geschichte des Sozialismus und der Arbeiterbewegung. -> Published between 1910-1930. BERLIN, ISAIAH (1939) 1963 Karl Marx: His Life and Environment. 3d ed. New York: Oxford Univ. Press. BERNSTEIN, EDUARD (1899) 1909 Die Voraussetzungen des Sozialismus und die Aufgaben der Sozialdemokratie. Stuttgart (Germany): Dietz. [BLECH, WILLIAM J.] 1939 Elements of Marxian Economic Theory and Its Criticism, by William J. Blake [pseud.]. New York: Cordon. BUKHARIN, NIKOLAI I. (1921) 1965 Historical Materialism: A System of Sociology. Translated from the 3d Russian edition. New York: Russell. -> First published as Teoriia istoricheskogo materializma.
39
DRAPER, HAL 1962 Marx and the Dictatorship of the Proletariat. Institut de Science Economique Appliquee, Cahiers Fifth Series: Etudes de Marxologie 6:5—73. DUNAYEVSKAYA, RAYA 1958 Marxism and Freedom From 1776 Until Today. New York: Bookman. ENGELS, FRIEDRICH (1878) 1959 Anti-Diihring: Herr Eugen Diihring's Revolution in Science. 2d ed. Moscow: Foreign Languages Publishing House. -» First published as "Herrn Eugen Diihrings Umwalzung der Wissenschaft" in a series of articles in Vorwdrts (Leipzig). Translated from the 3d German edition of 1894. ENGELS, FRIEDRICH (1892) 1925 Marx, Heinrich Karl. Volume 6, pages 496-500 in Handwdrterbuch der Staatswissenschaften. 4th ed. Jena (Germany): Fischer. FROMM, ERICH (editor) 1961 Marx's Concept of Man. New York: Ungar. GURVITCH, GEORGES (1950)1963- La sociologie de Karl Marx. Volume 2, pages 220-322 in La vocation actuelle de la sociologie. 2d ed., rev. Paris: Presses Universitaires de France. HILFERDING, RUDOLF (1904) 1949 Bohm-Bawerk's Criticism of Marx. Pages 119-196 in Eugen BohmBawerk, Karl Marx and the Close of His System. NeW York: Kelley. -» First published in German. HIRSCH, HELMUT 1963 Marxiana judaica. Institut de Science Economique Appliquee, Cahiers Fifth Series: Etudes de Marxologie 7:5-22. HODGES, DONALD C. 1965 Engels' Contribution to Marxism. Socialist Register 2:297-310. HOOK, SIDNEY (1936) 1958 From Hegel to Marx: Studies in the Intellectual Development of Karl Marx. New York: Humanities. -> A paperback edition was published in 1962 by the University of Michigan Press. KAMENKA, EUGENE 1962 The Ethical Foundations of Marxism. London: Routledge; New York: Praeger. KAUTSKY, KARL (1906) 1918 Ethics and the Materialist Conception of History. Chicago: Kerr. -* First published in German. KELSEN, HANS (1920) 1923 Sozialismus und Staat: Eine Untersuchung der politischen Theorie des Marxismus. 2d ed., enl. Leipzig: Hirschfeld. KORSCH, KARL (1923) 1930 Marxismus und Philosophic. 2d ed. Leipzig: Hirschfeld. KORSCH, KARL (1938) 1963 Karl Marx. New York: Russell. LICHTHEIM, GEORGE (1961) 1964 Marxism: An Historical and Critical Study. 2d rev. ed. London: Routledge. LUKACS, GYORGY (1919-1922) 1923 Geschichte und Klassenbewusstsein: Studien iiber marxistische Dialektik. Berlin: Malik. MARCUSE, HERBERT (1941) 1955 Reason and Revolution: Hegel and the Rise of Social Theory. 2d ed. New York: Humanities; London: Routledge. -» A paperback edition was published in 1960 by Beacon. Marxismusstudien. 4 vols. Evangelische Studiengemeinschaft, Schriften. 1954-1962 Tubingen (Germany): Mohr. MASARYK, THOMAS G. (1898)1964 Die philosophischen und soziologischen Grundlagen des Marxismus: Studien zur socialen Frage. Osnabriick (Germany): Zeller. MATTICK, PAUL 1962 Marx and Keynes. Institut de Science Economique Appliquee, Cahiers Fifth Series: Etudes de Marxologie 5:113-212. MAYER, HENRY 1960 Marx, Engels and the Politics of the Peasantry. Institut de Science Economique Appliquee, Cahiers Fifth Series: Etudes de Marxologie 3:91-152.
40
MARXISM
MEHRING, FRANZ (1918) 1948 Karl Marx: The Story of His Life. London: Allen & Unwin. -> First published in German. A paperback edition was published in 1962 by the University of Michigan Press. NAVILLE, PIERRE 1957 De Valienation a la jouissance: La genese de la sociologie du travail chez Marx et Engels. Paris: Riviere. NIKOLAEVSKII, BORIS I.; and MAENCHEN-HELFEN, OTTO 1936 Karl Marx: Man and Fighter. Philadelphia: Lippincott. OLLMAN, BERTELL 1967 Marx's Conception of Human Nature. Unpublished manuscript. PAGE, CHARLES (1940) 1964 Class and American Sociology: From Ward to Ross. New York: Octagon Books. PLAMENATZ, JOHN P. (1954) 1961 German Marxism and Russian Communism. 3d ed. London: Longmans. PLEKHANOV, GEORGII V. (1895) 1947 In Defense of Materialism: The Development of the Monist View of History. London: Lawrence & Wishart. -> First published in Russian. POPPER, KARL R. (1945) 1963 The Open Society and Its Enemies. 4th rev. ed. 2 vols. Princeton Univ. Press. -» Volume 1: The Spell of Plato. Volume 2: The High Tide of Prophecy: Hegel, Marx and the Aftermath. RUBEL, MAXIMILIEN 1956 Bibliographic des oeuvres de Karl Marx: Avec en appendice un repertoire des oeuvres de Friedrich Engels. Paris: Riviere. -» A Supplement was published in 1960. RUBEL, MAXIMILIEN 1957 Karl Marx: Essai de biographie intellectuelle. Paris: Riviere. SCHUMPETER, JOSEPH A. (1942) 1950 Capitalism, Socialism, and Democracy. 3d ed. New York: Harper; London: Allen & Unwin. -> A paperback edition was published by Harper in 1962. SOREL, GEORGES 1895 Les theories de M. Durkheim. Devenir social 1:1-26, 148-180. STAMMLER, RUDOLF (1896)1924 Wirtschaft und Recht nach der materialistichen Geschichtsauffassung: Eine sozialphilosophische Untersuchung. 5th ed. Berlin: de Gruyter. WEBER, MAX (1907) 1922 R. Stammlers "Uberwindung" der materialistischen Geschichtsauffassung. Pages 291-359 in Max Weber, Gesammelte Aufsdtze zur Wissenschaftslehre. Tubingen (Germany): Mohr. ZEITLIN, IRVING MORDECAI 1967 Marxism: A Re-examination. Princeton: Van Nostrand. Zeitschrift fur Sozialforschung. -> Published between 1932 and 1941. Title changed to Studies in Philosophy and Social Science with Volume 8, No. 3. It represented (until 1938) a serious attempt to develop a Marxian sociology in nondogmatic terms.
MARXISM This article deals with the origins and development of the political doctrine of Karl Marx. Marxism is also discussed in ECONOMIC THOUGHT, article On SOCIALIST THOUGHT; MARXIST SOCIOLOGY; SOCIALISM; and in the biography of MARX. Contemporary political and economic aspects are discussed in COMMUNISM; COMMUNISM, ECONOMIC ORGANI-
ZATION OF. Also related are PLANNING, ECONOMIC, article on EASTERN EUROPE; WORKERS. The biographies of BERNSTEIN; ENGELS; FANON; KAUTSKY; LANGE; LENIN; LUKACS; LUXEMBURG; MAN; MILLS; OssowsKi; and TROTSKY describe different intellectual developments after Marx. For the biographies of other socialist thinkers, see under SOCIALISM. Like other schools of socialism that arose in the early nineteenth century, Marxism was a response to the economic and social hardships accompanying the growth of Western industrial capitalism. If in recent decades it has attracted most of its adherents in countries hardly touched by industrial capitalism, this is the result of a tortuous ideological history. The intellectual heritage from which Marxism drew its insights, attitudes, and concepts is a synthesis of many ideological currents of the early and middle nineteenth century. They include the basic assumptions of the democratic faith and the slogans of the French Revolution; indeed, Marxism asserts that this revolution was betrayed by the very class which made it and will only be fulfilled by the proletariat through socialism. Hence, Marxism also embraces the syndrome of attitudes associated with workers' protest movements and socialism. Further, Marxism embodies the empiricism or "materialism" of Bacon, Hobbes, and Helvetius. From Rousseau and the romantics it has taken a strongly ambivalent attitude toward past and present institutions, together with a strong commitment to historicism and its Hegelian form —dialectics. Finally, this mixture is seasoned with the anthropocentrism of Feuerbach, the economic doctrines of Smith and Ricardo, and the class-war theories of Michelet and other historians of the French Revolution. As a syndrome of attitudes, Marxism might be described as a synthesis of radicalism, optimism, and a commitment to science: it is radical in criticizing contemporary social institutions and practices as stupid and inhuman; it is optimistic in expecting, eventually, the creation of a "good society" worthy of man's highest potentials; it is committed to science not only because it wishes to analyze society but also because it is convinced that the scientific investigation of the social forces active in the contemporary world will confirm both its radicalism and its optimism. Doctrine Marxism is a dialectical theory of human progress. It regards history as the development of man's effort to master the forces of nature and, hence, of
MARXISM production ("economic interpretation of history"). Since all production is carried out within social organization, history is the succession of changes in social systems, the development of human relations geared to productive activity ("modes of production"), in which the economic system forms the "base" and all other relationships, institutions, activities, and idea systems are "superstructural." History is progress because man's ability to produce, his "forces of production," continually increase. It is regression because in perfecting the forces of production man creates a more and more complex and oppressive social organization, seemingly beyond human control (the "production relations"), the central feature of which is the division of society into classes. Classes are defined by their relations to the essential means of production: the ruling class is that group of men who own the means of production; those who are propertyless are forced to function as the laboring class. Like Rousseau, Marx was profoundly interested in exploring the inequalities of men because he shared his belief that there can be no democracy as long as there are inequality and special interests. Progress, thus, is a mixed blessing. Nor is it unilinear, for in the history of man different elements of the complicated social system continually become dysfunctional to each other. In particular, the production relations, originally in tune with a given state of the forces of production, lag behind the latter and come to retard their further development. From a promoter of progress the ruling class turns into a useless parasite. But when the old production relations have turned into a dead shell, mankind assures the march of progress by remaking the social system in revolutionary violence, giving leadership to the class wielding the most advanced means of production. According to the Marxist scheme of history, mankind has gone through three or four major modes of production since an initial golden age of primitive communism: ancient slave society, feudalism, and capitalism (to which Marx added, in some of his works, Asiatic society as a distinct mode of production). Capitalism, the last form of society torn by a class struggle, represents the peak of human development so far. On the one hand, it has created and amassed unprecedented wealth, which, if used rationally, could assure the material wellbeing of all mankind. Yet, by virtue of its own laws of operation, capitalism cannot utilize its means of production rationally but must match the accumulation of capital with the accumulation of misery and chaos. Again, while it has promoted constitutional government and the rights of man, the
41
formal rights and equalities of liberal regimes are vitiated by actual inequalities and ultimate dehumanization: formally free, man has been converted into a commodity, whose labor power, talents, and personality are for sale on the free market. The resolution of these contradictions will be produced by capitalism itself. Its own economic laws not only produce chaos and crisis but also narrow the social basis of capitalism by casting the mass of the population into the proletariat. At a crisis point the exploited will rise in revolution, expropriate the ruling class, replace commodity production with an economy based on national planning, and abolish all class divisions in society. Supporting the optimistic prognosis of revolutionary Marxism is the image of the proletariat as the "chosen people," who, because of their place in society, their state of organization, and their spontaneous grasp of reality ("class consciousness") can be expected to rise above all narrow interests, loyalties, and ideologies and liberate mankind forever from the curse of property and class. Development Marxist doctrine was spelled out concisely in the Communist Manifesto. This pamphlet was written shortly before the outbreak of the 1848 revolution, which Marx and Engels were confident would lead to the socialist revolution of the proletariat. The failure of 1848 forced them to explain what had prevented this act of deliverance. In subsequent political commentaries, they emphasized complicating factors left out of the more abstract analysis of capitalism: the role of precapitalist classes in European politics, especially the petty bourgeoisie; the baneful role of demoralized workers (Lumpenproletariaf); the role of the state as an independent political force; and the role of nations. Many ideas contained in these political writings were never fully integrated with general Marxist theory. Indeed, Marx's major work on the capitalist system remained a fragment; and he died before he had time to give a systematic presentation of such a central concept as social class. Another task made necessary by the failure of 1848 was to elaborate a political strategy for the proletarian movement. Here Marxism came to emphasize the differences between long-range and short-range objectives. Socialism was defined as the maximal goal, while the minimal goal was the liberation of capitalism from feudal and absolutist residues of a political, economic, or social nature. A more intermediate goal—the dictatorship of the proletariat—received scant mention in Marx's writings.
42
MARXISM
Problems of political strategy became more important because, after inauspicious beginnings, Marxist doctrine was in time accepted as the party ideology of the European labor movement. There is irony in this merger because the workers' movement which finally accepted Marxist ideology was in many ways different from the proletariat as Marx and Engels had described and idealized it in 1847-1848. It tended toward reformism, had faith in constitutional democracy, and, as the first mass party of modern history, became thoroughly bureaucratized. Ulam (1960) cogently argues that Marxism at that time was no longer a suitable ideology for the European labor movement. Hence, its adoption raises puzzling questions which cannot be pursued here. Suitable or not, the merger of the ideology with the movement was a turning point because, with it, Marxism became a formal ideology, a guide to thought and action, a holy writ and catechism. Henceforth, it tended toward doctrinal rigidity. The growing discrepancy between revolutionary theories and actual party policies lent Marxism a note of hypocrisy, while a widening gap between assumptions and reality had the effect of ideological blinders on those who wanted to use Marxism as a tool for comprehension. Moreover, once Marxism was accepted as party doctrine, its adherents, beginning with Engels, extended the doctrine into areas of inquiry to which Marx himself had not applied it. Marx had thought to encompass with his theory contemporary society and all of human history. Engels sought to integrate Darwinian theory and all natural science with Marxism and to raise Marxist theories to the level of a universal philosophy. For Marx the ultimate determinant of the course of history was man and his needs, but for Engels it came to be matter and its motion. Yet, it was the ideas of Engels which set the tone for orthodox Marxism of both the social democratic and communist persuasions. The sociopsychological dynamics behind both this extension of the doctrine and its transformation into a holy writ still need to be explored. Conflicting interpretations As soon as Marxism was accepted as the doctrine of the European labor movement, it became a matter of controversy among its followers, partly because the work of Marx and Engels had remained unfinished in many details, but even more because of social changes that had occurred. The ensuing debates dealt with issues of strategy, focusing on the problem of maturity, i.e., the task of defining the point at which a society might be
ripe for the proletarian revolution. Engels alluded to the problem by wondering about the paradox that the proletarian revolution might be impossible as long as it was necessary and unnecessary once it became feasible. Problems of organization and tactics also provided material for controversies. Discussions of these and many related issues are still going on within Marxism, even though the same questions are asked in changing circumstances. Of the controversies raging before World War I, the bitterest was unrelated to strategy or organization. It arose instead out of the growing unrealism of Marxist doctrine. The spread of economic prosperity and constitutional government belied Marxist prognoses about the intensification of crises and misery; and the revolutionary slogans of the Communist Manifesto sounded incongruous when uttered by the moderate leaders of the Second International. To resolve the discrepancy, the "revisionists" proposed a thoroughgoing change of Marxist doctrines so as to make them reflect current conditions, modern scientific insights, and social democratic aims and policies. Revisionism came close to being a repudiation of Marxist ideas; and it can be regarded as the first in a long series of steps away from Marx made by democratic socialists since the turn of the century. Their antagonists in the Second International insistently upheld the letter of Marxist doctrines, identifying loyalty to the writ of Marx with loyalty to the workers' movement. The method of bridging the gap between theory and reality was by denying its existence, meanwhile reinterpreting Marx's revolutionary theories so as to make them yield reformist counsel. For most Marxists committed to democratic socialism, this was an ideological rear-guard action, because the predominant trend in the socialist movement has been to follow the revisionists in their repudiation of Marx. While the revisionists sought to change theories in order to align them with reality and the orthodox denied the need for such realignment, a radical wing of the Marxist movement, which arose after the turn of the century, attempted to bridge the gap between theory and practice by leading the workers' movement back to the revolutionary orientation of the Communist Manifesto. This wing became the nucleus of communism; most of its leaders joined communist parties, if only for short periods. The radical Marxists asserted that despite profound changes in the capitalist world since the days of Marx—especially "imperialism" (the export of
MARXISM capital and of capitalism to dependent countries overseas)—the basic contradictions of capitalism had remained; hence also the inevitability and necessity of the revolution Marx had predicted. Disagreements among the radicals arose over many issues, the most divisive one being the question of organization. One faction, of which Rosa Luxemburg was the outstanding spokesman, saw the roots of "reformism," i.e., democratic socialism, in the bureaucratization of the workers' movement, which they thought stifled revolutionary initiative and proletarian class consciousness. Against them, Lenin and his Bolsheviks believed that revolution making should be subject to rational management (bureaucracy), and they held, moreover, that by itself, spontaneously, the proletariat would not be able to attain class consciousness; hence their emphasis on leadership by an enlightened elite organized in bureaucratic fashion in the party, which should function as the general staff of the proletarian revolution. Leninist Marxism thus focuses on the task of manipulating the masses through leaders who have acquired insight into the politically necessary and possible by applying Marxist categories to the analysis of their society. New ideological problems were bound to arise when a Marxist party was founded in Russia toward the end of the nineteenth century, because most of the conditions to which Marxism originally had been a response were absent in that country. The very fact that Marxism could find acceptance among Russian revolutionaries is an interesting ideological development, which deserves explanation in another context. Russia's economic backwardness and repressive political system exacerbated problems of timing, leadership, organization, political alliances, and related issues, considerably straining the entire framework of Marxist concepts. For instance, the notion that the bourgeoisie cannot fulfill the ideological promises of the bourgeois revolution was a central tenet of Marx and Engels. But Lenin's idea that the bourgeoisie will not carry out or even initiate "its own" revolution (which will instead have to be accomplished by the proletariat and its allies) requires very bold use of Marxist class terminology. In its mature form Bolshevism takes even greater liberties with the original Marxist conception when it assigns to colonial and other dependent nations a significant role in the hoped-for "proletarian" revolutions. Underdeveloped nations here assume some of the traits which Marx had attributed to the industrial workers of the West. But as a consequence, the "proletarian revolution" itself turns into something very different from what Marx had assumed
43
it to be. Originally thought of as the act of taking over the mature industrial establishment created (and mismanaged) by capitalism, it now can take over nothing but a backward economy and culture; this leads to the paradoxical conclusion that the proletarian state (a "superstructural" phenomenon) will have to begin constructing its own economic "base," making use of "capitalist" methods in doing so. In their controversies over policy and organization, the different factions of Russian Marxism emphasized those portions of the ideology which seemed to support their views, at the expense of other portions. Lenin and his Bolsheviks were so intent on reaching their goal, the proletarian revolution, that they amended Marx's revolutionary timetable beyond recognition; this led to the accusation that they were Blanquists rather than Marxists. The more cautious Mensheviks, in turn, seemed so concerned with following the timetable provided by Marx that Lenin accused them of postponing socialism and the revolution ad calendas Graecas and thus betraying the cause of the proletariat. The schism which split Russian Marxism into two hostile social-democratic parties was extended to the entire world-wide movement in the years from 1914 to 1920. Disagreements over the proper attitudes a Marxist should take toward the war and the Russian Revolution divided Marxism irreconcilably into socialists and communists, each creating their own international federation of parties. The most important issue between them was their difference of attitude toward the Bolshevik seizure of power and method of governing. The communists regarded their state as the pioneer of the international workers' movement. The socialists saw in it an ill-timed and irresponsible adventure which discredited the Marxist movement. In subsequent decades, the communist-socialist split hastened the process by which democratic socialism came to dissociate itself from Marxist ideology. [See SOCIALISM.] Marxism as the ideology of a ruling party Once in power, the Russian communists sought to use Marxism as a guidebook for the further road toward socialism and for managing a proletarian dictatorship. The difficulties they encountered and the sketchiness of the hints Marx and Engels had provided for solving these problems led to new and sharp controversies among the communists themselves, in which the broadest spectrum of Marxist concepts was once again discussed from divergent points of view. With Stalin's rise to power
44
MARXISM
the debate was forcibly closed, and his own views were imposed as dogma over all communist parties. The substance of this dogma, based on Leninist principles, might be summarized as follows: Marxism is both scientific truth and ideology—-the ideology of the working class. The Communist party alone possesses full scientific insight and expresses the true interests of the workers. Hence, only he who is loyal to the party can be loyal to the proletariat, in tune with the course of history, or capable of grasping the truth. Nothing can be true which contradicts the party. Further, official Soviet doctrine justifies the communist state, its policies and social structure, as a proletarian dictatorship and a true democracy engaged in "constructing socialism." Marxism here has turned into a theory of state defending, as virtually perfect, a regime violating all the libertarian and egalitarian values expressed by Marx. The success with which Marxist concepts have been used to fashion a conservative and authoritarian doctrine of this kind is a major achievement in ideology making. In the Soviet Union this has recently been supplemented by a program for the "transition to communism," which is an obvious attempt to tone down expectations. Calling for little significant change in present-day Soviet society, it signals the withering away of Utopia on this branch of Marxist ideology. Finally, contemporary communist ideology incorporates a program for the further spread of the "proletarian" revolution. But although the workers in the industrialized countries have not been written off formally, in effect the communist movement has now substituted the colonial and other dependent nations in the historic role which Marx attributed to the proletariat of the capitalist world. The function of the ideology in communist political systems has been the subject of much controversy. Many scholars (echoing communist dogma) see Marxism-Leninism as the master plan guiding all communist thought, actions, and institutions. Others assert that it is no more than rationalization which easily adapts to any changes in policy. However contradictory, both theories have some plausibility but are easily refuted in their exaggerated forms. Marxist ideology, as amended by Lenin and his successors, did inspire the men who made communist revolutions and has influenced communist regimes and institutions even more directly than the ideas of Rousseau and Locke have shaped the institutions of the French and American republics. Although the ideology becomes primarily rationalization after the communist seizure of power, it
does remain the language of politics, meaning not only a code of communications for the political elite but also the conceptual frame of reference used for cognitive and ethical self-orientation. It thus determines both analysis and action, if only negatively, that is, as ideological blinders and as a brake ("bad conscience") on freedom of action. [See IDEOLOGY.] A doctrine which is meant to serve as a useful aid to cognition must be realistic and flexible, whereas a doctrine functioning as a communications code need only be rigid. These and other conflicting functions of the ideology strain it. Primarily, perhaps, the ideology functions as a legitimation device, implying not only an exercise in public relations to attain legitimacy among the citizens but, even more important, a continual attempt by the party leaders to convince themselves of their own legitimacy; more generally, it functions as an ideological exoskeleton for insecure bureaucrats in a vast and powerful administrative machine. The implication is that communist leaders in making doctrinal pronouncements speak more to themselves than to their citizenry, and least of all to the outside world. This phenomenon of self-encouragement or self-legitimation, observable in all societies, has been unduly neglected by contemporary communications theory. Ideological strains Since World War n, communist parties have come to power in close to a dozen countries of eastern Europe, Asia, and the Caribbean area. These various Marxist-Leninist regimes came to power by widely divergent methods and, once installed, faced very different tasks because of the great differences in the cultures, economic development, political traditions, and social structures of the countries concerned. If to these variations in national interests and outlooks of the several communist regimes one adds the bitter memory of past disagreements and injuries, plus the normal political rivalries between groups and personalities within a large group of states, it is not astonishing that sharp conflicts arose in the communist camp, straining relations between the different regimes and within each communist party. Given the relations between politics and ideology within the communist movement, these disagreements sooner or later had to become doctrinal and thus turned into questions of fundamental principle. Hence, the dialogue between Tito and Stalin, between Khrushchev and Mao, and between revisionists and dogmatists in every communist party led to a discus-
MARXISM sion of not only basic problems in revolutionary strategy and socialist construction but also of the most fundamental concepts of Marxism-Leninism. The resulting differences within the communist world are today as deep as the schism between Mensheviks and Bolsheviks half a century ago. Moreover, the issues under discussion are similar to those which divided Russian Marxism at that time, even though the circumstances and the concrete reference points have changed. The unity of the communist movement is as irretrievably gone as was the unity of European Marxism at the time of the October Revolution. As a result, what 15 years ago seemed well-established dogma is now subject to doubt. Within some of the communist societies, party dogmas today are also being criticized in the name of science, while the practices of communist governments and the rhetoric which justifies them are challenged in the name of Marxist humanism. In short, official ideology is assailed from several directions. Two tendencies are likely to result from this multiple onslaught. Official ideological output may turn increasingly into empty, meaningless political oratory, ritually incanted on suitable ceremonial occasions but as removed from life as Sunday sermons and Independence Day speeches. At the same time, a genuine dialogue conducted between and within the several communist parties over a sufficient period of time might serve to reinvigorate Marxist ideology, especially in communist parties that have not yet come to power, even though it may also lead many former adherents to repudiate this ideology. [See the articles under COMMUNISM,] Marxism in the noncommunist world Although the communist movement (or movements, as one must now write) claims to be the legitimate heir of Marxist ideology, Marxism continues to exist, in the Western world, as a noncommunist ideology. To be sure, most socialist parties have gone far in severing their ties with Marxism. Yet interest in it has increased in certain intellectual circles. Most of this is roundly critical and, especially in its recent intensified form, is a function of the cold war. The variety of points of view from which Marxism, or what is understood to be Marxism, has been criticized cannot be summarized here. But some of the recent interest has been sympathetic. This may have been stimulated by the collaboration of many diverse elements with communists and socialists during and shortly after World War n. In addition, political, economic, racial, and other difficulties that have beset the
45
Western world since the war have increased many intellectuals' awareness of defects in their social system. For anyone focusing his attention on negative aspects of contemporary social life, Marxism offers considerable attraction, principally because of two elements: one is the message of inevitable doom derived from the analysis of the capitalist economy; the other is the humanist ethic of Marxism—the emphasis on the evil features of a commercial civilization, the romantic anger at institutions and practices that degrade, oppress, or exploit some men, and the sanguine belief in the inherent goodness of mankind, which under favorable circumstances can and will free itself from inhibiting and corrupting institutions. In the last decade or two interest in this humanist philosophy of Marx and in the early writings in which it is expressed has increased rapidly. Finally, there is some increase in the interest social scientists have in Marx as a precursor of contemporary social science : some of his methodological contributions are only now receiving recognition. ALFRED G. MEYER BIBLIOGRAPHY
See the bibliographies following the articles on EN GELS; LENIN; and MARX for their major works. BLOOM, SOLOMON F. 1941 The World of Nations: A Study of the National Implications in the Work of Karl Marx. New York: Columbia Univ. Press. CHAMBRE, HENRI (1959)1963 From Karl Marx to Mao Tse-tung: A Systematic Survey of Marxism-Leninism. New York: Kennedy. -> First published in French. COLE, G. D. H. 1953-1960 A History of Socialist Thought. 5 vols. New York: St. Martins; London: Macmillan. H> Volume 1: Socialist Thought: The Forerunners 1789-1850, 1953. Volume 2: Marxism and Anarchism 1850-1890, 1954. Volume 3: Second International 1889-1914, 2 parts, 1956. Volume 4: Communism and Social Democracy 1914-1931, 2 parts, 1958. Volume 5: Socialism and Fascism 19311939, 1960. DANIELS, ROBERT V. 1960 The Conscience of the Revolution: Communist Opposition in Soviet Russia. Russian Research Center Studies, No. 40. Cambridge Mass.: Harvard Univ. Press. FETSCHER, IRING (1956) 1959 Von Marx zur Sowjetideologie. 4th ed. Frankfurt am Main: Diesterweg. Fundamentals of Marxism-Leninism. 2d ed., rev. (1959) 1963 Moscow: Foreign Languages Publishing House. -» First published as Osnovy marksizma—leninizma. GAY, PETER 1952 The Dilemma of Democratic Socialism: Eduard Bernstein's Challenge to Marx. New York: Columbia Univ. Press. -» A paperback edition was published in 1962 by Collier. GREGOR, A. JAMES 1965 A Survey of Marxism: Problems in the Philosophy and Theory of History. New York: Random House. HAIMSON, LEOPOLD H. 1955 The Russian Marxist &the Origins of Bolshevism. Russian Research Center
46
MARXIST SOCIOLOGY
Studies, No. 19. Cambridge, Mass.: Harvard Univ. Press. KOLARZ, WALTER (1959) 1964 Books on Communism: A Bibliography. 2d ed. New York: Oxford Univ. Press. LABEDZ, LEOPOLD (editor) 1962 Revisionism: Essays on the History of Marxist Ideas. New York: Praeger. LEHMBRUCH, GERHARD (1956) 1958 Kleiner Wegweiser zum Studium der Sowjetideologie. Bonn: Gesamtdeutscher Verlag. -> A revised and enlarged edition of H. Gollwitzer and G. Lehmbruch's Kleiner Wegweiser zum Studium des Marxismus—Leninismus. LICHTHEIM, GEORGE (1961) 1964 Marxism: An Historical and Critical Study. 2d ed., rev. London: Routledge. -> A paperback edition was published by Praeger in 1965. MARCUSE, HERBERT (1941) 1955 Reason and Revolution: Hegel and the Rise of Social Theory. 2d. ed. New York: Humanities; London: Routledge. -> A paperback edition was published in 1960 by Beacon. MARCUSE, HERBERT 1958 Soviet Marxism: A Critical Analysis. New York: Columbia Univ. Press. -* A paperback edition was published in 1961 by Vintage. MEYER, ALFRED G. 1957 Leninism. Russian Research Center Studies, No. 26. Cambridge, Mass.: Harvard Univ. Press. -» A paperback edition was published in 1962 by Praeger. MITRANY, DAVID 1951 Marx Against the Peasant: A Study in Social Dogmatism. Chapel Hill: Univ. of North Carolina Press. PLAMENATZ, JOHN P. (1954) 1961 German Marxism and Russian Communism. New York: Longmans. RAMM, THILO 1955 Die grossen Sozialisten als Rechtsund Sozialphilosophen. Stuttgart: Fischer. ROSENBERG, ARTHUR (1932) 1939 A History of Bolshevism: From Marx to the First Five Years' Plan. London and New York: Oxford Univ. Press. -> First published in German. RUBEL, MAXIMILIEN 1956 Bibliographie des oeuvres de Karl Marx: Avec en appendice un repertoire des oeuvres de Friedrich Engels. Paris: Riviere. -> A 74page supplement was added in 1960. SCHORSKE, CARL E. 1955 German Social Democracy, 1905-1917: The Development of the Great Schism. Cambridge, Mass.: Harvard Univ. Press. SCHWARTZ, BENJAMIN I. 1951 Chinese Communism and the Rise of Mao. Cambridge, Mass.: Harvard Univ. Press. TIMASHEFF, NICHOLAS S. 1946 The Great Retreat: The Growth and Decline of Communism in Russia. New York: Button. TUCKER, ROBERT C. 1961 Philosophy and Myth in Karl Marx. Cambridge Univ. Press. ULAM, ADAM B. 1960 The Unfinished Revolution: An Essay on the Sources of Influence of Marxism and Communism. New York: Random House. WETTER, GUSTAVO A. (1948) 1959 Dialectical Materialism: A Historical and Systematic Survey of Philosophy in the Soviet Union. New York: Praeger. -» First published as II materialismo dialettico sovietico.
MARXIST SOCIOLOGY Karl Marx introduced into the social sciences of his day a new method of inquiry, new concepts, and a number of bold hypotheses to explain the rise, development, and decline of particular forms
of society; all of which came to exercise, in the later decades of the nineteenth century, a profound and extensive influence upon the writing of history, political science, and sociology. Marx was also a man of action, a revolutionary, whose political creed stood in a complex and uneasy relationship to his scientific investigations, and his followers, the Marxists of various hues, have tended toward one or the other limit of his ideas, to doctrinal exposition, or to the furtherance of a science of society. Marxist sociology has been one of the principal battlefields in this conflict between objective science and political commitment. Marx's contributions On the side of scientific method, Marx made two important contributions. One was to adopt, and to maintain consistently in his work, a view of human societies as wholes or systems in which social groups, institutions, beliefs, and doctrines are interrelated and have to be studied in their interrelations rather than treated in isolation, as in the conventional separate histories of politics, law, religion, or thought. The second contribution was the view of societies as inherently mutable systems, in which changes are produced largely by internal contradictions and conflicts, and the assumption that such changes, if observed in a large number of instances, will show a sufficient degree of regularity to allow the formulation of general statements about their causes and consequences. Historical materialism. Marx's ideas, which played an essential part in the formation of modern sociology, had been adumbrated in the works of earlier thinkers as diverse in other respects as Hegel, Saint-Simon, and Adam Ferguson, all of whom greatly influenced Marx; and they resemble in some aspects the ideas which Comte and Spencer propounded in their attempts to lay the foundations of sociology. But Marx elaborated his conception of the nature of society, and of the appropriate means to study it, in a more precise, and above all more empirical, fashion than did his predecessors. He introduced an entirely new element by attributing to the characteristics of the economic system and to the derived relations between social classes a predominant influence in determining the structure of each society. It was this feature of Marx's method, to be known subsequently by the somewhat misleading term "historical materialism," which was widely accepted by later sociologists as offering a more promising starting point for exact and realistic investigations of the causes of social change than could be found in such notions as the three stages of man's intel-
MARXIST SOCIOLOGY lectual development (Comte) or the process of superorganic evolution (Spencer). Social class and social conflict. Marx's theories followed to a great extent from the above methodological conceptions, which he referred to as the "guiding thread" in his studies (1859, preface). The significance of the economic system of society was elaborated in a theory which traced the formation of the principal social groups—the classes—to the forms of ownership of the means of production and the forms of labor of nonowners. The idea of social change resulting from internal conflicts was developed in a theory of class struggles which made social classes the principal, if not the sole, agents of political activity; and this conception in turn led to the distinction between ruling and oppressed classes and to a distinctive theory of the state. The conviction that social changes display a regular pattern led Marx to construct, in broad outline, a historical sequence of the main types of society, proceeding from the simple, undifferentiated society of "primitive communism" to the complex class society of modern capitalism; and he sketched an explanation of the great historical transformations which demolished old forms of society and created new ones in terms of economic changes which he regarded as general and constant in their operation. Although this theoretical scheme was intended to have a universal character, Marx actually employed it in a partial manner. His own researches were limited almost entirely to the nineteenthcentury capitalist societies, and he gave only fragmentary accounts of the other types of society, in brief allusions in Capital, in newspaper articles and correspondence, and in manuscripts which were published after his death (see especially 18571858). Furthermore, some of his most important theoretical ideas were derived immediately from the observation of modern societies, and they fit closely only these particular societies. His theory of social classes applies in the main to the formation and development of the modern bourgeoisie and proletariat; it is not so helpful when applied to the phenomena of a caste system. Clearly, the theory of social conflict originated in an interpretation of the French Revolution, the materials for which had been prepared by earlier historians, and it was developed further by observation of the class struggles which accompanied the growth of the labor movement in western Europe. The concept of ideology, similarly, originated in Marx's criticism of some contemporary social doctrines—utilitarianism, the "critical philosophy" of the Young Hegelians, political economy in some
47
of its aspects—which he regarded as concealing or distorting the real relationships between men and the actual social conflicts in the European societies of his time (Marx & Engels 1845-1846). It is not a concept which Marx brought, or tried to bring, within the framework of a general sociological theory of knowledge. This intense preoccupation with the origins and development of industrial capitalism is, indeed, a feature of Marx's theories which helps to account for the interest which they still excite. It has enabled Marxists to represent his thought as a modern philosophy that is closely linked with the progress of science and industry, and it has enabled sociologists to discover in it the elements of a theory of industrialization and economic growth. Marx's influence in the nineteenth century Marx's scientific writings were not widely noticed or criticized during his lifetime, and he became known principally as the author of a political doctrine expounded in the Communist Manifesto in 1848 and as one of the animators of the International Working Men's Association. Furthermore, the early expounders of his ideas, other than Friedrich Engels, were themselves political leaders of the growing working class movement in Europe— men such as August Bebel, Karl Kautsky, and Eduard Bernstein in Germany; Jules Guesde and Paul Lafargue in France—rather than scholars. Only in the late 1880s did Marx's theories begin to claim the serious attention of academic social scientists. The first major work of sociology to recognize his importance and to display the influence of his thought was Ferdinand Tonnies' Community and Society (1887). In this book Tonnies expounded his distinction between two forms of society—"community" (Gemeinschaft) and "association" (Gesellschaff)—which has become one of the classic themes of sociology. His debt to Marx is indicated by the importance which he assigned to the system of production in determining these different forms of society and by the character of his analysis of modern capitalism. Much later Tonnies published an excellent short study of Marx's life and work (1921), in which he examined more fully the nature and limitations of Marx's contribution to sociology. A more general recognition by the German academic world of Marx's importance as a sociological thinker became apparent in the 1890s with the publication of a long essay by Werner Sombart on Marx's theory of modern capitalism, books by Rudolf Stammler and Thomas G. Masaryk on the methodological foundations of his theories, and
48
MARXIST SOCIOLOGY
numerous discussions in scholarly journals. At this time Marx's work also began to be discussed by eminent scholars in other European countries: in Italy by Antonio Labriola (1895-1896), Benedetto Croce (in several essays which are collected in Croce 1900), Giovanni Gentile, and Vilfredo Pareto; and in France by Georges Sorel, who expounded Marx's theories in a number of articles from 1894 onward and published in his journal, Le devenir social, some notable essays on Marx by European scholars as well as his own reviews, from a Marxist standpoint, of the work of contemporary sociologists. Marx's sociology also figured prominently in the contributions to the first international congress of sociology held in 1894. Divergence of Marxism and sociology By the beginning of the twentieth century, therefore, Marx had been generally accepted as the author of a profound and original system of sociology, yet in the following period the influence of Marxism upon sociology diminished rather than increased. Many of the writers who had first drawn attention to the importance of Marx's theories— among them Croce, Sorel, and Pareto—now became severe critics of Marxist thought and advanced new social and historical theories which, however much they might owe to the initial shock which Marx's ideas had produced, were conceived in an entirely different fashion. On the other hand, a number of influential Marxist thinkers came to regard more critically the claims of sociology as a positive science and to insist more strongly upon the character of Marxism as a revolutionary social philosophy. In the early 1900s only the small but distinguished group of thinkers who became known later as the Austro-Marxists were engaged in an attempt to set forth and develop the sociological elements in Marx's thought. Max Adler (1925), a philosopher deeply influenced by Neo-Kantianism, represented Marx as having established the epistemological foundations of social science, as Kant had done for the natural sciences; he saw in Marxism a sociological system of causal explanation. Another member of the group, Karl Renner (1904), produced what is still the outstanding Marxist contribution to the sociology of law, a study of the effect of economic forces and social changes upon the working of modern legal institutions. The writings of the Austro-Marxists, however, did not arrest the growing divergence between Marxism and sociology, which appears most clearly in the contrast between the work of Max Weber and Pareto in sociology and the fresh expositions of
Marxist thought by Karl Korsch and Gyb'rgy Lukacs. Sociology—Weber and Pareto. Marxism was unquestionably one of the strongest influences upon the work of Max Weber, much of which is devoted either to testing, in a particular context, some part of Marx's theories, or to reassessing in a more general way his concepts and methods. In the first of these directions, Weber's best-known study is that on the origins of modern capitalism (1904-1905), which is intended to show that a body of religious ideas (the Protestant ethic) played a vital part in the development of European capitalism, alongside the economic changes and the rise of a new class, through the inculcation of new attitudes toward wealth, science, and work. From this first revision of Marx's economic interpretation of history, Weber went on to examine on a wider scale the social influence of religious ideas, to amend and supplement the Marxist theory of classes, to outline a radically different theory of political power, and to suggest an interpretation of modern European history as a movement, not toward socialism but, rather, toward greater bureaucratic regulation. In the sphere of methodology, Weber's preoccupation with historical materialism is evident in his discussion (1907) of a book by Stammler and especially in an editorial in the Archiv fur Sozialwissenschaft und Sozialpolitik in 1904, in which he observed that while the materialist conception of history should be rejected as a comprehensive Weltanschauung, the interpretation of historical events from the aspect of their economic conditioning or relevance may be accepted as a useful methodological principle, above all in the study of modern societies. The impression made by Marxist ideas is equally clear in the earlier writings of Pareto, who singled out, as Marx's chief contribution to sociology, the theory of class conflict (1902-1903). This provided the basis for Pareto's own later elaboration of the idea of the struggle between elites for political power, which became the vital element in an interpretation of history directly opposed to that of Marx. Pareto replaced the idea of the progressive development of class systems by a cyclical theory of the rise and fall of elites, and concentrated attention upon the conditions of social equilibrium rather than the causes of social change. Marxist philosophy. Both Weber and Pareto aspired, though in different ways and with varying success, to establish sociology as an objective social science. Korsch and Lukacs, on the other hand, questioned the possibility, and also the value, of
MARXIST SOCIOLOGY such an objective science, and they expounded Marxism as a philosophy of society which approaches every problem from the point of view of the working class. Korsch, in Marxismus und Philosophie (1923), began by criticizing those thinkers who had regarded Marxism either as a set of methodological rules or as a system of universal causal laws, that is, as a general sociology in the positivist sense. According to him, Marxism includes both empirical and philosophical elements, but the latter are those which distinguish it clearly from other social theories. It is empirical in the sense that it deals with real social movements in modern society and is not in flagrant contradiction with actual events; it is philosophical in the sense that it interprets the facts by means of a conception of history as a process which will terminate in a "classless society." Because of this vision of the future which it contains, it is above all a theory of social revolution which expresses the outlook, and reflects the practical social activity, of a revolutionary class. In similar fashion Lukacs argued, in several of the essays collected in Geschichte und Klassenbewusstsein (1919-1922), that Marxism is not to be regarded as an objective interpretation of man's social history—still less as a scientific theory of social evolution—but as an interpretation, from the standpoint of the revolutionary working class, of the historical origins and development of capitalist society. Both writers insisted upon the opposition between Marxism and sociology. For them, Marxism is essentially a theory of history concerned with unique sequences of events and taking account both of objective conditions and of subjective human strivings. Sociology, on the other hand, by its ambition to establish general social laws, in the first place turns man into an object and discounts the subjective aspects of human action and, second, substitutes for the view of society as a historical process the conception of an unvarying system of social relationships which is to be discovered in every form of society. This idea of Marxism did not find favor with the orthodox Marxist-Leninists, whose opinions were authoritatively expressed at that time through the Third Communist International. However, there were few scholars among the orthodox who attempted to set out an alternative version or to meet the sociological criticisms of Marxism on their own ground. The most important of them was undoubtedly Nikolai Bukharin, whose exposition of historical materialism (1921) is noteworthy for the serious attention which it gives to the difficulties arising from the claim that Marxism is at the same
49
time an objective social science and the doctrine of a particular social class, and for its discussions of some of the more important criticisms of Marx. During the early 1900s, the intellectual and political influence of Marxism and the discussions of Marx's sociological theories were largely confined to the continental European countries. In Britain, Marxism made little impact upon sociology, either then or later. The influence of Marxism was greater in the early development of American sociology, but it was soon overshadowed. Thorstein Veblen is the most notable of those who turned to Marx as a source of powerful and radical ideas, which he then developed in his own fashion in theories of the influence of technology upon social structure (1899) and of the rise to power of the engineers (1921). Albion W. Small, who assigned to Marx a place as the Galileo of the social sciences, also played a large part in introducing Marxist ideas and was himself strongly influenced by Marx in working out his theories of social conflict. Marxist influence since the 1930s In the period from the early 1930s to the present day, the lines of thought distinguished above have continued and have been enriched by new studies. A number of Marxist writers have upheld the opposition between Marxism and sociology, and they have found new evidence for their views in Marx's early manuscripts, which began to be published in 1932. Thus, Korsch expounded his ideas more fully, but in the same form, in a study of Marx (1938) that was contributed to a series on the great sociologists. A few years later Herbert Marcuse (1941), in a study of the relations between Marx and Hegel, represented Marx's thought as the culminating achievement of the Hegelian dialectical method, as a "critical philosophy" of society which Marcuse contrasted with the positive philosophy and sociology of Comte. The same general view of the nature of Marx's thought, inspired in this case by Lukacs, is to be found in the work of Lucien Goldmann on the methods of the social sciences (1959) and on the social context and the literary expression of Jansenism in France (1955); and it has recently been expounded at length by Jean-Paul Sartre (1960), who argued that sociology, as an empirical discipline, either stands opposed to, or must be comprehended within, Marxism, which alone makes possible an understanding of the historically changing totality of social life. Mainstream of sociology. In the mainstream of sociological thought, many writers continued to turn to Marx's work as a source of specific ideas and problems which they could develop along new
50
MARXIST SOCIOLOGY
lines. One of the most important ideas which was thus reassessed was that of ideology. It had already attracted the attention of Marxist writers at the end of the nineteenth century, and Franz Mehring's Die Lessing-Legende (1893) is the first major attempt to make use of Marx's theories in the interpretation of literary styles. But it was not until a quarter of a century later that Marxist literary criticism revealed its full scope in the work of Lukacs, beginning with the publication of Die Theorie des Romans (1920) and continuing with his studies of nineteenth-century European realism (1935-1939) and of the historical novel (1947). Another Marxist writer who was greatly preoccupied with problems of ideology in a broader sense is Antonio Gramsci, much of whose work was done during his imprisonment by the Italian fascist government and has become generally known and influential only since the 1950s. Gramsci was especially concerned with the nature of the cultural dominance exercised by a ruling class, to which he attributed much greater importance than other Marxists had done, and, on the other hand, with the means by which the working class in a capitalist society might resist bourgeois cultural influences while developing its own forms of expression in literature, art, and thought. The notion of "social hegemony" which he introduced was meant to emphasize the interdependence of economic, political, and cultural elements in class conflicts; and his studies of the role of intellectuals, of the educational system, and of other aspects of culture (Gramsci 1949), inspired by this idea, were highly original contributions to the discussion of the old Marxist problem of the relations between "base" and "superstructure" in social life. The notion of ideology also provided the central theme in the work of Karl Mannheim, who envisioned his task as the elaboration of a general sociology of knowledge from Marx's one-sided criticism of bourgeois ideologies, as a means of understanding the ideological and political conflicts of the twentieth century. Mannheim's writings were symptomatic of a deep concern with the problems of ideology which has lasted until the present time and has produced a number of notable works, from the brilliant critical study by Ernst Griinwald (1934) to the historical survey, dealing at length with Marx and Nietzsche, by Hans Earth (1945). Theories of class structure. Mannheim was exceptional in attributing such overwhelming importance to the problems of ideology, and it is through other concepts—particularly those of class and conflict—that Marx has had his chief influ-
ence upon modern sociology. All the major theories of class structure, from those of Max Weber, Joseph Schumpeter, and Theodor Geiger up to those of the present day, have begun from Marx's formulation of the question and have been more or less strongly influenced in their conclusions by Marx's own results. Among recent writers, few have been prepared to abandon entirely Marx's model of the class system; but most of them have introduced modifications and have questioned Marx's explanations and predictions. Raymond Aron (1950; 1964), C. Wright Mills, and Ralf Dahrendorf (1959) reject, as inconsistent with the evidence, the constant association between economic ownership and political power which is a basic postulate of Marx's theory, and they draw attention in particular to the alternative bases of political power in societies where private ownership of industrial wealth is nonexistent. Marshall (1934-1962), Lockwood (1958), Lockwood and Goldthorpe (1963), and Mills (1951) examine the changing composition of the main social classes during the past century, especially the changes in the position of the middle classes, and show how these changes affect the relations between classes in a manner of which Marx's theory takes no account. Class conflict. Most recent sociologists have criticized Marx's theory of class conflict, especially that part of it which asserts the inevitability of working-class revolutions in capitalist societies and the eventual cessation of conflict in a society without classes. The critics, such as Dahrendorf (1959) and Aron (1964), argue that the growing differentiation of functions and the increasing separation between the economic, political, and other spheres in the advanced industrial societies have removed the basis for the coalescence of industrial, political, and ideological conflicts in massive class struggles, and that revolutionary movements have in fact disappeared from these societies. At the same time, they assert that some forms of conflict are unavoidable in any large and complex society and that a society without intergroup conflict, such as Marx envisaged, is sociologically impossible. The work of these writers shows, however, the extent to which the Marxist theory of conflict has influenced recent sociology; it has restored to the center of attention the problems of conflicting interests and values and of the strains produced by social change, which had been neglected in those theories, previously in the ascendant, that were chiefly concerned with consensus, integration, and social order. Communist countries. It might have been anticipated that with the spread of Marxism as a po-
MARXIST SOCIOLOGY litical creed in eastern Europe and Asia after 1945, there would be some revival of Marxist sociology in the countries concerned. However, so far this has not taken place. The later years of Stalin's rule were not propitious for any kind of sociology, and Soviet Marxism became increasingly occupied with adapting conventional formulas to political circumstances rather than with developing philosophical or sociological arguments; it was even less concerned with encouraging empirical investigations into Soviet society. Since Stalin's death there has been a resurgence of sociological research in communist countries, but it is not in any obvious respect inspired by Marxist ideas. Much of the research is concerned with problems which are to be found in all industrialized, or rapidly industrializing, societies—technological change, productivity, urban growth, delinquency, education, and leisure—and it is carried out by the same methods that are used elsewhere. Only in a few instances, where there is some significant difference in the institutional setting of the problems, as in studies of the workers' councils in Yugoslavia, does the Marxist theoretical system appear to have any importance in shaping the investigations. In the sphere of theoretical sociology, the contributions from communist countries have been few, and they have often revealed the difficulty of maintaining the Marxist system intact. A good example may be found in one of the most distinguished of these contributions: the last work published by the Polish sociologist Stanislaw Ossowski (1957), in which a profound reappraisal of Marx's theory of class leads to conclusions which do not differ widely from those reached by sociologists elsewhere. Ossowski recognizes that substantial changes have occurred in the class structure of capitalist countries, and he observes, in particular, that in all the modern industrial societies the political authorities increasingly determine the system of social stratification, rather than being determined by it, as a rigorous Marxist view would maintain. He also considers and criticizes the arguments which have been put forward, on opposite sides, for regarding both the United States and the Soviet Union as "classless societies." Perhaps his most important Contribution, however, is to distinguish the various conceptions of class which were incorporated into Marx's theory, to establish the tentative character of Marx's synthesis, and to show its potentialities for further development so long as it is not accepted as dogma. Ossowski's book may be seen, to some extent, as the harbinger of a more creative Period of Marxist thought in communist countries.
51
Defining Marxist sociology The record of the encounter between Marxism and sociology since the 1880s shows plainly that while they are distinct, and even opposed, they have never ceased to have a powerful influence upon each other. Marxism is more than a system of sociology; it is a philosophy of man and society, as well as a political doctrine. Sociology, as it has mainly developed in the present century, is an attempt to describe impartially, to measure exactly, and to connect by means of scientific generalizations the diverse phenomena of social life. Even if it be held that a "philosophical anthropology" underlies every major system of theoretical sociology, as Karl Lowith does in his illuminating comparison of Weber and Marx (1932), Marxism still retains a distinctive character; no other body of social thought has become, in this way, the unique doctrine of a political movement and finally the orthodoxy of a ruling party. No other theory, therefore, has been so liable to end in dogmatic assertion and estrangement from social science. Betwixt Marxism and sociology, the place of Marxist sociology is variable and uncertain. In one sense, Marxist sociology could be regarded as the sociology of those thinkers (for example, Nikolai Bukharin and Max Adler) who, on other grounds, are Marxist in their general philosophical or political outlook. It would then be of the same kind as any other school of sociology—let us say Thomist or Hindu sociology—which is based directly upon a philosophical world view. But it would still be affected by, and would have to respond to, the findings of empirical social research; and at some stage Marxists would be led to consider, as has happened in recent years, whether in fact there can be a separate Marxist sociology any more than there can be a separate Marxist physics. In a broader sense, however, Marxist sociology might be regarded as including the work of all those thinkers who attach prime importance, in the investigation and explanation of social events, to the role of economic interests, relations between classes, and intergroup conflicts, without necessarily agreeing with the particular conclusions that Marx himself reached. But this category may seem too broad, since it would include those, from Weber and Pareto up to the recent sociologists discussed above, who have acknowledged Marx's outstanding importance as a thinker and have turned to his work for concepts and hypotheses, but who have revised or rejected so much of his system that it would be eccentric to refer to them as Marxists.
52
MARXIST SOCIOLOGY
Lastly, Marxist sociology may be treated as a methodology, as a persistent critique of the aims and methods of the social sciences. In this form it has undoubtedly been prominent and important, as the writings of Lukacs, Marcuse, and Sartre bear witness; but here it becomes not so much Marxist sociology as Marxist "anti-sociology." T. B. BOTTOMORE [See also ALIENATION; KNOWLEDGE, SOCIOLOGY OF; LEISURE; MARXISM; STRATIFICATION, SOCIAL, articles on SOCIAL CLASS and THE STRUCTURE OF STRATIFICATION SYSTEMS; and the biographies of CROCE; LENIN; LUKACS; LUXEMBURG; MANNHEIM; MARX; MILLS; OSSOWSKI; PARETO; SOMBART; TONNIES; TROTSKY; WEBER, MAX.] BIBLIOGRAPHY ADLER, MAX 1925 Kant und der Marxismus. Berlin: Laub. ARON, RAYMOND 1950 Social Structure and the Ruling Class. British Journal of Sociology 1:1-16, 126-143. ARON, RAYMOND 1960 Classe sociale, classe politique, classe dirigeante. European Journal of Sociology 1: 260-281. ARON, RAYMOND 1964 La lutte de classes. Paris: Gallimard. EARTH, HANS 1945 Wahrheit und Ideologic. Zurich: Manesse. BUKHARIN, NIKOLAI I. (1921) 1926 Historical Materialism: A System of Sociology. London: Allen & Uriwin. -> First published as Teoriia istoricheskogo materializma. CROCE, BENEDETTO (1900) 1922 Historical Materialism and the Economics of Karl Marx. London: Allen & Unwin; New York: Macmillan. --> First published as Materialismo storico ed economia marxistica. DAHRENDORF, RALF 1959 Class and Class Conflict in an Industrial Society. Stanford Univ. Press. -> A greatly revised and expanded edition of a book first published in German in 1957. FROMM, ERICH (editor) 1961 Marx's Concept of Man. New York: Ungar. GOLDMANN, LUCIEN 1952 Sciences humaines et philosophic. Paris: Presses Universitaires de France. GOLDMANN, LUCIEN 1955 Le dieu cache. Paris: Gallimard. GOLDMANN, LUCIEN 1959 Recherches dialectiques. Paris: Gallimard. GRAMSCI, ANTONIO (1919-1937) 1959 The Modern Prince, and Other Writings. New York: International Publishers. GRAMSCI, ANTONIO 1949 Gli intellettuali e I'organizzazione della cultura. Turin: Einaudi. GRUNWALD, ERNST 1934 Das Problem der Soziologie des Wissens. Vienna and Leipzig: Braumiiller. KORSCH, KARL (1923) 1930 Marxismus und Philosophic. 2d ed. Leipzig: Hirschfeld. KORSCH, KARL 1938 Karl Marx. London: Chapman. LABRIOLA, ANTONIO (1895-1896) 1908 Essays on the Materialistic Conception of History. Chicago: Kerr. H> First published in Italian. LICHTHEIM, GEORGE 1961 Marxism: An Historical and Critical Study. New York: Praeger.
LOCKWOOD, DAVID 1958 The Blackcoated Worker. London: Allen & Unwin. LOCKWOOD, DAVID; and GOLDTHORPE, J. H. 1963 Affluence and the British Class Structure. Sociological Review 11:133-163. LOWITH, KARL (1932)1960 Max Weber und Karl Marx. Pages 1-67 in Karl Lowith, Gesammelte Abhandlungen zur Kritik der geschichtlichen Existenz. Stuttgart: Kohlhammer. LUKACS, GYORGY (1919-1922) 1923 Geschichte und Klassenbewusstsein: Studien iiber marxistische Dialektik. Berlin: Malik. LUKACS, GYORGY (1920)1963 Die Theorie des Romans: Ein geschichtsphilosophischer Versuch iiber die Formen der grossen Epik. 2d ed., enl. Neuwied am Rhein (Germany): Luchterhand. LUKACS, GYORGY (1935-1939) 1964 Studies in European Realism. New York: Grosset & Dunlap. -> Contains essays first published in Hungarian and German. First published in English in 1950. LUKACS, GYORGY (1947) 1965 The Historical Novel. New York: Humanities. -> First published in book form in Hungarian as A tortenelmi regeny. Parts 1 and 2 first appeared in 1937 in volumes 7, 9, and 12 of Liter-aturnyi kritik. MARCUSE, HERBERT (1941) 1955 Reason and Revolution: Hegel and the Rise of Social Theory. 2d ed. London: Routledge. -> A paperback edition was published in 1960 by Beacon. MARCUSE, HERBERT 1958 Soviet Marxism: A Critical Analysis. New York: Columbia Univ. Press. -» A paperback edition was published in 1961 by Vintage. MARSHALL, T. H. (1934-1962) 1964 CZass, Citizenship, and Social Development: Essays. Garden City, N.Y.: Doubleday. -> A collection of articles and lectures first published in England in 1963 under the title Sociology at the Crossroads and Other Essays. A paperback edition was published in 1965. MARX, KARL (1844) 1963 Early Writings. Translated and edited by T. B. Bottomore. London: Watts. MARX, KARL (1844-1875) 1964 Selected Writings in Sociology and Social Philosophy. 2d ed. Edited by T. B. Bottomore and M. Rubel with a foreword by Erich Fromm. New York: McGraw-Hill. MARX, KARL (1845) 1956 The Holy Family. Moscow: Foreign Languages Publishing House. -> First published as Die heilige Familie. MARX, KARL (1857-1858) 1953 Grundrisse der Kritik der politischen Okonomie. Berlin: Dietz. -> Written in 1857-1858. First published posthumously by the Marx-Engels-Lenin Institute, Moscow, in 1939-1941. A partial English translation was published as Precapitalist Economic Formations in 1965 by International Publishers. MARX, KARL (1859)1913 A Contribution to the Critique of Political Economy. Chicago: Kerr. -*• First published as Zur Kritik der politischen Okonomie. MARX, KARL; and ENGELS, FRIEDRICH (1845-1846) 1939 The German Ideology. Parts 1 and 3. With an introduction by R. Pascal. New York: International Publishers. -» Written in 1845-1846; first published in German in 1932. MEHRING, FRANZ (1893) 1953 Die Lessing-Legende: Zur Geschichte und Kritik des preussischen Despotismus und der klassischen Literatur. Berlin: Dietz. MILLS, C. WRIGHT 1951 White Collar: The American Middle Classes. New York: Oxford Univ. Press. -> A paperback edition was published in 1956.
MASARYK, THOMAS G. OSSOWSKI, STANISLAW (1957) 1963 Class Structure in the Social Consciousness. London: Routledge; New York: Free Press. -» First published as Struktura klasowa w spolecznej swiadomosci. PARETO, VILFREDO (1902-1903) 1965 Les systemes socialistes. 3d ed. Paris: Droz. -* Constitutes Volume 5 of Pareto's Oeuvres completes. RENNER, KARL (1904) 1949 The Institutions of Private Law and Their Social Functions. London: Routledge. -> First published in German in Marx-Studien under the pseudonym J. Karner. SARTRE, JEAN-PAUL 1960 Critique de la raison dialectique, precedee de question de methode. Paris: Gallimard. -» An English translation of the prefatory essay, "Question de methode," was published in 1963 by Knopf as Search for a Method. TONNIES, FERDINAND (1887) 1957 Community and Society (Gemeinschaft und Gesellschaft). Translated and edited by Charles P. Loomis. East Lansing: Michigan State Univ. Press. -> First published in German. A paperback edition was published in 1963 by Harper. TONNIES, FERDINAND 1921 Marx: Leben und Lehre. Jena (Germany): Lichtenstein. VEBLEN, THORSTEIN (1899) 1953 The Theory of the Leisure Class: An Economic Study of Institutions. Rev. ed. New York: New American Library. -» A paperback edition was published in 1959. VEBLEN, THORSTEIN 1921 The Engineers and the Price System. New York: Huebsch. WEBER, MAX (1904-1905) 1930 The Protestant Ethic and the Spirit of Capitalism. Translated by Talcott Parsons, with a foreword by R. H. Tawney. London: Allen & Unwin; New York: Scribner. -» See especially pages 35-92. First published in German. The 1930 edition has been reprinted frequently. WEBER, MAX (1904-1917) 1949 The Methodolgy of the Social Sciences. Glencoe, 111.: Free Press. ->• First published in German. See especially pages 50-113, "Objectivity in Social Science and Social Policy." WEBER, MAX (1907) 1922 R. Stammlers "Uberwindung" der materialistischen Geschichtsauffassung. Pages 291-359 in Max Weber, Gesammelte Aufsdtze zur Wissenschaftslehre. Tubingen (Germany): Mohr. WIATR, JERZY J. 1964 Political Sociology in Eastern Europe: A Trend Report and Bibliography. Current Sociology 13, no. 2.
MASARYK, THOMAS G. Thomas G. Masaryk (1850-1937), Czechoslovakian statesman and social theorist, was born in Hodonin on the Moravian-Slovakian border. His father, a Slovak, was a coachman on one of the imperial estates; his mother came from a small Moravian town. He studied first at the Gymnasium in Brno, but after a conflict with the Roman Catholic church he left there and attended the Gymnasium in Vienna. Later, at the University of Vienna he wrote his dissertation, "Das Wesen der Seele bei Hato" (1876), under the philosopher Franz Brentano. In 1877 he studied with Gustav Theodor Fechner at the University of Leipzig, but Masaryk
53
was influenced less by Fechner than by Charlotte Garrigue, a music student whom he met at Leipzig and later married. She came from a well-to-do American Unitarian family, whose religious faith had been shaped by Theodore Parker, and her religious views helped Masaryk define his own. Through her he also gained an understanding of English philosophy and American society: Locke, Hume, Mill, and Spencer influenced his thought, and American institutions guided his social and political aspirations. Masaryk acknowledged the extent of his debt to his wife by taking Garrigue as his middle name. In 1879 Masaryk became Privatdozent of philosophy at the University of Vienna. His interests centered on sociology and on Czech political life. His first important book was Der Selbstmord als sociale Massenerscheinung der modernen Civilisation (1881). In this work he attempted to deal with suicide as a social phenomenon and to support his conclusions statistically. According to Masaryk, Europe was then in a period of high suicide rates. This could be directly attributed to a decline of monotheistic religion and thus was the "fruit of progress, of education, of civilization" (p. 146). When the Czech university was established at Prague in 1882, Masaryk was called there as extraordinary professor of philosophy. He held his post for 32 years. During those years he helped to form the political and moral ideas of a significant part of the Czech and South Slav intelligentsia, although before 1914 he was not popular among the Czechs. His opposition to the Catholic church, on the one hand, and to Marxism, on the other, his pronounced Westernism, and his "realistic" moderation in political and national demands prevented him from exercising a wider influence. While a professor in Prague, Masaryk published his Versuch einer konkreten Logik: Klariftkation und Organisation der Wissenschaften (1885), his last scholarly work. From then on, most of his work was more directly concerned with moral and political education. He founded and edited several journals, among them Athenaeum and Nase doba ("Our Epoch"), and a political weekly, Cas ("Time"). He was active politically as a member of the Austrian Reichsrat, representing first the Young Czech party, from 1891 to 1893, and later, from 1907 to 1914, the tiny Progressive party (more generally known as the Realist party), which he had founded. Masaryk's political activities reflected the themes of his writings published during these years, in which he discussed Czech nationalism and deplored the deficiencies of the Marxist approach to basic social problems (1895; 1896; 1898). He knew
54
MASS BEHAVIOR
Russia from both studies and travels and wrote two volumes interpreting Russian culture, history, and religion (1913). He planned but never completed a third volume, on Dostoevski, whom he regarded as the key author for an understanding of Russia and whose philosophy and outlook he totally rejected. Masaryk's own nationalism revived and extended the ideas underlying the work of the first modern Czech historian, Frantisek Palacky. Masaryk saw the Czech national awakening in the nineteenth century as a continuation of the Hussite reformation of the fifteenth. And the movement of the Bohemian Brethren, in his view, encompassed the highest aspirations of the Czech nation and of the whole of mankind. This vague, moral nationalism was sharply criticized by professional Czech historians like Jaroslav Goll and Josef Pekaf. In 1914 Masaryk left Prague, and during World War i he became the leading propagandist urging the establishment of independent small nations (1918) and the identification of Czech national traditions with those of Western democracy. He made London the center of his activities and in 1917-1918 visited Russia and the United States. It was his plan to create a Czechoslovakia in which Czechs and Slovaks would be united on the strength of ethnic principles and Germans and Magyars would be included on the basis of historical principles; his views prevailed at the Versailles peace conference. He was elected the first president of Czechoslovakia in 1918 and was continuously reelected until he resigned in 1935 for reasons of ill health. Until 1948 he was revered as the "father of his country," but because of his strong antiBolshevik stand the post-1948 communist regime has been sharply critical of him. HANS KOHN WORKS BY MASARYK
1876
Das Wesen der Seele bei Plato. Ph.D. dissertation, Univ. of Vienna. 1881 Der Selbstmord als sociale Massenerscheinung der modernen Civilisation. Vienna: Konegen. (1885) 1887 Versuch einer konkreten Logik: Klarifikation und Organisation der Wissenschaften. Vienna: Konegen. -» First published as Zdkladove konkretni logiky. (1895) 1935 Ceskd otdzka: Snahy a tuzby ndrodniho obrozeni (The Czech Question: Efforts and Aspirations Towards the National Rebirth). 4th ed. Prague: Cin. (1896) 1920 Karel Havlicek: Snahy a tuzby politickeho probuzeni (Karel Havlicek: Efforts and Aspirations Towards the Political Awakening). 3d ed. Prague: Laichter. (1898) 1935 Otdzka socidlni: Zdklady marxismu sociologicke a filosoficke (The Social Question: The Sociological and Philosophical Foundations of Marxism). 3d ed. Prague: Cin.
(1913) 1955 The Spirit of Russia: Studies in History, Literature and Philosophy. Rev. & enl. ed., 2 vols. New York: Macmillan. -> First published in German. 1918 The New Europe: The Slav Standpoint. London: Eyre & Spottiswoode. (1925) 1927 The Making of a State: Memories and Observations, 1914-1918. New York: Stokes; London: Allen & Unwin. -> First published as Svetovd revoluce za vdlky a ve vdlce, 1914—1918. (1931-1933) 1944 Masaryk on Thought and Life: Conversations With Karel Capek. New York: Macmillan. First published as Hovory s T. G. Masarykem. SUPPLEMENTARY BIBLIOGRAPHY
Festschrift Th. G. Masaryk zum 80. Geburtstage. 2 vols. 1930. Bonn: Cohen. LUDWIG, EMIL (1935) 1936 Defender of Democracy: Masaryk of Czechoslovakia. New York: Robert McBride. First published as Gesprdche rn.it Masaryk: Denker und Staatsmann. NEJEDLY, ZDENEK (1930-1935) 1949-1950 T. G. Masaryk. 2d ed., 2 vols. Prague: Orbis. SETON-WATSON, ROBERT W. 1943 Masaryk in England. New York: Macmillan.
MASS BEHAVIOR See COLLECTIVE BEHAVIOR and NOMENA.
MASS PHE-
MASS COMMUNICATION See COMMUNICATION, MASS. MASS CULTURE See COMMUNICATION, MASS; MASS SOCIETY. MASS MEDIA See COMMUNICATION, MASS. MASS PHENOMENA The term "mass phenomena" as it is used in this article is intended to cover the same range of behavior as that denoted by two frequently used similar expressions: "collective behavior" and "mass behavior." Under this general rubric a number of more specific terms have commonly been employed to refer to the five major subtypes of mass phenomena: (1) apathy, (2) panic, (3) mob, (4) craze, and (5) social movement. These major subtypes are in turn divisible into still finer subclassifications, denoted by a miscellany of terms which have come to be used conventionally to describe local varieties of unique forms. Thus some mob situations are commonly labeled "riots," as in "race riots," and certain others are referred to as "lynchings"; some social movements are called "revitalization movements," and these are still further classified by local terms, such as "cargo cults" (Oceania), "nativistic move-
MASS PHENOMENA ments" (American Indians), etc. The nomenclature for mass phenomena is so vast and so intricately related to varying criteria that there is no reason to review or attempt to rationalize it in detail here. It should be noted that there are also other terms, like "mass hysteria," which crosscut this natural historian's nomenclature. These terms draw attention to certain psychological or social attributes which several types of mass phenomena have in common and which they sometimes share with behavior that is not included under the category of mass phenomena. Definition. Although there seems to be an intuitive recognition by most observers that all of these forms of human behavior have something in common which justifies treating them as a unit, the efforts to define this commonality are not always in agreement. Smelser has provided a definition in his book, Theory of Collective Behavior: "we define collective behavior as mobilization on the basis of a belief which redefines social action" ([1962] 1963, p. 8). Brown (1954) in his discussion of the varieties of mass phenomena suggests a set of dimensions for their classification (size, frequency and regularity of congregation, frequency and regularity of polarization of attention, and continuity of identification of individuals with the group), but he provides no definition. Any definition must not only state common features of the various referents of the term but must also state common features which nonmembers of the class do not possess. Certainly the term "mass phenomenon" cannot be taken to refer to all attributes of a large group of people (however "large" may be defined); otherwise, we should have to include culture, acculturation, culture change, population growth, voting behavior, rumor circulation, and many other social and cultural attributes and processes under the rubric of mass phenomena. Neither can we comfortably include all situations in which a large group of people is collected in one place, or in which many people are simultaneously the target of communication: not all audience or crowd behavior will fall within the intuitive boundaries of the concept. Nor can we be satisfied with a definition that emphasizes a single psychological or social process—such as fear or mobilization—without regard to the size (or "mass") of the group involved. If we keep these considerations in mind, it would appear that Smelser's definition is nearly adequate to our needs. But its emphasis on mobilization makes it difficult to include the disaster syndrome (including shock, apathy, disorientation, and the very opposite of mobilization of a
55
group for action). Thus I suggest the following, somewhat less analytical, definition: "Mass phenomenon" signifies that class of social event in which a large number of people at the same time behave in a way which constitutes a notable interruption of their routine, socially sanctioned role behavior. The varieties of mass phenomena. Let us now consider how the characteristics recognized by the above definition are manifested in the subclasses of mass phenomena to which I have referred. Apathy and the disaster syndrome. The disaster syndrome occurs after a major catastrophe, usually physical, has destroyed important features of a group's natural and/or cultural environment, frequently with severe casualties. In this case the interruption of routine behavior implies the virtual cessation of any kind of adaptive behavior. Initially, the surviving population appears to be in a state of shock: people are passive, emotionally numb, relatively insensitive to physical pain, apathetic, disoriented, unable to understand the magnitude of the disaster, and unresponsive beyond minimal survival action; little mutual aid is undertaken, and remedial action is often trivial. After minutes or hours—and perhaps longer—as aid enters from outside the disaster area, the survivors become less apathetic and enter into a suggestible stage in which, under leadership, they can begin to engage in rescue, repair, and other useful activities. Eventually the syndrome moves into a euphoric stage of mutual aid in reconstruction, and at last it tapers off into the culturally standardized routine. [See also DISASTERS. For discussion of the disaster syndrome see Wallace 1956a.] Panic. Panic occurs when a group is subjected to an overwhelming and imminent threat to which escape appears to be the only effective response, and when escape routes are perceived to be inadequate to accommodate all of the group before the impact of the threatened event. In such a situation, the group structure disintegrates into an "every-man-f or-himself" state of anarchy: individual escape tactics are apt to be chosen impulsively, with little foresight and with restricted attention to the real environment; jamming is likely to occur at the exits from the situation, with attendant injury, loss of life, and increased slowing of the escape flow. The interruption here is twofold: first, the abandonment of group structure by individuals (even though the group itself may have had a plan for handling the problem by reducing the threat or by orderly escape); and second, the severe constriction of perceptual and cognitive functions of individuals under the stress of fear.
56
MASS PHENOMENA
Mob. The mob is an angry group which attacks and attempts to injure or destroy an object (usually a person or persons or some item of material culture identified with some human being or group). It differs from a military or police force insofar as the members of the mob are not performing socially sanctioned roles and insofar as the attack is not undertaken as an implementation of a rational policy concerted by the mob's members (although, to be sure, there may be a leader who, unknown to the rank and file, is exciting and directing the mob, carrying out a policy of his own or of some other group). The interruption of routine behavior here is the abandonment of the socially sanctioned roles of peaceable, law-abiding private citizens and the assumption of primitive judgmental and punitive roles which are carried out with minimal concern for justice (as locally defined) or for long-term consequences. Craze. The craze is a short-lived rush, by many persons, to worship, to touch, or to acquire some object (human or material) or characteristic of value. In its milder forms it may be referred to as a "fad"—a clothing style, a type of haircut, a dance; in more extreme expression it may be termed a "craze" (proper)—such as the adulation of a popular singer by thousands of screaming fans, a kind of financial investment, or a rush to settle new lands or to exploit unclaimed mineral resources. The interruption here is the abandonment of previous objects of interest and the substitution for them, by many people at the same time, of a standardized object. [See FASHION.] Social movement. The social movement (or revitalization movement), whether religious or political and whether revolutionary or reformative, is by definition an organized effort to induce the members of a community to abandon certain customs or practices and to adopt different ones. The participants in the movement, starting with the prophet or leader, then his disciples, and eventually at least some followers, do in fact change their ways and the distribution of their energies. In every social movement, therefore, there is an interruption of a routine and the substitution of a new pattern of behavior, rationalized by reference to an ideology. The aim of the movement, of course, may be to accomplish a much more extensive interruption and to institute a much more pervasive new system than ever is accomplished. [See SOCIAL MOVEMENTS.] Phenomena not considered. It may be pointed out that in the above discussion we have left out certain phenomena which are included in some treatments. Thus, for instance, we have not treated
crowds per se as examples of mass phenomena, because many crowds are engaged in perfectly routine, standardized activities. Thus, for instance, we do not treat as a mass phenomenon the audience at sports events, at theaters, and at religious ceremonials because it is an organized group interacting with another group (the performers) in a patterned, culturally institutionalized way, even in cases (as at political rallies, voodoo rituals, or Holy Roller types of religious revival) where the behavior is excited or hysterical in a technical sense. Similarly, crowds on arteries of transportation, in markets, or in military units, however poorly or well organized, are not treated as mass phenomena, because the behavior involved is perfectly explicable and predictable from a knowledge of the culture. Nor do we consider the gross characteristics and social activities of vast aggregates —like "the masses," "the consumer," "the proletariat," "the Negro," or "the Southern white"— as mass phenomena in themselves. The reader may, however, note that while the behaviors which have been treated as mass phenomena do have in common the feature of interruption of routine, they range from the nonpurposeful, inactive, maladaptive extreme of the disaster syndrome, through grades of increasingly purposeful, active, adaptive behavior, to the social movement, which is eminently purposeful, active, and adaptive. In the next section we shall take up the question of explanation, not only for the mass phenomenon in general, but for the occurrence of its varieties. Explanations of mass phenomena. Explanations —and predictions—of mass phenomena usually invoke a mixture of psychological and sociological principles. Sometimes efforts are made to provide purely sociological explanations; these efforts, however, are generally justified by pointing out the deficiencies of early psychological theories that postulated "herd instincts," "the group mind," and the atavistic vulnerabilities of civilized men. Although such appeals to supposed universal psychological tendencies are fruitless as guides for research, the "pure" sociological approach merely reintroduces psychology through the back door via definitions of "social" concepts in terms of sentiments, goals, values, needs, and so forth. It seems wisest to make use explicitly of both psychological and sociological variables, evaluating the utility of each by more or less operational criteria. On the most generic level, the following conditions seem to be required for the occurrence of any mass phenomenon: (1) a certain type of information must be presented to the members of
MASS PHENOMENA the target group, approximately simultaneously; (2) the type of information which is presented must describe a difference between the individual's present situation and that which either has obtained in the past or very probably will obtain in the future; (3) the difference must be sufficient to constitute a dramatic gain, or loss, of important values (such as life, health, or self-respect); (4) the present or future loss must be perceived as avoidable, or the future gain as achievable, if something is done. The resultant action is, in fact, the behavior described as the mass phenomenon. Further conditions need to be specified before it is possible to predict what that action will be. Important classes of such conditions are, first, the precise nature of the threat, disaster, or future gain; and, second, the existing cultural system of the target group—its goals, its fears, its ontological beliefs, its social organization, and its modal personality structure. Apathy and the disaster syndrome. In a major disaster or in a situation of threat from which no escape route can at the moment be perceived, the mass action is, in effect, no action, or apathy: the only relief from the awareness of an unchangeable contrast between good past and bad present, or good present and bad future, is denial or withdrawal from awareness of reality as thus defined. The disaster syndrome, the cultural situation of universal demoralized individual behavior resulting from anomie, and the fictional social condition of an impending world's end described in Nevil Shute's popular novel On the Beach are examples of the apathetic response. It is worth noting that panic does not occur under these conditions. Panic. Where the catastrophe has not yet occurred and there is still a possibility—but a narrow and diminishing one—of escape, panic occurs instead of apathy. The action involved is frequently precipitate physical flight, but other activities may represent the appropriate mode of escape: the selling of property, as in a financial crash or in a neighborhood threatened with invasion by an unwanted social group; or the hoarding of food in anticipation of shortages. Mob. Where a catastrophe may occur but is not imminent, and where its likelihood is believed to be increased by the actions or inaction of some other person or group who is not believed to be responsive to the fears of the threatened group, a likely response is mob action. Mob action is also likely where an important goal is believed to be achievable but blocked by the action or inaction °f some nonresponsive social group. The critical factor here seems to be the group's belief that the
57
disparity between present and future (whichever way the balance lies—good present and bad future or bad present and good future) can be resolved only if some weak and evil persons are injured or destroyed. In the mob situation, furthermore, it is possible for the group to find relief for their feelings of guilt in scapegoating, that is, attributing to others those faults, often irrelevant to the precipitating issue, whose recognition in themselves would cause the members of the mob to feel further discomfort. [See PREJUDICE.] Craze. In the craze, the members of the group seem to be driven by both a hope for some desirable thing and a fear of being left behind while others enjoy themselves. In a sense the craze is a "positive" panic: the urgency of the situation lies not in the imminence of danger, escape from which becomes less likely with each passing moment, but in the availability of a benefit, access to which may be reduced in the near future. Social movement. The social (or revitalization) movement is the most positive, most organized, and most deliberate of the mass phenomena. In polar contrast to apathy, the participants in such a movement must maintain an effective social organization over considerable periods of time. The social movement defines the present as a transfer point between an undesirable past and a glorious future. It mounts a carefully calculated campaign, by a mixture of religious and political procedures, to transform society from an evil to a good condi' tion. In the social movement, the character of the existing culture is closely relevant to what happens. Prevailing beliefs about the mechanisms of change are apt to determine the form of—but not to precipitate—the movement. Thus, among Jews the belief in a Messiah, and among Muslims the belief in the Mahdi, have heavily colored the movements that have occurred among these peoples; the Melanesian "myth dream" of their ancestors returning with cargo and the Christian concept of the millennium have shaped many of the movements in their respective parts of the world. [See MlLLENARISM; NATIVISM AND REVIVALISM.]
It should finally be pointed out that mass phenomena of different types can follow one another in sequence in a given group. Thus, an apathetic phase following the awareness of disaster may be succeeded by a revitalization movement; a rioting mob may be swept by panic; an enthusiastic meeting of participants in a craze may turn into a riot if the object of the craze is withheld; and so on. It follows from the definition and from the general statement of conditions that a mass phe-
58
MASS SOCIETY
nomenon, once established, can readily be transmuted in form as the nature of the information given to the group is varied. ANTHONY F. C. WALLACE [Directly related is the entry COLLECTIVE BEHAVIOR. Other relevant material may be found in GROUPS; INTERACTION; SOCIAL PSYCHOLOGY.] BIBLIOGRAPHY The bibliography on mass phenomena is extensive and diffuse. Smelser 1962 contains the most useful general bibliography on the topics considered in this article, except for the subject of apathy, which is not treated. The National Academy of Sciences-National Research Council has published a series of monographs on disaster behavior, including the apathetic reaction, and maintains a large card catalogue of works on disaster. See also the bibliographies of COLLECTIVE BEHAVIOR and DISASTERS. BLUMER, HERBERT 1957 Collective Behavior. Pages 127158 in Joseph B. Gittler (editor), Review of Sociology: Analysis of a Decade. New York: Wiley. BROWN, ROGER W. 1954 Mass Phenomena. Volume 2, pages 833-876 in Gardner Lindzey (editor), Handbook of Social Psychology. Cambridge, Mass.: AddisonWesley. FESTINGER, LEON; RIECKEN, HENRY W.; and SCHACHTER, STANLEY 1956 When Prophecy Fails. Minneapolis: Univ. of Minnesota Press. QUARANTELLI, ENRICO 1954 The Nature and Conditions of Panic. American Journal of Sociology 60:267-275. SMELSER, NEIL J. (1962) 1963 Theory of Collective Behavior. London: Routledge; New York: Free Press. WALLACE, ANTHONY F. C. 1956a Tornado in Worcester. Washington: National Research Council. WALLACE, ANTHONY F. C. 1956b Revitalization Movements. American Anthropologist New Series 58:264281.
MASS SOCIETY "Mass society" is best understood as a term denoting a model of certain kinds of relationships that may come to dominate a society or part of a society. Terms like "mass production" and "mass communication" refer to activities that are intended to affect very large numbers of people who are seen, for these purposes, as more or less undifferentiated units of an aggregate or "mass." Similarly, a "mass society" is one in which many or most of the major institutions are organized to deal with people in the aggregate and in which similarities between the attitudes and behavior of individuals tend to be viewed as more important than differences. Societies or institutions organized in this way are said to have a "mass character," and the life of individuals in such societies is said to be governed primarily by "mass relations." The structure of mass society Large populations do not by themselves produce mass relations, although mass relations are less
likely among small populations. In the past, large societies were divided into many segments with relatively clear boundaries separating each segment from the other. Even though a society contained thousands of villages, all of them much alike, it was not a mass society because human relations centered on the village and supported the integrity of the village as a social unit. Unlike the village-based society, the mass society does not help to sustain spontaneously evolving and durable social units. "Mass" in its simplest sense means an aggregate of people without distinction of groups or individuals. In mass production, for example, workers are organized according to the logic of specialization and control rather than as members of social groups or as distinct persons, and production is geared to a market of similarly undifferentiated people. Mass production, of course, involves a highly structured mass, by virtue of the division of labor and administrative organization, and it is therefore to be distinguished from the unstructured mass represented, for example, by the aggregate of unemployed workers. Moreover, some industries have more of a mass character than others: the assembly-line system of automobile factories is much more conducive to the emergence of the mass than is the craft-based system of printing (Blauner 1964). Nevertheless, the mass character of the market is a decisive factor in the organization of most manufacturing industries. It is not so much the large size of the population as it is the large scale of activities that favors mass relations. Where the scale of activity is very great, it is more likely that the social relations which individuals bring with them or develop will be easily ignored or transformed by the dictates of technical efficiency or effective control. Thus, mass relations are likely to emerge where large-scale activities predominate, as in nationwide organizations, markets, audiences, and electorates. The decline of community. Large-scale activities favor the emergence of the mass because they tend to develop at the expense of communal relations. The local community comes to provide for fewer of its members' needs and therefore cannot maintain their allegiance. The rural community no longer is isolated and self-sufficient. As it becomes dependent on the city, and particularly on national markets and organizations, the rural community loses its significance and cohesion. The city does not develop the communal life that was formerly provided by the rural community. The individual who migrates to the city does not enter the community as a whole, nor is he likely to enter a subcommunity of the city. The urban subcommunity loses its coherence as a result of the increasing
MASS SOCIETY tie and specialization of common activities. Inad of affiliation with a community, the urban ident frequently experiences considerable social lation and personal anonymity. Ethnic and religious groups also tend to lose :ir coherence as their members are drawn into ge-scale organizations and arenas. Individuals rive less of their social identity, style of life, and ;ial values from their ethnic and religious back)und. As ethnic cultures come in contact with iss culture, they cease to preserve their unique alities. Religious groups tend to de-emphasize 5ir theological and liturgical differences. The parular religious affiliation loses its significance for th religious and secular beliefs and conduct, en if people continue to associate primarily with religionists, this has little influence on the quality their lives or on the manner of their participan in the larger society. Like local, ethnic, and religious communities, iss-based communities tend to lose their imrtance and coherence where the whole population incorporated into large-scale activities. Social isses weaken as sources of distinctive values, fles of life, and social identity; and they increasgly resemble one another in the beliefs, values, id interests of their members. Class distinctions e leveled, and class boundaries are blurred. Class nsciousness and class solidarity dissolve into ass consciousness and mass solidarity. The lower asses are increasingly brought into arenas of comunication, politics, and consumption previously nited to the higher classes. Class differences in >portunities and modes of participation that reain are no longer believed to be desirable or :rmanent. Common symbols of the good life and rights and obligations replace class-differentiated incepts. Classes remain as categories of people ho differentially share in common ways of life .ther than as self-conscious groups with distincve ways of life. Status strivings and anxieties )ound, but this testifies to the ambiguity of status here fixed social hierarchies no longer exist. The ascendance of organization. Mass organiitions replace communal groups as the charactertic units of society. Mass organizations are large id formal, but some large and formal organizaons exhibit more of a mass character than do :hers. The additional features that constitute a lass character include a membership that is strucired primarily by administrative devices rather lan through social relations, and, correlatively, stivity that is mobilized from the center rather lan generated through various groups within the fganization (Selznick 1952). Mass organizations o not build on the primary relations of members,
59
nor do they support and facilitate primary relations among members. The result is a relatively unmediated and depersonalized relationship between the membership and the organization. Where the organization seeks a highly active membership, as in certain kinds of mass parties, intense identifications with the organization may be created. Most mass organizations do not seek a mobilized membership, however, and do not possess the symbols or other resources for mobilization. Instead, they are content with passive support from their members, who in turn acquire little social identity from the organization. Solidarity tends to be weak under these conditions, and symbolic or personal gratifications correspondingly slight. Unlike membership in communities, membership in mass organizations tends to be a fragile bond because relations are impersonal and leveled. This weakness is indicated by high rates of mobility of members, as they respond to opportunities for greater benefits and to new interests elsewhere. As mass organizations replace communities, so do "mass arenas" displace local arenas. Mass arenas, including national markets and electorates, are spheres of activity common to all sections of the population. Like mass organizations, mass arenas are managed from the center rather than structured through social relations. They are managed primarily through the mass media of communication, since only in this way can an entire population be presented simultaneously with the same objects of attention. People participate in mass arenas by selecting from among the alternatives presented through the mass media. Since the alternatives are standardized in order to reach the entire population simultaneously and since they are directed to individuals as undifferentiated members of the society, participation transcends the individual's social relations (Blumer 1939). Mass equalitarianism. Pervading all kinds of mass relations is a common normative orientation of equalitarianism. All members of mass society are equally valued as voters, buyers, and spectators. Numerical superiority therefore tends to be the decisive criterion of success. In the political realm this means the number of votes; in the economic realm it is the number of sales; and in the cultural realm it is the size of the audience. Mass equalitarianism is strengthened by the attenuation of the social bases of inequality, notably membership in ethnic and religious groups and especially in social classes. In contrast to the equalitarianism of small numbers, as in friendships, mass equalitarianism emphasizes the similarities of individuals rather than the uniqueness of persons. Mass equalitarianism is also linked to the bu-
60
MASS SOCIETY
reaucratization of organization. Mass organization simultaneously encourages the bureaucratic centralization of governing powers and the leveling of social differences among the governed (Weber 1906-1924). The incorporation of all sections of the population into large-scale activities summons centralized organization for coordination and control. Mass bureaucracies favor the leveling of social differences in the interest of efficiency. By treating everyone alike, according to functionally rational rules and procedures, mass bureaucracies foster equalitarianism. However, bureaucratic recruitment on the basis of professional competence raises new hierarchies. To be sure, careers open to talent are in greater harmony with equalitarian beliefs than is selection according to family and property. But professional elites are nevertheless elites and thereby introduce new social distinctions. This is a source of strain in modern society; in the political realm, for example, there is a tension between planning by experts and participation by mass electorates [see ELITES]. Mass equalitarianism is expressed in the populist character of mass society. Whatever is believed to express the popular will or to meet the most widely shared expectations is considered legitimate (Shils 1956). Political regimes strive to be popular regimes, whether they are dictatorial or constitutional. While this popular legitimation of authority centers in the polity, it pervades all kinds of social institutions. Populism places a premium on the capacity of leaders to create and placate popular opinion. Those who are effective in mobilizing large numbers of people have great power, and this generally means the leaders of mass parties. The mass leader seeks to embody and reflect popular desires; masses, not elites, are the ultimate sources of legitimation in mass society. This leads elites to make themselves readily accessible to popular pressures: that is, they are forced to be responsive not only to periodic expressions of public opinion through regular channels such as elections, but also to momentary and ad hoc representations of whatever is claimed to be popular. Leaders, of course, do not seek merely to respond to mass opinion. They also try to control it. Since they lack firm bases of independent authority, their control tends to take the form of manipulation and mobilization rather than command. The very presence of large numbers of only loosely organized and committed people summons efforts of leaders to manipulate and mobilize them. For if elites are highly accessible to mass pressures, so are masses readily available for mobilization by elites. People are receptive to direct appeals from remote elites, because they are poorly attached to proximate sym-
bols and relationships and increasingly caught up in distant events and activities (Kornhauser 1959). Mass movements. As mass society develops, there is a growing cleavage between those who continue to be integrated in local groups and those who have already been incorporated into mass relations. In part this is a difference between the old classes and the new classes—craftsmen versus industrial workers, independent entrepreneurs versus industrial managers, free professionals versus members of professional staffs, and so forth. Increasingly isolated from the larger society, members of the declining classes readily come to believe that they are the victims of it. More generally, the locally attached, in their resentment of the ascendancy of big cities, big government, big business, and big labor, become receptive to the appeals of mass movements directed against the forces of mass society. Then there is the growing number of people who have been detached from communal relations but who are not, or not yet, incorporated into mass relations. It is likely to include, among others, new migrants to the cities, new workers in the factories, and, generally, the younger and newly mobile members of the society. In the absence of strong group ties, they are less constrained and more restless than those who continue to be rooted in communal groups or those who have been fully incorporated into mass relations. These poorly attached and unintegrated people are readily available for activistic modes of intervention in political life and for participation in mass movements that promise them full membership in the national society. Thus, modern mass movements are characteristically composed of people who either seek entry into mass society or seek to reverse the processes of mass society. Like mass organizations, mass movements do not build on existing social relations but instead construct direct ties between participants and leaders. When a mass society has successfully incorporated most sections of the population into its central institutions, mass movements may become less widespread. In a highly developed mass society, mass participation is institutionalized in the form of mass organizations, especially mass parties, but also mass unions and similar associations, universal suffrage, extensive publicity of political men and events, and the official symbolism of popular government [see SOCIAL MOVEMENTS]. Criticism of mass society Early critics. The concept of mass society had its major intellectual origin in the nineteenth-
MASS SOCIETY century criticism of the revolutionary changes in European (and especially French) society. Many thinkers believed that the decisive social tendency was the change from aristocratic to democratic society. It was not simply that a shift occurred in the class composition of governing groups. More fundamental was the shift that these thinkers perceived in the bases of social order. Formerly, standards of value and conduct had been assumed to exist as part of a natural order of society; in democratic society, by contrast, the arbitrary will and opinion of the masses were replacing established standards. Early representatives of this kind of social criticism of the democratization of society were Catholic thinkers like Joseph de Maistre and the vicomte de Bonald. Following the ascendancy of portions of the middle classes, marked by such events as the accession in 1830 of Louis Philippe, the bourgeois king, in France and the passage of the 1832 Reform Act in England, liberal thinkers adopted mass society ideas, not to defend the old order but to assess the strengths and weaknesses of the new order. Thus, Tocqueville (1835) moved from a fairly hopeful analysis of the possibilities of preserving standards in a democratic society (in light of his examination of America) to a more pessimistic view of the matter following the 1848 revolution in France. Even so influential a liberal thinker as J. S. Mill found himself in wide agreement with Tocqueville's more pessimistic diagnosis of democratic culture. Burckhardt and Nietzsche, among many other late nineteenth-century romantic thinkers, sought to interpret changes in European society as the erosion of culture. Ortega (1930) later formulated a highly popular version of this view. This aristocratic criticism of the development of nineteenth-century society profoundly influenced democratic criticism of the development of twentieth-century society. Where the first centered on the intellectual defense of elite values against the rise of mass participation, the second developed as a defense of democratic values against the rise of totalitarianism. The defensive posture of the aristocratic thinkers was adopted by democratic thinkers who, having won the nineteenth-century war of ideas and institutions, now sought to preserve their gains against the totalitarian challenge. Thus, such students of totalitarianism as Lederer (1940), Mannheim (1935), Fromm (1941), Neumann (1942), Arendt (1951), and Kornhauser (1959) s ee in the fragmentation of society the opportunity for new forms of domination based on the mobiuzation of large populations. Two kinds of analysis closer to the social sci-
61
ences have contributed significantly to the development of the idea of mass society during the past century. One is the effort to distinguish between traditional and modern societies, a line of analysis that has become a central theoretical perspective of sociology. An early formulation of this perspective was Maine's distinction between societies dominated by status relations of kinship and those dominated by contract relations of individuals. Tonnies (1887), in his highly influential analysis of Gemeinschaft and Gesellschaft, elaborated Maine's thesis. Further evolution of this line of analysis is to be found in Durkheim's theory of social solidarity and anomie (1893; 1897) and in Max Weber's treatment of traditional and bureaucratic authority (Weber 1906-1924). What made this kind of sociological theory relevant to the idea of mass society was its analysis of the atomization and depersonalization of social organization resulting from modernization. This became a central thesis of urban sociology, as in the writings of Simmel (1902— 1903), Park (1916-1939), and Wirth (19331953). [See COMMUNITY-SOCIETY CONTINUA.] The development of mass psychology provided still another source of ideas about mass society (Reiwald 1949). Gustave Le Bon, Scipio Sighele, and Gabriel Tarde were leading students of mass behavior at the turn of the century. In their analysis of the heightened suggestibility and manipulability of people no longer constrained by communal ties and traditional authorities, these theorists contributed to the social psychology of mass society. This line of analysis was given a more sociological and less polemical cast by American students of what came to be called "collective behavior" (Blumer 1939) [see COLLECTIVE BEHAVIOR]. Many of these themes from sociology and social psychology were drawn together in Mannheim's critical analysis (1935) of the effects of the "fundamental democratization" and "growing interdependence" of society. A common perspective unites these theories and makes them part of the history of the idea of mass society. It is a view of modern society as containing certain fundamental pathological tendencies, which are believed to inhere in its development. The theory of mass society adds to such concepts as "democratic society," "urban society," and "industrial society" an emphasis on the socially disintegrative effects of democratization, urbanization, and industrialization. Foremost among these effects are the decline of community and authority and the spread of pseudo-community and pseudoauthority. Pseudo-community. During the 1920s and 1930s, a number of American sociologists reported on
62
MASS SOCIETY
various aspects of modern life and generally stressed the anonymity and atomization of persons in contemporary society. Following World War n, this portrait of modern life was subject to considerable criticism on the grounds that primary relations are much in evidence in the factory, the army, and other allegedly impersonal organizations. Kinship networks, neighborhood bonds, and local activity were observed in a number of urban and suburban settings, and primary-group mediation of mass communications was shown to prevail over completely atomized audiences. Such observations suggest that the decline of community is at most relative to the condition of premodern society (Greer 1958). There is much more to the problem of community than the question of the mere presence or absence of personal attachments and communal bonds. Students of mass society assert that the functions of primary groups are weakened under conditions of modern society, not that primary groups are absent. The decreasing role of primary relations in the social organization of mass society and their increasing isolation from the larger society weaken them as sources of meaning and support for the individual in the larger society. Moreover, they are more easily broken because they receive less support from the institutional framework of society. Both weaknesses stem from the attenuation of the links between primary relations and the major functional areas of society. The isolation of primary relations creates the need for more inclusive bonds of solidarity and gives rise to a search for new forms of community. The barriers to community thrown up by the mass character of society heighten receptivity to the appeals of pseudo-community. This hypothesis has been applied to otherwise widely diverse social contexts. The German middle-class youth movement at the beginning of the century made the "return to Gemeinschaft" its cardinal article of faith. The Nazi movement subsequently inscribed the "folk community" on its ideological banner and won many adherents on the strength of this appeal. The totalitarian mass movement is only the most dramatic and extreme case of pseudo-community. Much more mundane cases have been examined in the context of American life. For example, campaigns of mass persuasion exploit themes of community and personalization (Merton 1946), and programs of "human relations" in industry exploit unfulfilled needs for social bonds and participation in the interest of greater worker efficiency (Mayo 1933). These ideologies and programs simulate but do not create community, and consequently they
make people more available for manipulation and mobilization. They exploit a general dilemma facing the individual in mass society: either he demands highly personalized meaning from the mass enterprise and suffers frustration, or he withholds commitment to it and suffers loss of identity. "Social alienation," "false personalization," "enforced privatization," and similar notions found in the writings on mass society point, however unsteadily, to the pathology of community in modern society (Nisbet 1953; Riesrnan 1950). This concern with the quality of social relations—the fabrication of symbols and relations, the exploitation of unfulfilled needs for personal response, and related matters—marks the perspective of mass analysis. As a perspective, it invites attention to the various distorted forms and expressions of the search for community, to the social conditions that promote them, and to the consequences for individuals and institutions that flow from them. Pseudo-authority. The decline of authority accompanies the decline of community. For the loosening of the various cohesive groupings that make up a society is at the same time the dissolution of the authority of these groups over the individual. Traditional standards and customary authorities anchored in kinship, church, and community are replaced by bureaucratic systems of legal and political control. The rationalization of authority liberates the individual from the often harsh and always close constraints of the cohesive group; however, it also removes the direction and support supplied by such a group but not by the large and impersonal bureaucracy. Many students of such relatively democratic political societies as those of England and the United States have been quick to criticize this conception of the bureaucratic society on the grounds that it fails to see the pluralist character of these societies, especially the dispersion of power and authority among diverse and independent social groups. Thus, interest-group activity and influence are found to be extensive on all levels of government; power and authority in many local communities are observed to be widely distributed among competing groups; and party loyalties are reported to possess considerable stability. Such findings appear to contradict notions of contemporary society as a condition of social and political atomization (Bell 1960). Mass theorists question, however, whether these observations in fact confirm a continuing vitality of social pluralism, or whether the pluralist group structure is not itself subject to the forces of mass society. Social pluralism undoubtedly receives a
MASS SOCIETY certain impetus from the elaborate functional differentiation of the large-scale industrial society: the proliferation of specialized occupational groups is the principal case in point. But since these associations are specialized, tend to be nationwide, and often do not incorporate the work group or other social relations of the individual, they are transformed into mass organizations (Nisbet 1953; Selznick 1952). Moreover, the interests underlying the formation of diverse organizations tend to be creatures of a complex and specialized economy that splits the person from the role, so that they possess only a limited capacity to elicit broad or deep personal commitment. If the interests are not very substantial and distinct, the organized representation of interests that constitutes a pluralist system will be correspondingly insubstantial and amorphous. The affluent society and welfare state also reduce the urgency of these interests. New sources of disaffection appear in the mass society, but they are not readily articulated and mitigated by means of pluralist organization and bargaining. Whether or not mass theorists are correct in this particular argument, the general principle is clear: "the differentiation . . . which disintegrates is very different from that which brings vital forces together" (Durkheim [1893] 1960, p. 353). The functional differentiation of structures may increase efficiency, but it simultaneously may disintegrate what was a viable social entity without creating the conditions for the formation of new social entities. A great deal depends upon whether a given social function can sustain a social identity or whether the function is so specialized or otherwise limited that it cannot provide sufficient meaning to summon commitment. As organizations become very large, specialized, and removed from the network of social relations of their members, they lose their authoritative character. The modern trade union, for example, appears less capable of providing meaning and identity to its members than occupational associations of former times, so that its power may seem more arbitrary and less authoritative. When similar processes occur in the church, profession, corporation, and other secondary groups, the society begins to lose its pluralism and experiences a general dissolution of authority. The decline of authoritative standards and leadership creates anxiety and insecurity; feelings of aimlessness and lack of social direction become widespread. Such a state of anomie generates the quest for new authority and heightens receptivity to pseudo-authority. As in the case of the search
63
for community, mass analysts try to identify the symptoms and consequences of inappropriate and inauthentic responses to genuine needs for authoritative standards and direction. The rise of charismatic leadership testifies to this need. But of greater significance is the quality of this leadership—whether it is the carrier of new values or merely the popularity of a demagogue or celebrity. Where mass media of communication and the techniques of manipulation and mobilization are highly developed, it hardly suffices to say that popular enthusiasm is sufficient to demonstrate a charismatic relationship. The conditions of mass society facilitate the fabrication of charisma in the absence of value commitment on the part of either leaders or masses. More generally, whenever the claim to authority is based substantially on the manipulation of symbols rather than on the invoking of standards, one may speak of pseudo-authority. What concerns mass analysts are situations in which there is a marked discrepancy between the symbols and the substance of authority. The claim that public opinion is authoritative under conditions of modern mass democracy is a case in point. Where public opinion becomes a slogan for whatever is believed to be popular, rather than a process and product of public deliberation and discussion, it is a form of pseudo-democracy. This is a powerful tendency in mass society because of the difficulties of making and eliciting personal responses in mass arenas and bureaucratic institutions. The ease of mass manipulation and the difficulty of public deliberation favor the symbols of democracy without the substance, especially where the symbols are widely stereotyped in terms that do not invite close scrutiny or comparison with actual experience (Selznick 1952). The most extreme manifestation of manipulated and mobilized opinion is found in totalitarian systems. The unanimous elections, the staged demonstrations, and the mass indoctrination programs reveal the possibilities of pseudo-democracy. Totalitarianism itself is greatly facilitated by the existence or creation of masses of people who are not attached to independent social groups. Indeed, the study of totalitarianism is instructive because it shows how the effort to mobilize a whole population actually requires the destruction of bonds of authority and community and their replacement by ideological organizations. However, the ultimate reliance of totalitarian regimes on the use of force testifies to the limits of this strategy of mobilization. Moreover, mass conditions do not by themselves produce totalitarianism. The existence of
64
MASS SOCIETY
modern technology plus the availability of large numbers of socially unintegrated people make totalitarianism possible, but a number of other conditions must be present to prepare the way for totalitarianism. Theories of mass society are sometimes said to be prophecies of despair (Bell I960; Shils 1962). But they need not be so construed. That the mass analyst tends to be a pathologist of contemporary society in no way denies the existence in that society of creative and value-sustaining social forces. Properly incorporated into social science, the concepts of mass society invite analysis of the conditions under which mass processes are strong or weak. Thus, mass analysis may take on new significance in alerting students of non-Western societies to certain pathologies of social development. Perhaps more important for social thought than any particular proposition of mass society is the concern this perspective represents for assessing the quality of culture and social institutions. If social science is to pursue this kind of inquiry, however, it will have to renew its communication with the humanities. For if the idea of mass society has greatly influenced social science, its formulation and development have been to a considerable extent the work of philosophy, history, and literature. WILLIAM KORNHAUSER [See also COMMUNICATION, MASS; DEMOCRACY; TOTALITARIANISM; and the biographies of BURCKHARDT; DURKHEIM; MAINE; MANNHEIM; MAYO; ORTEGA Y GASSET; TOCQUEVILLE; TONNIES; WEBER, MAX.] BIBLIOGRAPHY ARENDT, HANNAH (1951) 1958 The Origins of Totalitarianism. 2d ed., enl. New York: Meridian. BELL, DANIEL (1960) 1962 The End of Ideology: On the Exhaustion of Political Ideas in the Fifties. 2d ed., rev. New York: Collier. BLAUNER, ROBERT 1964 Alienation and Freedom: The Factory Worker and His Industry. Univ. of Chicago Press. BLUMER, HERBERT (1939) 1951 Collective Behavior. Pages 167-222 in Alfred M. Lee (editor), New Outline of the Principles of Sociology. 2d ed., rev. New York: Barnes & Noble. DURKHEIM, EMILE (1893) 1960 The Division of Labor in Society. Glencoe, 111.: Free Press. -» First published as De la division du travail social. DURKHEIM, EMILE (1897) 1951 Suicide: A Study in Sociology. Glencoe, 111.: Free Press. ->• First published in French. FROMM, ERICH (1941) 1960 Escape From Freedom. New York: Holt. GREER, SCOTT (1958) 1964 Individual Participation in Mass Society. Pages 329-342 in Roland Young (editor), Approaches to the Study of Politics: Twenty-two Contemporary Essays Exploring the Nature of Politics
and Methods by Which It Can Be Studied. Evanston, 111.: Northwestern Univ. Press. KORNHAUSER, WILLIAM 1959 The Politics of Mass Society. Glencoe, 111.: Free Press. LE BON, GUSTAVE (1895) 1947 The Crowd. New York: Macmillan. -> First published as Psychologie des foules. LEDERER, EMIL 1940 State of the Masses: The Threat of the Classless Society. New York: Norton. MANNHEIM, KARL (1935) 1940 Man and Society in an Age of Reconstruction: Studies in Modern Social Structure. New York: Harcourt. ->• First published as Mensch und Gesellschaft im Zeitalter des Umbaus, MAYO, ELTON (1933) 1946 The Human Problems of an Industrial Civilization. 2d ed. Boston: Harvard Univ., Graduate School of Business Administration, Division of Research. -> A paperback edition was published in 1960 by Viking. MERTON, ROBERT K. 1946 Mass Persuasion: The Social Psychology of a War Bond Drive. New York: Harper. NEUMANN, SIGMUND (1942) 1965 Permanent Revolution: Totalitarianism in the Age of International Civil War. 2d ed. New York: Praeger. -»• First published as Permanent Revolution: The Total State in a World at War. NISBET, ROBERT A. 1953 The Quest for Community: A Study in the Ethics of Order and Freedom. New York: Oxford Univ. Press. ORTEGA Y GASSET, JOSE (1930)1961 The Revolt of the Masses. London: Allen & Unwin. -> First published in Spanish. PARK, ROBERT E. (1916-1939) 1952 Human Communities: The City and Human Ecology. Collected Papers, Vol. 2. Glencoe, 111.: Free Press. REIWALD, PAUL 1949 De I'esprit des masses. Neuchatel and Paris: Delachaux & Niestle. RIESMAN, DAVID 1950 The Lonely Crowd: A Study of the Changing American Character. New Haven: Yale Univ. Press. -> An abridged paperback edition was published in 1960. SELZNICK, PHILIP (1952) 1960 The Organizational Weapon: A Study of Bolshevik Strategy and Tactics. Glencoe, 111.: Free Press. SHILS, EDWARD 1956 The Torment of Secrecy: The Background and Consequences of American Security Policies. Glencoe, 111.: Free Press. SHILS, EDWARD 1962 The Theory of Mass Society. Diogenes 39:45-66. SIMMEL, GEORG (1902-1903) 1950 The Metropolis and Mental Life. Pages 409-424 in Georg Simmel, The Sociology of Georg Simmel. Edited and translated by Kurt H. Wolff. Glencoe, 111.: Free Press. -> First published in German. TOCQUEVILLE, ALEXIS DE (1835) 1945 Democracy in America. 2 vols. New York: Knopf. -> First published in French. A paperback edition was published in 1961 by Vintage and by Schocken. TONNIES, FERDINAND (1887) 1957 Community and Society (Gemeinschaft und Gesellschaft). Translated and edited by Charles P. Loomis. East Lansing: Michigan State Univ. Press. -> First published in German. A paperback edition was published in 1963 by Harper. WEBER, MAX (1906-1924) 1946 From Max Weber: Essays in Sociology. Translated and edited by Hans H. Gerth and C. Wright Mills. New York: Oxford Univ. Press. WIRTH, Louis (1933-1953) 1956 Community Life and Social Policy: Selected Papers. Univ. of Chicago Press.
MATHEMATICS MATHEMATICAL STATISTICS See STATISTICS.
MATHEMATICS The history of mathematics, and to some extent its content, can be thought of as involving three major phases. Ancient mathematics, covering the period from the earliest written records through the first few centuries A.D., culminated in Euclidean geometry, the elementary theory of numbers, and ordinary algebra. Equally important, this phase saw the evolution and partial clarification of axiomatic systems and deductive proofs. The next major phase, classical mathematics, began more than 1,000 years later, with the Cartesian fusion of geometry and algebra and the use of limiting processes in the calculus. From these evolved, during the eighteenth and nineteenth centuries, the several aspects of classical analysis. Other contributions of this phase include nonEuclidean geometries, the beginnings of probability theory, vector spaces and matrix theory, and a deeper development of the theory of numbers. About a hundred years ago the third and most abstract and demanding phase, known as modern mathematics, began to evolve and become separate from the classical period. This phase has been concerned with the isolation of several recurrent structures of analysis worthy of independent study —these include abstract algebraic systems (for example, groups, rings, and fields), topological spaces, symbolic logic, and functional analysis (Hilbert and Banach spaces, for example)—and various fusions of these systems (for example, algebraic geometry and topological groups). The rate of growth of mathematics has been so great that today most mathematicians are familiar in detail with the major developments of only a few branches of the subject. Our purpose is to give some hint of these topics. The reader interested in a somewhat more detailed treatment will find the best single source to be Mathematics-. Its Content, Methods, and Meaning, the translation of a Russian work (Akademiia Nauk S.S.S.R. 1956). Other general works are Courant and Robbins (1941), Friedman (1966), and Newman (1956). More specific references are given where appropriate. We do not here discuss Probability, mathematical statistics, or computation, even though they are especially important Mathematical disciplines for the social sciences, because they are covered in separate articles in the ^cyclopedia.
65
Ancient mathematics The history of ancient mathematics divides naturally into three periods. In the first period, the pre-Hellenic age, the beginnings of systematic mathematics took place in ancient Egypt and in Mesopotamia. Contrary to much popular opinion, the mathematical developments in Mesopotamia were deeper and more substantial than those in Egypt. The Babylonians developed elementary arithmetic and algebra, particularly the computational aspects of algebra, to a surprising degree. For example, they were able to solve the general quadratic equation, ax2 + bx + c — 0. An authoritative and readable account of Babylonian mathematics as well as of Greek mathematics is presented by Neugebauer (1951). The second period of ancient mathematics was the early Greek, or Hellenic, age. The fundamentally new step taken by the Greeks was to introduce the concept of a mathematical proof. These developments began around 600 B.C. with Thales, Pythagoras, and others, and reached their high points a little more than a century later in the work of Eudoxus, who is responsible for the theory of proportions, which in antiquity held the place now held by the modern theory of real numbers. The third period is the Hellenistic age, which extended from the third century B.C. to the sixth century A.D. The early part of this period, sometimes called the golden age of ancient mathematics, encompassed Euclid's Elements (about 300 B.C.), which is the most important textbook ever written in mathematics, the work on conies by Apollonius (about 250 B.C.), and above all the extensive and profound work of Archimedes on metric geometry and mathematical physics (Archimedes died in 212 B.C.). The second most important systematic treatise of ancient mathematics, after Euclid's Elements, is Ptolemy's Almagest (about A.D. 150). Ptolemy systematized and extended Greek mathematical astronomy and its mathematical methods. The mathematical sophistication of Archimedes and the richness of applied mathematics evidenced by the Almagest were not equaled until the latter part of the seventeenth century. Classical analysis The intertwined and rapid growth of mathematics and physics during the seventeenth, eighteenth, and nineteenth centuries centered in a major way on what is now called classical analysis : the calculus of Newton and Leibniz, differential and integral equations and the special func-
66
MATHEMATICS
tions that are their solutions, infinite series and products, functions of a complex variable, extremum problems, and the theory of transforms. At the basis of all this are two major ideas, function and limit. The first evolved slowly, beginning with the correspondence, established in the Cartesian fusion of the two best-developed areas of ancient mathematics, between algebraic expressions and simple geometric curves and surfaces, until we now have the present, very simple definition of the term "function." A set f of points in the plane (ordered pairs of numbers) of the form (x,y) is called a function if at most one y is associated with each x. If (x, z/) is a member of f, it is customary to write y = f(x~); x is sometimes called the independent variable and y the dependent variable, but no causal meaning should be read into this terminology. The notion and notation may be generalized to more than one independent variable; if g is a set of ordered triples (x, y, z) with at most one z associated with each pair (x,y), then z = g(x, y} is called a function of two arguments. Since the most general notion of function can relate any two sets of objects, not just sets of numbers, it is sometimes desirable to emphasize the numerical character of the function. Then f is said to be a real-valued function of a real variable; here the term "real" refers to real numbers (in contrast to complex numbers, which will be discussed later). Although a real-valued function has been defined as a set of ordered pairs of numbers, (x,y), where the domain of x is is an unspecified set of numbers, the subsequent discussion of functions is mostly confined to the familiar case in which the domain of x is an interval of numbers. Even when the discussion applies more generally, it is helpful to keep the interval case in mind. A desire to understand limits was apparent in Greek mathematics, but a correct definition of the concept eluded the Greeks. A fully satisfactory definition, which was not evolved until the nineteenth century (by Augustin Louis Cauchy), is the following: b is the limit of f at a if and only if for every positive number e there is a positive number 8 such that, when the absolute value of x - a is less than 8 and greater than 0 (that is, 0 < x — a < 8 ) , the absolute value of f ( x ) — b is less than e (that is, \f(x) — b < e). In other words, b is the limit of f at a if x can be chosen sufficiently close to a (but not equal to a) to force f ( x ) to be as close to b as desired. Symbolically, this is written lim^af(^:) = b. The limit of f at a may exist even though f(#) is not defined; moreover, when f(«) is defined, b may or may not equal f(a~). If it does—that is, if f(#) is "near" /(a) whenever x is "near" a—then f is
)h
ffx +
Kx)
Figure I — Approximation of the derivative of f(x)
said to be continuous at a. If f is continuous at each a in an interval, f is said to be continuous over that interval. The calculus. The calculus defines two new concepts, the derivative and the integral, in terms of function and limit. They and their surprising relationship serve as the basis of the rest of mathematical analysis. The derivative. The first definition arises as the answer to the question "Given a function f, what is its slope (or, equivalently, its direction or rate of change) at any point x?" For example, suppose that y = f ( x ) represents the distance, y, that a particle has moved in x units of time; then what is the rate of change of distance—the instantaneous velocity—at time x? If h is a short period of time, then an approximate answer is the distance traversed between x and x + h, that is, f(x + h) — f ( x ) , divided by the time, h, taken to travel that distance (see Figure 1). The approximation is better the smaller the value of h, which suggests the definition of the rate of change of f at x as the limit of this ratio as h approaches 0, that is, ,. LT\Lx 'v" lim - - -
h
This limit, if it exists, is denoted by f ' ( x ) (or by d f ( x } / d x or by dy/dx) and is called the derivative of f at x. If f (#) exists, then f can be shown to be continuous at x, but the converse is not true in general. One of the earliest and most important applications in the social sciences of the concept of a derivative has been to the mathematics of marginal concepts in economics. For example, let x represent output, C(x) the cost of output x, and R(x) the revenue derived from output x; then C'(x^) and
MATHEMATICS R'(#) (or dC(x*)/dx and dR(x}/dx~) are the marginal cost and marginal revenue, respectively. Marginal utility, marginal rate of substitution, and other marginal concepts are defined in a similar fashion. Many of the fundamental assumptions of economic theory receive precise formulation in terms of these marginal concepts. The integral. The second concept in the calculus arises as the answer to the question "What is the area between the graph of a function f and the line y — 0 (the horizontal axis, or abscissa, of the coordinate system) over the interval from a to b?" (Regions below the abscissa are treated as negative areas to be subtracted from the positive ones above the abscissa; see Figure 2.) The solution, which will not be stated precisely, involves the following steps: the abscissa is partitioned into a finite number of intervals; using the height of the function at some value within each interval, the function is approximated by the resulting step function; the area under the step function is calculated as the sum of the areas of the rectangles of which it is composed; and, finally, the limit of this sum is calculated as the widths of the intervals approach zero (and, therefore, as their number approaches infinity). When this limit exists, it is called the Riemann integral of f from a to b and is symbolized as f l f ( x ) dx. It can be shown that the Riemann integral exists if f is continuous over the interval; it also exists for some discontinuous functions. For more advanced work, the concept of the length of an interval is generalized to the concept of the Lebesgue measure of a set, and the Riemann integral is generalized to the Lebesgue integral. Roughly, the vertical columns used to approximate the area in the Riemann integral are replaced in the Lebesgue integral by horizontal slabs. Although the interpretation of the integral as an extension of the elementary concept of area is f(x)
figure 2 - Approximation of the integral of f(x)
67
important, even more important is its relation (called the fundamental theorem of the calculus) to the derivative: Consider F(#) = f x j ( u ) du as a function of the upper limit, x, of the interval over which the integral is computed; it can then be proved that the derivative of this function, F'(x), exists and is equal to f ( x ) . Put another way, the rate of change at x of the area generated by f is equal to the value of f at x; or put still another way, the operation of taking the derivative undoes the operation of integration. This fact plays a crucial role in the solution of many problems of classical applied mathematics that are formulated in terms of derivatives of functions. Introductions to the calculus and elementary parts of analysis are Apostol (1961-1962) and Bartle (1964). Implicit definitions of functions. An algebraic equation such as 2x2 — 5x — 3 = 0 implicitly defines two numbers (namely, the two values of x, 3 and — £) for which the equality holds. Other algebraic equations implicitly define sets of numbers for which they hold. A functional equation is an equality stated in terms of an unknown function; it implicitly defines those functions (as in the algebraic case, there may be more than one) that render the equality true. Ordinary differential equations. Suppose it is postulated that the amount of interest (that is, the rate of change of money at time t) is proportional to (that is, is a constant fraction, k, of) the amount, f(£), of money that has been saved. (This is the case of continuous compound interest.) Then f satisfies the equation f (t) = fef(t). This is a simple example of an ordinary differential equation, the solution of which is any function having the property that its derivative is k times the function. The solutions are f(t) = f(0) exp(fct), where f ( 0 ) denotes the initial amount of money at time t = 0. Another simple economic example is the differential equation that arises from the assumption that marginal cost always equals average cost (that is, dC(x}/dx — C(x}/x} which has the solution that average cost is constant, that is, that C(x} = kx for some constant, k. Some laws of classical physics are formulated as second-order, linear, ordinary differential equations of the form where f" is the derivative of f (f" is called the second derivative of f ) and P, Q, and R are given functions. If, for example, f denotes distance, then this differential equation asserts that at each time t, a linear relation holds among distance, velocity,
68
MATHEMATICS
and acceleration. A vast literature is concerned with the solutions to this class of equations for different restrictions on P, Q, and jR; most of the famous special functions used in physics — Bessel, hypergeometric, Hankel, gamma, and so on — are solutions to such differential equations (see Coddington 1961). Partial differential equations. Many physical problems require differential equations a good deal more complicated than those just mentioned. For example, suppose that there is a flow of heat along one dimension, x. Let f ( x , t ) denote the temperature at position x at time t. With t fixed, one can find the rate of change (the derivative) of temperature with changes in x; denote this by df(x,£)/dx and its second derivative with respect to x by d-f(x,t~)/dx-. These are called partial derivatives. Similarly, holding x fixed, the derivative with respect to t is denoted by df(x,t~)/dt. According to classical physics, temperature changes due to conduction in a homogeneous one-dimensional medium satisfy the following partial differential equation : dfx,t dt
k pa-
dx~
where fe is the thermal conductivity, p the density, and cr the specific heat of the medium. Problems involving two or more independent variables (usually, time and some or all of the three space coordinates) — fluid flow, heat dissipation, elasticity, electromagnetism, and so on — lead to partial differential equations. Their solution is often very complex and requires the specification of the unknown function along a boundary of the space. This requirement is called a boundary condition. (See Akademiia Nauk S.S.S.R. [1956] 1964, chapter 6.)' Integral equations. Some physical problems lead to integral equations. In one type, functions g and K of one and two variables, respectively, and a constant, X, are given, and the problem is to find those functions, f, for which
This equation is called Fredholm's linear integral equation or the inhomogeneous linear integral equation. Basically, it asserts that the value of some quantity f at a point x is equal to an impressed value, g ( x } , plus a weighted average of its value at all other points. Integral equations arise in empirical contexts for which it is postulated that the value of a function at a point depends on the behavior of the function over a large region of its domain. Thus, in the example just considered the value of f at x depends on the in-
tegrand K ( x , y } f ( y ) integrated over the interval (a,b). There is a large body of literature dealing with the solution of various types of integral equations, especially those of interest in physics and probability theory. Functional equations. Although both differential and integral equations (and mixtures of the two, called integrodifferential equations) are examples of functional equations, that term is often restricted to equations that involve only the unknown function, not its derivatives or integrals. A simple, well-known example is f(xy) = f(x) + f(y), which implicitly defines those functions that transform multiplication into addition. If f is required to be continuous, then the solutions are Kf*dz/z, where K is a positive constant; this integral is called the natural logarithm. The choice of K is usually referred to as the selection of the base of the logarithm. Difference equations are functional equations of special importance in the social sciences. They arise both in the study of discrete stochastic processes (in learning theory, for example) and as discrete analogues of differential equations. Here the unknown function is defined only on the integers (or, equivalently, on any equidistant set of points), not on all of the real numbers, and so the function is written fn = f ( n ) , where n is an integer. The equation states a relation among values of the unknown function for several successive integers. For example, the second-order, linear difference equation — the analogue of the secondorder, linear, ordinary differential equation, described above — is of the form In some probabilistic models of the learning process it is postulated (or derived from more primitive assumptions) that the probability of a particular response on trial n + 1, denoted by p n+1 , is some function of pn and of the actual events that occurred on trial n. The simplest such assumption is the linear one, that is, pn+1 — ap» + ft, where a and (3 are parameters that depend upon the events that actually occur. If there is a run of trials during which the same events occur, so that a and ft are constant, then the solution to the above first-order, linear difference equation is
When different events occur on different trials, the equation to be solved becomes considerably more complex. An introduction to difference equations is Goldberg (1958). Given a functional equation — in the most gen-
MATHEMATICS eral sense—the answer to the question of whether a solution exists is not usually obvious. Exhibiting a solution, of course, answers the question affirmatively, but often the existence of a solution can be proved before one is found. Such a result is known as an existence theorem. If a solution exists, it is also not usually obvious whether it is unique and, if it is not unique, how two different solutions relate to one another. A statement of the nature of the nonuniqueness of the solutions is known, somewhat inappropriately, as a uniqueness theorem. Some rather general existence and uniqueness theorems are available for differential and integral equations, but in less well understood cases considerable care is needed to discover just how restrictive the equation is. A general work on functional equations is Aczel (1966). Three other areas of classical analysis. Three other branches of classical analysis will be briefly discussed. Extremum problems. For what values of its argument does a function assume its maximum or its minimum value? This type of problem arises in theoretical and applied physics and in the social sciences. In its simplest form, a real-valued function f is defined over some interval of the real numbers, and the problem is to find those x0 for which f(x 0 ) is a maximum or a minimum. If f is differentiate and if x0 is not one of the end points of the interval, a necessary condition is that f (x 0 ) — 0; moreover, x0 is a local maximum if f"(x0~) < 0 and a local minimum if f"(^o) > 0. (These statements should be intuitively clear for graphs of simple functions.) From these results it is easy to find, for example, which rectangle has the maximum area when the perimeter is held constant: it is the square whose sides are each equal to a quarter of the perimeter. A much more difficult and interesting problem— the subject of the calculus of variations—is to find which function (or functions) f of a given family of functions causes a given function F of f (known as a functional) to assume its maximum or minimum value. For example, let f be a continuous function that passes through two fixed points in the plane, and let F ( f ) be the surface area of the body that is generated by rotating f about the abscissa. A question that may be asked is "For which f (or f's) is F ( f ) a minimum?" A major tool in the solution of this •*-problem is a second-order, ori. cunary differential equation, known as Euler's e quation, that f must necessarily satisfy (just as the solution JCQ to the simpler problem necessarily satisfies f (*<,) = 0). (See Akademiia Nauk S.S.S.R. [1956] 1964, chapter 8.)
69
Within the past twenty years new classes of extremum problems have been posed and partially solved; they are mainly of concern in the social sciences, and they go under the names of linear, nonlinear, and dynamic programming. An example of a linear programming problem is the following diet problem. Each of several foodstuffs, f , , f 2 , • • • , fk, contains known amounts of various nutritional components, such as vitamins and proteins. Let fa be the amount of component j in food fi, j = 1, 2, • • • , n, and let a-, be the minimum amount of component j acceptable in the diet. If Xi is the amount of food fi in the diet, the diet will be acceptable only if the following n inequalities are fulfilled: Xifu + x2f2j +
h xkfkj ^ a},
j = 1, 2, • • • , n.
If PI denotes the price of food f i , the problem is to choose the xt so as to minimize the cost, x1pl + x~,p2 + • • • + xkpk, while fulfilling the above linear inequalities. [See PROGRAMMING.] Functions of a complex variable. One of the most beautiful subfields of analysis is the theory of functions of a complex variable, which was developed in the nineteenth century, starting with the work of Cauchy. It has been significant in the growth of several two-dimensional, continuous physical theories, including parts of electromagnetism, hydrodynamics, and acoustics, but so far its applications in the social sciences have been mainly restricted to mathematical statistics, as in the concept of the characteristic function of a probability distribution. A complex number, z, is of the form z = x + iy, where x and y are real numbers and i— \/ — l. Sums and products are defined in such a way that the resulting arithmetic reduces to that of the ordinary numbers when y — 0. Because a point (x,y~) in the plane can be (usefully) identified with the complex number x + iy, functions from the plane into the plane can be interpreted as complex-valued functions of a complex variable. If the derivative of such a function exists at all points of a region, derivatives of all orders exist and the function can be expressed as a convergent power series of the form a0 + a j Z + a 2 z 2 + • • • for some circle of z's within that region. It is clear from this result that the mere supposition that the derivative exists is a much stronger condition for complex-valued functions than for ordinary numerical functions. Such functions, which are called analytic, are very strongly constrained—among other things, specifying an analytic function over a small region determines it completely—and this fact has been
70
MATHEMATICS
effectively exploited to solve many two-dimensional problems of theoretical and practical interest. Interestingly, the theory cannot be neatly generalized beyond two dimensions. An introductory work on functions of a complex variable is Cartan (1961). Integral transforms. Suppose that f is any continuous, real-valued function defined over an interval from a to b and that K is a fixed, continuous, real-valued function of two variables, the first of which is also on the interval from a to b; then I(f,y)= $£K(x,y)f(x)dx is called an integral transform of f. If K satisfies certain restrictions, knowing I is equivalent to knowing f. Nevertheless, if K is carefully chosen, / may have convenient properties not possessed by f. For example, if a = 0, b~<x>, and K(x,y~) = e~xv, then I, which is then known as the Laplace transform and which is closely related to the moment-generating function of statistics, has the property that it converts certain integrals (convolutions) of two functions into multiplications of their transforms. In statistics such a convolution represents the distribution of the sum of two independent random variables. Another well-known and important example is the Fourier transform, which is used widely in statistics, and to a lesser extent in probabilistic models of behavior, to obtain a probability distribution from its characteristic function. Theory of numbers Despite several intellectual crises that led mathematicians to introduce new types of numbers into mathematics, it was not until about a hundred years ago that numbers were treated as being something other than intuitively understood. The natural numbers, 1,2,3, • • • , and their ratios, the positive rationals, are ancient concepts. The Greeks first noted their incompleteness when they showed that they are inadequate to represent \/2, the length of the diagonal of a square whose side is of length 1. Certain irrational numbers had to be added, and later 0, negative numbers, and complex numbers were added so that certain classes of equations would all have solutions. To clarify this patchwork and to understand the uniqueness of the additions, nineteenth-century mathematicians undertook the axiomatization of various aspects of the number system. Perhaps the most subtle step was the definition of irrational numbers in terms of sets of rational numbers (roughly, the set of all rationals less than the irrational to be defined). The axiomatization of numbers is not really the mainstream of the "theory of numbers." When one sees a book or course with that title, it usually refers to the study of properties of the natural numbers, mainly the prime numbers. Recall that
an integer is prime if it is divisible only by 1 and itself; the first few primes are 3, 5, 7, 11, and 13. In addition to the many results that can be proved directly (some of which were known to the ancients), such as that every integer can be represented uniquely as the product of powers of primes and that there are infinitely many primes, other results have depended upon the application of deep results from analysis. For example, parts of the theory of functions of a complex variable were used to show that the number of primes not larger than n divided by the number n/lnn, where Inn is the natural logarithm of n, that is, f^dx/x, is a ratio that approaches 1 as n becomes large. Not only has this work greatly increased the depth of understanding of integers, but it has fed back into analysis and was one of the factors leading to the development of parts of contemporary abstract algebra. Many applications of mathematics (for example, in statistics) involve counting the number of distinct events or objects that satisfy certain conditions; often these counting problems are quite difficult. Theorems providing explicit formulas or recursion schemes are called combinatorial theorems. One of the earliest important examples was the binomial theorem for the expansion of (a + b) n , which is now part of every elementary algebra course. [See PROBABILITY, article on FORMAL PROBABILITY.] A general introduction to the theory of numbers is Ore (1948). Algebra Classically, algebra was the theory of solving equations expressed in terms of the four arithmetical operations—addition, subtraction, multiplication, and division. The linear and quadratic equations of elementary algebra are familiar examples. Historically, the expression of mathematical problems in the form of equations, using letters to stand for the unknown numbers, was a major step in clarifying and simplifying the mathematical nature of many kinds of problems. Perhaps the most important consequence of the introduction of letters and the use of equations was the extension of routine methods of calculation to quite complicated settings. The introduction of algebraic equations probably ranks in importance in the history of ideas with the earlier invention, probably first by the Babylonians, of the place-value system of notation for numbers; such a system was needed to develop simple algorithms for performing arithmetical computations. The general theory of algebraic equations, the elementary parts of which are studied in high
MATHEMATICS school, has a long and distinguished history in mathematics. The proof by Niels Henrik Abel in 1824 that solutions of an algebraic equation of degree five or greater, where the degree is the highest exponent of any term in the equation, cannot be expressed in terms of radicals (that is, expressions definable in terms of square roots) was one of the most important mathematical results of the first half of the nineteenth century. Another result of basic importance is the fundamental theorem of algebra, which was first proved in the eighteenth century but which was proved rigorously only in the last half of the nineteenth century. This theorem asserts that every algebraic equation always has at least one root that is a real or a complex number. Also of great significance were the proofs that not all numbers are roots of algebraic equations; numbers that are not such roots are called transcendental numbers. The most famous proofs of this sort are Charles Hermite's (in 1873) that e is transcendental and F. Lindemann's (in 1882) that TT is transcendental. Orderings. Much of the work in algebra during the present century has been devoted to generalized mathematical systems that are characterized not in terms of the four fundamental arithmetical operations but in terms of generalizations of these operations and of the familiar ordering relations of "less than" and "greater than." In a number of the social sciences the theory of binary relations has received extensive application. From an algebraic standpoint a binary relation structure may be characterized as consisting of a set A and a set R of ordered pairs (x,y~), where x and y are both elements of A. Such an R is called a binary relation on A. A relation R is said to be a partial ordering of A when it is reflexive, antisymmetric, and transitive—that is, when it satisfies the following three properties: reflexive: for every x in A, xRx; antisymmetric : for every x and y in A, if xRy and yRx, then x — y; transitive: for every x, y, and z in A, if xRy and yRz, then xRz. If R is also connected in A (that is, if for any two elements x and y in A with x ^ y, either xRy or yRx) then R is said to be a complete or simple ordering or, sometimes, a linear ordering of A. The concept of a complete ordering is a direct abstraction of the order properties of "C' with respect to the real numbers. A familiar use of the concept of an ordering relation is in utility theory, particularly in the classical theory of demand in economics, in which it is assumed that each individual has an ordering relation over the set of commodity bundles °r, more generally, over the set of alternatives with which he is presented. The general concept of ordering relations also has far-ranging applications
71
in the theory of measurement within psychology and sociology, and more general binary relations have been extensively applied in anthropology in the study of kinship systems. Partial orderings can be extended in another direction by imposing additional conditions to obtain lattices, which have also been used in the social sciences. In a different direction, but still within the framework of binary relations, is the theory of graphs, in which no restrictions are placed on the binary relation, R. Applications of graph theory have been made to social-psychological and sociological problems, especially to provide a mathematical method for representing various kinds of relationships between persons. Groups, rings, and fields. Another direction of generalization of classical algebra has been to what are called groups, rings, and fields. A group is a set A together with a binary operation, o, satisfying the following axioms. First, the operation o is associative, that is, for x, y, and z in A, x o (y o z) = (x o y ) oz. Second, there is an element e, called the identity, of the set A such that for every x in A, xoe = eox = x. And, finally, for each element x of A there is an inverse element x-1 such that x o x-1 = e. It is obvious that if A is taken as the set of integers, o as the operation of addition, e as the number 0, and the inverse of x as the negative of x, then the set of integers is a group under the binary operation of addition. The theory of groups has had profound ramifications in other parts of mathematics and in the sciences, ranging from the theory of algebraic equations to geometry and physics. The reason for the fundamental importance of group theory is perhaps best summarized by stating that a group is the appropriate way to formulate the very important concept of symmetry. In the range of applications of group theory just mentioned, the underlying thread is the concept of symmetry, whether it is in the symmetry of the roots of an equation or the symmetry properties of the fundamental particles of physics. As a simple example, consider the finite group of rotations 90°, 180°, 270°, and 360°. A square does not change its apparent orientation under such a rotation about its center, but an equilateral triangle does. This group of rotations is the symmetry group of rotations for a square but not, of course, for an equilateral triangle. Although the methods and results of group theory have not yet had special applications of depth in the social sciences, they are important to many of the general mathematical results that have been applied. The theories of rings and fields represent rather direct generalization of arithmetical properties of the number system. The theory of groups is fun-
72
MATHEMATICS
damentally a generalization of the concept of a single binary operation, such as addition or multiplication, whereas rings and fields are algebraic systems that have two fundamental operations. The most familiar example of a field or of a ring is the set of rational numbers or of real numbers with respect to the operations of addition and multiplication. Boolean algebras. Algebraic aspects of the theory of sets have been studied under the heading of Boolean algebras. The concept of an algebra of sets, that is, a collection of sets closed under union and complementation, is fundamental in the modern theory of probability, where events are interpreted as sets of possible outcomes and numerical probabilities are assigned to events. [See PROBABILITY, article on FORMAL PROBABILITY.] Isomorphism and homomorphism. It should be mentioned that certain very general mathematical concepts find their most natural definition and application in modern algebra. One of the most important concepts is that of the isomorphism of two mathematical systems. An isomorphism is a oneto-one mapping of a system A onto a system B in which the operations and relations of A are preserved under the mapping and have the same structure as the operations and relations of system B. If the mapping is not one-to-one but the operations and relations are preserved, then it is called a homomorphism. A well-known application of the concept of isomorphism in the social sciences is in theories of fundamental measurement in which one shows that an appropriate algebra of empirical operations is isomorphic to some numerical algebra. It is this isomorphism that permits the direct application of computational methods to the results of measurement. Introductory works on algebra, both for this and for the next section, are Birkhoff and MacLane (1941) and Mostow, Sampson, and Meyer (1963). Vector spaces and matrix algebra Linear algebra is one of the most important generalizations of classical elementary algebra. The objects to which the operations of addition and multiplication are applied are now matrices, vectors of an ti-dimensional space, and linear transformations (an n x n matrix is a particular representation of a linear transformation in ^-dimensional space). More particularly, linear algebra arises as a generalization of the linear equations so familiar in elementary algebra, and historically one of the most important tasks of linear algebra has been to find solutions of systems of linear equations. As many research workers in the social sciences know,
the numerical solution of linear equations can be an extremely laborious and difficult affair when the number of equations is large. The set of coefficients of a system of linear equations gives rise to the concept of a rectangular array of numbers, which is precisely what a matrix is. An algebra of matrices in terms of addition and multiplication may be constructed; the distinguishing feature of this algebra, as compared with the algebra of the real numbers, is that multiplication is not commutative—that is, AB is not usually equal to BA, and the product of two nonzero matrices can be zero. The intuitive geometric concept of a vector may be represented by a column or row of n numbers, and an algebra of vectors, which bears a close resemblance to the algebra of numbers, may be constructed. Simple (linear) transformations of vectors, such as rotations and stretches of the coordinate system in space, can be interpreted as multiplication by matrices. The interaction between the geometrical intuitions about w-dimensional space and the algebraic techniques of calculation provided by linear algebra and the theory of matrices have made them powerful tools in the application of mathematics to many parts of science. These applications have been particularly prominent in statistics (for example, in factor analysis), as well as in economics, where it is often useful to treat n-dimensional bundles of commodities as vectors. Topology and abstract spaces Intuitively, a topological transformation of a geometrical figure or object is a deformation that introduces neither breaks nor fusions in the object. Put more exactly, a topological transformation is one that is one-to-one, is continuous, and has a continuous inverse. If one starts with a circle— perhaps the best example of a simple closed curve —one can deform it topologically into an ellipse or into the shape of a crescent, but one cannot deform it topologically into a figure eight, for example, because then two distinct points of the circle are fused as the intersection point of the eight. Also, one cannot deform it into a straight line segment, because to do so would introduce a break in the circle. Many familiar qualitative geometrical properties are topological invariants in the sense that they are not altered (are invariant) under topological transformations. Examples are the property of being inside or outside a closed figure in the plane; the property of a surface being closed, such as the surface of a sphere or an ellipsoid; or the property of the dimension of an object. For example, the surface of a sphere cannot be
MATHEMATICS topologically transformed into a one-dimensional curve or a three-dimensional sphere. We shall not attempt here to give an exact definition of continuity as it is used in topology; we simply remark that it is a reasonable generalization of the concept of continuity used in analysis. Topological methods and results have far-reaching applications in many branches of mathematics, but as yet the methods themselves have not been directly applied in those parts of the social sciences concerned extensively with empirical data. The most direct applications have been in economics, where topological fixed-point theorems have been of great importance in investigating the conditions guaranteeing the existence of a stable equilibrium in a competitive economy. The classical example of a fixed-point theorem—first proved by L. E. J. Brouwer, at the beginning of this century—states that for every topological mapping of an n-dimensional sphere into itself there is always at least one point that maps into itself, that is, remains fixed. Familiar examples of such mappings are rotations in two or three dimensions for which the center of the rotation is the fixed point of the transformation. Topological space. As a typical example of abstraction in modern mathematics, the initial concept of a topological transformation of familiar geometrical figures has led to the general abstract notion of a topological space. Roughly speaking, a topological space consists of a set, X, and a family, 0, of subsets of X, called open sets, for which the following four conditions are satisfied: the empty set is in 0; X is in 9; the union of arbitrarily many sets each of which is in fj is also in 0; and the intersection of any finite number of sets from 0 is also in 0. The concept of an open set is a generalization of the notion of an open interval of real numbers (an interval that does not include its end points). For example, the natural topology of the real line is the family of open intervals together with the sets that are formed from arbitrary unions and finite intersections of open intervals. Generally speaking, the notion of open set is used to express the idea of continuity. The important thing about a continuous function is that it does not jumble neighboring points too much, and this requirement may be expressed by requiring of a topological transformation that open sets be mapped into open sets and that the inverse of an open set be an open set. Metric space. Other kinds of abstract spaces have come into prominence in the development of topology. Perhaps the most important is the concept of a metric space. A set, X, together with a
73
distance function, d, that maps pairs of points into real numbers is called a metric space if d satisfies the following conditions: d(x, y) = 0 if and only if x — y, that is, the distance between x and y is 0 if and only if x and y are the same point; d(x, z/) ^ 0, which asserts that distance is a nonnegative real number; d(x, y ) = d ( y , x), that is, distance is symmetric; and, finally, d(x,y~) + d ( y , z~) ^ d(x, z), which is known as the triangle inequality. The concept of a metric space has had important applications in many parts of mathematics and is a fundamental concept in modern mathematics. It has been applied in recent work in scaling theory in psychology and sociology, particularly to the problems of multidimensional scaling, and also in certain areas of mathematical economics [see SCALING]. It is clear that the notion of a metric space generalizes, in a very natural way, the concept of distance in Euclidean space. A typical metric problem raised in the social sciences is this: Given data in the form of "distances" among a finite set of points, what is the smallest dimensional Euclidean space within which the points can be embedded so that these distances equal the Euclidean or some other preassigned metric of that space? Recently this problem has been effectively generalized by permitting certain transformations of the "distances" that preserve their metric property. Little has yet been done about embeddings in non-Euclidean spaces. An introductory work on topology is Hocking and Young (1961). Foundations As was remarked above, the concept of a rigorous mathematical proof originated in ancient Greek mathematics. The modern formal axiomatic method, characteristic of twentieth-century mathematical research and one of the most important topics to be clarified in modern research on foundations of mathematics, is conceptually very close to the approach followed in Euclid's Elements. The main difference is that the primitive concepts of the theory are now treated as undefined or meaningless. All that is assumed about them must be formally expressed in the axioms. In contrast, in the Elements primitive concepts such as those of point and line are given an interpretation or meaning from the very beginning. This modern conception originated with David Hilbert, who provided the first complete, modern axiomatization of geometry in 1889. It is customary to say that the concepts of the theory are implicitly defined by the axioms. What is not recognized often enough is that the collection of axioms together explicitly de-
74
MATHEMATICS
fines the theory embodied in the concepts. Thus, in slightly more exact phrasing, the axioms of Euclidean geometry define the theory of Euclidean geometry by defining the phrase "is a model of Euclidean geometry." Iri the same fashion, the axioms of group theory define the theory of groups by specifying what kinds of objects are called groups or, in other words, what kinds of objects are models of the theory of groups (here we are using the term "model" in the logical or mathematical sense). A more particular aim of foundational research has been to provide a set of axioms that would serve as a basis for the main body of mathematics. At least three major positions on the foundations of mathematics have been enunciated in the twentieth century; they differ in their conception of the nature of mathematical objects. Intuitionism. Intuitionism holds that in the most fundamental sense mathematical objects are themselves thoughts or ideas. The intuitionist holds that one can never be certain that he has correctly expressed the mathematics when it is formalized as a mathematical theory. As part of this thesis, the classical logic of Aristotle, in particular the law of excluded middle, has been challenged by Brouwer and other intuitionists because it permits the derivation of purely existential, n on constructive statements about mathematical objects. In particular the validity of classical reductio ad absurdum proofs depends upon this logical law. Although intuitionists express themselves in a way which suggests a psychological analysis of mathematics, it should be emphasized that their conception of mathematical objects as thoughts has not been seriously explored by any intuitionists from the standpoint of scientific psychology. Platonism. A second view of mathematics, the Platonistic one, is that mathematical objects are abstract objects that exist independently of human thought or activity. Those who hold that set theory or logic itself provides an appropriate foundation for mathematics (adherents of logicism) usually adopt some form of Platonism in their basic attitude. From the standpoint of working mathematics, set theory—and thus Platonism—has been the most influential conception of mathematics in this century. Set theory itself originated in the late nineteenth century with the revolutionary work of Georg Cantor. Its foundations were called into question by Bertrand Russell's discovery of a simple paradox which arises in considering the set of all objects that are not members of themselves. If it is supposed that to every property there corre-
sponds the set of objects having this property, then a contradiction within classical logic may easily be derived by considering the set whose members are those and only those sets that are not members of themselves. An apparently satisfactory foundation for set theory, which avoids this and related paradoxes, was formulated in 1908 by Ernst Zermelo, and with suitable technical extensions it provides a satisfactory basis for most of the mathematics published in this century. Formalism. The third influential position on the foundation of mathematics, called formalism, was developed by Hilbert and others. This view is that the primary mathematical objects are the symbols in which mathematics is written. This carries to the extreme the development of the axiomatic method begun by the Greeks. Under the formalist account the interpretation and use of mathematics must then be given from outside pure mathematics. From a psychological or behavioral standpoint, there is much that is appealing about formalism, but again little effort has yet been made to relate the detailed results and methods of formalism to theoretical or experimental work in scientific psychology. Relevance of research on foundations. In view of the high degree of agreement about the validity of most published pieces of mathematics, the skeptical social scientist may question the real relevance of these varying views about the foundations of mathematics to working mathematics itself. There is a highly invariant content of mathematics recognized by almost all mathematicians, including those concerned with the foundations of mathematics, and this invariant content is essentially untouched by radically different philosophical views about the nature of mathematical objects. A reasonable conjecture is that future research in the foundations of mathematics will attempt to capture this invariant content by concentrating on the character of mathematical thinking rather than on the nature of mathematical objects. One other important aspect of foundational research in the twentieth century is the fundamental work on mathematical logic, in particular the attempt by Gottlob Frege, A. N. Whitehead, Bertrand Russell, and others to reduce all of mathematics to purely logical assumptions. These efforts have led to great clarification of the nature of mathematics itself and to vastly increased standards of precision in talking about mathematical proofs and the structure of mathematical systems. Of major importance were the deep results of Kurt Godel (1931) on the logical limitations of any formal system rich
MATHEMATICS enough to express elementary number theory. His results show that any such formal system must be essentially incomplete in the sense that not all true sentences of the theory can be proved as theorems. An introductory work on foundations is Kneebone (1963). Mathematics applied to social sciences Applications of mathematics to specific social science problems are described, and detailed references are given, elsewhere in this encyclopedia. That material is not repeated here; several reasonably general references are Allen (1938), Coleman (1964), Kemeny and Snell (1962), Luce (1964), Luce, Bush, and Galanter (1963-1965), Samuelson (1947). Suffice it to say that these applications involve only fragments of the whole of mathematics, and they have not been as successful as those in the physical sciences. The reasons are many, among them these: the effort so far expended is much less; the basic empirical concepts and variables have not been isolated and purified to the same degree; mathematics grew up with and was to some extent molded by the needs of physics, and so it may very well be less suited to social science problems if these problems are of a basically different character from those of physics; a typical social science problem appears to involve more variables than one is accustomed to handling in physics; and, finally, social scientists are generally not extensively trained in mathematics. A social scientist who attempts to formulate and solve a scientific problem in mathematical terms is often disappointed with the mathematics he can find. This may happen simply because a mathematical system appropriate to his problem does not seem to have been invented, or, as is more common, the definite and often quite complex mathematical system that he happens to want to understand in depth has not been investigated in any detail. In this century especially, mathematicians have tended to focus on very general classes of systems, and the theorems concern properties that are true of all or of large subclasses of them; however, these results do not usually provide much detailed information about any particular member °f the class. As an example, the axioms of group theory are not categorical—that is, two groups need not be isomorphic. Therefore, theorems about groups in general tell one little about the specific properties a particular group. But this is what is of interest a particular group is used to represent an erm pirical structure, as in modern particle physics.
75
When this happens, it is necessary for the applied mathematician to carry out considerable mathematical analysis to achieve the understanding he needs to answer scientifically interesting questions. We have already discussed two parts of mathematics in which highly specific systems have been explored in depth: classical analysis and matrix algebra. A primary motivation for this detailed work was the needs of physical science. In fortunate instances, a problem may be formulated in terms of one of these systems, in which case specific results can sometimes be extracted from the existing literature. Examples where this has been done are in the application of matrix algebra to factor analysis and of Markov chains (a part of probability theory) to several areas, including learning, social interaction, and social structure [see FACTOR ANALYSIS; MARKOV CHAINS]. Theory as detailed as this, however, is not typical of contemporary mathematics. We have in mind such active areas as associative and nonassociative algebras, homological algebra, group theory, topological groups, algebraic topology, rings, manifolds, and functional analysis. The generality of contemporary mathematics can be seductive in that it invites sophistic treatments of scientific problems. It is often not difficult to find some general branch of mathematics within which to cast a specific social or behavioral problem without, however, actually capturing in detail the various constraints of the problem. Without these constraints few explicit results and predictions can be proved. Nevertheless, the real emptiness of such endeavors can be shrouded for the unwary in the impressive symbolism and ringing terms of whatever mathematics it is that is not being seriously used. If the growth of the social sciences parallels at all that of the physical sciences, they will study in detail various systems, which, although of peripheral mathematical interest, are of substantive interest. Indeed, some examples already exist, including these: (1) Just as classes of maximum and minimum problems have been formulated and solved in the physical sciences, other classes have arisen in the social sciences, such as linear, nonlinear, and dynamic programming, game theory, and statistical decision theory. (2) Various mathematical structures that may correspond to (parts of) empirical structures have been investigated, for example, aspects of the theory of relations and the closely related theory of graphs, matrix algebra, and concatenation algebras, which arose in the study of grammar and syntax. (3) Underlying the
76
MAURRAS, CHARLES
success of much physical theory is the fact that many variables can be represented numerically. The theories that account for this in physics are not suitable for the social sciences, but alternative possibilities are under active development, particularly in terms of theories of fundamental and derived measurement. The mathematics is reasonably involved, although for the most part the proofs are self-contained. (4) Although the theory of stochastic processes is a well-developed part of probability theory, a number of the processes that have found applications in the social sciences had not previously been studied by probabilists; their properties have been partially worked out in the social science literature. Among the most prominent examples are the nonstationary processes that have arisen in learning theory. Some of these postulate that on each trial one of several operators Qj transforms a response probability into the corresponding probability on the next trial. Two special cases have been most adequately studied. One assumes that the Ch are linear operators and the other assumes that the operators commute with one another—that is, Q;Q/ = Q,Q,. [See LEARNING.] As increasing use is made of mathematics in the social sciences, one may anticipate the investigation of very specific mathematical systems and, ultimately, the isolation of interesting abstract properties from these systems for further study and generalization as pure mathematics. R. DUNCAN LUCE AND PATRICK SUPPES BIBLIOGRAPHY
ACZEL, J. 1966 Lectures on Functional Equations and Their Applications. New York: Academic Press. AKADEMIIA NAUK S.S.S.R., MATEMATICHESKII INSTITUT (1956) 1964 Mathematics: Its Content, Methods, and Meaning. Edited by A. D. Aleksandrov, A. N. Kolmogorov, and M. A. Laurent'ev. 3 vols. Cambridge, Mass.: M.I.T. Press. -» First published in Russian. ALLEN, R. G. D. (1938)1962 Mathematical Analysis for Economists. London: Macmillan. APOSTOL, TOM M. 1961-1962 Calculus. 2 vols. New York: Blaisdell. BARTLE, ROBERT G. 1964 The Elements of Real Analysis. New York: Wiley. BIRKHOFF, GARRETT; and MACLANE, SAUNDERS (1941) 1965 A Survey of Modern Algebra. 3d ed. New York: Macmillan. CARTAN, HENRI (1961) 1963 Elementary Theory of Analytic Functions of One or Several Complex Variables. Reading, Mass.: Addison-Wesley. -» First published in French. CODDINGTON, EARL A. (1961) 1964 An Introduction to Ordinary Differential Equations. Englewood Cliffs, N.J.: Prentice-Hall. COLEMAN, JAMES S. 1964 Introduction to Mathematical Sociology. New York: Free Press.
COURANT, RICHARD; and ROBBINS, HERBERT (1941) 1961 What Is Mathematics? An Elementary Approach to Ideas and Methods. Oxford Univ. Press. FRIEDMAN, BERNARD 1966 What Are Mathematicians Doing? Science 154:357-362. GODEL, KURT (1931) 1965 On Formally Undecidable Propositions of the Principia mathematica and Related Systems. I. Pages 4-38 in Martin Davis (editor), The Undecidable: Basic Papers on Undecidable Propositions, Unsolvable Problems and Computable Functions. Hewlett, N.Y.: Raven. -> First published in German in Volume 38 of the Monatshefte filr Mathematik und Physik. GOLDBERG, SAMUEL 1958 Introduction to Difference Equations: With Illustrative Examples From Economics, Psychology, and Sociology. New York: Wiley. -» A paperback edition was published in 1961. HOCKING, JOHN G.; and YOUNG, GAIL S. 1961 Topology. Reading, Mass.: Addison-Wesley. KEMENY, JOHN G.; and SNELL, J. LAURIE 1962 Mathematical Models in the Social Sciences. Boston: Ginn. KNEEBONE, G. T. 1963 Mathematical Logic and the Foundations of Mathematics: An Introductory Survey. New York: Van Nostrand. LUCE, R. DUNCAN 1964 The Mathematics Used in Mathematical Psychology. American Mathematical Monthly 71:364-378. LUCE, R. DUNCAN; BUSH, ROBERT R.; and GALANTER, EUGENE (editors) 1963-1965 Handbook of Mathematical Psychology. 3 vols. New York: Wiley. MOSTOW, GEORGE; SAMPSON, JOSEPH H.; and MEYER, JEANPIERRE 1963 Fundamental Structures of Algebra. New York: McGraw-Hill. NEUGEBAUER, OTTO (1951) 1957 The Exact Sciences in Antiquity. 2d ed. Providence, R.I.: Brown Univ. Press. NEWMAN, JAMES R. (editor) 1956 The World of Mathematics. 4 vols. New York: Simon & Schuster. ORE, 0YSTEIN 1948 Number Theory and Its History. New York: McGraw-Hill. SAMUELSON, PAUL A. (1947) 1958 Foundations of Economic Analysis. Harvard Economic Studies, Vol. 80. Cambridge, Mass.: Harvard Univ. Press.
MAURRAS, CHARLES Charles Marie Photius Maurras (1868-1952), French man of letters, was born in Martigues, near Marseille. He entered public life supporting both Frederic Mistral's Felibrige and Jean Moreas, with whom he joined in 1891 in founding the Ecole Romane, a literary movement designed to defend "a common ideal of Romanity." He became the apostle of integral nationalism, having coined the term in 1900. Reacting against the dominant relativism and eclecticism of his time, Maurras set out from skeptical and agnostic premises to find some solid basis for thought, style, and action in historical realities which, he argued, having worked in the past, might be expected to work again. He rediscovered the classical ideals of order, hierarchy, and discipline
MAURRAS, CHARLES and insisted that they alone provide an escape from nihilism into the positive realm of "organizing empiricism"—a method of solving current problems in terms of past experience. Transferred from literary to sociopolitical grounds, Maurras's empirisme organisateur turned o him against what he considered the dissolvent and anarchic qualities of liberal individualism which had triumphed in the French Revolution. He thought France was in a state of decadence and attributed this to its abandonment of traditions identified with the old regime, campaigning against Protestants, Jews, and metics—all those alien agents of change and corruption to whom the revolution had given free rein in France. When, in the late 1890s, the scandals that periodically shook the Third Republic culminated in the Dreyfus affair, Maurras set out to elaborate a doctrine which might spark a reaction against the existing disorder and provide the basis of a national revival. Based upon penetrating if frequently unhistorical criticism of the republic and parliament, his critique asserted the necessity of a return to the historical sources of French intellectual and political success: the classical tradition of the seventeenth century and the monarchy. True patriotism, conscious of these conditions of national prosperity and greatness, demanded, he believed, a return to the stability and continuity which only hereditary monarchy could provide. This was the program of integral nationalism to which he soon converted the founders of the Ligue d'Action Franchise, a young, pragmatic, and patriotic movement dedicated to France's political and intellectual regeneration. Henceforth, the story of Maurras was that of the Action Franchise; he became its moving spirit, and most of his writings were published in the review (1899-1914) and the newspaper (1908-1944) of that name. Despite his insistence that politics must take precedence over everything else (Politique d'abord!), the Action Frangaise was less a political than a didactic and literary movement. The doctrine it taught combined traditionalism, regionalism, and corporatism and elaborated the picture of a society that was free of democratic sham, individualistic anarchy, and the struggles of political parties and that was ruled in a stable way by a monarch and by an elite of talent and birth who would consider °nly the interests of the nation, not those of particular interest groups. The negative aspects of Maurrasist doctrine were more convincing than its program, and its criticism of the republic and tts institutions provided rich ammunition for all other critics. When written into legislation at Vichy
77
(particularly in 1940-1941), Maurras's views proved anachronistic and unworkable. Nevertheless, the Action Frangaise provided an intellectual structure to which the French right could refer; Maurras's doctrine synthesized the ideas of nineteenth-century conservatives from Bonald to La Tour du Pin and influenced several generations of France's middle and upper classes. Its newspaper never ceased to warn against Germany, against a Red peril—not so much a peril of social revolution as of national disunity—against the popular front of 1936, and, thereafter, against war and warmongers—at a time of national division and unpreparedness to which its own campaigns had contributed a good deal. A steadfast supporter of Philippe Petain after 1940, though as steadfastly anticollaborationist (his ideas inspired much of Vichy's nationalist isolationism), Maurras was condemned in 1945 to life imprisonment. In prison, as out, his pugnacity and the stream of his writings never ceased. When his sentence was commuted in 1952 to forced residence in a private clinic, Maurras publicly thanked President Vincent Auriol, congratulating him for finally granting him the freedom that was his due and suggesting the expiatory execution of the minister "responsible for the excesses committed at the Liberation." He died a few months later, still bellicose but reconciled with the church he had abandoned as a youth. Maurras's destructive effect on the democratic and parliamentary ideology has been immense, his constructive influence slight. Yet his ideas affected the nationalists of all Latin nations, and there are strong traces of Maurrasism in Salazar's Portugal and de Gaulle's France. The Action Frangaise has been and still is well represented in the Academic Frangaise (in 1964, by three of its leaders), and many still respect its ideas in the breach if not in the observance. Still, the movement Maurras led is now but a memory and a sect. Responsible for much of its success between the wars, Maurras also bears the responsibility for its eventual failure. His intellectual elitism made for overemphasis of the written word, and his authoritarianism brought him into conflict with the Roman Catholic church he professed to admire (not for its Christianity but for its enduring power) and with the royalty he professed to serve (less out of personal loyalty than for theoretical reasons). The deafness which rid him at an early age of faith in God or nature grew steadily worse, isolating him and encouraging his pessimistic and intolerant dogmatism. Younger followers deserted, were excommunicated, or simply drifted away. But this very dogmatism gave him
78
MAUSS, MARCEL
the strength needed to repeat tirelessly and to elaborate endlessly ideas which have left their mark on France and on the Latin world. EUGEN WEBER [See also NATIONALISM.] WORKS BY MAURRAS
1931 Au signe de Flore. Paris: Oeuvres Representatives. 1950 Le Mont de Saturne. Paris: Quatre Jeudis. Oeuvres capitales. 4 vols. Paris: Flammarion, 1954. SUPPLEMENTARY BIBLIOGRAPHY
BUTHMAN, WILLIAM 1939 The Rise of Integral Nationalism in France. New York: Columbia Univ. Press. DIMIER, Louis 1926 Vingt ans d'Action Francaise. Paris: Librairie Nouvelle. JOSEPH, ROGER; and FORGES, JEAN 1953 Biblio-iconographie generale de Charles Maurras. 2 vols. Paris: Roanne. MASSIS, HENRI 1961 Maurras et noire temps. Paris: Plon. NOLTE, ERNST 1963 Der Faschismus in seiner Epoche. Munich: Piper. ROUDIEZ, LEON 1957 Maurras jusqu'a I'Action Francaise. Paris: Bonne. TANNENBAUM, EDWARD R. 1962 The Action Frangaise. New York: Wiley. WEBER, EUGEN 1962 Action Francaise. Stanford Univ. Press. WRIGHT, GORDON (1960)1962 France in Modern Times. London: Murray; Chicago: Rand McNally.
MAUSS, MARCEL Marcel Mauss (1872-1950), French sociologist, was born in Epinal (Vosges) in Lorraine, where he grew up within a close-knit, pious, and orthodox Jewish family. Emile Durkheim was his uncle. By the age of 18 Mauss had reacted against the Jewish faith; he was never a religious man. He studied philosophy under Durkheim's supervision at Bordeaux; Durkheim took endless trouble in guiding his nephew's studies and even chose subjects for his own lectures that would be most useful to Mauss. Thus Mauss was initially a philosopher (like most of the early Durkheimians), and his conception of philosophy was influenced above all by Durkheim himself, for whom he always retained the utmost admiration; by Kant, to whose work Durkheim introduced him; and by two philosophers at Bordeaux, O. Hamelin, a rationalist, and the more empirically minded A. Espinas, then concerned with the collective origin of arts, customs, and technology—subjects about which Mauss was later to write. The philosophical atmosphere was Neo-Kantian. Mauss placed third in the national agregation competition of 1895 and decided to devote himself to research.
He studied the history of religion at the Ecole Pratique des Hautes Etudes under Louis Finot, Sylvain Levi, Auguste Carriere, and A. Meillet in the Section des Sciences Historiques et Philologiques and under Alfred Foucher, Israel Levi, and Leon Marillier in the Section des Sciences Religieuses. Meillet and Levi, together with Celestin Bougie, were among his closest friends. In 1897-1898 he made a study tour to Leiden, Breda, and Oxford, where he worked with Tylor. He then studied Sanskrit and Indian texts and, as Foucher's assistant from 1900 to 1902, taught the history of the religion and philosophy of pre-Buddhist India, in 1901 succeeding Marillier to the chair in the history of the religion of "noncivilized" peoples, which he occupied for the rest of his career. He taught, in addition, at the College de France from 1930 to 1939. In 1925 he helped to found, and then became joint director of, the Institut d'Ethnologie de 1'Universite de Paris, which, by virtue of the instruction it provided and the publications it sponsored, contributed considerably to the development of field work by younger anthropologists. Mauss lectured at the Institut on ethnography until 1939, encouraging field workers to "take trouble to be exact, complete" and to have a sense for "facts and the relations between them," for "proportions and connexions" (1947, p. 5). (Apart from a brief voyage to Morocco, Mauss himself did no field work.) Mauss worked very closely with Durkheim. In addition to their major joint work, "De quelques formes primitives de classification" (Durkheim & Mauss 1903), he compiled statistical tables for Durkheim's study of suicide, and they collaborated in writing reviews. It was, on the whole, Mauss who, with his greater sense of the concrete, had an eye for the illuminating fact, while the theoretical interpretation generally originated with Durkheim. The notion of "total social facts," commonly attributed to Mauss (Levi-Strauss [1950] 1960, pp. xxiv ff.), was, according to Georges Davy (1958), born of their collaboration arising from the study of some documents of Boas and was subsequently applied by Mauss. It is indicative of the cooperative nature of the work done by the brilliant young scholars whom Durkheim had assembled around the journal Annee sociologique (published in 12 volumes between 1898 and 1913) that almost all of Mauss's major work in this period was written in collaboration: with Hubert he published "Essai sur la nature et la fonction du sacrifice" in 1899, "Esquisse d'une theorie generale de la magie" in 1904, and "Introduction a 1'analyse de quelques phenomenes religieux" in 1908; with Beuchat he
MAUSS, MARCEL published "Essai sur les variations saisonnieres des societes eskimos" in 1906; and with Fauconnet he published an important encyclopedia article on sociology in 1901. Mauss also took a major part in editing the Annee from the time of its foundation, directing the religious sociology section with Hubert, collaborating on several other sections, and contributing a vast number of reviews and notes, to which he rightly attached great importance. World War I tragically decimated the Annee sociologique group, and, after Durkheim's own early death, Mauss inherited the leadership of the group. He twice revived the journal (in the 1920s and in the 1930s) and devoted much of his time to editing posthumously published works by Durkheim, R. Hertz, Hubert, and others, thereby reducing his own output. Mauss's most important post-World War i writings may be divided into two broad categories. First, there are the major ethnological studies: the great Essai sur le don of 1925, "Effet physique chez 1'individu de 1'idee de mort suggeree par la collectivite (Australie, Nouvelle-Zelande)" of 1926, "Les techniques du corps" of 1936, and "Une categoric de 1'esprit humain: La notion de personne, celle de 'moi'" of 1938. Second, there are writings of a methodological and programmatic character on the social sciences: "Rapports reels et pratiques de la psychologic et de la sociologie," which was a presidential address to the Societe de Psychologic in 1924, "Divisions et proportions des divisions de la sociologie" of 1927, and "Fragment d'un plan de sociologie generate descriptive" of 1934. In addition, Mauss published other brief studies on a variety of subjects, among them the origins of the notion of money, the Melanesian potlatch, contract among the Thracians, joking relationships, the "Legend of Abraham," forms of civilization, social cohesion in polysegmentary societies, technology, the problem of nationality, and the sociology of Bolshevism. Mauss also led an active political life. Like Durkheim, he supported Dreyfus and Zola, and he was a leading member of the Dreyfusard Groupe des Etudiants Collectivistes. He was closely associated with the socialist leaders, in 1904 helping them to found L'humanite, to which he contributed, taking Part in strikes and supporting socialist candidates in elections. He was also much involved in the popular universities" and the cooperative movements. The evolutionary, pluralist, and liberal quality of his socialism, akin to that of Jaures, can be seen in the "Conclusions" to The Gift ([1925] 1954, Pp. 63-81), where he stressed both the loss in terms of the quality of human relationships that
79
occurs when exchange becomes purely economic and the need to restore the older themes of "freedom and obligation in the gift, of generosity and self-interest in giving" (ibid., p. 66). Although Mauss is chiefly known as an ethnologist and historian of religion, he was in fact a polymath, one of the last encyclopedic minds, and had an extraordinary range of ethnographic and linguistic knowledge (his pupils said "Mauss knows everything"). Levy-Bruhl (1951, p. 4) described his conversation and lectures as full of "new and fruitful ideas of which others made theses and books." His career was brutally ended by the German occupation, which for a second time deprived him of friends and colleagues and affected the balance of his mind. He never completed projected books on money, prayer (but see Mauss 1909), and the nation (the manuscripts of which were probably destroyed), and he never synthesized his many-sided and scattered work. Contributions to theory. Mauss's theoretical contributions derive mainly from his concrete application and refinement of Durkheim's precept, "The essential thing is to unite not many facts, but facts at once typical and well-studied," as well as the precept laid down in the article written with Fauconnet (which was a sort of Durkheimian charter) that the sociologist must connect "collective representations" (i.e., collective ways of acting and thinking) with features of the social structure or with one another (1901, p. 172). Thus the study of the Eskimos explores the relations between morphological factors, on the one hand, and legal and moral systems, domestic economy, and religious life on the other. Mauss related the crowded conditions in which the Eskimos lived in the winter to the development among them of a real community of ideas, to "a strong religious and moral unity of mind," which he contrasted with the social atomization, the extreme "moral and religious impoverishment" that accompanied the dispersal in summer ([1906] 1960, p. 470). Similarly, the classic study with Durkheim of primitive classification attempts to find the origin of classifications (such as space, time, hierarchy, number, class, etc.) in the social structure by establishing formal correspondences between social and symbolic classifications among Australian aborigines, among the Zuni, and in traditional China: thus "even ideas so abstract as those of time and space are, at each point in their history, closely connected with the corresponding social organization" (Durkheim & Mauss [1903] 1963, p. 88). The elucidation of these formal correspondences is of considerable theoretical interest and suggestiveness, however
80
MAUSS, MARCEL
questionable may be the causal chain that is postulated, the causal role given to affectivity, and the differentiation of cognitive operations from the content of thought (see Needham 1963). It was the first sociological study of classification and opened up the question, still immensely fruitful, of the relationship between symbolic classification and social structure. Other examples of Mauss's practice of Durkheimian precepts are the study of magic, analyzed as a social phenomenon and defined as "every rite which does not form part of an organised cult" being "private, secret, mysterious and tending at the margin towards the forbidden rite" (Hubert & Mauss [1904] 1960, p. 16); the study of sacrifice, analyzed as "a means of communication between the sacred and profane worlds, through the mediation of a victim, that is, of a thing that in the course of the ceremony is destroyed" (Hubert & Mauss [1899] 1964, p. 97); the study of the concept of the self, offering no more than a sketch of "the series of forms which this concept has assumed in the life of men in societies, according to their systems of law, their religions, their customs, their social structures and their modes of thought" (Mauss [1938] 1960, p. 335); and the studies of the social determinants of mourning rites (Mauss 1921), of the lust to die, and of uses of the body. It is, however, The Gift that must rank as Mauss's masterpiece. It is the supreme example of the study of "total social facts," being concerned with a limited range of social phenomena seen as a totality, with "wholes, with systems in their entirety" ([1925] 1954, p. 77), namely, "prestations," or systems of exchange, which are "in theory voluntary, disinterested and spontaneous, but are in fact obligatory and interested" (ibid., p. 1). He focused on a comparative study of forms of contract and exchange in Polynesia, Melanesia, and northwest America, with supplementary reference to evidence from early Roman, Hindu, and Germanic literature. The central hypotheses of the study are that "the archaic form of exchange," with its three obligations of giving, receiving, and repaying, is an aspect of almost all societies (and should be resurrected in our own), that it maintains and strengthens social bonds (cooperative, competitive, and antagonistic), and that by studying it concretely in its totality in the societies chosen, "we have been able to see their essence, their operation and their living aspect, and to catch the fleeting moment when the society and its members take emotional stock of themselves and their situation as regards others" (ibid., pp. 77-78). Gift exchange is revealed as at once religious, legal, moral, economic, aesthetic, morphological, and mythological in significance:
the obligations it involves are symbolically expressed in myth and imagery and take the form of an interest in the objects exchanged, but these objects "are never completely separated from the men who exchange them; the communion and alliance they establish are well-nigh indissoluble. The lasting influence of the objects exchanged is a direct expression of the manner in which subgroups within segmentary societies of an archaic type are constantly embroiled with and feel themselves in debt to each other" (ibid., p. 31). Apart from its considerable ethnographic interest, The Gift was the first systematic and comparative study of gift exchange and the first elaboration of the relation between patterns of exchange and the social structure. In general, it may be said that Mauss's theoretical contributions result from putting Durkheimian sociology to work, de-emphasizing its least acceptable features (the latent mysticism of the group, the crowd psychology, the identification of historical origin and analytical simplicity) and demonstrating its considerable explanatory power. Influence. Mauss's influence is particularly difficult to measure because of his deep involvement in collaborative work with Durkheim and others. He was the Durkheimians' ethnographic adviser, and his part in the studies of magic, social morphology (1906), and primitive classification was of crucial importance in the development of Durkheim's own sociology of religion and knowledge. One may likewise assert, but not measure, his influence on other Durkheimians (such as Marcel Granet) and upon those who came under their collective influence, including historians (such as Lucien Febvre and Marc Bloch) and psychologists (such as Charles Blondel). He had a direct influence, however, on French ethnology, inspiring such figures as A. Metraux, M. Leenhardt, M. Griaule, G. Dumezil, R. Bastide, and L. Dumont. He has been a major influence on Levi-Strauss, who has written about him in terms which overstate Mauss's theoretical divergence from Durkheim (Levi-Strauss 1945; 1950). LeviStrauss values above all Mauss's method, best illustrated in The Gift, of treating a total social fact as a symbolic system to be deciphered. Levi-Strauss sees this approach as "inaugurating a new era for the social sciences" ([1950] 1960, p. xxxv), for it may be generalized to the whole of social life. Thus social life may be understood as a system of transactions between groups and between individuals, the rationale of which can be established by techniques analogous to those of structural linguistics. According to Levi-Strauss, it is the great misfortune of modern ethnology that Mauss did not exploit his
MAUSS, MARCEL
discovery; he himself applied it in his theory of the exchange basis of cross-cousin marriage, which, he maintains, shows that in the field of kinship "the analogy with language, so strongly affirmed by Mauss, has permitted the discovery of precise rules, according to which there are formed, in any society whatever, cycles of reciprocity, whose mechanical laws are thenceforth known, permitting the use of deductive reasoning and offering the promise of a vast science of communication of which anthropology will be a part" ([1950] 1960, p. xxxvi). Leacock (1954) sees Mauss's work as more oldfashioned, condemning particularly its sociologism and evolutionism. The Gift is Mauss's best-known work outside France, and indeed it is the only one that has made any impact in the United States; its theoretical suggestiveness seems by no means spent. Also influential have been the seminal studies of magic, sacrifice, and, increasingly, primitive classification. Mauss's influence is especially hard to identify in these areas because his work has entered into the common theoretical inheritance, often operating through the medium of colleagues and disciples. He appears particularly to have influenced the following anthropologists: A. R. Radcliffe-Brown, B. Malinowski (both of whom in different ways distorted his somewhat refined Durkheimianism), E. E. Evans-Pritchard, R. Firth, M. J. Herskovits, W. Lloyd Warner, and R. Redfield, among others. More generally, his influence is especially apparent in the anthropological work emanating from Oxford (via Evans-Pritchard) and in the work of the Leiden school (especially F. D. E. van Ossenbruggen and J. P. B. de Josselin de Jong). But the rich possibilities of his work have still to be fully exploited. STEVEN LUKES [See also ETHNOGRAPHY; EXCHANGE AND DISPLAY; MAGIC; MYTH AND SYMBOL; RITUAL; SOCIAL STRUCTURE; and the biographies of BLOCK; DURKHEIM; FEBVRE; GRANET; HERSKOVITS; MALINOWSKI; METRAUX; POLANYI; RADCLIFFE-BROWN; REDFIELD.] WORKS BY MAUSS (1899) 1964 HUBERT, HENRI; and MAUSS, MARCEL Sacrifice: Its Nature and Function. Univ. of Chicago Press. -» First published as "Essai sur la nature et la fonction du sacrifice" in Volume 2 of Annee sociologique. (1899-1905) 1909 HUBERT, HENRI; and MAUSS, MARCEL Melanges d'histoire des religions. Paris: Alcan. -» A collection of previously published articles. See especially the preface. 1901 FAUCONNET, PAUL; and MAUSS, MARCEL Sociologie. Volume 30, pages 165-176 in La grande encyclopedic: Inventaire raisonne des sciences, des lettres et des arts. . . . Paris: Societe Anonyme de La Grande Encyclopedie.
81
(1903) 1963 DURKHEIM, EMILE; and MAUSS, MARCEL Primitive Classification. Translated and edited with an introduction by Rodney Needham. Univ. of Chicago Press. -» First published as "Do quelques formes primitives de classification" in Volume 6 of Annee sociologique. (1904) 1960 HUBERT, HENRI; and MAUSS, MARCEL Esquisse d'une theorie gencrale de la magic. Pages 1141 in Marcel Mauss, Sociologie et anthropologie. 2d ed. Paris: Presses Universitaires de France. -» First published in Volume 7 of Annee sociologique. (1906) 1960 Essai sur les variations saisonnieres des societes eskimos: Etude de morphologic sociale. Pages 389-477 in Marcel Mauss, Sociologie et anthropologie. 2d ed. Paris: Presses Universitaires de France. -» With the collaboration of H. Beuchat. First published in Volume 9 of Annee sociologique. 1908 HUBERT, HENRI; and MAUSS, MARCEL Introduction a 1'analyse de quelques phenomenes religieux. Revue de I'histoire des religions 58:163-203. 1909 La priere. I: Les origines. Unpublished manuscript. -> The beginning of a larger work; distributed privately. 1921 L'expression obligatoire des sentiments: Rituels oraux funeraires australiens. Journal de psychologie 18:425-434. (1924) 1960 Rapports reels et pratiques de la psychologie et de la sociologie. Pages 281-310 in Marcel Mauss, Sociologie et anthropologie. 2d ed. Paris: Presses Universitaires de France. -» First published in Journal de psychologie normale et pathologique. (1925) 1954 The Gift: Forms and Functions of Exchange in Archaic Societies. Glencoe, 111.: Free Press. -» First published as Essai sur le don: Forme et raison de I'echange dans les societes archa'iques. (1926) 1960 Effet physique chez 1'individu de 1'idee de mort suggeree par la collectivite (Australie, NouvelleZelande). Pages 311-330 in Marcel Mauss, Sociologie et anthropologie. 2d ed. Paris: Presses Universitaires de France. -> First published in Journal de psychologie normale et pathologique. 1927 Divisions et proportions des divisions de la sociologie. Annee sociologique New Series [1924—1925]: 98-173. 1934 Fragment d'un plan de sociologie generale descriptive. Annales sociologiques Series A 1; 1-56. (1936) 1960 Les techniques du corps. Pages 363-386 in Marcel Mauss, Sociologie et anthropologie. 2d ed. Paris: Presses Universitaires de France. -» First published in Journal de psychologie. (1938) 1960 Une categorie de 1'esprit humain: La notion de personne, celle de "moi." Pages 331-362 in Marcel Mauss, Sociologie et anthropologie. 2d ed. Paris: Presses Universitaires de France. -» First published (in French) in Volume 68 of the Journal of the Royal Anthropological Institute of Great Britain and Ireland. 1947 Manuel d'ethnographic. Paris: Payot. -» Based on a course given annually from 1926 to 1939 at the Institut d'Ethnologie de 1'Universite de Paris. Sociologie et anthropologie. 2d ed. Paris: Presses Universitaires de France, 1960. -> A collection of essays first published between 1904 and 1938. SUPPLEMENTARY BIBLIOGRAPHY
DAVY, GEORGES 1958 In Memoriani: Emile Durkheim. Annee sociologique 3d Series [1957-1958] :vii-x. GUGLER, JOSEF 1961 Die neuere franzosische Soziologie: Ansdtze zu einer Standortbestimmung der Soziologie. Neuwied (Germany): Luchterhand.
82
MAXIMUM LIKELIHOOD
GUGLER, JOSEF 1964 Bibliographic de Marcel Mauss. Homme 64:105-112. -» The most complete bibliography of Mauss's publications (excluding those in socialist journals and the numerous notes and reviews in the Annee sociologique and the Notes critiquesSciences sociales). Includes not only his writings but also summaries of his comments at meetings of academic societies and congresses. LEACOCK, SETH 1954 Ethnological Theory of Marcel Mauss. American Anthropologist New Series 56:5873. -> Selected bibliography appended. LEVI-STRAUSS, CLAUDE 1945 French Sociology. Pages 503-537 in Georges Gurvitch and Wilbert E. Moore (editors), Twentieth Century Sociology. New York: Philosophical Library. LEVI-STRAUSS, CLAUDE (1950) 1960 Introduction a 1'oeuvre de Marcel Mauss. In Marcel Mauss, Sociologie et anthropologie. 2d ed. Paris: Presses Universitaires de France. -» The most important study of Mauss to date. Contains a selected bibliography. LEVY-BRUHL, H. 1951 In Memoriam: Marcel Mauss. Annee sociologique 3d series [1948-1949]: 1-4. MERLEAU-PONTY, MAURICE (1953) 1960 De Mauss a Claude Levi-Strauss. Pages 145-169 in Maurice Merleau-Ponty, Eloge de la philosophic, et autres essais. Paris: Gallimard. NEEDHAM, RODNEY 1963 Introduction. In Emile Durkheim and Marcel Mauss, Primitive Classification. Univ. of Chicago Press.
MAXIMUM LIKELIHOOD See ESTIMATION. MAYO, ELTON While the published writings of Elton Mayo (1880-1949) now seem to be mainly of historical interest, he personally had an enormous influence in the development of industrial sociology and psychology and in the stimulation of men who have made major contributions to research and theory. Mayo was particularly influenced by the writings of the psychologist Pierre Janet. He combined an interest in psychoneuroses and what he termed "obsessive thinking," derived from his study of Janet, with the approach to culture and social structure of the social anthropologists Bronislaw Malinowski and A. R. Radcliffe-Brown. In research methods he adapted the interviewing methods of clinical psychologists to the field methods of the anthropologists and brought them to bear on studies of industrial organizations. Mayo was born in Adelaide, Australia, the second of seven children. He came from a family of professional men. In the process of finding his vocation, Mayo ranged widely in space and experience: from medical student to newspaperman to laborer to businessman, from Scotland to west Africa and back to Australia. From the printing business, he turned to the study of psychology at Adelaide Uni-
versity. A psychiatric treatment program he and a collaborator organized toward the end of World War i to deal with soldiers suffering from shell shock led to his appointment in 1919 to the newly established chair of philosophy at the University of Queensland. Rockefeller and Carnegie foundation grants brought him to the United States and supported his first research in human relations in industry, which he began while at the University of Pennsylvania. The site of his first research in this field was a textile mill. The most productive period of his life began in 1926, when he accepted a position at Harvard University's Graduate School of Business Administration. In association with Lawrence J. Henderson, an eminent biological chemist and devotee of Pareto, Mayo organized a research team to study the psychological and social problems of industrial workers. The aim from the beginning was to follow these problems wherever they led, without regard to customary disciplinary boundaries. In 1927 Mayo launched the now famous Western Electric research program. He worked particularly with Fritz J. Roethlisberger, William J. Dickson, and T. North Whitehead, and it was they who produced the principal research reports of the studies carried on at the Hawthorne Works in Chicago. As director of the program, Mayo had the task of handling the diplomacy involved in making such an unprecedented research effort acceptable within a company, and he also made important contributions to the design of the research program and to the interpretation of the results (see Management and the Worker by Roethlisberger & Dickson 1939). While Mayo was primarily interested in problems of individual adjustment, he recognized the necessity of examining such individual problems in the context of organizations and social structure. He was instrumental in bringing W. Lloyd Warner to Harvard and worked closely with him in launching the Yankee City study. At the same time, Warner became consultant to the Western Electric research program and there stimulated the analysis of problems of group and organizational structure. In all his writings Mayo was concerned with two basic ideas, one dealing with the nature of society, the other dealing with the problems of individuals. He argued that the industrial revolution had destroyed traditional society in which people responded to each other in terms of established routines. The breakdown in these traditional understandings had led to widespread conflict in industry and society. The traditions of old could not
MEAD, GEORGE HERBERT be re-established, and, therefore, the only solution must be to build an adaptive society in which an administrative elite, trained in social understandings and skills, would resolve human as well as technical problems. He saw workers suffering from a form of anomie, the failure to find a satisfactory place for themselves in the world of work, with a consequent involvement in obsessive reveries in which they brooded unproductively over their problems. For dealing with these problems of obsessive thinking, he had great faith in the therapeutic relationship in which the individual is encouraged to talk out his problems freely to an interested listener. Although Mayo directed the Western Electric research program, the principal research fruits of that program bear little relation to Mayo's ideas about social integration, obsessive thinking, and psychotherapy. To be sure, Management and the Worker does devote chapters to the personnelcounseling program, a direct outgrowth of Mayo's ideas, but that program was later abandoned by the company and never served as a model for other companies. Furthermore, few research men today consider a personnel-counseling program of much importance in dealing with problems of human adjustment in industry. The principal fruits of the Western Electric studies are found in those parts of Management and the Worker which deal with informal relations among workers, with workermanagement relations, and with the methods for gathering systematic observational and interviewing data upon behavior in organizations. These contributions provided the foundation for the very rapid development of research on organizational behavior in the two decades following publication of that book in 1939. Mayo, as the father of research on the human problems of industry, also became the principal target for attack. Critics argued that there was no place in his philosophy for conflict, that he sought to achieve organizational harmony through the subordination of individual and group interests by the administrative elite, and that he did not understand the role of unions in a free society. Mayo's supporters replied that he had no illusions about the possibility of establishing perfect harmony in any industrial society. He simply observed that there is so much destructive conflict that it is well to seek better ways of handling human problems. While Mayo has been charged with being antiunion, it might be more accurate to say that he was simply indifferent to unions. In his most productive period of work with Western Electric, the company had only a weak company union. Al-
83
though unions had become a prominent part of the industrial scene long before Mayo's death, he did not think they fundamentally altered those human problems of industry that interested him, and he never integrated unions into his thinking about industry. Mayo was not a systematic theoretician. He had a wide-ranging mind and creative social abilities. Few men have contributed as much as he to the establishment of new fields of social research and teaching. He was a behavioral scientist long before the term became popular. WILLIAM F. WHYTE [See also GROUPS, article on THE STUDY OF GROUPS; INDUSTRIAL RELATIONS; ORGANIZATIONS, article on THEORIES OF ORGANIZATIONS; WORKERS; and
the
biographies of HENDERSON; JANET; MALINOWSKI; RADCLIFFE-BROWN.] WORKS BY MAYO (1933) 1946 The Human Problems of an Industrial Civilization. 2d ed. Boston: Harvard Univ., Graduate School of Business Administration. -> A paperback edition was published in 1960 by Viking. 1945 The Social Problems of an Industrial Civilization. Boston: Harvard Univ., Graduate School of Business Administration. 1947 The Political Problem of Industrial Civilization. Boston: Harvard Univ., Graduate School of Business Administration. 1948 Some Notes on the Psychology of Pierre Janet. Cambridge, Mass.: Harvard Univ. Press. SUPPLEMENTARY
BIBLIOGRAPHY
BENDIX, REINHARD; and FISHER, LLOYD H. 1949 The Perspectives of Elton Mayo. Review of Economics and Statistics 31:312-319. HOMANS, GEORGE C. 1949 Some Corrections. Review of Economics and Statistics 31:319-321. ROETHLISBERGER,
FRITZ J.;
and
DlCKSON,
WlLLIAM J.
(1939) 1961 Management and the Worker: An Account of a Research Program Conducted by the Western Electric Company, Hawthorne Works, Chicago. Cambridge, Mass.: Harvard Univ. Press. -> A paperback edition was published in 1964 by Wiley. URWICK, LYNDALL F. 1960 The Life and Work of Elton Mayo. London: Urwick. WARNER, W. LLOYD et al. 1941-1959 Yankee City Series. 5 vols. New Haven, Conn.: Yale Univ. Press.
MEAD, GEORGE HERBERT The work of George Herbert Mead (1863-1931), one of the leading figures in pragmatism, has had a profound impact on the development of American social science. Despite the lavish praise of Dewey and Whitehead, most philosophers tended to neglect him, because his ideas were not readily accessible during his lifetime. He was reluctant to set down in writing views that were still being
84
MEAD, GEORGE HERBERT
formed; he published no books, and many of his articles dealt with education, psychology, and sociology. Communicating most effectively in oral discourse, Mead developed his thoughts in extemporaneous lectures at the University of Chicago, where he taught from 1893 to the time of his death. Although his style was involved and labored and even his admirers acknowledged difficulties in deciphering his sentences, the classes were wellattended; and his influence upon colleagues and students, especially in sociology and social psychology, is readily discernible in their writings. Four posthumous volumes have been pieced together by devoted students from stenographic notes of his lectures, fragmentary manuscripts, and tentative drafts. Pragmatism represents an attempt to reformulate conceptions of man and his place in the universe in terms of the revolutionary implications of scientific method and of evolutionary theory. Mead viewed evolution as the process of meeting and solving problems and scientific method as the evolutionary process grown self-conscious. The characteristics of various species develop as organisms come to terms with life conditions, and Mead wanted to account for the emergent properties of man—thinking in abstractions, self-consciousness, and purposive and moral conduct. He contended that these attributes rest on the development of language, a form of social interaction that evolves among human beings as they meet the exigencies of living in groups. Thus, Mead's central hypothesis made social psychology basic to his philosophical work. His approach was behavioristic, although not in the narrow sense of John B. Watson: man is to be studied in terms of his deeds, covert as well as overt. Since, however, each person is involved in a succession of joint enterprises with others, his acts can best be regarded as segments of larger transactions. Social psychology is the study of regularities in individual behavior that develop from participating in groups. Mead also stressed the temporal dimension—the extension of individual and group activities over time. Analysis of the "act" Society is an ongoing process and consists of social acts. By social act Mead meant a transaction involving two or more persons among whom there is a division of labor. The contributions of various individuals are coordinated to achieve objectives that bring gratifications of some sort to each. Unlike the instinctive cooperation found among social insects, concerted action among human beings is characterized by a high degree of flexibility. The
participants build up a social act as they continuously adjust to one another and to the demands of the developing situation. Should there be drastic environmental change, entirely novel patterns may emerge. Such concurrence among separate and independently motivated individuals is made possible by role taking, the ability of each to visualize his own performance from the standpoint of the others. Each person is able to comprehend the entire transaction, locate himself within it, and regulate his own contributions to fit into the larger pattern. Coordination depends, then, on the selfcontrol of each actor. In highly institutionalized transactions, collaboration is facilitated insofar as the participants share a common perspective; each of them takes the role of a generalized other. The execution of a social act is a communicative process; transactions of all kinds develop in the reciprocating adjustments of the participants. Mutual orientation is built up and maintained in a continuous interchange of gestures. A gesture is any perceptible sound or movement which indicates to a second party the inner experiences or intentions of the first; any act may become a gesture when an observer responds to it in terms of what it represents. Speech, which consists of vocal gestures, is of special importance. Since a speaker is able to hear his own remarks in much the same manner as his audience, the establishment of mutual understanding becomes easier. A gesture that has the same meaning for two or more people is a significant symbol, and language consists of such conventional sounds. Those who are associated in common activities eventually develop a universe of discourse, which facilitates their subsequent collaboration. Although each deed is a fragment of a larger social act, it is also an episode in the life of an individual. Mead's basic unit of analysis is the act, which is initiated by some want and is directed toward its satisfaction through the use of suitable elements of the environment. All behavior can be broken down, for purposes of analysis, into a series of acts. Each act has a history; it is constructed as an organism makes a succession of adjustments to conditions (external and internal) that are unT dergoing constant change. Overt behavior is usually only the final phase of an act; in most cases it is preceded by a number of preparatory adjustments, including various subjective experiences. An act is teleological; it is not a mere sequence of passing events but an organized whole directed toward an end. To study such processes Mead proposed the concepts of impulse, perception, manipulation, and consummation. An impulse is a disturbance, any lack of adjustment between the organism and its
MEAD, GEORGE HERBERT milieu—pique over an imagined slight, hunger pangs, or concern over a difficult task to be faced. Consummation is the elimination of the disturbance. In his study of motivation, then, Mead developed an approach resembling some of the more recent tension-reduction models. His scheme was comprehensive, and his concepts made it possible to show the relationship between organic needs, external stimulation, conscious intent, and overt movements. Between the terminal points of the act lie perception and manipulation; it is through these processes that various features of the environment become involved in the act. An organism is in continuous interaction with its milieu, and activity is redirected in response to a succession of sensory cues. Perception is selective: not everything in the environment is noticed. An object is something that is essential to the completion of the act, and a person is sensitized to whatever will enable him to carry out activity that is already under way. Both perception and manipulation rest upon hypotheses. An object is approached in terms of expectations: a person anticipates what would happen if he were to move forward and touch it. For this reason Mead referred to perception as a "collapsed" or "telescoped" act. What is perceived depends in part upon what is anticipated; these hypotheses are then tested and confirmed in manipulation, the handling of objects as tools. The hypotheses upon which perception and manipulation rest are derived from the meanings of objects. For pragmatists meaning is primarily a property of behavior and only secondarily a property of the objects themselves. Meanings are stable relationships between an organism and a class of objects, defined in the manner in which the latter are characteristically handled. Physical attributes are important because they set limitations upon what can be done. Most meanings are subject to social control in that the anticipated reactions of other people place additional restrictions on usage. Approaching sacred objects without sufficient deference, for example, elicits outraged protests. Such expectations are incorporated into the organization of the act. Members of each species select out of their environment objects that are essential to their survival and organize responses toward them; the world view of human beings is necessarily anthropomorphic and social. But pragmatism is fiot solipsistic. Reality is objectified through activJ ty, but the orientations which support such activity are subject to reality testing. Hypotheses that turn out to be unreliable are rejected, and objects are redefined. Once an act is under way, it generally proceeds
85
to consummation. One of the major contentions of pragmatists is that thinking is a form of behavior that occurs when activity is interrupted. The interference may arise from an external barrier, a disability of the organism, or an absence of necessary objects. When an act is blocked, a number of secondary adjustments take place, including emergency mobilization (emotion) and conscious reflection; and through these processes a delayed act may eventually be completed. Any impulse that is not immediately consummated is transformed into an image, which serves as the basis for reflection. Images are acts that fail to issue in overt behavior, acts that are innervated but not carried out. Each image may be regarded as a plan of action, one possible way of completing the interrupted act. A perplexed individual experiences a succession of images, and reflective thought is an imaginative rehearsal—a comparison and evaluation of alternative routes to consummation. Mentality may be regarded as the ability to anticipate the consequences of projected lines of action and to respond to them prior to commitment to overt action. Thinking, then, is problem-solving activity in which trial and error takes place in the imagination. Once a person has mastered a language, images and objects may be designated by symbols; alternative plans of action are labeled, and their consequences are examined verbally. Consciousness is inner discourse, sub vocal linguistic communication. While thinking is a private experience, it takes place through significant symbols; it is therefore behavior organized from the standpoint of a generalized other. The use of language transforms the effective environment to which human beings adjust. By using words, one can manipulate meanings outside the contexts in which they have developed and even make up more complex meanings. Foresight and planning are greatly facilitated. With symbols one can isolate certain experiences and hold on to them, pick out other relevant meanings, or emphasize a particular image while rejecting others. Language also makes possible the formulation of complex plans, broad schemes in which diverse and even antagonistic tendencies may be coordinated and a sequence of operations performed. Mind for Mead was internal symbolic communication, and it is this type of cognitive activity that computer engineers are now attempting to reconstruct. Modern decision theory describes regularities in the selection process. Analysis of the "self" Mead is best known for his theory of the self. The self is not one's body but a perceptual object. Since most acts are components of larger trans-
86
MEAD, GEORGE HERBERT
actions, the actors are interdependent; the impulses of one cannot be consummated without the cooperation of his associates. Each participant therefore becomes concerned over the possible reactions of the others to himself, for he cannot afford to do anything that will jeopardize their support. Each person forms an object of himself through role taking, by reviewing his intended conduct from the standpoint of those with whom he is involved in a common venture. Mead's discussion was rendered unnecessarily difficult by his use of the term "self" to designate three different referents: (1) the perceptual object formed of oneself in a particular historical context, (2) the process of self-control, and (3) one's personality. Voluntary conduct is constructed in a sequence of adjustments in which a person responds to himself as well as to the rest of his perceptual field. To study this process Mead proposed the concepts of the / and the Me; these terms refer not to agents but to phases of activity. The "Me" is the object one forms of oneself from a conventional standpoint, and the "I" is the reaction of the unique individual to the historical situation as he perceives it. Typical inclinations to react differ from person to person; in fact, the succession of "I's" constitutes the basis of individuality. In speaking of behavior as being built up in the interaction of the "I" and the "Me," Mead was stressing the seriated character of human conduct. If an impulse (I) is not immediately carried out, it is transformed into an image (Me), which in turn elicits another reaction (I). For example, if a man believes that his wife is disparaging his efforts (Me), he may want to beat her (I). As he refrains from striking, he imagines himself administering the beating (Me). Since he views himself from the standpoint of a group in which wife beaters are condemned, he reacts with disgust (I), and this inhibits one route to consummation. Frustrated and hurt (Me), he reacts with determination to demonstrate his competence through a superlative performance (I). Thus, an individual's line of conduct is constructed as he adjusts to a succession of organic states, perceptual objects, images, and anticipated reactions of other people. Self-control is part of the ongoing social current; each person adjusts in advance to the situation in which he is involved and thereby facilitates cooperation. In this process self-consciousness provides the basis for corrective measures. For Mead, as for Norbert Wiener, autonomy depends upon feedback; without it one becomes a creature of impulse or is subject to drift or external control. A crucial feature of feedback in self-control is that the object is formed from a standpoint shared with other people. The
fact that all participants control themselves from the same perspective (generalized other) makes concerted action possible. A human being is not born with a mind and self-discipline; these capacities develop gradually as the child comes to terms with the demands of group life. The meanings of objects and gestures are products of experience; appropriate ways of handling things and of speaking are shaped largely by the consistent responses of elders, who provide instructions, serve as models, and reinforce the accepted modes of conduct. Mead emphasized two especially important contexts for socialization. In play young children assume specific roles and imitate individuals they know—mother, postman, salesclerk. In so doing they begin to appreciate the perspectives of others. By repeating such role taking the child is able to build up an orientation toward himself as an object of a certain sort. But effective self-control develops only in the game— or in any other enterprise that requires teamwork. In games the responses of others are organized, and activity proceeds according to rules. The contributions expected of each player are standardized into impersonal roles. Furthermore, successful participation requires the ability to assume multiple positions vis-a-vis oneself: one must be ready to take the role of any other player. Through repeated participation in such transactions, the child learns to adopt a point of view that is shared by all other participants (generalized other), a perspective that transcends that of particular individuals. Although Mead saw human beings as inextricably involved in groups, he stressed the importance of individuality. Each person, although a product of society, retains his distinctiveness, for he incorporates the generalized other from a unique standpoint. As one develops the capacity for conscious communication, one achieves greater independence from others and greater discreteness as an individual. For each person self-realization is attained through the consummation of a distinct set of impulses; what brings fufillment to one individual will not necessarily satisfy another. Furthermore, each person has a unique impact upon his community. Even when he is complying with conventional norms, he does so in his own style. The contributions of a genius are often striking and extensive and therefore more readily discernible, but everywhere allowances have to be made for the idiosyncrasies of the less talented. Thus, through self-assertion, each individual alters somewhat the social pattern in which he participates. As a social philosopher Mead had a deep bias toward amelioration through understanding. The
MEDICAL CARE: Ethnomedicine son of a Congregationalist minister in Ohio, he may have been influenced by the climate of opinion of his community, a station of the Underground Railroad and locale of the first college to admit women. He believed that progress takes place through the constant meeting and solving of problems. Social institutions, like everything else in nature, are continually evolving, but men can direct this process through intelligent action. Scientific method provides the most efficient way of solving problems and should be used to facilitate human adaptation. The ideal society is one in which there is maximum participation by all members, one in which each person understands all the others and still retains his individuality. This ideal, while imperfectly realized, is constantly being approximated. Mead believed that history is on the side of progress and that eventually a brotherhood of man will emerge. Since pragmatism is an application of scientific method to philosophical problems, it is not surprising that Mead's position is so much like the developing outlook of the natural and social sciences. Mead was a thinker who was ahead of his time. His views on matter, space, time, and relativity are similar to those of modern theoretical physics; and his discussion of meaning resembles P. W. Bridgman's work on operational definitions. Many of the ideas Mead developed at the turn of the century are now widely accepted in social psychology: the selective and seriated character of perception, cognition through linguistic symbols, role enactment, decision processes, autonomy through feedback, personal identity, reference groups, and socialization through participation. Because of the congruence of Mead's views with current trends, it seems likely that increasing attention will be directed to his work. Many implications of his position still remain to be explored. TAMOTSU SHIBUTANI [For the historical context of Mead's work, see INTERACTION, article on SOCIAL INTERACTION; SOCIOLOGY, article on THE DEVELOPMENT OF SOCIOLOGICAL THOUGHT; and the biographies of DARWIN; DEWEY; HEGEL; JAMES; MARX; PARK; PEIRCE; SMITH, ADAM; SULLIVAN; THOMAS. For discussion of the subsequent development of Mead's ideas, see COMMUNICATION; DEVIANT BEHAVIOR; KNOWLEDGE, SOCIOLOGY OF; ROLE, article on SOCIOLOGICAL ASPECTS; SELF CONCEPT; SEMANTICS AND SEMIOTICS; SYSTEMS ANALYSIS, article on SOCIAL SYSTEMS; and the biographies of ANGELL; BECKER; BURGESS; COOLEY; FOLLETT; MERRIAM; MEYER; WALLER.] WORKS BY MEAD (1932) 1959 The Philosophy of the Present. La Salle, 111.: Open Court. 1934 Mind, Self and Society From the Standpoint of a
87
Social Behaviorist. Edited by Charles W. Morris. Univ. of Chicago Press. -» Contains a complete bibliography. 1936 Movements of Thought in the Nineteenth Century. Univ. of Chicago Press. 1938 The Philosophy of the Act. Univ. of Chicago Press. Selected Writings. Edited with an introduction by Andrew J. Reck. Indianapolis, Ind.: Bobbs-Merrill, 1964. SUPPLEMENTARY BIBLIOGRAPHY
BLUMER, HERBERT 1966 Sociological Implications of the Thought of George Herbert Mead. American Journal of Sociology 71:534-544, 547-548. CLAYTON, ALFRED S. 1943 Emergent Mind and Education: A Study of George H. Mead's Bio-social Behaviorism From an Educational Point of View. New York: Columbia Univ., Teachers College. KOLB, WILLIAM L. 1944 A Critical Evaluation of Mead's "I" and "Me" Concepts. Social Forces 22:291-296. LACUNA, GRACE A. DE 1946 Communication, the Act, and the Object, With Reference to Mead. Journal of Philosophy 43:225-238. LEE, GRACE C. 1945 George Herbert Mead: Philosopher of the Social Individual. New York: King's Crown Press. NATANSON, MAURICE 1956 The Social Dynamics of George H. Mead. Washington: Public Aifairs Press. -> Contains an extensive list of secondary sources. STRONG, SAMUEL M. 1939 A Note on George H. Mead's The Philosophy of the Act. American Journal of Sociology 45:71-76.
MEAN VALUES See STATISTICS, DESCRIPTIVE, article on LOCATION AND DISPERSION.
MEASUREMENT See CONTENT ANALYSIS; ECONOMETRICS; EVALUATION RESEARCH; PANEL STUDIES; PSYCHOMETRICS; SCALING; SOCIOMETRY; STATISTICS; STATISTICS, DESCRIPTIVE, article on LOCATION AND DISPERSION; STRATIFICATION, SOCIAL, article on THE MEASUREMENT OF SOCIAL CLASS; SURVEY ANALYSIS.
MECHANISMS See DEFENSE MECHANISMS and FUNCTIONAL ANALYSIS. MEDIATION See DIPLOMACY; LAROR RELATIONS; INTERNATIONAL CONFLICT RESOLUTION; NEGOTIATION.
MEDICAL CARE i. ETHNOMEDICINE n. SOCIAL ASPECTS in. ECONOMIC ASPECTS
Charles C. Hughes William A. Glaser Rashi Fein
ETHNOMEDICINE
Judging from paleopathological evidence, diseases of one kind or another have always afflicted
88
MEDICAL CARE: Ethnomedicine
man. Indeed, given the nature of life and the nature of disease, it could not be otherwise; for disease is but an expression of man's dynamic relationship with his environment. And even as there has always been sickness, accident, deformity, and anxiety to trouble man, so, too, has there been an organized, purposeful response by society to such threats. In all human groups, no matter how small or technologically primitive, there exists a body of belief about the nature of disease, its causation and cure, and its relations to other aspects of group life. There also exist therapeutic and preventive practices, many of which are empirically efficacious by standards of modern medicine, although often not for the reasons advanced by folk belief. The variability of societies and cultural systems impedes easy generalization about the nature of "primitive" or "folk" medicine (cf. Ackerknecht 1942&), but one common characteristic is its close integration with other institutions of the society. Religion, medicine, and morality are frequently found together in the behavioral act or event, and "folk medicine" becomes "social medicine" to an extent not found in industrialized societies. The term "ethnomedicine" will be used to refer to those beliefs and practices relating to disease which are the products of indigenous cultural development and are not explicitly derived from the conceptual framework of modern medicine. In this light, most of the following discussion focuses on the non-Western, nonindustrialized societies of the world, although it is clear that in "modern" societies as well there exist beliefs and practices relating to disease and its treatment which are based on magical or religious conceptions rather than on those of scientific medicine. Theory of disease. Man, everywhere, devises or divines causes for the significant events in his life. The afflictions which beset body and mind are explained in both naturalistic and supernaturalistic terms. A cut finger, a broken limb, a snake bite, a fever, the halting speech and wandering mind of senility—all may be regarded as sometime hazards in life. To explain such events there is always some conceptual framework founded in common-sense empiricism. But often a wound does not heal, a sickness does not respond to treatment, and the normally expected and predictable does not happen. In such cases another order of explanation is employed, one which attempts to come to terms with the more basic meaning of the event in metaphysical perspective. For most non-Western societies this transcendent explanation for the occurrence of disease tends to figure more pervasively in the total body of medical lore and practice than does
the empirical framework. One reason is the greater incapacitation and mortality from disease in the underdeveloped world than in highly industrialized societies. In addition, this reality is coupled with the comparative inadequacy of ethnomedical techniques and knowledge for dealing with these com* mon threats to the existence of the group and the person. Widespread throughout the world are five basic categories of events or situations which, in folk etiology, are believed responsible for illness: (1) sorcery; (2) breach of taboo; (3) intrusion of a disease object; (4) intrusion of a disease-causing spirit; and (5) loss of soul (Clements 1932). Not every society recognizes all five categories; indeed, many groups are selective in the emphasis placed upon one or a combination of causes. For example, the Eskimo most frequently trace the origin of diseases to soul loss and breach of taboo, while the malevolence of sorcerers or witches is especially emphasized in many African cultures. Usually, however, these categories are more useful in analytically characterizing the etiological beliefs of a particular group than in describing the content of an entire belief system. This would be the case where a disease is believed to be caused by the intrusion of an object which contains a spirit, and it is the latter to which primary causative influence is attributed (e.g., Hallowell 1935). The Greeks were not alone in viewing disease as a manifestation of disharmony in man's over-all relation to the universe. "Health" is rarely, if ever, a narrowly restricted conception having its locus only in the well-being of the individual body. In discussing conceptions of illness among a west African people, Price-Williams gives a modern illustration : In common with a great many other people, Tiv do not regard "illness" or "disease" as a completely separate category distinct from misfortunes to compound and farm, from relationship between kin, and from more complicated matters relating to the control of land. But it would be completely erroneous to say that Tiv are not able, in a cognitive sense, to recognize disease. As Bohannan has said: "The concept of a disease is not foreign to the Tiv: mumps, smallpox, . . . yaws and gonorrhoea are all common and each has a name." What is meant is that disease is seldom viewed in isolation. (1962, p. 125) Such a notion is widely found, as in certain American Indian groups, where bodily or mental affliction is often viewed as an indicator of moral transgression, in thought or in deed, against the norms of society. Indeed, man is frequently thought of as continuous with both the social and nonsocial
MEDICAL CARE: Ethnomedicine aspects of his environment, and what happens in his surroundings affects his bodily well-being. Not only a person's own actions, therefore, but also those of kinsmen or neighbors can cause sickness. Such an etiological conception has obvious implications for treatment. For example, if the curative technique includes magically based dietary restrictions, they may apply to all members of the patient's family; a breach of the restriction by any of these people will undermine the patient's health. Similarly, as among the Thonga of South Africa, sexual relations between any of the inhabitants of the patient's village can aggravate his condition, and some Eskimo groups feel that the patient's family should do no work during the period of convalescence for fear of giving offense to the spirit causing the sickness. The belief that by his own actions a man can influence the state of his fellow's health also has malevolent implications, as in the practice of witchcraft and sorcery. Frequently this may be an important factor deciding the success or failure of attempts at introducing new medical programs in underdeveloped societies. Cassel (1955) cites as an illustration the Zulu, who believe that only sorcerers and witches have the ability to transmit disease, particularly diseases which show themselves in symptoms normally associated with pulmonary tuberculosis. Progress in one community's acceptance of a Western-styled health program was brought to an abrupt halt when a physician tried to introduce the medical concept of contagion. He traced the course of tuberculosis through a family, showing how one person had been the original source of the disease in the group and had therefore been the agent responsible for sickness in all the others. The cautious cooperation of the family elder immediately turned into a hostile rejection, which was assuaged only after the doctor had retracted his apparent accusation that the daughter of the family was a witch. A theory of disease implies a theory of normality. Yet the "normal" is in no way easy to define for all times and places. Aside from questions of a "statistical" versus a "functional" basis for normality, there is the cultural definition. Afflictions common enough in a group to be endemic, though they be clinical deformities, may often be accepted simply as part of man's natural condition. Ackerknecht (1946), for example, has noted that the Thonga believe intestinal worms, with which they are pervasively affected, to be necessary for digestion; the Mano, also of Africa, feel that primary and secondary yaws are so common that they say, "That 18 no sickness; everybody has that." North Ama-
89
zonian Indians, among whom dyschronic spirochetosis is prevalent, accept its endemicity to such an extent that its victims are thought to be normal, and individuals who have not had the disfiguring disease are said to be looked upon as pathological and consequently unable to contract marriage. It is culture, not nature, that defines disease, although it is usually culture and nature which foster disease. Recent behavioral science research has attempted to go beyond a "phenomenological" orientation in investigation of cultural theories of disease and has sought analytic categories which would relate particular emphases in a theory of disease to other aspects of social and cultural life. A striking example of this type of investigation is the work of John Whiting and Irvin Child (1953). Using a wide sample from the ethnographic "laboratory," these investigators found high correlations between certain aspects of child-rearing practices and dominant themes in etiological beliefs related to disease, more particularly, between the hypothesized degree of anxiety generated during the socialization process—the degree of "negative fixation"— and a theory of disease which reflects these anxieties. Thus, harsh weaning is highly associated with oral explanations for onset of disease: these would include consumption of food, drink, or poison by the sick individual or oral activity on the part of others, such as incantations and spells. Societies in which independence training is characteristically fraught with emotional hazards tend also to have "dependence" explanations of illness. These include soul loss or spirit possession. "Aggression explanations" for disease are highly associated with societies in which training for handling aggressive impulses leaves a residue of unresolved anxiety, and they are expressed in theories which ascribe the cause of a disease to the patient's disobedience or aggression toward spirits, to aggressive wishes on the part of the patient or another person, to introjection of poison other than by mouth, or to harm by magical weapons or objects. Theory and practice of treatment. Therapeutic practices in ethnomedicine address themselves to both supernatural and empirical theories of disease causation. Ackerknecht has said that primitive medicine is "magic medicine" (1942b); certainly much of it is, and, insofar as supernatural causes are involved, therapeutic regimes are based on countervailing supernatural powers or events. Thus, the powerful shaman or healer attempts to recover the soul lost or stolen by a human or supernatural agent. The intrusion of a disease object or diseasecausing spirit is treated by extraction or exorcism,
90
MEDICAL CARE: Ethnomedicine
and diseases which come as punishment for breach of taboo are usually treated by divination or confession of the infraction. Forgiveness and re-establishment of harmony with the moral and supernatural order are thus important outcomes of the therapy. In folk medicine, however, there is more to treatment than magical or religious ritualization, however effective this may be psychosocially in providing emotional catharsis and reassurance. All human groups have a pharmacopoeia and at least rudimentary medical techniques; some groups, indeed, are exceptional in their exploitation of the environment for medicinal purposes and in the degree of their diagnostic and surgical skills (Ackerknecht 1942b; Sigerist 1951; Laughlin 1963). The trephining done by the Inca, Masai surgery, the anatomical knowledge of the Aleut and the Eskimo, and the extensive drug repertory of west African tribes are familiar examples. In addition to trephining, numerous other types of surgery and bonesetting are found, as well as massages, bloodletting, dry cupping, bathing, inoculation, and cauterization. It has been estimated that from 25 to 50 per cent of the non-Western pharmacopoeia is empirically effective. In fact, our knowledge of the therapeutic efficacy of a large number of modern drugs is derived from the experience of primitive peoples: opium, hashish, hemp, coca, cinchona, eucalyptus, sarsaparilla, acacia, kousso, copaiba, guaiac, jalap, podophyllin, quassia, and many of the tranquilizers and psychotomimetics now used in psychiatric therapy and research. A great part of the task of folk medicine, however, and especially of preventive medicine, is borne by cultural practices which, although oriented to different social purposes, have important functional implications for health. Thus, notable hygienic purposes are served by many religious and magical practices, such as avoidance of the house in which a death has occurred, theories of contagious "bad body humors" which necessitate daily bathing, distinctions of "hot" and "cold" food and water which require boiling or cooking, hiding of fecal and other bodily waste through fear of their use by sorcerers or witches, and numerous others. Other cultural practices inadvertently relevant to health have a more general ecological basis. These may include customs regarding cosmetics and clothing or house styles and settlement patterns. Regardless of their value to the archeologist, the middens of ancient sedentary communities have rather baleful implications for the public health of the times. Changing economic incentives and circumstances which disrupt the adaptation of a cultural activity to its environment frequently
create health hazards. May (1960) provides a striking illustration of the intersection between cultural and ecological factors in North Vietnam. Dwellers on the plains lived in low, squat houses in which they sheltered their cattle on one side and did their cooking on the other. When these rice growers moved into the hills they constructed houses according to the same general plan. In the hills, however, the incidence of malaria among them became so high as to limit further such movement, despite governmental encouragement. The people themselves ascribed the calamity to the ill will of the hill deities. In fact, however, the incidence of malaria was low among the indigenous hill people, who constructed their houses on stilts, sheltered their animals underneath, and did their cooking inside the house. Several factors were apparently instrumental in the latter case in keeping down the spread of malaria from the mosquito vector found in the hills; flight ceiling of Anopheles minimus is restricted to about ten feet above the ground, and, despite its preference for human blood, the presence of animals underneath the house and of smoke inside the house (where the cooking was done) created an unrealized protection for human inhabitants. The study of folk medicine has important theoretical implications for the persistent question of a "magical" versus a "scientific" orientation among non-Western peoples. Erasmus (1952), utilizing data from South American Indian populations, contends that the inductive epistemological framework of folk medicine is essentially similar in structure to that of modern scientific medicine, but that the latter differs chiefly in its amenity to generalization and degree of predictive success. In folk medicine the chances of "natural" recovery are in favor of predictive successes, but, more often than in modern medicine, the theoretical propositions lying behind such predictions are merely coincidentally rather than functionally related to the phenomenon in question. Thus, the recooking, before eating, of food that has been left standing overnight is done on the basis of the need to dispel the dangerous quality of "coldness" in the food, but in fact such recooking destroys enterotoxin-producing staphylococci. The possible implications for a sociology of knowledge are apparent: so long as any activity or set of activities produces a sufficiently high proportion of predictive successes, there will be little elaboration or alteration of the conceptual framework orienting the activity. Cognitive frameworks relating to disease are instruments in the total process of adaptation; they change, evolve, and respond when their viability and acceptability are
MEDICAL CARE: Ethnomedicine challenged. Only when folk etiology fails too often and in too many areas to give pragmatic and especially psychodynamic satisfaction does it yield to other frameworks, autochthonous or adopted from outside. Disease, medicine, and culture. Some knowledge of diseases, their classification and etiology, is part of all cultural systems (e.g., Lieban 1962; Rubel I960; Price-Williams 1962). Investigators have analyzed disease categories in an attempt to understand the structure of the conceptual world of different peoples. The use of componential analysis, the investigation of semantic interrelationships of terms, has been applied to words for sicknesses (cf. Frake 1961). Aside from illustrating the extent to which concern with disease is elaborated in a folk nosology, such work also emphasizes a more general point: an effective cultural response to disease requires patterned discrimination and categorization of disease symptoms, even if treatment is based largely on methods of trial and error. Diagnostic categories, however crude, serve the purpose of directing sustained attention and reflection to the appearance of disease syndromes, thus providing empirical data for inferences about the probable effectiveness of one type of treatment or another. Undoubtedly this constitutes a kind of inadvertent experimental approach (see, e.g., Laughlin 1963; Erasmus 1952). Theories of disease generally have major relevance to the moral order, that is, to the control of man's behavior in society. Disease is frequently seen as a warning sign, a visitation from punishing agents for a broken taboo, a hostile impulse, or an aberrant urge to depart from the approved way. In a series of classical papers, Hallowell (e.g., 1963) has analyzed the function of anxieties over sickness among the Ojibwa Indians of North America, and other investigators have looked at the same problem in different cultural settings (e.g., Lieban 1962). Sickness is often interpreted as the supernaturals' way of indicating an act or intention of socially disruptive behavior. Especially in societies lacking strong centralized sociopolitical institutions, the occurrence and imminence of disease— with the belief that it represents punishment for aberrant, dissocial impulse or action—can be functionally important in maintaining group cohesion a nd restraining disruptive tendencies. The therapeutic practices attendant upon occurrence of disease may also have socially cohesive re sults. Although such therapy may often be medically effective, it may serve ancillary functions in tne total organization of the society. Typically, the curative session (and often the diagnostic occasion as well) involves not only the patient and the
91
healer but also the patient's family and neighbors. Often the therapy involves confession by the patient, and under such conditions the confession may well relieve him of diffuse as well as focused anxieties and guilt. When followed by concrete expiatory acts, it may also give him a chance to participate in his own treatment through action. (It is doubtful whether such curative rites do more than provide temporary symptom relief—but the same can be said of much modern psychotherapy.) At the same time group cohesion is often enhanced, for such confessions dramatize fundamental social values by illustrating the harm that can come from social deviance. They provide a setting in which all participants are enveloped in the aura of forgiveness and, through stress on the protection afforded by adherence to group values, assurance of good health. In short, the therapeutic context is usually explicitly a social context, and during the course of the therapy the reciprocal psychosocial involvement of the patient with his fellows is ritually underscored. As noted above, if therapeutic directives for behavior are issued, they frequently apply to the group, or selected members of the group, as well as to the patient; and if successful recovery is as dependent upon good thoughts as upon effective techniques—as frequently happens—then the assembled company must be devoid of ill wishes and hostilities toward not only the patient but also each other. The curative rite may thus serve in multiple ways as an occasion for reintegration of the group around common social values. The practice of folk medicine is variously institutionalized. In all societies some rudimentary medical knowledge is an aspect of enculturation, but beyond this general protection there is always a specialist. Sometimes the specialist's role is a fulltime activity, but more frequently it is combined with other principal roles appropriate for the practitioner. In some societies there are more complex social arrangements than the simple dyadic relationship between healer and patient. Even as the kin and covillagers of the patient may be explicitly involved in the curative process, so too there may be a society of healers or several societies of healers devoted to diagnosis and cure of various diseases. In west Africa there are found, for example, specialized associations of healers of smallpox or snake bite; each association possesses its own rules of qualification, initiation, and procedure (Harley 1941). Folk medicine in change. Folk medicine does not easily change under the impact of sustained contact with the industrialized world, or even as a result of deliberate attempts to introduce new conceptions of disease and hygiene. Paul (1955), Fos-
92
MEDICAL CARE: Ethnomedicine
ter (1962), and others have documented the variety of factors that may impede or altogether prevent the successful introduction of a modern health program, even of so simple an innovation as the boiling of drinking water. Such factors include ecological considerations, as well as functional efficiency in domestic tasks, the social structure, the status and prestige of the innovator, and the perception of threat or advantage to the recipient. The proper role of the healer may be differently defined; in India, for example, the medical practitioner must assure the patient of recovery, and any admission of uncertainty (even couched in the form of probability) is not allowed. Rudimentary physical testing may be impossible or difficult in a non-Western context. In societies where blood, for example, is thought of as a nonregenerative substance, to take samples for testing is tantamount to inflicting deliberate harm. It has been found to be easier to introduce behavioral changes than changes in belief about the nature of illness, its cause, and prevention. Domestic hygiene and community health may be bettered by the public health worker who influences a change in habits while not disturbing the underlying belief system. One reason for this has been mentioned above: belief systems, particularly those centered on critical areas of social value, serve more than a single cognitive function. Because they interrelate with religious and magical systems, as well as with the moral order of the society, they impart a deeper sense of resignation and acceptance of events than does an alien concept treating of a germ theory of illness. The value system of a culture provides a more satisfying answer to the question, "Why did I and not my neighbor get sick?" than does an explanation phrased in terms of communicability of a disease, thresholds of resistance, host, agent, and environment. Yet in many instances modern medicine does get accepted; and one of the reasons is its demonstrably greater effectiveness in the treatment and prevention of many diseases. But even such acceptance is often compromised by the existence of alternative diagnostic and therapeutic frameworks: one relating to those diseases for which it is felt modern medicine is more effective, and the second relating to diseases conceived to be unamenable to modern medical treatment. The first is often applied to sicknesses introduced by the Europeans (such as tuberculosis, measles, smallpox, and others of a communicable nature), while in the second group are traditionally endemic diseases and, especially, ailments having a large component of psychological or psychophysiological involvement.
But in the extremity of fear for a patient's life even such distinctions as these are often disregarded, and the ill person may be taken to a modern medical facility after indigenous healers have done their best—taken either to be cured or left to die. Every hospital, and not just those in nonWestern, "underdeveloped" countries, will admit patients brought too late for the course of disease to be halted even by the most advanced techniques of scientific medicine. Disease being an unequivocal threat to life, adaptive responses are many and sometimes override ingrained belief, either of folk medicine on the one hand or modern medicine on the other. In this light, given the avowedly limited role of scientific medicine in society—together with the inevitability of disease—elements of folk medicine will no doubt everywhere persist, even as they do in Europe and the United States, so long as there is uncertainty of outcome or technical ineffectiveness in alleviating pain, prolonging life, and guaranteeing cure. CHARLES C. HUGHES [See also HEALTH and ILLNESS. Directly related are the entries ANTHROPOLOGY, article on APPLIED ANTHROPOLOGY; MAGIC; POLLUTION; RELIGION.] BIBLIOGRAPHY ACKERKNECHT, ERWiN H. 1942a Primitive Medicine and Culture Pattern. Bulletin of the History of Medicine 12:545-574. ACKERKNECHT, ERWIN H. 1942fe Problems of Primitive Medicine. Bulletin of the History of Medicine 11:503521. ACKERKNECHT, ERWIN H. 1946 Natural Diseases and Rational Treatment in Primitive Medicine. Bulletin of the History of Medicine 19:457-497. ACKERKNECHT, ERWIN H. 1965 History and Geography of the Most Important Diseases. New York: Hafner. CASSEL, JOHN 1955 A Comprehensive Health Program Among the South African Zulus. Pages 15-41 in Benjamin D. Paul (editor), Health, Culture, and Community. New York: Russell Sage Foundation. CLEMENTS, FORREST E. 1932 Primitive Concepts of Disease. California, University of, Publications in American Archaeology and Ethnology 32, no. 2:185252. DUBOS, RENE (1959) 1961 Mirage of Health, Utopias, Progress, and Biological Change. New York: Harper. DUBOS, RENE 1965 Man Adapting. New Haven: Yale Univ. Press. ERASMUS, CHARLES J. 1952 Changing Folk Beliefs and the Relativity of Empirical Knowledge. Southwestern Journal of Anthropology 8:411-428. FOSTER, GEORGE M. 1962 Traditional Cultures, and the Impact of Technological Change. New York: Harper. FRAKE, CHARLES O. 1961 The Diagnosis of Disease Among the Subanun of Mindanao. American Anthropologist New Series 63:113-132. HALLOWELL, A. IRVING 1935 Primitive Concepts of Disease. American Anthropologist New Series 37:365368.
MEDICAL CARE: Social Aspects , A. IRVING 1963 Ojibwa World View and Disease. Pages 258-315 in lago Galdston (editor), Man's Image in Medicine and Anthropology. New York Academy of Medicine, Institute of Social and Historical Medicine, Monograph No. 4. New York: International Universities Press. GEORGE W. 1941 Native African Medicine: With Special Reference to Its Practice in the Mano Tribe of Liberia. Cambridge, Mass.: Harvard Univ. Press. HUGHES, CHARLES C. 1963 Public Health in Non-literate Societies. Pages 157-233 in lago Galdston (editor), Man's Image in Medicine and Anthropology. New York Academy of Medicine, Institute of Social and Historical Medicine, Monograph No. 4. New York: International Universities Press. KUHN, THOMAS S. 1962 The Structure of Scientific Revolutions. Univ. of Chicago Press. LAUGHLIN, WILLIAM S. 1963 Primitive Theory of Medicine: Empirical Knowledge. Pages 116-140 in lago Galdston (editor), Man's Image in Medicine and Anthropology. New York Academy of Medicine, Institute of Social and Historical Medicine, Monograph No. 4. New York: International Universities Press. LIEBAN, RICHARD W. 1962 The Dangerous Ingkantos: Illness and Social Control in a Philippine Community. American Anthropologist New Series 64:306-312. MAY, JACQUES M. 1960 The Ecology of Human Disease. New York Academy of Sciences, Annals 84:789-794. -> A paper delivered at a conference on culture, society, and health held and sponsored by the New York Academy of Sciences and the Research Institute for the Study of Man. PAUL, BENJAMIN D. (editor) 1955 Health, Culture, and Community: Case Studies of Public Reaction to Health Programs. New York: Russell Sage Foundation. PRICE-WILLIAMS, D. R. 1962 A Case Study of Ideas Concerning Disease Among the Tiv. Africa 32:123131. RUBEL, ARTHUR J. 1960 Concepts of Disease in MexicanAmerican Culture. American Anthropologist New Series 62:795-814. SIGERIST, HENRY E. 1951 A History of Medicine. Volume 1: Primitive and Archaic Medicine. New York: Oxford Univ. Press. WHITING, JOHN W. M.; and CHILD, IRVIN L. 1953 Child Training and Personality: A Cross-cultural Study. New Haven: Yale Univ. Press. -» A paperback edition was published in 1962. II SOCIAL ASPECTS
Medical care is the application of scientific knowledge and technique to solving the physical a nd emotional problems of man. To a physician, medical care denotes the body of diagnostic and therapeutic theory and procedure developed to understand, cure, and prevent diseases. A social scientls t, however, defines medical care as a system of social institutions in a larger social structure. Since me dical care is given by specialized personnel, it Presents to the social scientist several problems in e sociology of professions. Since much medical ar e is given in organized settings, it can also be
93
studied from the viewpoint of the sociology of formal organizations. The demand for medical care Medical institutions cannot originate without a market. Anthropological evidence suggests that recognition of physical and emotional problems by potential patients is universal among the world's populations: whether he is a Western city dweller or a peasant in an underdeveloped country, the human being is aware of discomfort and an inability to perform his normal social roles. However, the decision to take practical action and the choice of remedies vary widely according to the social system and according to the individual's statuses in each social system. Religion and science. The meaning of illness and the practical action deemed appropriate in a society derive from the prevailing bodies of religious and scientific thought. In many underdeveloped countries, widespread beliefs attribute injuries and illness to alienation from divine forces or from social obligations. For remedies appropriate to the imputed causes, large numbers of people may rely on personal rituals or on the guidance of priests and folk practitioners. Where Western medical institutions have been imported into such societies by governments or missionaries, they may be used by only the small minority that is urban and Westernized (Williams & Scharff 1960, p. 18). Extensive public use of medical care depends on widespread acceptance of certain doctrines about God and man: human life on earth should be preserved; the alleviation of physical discomfort does not contradict divine intent but may even serve God's will; practical action to save life is not inconsistent with divine purpose or church obligations. These doctrines have been prominent in Christianity, classical Islam, and Judaism. For these reasons, public utilization of therapeutic and preventive medicine has been widespread in the ancient Middle East and in the modern West. As Western religions spread in Asia, Africa, and Latin America, or as contrary traditional religions are redefined to tolerate or encourage the alleviation of discomfort by practical secular action, public utilization of scientific medical institutions increases. In many societies, the prevailing religious and scientific beliefs define illnesses differently and, thus, teach a range of responses. For example, externally visible injuries and infections may be attributed to mundane events, while internal physical and mental malfunctions are believed to originate supernaturally. Since direct physical remedies are considered appropriate for the former and magical or propitiatory methods are alone believed effica-
94
MEDICAL CARE: Social Aspects
cious for the latter, the population may bring its surgical and infectious problems to the Westernstyle doctor and hospital, while retaining clergymen and folk practitioners for internal and mental illnesses. The referral of particular categories of people is often a function of religious belief. For example, folk religion in some Arab and Asian countries assumes that the evil eye and divine wrath fall particularly heavily upon babies and children. Therefore, pediatric illnesses may be thought irremediable; few children are brought for medical care; and pediatrics remains an undeveloped specialty in medicine (Cameron I960; Williams & Scharff 1960, pp. 18, 32,34). Other social determinants. The demand for medical care varies with other broad characteristics of society, such as the general levels of urbanization, literacy, prosperity, and industrialization. Poverty breeds disease and neglect, and consequently the world's numerous underdeveloped countries have potentially huge numbers of patients (Brockington 1958, part 2). Illiteracy prevents many citizens from learning, where to find medical care, and services are often inaccessible in the rural areas, where much of the population lives. However, as information spreads about the availability and efficacy of medical installations, public use increases. As urbanization rises, so does the number of patients; in the cities of underdeveloped countries, vast throngs of ill persons tax the outpatient clinics and inpatient wards of public hospitals (McGibony 1961, pp. 51-52; Mazen 1961, pp. 40, 72, 273275). Within each country, social class and literacy appear to correlate with the use of medical care. The lower and poorer social classes may have the greatest number of physical and mental problems. But the higher classes make greater use of medical services because of their greater understanding of medical services, their social sophistication, and their ability to pay (Kadushin 1964). The problem varies by social class: for example, in the developed countries, coronary disease and ulcers are found more often among the upper class, infections and tuberculosis more often among the lower class (Freeman et. al. 1963, chapter 14 passim; Susser & Watson 1962, chapter 3). Family structure. The demand for medical care is affected in several ways by family structure. In the Muslim countries and in some others, religion and custom decree that women are to work and live solely within the family circle. Therefore, women hesitate to visit the hospital and particularly to become inpatients, since this would involve
immodest contact with male strangers. More men than women use medical services in the Muslim countries, but the opposite is true elsewhere, particularly in the West (compare Mazen 1961, pp. 213-214 and Fehler 1961, p. 397). Medical specialties involving female modesty, particularly obstetrics and gynecology, are developed far more in the West than in the countries with sheltered women. In preindustrial societies, the breadwinner is indispensable to provide current income, and the mother cannot easily be spared from her family duties. Thus, young adults may postpone medical care, particularly if inpatient hospitalization is likely, and medical services are used predominantly by children, the elderly, and severely ill young adults. However, in some underdeveloped countries, babies and young children occupy such a subordinate position in the family that they are brought for medical care much less often than adults (Stever 1961, p. 335). In general, the utilization of medical services is higher in the industrial countries, not only because family nursing is deemed less efficacious than expert hospital or clinic treatment, but also because employers expect efficient performance and because urban living conditions make home care difficult. The modernization of underdeveloped countries includes the establishment of Western-style medical institutions and the propagation of Western health norms, and consequently the foregoing cross-national differentials in the utilization of medical care will diminish. Medical practice Since every society has the functional problems of explaining illness to the sick and their families and of guiding practical action, each society develops a body of medical theory from its basic ideas about the universe, life, and man. These theories vary in their empirical utility and in their implications for remedial treatment. Some of the principal civilizations—notably, European, Islamic, Indian, and Chinese—derived considerable and still influential systems of medical theory from their religions and ontologies (Castiglioni 1927). Each of the principal medical traditions became institutionalized in the form of a theoretical literature, recognized practitioners, and educational techniques for transmitting knowledge and remedial skill from one generation of practitioners to the next. Much of the training has been didactic, but some—particularly in the Greek-Islamic-Western tradition—has made the neophyte an apprentice or observer of a practitioner at work. Western medicine. Transmitted to Europe by certain medieval Italian universities, Greek-Islamic
MEDICAL CARE: Social Aspects medicine ultimately proved the most productive in empirical knowledge and in therapeutic success. Several characteristics of European thought and society seem to have been crucial conditions for its growth. First, much of Western scientific thought has been inductive and empiricist, and some of this thinking contributed to medicine. Thus, although deductive and metaphysical theory was not absent from Western medicine, and although the speculative system builders vigorously fought new trends, simultaneously there developed a medical tradition of observation, experimentation, and critical verification of facts and therapies. Western medical knowledge and remedies—particularly in recent centuries—have been under continuous revision in the light of new evidence, to a degree found in no other medical tradition. Furthermore, the empiricist spirit of Western medicine has resulted in a tradition of medical education in hospital wards and experimental laboratories, to a degree found nowhere else (Shryock [1936] 1947, especially chapters 1-11). Second, religions in much of the world insist on preserving the complete body of the deceased person, usually as a condition for his heavenly salvation. But Christianity distinguishes more clearly between soul and body, and it pictures the salvation of the soul while the body rots. Therefore, Christianity has been able to tolerate the widespread use of autopsies, while most other religions discourage them, and much of the physiological and therapeutic knowledge of Western medicine has resulted from post-mortem examinations. Potentially productive medical traditions in other societies, such as Islam, have been stunted because the dominant religions forbade autopsies. A third reason for the greater success of the Western medical tradition has been the West's technical inventiveness. This approach industrialized the West's economy. But even before the industrial revolution, the West's gadget-consciousness introduced into its medical research and practice extremely fruitful diagnostic and therapeutic devices, such as the microscope and thermometer. During the nineteenth and twentieth centuries, the industrial revolution has enabled Western medicine to e rnploy even more complex devices, such as the ^ ray or the anesthetic equipment essential for thoracic surgery, which could not be invented by the world's numerous preindustrial societies. Finally, the West has had a long tradition of u reaucratic organization. In this respect, of course, the West has not been unique; but its disjrictive achievement was to apply such methods to tte financing, staffing, and dissemination of med-
95
ical care. Nationwide systems of public health regulation, hospitals, medical education, and health insurance evolved under the auspices of churches, governments, and voluntary associations. All the foregoing variables—cumulative scientific knowledge, technology, and extensive formal organization—have combined in recent centuries to produce a highly developed system of medical care in the West and, increasingly, through cultural diffusion, in other societies. Doctor and patient The individual doctor renders medical care to the individual patient in a variety of settings: the patient's home, the doctor's office, hospitals, or poly clinics. The relationship between doctor and patient in the West is thought of by social scientists as a special case in the sociology of professional-client relationships. American sociologists' thinking about the doctor-patient confrontation has been greatly influenced by a theoretical model suggested by Talcott Parsons (Parsons 1958; Parsons & Fox 1952). The sick person, according to this model, is temporarily exempt from his normal social roles but is expected to perform certain well-defined patient roles. The doctor specializes in diagnosing and solving the patient's problems in accordance with the social norms of his profession, and he has the social responsibility of controlling the patient for the good of society and of the patient himself. The patient is expected to obey the doctor and strive for recovery in accordance with the expectations of each stage of his treatment. Gradually the dependence of patient upon doctor diminishes, and the patient resumes his normal family and economic roles. Parsons' analysis pictures an intelligent and ambitious patient and presupposes the achievementoriented social system of the West. Research needs to be done in order to determine its applicability in non-Western societies with different values and with large proletariats possessing less ambition and more pessimistic medical prognoses. [See PROFESSIONS.] Occasionally social scientists have observed the relationships between doctor and patient in hospitals and private offices, and they have attempted to trace the clinical consequences of the social structures governing this two-person interaction. For example, the class differences between doctor and patient have been found to affect the success of their clinical relationship: since the less educated patient is less able to communicate with the doctor in the latter's own vocabulary, he is asked to give fewer reports, and he receives fewer explana-
96
MEDICAL CARE: Social Aspects
tions and fewer instructions for home care than do patients of higher social classes (Freeman et al. 1963, chapter 11 passim). Another example of research on the social conditions governing relations between doctor and patient is Burling's finding that the patient's confidence in the doctor increases if the patient is not socially isolated but enjoys close personal relations with family members and with other patients (Burling et al. 1956, chapter 3). Many of the clinical decisions of the doctor have been found to depend on his own social relations within the medical profession and within the larger, lay community. For example, the doctor's ethnicity, family origins, and prior educational career—as well as his professional skill—affect the types of colleague networks and hospitals that he will join, and these contacts will determine the quality of his facilities, the skill of his consultants, and the affluence of his practice (Hall 1946). Adoption of new drugs—and possibly other innovations —by the individual doctor occurs earlier and more often if the doctor is closely related to a network of colleagues, and the adoption pattern for the network is led by certain doctors with a proclivity for adopting new things generally (Coleman et al. 1959). The hospital In many societies, all patients are treated in their homes by family members, with occasional visits by the priest or medicine man or with occasional visits by the patient to the medicine man. Among a few preindustrial peoples—such as the early Hebrews and some contemporary African tribes—the medicine man maintains beds where he can watch and treat patients in proximity to his own medical supplies or religious shrines. However, only a few civilizations have produced hospitals in large number and as important centers of medical care. Certain conditions in the social structure seem essential for the existence of an extensive and important hospital system. The society must have skill in creating and running organizations. The religious and other social beliefs must legitimize the diversion of substantial resources for the treatment and maintenance of sick strangers. Sick people and their families must accept the idea of living away from home under the control of strangers. Finally, the society must produce and train enough people qualified to work in hospitals and must motivate them to care for sick strangers. Extensive hospital programs occurred in the past under the ancient Hebrews, in ancient India during a period of synthesis between Buddhism and Hinduism, and in medieval Islam. Religion was a
powerful motive in all three societies, since it taught that human life should be saved, that illness was a remediable blight upon life, and that scientific knowledge could properly be applied in cornbating disease. Resources were readily available for hospitals, and the Hebrew and Muslim religions preached that private charity for the poor and sick were necessary conditions for salvation. Islam and India were ruled by governments which recognized that large cities could not be run efficiently without therapeutic and custodial centers. Furthermore, both Islam and India, during the periods of their hospital programs, had experience in creating networks of formal organizations—in Islam as a result of ecclesiastic work and in both countries as a result of powerful and active governments. However, the hospital programs soon declined in all three societies because the necessary social conditions were incomplete. Hospitals require dedicated staffs, and none of these religions preached the duty or desirability of lifetime careers of caring for sick strangers. In fact, the caste taboos of Brahmanical Hinduism discouraged a large number of employees from working in the same place and from giving all necessary care to the general public. None of these countries had monastic traditions, and thus, the church could not run hospitals when the state was conquered by foreigners— in the Hebrew and Muslim cases—or when the state lost interest in running hospitals—as in India and the unconquered parts of Islam. The role of Christianity. Christianity has been far more conducive to the development of hospitals than other world religions. Like the others that encouraged limited hospital programs, Christianity preached the value of human life and the desirability of preserving it through applied science. But above all, Christianity provided the doctrinal and organizational basis for hospital staffs. Humble and charitable care of the sick and poor was commended as one of the principal paths to salvation. Instead of sheltering women and thus discouraging nursing careers—as did Islam—Christianity encouraged unmarried women to do charitable work among the public and even to live away from home if necessary. Not only did Christianity legitimize hospital employment by laywornen; the Catholic church organized brotherhoods of monks and orders of nuns who nursed hospital inpatients as their apostolic mission. Employing bureaucratic procedures learned in large part from the government of the Roman Empire, the church organized hospitals, financed them through its extensive fund-raising machinery, and maintained themThus, Christianity's ecclesiastic structure produced
MEDICAL CARE: Social Aspects a continuity in hospital affairs regardless of the fluctuations in state policy. For many centuries, European hospitals were run by the church or by associations of laymen affiliated with the church, and they were staffed by nuns. For much of their history they were custodial institutions, where sick and dying people were maintained and given religious guidance. Greek-Islamic therapy was introduced when some of the medieval nursing monks attended southern Italian medical schools. Thereafter, hospital systems created and maintained by the church also became the workshops of doctors and centers of medical education. European hospitals became secularized as the function of religion changed in Western social structure, as medical science became more successful and influential, and as lay governments became more powerful and more responsive to their citizens' demands for social welfare. In the late Middle Ages the church forbade monks to practice surgery and restricted their other medical work; lay physicians were welcomed into the hospitals and soon commanded all their medical practices (Rosen 1963). Secularization of hospitals. Between the Reformation and the present day, many church-affiliated hospitals have been taken over by governments, and many new hospitals have been created by governments or by secular owners. Even after these transfers to government, Catholic nuns and Protestant deaconesses continued to work in hospitals; however, a new occupation of trained lay nursing arose in the late nineteenth century, and these nurses have been taking over the numerous jobs that cannot be filled by the now contracting religious orders. Church-affiliated hospitals staffed by nuns can still be found in countries where the church must seek new converts and retain the loyalty of its own communicants, such as in nonChristian societies and in the Western countries with mixed religions (for example, Germany, Holland, Switzerland, and the United States). As Western medical care spreads throughout the world, so does the Western hospital in its present secularized form. However, a serious obstacle is the recruitment of enough lay graduate nurses in non-Christian societies. Christian values seem to De an important recruitment motivation for lay nurses, just as they have been for nursing nuns. It is evident from unpublished research by the Present author that underdeveloped countries without large Christian minorities have great difficulty ln staffing their hospitals. The rise of scientific medicine. Since the hosPital is organized to protect and treat sick people,
97
its goals, structure, and functions depend on the current state of medical science. As a result of the great medical advances of recent centuries, the Western hospital has been transformed. Previously, the hospital was a charitable establishment to care for the helpless and protect society from the infected. In order to perform such work, only a small staff of supervisors and unskilled attendants was needed; only a few doctors visited, to give quick treatments or to perform occasional clinical experiments. Since these were custodial institutions for the poor and since the wealthy could afford home visits by doctors, the higher classes were treated and housed in their homes. Because infection and pain were common in hospitals, much of the public avoided surgery and feared hospitalization. Modern medicine learned to classify diseases and distinguish among the treatments for each; the hospital needed to separate patients according to disease for distinctive treatment by category, and therefore, modern hospitals became departmentalized. The understanding of sterile technique and the introduction of antiseptics reduced cross-infections, made surgery and obstetrics safe, and increased therapeutic success and public confidence. Surgery also became more successful and popular because of the introduction of anesthetics. Surgical facilities grew in hospitals, and surgeons gained prestige and wealth. The wards became places for continuing treatment and were recognized as places of potential but preventable cross-infections. Scientific nursing education was introduced, closer controls were instituted over lower nursing ranks, and nursing staffs were greatly increased in size. The introduction of laboratories, operating theaters, and many ancillary services increased the size of paramedical staffs and raised the cost of hospitals. Since medical specialists needed to treat their patients by means of these facilities, members of higher social classes were hospitalized. The physical appearance of existing hospitals was improved, and private hospitals originated. Since specialty training required acquaintance with the advanced techniques found in hospitals, large and disciplined medical staffs were created, and many hospitals added educational and research goals to their mission of patient care. To administer such large and complex organizations, there developed new occupations specializing in hospital management; to supply the hospitals there arose new and immensely profitable pharmaceutical and equipment industries. Hospital organization. Within individual countries—and particularly in the United States—social scientists have studied interpersonal processes within hospitals, using the same concepts that they ap-
98
MEDICAL CARE: Social Aspects
ply to any other organizations. For example, American sociologists have long been interested in latent social processes and in the entire set of functions and dysfunctions resulting from certain institutional arrangements. Thus, American medical sociologists have not only studied the manifest hospital structure and its successes but have pointed out the presence of latent processes with paradoxical outcomes. The hospital, they say, is manifestly dedicated to the provision of means for treating patients successfully. However, every organization requires an administrative structure to arrange its resources economically and to control deviant behavior. Thus, therapeutic and administrative structures exist simultaneously in the hospital, each with its own priorities and personnel. Emphasizing one set of goals (such as administrative order) is dysfunctional for maximization of the other structure's goals (such as patient care), and conflicts occur between the two lines of authority (such as the lay administrators and the doctors). Several studies of mental hospitals note the dilemma of combining organizational imperatives and therapy: the custodial structure necessary to control patients and maintain order is dysfunctional for mental care. [See ORGANIZATIONS, article on ORGANIZATIONAL GOALS; see also Freeman et al. 1963, chapter 10, and Reader 1959.] A specialist in administrative medicine might primarily study the formal organization of hospitals, but sociologists—whether observing factories, schools, or hospitals—search for the informal social structures that may substantially deviate from the formal chain of command and that may powerfully influence the system in action. Some American medical sociologists have been participantobservers in hospital wards. They have described the informal social system among patients, and they have identified how this informal ward society alternately raises and lowers the morale of individual patients and alternately supports and disturbs each patient's relationships with the hospital staff (e.g., Fox 1959). From such research in mental hospitals has come the advice that the individual patient's recovery depends not only on treatment received during the hours reserved for formal therapy but also on harmonious relationships involving all patients and staff members throughout the day. The successful "therapeutic communities" are said to have democratic decision making, full communication of ideas and grievances, stable and rewarding careers among staff members, a proper balance between freedom and control, and an ideal combination of self-reliance and group support (Stanton 1954).
Quality of medical care From the viewpoint of the social sciences, medical care is a strategic area for studying the institutionalization of social values and of applied scientific technique. However, from the viewpoint of the public and of an increasing number of medical practitioners, the proper task of the social sciences is to evaluate and improve the quality of care. Therefore, many medical organizations have retained social scientists to gather information and give advice. Some of the resultant studies deal with the training of doctors and nurses: the formal and informal organization of the curriculum, of the student community, and of the hospital are found to promote student learning in some ways and to inhibit it in others; and this information has been used by curriculum planners and administrators in the improvement of their instruction. Determinants of good care. Several studies have attempted to locate elements in the social organization of medical care that affect the quality of treatment. Usually this research involves collaboration by physicians and social scientists: the former identify the actions constituting "good medical care," while the social scientists contribute knowledge about research techniques and ideas about the possible determinants of good care. For example, O. L. Peterson and his associates (1956) observed the work of general practitioners and concluded that class standing in medical school, length of preparatory education, exposure to refresher courses, and other experiences are strongly related to several indicators of good care. Contrary to common beliefs, involvement in medical society affairs, prestigious hospital appointments, hours of work, and certain other variables were found to have little relationship to good care in this sample. In an ambitious study of ten hospitals, Basil S. Georgopoulos and Floyd C. Mann (1962, chapters 5 and 8) discovered that certain social attributes of the hospital correlated more strongly with quality of care than did other commonly credited determinants, such as the volume of material facilities, size of budget, number of beds, and ratio of personnel to beds. The organizational characteristics associated with good care were: high coordination throughout the hospitals, harmonious relations among departments and between doctors and nurses, lower absenteeism among graduate nurses, and efficient but congenial management. In another organizational study of hospitals, MeMn Seeman and John W. Evans (1961) found that the informal social organization of the ward affects the clinical performance of interns and nurses: where
MEDICAL CARE: Social Aspects power inequalities and other forms of social distance are greater, efficiency and skill are lower. Some social research methods have been used to gather expert opinions about the nature and sources of good care. For example, Milton C. Maloney and his associates (1960) used survey techniques to get a sample of expert opinion about the quality of care: they asked doctors about the care the doctors obtained for themselves and for their families. These expert judges preferred treatment by fulltime specialists in metropolitan medical centers. Doubtless applied social research will become an increasingly recognized fact of medical care in all countries. Modern medicine is devoting more attention to the "stress diseases" and other medical conditions that seem related to social roles. The organizational problems of medical care are rising because of increased costs, scarcities of personnel and of services, and because of the demands by social planners for greater efficiency and economy in medical administration. Teamwork among clinicians, administrators, and social scientists may consequently become commonly accepted as the means to a common goal of better quality care. WILLIAM A. GLASER [Directly related are the entries HEALTH; MEDICAL PERSONNEL; PUBLIC HEALTH. Other relevant material may be found in MENTAL DISORDERS, TREATMENT OF; MENTAL HEALTH; PSYCHIATRY.] BIBLIOGRAPHY Interest in social determinants of medical care first entered the literature of the field through the writings of several leading medical historians (e.g., Sigerist 1960; Shryock 1936) and through research in epidemiology and public health (see the bibliographies for the articles on these two topics). Sociologists, social psychologists, and other social scientists began to write about medical care only in recent decades. Much of their work has been designed to provide practical information for psychiatric hospitals, the curriculum committees of medical and nursing schools, epidemiologists, and other medical practitioners with practical Problems requiring knowledge of the facts. However, some research has been designed by social scientists to test hypotheses derived from their own disciplines. The voluminous American literature is summarized in Freeman et al. 1963, and some of the principal studies are reprinted in Jaco 1958. Research by social scientists is also well advanced in Great Britain, where a group of specialists in social administration" provide facts and advice about all aspects of social policy, including health (Susser & Watson *962). Similar work is beginning elsewhere: for examPte, Konig & Tonnesmann 1958 summarizes the research ln Germany. Bibliographies of the American and European hterature are provided by Freidson 1961/1962 and Pearsall 63. Much of the social research on the quality of care is summarized by Anderson & Altman 1962. Several writers on public health and administrative mediCln ehave summarized the health problems and forms of wedical organization in the world (e.g., Brockington 1958). s yet, no social scientist has written a general survey of
99
the variations in the social organization of medical care throughout the world. As social scientists have written more about medical care, their ideas have become more widely accepted in the writings of clinicians, and particularly by psychiatrists. An increasing number of writings by physicians, nurses, and medical administrators about the proper management of medical organizations and about the proper treatment of patients incorporate concepts and generalizations about the effects of the patient's family settings, the community's culture, the social relationships between patient and clinician, the personality traits of persons experiencing stress and social isolation, and so on (e.g., Balint 1957). ANDERSON, ALICE J.; and ALTMAN, ISIDORE 1962 Methodology in Evaluating the Quality of Medical Care: An Annotated Selected Bibliography, 1955-1961. Univ. of Pittsburgh Press. BALINT, MICHAEL 1957 The Doctor, His Patient and the Illness. London: Pitman Medical Publishers. BROCKINGTON, C. FRASER 1958 World Health. Harmondsworth (England): Penguin. -» Includes an account of the first ten years of the World Health Organization. BURLING, TEMPLE; LENTZ, EDITH M.; and WILSON, ROBERT N. 1956 The Give and Take in Hospitals: A Study of Human Organization in Hospitals. New York: Putnam. CAMERON, ALICK 1960 Folk-lore as a Medical Problem Among Arab Refugees. Practitioner 185:347-353. CASTIGLIONI, ARTURO (1927) 1958 A History of Medicine. 2d ed. New York: Knopf. -> First published as Storia della medicina. COLEMAN, JAMES; MENZEL, HERBERT; and KATZ, ELIHU 1959 Social Processes in Physicians' Adoption of a New Drug. Journal of Chronic Diseases 9:1-19. FEHLER, J. 1961 Verweildauer im allgemeinen Krankenhaus. Krankenhaus (Stuttgart, Germany) 53:397— 403. Fox, RENEE C. 1959 Experiment Perilous. Glencoe, 111.: Free Press. FREEMAN, HOWARD E.; LEVINE, SOL; and REEDER, LEO G. (editors) 1963 Handbook of Medical Sociology. Englewood Cliffs, N.J.: Prentice-Hall. FREIDSON, ELIOT 1961/1962 The Sociology of Medicine: A Trend Report and Bibliography. Current Sociology 10/11:123-192. GEORGOPOULOS, BASIL S.; and MANN, FLOYD C. 1962 The Community General Hospital. New York: Macmillan. HALL, OSWALD 1946 The Informal Organization of the Medical Profession. Canadian Journal of Economics and Political Science 12:30-44. JACO, E. GARTLY (editor) 1958 Patients, Physicians and Illness: Sourcebook in Behavioral Science and Medicine. Glencoe, 111.: Free Press. KADUSHIN, CHARLES 1964 Social Class and the Experience of 111 Health. Sociological Inquiry 34, no. 1: 6780. KONIG, RENE; and TONNESMANN, MARGARET (editors) 1958 Probleme der Medizin-Soziologie. Cologne (Germany) : Westdeutscher Verlag. McGiBONY, JOHN R. 1961 Health Care in India: Its Patterns and Problems. Hospitals 35, no. 10:40-44; no. 11:47-52. MALONEY, MILTON C.; TRUSSELL, RAY E.; and ELINSON, JACK 1960 Physicians Choose Medical Care: A Sociometric Approach to Quality Appraisal. American Journal of Public Health 50:1678-1686.
700
MEDICAL CARE: Economic Aspects
MAZEN, AHMED KAMEL 1961 Development of the Medical Care Program of the Egyptian Region of the United Arab Republic. Ph.D. dissertation, Stanford Univ. PARSONS, TALCOTT 1958 Definitions of Health and Illness in the Light of American Values and Social Structure. Pages 165-187 in E. Gartly Jaco (editor), Patients, Physicians and Illness: Sourcebook in Behavioral Science and Medicine. Glencoe, 111.: Free Press. PARSONS, TALCOTT; and Fox, RENEE (1952) 1958 Illness, Therapy, and the Modern Urban Family. Pages 234-245 in E. Gartly Jaco (editor) Patients, Physicians and Illness: Sourcebook in Behavioral Science and Medicine. Glencoe, 111.: Free Press. -> First published in Volume 8 of the Journal of Social Issues. PEARSALL, MARION 1963 Medical Behavioral Science: A Selected Bibliography of Cultural Anthropology, Social Psychology, and Sociology in Medicine. Lexington: Univ. of Kentucky Press. PETERSON, OSLER L. et al. 1956 An Analytical Study of North Carolina General Practice: 1953-1954. Journal of Medical Education 31, no. 12, part 2. READER, GEORGE G. 1959 Medical Sociology With Particular Reference to the Study of Hospitals. Volume 2, pages 139-152 in World Congress of Sociology, Fourth, Transactions. London: International Sociological Association. ROSEN, GEORGE 1963 The Hospital: Historical Sociology of a Community Institution. Pages 1-36 in Eliot Freidson (editor), The Hospital in Modern Society. New York: Free Press. SEEMAN, MELVIN; and EVANS, JOHN W. 1961 Stratification and Hospital Care. American Sociological Review 26:67-80, 193-204. SHRYOCK, RICHARD H. (1936) 1947 The Development of Modern Medicine. New York: Knopf. SIGERIST, HENRY E. 1960 On the Sociology of Medicine. New York: MD Publications. STANTON, ALFRED H. 1954 The Mental Hospital: A Study of Institutional Participation in Psychiatric Illness and Treatment. New York: Basic Books. STERN, BERNHARD J. 1941 Society and Medical Progress. Princeton Univ. Press. STEVER, ROBERT C. 1961 Medical Impressions From India and Nepal. Journal of Medical Education 36: 330-337. SUSSER, MERVYN W.; and WATSON, WILLIAM 1962 Sociology in Medicine. Oxford Univ. Press. WILLIAMS, CICELY D.; and SCHARFF, J. W. 1960 An Experiment in Health Work in Trengganu, Malaya. Beirut: American Univ., School of Public Health. Ill ECONOMIC ASPECTS
The economic aspects of medical care encompass a broad and diverse area. It can include discussion of medical care utilization by various income groups, licensing arrangements, incomes of practitioners, fee structures, and so forth. This discussion will be limited to three topics: (1) some of the special characteristics that distinguish medical care from other goods and services; (2) the various alternative mechanisms for financing medical care services utilized at the present time;
(3) general economic problems related to alternative patterns of financing. Special characteristics Medical care, although generally viewed as a consumption commodity, has come to have a special status among the wide variety of such goods and services. This status affects its organization, the amount of resources it commands, and the patterns used in its financing. Social-psychological factors. The view that medical care is somehow "different" is mainly an outgrowth of what may be called social-psychological factors. Medical care is felt to be (and obviously often is) intimately related to health and to life itself. The public associates medical care with dramatic lifesaving procedures, with the treatment of potentially fatal illness, and with the alleviation of suffering and pain. Since life is not considered a luxury commodity, those things which are believed to be associated with its preservation, e.g., medical care, have come to be considered a "human right." As a consequence, it is felt that the amount of medical care received—that the right to life or the prevention of pain—should not depend on the individual's income and purchasing power. Actual practice, of course, is often at wide variance with beliefs and declarations in this area. Unpredictability of need. A number of significant economic distinctions between medical care and most other goods also impel the public to the view that the financing and organization of medical care should be treated differently from the financing of other goods and services. Perhaps of primary importance is the fact that illness, and consequently the need for care, is unpredictable for the individual, although it is predictable for the group. The individual, therefore, finds it impossible to anticipate the frequency of medical care, the amount he will need, and the costs that will be associated with such care. Furthermore, the economic burden imposed may be extreme—the illness may be severe and the economic consequences catastrophic. Not knowing the costs, he cannot save the appropriate amount in advance of an illness of unknown severity that may occur at some unknown time and with unknown frequency. This matter assumes increased significance, since real limitations exist on the possibility of postponing medical care purchases. Although postpayment over a period of time would be a possible remedy for part of the problem, difficulties with that device are many: (1) illness may affect both short-run and long-run earning power and thus make such postpayment difficult, perhaps impossible; (2) the service rendered can-
MEDICAL CARE: Economic Aspects not be repossessed if there is a failure to meet payments; (3) the individual may feel that he has little or nothing to show for the medical care received, i.e., he is no better off than before he became ill and is simply back to the pre-illness state, to which he feels he might have returned even without the care; (4) treatment may involve discomfiture and pain, neither of which is certain to make one feel well disposed to those who rendered the treatment and thus to meeting periodic payments to the provider of care; (5) the individual does not consider that he is continuing to derive benefits from the service rendered in the past— as contrasted with the continuing pleasures derived from the purchase of consumer durables and other goods. Externalities. A second characteristic of medical care services not shared by most other commodities involves externalities. These are present when benefits (or costs) accrue to others because the individual takes a particular action. The purchase of some types of medical services clearly involves such externalities. This is most evident in the case of communicable disease. All of us derive benefits from the immunization of others and from the prevention and treatment of the communicable diseases that affect them. The benefits to society, therefore, exceed the individual consumer's benefits. Conversely, there are costs to individuals when disease is not controlled in some other part of the population. The matter of externalities can be viewed even more broadly if we include satisfaction as well as dollar benefits in our consideration. If others are crippled, sick, or disabled, and if we are made uncomfortable by knowing about or seeing these conditions, then the benefits to society of rehabilitation or cure exceed the benefits accruing to the individual who is rehabilitated or cured. The increase in social satisfaction should, therefore, also be included in the calculus. Economic theory has shown that where positive externalities (external benefits) exist, the individual underinvests in the commodity, from society's standpoint. Thus the private sector underinvests, insofar as there is insufficient philanthropy or absence of compulsion. This characteristic becomes one of the bases for public intervention in the health sector and is one of the reasons that government is often the major financing agent for certain types of health expenditures. [See EXTERNAL ECONOMIES AND DISECONOMIES; WELFARE ECONOMICS.] Investment. Yet another special feature of medial care is implied by the use, in the preceding Paragraph, of the term "invest." Medical care is in Part a consumption commodity, but also in part
1 01
an investment good. The purchase of medical care today increases the level of health and thus raises productivity in the future. Such sacrifice of current consumption in order to raise future output is the essence of all investments. Thus, the term "investment" need not be applied solely to expenditures on material capital. Accordingly, the analysis of public policy toward medical care can be cast in an investment framework. Quality assessment problems. An additional significant distinction between medical care and other services is the consumer's relative inability to judge the quality of the product he is purchasing. This matter has even greater importance when it is recognized that sins of omission or commission on the part of the physician or persons working under his direction may have serious and nonreversible consequences. Even ex post evaluation of quality by the consumer is difficult, and the satisfaction or dissatisfaction with the medical care provided may, therefore, be based on various extraneous factors. Since medical care may involve more serious matters than are involved in the purchase of other consumer goods, and since the search to find quality care is more difficult than with other goods, some measure of protection is afforded the consumer by licensing arrangements. In this manner authorities "guarantee" some presumed minimum level of competence. The economic issues in such licensing arrangements are many, since, depending on the standards set and the responsibility of those who provide licenses, supply may be artificially and unnecessarily restricted, with consequent increase in prices for services and in incomes of practitioners not sufficiently compensated for by increase in quality of care. [See LICENSING, OCCUPATIONAL.] Alternative financing mechanisms The special characteristics of medical care have played a role in inducing governments to participate actively in the medical care sector. The form that such participation has taken has differed greatly and is a product of institutional and ideological constraints. Nevertheless, the role of government in the health sector and in financing health care is acknowledged in all countries. This role often takes the form of government participation in funding or distributing medical care for specific categories in the population. These categories of persons may be defined by specified health characteristics (e.g., the blind, the disabled), by age characteristics (e.g., the aged, children), by economic status (e.g., the indigent, the medically indigent), or by other special characteristics that define
102
MEDICAL CARE: Economic Aspects
a particular population group (e.g., veterans, migrant workers). On occasion, combinations of these characteristics are utilized to determine eligibility, and individuals must then meet more than one criterion. Government participation and support, of course, need not be confined to categories of persons and in some countries covers the total resident population. Voluntary insurance (United States). Voluntary health insurance has developed as one method of financing designed to meet the problem of unpredictability of illness and the need for medical care. The insurance principle lends itself to application in the health area, although it is relatively difficult to apply to certain health expenses and to certain population groups. This type of financing is most highly developed in the United States, where hospital insurance and surgical-medical insurance are provided both by nonprofit plans and by commercial insurance companies. The expansion of private voluntary health insurance in the United States began in the 1930s with the development of Blue Cross plans for hospital coverage (with backing by the American Hospital Association) and Blue Shield plans for surgical-medical insurance (with backing by the organized medical profession). These plans are nonprofit and serve local areas (usually entire states), although national coordination is provided. Commercial insurance companies entered the field in the late 1930s, largely through group coverage and utilizing arrangements for reimbursement of charges up to specified amounts rather than for provision of specified services (as in the nonprofit hospitalization plans). Commercial carriers competed successfully with the Blue Cross and Blue Shield plans. In part this was because the "Blues" used community rating: the rates charged for each of the various contracts offered were the same for all groups in a community. Commercial carriers used experience rating, wherein the rates charged any single group varied with the experience of that group. As a consequence, low-risk groups and individuals often found it advantageous to deal with a commercial carrier where premiums reflected their own experience and utilization of services rather than the average experience, including that of the high risks, in a community. The greater the number of low risks who left the "Blues," the higher the insurance rates rose for those remaining. Thus those who previously found themselves on "the margin" now found it advantageous to shift to commercial carriers. This mechanism could, in theory, continue indefinitely. It is one of the important considerations in the social insurance area and is often used to argue
for a compulsory element in social insurance. As a result of the force of this process, experience rating is now widely used even by those plans which originally rejected it on philosophic grounds. With experience rating, high-risk groups (e.g., the aged) find that the rates charged are higher than would be the case under community rating and that it becomes more difficult to participate in the plan. This problem, as well as a number of other conditions, led the United States to institute, effective in 1966, a public system of health insurance, as part of its social insurance system, providing for limited financing of some medical care for persons 65 years of age and older. The pressure for such a system of insurance was also heightened by the fact that the aged often are not members of groups eligible for group insurance coverage. Insurance rates for individuals are significantly higher than group rates (and not only because the employer frequently pays a portion of the premium under group coverage). The high medical expenses of the aged and their relatively low financial resources compound the problems of finance. [See AGING, article on ECONOMIC ASPECTS.] It is estimated that in 1965 about 80 per cent of the civilian population in the United States had some insurance protection against the costs of basic hospital care, about 75 per cent had some surgical expense protection, and almost 60 per cent were eligible for some additional coverage for in-hospital physician visits. Although the benefits met about 70 per cent of the total cost of hospital care, only about one-third of total consumer expenditures for personal health care were met by insurance. In the United States, health expenditures are financed largely by the private sector: of the $37,000 million of national health expenditures in 1964 (5.8 per cent of the gross national product), 74 per cent came from the private sector and only 26 per cent from the government—equally divided between the federal government on the one hand and state and local governments on the other. About 50 per cent of all personal health service expenditures were paid for by the recipient of the services (or his family), while third parties (government, insurers, etc.) paid for the remainder. The problem of high-risk groups under experience rating in a voluntary insurance plan can be solved by a compulsory insurance program or direct government provision or financing of services. However, the traditional arguments that are used by those who favor approaches involving more government participation go beyond the solution of the difficulties that some groups and individuals face under experience rating. They include matters such as ease and economic efficiency of administration,
MEDICAL CARE: Economic Aspects comprehensiveness of care, coverage of the total population (including those groups that voluntary insurance finds it difficult or inefficient to reach or whose economic status is so low as to inhibit purchase of insurance), more ready control of inflation in medical care costs, increased equity in financing of care, and more possibility of increasing emphasis on certain aspects of medical services, e.g., preventive care. The success with which any of these objectives would be met would depend upon the particular characteristics of the health insurance or financing plan. The social security characteristic of a plan does not in itself guarantee comprehensiveness of coverage, quality of service, control of inflation in medical care prices, equity in financing, and so forth. Just as the voluntary private health insurance system can take many different forms and thus meet different problems, raise different issues, and resolve basic questions in different ways and with varying degrees of success, so too can each alternative basic system for financing or providing care. National Health Service (United Kingdom). One type of approach to the financing and provision of medical care services is that employed in the United Kingdom under the National Health Service. Medical care services (rather than reimbursements) are provided all residents. The major costs of the program are financed out of general revenues, although small weekly contributions are paid by workers and employers. Some cost sharing by the patient for medicines and selected other items is also provided for. All residents are eligible for such services as general and specialized care, dental care, and hospitalization. The services are provided by doctors and druggists who are under contract to the National Health Service and by public hospitals owned by the central government. The physician receives a payment for each person on his rolls. This capitation method of payment of physicians encourages patients to have a continuing relationship with a particular general practitioner. The capitation method of payment, as contrasted with a payment for services rendered, is considered not only to be a good financial arrangement but also to represent a basic philosophical approach to medicine which is believed to have advantages that go beyond ease of administration. The United Kingdom thus has a pattern of financing and organizing medical care substantially different from that of the United States, or from that which the latter will employ for those 65 years °f age and over. Perhaps 85 per cent of all health e xpenditures in the United Kingdom are paid for by third parties, a much larger percentage than in the United States. The amount of resources allocated
703
to health services in the United Kingdom (4.2 per cent of a lower GNP) also differs from that in the United States (as indicated previously, 5.8 per cent of GNP). It is not clear, however, what part of these, and other, intercountry differences is attributable to differences in relative price levels (i.e., the relative costs of health services as compared with other goods and services in each country), differences in efficiency in the health sector, differences in levels of health and "needs," differences in quality of care, and differences in real resources devoted to health care. But it is clear that there do exist wide differences in the percentage of GNP devoted to health services in various countries. The significance and implications of these differences have yet to be thoroughly examined, but they do not seem to be related to the prevailing patterns or sources of finance or to the significance of the government sector in the health area. Basic alternatives. The three main patterns of nonmarket provision of medical benefits in the various countries are (1) provision of direct services through facilities owned and operated by the government or by a social insurance fund; (2) services provided to patients, with the provider of services being paid directly by the government or fund; ( 3) reimbursement of fees paid directly by patients to providers of the services. As will be discussed below, the need for rationing of services still remains under the various arrangements, and patients are, therefore, often called upon to share the costs of the services rendered. While a number of countries provide comprehensive medical services to all residents (although patients may pay part of the cost of the service provided), and an additional number of countries provide various selected therapeutic services, the particular benefits available and the mechanisms by which care is financed, provided, and organized differ greatly. In some countries medical services are provided to parts of the population under social insurance arrangements; in others, benefits are provided through membership in sickness funds to which employees, government, and (in some countries) employers contribute and in which membership is compulsory for certain categories of persons. Particular financing mechanisms are not necessarily tied to particular organizational arrangements, and thus many different combinations exist. Economic problems Need for rationing. Whatever the organizational pattern for distributing and financing medical care, the absence of a perfectly operating market for medical care presents certain difficulties. Although medical care is often viewed as a
104
MEDICAL CARE: Economic Aspects
basic human right, few scientific standards exist to define the amount of care that an individual "needs" or should utilize (i.e., the health benefits associated with given amounts of care). However great the public dissatisfaction with the medical care received, it is a fact that some individuals tend to utilize more services than they require or are believed to require. Furthermore, changes in the price of medical care services do tend to change levels of utilization. Declines in price are associated with increases in usage. Although an increase in utilization is often one of the purposes of government in creating systems that reduce price or provide insured health benefits, allocation problems remain. If society were to allocate sufficient resources to medical care so that no choice need be made between care for the individual who is dying but can be saved and the hypochondriac who wants reassurance (that is, if we were prepared to meet all increases in, and levels of, utilization), then rationing of medical resources would not be required. But such a situation is neither conceivable nor, given competing demands on resources, desirable. Thus rationing becomes imperative, and since society is willing to make some interpersonal utility comparisons regarding who needs care, the rationing scheme must incorporate the public's judgments. Nevertheless, since there is little scientific agreement on desirable utilization rates (or on the relationship between utilization rates and levels of health), it is difficult to agree upon, set, and control such rates and to develop medically "optimal" rationing devices. In the absence of control of usage, budgetary, physical, and personnel shortages often appear, particularly because consumers often demand more care than the authorities have estimated would be the case. Attempts to balance resource limitations (including budgetary considerations) against the private decisions to visit physicians, to enter hospitals, and to utilize medical care in quantities determined by the individual (the balancing of "scarcity" and "human right") have always been necessary and have seldom been easy to achieve. Were the health sector treated as most other services are generally treated, this particular problem would not exist. If consumers, out of their own budgets and without public or private subsidization, chose to "overutilize" services (judged by some scientific standard), this would be fully acceptable. The price mechanism would serve as a rationing device and, however "foolish" some might feel these expenditures to be, the consumer would be sovereign. Since government plays a role in the health
field, even if in many cases it is limited to support for construction of physical resources (e.g., hospitals), or to training of personnel (e.g., supporting education), or to financing of certain types of care (e.g., for needy or aged persons), the government must be involved in the development of rationing devices to replace market forces. The same need exists, of course, when private physicians offer charity care. Controlling utilization. The problem of "overutilization" (which can be viewed as the problem of scarcity) arises in all medical care financing mechanisms. The mechanisms of control when there is governmental involvement or control by a voluntary insurance carrier may, of course, differ. It is true that for many services waiting time, inconvenience, and even loss of income serve as deterrents to use. Even so, the use of "free" (but not costless) services may exceed the necessary, or "need," level. Therefore, additional monetary deterrents are often used, in the attempt to control overutilization. In particular, they are used to influence the consumer's choice between different types of care, e.g., office calls rather than house calls, home care rather than hospital care. Payment of some percentage of the costs (e.g., of hospital care) or in the form of some fixed charge (e.g., for a house call) is often instituted in order to induce the consumer to ration the care he seeks and in order to lead him toward those types of care which utilize less of society's economic resources. Such charges are also often made for prescription drugs. Since the use of such drugs is governed largely by the physician rather than by the patient, these charges may be instituted primarily as a device to cut the costs financed by the program rather than in order to change consumer behavior (although they do this as well). The difficulty, sometimes unrecognized, with these procedures is that the money deterrent has a differential impact on individuals, depending greatly upon their "taste" and need for medical care, the accessibility of medical care, their total income, and the prices of other goods and services. What may be a small deterrent for some (perhaps the "optimal" deterrent from the point of view of the third party that finances the service or from the point of view of society) may represent a great deterrent for others and a trivial deterrent for still others. Income and price considerations thus reenter the medical market place, although with less impact than would be the case in the absence of third-party payments. Furthermore, it is clear that the conflict recurs between encouragement to use services in the "correct" quantities and rationing of
MEDICAL PERSONNEL: Physicians care. No single level of fixed charges applying to all members of the population and designed to deter unnecessary use, but not to inhibit necessary use, can be the correct level for each individual. The operational consequences of the fixed charge will vary with a number of institutional considerations. It cannot be presumed that the consequences are necessarily severe (if maintaining accessibility is the primary goal) or are necessarily minor (if cutting utilization is the primary goal). The fixed charge in third-party payments does illustrate the fact that it is difficult to define that part of medical care which society considers a luxury and that part which is considered a need. With scarce resources a trade-off between goals is required—and difficult to determine. Future developments in the provision and financing of medical care will be related to developments in medicine itself and to the acquisition of additional knowledge concerning the relationship between medical care and health and between health and productivity. The growing ability of medicine to prevent illness and to care for and rehabilitate the sick will bring increased pressures on the health sector. In most nations the role of government in health is likely to increase. In part this will take the form of intervention in the financing of services via social insurance or other arrangements. The financial responsibility, as well as the special characteristics, of medical care outlined earlier will result in government's assuming an increasing responsibility in the determination of the total amount of manpower and resources devoted to health and medical care and to the allocation of resources within the health sector itself. The meeting of this responsibility will involve an increased reliance on new evaluative techniques. In recent years, partly as a result of the pressures for health expenditures and for planning of medical care in the less developed countries, and partly as a result of the refinement of benefit-cost analysis, the analysis of medical care has acquired an increasingly analytic economic content. In this approach, health and medical care are viewed in an economic context, and medical care programs are evaluated in relation to their cost and.their potential impact on productivity through lower levels of morbidity and mortality. The economic value of man as a producer lies at the heart of the analysis, and the costs of various diseases are calculated in terms of the direct costs of treatment and medical care and the indirect costs to the economy of loss of productive power. Decisions concerning the allocation of resources
105
to the health sector will, of course, not be based solely on the measurement of pecuniary costs and benefits. Nevertheless, such measurements are of assistance. It can, therefore, be anticipated that as understanding of the methodology of benefit-cost analysis grows, it will be increasingly used in governmental evaluation, planning, and budgeting processes. RASHI FEIN BIBLIOGRAPHY
The World Health Organization (Geneva) periodically issues publications on the level of health and on the financing and organization of health services in particular countries. KLARMAN, HERBERT E. 1965 The Economics of Health. New York: Columbia Univ. Press. Social Security Bulletin. -» Published since 1938. Regularly contains articles on the economic aspects of medical care. U.S. DEPARTMENT OF HEALTH, EDUCATION, AND WELFARE 1964 Social Security Programs Throughout the World: 1964. Washington: Government Printing Office. WINSLOW, CHARLES E. A. 1951 The Cost of Sickness and the Price of Health. Geneva: World Health Organization.
MEDICAL PERSONNEL i. PHYSICIANS ii. PARAMEDICAL PERSONNEL
Eliot Freidson Eliot Freidson
PHYSICIANS
The physician is the most prominent among members of the generally recognized professions. He is seen by the public as possessing a higher standard than any other professional, and by the sociologist as the virtual prototype of his kind. While it would be a great mistake to confound what is peculiar to medicine with what is characteristic of professions in general, the study of physicians does offer the sociologist the opportunity to test both the truth and the utility of various orientations toward the concept of a profession. One orientation sees a profession as an aggregate of people finding identity in sharing values and skills absorbed during the course of intensive training through which they all have passed in order to become professionals. In this view the professional is primarily a particular kind of person; one determines whether or not an individual "is" a professional by determining whether or not he has internalized certain given professional values. One explains a "bad" professional by reference to his inferior education, his defective character, or
106
MEDICAL PERSONNEL: Physicians
similar variables. In short, one explains the behavior of members of a profession by reference to individual attributes and experiences bearing on conformity to a given set of norms. Another orientation sees the profession as a group of workers joined together on the most general level by virtue of sharing a particular position in society and by common participation in a given division of labor. More specifically, the behavior of the profession is interpreted by reference to the way in which its work life is organized and the pressures toward conformity or deviance implicit in that organization. Here, the general assumption is that one defines a professional by his status, irrespective of the norms to which he subscribes, and explains his behavior by reference to the work structure in which he participates. One difficulty in assessing the virtues of each of these orientations is the fact that there has been little attempt at testing them by sustained and detailed analysis of any single profession. And there has been little of the comprehensive comparative analysis that must be the ultimate goal of the sociology of medicine. Furthermore, because one of the marked characteristics of established professions is their relative freedom from lay intervention, from the conventional discipline exercised by industrial employers, and from the detailed directives of crafts unions, both organization and structure have been difficult to perceive. Professional organization is usually taken to be synonymous with the formal professional association, and the actual organization of work or practice has gone largely unnoticed. This article, by attempting a detailed analysis of the medical profession and by focusing particularly on the way the performance of medical work is controlled, will try to clarify both the sociological characteristics of the medical profession and some of the issues germane to the sociology of the professions. (For a more extended analysis, see Freidson 1966.) Because of the paucity of systematic empirical studies from other countries, it is regrettably necessary to run the risk of parochialism and concentrate on medicine in the United States. Medicine and the state The foundation on which the analysis of a profession must be based is its relationship to the ultimate source of power and authority in modern society—the state. In the case of medicine, much, though by no means all, of the profession's strength is based on legally supported monopoly over practice. This monopoly operates through a system of licensing that bears on the privilege to hospitalize
patients and the right to prescribe drugs and order laboratory procedures that are otherwise virtually inaccessible to the layman. It is the state that grants this monopoly, the exact form of which varies widely throughout the world. In the United States the profession, through its private associations, has very largely been given the right to determine how political and legal power bearing on medicine shall be exercised (see especially Hyde et al. 1954). In such countries as Great Britain, where the state has set up a national health system, representatives of the independent and private professional associations sit on both policy-making and administrative boards and negotiate with the state on various issues influencing practice (Stevens 1966). In the national health system of the Soviet Union there is no really private or independent representation of the profession that can negotiate with the state, although advisory and administrative councils do include physicians (Field 1967). Clearly, the economic and political autonomy of the medical profession varies from country to country. What seems invariant, however, is its technological or scientific autonomy, for everywhere the profession appears to be left fairly free to develop its special area of knowledge and to determine what are "scientifically acceptable" practices. In national state health systems, although laymen do serve in policy-making and administrative positions, physicians tend to be administrative heads of practicing units and to be responsible for the determination of technical standards of equipment, procedures, and performance. Thus, while the profession may not be everywhere free to control the terms of its work, it is free to control the content of its work. Similarly, it is free to control the technical instruction of its recruits. Medical training The medical profession, quite as much as most sociologists, considers medical education to be the major single factor determining the performance of the practicing professional. By the content of his education the student is "socialized" to become a physician. The assumption is that in the course of such an education a new kind of person is created. Medical education in the United States is perhaps unparalleled by any other conventional professional training in its duration, its detail, and its rigidity. Medical school lasts four years after undergraduate college, followed by a fifth year of supervised practice in an accredited teaching hospital (the internship), and even more years for those seeking certification as specialists. It would seem
MEDICAL PERSONNEL: Physicians reasonable to think that such intensive exposure in fact molds the student into a particular kind of person. The Columbia University study of medical education (Merton et al. 1957) sought to demonstrate that the student, in the course of his training, develops a conception of himself as a doctor, absorbs the knowledge he needs in order to be secure enough to deal with patients without too much anxiety, and attains the capacity to cope with basic uncertainty in clinical practice. Nonetheless, the students' perspective on their educational experience differs from that of their instructors. Indeed, one may expect there to be some kind of conflict between students and faculty by the very nature of their different roles. It is the unique contribution of the University of Kansas study (Becker et al. 1961) that it demonstrated the clash of perspectives in medical school and showed that the differences in orientation leading to "restriction of production" are not limited to industrial organizations. The consequences of this for the educational process were followed up in some detail. But the study also emphasized that the existence of this clash of orientations did not mean that there was nothing in common between the performance of students and the demands of faculty. It was discovered that two dominant values, held by the faculty, were adopted by the students and used by them to guide their learning experience and select their careers. These were the values of medical responsibility and of clinical experience (ibid., chapters 12 and 13). The value of medical responsibility refers to the traditional ideals of medicine, according to which the physician holds the life of a patient in his hands. It is the personal responsibility held by the physician working directly with a patient that requires him to take the blame for bad results. In the Kansas study, it was found that this value was impressed on the student by frequent faculty lectures about the way in which mistakes of omission or commission endanger the patient's life. Furthermore, faculty members frequently asked students how they would handle an emergency so as to avoid serious consequences to the patient. The value also featured in the organization of the training hospital, where the hierarchy of medical staff was ordered by differential access to medical responsibility, so that the unlicensed student was restricted to routine work having little relationship to life-or-death issues and the highest-status person was free to carry out the most complicated and dangerous procedures. Clinical experience refers to first-hand contact patients and disease. Such contact is the
107
ultimate justification for deciding to use one procedure for a treatment rather than another, and the experience so gained is valued because it provides a basis for therapeutic choice that is believed to be superior not only to the abstract considerations posed in textbooks but even to general, scientifically verified knowledge. It was observed in the course of the Kansas study that argument from experience was unanswerable except by the same type of argument from someone with greater experience. These two values, Becker and his colleagues argued, order the choices the student makes from the range of experience offered by the medical school. These choices limit and direct his efforts in ways not anticipated or approved by the faculty. One of the student's most difficult problems is to select from the enormous mass of facts presented to him the information he is really to learn, for he cannot learn all of it. The idea of clinical experience, it was suggested, guides his selection of facts and information, leading him to discount basic science and focus on classes in which instructors give practical information not found in books— information that adds to his store of vicarious experience. By the same token, he struggles for personal clinical experience in his training, seeking opportunities for expanding it and deprecating routines he has already mastered. Similarly, he seeks tasks in which medical responsibility is apparent—reflecting some risk or danger—and avoids those in which it is not. And, finally, he responds positively to some patients as cases presenting him with valued responsibility and experience, and to others negatively as cases that take up a great deal of time without any valuable recompense (Becker et al. 1961, chapters 14 and 16). The evidence seemed to show that choice of career also hinges on how far specialties provide the opportunity for medical responsibility and clinical experience. Thus, a desirable specialty is one which offers a wide variety of experience and in which responsibility is symbolized by the possibility of killing or disabling patients in the course of making a mistake. Internal medicine, general surgery, and pediatrics are therefore the most popular specialties, although the potentially "mechanical" character of surgery and the necessity of liking to work with children in pediatrics qualify their desirability for some students. At the other extreme, specialties like dermatology and allergy are unpopular because they are thought to involve little danger (and therefore little responsibility) or little variety (ibid., chapter 20). National surveys of medical students in the United States have accumu-
108
MEDICAL PERSONNEL: Physicians
lated a fair amount of data on specialty choice, most of whicn are compatible with this interpretation of the values underlying the choices of the majority of students. In the case of those choosing the less popular specialties—the best-investigated of which is public health (Coker et al. 1966)— specification of more detailed patterns of values is of course necessary. Empirical types of practice Unmentioned in the course of the discussion of medical training is one element of great relevance to performance: the technical knowledge and skill learned by the student. Here, it is necessary to say only that the student does in fact gain command over a great deal of knowledge and skill; what we must dwell on is the fact that this knowledge and skill is not necessarily retained or used after graduation from medical school. While in a modern industrial country like the United States all physicians share the same basic technical education, they do not all practice in the same way. In the few systematic studies of medical practice that have been made, the association found between medical education and subsequent performance was at best very weak. While the available evidence is scanty and poor, it points to variation in the organization of practice—that is, in the organized setting in which the professional works —as a more important influence than medical education or variation in performance. (For a summary of this material, see Freidson 1963; for the parallel case of the lawyer, see Carlin 1966.) Indeed, the analysis of work organization or practice is a critical problem for the sociology of the professions. The central issue in the analysis of work is how performance is controlled. This issue constitutes a special problem for the analysis of professional work because professions, unlike other occupations, have successfully gained freedom from control by outsiders. Indeed, a profession is said to control its own performance. This is a rather unusual arrangement, worth understanding both in itself, as one type of control, and in its bearing on how, in our complex world, freedom and autonomy can be joined with responsibility. Let us examine the various organized practices in which medical work takes place to see how control over performance can be exercised. (For a more extensive examination, see Freidson 1963.) One type of practice that is frequently held up as the ideal by professionals is one in which the individual is an entrepreneur, free to do what his own conscience and knowledge dictate. This is so-
called solo practice. While pure forms of solo practice are quite rare—invoking it as a norm reflects the individualistic ideology of the profession more than it reflects reality—we might ask what conditions must be met to assure that individuals practicing entirely on their own conform to professional standards. Assurance of adequate performance on the part of solo practitioners seems to require exceedingly careful recruitment policies and extraordinarily effective educational procedures. In essence, the practitioner must be able to resist all temptations to ethical or technical lapses by virtue of his inner resources alone, resources which must also motivate him to continue to keep up-to-date. In solo practice the burden of control rests solely on individual motivation and capacity. Much more common than solo practice in the United States today is practice involving a loose network of interdependent practitioners who refer cases to each other—an informal organization that has been described as a "colleague network" (Hall 1946). Backed by a stable clientele relatively loyal to them, the practitioners in such a network control access to that clientele and thus access to work on the part of new young practitioners. In the rather well-organized case he studied, Hall showed how an "inner fraternity" of practitioners controlled access to practice settings and desirable patients and how, through the mechanism of sponsorship, newcomers were obliged both to take on minor tasks and to turn to their sponsors for consultation. While it may be doubted that professional services in large cities can be wholly dominated by any single informal fraternity, the sociometric studies of Coleman and his colleagues (1966) suggest that there are systematic and persistent patterns of interrelationships among practitioners even in so loosely organized a system as exists in the United States. These patterns of interaction suggest two of the most important prerequisites for control of the practitioner's performance by colleagues rather than by clients: by referring patients to each other, each practitioner has the opportunity to observe some of the other's performance; by being economically and technically interdependent, each practitioner has some leverage to influence the other's performance. Finally, one may mention the less primitive structures of practice that are characteristic of some European countries and are represented in the United States by large group practices and university clinics. These are essentially bureaucratic organizations, although the variations in actual administrative detail are countless. We may point to one logically distinct type of bureaucracy that has
MEDICAL PERSONNEL: Physicians received some theoretical attention in the literature because of its systematic deviation from the classical rational-legal model of bureaucracy. It has been called professional bureaucracy, and it has been characterized as a form of organization in which the hierarchy of professional practitioners is set apart from the hierarchy of the administration itself or (as in many European countries) a form of organization in which all important positions of organizational authority are filled by professionals. In both cases, professional work is free from the exercise of the authority of nonprofessionals even though the working professionals are technically subordinates in a bureaucratic system and lack the freedom of the entrepreneur. The exact theoretical importance of such a logical construct and the degree to which it mirrors enough of reality to be useful are by no means clear, but by pointing to the bureaucratic elements of organization it does indicate that here, more than in other forms of practice, physicians are in a position of interdependence that implies opportunities to observe and to exercise influence over one another's performance. Of all the types of practice reviewed, the bureaucratic type provides the best opportunity for professional self-regulation. Indeed, this is the type exemplified by high-prestige academic institutions in the United States and elsewhere. Analytical types of practice Thus far, it has been suggested that there is a range of practice organizations, from purely individual practice to bureaucratically organized practice. To understand how colleagues or clients may gain access to observe and influence performance, it is useful to distinguish those features of practice which determine both the source and the content of control. In this way it becomes possible to analyze the differential significance in the division of labor of various forms of specialization. The lay client's perspective on the service he seeks differs from that of the colleague group of professionals: this may be taken as axiomatic. Let us therefore distinguish practices by the degree to which they are amenable to lay or colleague control. It is clear that two types of medical practice form the logical extremes of the medical division of labor. At one extreme is practice wholly dependent on lay choice for its existence: it may be called client-dependent Practice. Such a practice survives by using its own resources to attract and satisfy a lay clientele. Since the client uses lay standards in deciding that he fieeds professional services and in evaluating the services he gets, the practice must conform to lay standards in order to be patronized. Furthermore,
109
when wholly dependent on client choice, the practice cannot be observed by colleagues, nor is its survival dependent on their cooperation. In consequence, all the pressure on the practice is toward conforming to lay rather than professional standards. At the other extreme is colleague-dependent practice. It does not attract its own lay clientele but, rather, obtains clients through the referrals of other colleagues. Thus, in order to survive it must honor the prejudices of colleagues, and so is likely to conform more to professional than to lay standards. How closely do actual practices conform to these logical types? The logical extreme of clientdependent practice does not seem fully applicable to any professional practice, although the "independent" solo neighborhood or village general practitioner comes close to it. Also close are specialists who must attract a clientele directly and do not have to make everyday use of hospital facilities— for example, in urban areas in particular, some internists, pediatricians, ophthalmologists, and gynecologists. In these instances lay standards may be expected to have some force. Empirical examples of the logical extreme of colleague-dependent practices are easier to find in modern medicine. Specialists like pathologists and radiologists, for instance, are almost completely dependent upon colleague referrals and therefore have little need for such client-oriented techniques as a good bedside manner. Here, we should expect considerably greater pressure to honor colleague rather than lay or patient standards. This typology is based on the division of labor within the profession and is therefore applicable to analysis of the control of performance of individual practitioners in any kind of organized practice, from solo to bureaucratic. It might be pointed out, however, that in bureaucratically organized practice it is frequently the organization as a unit, not individual practitioners, that attracts a clientele and that all practitioners in the organization are therefore dependent on it for their work. Insofar as the organization is of the "professional" type discussed above, this means that dependence on it is actually dependence on the colleagues running it. Encouragement to meet professional standards of performance will therefore be considerably stronger than encouragement to meet lay standards. And insofar as work is at once more visible and amenable to control in such an organization than in less well-organized forms of practice, it is here that we should expect to find the highest professional standards. Indeed, it is the general opinion of teachers of medicine in the United
1 10
MEDICAL PERSONNEL: Physicians
States, Great Britain, and elsewhere that this is the case, although adequate evidence tc test this opinion has not yet been gathered. All else being equal, then, we may hypothesize that colleague-dependent practices, in which the physician's performance is observable to and his work dependent on colleagues, will also be most likely to conform to professional standards. Insofar as bureaucratic practice is colleague-dependent, the same conclusion may be drawn for it. But this conclusion masks several assumptions the truth of which is not self-evident: first, that colleagues will exert control over performance; second, that the mechanisms of control used by colleagues are effective; third, that standards are homogeneous throughout the profession. The remainder of this article will explore these assumptions. Professional regulation Variation in the organization of medical practice bears on such necessary conditions for the exercise of professional regulation as the observability of performance to colleagues and the structural vulnerability of the practitioner to control by colleagues. However, while observability and dependence are necessary conditions for the effective exercise of supervision, they are not sufficient. What is needed in addition is the willingness to exercise supervision and exert effective influence over performance. What slender evidence there is suggests that rather less influence over performance is exercised than the organization of practice actually allows, and that the little regulation which does exist has properties that establish and maintain organized differences in performance standards. The basic property of the system of control that seems to exist in the United States is its reliance upon what Carr-Saunders and Wilson, speaking of British medicine, call the "boycott"—that is, the refusal by individual practitioners to enter into a referral or collaborative relationship with those of whom they do not approve (1933, p. 403; see also Hyde et al. 1954). This device does not control the boycotted person's behavior so much as it pushes him outside the boundaries of observability and influence, to practice as he wishes in the company of those with similar standards. There seems to be a certain reluctance to exert active influence over another's performance—a reluctance that results in avoiding him rather than in seeking to change him. There is, unfortunately, little systematically collected empirical information bearing on the process of supervision and control among physicians. A study by Freidson and Rhea (1963; 1965) of a
large academically oriented group practice in the United States indicated that while performance was visible along the axes dictated by the interdependence of specialties within the over-all division of labor, each practitioner tended to keep his complaints about others to himself, so that what he could observe of others' performance in the division of labor was not transmitted to other colleagues. Since bits of information were scattered piecemeal through the colleague group, no really organized control of performance could be initiated unless a man behaved so outrageously as to personally offend everyone. Furthermore, attempts at control were largely individual and hortatory, and there were no control devices intermediate between remonstration and outright ejection from the organization (the latter being the structural equivalent of the boycott). While the physicians studied were aware of the looseness of supervision and control in this ostensibly well-organized practice, they were inclined to feel it adequate and appropriate for ordinary circumstances. Another American study (Goss 1961; 1963) is particularly instructive because it was done in a setting into which supervision was built. There were clear bureaucratic as well as professional supervisory responsibilities allocated through hierarchical ranks. The superior physician in the hierarchy had the right and perhaps even the obligation to review case records and evaluate case management. Furthermore, he had the right to give advice to subordinates about the way a case should be handled, even when advice was not solicited. However, even though the supervisor was officially responsible for the care given to patients in his unit and therefore had the formal right to order that certain procedures be followed for a case, he very rarely gave such orders. Instead, he gave advice, which incurred no obligation to obedience. The only obligation the subordinate had was to consider the advice in the light of his personal experience with and responsibility for the case. So long as he could justify his management of the case by reference to medical knowledge and his clinical experience, and so long as it was he who took personal responsibility for the outcome, he could reject the advice of his superior. In short, even here, where supervisory inspection of performance was routine, the exercise of control over performance was quite loose and permissive. If this is so in hierarchically organized practice settings, it should be even more the case in the informal, small-scale community practices that are far more common in medicine. Thus, we may say that the medical profession, which has gained freedom
MEDICAL PERSONNEL: Physicians from regulation by others, regulates itself in ways whose effectiveness is not self-evident. The analytical problem here is to understand what contributes to shaping this peculiar process of regulation and to point up its structural consequences. Professional values Obviously, when a social structure permits certain kinds of behavior but that behavior does not occur, we must explore the situation further to explain why it does not. Our first question might be why, in such a loose system, the physician does not routinely abuse his privilege. Here, the internalization of general professional values postulated by Parsons (1951, chapter 10) seems a plausible explanation. Parsons defines the professional as someone who is supposed to be recruited and licensed on the basis of his technical competence rather than his ascribed social characteristics; to use generally accepted scientific standards in his work rather than particularistic ones; to restrict his work activity to areas in which he is technically competent; to avoid emotional involvement and to cultivate objectivity in his work; and to put his client's interests before his own [see PROFESSIONS]. These normative expectations are intended by Parsons to apply to all professions, not only to medicine, since he treats the medical practitioner as the archetype of the professional. But it may be objected that the same expectations are applicable to all technical service occupations, not only to professions. Plumbers, too, are supposed to be recruited and licensed on the basis of achievement, to employ universalistic standards in their work, to be functionally specific and affectively neutral. And while plumbers are expected to make enough money from their work to gain a decent income (just as are physicians), they are not expected to do this by cheating the customer. Thus, such values constitute only the most general foundation of conscientiousness in occupational practice. Our second question, however, may be more specific to the medical profession. Why, if it so conscientious, does it not exercise more regulation over its members' performance? Such an extraordinarily loose regulatory structure has been explained by Carr-Saunders and Wilson (1933, pp. 399-400) and by Parsons (1951, pp. 470-471) by reference to the character of professional work. Instead of a se t routine, medicine requires the exercise of complex judgment; instead of caution, the taking of risks. Therefore, regulation can only be loose. But °* all the old established professions medicine is the one most based on fairly precise and detailed scientific knowledge. Indeed, the practice of medi-
111
cine involves considerably less uncertainty than many other technical occupations. As the use of the doctrine of res ipsa loquitur in American courts implies, there are some very clear rights and wrongs in medicine, even if there are also some uncertainties, and these rights and wrongs have not brought forth any formal regulatory mechanisms from the profession as such (as opposed to concrete organizations like teaching hospitals, in which regulatory mechanisms do exist). Without denying that there is a degree of uncertainty, we must conclude that the precision possible in much of modern medicine and the trivial routine of much of everyday medical practice call into question the adequacy of explaining the peculiarly loose structure of controls to be found in the profession by reference to the character of its work. However, it may be that the peculiar nature of the work of the practicing professional encourages a characteristic sense of uncertainty that reflects considerably more special values than those described by Parsons. One such value is that of independence, or autonomy, which is significant for physicians in countries as different as Finland and the United States. Insofar as this value refers to social and economic independence, it reflects the entrepreneurial and individualistic ideology of the bourgeoisie, who are the prime source from which physicians are recruited in virtually all industrial countries. Insofar as the value refers to technical or professional independence—that is, the freedom to practice one's craft without interference, advice, or regulation by others—it seems more closely related to a state of mind encouraged by the character of professional work. The aim of the practitioner is not knowledge but action, and while successful action is the aim, the tendency is to assume that any action at all is better than none. Furthermore, to take action requires faith in oneself and even a will to believe in whatever one does instead of maintaining a skeptical detachment. Dealing with individual and concrete cases, the practitioner is inclined to emphasize indeterminacy rather than lawful regularity and to be radically pragmatic, relying more on the results he associates with his own actions than on theory. These seem to be the orientations that contribute to the emphasis on clinical experience mentioned earlier in connection with medical education. Given that the work of the medical practitioner is with individuals and that it is believed to be based on individual clinical experience, it follows that responsibility for the work can be perceived only as individual and personal. In assuming that
1I 2
MEDICAL PERSONNEL: Physicians
responsibility, the practitioner does gain gratitude for success, but he also gains reproach for failure. Given the risk of blame, he evidences a certain sensitivity and defensiveness in the face of any outsider's evaluation of his performance. This defensiveness is manifested in imputing more uncertainty to the work than in fact exists and in insisting on using his own personal, clinical experience as the ultimate criterion for evaluating his own performance. Thus, collective responsibility for regulation is diminished, and the inclination to rely on individual responsibility and personal experience is augmented. These, of course, are merely suggestions of the complex task that has still to be done in picking out and interrelating strands of what might be called the ideology of the practicing professional. Instead of dwelling on values of such generality that they have doubtful analytical utility for understanding the quality of professional self-regulation, sociologists should determine the specific values attached to different types of professional work. Such an approach would supply one of the critical elements for an adequate explanation of the peculiarities of professional organization. Informal organization of the profession Even without detailed information it seems possible to suggest that the notion of informal organization serves as a vital link between the formal structure given to the American medical profession by the national, state, and county medical associations (see Hyde et al. 1954) and professional performance in the concrete setting of medical practice. By focusing on the characteristic way in which practitioners assert control over each other's performance, one can delineate the relation of one local practice to another and the loose groupings of practices that both carve up a community and extend outside its boundaries. When these informal groupings and the mechanisms of control they express are seen to be intertwined with the formal structure of the profession as a whole, much more of the character of the profession can be understood than by reference to formal associations and codes alone. Recalling that ordinarily the ultimate mechanism of control is the personal boycott, we can begin to indicate the informal structure of the profession by following out the implications of the boycott's operation on the interrelations of practitioners. Let us assume that individual practitioners are free to select the work they will undertake and to choose the colleagues with whom they will work in the division of labor. In this situation, control of professional standards is exercised largely by
willingness to work with one man and to exclude another. But since exclusion as such neither changes a man nor prevents him from working, one may assume that he will eventually find a circle whose standards are such that he is not excluded. There is thus a tendency for the control process to develop a stable set of colleague networks or fraternities. Each network, by the nature of its creation, is fairly homogeneous within itself. Its members share about the same professional standards, participate in each other's work, and participate in, if not dominate, the particular organizations and practices in which they work. But while each colleague network is likely to be fairly homogeneous, many differences are likely to exist between networks by virtue of the process of selection and rejection that differentiates them into separate networks. Thus there is not only likely to be little interaction between many contiguous networks, but also marked differences in technical and normative standards and in the practices and institutions in which the members of each network participate. In a structure of this nature there is comparatively little opportunity for those in one network to be very much aware of the existence of other standards; and even when awareness may exist, there is little leverage by which one network could influence another because each has severed connections with the other and is independent of the other. Since it is a segregating process that leads to and maintains such networks, and since the individual's behavior is less regulated by such a process than classified and assigned to a selfmaintaining collectivity of like people, we can see how within a single profession, even one quite free of lay interference, organized variations in professional performance can occur. While there are certainly social links between adjacent fraternities in the form of practitioners with connections in both, it does not seem to require a very large city to find individual practitioners who know nothing about each other. The characteristic control mechanism of professional regulation, then, paradoxically operates to place offenders beyond the control of those who disapprove of their performance. Moreover, the informal organization of internally homogeneous colleague networks segregated from interaction with each other sustains, if not reinforces, the differences in standards between networks. Apart from civil suit, which is a nonprofessional source of control over practice, and regulatory devices established in the limited milieu of teaching hospitals, all that is left to concerned members of the profession in the United States is exhortation and,
MEDICAL PERSONNEL: Physicians it is hoped, instruction by means of articles in professional journals that may or may not be read and that, if read, may or may not be influential on behavior. What has been suggested, in short, is that the disjunctive process of social control characterizing the concrete, everyday practice of American physicians creates an informal structure of relatively segregated, small circles of practitioners, the extremes of which are so isolated from each other that the conditions necessary for each influencing the other's behavior are missing. Furthermore, the mechanism of control that produces and sustains this situation is no aberration—rather, it is characteristic of the profession, an outcome of its organization and of the way it sees itself and its work. The consequence is that a single profession can contain within itself, and even encourage, markedly different ethical and technical standards of performance, limited in a very superficial way by the minimal standards imposed by selective recruitment, a basic core of training required for licensing, and the writings of the leaders of the medical profession. Tasks of a sociology of medicine The problems of analysis described in this article are not unique to the sociology of medicine but affect the sociology of professions in general. If they can be solved for medicine, we will have taken a long step toward solving them for all professions. The central problem here, as in the study of society in general, is social control. The problem is particularly important for the professions because by definition they are free of the controls common to most occupations. In addressing the problem of control, it was necessary to assess the role of the state and of politico-legal institutions, the manifest and latent functions of professional education, the organization of work, the control processes operating in work, and the norms or values that bear on the exercise of social control in work. The outcome of that analysis was the suggestion that a fragmented structure underlies the serene fagade of unity and homogeneity implied by the notion of a single profession joined by common values and a community of identity. To extend, correct, and refine such a trial analysis of the organization of medical work is one of the prime tasks of a sociology of medicine. In the course of extending it, one would be led quite naturally into a more detailed examination of another fliajor problem of analysis: the client-practitioner relationship. This problem, too, might be seen as one of control. The practitioner wants the client to Se ek him out for professionally appropriate reasons,
113
without visiting quacks and without untoward delay. He wants the client to accept his recommendations and follow them scrupulously. In seeking compliance on the part of his client, the professional cannot always rely on his influence as an expert. The character of this influence and of practitioner—client interaction has barely been explored in other than psychological terms, and poses a challenge both to the taxonomy of types of social influence and to the conceptualization of social interaction. Finally, we may mention a problem of analysis that has not yet received much attention—the role of the professional in creating and defining his own work. In the case of medicine (more than of law or religion) this analytical problem has been confounded by the reification of "scientific knowledge," a viewpoint in which disease is taken to exist independently of human action and the physician is regarded as merely a diagnostician and therapist of what is objectively "there." However, disease that exists independently of human awareness and action is irrelevant to the sociologist, while biologically nonexistent "disease" in which people believe is quite relevant. What is sociologically relevant is a social definition of disease or any other kind of deviance, not the biological fact or fancy. If this premise be adopted, then it follows that physicians are responsible for the social creation of disease in the course of "discovery" and diagnosis [see HEALTH]. It would follow, further, that in medical practice the social organization of work biases the way in which diseases are created and shapes the way in which patients are managed and even created by diagnosis. Thus, a major task of the sociology of medicine is to study the causes and consequences of physicians' conceptions of disease, showing how disease as a social object is created or formed by medical institutions (Scheff 1966). If this task should be performed as something independent of the conventional study of the process of scientific discovery, and with different premises, we will have come a long way toward understanding the social institutions of medicine as one of the modern professions. ELIOT FREIDSON [See also HEALTH; ILLNESS; MEDICAL CARE; PROFESSIONS; PUBLIC HEALTH; SCIENCE, article on SCIENTIFIC COMMUNICATION; and the biography of HENDERSON.] BIBLIOGRAPHY BECKER, HOWARD S. et al. 1961 Boys in White: Student Culture in Medical School. Univ. of Chicago Press. CARLIN, JEROME 1966 Lawyers' Ethics: A Survey of the New York City Bar. New York: Russell Sage Foundation.
114
MEDICAL PERSONNEL: Paramedical Personnel
CARR-SAUNDERS, ALEXANDER; and WILSON, P. A. (1933) 1964 The Professions. London: Cass. COKER, ROBERT E. JR. et al. 1966 Medical Careers in Public Health. Milbank Memorial Fund Quarterly 44:143-258. COLEMAN, JAMES S. et al. 1966 Medical Innovation: A Diffusion Study. Indianapolis, Ind..- Bobbs-Merrill. FIELD, MARK G. 1967 Soviet Socialized Medicine: An Introduction. New York: Free Press. FREIDSON, ELIOT 1961/1962 The Sociology of Medicine: A Trend Report and Bibliography. Current Sociology 10/11:123-192. FREIDSON, ELIOT 1963 The Organization of Medical Practice. Pages 299-319 in Howard E. Freeman, Sol Levine, and Leo G. Reeder, Handbook of Medical Sociology. Englewood Cliffs, N.J..- Prentice-Hall. FREIDSON, ELIOT 1966 The Sociology of Medicine: A Structural Approach. Unpublished manuscript. FREIDSON, ELIOT; and RHEA, BUFORD 1963 Processes of Control in a Company of Equals. Social Problems 11:119-131. FREIDSON, ELIOT; and RHEA, BUFORD 1965 Knowledge and Judgment in Professional Evaluations. Administrative Science Quarterly 10:107-124. Goss, MARY E. W. 1961 Influence and Authority Among Physicians in an Out-patient Clinic. American Sociological Review 26:39-50. Goss, MARY E. W. 1963 Patterns of Bureaucracy Among Hospital Staff Physicians. Pages 170-194 in Eliot Freidson (editor), The Hospital in Modern Society. New York: Free Press. HALL, OSWALD 1946 The Informal Organization of the Medical Profession. Canadian Journal of Economics and Political Science 12:30-44. HYDE, DAVID R. et al. 1954 The American Medical Association: Power, Purpose, and Politics in Organized Medicine. Yale Law Journal 63:938-1022. MERTON, ROBERT K. et al. (editors) 1957 The StudentPhysician. Cambridge, Mass.: Harvard Univ. Press. PARSONS, TALCOTT 1951 The Social System. Glencoe, 111.: Free Press. SCHEFF, THOMAS J. 1966 Typification in the Diagnostic Practices of Rehabilitation Agencies. Pages 139-147 in Marvin B. Sussman (editor), Sociology and Rehabilitation. Washington: American Sociological Association. STEVENS, ROSEMARY 1966 Medical Practice in Modern England: The Impact of Specialization and State Medicine. New Haven: Yale Univ. Press. II PARAMEDICAL PERSONNEL
The term "paramedical" refers to occupations whose work is both organized around tasks of healing and ultimately controlled by the authority of physicians. Ultimate control by medical authority is manifested in a number of ways. First, much of the technical knowledge learned by paramedical workers during the course of their training and used during the course of their work tends to be discovered, enlarged upon, and approved by physicians. Second, the tasks performed by paramedical workers tend to assist, rather than directly replace, the focal tasks of diagnosis and treatment. Third,
paramedical workers tend to be subordinate in that their work tends to be performed at the request or "order" of, and is often supervised by, physicians. Finally, the prestige assigned to paramedical occupations by the general public tends to be less than that assigned to physicians. The paramedical occupations may be distinguished from established professions by their relative lack of autonomy, responsibility, authority, and prestige. However, the fact that they are by definition organized around an established profession and in varying degrees partake of some, but never all, of the elements of professionalism allows us to distinguish them from many other occupations and, indeed, argue that they represent a sociologically distinct form of occupational organization. Furthermore, it may be noted that paramedical occupations are not adequately distinguished by reference to their health-related tasks. Occupations usually called paramedical do participate in a functional division of labor, but what is distinct about this division of labor is that it is ordered by the authority of a prime profession. Other occupations which may actually perform some of the same technical tasks but which stand in a different relationship to the dominant profession (as does, for example, a herbalist compared to a pharmacist) are not called paramedical, but rather quack or irregular. Differences between the paramedical and the quack do not necessarily arise from the actual tasks each performs, but rather from the relations each has to the dominant profession. Thus, the paramedical worker is less easily distinguished technologically, by the relation of his work to that of others, than sociologically, by his relation to medical authority. However, distinct as it is, this "paraprofessional" pattern is not common. For example, while there is a fairly elaborate division of labor revolving around law, it would not be appropriate to use the term "paralegal" for bailiffs, accountants, clerks, real estate brokers, and bankers in the same way that we use "paramedical" for nurses and laboratory technicians. Nor does the prefix seem properly employed to designate the division of labor connected with any other established profession. Medicine alone seems to have imposed such definite order on the occupations surrounding it. We can but guess the reasons for this, citing such variables as the comparative specificity and technical complexity of the tasks involved, which can only be exercised in a medical setting. Whatever the reason, this mode of organizing a division of labor is taxonomically distinct, and if it is true that labor is in general being "professionalized," the paramed-
MEDICAL PERSONNEL: Paramedical Personnel ical model may become more widespread in the future. Both practically and conceptually it is worth close study. How did it develop? What are its present characteristics? Development of the division of labor A division of labor in the task of diagnosing and treating human ills has always existed in one form or another in every human society. There have always been diagnosticians, herbalists, midwives, and nurses, even if only on a part-time, amateur basis. However, the distinctive division of labor labeled paramedical—which is to say, one organized around the authority of the medical profession —is relatively new and is complex only in the highly industrialized societies of the world where the modern medical profession arose. Even in these countries it varies a great deal in the completeness of its integration around and control by the medical profession. Unfortunately, there are few adequate cross-national comparisons of the organization of health workers to provide even the basic descriptive information necessary for analysis, and so much of the indication of the types and sources of variation must be based on scattered bits of information (for one comparison, see Glaser 1966). In Europe a distinctly paramedical division of labor, organized around the authority of a medical man, had begun to emerge at least by the time of the development in cities of the corporate guild and the university. The city provided the population density necessary for the support of a variety of full-time specialists, thereby allowing true occupations to arise. The guild provided the health-related occupations with a workable type of organization through which a distinct occupational identity, visible to officials and public alike, could slowly be established and through which they could press for exclusive rights to that identity and the work it involved. However, the right to have something of a monopoly of title and function—that is, to be licensed—and to control in a fairly strict way access into and progress through the occupational career was obtained from the state. Thus, the occupation gained organization, but it also became subject to assignment, by a political process, to a relatively well denned official position in a larger division of labor—a position that could involve enforced subordination to members of quite another guild. The significance of the university in this situation is that occupations trained in one had a stronger claim, by virtue of their aura of learning an d science, to a superordinate position in the occu pational structure. University training gave phy-
115
sicians and surgeons a strong political position for persuading the state to subordinate to them such competitors as apothecaries, grocers, and barbers, and to allow them to prosecute irregular practitioners. This could be so even when it was doubtful that the actual knowledge and skill of the average university-trained practitioner in those days equipped him to practice any better than his selftaught or apprenticed competitor. With the development of the university and the guild in European cities, then, there arose a rudimentary structure of full-time health workers, organized, at least in part, under the supervision of physicians and surgeons. For centuries this organization was highly unstable, weakened from within by undisciplined competition and from without by the persistence of a great variety of irregular practitioners (see King 1958; Turner 1959). As is the case with the health services in the nonindustrial countries of today, the medical division of labor was fairly stable only in those parts of cities where a well-to-do gentry was likely to patronize it. In the city slums and in the countryside the poor and the peasantry persisted in relying on their own folk remedies, their own, largely part-time practitioners, and, on occasion, itinerant irregulars; the first two being part of their own culture, the last exploiting the naivete of that culture. There were in essence two health systems: the dominant one was rooted in the peasant culture, while the other, which was available only to a minority, owed its greater prestige to its origins in the learned traditions of Western civilization. Before the latter could become at once stable and universal, the former had to be destroyed or at least severely restricted. Not until the twentieth century in Europe and North America did anything emerge resembling a stable and extensive division of labor dominated by physicians, that is to say, a genuine paramedical division of labor. In the nonindustrial countries of the world today, such a structure does not yet exist to any great degree. Modern developments. The prime prerequisite for the development of a stable and extensive division of labor that is distinctively paramedical seems to be the eradication of great qualitative differences in culture and education among the major social strata of a society. This seems to be so because health services are used mostly on a voluntary basis. People choose to use one health service rather than another, and if only one organized service is available, they can choose not to use it and rely on their own informal resources instead. In this sense, while the application of political power can drive out of practice all but officially li-
116
MEDICAL PERSONNEL: Paramedical Personnel
censed workers, it cannot make people use them. It seems no mere coincidence that the irregular health services declined greatly in industrialized countries about the same time that the institution of compulsory universal education arose. Contributory to this process, but by no means enough by itself (as experience in nonindustrial countries today indicates), was the rise of scientific medicine, which was capable for the first time in history of alleviating many complaints and symptoms consistently and predictably. By the twentieth century the medical profession was at last able to establish a secure mandate to provide a central health service. In England the rural general practitioner had been drawn into regular medical ranks. In Russia the feldsher had been in part replaced by and in part subordinated to the physician. In the United States the many different kinds and qualities of practitioners, all democratically calling themselves doctor, had been reduced to some uniformity. Control over the focal tasks of diagnosis and prescription was thereby secured (Sigerist 1935), and by virtue of its major role as arbiter in the application of new scientific discoveries, the profession could order around itself the proliferating new technical personnel. Some historical specialties, such as dentistry, survived fairly independently of the paramedical division of labor. Others, such as pharmacy and optometry, were not fully integrated into the paramedical division of labor, remaining at least partially independent of it. Still others, such as bonesetting and, in the United States, midwifery, were taken over by the physician himself, laymen and amateurs being driven out of practice. Others, the most prominent of which is nursing, maintained an ancient function while being brought firmly under medical control. And finally, with some few exceptions, such new specialties as laboratory technology, which arose with the new medical science and technology inside the walls of the hospital and medical school, developed unequivocally as part of an established paramedical division of labor. Today's paramedical division of labor is therefore a specifically historical construction, with some functionally related occupations falling inside it and some outside it; not all of the very old or the very new occupations fall inside it. The source of whatever order may be found in this division of labor seems to lie in the character of the relationships to be found between medicine, other occupations, and prospective lay clientele—central to which, perhaps, is the possibility of functional autonomy. Relations with the medical profession. The interoccupational relations of paramedical workers
can be seen clearly only as part of a larger, evolving structure that embraces physicians, health workers who are not part of the paramedical division of labor, and the institutions in which medical and nonmedical health services are provided. One of the major variables mediating interoccupational relations in the health services seems to be functional autonomy—the degree to which work can be carried on independently of organizational or medical supervision and to which it can be sustained by attracting its clientele independently of organizational referral or referral by other occupations, including physicians. On the whole, the more autonomous the occupation and the greater the overlap of its work with that of physicians, the greater is the potential for conflict, legal or otherwise. Such conflict is to be seen between chiropractors and physicians in the United States, homeopaths and physicians in the Soviet Union, and "native" practitioners and physicians in virtually all nonindustrial countries. The most interesting conflicts, however, occur within the paramedical division of labor during the course of the growth of new occupations capable of attaining functional autonomy. In the United States, where the movement toward professional status is strong and extensive and there are not enough physicians to perform all the traditional functions demanded of them, such conflict is common; it focuses on the question of whether or not nonphysicians are to be allowed to offer health services independently of medical supervision. The outcome has been, in such increasingly successful cases as that of the clinical psychologist, virtual independence in practice, limited only by the legal inability to prescribe drugs. Impelled by the force of professionalization, the growth of new techniques and new occupations to practice them seems to be giving a new shape to the paramedical division of labor. Some years ago it could be visualized quite simply as a pyramid, with the physician at the apex. However, in the present-day United States the pyramid seems to be changing into a less clear-cut structure, at the top of which is a plateau along which are ranged physicians as well as other relatively autonomous, but consulting and cooperating, new professionals, Recruitment and training Obviously the paramedical division of labor is a stratified system, the occupations of which are in varying degrees integrated around the work of the physician. All occupations in the system are given less prestige than the physician by the society at large. It follows that the socioeconomic status of those recruited into all paramedical occupations is
MEDICAL PERSONNEL: Paramedical Personnel likely to be lower than the status of those recruited into medicine itself. Furthermore, there is a hierarchy of prestige and authority among the ranks of paramedical workers; nurses, for example, are higher than attendants and technicians. This hierarchy is also likely to be reflected in the socioeconomic backgrounds of the workers. In the grossest comparison between physicians and paramedical workers, the latter are to a disproportionate degree women and from the less valued ethnic, racial, and religious groups. With the special exception of sex, those differences in background and personal characteristics are also likely to be ranged in an order corresponding to the general hierarchy of prestige and authority. Variability of training. Training follows a variable pattern, its order roughly paralleling the prestige, independence, and imputed responsibility of the work (see Wardwell 1963). Patterns of training range from the one extreme of professional schools associated with universities, requiring a full higher education before several years of training, to the other extreme of brief, informal, on-thejob training. Between these extremes are many types of training, varied according to the length of study, the formality and abstractness of the curriculum, and the type of institutional arrangement, such as attendance at hospital training schools or proprietary technical schools, apprenticeships in various institutions, and the like. In the United States, where the university is a considerably less clearly defined institution than elsewhere, more paramedical education with professional trappings is to be found. In Europe technical training schools quite separate from the university are more likely to exist, for the education of even the high-prestige, more independent paramedical occupations. The paramedical ranks tend to be ordered by the length and type of training required by the occupation: the longer the training and the more formal and the closer to the university it is, the higher is the occupation's position in the division of labor. It follows from this that the higher the position, the greater must be the investment of time and energy in training, the less casual can be the recruitment, and the greater must be the commitment to the occupation. Recruitment to the many low-skill positions in the paramedical division °f labor seems, by and large, to be a simple function of the demand for unskilled service workers willing to do unpleasant work. Recruitment to the higher-skill positions, however, is considerably more Problematic, especially in those occupations traditionally filled by women. The position of women. Nursing is a fairly ^ell-documented example of problems of recruit-
117
ment and training in the paramedical occupations (see Corwin & Taves 1963). The problem in nursing is not that of attracting people to undergo training as such, for quite a few women begin training; it lies in recruiting women who will stay in training and subsequently pursue a lifetime career in the occupation. The essential difficulty here is that women are likely to be torn between the commitment to work and the commitment to marriage and family. This conflict has been discerned in nursing students and seems to be closely related to school dropouts and subsequent job turnover. Leaders of nursing in the United States have attempted to contend • with the problem by emphasizing the professional qualities of the occupation, presumably hoping to create a stronger "professional" commitment to work that might outweigh family considerations (Strauss 1966). The problem, however, seems to be inherent in the position of women in the labor force and does not seem soluble by professionalization. Even in the case of that most professional of professions, namely medicine, only a modest proportion of women in the United States who are qualified to practice medicine actually do so. One might therefore suspect that a more likely solution for a social system such as that of the United States would be found in changing the organization of the job so as to accommodate it to the demands of marriage and family. In European countries the position of women in the medical and paramedical labor force is quite different, apparently because of national differences in the occupational roles of women that make a professional career highly desirable among women of the haute bourgeoisie, small but significant differences in the class system, and, finally, the level of industrialization and the general standard of living. The last consideration brings up another aspect of recruitment and training in the paramedical division of labor. Clear evidence is lacking, but general opinion seems to be that it is becoming more and more difficult to recruit people to the paramedical jobs that require considerable investment of time and money in technical training. If this is so, it might be understood as a symptom of a larger process of advanced industrialization. Emphasis on professionalism. In the earlier stages of industrialization, the health services constituted a major and conspicuous source of social and economic mobility to which those willing and able to invest in specialized training could aspire. However, in later stages the demand for skilled technical services has developed markedly
118
MEDICAL PERSONNEL: Paramedical Personnel
in other segments of the economy, thereby providing a considerably wider universe of opportunity than that which existed earlier. As older, fairly closely organized systems, requiring relatively extensive investment in training but offering relatively inflexible career lines, the health services, medical as well as paramedical, seem handicapped in competing for a limited pool of potential workers. Part of the pervasive emphasis on professionalism within the paramedical division of labor in the United States seems to be an attempt to increase the attractiveness of the work and thereby aid in recruiting the best possible workers. However, the emphasis on professionalism is likely to be strong only during the course of training, which is where the leaders of the occupation are most likely to be influential. Inasmuch as professionalism tends to emphasize intellectual and technical skill, there is the danger of dissatisfying students whose motives for entering the occupation are not so much intellectual as humanitarian— a danger that has been observed in nursing schools. Furthermore, inasmuch as professionalism tends to emphasize the dignity and autonomy of the worker, it is likely that upon leaving school and entering the everyday institutions of work, which are generally not controlled by the leaders of the occupation, the erstwhile student, who has been imbued with professionalism, is in for what has been called "reality shock." If the student's indoctrination has been thorough, his relations with other occupations in the paramedical hierarchy are likely to be somewhat difficult and personally disillusioning. Paramedical personnel in the hospital It has been implied that the greatest opportunity for developing functional autonomy seems to exist for those occupations that can operate outside the walls of such medically organized institutions as clinics and hospitals. The nursing profession, whose leaders in the United States have with great energy sought to establish unique skills and fully professional status, seems fated nonetheless to remain subject to the doctor's orders, in part because a nurse's work is largely carried out in the hospital. In this, however, the nurse is not unique: the largest part of the paramedical division of labor grew up within such organizations and may be expected to persist and proliferate within them in the future. It is for this reason that once we leave the broad societal level of analysis of the paramedical division of labor to undertake the analysis of everyday work, we find ourselves
in the community agency, the clinic, and, most extensively studied of all, the hospital. All hospitals are complex organizations coordinating a number of tasks and forming the focus for a number of distinct, usually overlapping goals. Given the fact that hospitals are fairly stable and spatially fixed, it is no accident that the paramedical occupations working within them have been studied far more than those working outside in the community at large, where the bulk of health services are actually provided. Thus, we have a severely limited view of paramedical as well as medical work. Among studies of hospital personnel, the nurse in the general hospital is the most frequent subject, and the attendant or aide in the mental hospital ranks second. We have little systematic empirical information about virtually all other paramedical workers. Handicapped as we are, the nurse and attendant between them do present us with a view of the range of workers, from the most professional to the least. By reviewing their respective positions, we can obtain some hints about the kinds of analytical problems posed by the work of paramedical personnel. The nursing profession. It is difficult to speak of nursing as a single occupation, because the training and work situations of nursing are so variable. Training in the United States can vary from a three-year hospital-nursing-school program to a four-year college program and even to programs leading to the doctorate (Davis et al. 1966). On the job, nurses in some American and European hospitals are preoccupied with bedside patient care and virtually all housekeeping tasks; however, in larger American teaching hospitals nurses are characteristically engaged in supervising the lesser personnel who give bedside care and do the housekeeping. Furthermore, there are major differences in the organization of hospitals in which nurses work: in most hospitals throughout the world the medical staff constitutes the only significant hierarchy, but in some of the larger American hospitals the medical hierarchy is paralleled by that of a nonmedical administrative staff. In the latter case the nurse's traditional subordination to the physician becomes complicated by subordination to another hierarchy. The two lines of authority may make quite different, even opposing, demands on her, thereby introducing into her work more strain than has existed traditionally (see Croog 1963). However, the problem of two lines of authority in hospitals has been overemphasized, particularly in light of the fact that the development of an administrative hierarchy provides the nurse with a
MEDICAL PERSONNEL: Paramedical Personnel better opportunity for mobility than exists when a medical hierarchy alone is present. The possibility of moving up into an administrative hierarchy is common for many occupations, including medicine, but it seems particularly significant for paraprofessional occupations. By their nature such occupations are technically subordinate: success within the occupation does not remove that subordination, and movement into the superordinate occupation is not usually possible. Only by forsaking the particularistic skills of the occupation and moving into administrative positions can that subordination be escaped. While administrative positions may in fact not be superordinate to professional staff positions, they may at least run parallel to the professional positions and attain equality with them. We can therefore understand why it is that nurses who are preoccupied with attaining a fully independent status attempt to pass over as "dirty work" the skills of bedside care (i.e., what was once called nursing) to lesser workers and to specialize in administrative work (see Hughes 1958). Recalling the problems involved in recruiting students who can become committed to nursing as a career and the attempt to create such commitment by emphasizing professionalism, we are led into an interesting dilemma: if women become committed to nursing by becoming "professionalizes," their commitment makes them prone to forsake the work for which they were recruited in the first place. That dilemma, however, is more characteristic of nursing in the United States than elsewhere, reflecting a national emphasis on social mobility and professionalization. Furthermore, it refers to one of the better-established paramedical occupations and more particularly to those members of the occupation in the United States who have been trained in and work in the high-prestige, academically oriented institutions. As such, it is hardly representative of the total range of paramedical occupations and their dilemmas. The cross-national comparison presented by William Glaser (1963) suggests that the more common problems of paramedical occupations are not really represented by American nursing studies. What is needed most is insight into the less trained, less mobile occupations. Unfortunately, about all we have to provide us with this insight are studies of attendants and aides in American mental hospitals. The attendant. The essential problem posed by the hospital attendant, and presumably by other relatively untrained personnel in similar positions lr i the division of labor, is his failure to satisfy
119
the expectations of his professional supervisors. This difficulty may be the more important because the attendant is in the most continuous and intimate contact with the patient and therefore may in fact have greater influence on the patient than the supervising professionals. Thus, his "custodial" orientation to his patients is deplored, and a more "therapeutic" orientation is expected. The cause of the attendant's deficiencies seems to stem from at least two sources. First, his job is, in the most immediate sense, one of keeping order—minimizing dirt, destruction or waste of property, and personal injury and allowing housekeeping, therapeutic, and other services to be carried out on a predictable and efficient schedule. In the nature of the case this is a custodial responsibility, requiring something of a custodial attitude. If health institutions are to be run relatively economically, such an attitude on the part of those responsible for the hour-by-hour care of resident patients seems necessary and inevitable. It is the second element that is more variable— the way in which the attendant perceives his patients, their illness, and his relationship to them. Almost by definition, as a paramedical worker without formal training, the attendant is likely to adopt a view similar to that of the layman. The problem is not lay attitudes as such but which lay attitudes the attendant adopts. A great many studies of American state mental hospitals suggest that attendants adopt an attitude of punitiveness and contempt toward patients and of antagonism toward the expectations of the professional staff. Part of this attitude, as noted already, stems from the job the attendant has to do, as well as from the feeling that the more remote professional staff does not really understand how difficult it is to keep order or even how to keep order. Another part, however, seems to reflect more than anything else the average "unenlightened" American layman's conception of the mentally ill (see Strauss et al. 1964). Attendants from other cultures may have entirely different conceptions of the mentally ill and behave quite differently, as Caudill's analysis of the tsukisoi in Japan (1961) and Parsons' discussion of a Neapolitan hospital (1959) suggest. Even in the United States, when the illness involved is not as stigmatized as mental illness, lay attitudes of unskilled aides can be supportive rather than punitive, sympathetic rather than hostile. In this sense, precisely what is "unprofessional" about such lower-order workers can be as much a virtue as a vice. Indeed, that same less-professional character
120
MEDICAL PSYCHOLOGY
enables the paramedical worker to accomplish what the professional cannot, that is, the paramedical worker can draw into treatment patients who would otherwise be evasive and hostile to organized health services. Many studies from around the world, particularly those summarized by Simmons (1958), indicate that patients of humbler origins than that of physicians feel more comfortable dealing with such paramedical workers as nurses, feldshers, and midwives, who are closer to their own class and culture. Furthermore, lowerstatus patients seem more easily "educated" by paramedical personnel than by physicians, not only because they can enter into rapport more easily but also because they are more prone to "speak the same language" and to adjust themselves to the patient's expectations. This lesser social distance from patients seems to be particularly critical in circumstances where the contact between patient and health worker is voluntary and casual, rather than forced and desperate, and where status differences are quite marked, linguistically, culturally, and socially. Indeed, it appears that it is the need on the part of lower-status patients for consultants who are more nearly equal to them and who operate in a manner compatible with their culture that modern irregular practitioners have risen to serve. To the extent that paramedical personnel become professionalized, they may lose their advantage in dealing with lower-status patients. However, to the extent that the paramedical worker's success with those patients is predicated on lay attitudes, his relations with supervising professionals are certain to be problematic. This is one of the major dilemmas of paramedical work. ELIOT FREIDSON [See also MENTAL DISORDERS, TREATMENT OF, article On THE THERAPEUTIC COMMUNITY.] BIBLIOGRAPHY CAUDILL, WILLIAM 1961 Around the Clock Patient Care in Japanese Psychiatric Hospitals: The Role of the tsukisoi. American Sociological Review 26:204—214. CORWIN, RONALD G.; and TAVES, MARVIN J. 1963 Nursing and Other Health Professions. Pages 187-212 in Howard Freeman et al., Handbook of Medical Sociology. Englewood Cliffs, N.J.: Prentice-Hall. -> A review of many American studies of nursing. CROOG, SIDNEY H. 1963 Interpersonal Relations in Medical Settings. Pages 241-271 in Howard Freeman et al., Handbook of Medical Sociology. Englewood Cliffs, N.J.: Prentice-Hall. -» A review of studies of interoccupational relations in American hospitals. DAVIS, FRED et al. 1966 Problems and Issues in Collegiate Nursing Education. Pages 138-175 in Fred Davis
(editor), The Nursing Profession: Five Sociological Essays. New York: Wiley. FREIDSON, ELIOT 1961/1962 The Sociology of Medicine: A Trend Report and Bibliography. Current Sociology 10/11:123-192. -» Contains a brief review of the field and a fully annotated and classified international bibliography. GLASER, WILLIAM A. 1963 American and Foreign Hospitals: Some Sociological Comparisons. Pages 37-72 in Eliot Freidson (editor), The Hospital in Modern Society. New York: Free Press. -> A sketch of the different international settings in which paramedical personnel work. GLASER, WILLIAM A. 1966 Nursing Leadership and Policy: Some Cross-national Comparisons. Pages 1-59 in Fred Davis (editor), The Nursing Profession: Five Sociological Essays. New York: Wiley. HUGHES, EVERETT C. 1958 Men and Their Work. Glencoe, 111.: Free Press. -* Seminal essays on the study of occupations, many referring to paramedical and medical workers. KING, LESTER S. 1958 The Medical World of the Eighteenth Century. Univ. of Chicago Press. H> Contains a few excellent essays on interoccupational relations in English medicine of the sixteenth through the eighteenth centuries. PARSONS, A. 1959 Some Comparative Observations on Ward Social Structure: Southern Italy, England and the United States. Ospedale psichiatrico 2:3—23. SIGERIST, HENRY E. 1935 The History of Medical Licensure. Journal of the American Medical Association 104:1057-1060. SIMMONS, OZZIE G. 1958 Social Status and Public Health. Pamphlet No. 13. New York: Social Science Research Council. STRAUSS, ANSELM 1966 The Structure and Ideology of American Nursing: An Interpretation. Pages 60-180 in Fred Davis (editor), The Nursing Profession: Five Sociological Essays. New York: Wiley. STRAUSS, ANSELM et al. 1964 Psychiatric Ideologies and Institutions. New York: Free Press. TURNER, ERNEST S. 1959 Call the Doctor. New York: St. Martins. -> A social history of medicine in England, somewhat popular, but containing more data on practice and practitioners than conventional academic studies. WARDWELL, WALTER I. 1963 Limited, Marginal and Quasi-practitioners. Pages 213-239 in Howard Freeman et al., Handbook of Medical Sociology. Englewood Cliffs, N.J.: Prentice-Hall. -> A review of American materials on pharmacists, dentists, podiatrists, optometrists, clinical psychologists, osteopaths, chiropractors, and others.
MEDICAL PSYCHOLOGY See CLINICAL PSYCHOLOGY. MEDICAL SOCIOLOGY
See EPIDEMIOLOGY; HEALTH; ILLNESS; MEDICAL CARE; MEDICAL PERSONNEL; MENTAL DISORDERS, TREATMENT OF, article On THE THERAPEUTIC COMMUNITY; MENTAL HEALTH, article on THE CONCEPT; PUBLIC HEALTH. Related material may be found under OCCUPATIONS AND CAREERS; PROFESSIONS.
MEINECKE, FRIEDRICH MEINECKE, FRIEDRICH Friedrich Meinecke (1862-1954) was the most important German historian to follow Ranke and Burckhardt. He developed Dilthey's concept of history of ideas; he followed the philosophy of historicism, first outlined by Ernst Troeltsch and Benedetto Croce, to its logical conclusion; and finally, he achieved a synthesis of historical thought and political action by becoming one of the moral leaders in Germany's return to democracy after 1945. Sources of thought. Meinecke was born in the town of Salzwedel in Prussian Saxony but was brought up in Berlin in solid, middle-class surroundings. As a student, he was stirred by the personality of Bismarck and influenced by the sense of discipline and courage found in the Prussian state. But Meinecke was impressed by the classical humanism of German literature and music, poetry, and philosophy as well as by the spirit of Potsdam. After leaving the Gymnasium, Meinecke entered the University of Berlin, determined to become a historian. There he was initiated into the techniques of historical methodology which Leopold von Ranke and his school had perfected. Meinecke accepted not only their methods but also their general frame of reference; i.e., that the proper subject of study for the historian is conflict among the great powers. He attended the lectures of Johann G. Droysen and Wilhelm Dilthey; Heinrich von Sybel and Heinrich von Treitschke directed his scholarly pursuits. A speech defect from which he suffered throughout his life made Meinecke choose the career of secluded archivist rather than academic teacher, and in "this dusty trade" he felt at home for many years. Among his fellow archivists was one of the masters of institutional and comparative history, Otto Hintze, who exercised considerable influence on Meinecke. Although Meinecke was shy and withdrawn by nature, his special gifts were soon recognized. In 1893 he was asked to become editor of the Historische Zeitschrift, Germany's most important historical review. History of ideas. In these early years of apprenticeship, Meinecke was already concerned with the world of political ideas. He formulated the task of the historian in this manner: "Ideas, carried and transformed by living personalities, [constitute] the canvas of historical life" (Erlebtes 1 • • p. 117). This sentence represents the core of Meinecke's Ideengeschichte. He first put this conv iction to the test when he wrote the biography
12 1
(1896-1899) of Hermann von Boyen, the Prussian minister of war who, in 1814, introduced military conscription. It was a pioneering attempt to make the arid facts of military history a part of the history of ideas. Meinecke's biography established Boyen's niche in history and Meinecke's own reputation as one of the most promising talents in Germany's academic world. His appointment as professor of modern European history at Strasbourg in 1901 was evidence of his immediate recognition. There, in the southwestern corner of Germany, Meinecke encountered some of the best minds then active in that country: Max Weber, Ernst Troeltsch, Heinrich Rickert, and many others, and they made him aware of the limitations of his earlier perspectives. One of Meinecke's characteristics was his never-failing capacity for growth; in Strasbourg, and after 1908 in Freiburg, he shed much of his Prussian parochialism. For more than a decade Meinecke remained fascinated by the problems and paradoxes of German history, especially the years from 1789 to 1848. In his next work, Weltbiirgertum und Nationalstaat (1908), he endeavored to show how cosmopolitanism and nationalism had become deeply intertwined in the complex development of nineteenthcentury Germany. He showed how both elements could be found in the ideas of Fichte, Novalis, Schlegel, Hegel, and Ranke, and how early German nationalism was made up of politically inconsistent cultural components. As Meinecke saw it, the universalistic tendencies of German thinkers were put to the test in 1848, and the revolution failed because of the incapacity of many Germans to come to grips with the realities of power politics. In this perspective, Bismarck and what he stood for became essential to the German quest for national unity. Hegel, Ranke, and Bismarck were the great liberators who freed the German mind from its romantic mists and created a realistic attitude toward the state. Meinecke thus traced the philosophical and literary origins of the ideology of the nation-state and went beyond the traditional borderlines of political history. His works were soon recognized as masterpieces in a new field, the biography of ideas or, as it were, of two ideas. He was at his best when analyzing the major polarities in Western thought: order and freedom, nationalism and universalism, power and ethics, "is" and "ought," uniqueness and recurrence. He became the historian of political ideas par excellence, founding a new school of historical thought, and developing a unique style—
122
MEINECKE, FRIEDRICH
subtle, sensitive, and highly expressive of the countless vacations and mutations which political ideas produce as they develop. In Freiburg, Meinecke moved into the political arena for the first time. Abandoning the conservative leanings of his earlier years, he joined the right wing of the liberal party, the National Liberals. His goal was the widening of the foundation of the nation-state to include the ever-growing masses of industrial labor. His initial attempt was timorous and lacking in energy; he approved of representative government, not as an end in itself, but as a means to an end—that of enabling Germany to play her role as a world power. The nature and justification of power. In 1913 Meinecke accepted an appointment at the University of Berlin. At the outbreak of World War I he was at first uncritically committed to Germany's imperialistic aspirations. Only slowly did it dawn on him that this conflict harbored consequences surpassing by far the significance of previous engagements between feuding European nations. As the horizon around Germany grew darker, however, Meinecke's perceptions became more piercing. His keen political analysis and counsel, in turn, began to be sought after by the more thoughtful statesmen of Germany; Richard von Kuhlmann, secretary of state in the Foreign Office, and Theobald von Bethmann-Hollweg, the hapless chancellor, discussed with Meinecke the unsolved problems of Germany's domestic situation and the chances for a negotiated peace. He began to work for a peace by compromise and without territorial gains for any of the great powers. He also bent his efforts toward more equitable political representation for the working class. But although his advice was heard in many quarters, it was little heeded. More important than Meinecke's remedies for specific problems, however, was his emergent comprehension that the nation-state in which he had so strongly believed was no longer a sufficient answer to the political exigencies of the twentieth century. New questions crowded his mind: what was power? what was Germany's relationship to the rest of the Western world? what lay behind the great conflict that seemed to be splitting the Occident? Meinecke could not accept the MarxistLeninist interpretation of the crisis, since world revolution and the revolt of the masses appeared to him as the predominant threats to Western civilization. On the other hand, by the time the war ended he realized that the old, aristocratic Germany was doomed. The downfall of imperial Germany in 1918 filled him with grief but not with despair. He
accepted the Weimar Republic as a necessity and was ready to work for a new democratic Germany. Meinecke's doubts about the nature and justification of power, aroused by World War I, were crystallized in his book Machiavellism (1924). Meinecke admitted that it was the extreme manifestations of power politics during World War i that had opened his eyes to the dangers of politics divorced from any ethical code. The Treaty of Versailles only served to deepen the lesson; it led him into a historical investigation of theories of the nature and function of power in human life, beginning with Machiavelli, through Bodin and Rohan, to Frederick the Great, Hegel, Ranke, and Treitschke. There are those who consider the book to be a history of Machiavellianism; others view it as an attempt to surmount the teachings of Machiavelli. Neither of these interpretations hits the mark; more nearly, the book is Machiavellianism considered with a guilty conscience. Meinecke could not subscribe to Burckhardt's and Acton's thoroughgoing condemnation of power; neither could he any longer assent to the idolatry of power found in Hegel and Treitschke. The result is a dichotomy, a separation of ethics and power that defies reconciliation: the creed of the statesman, said Meinecke, must embody both the interest of the state and the fundamental moral principles of mankind. Historicism. There is a mood of philosophical reflection in Machiavellism which foreshadows an even more complex enterprise—a study of the genesis of historical thought. In Berlin, Meinecke lived in close contact with Troeltsch, who considered the historical outlook in its most comprehensive sense as characteristic of the twentieth century. In 1922 Troeltsch published Der Historismus und seine Probleme. After Troeltsch's death in 1923, his work on this subject was continued by Meinecke. However, Meinecke's perspective was somewhat narrower: as a historian he was more interested in the origins of historical thought than in its significance for the future of Western civilization. In 1936 he published Die Entstehung des Historismus ("The Origins of Historicism"). It is the third of his significant contributions to the history of ideas, completed when Meinecke was well past his seventieth year but dating back to very early reflections on the element of individuality, or uniqueness, in historical life. German historians had long been hostile to positivistic attempts to reduce human development to "scientific laws"; such attempts, they contended, violate two of the most precious elements in his-
MEINECKE, FRIEDRICH tory: spontaneity and uniqueness. Meinecke shared this interpretation of human life and traced it from the late seventeenth century to the twentieth. He defined "historicism" in the following manner: "the essence of historicism consists in replacing a general and abstract contemplation of human affairs by an individual one" (eine individualisierende Betrachtung) ([1936] 1959, p. 2). He did not hesitate to call this concentration on uniqueness the highest achievement in the contemplation of things human. This was an extreme stand, denying both the sociological ideal-type (as Max Weber conceived it) and the ethical norm of universal validity. Die Entstehung des Historismus begins with an analysis of Shaftesbury, Leibniz, and Vico; it moves into an evaluation of the historiography of the Enlightenment, with special emphasis on Voltaire, Montesquieu, Hume, and Gibbon. From English preromanticism it switches to Moser, Winckelmann, Herder, and Goethe, and it ends with an epilogue on Ranke. Critics have pointed out, with justice, that this history of historicism ends at the moment when the movement really came into its own and that it describes its growth but not its flowering. Likewise, the problem of relativism (inherent in the idea of uniqueness) versus absolute and perennial values is stated by Meinecke but by no means elucidated or solved. Nevertheless, the book marks an important advance in the long discussion among historians, social scientists, and philosophers of the proper subject and the meaning of history. The German catastrophe. Meinecke might have tried to answer some of these questions more conclusively had it not been for the general conditions of his time. When this book on historicism was published, Hitler had triumphed in Germany. Meinecke had fought with courage against the rise of National Socialism, both in the press and from his chair at the University of Berlin. Some of his close associates were ousted and silenced. Many of Meinecke's students were obliged to flee the country, and in 1935 Meinecke relinquished the editorship of the Historische Zeitschrift. But he was perhaps most oppressed by the foreboding of a second world war. His correspondence clearly reveals that he was one of the few German scholars who never compromised with the authorities and who had the courage to state frankly in his letters what he was not allowed to say in public. An indefatigable worker, Meinecke spent these Years working on his memoirs; they hold a certain c harrn but do not rank with his contributions to
123
intellectual history. The war did not spare him: he suffered the same privations, the hunger, and bombings, as millions of others. Finally he fled from Berlin, shortly before it fell to the Russians. Once more his mind turned to the enigma of German history, especially to the questions that have puzzled so many observers: how could the advent of Hitler be explained? and further, to what extent was Germany responsible for the greatest retrogression in European civilization since the days of the Black Death? Meinecke's answers were given in a small book, The German Catastrophe (1946). He began his analysis with the statement that National Socialism must be understood against the background of our entire Western civilization, against the conflict between the old society and the new industrial masses. And he did not spare those forces which had once elicited his praise: the Prussian state and the German bourgeoisie. The Prussian state, he wrote, had permeated the nation with its militaristic attitude; the bourgeoisie had closed its mind to democratic forms of government, which alone could have brought a reconciliation between itself and the working class. He accounted for the success of a demoniac figure like Hitler by indicating the German social interests which had tried to manipulate the "revolution of nihilism" only to become its victims. This new approach, an attempt to combine intellectual and social history, is also apparent in other essays that Meinecke wrote after 1945, especially in his comparison of Burckhardt and Ranke (1948a) and in his appraisal of the revolution of 1848 (1948&). They reveal, if nothing else, an indomitable will to continue the task of the historian in a world changed beyond recognition from the well-grounded security into which Meinecke had been born. When a large part of the student body revolted against the oppression of the communist-controlled University of Berlin, it found in Meinecke the leader to head an independent institution—the Free University of Berlin. To have heralded and ushered in so momentous an action is surely one of Meinecke's titles to lasting fame. His contributions to modern historiography have proved surprisingly durable and have reached well beyond the confines of Germany. They have been emulated, corrected, and improved in Austria, Italy, and, more especially, in the United States, where some of his students carry on his work. GERHARD MASUR
124
MEMORY
[See also HISTORY; NATIONALISM; POWER; and the biographies of BURCKHARDT; CROCE; DILTHEY; HEGEL; HINTZE; MACHIAVELLI; RANKE; TREITSCHKE; TROELTSCH.] WORKS BY MEINECKE 1896—1899 Das Leben des Generalfeldmarshalls Hermann von Boyen. 2 vols. Stuttgart: Cotta. (1908) 1962 Weltburgertum und Nationalstaat: Studien zur Genesis des deutschen Nationalstaates. Edited with an introduction by Hans Herzfeld. Munich: Oldenbourg. (1924) 1962 Machiavellism: The Doctrine of Raison d'Etat and Its Place in Modern History. New York: Praeger. -» Originally published as Die Idee der Staatsrdson in der neueren Geschichte. Contains a general introduction to Friedrich Meinecke's work by Werner Stark. (1936) 1959 Werke. Volume 3: Die Entstehung des Historismus. Munich: Oldenbourg. -> The translation of the extract in the text was provided by Gerhard Masur, (1946) 1950 The German Catastrophe: Reflections and Recollections. Cambridge, Mass.: Harvard Univ. Press. -> First published in German. A paperback edition was published in 1963 by Beacon. (1948a) 1954 Ranke and Burckhardt. Pages 141-156 in Hans Kohn (editor), German History: Some New German Views. Boston: Beacon. -> First published in German. (19486) 1951 Year 1848 in German History: Reflections on a Centenary. Pages 668-686 in Herman Ausubel (editor), Making of Modern Europe. Volume 2: Waterloo to the Atomic Age. New York: Dryden, -> First published as "1848: Eine Sakularbetrachtung." Erlebtes: 1862-1919. Stuttgart: Koehler, 1964. -» The translation of the extract in the text was provided by Gerhard Masur. Werke. 6 vols. Munich: Oldenbourg, 1957-1962. -» Volume 1: Die Idee der Staatsrdson in der neueren Geschichte. Volume 2: Politische Schriften und Re den. Volume 3: Die Entstehung des Historismus. Volume 4: Zur Theorie und Philosophic der Geschichte. Volume 5: Weltburgertum und Nationalstaat: Studien zur Genesis des deutschen Nationalstaates. Volume 6: Ausgewdhlter Briefwechsel. SUPPLEMENTARY BIBLIOGRAPHY
STERLING, RICHARD W. 1958 Ethics in a World of Power: The Political Ideas of Friedrich Meinecke. Princeton Univ. Press. ->• Contains a bibliography of Friedrich Meinecke's writings and books and articles about him.
MEMORY See FORGETTING and LEARNING.
MENGER, CARL Carl Menger (1840-1921), economic theorist and founder of the Austrian school of marginal analysis, was both the most influential and the least read of the major figures who gave economic theory the shape it preserved from about 1885 to
1935. There is little doubt that it was his immediate disciples who cast microeconomic theory into the form which, in its essentials, it still retains. Of the three founders of modern utility analysis, he alone not only based his work on a long tradition and presented the outlines of his theory in a form which for some time could not be bettered, but also succeeded in creating a school which continued to develop his ideas. Menger exerted a widespread influence, mainly through his avowed disciples in many countries, despite the fact that his two main books were not reprinted for 50 years or translated into English for 79 years. His work also had an effect on the only important rival school of the period—the neoclassical Cambridge tradition. At an early stage, Alfred Marshall, founder of the Cambridge school, had evidently studied Menger's work much more assiduously than is suggested by the few references to Menger (most of which were dropped from later editions) in Marshall's Principles. (Marshall's personal copy of Menger's Grundsatze, with a detailed marginal commentary in Marshall's hand, is preserved in the Marshall Library at Cambridge.) Menger was born in Neu Sandec, Galicia (then in the Austrian part of Poland), the descendant of a professional family that had earned the prefix "von" (Menger himself dropped it in early adulthood). In the well-stocked library of his father, a practicing lawyer, Menger and his two brothers became acquainted early with the literature on social and economic questions; one brother, Anton Menger, was a legal philosopher and historian of socialist doctrine. Menger studied law at the universities of Vienna and Prague and finally took his doctorate at the University of Cracow in 1867. Apparently he had done some journalistic work in Vienna and Lemberg before taking the degree, and afterward he entered the press section of the prime minister's office in Vienna, a position which was frequently a springboard to high public office. In that position, apparently as a result of having to write market reports, Menger developed an interest in price theory. The recent publication of his annotations to Rau's Grundsatze der Volkswirthschaftslehre ([1870] 1963) suggests that it was mainly his critical analysis of this textbook exposition of classical doctrine that led Menger, from 1867 on, to develop his own value theory. In his extensive reading, Menger must have found ample material in the early nineteenth-century German and French economic literature on which to build a fully developed utility analysis. (The utility tradition was not as strongly preserved in the English literature.) It noW
MENGER, CARL appears that the literature on which he was able to draw included also the work of an Austrian economist, J- Kudler (whose textbook, Die Grundlehren der Volkswirthschaft 1846, he had probably used at the university), and one work by Cournot. Menger's sources, however, did not include the work of the author who had the most completely anticipated him, Gossen's Entwickelung der Gesetze des fnenschlichen Verkehrs . . . , published in 1854. The results of Menger's studies appeared in his principles of Economics (1871), the work on which his fame mainly rests. Described as the "first, general part" of an intended comprehensive work on economic theory, it remained his sole major publication in this field during his lifetime. In somewhat copious but always clear language, it provided a much more thorough account of the relations between utility, value, and price than is found in any of the works of Jevons and Walras, who at about the same time laid the foundation of the "marginal revolution" in economics. The book gained for Menger first a lecturership and, in 1873, the position of extraordinary professor at the University of Vienna. For some years he published nothing more, apparently because of his appointment in 1876 as tutor to the 18-year-old crown prince of Austria, the ill-fated Archduke Rudolf. For two years Menger accompanied Rudolf on extensive travels through Germany, France, and the British Isles. He seems to have assisted the crown prince in the composition of a pamphlet (anonymously published in 1878) which attempted a critical examination of the role played by the higher Austrian aristocracy. The pamphlet caused some stir when in 1906, 17 years after the death of the archduke, his authorship was discovered. The real beginning of Menger's long and very effective career as a teacher came with his appointment in 1879 to a full professorship at the University of Vienna. During the next 24 years he expended most of his energy on his general lectures to law students (which he appears to have rewritten every year), and he was particularly attentive to those few students who voluntarily chose economics as their field of special work. His teaching w as interrupted only twice by bursts of literary activity. The first of these was connected with his second major book, Problems of Economics and Sociology (1883). Here he undertook to vindicate J16 importance of theory in the social sciences. lfl is was an effort that seemed necessary to him in ^ew of the complete indifference or even hostility , ich most of his German colleagues, influenced y the antitheoretical attitude of the "younger hislc al school" in economics, had shown toward his
125
attempt in the Principles to reconstruct economic theory. To understand the aim of the Problems and the nature of the great controversy to which it gave rise, it is necessary to appreciate the character of the school against which it was directed. The "younger historical school" is somewhat misnamed: unlike von Savigny and the older historical school of jurisprudence, or even Roscher and the "older historical school" in economics, this "younger" school was not interested in history as the study of unique events but regarded historical study as the empirical approach to an eventual theoretical explanation of social institutions. Through the study of historical development it hoped to arrive at the laws of development of social wholes, from which, in turn, could be deduced the historical necessities governing each phase of this development. This was the sort of positivist-empiricist approach which was later adopted by American institutionalists (differing from similar, more recent efforts only in that it made little use of statistical technique), and which is better described (as by Popper) as historicism. It was against this use of history as a means of discovering empirical laws that Menger undertook to defend what he considered to be the proper function of theory—reconstructing the structure of social wholes from their parts by the procedure called methodological individualism by Schumpeter, or the "compositive method" by Menger himself. It is essentially what today is called microtheory. Menger was greatly interested in history and the genesis of institutions, and he was anxious mainly to emphasize the different nature of the task of theory and the task of history proper and to prevent a confusion of their methods. The distinction, as he elaborated it, considerably influenced the later work of Rickert and Max Weber. Perhaps the most important part of his discussion was the clear recognition, first, that the object of all social theory is the tracing of what are now usually called the unintended consequences of individual actions (Menger's term was the unbeabsichtigte Resultante), and, second, that in this effort the genetic and the functional aspects could not be separated ([1883] 1963, pp. 163, 180, 182, 188). In expounding and illustrating this view he went far beyond the limit of economics and dealt particularly with the genesis of law. The nature of the dispute has often been confused by the fact that Menger, in arguing against what he regarded as the dominant pseudohistorical school in economics, maintained ideas which had reached him through the historical school in law.
126
MENGER, CARL
These ideas can be traced back to Mandeville, David Hume, and the later eighteenth-century Scottish philosophers, although the degree to which Menger was directly acquainted with these eighteenth-century sources is not clear. It is worth noting that Menger always had a great interest in the history of economic theory and used it with much didactic skill in his lectures as an introduction to the problems of modern economic theory. The Problems was unfavorably and condescendingly reviewed by Gustav Schmoller, the head of the younger historical school of economists; Menger replied to Schmoller's criticism in a passionate brochure, Die Irrthumer des Historismus in der deutschen Nationalokonomie (1884). This was the beginning of the celebrated Methodenstreit (dispute on method). Emotions ran high; younger men on both sides joined in; and the dispute produced a cleavage between German and Austrian economics, traces of which were to be felt for decades. In a number of articles during the following few years Menger dealt mainly with problems arising out of the dispute, except for his only other contribution to pure economic theory, the article "Zur Theorie des Kapitals," published in 1888 (see Collected Works, vol. 3, pp. 133-183). Menger emerged a second time from his academic seclusion in 1892 to join the discussion on the reform of the Austrian currency. His active participation in discussions of policy was foreshadowed in the very same year by his article on money (see Collected Works, vol. 4, pp. 1-2) for the new German encyclopedia of political science. The article was itself a substantial treatise, which devoted much space to the evolution of money but also emphasized the factors determining the amount of money held by individuals, and which laid the foundations for a theory of the value of money on which later Austrian economists, such as Wieser, Von Mises, and Weiss were able to build. No less important, however, are his memorandum and his oral evidence to the Austrian currency commission and various articles which he published in 1892 and in the next few years. But while such special occasions led Menger to literary production, his teaching appears to have precluded progress on the great treatise which he hoped would replace his first work. Therefore, in 1903, he prematurely resigned his professorship in order to devote himself entirely to this task. But although he continued to work on it during the remaining 18 years of his life and at one stage seems to have come close to his goal, he continued his efforts after his powers had begun to fail, with the result that he left nothing that was readily
publishable at his death. His son included part of the manuscript material in a second edition of the Grundsatze, which appeared in 1923. But the publication of more of the manuscript material has proved to be a very difficult task which so far has not been accomplished. Menger built up over the years one of the greatest private libraries in the field of social science, which in 1911 he estimated at something like 25,000 volumes. The sections dealing with the social sciences and anthropology were sold after his death to the Commercial University (now Hitotsubashi University) in Tokyo, which published a catalogue of it in two parts, one in 1926 and the other in 1955. In an assessment of Menger's influence it should be noted that his ideas were introduced into anthropology by Richard Thurnwald, one of his students. FRIEDRICH A. VON HAYEK [For the historical context of Menger's work, see ECONOMIC THOUGHT, articles on THE HISTORICAL SCHOOL, THE AUSTRIAN SCHOOL,
and
THE INSTITUTIONAL
SCHOOL; and the biographies of COURNOT; GOSSEN.; JEVONS; SCHMOLLER; WALRAS. For discussion of the subsequent development of Menger's ideas, see UTILITY; and the biographies of THURNWALD; VON MISES, LUDWIG; WEBER, MAX; WIESER.] WORKS BY MENGER (1870) 1963 Carl Mengers erster Entwurf zu seinem Hauptwerk Grundsatze geschrieben als Anmerkungen zu den Grundsatzen der Volkswirthschaftslehre von Karl Heinrich Ran. With an Introduction by Yuzo Yamada. Tokyo: Bibliothek der Hitotsubashi Universitat. -> Written in 1870 and published posthumously. (1871) 1950 Principles of Economics: First General Part. Edited by James Dingwall and Bert F. Hoselitz, with an Introduction by Frank H. Knight. Glencoe, IllFree Press. -» First published as Grundsatze der Volkswirthschaftslehre. The second complete German edition was published in 1923. (1883) 1963 Problems of Economics and Sociology. Edited with an introduction by Louis Schneider. Urbana: Univ. of Illinois Press. -» First published as Untersuchungen uber die Methode der Socialwissenschaften und der politischen Oekonomie insbesondere. 1884 Die Irrthumer des Historismus in der deutschen N«tionalokonomie. Vienna: Holder. 1892 Beitrage zur Wdhrungsfrage in Oesterreich-UngarnJena (Germany): Fischer. Carl Mengers Zusdtze zu Grundziige der Volkswirthschaftslehre. With an introduction by Emil Kauder. Tokyo' Bibliothek der Hitotsubashi Universitat, 1961. -> Published posthumously. The Collected Works of Carl Menger. 4 vols. Series of Re" prints of Scarce Tracts in Economic and Politic3* Science, No. 17-20. London School of Economics and Political Science, 1933-1936. -> Volume 1: Grundsatze der Volkswirthschaftslehre (1871) 1934. Voluitt6 2: Untersuchungen uber die Methode der Social^lS'
MENTAL DISORDERS: Genetic Aspects senschaften . . . , (1883) 1933. Volume 3: Kleinere Schriften zur Methode und Geschichte der Volkswirthschaftslehre (1884-1915) 1935. Volume 4: Schriften iiber Geldtheorie und Wdhrungspolitik . . . (1889-1893) 1936. Contains a biographical introduction by von Hayek in Volume 1, and a complete list of Menger's known writings in Volume 4 1933-1936. SUPPLEMENTARY BIBLIOGRAPHY
ANTONELLI, ETIENNE 1953 Leon Walras et Carl Menger a travers leur correspondance. Economie appliquee 6:269-287. BLOCK, HENRI S. 1937 La theorie des besoins de Carl Menger. Paris: Librairie Generale de Droit et de Jurisprudence. BLOCK, HENRI S. 1940 Carl Menger: The Founder of the Austrian School. Journal of Political Economy 48: 428-433. FEILBOGEN, S. 1911 L'ecole autrichienne d'economie politique. Journal des economistes Sixth Series 31:5057, 214-230, 375-388. HOWEY, RICHARD S. 1960 The Rise of the Marginal Utility School: 1870-1889. Lawrence: Univ. of Kansas Press. KAUDER, EMIL 1953 The Retarded Acceptance of the Marginal Utility Theory. Quarterly Journal of Economics 67:564-575. KAUDER, EMIL 1957 Intellectual and Political Roots of the Older Austrian School. Zeitschrift fur Nationalokonomie 17:411-425. KAUDER, EMIL 1959 Menger and His Library. Keizai kenkyu (Economic Review), Hitotsubashi University 10:58-64. KAUDER, EMIL 1961 Freedom and Economic Theory: Second Research Report on Menger's Unpublished Paper. Hitotsubashi Journal of Economics 2:67—82. KAUDER, EMIL 1962 Aus Mengers nachgelassenen Papieren. Weltwirtschaftliches Archiv 89:1-28. SCHUMPETER, JOSEPH A. (1921) 1960 Carl Menger: 1840-1921. Pages 80-90 in Joseph A. Schumpeter, Ten Great Economists, From Marx to Keynes. New York: Oxford Univ. Press. -> First published in German in Volume 1 of Zeitschrift fur Volkswirtschaft und Sozialpolitik, New Series. STIGLER, GEORGE J. 1941 Production and Distribution Theories: The Formative Period. New York: Macmillan. -> See especially pages 134-157 on "Carl Menger." WEISS, FRANZ X. 1924 Zur zweiten Auflage von Carl Mengers Grundsdtzen. Zeitschrift fur Volkswirtschaft und Sozialpolitik New Series 4:134-154. WIESER, FRIEDRICH 1923 Karl Menger. Volume 1, pages 84-92 in Neue osterreichische Biographie: 1815-1918. Vienna: Wiener Drucke. YEAGER, LELAND B. 1954 The Methodology of Henry George and Carl Menger. American Journal of Economics and Sociology 13:233-238.
MENTAL ABILITY
See INTELLIGENCE AND INTELLIGENCE TESTING. MENTAL DISORDERS
General considerations underlying the study of Cental disorders are discussed in the articles under ^s heading. Concepts of direct relevance are also discussed in ANXIETY; DEFENSE MECHANISMS;
727
STRESS. General categories of mental disorders are reviewed in the articles NEUROSIS; PSYCHOSIS; specific disorders are discussed in CHARACTER DISORDERS; DEPRESSIVE DISORDERS; HYSTERIA; OBSESSIVE-COMPULSIVE DISORDERS; PARANOID REACTIONS; PHOBIAS; PSYCHOPATHIC PERSONALITY; PSYCHOSOMATIC ILLNESS; SCHIZOPHRENIA. Methods of assessing mental disorders are discussed in ELECTROENCEPHALOGRAPHY; INTERVIEWING, article On PERSONALITY APPRAISAL; PERSONALITY MEASUREMENT; PROjECTivE METHODS. The treatment of mental disorders is discussed under CLINICAL PSYCHOLOGY; COUNSELING PSYCHOLOGY; INTERVIEWING, article on THERAPEUTIC INTERVIEWING; MENTAL DISORDERS, TREATMENT OF; PSYCHIATRY; PSYCHOANALYSIS. Social problems that can be considered as aspects of mental disorders are discussed in DRINKING AND ALCOHOLISM; DRUGS; SEXUAL BEHAVIOR, articles on HOMOSEXUALITY and SEXUAL DEVIATION; SUICIDE. i. 11. in. iv. v. vi.
GENETIC ASPECTS ORGANIC ASPECTS BIOLOGICAL ASPECTS EPIDEMIOLOGY CHILDHOOD MENTAL DISORDERS EXPERIMENTAL STUDY
Eliot Slater Joseph M. Wepman Joel Elkes Ernest Gruenberg Britton K. Ruebush George Talland
GENETIC ASPECTS
People who are related to one another by blood tend to resemble one another in, among other things, their mental make-up and their liability to mental illness. Both genetic and environmental factors may play a part in this resemblance. The most widely accepted view of the nature of the interaction between heredity and environment has been called the diathesis-stress theory (Rosenthai 1963). In its application to mental illness this view suggests that the susceptibility to mental illness, insofar as it is genetically based, varies along a continuum ranging from high to low extremes, most people clustering about an average of moderate susceptibility. Environmental stresses, also, vary from severe to slight. Accordingly, when a mental breakdown occurs, a combination of both factors is involved; thus, we should expect a high rate of breakdown among normal individuals subjected to severe stress, and a high rate also among very susceptible people placed under even mild stress. Such a quantitative relationship has been shown to hold in fact, e.g., in the neurotic illnesses of combat troops (Symonds 1943). Whether psychotic illnesses follow the same general law is a matter more difficult to decide.
128
MENTAL DISORDERS: Genetic Aspects
Personality deviations and neurotic illness Twin studies. An important part of the work in the field of personality deviations and neurotic illness has been studies of twins. One-egg, or monozygotic (MZ), twins, whose entire genetic equipment is identical, are of the same sex and are very much alike in physical characteristics. Twoegg, or dizygotic (DZ), twins ordinarily resemble each other no more than any pair of brothers or sisters and are as likely to be of opposite sexes as of the same sex. As a rule only the same-sexed DZ twin pairs are taken by investigators for comparison with MZ pairs. Twin pairs are said to be concordant when it is found that the twin of a proband (index case) with a particular deviation also shows the same anomaly. Concordance rates are usually given as percentages. Genetic hypotheses lead one to suppose that concordance rates should be much higher in MZ than in DZ pairs and that variability within MZ pairs should be smaller than within DZ pairs. Table 1 lists the percentage of concordances found for a variety of conditions, in which the statistics are based on at least thirty pairs. Criminality and delinquency. Much effort has been put into the investigation of criminality and behavior disorder. The results of Rosanoff in juvenile delinquency are noteworthy (Rosanoff et al. 1934; 1941). There are high rates of concordance both in MZ and in DZ twin pairs, and little difference between them. This suggests that the similarity in the twins' behavior is due to common factors in their environment. It is at least possible that each of the twins constitutes part of the stress factor for his twin partner. The same suggestion arises from the observations of behavior disorder and neurotic traits in school children. Shields found that in these children the degree of neurotic reaction was more noticeably related to environmental factors than genetic constitution; the hereditary
factor showed itself in the type of reaction (1954). [See CRIMINOLOGY; DELINQUENCY; PSYCHOPATHIC PERSONALITY.]
Adult neuroses. Concordance rates are lower in neurotic adults, both in the MZ and in the DZ pairs. The greatly increased variability within both kinds of pairs can be put down to the much wider range of experience, and wider variety of stresses, to which adults are subjected. Male homosexuality. At the opposite extreme are Kallmann's findings of 100 per cent concordance in MZ pairs, as against 12 per cent concordance in DZ pairs, for male homosexuality. This would suggest that in these cases the genetic factors account for the greater part of the variance. Some caution in interpretation is needed, however. The importance that the genetic contribution acquires here may be due to the fact that there was a predominance of the more constitutional type of homosexual in the sample studied by Kallmann and his team. [See SEXUAL BEHAVIOR, articles on SEXUAL DEVIATION and HOMOSEXUALITY.] Studies of relatives. Attempts to estimate the importance of hereditary factors in causing neurotic illness have been made by investigating the frequency of such illnesses among the relatives of neurotic patients. Findings have varied greatly from observer to observer. One of the early workers, Brown (1942), started by investigating the first-degree relatives of patients who had been diagnosed as suffering from obsessional neurosis, anxiety neurosis, and hysteria, as well as those of a control group. Among the relatives of all groups he found individuals suffering from obsessional neurosis, anxiety neurosis, and hysteria, as well as from other personality deviations of a kind not easily named and classified. There were three significant findings: the relatives of the control group had much less psychiatric abnormality than the relatives of any of the other three groups; all three diagnoses were represented among the relatives of
Table 1 — Concordance rates for personality deviation and behavior disorders in monozygotic and same-sexed dizygotic twins PER CENT OF CONCORDANCE
Number of pairs
lnvesfigafor('sj
MZ
DZ
Male homosexuality Adult crime
63 216
100
12
Juvenile delinquency Childhood behavior disorder Behavior disorder or marked neurotic traits in school children Neurosis, psychopathic personality Alcoholic addiction
67 107
Kallmann 1952 Lange 1929; Rosanoff et al. 1934; Kranz 1936; Stumpfl 1936; Borgstroem 1939 Rosanoff etal. 1934; 1941 Rosanoff et al. 1934; 1941
68 85 87
35 75 43
74 25 65
50 14 30
Deviation
41 37 82
Shields 1954 Slater 1953 Kaij 1960
Source: Shields & Slater 1960, p. 327, table 8.6-
MENTAL DISORDERS: Genetic Aspects all three patient groups; among those relatives who were classifiable under the three named diagnoses, it was found that there was a tendency for them to be in the same diagnostic category as the related patient. For example, of the nine obsessional relatives discovered, seven were related to the obsessional patients. This finding suggests a certain degree of specificity, which is best seen in the investigations which have been made on the relatives of obsessional patients (Luxenburger 1930; Lewis 1936; Riidin 1953). Among the 100 parents of 50 obsessional patients Lewis found that 37 showed pronounced obsessional traits in one form or another; 21 per cent of the 206 siblings also showed obsessional traits. [See OBSESSIVE-COMPULSIVE DISORDERS.] At the opposite pole are the findings in the relatives of patients diagnosed as suffering from "hysteria." The best family study was made by Ljungberg (1957), who found that among the fathers, brothers, and sons of hysterics, 2 per cent, 3 per cent, and 5 per cent respectively were themselves hysterics; and among the mothers, sisters, and daughters, 7 per cent, 6 per cent, and 7 per cent respectively. His observations also suggested that hysterical symptoms were not necessarily associated with hysterical personalities, and of the 363 hysterics whose personality structures were analyzed, 55 per cent were found to be nondeviant. Similar conclusions were reached by the writer (Slater 1961) from a study of 24 pairs of twins, 12 MZ and 12 DZ, in which the proband had been diagnosed as suffering from hysteria. In none of these pairs was there a co-twin who had ever been diagnosed as suffering from hysteria, though abnormalities of personality and psychiatric illness were common. Among the relatives of these pairs the incidence of hysteria was even lower than in Ljungberg's material, and the anomalies found to be most noticeably in excess were manic-depressive and endogenous affective psychoses. [See HYSTERIA.] It seems probable that environmental factors are more important than genetic ones in determining whether or not a man breaks down with a neurotic illness. But it would seem that genetic factors influence the predisposition to such breakdown, help to determine whether the personality is a stable °ne, and, in the event of breakdown, have a large e ffect on the type of symptoms which are likely to be shown. Manic-depressive illness The risk of affective psychoses in the first-degree Datives of manic-depressive patients is shown in
129
Table 2 — Per cent of manic—depressive disorders among relatives of manic-depressive patients NUMBER OF Investigator Banse 1929 Roll and Entres 1936 Slater 1938 Stromgren 1938 Sjogren 1948 Kallmann 1950 Stenstedt 1952 Fonseca 1959
PROBANDS
NATURE OF RELATIONSHIP Parents
Siblings
80
10.8
18.1
83 138 77
13.0 15.5 7.5
45
7.0
3.6
75 216 60
23.4 7.4 22.8
23.0 12.3 18.9
Children
10.7 15.2 10.7
9.4 21.7
Table 2. It will be seen that there is much variation in the data reported by different observers. Nevertheless, these risk figures are all very much higher than would be expected of a sample taken from the general population, in which the incidence level is probably of the order of 0.4 per cent. Approximately 15 per cent of the first-degree relatives of manic-depressives may themselves have affective disorders of the same generic group. Single-gene explanation. One explanation suggests that there is a single dominant autosomal gene which predisposes a spontaneous variation in mood. Most people with such a tendency are likely to remain healthy throughout their lives, though their cyclothymic temperament may be clearly recognizable both to themselves and to their families. If such cyclothymic individuals are subjected to stresses, the spontaneous swing of mood into depression or elation may go so far and last so long that medical treatment becomes necessary; and then the patient may be stigmatized as a "manicdepressive." The genetic hypothesis proposes, in fact, to account for only a part of the causation. Environmental factors and threshold effects must also be playing a part. The theory requires that 50 per cent of the first-degree relatives of manicdepressives should be gene carriers; if only 15 per cent of the relatives show themselves as such, this can be put down to low penetrance of the gene. [See DEPRESSIVE DISORDERS.] Polygenic explanation. A single-gene hypothesis, however, is not the only possibility; polygenic inheritance is an alternative explanation. Edwards (I960; 1963) has drawn attention to the fact that, in the case of common conditions, it is not easy to distinguish between the expected consequences of a single gene with diminished penetrance and of multifactorial inheritance with a threshold effect. Assuming that the predisposition to a condition, such as schizophrenia or diabetes, is quantitatively graded, with a normal distribution, Edwards suggests that when p — the frequency of the disorder, the incidence of the disorder in the first-
130
MENTAL DISORDERS: Genetic Aspects
degree relatives of persons suffering from the disorder will be approximately \/p. If the frequency of manic-depressive illness in the general population is approximately 0.004, then the frequency of manic-depressive illness in the first-degree relatives of manic-depressives should be about 6 per cent to 7 per cent. The observations are about double that figure, but it is not possible to say that observation and expectation, on the multifactorial hypothesis, are irreconcilable. Schizophrenia Extensive investigations of the hereditary factor in schizophrenia have been made by Kallmann and his associates in the New York State Psychiatric Institute and Hospital (Columbia University). A synopsis of the results obtained is given in Table 3. These risk figures should be compared with the estimated risk of schizophrenia for a member of the general population of 0.9 per cent. Environmental effects show up: association with a schizophrenic proband in the same home virtually doubles the risk of schizophrenia for step-sibs and spouses; and there is a greater risk for the nonseparated MZ co-twin of a schizophrenic than for an MZ co-twin who has lived apart from the proband for five years or more. However, the table also shows the risk of schizophrenia running up steeply with increasingly close degrees of blood relationship. In view of these figures it is difficult and, the writer feels, unrealistic to dispute the conclusion that genetic factors play a significant role in the causation of schizophrenia. The risk figures published by Kallmann are somewhat higher than those obtained by other workers but are in the main reconcilable with them. The writer, for instance (Slater 1953), found a risk of Table 3 — Per cent of risk of schizophrenia among relatives of schizophrenics Nature of relationship
Per cent risk
Not related by blood, step-sibs Not related by blood, spouses First cousins Nephews and nieces Grandchildren Half-sibs Parents Full-sibs Dizygotic co-twins DZ co-twins of same sex Children with one schizophrenic parent Children with two schizophrenic parents Monozygotic co-twins MZ co-twins living apart for at least 5 years MZ co-twins not so separated
1.8 2.1 2.6 3.9 4.3 7.1 9.2 14.2 14.5 17.6 16.4 68.1 86.2 77.6 91.5
Source: Adapted from Kallmann 1950.
76 per cent for the MZ co-twin and 14 per cent for the DZ co-twin, in a sample of 41 MZ and 115 DZ pairs. However, it is noteworthy that Essen-Moller (1941) in Sweden found larger differences within MZ pairs than were found by other workers, and investigations in Norway and Finland show tendencies in the same direction. Thus Kringlen (1966) collected 50 MZ and 94 same-sexed pairs, with concordance rates of 38 per cent and 14 per cent; and Tienari (1963) found that none of the 16 male MZ pairs he studied were concordant. It is possible that what is diagnosed as schizophrenia in Scandinavia is not quite the same as schizophrenia in Germany, Britain, the United States, and Japan. That there may be peculiar features about the gene distributions in northern lands is also suggested by the work of Book (1953). In a remote part of Sweden north of the Arctic circle, in a population of farmers and lumbermen, he found a high incidence of schizophrenia, in a predominantly catatonic form; the incidence of schizophrenia in the relatives of probands, however, was also high, suggesting an intermediate gene with 20 per cent penetrance in the heterozygote. The possible preva lence of different genetic predisposing factors for schizophrenia in different parts of the world is a possibility which deserves investigation. The findings of Tienari, by themselves, are anomalous and should not be taken as throwing doubt on the results obtained by others; they should be considered in relation to those of other observers, summarized in Table 4. In this table the figures relating to schizophrenia are derived from work in Germany, the United States, Sweden, and England. To these should be added the results obtained by the Japanese workers Kurihara (1959) and Inouye (1961). Inouye found concordance for schizophrenia in 60 per cent of 55 MZ pairs and in 12 per cent (two pairs) of the DZ pairs. Kurihara also found that 29 of 45 MZ pairs were concordant for schizophrenic symptomatology, but none of the 9 DZ pairs were. The twin work on schizophrenia has been analyzed and discussed critically by Rosenthal in a number of papers (1962a; 1962b; 1963). He concludes that concordance rates have been artificially inflated by the sampling methods employed. Cases have been largely taken from standing populations and include an unrepresentative proportion of severe cases; if genetic factors are connected with reduced chances of remission (which has yet to be shown), this sampling would obviously bias the results. Clearly, sampling from consecutive admissions, or better still from birth registers, would be an improvement. Rosenthal criticizes standards
MENTAL DISORDERS: Genetic Aspects Table 4
Concordance rates for mental disorders in scrme-sexed twins
Type of disorder
dizygotic
Number Pairs of studies reported
and
monozygotic
PER CENT OF CONCORDANCE
MZ
DZ
Relative increase*
Mental defective
5
Child and juvenile Schizophrenic
5 5 6
569 209 728 184
94 87 69 70
65 53 13 28
.83 .73 .63 .58
7 6
214 231
54 66
7 32
.50 .50
5 3
103 56
43 44
23 27
.26 .24
Affective Epileptic Criminal Neurotic Senile
e relative increase in concordance associated with identity of genetical constitution, between theoretical limits of 1 and 0: (MZ — DZ)/(100 — DZ). Source: Adapted from Essen-Moller 1963.
of diagnosis, both of zygosity and clinical classification, and considers that these diagnoses should be made independently of one another by different observers. Not all recorded work is equally open to such criticism; it is, for instance, a considerable safeguard to publish protocols in full, as the writer did, to make them available to rediagnosis by the reader. [See SCHIZOPHRENIA.] Etiological theories. Rosenthal classifies the etiological theories of schizophrenia into (1) monogenie-biochemical, (2) diathesis-stress, and (3) life-experience theories, his own views inclining to a theory of the second type. This is no place for the discussion of the difficult problems involved, but the writer inclines to a theory of the first type. It can be shown (Slater 1958) that a monogenic theory fits fairly well with the empirically obtained figures for the frequency of schizophrenia in the siblings of schizophrenics, in the children of one schizophrenic parent, and in the children of parents both of whom had schizophrenic illnesses. These data can be reconciled with a gene of intermediate type, manifesting itself in all homozygotes but in only 26 per cent of heterozygotes; all but 3 Per cent of schizophrenics would be heterozygous for the gene. This hypothesis clearly involves a massive environmental contribution in the causation of manifest illness and therefore differs from theories of type (2) only in supposing that the element of specificity in determining the type of Psychosis is provided by the genetic constitution, une may expect that decisive support or refutation °f type (1) theories will depend on biochemical investigations. "resenile and senile dementias Inheritance in Huntington's chorea is dependent °n an autosomal dominant gene. Its incidence in tne sexes is about equal. The age of onset, according
I 31
to Panse (1942), extends from early childhood to the late sixties, with a mean at age 36; Wendt and his colleagues estimate the mean age of onset at 44 (1960). These estimates mean that most of the children of gene carriers are born before the parent has developed the disease. Elimination of the pathogenic gene by processes of natural selection is, accordingly, very slow. The disease is slowly progressive and fatal, with a mean duration of 13 years (Wendt et al. 1960). Unusual forms of presenile dementia are Pick's disease and Alzheimer's disease; both have a genetic basis. Compared with Huntington's chorea, they occur later in life, with a mean age of onset of 55, and with about seven years as the mean duration. Although the conditions are distinct pathologically, they are often difficult to distinguish clinically. According to Sjogren (Sjb'gren et al. 1952), the genetic factor in Pick's disease is most probably a dominant major gene, its manifestation subject to modifying genes; in Alzheimer's disease Sjogren thinks multifactorial inheritance more probable. The problem of genetic determination in senile dementia is even more difficult. The most generally accepted opinion in the past has been that senile dementia is but one aspect of senescence and that specific genetic causation is improbable. However, this established viewpoint has been challenged by the work of Larsson, Sjogren, and Jacobson (1963). In a large study in Stockholm they found that senile dementia was not correlated with senescence. The relatives of patients suffering from senile dementia were not more than normally subject to other conditions, although their risk of senile dementia itself was increased; no instances of Pick's or Alzheimer's disease were found among them. There did not appear to be special factors for longevity whose presence or absence was connected with senile dementia. No evidence could be found of environmental factors of a sociomedical kind playing a part in determining the onset of senile dementia. Furthermore, there was no secular change in the incidence of the disease. Particularly in favor of an explanation in terms of a single gene rather than multifactorial inheritance were the variations in geographical distribution and the absence of intermediate states between senile dementia and normal aging in the siblings and children of the probands. [See AGING.] The morbidity risk for senile dementia was found to be greatly increased among the relatives of the patients. These researchers consider that the best working hypothesis for the explanation of their findings is that of an autosomal major domi-
132
MENTAL DISORDERS: Genetic Aspects
nant gene. This gene would be subject to diminished penetrcince, the manifestation rate increasing with age. Only a minority even of the gene carriers would ever develop senile dementia; and since the calculated gene frequency was 0.12, the great bulk of the population would be immune. ELIOT SLATER [Directly related are the entries GENETICS; PSYCHOLOGY, article on CONSTITUTIONAL PSYCHOLOGY. Other relevant material may be found in SCHIZOPHRENIA.] BIBLIOGRAPHY
BANSE, J. 1929 Zum Problem der Erbprognosebestimmung: Die Erkrankungsaussichten der Vettern und Basen von Manisch-Depressiven. Zeitschrift fur die gesamte Neurologie und Psychiatric 119:576-612. BOOK, J. A. 1953 A Genetic and Neuropsychiatric Investigation of a North-Swedish Population: I. Psychoses. Acta genetica et statistica medico. (Basel) 4:1-100. BORGSTROEM, C. A. 1939 Eine Serie von kriminellen Zwillingen. Archiv fur Rassen- und Gesellschaftsbiologie 33:334-343. BROWN, FELIX W. 1942 Heredity in the Psychoneuroses. Royal Society of Medicine, Proceedings 35:785-790. EDWARDS, J. H. 1960 The Simulation of Mendelism. Acta genetica et statistica medica (Basel) 10:63-70. EDWARDS, J. H. 1963 The Genetic Basis of Common Disease. American Journal of Medicine 34:627-638. ESSEN-MOLLER, ERIK 1941 Psychiatrische Untersuchungen an einer Serie von Zwillingen. Acta psychiatrica et neurologica scandinavica Supplement 23. ESSEN-MOLLER, ERIK 1963 Twin Research and Psychiatry. Acta psychiatrica scandinavica 39, fasc. 1:65-77. FONSECA, ANTONIO F. DA 1959 Andlise heredo-clinica das perturbacoes afectivas: Estudo de 60 pares de gemeos a seus consanguineos. Universidad de Porto (Portugal): Faculdade de Medicina. INOUYE, EIJI 1961 Similarity and Dissimilarity of Schizophrenia in Twins. Volume 1, pages 524-530 in World Congress of Psychiatry, Third, Montreal, Proceedings. Univ. of Toronto Press. KAIJ, LENNART 1960 Alcoholism in Twins: Studies on the Etiology and Sequels of Abuse of Alcohol. Stockholm: Almqvist & Wiksell. KALLMANN, FRANZ J. 1950 The Genetics of Psychoses: An Analysis of 1,232 Twin Index Families. American Journal of Human Genetics 2:385-390. KALLMANN, FRANZ J. 1952 Comparative Twin Studies on the Genetic Aspects of Male Homosexuality. Journal of Nervous and Mental Disease 115:283-298. KRANZ, HEINRICH 1936 Lebensschicksale krimineller Zwillinge. Berlin: Springer. KRINGLEN, EINAR 1966 Schizophrenia in Twins: An Epidemiological-Clinical Study. Psychiatry 29:172184. KURIHARA, M. 1959 A Study of Schizophrenia by Twin Method. Psychiatria et neurologia japonica (Seishin shinkeigaku zassfet) 61:1721-1741. -» Text in Japanese; title and summary in English. LANGE, JOHANNES (1929) 1931 Crime as Destiny: A Study of Criminal Twins. London: Allen & Unwin. -> First published as Verbrechen als Schicksal. LARSSON, TAGE; SJOGREN, TORSTEN; and JACOBSON, GEORGE 1963 Senile Dementia: A Clinical Sociomedical and Genetic Study. Copenhagen: Munsgaard.
LEWIS, AUBREY 1936 Problems of Obsessional Illness. Royal Society of Medicine, Proceedings 29:325-336. LJUNGBERG, L. 1957 Hysteria: A Clinical, Prognostic and Genetic Study. Acta psychiatrica et neurologica scandinavica Supplement 112. LUXENBURGER, H. 1930 Psychiatrisch-neurologische Zwillingspathologie. Zeitschrift fur die gesamte Neurologie und Psychiatric 56:145-180. PANSE, FRIEDRICH 1942 Die Erbchorea: Eine klinischgenetische Studie. Leipzig: Thieme. ROLL, A.; and ENTRES, J. L. 1936 Zum Problem der Erbprognosebestimmung: Die Erkrankungsaussichten der Neffen und Nichten von Manisch-Depressiven. Zeitschrift fur die gesamte Neurologie und Psychiatric 156:169-202. ROSANOFF, AARON J.; HANDY, L. M.; and PLESSET, I. R. 1941 The Etiology of Child Behavior Difficulties, Juvenile Delinquency and Adult Criminality, With Special Reference to Their Occurrence in Twins. Sacramento: California State Printing Office. ROSANOFF, AARON J.; HANDY, L. M.; and ROSANOFF, I. A. 1934 Criminality and Delinquency in Twins. Journal of Criminal Law and Criminology 24:923-934. ROSENTHAL, DAVID 1962a Familial Concordance by Sex With Respect to Schizophrenia. Psychological Bulletin 59:401-421. ROSENTHAL, DAVID 1962t> Problems of Sampling and Diagnosis in the Major Twin Studies of Schizophrenia. Journal of Psychiatric Research 1:16-34. ROSENTHAL, DAVID (editor) 1963 The Genain Quadruplets: A Case Study and Theoretical Analysis of Heredity and Environment in Schizophrenia. New York: Basic Books. RUDIN, EDITH 1953 Ein Beitrag zur Frage der Zwangskrankheit, insbesondere ihrer hereditaren Beziehungen. Archiv fur Psychiatric und Nervenkrankheiten 191: 14-54. SHIELDS, JAMES 1954 Personality Differences and Neurotic Traits in Normal Twin Schoolchildren: A Study in Psychiatric Genetics. Eugenics Review 45:213-245. SHIELDS, JAMES; and SLATER, ELIOT (1960) 1961 Heredity and Psychological Abnormality. Pages 298-343 in Hans J. Eysenck (editor), Handbook of Abnormal Psychology: An Experimental Approach. New York: Basic Books. SJOGREN, TORSTEN 1948 Genetic-Statistical and Psychiatric Investigations of a West Swedish Population. Acta psychiatrica et neurologica scandinavica Supplement 52. SJOGREN, TORSTEN; SJOGREN, HAKON; and LINDGREN, AKE G. H. 1952 Morbus Alzheimer and Morbus Pick: A Genetic, Clinical and Patho-anatomical Study. Acta psychiatrica et neurologica scandinavica Supplement 82. SLATER, ELIOT 1938 Zur Erbpathologie des manischdepressiven Irreseins: Die Eltern und Kinder von Manisch-Depressiven. Zeitschrift fur die gesamte Neurologie und Psychiatric 163:1-47. SLATER, ELIOT 1953 Psychotic and Neurotic Illnesses in Twins. Medical Research Council Special Report NO. 278. London: H.M. Stationery Office. SLATER, ELIOT 1958 The Monogenic Theory of Schizophrenia. Acta genetica et statistica medica (Basel) 8:50-56. SLATER, ELIOT 1961 The Thirty-fifth Maudsley Lecture: Hysteria 311. Journal of Mental Science 107:359-381STENSTEDT, AKE 1952 A Study in Manic-Depressive
MENTAL DISORDERS: Organic Aspects Psychosis. Acta psychiatrica et neurologica scandinavica Supplement 79. STROMGREN, ERIK 1938 Beitrage zur psychiatrischen Erblehre. Acta psychiatrica et neurologica scandinavica Supplement 19. STUMPFL, FRIEDRICH 1936 Die Urspriinge des Verbrechens dargestellt am Lebenslauf von Zwillingen. Leipzig: Thieme. SYMONDS, C. P. 1943 The Human Response to Flying Stress. British Medical Journal [1943]: 703-706, 740744. -^ Lecture 1: "Neurosis in Flying Personnel." Lecture 2: "The Foundations of Confidence." TIENARI, P. 1963 Psychiatric Illnesses in Identical Twins. Acta psychiatrica scandinavica 39 (Supplement 171). -» The entire issue is devoted to Tienari's study. WENDT, G. G.; LANDZETTEL, I.; and SOLTH, K. 1960 Krankheitsdauer und Lebenserwartung bei der Huntingtonschen Chorea. Archiv fur Psychiatric und Nervenkrankheiten 201:298-312. II ORGANIC ASPECTS
Traditionally, description of the organic syndromes, in relation to their effect upon the central nervous system and behavior generally, has included such generalizing conditions as cerebral arteriosclerosis, acute and chronic alcoholism, presenile and senile conditions, degenerative neural diseases, and developmental mental deficiencies, as well as conditions producing more focal disorders such as head injuries, brain tumor extirpations, and cerebrovascular accidents (the boxer who is punch-drunk or the stroke patient). The trend in recent literature, however, confirms what is a notably progressive shift in emphasis away from strictly clinical or case description, as exemplified by texts in psychiatry and neurology, toward more varied studies, which consider psychodiagnostic and psycholinguistic phenomena in the context of clinical observations. With these important changes in the study of organic pathology has also come a voluminous growth in the literature. Within this literature (see Wepman 1961) can be found the continuing traditional interest of the neurologist and psychiatrist, as seen, for example, in the recently published American Handbook of Psychiatry (Arieti 1959). However, the great breadth of the field can be noted in the work of experimental psychologists using both animal and human subjects; in the Decent neurophysiologie research on such central nervous system areas as the reticular formation, the limbic lobes, and the association tracts; in the dramatic electrode stimulation studies of cortical factions carried on during neurosurgery; in the e laboration of new psychodiagnostic signs by psychometric researchers; and in the growing group of neurophysiological theories related to brain func-
133
tion. Additional examples, from the more applied fields, are rehabilitation for brain-impaired persons and the not inconsiderable addition of research on language disabilities. As can be seen, the literature of neuropsychology is rich and varied. Within it, the reader will find, however, that there are more unknowns than knowns, more questions than answers, more theories than facts. While a great deal has been written about the behavior of the brain-impaired, very few facts have been demonstrated. For example, it is generally accepted that in thought, emotional control, and intellection, the role of the central nervous system is crucial. Yet the precise manner in which the nervous system and the human brain work to fulfill this function is undetermined and vague. A leading neurophysiologist once described the human brain as a "black box" that must function in certain well-prescribed ways in order to produce all that it does in human behavior but unfortunately is not available for viewing. Nevertheless, the great concentration of attention by so many trained observers, the extensive studies of clinical cases, the host of theories, the wide and growing application of differential diagnostic techniques, and the increasing activities of both language therapists and psychotherapists treating patients suffering from brain impairment have all produced a considerable body of knowledge available for better understanding of organic mental disorders. The following is an attempt to describe some of the more prominent behavioral phenomena commonly associated with neural impairment and includes major sources of data and references in order to facilitate and encourage further investigation by the reader. Common behavioral manifestations Alteration in thought, personality changes, and changes in language comprehension and usage take on many forms following cortical insult. By and large, however, most investigators agree that individual patterns of behavior in the brain-impaired differ from those of the unimpaired more in degree than in type. Table 1 presents the reader with a list of the most common indications of brain impairment. While space limitations preclude extensive comment on the relative merit of individual signs of organicity as indicators of organic psychopathology, some generalizations about them are warranted. Many of these signs appear as the result of a disordering process producing abnormal behavior; others are the product of the retained ability to function of what is left of the nervous system; and
134
MENTAL DISORDERS: Organic Aspects Table I — Behavioral symptoms of the brain-impaired PSYCHOD/AGNOST/CALLY ELICITED SYMPTOMS
Memory loss (especially immediate memory) Reduced association of ideas Perseveratiort of thought and language Feelings of inadequacy Egocentricity Hyperirritability Overfatiguability Euphoria Catastrophic overreaction Reduced initiative Lack of spontaneity Impulsive behavior Regressive behavior Fluctuating ability Situational or fixed anxiety
Poor attention and concentration Memory loss Abstract—concrete imbalance Poor ability to organize and preplan Difficulty in forming generalizations Inability to categorize Lowered general intelligence Psychomotor retardation Perplexity (questioning one's own ability) Psychological impotence (recognition of errors without the ability to alter responses) Egocentricity Anxiety Specific modality disabilities in learning Body-image distortion Spiral afterimage reactions Delayed response patterns
still others derive from the altered self-concept of the people involved (their reactions to their impairment). At any rate, these signs, while not necessarily pathognomonic of the brain-injured, are rather useful indicators of the likelihood that a cortical impairment has occurred. Research comparing known brain-impaired subjects with normal individuals and subjects with thought disorders (for example, schizophrenics) has consistently shown that pathological conditions, while frequently unlike normal states, are most often like each other (Chapman 1960). In fact, the argument has been advanced that the very presence of so many socalled organic signs in schizophrenia may indicate that this psychiatric condition is indeed an organic psychosis. Thus, behavioral symptoms can be viewed as frequently occurring telltale indicators of pathology but cannot themselves be viewed as differentially diagnostic signs of organic pathology. It should also be pointed out that most of the signs used as organic indicators have not been found by research studies to occur in every organic condition. In part this is true, the present writer feels, because few of the studies have been really comparable. Research populations differ in so many ways (for example, age of subjects and location and extent of lesion or impairment) that the signs said to be associated with brain pathology in one study may be completely lacking in an equally well attested study of another group of braininjured patients. A final source of inconsistency is related to subjects with a known brain injury but who differ in behavior from other such subjects even though neuropathies may be alike, the location of damage is thought to be the same, and the amount or size of the lesion is thought to be
LANGUAGE D/STURBANCES Aphasia Agnosia Apraxia Dysarthria
roughly equal. The generalization must be made, therefore, that brain-impaired patients differ from one another not only in the size and location of their structural defect but also in the behavioral aftereffects of similar neurological impairment. Certain commonalities do exist, however, and it is to these that students of brain pathology turn in attempting to gain a greater understanding of cortical function and dysfunction. The remainder of this article will be concerned with a more particular discussion of the behavioral signs appearing in Table 1 in relation to organic disorder, with emphasis upon the literature and major sources of data. Shortage of space unfortunately limits consideration to only the most prominent and generally elicited signs appearing in each of the three categories arbitrarily designated as clinically observed behavior, psychodiagnostically elicited behavior, and psycholinguistic disturbance patterns. Clinically observed behavior. Many of the signs enumerated in the category of clinically observed behavior can be related to certain basic processes that seem to underlie the signs themselves. Lack of control over behavior. Most notable among these processes is the concept of "control." Since the central nervous system is generally considered a major determinant in the control of behavior as the individual seeks to adapt to his environment, loss of memory and similar behavior may be best understood, in this context, as a loss of selective control over recall of events from the past. Further, memory loss in organic conditions is said to be most severe in the area of recall of very recent events, for example, when the individual must select (control) the appropriate impressions from a variety of recent ones. Related to the
MENTAL DISORDERS: Organic Aspects recall aspect of memory loss is the loss of capacity to attend, concentrate, or exert simple control over one's ideas. Impulsiveness, emotionality, and rigidity. Similarly, many clinicians have observed that the brainimpaired are often impulsive and emotionally labile. They react catastrophically to noncatastrophic events, or they become easily frustrated. At the other extreme, the brain-impaired often display tendencies toward rigidity or repetition in thought and language, find it difficult to shift from one idea to another, and are frequently inflexible in behavior. In each case then, these signs point to the patient's inability to regulate his behavior in a flexible fashion. Lack of adequate inhibition and sufficient control brings about these overt behavioral patterns, which have been termed symptoms of brain damage. With adequate control, the unimpaired behave with intent, behave appropriately, and behave purposively. The impaired, having lost control, show abnormalities of behavior in all three areas. (Note: The writer feels it is unnecessary to discuss clinically observed symptoms more extensively here, as they are generally self-explanatory; however, more detailed consideration may be found in the references cited in the bibliography.) Psychodiagnostically elicited symptoms. The loss of perceptual ability is a basic process that frequently affects the brain-impaired. This loss of function is especially noticeable in the capacity to learn through specific sense modalities. Consequently, it is most often described in relation to problems of children with known neurological deficiencies. But it seems equally true in all adults with brain damage, where perceptual deficiencies have been noted as being modality-bound rather than generalized. Auditory perception, for example, may be affected by brain impairment or by failure of certain portions of the nervous system to develop, while other perceptual abilities may remain intact (Wepman 1960). Some children, it has been noted, fail to develop language at the time expected of them (that is, by two or three years of age). This is not because of any lack of general intelligence or because of deafness or emotional instability but rather because they are unable to learn in situations that involve the auditory pathway. Other children, who do speak adequately, have been observed having difficulty in learning to read at the expected age. Again, this may be caused by no other reason than their inability to use the visual modality in learning, and this despite the Measurable adequacy of their visual acuity. [See HEARING; VISION.]
135
These auditory and visual agnosias in children are paralleled by similar disturbances of transmission of input stimuli following cortical insult in adults. Perceptual learning disability along specific modality lines has become the center of attention for many students of behavior in brain-impaired children and adults. As Table 1 implies, many signs of organic brain impairment can be elicited by psychometric and projective assessment. In some instances these are found to duplicate the clinically observed signs. But in others, they are observed only when the subject is under the stress and scrutiny of formal testing. Verbal and nonverbal functioning. The use of psychological tests as a means of isolating behavioral indicators of brain impairment continues to be a source of considerable research. Babcock (1941) noted in her studies of the brain-injured that certain intellectual faculties seem better retained after trauma than others. She then proposed to study subjects and evaluate their product in terms of these better retained areas (verbal behavior) as compared with the abilities that are less well retained (visually stimulated abstraction ability). A similar differentiation was used in the ShipleyHartford Scale (Shipley 1940) and the HuntMinnesota Test for Brain Impairment (Hunt 1943). Some support for this notion also seemed to follow from the characteristics of the tests used. Vocabulary retention, or verbal ability, as measured by most tests, is statistically well equated with general intelligence, is known to have a high test-retest reliability, and is generally a stable measure. In contrast, the tasks of visual abstraction are less stable, have less reliability over time, and are therefore more likely to be susceptible to change as a result of organic conditions affecting the nervous system. Wechsler followed the concept of retained as opposed to sensitive functions, not in a comparison of verbal versus nonverbal behavior, but in empirically determined responses to various subtests in his intelligence scale. From the results of his studies of an aging population, he devised a deterioration index (1955). By and large, this approach has fallen into disfavor among psychologists, however, since confirming research for the deficiencies found by the original author has been lacking (Kass 1949). Those who argue against the use of the verbal versus the nonverbal differential, or, as the believers in perceptual disabilities put it, "the differential between aurally stimulated verbal behavior and visually stimulated nonverbal behavior,"
736
MENTAL DISORDERS: Organic Aspects
seem to this writer to be correct in their conclusion —'but for the wrong reasons. In none of the research reported on the many indexes of deterioration was any attempt made to isolate the location of the organic condition, even to the rather loose degree of determining the hemisphere affected. Yet many language studies (Wepman 1951; Ettlinger et al. 1956) and the work of such researchers as Reitan and Reed (1962), Milner (1954), Bauer and Wepman (1955), and others have reported each of the two hemispheres differentially responsible for various intellectual tasks. [See INTELLIGENCE AND INTELLIGENCE TESTING.] In conclusion, therefore, it would be expected that the verbal-nonverbal paradigm would be successful. When damage has occurred to the left brain (the apparent site of integrations necessary for verbal behavior) it would seem that subjects should do less well with verbal than with nonverbal material. When, on the other hand, notable damage occurs in the right hemisphere, which, according to many studies, is the locus of control and integration for nonverbal thought, the expectancy would then be that nonverbal material would present greater difficulty. To the degree that such a distinction can be made in psychological tests— and it can be in many of them—the tendency for the effects of brain damage to be depicted seems high. Re-examining known organically impaired subjects from this viewpoint shows this differentiation to be a meaningful one. The degree to which this approach can be used in differential diagnosis where brain damage is suspected but not certain, however, still needs research verification. The qualitative approach for the delineation of personality disorganization, using projective tests as the source of data, has proven of value in the hands of many psychologists (Aita et al. 1947; Baker 1956; Hughes 1948; Piotrowski 1937). Unfortunately, most of the results reported are the subjective interpretations of individual examiners and are barely or not at all confirmed in replicated studies. Where such studies have been attempted, few of the signs found diagnostic by one examiner have been elicited by others. For the purposes of demonstrating some of the psychodynamic processes affected in many brain-injured subjects, however, the different projective techniques have proven of considerable value. Yet even here it seems important to point out the individual variability in behavioral reactions. Since brain-injured subjects differ so markedly, for example, in their responses to their traumas, or with respect to their selfconceptions, it has been found to be of little value
to look for commonality of reaction in personality change. Abstract and concrete functioning. Goldstein, perhaps more than any other student of brain disorders, has postulated both specific disabilities and changes in basic attitudes and thought processes as a result of organic conditions. He pointed out that ". . . even in cases of circumscribed cortical damage, the disturbances are scarcely ever confined to a single field of performance. In such intricate syndromes, we deal not only with a simple combination of disparate disturbances but also with more or less unitary, basic change that affects different fields homologously and expresses itself through different symptoms . . ." ([1934] 1939, pp. 15-16). Of all the symptomatology noted in the study of the organic psychopathologies, none has had greater impact or perhaps stirred greater controversy than Goldstein's concept of the shift from the abstractive to the concrete mode of thought (Goldstein & Scheerer 1941). With his co-workers, he postulated both theory and methods for determining the loss of categorical, abstracting behavior. The concept, by and large, has received greater acceptance than the methods. Today almost every clinician studying the behavior of the brain-impaired patient concedes the correctness of Goldstein's observation that organic brain disease impairs abstract functioning. The many Goldstein tests, however, have been less successfully used by other examiners. Changes in intellectual performance. Finally, some attention should be directed toward the concept of changes in general intellectual level as a consequence of brain impairment. Essentially, there are two types of change that have been widely noted. In some patients, over-all intellectual level seems grossly affected, as in conditions that to a certain degree affect the nervous system in its entirety. A good example is the deterioration that accompanies progressive cerebral disease and that is demonstrated by progressive generalized intellectual decline. In other patients, the condition is localized and affects intellectual ability within fairly circumscribed areas. It can be shown in most cases of brain disorder that one or the other of these two forms of intellectual loss occurs. In certain types of localized damage that affect only specific functions, as in the limited agnosias and apraxias affecting language, it may be held that no intellectual loss need be predicated. However, even here a loss must be considered to occur, since the deficiency in those functions immediately affected makes adaptation to the environment more difficult and more circumscribed. Thus, even very minor
MENTAL DISORDERS: Organic Aspects brain damage that functionally affects only the capacity to read or write and that might not affect the individual's capacity to think or perform on intelligence tests would still have its deleterious effect upon the totality of behavior and, consequently, upon intelligence. It would make the patient a less efficient organism and would thus make adaptation to life a more complex task for him. Generalized deterioration, whether progressive or not, rarely permits alterations in behavior through therapy and rehabilitation. On the other hand, focalized injuries that produce limitations of intellectual ability can frequently be offset by proper training and therapeutic rehabilitation (Harlow 1953). Organic language disturbances. Language function and dysfunction have also become the focus of attention of many researchers in recent years. Loss of the ability to comprehend and use language has an extensive literature of its own. Aphasia, a loss of ability to utilize verbal symbols, is a relatively common aftereffect of brain damage, especially when that damage affects the left cortical hemisphere. There have been widely different approaches to the study of the language syndromes. Schuell and Jenkins (1959) have postulated that language is a unitary process that may be lost in varying degrees. They have concluded from their studies that the language deficit may be measured along a continuum of severity of dysfunction. This would include difficulties with a variety of individual tasks, such as the abilities to speak, read, and write. Wepman and Jones (1961), on the other hand, contend that language is divisible into a series of different linguistic processes and that each process may be differentially affected. Five types of symbolic loss of language—five types of aphasia—have been described in this research: global, jargon, pragmatic, semantic, and syntactic. Partial support for this viewpoint is seen in the insightful work of Jakobson (Jakobson & Halle 1956), who related certain observed aphasic disturbances, noted above as semantic and syntactic aphasia, to two basic linguistic processes. He pointed out that there are two basic types of aphasia, differentiated according to whether the deficiency is in the selection and substitution of Words or in their combination and contexture. Further support for this linguistic differentiation also comes from other sources. The research of Goodglass and his co-workers on agrammatism and paragrammatism (Goodglass & Hunt 1958; Goodglass & Mayer 1958) confirmed the Jakobson dis-
137
tinction. The work of Luriia (1947; 1958) in the Soviet Union goes far beyond the linguistic approach, identifying the process changes with neural constructs and specific localization of damage. Also included in his work is a description of five very similar types of language disturbance (1958). The conception of aphasia as representing disorders along a continuum of severity and the psycholinguistic classification of aphasic types as discussed above are fairly recent developments. In contrast, traditional neurological literature treats language disturbances as receptive or expressive disorders, closely related to specific neurological substrata. Indeed, a fair proportion of the literature on the whole field of brain damage is given over to discussions of the question of localization of function in the nervous system. Harlow's review of literature dealing with the higher functions of the nervous system (1953) is devoted solely to the difference of opinion concerning localization. Discussing Hebb's brilliant Organization of Behavior (1949), the Hixon Symposium on Cerebral Mechanisms in Behavior (Jeffress 1951), and Fulton's Functional Localization in Relation to Frontal Lobotomy (1949), as well as research devoted to specific architectonic divisions of the cortex, Harlow concludes in part that "the data would appear to be almost overwhelmingly opposed to any theory of cortical localization of intellectual function which is anatomically precise or temporarily static" (1953, p. 512). Writers in the field vary from a position of extreme belief in punctate localization to the opposite view of equipotentiality or mass action. Somewhere between these two polar viewpoints rest the opinions of most present-day exponents of the role of the cortex in relation to behavior. Recent reports of psychodiagnosticians studying the behavior of patients with known brain damage give some credence to a gross type of localization. Reitan and his co-workers (Reitan & Reed 1962), as mentioned earlier, have been able to show that subjects with left brain damage, when tested by such scaled instruments as the Wechsler Adult Intelligence Scale, show a greater deficit in performing verbal tasks, and less deficit, if any, in performing nonverbal tasks. The opposite findings were reported on patients with known right brain damage; that is, they did better on verbal tasks than on nonverbal ones. This general finding bears out what has previously been said about language disorders following brain injury; that is, symbolic verbal behavior is found to be disturbed only in left brain-injured subjects (Wepman 1951).
138
MENTAL DISORDERS: Organic Aspects
It is the viewpoint of most students of brain function that while there is a type of localization of function within the nervous system, the end product, which is an individual's behavior, is the result not of the functioning of any localized section or subsection but of the integrated nervous system as a whole. For example, Penfield and Rasmussen (1950), by their ingenious placement of electrodes during neurosurgery, have demonstrated that while aphasic arrest does occur more frequently when Broca's area (the third prefrontal convolution of the left cortex) is stimulated electrically, a similar arrest of language occurs when widely scattered areas of the parietal and even the temporal lobes are stimulated. From such studies it would appear that while a high concentration of cells in Broca's area may be responsible for a type of word-finding ability, other areas subserve the same function but in a less concentrated way. Behavior, it is held, is a far too complex process to be conceptualized as the product of any localized area of the brain. It is much more reasonably thought of as the integration of perceptual, conceptual, mnemonic, and motor processes—with factors of motivation, emotion, and the processes of feedback playing their various roles. Organic mental disorders can thus be best described as the results of a variety of conditions— disease processes, traumas, agenesis, deteriorations, etc. These conditions in turn produce alterations in the consequent behavioral patterns of the brainimpaired. Certain specific signs of alteration are more commonly seen in the behavior of the impaired than in the unimpaired. These signs are recognizably not pathognomonic of the neural disorder. Yet, by their consistency of occurrence, they are often the best indicators available of brain damage in those in whom pathological behavior is noted. Many of these indications are admittedly seen only clinically and rarely verified by research. Others are elicited through more organized and scientific psychological and linguistic studies. JOSEPH M. WEPMAN [Other relevant 'material may be found in LANGUAGE, article on SPEECH PATHOLOGY; NERVOUS SYSTEM; SCHIZOPHRENIA.] BIBLIOGRAPHY AITA, JOHN A.; REITAN, RALPH M.; and RUTH, JANE M. 1947 Rorschach's Test as a Diagnostic Aid in Brain Injury. American Journal of Psychiatry 103:770-779. ARIETI, SILVANO (editor) 1959 American Handbook of Psychiatry. 2 vols. New York: Basic Books. BABCOCK, HARRIET 1941 The Level-efficiency Theory of Intelligence. Journal of Psychology 11:261-270.
BAKER, GERTRUDE 1956 Diagnosis of Organic Brain Damage in the Adult. Pages 318-428 in Bruno Klopfer (editor), Developments in the Rorschach Technique. Volume 2: Fields of Application. New York: World Book. BAUER, ROBERT; and WEPMAN, JOSEPH M. 1955 Lateralization of Cerebral Functions. Journal of Speech and Hearing Disorders 20:171-177. CHAPMAN, LOREN J. 1960 Confusion of Figurative and Literal Usages of Words by Schizophrenics and Brain Damaged Patients. Journal of Abnormal and Social Psychology 60:412-416. ETTLINGER, GEORGE; JACKSON, C. V.; and ZANGWILL, O. L. 1956 Cerebral Dominance in Sinistrals. Brain 79: 569-588. FULTON, JOHN F. 1949 Functional Localization in Relation to Frontal Lobotomy. New York: Oxford Univ. Press. GOLDSTEIN, KURT [1934] 1939 The Organism. New York: American Book. -» First published in German, GOLDSTEIN, KURT; and SCHEERER, MARTIN 1941 Abstract and Concrete Behavior: An Experimental Study With Special Tests. Psychological Monographs 53, no. 2. GOODGLASS, H.; and HUNT, J. 1958 Grammatical Complexity and Aphasic Speech. Word 14:197-207. GOODGLASS, H.; and MAYER, J. 1958 Agrammatism in Aphasia. Journal of Speech and Hearing Disorders 23:99-111. HALSTEAD, WARD C. 1947 Brain and Intelligence. Univ. of Chicago Press. HARLOW, HARRY 1953 Higher Functions of the Nervous System. Annual Review of Physiology 15:493-514. HEBB, DONALD O. 1949 The Organization of Behavior. New York: Wiley. HUGHES, ROBERT M. 1948 Rorschach Signs for the Diagnosis of Organic Pathology. Rorschach Research Exchange 12:165-167. -> Now called Journal of Projective Techniques. HUNT, HOWARD F. 1943 A Practical Clinical Test for Organic Brain Damage. Journal of Applied Psychology 27:375-386. JAKOBSON, ROMAN; and HALLE, MORRIS 1956 Fundamentals of Language. The Hague: Mouton. JEFFRESS, LLOYD A. (editor) 1951 Cerebral Mechanisms in Behavior: The Hixon Symposium. New York: Wiley. KASS, WALTER 1949 Wechsler's Mental Deterioration Index in the Diagnosis of Organic Brain Disease. Kansas Academy of Science, Transactions 52:66-70. LASHLEY, KARL S. 1929 Brain Mechanisms and Intelligence: A Quantitative Study of Injuries to the Brain. Univ. of Chicago Press. LURIIA, ALEKSANDR R. 1947 Travmaticheskaia afaziia (Traumatic Aphasia). Moscow: Academy of Medical Science. LURIIA, ALEKSANDR R. 1958 Brain Disorders and Language Analysis. Language and Speech 1:14-34. MILNER, BRENDA 1954 Intellectual Functions of the Temporal Lobes. Psychological Bulletin 51:42-62. PENFIELD, WILDER; and RASMUSSEN, THEODORE 1950 The Cerebral Cortex of Man. New York: Macmillan. PIOTROWSKI, ZYGMUNT A. 1937 The Rorschach Ink Blot Method in Organic Disturbances of the Central Nervous System. Journal of Nervous and Mental Diseases 86:525-537. REITAN, RALPH M.; and REED, HOMER B. 1962 Consistencies in Wechsler-Bellevue Mean Values in Brain-
MENTAL DISORDERS: Biological Aspects damaged Groups. Perceptual and Motor Skills 15:119121. SCHUELL, HILDRED; and JENKINS, J. J. 1959 The Nature of Language Deficit in Aphasia. Psychological Review 66:45-67. SHIPLEY, WALTER C. 1940 A Self-administering Scale for Measuring Intellectual Impairment and Deterioration. Journal of Psychology 9:371-377. TEUBER, HANS L. 1950 Neuropsychology. Pages 30-52 in Recent Advances in Diagnostic Psychological Testing. Springfield, 111.: Thomas. WECHSLER, DAVID 1955 Wechsler Adult Intelligence Scale (WAIS). New York: Psychological Corp. WEPMAN, JOSEPH M. 1951 Recovery From Aphasia. New York: Ronald Press. WEPMAN, JOSEPH M. 1960 Auditory Discrimination, Speech and Reading. Elementary School Journal 60: 325-333. WEPMAN, JOSEPH M. 1961 A Selected Bibliography on Brain Impairment, Aphasia, and Organic Psychodiagnosis. Chicago: Language Research Associates. -» Lists over a thousand items collected from 220 recent American journals and books. WEPMAN, J. M.; and JONES, L. V. 1961 The Language Modalities Test for Aphasia. Chicago: Education—Industry Service. Ill BIOLOGICAL ASPECTS
In 1884 the founding father of neurochemistry, J. W. L. Thudichum, wrote: Many forms of insanity are unquestionably the external manifestations of the effects upon the brain substance of poisons fermented within the body, [in the same way that] mental aberrations accompanying chronic alcoholic intoxication are the accumulated effects of a relatively simple poison fermented out of the body. These poisons we shall, I have no doubt, be able to isolate after we know the normal chemistry to its uttermost detail. And then will come in their turn the crowning discoveries to which all our efforts must ultimately be directed, namely, the discoveries of the antidotes to the poisons, and to the fermenting causes and processes which produce them. (1884, p. xiii)
Thudichum anticipated trends which we would regard as very modern in our day. Two premises are explicitly stated. An aberrant biological product (or "a product fermented in the body") leads to aberrant behavior; and knowledge of normal chemistry "to the uttermost detail" is an essential prerequisite to the isolation of such a product. Thudichum's own crowning achievement was an analysis of the brain in terms of its chemical buildln g blocks, the so-called lipoproteins, which are complexes formed between fatty bodies (lipids) and proteins. This work still stands as a classic. Yet this approach—what one might call a clockw °rk approach—reflects the hopes and limitations an age. In the Victorian era, governed by the rule, the clock, and the kilogram, an under-
139
standing of the chemical machinery was regarded as a reasonable basis for the understanding of mental disorder and thus, by inference, of mental order. Attitudes have great viability; similar approaches (with somewhat better reason) are with us to this day. They aim at an understanding of the chemistry of the strange detector we carry in our skull. But of environment, forever playing upon this detector, and so often disrupting and distorting it, chemistry will tell us nothing. Nor can chemistry, in its old and classic form, tell us much of how environment is transcribed and coded by nervous tissue. Yet in man the brain in some strange way internalizes and stores environment; it models it in sight and sound and uses these models as predictors; it transmits information from generation to generation by a symbolic (nongenetic) process, which is a radically new departure in evolution. Neurochemistry thus poses problems very different from those posed by classical chemistry or even by modern physical chemistry. It demands a revision of attitudes, and perhaps like no other field in biology is forcing a confrontation of biological process with the emerging concepts of system theory. In short, any attempt at understanding the brain as a chemically mediated organ of information forces an encounter between somatic transaction and symbolic process. This field one can, justifiably, term psychobiology. Principles of regional brain organization We may be a long way from understanding the nature of the chemical control processes which enable the brain to function as an integrating, "feeling," information-storing, predicting, and computing organ, but we have also come some way since Thudichum. Thudichum's analysis involved the analysis of the brain as a single organ. However, the brain, unlike the liver, is not a homogeneous organ; and in terms of anatomical arrangement, cell population, and chemistry it shows a regional economy which reflects the course of its own evolution. The distinctive attribute of the human and primate brain is the size and structure of its cortex. Its contribution to the analysis of secondary signaling (that is, symbolic) systems has been extensively studied by the Pavlovian school. Yet work since the 1940s has also emphasized the role of some developmentally older subcortical centers buried deep in the brain and the relationship of these structures to the cerebral cortex. The brain centers in question are the hypothalamus, the reticular activating system, the rhinencephalic ("smell brain") formation (also known as the limbic system), and the caudate
140
MENTAL DISORDERS: Biological Aspects
and lentiform masses (the corpus striatum). Each of these systems has received its share of extensive review; and accumulated experience, using a variety of techniques with each, has steadily emphasized three separate, though related, trends. The first is the discrete neuroanatomic and cellular suborganization of these systems; the second, the interconnectedness between the systems themselves and between the systems and relatively distant elements at high cortical and spinal levels; and the third, the reciprocal, complementary, yet mutually exclusive relationship which some patterns represented in these systems bear to each other. These findings are relevant when considered in relation to the regional chemistry of these structures. [See NERVOUS SYSTEM, article on STRUCTURE AND FUNCTION OF THE BRAIN.]
There is little doubt of the anatomical heterogeneity and cytological differentiation of the hypothalamus, where small areas measuring hardly more than a few millimeters in diameter control the central representation of the autonomic nervous system and the appetitive drive systems (fight, flight, hunger, thirst, sex). Similarly, more recent studies have emphasized the remarkable anatomical and cytoarchitectonic differentiation within the reticular activating system, governing wakefulness, sleep, and focused attention. Elements in the structure of the limbic system have similar differentiation. Each of these systems thus encompasses a mosaic of subsystems which in a manner only poorly understood at present are fitted into one another. This understanding, however, is being steadily enhanced by mounting knowledge of the anatomical connections and electrophysiological properties. These connections are in both directions between the hypothalamus and the reticular formation, the limbic system and the hypothalamus, and between these structures and the cortex. [See ATTENTION; SLEEP.] The term "limbic system midbrain circuit" is entirely apt (Nauta 1958). In a way still poorly understood, the limbic system would appear to be an intermediate between discrete analysis of diverse signals at high levels and the discharge of a limited genetically coded stereotyped response known as "affective" behavior. It appears to participate in and to modulate both. There is a tremendous convergence of information in this area. There are structural counterparts of this convergence; in the olfactory bulb, for example (which in man can be regarded as a homologue of the hippocampus), the messages of about 50 million receptors are reduced to 150 thousand in mitral cells and finally reach the brain through
the axons of a mere 45,000 pyramidal cells. These cells and their branchings thus have the structural features which make for an extraordinary funneling and filtering of information (Green 1964). It is, incidentally, in these areas too that a rich, tessellated, terrazzo-like apposition of shared membranes, a virtual mosaic, is found. These junctional areas are also highly localized electrical generators. It is usually found convenient to speak of the reticular formation and the hippocampal amygdaloid cell assemblies as areas that control sleep, wakefulness, and the states between wakefulness and sleep. Yet there is evidence that these areas are pre-eminent in their relation to patterns of emotional expression of autonomic functioning and, also, that interference in these areas may be important for the process of "recall" and possibly for the process of registering the memory trace. Seen dimly, then, and in broadest silhouette, the functions covered by the terms "consciousness," "affect," and at any rate "recall" thus appear to be subserved by congruent or at least intimately connected systems (Elkes 1966). That we are "aware," that we are "responsive," that we are "appropriate," that we can plan a piece of behavior and be sequent rather than random, confused, and "in-con-sequent" in its execution may well be due to some elements and processes vested by evolution in these remarkable cell groups. Somehow, they appear to have the ability to build short-term representational systems of temporarily related events and to use them in the construction of appropriate responses. These models, these tiny maps in time, may be intercellular or cellular; at this stage we do not know. We may thus regard the brain as a mosaic of what have appropriately been called "biased homeostats" (Pribram & Kruger 1954). The bias, handled by genetically coded "Yes-No" drive systems, keeps changing constantly in the light of ongoing events. To allow adequate comparison of events, slow-fading traces must be available to set up such transient comparisons. It is possible that the socalled after discharges (slow-fading electrical discharges) for which the cells—particularly of the hippocampus—are noteworthy provide a medium for the establishment of such traces. In a manner which is only just beginning to be understood, the coincident is detected and the concurrent arranged into an appropriate action that is consequent. Whereas internal perception and comparison may be multiple, action and preparation for action are essentially serial: it requires a rigorous regulation of construction of events at the time; it demands apprehension of the redundant and, above all, a selective inhibition of those elements which are
MENTAL DISORDERS: Biological Aspects judged irrelevant; a construction of subsets which carry meaning by ignoring but also can carry coded traces of what they ignore. Inhibition is thus the agent of structure in the central nervous system. This is borne out by all that we know of reciprocal inhibition in the spinal cord and all that is being learned of the organization of sensory processes. All evidence coming from visual, auditory, and the somatosensory fields points to the operation of highly patterned inhibitory processes reciprocating with the excitation. Delay of the immediate response—that is, the reduction of the immediate reactivity—is merely the giant child of inhibition, a vast and pervasive function woven by evolution into the nervous tissue as a device for judging relevance and for structuring time. Time, indeed, would appear to be the main axis around which the nervous system constructs its model of reality. It does so by judging what is relevant in time and "banking" what is irrelevant. The higher nervous system is remarkable for its ability to ignore; accurate adaptive performance is attained by ignoring all that is adjudged irrelevant. Somehow, then, in such transactions the simultaneous has to be changed to the successive, and global or random apprehension likewise has to be transformed into rank-ordered sequential responses. It has been observed by Pribram and Kruger (1954) that it may be the role of the limbic system to provide the context in which drive stimuli are reinforcing and then to reverse the context-content relationship between the drive stimuli and reinforcing events. Thus affective connotation becomes a label and a gating device. Affect gates the emergence of memory traces into preconscious knowing and conscious action. The events and the internal representation may be highly varied and complex; the affective response patterns, however, are finite. Seen very broadly, the anatomy of the neuraxes reflects these requirements. For we are dealing here with two great vertical systems and one, so to speak, horizontal system. First, there is the so-called specific afferentefferent system which in particular preserves information coming from receptors in a point-to-point ^presentation; second, there is the core of the midline structures comprising the massive facilitatory and inhibitory structures and characterized by a §reat variety of cells which serve as a mixing pool for each sense modality. Connected with both syserns are the elements of the limbic system in which the indications for somewhat longer traces of activity • (the so-called "after discharges") are ocated and which makes this information availfor reference, for comparison, and for label-
141
ing. It is possible that the caudate and the striatal nuclei may have somewhat similar properties. In man the vast neocortical mantle provides a tremendous reservoir of cells for the storage of recorded traces of events and for their use in the shaping of new events, either symbolic or actual, i.e., those expressed in action. It is also well to recall that wherever we look in the central nervous system, extracellular space is scarce; nonneuronal (socalled) glial structures and membranes predominate. In fact (although this may prove to be an extravagant generalization), we may look upon the central nervous tissue as an array of growing, polymerizing protein fibers and a mosaic of lipid—protein membranes. Studies have shown the power, the growth, and the specificity of connection formation in the peripheral nervous system. There is also evidence of the presence of nerve growth factors promoting the growth of nerve fibers (Levi-Montalcini & Angeletti 1961). It would seem that, scaled down and compressed to an enormously faster time scale, we may be dealing in the central nervous system with the growth and evolution of macromolecular forms, which by their interaction determine what we know as symbolic form. The junctional sites between cells, and particularly the enormous and highly ordered membranes existing in the central nervous system, may play a part in the initial construction of trace models capable of acting as recognizers, and hence as organizers and ultimately as decision points—organizers, that is, of coincidences, of sequences of pattern, that is, patterns in time. [See TIME, article on PSYCHOLOGICAL ASPECTS.] Central neurohumoral substances It may very well be asked why one should emphasize structural features in the context of a statement on the biological background of mental disorder. The partial answer is that the anatomical and functional attributes have some suggestive chemical correlates. For it is precisely in the areas just mentioned (the hypothalamus, the midline gray area, and elements of the limbic system, the amygdala, the hippocampus, the reticular activating system, and certain layers of the cortex) that one repeatedly encounters a number of small molecules which appear to have evolved for a role in the organization of control systems in the central nervous system. These molecules are acetylcholine, and possibly other choline esters; gamma aminobutyric acid, a derivative of glutamic acid; histamine, also found in the skin and released in injury of all tissues; catecholamines such as dopamine, norepinephrine, and epinephrine, mediators of
142
MENTAL DISORDERS: Biological Aspects
sympathetic system responses at peripheral effector sites; and indoles such as serotonin. All these molecules show a regional gradient in their distribution. Histamine, for example, is found principally in the midline diencephalic structures; norepinephrine is present in the hypothalamus and in the periventricular gray matter; serotonin is present in both diencephalic and limbic structures. Two further features should be emphasized concerning these substances. First, they are representative members of families of compounds. As chemical mapping proceeds, related members of these various subgroups, their precursors, and their products are identified. Second, the metabolic pathways of each of these compounds—within and without the brain —are steadily being defined more clearly. New and elegant techniques are capable of demonstrating these materials in situ in the brain. These studies show that the intercellular and pericellular economies of these substances are organized very precisely. The molecules are transported into cells or their precursors by the energy-yielding system; once synthesized, they are transported and packaged into granular particles or "vesicles" inside cells; they are carefully stored in equilibrium at these sites and are released in response to electrical stimulation by mechanisms still poorly understood. Moreover, all work so far on how the psychoactive drugs act—be they tranquilizers, stimulants, or the so-called hallucinogenic compounds—suggests that all substances interfere in varying ways and to varying degrees with the uptake, storage, and release of the catecholamines (such as epinephrine and dopamine), indolic substances (such as serotonin), and possibly histamine in the areas that have been mentioned. The precise action profiles of these drugs are different, and it is upon such subtle differences in action that variation in therapeutic effect may well depend. This, then, puts an end to the simple "clock model" of an earlier day. For we are not only dealing with families of compounds, but we are also considering multiple binding sites of organelles exquisitely sensitive to local subcellular conditions. The uneven distribution of these chemicals at neuronal decision points, the trigger (or selective suppressor) function of certain elements, and the anatomically imposed economy in terms of convergence and occlusion all suggest that we are dealing with transaction sites at highly localized subcellular levels which need not necessarily be reflected in gross over-all shifts. These sites are evidently unevenly distributed in nonhomogeneous cell populations. Their state at any one time depends exquisitely upon the short-term history of preceding or
coincidental events. It is impossible to think in terms of a mechanistic spatial localization, i.e., in terms of points. Rather, one is forced to think in terms of convergence, coincidence, stochastic processes, and probabilities of interaction in time. It was earlier suggested that inhibition is the agent of structure in the nervous system and that the silence in the central nervous system is, so to speak, informed silence. The chemical computer we carry in our skulls apparently writes its chemical text of experience in proteins; and some small molecules appear to play a key part in the transcription or readout. Much work on the biology of mental disorder centers on the identification of the normal molecules mentioned above and of their deviant metabolites. Chemical aspects of mental development Chemical factors certainly operate in the development of the nervous system. Some of these are general; others are more special. Of the general factors, oxygen supply is one and hormones are another. Cerebral ischemia, i.e., restriction of blood supply, even for a short time in the developing animal, causes marked forms of mental deficiency. Yet in the adult animal, oxygen supply and fuel consumption are not necessarily related to mental functioning: they may be, but they need not be. A major advance in methodology (Kety & Schmidt 1948) has made it possible to make exact measurements of blood flow, oxygen consumption, and glucose consumption in the conscious human brain, in a variety of functional states in the normal brain and in various forms of mental disorder. These studies have shown quite clearly that in normal man the major substrate for oxidation is glucose and that the rate of energy utilization of the human brain is on the order of a mere 20 watts— eloquent testimony to the efficiency and miniaturization of the brain-computer, weighing about three pounds. These same studies also showed that general anesthesia reduces cerebral oxygen consumption to about 40 per cent; in contrast, normal sleep did not show such reduced oxygen consumption. Similarly, in studies of schizophrenia and of the effects of LSD-25 and other hallucinogenic drugs there were no changes in over-all total oxygen consumption. First and last, the brain is an organ of information. To be sure, energy is needed to keep the living computer going, to synthesize the building blocks, particularly proteins and lipoproteins essential for its development. However, information storage and retrieval are evidently low-energy processes.
MENTAL DISORDERS: Biological Aspects Another equally general factor concerned in intellectual development and mental functioning is presented by a number of hormones, of which thyroxin can serve as a useful example. Hypothyroidism due to an iodine deficiency leads to a mental deficiency syndrome known as cretinism. Hyperthyroidism (an excess of thyroxin) leads to striking hyperirritability and various signs of overactivity of the autonomic nervous system. There is much experimental evidence (Sokoloff & Kaufman 1959) that thyroxin may exert its action on the brain through influencing protein synthesis. It is relevant that although thyroxin does not stimulate protein synthesis in the mature brain, it does significantly do so in the newborn brain. The main structural defect in experimentally induced cretinism (produced by thyroid deficiency) is a deficiency in the proliferation of fine nerve fibrils (dendrites) in the immature cerebral cortex. Here again, a structural feature stresses the importance of connectivity between neurones (the so-called neuropile) rather than mere number of cells. The pathology of phenylpyruvic oligophrenia (phenylketonuria, PKU) may serve as a useful example of the way in which a specific and genetically determined metabolic error—a so-called biochemical lesion—may profoundly affect the development of higher nervous function. In 1934 A. Z. Foiling observed in the course of an investigation of mentally defective children that some of them excreted phenylpyruvic acid in the urine and that there appeared to be a relation between the anomaly and the imbecility. This was the first demonstration of a metabolic error definitely associated with a form of mental defect and also with physical characteristics (blond hair and blue eyes). In phenylketonuria the subject is unable to oxidize a normally occurring essential amino acid (phenylalanine) to tyrosine at a normal rate. Because of this inability, phenylalanine accumulates in the tissues, rises in the blood stream, and spills over in the urine. This error also mobilizes other metabolic routes which normally play little part in the metabolism of these amino acids. In 1953 it was definitely shown that enzymic extracts prepared from the livers of phenylketonuric patients failed to further the oxidation of phenylalanine to tyrosine (Jervis 1953; Wallace, Moldave & Meister 1957). Extracts of normal livers do so quite readily. Furthermore, the use of radioactively labeled phenylalanine showed that the phenylketonuric can convert this phenylalanine to tyrosine at only a fraction of the normal rate. The most striking result of this deficient oxidation mechanism is the a Ppearance in the urine of phenylpyruvic acid, a
143
compound which is readily detected through the greenish color it acquires when it reacts with ferric chloride. This provides a ready screening test for the deficiency in the newborn. The striking decreased pigmentation seen in phenylketonuria— blue eyes and blond hair—may be due to the decreased formation of melanin pigment, through inhibition of an enzyme known as tyrosinase. The abnormal metabolites of phenylalanine also apparently interfere with the synthesis of catecholamines important for brain functions. This may account for the lowered level of epinephrine and norepinephrine in the plasma of the phenylketonuric. Whether excess of phenylalanine or one of its breakdown products or an interference with some of the biosynthetic processes of catecholamines accounts for the mental defect remains uncertain. However, the concept opens up an inviting area for producing experimental phenylketonuria in the laboratory by means of "loading" the system with phenylalanine and also suggests a way of treating or preventing the disorder by withdrawing the offending amino acid from the diet. The first attempt to relieve phenylketonuria by such dietary means followed only four years after the discovery of the syndrome (Penrose & Quastel 1937). A new approach to the problem was introduced in 1951, when, for the first time in the field of mental deficiency, diets specifically low in one amino acid (namely, phenylalanine) were introduced (Woolf & Vulliamy 1951). The so-called synthetic phenylalanine-low diets have now been used with varying success in a number of studies. They result in a sharp lowering of urinary phenylpyruvic acid and plasma phenylalanine. The patients so treated show a striking reduction in seizures and spasticity and increased responsiveness in motor development. There is also a darkening of the hair. Reversal to full phenylalanine natural diet leads to relapse. It has also become unambiguously clear that treatment must be introduced as early as possible and that improvement falls off sharply if this regimen is introduced beyond the stage of infancy. The developing nervous system is a vulnerable one. [See MENTAL RETARDATION.] These studies are mentioned because in a sense they represent a prototype of approach which is now being applied, with some modest success, in other studies of the biological basis of mental disorder. The steps are as follows: An empirical finding—a deviant metabolite in urine—leads to a suspicion of a metabolic defect. The natural history of the disorder suggests a genetic basis for this defect. A biochemical lesion—a specific biochem-
144
MENTAL DISORDERS: Biological Aspects
ical defect—is defined. Since the metabolic pathways are interrelated, this single defect leads to consequences only indirectly related to the primary defect, yet very pertinent to the total pathology. The correction of the defect by reducing the metabolic load forms the basis of therapy. An animal model for this disorder is developed, and, finally, the fit of the model in terms of detection and prevention is tested. However, there is still a large no man's land between evidence and inference. The effects of the deviant metabolites on the development of the nervous system, and particularly those areas concerned with perception, coordination, control of motor activity, and maturation of intellectual function, remain largely unknown. Schizophrenic disorders The facts cited above point to the complexity of the field of disorder when it is seen in terms of available biochemical facts and biochemical hypotheses alone. These complexities are compounded many times over in an attempt to relate known biochemical facts to the group of disorders known as the schizophrenias. It is by now generally accepted that we may be dealing, in this group, with a number of very different disorders, sharing a general symptomatology but quite possibly in need of a radical regrouping. The role of genetic factors is reviewed elsewhere [see MENTAL DISORDERS, article on GENETIC ASPECTS]. Careful genetic psychosocial studies of twins, on a national and international scale, are now proceeding, and such studies (particularly of families in which twins are discordant for schizophrenia) may contribute some of the facts which are needed to separate, on a conceptual basis, nature from nurture. Equally, careful longitudinal studies—prospective rather than retrospective—are needed (starting, preferably, in early infancy) to establish the role of genetic "givens" in the autonomic reactivity patterns which have been claimed to be deviant among schizophrenics. These suggest an instability of hormonal control and diminished compensatory physiological responses: peripheral vasoconstriction, capillary abnormalities in the nail bed, and abnormal pupil responses have been implicated as such signals. Yet the one measure which reliably distinguishes schizophrenic populations from normal is their state of readiness in the face of oncoming stimuli (Rodnick & Shakow 1940). This anticipatory set or "set index" suggests that in schizophrenia one may be dealing with a disorder of the attention process. How much this disorder represents the collusion between a genetically determined insta-
bility in homeostatic control and a defensive homeostatic withdrawal from stressful stimulus situations and thus, ultimately, a learned pattern of adaptation (enhanced, for example, by the double message structure found in schizophrenogenic families) still remains a matter of conjecture. [See ATTENTION; REACTION TIME; SCHIZOPHRENIA; STRESS.] Nor does the difficulty of relating biochemical varibles to clinical states end there. Even when the data are from the observation of schizophrenics in a hospital ward, there are a number of sources of errors which have seriously affected investigation (Kety 1959). These include long hospitalization, diet (including dietary iodine and protein deficiency), various therapeutic maneuvers (including medication), and the actual circumstances—stressful or otherwise—accompanying the drawing of the biochemical sample. Also, as in all other fields of psychobiology, subjective bias has cast a pall over many painstaking studies. All these reservations notwithstanding, there are, however, some findings largely attributable to the striking advances in present-day methodology. These advances are essentially three in number. First, the refinement of protein fractionation procedures (derived from the needs of blood transfusion and of plasma substitutes); second, the advance of microfluorometric techniques for the detection of very small quantities of catecholamines, indole derivatives, and their metabolites; and third, the development of radioactive tracer techniques and particularly the advent of the liquid scintillation counter, which makes it possible to follow a particular compound through a metabolic maze. As always, it is a moot question whether technical methods or intuitive insights are the more powerful propellants of science. Evidently, in our age they are inextricably connected. In 1957 it was first reported that a serum fraction obtained from schizophrenic patients, when injected into carefully selected nonschizophrenic prisoner volunteers, led to the development of symptoms of thought disorder, autism, depersonalization, paranoid ideas, hallucinations, and catatonic stupor which were likened to schizophrenia (Heath et al. 1958). Attempts to replicate this finding by injection of material prepared according to similar instructions, however, were not successful (see Conference on Neuropharmacology 1959). This finding, however, seems to this writer less pertinent than the various lines of investigation which were stimulated by the finding. A number of groups have now independently obtained evidence which suggests at least the possibility that
MENTAL DISORDERS: Biological Aspects an abnormal protein may be present to a greater extent in the blood of schizophrenics than in normals and that this substance may be capable of producing behavioral metabolic changes in some animal tests. There is also evidence that there is an antigenic abnormality in the pooled serum of chronically ill schizophrenic patients (Haddad & Rabe 1963) and that plasma from schizophrenic patients affects learning and retention of learning in the rat (Bishop 1963). Similarly, serum of schizophrenic patients has been shown to affect cortical (electrically evoked) responses of animals (German 1963). However, it is of more than suggestive significance that in these various studies plasma derived from normal individuals put under stress produced somewhat similar responses. It is therefore possible that one may be dealing here with a small molecular constituent liberated during stress and attached to one of the plasma fractions; the constituent may not be a unique characteristic of schizophrenia. Another approach to the problem is the characterization of various serum protein fractions, rather than total proteins, in terms of electrophoretic and immunochemical properties. There is evidence from double-blind studies that there may be abnormal protein fractions in a considerable proportion of schizophrenic patients (Fessel & Grunbaum 1961). A cognate approach to the above are the findings of another group (Frohman et al. 1960), who reported that when red blood cells of chickens are incubated with plasma or plasma fractions of some schizophrenic patients, there is an increase, compared with normal controls, of the lactate-pyruvate ratio. This finding, however, still awaits full confirmation, for the difference (in the lactate-pyruvate ratio) is seen only when the subjects have been engaged in moderate exercise before the blood samples are drawn. It may be that these serum factors, while responsive to stress in normals, may be greatly increased in schizophrenics subjected to stress. The serum factor apparently influences the stability of the red cell membrane, the rupture of which may alone account for the changes in the lactate-pyruvate ratio. Broadly, then, the conclusion at this stage is that there is evidence of a plasma protein abnormality *n schizophrenia capable of producing measurable behavioral, immunological, electrophysiological, a nd biochemical responses in suitable test preparations; and that this abnormal constituent may in fact contain a small molecular substance released by, or related to, physiological stress. There are, however, a number of other small which are increasingly being implicated
145
in the search for a biochemical factor (the so-called psychotoxic factor) in schizophrenia. As early as 1952 Osmond and Smythies pointed out that there is a close chemical similarity between the drug mescaline, derived from a Mexican cactus plant, and epinephrine and its precursor dopamine, both of which are usually found in the brain. Mescaline is a methylated derivative of dopamine. The mental changes produced by mescaline bear some resemblance to those seen in schizophrenia. The same paper concluded that "it is extremely probable that the final stage in the biogenesis of epinephrine is a transmethylation of norepinephrine, the methyl groups arising from methionine or choline" (a well-known methyl donor). It is just possible that a defective transmethylation of norepinephrine might lead to methylation of one or both of its hydroxyl groups instead of its amino group. This defective methylation could thus give rise to a mescaline-like toxic substance—Thudichum's "internally fermented psychotoxic." There is no denying the attractiveness of this hypothesis, for it relates the metabolism of epinephrine, a hormone liberated during stress, to the pathology of a condition in which stress tolerance and responsiveness to stress are markedly altered or reduced. This suggestion that there may, in schizophrenia, be a disturbance of the transmethylation process is supported by the fact that a number of drugs (such as dimethyltryptamine, DMT) producing profound mental changes in man are in fact methylated congeners of normal body metabolites. On the basis of such findings it was indeed suggested by Hoffer and his colleagues (1957) that substances which would compete for methyl groups and act as methyl acceptors could competitively inhibit the abnormal process. The vitamins niacin and niacinamide are such substances, and some beneficial results following the administration of large doses of these vitamins in schizophrenia have been reported (Hoffer et al. 1957). These findings still await confirmation. A more direct approach to the problem was to administer large doses of L-methionine, a powerful methylating agent, and a number of other amino acids to chronic schizophrenic patients (Pollin et al. 1961). The changes seen in some patients following the administration of this material were striking; there was a brief and sharp intensification of the psychotic symptoms. This finding has since been confirmed by three other groups and suggests that methylation of aromatic compounds may indeed lead to substances which greatly affect brain function. Another piece of evidence along the same line is the reported occurrence in the urine of schizophrenic
146
MENTAL DISORDERS: Biological Aspects
patients of a substance 3-4-dimethoxy-phenyl-ethylamine (Friedhoff & Van Winkle 1962), suggested in 1952 by Osmond and Smythies as possibly an abnormally methylated and toxic metabolite. This compound is indeed the dimethyl derivative of dopamine, the precursor of epinephrine, and in structure is closely related to mescaline (the trimethyl derivative of this substance). It is only fair to say, however, that the finding of this compound in urine is still subject to confirmation. The presence of the compound may be related to dietary factors and there is, so far, only preliminary evidence that the substance identified in the urine is indeed produced in the body. Once again, then, one can but say that we are at the beginning; yet the pieces are showing some fit. Affective disorders The relation of the midbrain structures and certain elements of the limbic system to the regulation of the visceromotor and affective states has already been mentioned. It is also clear that catecholamines and indoles play a dominant role in these highly specialized and all-pervasive regulatory centers. The past few years have seen increasing evidence to suggest a possible link between affective disorders (i.e., depression or elation of mood) and changes in the metabolism of catecholamines in the central nervous system. Most of the evidence so far is inferential, yet the advent of pharmacological agents which strikingly affect mood by interfering with the storage, release, and disposition of catecholamines in the central nervous system and at peripheral sites has added an important segment to the body of evidence. [See DEPRESSIVE DISORDERS.] Quite early it was reported, in a carefully controlled metabolic study, that clinical manifestations of periodic catatonic excitement and stupor were correlated with a change in the nitrogenous constituents of the urine (Gjessing 1938; Gjessing et al. 1958). Longitudinal studies have shown that the urinary excretion of norepinephrine is increased in the manic phase and decreased in the retarded depression phase in manic-depressive ("cyclic") patients (Strom-Olsen & Weil-Malherbe 1958). Yet urinary epinephrine represents only a small fraction of the total metabolites of epinephrine. However, it is now possible to study and identify most other breakdown products of epinephrine and to draw up a full balance sheet of catecholamine metabolites in man. Such studies of urinary metabolites in depressed patients and normal controls suggest as a reasonable hypothesis that "some, if not all, depressions are associated
with an absolute or relative deficiency of catecholamines . . . at functionally important receptor sites in the brain" (Schildkraut 1965, p. 509). The major inferential evidence so far derives not from physiological studies in the natural untreated states but from the results of pharmacological intervention. In this respect, three groups of drugs are of particular import, namely, reserpine (a major tranquilizer exerting its effect by depleting serotonin and norepinephrine sites in the brain and peripheral sites); the monoamine oxidase inhibitors, which are powerful antidepressants and inhibit the destruction of naturally occurring amines (epinephrine, serotonin, and the like) by oxidation; and various imipramine-like compounds which, in a way not yet clearly understood, interfere with the local economy of catecholamines at intracerebral sites. Reserpine has been found to induce severe depression in patients, yet whether reserpine-induced depression is a valid pharmacological model of the naturally occurring clinical state remains to be seen. In animals, reserpine induces sedation, which is associated with a decrease in the brain levels of norepinephrine, dopamine, and serotonin. The level of sedation correlates well with the depletion of catecholamines in the brain and, furthermore, shows a rise in level of catecholamines with a return of normal motor behavior. Furthermore, the administration of dihydroxy-phenylalanine and dopamine promptly reverses the reserpine-induced sedation in animals and restores normal behavior and norepinephrine level, while an administration of the corresponding serotonin precursor (5-hydroxy-tryptophan) does not. In man, dopamine has been reported to counteract the sedating effect of reserpine. Thus catecholamine depletion (i.e., depletion of dopamine, epinephrine, and norepinephrine) may be of major importance in reserpine-induced sedation in animals; reserpineinduced depression in man may possibly have a similar basis. The gross picture is reversed for the antidepressive agents. Administration of monoamine oxidase inhibitors, such as iproniazid, and other clinically effective antidepressives both produced behavioral excitation in animals and correlated well with elevated levels of brain norepinephrine. The behavioral stimulation is less related to an elevation of brain serotonin. The mood-elevating properties of amphetamine (benzedrine) are well known. It may be that amphetamine acts by releasing active norepinephrine from its anchoring sites. Furthermore, the "rebound" effect following amphetamine intoxication (which shows clinically in depression) is reflected in a temporary depletion of norepineph-
MENTAL DISORDERS: Biological Aspects rine stores available for continued release. Imipramine, another antidepressant, does not inhibit the enzymes involved in metabolic reactivation of norepinephrine and may interfere with the uptake of norepinephrine into peripheral tissue. There is also some evidence that imipramine inhibits norepinephrine uptake in the brain. These findings are indicative of the kind of subcellular organization that one is compelled to consider. Little is known as yet of the physiological (rather than pharmacological) factors which govern the storage and release of substances at these subcellular sites. One would naturally suspect ionic shifts, although evidence in this respect so far is very circumstantial. The metallic ion lithium is used in the treatment of manic behavior. Evidence of the role of hormones in these transactions is also at present hardly more than suggestive. There is, however, a respectable body of evidence recently reviewed (Ramey & Goldstein 1957) which indicates a close physiological interaction between the adrenocortical steroids and the catecholamines. The two groups of stress hormones would seem to operate physiologically as one functional unit, the steroids maintaining the integrity and responsiveness of tissues in the process of reacting to norepinephrine. Indeed, many actions of epinephrine, norepinephrine, and dopamine are not elicitable in the absence of steroids. Equally, and perhaps more significantly, many actions attributable to the steroids may in fact be more accurately ascribed to the catecholamines. There is some clinical data available on the relation of plasma hydrocortisone levels to mood. There is, for example, a significant linear relationship between an increase in anxiety, anger, and plasma hydrocortisone levels; moreover, and perhaps less expected, severe depression is accompanied by elevated plasma hydrocortisone levels; and deeply retarded underactive patients show hormone levels higher than those of less depressed groups. Depression would thus appear to be an active stress response; the retarded tearless state, the final phase of an active, adaptive process. This phase is largely inhibitory, in contrast with the excitatory component commonly seen in the so-called agitated depressive syndromes. It would be idle to speculate at present on the possible effects of various degrees of adrenocortical mobilization on the binding and releasing of epinephrine and norepinephrine in trigger areas of the brain. It may even be that the reverse is true and that the differential proportion of amines in the brain ftiay affect the degree of adrenocortical mobilization. In closing, however, it would be safe to surmise
147
that affective behavioral responses may have peculiar chemical topology in some nodal areas of the brain and that some small molecules, particularly catecholamines and indoles, may by virtue of differential storage, release, and disposition determine the particular pattern of response which is selected by the organism in the light of adaptive need. The affective responses are very simple and are relatively stereotyped. The symbols which produce them are infinitely more complex. We do not know what factors go into determining the responsiveness of the affective apparatus during an early developmental period, i.e., how much depends on certain genetic "givens" (enzyme proteins); how much is learned (incorporated into the plastic biological system during an early developmental period). Yet even in this area there is some suggestive evidence. For example, animals reared in isolation show a different pattern in the metabolism of actively labeled dopamine from animals which are forced to interact (Welch & Welch 1965). Could, one might ask, the coldness and unresponsiveness of the sociopath have a biochemical basis? Could such an understanding yield, in time, appropriate additive therapies? Could learning be modified by chemical means, even in the adult? Could the affective disorders yield to specific chemotherapy? Is there a rational basis for a so-called biological theory of the schizophrenias? More important, could the slowly evolving principles of psychobiology be applied equally to states of mental well-being and ill-being? We know little about the physiology of these states, yet it is clear that whatever direction neurobiology takes, it cannot go it alone. It deals with the machinery and not with information; information is symbolic and social. Whatever hopes there may be for the brain sciences, they remain inseparable from education and social change. Human development, human learning, human communication, human socialfield dependence remain the heart of the study of mental health. In these, the neurobiologist must take his modest place. JOEL ELKES [Directly related are the entries DRINKING AND ALCOHOLISM, article on PSYCHOLOGICAL ASPECTS; DRUGS, articles on PSYCHOPHARMACOLOGY and DRUG ADDICTION: ORGANIC AND PSYCHOLOGICAL ASPECTS; EvOLUTION, article on EVOLUTION AND BEHAVIOR; GENETICS, article on GENETICS AND BEHAVIOR; LEARNING, article on NEUROPHYSIOLOGICAL ASPECTS; MENTAL DISORDERS, TREATMENT OF, article on SOMATIC TREATMENT. Other relevant material may be found in ANXIETY; DEPRESSIVE DISORDERS; DRIVES,
148
MENTAL DISORDERS: Biological Aspects
article on PHYSIOLOGICAL DRIVES; EMOTION; HOMEosxAsis; INIANCY, article on THE EFFECTS OF EARLY EXPERIENCE; MENTAL RETARDATION; NERVOUS SYSTEM; SCHIZOPHRENIA; SENSES; STRESS.] BIBLIOGRAPHY
BISHOP, M. P. 1963 Effects of Plasma From Schizophrenic Subjects Upon Learning and Retention in the Rat. Pages 77-91 in Robert G. Heath (editor), Serological Fractions in Schizophrenia: A Research Symposium. New York: Harper. CONFERENCE ON NEUROPHARMACOLOGY, FOURTH, SEPTEMBER 25-27, 1957, PRINCETON, N.J. 1959 Neuropharmacology: Transactions. Edited by Harold A. Abramson. New York: Josiah Macy, Jr. Foundation. ELKES, J. 1966 Psychoactive Drugs: Some Problems and Approaches. Pages 4-21 in P. Solomon (editor), Psychiatric Drugs. New York: Grune. FESSEL, W. J.; and GRUNBAUM, B. W. 1961 Electrophoretic and Analytical Ultra-centrifuge Studies in Sera of Psychotic Patients: Elevation of Gamma Globulins and Macroglobulins, and Splitting of Alpha Globulins. Annals of Internal Medicine 54:1134-1145. F0LLING, A. 1934 Excretion of Phenylpyruvic Acid in Urine as Metabolic Anomaly in Connection With Imbecility. Nordisk medicinsk tidskrift (Stockholm) 8:1054-1059. FRIEDHOFF, A. J.; and VAN WINKLE, E. 1962 The Characteristics of an Amine Found in the Urine of Schizophrenic Patients. Journal of Nervous and Mental Disease 135:550-555. FROHMAN, CHARLES E. et al. 1960 Further Evidence of a Plasma Factor in Schizophrenia. A.M.A. Archives of General Psychiatry 2:263-267. GERMAN, G. A. 1963 Effects of Serum From Schizophrenics on Evoked Cortical Potentials in the Rat. British Journal of Psychiatry 109:616-623. GJESSING, L.; BERNHARDSEN, A.; and FR0SHAUG, H. 1958 Investigation of Amino Acids in a Periodic Catatonic Patient. Journal of Mental Science 104:188-200. GJESSING, R. 1938 Disturbances of Somatic Functions in Catatonia With a Periodic Course, and Their Compensation. Journal of Mental Science 84:608-621. GREEN, J. D. 1964 The Hippocampus. Physiological Reviews 44:561-608. HADDAD, R. K.; and RABE, AUSMA 1963 An Antigenic Abnormality in the Serum of Chronically 111 Schizophrenic Patients. Pages 151-157 in Robert G. Heath (editor), Serological Fractions in Schizophrenia: A Research Symposium. New York: Harper. HEATH, R. G. et al. 1958 Behavioral Changes in Nonpsychotic Volunteers Following the Administration of Taraxein, the Substance Obtained From Serum of Schizophrenic Patients. American Journal of Psychiatry 114:919-920. HOFFER, A. et al. 1957 Treatment of Schizophrenia With Nicotinic Acid and Nicotinamide. Journal of Clinical and Experimental Psychopathology 18:131158. JERVIS, G. A. 1953 Phenylpyruvic Oligophrenia Deficiency of Phenylalanine-oxidizing System. Society for Experimental Biology and Medicine, Proceedings 82: 514-515. KETY, SEYMOUR S. 1959 Biochemical Theories of Schizophrenia. Science New Series 129:1528-1532, 1590-1596.
KETY, SEYMOUR S. 1960 Measurement of Local Blood Flow by the Exchange of an Inert, Diffusable Substance. Volume 8, pages 228-236 in Methods in Medical Research. Edited by H. D. Bruner. Chicago: Year Book Publishers. KETY, SEYMOUR S.; and SCHMIDT, C. F. 1948 Nitrous Oxide Method for the Quantitative Determination of Cerebral Blood Flow in Man: Theory, Procedure and Normal Values. Journal of Clinical Investigation 27: 476-483. LEVI-MONTALCINI, RITA; and ANGELETTI, PIETRO U. 1961 Biological Properties of a Nerve-growth Promoting Protein and Its Antiserum. Pages 362-377 in International Neurochemical Symposium, Fourth, Varenna, Italy, 1960, Regional Neurochemistry; the Regional Chemistry, Physiology, and Pharmacology of the Nervous System: Proceedings. Edited by Seymour S. Kety and Joel Elkes. New York: Pergamon. NAUTA, W. J. 1958 Hippocampal Projections and Related Neural Pathways to the Mid-brain in the Cat. Brain 81:319-340. OSMOND, H.; and SMYTHIES, J. 1952 Schizophrenia: A New Approach. Journal of Mental Science 98:309315. PENROSE, LIONEL; and QUASTEL, JUADA H. 1937 Metabolic Studies in Phenylketonuria. Biochemical Journal 31:266-274. PERSKY, H. et al. 1958 Relation of Emotional Responses and Changes in Plasma Hydrocortisone Level After Stressful Interview. A.M.A. Archives of Neurology and Psychiatry 79:434-447. POLLIN, WILLIAM; CARDON, PHILIPPE V. JR.; and KETY, SEYMOUR S. 1961 Effects of Amino Acid Feedings in Schizophrenic Patients Treated With Iproniazid. Science New Series 133:104-105. PRIBRAM, K. H.; and KRUGER, L. 1954 Functions of the "Olfactory Brain." New York Academy of Sciences, Annals 58:109-138. RAMEY, E. R.; and GOLDSTEIN, M. S. 1957 The Adrenal Cortex and the Sympathetic Nervous System. Physiological Reviews 37:155-195. RODNICK, E. H.; and SHAKOW, D. 1940 Set in the Schizophrenic as Measured by Composite Reaction Time Index. American Journal of Psychiatry 97:214225. SCHILDKRAUT, JOSEPH J. 1965 The Catecholamine Hypothesis of Affective Disorders: A Review of Supporting Evidence. American Journal of Psychiatry 122: 509-522. SOKOLOFF, Louis; and KAUFMAN, SEYMOUR 1959 Effects of Thyroxine on Amino Acid Incorporation Into Protein. Science New Series 129:569-570. STROM-OLSEN, R.; and WEIL-MALHERBE, H. 1958 Humoral Changes in Manic—Depressive Psychosis With Particular Reference to the Excretion of Catechol Amines in Urine. Journal of Mental Science 104: 696-704. THUDICHUM, JOHN W. L. 1884 A Treatise on the Chemical Constitution of the Brain. London: Balliere. WALLACE, H. W.; MOLDAVE, K.; and MEISTER, A. 1957 Studies on Conversion of Phenylalanine to Tyrosine in Phenylpyruvic Oligophrenia. Society for Experimental Biology and Medicine, Proceedings 94:632633. WELCH, BRUCE L.; and WELCH, ANN MARIE 1965 An Effect of Aggregation Upon the Metabolism of Dopa-
MENTAL DISORDERS: Epidemiology mine-l-H3. Pages 201-206 in Symposium on Binding Sites of Brain Biogenic Amines, Galesburg, 111., 1963, Biogenic Amines. Progress in Brain Research, Vol. 8. Amsterdam: Elsevier. WOOLF, L. I.; and VULUAMY, D. G. 1951 Phenylketonuria With Study of Effect Upon It of Glutamic Acid. Archives of Disease in Childhood 26:487-494. IV
EPIDEMIOLOGY
"Epidemiology" refers to the science which studies "the mass phenomena of disease" (Greenwood 1935) by determining the distribution of conditions or diseases and the factors which determine these distributions (Lilienfeld 1959); that is, it is "the study of the distribution and determinants of disease prevalence in man" (MacMahon et al. 1960). The analysis that epidemiology makes of these findings results in a "medical ecology" (Gordon 1952). Epidemiology relates observed distributions of disorders to the environments in which people live—the physical, biological, and social environments. "Mental disorder" is used in this article to refer to the full range of psychic conditions identified by psychiatrists or competent social authorities as abnormal or needing improvement. This is a broader range than would be used in planning or conducting any single inquiry but permits consideration, where necessary, of studies of delinquency, criminality, military desertion, and group fads (such as fish swallowing), as well as any conventional psychiatric diagnostic category or an individual symptom or special syndrome recognized in psychiatry. Uses of epidemiology Seven uses of epidemiology can be distinguished (Morris 1957): (1) knowledge regarding historical trends helps to distinguish disorders that are on the increase from those that are disappearing; (2) community diagnosis of the size, location, and distribution of a condition aids in planning health programs for the community; (3) from accumulated records of the ages at which individuals contract a disease, individual risks can be calculated (a basic tool in calculating insurance premiums), and knowledge of contingency risks aids in estimating the effects of host factors in determining the distribution of cases; (4) knowledge of the attributes of cases not in treatment enlarges the clinical picture by making our concept of a disorder less dependent on the clinician's limited perspective on cases; (5) occasionally, new syndromes are identified because clinically dissimilar
149
cases are found to arise from a particular common background or because clinically similar cases are found to arise in two or more distinct sets of circumstances; (6) the working of health services can be studied in terms of their successes and failures, their selection of cases for treatment, and their deleterious effects on the people they seek to serve; and (7) in the search for causes of disorders, data on the factors associated with the distribution of a disorder supplement laboratory and clinical data in the elucidation of causal mechanisms—at times the crucial breakthrough in our understanding of the way in which a condition is caused is made by epidemiological inquiry (this occurred with cholera, pellagra, and lung cancer). Historical trends. Historical trends are important but difficult to study. The Group for the Advancement of Psychiatry reviewed psychiatric knowledge recently (1961). The use of old data gathered for another purpose is sometimes tried. It is difficult to identify two groups at two different points in time which can be considered different time samples of the same population. If the questions are asked broadly enough and if the population being considered has some sort of continuing dimensions, an approximation of trends can sometimes be established. Two studies are noteworthy because they are particularly well done. Goldhamer and Marshall (1949), in a superb study of the admission of psychotics to mental hospitals in Massachusetts during a hundred-year period, found evidence to contradict the widely held view that schizophrenia is becoming more common. In spite of the work's excellence, the implications of the findings remain uncertain, mainly because the data depend entirely on records of cases admitted to mental hospitals and because the population "sample" was taken from an area (Massachusetts) that was the first in North America to be industrialized and was subject to gross emigration and immigration during the time period observed. With the hope of showing that an inverse relationship between intelligence and fertility was leading to a decrement in the national average intelligence quotient (IQ), a survey of Scottish intelligence tested "all" 11-year-olds on one day in 1930 and repeated almost the same procedure on one day in 1949. The findings were negative, according to the publications (Scottish Council for Research in Education 1953), but reinterpretation of the differences between the methods used in the two surveys suggests that the prevalence of very low scores among Scottish 11-year-olds may actually have decreased (Gruenberg 1964).
150
MENTAL DISORDERS: Epidemiology
Community diagnosis. Many studies of particular communities have been carried out for diagnostic purposes; some outstanding examples are mentioned later in this article, in the section on case-finding methods. Since community diagnosis, by its nature, is done for a particular community, it is no more reasonable to borrow the diagnosis of one community and apply it to another than to borrow a neighbor's X ray because he had a similar cough. The general picture will be more or less the same, but the details which differentiate one community from the other may be crucial. Techniques for making rapid and relevant surveys of communities to aid mental health planning are sadly lacking. A few demographic facts are often used (as in the American Psychiatric Association's consultations) to infer what the findings would be. From the health service's point of view, enumeration of cases that do not distinguish preventable or curable conditions from nonpreventable or noncurable cases are of little value. One learns only that the problem is small, big, very big, or enormous, depending on how one defines the problem. The American Public Health Association's Mental Disorders (1962) is a milestone because it indicates the conditions for which it is currently important to be able to enumerate cases. As the technology of treatment and prevention grows, this list will grow. This analysis also indicates that the social-breakdown syndrome is important in evaluating the benefits of a comprehensive community mental health service (e.g., Gruenberg 1966). Similarly, future studies for community diagnosis, as well as for analysis of health services, will require techniques for counting cases of conditions for which something can be done. The calculation of individual risks. The risk of a child's being Mongoloid is dependent on its mother's age at the time of birth, but not its father's (Penrose 1949). If a woman has German measles while pregnant, there is an increased risk of her child's being brain-damaged; this increased risk is highest if the infection occurs during the third month of pregnancy (Hill et al. 1958). If a young child is removed from his parental home for months, there is an increased risk of nightmares, bed-wetting, and some other neurotic symptoms; if the mother leaves the child's home for several months, the increased risk is much less or nonexistent (Douglas & Blomfield 1958, pp. 112-113). Calculating individual risks is frequently helpful; it should not be confused with measuring the incidence of a condition (number of new cases arising during a unit of time in a defined popula-
tion, divided by the number of people in the defined population), or with measuring the prevalence (number of cases present at one point in time in a defined population, divided by the number of persons in that population at that point in time). Enlarging the clinical picture. The clinical picture of a condition is almost automatically enlarged by follow-up studies. Follow-up studies of persons identified at about the age of 12 or 13 as mentally retarded are needed now because many cross-sectional prevalence studies have shown an age distribution of cases indicating a rapid fall in prevalence after age 14 that is incompatible with the definition of the condition, which includes the concept of a fixed, permanent state. [See MENTAL RETARDATION.]
The Medical Research Council Unit for Research on the Epidemiology of Psychiatric Illness provides psychiatric care in a general hospital ward in Edinburgh to which are routinely brought, regardless of severity, all self-poisoning cases in the city of Edinburgh. This comprehensive experience has made the clinicians aware of self-poisoning cases with intent at self-destruction by people who do not exhibit evidence of any psychiatric disorder; they find that the severity of need for psychiatric attention has no relation to the probability that the self-poisoning would lead to death. Thus, looking at all cases of self-poisoning is beginning to provide a different picture of the range of clinical findings. Syndrome identification. The study of population distributions of cases resulted in separating typhus ("jail fever") from typhoid (water- and food-borne). Psychiatry has not made progress this way. Yet, certain diagnostic categories have been set up in terms of the age or personal characteristics or situations of the cases (e.g., involutional melancholia, combat fatigue, dementia praecox, senile dementia); such a classification short-circuits epidemiological inquiry. The course of disorders has been a key criterion in characterizing manicdepressive psychoses, dementia praecox, and mental retardation. Patients discourteous enough to violate the rules have simply had their diagnoses changed. Such practices hinder progress. "Puerperal psychoses" have been in and out of fashion, but pregnancy has not yet been shown to be associated with psychoses (Pugh et al. 1963). Involutional melancholia is confined to certain age levels and is out of favor at present. These illustrations indicate the special problems of classification in psychiatry, where entities like the Ganser syndrome in prisoners and combat fatigue in soldiers gain currency in each generation.
MENTAL DISORDERS: Epidemiology The social-breakdown syndrome is a new sociogenic entity. Its identification arose from observations that can be loosely termed clinical epidemiology (Pickles 1939). A community served by a single mental hospital that undergoes radical reform of patient care and breaks down barriers between hospital and community services stops producing new cases of severely deteriorated, helpless, vegetating individuals. Similar reforms elsewhere lead to similar observations. As a result of these observations, clinicians with a broad (population) perspective change their views of the disorders. Disturbances in capacities to fill social roles come to be seen as extrinsic to the mental disorders of the patients and of secondary importance in comparison with the society's customs and attitudes regarding mental disorders which result in rejections and degrading behavior toward the ill. This sequence of observations leads to a reformulation of clinical syndromes that puts together manifestations previously thought to be due to several different disorders and that attributes these manifestations to a series of particular social events likely to occur in the presence of any of these disorders. Like any other conclusion derived from unsystematic observations, it may gain currency because it fits the prevailing preconceptions of many people, without being further established. Unless systematic evidence is obtained, its validity remains untested. The new sociogenic social-breakdown syndrome is best examined as a promising hypothesis. In order to test the sociogenic hypothesis, the syndrome must be defined operationally and case-finding techniques developed. These must be kept separate from the specification of the social conditions suspected of favoring the syndrome's appearance. Recent investigations have shown that these are soluble problems and confirm the hypothesis in large part; further investigation will also make clearer which mental disorders make people particularly susceptible to the noxious social forces [Gruenberg 1966; see also PSYCHIATRY, article on SOCIAL PSYCHIATRY]. Working of health services. Much of the data gathered on the distribution of mental disorders can be studied to help us understand how hospitals and clinics work. This can be a productive approach to data gathered for other uses. When a population is surveyed for cases similar to those that have gone to a hospital, many unhospitalized cases are found. This suggests that social forces control admissions; if so, these social forces may account
151
partly, or entirely, for variations in admission rates. This proved to be a productive hypothesis in studying the distribution of admissions for the elderly (Mental Health Research Unit 1959-1960) and could be used to interpret many findings (e.g., Paris and Dunham 1939; Goldhamer and Marshall 1949). Readmission rates can also be looked at this way. Freeman and Simmons (1963) found that the types of homes to which mental hospitals released patients did not affect the probability of return to the hospitals; from this they concluded that reasons for rehospitalization are unrelated to the social environment. However, the same data can be approached by starting with the assumption that a hospital staff releases only certain patients and, in deciding which to release, takes the family situation into account; if this is true, the data can be used to argue that the observed lack of difference only shows that the various staffs are equally competent in evaluating the suitability of different types of homes for their patients. Thus, the plan to see whether variations in home living arrangements affect the probability of rehospitalization by studying a cohort of patients released from hospitals to varying home situations is irrelevant to the question. The plan is suitable for an evaluation of the hospitals' release policies with regard to different types of home situations. Obviously, volunteer subjects cannot be assumed to be randomly selected subjects; it is just as important—but sometimes more difficult—to perceive that subjects have been selected by someone else or by some agency. It is necessary to realize that when a bureaucracy selects subjects, the data reflect the behavior of the bureaucracy. The same principle applies to analysis of first mental-hospital admission rates. Preventive trials. The most important studies of health services are those which are carried out in the form of a preventive trial when the health services seek to prevent a particular disorder. The population given the service is selected so that there is a comparable control population to whom service is not given (Pasamanick et al. 1964). The study then becomes a crucial experiment for the health service; it can also be a crucial experiment, if all goes well, for testing an etiological hypothesis. The preventive trial is often thought to represent the last stage of inquiry and to be justified only when much other evidence has been accumulated. But when the preventive procedure advocated is believed to be harmless and generally thought to be desirable and when its supply cannot satisfy all demands, the preventive trial is justified
152
MENTAL DISORDERS: Epidemiology
even if prior evidence is very weak. It is wrong to assume that preventive trials are inherently more expensive, more dangerous, or more difficult than passive studies; such research can be easier and cheaper. However, creating the opportunity for preventive trials and ascertaining that the design of the study is being adhered to throughout pose difficult problems. The search for causes. Each use described above generates data with implications for causes. Outstanding today are the efforts to identify the mechanisms which lead to what Pasamanick has called the "continuity of fetal damage," ranging from death to mild brain damage followed by reading disabilities, impulsive behavior disorders, and other syndromes. Fetal damage has been linked to rubella, to the effects of poverty, and to early complications of pregnancy. Several lines of evidence suggest its linkage to very mild or moderate malnutrition of the mother during the first few months of gestation (MacMahon & Sowa 1961). The hypothesis of schizophrenogenic mothering is being pursued (Lidz & Fleck 1960). Current work is clarifying the nature of the hypothesis and may do more [see SCHIZOPHRENIA]. Maternal deprivation, as Bowlby (Bowlby et al. 1956) named a hypothetic ally pathogenic experience, has been investigated a number of times (e.g., Douglas & Blomfield 1958). These studies, like those of Pasamanick, not only look for causes but relate dissimilar clinical syndromes to one set of causes. Many investigations are required before the relationships become clarified and before the credibility of the hypothesis can be appraised. Bowlby's own study has weakened the initial hypothesis. Hunt (1965) has recently pointed to some further weaknesses in the hypothesis [see INFANCY, article on THE EFFECTS OF EARLY EXPERIENCE]. Down's syndrome (Mongolism, trisomy-21) is not only a major cause of serious mental handicap, but the new and growing knowledge regarding the associated chromosomal abnormality supports the notion that some particular cause or group of causes must be at work and that other chromosomal abnormalities may have the same causes. It is complicated to investigate these conditions, and social phenomena have not yet been implicated. Too often efforts to find the determinants of the distribution of a mental disorder are launched as one-shot investigations, without the researchers' knowing what the distribution of cases is. This procedure assumes that if the distribution is found to be that predicted in the hypothesis, then the hypothesis has been proved, and if not, then it has been disproved. Such a course is not absolutely
doomed to failure, but it is not likely to lead to the gathering of data that can rule out alternative explanations of the observed distributions. Tools and techniques Case finding. Hospital and state school records, clinic outpatient records, records of private practitioners, and key-informant methods have all been used repeatedly in case finding. Hospital records have been used to study firstadmission rates in relation to various hypotheses. Faris and Dunham (1939) pioneered the analysis of first-admission rates in terms of the modern social ecology of a city. Using the census-tract classification of Chicago, based on the University of Chicago sociology department's methods of characterizing urban land use, they allocated the first admissions to mental hospitals to the census tract of origin. They predicted correctly the now wellestablished fact that the first-admission rates for schizophrenia are highest in the central, deteriorated section of the city and decline with the distance of neighborhoods from the center. This was a startlingly successful fusion of Durkheim's ideas, urban sociology, and a hypothesis about the social origins of schizophrenic syndromes. Even more startling were the failure of manic-depressive psychoses to fall into such a pattern and the existence of a different pattern for organic psychoses. These different patterns tend to confirm the importance of the psychiatric diagnosis in spite of the skepticism of many psychiatrists regarding the objective nature of their diagnoses. These findings have been confirmed in other cities. A literature has developed seeking to account for the high first-admission rates of schizophrenics from the city's center. The straightforward notion that the depersonalized, socially isolated part of the city favored the development of schizophrenic disorders has not been universally accepted. Attempts to demonstrate that the disproportionate concentration of schizophrenic cases coming to hospitals from the central part of the city is due to consequences of schizophrenic disorders rather than the pathogenic nature of these neighborhoods can be thought of as studies of the "drift hypothesis." This states that schizophrenics tend to drift into the rooming-house areas of cities at a higher rate than do other people, producing the concentration of cases found there. Sanua (1963) points to the main studies of this hypothesis and to the conflicting evidence, citing Morrison's (1959) finding that patients had a lower social-class status than their fathers, whose social-class distribution was similar to that of the
MENTAL DISORDERS: Epidemiology general population. The inference that schizophrenia causes downward social mobility does not necessarily follow from this observation; since occupational levels tend to rise with age, schizophrenics may accumulate in lower occupational categories because they fail to climb the occupational ladder as fast as other young people. If downward mobility (or failure to rise with one's generation) is really at the root of the phenomenon, then the etiological theory advanced by Faris and Dunham does not hold. Their observations would remain, however, and require explanation. A more recent study, Social Class and Mental Illness, by Hollingshead and Redlich (1958), found that in New Haven the prevalence of treated cases was related to social class. The authors asserted that their data show a gradient of prevalence rates which falls from a high rate in the lowest social class to a low rate in the upper social classes. Miller and Mishler (1959), however, state (correctly, I believe) that the New Haven data show only a very high rate for the lowest social class and that the observed figures in other classes do not demonstrate a gradient of rates. The difficulties of interpreting such studies are compounded by the reliance of the studies on clinical records of cases in treatment and by the social factors affecting hospital-utilization patterns. To obviate the weaknesses of relying solely on clinical records, a method centering on a structured household-interview questionnaire derived in part from the Cornell neuropsychiatric inventory has been developed; this method has not yet been systematically calibrated (Macmillan 1959), but it holds out a promising potential. Its first important use for case finding on a large scale was by Stouffer (Stouffer et al. 1949) in World War n. It was one of the methods used in the first large-scale metropolitan survey, the Midtown Manhattan Study (Langner & Michael 1963). One population of over two thousand has been personally interviewed by psychiatrists at both the beginning and the end of a decade (Essen-Mb'ller 1956). Participant-observers with a psychiatric background have been useful (e.g., Eaton 1955). Finding cases of severe deterioration (severe social-breakdown syndrome) has been carried out by means of a semistructured interview by specially trained public health nurses, psychiatric social Workers, and graduate behavioral-science students. Another method involves interviewers' filling out structured questionnaires—together with answers to open-ended questions—after they have been trained to complete the protocols following open-
153
ended interviews; these protocols are then evaluated by psychiatrists, who categorize the individuals (Mental Health Research Unit 1959-1960). A method similar in principle was used in the following years in the age groups under 60 in Stirling County and in midtown Manhattan (A. Leigh ton 1959; Hughes et al. I960; D. Leighton et al. 1963; Langner & Michael 1963). Cases derived from nonmedical-service-agency records were identified by Lemkau and associates (1941-1942) in the Eastern Health District in Baltimore and in the survey of the mentally retarded in Onondaga County, New York State, by the Mental Health Research Unit staff, under the direction of the program director (Goodman et al. 1956). All of these are useful devices. In practice all are complicated to use and none are easily applied with precision and accuracy. None of them have been adequately calibrated against a standardized ultimate criterion. Such a criterion requires a set of explicit objective criteria for identifying a case, a standardized method of observation, and an estimate of observer variability (Cochrane et al. 1951). The methods of household sampling of populations, highly developed by social scientists, have been used in a number of studies; their most extensive use currently is in the continuous National Health Survey (U.S. National Health Survey 1958). Sorting populations. The use of statistical techniques in planning and interpreting studies has developed to a very high level, borrowing from general statistical theory, agricultural research, genetic research, social science, and economics. Epidemiologic inquiries have contributed devices for age standardization and for adjustment of data, which have in turn been used by those fields (Hill 1937). The study of case aggregations in neighborhoods, households, and families has developed a whole series of techniques. Some of these depend on the concept of primary-case (or proband) rates and secondary-case rates. In the study of mental disorders this has been most frequently associated with genetic hypotheses, and in these instances the concept of lifelong expectation of manifesting the disorder has been developed. In this field the formulas of Wilhelm Weinberg and the mathematician G. H. Hardy (1908) and of Stern (1943) are particularly appropriate. They depend, however, on estimates of differential death and migration rates, whose validity is hard to judge from existing data. The study of familial and household aggregations is not confined to genetic hypotheses, however, as is shown in a review of the literature
154
MENTAL DISORDERS: Epidemiology
(Gruenberg 1950) and by Bleuler's review of studies on schizophrenia (1955); a review of the knowledge regarding group disorders brings together concepts from psychiatry, social psychiatry, and social psychology (Gruenberg 1957). Populations have commonly been characterized by age and sex and time ever since the Hippocratic writings on times, places, and persons. Variations in incidence and prevalence rates by age and sex are frequently interpreted as though age and sex were causative factors. Generally they are not mechanisms by which disorders are produced but convenient ways of classifying populations which develop disorders at different rates. The suspected mechanisms which are distributed according to age and sex also need to be looked into, as was done recently by Langner and Michael (1963). Categorizing the population according to previous illnesses is a frequent but not highly standardized procedure. Except for body typing, the categorizing of populations according to physical characteristics has not been used in inquiries regarding mental disorders. Studies attempting to link genetic characteristics to certain disorders have focused on blood types. It is to be expected that in the next decade other physical characteristics will prove relevant to mental disorder epidemiology. Categorizing populations by their social characteristics is almost standard nowadays in epidemiological inquiry in all fields. The social environment has usually been regarded as important by epidemiologists, and this recognition has been increased greatly by advances in methods of characterizing socioeconomic status and other social variables. The complexity of socioeconomic classifications in a rapidly changing society has made this type of categorizing hard to standardize; attention will probably become more focused on specific characteristics of individuals, some of which are incorporated into indices of socioeconomic status. One study showed that hospital-admission rates for elderly persons were unrelated to economic levels of neighborhoods but were closely related to the frequency of multiplefamily dwellings in neighborhoods; yet no correlations with socioeconomic status (a composite index) could be found [see Gruenberg 1953; see also AGING]. The conduct of studies. The list of references makes it clear that contributions to the epidemiology of mental disorder have been made by workers with various professional backgrounds; there is no reason to expect this to be altered in the future. Technical expertness in such work does not develop from any one course of professional training but is acquired by experience in conducting and inter-
preting the findings of investigations. Too many investigations have been conducted with the hope of testing a hypothesis about the determinants of the distribution of a disorder in a single investigation. The opportunity for refining a hypothesis and going back with experienced investigators to the same population rarely occurs. But such steps can be expected to yield larger returns in knowledge than a proliferation of single investigations. These defects in the social organization of research are being rectified by the creation of permanent laboratories for conducting investigations. The first such laboratory was created at Syracuse, New York, in 1950, on a pilot basis, by the New York State Department of Mental Hygiene and was made permanent in 1955. Since then the Medical Research Council of Great Britain has created one in the University of Edinburgh department of psychiatry; the Danish government has set up one at Aarhus; and the Swedish government, one at Lund. The U.S. National Institute of Mental Health finances one at the Columbia University department of psychiatry. These improvements in social support should lead to a more rapid exploitation of advances as they are made and thus should speed up the acquisition of knowledge. But it is to be expected that a larger number of contributions to our understanding of mental-disorder epidemiology will continue to come from workers who are not labeled epidemiologists and who will in many instances not regard their work as particularly relevant to mental health. The epidemiologist will continue to be interested in getting answers to his questions and will, hopefully, judge new contributions on their merits rather than on their author's school of thought or previous conditions of servitude (i.e., degree sequences). In the context of this encyclopedia it may be well to point out that all disorders, whatever the causes, have distributions that reflect social factors. This universal proposition follows from the simple fact that all classes of causes have such social distributions. For example, Goldberger (1964) showed that pellagra is due to a nutritional deficiency by studying its incidence in different social groups in southern mill towns. MacMahon and Roller (1957) showed that the higher leukemia death rate among whites, as compared to nonwhites, may well be due to more exposure to diagnostic radiation among Jews (who apparently use medical specialists at a higher rate). Gelfand and his associates (1957) showed that naturally acquired immunity to polio viruses occurs at early ages most frequently i*1
MENTAL DISORDERS: Epidemiology low-income groups. Book (1961) reviewed accumulations of genes in particular linguistic and social groupings. Roueche (1954) describes how poisons can spread through socially isolated parts of a population. Sometimes the social forces are considered the main factors (as in "diseases of poverty" and in "diseases of affluence," such as coronary heart disease) and sometimes the intermediate variables that help unravel the chain (as in pellagra). But in understanding every distribution of disorders, knowledge of social forces plays some role. ERNEST GRUENBERG [Directly related are the entries EPIDEMIOLOGY and PUBLIC HEALTH. Other relevant material may be found in POPULATION; PSYCHIATRY; SAMPLE SURVEYS.] BIBLIOGRAPHY AMERICAN PUBLIC HEALTH ASSOCIATION, TECHNICAL DEVELOPMENT BOARD, PROGRAM AREA COMMITTEE ON MENTAL HEALTH 1962 Mental Disorders: A Guide to Control Methods. New York: The Association. BLEULER, M. 1955 Research and Changes in Concepts in the Study of Schizophrenia: 1941-1950. Isaac Ray Medical Library, Bulletin [1955]: 1-132. BOOK, JAN A. 1961 Genetical Etiology in Mental Illness. Pages 14-45 in Milbank Memorial Fund, Causes of Mental Disorders: A Review of Epidemiological Knowledge, 1959. New York: The Fund. BOWLBY, JOHN et al. 1956 The Effects of Mother-Child Separation: A Follow-up Study. British Journal of Medical Psychology 29:211-247. COCHRANE, A. L.; CHAPMAN,
P.
J.;
and
OLDHAM, P.
D.
1951 Observers' Errors in Taking Medical Histories. Lancet IB: 1007-1009. DOUGLAS, JAMES W. B.; and BLOMFIELD, J. M. 1958 Children Under Five. London: Allen & Unwin. EATON, JOSEPH W. 1955 Culture and Mental Disorders: A Comparative Study of the Hutterites and Other Populations. Glencoe, 111.: Free Press. ESSEN-MOLLER, ERIK 1956 Individual Traits and Morbidity in a Swedish Rural Population. Acta psychiatrica et neurologica scandinavica Supplement 100. PARIS, ROBERT E. L.; and DUNHAM, H. WARREN (1939) 1960 Mental Disorders in Urban Areas: An Ecological Study of Schizophrenia and Other Psychoses. New York: Hafner. FREEMAN, HOWARD E.; and SIMMONS, OZZIE G. 1963 The Mental Patient Comes Home. New York and London: Wiley. GELFAND, HENRY M. et al. 1957 Studies on the Development of Natural Immunity to Poliomyelitis in Louisiana. American Journal of Hygiene 65:367-385. GOLDBERGER, JOSEPH 1964 Goldberger on Pellagra. Edited by Milton Terris. Baton Rouge: Louisiana State Univ. Press. -* Reprint of 17 papers. OI -DHAMER, HERBERT; and MARSHALL, ANDREW W. (1949) 1953 Psychosis and Civilization. Glencoe, 111.: Free Press. -» First published as The Frequency of Mental Disease: Long-term Trends and Present Status.
155
GOODMAN, M. B. et al. 1956 A Prevalence Study of Mental Retardation in a Metropolitan Area. American Journal of Public Health. 46: 702-707. GORDON, JOHN E. 1952 The Twentieth Century—Yesterday, Today, and Tomorrow (1920). Pages 114167 in Franklin H. Top (editor), The History of American Epidemiology. St. Louis, Mo.: Mosby. GREENWOOD, MAJOR 1935 Epidemic and Crowd Diseases. London: Williams & Norgate. GROUP FOR THE ADVANCEMENT OF PSYCHIATRY, COMMITTEE ON PREVENTIVE PSYCHIATRY 1961 Problems of Estimating Changes in Frequency of Mental Disorders. New York: The Group. GRUENBERG, ERNEST M. 1950 Review of Available Material on Patterns of Occurrence of Mental Disorders: Major Disorders. Pages 176-196 in Milbank Memorial Fund, Epidemiology of Mental Disorder. New York: The Fund. GRUENBERG, ERNEST M. 1953 Community Conditions and Psychoses of the Elderly. American Journal of Psychiatry 110:888-896. GRUENBERG, ERNEST M. 1957 Socially Shared Psychopathology. Pages 201-229 in Alexander H. Leighton, John A. Clausen, and Robert N. Wilson (editors), Explorations in Social Psychiatry. New York: Basic Books. GRUENBERG, ERNEST M. 1964 Epidemiology. Pages 259306 in Harvey A. Stevens and Rick Heber (editors), Mental Retardation. Univ. of Chicago Press. GRUENBERG, ERNEST M. (editor) 1966 Evaluating the Effectiveness of Mental Health Services. Milbank Memorial Fund Quarterly 44, no. 1, part 2. HARDY, G. H. 1908 Mendelian Proportions in a Mixed Population. Science 28:49-50. HILL, A. BRADFORD (1937) 1961 Principles of Medical Statistics. 7th ed. London: Lancet. HILL, A. BRADFORD et al. 1958 Virus Diseases in Pregnancy and Congenital Defects. British Journal of Preventive and Social Medicine 12:1-7. HOLLINGSHEAD, AUGUST B.; and
REDLICH,
FREDERICK
C.
1958 Social Class and Mental Illness: A Community Study. New York: Wiley. HUGHES, CHARLES et al. 1960 People of Cove and Woodlot: Communities From the Viewpoint of Social Psychiatry. The Stirling County Study of Psychiatric Disorder and Sociocultural Environment, Vol. 2. New York: Basic Books. HUNT, J. McV. 1965 Traditional Personality Theory in the Light of Recent Evidence. American Scientist 53: 80-96. LANGNER, THOMAS S.; and MICHAEL, STANLEY T. 1963 Life Stress and Mental Health. Volume 2 of The Midtown Manhattan Study. New York: Free Press. LEIGHTON, ALEXANDER H. 1959 My Name Is Legion: Foundations for a Theory of Man in Relation to Culture. The Stirling County Study of Psychiatric Disorder and Sociocultural Environment, Vol. 1. New York: Basic Books. LEIGHTON, DOROTHEA et al. 1963 The Character of Danger. The Stirling County Study of Psychiatric Disorder and Sociocultural Environment, Vol. 3. New York: Basic Books. LEMKAU, PAUL; TIETZE, CHRISTOPHER; and COOPER, MARCIA 1941-1942 Mental-hygiene Problems in an Urban District. Mental Hygiene 25:624-646; 26:100-119, 275-288. LIDZ, THEODORE; and FLECK, STEPHEN 1960 Schizophrenia, and Human Integration, and the Role of the
I 56
MENTAL DISORDERS: Childhood Mental Disorders
Family. Pages 323-345 in Don D. Jackson (editor), The Etiology of Schizophrenia. New York: Basic Books. LILIENFELD, ABRAHAM M. 1959 A Methodological Problem in Testing a Recessive Genetic Hypothesis in Human Disease. American Journal of Public Health 49: 199-204. MACMAHON, BRIAN; and ROLLER, ERNEST K. 1957 Ethnic Differences in the Incidence of Leukemia. Blood: The Journal of Hematology 12:1-10. MACMAHON, BRIAN; PUGH, THOMAS F.; and IPSEN, JOHANNES 1960 Epidemiologic Methods. Boston: Little. MACMAHON, BRIAN; and SOWA, JAMES M. 1961 Physical Damage to the Fetus. Pages 51-110 in Milbank Memorial Fund, Causes of Mental Disorders: A Review of Epidemiological Knowledge, 1959. New York: The Fund. MACMILLAN, ALLISTER M. 1959 A Survey Technique for Estimating the Prevalence of Psychoneurotic and Related Types of Disorders in Communities. Pages 203218 in American Psychiatric Association, Epidemiology of Mental Disorder. Washington: The Association. MENTAL HEALTH RESEARCH UNIT, NEW YORK STATE DEPARTMENT OF MENTAL HYGIENE, SYRACUSE, NEW YORK 1959-1960 A Mental Health Survey of Older People. Parts 1-3. Psychiatric Quarterly Supplement 33:45-99, 252-300; 34:34-75. MILLER, S. M.; and MISHLER, E. G. 1959 Social Class, Mental Illness, and American Psychiatry: An Expository Review. Milbank Memorial Fund Quarterly 37: 174-199. MORRIS, JEREMY N. (1957)1965 Uses of Epidemiology. 2d ed. Baltimore: Williams & Wilkins. MORRISON, S. L. 1959 Principles and Methods of Epidemiological Research and Their Application to Psychiatric Illness. Journal of Mental Science 105: 999-1011. PASAMANICK, BENJAMIN et al. 1964 Home vs. Hospital Care for Schizophrenics. Journal of the American Medical Association 187:177-181. PENROSE, LIONEL S. (1949) 1963 The Biology of Mental Defect. 3d ed. London: Sidgwick & Jackson. PICKLES, WILLIAM N. 1939 Epidemiology in Country Practice. Bristol (England): Wright; Baltimore: Williams & Wilkins. PUGH, THOMAS F. et al. 1963 Rates of Mental Disease Related to Childbearing. New England Journal of Medicine 268:1224-1228. ROUECHE, BERTON 1954 Eleven Blue Men. Boston: Little. SANUA, VICTOR D. 1963 The Etiology and Epidemiology of Mental Illness and Problems of Methodology, With Special Emphasis on Schizophrenia. Mental Hygiene 47:607-621. SCOTTISH COUNCIL FOR RESEARCH IN EDUCATION, MENTAL SURVEY COMMITTEE 1953 Social Implications of the 1947 Scottish Mental Survey. Univ. of London Press. STERN, CURT 1943 The Hardy-Weinberg Law. Science 97:137-138. STOUFFER, SAMUEL A. et al. 1949 The American Soldier. Studies in Social Psychology in World War II. Vols. 1 and 2. Princeton Univ. Press. -» Volume 1: Adjustment During Army Life. Volume 2: Combat and Its Aftermath. U.S. NATIONAL HEALTH SURVEY 1958 Health Statistics, Series A. Volume I. Washington: Government Printing Office.
CHILDHOOD MENTAL DISORDERS
Early conceptions of mental disorders in children are reflected in several clinical papers on child-rearing practices from the sixteenth, seventeenth, and eighteenth centuries (see Kessen 1965). These precursors of present ideas about causality failed to receive widespread acceptance and scientific interest, however, until recently. Even Kraepelin's influential Psychiatric, first published in 1893, did not discuss mental disorders in children in any of its many editions. Scientific interest in deviant mental processes in children has flowered in the twentieth century, stimulated in large part by theories of mental functioning advanced by Freud, Piaget, Watson, and others which emphasized developmental and dynamic factors as well as notions of environment causality and learning. Further, interest in mental disorders in children has increased with the resurgence of scientific interest in children's behavior in general in the twentieth century, particularly since the late 1940s. Resources for the diagnosis and treatment of mental disorders in children have multiplied in the United States and in Europe since the 1950s. In the United States, the establishment of the National Institute of Mental Health and the National Institute of Child Health and Development and the allocation of federal funds for the development of plans for organizing and providing comprehensive mental health services in local communities have stimulated the growth of relevant programs of service, professional training, and research. A federally supported conference of professional organizations and interested agencies was held in 1964 to insure and to plan for the provision of diagnosis and treatment of mental disorders in children under federally sponsored community health programs (American Psychiatric Association 1964). Definition, predisposition, precipitation. The definition of disorders of mental functioning in children varies depending upon one's conception of mental functioning in general. Most professionals—including psychologists, psychiatrists, and social workers—with special training in the diagnosis and treatment of mental disorders in children currently emphasize a multidimensional approach. This approach encompasses psychosomatic considerations; developmental capacities and vulnerabilities; constitutional and genetic factors; the internal personality system, including cognitive perceptual, and affective mechanisms and the fluidity and plasticity of the child's personality
MENTAL DISORDERS: Childhood Mental Disorders characteristics; and psychosocial considerations, including parent-child relationships, family interactions, and sociocultural influences. This approach also stresses the need to assess and capitalize upon the healthy and adaptive facets of the child's personality. A conceptual framework which encompasses all of the above aspects of the child's functioning inclusively and systematically has not been developed. However, the Committee on Child Psychiatry of the Group for the Advancement of Psychiatry has prepared (in unpublished form) what appears to be the best available approximation of such a framework (Committee on Child Psychiatry 1965). The present discussion of mental disorders in children leans heavily upon its work. Mental disorder can be defined as a failure in the child's attempt to maintain an adaptive equilibrium between physiological, psychological, and interpersonal systems; there is a close relationship between physical and psychological factors. In the child with certain constitutional or experiential predispositions, disordered mental functioning may be precipitated and sustained by physical, psychological, or social stimuli. Biochemical stimuli or stressful insufficiencies of chemical substances may disrupt the child's physiological equilibrium as well as associated perceptual, cognitive, and emotional systems. In addition, the child's restitutive efforts involve an interaction between physiological, psychological, and social systems. Stimuli of a psychological nature may also precipitate mental disorder. Such stimuli include conscious or unconscious thoughts and feelings which arouse anxiety in the child because of their association with stressful past or present experiences. They involve the cognitive system in that they are represented in symbolic form in memory and thinking, and the perceptual apparatus. Thoughts, memories, and feelings triggered by anxiety-arousing psychological stimuli lead to the child's employment of psychological defenses or compensating behaviors as he seeks to avoid a breakdown in adaptation or equilibrium. Stressful social stimuli in the child's environment Jttay also precipitate disorder. Such stimuli include the loss, or threat of loss, of close interpersonal r elationships and the frustration of basic needs re sulting from disturbances in relationships within the family. The nature and severity of the mental disorder Precipitated by physical, psychological, or social lrn uli, or of interactions among them, is contine § nt upon the stressfulness of the stimuli as well
157
as upon genetic and constitutional factors (for example, temperament, body build), previous experience, and the developmental level of the child. Stressful stimuli initially disruptive of one of these systems may, and often do, have repercussions in other systems. Thus mental disorders in children may be precipitated by disruptions of a physical, social, or psychological nature, but the child's defensive efforts or attempts at restoration of ego functions or compensatory behavior almost always involve all three systems. For example, a stressful thought or fantasy may lead to emotional conflict which triggers "signal anxiety." This signal, warning of possible breakdown in adaptation or mental equilibrium, leads to the establishment of psychological defenses or adaptive interpersonal behaviors. If emotional conflict and anxiety are severe or become chronic, physiological concomitants may develop. Such physiological symptoms are reversible if the underlying emotional conflicts are resolved; chronic unresolved conflict and anxiety may lead, however, to serious strain and even breakdown of weakened physiological systems or organs. [See ANXIETY; CONFLICT, article on PSYCHOLOGICAL ASPECTS; STRESS.] In the first several months of life the immaturity of the physical and mental systems results in the infant's responding in a global manner to lack of gratification of his needs or to stressful stimuli. The very young infant shows little differentiation of emotional response; he tends to respond to stressful stimuli with general symptoms of distress (for example, crying, global motor activity). However, there is increasing evidence of systematic individual differences in response even at this early level of development (Murphy 1962). Individual differences in stress thresholds, modes of expressing distress, and secondary reactions to distress—including the number of organ systems involved and the intensity of their involvement—may be prognostic in infants of a predisposition to mental disorder in later childhood (Korner 1964). In children predisposed by stressful experiences and by personality structure, continued emotional conflict may lead to the chronic use of psychological defenses and maladaptive social behaviors known as symptoms of mental disorder. Particular constellations of such symptoms define the different types of mental disorder in children. In general, these clusters of clinical symptoms can be assigned to one of three general divisions: psychoneuroses, personality or character disorders, or psychoses. Some of the physical or psychological symptoms in each cluster may result from the child's attempt to maintain equilibrium or adaptation in the face
1 58
MENTAL DISORDERS: Childhood Mental Disorders
of current stressful events or to compensate for the disturbance, to make restitution, or to obtain gratification of physical, psychological, or interpersonal needs. Other symptoms in each clinical picture may define the new equilibrium resulting from the child's adaptive and compensating efforts. They may include partial restriction or malfunctioning of perceptual, cognitive, social, or physiological functions, rigid reliance upon defense mechanisms such as repression and denial, or a more severe breakdown in adaptive efforts. Anna Freud (1965), Erik Erikson (1950), and many others have emphasized that in addition to the physical and personality characteristics and the experiences of the child, it is important to consider both his developmental level and "lines" or continuities of development. Thus, the notion of a state of psychological equilibrium in the child is only a relative one, with such states stable and definable only at cross-sectional points in time during his growth. These "states" interact systematically, and lines of development such as that from "dependency to emotional self-reliance and adult object relationships" may be observed (A. Freud 1965). Interpersonal or psychological events may have stressful effects upon a relatively immature mental apparatus, whereas the same stimuli may be handled without disruptive anxiety by an older child or adult. In addition, the child may defend against anxiety resulting from such stimuli through regression to a more immature level of adaptation. Or, such stimuli may reinforce fixations of personality development which partially preclude further development in one or more areas of functioning (for example, learning). Thus the child's developmental level partially determines his capacities and modes of coping with potentially stressful stimuli of a physical, social, or psychological nature and partially determines the type and severity of mental disorder. For example, certain intrauterine viral infections during early pregnancy are more likely to produce congenital anomalies than such infections occurring later in gestation. In addition, a prolonged lack of adequate mothering is particularly damaging to infants in the second half of the first year. The research of Spitz (1946), Bowlby (1960a; 1960b), Provence and Lipton (1962), Piaget (Flavell 1963),Escalona andHeider (1959), and Heider (1960), as well as recent animal research by ethologists (Ostow 1959), suggests that serious and perhaps irreversible defects in social and intellectual development may follow if the child fails to receive sufficient interpersonal or perceptual stimulation during appropriate develop-
mental phases, including those of early infancy. Such defects may predispose the child to one or another type of mental disorder, but knowledge of the specific nature of such relationships depends upon future research. [See INFANCY, article on THE EFFECTS OF EARLY EXPERIENCE.]
Freud's concept of emotional conflict as amplified by Anna Freud, Hartmann, Erikson, and others is central in contemporary theories of mental disorder in children, particularly with respect to the development of psychoneuroses. As the infant or preschool child develops relationships with people and as his mental functions become more complex, he may show a variety of reactive disturbances in behavior, such as temper tantrums, aggressive behavior, and crying. These disturbances are reactive to environmental stress, primarily conflicts with parents over the control (socialization) of basic sexual and aggressive drives, and are the early precursors of internalized emotional conflict. In the early years, these disturbances are generally transient and reversible in response to positive changes in the environment, although as a result of continued disturbances in parent-child relationships, they may become chronic and fixed. Reactive disorders are also seen in the older child; the symptoms may include anxiety, unrealistic fears, shyness, feelings of inadequacy or loneliness, disturbances in attentional processes and learning, and inappropriate social behavior. If the child's emotional conflicts are especially anxiety-arousing and unresolved, they later lose their conscious nature and are repressed and internalized. The conflicts are then inaccessible to further attempts at resolution, including new efforts made possible by the development of more powerful and complex cognitive and other ego functions: the capacities for greater independence in the gratification of needs, a longer attention span, improved reality testing, increased ability to think abstractly as well as concretely, and further differentiation and integration of perceptual and motor functions—together with further differentiation of the emotions, more effective repression of anxiety and other affects, and the internalization and symbolic representation of conflict. Because of their relative inaccessibility to conscious problem-solving efforts, emotional conflicts in the child may become self-perpetuating and lead to the establishment or chronically maladaptive behavior learned in early periods of development. Depending upon his experiences and his own developmental capacities, however, the child instead resolve conflicts established earlier, or new conflict facing him in a crisis situation, a
MENTAL DISORDERS: Childhood Mental Disorders thereby achieve a higher level of differentiation and structure of personality. If such resolution and mastery is prevented, temporary regression, longterm arrest in cognitive or emotional function and development, or decompensation or adaptive failure may occur. Thus, unconscious conflicts, with associated alternating experiences of anxiety and employment of maladaptive psychological defenses, may become firmly established components of the personality, leading to structural rigidity and brittleness and presenting the model of psychoneurosis in the child. Psychoneurotic disorders. A variety of symptoms and patterns of symptoms may be observed in the psychoneurotic mental disorders of children. These symptomatic reactions to neurotic conflict may fluctuate and change with changes in development and socialization. The psychoneuroses are not characterized by extreme personality disorganization and grossly distorted reality testing. Although psychoneurotic symptoms of internalized emotional conflict may be observed in children as young as three or four years, fully structured neurotic disorders are not ordinarily seen in children until the early school-age period. The typical childhood neurosis develops in a youngster who has already evolved a conscience or superego, who has achieved an internalization of conflict and the use of a variety of defense mechanisms, including repression of affects from consciousness, and who manifests symptoms symbolically relevant to the underlying conflicts. Several relatively independent types of psychoneurotic disturbances in children have been defined, including anxiety, phobic, conversion, dissociative, obsessive-compulsive, and depressive disorders. Detailed descriptions of these disorders are discussed elsewhere [see ANXIETY; DEPRESSIVE DISORDERS;
OBSESSIVE-COMPULSIVE
DISORDERS;
PHOBIAS; see also Committee on Child Psychiatry 1965]. The well-known cases of "Frankie" (Bornstein 1949) and "Little Hans" (Freud 1909) are illustrative of the phobic type of neurotic disorder in which the defense mechanism of displacement is prominent. Here the child unconsciously displaces the meaning or content of the underlying emotional conflict onto an object or situation in his environment which is symbolically relevant. The fears derived from the internalized conflict are experienced in a distorted and irrational manner. The child avoids stimuli which reactivate or intensify his displaced conflict, and he often projects his sexual, hostile, and other unacceptable feelings onto the external feared objects such as animals, lrt , elevators, or situations such as school. Mild
159
and transient fears, fearful reactions to stressful experiences (reactive disorders), and common developmental crises involving separation anxiety in children should be carefully distinguished from phobic psychoneurotic disorders, with their internalized and structured nature. [See DEFENSE MECHANISMS.]
Remission of symptomatic behaviors without treament may be observed in some of the milder psychoneurotic disorders as the child masters the developmental tasks and crises of later stages (Nagera 1966). However, such disorders ordinarily require psychotherapeutic intervention with the child and his family. The prognosis for response to treatment is good. Treatment requires assessment of the balance of external and internal forces involved in the disorder (Haworth 1964). Depending upon the balance of such forces, therapeutic intervention may be made primarily through manipulation of the environment, or the therapist may focus upon achieving intrapsychic changes in the child and other family members, leading to the resolution of interpersonal conflict, the relieving of neurotic symptoms, the promoting of further development in the mental apparatus, and the learning of more adequate social responses and modes of coping with stressful stimuli (Kessler 1966). Personality disorders. The personality disorders differ from psychoneurotic disorders in children in the following ways. In personality disorders, chronic (fixed) pathological trends and traits are prominent, and they are ego-syntonic—that is, they are not perceived by the child as anxiety-arousing or as a source of distress. Most of the personality disorders in children appear to involve strong fixations and/or disturbances in earlier psychological and psychosexual development, related to crises and conflicts involving wishes for dependency and autonomy, the handling of sexual and aggressive impulses and behaviors, and sex role identification. The types of personality disorder which have been defined (Committee on Child Psychiatry 1965) include the anxious personality, compulsive personality, hysterical personality, overly dependent personality, oppositional personality, overly inhibited personality, overly independent personality, isolated personality, distrustful personality, personality with discharge disorder (impulse-ridden or neurotic types), and sociosyntonic personality disorder. Discharge disorders. The discharge-disorder category includes many children who are classified in other systems as delinquent, acting-out, psychopathic, or sociopathic. Children in this general
1 60
MENTAL DISORDERS: Childhood Mental Disorders
category tend to act out directly their feelings and impulses in an antisocial and often highly destructive manner. Two subcategories have been defined which distinguish between an impulse-ridden group and a group in whose discharge disorder neurotic conflict plays an important role. There is a central tendency in both of these subgroups to discharge rather than delay or inhibit antisocial impulses, but the sources of the tendency to discharge differ. The child with an impulse-ridden personality shows low frustration tolerance and difficulty in controlling or channeling sexual and aggressive impulses; his interpersonal relationships tend to be shallow, he experiences little anxiety or guilt, and there is considerable deficiency in conscience development and in the development of flexible and complex defense mechanisms. He tends to have a history of extreme emotional deprivation. The neurotic personality disorder subgroup, on the other hand, has achieved a more complex level of personality development. The child in this subgroup has developed the capacity to internalize conflict, and his antisocial behavior tends to be reactive to such conflicts and to have unconscious symbolic significance to such conflicts. These children experience some anxiety and guilt, and their interpersonal relationships, while ambivalent, are warmer and more meaningful than those of children with impulse-ridden personality [see DELINQUENCY, article on PSYCHOLOGICAL ASPECTS; PSYCHOPATHIC PERSONALITY]. Developmental deviations. It is important to distinguish between the diagnoses of psychoneurosis and personality disorder, and developmental deviations. Some behaviors (for example, sexual deviations), often part of a neurotic and personality disorder, may also be related primarily to delay, acceleration, or unevenness in development and should be classified as developmental deviations rather than as neuroses or personality disorders. Psychotic disorders. Psychotic mental disorders in children are characterized by marked and pervasive deviations from mental functioning normal for the child's age. In general, these disorders usually include chronic and severe impairment or deterioration of emotional relationships, preoccupation with inanimate objects, failure to develop speech or loss of speech for purposes of communication, bizarre behavior and unusual motility patterns, extreme mood swings and intensity of affective experience and expression (as in sudden temper outbursts, panic, etc.), and failure to develop or loss of a sense of individual identity. There are generally severe disturbances in perceptual and cognitive development and functioning, but with
onset in later childhood certain areas of intellectual functioning and achievement may be adequate or better. Some of the symptoms seen in childhood psychosis—such as some of the disorders in thinking, affect, perception, motility, speech, object relations, and reality testing—represent efforts at restitution or compensation for the psychotic process. Autism and symbiotic psychosis. Childhood psychoses do not tend to crystallize into as many varieties or subtypes as is the case with adult psychoses (Kessler 1966). Two major subtypes have been defined in early childhood: infantile autism and symbiotic or interactional psychotic disorder. Age of onset in infantile autism is the first few months of life as the infant fails to develop a normal emotional attachment to a mother figure. He remains emotionally aloof, speech development is delayed or absent, feeding and sleeping problems and stereotyped motor and motility patterns are prominent, and the child responds to relatively slight changes in his environment with intense outbursts of anger or anxiety. Some intellectual functions are intact, but their use is impaired by the defective reality testing and lack of communication. Age of onset of symbiotic psychosis is after the first year or two of life. The child develops a normal emotional attachment to his mother but fails to achieve separation and individuation. Intense and prolonged dependency upon the mother (or between mother and child) is prominent in the early history. The disorder is usually precipitated by some real or fantasied threat to the mother-child relationship. Symptoms include marked and severe separation anxiety, clinging (sometimes indiscriminately), regression (for example, giving up speech, loss of bowel control), gradual withdrawal from object relations, autistic behavior, and distortions in perceptual and cognitive functioning. Childhood schizophrenia. Childhood schizophrenia or "schizophreniform" psychotic disorder occurs in middle childhood—ages 6 through 12 or 13. The disorder may be of gradual or relatively acute onset. Where onset of the disorder is gradual, the development of neurotic symptoms is followed by regression to use of the primitive defenses of marked denial and projection. Low frustration tolerance, hypochondriacal tendencies, and inappr°' priate outbursts of temper or panic are often observed, and these are followed by withdrawal, increasing involvement in private fantasy, emotional aloofness, disorders in thinking and perceptionautistic behavior, and a breakdown in reality testing (e.g., Goldfarb 1961). The prognosis is more
MENTAL DISORDERS: Childhood Mental Disorders favorable if the psychosis is an acute reaction to a developmental crisis. Few adultlike hallucinations are experienced by psychotic children until ages 9 or 10 at least. However, bizarre motor behavior (for example, whirling), self-mutilation and suicidal attempts, and inappropriate mood swings are seen with some frequency in these cases. Occasionally, some of the symptoms more characteristic of adult psychoses are seen in children. These include ideas of reference, somatic delusions, catatonic behavior, and paranoia. [See SCHIZOPHRENIA.] Parent-child relations. In diagnosing neurosis, personality disorder, or psychosis in children, the healthy, positive responses and capacities as well as psychopathological trends should be assessed, along with the positive and negative physical, social, and psychological determinants of the child's behavior (A. Freud 1965). Current views of the diagnosis and treatment of mental disorders in children stress the importance of parent-child relationships, family processes, and sociocultural influences, while remaining cognizant of contributing and predisposing genetic and constitutional factors. Disturbances in parent-child relationships have been implicated in a variety of children's mental disorders, including unusual fluctuations in mood, psychoneuroses, certain psychotic disorders, and "antisocial" personality disorders, as well as disorders in which both physical and psychological functions are disturbed, such as marasmus or failure to thrive in infants, ulcerative colitis, asthma, and disturbances in perceptual, cognitive, and sensory-motor functions related to structural changes in the central nervous system. The connection between type of mental disorder and specific characteristics of parent-child relationships is complex and not clearly understood. Research on certain personality factors characteristic of parents of psychotic and neurotic children (such as that of Sarason et al. 1960) shows that parents of children with high anxiety differ in certain personality traits from parents of children with low anxiety, and the "superego lacunae" shown by Johnson and Szurek (1952) in the parents of some kinds of antisocial children indicate the probability of some specificity in the relationsnip between parent-child interaction variables and type and severity of mental disorder. In addition to the quality of the parent-child re lationship, other family variables may predispose a nd contribute to the development of mental disorders in children (e.g., Ross 1964). For example, Ss of a family member, lack of family cohesive-
1 61
ness, distorted and neurotic communication patterns, deviant role functions, conflicting value orientations, or poor integration with the community may serve as stressful stimuli which upset the functioning of the family and lead to a reactive disorder, developmental deviation, neuroses, personality disorder, or psychoses in the child. One of the characteristics of the healthy family is the capacity to respond adaptively to crisis. Serious illness, economic losses, death of a parent, removal to a new community, and other such stressful events tend to be disruptive of family equilibrium and established modes of functioning and relating. Such disruptions may be temporary and through mastery lead to a higher level of functioning, or they may continue and result in family disintegration or pathology in one or more family members. There is some indication of a specific relationship between type of family disruption and type of mental disorder in the child. For example, certain types of delinquent acting-out tend to occur in families with little cohesiveness and faulty disciplinary practices (e.g., Bandura & Walters 1959; McCord et al. 1961). Certain patterns seem characteristic in families of children suffering from schizophrenia or certain types of autism (e.g., Lidz & Fleck 1960). However, as is the case with other variables which have been implicated to some degree in mental disorders of children, the establishment of clear-cut relations between family variables and individual child disorders depends upon further research. Sociocultural variables have also been implicated in the etiology of children's mental disorders. Variations in child-rearing practices and attitudes have been noted in different ethnic and social-class groups and cultures, as have incidence and type of disorder and attitudes and responses to treatment (e.g., Whiting 1963; Clinard 1957). However, the social and psychological mechanisms mediating the relationship between sociocultural variables and mental disorder are still poorly understood. Potentially stressful events for families and individuals include movement from rural to urban areas; shifts in socioeconomic conditions and traditional customs, attitudes, and interpersonal functions; and the acculturation of primitive or previously relatively isolated cultures. Children who experience such events without adequate preparation are particularly vulnerable to mental disorder (Kessler 1966). The economic, religious, and educational status of the family also seems related to the manner in
162
MENTAL DISORDERS: Experimental Study
which the other family members react to mental disorder in a child, just as stereotypes of these groups tend to affect the treatment plans and services offered by agencies and clinicians. BRITTON K. RUEBUSH [Directly related are the entries MENTAL RETARDATION; PSYCHIATRY, article on CHILD PSYCHIATRY. Other relevant material may be found in DEVELOPMENTAL PSYCHOLOGY; INFANCY.] BIBLIOGRAPHY
AMERICAN PSYCHIATRIC ASSOCIATION 1964 Planning Psychiatric Services for Children in the Community Mental Health Program. Washington: The Association. BANDURA, ALBERT; and WALTERS, RICHARD H. 1959 Adolescent Aggression. New York: Ronald Press. BORNSTEIN, BERTA 1949 The Analysis of a Phobic Child: Some Problems of Theory and Technique in Child Analysis. Psychoanalytic Study of the Child 3/4:181-226. BOWLBY, JOHN 1960a Separation Anxiety. International Journal of Psycho-analysis 41:89-113. BOWLBY, JOHN 1960t> Grief and Mourning in Infancy and Early Childhood. Psychoanalytic Study of the Child 15:9-52. CHESS, STELLA 1959 An Introduction to Child Psychiatry. New York: Grune. CLINARD, MARSHALL B. (1957) 1963 Sociology of Deviant Behavior. Rev. ed. New York: Holt. COMMITTEE ON CHILD PSYCHIATRY, GROUP FOR THE ADVANCEMENT OF PSYCHIATRY 1965 A Proposed Classification of Psychological Disorders in Childhood. Unpublished manuscript. CRAMER, JOSEPH B. 1959 Common Neuroses of Childhood. Volume 1, pages 797-815 in American Handhook of Psychiatry. Edited by Silvano Arieti. New York: Basic Books. ERIKSON, ERIK H. (1950) 1964 Childhood and Society. 2d ed., rev. & enl. New York: Norton. ESCALONA, SIBYLLE; and HEiDER, GRACE 1959 Prediction and Outcome. New York: Basic Books. FLAVELL, JOHN H. 1963 The Developmental Psychology of Jean Piaget. Princeton, N.J.: Van Nostrand. FREUD, ANNA 1965 Normality and Pathology in Childhood: Assessment of Development. New York: International Universities Press. FREUD, SIGMUND (1909) 1955 Analysis of a Phobia in a Five-year-old Boy. Volume 10, pages 3-149 in Sigmund Freud, The Standard Edition of the Complete Psychological Works of Sigmund Freud. London: Hogarth; New York: Macmillan. -» First published in German. GOLDFARB, WILLIAM 1961 Childhood Schizophrenia. Cambridge, Mass.: Harvard Univ. Press. HAWORTH, MARY R. (editor) 1964 Child Psychotherapy. Practice and Theory. New York: Basic Books. HEIDER, GRACE 1960 Vulnerability in Infants. Menninger Clinic, Bulletin 24:104-114. JOHNSON, ADELAIDE M.; and SZUREK, S. A. 1952 The Genesis of Antisocial Acting Out in Children and Adults. Psychoanalytic Quarterly 21:323-343. KESSEN, WILLIAM 1965 The Child. New York: Wiley. KESSLER, JANE W. 1966 Psychopathology of Childhood. Englewood Cliffs, N.J.: Prentice-Hall.
KORNER, ANNELIESE F. 1964 Some Hypotheses Regarding the Significance of Individual Differences at Birth for Later Development. Psychoanalytic Study of the Child 19:58-72. LIDZ, THEODORE; and FLECK, STEPHEN 1960 Schizophrenia, Human Integration, and the Role of the Family. Pages 323-345 in Don Jackson (editor), The Etiology of Schizophrenia. New York: Basic Books. Me CORD, WILLIAM; Me CORD, JOAN; and HOWARD, ALAN 1961 Familial Correlates of Aggression in Nondeliquent Male Children. Journal of Abnormal and Social Psychology 62:79-93. MURPHY, Lois 1962 The Widening World of Childhood: Paths Toward Mastery. New York: Basic Books. NAGERA, H. 1966 Early Childhood Disturbances, the Infantile Neurosis, and the Adult Disturbances. New York: International Universities Press. OSTOW, MORTIMER 1959 The Biological Basis of Human Behavior. Volume 1, pages 58-87 in Silvano Arieti (editor), American Handbook of Psychiatry. New York: Basic Books. PROVENCE, SALLY; and LIPTON, ROSE 1962 Infants in Institutions. New York: International Universities Press. Ross, ALAN O. 1964 The Exceptional Child in the Family. New York: Grune. SARASON, SEYMOUR et al. 1960 Anxiety in Elementary School Children. New York: Wiley. SPITZ, RENE 1946 Anaclitic Depression. Psychoanalytic Study of the Child 2:313-342. WHITING, BEATRICE B. (editor) 1963 Six Cultures: Studies of Child Rearing. New York: Wiley. VI EXPERIMENTAL STUDY
Mental disorders vastly extend the normal range and variety of human behavior available for psychological study. They present familiar patterns in exaggerated form or in unusual combinations, incompletely developed, or disorganized. Although these derangements are often accompanied by seemingly arbitrary and unique manifestations, there is enough regularity in them to allow for prediction from one occasion to another and for generalization within groups of persons. Experiments help to determine lawful relations in this area, as they do in other fields. Since mental derangements are often accompanied by considerable discomfort or distress, they call first for caretaking and administrative action which, if feasible, includes treatment and rehabilitation. In addition, mental disorders provide a source of information about the mechanisms of normal, as well as impaired, function. We are apt to take for granted our ability to execute the many intricate operations necessary for the coordinated and complete performance of even quite simple movements and mental processes. Only when these operations break down, become inefficient, or fail to develop do we recognize the contribution of one or another mechanism to normal functioning. Any
MENTAL DISORDERS: Experimental Study systematic examination of mental disorders involves the identification of the mechanisms or processes that are damaged in function. From this follows the determination of their part, typically in interaction with other mechanisms or processes, in the behavior disorder or symptoms. Such an analysis, often performed implicitly and schematically, constitutes the diagnosis. When it is explicit and precise, when it allows for prediction (i.e., prognosis) and is followed up by observations on the patient's progress, it can contribute to knowledge of mental function in general, as well as of psychopathology in particular. Experimental techniques are better suited to the study of disorders in cognitive and motor function than to the study of disorders in emotions, but, with some ingenuity, these techniques have been made to serve in the exploration of affective and motivational anomalies as well. Experiments can serve to test a specific hypothesis or merely to verify the fact that some type of behavior does occur. More particularly, they are employed to determine the conditions under which such behavior occurs and the factors that influence its magnitude or frequency. Experimental techniques have been used not only to explore phenomena relevant to mental disorders but also to bring about such derangements (by drugs, fatigue, etc.) and to treat others that have emerged in the course of development or as a result of accident. Thus, experimentation is appropriate at every stage of the evolution and remission of mental disorders. Its special virtue lies in its capacity to refine « clinical observations, to evalute their accuracy and delineate the boundaries of their validity, and to sort out the several sources that contribute to the effects under investigation. An experiment may be necessary, for example in diagnosis, to decide whether a patient's sensory or motor function has been completely lost or is available in extreme emergencies, as happens with hysterical conversion symptoms. Experimental procedures are also widely used to determine the extent of an incapacity. These rnay include procedures such as perimetry of visual fields and determination of dark adaptation, or performance tasks, such as memorizing pairs of nonsense words and sorting designs. Psychological tests used in diagnosis also have their origin in experimental procedure and retain some of its features—the objective tests more so than the projective. For example, certain aspects °f the test situation are always standardized, e.g., the content and phrasing of the questions, the instructions given, the perceptual information presented, the method of recording the patient's re-
763
sponses, the format of the protocol taken for the record. In other respects clinical tests may not satisfy the requirements of control and reproducibility that distinguish the experimental from other methods of observation. Also, clinical test scores are evaluated against pre-established population norms, an advantage over the experimental method in determining the baseline and the extent of the deviation, that may have significance for psychopathology. In standardized tests, however, a significant clue could be missed if it emerges, not against the background of average performance, but in the unique pattern of the particular patient's function and dysfunction. Here the flexibility of the experimental method, its resources of manipulation and control, and, above all, its rationale of specifying functions by operations offer the investigator a considerable gain. [See PERSONALITY MEASUREMENT; PROJECTIVE METHODS.] Although this article is concerned with the psychological study of mental disorders, it should be remembered that these disorders furnish subjects for experimental investigation by other disciplines, e.g., pathology, biochemistry, and neurophysiology. Even within psychology, the spectrum of experimental techniques and problem areas covers a wide range, from physiology to the social sciences. Experiments in autonomic and endocrine function, for instance, belong within its boundaries and have been of especial interest to students of unconscious psychic processes. At the other end, experiments in role playing, in group processes, and in the restructuring of social systems, such as hospital wards, have been undertaken jointly or alternately by psychologists and social scientists. The middle ground is marked by experiments in psychophysical measurement and perceptual judgment, by the several conditioning techniques, and by the immense variety of performance tasks—from simple motor responses to the solution of syllogistic problems, from the retention of nonsense syllables over a few minutes to the enduring acquisition of a new skill. [See PSYCHIATRY, article on SOCIAL PSYCHIATRY.] As in other areas of research, in the study of mental disorders the experimental approach implies that as many as possible of the conditions that bear upon the subject under investigation are held constant, while the others are controlled by the design of the research. Only a few, and preferably only one, of the conditions is allowed to vary, and that one is varied systematically, so that its effect on the outcome can be evaluated, whenever possible in some quantified terms. The purpose of an experiment is not just to discover whether an
164
MENTAL DISORDERS: Experimental Study
anticipated result can be discerned but also to give an estimate of its magnitude and of the probability that it could be reproduced under similar circumstances. In this methodological sense, "experiment" means something very different from the improvised trial or casual exploration to which the name is also applied. Ventures of that type are not unknown in the study of mental disorders, especially in the study of treatment and rehabilitation programs. They may be successful and even valuable in suggesting hypotheses or in demolishing an established misconception, but they are not experiments in the sense that their results can be attributed to a specific variable in the total situation. Certain guesses at the hidden meaning of a patient's obscure statement are experimental in this improvisational sense. Even though they may solve a riddle, there is no certainty that they hold the only solution or, indeed, that the riddle has been posed. Like other intuitive observations, such stabs at interpretation or intervention can supply hypotheses for subsequent testing. Historical aspect. The experimental method was introduced into the study of mental disorders soon after it appeared in general psychology. Kraepelin, the great system builder in clinical psychiatry, had worked with Wundt, and in due course he founded an experimental psychological laboratory at his research institute in Munich. Its program was outlined as far back as 1894, while Kraepelin was in Heidelberg, and appeared in print as the first in a series of reports under the title Psychologische Arbeiten (Kraepelin 1896). Periodic publications on research conducted by Kraepelin himself, and, under Kraepelin's editorship, on work done by his associates and students, followed during the first quarter of the twentieth century and constitute a record of eight volumes [see KRAEPELIN]. Kraepelin credited Gabriele Buccola, an Italian, with the first psychophysical experiments on patients with mental disorders and also noted the beginnings of such research in Russia and the United States. Another neuropsychiatrist whose prolific experimental activity started before the turn of the century was Paul Ranschburg, a Hungarian, most noted for his research in memory disturbances. By 1902 Ranschburg had established a permanent laboratory in Budapest, with government support, for the study of abnormal mental function. In America, Shepard Ivory Franz set up the first laboratory dedicated to the experimental study of behavior in mental patients, at the McLean Hospital in Belmont, Massachusetts, in 1904. The apparatus came from Leipzig, and Franz's first ex-
periments were concerned with the association areas in the brain, aphasia, memory defects, and especially with the physiology of manic-depressive psychosis. Psychological research was typically introduced into mental hospitals via physiology, with experiments on fatigue, speed of reaction, and time judgment. Significantly, Franz's laboratory was established for research in pathological physiology. When in 1907 he transferred his activities to the Government Hospital for the Insane in Washington, D.C., F. L. Wells succeeded him at McLean, heading a laboratory in pathological psychology. Wells was indeed less interested than Franz in cerebral localization, and his work in clinical psychology is best remembered for experiments in reaction time, the design of psychometric tests, and the pioneer use of psychogalvanic measurement. Kraepelin's experimental program encompassed all the principal approaches to mental disorders. He studied normal function, in order to establish baselines, isolate distinct mental processes, and perfect techniques for their measurement. The same processes were also investigated in patients with various mental diseases and with mental disturbances brought on by drugs, fatigue, or sleep deprivation. The ultimate goal was to gain a clearer understanding of mental illness in order to determine its etiology and render its treatment more effective and more amenable to evaluation. Kraepelin's methodological goals have proved understandably uncongenial to the currently dominant school of psychiatry. One clue to his unpopularity is his definition of psychology as a branch of physiology, i.e., physiology of the mind. It is a definition that today few, even among his admirers, would accept, although the concepts underlying it inspired his research. The definition implied that mental function can be measured; that deranged mental function differs from the normal in some quantifiable properties; that operations allowing for the exact assessment of the speed, the regularity, or the frequency of certain incidents in performance can serve as diagnostic and prognostic devices. This view entailed neither an atomistic model of personality nor a denial of psychogenic etiology. Kraepelin unequivocally declared that the physician concerned with mental disorders aims at forming a total picture of his patient and that the vast majority of these disorders originate from inside— genetic disposition being one of the prime determining factors. Experimental psychologists of the present day, while paying homage to his pioneer work, nonetheless adopt a critical attitude toward Kraepelin's contribution, for two reasons. One is the structuralist theory he inherited from Wundt,
MENTAL DISORDERS: Experimental Study which has proved to be of limited value—especially Wundt's concept of apperception, which played an important part in the theory of Kraepelin's school. Second, Kraepelin himself and several of his associates—-although not all, Ernst Griinthal being a notable exception—worked with extremely small samples of subjects. Typically, the investigator himself, not only dubiously representative of the general population but also far from naive in regard to the anticipated effects of the experimental treatment, was the only subject. Kraepelin and an associate's report of impaired performance attributed to a daily dose of alcohol no larger than two quarts of beer is the envy of later experimentalists who, like the author, have been unsuccessful in demonstrating impairment with much larger doses of alcohol [see DRINKING AND ALCOHOLISM, article on PSYCHOLOGICAL ASPECTS]. Theory and operationalism. While the program expounded by Kraepelin has been adapted by experimentalists to other psychological theories and to the requirements of representative sampling and stricter control, laboratory research on problems related to mental disorders has branched in other directions as well. Hughlings Jackson's hierarchic model of dissolution of function (1884) has stimulated innumerable experiments, especially with neurological patients, and Pavlov's theory of conditioning has been applied to mental disorders of every kind. Whether tied to a theoretical position or concerned only with the problem in hand, experimental studies of neurological patients have enriched, refined, and at times corrected our notions about sensory and motor function and higher mental processes. Since the purpose of most investigators has been to isolate the damaged function from those that remain intact or to arrive at a differential determination of deficit, the general approach to neurological disorders has been operational. The experimenter defines the function that seems impaired, selects operations that involve the function, and sets the patient experimental tasks that test these operations. As a rule, the more closely defined these operations, the more informative they are to the investigator. This scale of value may, however, be reversed by his commitment to a theory couched in such global constructs that each and any derangement in function serves as an illustration of the general Principle. A classical instance of that view, advanced by a clinician who was also a notable experimenter, is Goldstein's principle of abstraction. Goldstein conceived of abstraction as a composite function, central to which is the ability to categorize concrete instances. Its impairment is cited to ac-
165
count for disorders in speech and in thought, for the psychopathology associated with lesions in diverse areas of the brain, and for disorders that occur without known cerebral damage, such as schizophrenia. [See GOLDSTEIN.] Werner's developmental principle was another such general concept, which inspired experimental attacks on mental derangements as diverse as aphasia, the psychoses, and mental deficiency. Implicit in Werner's principle is the notion that seemingly instantaneous events in perception and thought in fact evolve by steps over a microscopic time scale. The evolution of these processes is usually too rapid to be open to inspection, but with appropriate experimental techniques it can be demonstrated in some instances of disturbed mental function. This was done, for example, by Klaus Conrad, a clinical psychiatrist and ingenious experimenter, who derived quite specific neuropsychological formulations from the broad and formal gestalt laws of integration and differentiation. In the aphasias, amnesias, and other cognitive derangements, experiments with the tachistoscope and memory drum (laboratory instruments for the regular, and very short, presentation of visual displays) merely extended the routine neurological examination and psychological testing. Students of the psychoses and psychoneuroses, on the other hand, could not always admit the relevance of experimental techniques. The psychoanalytic school, which in one version or another has undoubtedly been the most influential force in psychiatry and its ancillary professions in the United States, relies on analogy rather than on operational constructs. This applies even more emphatically to the existentialist and other metaphysical doctrines of psychiatry. In these theories explanation is more often by retrodiction than by prediction; the antecedent events are reconstructed without an exact weighting of the factors that contributed to the outcome or an analysis of their interaction. [See PSYCHOLOGY, article on EXISTENTIAL PSYCHOLOGY.] Animal research. To be sure, certain key concepts in psychoanalytic theory, such as conflict, have been subjected to laboratory experiments. The effects of stress interviews, frustration, pain, and vexation have been studied with human subjects, but for obvious reasons the threat to health and human dignity had to be well below the level at which lasting mental disorders are likely to develop. Most of the laboratory studies, therefore, have been done with infrahuman species. A powerful stimulus to such studies was the almost undivided sway of learning theory in American experimental psychology following the behaviorist
766
MENTAL DISORDERS: Experimental Study
revolution. If mental disorders, or so many of them, are not diseases but, rather, clusters of maladaptive habits and drives, persistently faulty perceptions or thoughts, and stunted social skills, then the laws of learning and conditioning may explain the emergence of symptoms and lead to successful methods of treatment. Attempts to reconcile the clinical formulations of Freud with an experimentally based academic theory appealed strongly to Hull and his school, whose learning theory shared with psychoanalysis a hedonic principle of motivation. It is, perhaps, disappointing that experiments concerning the development of defense mechanisms have been largely equivocal (see Sears 1943; Miller 1944), although this is hardly surprising in view of the fact that most of them employed rats for subjects. [See CONFLICT, article on PSYCHOLOGICAL ASPECTS; LEARNING THEORY; STRESS; and the biog-
raphy of HULL.] While there is unquestionable elegance in the graphic representation of gradients and the equilibrium point between a rat's approach to food and avoidance of shock and there is quite live evidence of regression, fixation, and other maladaptive habits in the experimentally induced neuroses of cats, dogs, pigs, and sheep trapped in a physical or psychological harness (Liddell 1944), the leap from the laboratory to the human family setting is still a long one. It has been quite spectacularly shortened by Harlow's experiments with monkeys (1958; 1962). The affectionate display directed by infant monkeys toward their surrogate mothers— cloth dummies—was not only comparable with their responses to the natural mothers but also uncannily reminiscent of the behavior of human infants. Since these observations were made in a laboratory, several influences on the growth of affectional responses could be controlled and systematically varied. Furthermore, the situation seemed more likely to produce a lasting mental disturbance similar to those that afflict man than did the traditional procedure of inducing experimental neurosis in social isolation. As the monkeys grew up, some of them indeed developed marked personality disorders. Rather surprisingly, however, this outcome could not be traced to the artificial surroundings of the laboratory or to the substitution of a cloth dummy for the mother or, indeed, to her replacement by surrogates who rejected the clinging infant with a violent mechanical thrust or a concentrated blast of air. The infant experience that seemed chiefly to account for the adult monkey's psychopathology was deprivation of contact with other infants. [See AFFECTION; INFANCY.]
Mental disorders and brain function. The problems of generalizing from infrahuman animals to man will be briefly considered below. At this juncture another problem demands attention: the relationship of mental disorders to disturbed brain function. The conception that behavior disorders develop as a result of infantile deprivation or other stressful life experiences makes no explicit assumptions about alterations in the organism by which such effects could be mediated over time. It is, of course, tacitly assumed that learning, whether it results in greater efficiency or is maladaptive, involves some enduring organic changes, more particularly in the central nervous system. What these changes are is unknown, partly because of the inadequacy of current neurological techniques and partly because of the inadequacy of neuropsychological concepts. These two factors are as closely related as are technology and theory in other domains. After many false attributions and an occasional correct guess in past centuries, it is now pretty universally agreed that the organ of the mind is the brain. The traditional metaphysical controversies about the relationship of body and mind have not been so completely resolved. They may be dismissed as irrelevant to the problem of mental disorders, but there survives a prejudice from the days when body and mind were regarded as two distinct substances, coordinate but not coequal in value or power. In the chain of being, man occupied a position intermediate between that of inanimate matter and the soulless animals, with whom he shared the bodily attributes, and that of the several classes of spiritual beings, whom he resembled by virtue of his mental faculties. His mind received credit for very extensive powers over the body, while control in the reverse direction was thought to be more limited in scope and also to be a likely avenue to sin. Moral judgments have changed, but clinicians are still apt to regard brain damage as but one of several causes of mental disorder and quite readily attribute disturbances in visible and tangible behavior to psychic processes that, at least by implication, take place outside the bodily dimensions. The implication is made by the rough division between organic and functional disorders that has been established in the usage of psychiatrists and neurologists, as well as of clinical psychologists. "Organic" refers to brain damage, and that term is understood to stand for such structural impairment of brain tissue as can be observed directly or by means of currently available neurological tech-
MENTAL DISORDERS: Experimental Study niques, e.g., electroencephalography, pneumography, angiography, etc. "Functional," by elimination, does not refer to organic or brain function. A classificatory principle as incompatible with present-day scientific notions as this is can be defended only on pragmatic grounds. It may, indeed, help with the prescription of treatment, but it introduces an unnecessary division into the research on mental disorders. Localization of brain function. On the one hand, investigations of patients with brain lesions tend to be focused exclusively on the topographical localization of the damaged function. Preoccupation with specialized centers or areas discourages the search for functional systems in the brain or, for that matter, in behavior and experience. On the other hand, psychological concepts have been derived without reference to systems and processes within the organism, and it is hardly astonishing that they do not always fit exactly the neurologist's model. Derangement of function, especially when it can be determined independently by neurological techniques and by experimental studies of behavior, furnishes the most revealing clues about mechanisms and systems common to neurology and psychology. The concepts and patterns that will be meaningful in both disciplines need not correspond to those presently current in either. For example, if future research confirms the author's experimental and clinical observations about the close association of an extremely severe memory disorder with a general lack of spontaneity in the patient's behavior—although no comparable deficit in reasoning or intelligence is found—these two seemingly unrelated psychological functions could be properly subsumed under a common concept. Moreover, if evidence accumulates that this combined deficit occurs with lesions in a certain fairly well defined subcortical system of the brain, such a concept should be meaningful in the language of neural as well as of behavioral processes. Although the accomplishment of experimental studies aimed at this objective is but modest to date, the advance has been considerable (a good share of that achievement being the result of investigations of disordered mental function; see Boring 1929; Flugel 1933; Hebb 1949; 1958) since G. Spurzheim, who ranks next to Franz Joseph Gall as the founder of phrenology, drew his spuriously detailed map of some thirty-odd brain areas that subserved as many mental faculties, with mirrorlike duplication in the two hemispheres. More reliable evidence for the functional subdivision of the brain came first from Paul Broca's clinical
167
observations and shortly afterward from Luigi Rolando's experiments with electrical stimulation. Pierre Flourens, another pioneer in experimental research in this area, however, arrived at different conclusions in his studies using the extirpation technique. Phrenology survives today in the mosaic theory of the brain. An opposite point of view was forcefully advanced by Goldstein, whose concept of abstraction has already been mentioned, and, for a while, by Lashley, who derived the laws of mass action and equipotentiality from ablation experiments. These laws imply that the size, rather than the site, of brain damage determines the disturbance in behavior and that—outside the sensory and motor areas—any part of the cortex can potentially subserve any learned behavior. [See BROCA; FLOURENS; GALL; LASHLEY.] Laboratory studies of human patients have led to formulations intermediate between the two poles. Teuber, who conducted extensive and carefully designed experiments on a group of patients with combat injuries, the majority with gunshot wounds in the brain, demonstrated that these lesions result in both specific and general impairment of perceptual function (Teuber & Liebert 1958). Halstead reached a somewhat different, intermediate position, from studies of patients with lesions in the various lobes of the cortex (1947). Statistical analysis of his experimental results showed that no intellectual function is uniquely dependent on a single region of the brain but that these areas differ in their importance for one or another function. Experimental techniques have been extensively used to test hypotheses about cerebral dominance and the division of function between the hemispheres. In this context one again meets spokesmen of extreme as well as intermediate positions. At one pole is the opinion that man has two brains, that one hemisphere all but completely duplicates the function of the other. Indeed, experiments have repeatedly shown that the surgical removal of a diseased hemisphere or its decay through atrophy may cause little discernible loss of intellectual ability. Opponents of this view attribute a high degree of specialization to each half. Speech, for example, has been associated with the left hemisphere since Broca's time, and, correspondingly, control over manipulative skills has been attributed to the right hemisphere. The proposition that cerebral dominance is thus manifested and that it is reversed in left-handed persons has been tested by many experimenters. Zangwill's survey (1960) of these studies reached the conclusion that the speech area is not invariably contralateral to the dominant
168
MENTAL DISORDERS: Experimental Study
hand. Nor is representation exactly identical in the two hemispheres when (as, for example, in sensory function) the rule of contralaterality obtains. Semmes and her associates (1960) have demonstrated this for somatosensory function, while experiments determining the two-point discrimination and pressure thresholds were used in Teuber's study of disabled veterans (Teuber & Liebert 1958). Pathological damage to the brain that corresponds exactly to a functional impairment is the exception and so far has not been found typical of mental disorders. Lesions in the visual or auditory cortex offer the closest examples. Students of mental illness have been especially interested in the frontal lobe, which seems to play an important part in such emotional disturbances as depression and distress over pain. A functional severance of the prefrontal cortex from the central nervous system has been successfully applied in treating these complaints. Experimental studies, however, have been inconclusive in assigning a particular function or ability to the frontal lobe. Advocates of the mosaic theory of brain function have found more encouragement in the temporal lobe. Penfield, the neurosurgeon, exposed the temporal cortex in several patients while removing a focus of epileptic discharge (1954). Applying electric stimulation to clearly marked points, he was able to elicit repeatedly—although not always—the same sensation or evoke recall of the identical episode from the past. These recollections appeared to be as sensorially sharp as the original experience, even though the patient was fully aware of his immediate surroundings in the operating room. At first it seemed that the site where the organism stores its discrete memories had been discovered in the temporal lobe, and although later interpretations have been more cautious, they do not diminish the significance of these experimental findings, especially for the demonstration that experiences and memories are classified in the nervous system according to abstract principles. This deeply hidden surface of the cortex and the adjacent subcortical regions are undoubtedly implicated in some of the processes of memory. Milner's reports with Penfield (Milner & Penfield 1955) and with Scoville (Scoville & Milner 1957), as well as other investigators' reports on patients who had lesions from surgery or disease in those areas, have noted grave defects in memory for recent events. Memory for remote events—like those which emerged under stimulation of the temporal cortex —is less consistently and less severely affected. Newer techniques in brain surgery, such as electro-
coagulation, have widened the area of experimental investigation of neuropsychological processes. Experiments performed with chronically implanted electrodes allow for the simultaneous recording of neuroelectric activity at deep brain centers and of overt behavior, as well as for electric stimulation. The subject in such investigations, however, is always a sick person, whose function is often influenced by drugs as well as the disease. Alternatively, the subject of an investigation may be a healthy monkey or cat, animals whose brains as well as behavioral repertoire have a good deal in common with man, although not quite as much as would satisfy many students of mental disorders. [See NERVOUS SYSTEM, article on BRAIN STIMULATION.] Clinical research. There are well-established anatomical differences between the brain of man and that of even the highest subhuman primate; there is also reason to believe that some structures or neurophysiological systems are identical but serve different functions at the two different phylogenetic levels. Whether the accomplishments of the rat or the ape in learning and problem solving furnish an informative analogy for man's feats in forming unique memories and operating with symbols remains a debatable issue. Many experiments tracing the relationships between derangements in behavior and in the brain can be carried out only with human patients, because the psychological defect is entirely in the use of language. The variety of aphasias distinguished in clinical observation and confirmed by experimental tests are proof not so much of an ingenuity in classificatory exercises as of the scientific endeavor to expand our knowledge of significant lawful relationships. [See LANGUAGE, article on SPEECH PATHOLOGY; PERCEPTION, article on SPEECH PERCEPTION.] Kinsbourne and Warrington's investigation of six patients with a reading disability (1962) is an example of the contribution experimental procedures can make in a clinical situation. These patients were right-handed, and they were known to have brain lesions in the right (i.e., minor) hemisphere. Paralexic errors, which their defects first seemed to exemplify, are regarded as aphasic in origin and are therefore attributed to damage in the dominant hemisphere. It appeared significant that the errors observed in these patients also differed from those commonly made in reading, in that they tended to occur with the first letters of a word. The investigators designed experiments to test the hypothesis that the reading errors arose from a perceptual derangement. By means of a
MENTAL DISORDERS: Experimental Study tachis to scope, they exposed to the patients brief glimpses of whole and fragmented words and geometrical figures. It became apparent that the patients had an abnormal field of perception, in which fine discriminations were restricted to the right side, although gross operations, such as judging the length of words, were unaffected. Unaware of their disability, the patients attempted to complete what they could read of a word, always in a leftward direction. Experimental checks ruled out the possibility that the disability arose from faulty fixation or defective eye movements, and the investigators therefore attributed it to an abnormal distribution of visual attention—to an unconscious neglect of space. They also recognized that their findings may have implications for reading disabilities that originate from a failure to execute the normal operation of forward completion. [See READING DISABILITIES; VISION, article on EYE MOVEMENTS.] Multidisciplinary contributions. Experiments following the trail between mental disturbances and cerebral dysfunction, in either direction, have the advantage of a clearly defined objective. The pursuit itself, however, may become exceedingly complex and involve—over and above the neurological and psychological considerations—such diverse disciplines as biochemistry, genetics, social science, and history (the patient's life history, as well as the history of the disease process). These disciplines have indeed appeared more promising for research in schizophrenia than have the neural sciences. Experimental studies of schizophrenia in the biochemistry laboratory, although varied, allow for a broad, threefold classification. Their goal is (1) to determine the chemical causes of the mental disorder; or (2) to assess metabolic, and particularly endocrine, function in patients, at different stages of their illness, under induced stress, special experimental conditions, or in the standard hospital setting; or (3) to form a part of a pharmacological treatment program. These experiments often include investigations of physiological and/ or psychological function as well. Experimental studies of schizophrenia, manicdepressive psychosis, and the neuroses are typically confined to the verification of clinical observations. The major exception to this trend has been research stimulated by theories that do not distinguish between functional disturbances attributed to brain damage and those due to psychological factors. Followers of Hughlings Jackson and Head, of Pavlov and the gestalt school, of Goldstein, Werner, ar *d Schilder, have exerted an influence in that
169
direction. More recently, models built around the activating properties of a subcortical neural system have served as a comprehensive conceptual frame work for disordered mental function. Theories that transcend the boundaries of established brain damage and psychogenic dysfunction employ loose and not fully operational constructs. Goldstein's principle of abstraction is an example, and, indeed, several experimenters have produced partial evidence against it. Some of these support Cameron's proposition (1947) that distractibility is an important source of concrete thinking in schizophrenia and that in several instances of mental disorder abstraction is manifested but is masked by the use of inappropriate, overinclusive, bizarre concepts, especially when they concern the social context. Of course, the conclusions of these experiments are limited by the extent to which their operations represent abstractness and concreteness in thinking. Uncertainty and disagreement about adequate correspondence between behavior under laboratory control and hypothetical mechanisms or processes—in personality dynamics or neural function alike—present the most difficult problem to the experimental psychologist investigating mental disorders. Laboratory techniques. Experimental reports about the slow responses, sluggish work rate, and straying attention of depressed or schizophrenic patients arc unlikely to arouse much controversy. In these instances, behavior observed in the laboratory does not conflict with clinical impressions, although inferences drawn from the two sets of data to underlying mechanisms may clash, whether they concern hypothetical psychic mechanisms, such as defenses against impulses, or hypothetical neural mechanisms, such as cortical arousal. Experimental techniques have also been used as a subsidiary procedure to clinical interviewing, e.g., the measuring of motor and autonomic responses in tests of word association or perception. If a patient seems to be unusually disturbed or ill at ease when discussing—or keeping silent about— certain topics, such suggestive impressions can be confirmed by the relatively exact measurement of his psychogalvanic response or hand movements. Evidence thus obtained may help in delineating and exploring his areas of conflict. This technique —popularly known as the lie detector—was systematically explored for the purpose of psychological research by Luriia in Moscow in the 1920s (1932). Other investigators, in the United States as well as in the Soviet Union, have further developed this technique, following clues from the patient's auto-
770
MENTAL DISORDERS: Experimental Study
nomic and skeletal behavior and measuring the latency, rhythm, and amplitude of his responses in structured interviews or during his performance of an experimental task. Laboratory techniques have also been used in the study of the regularity of grossly distorted perceptions in patients with mental disorders. Estimates of the patient's body or of its parts; of the size and distance of objects, especially when the judgment involves the perceptual constancies; and of his dependence on external cues for accurate assessment of the vertical dimension have added to our knowledge of the effects of mental disease or of particular diagnostic types, but have neither promised nor succeeded in getting at the roots of these disorders. Malmo and his associates (1951) have questioned the wisdom of resorting to tests of perception, and especially of concept formation, in order to establish characteristic differences between normal persons and neurotic or psychotic patients. In an experiment demanding difficult perceptual judgment under time pressure, it was demonstrated that groups representing these three classes barely differed in their accuracy. They did differ, though, in the regularity and duration of the motor response by which they indicated their judgment (this was simply pressing a button with the right thumb). They also differed in the frequency and magnitude of the synchronous response with the left hand and in the motor activity of their left hand between responses. The experimenters presented these results in support of the thesis that disproportionate motor disturbance is typical of patients with mental disorder under any stressful situation, and not only when the stress is specific to their emotional problem. This line of reasoning is very congenial to Eysenck (1947; 1961), who has undertaken the most ambitious and extensive experimental research in mental disorders, with the purpose of establishing a reliable psychiatric nosology. Experimental studies have played an important part in this program and have contributed data to the factor analyses from which three personality dimensions were derived. One of these represents introversion-extroversion and accounts for certain individual differences in normal, healthy persons, as well as for the two types of disorders into which Eysenck groups the neuroses. The other dimensions represent the magnitude of a patient's neurotic and psychotic disturbance. The three dimensions are orthogonal to each other (i.e., independent), so that the extent of a patient's psychotic derangement is unrelated to the magnitude or type of his
neurotic abnormality. Most of Eysenck's earlier experiments demanded performance of some task, but more recently he has preferred laboratory methods that call for judgments on perceptual illusions or aftereffects. Some of these procedures, such as assessing the afterimage of a rotating spiral, have been widely used by clinical psychologists for diagnostic as well as research purposes. Induction and treatment. Experimental techniques have also been used for the induction and treatment of mental disorders. While the development of a lasting experimental neurosis in an animal may be a perfectly justified venture, with human subjects the derangement of mental function can be considered only if it is a reversible process of short duration. Various psychotomimetic drugs have been used for this purpose, producing effects that have been reported as pleasant by some, disagreeable by others, and weird by most persons undergoing the experience. Perception and thought processes are distorted in fairly predictable fashion, but it remains a matter of debate whether the abnormal effects thus induced are the same as, similar to, or different in character from those of, for example, schizophrenia. Sleep deprivation, extreme fatigue, anoxia, starvation, heat and cold, excessive sensory stimulation (e.g., continuous noise) and its opposite, sensory and social isolation, have been among the experimental devices used to induce transitory mental disorders. [See DRUGS; PERCEPTION, article on PERCEPTUAL DEPRIVATION.] The application of experimental techniques to the treatment of mental disorders has received its strongest impetus from theories of conditioning. The therapeutic goal is to retrain the patient by progressive weakening of the maladaptive habit or symptom or by reinforcement of an adaptive response. From isolated experiments with laboratory techniques to cure enuresis or hysterical tics or paralysis, or to reach autistic or mentally defective children, there has now developed a recognized practice of behavior therapy. Its methods—desensitization, satiation, counterconditioning, reciprocal inhibition, operant and avoidance conditioning— were formulated and first tested with animals. Now they are applied by clinicians, whose relationship to the patient may not be very different from the psychotherapist's. Also, like the latter, the behavior therapist can combine psychological treatment with pharmacological treatment. [See LEARNING, articles on CLASSICAL CONDITIONING, INSTRUMENTAL LEARNING, and AVOIDANCE LEARNING; LEARNING THEORY; MENTAL DISORDERS, TREATMENT OF, article On BEHAVIOR THERAPY.] Drug therapy and electroshock treatment have
MENTAL DISORDERS: Experimental Study many of the features of a research experiment. The agents administered to bring about certain effects, i.e., improvement in the patient's condition, are under control. The amount given, the duration of treatment, and the avenue by which the agent is administered can be varied, within limits, and the manifestation of side effects, as well as of the principal outcome, can be evaluated and related to the input variables. Opportunities to explore these relationships have been thoroughly exploited by experimenters. [See ELECTROCONVULSIVE SHOCK; MENTAL DISORDERS, TREATMENT OF, article OH SOMATIC TREATMENT.]
The use of experiments in evaluating process and outcome in psychotherapy is far more limited. In individual and group therapy alike, too many of the relevant influences are outside the scope of controlled manipulation. Hypnotherapy offers more attractive possibilities; indeed, all research related to hypnosis seems to be relevant to an understanding of mental derangements and, in the light of Orne's findings, to an accurate assessment of experimental findings in psychology. Orne's experiments (1959) have shown that behavior under hypnosis depends very largely on the current notions about hypnotic effects and, also, that subjects volunteering for psychological experiments tend to have very definite ideas about how they are expected to behave in the laboratory, ideas which may exert a considerable influence on what they do or accomplish there. [See HYPNOSIS.] Experiments in human behavior, whether they reflect normal or disordered mental function, pose certain problems that are of little or no concern to experimenters in other biological sciences. The patient or control subject does not ever merely react to stimuli. The best the experimenter can achieve is to elicit, with his instructions and setting, an unprejudiced cooperation and to rely on observations that allow the least possible latitude for subjective interpretation. In the course of his work he may make new and significant clinical observations or discover lawful relationships that explain some phenomena of the mental disorders. His special contribution, however, is the testing of such observations and the definition of clinical terms by operations that are reproducible and open to inspection by all who will take the trouble to look. GEORGE TALLAND \uirectly related are the entries EXPERIMENTAL DESIGN; PSYCHOANALYSIS, article on EXPERIMENTAL STUDIES. Other relevant material may be found in
171
ANXIETY; DEPRESSIVE DISORDERS; DRUGS; FATIGUE; MENTAL DISORDERS, TREATMENT OF; NERVOUS SYSTEM; SCHIZOPHRENIA; SLEEP; STRESS; and in the biographies of GOLDSTEIN; KRAEPELIN; LASHLEY.] BIBLIOGRAPHY
BORING, EDWIN G. (1929) 1950 A History of Experimental Psychology. 2d ed. New York: Appleton. -» See especially pages 50-60, "Phrenology and the Mind-Body Problem," and pages 61-79, "Physiology of the Brain: 1800-1870." CAMERON, NORMAN A. 1947 The Psychology of Behavior Disorders: A Biosocial Interpretation. Boston: Houghton Mifflin. CONRAD, KLAUS 1954 New Problems of Aphasia. Brain 77:491-509. CONRAD, KLAUS 1960 Die Gestaltanalyse in der psychiatrischen Forschung. Nervenarzt 31:267-273. EYSENCK, HANS J. 1947 Dimensions of Personality. London: Routledge. EYSENCK, HANS J. (editor) 1961 Handbook of Abnormal Psychology. New York: Basic Books. FLUGEL, JOHN C. (1933) 1964 A Hundred Years of Psychology: 1833-1933. With an additional part, 1933-1963, by Donald J. West. New York: Basic Books. GOLDSTEIN, KURT (1934) 1939 The Organism: A Holistic Approach to Biology Derived From Pathological Data in Man. New York: American Book. -> First published as Der Aufbau des Organismus. GOLDSTEIN, KURT 1942 Aftereffects of Brain Injuries in War, Their Evaluation and Treatment: The Application of Psychologic Methods in the Clinic. New York: Grune. HALSTEAD, WARD C. 1947 Brain and Intelligence. Univ. of Chicago Press. HARLOW, HARRY F. 1958 The Nature of Love. American Psychologist 13:673-685. HARLOW, HARRY F. 1962 The Heterosexual Affectional System in Monkeys. American Psychologist 17:1—9. HEAD, HENRY et al. 1920 Studies in Neurology. 2 vols. London: Hodder & Stoughton. -> Consists mainly of papers published in Brain between 1905 and 1918. See especially Volume 2, pages 533-800, "The Brain." HEBB, DONALD O. 1949 The Organization of Behavior: A Neuropsychological Theory. New York: Wiley. HEBB, DONALD O. 1958 A Textbook of Psychology, Philadelphia & London: Saunders. JACKSON, J. HUGHLINGS (1884) 1958 Evolution and Dissolution of the Nervous System. Volume 2, pages 45—75 in J. Hughlings Jackson, Selected Writings . . . . Edited by James Taylor. New York: Basic Books. -> First published in 1884 in Lancet. KING, HENRY E. 1954 Psychomotor Aspects of Mental Disease. Cambridge, Mass.: Harvard Univ. Press. KINSBOURNE, M.; and WARRINGTON, ELIZABETH K. 1962 A Variety of Reading Disability Associated With Right Hemisphere Lesions. Journal of Neurology, Neurosurgery, and Psychiatry 25:339-344. KRAEPELIN, E. 1896 Der psychologische Versuch in der Psychiatrie. Volume 1, pages 1-91 in E. Kraepelin, Psychologishe Arbeiten. Leipzig: Englemann. KRECH, DAVID 1962 Cortical Localization of Function. Pages 31—72 in Leo Postman (editor), Psychology in the Making. New York: Knopf. LASHLEY, KARL S. 1929 Brain Mechanisms and Intelligence: A Quantitative Study of Injuries to the Brain. Univ. of Chicago Press.
172
MENTAL DISORDERS, TREATMENT:
LIDDELL, H. S. 1944 Conditioned Reflex Method and Experimental Neurosis. Volume 1, pages 389-412 in Joseph McV. Hunt (editor), Personality and the Behavior Disorders: A Handbook Based on Experimental and Clinical Research. New York: Ronald. LURIIA, ALEKSANDR R. 1932 The Nature of Human Conflicts; or, Emotion, Conflict and Will: An Objective Study of Disorganization and Control of Human Behaviour. New York: Liveright. MAGOUN, HORACE W. (1958) 1963 The Waking Brain. 2d ed. Springfield, 111.: Thomas. MALMO, ROBERT B. et al. 1951 Motor Control in Psychiatric Patients Under Experimental Stress. Journal of Abnormal and Social Psychology 46:539-547. MILLER, NEAL E. 1944 Experimental Studies of Conflict. Volume 1, pages 431-465 in Joseph McV. Hunt (editor), Personality and the Behavior Disorders: A Handbook Based on Experimental and Clinical Research. New York: Ronald. MILNER, BRENDA; and PENFIELD, WILDER 1955 The Effect of Hippocampal Lesions on Recent Memory. American Neurological Association, Transactions 80: 42-48. ORNE, MARTIN T. 1959 The Nature of Hypnosis: Artifact and Essence. Journal of Abnormal and Social Psychology 58:277-299. OSGOOD, CHARLES E. (1953) 1959 Method and Theory in Experimental Psychology. New York: Oxford Univ. Press. PENFIELD, WILDER 1954 Studies of the Cerebral Cortex of Man: A Review and an Interpretation. Pages 284309 in Council for International Organizations of Medical Sciences, Brain Mechanisms and Consciousness. Edited by J. F. Delafresnaye. Oxford: Blackwell. RANSCHBURG, PAUL 1939 Les bases somatiques de la memoire. Pages 513-531 in Centenaire de Th. Ribot: Jubile de la psychologie scientifique francaise, 18391939. Agen (France): Imprimerie Moderne. SCHILDER, PAUL 1942 Mind: Perception and Thought in Their Constructive Aspects. New York: Columbia Univ. Press. SCOVILLE, WILLIAM B.; and MILNER, BRENDA 1957 Loss of Recent Memory After Bilateral Hippocampal Lesion. Journal of Neurology, Neurosurgery, and Psychiatry 20:11-21. SEARS, ROBERT R. 1943 Survey of Objective Studies of Psychoanalytic Concepts. Bulletin No. 51. New York: Social Science Research Council. SEMMES, JOSEPHINE et al. 1960 Somatosensory Changes After Penetrating Brain Wounds in Man. Cambridge, Mass.: Harvard Univ. Press. TALLAND, GEORGE A. 1965 Deranged Memory. New York: Academic Press. TEUBER, HANS L. 1964 The Riddle of Frontal-lobe Function in Man. Pages 410-444 in Symposium on the Frontal Granular Cortex and Behavior, Pennsylvania State University, 1962, The Frontal Granular Cortex and Behavior. Edited by J. M. Warren and K. Akert. New York: McGraw-Hill. TEUBER, HANS L.; and LIEBERT, ROBERT S. 1958 Specific and General Effects of Brain Injury in Man. Archives of Neurology and Psychiatry 80:403-407. WEISENBURG, THEODORE; and MCBRIDE, KATHARINE 1935 Aphasia: A Clinical and Psychological Study. New York: Commonwealth Fund. WERNER, HEINZ (1926) 1957 Comparative Psychology of Mental Development. Rev. ed. New York: International Universities Press. -+ First published in German.
Psychological ZANGWILL, O. L. 1960 Cerebral Dominance and Its Relation to Psychological Function. Edinburgh: Oliver & Boyd.
MENTAL DISORDERS, TREATMENT OF i. ii. in. iv. v. vi.
PSYCHOLOGICAL TREATMENT CLIENT-CENTERED COUNSELING GROUP PSYCHOTHERAPY BEHAVIOR THERAPY SOMATIC TREATMENT THE THERAPEUTIC COMMUNITY
Kenneth M. Colby John Butler Jerome D. Frank Joseph Wolpe Heinz E. Lehmann Robert N. Rapoport
PSYCHOLOGICAL TREATMENT
Within the context of this article, the term "psychological treatment" means psychotherapy and "mental disorder" means mental distress. Psychotherapy consists of a group of communicative methods for exchanging semantic information with the aim of relieving mental distress. Mental distress consists of behavior patterns subjectively experienced as painful and judged by subjective and objective observers to be inappropriate to a context. Although there are now several psychotherapeutic approaches in Western culture, only a few can be considered as seriously developed alternatives whose methods continue to be evaluated and improved through systematic study. These approaches can be subdivided into three schools—psychoanalytic-psychodynamic, learning theory, and clientcentered. Although many similarities and differences can be found among these schools, depending upon how one compares them, there is agreement regarding a number of essential components in mental distress and its treatment by psychotherapy (Ford & Urban 1963). Modern psychotherapy was derived mainly from the efforts of Josef Breuer and Sigmund Freud, toward the end of the nineteenth century, to systematize a "talking cure" from the hypnotic techniques of the time. From this beginning several methods have evolved, none showing a clear-cut superiority over the others. They share many presuppositions regarding the nature of man and a delineation of individual psychotherapy as a private two-person relationship limited to talking and listening, the intent of which is to relieve the mental suffering of a patient in an enduring way. This private and intimate relationship, peculiar to Western man at this time, is notable as much for what it does not contain as for what it does. For example, it is a regularly repeated human communion that is unaccompanied by food and drink. Presuppositions are vaguely held and seldom
MENTAL DISORDERS, TREATMENT: Psychological examined beliefs. Beliefs concerning the nature of man underlie the articulated suppositions of psychotherapy. This Menschanschauung, as it might be called, presupposes that man's suffering is an outcome of his experience, that mental suffering should be relieved, that man has some degree of freedom of choice and decision, that he can control himself to some extent, that he can be changed by experience, that one man can help another to change, and so on. A complete inventory of such presuppositions has never been attempted, and perhaps because of the tacit nature of such beliefs, no inventory could be complete. It is of obvious importance that these beliefs are held by both therapist and patient. The more clearly held beliefs of therapists make up the specifiable suppositions and assumptions of psychotherapy theory. A therapist operates with a theory of the pathology (Greek, pathos, suffering) of mental processes and a theory regarding techniques that can bring about beneficial change in them. Here the term "theory" refers to a rough framework of notions expressed in a language containing everyday and special terms. A therapist's theories do not represent formal systematized bodies of tested and established hypotheses, such as those found in some natural sciences. This is not so much due to the youth of the field as to its nature. Therapy is not a science but a practical healing art. Practical arts consist of techniques for achieving ends valued as good. Procedures and rules for achieving ends can be aided by basic scientific knowledge that increases our understanding of the subject matter or augments the power of techniques. The effective utilization of techniques remains in the hands of a skilled artisan whose work represents the conduct of an artistic rather than a scientific activity. Theories of mental distress A therapist's theories begin with conceptual notions about the subject matter to which his techniques will be applied. Mental suffering involves a set of conditions judged to be qualitatively or quantitatively inappropriate to a context. This judgment ts made both by an internal observer, a patient, a nd by an external observer, a therapist, both of whom hold beliefs about ideal or desirable types of behavior for various contexts. The judgments that something is out of order are arrived at not by consulting experimental or statistical evidence, but b y comparison of the patient's behavior with ideal types. This comparative method uses a concept of Desirable behavior that represents an idealization,
773
a useful fiction, and not the extreme of an observable range. The chief empirical indications of mental distress, some or all of which are evident to both observers, are negative affects, thought distortions, and constrictions. Common negative affects, subjectively experienced and reportable as intense and not in keeping with an external situation, are anxiety, anger, depression, shame, and guilt. For example, a person may experience great anxiety in a classroom where there is no evident threat. Or he may become repeatedly enraged at frustrations that he judges to be trivial. Or he may enter a prolonged depression over the death of a loved one and even feel, inexplicably, guilt over the loss. Thought distortions have a great range of severity and variety of content. Common are beliefs that one is inferior, that one deserves admiration, that a disaster is about to occur, that one is being looked at or talked about, that people are dangerous, that one's body is defective, and that the opposite sex is hostile. These beliefs are often accompanied by the patient's own judgment that they are unwarranted or unjustified to this degree. Yet this judgment seems powerless to correct the thought distortion. Constrictions involve limitations of feelings or behavior required by and congruent with contexts. These limitations include avoidance of the opposite sex, sexual impotence or frigidity, inability to enjoy life, and an incapacity to experience either joy or grief. Such constrictions have far-reaching consequences and lead to repetitions of old patterns in novel situations requiring discriminations and new behaviors. There is great variation in how much distress an individual can stand. Most applicants for therapy have experienced enduring distress of more than mild severity which has not disappeared in the course of time. A patient seeks expert help for the negative affects, thought distortions, or constrictions that trouble him, and it is these phenomena from which a therapist attempts to release a patient by modifying the processes that generate them. For more than five thousand years there have been attempts to classify symptoms, descriptions, and behavior patterns into disease categories (Menninger et al. 1963). All these efforts have failed to produce reliable categories. The growing modern view is that we are not dealing with disease entities in the medical sense but with states of experiencing that require conceptualizations different from those found in traditional medicine. Today, various schools of psychotherapy have
174
MENTAL DISORDERS, TREATMENT: Psychological
reached moderate agreement on which elements are essential in descriptions of mental distress. Theories of the underlying pathological processes also agree insofar as they consider mental conflict, anxiety and other negative affect processes, and the ontogenesis of distress in parent-child relations to be crucial variables. The current uncertainty and disputes center on the problem of determining the best techniques for relieving distress and producing change. Although a crude theory of distress exists, we lack a theory both of mental change and of how change comes from external social influence. Hence there exists a profusion of techniques derived from clinical experience, but they lack a satisfactory theoretical underpinning. Theories and techniques of therapy Techniques of therapy are purely semantic, involving a communicative exchange of meaningful information. Treatment procedures are limited to conversations of various types, and there is a limit to what a therapist and patient can do with any purely semantic technique. These limitations are set by the nature of the therapist-patient relations, by what can take place in talking and listening, and by the topics chosen to be talked about. The relation between therapist and patient represents a working collaboration guided by a contract with stated and implied terms, usually involving payment of a fee for the therapist's services. Although the relation becomes intimate and emotionally arousing, it remains extremely one-sided, with the patient doing most of the talking. Disclosure is exchanged for confidentiality and neutral interest. The therapist's skills in listening and talking involve a general attitude of benevolent acceptance and specific acts of eliciting, focusing, clarifying, reflecting, and interpreting those relevant topics initiated by the patient. Criteria of relevance vary somewhat among therapy schools, but again there are limitations imposed by the regularities of deep human concerns—e.g., relations to significant other persons and the self to the self—and by what can in fact be said about them by therapists. There exists a difficulty between and within schools of therapy in examining the facts of therapeutic conversation. When discussing therapeutic approaches, each school uses its own notions and language. But it is an open secret among cognoscenti that this talking about therapy is highly unrelated to the talking that takes place in therapy. Official discussions about therapy tend to call up school allegiances and personal commitments. With an increasing use of tape recordings, movies, and television, we are in a position to observe what
therapists actually do rather than relying on what they say they do. As already emphasized, theories regarding processes of mental change are not as developed as theories of mental distress. Every therapy school could use a theoretically justified set of principles for achieving change. The technical rules and principles currently used have come from long clinical experience and common-sense knowledge about human behavior. As an example of the latter, if you want a person to tell you about his inner painful thoughts, do not frighten him. This simple but effective principle is used by all schools. For its skillful applications, a therapist must be able to accept and control himself when stirred by feelings that can lead one person to attempt to frighten another. Most of the rules for conducting therapeutic conversations are of this simple type. Some schools try to justify their techniques by appealing to fanciful metatheories or to animal experiments having no relevance to human behavior. But thus far technical rules are entirely empirical and justified only by clinical experience. This does not mean they are all wrong or can be easily dismissed, but because they lack a theoretical basis, it is difficult to sort out which techniques are truly effective and which can be dispensed with. For the time being, then, techniques must be learned through the oral tradition of apprenticeships in which empirical knowledge is passed on from the more experienced to the less experienced in the course of studying representative examples of clinical problems. And like the skills of all practical arts, they are performed by some people better than by others. Until theories of change are worked out, a therapist must rely, in his practice, on simple rules and on a tacit knowledge that comes with clinical experience. The type of guides he needs most are decision rules that tell him what to say and when and how to say it in order to achieve his shortrange and long-range goals. It is important to distinguish techniques from goals. Much of the therapy literature is clear about goals (although what the language refers to may be obscured observationally), but it remains quite opaque as to how these ends are to be achieved technically. When one discusses the details of the therapist's utterances, all therapists say much the same things. And it is these utterances to which a patient responds, not the theory of the therapist's school. It is also noteworthy that statements about goals often refer to the way the patient should be rather than to what a therapist should do to help him become this way. It is easy enough to state a goal of therapy as
MENTAL DISORDERS, TREATMENT: Psychological an enduring relief of the mental distress a patient suffers. With such relief comes a positive gain in the form of a new ability to enjoy experiences. All schools, in addition, seek goals or subgoals assumed to be necessary to achieve the over-all end of enduring relief. They are variously termed as self-realization, personality growth, self-knowledge, full-functioning, enlightenment, etc. Here differing therapy approaches seem to be saying equivalent things about ideal types of mental functioning. Utopian or not, they play a part in the presuppositions and suppositions regarding therapy's ends. Statements about goals represent assertions of value systems striving to achieve the good (London 1964). A therapist's values regarding "the good" and "the right" in human conduct inevitably clash with other value viewpoints. Each therapist-patient pair must work out this difficult problem of values in a manner that provides some consonance with the values of the patient and the community in which he wants to live. It is this problem of values about what constitutes proper conduct, bound up with the ends of therapy, that contributes great difficulties in answering questions about the effectiveness of psychotherapy. How the techniques of psychotherapy actually work to affect a patient remains mysterious. Changes observable to both subjective and objective observers take place in the form of reduction of negative affect, correction of thought distortions, liberation from constrictions, or a combination of these. Processes that generate such changes are difficult to understand and facilitate in our present state of knowledge. For far-reaching and enduring mental changes to occur, we assume some sort of scrutiny, and revision in belief systems must take place in which information is used to counteract and correct other information. This process is facilitated by the haven and support provided by a benevolent therapist, who functions as an ideal friend or parent and who occasionally can offer alternative views to be tried out by the patient first in safe thought-experiments and later in actual behavior. This is about as much as we think we know of mental change, an area in which dependable knowledge is very difficult to obtain. Trends Historically, the treatment of mental distress has involved the use of physical methods, such as drugs a nd electric shock, and semantic methods, such a § individual psychotherapy (Walker 1957). Hospital treatment has mainly utilized the former methods, whereas outpatient clinics and private practice have relied mainly on the latter.
175
No great change in the techniques of semantic methods has gained much acceptance in the past few years. In classical psychoanalytic treatment a couch, free association, and several visits per week for two to three years, or more, are still used. In less-intensive therapies a patient is seen faceto-face for several weeks or months. One trend in the field of psychotherapy concerns the matter of personnel and training. Since the early part of the twentieth century, the official (i.e., sanctioned by an accrediting organization) practice of psychotherapy has been by psychiatrists. But with the growing numbers of trained clinical psychologists and social workers, increasingly more therapeutic work is being carried on by individuals who do not have medical degrees. This trend is consistent with modern views regarding the nature of the "disease." If persons who seek therapy are not viewed as suffering from diseases in the medical sense, then it does not require medically trained personnel to deal with them. Medicine must either redefine "disease" to include human behavior patterns or concede some of its traditional territory to the nonmedical therapist. Since territorial dominance is what it is in male animals, the dispute will be a long one. This issue will become crucial as more and more people seek therapy. The growth of therapy is not so much associated with the prestige of science as it is with a growing change in attitude that facilitates criticism of socially acquired beliefs. Therapy provides a sanctioned way of participating in social criticism within the microcosm of the self, of repudiating acquired beliefs, and of liberating the self. With our increasing prosperity and with a rapid lessening in shame over seeking help, it is estimated that one out of seven persons will eventually apply for therapy. Medicine, psychology, and social work alone will not be able to meet the demand of millions of people. It is obvious that other manpower resources must be trained (Schofield 1964). Whether there has been any progress in the field of therapy over the past several years is an open question, partly because "progress" is such an evaluative term. Through clinical experience therapists have slowly learned much about what is not considered progress, and this represents a gain of information. It is agreed that we need more powerful and efficient methods as well as more knowledge about the differential application of semantic techniques. Furthermore, because patients tend to select therapists from schools about which they feel more comfortable, the various therapy approaches may be dealing with different classes of
776
MENTAL DISORDERS, TREATMENT: Psychological
problems and patients, making outcome comparisons across schools worthless. When a practical art tries to improve its methods, it often turns to science for help. Psychotherapy, relying purely on semantic techniques, turns to the behavioral sciences of psychology, sociology, ethology, etc. As yet there has been no great help for the therapist from these areas, but the hope is that scientific research can contribute to a therapist's knowledge in order to make therapy more effective. Research A historical example of mutually benefiting relations between science and practical art can be found in Louis Pasteur's contribution to wine making. Although the process of fermentation was not well understood, wine had been made for thousands of years, and the results had been unpredictable. At the request of wine makers, Pasteur undertook a systematic study of the process of fermentation and discovered the role of bacteria. With this understanding it became possible to control fermentation by regulating the activity of bacteria. Nowadays the making of a great wine still requires intuitive art, but the making of a predictably sound wine is rather straightforward. Like a wine maker, a psychotherapist follows a set of rules for achieving his goal—the relief of mental distress. As mentioned, these rules come from a body of clinical knowledge accumulated through the empirical experience of thousands of practitioners over many years. Why does a therapist believe in these rules when so few (if any) of them rely on scientific knowledge? One must here consider the nature of scientific, clinical, and common-sense knowledge. Scientific knowledge consists of reliable data and tested and confirmed (i.e., not disproved) hypotheses. Depending heavily on measurement and replication, it is precise and highly plausible in the face of the evidence. But it is also limited, lacking in scope and full of errors as history has demonstrated. Clinical knowledge stems from the slow accumulation of data and rough conceptions deriving from the astute observations and powerful intuition of generations of practitioners. Consensus develops through trial and error, and clinicians gradually come to agreement about the suitability of a technique. Common-sense knowledge consists of everyday observations and inferences at a low degree of refinement; it is often fallible and dubitable, but since it is not entirely unevaluated knowledge, it is indispensable. Refined common-sense knowledge becomes scientific knowledge, which then becomes part of common sense
again. If we had no scientific or clinical knowledge, we would still be able to manage human affairs about as well as we do today, using only commonsense knowledge of human behavior. A person has powerful aids of introspection and empathy in thinking and feeling about the behavior of other persons. Scientific research in the problems of therapy should be able to cast light on some of the difficulties in the art. Ideally, one would like to have explanations of everything regarding mental distress and its relief. But this is not likely, nor is it even necessary for the art to improve. Not everything in therapy is a major problem. Only certain aspects merit a scientific study, and only certain questions deserve the labor required to attain a satisfactory answer. Is psychotherapy effective? A useful and apparently simple question to ask and answer would be, "Is psychotherapy effective?" This has turned out to be such a difficult question for research to answer that we now must consider the question unanswerable when posed in this form. Thousands of therapists by now have treated millions of patients. Some patients report they are better, a few that they are worse, and some say they are the same. Therapists believe they help a majority of their patients. Therapists continue to be trained and to practice, and patients continue to seek therapy. There seems to be no widespread doubt that therapy is helpful, or at least that it is in some cases. But there is no satisfactory statistical evidence as yet that therapy benefits a population of patients. Are all these people, therapists and patients, unwittingly deceiving themselves and one another? The issue is reduced to statistical evidence versus clinical knowledge with its elements of common sense. The failure of statistical evidence to demonstrate a phenomenon may reflect the weaknesses in our current tools of demonstration. Also a failure to reject the null hypothesis (which is what statistics attempts) does not establish the null hypothesis. On the other hand, therapists should realize better than anyone the weaknesses and uncertainty of clinical and common-sense knowledge. The question of therapy effectiveness should be rephrased, because what the terms "therapy" and "effectiveness" refer to has never been operationally explicit. On the one hand, the term "therapy' does not refer to a homogeneous set of events. Unless the therapy can be observed by others, there is no guarantee that a therapist is doing what he should be doing and no estimate of how competently he is doing it. On the other hand, the term
MENTAL DISORDERS, TREATMENT: Psychological "effectiveness" also initiates a snarl, because patients enter therapy with varying severity of mental distress and what is judged improvement for one patient may not be judged improvement for another. Furthermore, therapeutic goals contain values about desirable behavior, and unless judges share similar value systems, it is impossible for them to agree on whether the result of therapy was good or right. Certainly every therapist has had at least one experience in which the outcome was judged favorable by himself, other clinicians, the patient, and others who know the patient. Such an experience carries the high conviction that therapy can benefit individual cases, and if it can happen to one patient, it should be able to happen to others. But to how many others in a population and what population? And perhaps it would have happened anyway "spontaneously." There is often mention of spontaneous remission in the literature, but as yet no one has presented any evidence that such a phenomenon exists. Candid therapists admit they do not benefit all patients and wish that those who are helped could be helped more. The issue of effectiveness remains unsettled, but therapists are convinced that therapy has the potential to relieve mental distress. What is really needed is an improvement in methods to make therapy not only more powerful but more efficient. Resistance and transference. If research is to help a practical art, it should address itself to crucial difficulties in that art. A crucial difficulty in all therapy involves a process known as "resistance." This term was derived from nineteenth-century electrodynamics, whose terms Freud used metaphorically in conceptualizing mental processes in terms of a flow of current through a circuit. The term refers to those hindrances a patient presents to explorations, scrutiny, and change. Clinical theory explains this phenomenon on the ground that a patient, although suffering distress, has achieved a mental state that is almost tolerable in many respects. Because the patient views any change in this state as a threat of even greater suffering, attempts to change are warded off and the state is defended for a long time. It is this fear of change and of being hurt that therapists believe to be a ftiajor factor in limiting the efficiency and effectiveness of therapy. Greater knowledge is needed about this process and its relation to "transference," i.e., the feelings and beliefs a patient deVe lops about his therapist. Technical rules for dealln g with transference and resistance may still be , but they should be based on a greater £j
O
177
understanding of what we are dealing with. Animal ethology and experimental psychology has already begun to indicate much about social bonds and social influence, especially between adults and their offspring (Scott 1962). Most research in therapy thus far has concentrated on the therapy situation itself, studying it directly as it exists in nature or studying experimental analogues. Naturalistic attempts to find common denominators among therapeutic approaches have not led us very far, because the comparisons have been too superficial and experimental attempts to duplicate the therapy situation have not brought about anything new. All this is ordinary research clearing up aspects of existing paradigms (Colby 1964). Sooner or later a new paradigm will appear, and extraordinary research will begin using surprisingly different presuppositions and suppositions. It is between the crevices of a Menschanschauung that new paradigms are discovered. One attempts to forecast the future by extrapolating present trends and by predicting those discoveries or inventions needed to fulfill human wishes. The main trend in the profession of psychotherapy presently concerns the development of a therapist who is not a medical practitioner. With the admission that current training systems cannot meet the increasing social demand, a new type of therapist will emerge trained in the best way that can be agreed upon by psychiatry, clinical psychology, and social work. All kinds of impediments will be raised by organization officials, but the need is clear and reasonable men will eventually yield to it. The second forecast involves discoveries and inventions needed by therapists who wish to improve their methods. The need for a theory of mental change has already been emphasized. This will be a fresh theory, not an amalgamation of current theories. For years there has been a demand for some sort of rapprochement between learning theory and psychoanalytic theory. A satisfactory combination seems unlikely as long as learning theory does not concern itself with such higher mental processes as symbol manipulation or with the fact that people think, talk meaningfully, and have awareness. Also, unless psychoanalytic theory develops novel concepts, no further contributions can be expected from it. The sorts of discoveries needed are those that can be provided by basic behavioral science or by a genius in the field of clinical observations and inference. Psycho-
178
MENTAL DISORDERS, TREATMENT: Client-centered Counseling
therapy, as we know it now, will change markedly if vigorously and boldly worked on. The inventions needed are recording apparatuses providing rapid information retrieval, voice-recognizing devices, automated analyses of natural language, and computerized training devices for the learning of therapy. There is also the interesting question of whether a future computer might do as well, if not better, than a person in providing individualized therapeutic conversation for certain classes of problems (Colby et al. 1966). If a computer will be able to treat with semantic techniques thousands of patients an hour, this would be one answer to the problems of (a) the countable hundreds of thousands of hospitalized patients who never have an opportunity to talk with a therapist and (£>) the uncounted millions of patients who could benefit prophylactically or remedially from therapeutic conversation. KENNETH M. COLBY [See also CLINICAL PSYCHOLOGY. Other relevant material may be found in ANXIETY; INTERVIEWING, article on THERAPEUTIC INTERVIEWING; PSYCHIATRY; PSYCHOANALYSIS, article on THERAPEUTIC METHODS; STRESS.] BIBLIOGRAPHY COLBY, KENNETH M. 1964 Psychotherapeutic Processes. Annual Review of Psychology 15:347-370. COLBY, KENNETH M.; WATT, JAMES B.; and GILBERT, JOHN P. 1966 A Computer Method of Psychotherapy: Preliminary Communication. Journal of Nervous and Mental Disease 142:148-152. FORD, DONALD H.; and URBAN, HUGH B. 1963 Systems of Psychotherapy: A Comparative Study. New York: Wiley. LONDON, PERRY 1964 The Modes and Morals of Psychotherapy. New York: Holt. MENNINGER, KARL; MAYMAN, MARTIN; and PRUYSER, PAUL 1963 The Vital Balance: The Life Processes in Mental Health and Illness. New York: Viking. SCHOFIELD, WILLIAM 1964 Psychotherapy: The Purchase of Friendship. Englewood Cliffs, N.J.: PrenticeHall. SCOTT, J. P. 1962 Critical Periods in Behavioral Development. Science New Series 138:949-958. WALKER, NIGEL (1957) 1963 A Short History of Psychotherapy in Theory and Practice. New York: Noonday Press. II CLIENT-CENTERED COUNSELING
Client-centered counseling and psychotherapy as a distinctive point of view and as a radical departure from current practices can be dated rather precisely to December 1940, when Carl R. Rogers, its leading exponent, presented a paper at the University of Minnesota on the attitude and orienta-
tion of the counselor. The paper later became the second chapter of his controversial book, Counseling and Psychotherapy (1942). The controversy engendered by this book centered as much upon what the counselor or psychotherapist was not to do in the psychotherapeutic situation as upon what he was to do. According to Rogers, he was not to guide or to reassure or support; he was not to interpret and was not to use an entire armamentarium of what were labeled "directive" standard techniques. Psychotherapeutic interventions, particularly interpretive explanations, were categorized as dangerous. It was recommended instead that the therapist stress what were called nondirective techniques, responding directly to the present, expressed attitudes of the client (reflection of feeling), and that the therapist convey his unequivocal respect for and acceptance of the client as he presented himself in the immediate present. Rogers postulated that when the therapist demonstrates acceptance and permissiveness and shows understanding of the client's expressed attitudes and feelings, a process of personal change in the client would occur, in which the following stages could be observed: release of expression, achievement of insight, and development of capacities for making choices and of acting on the choices made. The main task of the therapist was to allow the stages to evolve, to facilitate a natural and inherent sequence, not to set the sequence into motion. Rogers also recognized that the therapist had his own propensities to become emotionally involved with his client in ways which resulted in directiveness and that, therefore, the therapist should work at circumscribing these propensities in himself. He stressed the complete abdication of power in the therapeutic relationship, in contrast with current and standard techniques, in a manner which seemed to many to strike directly at the heart of the current practice of psychotherapy in medicine, in social work, in nonmedical settings, and in vocational psychology as well. Pained and angry responses from the ranks of these helping professions were immediate, intense, and longsustained. Personal change in psychotherapy Counseling and Psychotherapy was almost entirely theory-free and empirical in tone, and intentionally so. In 1942, Rogers was a clinical professor formulating his clinical experience for the benefit of clinical students; like many clinicians he was somewhat scornful of current psychological theories, regarding them as sparse and simplistic compared with the richness and complexity of the
MENTAL DISORDERS, TREATMENT: Client-centered Counseling clients with whom he worked. The storm of controversy ensuing upon the publication of Counseling and Psychotherapy stimulated a flow of research and theoretical development by Rogers, his associates, and their students which has not yet abated. The development of theory and research in all areas until approximately 1956 was summarized and integrated by Rogers (1959, pp. 184-252). The approach to understanding personality, psychotherapy, and interpersonal relationships is entirely phenomenological. Technique is minimized and the necessary and sufficient conditions for inducing psychotherapeutic personality change are stated to be the following: 1. Client and therapist are in contact. 2. The client is in a state of incongruence: there is a discrepancy between his perceived self and his actual experience. He is vulnerable or anxious. 3. The therapist is congruent in the relationship with his client: his perceptions of this relationship are accurate symbolizations of the actual experience. 4. The therapist is experiencing unconditional favorable regard toward his client. 5. The therapist is experiencing an empathic understanding of the client's internal frame of reference. 6. The client perceives, at least to a minimal degree, the unconditional favorable regard of the therapist for him as well as the empathic understanding of the therapist. It is noticeable that no techniques and no behavior prescriptions appear in this account. Everything is couched in terms of the experience of the client and of the therapist. Nonetheless, most responses of client-centered therapists continue to be reflections of feeling based on their perception of the internal frame of reference of the client. The behavior of client-centered therapists follows from this premise: The probability that the client will perceive the therapist as prizing (positively valuing) and understanding him is maximized when the therapist manages to convey his prizing attitude of unconditional favorable regard and when the therapist communicates his empathic understanding to the client in a consistent way. Basic concepts. The theory of personality presented is also phenomenological and shows the influence of gestalt theory. The only motive postulated in the theoretical system is the actualizing tendency: the inherent tendency of the organism to develop all of its capacities in ways serving to maintain or enhance the organism. The actualizing tendency reflects in large part the tendency to develop autonomy and to lessen heteronomy, or
1 79
control by external forces. The actualizing tendency is a property of the total organism. The self concept is the consistent conceptual gestalt (organization) derived from the perceptions of the "I" or the "me" that are developed in interaction with significant others. The ideal self concept denotes the self concept to which the individual aspires. The self-actualizing tendency is a subsystem of the basic organismic actualizing tendency and is a consequence of the development of the self concept. Self-actualization is the actualization of that portion of the experience of the organism which is symbolized in the self concept. When self-experience and the remainder of the experience of the organism are congruent, then the actualizing tendency remains relatively unified. If self concept and experience are incongruent, then self-actualization and actualization tendencies are incongruent. In this case, the individual is maladjusted: his self concept reflects a conflict between self-actualizing motives and actualizing motives. [See SELF CONCEPT.]
The self concept does not direct the organism; indeed, the self concept derives from the actualizing tendency and is but one aspect of the tendency of the organism to react and behave so as to maintain and enhance itself. Motives or needs such as the need for favorable recognition from others and the need for self-esteem arise out of the organism's experiences in relation to interpersonal transactions and their vicissitudes. In a broad sense, "experience," in Rogers' view, is the organism's receiving the impact of sensory or physiological events happening at the moment; experience is what happens to the organism, including what happens within it. However, in a more restricted meaning "to experience" for Rogers also denotes the accurate symbolization in awareness of the sensory or physiological events. The theory presented by Rogers, although containing many propositions, is basically simple. It concerns the development and self-development of the organism, the accurate symbolization in awareness of experience, and the perception of threat, with consequent defenses and effects upon interpersonal behavior. The development of an accurate self concept is held to be a basic capacity of the organism. An inaccurately symbolized self concept emerges because the individual, in the course of development, begins to have a need for favorable regard from others and an analogous and consequent need for favorable self-regard but perceives himself as being only conditionally prized or loved by others. He incorporates this conditional prizing into his
180
MENTAL DISORDERS, TREATMENT: Client-centered Counseling
self concept and subsequently evaluates experiences on the basis of conditional prizings instead of in terms of the basic actualizing tendency. Perception of unconditional prizing by others leads, on the other hand, to satisfaction of the needs for favorable regard and self-regard in a way that is congruent with the basic actualization tendency. Development under optimal conditions of unconditional prizing leads to a person who is fully functioning, open to experience, and psychologically adjusted. Some comments on the theory are in order. As stated before, it is relatively simple, having neither the comprehensiveness, say, of psychoanalytic theory nor the seemingly rigorous and elegant simplicity of behavior theory. Not too much is said about motivation, and the defense mechanisms, such as repression, denial, and reaction formation, are taken for granted: they have been discussed and investigated elsewhere. The theory was developed to account for what other theories neglected and to stress a view of human nature and experience not currently dominant in Western culture, namely, that the individual inherently actualizes and self-actualizes, is personal and subjective, is not at the mercy of individual drives, and has inherent capacities for realistic adaptation and unrestricted experiencing. He is often conditionally valued by significant others early in life and often responds by evaluating his self in terms of these conditional prizings. The individual has a history that results in various degrees of congruence between the actualization and the self-actualization tendencies: the lower degrees of congruence are maladjustive; the higher degrees approximate the fully functioning, fully experiencing individual, who evaluates autonomously and is personally creative. [See DEFENSE MECHANISMS.] Clinical and empirical foundations. Despite its phenomenological language, Rogers' theory has the virtue of being close to and being derived from clinical observation. When the client is in the psychotherapeutic situation, when he is unconditionally prized and is well understood by the therapist as he presents himself, he does change his self concepts, he does react more openly, he does abandon maladaptive strategies, maneuvers, and symptoms in relation to the therapist. He usually does not develop a "transference neurosis," and his expressive style changes. The organization and use of language changes, and the individual comes to act differently with others than before. Observations of such changes led to the theory. What is not observationally based is the theoretical prediction of the increase in congruence between self concepts and experience. What is observed is that self con-
cepts change and that the individual expresses himself as being in some ways more like the person he wants to be. However, the congruence of self concept and ideal concept is suspect because it is observed that some individuals claim to have congruent selves and ideals when obviously such is not the case; i.e., interpersonal behavior is not consistent with the claim, and worse, other persons, such as paranoid individuals, seem to have congruent self concepts and ideal concepts but are clearly psychologically maladjusted; thus the recourse by Rogers to the discrepancy between self concept and experience and between self-actualization and the actualization tendency, even though such discrepancies are not actually observed in the psychotherapeutic interaction. In general, theoretical statements by writers within the client-centered orientation, with the exception perhaps of Raimy (1943) and Snygg and Combs (1949), have the same clinical-empirical and action-oriented flavor as those of Rogers. Complex behavior is considered. Little attempt is made to provide careful definitions in the sense in which terms such as stimulus, response, drive, and response generalization are carefully defined in, say, behavior theory because the primary referents can be pointed to in recordings, motion pictures, etc. This discriminability of primary referents is conceived to be an advantage. Considerable difficulty has been encountered by behavior theorists in defining such terms as stimulus, response, and response generalization unambiguously even for simple situations, and when they are applied as behavior therapy, the specification of reinforcing stimuli and of response generalization has been so vague that it seems safe to say that for some of the better-known studies it would be easy to obtain quite different results using the same reinforcing stimuli described by the investigators. The therapy process. Rogers' specifications of the necessary and sufficient conditions of personality change in psychotherapy are to a large extent understood by client-centered therapists in terms of the predominant conduct of therapists: unconditional prizing behavior, mostly expressed nonverbally, and reflections of feeling as communications of manifest themes that occurred in the client's communications. The communicative behavior of clients includes the nonverbal, gestural, expressive components of communication which are linguistic in nature and which serve to modify the meaning of symbols and signs. What happens in the psychotherapeutic hour when the therapist conducts himself in the manner mentioned above? Numerous studies, most of
MENTAL DISORDERS, TREATMENT: Client-centered Counseling them cited by Rogers (1959), have been addressed to this question. While these studies are satisfactory in a certain sense, they do not really describe the events at all well. Hence, a naturalistic description will be attempted. The communicative behavior of the client usually, in contrast with psychotherapies in which the psychotherapist's responses are interventive, shows thematic unity. Thematic unity is also evident in the sequence of responses. Along with this, the voice qualities change, and language usage also changes in such a way that the client appears to be more expressive and integrated in his communicative behavior, to be using a richer and more figurative language. When the therapist accurately symbolizes the themes in a client response, he is in actuality amplifying and developing them through his own language and voice qualities. Often these responses of the therapist are met by the client with "Yes, yes," "That's it exactly," or "Exactly," spoken with considerable emphasis. The theme voiced by the therapist (but voiced by the client immediately before) is then elaborated and developed with considerably more differentiation in both language and voice, creating an impression of supple and spontaneous flexibility. This is true even for quite disturbed clients communicating initially in a passive way and with the passive voice, and using such phrases as "This comes to mind." As the therapist's language becomes richer and more apt, as he makes more use of his voice, as his expressive gestures become more explicit, the more thematic is the development, the more figurative is the speech, and the more closely knit, congruent, and spontaneous become the interchanges of the client and the therapist. This behavioral process, so difficult to describe but so denotable in audio and photographic reproductions of psychotherapy interviews, at its best has the kind of literary quality one might ascribe to recitations of the ancient Greek bards, whose recitations were worked out anew in each encounter with an audience. Published reports of client-centered therapy are, unfortunately, poor representations of the process described. Later theoretical development Theory development since 1956 has largely concentrated upon developing the phenomenological perspective (Shlien 1962), with much stress on the experiencing process in relation to personality change (Gendlin 1964). Gendlin (1962) has written a philosophical treatise on personal experiencing that stresses the relation of experiencing to the creation of meaning. This work developed out of his training as a philosopher and his extensive en-
181
counters with client-centered psychotherapy. In extending his concept of experiencing to the understanding of concepts and values (1963; 1964), psychotherapy (1961), and personality change (1964), Gendlin has concluded that changes may occur in psychotherapy even before concepts have been attained that accurately represent feelings referred to by the client. This occurs because the therapist's responses themselves may lead to symbolic completions, closures, and elaborated themata even when the client does not perceive the therapist as prizingly understanding. In emphasizing the influence of the therapist in promoting closure, Gendlin is in effect changing Rogers' conditions for personality change. Dissatisfied with the postulational character of the actualization tendency, Butler and Rice (1963) have proposed that adient motivation, the need for experience, is the primitive base for the actualization and self-actualization tendencies. On the basis of studies of preference for complexity or novelty, of stimulus deprivation and of neurophysiological processes, they propose that adience is rewarded by thinking processes as well as by environmental transactions. The self-actualizing, fully functioning person autonomously creates experience for himself, and he can autonomously reinforce and extinguish behavior (learn) without moving a muscle. [See STIMULATION DRIVES.] With respect to psychotherapy, Butler and Rice maintain that a stimulating, expressive communicative style on the part of the psychotherapist enriches the experience of the client, focuses associations by reflections of feeling, and leads to symbolic completions and thema development. A "difficult" client with a poor prognosis may lower the responsive participation of the therapist, thus creating an experientially impoverished environment matching his inner experiencing, with the consequence that enrichment of experience may not ensue. Clients with poor prognosis are just those who, in the therapeutic interaction, are likely to create the conditions leading to lack of progress and no constructive personality change. Butler and Rice maintain that the therapist who can sustain a participative, stimulating, and responsive expressive style is likely to induce progress in therapy even when such progress seems improbable. Evidence supporting their hypotheses has been presented by Wagstaff (see Butler et al. 1963) and Rice (1965.) Research Research on client-centered psychotherapy has been summarized or cited extensively in Rogers (1959; 1960), Seeman (1956; 1965), Butler
182
MENTAL DISORDERS, TREATMENT: Client-centered Counseling
(1958), and Grummon (1965). Attention here will be centered on the proposition that the changes noted in psychotherapy are due to the psychotherapeutic encounter rather than to spontaneous remission or other extrapsychotherapeutic agents. All work cited has been discussed in Rogers (1959), except when cited by year. Early studies tended to be confined to content analyses of transcripts of therapy interviews. These studies showed that, to a considerable extent, therapists did, indeed, consistently employ the techniques they claimed to use (Porter, Snyder, Seeman, Strom). Furthermore, for clients it was demonstrated that there was a change in the proportion of responses indicating insight, self-exploration, and integration (Porter, Snyder, Curran, Stock, Hoffman); that there were decreasing proportions of distress and discomfort responses and increasing proportions of favorable responses to self (Raimy, Assum, & Levy; Kauffman & Raimy; Zimmerman); that decreasing self-exploration was exhibited when therapists complied with requests for guidance, information, and support (Bergman 1951); that there occurred an increasing acceptance of self and others (Sheerer); and that there was an increased correspondence between ideal self and self concept for cases evaluated as successful, whereas this did not occur in cases evaluated as unsuccessful (Aidman 1951; Bowman 1951). In later studies, control techniques and measuring devices suggested by theory and research results were employed. Self-ideal relations were found to increase during the course of psychotherapy for the client group as a whole, the increase being greater for the group of clients judged to be definitely improved in terms of both therapist ratings and Thematic Apperception Test (TAT) ratings (Butler & Haigh 1954). The therapist's judgments were made independently of the TAT ratings and of the tested self-ideal relations. There was a significant increase in the variability of the selfideal correlations at the end of therapy and at the end of a follow-up period, indicating that self-acceptance was decreasing for some clients and increasing for others. The majority of the changes reflected increasingly self-ideal correspondence, however. Butler showed that 11 of the clients serving as their own controls, to whom tests were administered 60 days prior to therapy, immediately before therapy, and 60 days or less after therapy began, changed their self-descriptions significantly more during the in-therapy period than during the notherapy period (Butler 1964a). Another control feature is the rating of clients
on a scale measuring maturity of behavior (Rogers 1954). Friends of the clients who were not informed that the clients were in therapy and who knew nothing of the research, made these ratings. In general, mean ratings on these scores did not change significantly between pretherapy, therapy termination, and follow-up testing. However, when the clients were stratified on the basis of therapist rating on a nine-point scale of success, the mean increase on maturity scores from pretherapy to therapy termination was statistically significant for clients whose ratings were in the 7-9 range, while there was a statistically insignificant decrease for clients rated in the 1-5 range. For the period between pretherapy and follow-up testings there was a significant increase in average maturity ratings for clients in the 7-9 range and a significant decrease in average maturity ratings for clients in the 1-5 range. The findings are remarkable because the groups used were very small and the therapist rated clients solely on the basis of interview behavior, while the lay observers presumably did not know their friends were in psychotherapy and knew nothing of the research. Comparable ratings on normal controls showed no mean change in score between testing periods. In a later study of many of the same clients, Butler (1964Z?) related self-reports, ratings by independent lay observers, and ratings by therapists. His results indicate that the vantage points of clients, therapists, and observers provide similar information, although the judgments are made on different bases. While this particular group was small, cross validation with another group yielded the same conclusion about the effects of psychotherapy on selfdescription. Cartwright and Vogel (1960) reported on a group of clients for whom they individually matched periods of waiting for therapy with intherapy testing points. The wait periods varied from 4 to 24 weeks. They found statistically significant differences in the variability of self-descriptions indicative of adjustment (highly related to selfideal correspondence) during the treatment period over and above those obtaining in the waiting period. Psychotherapy was held to account for the increase in the variability of self-description scores. In another study reported by Butler (1964a), the self-ideal correlations of clients with good and poor prognoses were compared with those of clients with good and poor prognoses who were not receiving psychotherapy. The treatment group received ten weeks or less of psychotherapy, whereas the control group received no psychotherapy for a tenweek period. Analysis of covariance of the self-
MENTAL DISORDERS, TREATMENT: Client-centered Counseling ideal correlations revealed that those of the treatment group changed more than those of the notreatment control group and that the majority of the changes were in the direction of increased correspondence of self concepts and ideal concepts. When clients are matched on prognosis and are randomly assigned, one can infer from these findings that self-acceptance does change in, and as a result of, client-centered psychotherapy. One can also infer that changes in self-acceptance are related in some way, not necessarily linearly, to changes in maturity of interpersonal behavior as seen by lay observers, to psychodynarm'c changes as reflected in indexes derived from projective tests, and to changes in personal integration and level of adjustment as perceived by therapists. No single study provides perfect control, but the progressive character of the results and the relationships of the measures lend considerable weight to the hypothesis that self-acceptance changes as a result of client-centered psychotherapy and that other changes, particularly maturity of interpersonal behavior, are associated with self-acceptance and the process of psychotherapy. A particularly interesting study was conducted by Bills (1950). After a 30-day control period in which none of 18 third-graders who were retarded readers received play therapy, eight received clientcentered play therapy and ten received no treatment. An analysis of covariance showed a statistically significant difference in gain in reading score for the treated group compared with the untreated group. Bills's study bears on the question of what kinds of behavior are affected by psychotherapy, adding reading to the list of self-re gar ding behavior, interpersonal behavior, and projective behavior. [See READING DISABILITIES.] Research in client-centered psychotherapy since Rogers completed his survey (1959) has centered largely upon the conditions of psychotherapy as provided by the psychotherapist and upon therapist characteristics. Wagstaff has found three factors of expressive style in client verbal behavior, two of which are related to various criteria of outcome of psychotherapy (Butler et al. 1962; 1963); and Rice, analyzing responses of the therapists of the clients studied by Wagstaff, has found three factors of therapist vocal and lexical style, two of which are also related to various outcome criteria (1965). These studies support the hypothesis of Butler and Rice, alluded to earlier, that clients with poor prognoses deleteriously affect the responsiveness of their therapists. Duncan (1965), studying a variety of discrete paralinguistic behaviors in both client and thera pist, found significant relationships, on the one
783
hand, between one aspect of therapy "process" (patterns of voice quality) and the therapist's judgments of the process, and, on the other hand, between this process and client test performance both before therapy and after 20 interviews. Gaylin (1965) devised a Rorschach function score designed to measure psychological health. Obtaining this score for pretherapy and post-twentieth-interview Rorschachs, he found that those clients with high ratings of improvement by their therapists exhibited improved scores; those with low ratings, poorer scores. Gaylin's function score also correlated significantly with paralinguistic factors studied by Duncan. Truax and Carkhuff (1963), working with Rogers, have presented evidence to show that when therapists dealing with hospitalized patients provided high levels of warmth, empathy, and congruence, patients improved; when they did not, patients became worse. Client-centered psychotherapists hypothesize that the person is motivated largely by actualizing and self-actualizing tendencies which result in favorable personality change under proper interpersonal conditions initiated by the therapist. The results of studies of the psychotherapeutic situation and its effects strongly support these hypotheses. These studies also show that changes observed in psychotherapy are reflected in interpersonal relationships and in favorable and enduring changes in the structure of self concepts. In addition, the techniques and qualities of client-centered therapists significantly affect performances in other types of situations. Although there are a few studies suggesting that client-centered psychotherapy compares favorably with other approaches (e.g., Shlien et al. 1962), it would be premature to claim that clientcentered psychotherapy is more efficacious than other psychotherapies. This is due in part to the lack of systematic research on personal change in psychotherapy. Furthermore, different approaches to psychotherapeutic treatment, such as behavior therapy, apparently have goals somewhat different from those stated for client-centered psychotherapy. These circumstances render systematic comparisons difficult, if not impossible, at the present stage of development of research on psychotherapy. Currently, an opinion on the relative efficacy of various forms of psychotherapy must be regarded as just that and no more. JOHN BUTLER [Directly related are the entries CLINICAL PSYCHOLOGY; COUNSELING PSYCHOLOGY; IDENTITY, PSYCHOSOCIAL;
184
MENTAL DISORDERS, TREATMENT: Client-centered Counseling
SELF CONCEPT. Other relevant material may be found in GESTALT THEORY; PERSONALITY, article on PERSONALITY DEVELOPMENT; PERSONALITY: CONTEMPORARY VIEWPOINTS, article on A UNIQUE AND OPEN SYSTEM; PHENOMENOLOGY; PSYCHOLOGY, article
On EXISTENTIAL PSYCHOLOGY;
SYMPATHY AND
EMPATHY; THINKING, article On COGNITIVE ORGANIZATION AND PROCESSES.] BIBLIOGRAPHY
AIDMAN, TED 1951 An Objective Study of the Changing Relationship Between the Present Self and Wanted Self-picture as Expressed by the Client in Clientcentered Therapy. Ph.D. dissertation, Univ. of Chicago. AXLINE, VIRGINIA M. 1947 Play Therapy: The Inner Dynamics of Childhood. Boston: Houghton Mifflin. BARRINGTON, BYRON 1961 Prediction From Counselor Behavior of Client Perception and of Case Outcome. Journal of Counseling Psychology 8:37-42. BERGMAN, DANIEL V. 1951 Counseling Method and Client Responses. Journal of Consulting Psychology 15:216-224. BILLS, ROBERT E. 1950 Non-directive Play Therapy With Retarded Readers. Journal of Consulting Psychology 14:140-149. BOWMAN, PAUL H. 1951 A Study of the Consistency of Current, Wish and Proper Self-concepts as a Measure of Therapeutic Progress. Ph.D. dissertation, Univ. of Chicago. BUTLER, JOHN M. 1952 The Interaction of Client and Therapist. Journal of Abnormal and Social Psychology 47:366-378. BUTLER, JOHN M. 1958 Client-centered Counseling and Psychotherapy. Volume 3, pages 93-106 in Progress in Clinical Psychology. Edited by Daniel Brower and Lawrence E. Abt. New York: Grune. BUTLER, JOHN M. 1964a Self-acceptance as a Measure of Outcome of Psychotherapy. Unpublished manuscript. -> Paper delivered at the First International Congress of Social Psychiatry. BUTLER, JOHN M. 19646 Self Concept Change in Psychotherapy. Acta psychologica 23:119 only. -> Volume 23 contains the Proceedings of the Seventeenth International Congress of Psychology held in Washington in 1963. BUTLER, JOHN M.; and HAIGH, GERARD V. 1954 Changes in the Relation Between Self-concepts and Ideal Concepts Consequent Upon Client-centered Counseling. Pages 55-75 in Carl R. Rogers and Rosalind F. Dymond (editors), Psychotherapy and Personality Change: Co-ordinated Research Studies in the Clientcentered Approach. Univ. of Chicago Press. BUTLER, JOHN M.; and RICE, LAURA N. 1963 Adience, Self-actualization and Drive Theory. Pages 79-110 in Joseph M. Wepman and Ralph W. Heine (editors), Concepts of Personality. Chicago: Aldine. BUTLER, JOHN M.; RICE, LAURA N.; and WAGSTAFF, ALICE K. 1962 On the Naturalistic Definition of Variables: An Analogue of Clinical Analysis. Volume 2, pages 178-205 in Conference on Research in Psychotherapy, Research in Psychotherapy. Edited by Lester Luborsky and Hans Strupp. Washington: American Psychological Association. BUTLER, JOHN M.; RICE, LAURA N.; and WAGSTAFF, ALICE K. 1963 Quantitative Naturalistic Research: An Introduction to Naturalistic Observation and Investigation. Englewood Cliffs, N.J.: Prentice-Hall.
CARTWRIGHT, DESMOND 1957 Annotated Bibliography of Research and Theory Construction in Client-centered Therapy. Journal of Counseling Psychology 4 : 82-100. CARTWRIGHT, ROSALIND D.; and VOGEL, JOHN 1960 A Comparison of Changes in Psychoneurotic Patients During Matched Periods of Therapy and No Therapy. Journal of Consulting Psychology 24:121-127. DUNCAN, STARKEY D. JR. 1965 Paralinguistic Behaviors in Client-Therapist Communication in Psychotherapy. Ph.D. dissertation, Univ. of Chicago. GAYLIN, N. L. 1965 Psychotherapy and Psychological Health: A Rorschach Structure and Function Analysis. Ph.D. dissertation, Univ. of Chicago. GENDLIN, EUGENE 1961 Experiencing: A Variable in the Process of Therapeutic Change. American Journal of Psychotherapy 15:233-245. GENDLIN, EUGENE 1962 Experiencing and the Creation of Meaning: A Philosophical and Psychological Approach to the Subjective. New York: Free Press. GENDLIN, EUGENE 1963 Experiencing and the Nature of Concepts. Christian Scholar 46:245-255. GENDLIN, EUGENE 1964 A Theory of Personality Change. Pages 100-148 in Symposium on Personality Change, University of Texas, Personality Change. Edited by Philip Worchel and Donn Byrne. New York: Wiley. GENDLIN, EUGENE 1965 Values and the Process of Experiencing. Unpublished manuscript. GRUMMON, DONALD L. 1965 Client-centered Therapy. Pages 30-90 in Buford Stefflre (editor), Theories of Counseling. New York: McGraw-Hill. RAIMY, VICTOR C. 1943 The Self-concept as a Factor in Counseling and Personality Organization. Ph.D. dissertation, Ohio State Univ. RICE, LAURA N. 1965 Therapist's Style of Participation and Case Outcome. Journal of Consulting Psychology 29:155-160. ROGERS, CARL R. 1942 Counseling and Psychotherapy: Newer Concepts in Practice. Boston: Houghton Mifflin. -> See especially pages 19-47, "Old and New Viewpoints in Counseling and Psychotherapy." ROGERS, CARL R. 1954 Changes in the Maturity of Behavior as Related to Therapy. Pages 215-237 in Carl R. Rogers and Rosalind F. Dymond (editors), Psychotherapy and Personality Change: Co-ordinated Research Studies in the Client-centered Approach. Univ. of Chicago Press. ROGERS, CARL R. 1959 A Theory of Therapy, Personality, and Interpersonal Relationships, as Developed in the Client-centered Framework. Volume 3, pages 184256 in Sigmund Koch (editor), Psychology: A Study of a Science. New York: McGraw-Hill. ROGERS, CARL R. 1960 Significant Trends in the Clientcentered Orientation. Volume 4, pages 85-99 in Progress in Clinical Psychology. Edited by Lawrence E. Abt and Bernard F. Riess. New York: Grune. ROGERS, CARL R. 1961a On Becoming a Person: A Therapist's View of Psychotherapy. Boston: Houghton Mifflin. ROGERS, CARL R. 1961b A Theory of Psychotherapy With Schizophrenics and a Proposal for Its Empirical Investigation. Pages 3-19 in J. G. Dawson, H. K. Stone, and N. P. Dellis (editors), Psychotherapy With Schizophrenics: A Reappraisal. Baton Rouge: Louisiana State Univ. Press. ROGERS, CARL R.; and DYMOND, ROSALIND F. (editors) 1954 Psychotherapy and Personality Change: Co-
MENTAL DISORDERS, TREATMENT: Group Therapy ordinated Research Studies in the Client-centered Approach. Univ. of Chicago Press. ROGERS, CARL R.; and KINGET, G. MARIAN 1960 Psychotherapie en menselijke verhoudingen: Theorie en praktijk van de non-dire ctieve therapie. Utrecht (Netherlands): Spectrum. -» A French translation was published in Louvain by Presses Universitaircs de France in 1962. SEEMAN, JULIUS 1956 Client-centered Therapy. Volume 2, pages 98-113 in Progress in Clinical Psychology. Edited by Daniel Brower and Lawrence E. Abt. New York: Grune. SEEMAN, JULIUS 1965 Perspectives in Client-centered Therapy. Pages 1215-1229 in Benjamin B. Wolman (editor), Handbook of Clinical Psychology. New York: McGraw-Hill. SHLIEN, JOHN M. 1961 A Client-centered Approach to Schizophrenia: First Approximation. Pages 285-317 in Arthur Burton (editor), Psychotherapy of the Psychoses. New York: Basic Books. SHLIEN, JOHN M. 1962 Toward What Level of Abstraction in Criteria? Pages 142-154 in Conference in Research in Psychotherapy 1961, Research in Psychotherapy. Washington: American Psychological Association. SHLIEN, JOHN M.; MOSAK, HAROLD H.; and DREIKERS, RUDOLF 1962 Effect of Time-limits: A Comparison of Two Psychotherapies. Journal of Counseling Psychology 9:31-34. SNYGG, DONALD; and COMBS, ARTHUR W. 1949 Individual Behavior: A New Frame of Reference for Psychology. New York: Harper. ->• A revised edition was published in 1959. TRUAX, CHARLES B.; and CARKHUFF, ROBERT R. 1963 For Better or Worse: The Process of Psychotherapeutic Personality Change. Pages 118-157 in Academic Society on Clinical Psychology, Montreal, 1963, Recent Advances in the Study of Behaviour Change: Proceedings of the Academic Assembly on Clinical Psychology. . . . Montreal: McGill Univ. Press.
Ill GROUP PSYCHOTHERAPY
Group psychotherapies are based on the recognition that, with proper guidance, certain types of persons with psychiatric disorders can help each other. In all forms of group therapy, patients and a therapist repeatedly meet to conduct certain activities within the framework of a special group structure and code. Their emotionally charged interactions with the leader and with each other may help to correct their faulty communication behavior and their distorted perceptions of themselves and others, leading to improved social and personal functioning and to relief of psychic distress. Group healing methods are as old as individual ones. From earliest times, sufferers have sought relief through group activities at religious shrines, and many continue to do so. Group therapies began to emerge as recognized and legitimate forms °f psychotherapy, however, only in the 1920s.
185
Many early practitioners exploited the instructional and inspirational potentialities of groups in a purely empirical way; but two pioneers, Trigant Burrow and J. L. Moreno, offered theoretical rationales that, although not in the mainstream of psychiatric thought, had considerable influence. According to Burrow (Riese & Syz 1963), mental disorder was a disturbance in communication, created largely by a person's "privately cherished and secretly guarded" image of himself; the aim of group therapy was to enable him to express himself as he really was by exposing the socially determined basis of his self-image. Moreno (1959) stressed the freeing of spontaneity through encouraging the patient to act out his problems, with the aid of other patients as well as of therapists, in the presence of a vicariously participating audience. In the 1930s psychoanalysts began to experiment with group therapy based on psychoanalytic theory. During World War H, psychotherapists in the armed forces were forced to resort to group methods to handle the enormous load of patients. These methods proved so successful that they spread with almost explosive rapidity. Many modifications were introduced and applied to an ever increasing variety of psychiatric conditions in many different settings. By the 1950s, group therapy in the United States had assumed the dimensions of a movement and had two professional associations, each with a journal devoted to promulgating it. The wide popularity of group therapies may be partly due to the fact that they offer a type of intimacy characteristic of the family and other primary groups. The urbanization and mobility of modern life have reduced opportunities for such relationships, and the shallow, transient, competitive sociability of residential development, office, and club is not an adequate substitute. Characteristics of group therapy Therapeutic groups are conducted in outpatient clinics, private offices, social agencies, mental hospitals, and correctional institutions. Leaders are characteristically psychiatrists, psychologists, psychiatric social workers, or ministers. Some groups are conducted by their own members, without professional guidance. Most forms have a single leader, often with an observer to record what occurs; but some have cotherapists—usually a man and a woman—who try to take different functional roles, such as "father" and "mother." Composition of therapy groups. Most therapy groups consist of from 7 to 25 strangers selected according to a principle such as age, institutional
186
MENTAL DISORDERS, TREATMENT: Group Therapy
residence, or diagnostic category. Examples are groups composed of children, adolescents, mature adults, or the aged; of alcoholics, psychotics, or neurotics; or of persons whose only common feature is residence in the same mental hospital or correctional institution. Increasing efforts are being made to group patients within these broad categories in such a way as to maximize their communication potential. It has been noted that groups tend to elicit certain group roles in predisposed members. For example, one repeatedly finds monopolists, nonparticipants, therapist's assistants, members who try to hold the stage by constantly complaining, and others who try to dominate by moralizing (Rosenthal et al. 1954). This raises the possibility of balancing groups by selecting prospective members with regard to their predilections for different group roles. Observation of patients' actual group behavior seems to be a more reliable way of determining this than individual interviews and psychological tests. To this end, assignment of patients to therapeutic groups may be based on their behavior in a diagnostic group, to which all patients are briefly assigned, where this is administratively possible. A recent trend toward treatment of family groups as a unit is based on the view that the member officially labeled the patient is in reality the victim of a disturbed communication network in which other family members are also involved (Bell 1961; Satir 1964). This approach seems especially promising when the patient is chronologically or psychologically immature, as in the case of a child, an adolescent, or a schizophrenic. Therapists meet privately with patients before the first group meeting to determine their suitability for inclusion and to prepare them for the group; they meet again, later, to evaluate the patients' readiness for discharge. The extent of private patient-therapist contacts at these and other times varies widely, depending on the therapist's conceptualization of treatment; but it is generally agreed that such meetings must be limited if they are not to drain important material from the group sessions. This limitation also holds for meetings of patients between formal sessions, since such informal meetings create opportunities for antitherapeutic as well as therapeutic encounters. With family groups and married couples, such meetings are of course unavoidable; and they seldom can be completely prevented in groups of strangers. Extra group meetings foster growth of group cohesiveness and give members opportunities to interact away from the inhibiting presence of the therapist,
which may be advantageous. On the other hand, by diminishing the members' "social incognito," they may inhibit candid expression of feeling in the group and may foster "acting-out" of personal problems through, for example, exploitative or anxiety-relieving sexual behavior, thereby removing the problems from the helpful scrutiny of the group. Some therapists deal with this problem by prescribing meetings in their absence, so that these become part of treatment; all try to set the ground rule that there be no secrets from the group. The knowledge that anything occurring in an extragroup encounter may be reported to the group usually has an inhibitory effect on antitherapeutic activities. Leader-centeredness or group-centeredness. Methods of group therapy can be ordered with reference to their degree of leader-centeredness or group-centeredness. Since group members are chosen by the therapist and initially expect help only from him, all groups begin as leader-centered. Throughout the duration of some groups the therapist continues to be seen as the sole therapeutic agent, and the group as merely the arena in which members interact with him and each other. Groupcentered approaches attribute considerable therapeutic effects to properties of the group itself. Some groups have no official leader. In others, the leader encourages patients to rely increasingly on each other and deliberately tries to foster a group code and group attributes, such as cohesiveness, that have therapeutic potential. Actually, in therapy groups as in all others, leader behavior, member behavior, and group processes continuously interact. For example, a controlled study of group therapy with hospitalized patients found that intrapersonal exploration by the patients was associated with certain aspects of the therapist's style of leadership and with certain properties of the group itself (Truax 1961). Degree of activity structure. Groups can also be roughly classified in terms of the extent to which their activities are organized. Some, such as Alcoholics Anonymous, therapeutic social clubs, and Recovery, Incorporated (Wechsler 1960), rely on tightly structured, prescribed activities; others, often termed interview or free-interaction groups, create an ambiguous situation and place responsibility for what occurs on the members. In general, the more structured the group, the larger its size can be. Free-interaction groups. To illustrate the range of group therapies, three divergent types may be briefly described. Free-interaction groups typically consist of up to eight adult outpatients and a pro-
MENTAL DISORDERS, TREATMENT: Group Therapy fessional leader. These groups seek to create a code and a climate that foster development of greater self-reliance, spontaneity, and maturity in the members. They encourage free expression of feeling and discussion of personal problems, relying primarily on the shared experiences of the participants to help each find better solutions to his own problems. The responsibility for choice of topic and conduct of the meeting lies largely with the members. The therapist creates and maintains the ground rules and therapeutic atmosphere, facilitates members' interactions, and clarifies the meanings of their behavior (Foulkes & Anthony 1957; Mullan & Rosenbaum 1962). Alcoholics Anonymous. Alcoholics Anonymous is a self-selected, group-oriented organization based on the single criterion of self-confessed alcoholism. Meetings are conducted by the members in a highly structured fashion, and consist chiefly of testimonials about how wretched they were when they drank and how much better they are since they have stopped. Other prescribed activities include making restitution to persons they have harmed and being available to alcoholics who ask for help. The considerable therapeutic effect of these groups lies in the unique degree of support and mutual understanding that alcoholics can give each other. Therapeutic social clubs. Therapeutic social clubs, used chiefly for hospitalized patients or those making the transition back to the community, are run along parliamentary lines, and plan and conduct projects financed by dues. The therapist selects the members and attends all meetings, but remains in the background. The central purpose of these clubs is to combat the vicious circle of impaired social skills, withdrawal, and further social impairment by helping members to improve their social abilities (Bierer 1944). Results of group therapies Evaluation of the results of group therapies, as of all other forms of psychotherapy, is hampered by the absence of a satisfactory classification of psychiatric disorders and inadequate criteria of improvement, but certain clinical impressions are sufficiently widespread to warrant mention. Because of the tensions created by early meetings, especially in unstructured, group-centered approaches, the drop-out rate is higher than in individual psychotherapy, unless the therapist makes special efforts to maintain the patient's commitment to treatment. Particularly prone to leave are Patients with such socially unacceptable problems as sexual deviations; those needing strong support froman authority figure; the excessively shy, sen-
187
sitive, or suspicious; and those with high dominance but low popularity (Taylor 1961). About two-thirds of those who remain in treatment improve, as in individual psychotherapy. Group therapy may be especially helpful to patients who are inadequately socialized, including those who express their personal problems in somatic symptoms rather than words, schizophrenics, and sociopaths. Certain obsessional patients, whose verbal and conceptual skills act as defenses against experiencing emotions in analytic-type therapies, may profit from the strong emotional reactions triggered by group processes. Group treatment may aid families and married couples whose communications have become frozen in self-perpetuating, self-aggravating patterns, and who have become so busy defending themselves that they no longer "hear" each other. As they repeatedly display their pathological interaction patterns in a setting that offers support and encourages self-examination, each family member may come to understand how he contributes to the problems of the others and learn to modify his behavior. Therapy groups and group dynamics Although controlled experimentation with therapy groups obviously is very difficult, they provide a source for hypotheses concerning all small-group functioning; and some data obtained from experimental studies of small groups may cast light on the phenomena of therapy groups. The following discussion reviews some possible relationships between the two fields that afford areas for research (Kelman 1963). Distinctive features. Most therapy groups represent subcultures that are demarcated from the culture of the community at large in certain important respects. One is the ground rule that what is said or done in a group meeting is confidential with respect to the outside world. In contrast to other types of groups, admission is secured by confession of failure in some aspects of living. Status within the group is related to skill in playing the role of patient, as defined by the group code, and to demonstration of clinical improvement. Another distinguishing feature of most therapy groups is that members are expected to express their feelings about themselves, persons outside the group, other group members, and the leader candidly and freely. At the same time, acting on feelings is interdicted or carefully controlled, as in psychodrama. Finally, therapy groups demand that patients in conflict keep in communication. Such a group code maximizes opportunities for
188
MENTAL DISORDERS, TREATMENT: Group Therapy
learning and modification of attitudes and behavior. The protected atmosphere encourages patients to express their real feelings, uninhibited by the norms of ordinary social intercourse. Encouragement to verbalize feelings helps patients to differentiate them. Since the group is tolerant and there is little carry-over into daily life, penalties for failure are mitigated, thus encouraging freedom of experimentation. In daily life, antagonists customarily stop communicating, thereby leaving their mutual distortions unchanged. Maintenance of communication despite conflict encourages verbalization, enables each antagonist to gain fuller understanding of the other's position and his own, and helps each to learn to stand his ground despite opposition. Member-leader and member-member interactions. All forms of psychotherapy support patients' self-esteem, arouse them emotionally, and offer them new cognitions. These features give them courage to examine and modify their habitual attitudes, supply the motive power for doing so, and guide their efforts, thereby enabling them to correct maladaptive attitudes and behavior and to progress in self-development. Therapeutic groups have certain potential advantages with respect to these goals. Successful therapy groups overcome members' demoralizing sense of isolation by enabling them to discover that others have similar problems. Furthermore, in contrast to private treatment, in which all help flows from therapist to patient, members of therapy groups find that they can help each other. This counteracts the damage to self-esteem resulting from having been derogated by family and friends. An important aspect of both the supportive and the influencing power of therapy groups lies in the cohesiveness successful ones develop, growing out of members' discovery of common problems, experience of mutual helpfulness, and a history of shared crises and triumphs. This is manifested by therapy groups' reluctance to disband and their resistance to the admission of new members. The danger that cohesiveness will produce pressure on members toward artificial conformity of behavior is reduced by the fact that the group task is to help each member develop in accordance with his own inner needs, so that the group norms encourage diversity. Therapy groups arouse members emotionally in ways not available to individual therapy. One is rivalry for the leader's attention and approval, which, incidentally, seems to be more acute when the leader and members are of different sexes.
The central initial position of the therapist is illustrated by the finding that in a given group those patients who experience a "better" relationship with him relative to other patients show more improvement and are less likely to drop out than are those who experience a "worse" one, regardless of the absolute goodness of the relationship (Parloff 1961). The protective atmosphere of therapy groups and their norm of open expression of feelings facilitate expressions of anger toward the therapist. However, since members depend on him for help, prolonged, unanimous condemnation of him cannot occur. Whether a phase of scapegoating the leader is a necessary step in the development of group cohesiveness, as some believe, remains a question for research. Members also arouse a wide range of hostile and friendly feelings in each other, based on more or less unconscious distortions as well as genuine differences or similarities in background, life experience, and values. In addition, many patients seem to benefit from vicarious emotional participation in problems of others. From the cognitive standpoint, members also serve as models for each other; as sources of feedback, the value of which is increased by the fact that it is less distorted by the rules of social intercourse than are reactions from friends and acquaintances; and as representatives of attitudes existing outside the group. Acceptance by other members carries more weight than acceptance by the therapist, because they are viewed as being more like ordinary people. Because group members represent the outside world, transfer of insights obtained through group experiences to daily life is easier than it is in private psychotherapy. Commitment to the group and awareness that one will report back to it help to sustain changes in attitude. On the other hand, the necessity of constantly dealing with the reactions of other members may hamper progress in patients who need to withdraw into reverie or fantasy or to subject their problems to leisurely scrutiny. Group development and group issues. Wellestablished therapy groups differ from new ones in many ways, including greater freedom of expression among members and a greater tendency for topics to carry over from one session to the next; but whether therapy groups exhibit regularities of development similar to problem-solving groups remains open despite some experimental evidence in support of this possibility (Psathas 1960). The developmental process in therapy group8
MENTAL DISORDERS, TREATMENT: Behavior Therapy can be viewed from the standpoint of the progression of group preoccupations, or issues influencing the members at more or less unconscious levels. It has been suggested, for example, that initial meetings of therapy groups are dominated by three antitherapeutic "basic assumptions": dependency, fight-flight, and pairing, and that group progress can be judged by members' success in overcoming the obstacles these "basic assumptions" present to achievement of the therapeutic goal of increased self-realization (Bion 1961). Another theory conceptualizes group progress in terms of the successive emergence and resolution of "focal group conflicts." A well-nigh universal example of this in early meetings of free-interaction groups is the conflict between the desire to achieve therapeutic gain by becoming committed to the group and exposing one's feelings to it and the fear that by so doing one is exposing oneself to rejection and ridicule (Whitaker & Lieberman 1964). Viewed in a larger perspective, group therapies exploit the universal human tendency to validate subjective experiences by comparing them with experiences of other persons who are perceived as similar. The standards, structure, and processes of therapy groups facilitate these comparisons and help members to correct the distortions thus brought to light. Since each group member deviates in a different way but shares attitudes consistent with the social norms of the community, the attitudes and values of the group as a whole tend to foster improved social adjustment of each member. The advantages and limitations of group psychotherapy as compared with private methods of psychotherapy require further exploration, but it seems probable that the potentialities of group approaches have not yet been fully realized. JEROME D. FRANK [Other relevant material may be found in GROUPS and SOCIOMETRY.]
189
FOULKES, SIEGMUND H.; and ANTHONY, E. J. 1957 Group Psychotherapy: The Psycho-analytic Approach. Baltimore: Penguin. KELMAN, HERBERT C. 1963 The Role of the Group in the Induction of Therapeutic Change. International Journal of Group Psychotherapy 13:399-451. -> Includes discussion by Saul Scheidlinger. MORENO, JACOB L. 1959 Psychodrama. Volume 2, pages 1375-1396 in American Handbook of Psychiatry. Edited by Silvano Arieti. New York: Basic Books. MULLAN, HUGH; and ROSENBAUM, MAX 1962 Group Psychotherapy: Theory and Practice. New York: Free Press. PARLOFF, MORRIS B. 1961 Therapist-Patient Relationships and Outcome of Psychotherapy. Journal of Consulting Psi/chology 25:29-38. POWDERMAKER, FLORENCE B.; and
FRANK, J. D.
1953
Group Psijchotherapy: Studies in Methodology of Research and Therapy. Cambridge, Mass.: Harvard Univ. Press. PSATHAS, G. 1960 Phase Movement and Equilibrium Tendencies in Interaction Process in Psychotherapy Groups. Sociometry 23:177-194. RIESE, W.; and SYZ, H. 1963 Phyloanalysis: Theoretical and Practical Considerations on Burrow's Groupanalytic and Socio-therapeutic Method. Acta Psychotherapeutica et Psychosomatica: International Journal of Psychotherapy and Psychosomatics (Basel) 11 (Supplement): 5-88. -» Part 1, "Phyloanalysis (Burrow)—Its Historical and Philosophical Implications," by W. Riese, is on pages 5-36. Part 2, "Reflections on Group- or Phylo-analysis," by H. Syz, is on pages 37-88. ROSENTHAL, DAVID; FRANK, J. D.; and NASH, E. H. 1954 The Self-righteous Moralist in Early Meetings of Therapeutic Groups. Psychiatry 17:215-223. SATIR, VIRGINIA 1964 Conjoint Family Therapy. Palo Alto, Calif.: Science and Behavior Books. SLAVSON, SAMUEL R. (editor) 1956 The Fields of Group Psychotherapy. New York: International Universities Press. TAYLOR, FREDERICK K. 1961 The Analysis of Therapeutic Groups. Oxford Univ. Press. TRUAX, CHARLES B. 1961 The Process of Group Psychotherapy: Relationship Between Hypothesized Therapeutic Conditions and Intrapersonal Exploration. Psychological Monographs 75, no. 7. WECHSLER, HENRY 1960 The Self-Help Organization in the Mental Health Field: Recovery, Inc.; A Case Study. Journal of Nervous and Mental Disease 130:297-314. WHITAKER, DOROTHY STOCK; and LIEBERMAN, MORTON A. 1964 Psychotherapy Through the Group Process. New York: Atherton.
BIBLIOGRAPHY
, JOHN E. 1961 Family Group Therapy: Methods for Psychological Treatment of Older Children, Adolescents, and Their Parents. U.S. Public Health Service Monograph. Publication No. 826. Washington: Government Printing Office. , JOSHUA 1944 A New Form of Group Psychotherapy. Mental Health (London) 5:23-26. ION, WILFRED R. 1961 Experiences in Groups, and Other Papers. New York: Basic Books. -» Seven of these papers were published in Human Relations from !948 to 1951. , RAYMOND J. 1957 Methods of Group Psychotherapy. New York: McGraw-Hill.
IV BEHAVIOR THERAPY
The term "behavior therapy" was introduced by A. A. Lazarus in 1958 and popularized by H. J. Eysenck (1960). It refers to psychotherapeutic methods that are directly based on experimentally established principles of learning. Although "behavior therapy" is broadly synonymous with "conditioning therapy" and with "behavioristic psychotherapy," it more specifically denotes the methods
190
MENTAL DISORDERS, TREATMENT: Behavior Therapy
that have developed from learning theory since the 1940s. While the principles of learning upon which behavior therapy is based have stemmed mainly from the work of Clark L. Hull (1943)—who in many respects united the lines of study begun by Ivan P. Pavlov in Russia and by Edward L. Thorndike and John B. Watson in the United States—a distinctive group of techniques based on B. F. Skinner's operant conditioning paradigm (1938) has been emerging in recent years. Experimental neuroses, first produced in Pavlov's laboratories, provided the primary data from which Hullian learning theory evolved behavior therapy. Ironically, there could have been no such evolution in the Soviet Union because of the pervasive acceptance there of Pavlov's view that a neurosis is due to the establishment of a chronic pathological focus in the central nervous system. Among those who in the 1920s and 1930s tried to apply principles of learning to clinical problems, foremost mention must be made of Mary Cover Jones (1924), who was the first deliberately to invoke the counterconditioning method that dominates present-day behavior therapy (and whose work moldered in the dust for most of the ensuing quarter of a century). She treated children's phobias by having the patient eat in the presence of a feared object. At first, the object was at a distance. Then, as his anxiety diminished, the patient was placed closer and closer to it. Guthrie (1935) realized the wide applicability of this principle, stating that the rule for overcoming an undesired response is to control the situation so that the cue to the undesired response is present while "other behavior prevails." Dunlap (1932) originated the technique of negative practice, in which the extinction mechanism is used to overcome unadaptive motor habits like tics through instigating their repeated evocation in the absence of reinforcement. Approaching experimental neuroses from the standpoint of modern learning theory, Wolpe (1952; 1958) demonstrated that the behavior observed in neurotic states had all the attributes of learned behavior. The manifestations of anxiety and agitation were similar in detail to the behavior originally evoked in the situations of conflict or noxious stimulation that were used to precipitate the neurosis; the neurotic responses were conditioned to and remained under the control of stimuli present at the time of causation; and neurotic responses of smaller intensity could be evoked, in accordance with the principle of primary stimulus generalization, by other stimuli similar to those to which the neurotic reaction had been directly at-
tached. The most marked and constant neurotic responses were autonomic responses typical of anxiety. These failed to undergo extinction no matter how often or for how long the animal was exposed to the experimental situation, but they could consistently be removed if the animal could be induced to eat in the presence of anxiety-evoking stimuli. Since the animal's eating was inhibited if anxiety was strong, food had to be offered first in the presence of generalized stimuli that aroused anxiety weakly; and then reciprocally, the eating would inhibit the anxiety, and repeated feedings would diminish it to zero. The same treatment in successively more "severe" situations eventually enabled the animal to eat without anxiety in the cage where the neurosis had been induced. The methods of behavior therapy These therapeutic experiments suggested the generalization that the reciprocal inhibition mechanism is the basis of the psychotherapeutic effects obtained by counterconditioning methods, so that if any response that inhibits anxiety can be made to occur in the presence of anxiety-evoking stimuli, it will on each occasion to some extent weaken the conditioned connection between these stimuli and the anxiety responses. This idea was subsequently widely applied in the treatment of human neuroses. Not only eating but a considerable number of other responses in human beings are incompatible with anxiety and thus lend themselves to therapeutic application. The use of some of these responses is briefly described below, followed by a short account of some methods employing different learning mechanisms. (For further details of many techniques of this type, see Wolpe 1958; Eysenck 1960; 1964; Wolpe & Lazarus 1966.) Counterconditioning methods. The group of counterconditioning (reciprocal inhibition) methods is applied mainly, but by no means entirely, to the elimination of unadaptive anxiety-response habits such as fear of crowds, of praise, or of criticism. Such habits are the crux of most neuroses, and when they are overcome, treatment of "defenses against anxiety" and other secondary processes becomes irrelevant. Assertive responses. Assertive responses are used to countercondition neurotic fears aroused IB» interpersonal interchanges. The term "assertive is employed here a good deal more broadly than IB common parlance and includes not only responses of a more or less aggressive nature but also others expressing affection, liking, admiration, and revulsion—almost any feeling other than anxiety (Salte* 1949; Wolpe 1958). Aggressive kinds of assertioB
MENTAL DISORDERS, TREATMENT: Behavior Therapy are, however, very commonly required. For example, there are many patients whom unjust criticism renders hurt and helpless. The therapist applauds the anger and resentment that they inevitably feel in the situations they inadequately handle and gives detailed instructions for the appropriate expression of these feelings. Such expression reciprocally inhibits the anxiety, and repetition of such expression brings about a cumulative conditioned inhibition of anxiety. Sexual responses. Sexual responses are employed to overcome habits of anxiety inappropriately evoked in sexual situations. For example, the male patient usually complains of impotence or premature ejaculation, both of which are generally due to anxiety interfering with the predominantly parasympathetic responses that subserve penile erection. The emotional components of the sexual response (sexual feelings) usually remain adequate in the patient so afflicted. The therapist, having ascertained at what stage in the sexual approach anxiety begins to be experienced, instructs the patient (who must have secured the cooperation of his sexual partner) to take his sexual approach no further than this stage of minimal anxiety on repeated occasions—until the anxiety has decreased to zero. He is then directed to go on to the next stage in the same way. Advances continue to be made step by step until normal intercourse is achieved, usually from three to six weeks after the start of therapy. Although the principle is simple, the detailed tactics must always be adjusted to the individual case (see Wolpe 1958; Wolpe & Lazarus 1966). Desensitization and muscle relaxation. Relaxation, long a popular prescription for nervous disturbances, first achieved scientific respectability through the work of Edmund Jacobson (1938), who showed its autonomic effects to inhibit those effects characteristic of anxiety. Jacobson treated neurotic patients by giving them very extensive training in relaxation and then instructing them to relax at all times all muscles not in use (differential relaxation). A similar program promulgated by Schultz (Schultz & Luthe 1950) has been widely adopted in Europe. It would seem that when improvement occurs, it is because persistent relaxation provides the possibility of reciprocal ^hibition of anxiety aroused by stimuli that appear ln the course of daily life. Systematic desensitization, one method of using ^ep muscle relaxation to decondition neurotic anxiety, is much more economical of time and effort and affords detailed control of the therapeutic process. Training in deep muscle relaxation
191
occupies only part of each of about six sessions. The greater part of these sessions is devoted to the construction of anxiety hierarchies. If a particular patient is neurotically anxious about high places and about being rejected, situations relating to each of these areas are listed in descending order of intensity of anxiety reaction, each list constituting a hierarchy. In the actual desensitization procedure, the patient is made to relax as deeply as possible, and then the least disturbing scene from one of his hierarchies is presented to his imagination for a few seconds. Presentations are repeated until he no longer has any disturbance, and the same procedure is followed all the way up the hierarchy. Almost invariably there is transfer of this effect when the patient is exposed to the real situation. In individuals who are not disturbed upon imagining situations that disturb them in reality, desensitization requires the exploitation of real stimuli, being then called "desensitization in vivo." Other modes of desensitization. Other inhibitors of anxiety may also be employed therapeutically in a systematic way. An anxiety-inhibiting effect is produced by the emotions spontaneously aroused in some patients by the therapeutic situation itself (see below). In behavior therapy this has been mainly used for desensitization in vivo. For example, in cases of anxiety in social situations characterized by tremor of the hand while lifting a teacup, patient and therapist repeatedly raise first an empty glass and then a progressively fuller one, until all signs of shaking disappear at each stage; and later they repeat the sequence before an audience. Lazarus and Abramovitz (1962) have reported the desensitization of children's phobias by the use of what they call "emotive imagery." The patient is made to expose himself in imagination to phobic stimuli of increasing intensity in contexts of pleasant emotional excitement. Recently, use has been made of the observation that anxiety can be inhibited through cutaneous stimulation by nonaversive galvanic shocks. The mechanism of this effect may well turn out to depend on afferent collateral inhibition (Eccles 1957) as may also that of the technique of inhibiting anxiety through the arousal of a dominating motor response evoked by mild electric current (Wolpe 1954; 1958). Avoidance conditioning. Avoidance (aversive) conditioning is the application of the reciprocalinhibition principle to the overcoming of responses other than anxiety. It is employed largely to treat obsessional behavior. The agents commonly used
192
MENTAL DISORDERS, TREATMENT: Behavior Therapy
have been strong faradic stimulation of the forearm and drug-induced nausea, either of which must be administered in an appropriate time relation to the stimulus to which avoidance conditioning is desired. Avoidance conditioning has been effectively used in cases of obsessional thinking, compulsive acts, fetishism, arid homosexuality. It has been least successful in homosexuality, which is often based on neurotic interpersonal anxiety and in such cases should be treated by deconditioning the anxiety (Stevenson & Wolpe 1960). Avoidance conditioning has also been applied with limited success in the treatment of addiction, especially alcoholism. [See LEARNING, article on AVOIDANCE LEARNING.] Experimental extinction. Techniques based on the extinction mechanism—the breaking of habits through repeated performance of the relevant act without reinforcement—were introduced by Dunlap (1932) under the name "negative practice" and in recent years have again been employed occasionally in the treatment of such motor habits as tics. In the course of large numbers of forced evocations of the undesired movement, spontaneous evocations of it are progressively lessened. Certain therapeutic measures have given the appearance of applying the extinction principle to the elimination of emotional reactions (e.g., Malleson 1959). The patient is exposed to anxiety-arousing stimuli, either in reality or in imagination, at the greatest possible strength. In some cases this leads to the decline and ultimate elimination of the anxiety response habit, but more often it does not. It is very doubtful that such improvement is really due to experimental extinction; and a form of inhibition has been suggested as the possible mechanism (Teplov et al. 1956). Both clinically and experimentally, the elicitation of a high-intensity anxiety response ordinarily tends to increase the habit strength of that response. Positive reconditioning. While the overcoming of unadaptive autonomic response habits is usually the central task of behavior therapy, very frequently there is also a need to form adaptive motor habits. Such conditioning is often part and parcel of the measures employed to break down the anxiety habit, as, for example, in the case of assertive training. But motor habits often need to be changed even where anxiety is not involved. For instance, a man who has repeatedly spoiled courtships by overeager behavior might be taught to "play it cool." If the new behavior is successful, it naturally tends to replace the old. In enuresis nocturna, waking is conditioned to the imminence of urination, and this makes possible the subsequent conditioned in-
hibition of urination during sleep (Eysenck I960 p. 377). In recent years Skinnerian operant conditioning techniques have been used to remove and replace undesirable habits. Anorexia nervosa has been successfully treated by providing social rewards— such as the use of a radio or permission to receive company—contingent upon the patient's eating, while withdrawing these rewards when the patient fails to eat (Bachrach et al. 1965). Several varieties of psychotic behavior have been treated on the same principle (e.g., Ayllon 1963), bringing about major and lasting changes in chronic schizophrenic patients, some of whom have been continuously hospitalized for decades. The results of behavior therapy The most distinctive feature of behavior therapy is that it enables the therapist to plan therapeutic strategy and control its details, in contrast to merely setting a framework for transactions with the patient and hoping that beneficial effects will emerge. The behavior therapist can specify the reactions to be overcome and the means to be employed in overcoming them, and he can often state the quantitative relations to be expected between defined therapeutic operations and amount of habit change (Wolpe 1963). Statistical data. Two fairly extensive studies (Wolpe 1958; Lazarus 1963) have evaluated the results of behavior therapy in terms of R. P. Knight's five criteria: symptomatic improvement, increased productiveness, improved adjustment and pleasure in sex, improved interpersonal relationships, and the ability to handle ordinary psychological conflicts and reasonable reality stresses (Knight 1941). From these reports it appears that over 80 per cent of unselected neurotic patients exposed to the available techniques either recover or improve markedly. These results must be compared with the 60 per cent "cured" or "greatly improved" among the completely analyzed patients studied by the Central Fact-Finding Committee of the American Psychoanalytic Association. While the psychoanalyzed patients were treated an average of four times a week for three to four years—i.e., about seven hundred sessions—the average course of behavior therapy covers about thirty sessions (Wolpe & Lazarus 1966, p. 156). A fairly constant number of neurotic patients (about 40 per cent) improve markedly with therapies other than behavior therapy. It is suggested that these nonspecific improvements are due to emotional responses in the therapeutic situation
MENTAL DISORDERS, TREATMENT: Somatic that reciprocally inhibit the anxiety responses evoked by verbal stimuli during interviews. Such nonspecific effects presumably also account for part of the favorable results of behavior therapy. Depth of the effects of behavior therapy. It is sometimes stated as a criticism of behavior therapy that it does not attempt to deal with the "basic dynamic conflict" that is alleged to underlie neurosis. This would be an important objection if a neurosis really had such a conflict as its basis. But there are facts that are hard to reconcile with this idea. For example, a corollary of such an objection would claim that unless the dynamic conflict is resolved, relapse or symptom substitution will sooner or later occur. But a survey (Wolpe 1961) of the results of follow-up studies on neuroses successfully treated by a variety of methods not concerned with the "dynamic conflict" revealed only a 1.6 per cent incidence of relapse or symptom substitution. Weighing the evidence, it seems reasonably certain that neuroses can be considered to be nothing but habits and that therefore a therapy able to break these habits must be considered fundamental. JOSEPH WOLPE [See also LEARNING, articles on CLASSICAL CONDITIONING, INSTRUMENTAL LEARNING, REINFORCEMENT. Other relevant material may be found in ANXIETY; CLINICAL PSYCHOLOGY; CONFLICT, article on PSYCHOLOGICAL ASPECTS; NEUROSIS; PSYCHIATRY; and in the biographies of GUTHRIE; HULL; PAVLOV; THORNDIKE; WATSON.] BIBLIOGRAPHY AYLLON, T. 1963 Intensive Treatment of Psychotic Behaviour by Stimulus Satiation and Food Reinforcement. Behaviour Research and Therapy 1:53-61. BACHRACH, A. J.; ERWIN, W. J.; and MOHR, J. P. 1965 The Control of Eating Behavior in an Anorexic by Operant Conditioning Techniques. Pages 153-163 in Leonard P. Ullmann and L. Krasner (editors), Case Studies in Behavior Modification. New York: Holt. DUNLAP, KNIGHT 1932 Habits. New York: Liveright. ECCLES, JOHN C. 1957 The Physiology of Nerve Cells. Baltimore: Johns Hopkins Press. EYSENCK, HANS J. (editor) 1960 Behaviour Therapy and the Neuroses: Readings in Modern Methods of Treatment Derived From Learning Theory. Oxford. Pergamon. EYSENCK, HANS J. 1964 Experiments in Behaviour Therapy. New York: Macmillan. GUTHRIE, EDWIN R. (1935) 1952 The Psychology of Learning. Rev. ed. New York: Harper. 1943 Principles of Behavior. New York: ; CLARK L. Appleton. JACOBSON, EDMUND 1938 Progressive Relaxation. Univ. of Chicago Press. J°NES, MARY C. (1924) 1960 A Laboratory Study of Fear: The Case of Peter. Pages 45-51 in Hans J. Eysenck (editor), Behaviour Therapy and the Neu-
193
roses: Readings in Modern Methods of Treatment Derived From Learning Theory. Oxford: Pergamon. KNIGHT, R. P. 1941 Evaluation of the Results of Psychoanalytic Therapy. American Journal of Psychiatry 98:434-446. LAZARUS, ARNOLD A. 1963 The Results of Behaviour Therapy in 126 Cases of Severe Neuroses. Behaviour Research and Therapy 1:69-79. LAZARUS, ARNOLD A.; and ABRAMOVITZ, ARNOLD 1962 The Use of "Emotive Imagery" Li the Treatment of Children's Phobias. Journal of Mental Science 108:191195. MALLESON, NICOLAS 1959 Panic and Phobia: A Possible Method of Treatment. Lancet [1959], no. 1:225227. SALTER, ANDREW (1949)1961 Conditioned Reflex Therapy. 2d ed. New York: Capricorn Books. -> A paperback edition was published in 1961 by Putnam. SCHULTZ, JOHANNES H.; and LUTHE, W. (1950) 1959 Autogenic Training. New York: Grune. -» First published in German. SKINNER, B. F. 1938 The Behavior of Organisms. New York: Appleton. STEVENSON, IAN; and WOLPE, JOSEPH 1960 Recovery From Sexual Deviations Through Overcoming Nonsexual Neurotic Responses. American Journal of Psychiatry 116:737-742. TEPLOV, BORIS M. et al. (1956) 1964 Pavlov's Typology: Recent Theoretical and Experimental Developments From the Laboratory of B. M. Teplov, Institute of Psychology, Moscow. Compiled, edited, and translated by J. A. Gray, with an editorial introduction by H. J. Eysenck. Oxford: Pergamon. -» First published in Russian. WOLPE, JOSEPH 1952 Experimental Neuroses as Learned Behavior. British Journal of Psychology 43:243-268. WOLPE, JOSEPH 1954 Reciprocal Inhibition as the Main Basis of Psychotherapeutic Effects. A.M.A. Archives of Neurology and Psychiatry 75:205-226. WOLPE, JOSEPH 1958 Psychotherapy by Reciprocal Inhibition. Stanford Univ. Press. WOLPE, JOSEPH 1961 The Prognosis in Unpsychoanalyzed Recovery From Neurosis. American Journal of Psychiatry 118:35-39. WOLPE, JOSEPH 1963 Quantitative Relationships in the Systematic Desensitization of Phobias. American Journal of Psychiatry 119:1062-1068. WOLPE, JOSEPH; and LAZARUS, A. A. 1966 Behavior Therapy Techniques. Oxford: Pergamon.
SOMATIC TREATMENT
Somatic treatment comprises all therapeutic procedures which are based primarily on physical means of influencing the human organism. The agents employed may be mechanical, electromagnetic, or chemical in nature, but they are all characterized by their potential to change the energy balance within the physiochemical system of cerebral dynamics. Defined negatively, somatic treatment of mental disorders may be said to be essentially independent of social and psychological factors, and it would be expected to be generally effective regardless of individual differences in per-
194
MENTAL DISORDERS, TREATMENT: Somatic
sonality structure, in personal interactions, and in transactions! processes inherent in the treatment situation. Organic and functional mental disorders. Mental disorders are traditionally divided into two categories—the organic and the functional. Organic mental disorders are characterized by the presence of demonstrable morphological or metabolic abnormalities, which are necessary factors for the establishment of their clinical and pathological diagnosis. Mental disorders for which physical cerebral pathology cannot be demonstrated are denned as functional in nature. Since it is in the organic mental disorders that somatic pathology has been clearly established, it would appear plausible to expect here the best results of somatic treatments. However, the most significant progress with somatic therapies has so far been made by psychiatry in the field of functional mental disorders, just since the mid-1930s. The most spectacular exception to this statement was the discovery of malaria treatment for general paresis of the insane, or dementia paralytica, an inflammatory brain disease caused by syphilis. The discoverer of this therapy, Wagner-Jauregg, received the Nobel Prize in 1927, thus becoming the first psychiatrist to be so honored. However, because of the discovery of penicillin as the specific cure for syphilis, general paresis is no longer a significant mental disorder in many parts of the world. It remains a fact that the major breakthroughs in the somatic treatment of mental disorders have occurred only fairly recently and in the field of functional psychoses—namely in schizophrenia and in manic-depressive psychosis and other depressions. These therapeutic advances have been achieved through shock therapies and still more recently through pharmacotherapy. Throughout the history of psychiatry there have been those who have predicted that some day scientists will discover the physical substrate of all mental disorders. At that time, the argument goes, we might be able to approach all psychiatric treatment with the same scientific detachment that characterizes a surgeon performing an appendectomy or a physician treating a case of pneumonia with modern antibiotics. This hope of finding some kind of "magic bullet" for every mental disorder is not likely ever to be fulfilled. First of all, it is by no means certain that physical substrates or lesions will be discovered for every psychological disorder. Even more important, however, is the well-established fact that in the realm of behavior, physical and psychological fac-
tors are so closely interwoven that a mental disorder—which is essentially a disorder of behavioral manifestations—will seldom respond to somatic therapy alone, without any consideration of psychological and interpersonal factors, even if its primary cause is clearly a physical one. Are somatic treatments cures? The action of few of the major successfully employed somatic therapies is clearly understood, and none of the therapies are specific cures. This is not surprising, since these treatment methods are most effective in the functional psychoses, and this class of mental disorders is characterized by the fact that no definite physical or psychological cause has consistently been established by the many investigators who have searched intensively for the final common physical path of these disorders for nearly a century. A truly curative treatment, however, can be undertaken only if the cause of an illness is known. Otherwise, even the most successful treatment of a disease—for instance, insulin therapy of diabetes mellitus—can only be symptomatic, supportive, or compensatory in nature. A comprehensive approach Psychiatry is fundamentally a pragmatic science. Its raison d'etre and essential goal are the improvement or cure of mentally disordered patients. While psychiatry has developed major theoretical frameworks of its own, e.g., psychoanalysis, and has assimilated others from behavioral sciences for its own use, e.g., learning theory, most of the major advances in psychiatric somatic therapy originated in empirical observations, and the underlying mechanisms through which these treatments became effective were usually inadequately or even erroneously understood. The present methodological situation in the treatment of mental disorders is characterized by a highly dynamic state of flux. The two extreme positions of those who "believe" only in the psychodynamic approach to and resolution of the problems posed by mental disorders and regard a physical approach to mental disorders as methodologically naive and grossly inappropriate, and those who consider any other than a clearly somatic orientation and therapeutic approach to mental disorders as unscientific and doomed to failure, are no longer clearly defined. Today it is generally accepted that all psychodynamic processes depend on a neurophysiological substrate; consequently, psychoanalysts have in recent years shown much interest in neurophysiological research and the pharmacotherapeutic approach to mental disorders. On the other hand, even in their laboratory experi-
MENTAL DISORDERS, TREATMENT: Somatic ments, behavior researchers are now clearly acknowledging the important role of individual personality differences, nonquantifiable psychodynamic factors, and interpersonal transactions. Therapeutic revolution in psychiatry. During the decade between 1950 and 1960 the therapeutic and administrative approach and the social attitudes toward the mentally ill underwent changes of such magnitude that one would be justified in speaking of a quiet revolution. In the United States there was a spectacular decrease of mental hospital populations. In the eight years between 1956 and 1964, i.e., since systematic drug therapy became widely established, there was a decrease of 54,000 patients instead of an anticipated increase of 82,000 patients confined in mental hospitals. One of the by-products of this decrease in mental illness and suffering is a probable saving of more than one billion dollars ("What Tranquilizers . . ." 1964). Thousands of mental patients who only 15 years ago would have remained hospitalized for months, years, or often indefinitely, are now functioning in the community as the result of two new developments : (1) modern drug therapy for mental disorders for which there has been no precedent, and (2) a more progressive and tolerant attitude of mental hospital administrations coupled with increased social acceptance of the former mental patient. The second development is not entirely new, but in the past, if such liberal attitudes emerged they eventually disappeared because there were no effective physical treatments supporting them. Historical treatment methods. During the more than two thousand years that elapsed between the time the somatic nature of mental disorders was first proclaimed by Alcmaeon and Hippocrates and the time it was reasserted by Wilhelm Griesinger at the beginning of the nineteenth century, medicine applied innumerable somatic treatment procedures and remedies, none of which survived because none was ever systematically explored and tested for its efficacy under controlled conditions (Zilboorg 1941; Haisch 1959; Kalinowsky & Hoch 1961). Bloodletting, purging, and induced vomiting w ere therapeutic mainstays in the treatment of niental disorders for many centuries. Physical threats, restraint, solitary confinement in the dark, whipping, periodic submersion under water, violent spinning of the mental patient on specially constructed revolving chairs, were all frequently a Pplied. Many of these procedures, particularly when applied to severely excited patients, resulted, °t course, in rapid "symptomatic improvement,"
795
because the patients fainted or became utterly exhausted and remained quiet for some time. Some of these uncritically employed treatment methods have sometimes been referred to as forerunners of modern shock therapies—for instance, the sudden pouring of ice water over the naked patient, burning of the scalp with scalding water, or the sudden plunging of the unsuspecting patient into a lake from a room with a trap-door device. However, the shock induced by these methods was primarily psychological. In contrast to this, modern somatic shock therapy is based on the induction of physiological shock. A certain semantic confusion exists if no clear distinction is made between the biological and the experiential aspects of shock. The old treatments aimed at causing surprise and fear in the patient, while modern shock therapy tends to avoid conscious distress of the patient and aims at the production of a specific state of physiological stress. Countless substances were prescribed as remedies for mental disorders, involving not only a great variety of the basic elements such as mercury, phosphorus, copper, iron, etc., but also chemical compounds—for instance, salts of silver, iodine, or lead. A host of organic chemical compounds was employed, most of which were derived from plants. In ancient times and during the Middle Ages the helleborus plant was thought to possess special powers for the treatment of mental illness. Other treatments, which involved the drinking of the blood of a recently beheaded criminal or concoctions and distillates made from toads, snails, and salamanders, as well as the wearing of precious metals, crystals, and gems, were based on ideas and principles developed by magic and later elaborated on in the symbolic systems of alchemy. Even in prehistoric times, surgical trepanations of the skull were performed, as a number of archeological findings prove. It is likely that the opening of the skull was not always undertaken because of increased intracranial pressure—the modern indication for such surgery—but more often to provide an escape for the evil spirits which were thought to possess the brain of an insane person. Scientific rationale and evaluation. At first glance one might conclude that not much that is new has been added to the modern repertoire of somatic treatment methods in psychiatry, since basic patterns of shock therapy, chemotherapy, and even psychosurgery were traced out many centuries ago. However, it must be remembered that the number of possible physical treatment modalities at our disposal is limited and that the value of a therapeutic procedure does not lie in its incidental
196
MENTAL DISORDERS, TREATMENT: Somatic
application but in the fact that its indication is based on a well-established rationale and that the results of the treatment have been assessed by scientific methods. Evidence of favorable results of any specific treatment procedure must be provided through objectively controlled and statistically evaluated clinical experiments. A rationale for psychiatric treatment was not seriously considered in scientific terms until the nineteenth century, and evaluation of treatment results based on controlled and statistically processed observations came into being only in the twentieth century. The scientific groundwork for a successful therapeutic attack on mental disorders was laid at the turn of the twentieth century. This groundwork is founded on three major achievements: (1) the introduction of a clinically valid and useful classification of mental diseases by Kraepelin; (2) the discovery of the principles of a consistent and comprehensive psychodynamic theory by Freud; (3) the progress made by many researchers in bacteriology, cellular pathology, neurophysiology, and chemistry. In the first decade of the twentieth century it was shown that the spirochete, which is the causative agent in syphilis, is present in the brains of patients with general paresis. Soon after the syphilitic etiology for this mental disorder became firmly established, August von Wassermann discovered a practical serological procedure which made it possible to prove objectively the presence or absence of syphilis. Until Wassermann's test became available, many patients, particularly those afflicted with dementia due to the effects of chronic alcoholism on the brain, had been misdiagnosed as suffering from general paresis. Malaria treatment—first breakthrough. In 1917 Wagner-Jauregg, at the University of Vienna, announced results of tests using deliberately induced malaria fever as a treatment for nine patients diagnosed as suffering from general paresis. Six of these patients improved greatly and three of them were cured. Until then, general paresis had been an incurable disease which invariably led to complete dementia and a miserable death. For thirty years Wagner-Jauregg had thought about this kind of treatment, ever since he had observed that the course of a psychosis was often favorably influenced by intercurrent infectious diseases. However, this general observation and even Wagner-Jauregg's idea of imitating this "experiment of nature" and inoculating paretic patients with tertian malaria could not have assumed the status of a scientific procedure until he had, at his disposal, a reliable method—namely, Wassermann's serological test—
which enabled him to make an objective diagnosis and select a homogeneous sample of patients for his experiment. Had he tried the experiment twenty years earlier, he might have inadvertently chosen a group of patients whose diagnosis in 50 per cent of the cases was alcoholic dementia and only in the other 50 per cent, general paresis. Under those conditions, to draw valid conclusions about the efficacy of his malaria treatment in cases of general paresis he would have had to employ a much larger sample, and this would have proved difficult, since he was using an untried, somewhat hazardous procedure (Wagner-Jauregg 1946). A variety of unsuccessful treatments. Around 1920 a group of clinicians conceived of focal infections in tonsils, teeth, and the intestines as the cause of many diseases, including the functional psychoses, and for a few years a great deal of unnecessary surgery was performed with the idea of removing the infectious foci. However, the theory was soon disproved, and this kind of surgical treatment was shown to be valueless or even harmful. A number of other therapeutic efforts were aimed at duplicating the spectacular results of the malaria treatment of general paresis, and, in several places, particularly in Europe, fever was induced artificially through injections of sulfur, typhoid vaccine, or foreign protein in patients suffering from various functional psychoses. Other attempts in the same direction, namely, to produce a systemic irritation and thereby a general mobilization of biological defenses, consisted of producing large blisters on the skin through the use of vesicantia, of making sterile abscesses in the muscles through the injection of turpentine oil, and of creating an aseptic meningitis by means of horse-serum injections into the spinal canal. Although short-term improvements in psychotic states were often observed in response to some of the procedures, no lasting remissions in the major psychoses could be achieved by any of them. The advances made in endocrinology suggested the therapeutic use of various hormones. Again, this approach did not prove to be fruitful. Most frequently employed in these therapeutic trials were the male sex hormone, testosterone, and the female sex hormones, estrogen and progesterone. However, some promising results in this field were obtained with the hormone of the thyroid gland in the treatment of mental and emotional disorders secondary to hypothyroidism. Castration. Castration was used in the nineteenth century and earlier because of the mistaken belief that such surgery on the genital organs would prove beneficial in certain psychiatric dis-
MENTAL DISORDERS, TREATMENT: Somatic orders, particularly those with hysterical manifestations. In modern times, castration of recidivist male sex offenders is a legal therapeutically employed procedure in some countries, particularly in Scandinavia. A recent review of results of this procedure in this particular group of psychiatric patients has shown that the treatment is often effective, but since it has so many drawbacks of a moral, psychological, and medical nature—not the least being its irreversibility—it is not likely to become a widely accepted procedure (Tappan 1951). Pharmacotherapy and psychopharmacology Around 1930 new interest was aroused in the use of various drugs in mental disorders. Loevenhart and others (1929) reported interesting experiments with injections of small doses of potassium cyanate and with the inhalation of carbon dioxide in stuporous patients. Patients who had been mute and motionless for weeks would suddenly, under the influence of these drugs, begin to talk and move about. Within a short time, however, they would invariably relapse into their previous stuporous condition. These therapeutic procedures were hardly more than provocative laboratory experiments. Their mechanism of action was not well understood, and their therapeutic action could not be sustained. New synthetic drugs which had become available at that time were given widespread application in psychiatric disorders. Benzedrine, one of the early representatives of the group of amphetamine compounds, produced marked stimulation of the central nervous system and had certain euphorizing effects (Myerson 1936). However, the early hopes that this drug might prove to be a specific agent counteracting depression were not fulfilled. Therapeutic experiments with photosensitization of depressed patients through the administration of hematoporphyrine, a hemoglobin derivative, seemed to give promising results at first but eventually proved to be disappointing. Amobarbital, a barbiturate, when given intravenously, was shown to produce the same tantalizing effect of relieving stupor states temporarily as did potassium cyanate injections and carbon dioxide inhalations. Nitrogen metabolism. Gjessing, in a series of beautifully designed and very carefully controlled experiments conducted at a Norwegian mental hospital, demonstrated that a certain type of schizophrenia, which he named "periodic catatonia," was characterized by recurrent attacks of stupor or excitement and was associated with a defective regulation of nitrogen metabolism (Gjessing et al. 1958). Patients afflicted with it either accumulated
197
or lost nitrogen beyond the normally permitted biological limits, and at the critical points of change in the nitrogen balance of the body, psychotic episodes would occur. Gjessing showed that a small amount of thyroxin would enable these patients to maintain their nitrogen balance within normal limits and thus remain free from psychotic attacks. His brilliant work was greeted with great enthusiasm as another milestone on the road toward effective and scientifically grounded somatic treatment in psychiatry. Unfortunately, its practical importance is limited, since the mental disorder for which this treatment is indicated is comparatively rare. Pellagra and phenylketonuria. Two more important therapies must be mentioned—both the outcome of systematic, scientific research. One has led to the almost complete disappearance of a mental disease that formerly was fairly frequent in certain parts of the world, while the other one has opened up exciting vistas for a practical therapeutic and preventive approach to mental deficiency. In 1938 Elvehjem demonstrated that the socalled "black tongue" in dogs was a deficiency disease caused by a lack of nicotinic acid, a component of the vitamin B complex, in the food (see Woolley et al. 1938). Soon afterward, the first cases of pellagra in humans, which seemed to be closely related to the black-tongue disease, were successfully treated with nicotinic acid. Pellagra is a disease which produces manifestations in the skin, intestines, and the brain. In the past, many patients suffering from pellagra psychosis could be found in mental hospitals in certain parts of the world— for instance, in the south of the United States, where nutritional conditions were particularly bad among the lower classes. Today one rarely sees a patient with psychosis due to pellagra, and in the few instances where such a diagnosis is made, the condition readily responds to treatment with nicotinic acid. The modern antibiotic treatment of general paresis with penicillin—which has superseded Wagner-Jauregg's original malaria treatment—and the supplementary vitamin therapy with nicotinic acid of pellagra psychosis are the only two truly curative treatments of mental disorders at our disposal today. A recently developed treatment that has opened fresh possibilities for an attack on mental deficiency is really a preventive one. It consists of the reduction of a certain amino acid—phenylalanine—in the food intake of infants and young children in whom the diagnosis of phenylketonuria has been made. In 1934 Foiling showed that a certain small
198
MENTAL DISORDERS, TREATMENT: Somatic
group of mental retardates was characterized by the excretion of an abnormal metabolite, namely, phenylpyruvic acid, in the urine. Later it was shown that these patients were afflicted with an "inborn error of metabolism," and, lacking certain essential enzymes—in particular, phenylalanine hydroxylase—were unable to metabolize phenylalanine, which is a component of normal food intake, to tyrosine. Because of this metabolic inadequacy, another metabolic product—phenylpyruvic acid— accumulates in their system. It seems that the increased blood level of phenylalanine is highly toxic for the developing brain. By carefully eliminating most phenylalanine from the food of a patient in whom the diagnosis has been made early enough —that is, before the toxic excess of phenylalanine can exercise its damaging influence on the developing brain, up until the fourth or fifth year of life— it is possible to prevent or at least reduce the intellectual damage inflicted upon individuals who carry this genetic error of metabolism [Ragsdale & Koch 1964; "Mental Deficiency. . ." 1961; see MENTAL RETARDATION]. Disulfiram in alcoholism. When it was noted accidentally that people who had been exposed to a certain chemical (disulfiram) reacted during a period of several hours with considerable discomfort, flushing, nausea, palpitation, and vertigo to any amount of alcohol they consumed afterward, this substance was soon introduced into the therapeutic armamentarium of psychiatrists for treatment of alcoholism. The mechanism of disulfiram consists of the blocking of an enzyme which is essential for the metabolic breakdown of products of alcohol in the body. Accumulation of acetaldehyde in the body causes unpleasant toxic effects if any alcohol is taken while the enzyme action is blocked by disulfiram. This drug can serve as a self-imposed chemical restraint for the problem drinker. As long as he takes it, he knows he cannot drink alcohol without rapidly producing an alarming reaction. If the patient is motivated well enough to take his medication regularly, this treatment can be a valuable aid in the comprehensive treatment program required for the psychiatric management of alcoholics [see DRINKING AND ALCOHOLISM]. Psychotropic drugs and neuroleptic effects. The latest chapter of somatic therapy in psychiatric disorders began in the early 1950s with the introduction of a new class of drugs. These drugs are designated as psychotropic or psychoactive substances, because their principal action is manifested in the realm of human behavior and experience. A whole new scientific discipline has
developed under the name of psychopharmacology. Its principal task is the study of psychoactive drugs [Lehmann 1963; see DRUGS]. Although psychoactive drugs are as old as civilization—alcohol, caffeine, and opium fall into this category-—the new type of psychoactive drugs, first systematically applied in 1951, was characterized by a particular quality which has been termed "neuroleptic" by the French psychiatrists Jean Delay and Pierre Deniker, who were pioneers in proving the value of pharmacotherapy in psychiatry. A neuroleptic drug is a substance which produces distinct neurological effects in addition to its psychotropic action. The neuroleptic action may manifest itself in various ways, but it appears most frequently in the form of extrapyramidal symptoms. Extrapyramidal symptoms may occur as drug-induced Parkinsonism, which is characterized by muscular rigidity, a masklike face, tremor, and a shuffling gait, or sometimes they may occur as severe muscular dystonia or akathisia—a term denoting motor restlessness, which makes it impossible for the patient to sit or stand still. Such a neuroleptic quality had never before been observed with any psychoactive drugs. In fact, there had been no experimental way of producing extrapyramidal symptoms consistently by pharmacological means. However, the neuroleptic action is only a side effect of the new drugs, whose principal action is their marked therapeutic effect on such psychotic symptoms as hallucinations, delusions, autistic thought-disorder, and psychotic stupor and withdrawal. For this reason, these drugs have often been referred to as antipsychotic or, more recently, as psychotostatic in their action. Up until a few years ago none of the clinically applied psychoactive drugs—mainly hypnotics, sedatives, and stimulants—had been effective in reducing specifically psychotic manifestations. In addition to their neuroleptic and their antipsychotic action some of the new drugs also possess an unusual sedative action which is characterized by their pronounced effect on psychomotor tension and agitation, without inducing any clouding of consciousness or impairment of cognitive processes. Until recently, clouding of consciousness and impairment of judgment had been almost synonymous with the notion of powerful sedation (Lehmann 1961; 1966). Although the name "tranquilizer" was given to these new drugs and became a popular label for them soon after they appeared on the clinical scene, when generally applied it may sometimes be a misnomer, since a number of the new drugs
MENTAL DISORDERS, TREATMENT: Somatic which possess neuroleptic and antipsychotic properties may not tranquilize, but, instead, exercise a mild stimulant action. Rauwolfia and phenothiazine derivatives. The first two substances which were clinically employed and systematically studied in the treatment of acute manic and schizophrenic psychoses were the rauwolfia and the phenothiazine derivatives. Rauwolfia derivatives are related to the principal active ingredient of the plant R. serpentina, which was used for centuries in India for the treatment of mental disorders. However, it was given in doses which today we would consider inadequate. The phenothiazine derivatives, on the other hand, are synthetic products of a systematic search by pharmacologists for certain compounds with pronounced effects on the central nervous system. The first rauwolfia derivative studied extensively in psychiatric patients was reserpine, and the first phenothiazine derivative was chlorpromazine. In the few years since their introduction into psychiatry a tremendous number of clinical and experimental observations has been reported, and a great number of other rauwolfia and phenothiazine derivatives with similar properties have been developed by the pharmaceutical industry. It has become evident that for clinical purposes the phenothiazine derivatives present the advantages of being more reliable and producing fewer undesirable side effects than the rauwolfia derivatives. The latter, however, still play an important role as standard drugs for certain psychopharmacological experiments. Evaluation of neuroleptics. Many of these new drugs have the particular tranquilizing effect which has been described above, and they do not, like most other sedatives, lead to addiction, even in predisposed individuals. All of the clinically applied neuroleptics counteract specific psychotic manifestations in acute mental breakdowns, and they were soon found to be effective therapeutic agents even in chronic psychotic states. A considerable number of regressed, chronic schizophrenics, some of whom had vegetated for ten or twenty years as hopeless human derelicts in the back wards of mental hospitals, responded to treatment with the new antipsychotic drugs, although they had previously failed to show any favorable response to repeated courses of insulin-coma and electroconvulsive treatment. The atmosphere of mental hospitals all over the world has changed rapidly since the new drugs were introduced, since treatment with phenothiazine derivatives often renders an acutely psychotic individual rational and cooperative within a matter
199
of hours or days instead of weeks and months, as had been the rule prior to the drug era. As psychiatrists learned to employ the new tranquilizers it became possible to reduce violent and destructive behavior to a minimum. The construction of new hospitals has been profoundly influenced by these therapeutic developments in that facilities for seclusion and restraint are no longer considered to be essential features of every mental hospital. Perhaps the most important function of the new drugs is their role in the maintenance treatment of psychotic patients in remission. It is now possible to maintain a psychotic patient who has been rendered symptom-free through the use of antipsychotic drugs indefinitely in this compensated, and for all practical purposes, recovered state, provided the patient is carefully observed and continues to take antipsychotic medication regularly and in adequate doses. There are as yet no objective methods to determine which patients may eventually be able to discontinue maintenance medication and which will have to remain on it indefinitely. There are four therapeutic functions for neuroleptic drugs with antipsychotic or psychotostatic activity. The drugs may be used as: (1) symptomatic sedatives; (2) therapeutic agents in acute psychotic conditions; (3) therapeutic agents in chronic psychotic conditions; (4) maintenance agents in former psychotic patients in remission. Older somatic treatments had been known to provide sedation effectively (e.g., the barbiturates or scopolamine), and insulin-coma or electroconvulsive treatment was effective in acute psychotic conditions. However, there had been no therapeutic procedures which could promise any real hope for chronic schizophrenic patients, and there had never been any drug that could maintain former psychotic patients symptom-free. At least 70 per cent of all patients suffering acute schizophrenic breakdowns respond favorably to modern pharmacotherapy. Pharmacotherapy is simpler and at least as effective as insulin-coma therapy and consequently has replaced the latter in most psychiatric clinics today. Drug treatment of functional psychoses is neither merely symptomatic nor capable of curing the mental disease. The function of such treatment has been characterized as compensatory in nature. In this respect it resembles such therapeutic procedures as insulin treatment for diabetes or anticonvulsant therapy for epilepsy: as long as the treatment is administered the patient's symptoms will remain in abeyance. The drugs seem to counteract and neutralize the behavioral effects of a somatic substrate in psy-
200
MENTAL DISORDERS, TREATMENT: Somatic
chotic conditions without, however, being able to eliminate this physical substrate. Today many different phenothiazine derivatives are used in the treatment of psychotic conditions— in particular in the therapy of schizophrenia. One "ay easily become confused by the many generic names and the innumerable trade names under which these drugs appear on the international markets. Common to most of them is the phenothiazine nucleus; their differences depend on the chemical structure of the side chain attached to this nucleus. Carefully controlled observations have shown that the therapeutic effectiveness of the great majority of phenothiazine derivatives appears to be roughly equal. The derivatives differ, however, in the dose which is required to produce therapeutic effects and also in the side effects which accompany their administration. Recently, another chemical class with neuroleptic and antipsychotic properties is being studied intensively—the butyrophenones. There seems to be little doubt that equally or more effective new psychotostatic drugs will be developed in the future. Other psychoactive drugs. While the mainstream of therapeutic activity has been going in the direction of treating psychotic manifestations, a number of new chemical substances with other interesting psychoactive properties have also been developed since 1955. These new drugs can be considered under three headings: (1) minor tranquilizers. (2) antidepressants, (3) psychotomimetics. Minor tranquilizers. Minor tranquilizers are sedatives which do not possess antipsychotic properties. In other words, they can sedate a tense and excited patient but they cannot counteract such specific psychotic manifestations in the cognitive and perceptual field as delusions, thought disorder, and hallucinations. Drugs which do have antipsychotic effects—for instance, phenothiazine or butyrophenone derivatives—are sometimes referred to as major tranquilizers. While a considerable number of new minor tranquilizers have appeared on the market, there is, so far, no convincing evidence that these new substances have accomplished anything that is essentiallv different from the achievement of the older well-known sedatives. Minor tranquilizers which are useful adjuncts to the treatment of anxiety and emotional tension are also characterized by the fact that they usually induce drowsiness and postural ataxia, exert an anticonvulsant action, and may lead to habituation and psychological addiction. In contrast, major tranquilizers do not induce postural ataxia, and only rarely do they induce persistent drowsiness. Major tranquilizers also tend to lower the brain's
convulsive threshold and they do not lead to habituation and addiction (Berger & Ludwig 1964). Antidepressants. The early hopes that the new stimulants (such as the amphetamines), which had been introduced in the 1930s, would be useful in the treatment of severe depressive states were not fulfilled. A depressed person is frequently in a state of heightened arousal, and giving him stimulants might only increase his anxiety and agitation without affecting the fundamental symptom of all depression—namely, the depressive mood. To fill the gap that existed in the pharmacotherapy of mental disorders characterized by depression, another type of psychoactive substance was developed a few years after the discovery of the psychotostatic drugs (major tranquilizers)—the antidepressant drugs (Lehmann 1965). These may be divided into two major groups: (1) the mono-amine-oxidase inhibitors, (2) antidepressants with no mono-amineoxidase inhibiting activity—also referred to as tricyclic antidepressants. Mono-amine-oxidase is an enzyme which degrades the so-called neurohormones, nor adrenalin and serotonin. There is indirect clinical and experimental evidence that the distribution and balance of these neurohormones in the brain is significantly related to emotional states. It has been observed clinically that chemical substances which inhibit mono-amine-oxidase and thereby allow noradrenalin and serotonin to build up in the brain may successfully reduce the duration of a depression from several months or years to a period of three or four weeks. While this observation has provided a pharmacological model in the systematic search for new antidepressant drugs, the mono-amine-oxidase inhibitor model certainly does not account in full for the physical substrate of depressive states, since a number of other substances with no enzyme inhibiting activity but a close chemical resemblance to the phenothiazines have proved to be equally effective in the treatment of severe depression. Our knowledge of the specific indications for each type of antidepressant—for instance, which should be prescribed for reactive depressions and which for endogenous, agitated, or retarded depressive states —is still incomplete. Nevertheless, antidepressant drug therapy represents a considerable step forward in psychiatric treatment, and antidepressant drugs are effective in about 60 per cent of all depressive conditions. This reduces the number of patients who otherwise would have to be given electroconvulsive therapy. It takes from one to three weeks for antidepressant drugs to manifest their therapeutic action, and they
MENTAL DISORDERS, TREATMENT: Somatic are, therefore, slower acting than the antipsychotic drugs. Some, but not all, antidepressants have stimulating effects. The mono-amine-oxidase-inhibiting drugs tend to produce mild euphoria, while the tricyclic antidepressants that do not inhibit monoamine-oxidase merely eliminate depressive symptoms without inducing euphoria. Like many other psychoactive drugs, antidepressants frequently produce side effects; the monoamine-oxidase inhibitors are particularly prone to do this. It is interesting to note that all effective antidepressants may potentiate psychotic symptoms—for instance, hallucinations and delusions —and may sometimes even induce a toxic psychotic state. Recently, the distinction between antipsychotic and antidepressant drugs, which at first appeared to be quite clear-cut, has lost some of its sharpness. There are depressed patients, particularly the anxious depressed, who respond to major tranquilizers, and there are schizophrenic psychotics who respond favorably to antidepressants. Patients who respond in this different manner cannot yet be clearly distinguished in advance from the patients who show average response tendencies. Psychotomimetics. Psychotomimetics are drugs which experimentally induce states of psychotic disintegration accompanied by thought-disorder and perceptual disturbances. Some representatives of this class of drugs have been known for a considerable time—for instance, marijuana, an ingredient of the hemp plant, and mescaline, the active component of the Mexican peyote cactus (Unger 1963). Other psychotomimetic substances have been developed more recently—for instance, lysergic acid diethylamide, a synthetic ergot compound, and psilocybin, which is derived from Mexican mushrooms. These strange substances have received a great deal of public attention in recent years and have stirred up considerable controversy. Preliminary clinical trials with some of the psychotomimetics, or hallucinogens as they are sometimes called, suggest that they may prove to be useful in the treatment of alcoholism and character disorders. However, these results will have to be confirmed, and a methodology for systematic therapy with these substances still needs to be developed. In the meantime, these substances provide interesting tools for the experimental study of the Psychotic process, since it is possible to induce "model psychoses" in volunteer subjects, who will, for a period of hours, undergo experiences which seem to be closely related to the experiential world of the schizophrenic. Unfortunately, these chem-
201
icals, which, like any powerful drug, carry a dangerous physical potential, have become a rallying issue of almost political importance for a small but vociferous group of intellectuals who claim the right to administer these drugs to themselves and to others according to their own, nonmedical judgment. It appears that under the influence of such substances, states of ecstatic exaltation can be fairly easily induced. Viewed in an objective clinical and psychopharmacological perspective, the development, understanding, and administration of psychotomimetic drugs must be judged to be still in the experimental stages. Prolonged sleep treatment In 1922 Klaesi introduced prolonged sleep therapy into the therapeutic armamentarium of the psychiatrist (Klaesi 1922). This treatment method, which consisted of keeping patients asleep with the help of hypnotic drugs throughout most of the day for a period of several weeks, has held its place in the treatment of certain psychiatric conditions. Originally introduced for the treatment of schizophrenia, depressions, and manic states, it is now mostly given to patients suffering from long-standing psychoneuroses and psychosomatic conditions. It is particularly popular with psychiatrists in Soviet Russia, where Pavlov's teachings determine the basic theoretical approach to psychiatry. In Pavlov's conceptual framework, prolonged sleep is viewed as a form of protective inhibition that can successfully counteract the pathological excitation of the higher central nervous processes that manifest themselves as symptoms of pathological behavior. A variety of hypnotic and sedative drugs are used for sleep therapy. They are given at regular intervals either by mouth or by injection. This therapy does not produce any unpleasant experiences for the patient, but it requires careful, continuous nursing care in order to avoid circulatory collapse or other untoward effects in the patient, who has to be kept inactive for long periods of time. [See SLEEP.] The use of sleep-producing drugs in psychiatric patients may be variously determined according to the different theoretic orientations of the therapist. While the Russian psychiatrists think in terms of physiological protective inhibition, a psychiatrist from the West may be more interested in the psychodynamic aspects of sleep therapy—e.g., disinhibition, abreaction, or anaclitic dependency. Another method of inducing therapeutic sleep makes use of a rather weak continuous electrical current that is passed through the brain and produces relaxation and sleep which may be sustained
202
MENTAL DISORDERS, TREATMENT: Somatic
for several hours without eliciting pain, shock, or convulsions. As with drug-induced prolonged sleep, Russian psychiatrists invoke the concept of protective inhibition for this kind of electrically induced alteraH .on of consciousness, and the therapeutic procedure is more widely employed in countries in the Russian sphere of influence than in Western countries. However, neither its technical application nor its therapeutic effects are as reliable as electroconvulsive treatments. Recent claims for success with electrically induced sleep are made mainly for neurotic states and for a variety of psychosomatic disorders (see Clapp & Loomis 1950; Giljarowskii et al. 1956; Wageneder & Hafner 1965). Shock therapies A new era of somatic therapy in mental disorders began around 1935 with the discovery of two types of physiological shock therapy: (1) the hypoglycemic coma treatment, or insulin-shock therapy, developed by Sakel, a young German psychiatrist; and (2) the convulsive therapy, or pentylenetetrazol-shock treatment, developed by Meduna, a psychiatrist at the University of Budapest. Sakel started his experiments two years prior to Meduna's (Kalinowsky & Hoch 1961; Sakel 1958; Meduna 1935). Insulin-coma treatment. The principle of insulin-shock treatment lies in the production of the deep coma that results from a severe lowering of the blood-sugar level following an injection of insulin. The brain depends for its metabolism mainly on carbohydrates. A reduction of the blood-sugar supply, therefore, lowers brain metabolism relatively more than that of any other organ. Sakel had been treating patients addicted to morphine with small doses of insulin as a relief measure during their withdrawal from the narcotic; he observed that sometimes they had inadvertently slipped into a deep coma and after having been aroused from it had appeared much improved. He was an imaginative man who, without much other justification, generalized from these specific observations to the bold hypothesis that schizophrenic patients would benefit from hypoglycemic coma therapy. He was also a courageous man, for, at that time, a coma produced by an overdose of insulin was still considered to be a dangerous complication. At any rate, in 1933 Sakel presented his first promising report with insulin-coma therapy in one hundred schizophrenic patients from the University Clinic of Vienna. His findings were soon confirmed in other European countries, and the treatment was rapidly adopted all over the world. It constituted
the first effective major therapeutic advancement in the management of schizophrenia (Sakel 1958). Insulin-coma therapy of schizophrenia is most effective in patients who have been sick for not more than six months to a year. After this time it rapidly loses its effectiveness. Patients have to be treated in specially equipped hospital units for several months. They receive an injection of insulin while in the fasting state early in the morning. Within one to two hours they fall asleep and eventually go into a deep coma from which, at the end of three or four hours, they are aroused by an injection of glucose into the blood stream or by the infusion of a sugar solution into the stomach. A recent refinement of technique consists in the administration of glucagon, a substance which rapidly mobilizes available carbohydrate stores in the body and, therefore, reduces the need for large amounts of sugar to be administered through intravenous injection or stomach tube. The amount of insulin required for each patient to induce coma differs considerably and might change from day to day. The medical and nursing staffs administering insulin-coma treatment have to be well trained and experienced. Usually not more than ten or twenty patients can be treated on any one day. All these factors make the treatment a rather expensive procedure which requires considerable vigilance on the part of the staff and is by no means entirely without risk. Nevertheless, in competent hands, insulin treatment produces 70 per cent to 75 per cent remissions in acute schizophrenia, and it was the favored treatment for this disease until the advent of pharmacotherapy. Subcoma insulin treatment. A modification of insulin-coma therapy consists of the administration of smaller doses of insulin, which will produce a state of lowered blood-sugar level (hypoglycemia) without, however, bringing about a deep coma. The patients are rendered sleepy, and a state of deep relaxation is produced. At the termination of each treatment—which lasts from two to three hours with the patient lying quietly in a darkened room—the patient feels hungry but relaxed. A course of subcoma insulin treatment usually lasts for several weeks. Indications for this type of therapy are conditions of neurotic anxiety and increased tension. Patients often respond to the treatment with improved sleep, gain of weight, a feeling of increased well-being, and a reduced need for sedatives. Drug-induced convulsive treatment. Insulinproduced hypoglycemia sometimes leads to convulsions. It had been observed that patients often
MENTAL DISORDERS, TREATMENT: Somatic appeared to improve more rapidly after they had had a convulsion. Meduna theorized that there existed a biological antagonism between epilepsy and schizophrenia because these two diseases do not often occur simultaneously in the same patients. He had transfused schizophrenic patients with the blood of an epileptic and vice versa, hoping to see an improvement in the patients treated in this manner. When he obtained no results, he conceived of the idea of producing epileptiform convulsions in schizophrenic patients by injecting them intramuscularly with camphor in oil. Later he used the synthetically produced drug pentylenetetrazol, instead. This drug is water soluble and can be injected into a vein. When this is done the patient experiences for a short time an agonizing state of apprehension and panic and then, within a minute or two, loses consciousness and has a typical epileptiform convulsion of the grand mal type. The treatment is given every other day until a series of 10 to 25 or more convulsions has been produced. With his first group of schizophrenics treated with pentylenetetrazol, Meduna could report almost 90 per cent remissions. However, a considerable number of patients relapsed within a few weeks, and the over-all results of pentylenetetrazol-convulsive treatment in schizophrenia were not quite as favorable as those obtained with insulin-coma therapy. Often the two treatments could be combined for best results (Meduna 1935). It did not take long before it became evident that convulsive therapy was highly effective in depressive states—in fact, more so than in schizophrenia—and in the treatment of severe depression convulsive therapy still plays an important role. Electroconvulsive treatment (ECT). In 1938 Cerletti and Bini in Rome treated a patient with an improved modification of the convulsive treatment method, namely with convulsions produced through the application of an electrical current instead of the administration of convulsant drugs (Cerletti 1950). This method soon supplanted the pentylenetetrazol treatment, since it was simpler and, above all, did not produce the unpleasant subjective effects of the drug. A patient who receives electroshock therapy may remain comfortably in his own bed. Two electrodes are applied to the temples, and even if no anesthetic is given the Patient never feels any pain, since the passage of the current—about 100 volts for 0.3-0.5 secondscauses immediate loss of consciousness. After regaining consciousness the patient might remain rather confused for thirty minutes to an hour. A depressed person usually begins to show defi-
203
nite improvement after the first four or five treatments, but additional treatments are required to prevent a relapse of depressive symptoms. Six to twelve treatments—administered over a period of two to four weeks—are the usual number for a course of ECT in acute depressions. If electroshock therapy is given to schizophrenic patients, twenty or more treatments are usually administered. After three to five electrically induced convulsions, one observes a change in the patient's electroencephalogram in the direction of a general slowing of the electrical brain rhythms. At about the same time the patient develops an acute amnestic syndrome. He shows impairment of recent memory, and after a large number of treatments— or also if treatments are given at closely spaced intervals, for instance, every day or several times a day—the patient might become greatly confused. Sometimes such confusion is deliberately induced in the course of "regressive" or "depatterning" shock therapy, in the hope that the complete shattering of the patient's mental processes will be followed by a reconstitution and reintegration of his mental functioning but with a selective permanent destruction of the more recently acquired pathological manifestations [Cameron et al. 1962; see ELECTROCONVULSIVE SHOCK]. Within two to four weeks after discontinuation of electroshock therapy, memory functions and electroencephalographic indices tend to return to their normal levels. Although the human organism is able to tolerate convulsions amazingly well at any age, in elderly persons there is some danger of either producing or triggering off permanent impairment of memory due to physical brain changes. The aged are, of course, already predisposed to such organic brain damage. In younger persons there seems to be little or no danger of any permanent brain damage due to electroconvulsive treatment. Several modifications of the standard treatment method have been proposed, mostly with the intention of reducing confusion and amnesia following ECT. One of the most promising is the unilateral application of the current, which seems to produce less memory impairment than the standard method (Cannicott 1962). It has been demonstrated that the induction of paroxysmal seizure discharges in the brain is the essential factor in electroconvulsive therapy. All muscular, autonomic, and metabolic responses which can be observed during and after the convulsions seem to be secondary and carry little or no therapeutic value.
204
MENTAL DISORDERS, TREATMENT: Somatic
The violent muscular contractions which accompany the seizure discharges of the brain can easily lead to fractures of the vertebrae or of other bones in the trunk or the extremities, and for this reason electroconvulsive therapy today is almost always preceded by the administration of a muscle relaxant—either curare or one of its analogues, or more frequently, succinylcholine—which is injected into a vein together with a short-acting barbiturate to produce a superficial anesthesia as well as muscular relaxation. Immediately following these injections artificial respiration is established for a few minutes; the electrical shock is administered; and the convulsion ensues. However, suppressed by the muscle relaxant, the convulsion consists of hardly more than a flickering of the eyelids or a movement of the toes, although the electrical and physiological effects on the brain are not essentially altered. Electroshock therapy, which should be referred to as electroconvulsive therapy (ECT), is today the most reliable treatment of severe depressive states. It produces favorable results in about 90 per cent of the cases, while treatment with antidepressant drugs is effective only in about 60 per cent of depressive conditions. Electroconvulsive treatment is most successful in involutional melancholia, an endogenous depressive condition usually associated with agitation and occurring in patients of the involutional age, i.e., between forty and sixty years. Electroconvulsive treatment is less effective when marked anxiety or many hypochondriacal symptoms are associated with the depressive state. While ECT is still the most reliable treatment for severe depressive states and is also often remarkably effective in states of acute psychotic excitement or schizophrenic disintegration, it should be clearly understood that this type of therapy does not seem to influence the long-term development of a psychopathological process. Various studies have shown that ECT does not prevent depressive or psychotic relapses; nor does it seem to reduce the number of such relapses or prolong the normal intervals in recurrent mental diseases. The main value of this dramatic therapy lies in the fact that it leads to a rapid disappearance or amelioration of depressive or psychotic symptoms and that it may reduce the duration of a severe depression from nine to twelve months to three to six weeks, Although most manic-depressive episodes terminate eventually in spontaneous recovery, the everpresent danger of suicide in depressed patients can be eliminated with ECT, and the value of saving a patient a great deal of unnecessary suffering and
restoring him early to useful functioning in the community is, of course, sufficient justification for employing this kind of therapy whenever it is indicated as the most suitable therapeutic approach. If the mental disease is not of a periodic nature, as in manic—depressive psychosis, but tends to be chronic, as for instance, schizophrenia, then ECT is much more limited in its value, because although regular repeated maintenance treatment with ECT is possible, it is more hazardous and not as practical as the continuous suppression of psychotic symptoms with pharmacotherapy. Psychosurgery In 1936, almost simultaneously with the introduction of the shock therapies, another dramatic somatic therapy was introduced into psychiatry by the Portuguese neuropsychiatrist Antonio Egas Moniz—the surgical procedure of prefrontal lobotomy. Basing his work on accumulated data from neurophysiological and experimental psychological research, Moniz theorized that the frontal lobes were essentially related to the higher mental processes, which were necessary components of normal cerebral and behavioral functioning. In particular, certain processes of abstraction, inhibition, and time projection had been thought to find their representation in the frontal and prefrontal lobes. Moniz severed the connections between the frontal lobes and the thalamus through small cuts of the fiber tracks responsible for these connections. He showed that his operation was often followed by improvement, particularly in early schizophrenia but also in chronic states of depression or in obsessive-compulsive ruminations that had not yielded to any other treatment. [See NERVOUS SYSTEM, article on STRUCTURE AND FUNCTION OF THE BRAIN; OBSESSIVECOMPULSIVE DISORDERS.] This operation is not followed by any deterioration of intellectual performance, but it leads to a marked reduction in the intensity of the person's emotional involvement, imagination, and creative productivity. If an insufficient amount of brain substance is cut there will be no adequate relief of symptoms. If exactly the right amount of brain substance is destroyed, the therapeutic result might enable the patient to function better after the operation than he ever functioned before. If the surgeon destroys too much brain substance, the patient might develop into an apathetic slob or an irresponsible psychopath, being left either without ambition or without any consideration for others. Unfortunately, it is not possible to calculate precisely where and how much to cut. However, a
MENTAL DISORDERS, TREATMENT: Somatic great deal of progress has been made in perfecting this particular neurosurgical procedure, so that the probability of a good therapeutic response is heightened and the probability of personality deterioration is minimized. Most of the results with this kind of therapy were collected in the United States prior to 1955 and particularly in Britain, where the interest in prefrontal lobotomies still persists and is probably higher than anywhere else. On the North American continent interest in psychosurgery has sharply declined since the introduction of modern drug therapy for mental disorders. There are probably two principal reasons for the disinclination of most American psychiatrists to submit their patients to a prefrontal lobotomy or to any other type of surgical interference with the brain: (1) Since neurons cannot regenerate, any artificially produced morphological changes in the brain—in particular any loss of substance—would be irreversible; (2) Several careful follow-up studies have brought out evidence that marked personality changes supervened eventually in almost all lobotomized patients, even if the therapeutic gain outweighed the personality deficit due to the operation (Greenblatt et al. 1950; Petrie 1952; Rylander 1939). One general consequence of a prefrontal lobotomy is a certain loss of subtleness and complexity in the patient's personality; he usually loses some of his creative imagination; he dreams less; and his dreams tend to become much simpler in structure; and on the whole he becomes less sensitive. However, for intractable conditions of chronic depression or for chronic obsessive-compulsive psychoneurosis which has been refractory to any other type of therapy, prefrontal lobotomy would still have to be seriously considered as a last resort, which, nevertheless, may often yield surprisingly good results. Mechanisms of action The rationale for a therapeutic procedure is easily understood if the etiological factors of the condition to be treated are known and if the treatment is specifically aimed at eliminating these etiological factors. This would apply to the modern penicillin treatment of general paresis, which kills the trepanoma pallidum and thus eliminates the causal factor of syphilis, which in turn is responsible for the psychosis. But the modus opcrandi °f Wagner-Jauregg's malaria treatment of general Paresis was not so easily understood because it c °nstituted an unspecific attack on the disease, and a number of theories have been proposed to exPlain its effect. It was thought, for instance, that
205
the physical hyperthermia factor produced by malaria might kill the trepanoma pallidum, and it was also put forward that malaria therapy was effective because it mobilized general biological defenses in an unspecific manner, thus enabling the organism to deal successfully in its own way with the brain disease. Many theories have been offered to explain the therapeutic action of insulin-coma treatment, ranging from a conception of reduced cerebral metabolism, which is equivalent to partial anoxia, to the idea that the artificially induced regression of the patient to an infantile level and the nursing care he is receiving during the coma treatment from doctor and nurse constitute a re-enactment of the patient's infantile situation without the traumatic factors which supposedly prevailed originally in his infancy. Theories to explain the effects of electroconvulsive treatment also involve physical and psychodynamic concepts and range from neurophysiological, biochemical, and endocrinological hypotheses to the proposal that the induced amnesia is responsible for the "unlearning" of recently acquired pathological behavior patterns and also to the suggestion that the patient experiences symbolic death and resurrection under the magic influence of the doctor administering the treatment. There is no generally accepted theory for the action of modern psychopharmacological agents. It has already been mentioned that the therapeutic action of one group of antidepressants, namely the mono-amine-oxidase inhibitors, has been explained on the basis of competitive affinity of the drug and hormones to essential receptor sites on an enzyme. The neuroleptic major tranquilizers seem to act on the brain's reticular activating system, the diencephalon, and the extrapyramidal system and possibly produce their effects through an influence on synaptic transmission in the subcortical structures of the brain. Minor tranquilizers apparently affect more prominently the cerebral cortex and parts of the limbic system in the subcortical structures. The action of psychotomimetics still poses a very puzzling phenomenon: these drugs may lead to desynchronization and imbalance between the transmission of impulses through the primary sensory paths and their processing through the associative systems of the brain. Treatment practices in clinical psychiatry A general principle may be formulated, stating that somatic treatment is the primary therapeutic approach to the psychoses, with psychotherapy playing an auxiliary role, and that psychotherapy
206
MENTAL DISORDERS, TREATMENT: Somatic
is the primary therapeutic approach to the psychoneuroses, v;ith somatic treatment serving as an adjutant (Kline & Lehmann 1962). This would mean that the treatment of psychoses is less complex and demands less personal skill and experience from the therapist than the treatment of psychoneuroses. The recently developed pharmacotherapy of schizophrenia, manic states, and depressive conditions makes it now possible for a nonspecialist physician to treat a psychotic patient successfully outside a mental hospital. This is of particular significance for the organization of psychiatric services in underdeveloped countries, where the impossibility of constructing mental hospitals and finding enough well-trained specialists has prevented the build-up of such services until now. That modern pharmacotherapy can indeed provide the nucleus for a successful psychiatric service organization in a country where such services were nonexistent before has been demonstrated in Haiti, where a wellfunctioning psychiatric clinic with preventive, therapeutic, and rehabilitative functions has recently been set up within a short time and at minimum expense of financial means and trained manpower. Such shortcuts have become possible only since rapidly effective and self-administered therapeutic agents that can also maintain the patient symptom-free following the acute treatment phase have been made available in the form of antipsychotic and antidepressant medication. Ready availability, ease, and continuity of administration are the main advantages of drug treatment over insulin-coma and electroconvulsive therapy. Insulin-coma therapy, although still a valuable treatment for schizophrenic patients who do not respond to drug therapy, is no longer given at most psychiatric treatment centers because the necessary, specially equipped insulin-treatment units have been dissolved. Electroconvulsive treatment is simple to administer but also requires some special apparatus and technical skill on the part of the therapist; besides, it cannot be given continuously because of its cumulative effects on a patient's memory. However, it is still widely used in the treatment of acute or chronic manic or schizophrenic psychoses, particularly when they do not respond to drug therapy. Prefrontal lobotomy—sometimes also referred to as leucotomy— is rarely performed today except in cases manifesting depressive or obsessive—compulsive symptoms which have failed to respond to any other treatment. Pharmacotherapy has in many instances replaced these older methods of somatic treatment. Different national cultures seem to have shaped
different therapeutic attitudes, and in certain countries preferences and dislikes for particular modes of treatment are clearly evident. In a very broad sense it may be stated that psychiatric orientation on the North American continent is characterized by a much heavier bias toward a psychodynamic—more specifically, Freudian psychoanalytic—approach to theory and practice than psychiatry in other parts of the world. European schools of psychiatry place more emphasis on genetic-constitutional and physicochemical factors in their speculation on the etiology of mental disorders as well as in their clinical approach to them. In the Soviet Union, psychosurgery, in the form of prefrontal lobotomy, has been officially ruled out as a permissible treatment method. In Britain, on the other hand, there is still greater interest in this type of therapy than in most other countries. In German-speaking countries, where the influence of an existentialist orientation is fairly strong, electroconvulsive therapy seems to be disliked— possibly because of the radically disrupting effect such treatment has on the continuity of a patient's orientation toward his problems and his experience of the world in which he lives. In countries where psychiatric services must be created de novo, the practical advantages and the comprehensive effectiveness of modern pharmacotherapy have made it the favorite choice of somatic treatment for mental disorders. It should again be recalled that the undoubted effectiveness of modern somatic treatment applies only to psychotic disorders, in particular to those of functional origin. Physical methods, including drug therapy, are of very limited value for the therapeutic management of psychoneuroses and character disorders. In accordance with modern concepts one may view mental disorders as the resultants of multifactorial functions and forces. The complex factors interacting with each other may be categorized under the headings: (1) genetic factors—constitutional personality matrix, idiosyncratic and, within wide limits, independent of time; (2) situation al factors—idiosyncratic and time-dependent; (3) physicochemical factors—general and independent of time. Somatic treatment of mental disorders is aimed only at the physicochemical sector of the human behavioral complex. The indirect repercussions of physical treatment may, however, bring about a change of balance in the entire organism and thus result in therapeutic effects which seem to go f ar beyond simple physical changes, HEINZ E. LEHMANN
MENTAL DISORDERS, TREATMENT: Therapeutic Community [Other relevant material may be found in DEPRESSIVE DISORDERS;
DRUGS;
ELECTROCONVULSIVE
SHOCK;
NERVOUS SYSTEM, article on STRUCTURE AND FUNCTION OF THE BRAIN; NEUROSIS; PSYCHIATRY; PSYCHOSIS; SCHIZOPHRENIA.] BIBLIOGRAPHY BERGER, F. M.; and LUDWIG, B. J. 1964 Meprobamate and Related Compounds. Volume 1, page 103 in Maxwell Gordon (editor), Psychopharmacological Agents. New York: Academic Press. CAMERON, D. E.; LOHRENZ, J. G.; and HANDCOCK, K. A. 1962 The Depatterning Treatment of Schizophrenia. Comprehensive Psychiatry 3:65-76. CANNICOTT, S. M. 1962 Unilateral Electro-convulsive Therapy. Postgraduate Medical Journal 38:451-459. CERLETTI, UGO 1950 Old and New Information About Electroshock. American Journal of Psychiatry 107: 87-94. CLAPP, JOHN S.; and LOOMIS, EARL A. 1950 Continuous Sleep Treatment: Observations on the Use of Prolonged, Deep, Continuous Narcosis in Mental Disorders. American Journal of Psychiatry 106:821-829. F0LLING, A. 1934 Excretion of Phenylpyruvic Acid in Urine as Metabolic Anomaly in Connection with Imbecility. Nordish medicinsk tidskrift (Stockholm) 8: 1054-1059. GILJAROWSKII, VASILII et al. 1956 Elektroschlaf. Berlin: VEB Verlag Volk und Gesundheit. G JESSING, L.; BERNHARDSEN, A.; and FR0SHAUG, H. 1958 Investigation of Amino Acids in a Periodic Catatonic Patient. Journal of Mental Science 104:188-200. GREENBLATT, MILTON; ARNOT, ROBERT; and SOLOMON, HARRY C. (editors) 1950 Studies in Lohotomy. New York: Grune. HAISCH, ERICH 1959 Irrenpflege in alter Zeit. Ciba-Zeitschrift 8, no. 95:3142 only. KALINOWSKY, LOTHAR B.; and HOCH, PAUL H. 1961 Somatic Treatments in Psychiatry. New York: Grune & Stratton. KLAESI, J. 1922 Uber die therapeutische Anwendung der "Dauernarkose" mittels Somnifen bei Schizophrenen. Zeitschrift fur die gesamte Neurologic und Psychiatric 74:557-592. KLINE, NATHAN S.; and LEHMANN, H. E. 1962 Handbook of Psychiatric Treatment in Medical Practice. Philadelphia: Saunders. LEHMANN, H. E. 1961 New Drugs in Psychiatric Therapy. Journal of the Canadian Medical Association 85: 1145-1151. LEHMANN, H. E. 1963 Psychopharmacology: A Discussion of Current Problems. Ohio State MedicaZ Journal 59:1091-1097. LEHMANN, H. E. 1965 The Pharmacotherapy of the Depressive Syndrome. Journal of the Canadian Medical Association 92:821-828. LEHMANN, H. E. 1966 Pharmacotherapy of Schizophrenia. Pages 388-411 in American Psychopathological Association, Psychopathology of Schizophrenia. Edited by Paul Hoch and Joseph Zubin. New York: Grune. LOEVENHART,
ARTHUR
S.;
LORENZ,
WILLIAM
F.;
and
WATERS, RALPH M. 1929 Cerebral Stimulation. Journal of the American Medical Association 92:880882. MEDUNA, L. J. 1935 Versuche iiber die biologische Beeinflussung des Ablaufes der Schizophrenic. Part 1:
207
Campher- und Cardiazolkrampfe. Zeitschrift fur die gesamte Neurologie und Psychiatrie 152:235-262. Mental Deficiency and Phenylketonuria. 1961 Journal of the American Medical Association 178:838 only. MYERSON, ABRAHAM 1936 Effect of Benzedrine Sulphate on Mood and Fatigue in Normal and Neurotic Persons. Archives of Neurology and Psychiatry 36:816-822. PETRIE, ASENATH 1952 Personality and the Frontal Lobes: An Investigation of the Psychological Effects of Different Types of Leucotomy. Philadelphia: Blakiston. RAGSDALE, N.; and KOCH, R. 1964 Phenylketonuria Detection and Therapy. American Journal of Nursing 64, no. 1:90-96. RYLANDER, GOSTA 1939 Personality Changes After Operations on the Frontal Lobes. Acta psychiatrica and neurologica Supplement 20. SAKEL, MANFRED 1958 Schizophrenia. New York: Philosophical Library. TAPPAN, PAUL W. 1951 Treatment of the Sex Offender in Denmark. American Journal of Psychiatry 108:241— 249. UNGER, SANFORD M. 1963 Mescaline, LSD, Psilocybin and Personality Change: A Review. Psychiatry 26: 111-125. WAGENEDER, F. H.; and HAFNER, H. 1965 Elektroheilschlaf: Eine neue Therapieform. Anaesthesist 14:126129. WAGNER-JAUREGG, JULIUS VON 1946 The History of the Malaria Treatment of General Paralysis. American Journal of Psychiatry 102:577-582. -> A translation by Walter L. Bruetsch of an earlier manuscript by Wagner-Jauregg. What Tranquilizers Have Done. 1964 Time April 24: 43-44. WOOLLEY, D. W. et al. 1938 Anti-black Tongue Activity of Various Pyridine Derivatives. Journal of Biological Chemistry 124:715-723. ZILBOORG, GREGORY 1941 A History of Medical Psychology. New York: Norton. VI THE THERAPEUTIC COMMUNITY
The term "therapeutic community" designates a method of treatment which attempts to use a hospital's social environment as an integral part of the treatment approach. It belongs to the general approach often referred to as "milieu therapy." As such, it is related to the field of social psychiatry, which attempts to incorporate sociocultural perspectives into psychiatry. The first use of the term "therapeutic community" was by Thomas Main (1946) in his description of the experimental units developed during World War H in England for the treatment of various kinds of wartime psychological casualties. The felicitous phrase, mentioned almost casually in Main's report, was picked up and made into a selfconscious, organizing concept in the work of Maxwell Jones (1952), with whose name the concept is generally associated. Although the idea of the therapeutic community was forged in the exigencies of war, it was devel-
208
MENTAL DISORDERS, TREATMENT: Therapeutic Community
oped to its fullest elaboration in its postwar applications. In essence, it represents a critique of prior psychiatric theories and practices in that it advocates radical changes in psychiatric hospitals to make them therapeutic rather than merely custodial or even psychologically damaging, and it is based on a new combination of ideas that have emerged from psychoanalysis and the social sciences. Accordingly, it is important to consider the complex of ideas associated with the term "therapeutic community" in the context of an appreciation of its precursors both in and out of psychiatry in Europe and America. There are two sets of precursors: those against which the therapeutic community idea represented a protest and those from which it drew its own creative elements. Characteristics of therapeutic communities There have been numerous attempts to find a quintessential definition of the therapeutic community idea as it has evolved and been applied in various settings (e.g., Jones 1952; Jones & Rapoport 1955; Rapoport et al. 1961; Research Conference . . . I960; Schwartz et al. 1964). A number of key ideas are distinguishable that differentiate the therapeutic community approach from other forms of milieu therapy or administrative psychiatry. Some or all of the elements described below are employed in units designated as therapeutic communities. In any given setting, these elements may be part of a focal treatment method, they may be secondary elements with other methods more focally applied, or they may be a part of the general background or atmosphere against which various methods may be applied. The holistic view. Conceptualizing the organization in system terms, according to a "holistic" view of the entire hospital or unit, is an integral part of most therapeutic community ventures. As the concept suggests, the hospital or unit is seen as a "community," of more or less closed corporate character. The emphasis is on understanding how behavior is affected by and affects the over-all functioning of the hospital as a. social system. This element of the therapeutic community idea is taken to contrast with the tendency of the old-fashioned mental hospital (with its associated theories emphasizing constitutional determination and therapeutic pessimism) to regard individual patients in depersonalized, asocial, atomistic terms. By conceptualizing the hospital as a special type of community in which the patient can be helped, therapeutic community practitioners seek, at the very least, to sustain a social definition of the nature of the patients' conditions and potentialities. They
hold that this social definition is a prerequisite of therapeutic effectiveness. The new community in the hospital is seen as belatedly providing a "good" substitute for the earlier pathogenic environment in which the patients' socially unadaptive tendencies developed. Identification with the hospital community is seen as a steppingstone to identification with society at large. Permissiveness. In contrast with the restrictive environment of the old-fashioned mental hospital, which was prisonlike or even punitive in character, the therapeutic community allows patients to express themselves relatively freely, even if this means the enactment of behavior that would be morally repugnant in ordinary settings. The degree to which this "acting out" can be tolerated is, of course, limited by several factors: the capacity of the institution to endure disruptive behavior, the exercise by the staff of their responsibilities for the care and protection of their patients, and the degree to which the expressive behavior may be taken to be in the interests of therapy. But some degree of permissiveness has become a hallmark of the therapeutic community. Underlying the application of permissiveness is the theoretical viewpoint that psychiatric disorder is the manifestation of maladaptive responses to earlier situations which have formed a covert, often unconscious framework for the individual's contemporary behavior. Success in attempting to change the latter is seen as depending, to some extent, on uncovering the former. Permissive standards of patient role prescription thus create a deliberate, rather than a careless, set of ambiguities in the structuring of hospital role relationships. The hospital social structure can, in this context, be seen to act somewhat as a "social screen" onto which patients project their covert behavioral tendencies; these tendencies can then be subjected to group analysis. In addition, involvement of patients in one another's treatment responsibilities helps the staff to allow more permissiveness by providing a more diffuse base of social controls. Increasing patient participation. In old-fashioned mental hospitals the patients were treated as objects, led passive lives removed from ordinary social activities, and had little or no voice in the conduct of the affairs of the institution in which they lived. In therapeutic communities, by contrast, patient participation is encouraged. This trend seems to have several components: "democratization," which implies the increase of patient participation in policy decisions relating to the general administration of the organization (for example* by patient government); "egalitarianism," which
MENTAL DISORDERS, TREATMENT: Therapeutic Community implies a reduction of status differentiation between staff and patients, with an emphasis on sharing of the facilities and resources of the institution (usually accompanied by the use of familiar terms of address, the forgoing of titles, uniforms, etc.); and "harnessing of patients for therapy" through attempts to use their intimate knowledge of one another, their communications, and insight potentials (for example, by the emphasis on group therapy sessions as a major treatment method). Broadening the base of therapy. In the therapeutic community a broader range of activities, relationships, and qualities of the patient's environment are considered relevant to the course of his treatment. Conventional medical thinking has always defined treatment in terms of what the doctor does—giving an injection, applying electrical shock therapy, or even subjecting the patient to "analysis" in a therapeutic "hour." Proponents of the therapeutic community, however, recognize that many experiences, relationships, and characteristics of the patient's life in the hospital can have a critical effect on his treatment. Activities previously thought of as purely diversionary or recreational have come to be seen as part of a program of treatment, and subordinate ranks of hospital personnel have become important links in the human communications network through which treatment is provided. Likewise, patient roles which had been seen as relatively unimportant or even troublesome, such as the leader of an informal patient clique, have become parts of the organizational and interactional process which therapeutic community practitioners seek to harness for therapy. Rehabilitation. Another feature of therapeutic communities is their orientation toward patient rehabilitation, which is based on an optimistic view of therapeutic potentialities. Therapeutic communities seek to reproduce within the hospital a microcosm of the ordinary world of the patient so as to enhance the possibilities for rehearsing social roles while still in the hospital. Thus there is an emphasis on training or retraining individuals to take social roles outside the hospital, not by forcing conformity to ordinary role requirements but, rather, by providing the opportunity for learning what kinds of problems interfere with the individual's capacity to perform acceptably in these roles. This emphasis is seen in the development of "realistic" workshops in the hospital and in the tendency to confront patients continually with others' perceptions of their behavior. This optimistic view of rehabilitation is reflected in what has come to be known as a "therapeutic atmosphere," which to some extent consists of a
209
series of new attitudes toward the patient and the hospital. These attitudes emphasize, for example, the positive elements in personality rather than the sick parts and thinking from the outset in terms of pathways back to a normal existence in the community rather than in terms of a long and hopeless removal in the artificial and impersonal life of the mental hospital. To some extent this "atmosphere" seems to have been made up of a set of attitudes held by the psychiatric leader: a sense of innovative change, high valuation of the work and people involved in it, optimism, and a feeling for the social significance of the therapeutic enterprise. There has been an impression by some observers (for example, see World Health Organization, Expert Committee on Mental Health 1953) that this charismatic quality of the leader contributes a major share to the success of therapeutic communities. To the extent that sophisticated practitioners of the therapeutic community approach have been aware of the importance of this sense of innovative change (akin to the "Hawthorne effect" in industry), they have sought to develop a continuing sense of challenge. They have looked to positive elements of the therapeutic community concept on which to build after many of the inequities of the old hospital system have given way to widespread reform. History of the concept Ingredients of the therapeutic community concept within psychiatry stem from the reformist stance taken over a century ago, when, in keeping with the trends emanating from the French Revolution, mental patients in country after country were brought under the benign aegis of medicine. Philippe Pinel in France, William Tuke in England, Vincenzo Chiarugi in Italy, Johann Reil in Germany, and Benjamin Rush in the United States were leaders among scores of hospital superintendents who attempted to redefine the ailments of mental patients. With the development of "moral treatment," physicians sought to treat psychologically disturbed individuals with compassionate understanding and close attention to their personal needs, thoughts, and feelings. The custodial system. The "moral treatment" approach declined in the latter part of the nineteenth century, to be replaced by a custodial, incarcerative system accompanied by deep-seated attitudes of therapeutic pessimism. The reasons for the change included the influx into urban areas of individuals who had few if any ties and who were disturbed and disfranchised through their experiences with urbanization and industrialization. Hospitals were overloaded, their patients were not inte-
210
MENTAL DISORDERS, TREATMENT: Therapeutic Community
grated into the local communities, and as costs mounted, the tendency was to remove them to the outskirts of major urban concentrations. Patients were held there with prisonlike restraints for their own and society's security. Concurrently, a theory of psychopathology developed which attributed the more serious mental disorders (which were thought to be on the increase) to brain lesions, the cure of which was yet to be discovered by medicine. The fact that the moral treatment proponents had not developed a theory of etiology and therapy to underpin their efforts left them without a rationale to support a concerted program of care in the face of the new influx of intractable patients and competing etiological theories. Their effectiveness was based on their norms of personal conduct as genteel members of the relatively small, intimate communities of their times. Reformist movements. Thus a combination of factors—ranging from fiscal to theoretical—led to the build-up of large custodial mental hospitals, which were stocked with chronically disturbed, neglected mental patients and ill-trained, pessimistic staffs. The therapeutic community approach found its immediate impetus in the reaction against this situation. The reformers were motivated in part by the ancient injunction Primum non nocere; as Florence Nightingale put it, "It may seem a strange principle to enumerate as the very first requirement in a hospital that it should do the sick no harm." The forces of institutionalization and neglect, as much as the innate pathology of the individual patients, were increasingly seen as having played a part in bringing them to their predicament. As compared with their predecessors, the new reformist movements were better equipped by modern theory and research methods to implement new approaches to mental patient care. The new theoretical orientation stressed the growth potentialities of the mentally ill. One of the fortunate by-products of the great depression was a new recognition that individuals could not entirely control the social forces affecting their lives. This recognition negated the view that casualties of the social process were constitutionally defective. World War n led to the mobilization of new talents and energies, accompanied by a revised sense of federal responsibility for the care of the nation's casualties. The changes have been so great in the period following World War n, particularly in the fields of social psychiatry, that they have been termed by such partisans as Moreno (1934) and Dreikurs (1955) the "third revolution" in psychiatry—the first having been that associated with the early reformers and the second with the psychoanalytic movement.
Of particular relevance to the therapeutic community approach were the efforts of August Aichhorn in Austria, Jacob Moreno and Harry Stack Sullivan in America, Ernst Simmel in Germany, and Wilfred Bion and others associated with the Tavistock Clinic and Institute in England. Aichhorn's work (1925) was influential in applying the psychoanalytic conception of permissiveness to the administration of an adolescent treatment institution. His work has been carried on and developed in the United States by such men as Bettelheim (1950) and Redl (Redl & Wineman 1952); Moreno stressed the importance of "psychodrama," or the use of role playing for both diagnostic and therapeutic purposes. Harry Stack Sullivan's Interpersonal Theory of Psychiatry (1953) was a pioneering effort to revise psychoanalytic theory so that it would take sociocultural processes into account in therapy, and while he did not directly influence the early therapeutic community innovations, his work was important in the development of parallel efforts such as those of Rioch and Stanton (1959), Stanton and Schwartz (1954), and Artiss (1962) and in laying some of the groundwork for American acceptance of the therapeutic community idea. Ernst Simmel (1929) noted that the transference relationship, so vital to psychoanalytic therapy, could be developed in a hospital setting with reference to social roles and, therefore, be displaceable to some extent from one individual incumbent in the role to others. The Tavistock group, influenced particularly by Bion (1961), developed many of the notions of group dynamics that informed the efforts of Maxwell Jones and his colleagues with therapeutic community experiments. A prior effort that resembled the therapeutic community method and provided ingredients for its subsequent development as a well-formulated approach was the "total push" method of Abraham Myerson (1939). This was an eclectic approach which optimistically sought to harness whatever resources and methods were at hand to reorient the staff's activity into a more holistic attack on the problems of psychiatric rehabilitation. Ideology of the therapeutic community. It is interesting to consider the question of why such a movement, with its Utopian emphasis on the healing power of the community, should have developed at this point in history. It may be conjectured that the movement to establish therapeutic communities is essentially a reaction against the anomic by-products of rapid social change attendant on increasing industrialization and urbanization. It represents an attempt to restore what Edward Sapir referred to as a "genuine culture," at least within
MENTAL DISORDERS, TREATMENT: Therapeutic Community the limited and more manageable sphere of (lie mental hospital world. The fact that the segregated mental hospital system displayed a remarkable degree of cultural lag and at the same time was associated with an idealistic, science-minded group of professional practitioners, gave unusual leverage for rapid implementation of this program once the conditions were favorable. The wartime mobilization of effort seems to have galvanized the profession to action that was directed toward reducing the cultural lag. The scientific ingredients of the new method were at hand, and the ideological emphasis on the corrective power of the tight-knit, intensively interacting community seems to be explainable in terms of a reaction against the anomic effects of modern society. Social science and the therapeutic community Social science has affected the development of therapeutic communities both indirectly, through the interest of innovating psychiatrists in social science concepts such as "culture" and "social structure," and directly, through the participation of social scientists in research on hospitals using this method of treatment. The Rapoport study. Although there were several prior social science studies on mental hospitals of various kinds (see Belknap 1956; Dunham & Weinberg I960; Rowland 1938; and Henry 1954 on the old-fashioned mental hospital; Stan ton & Schwartz 1954 and Caudill 1958 on psychoanalytically oriented hospitals), the first social science study of a hospital styling itself a therapeutic community was the study by Robert N. Rapoport and his associates (1961) of the British unit under Maxwell Jones. The latter had advocated setting up such a unit, as a result of his wartime experiences with soldiers suffering from war neuroses and from the difficulties of returning to normal social life after such deprivations as prolonged prison camp internment. When the unit became established, it gained a wide reputation as the first fully developed therapeutic community. In seeking to repeat their wartime successes with intensive resocialization methods, Jones and his staff developed a unit for the treatment of a variety of patients with problems of social maladjustment. They reported that their extension of the therapeutic community method was effective. Indeed, they advocated its application not only to the treatment of "psychopathic" personality disorders but also to the treatment of all sorts of behavioral adjustment problems, including those of imprisoned criminal offenders. Since the method was essentially one of harnessing the social processes of
211
institutional community life, the collaboration of a social scientist was sought. Rapoport and his colleagues, following Caudill (1958) in viewing the hospital as a form of small society, observed that the culture of this society emphasized four principal themes—permissiveness, democratization, communalism, and rehabilitation (through reality confrontation). Analysis of the program of activities and prescriptions for social roles in relation to these themes led to certain conclusions of a structural-functional nature. For example, one structural recommendation focused on the importance of incorporating into the formal ideology a conceptual distinction between "treatment" (measures aimed at improving the organization of the individual personality structure) and "rehabilitation" (measures aimed at improving the individual's adjustment to his social role relationships). This distinction was shown to be important in avoiding certain potential conflicts between overt and covert role prescriptions, among ideological themes in their spheres of possible contradiction, and between intrahospital and external norms for social behavior. One of Rapoport's functional recommendations was that treatment should concentrate on the hitherto relatively unrecognized process of "oscillations" in the state of over-all organization and functioning of the system. It was pointed out that all social systems are subject to variations in their state of organization and that systems with the properties of the unit are subject to particularly great swings in the state of "collective disturbance" —due to their permissiveness, the properties of their patients, and their emphasis on maximizing intercommunication. The tendency in the unit, as among practitioners generally, was to view these swings toward states of great collective disturbance with alarm and to attempt to avoid them wherever possible. However, Rapoport found evidence to support the view that the oscillatory process could be therapeutically very useful if appropriately harnessed. In the stage of social reorganization following the critical turning point of maximum disorganization, patients were observed to involve themselves more meaningfully in the constructive social processes and thereby to learn modes of social adaptation which could serve them in their subsequent relationships. The technique of managing these processes in the interests of therapy was shown to involve an interest in and alertness to their special properties; thus the Rapoport study recommended an avoidance of such pitfalls as "collusive anxiety" (premature intervention and imposition of authoritative staff social controls)
272
MENTAL DISORDERS, TREATMENT: Therapeutic Community
and "collusive denial" (lack of recognition of the state of disorganization and consequently the failure to intervene at the social-psychologically critical point). In addition, an analysis was made of the "careers" of a cohort of patients, and several conclusions were drawn, principal among which were the following. There was a relationship between the patient's acceptance of unit culture (as measured by change in profile scores on the ideological themes) and his perceived improvement in the unit (as measured by the patient, by the physician, and by the nursing staff); moreover, acceptance of unit cultural norms (and thus manifestation of clinical improvement) was far less likely to occur among patients who left the unit in less than six months than among those who stayed at least six months. The persistence of improvement in social functioning in the community for a year following discharge was more likely to be seen among certain types of "improved" patients than among others. Married patients did better than unmarried, and patients whose dominant personality defenses did not involve aggressive behavior did well in the permissive atmosphere of the unit. However, the fact that patients suffer setbacks in relationships outside the unit following discharge points up the importance of being alert to cultural discontinuities between treatment unit and community; in therapeutic communities the sense of contrast with the segregated mental hospitals may tend to obscure the contrast between such communities and their surrounding cultural context. In conclusion, Rapoport and his colleagues listed 30 principles for the formation of therapeutic communities, attempting to formulate them in sufficiently flexible terms to make them adaptable to a wide range of therapeutic contexts (R. N. Rapoport et al. 1961). Holistic and segmental studies. In considering the place of the Rapoport study of the therapeutic community among social science studies of the mental hospital generally, a useful distinction might be made between holistic studies and segmental studies. The study by Belknap (1956) and some other earlier studies of the state mental hospital, as well as Goffman's study (1961), may be thought of as cases at the custodial extreme of the continuum (described by Greenblatt et al. 1955) which ranges from custodial to therapeutic care; the Rapoport study is a holistic analysis of a case at the therapeutic extreme. Brown and Wing (1962) present a study of three hospitals representing three points along the continuum and provide further evidence to support the contention that changes in
the over-all organization of the hospital are reflected in changes in patients' behavior. Segmental studies would be those which focus on part processes. Even the Stanton and Schwartz classic study, The Mental Hospital (1954), is essentially a collection of segmental analyses of part processes, most notable among which is the demonstration of a relationship between covert disagreement among staff members and clinical excitation of patients. Gilbert and Levinson (1956) are particularly concerned with the relationship between the espousal of a custodial ideology and certain personality types, notably the "authoritarian personality." The other form of segmental research is seen in the replication studies, such as that of Carstairs and Heron (1957), who studied a British mental hospital, using the same measures as Gilbert and Levinson (1956) and their colleagues; they found that in Britain, as in the United States, higherstatus staff members are more likely than lowerstatus members to have a low "custodial ideology" score. Working with a greater cultural contrast, Stein and Getting (1964) found that the culturally prescribed role of the physician in Latin America countervails this tendency for liberalism to be easier for higher-status, relatively disengaged people. The role of the psychiatrist in Latin America is still oriented to the more authoritarian norms of physician conduct (reminiscent of the earlier period in Europe and the United States), and therefore their scores on the "custodial ideology" measures were less differentiated from their lower-echelon staff members than was found to be true in England and America. Another type of segmental study involves focal concentration on process. Many of the processes indicated in the earlier holistic studies (e.g., Caudill's "linked open systems," or "transactions," Rapoport's "oscillations," and Stanton and Schwartz's "covert disagreements") have become the focuses for subsequent studies seeking to replicate, refute, or extend their relevance into other contexts. Some segmental process studies seek further development of these earlier insights, particularly of the dynamics of inducing changes in hospital structures. For example, a study by Isabel Menzies (1960) seeks to elucidate a specific type of psychological barrier to the accomplishment of social changes. She concentrates her attention on the deep intrapersonal functions served by the conventional role prescriptions, such as those of the nursing role, and the built-in resistances of participants in the change process that work against their conscious wishes for change and modernization. Agnew and Hsu
MENTAL DISORDERS, TREATMENT: Therapeutic Community (1960-1961) approach the problem of understanding and overcoming resistances to change by using social structure as a point of departure; they suggest that the "democratization" theme of therapeutic communities may be most applicable to the phase of steady functioning following the institutionalization of the new system. In order to break through the rigidities of the older system, a measure of authoritative behavior may succeed where the democratic mode would be rejected. Clinical applications. Social science contributions of a more indirect kind can be seen in the clinical reports by psychiatrists on attempts to apply the therapeutic community idea to other contexts and in the course of so doing to evaluate and modify it to suit the circumstances (see Wilmer 1958; Scher 1958; Stainbrook 1955; Clark 1964). To some extent, the attention given to the therapeutic community idea in psychiatry has diminished as a consequence of the great development and immediate successes of new pharmacological treatments, even while the former was gaining general, if ancillary, acceptance. However, there is some reason to believe that interest in the therapeutic community dimensions of treatment will once again receive prominence as research reveals the limitations of a simplistic pharmacological approach (Klerman 1960). Lessons of the therapeutic community There are several ways in which the development of the therapeutic community emphasis in psychiatry may be seen as relevant to social science. Milieu therapists have provided opportunities for social scientists to observe the intimacies of an important form of institutional life that might otherwise have been inaccessible. Thus, the field of hospital studies and the entire field of comparative institutions have been enriched. Furthermore, the therapeutic community investigations have contributed to the already active trends in social science toward interdisciplinary collaboration. Epitomized in the work of Stan ton and Schwartz (with their demonstration of the connection between structured conflict in the environment and emotional upset in the individual) and of Caudill (with his conception of "linked open systems"), research in the milieu therapy-oriented hospital demands interdisciplinary approaches. The subsequent work by Robert N. Rapoport and his colleagues (1961) in the more labile environment of an innovating, experimental therapeutic community provided opportunities to examine unusually fluid social systems. This research has fed into the general trend toward developing more pro-
273
cessual modes of social science analysis and thus has become associated with numerous other approaches, such as general systems theories and the crisis theories. The experimental therapeutic communities were useful for such analyses because of their positive orientation toward flexibility and change and their relatively unstable functioning due to their tendency to de-emphasize authority hierarchies, to permit disruptive behavior by patients, and to encourage expressive communication. The resulting phenomenon, described in the mental hospital literature as "collective upsets," tends to be particularly notable in the therapeutic community. The "oscillatory tendency," as Rapoport termed it, was observed to have a discernible periodicity, to be affected by specific organizational events, and thus to have properties in common with other systems, as described, for example, in cybernetics. The oscillatory process was also observed to engender therapeutic potentials if properly harnessed, particularly in its phase of social reorganization. This interest in harnessing the energies that become available at critical turning points is shared by those social scientists who have been studying the process of critical role transitions (Rhona Rapoport 1963). The importance of ritual at times of transition in primitive societies has long been recognized by sociologists and anthropologists, notably Arnold van Gennep. In the more complex situations of modern secular society, the mechanisms used to cope with these status transitions are of a more deliberate, rational kind, aiming at adaptation to changing situations as well as accommodating existing needs and expectations. The processes of oscillation within a complex organizational framework can, in this sense, be seen as resembling the pattern of alternation between periods of stable functioning and critical transition followed by reorganization that characterizes the life cycle. From the viewpoint of the more analytic or fragmentary approaches, the quasi-experimental situation represented by the therapeutic community approach, particularly in its innovating stages, has been an attraction that has only begun to yield the kinds of results of which it is potentially capable. Contemporary issues As the therapeutic community concept has gained wider acceptance, the range of issues confronting social scientists in relation to research in this field has changed somewhat. There is still much to be desired by way of sheer evaluation of the effectiveness of the method; however, the types of research concern seem to be shifting. Rather
214
MENTAL DISORDERS, TREATMENT: Therapeutic Community
than asking what the therapeutic community is and how well it works, the questions are being posed more in terms of what aspects of the approach are most relevant for what types of persons under what conditions, including conditions of concurrent use of other forms of therapy. Furthermore, the possibility of the initial efficacy of the method as a novelty stimulus—akin to the medical "placebo effect" or the "Hawthorne effect" as recognized in industrial research—has relevance not only for evaluation of the method but also for the type of interest which it has for social scientists. Many of the early social scientists were interested in it as an innovating experiment with some of the characteristics of a Utopian reformist movement. However, as the method has gained acceptance and has become to some extent routinized within the psychiatric profession, its appeal for social scientists has changed. The emphasis has shifted, to some extent, from the more macrosociological or holistic, anthropological type of approach to the more structured, quasi-experimental approaches that are more characteristic of the social psychologist. However, the holistic researcher still has scope for analyzing the range of problems associated with application of the concept in different subcultural and structural situations—in large state mental hospitals, prisons, delinquent groups, depressed slum neighborhoods, schools, and industrial work groups. The concept can also be applied to different national and cultural settings, and to functional processes related to the persistence of innovations. In the context of these new and contrasting over-all situations, analyses will be fruitful on both the holistic level and on the level of part processes. Such issues as optimal size of the hospital unit, degree of social differentiation, type of authority structure, degree of interlinkage of subsystems, and flexibility versus fixity of value hierarchy can be tested in various contexts in relation to therapeutic effectiveness. The issues involved in doing systematic evaluative research in this field have hardly been broached and present a major challenge. On the side of implications for social theory, one can only note a great hiatus in work already done. Goffman's linking of the old-fashioned mental hospital to the larger class of "total institutions" (1961) is the most creative effort available in the hospital research field, but it relates not to therapeutic communities but, rather, to the polarity against which they are reactions. Therapeutic communities of the future will probably turn out to be far more differentiated and can therefore be expected to provide materials for un-
derstanding many kinds of dynamic processes. It would seem that their contribution to social science might be expected to lie in two spheres: first, the reciprocal relationship between personality and social structure, and second, the relationship between stability and structure on the one hand, and fluidity and change on the other, in the functioning of institutions designed to "process" a continuous flow of people while the organization maintains continuity and reliable functioning. These are challenges faced by social scientists in increasingly numerous fields of investigation, and the degree to which the therapeutic community will be a fruitful arena for investigation of these issues will depend on a complex of many factors other than the intrinsic interest which it presents. ROBERT N. RAPOPORT BIBLIOGRAPHY
AGNEW, PAUL C.; and Hsu, FRANCIS L. K. 1960-1961 Introducing Change in a Mental Hospital. Human Organization 19:195-198. AICHHORN, AUGUST (1925) 1935 Wayward Youth. New York: Viking. H> First published as Verwahrloste Jugend. ARTISS, KENNETH 1962 Milieu Therapy in Schizophrenia. New York: Grune & Stratton. BELKNAP, IVAN 1956 Human Problems of a State Mental Hospital. New York: McGraw-Hill. BETTELHEIM, BRUNO 1950 Love Is Not Enough: The Treatment of Emotionally Disturbed Children. Glencoe, 111.: Free Press. BIERER, JOSHUA 1960 Past, Present and Future. International Journal of Social Psychiatry 6, no. 1/2:165173. BION, WILFRED 1961 Experiences in Groups, and Other Papers. New York: Basic Books. BROWN, ESTHER L. 1961-1964 Newer Dimensions of Patient Care. 3 vols. New York: Russell Sage Foundation. -> Volume 1: The Use of the Physical and Social Environment of the General Hospital for Therapeutic Purposes. Volume 2: Improving Staff Motivation and Competence in the General Hospital. Volume 3: Patients as People. BROWN, G. W.; and WING, J. K. 1962 A Comparative Clinical and Social Survey of Three Mental Hospitals. Pages 145-168 in Paul Halmos (editor), Sociology and Medicine. Sociological Review Monograph No. 5. Univ. of Keele (England). CARSTAIRS, G. M.; and HERON, ALASTAIR 1957 The Social Environment of Mental Patients: A Measure of Staff Attitudes. Pages 218-230 in Milton Greenblatt et al. (editors), The Patient and the Mental Hospital. Glencoe, 111.: Free Press. CAUDILL, WILLIAM 1958 The Psychiatric Hospital as a Small Society. Cambridge, Mass.: Harvard Univ. Press. CLARK, DAVID 1964 Administrative Therapy: The Role of the Doctor in the Therapeutic Community. London: Tavistock. CONFERENCE ON COMMUNITY MENTAL HEALTH RESEARCH, THIRD, WASHINGTON UNIVERSITY, ST. Louis, 1961 1964 The Psychiatric Hospital as a Social System: Proceedings. Edited by Albert F. Wessen. Springfield, 111.: Thomas.
MENTAL HEALTH: The Concept DREIKURS, RUDOLF 1955 Group Psychotherapy and the Third Revolution in Psychiatry. International Journal of Social Psychiatry I , no. 3:23-32. DUNHAM, H. WARREN; and WEINBERG, S. KIRSON 1960 The Culture of the State Mental Hospital. Detroit, Mich.: Wayne State Univ. Press. GILBERT, DORIS; and LEVINSON, DANIEL 1956 Ideology, Personality and Institutional Policy in the Mental Hospital. Journal of Abnormal and Social Psychology 53:263-271. GOFFMAN, ERVING (1961) 1962 Asylums: Essays on the Social Situation of Mental Patients and Other Inmates. Chicago: Aldine. GREENBLATT, MILTON; LEVINSON, DANIEL; and WILLIAMS, RICHARD (editors) 1957 The Patient and the Mental Hospital: Contributions of Research in the Science of Social Behavior. Glencoe, 111.: Free Press. GREENBLATT, MILTON; YORK, RICHARD H.; and BROWN, ESTHER L. 1955 From Custodial to Therapeutic Patient Care in Mental Hospitals. New York: Russell Sage Foundation. HAMBURG, DAVID A. 1958 Therapeutic Hospital Environments : Experience in a General Hospital and Problems for Research. Pages 479-491 in Symposium on Preventive and Social Psychiatry, Waiter Reed Army Institute of Research, April 1957. Washington: Government Printing Office. HENRY, JULES 1954 The Formal Social Structure of a Psychiatric Hospital. Psychiatry 17:139-151. JONES, MAXWELL (1952) 1953 The Therapeutic Community: A New Treatment Method in Psychiatry. New York: Basic Books. -> First published as Social Psychiatry: A Study of Therapeutic Communities. JONES, MAXWELL; and RAPOPORT, ROBERT N. 1955 Administrative and Social Psychiatry. Lancet [1955], no. 2:386-388. KLERMAN, GERALD L. 1960 Staff Attitudes, Decisionmaking, and the Use of Drug Therapy in the Mental Hospital. Pages 191-214 in Research Conference on the Therapeutic Community, Manhattan State Hospital, Ward's Island, N.Y., 1959, Proceedings of the Conference. Edited by Herman Denber. Springfield, 111.: Thomas. -> Includes 2 pages of discussion. LEVINSON, DANIEL; and GALLAGHER, EUGENE 1964 Patienthood in a Psychiatric Hospital: An Analysis of Role, Personality, and Social Structure. Boston: Houghton Mifflin. MAIN, T. F. 1946 The Hospital as a Therapeutic Institution. Menninger Clinic, Bulletin 10:66-70. MENZIES, ISABEL 1960 A Case-study in the Functioning of Social Systems as a Defense Against Anxiety. Human Relations 13:95-122. MORENO, JACOB L. (1934) 1953 Who Shall Survive? Foundations of Sociometry, Group Psychotherapy and Sociodrama. Rev. & enl. ed. Beacon, N.Y.: Beacon House. MYERSON, ABRAHAM 1939 Theory and Principles of the "Total Push" Method in the Treatment of Chronic Schizophrenia. American Journal of Psychiatry 95:1197-1204. RAPOPORT, RHONA 1963 Normal Crises, Family Structure and Mental Health. Family Process 2:68-80. RAPOPORT, ROBERT N. et al. 1961 Community as Doctor: New Perspectives on a Therapeutic Community. Springfield, 111.: Thomas. RAPOPORT, ROBERT N.; and RAPOPORT, RHONA 1957 Democratization and Authority in a Therapeutic Community. Behavioral Science 2:128-133.
2 15
RAPOPORT, ROBERT N.; and RAPOPORT, RHONA 1959 Permissiveness and Treatment in a Therapeutic Community. Psychiatry 22:57-64. REDL, FRITZ; and WINEMAN, DAVID (1951) 1964 Children Who Hate. New York: Free Press. REDL, FRITZ; and WINEMAN, DAVID 1952 Controls From Within. Glencoe, 111.: Free Press. RESEARCH CONFERENCE ON THERAPEUTIC COMMUNITY, MANHATTAN STATE HOSPITAL, WARD'S ISLAND, N.Y., 1959 1960 Proceedings of the Conference. Edited by Herman Denber. Springfield, 111.: Thomas. RIOCH, DAVID; and STANTON, ALFRED 1959 Milieu Therapy. Psychiatry 22:65-72. ROWLAND, HOWARD 1938 Interaction Processes in the State Mental Hospital. Psychiatry 1:323-337. SCHER, JORDAN M. 1958 The Structured Ward: Research Method and Hypothesis in a Total Treatment Setting for Schizophrenia. American Journal of Orthopsychiatry 28:291-299. SCHWARTZ, MORRIS S. et al. 1964 Social Approaches to Mental Patient Care. New York: Columbia Univ. Press. SIMMEL, ERNST 1929 Psycho-analytic Treatment in a Sanatorium. International Journal of Psycho-analysis 10:70-89. SIVADON, PAUL 1958 Technics of Sociotherapy. Pages 457-464 in Symposium on Preventive and Social Psychiatry, Walter Reed Army Institute of Research, April 1957. Washington: Government Printing Office. STAINBROOK, EDWARD 1955 The Hospital as a Therapeutic Community. Neuropsychiatry 3:69-87. STANTON, ALFRED H.; and SCHWARTZ, MORRIS S. 1954 The Mental Hospital: A Study of Institutional Participation in Psychiatric Illness and Treatment. New York: Basic Books. STEIN, WILLIAM; and GETTING, E. R. 1964 Humanism and Custodialism in a Peruvian Mental Hospital. Human Organization 23:278-282. SULLIVAN, HARRY STACK 1953 The Interpersonal Theory of Psychiatry. Edited by Helen Swick Perry and Mary Ladd Gawel. New York: Norton. WILMER, HARRY A. 1958 Social Psychiatry in Action: A Therapeutic Community. Springfield, 111.: Thomas. WORLD HEALTH ORGANIZATION, EXPERT COMMITTEE ON MENTAL HEALTH 1953 Report, Third. Technical Report Series, No. 73. Geneva: World Health Organization.
MENTAL HEALTH Morris S. Schwartz and Charlotte Green Schwartz
i. THE CONCEPT H. SOCIAL CLASS AND PERSONAL ADJUSTMENT
William H. Sewell
THE CONCEPT
The meaning of the term "mental health" is ambiguous; not only is it difficult to agree on its general application, but even in a single context it may be used in many different ways. This lack of agreement will probably continue because the term has been adopted for a variety of purposes. One conclusion, however, can be reached: "mental health" is not a precise term but an intuitively ap-
21 6
MENTAL HEALTH: The Concept
prehended idea that is striving for scientific status while also serving as an ideological label. Problems of definition The word "mental" usually implies something more than the purely cerebral functioning of a person; it also stands for his emotional-affective states, the relationships he establishes with others, and a quite general quality that might be called his equilibrium in his sociocultural context. Similarly, "health" refers to more than physical health: it also connotes the individual's intrapsychic balance, the fit of his psychic structure with the external environment, and his social functioning. It is not surprising that the combination of two such terms produces an elastic and ambiguous concept. Another ambiguity attends this phrase. In common usage "mental health" often means both psychological well-being and mental illness. Definitions obviously vary with the perspective of the definers, the point of reference used, and the values considered important. Thus, the psychoanalytic perspective focuses on the intrapsychic life of the individual. Freud defined mental health in his programmatic statement: "Where id was, there shall ego be" (1932, p. 112). Here the value is awareness of unconscious motivations and self-control based upon these insights. The interpersonal frame of reference, on the other hand, is more concerned with the functioning of individuals in interpersonal situations. Sullivan identifies a person's drive toward mental health as those "processes which tend to improve his efficiency as a human being, his satisfactions, and his success in living" (1954, p. 106) and places major value on effective and efficient social functioning. The social relatedness perspective is exemplified by Fromm, who focuses on the individual's relationship with the larger social environment. The mentally healthy person is the productive and unalieiiated person; the person who relates himself to the world lovingly, and who uses his reason to grasp reality objectively; who experiences himself as a unique individual entity, and at the same time feels one with his fellow man; who is not subject to irrational authority, and accepts willingly the rational authority of conscience and reason; who is in the process of being born as long as he is alive, and considers the gift of life the most precious chance he has. ([1955] 1959, p. 275) Here the values are humanism, individualism, freedom, and rationality. The most comprehensive and definitive summary of the multiplicity of criteria used in defin-
ing mental health is that of Jahoda (1958). She rules out certain criteria as unsuitable because they are unsatisfactory for research purposes. "Absence of disease," for instance, is rejected as a criterion, not only because of the difficulty in circumscribing disease but also because common usage of the term "mental health" now includes something more than the mere absence of a negative value. "Statistical normality" is also considered unsuitable on the grounds that the term is unspecific, bare of content, and fails to come to grips with the question. Finally, "happiness" and "well-being" are ruled out because they involve external circumstances as well as individual functioning. Jahoda then summarizes what are to her the acceptable sets of criteria in current use. These are attitudes toward the self, which include accessibility of the self to consciousness, correctness of the self concept, feelings about the self concept (self-acceptance), and a sense of identity; growth, development, and self-actualization, which include conceptions of self, motivational processes, and investment in living; integration, which refers to the balance of psychic forces in the individual, a unifying outlook on life, and resistance to stress; autonomy, which refers to the decision-making process, regulation from within, and independent action; undistorted perception of reality, including empathy or social sensitivity; environmental mastery, including the ability to love, adequacy in interpersonal relations, efficiency in meeting situation requirements, capacity for adaptation and adjustment, efficiency in problem solving, and adequacy in love, work, and play. Since Jahoda's statement is a summary and not an attempt to integrate the criteria currently used in defining or identifying mental health, various difficulties (many recognized and discussed by her) attend her presentation. The criteria are overlapping, and the relationship between criteria is not spelled out (for example, the degree to which they are independent). Moreover, no method is indicated for identifying satisfactory indexes for the criteria, thus making it impossible to measure the degree of a particular criterion or even to discover its presence or absence. Ambiguities and different levels of specificity characterize the different criteria, and the impact of the social situation and the relevance of the society as context criterion are largely ignored. Jahoda does not attempt a solution for these difficulties. She simply recognizes the impossibility of arriving at a "correct" definition and of attaining a consensus, because values underlie the defi-
MENTAL HEALTH: The Concept nitions proposed and because the concept is used for different purposes. Jahoda's analysis of mental health as a concept deals mainly with the problems it poses for the empirical researcher: whether— and if so, how far—the various criteria can be integrated into one criterion or a set of criteria; the kinds of criteria that are required by different definitions; whether and how one might distinguish between "optimal" and "maximal" mental health; and operationalizing the definitions used. She deals minimally with the approach that the student of society would take: the meaning of this concept in society, its various functions, the ways in which it constitutes and expresses societal values, and the nature of the kinds of social environments that influence a person's psychological well-being. Nevertheless, her work represents the best summary of the current major definitions and the controversy connected with them. Aspects of the mental health controversy Discussions of the concept of mental health naturally reflect the interests of the principal groups involved in the mental health movement. One of the leading issues is whether "mental health" and "mental illness" should be conceptualized on the same continuum or on different continua that cut across each other. The conventional medical view holds that mental health is the absence of mental illness, that both terms represent the extreme ends of the same continuum, and that the difference between the two states is one of degree. A contrary view is that mental health is qualitatively different from mental illness and that a person can be both mentally healthy and mentally ill at the same time. Jahoda, as an advocate of the concept of "positive mental health," maintains that the absence of certain qualities does not imply the presence of others. For example, the absence of hallucinations does not imply the presence of accurate self-appraisal; conversely, the presence of creativity does not exclude the presence of severe anxiety. But if mental health and mental illness are placed on different continua, then it becomes necessary to specify their relationship. For this reason, Conrad (1952) has suggested that "negative health," or the absence of pathology, be used as an interstitial term. A related issue is whether mental health is to be seen as a relatively constant and enduring function of personality or as a momentary function of person and situation. For instance, Klein (1960) distinguishes "soundness" from "well-being": the former refers to the level of integration of the gen-
217
eral, more enduring personality structure, and the latter to the individual's current state of equilibrium. This distinction may be a useful way of identifying two different kinds of mental health. There also are differences of opinion on whether the concept of mental health is ever value-free. Some authors—medically oriented professionals— view psychological health as analogous to physical health, which, they maintain, can be evaluated by objective medical standards, without regard to the patient's sociocultural context. Another view maintaining that mental health is a value-free concept equates it with the statistically normal: mentally healthy behavior is that which is considered average or conventional behavior for a particular population. Here, good mental health is evaluated in terms of adjustment to and acceptance of current societal norms. Clearly, these criteria are not valuefree. Indeed, many students of the field maintain that criteria of mental health cannot be established in complete independence from the particular values and ideology of the society or group in which they are formulated and applied. According to this view, the study of definitions of mental health becomes a branch of the sociology of knowledge. But such an approach, although sociologically meaningful, cannot settle the question of which criteria are the most useful for therapy and mental health research. Some of those who maintain that all definitions of mental health are culture-bound hold that multiple criteria should be used, depending upon the values cherished by each society or subculture. Thus, criteria for mental health in the lower classes may have to be different from those for the middle classes, and those for citizens of Japan would have to differ from those for India or the United States. The issue here is that of the relation of the mental health of a person to the nature of the society in which he lives. Although this issue is rarely discussed, its clarification and resolution are critical in identifying the field of interrelated variables that are relevant to the study of mental health. What is needed is nothing less than a complete theory of the relation between the individual and society. Other students of the field hold that the criteria for mental health, though value-laden, can transcend situational or cultural boundaries and that an area of general value consensus can be arrived at. For example, M. B. Smith has suggested that universal criteria for mental health might be "identified with the stability, resilience, and viability— in a word the system properties—of these external and internal subsystems of personality" (1959,
218
MENTAL HEALTH: The Concept
pp. 680-681). Similarly, Fromm (1955) insists that criteria for mental health must be based on some concept of a universal human nature rather than on the values of particular cultures or societies. In summary, mental health can be viewed either as an ideal-type concept or as an empirical construct referring to a state that actually occurs. In the former view, mental health is an ideal to be striven for but never fully attained; it serves, however, as a standard against which to measure any particular individual. In the latter view, mental health is realistically attainable, though there is much dispute about the frequency with which it is encountered. Mental health as a movement and a profession The emergence of the concept of mental health is closely related to the growth of the mental hygiene movement in the United States and to the development of psychotherapeutic practice and personality research. As an explanatory construct, "mental health" emerged out of the concern with "mental hygiene" that gained its first adherents at the beginning of the twentieth century. Originally, this social movement focused on improving the wretched conditions in mental hospitals and providing better care and treatment for the mentally ill wherever they might be. In the 1920s interest shifted to promoting "mental hygiene" and establishing child-guidance clinics. The term "mental health" began to replace "mental hygiene" in the 1930s, and by the late 1940s it assumed an independent status with a growing and enthusiastic social movement operating in its name. This shift in terms signified the beginning of the era of concern with the prevention of mental disorders rather than merely care and treatment and the broadening of focus to include all forms of social and psychological maladjustment rather than just the severely emotionally disturbed or psychotic. The movement began to promote "positive" mental health as a goal distinct from the elimination of mental illness. The popularity of mental health as a desired value in the United States is in part related to its advocacy by those in the mental health movement and in part to the growth of psychoanalytic theory and acceptance of psychotherapeutic practice in the past several decades. The orthodox psychoanalytic viewpoint that mental health is a property of individuals and a function of intrapsychic development and dynamics is still dominant. It maintains that an individual acquires good mental health as a consequence of fortunate early socialization; psy-
choanalysis or some other form of psychotherapy is a corrective for unfortunate early development. Thus, the individual remains the unit of analysis, and psychological health is seen as a function of the individual's unique, private intrapsychic development and life history. Subsequently, the unit of analysis was extended to include the patterning of an individual's interpersonal relations. Recently, another view of mental health was put forward by the proponents of social psychiatry [see PSYCHIATRY, article on SOCIAL PSYCHIATRY]. Only a few authors, such as Fromm (1955) and Frank (1948), take a comprehensive view of mental health as a function of the total society—its dominant ideologies, assumptions, norms, values, institutions, and general style of life. Such a perspective is largely ignored or considered irrelevant by the great majority of ideologists, practitioners, and researchers in the field of mental health. Ideologists, practitioners, researchers. Action in the name of mental health has occasioned the development of three distinct groups whose membership may overlap but whose interests and functions are separable: they can be called the ideologists, the practitioners, and the researchers. The ideologists are primarily interested in promoting psychological well-being as a value and in encouraging action to prevent and eliminate mental illness. Well-developed mental health organizations, both private and public, now exist in the United States at the national, state, and local levels. In 1960 the National Association for Mental Health reported that, in addition to the state mental health associations, there were some eight hundred affiliated local mental health associations in 42 states, with a total registered membership and volunteer participation exceeding one million persons (Ridenour 1963). In addition, a network of federal governmental agencies, led by the National Institute of Mental Health (NIMH), spent a vast sum for research, training, education, demonstration, and the building and development of treatment facilities (during the fiscal year 1964/1965 the NIMH alone spent over $200 million). The NIMH also maintains links with the privately sponsored branches of the mental health movement. In addition to the federal government, each state and many cities and counties have a department of mental health or a mental health officer. Private and governmental agencies often join with practitioners to educate the public about mental illness and health, to urge persons to become concerned about their own and others' psychological health, and to collect funds for research.
MENTAL HEALTH: The Concept The importance of the mental health movement has enhanced the prestige and power of its practitioners, who range from psychoanalysts to marriage counselors. They have gradually widened their sphere of operation and now function in institutions such as schools, courts, and industry. Although many of their activities are undertaken in the name of mental health, little work is directed toward mental health as distinct from mental illness. Primarily, their concern is treatment; secondarily, it is research; it is only minimally prevention. The interests of researchers in mental health span the entire range of human behavior from circumscribed biochemical problems to existential problems of living. Despite the increasing number of research projects over the past decade, etiological problems remain unsolved and the field awaits conceptual clarification. Mental health and American values The mental health concept is related to current and traditional American values in three ways. First, it reflects and embodies many of these values; second, it functions to preserve certain of them; finally, it is a highly valued end in itself. In fact, mental health has become so esteemed that in some circles it has taken on the characteristics of a secular religion. In the twentieth century, human health is prized as it has been in no other. In the United States, in particular, we have moved from valuing sheer physical health to cherishing the psychological well-being of the total person. In pursuing these goals, we have relied on medicine, psychology, and social science to produce more valid knowledge and techniques with which to serve this value. Science and medicine, in turn, are values that are used to promote psychological health as a social and ethical goal. Thus, the importance of health, the faith in science and medicine, the reliance on technology to produce means for the ends declared desirable by experts, and the development of professional skill and specialization as attributes of the technology all combine to maintain and reinforce mental health as a value. The high degree of acceptance of this value also seems related to its congruence with the Protestant ethic. Kingsley Davis (1938) has suggested that the mental health movement took over the Protestant ethic as a system of conscious preachment and unconscious premises and that it bases itself upon fftuch the same values. But we suggest that the Movement has done more than take over the Protestant ethic—it has dressed it up in a modern scientific cloak. Thus it serves as a new ideology that
21 9
recommends, in nonreligious, quasi-scientific terms, a way of dealing with personal troubles and anxieties without the necessity for becoming involved in broader social issues or societal reconstruction. In any case, its popularity among middle-class, college-educated Americans cannot be denied. For some ideologists of the movement, "mental health" has become a mystique and a secular religion. Dicks, for example, proposes that it be conceived of as a new value in our world that is "comparable to the notions of 'finding God,' 'salvation,' 'perfection' or 'progress' which have inspired various eras of our history, as master-values which at the same time implied a way of life. . . . Some of the attributes of a secular priesthood or therapeutae are attached to us, and it is questionable whether we ought to divest ourselves of them even if the community would let us" (1950, pp. 3-4). Thus, for the mental health enthusiast, "mental health" becomes the standard for evaluating human behavior. Further, the mental health idea implies a new conception of moral and social progress in the form of self-correctability, self-perfectibility, inner growth, personal fulfillment, and inward and outward harmony, or the like. We are told that in the same way that we have achieved physical comfort—-through the instrumental application of knowledge and understanding—we can achieve psychological mastery over the self. This idea of progress embodies a new conception of success. No longer is it sufficient to measure achievement in tangible coin; we are persuaded to evaluate ourselves in terms of self-development and maturity. But there are no clear guidelines as to the means of reaching this goal or even to knowing that one has reached it. Orientations toward mental health. Orientations toward mental health as a desirable objective, as a subject matter, and as a field of work, knowledge, and inquiry oscillate between two poles. On the one hand, mental health is seen as a restricted and circumscribed "state of being" and as the subject matter of a field of work that is a specialty among other specialties. The individual or his immediate social environment is the unit for analysis, attempted control, and change. On the other hand, mental health is seen as the sum total of the individual personality, and the field of work associated with it is a superordinate, all-inclusive science of man. In the more restricted orientation, the acquisition of mental health is viewed as a technical problem that is to be solved under the direction and leadership of experts. Mental health technology is seen as being contained in and developed and
220
MENTAL HEALTH: The Concept
transmitted by practitioners who claim special skills and expertise and who are legitimated by the society as the vehicle for the ethical application of knowledge about mental health. Operational techniques and procedures are established, and frames of reference and explanatory theories are developed and fiercely adhered to. In general this orientation stresses the separateness of persons and encourages them to seek inner tranquility and self-actualization on a private basis; psychological well-being is seen as a function of personality dynamics, which, in turn, are supposed to be primarily a function of early experience and only secondarily of later interpersonal relations. [For an approach that stresses primarily social factors, see PSYCHIATRY, article on SOCIAL PSYCHIATRY.] By contrast, those who take an all-encompassing view of mental health phenomena claim as their province the entire range of human thought and behavior; they believe that the human panorama is to be interpreted within the mental health framework rather than vice versa. These contrasting orientations have different advantages and disadvantages in achieving mental health objectives. The psychotherapeutic orientation is far more specific about the nature of the phenomena to be affected, be they biochemical, individual, or social; it therefore affords greater opportunity for intervention and control. However, by restricting the variables to be dealt with, it may neglect significant and, perhaps, crucial phenomena. By contrast, the broader orientation opens up greater possibilities of discovering the various interconnections between the variables involved. However, its very diffuseness and scope make it a poor guide for scientific research or social action. The functions of mental health ideology. The mental health ideology and movement function, in general, whether deliberately or inadvertently, to preserve and enhance certain values in American society. Outstanding among these is the humanistic value that emphasizes the importance of the individual as well as his development and fulfillment. Thus, the mental health movement contributes to and reinforces certain aspects of American democratic ideals and also promotes a form of "inwardness" by emphasizing introspection and selfawareness. By focusing on changing the individual rather than the society, the mental health movement directs effort away from social reconstruction and thereby functions to preserve the status quo and those middle-class values that are an intrinsic part of it. This is not to deny that some practitioners use the mental health idea as a vehicle for achieving social reform; but they are interested
only in specific social changes which they hope to effect in the name of mental health, such as changes in child-rearing practices in the family or in the ways in which students are handled in public school. For the ideologists, the conception provides a Weltanschauung of self-betterment to which they can devote themselves at a time when sociopolitical ideologies are unfashionable in the United States. Thus mental health is put forward as the panacea for all social problems and for the wholesale improvement of mankind. For the practitioner, on the other hand, the concept of mental health usually serves as a goal—albeit an ambiguous one— against which he can measure the current functioning of his patients and toward which he can direct his and their efforts; it is an implicit or explicit standard against which he measures the success and failure of his efforts and those of his colleagues. Problems for the future Despite the expansion of the mental health movement and the prestige of the professionals involved with it, little is known about how to achieve mental health. Moreover, the mechanisms for applying this meager knowledge and effecting the ends sought are extremely inadequate. Of the many issues that need resolution, three are central. The first is the necessity for conceptual extension beyond the individual intrapsychic life, interpersonal relations, and limited social contexts. For no matter how sophisticated, discerning, or scientific is our understanding of human beings as individuals, this framework is insufficient for understanding mental health, which also needs to be seen as a function of social roles, institutions, and communities. The second problem concerns this very scope of the mental health conception, which, because it involves a number of aspects of human living, demands an integration of the biochemical, psychological, social, and philosophical disciplines that is not yet in sight. The third problem involves the difficulties in intervention, implementation, and control that would remain even if conceptual expansion and the integration of relevant disciplines were achieved. Even if mental health can be achieved by rational planning, how much planning of this kind is desirable? Would it not threaten other cherished values, or have consequences that we cannot now foresee? From one perspective, the problem of mental health is identical with the eternal question of how to lead the good life. Perhaps this is not subject matter for academic disciplines, whether they be expanded or integrated, but rather
MENTAL HEALTH: The Concept an emergent from the human condition, in its infinite complexity, only a part of which can be planned for. Perhaps we need to raise the issue of how much mental health can be achieved by science and planning. It may be that the ultimate goal of positive mental health for all will continue to elude us as one of our persistent human limitations. MORRIS S. SCHWARTZ AND CHARLOTTE GREEN SCHWARTZ [Directly related are the entries on HEALTH; ILLNESS; LIFE CYCLE; MENTAL DISORDERS, TREATMENT OF, article on THE THERAPEUTIC COMMUNITY; PSYCHIATRY, article on SOCIAL PSYCHIATRY; PSYCHOANALYSIS. Other material relevant to the concept of mental health may be found in MENTAL DISORDERS; and in the biographies of FREUD; RANK; REICH; SULLIVAN.] BIBLIOGRAPHY CAPLAN, GERALD 1964 Principles of Preventive Psychiatry. New York: Basic Books. CLAUSEN, JOHN A. 1956 Sociology and the Field of Mental Health. New York: Russell Sage Foundation. CONRAD, DOROTHY C. 1952 Toward a More Productive Concept of Mental Health. Mental Hygiene 36:456473. DAVIS, KINGSLEY 1938 Mental Hygiene and the Class Structure. Psychiatry 1:55-65. DICKS, HENRY V. 1950 In Search of Our Proper Ethic. British Journal of Medical Psychology 23:1-14. EATON, JOSEPH W. 1951 The Assessment of Mental Health. American Journal of Psychiatry 108:81-90. Educational Practices. 1960 Pages 111-170 in Pennsylvania Mental Health, Incorporated, Mental Health Education: A Critique. Philadelphia: The Corporation. FELIX, ROBERT H. 1957 Evolution of Community Mental Health Concepts. American Journal of Psychiatry 113:673-679. FRANK, LAWRENCE K. 1948 Society as the Patient: Essays on Culture and Personality. New Brunswick, N.J.: Rutgers Univ. Press. FRANK, LAWRENCE K. 1953 The Promotion of Mental Health. American Academy of Political and Social Science, Annals 286:167-174. FREUD, SIGMUND (1932) 1965 New Introductory Lectures on Psycho-analysis. New York: Norton. -» First published as Neue Folge der Vorlesungen zur Einfiihrung in die Psychoanalyse. FROMM, ERICH 1947 Man for Himself: An Inquiry Into the Psychology of Ethics. New York: Holt. FKOMM, ERICH (1955) 1959 The Sane Society. New York: Holt. GINSBURG, SOL W. 1955 The Mental Health Movement: Its Theoretical Assumptions. Pages 1-29 in Ruth Kotinsky and Helen Witmer (editors), Community Programs for Mental Health: Theory-Practice Evaluation. Cambridge, Mass.: Harvard Univ. Press. GURSSLIN, O. R. ; HUNT, R. G.; and ROACH, J. L. 19591960 Social Class and the Mental Health Movement. Social Problems 7:210-218. HARTMANN, HEINZ 1939 Psychoanalysis and the Con-
2 21
cept of Health. International Journal of Psycho-analysis 20:308-321. JAHODA, MARIE 1955 Toward a Social Psychology of Mental Health. Pages 296-322 in Ruth Kotinsky and Helen Witmer (editors), Community Programs for Mental Health: Theory-Practice Evaluation. Cambridge, Mass.: Harvard Univ. Press. JAHODA, MARIE 1958 Current Concepts of Positive Mental Health. Joint Commission on Mental Illness and Health, Monograph Series, No. 1. New York: Basic Books. JAHODA, MARIE 1963 Mental Health. Volume 3, pages 1067-1079 in Encyclopedia of Mental Health. New York: Watts. KLEIN, DONALD C. 1960 Some Concepts Concerning the Mental Health of the Individual. Journal of Consulting Psychology 24:288-293. LEUBA, CLARENCE 1960 The Mental Health Concept. American Psychologist 15:554—555. LEWIS, AUBREY 1953 Health as a Social Concept. British Journal of Sociology 4:109-124. MASLOW, ABRAHAM H. 1962 Toward a Psychology of Being. Princeton, N.J.: Van Nostrand. THE MIDTOWN MANHATTAN STUDY 1962 Mental Health in the Metropolis: The Midtown Manhattan Study, by Leo Srole et al. Vol. 1. New York: McGraw-Hill. NUNNALLY, JUM C. JR. 1961 Popular Conceptions of Mental Health: Their Development and Change. New York: Holt. OFFER, DANIEL; and SABSHIN, MELVIN 1966 Normality: Theoretical and Clinical Concepts of Mental Health. New York: Basic Books. OPLER, MARVIN K. (editor) 1959 Culture and Mental Health: Cross-cultural Studies. New York: Macmillan. REDLICH, F. C. 1952 The Concept of Normality. American Journal of Psychotherapy 6:551-569. RIDENOUR, NINA 1963 The Mental Health Movement. Volume 3, pages 1091-1102 in Encyclopedia of Mental Health. New York: Watts. RUMKE, H. C. 1955 Solved and Unsolved Problems in Mental Health. Mental Hygiene 39:178-195. SCOTT, WILLIAM A. 1958 Research Definitions of Mental Health and Mental Illness. Psychological Bulletin 55:29-45. SEELEY, JOHN R. 1955 Social Values, the Mental Health Movement, and Mental Health. Pages 599-612 in Arnold Rose (editor), Mental Health and Mental Disorder. New York: Norton. SMITH, M. BREWSTER 1950 Optima of Mental Health. Psychiatry 13:503-510. SMITH, M. BREWSTER 1959 Research Strategies Toward a Conception of Positive Mental Health. American Psychologist 14:673-681. SMITH, M. BREWSTER 1961 "Mental Health" Reconsidered: A Special Case of the Problem of Values in Psychology. American Psychologist 16:299-306. SULLIVAN, HARRY STACK 1954 The Psychiatric Interview. Edited by Helen Swick Perry and Mary Ladd Gawel. New York: Norton. -> Published posthumously. WEGROCKI, HENRY J. (1948) 1953 A Critique of Cultural and Statistical Concepts of Abnormality. Pages 691-701 in Clyde Kluckhohn and Henry A. Murray (editors), Personality in Nature, Society, and Culture. 2d ed., rev. & enl. New York: Knopf. WORLD FEDERATION FOR MENTAL HEALTH, SCIENTIFIC COMMITTEE 1962 Identity: Mental Health and Value Systems. Edited by Kenneth Soddy. London: Tavistock.
222
MENTAL HEALTH: Social Class and Personal Adjustment ii
SOCIAL CLASS AND PERSONAL ADJUSTMENT
If personality is seen as referring to the relatively enduring needs, motives, attitudes, values, belief systems, and self-conceptions that characterize the behavior of the individual, there is good reason to expect a substantial relationship between social class (one's position in the stratification structure) and personality. The basis for expecting such a relationship rests on widely accepted assumptions regarding man and society. Human personality is to a large extent a product of the social learning experiences that the individual undergoes in the sociocultural environment in which he lives. Moreover, there seems to be almost complete agreement among social scientists that the early experiences of the individual are of critical importance in personality development and in later adjustment, although there is considerable disagreement as to the dynamics of the relationship between early experience and later personality. It is also generally accepted that personality continues to develop throughout the life cycle (although probably at a less rapid rate than in childhood) in response to learning experiences and environmental pressures which the person encounters in the performance of his social roles. Finally, it is readily apparent that one of the most pervasive aspects of the social structures impinging on the individual throughout his life cycle is the stratification system of his society. This last observation is true not only because all societies have a system of stratification in which the members are differentiated into strata of unequal status but also because of the unique function of the family as a status ascription and socialization agency. Because in all societies the child is accorded the same status as his parents, the family of origin serves as the main link between the child and society. Since the family is the major agency charged with the early socialization of the child, its position in the stratification structure will to a large extent determine the social learning influences to which the child will be subjected during the most formative periods of his life. Moreover, the family's position in the stratification structure will greatly affect the child's choice of associates outside of the family, which in turn will go far in determining the social opportunities he will encounter throughout his life. Thus the stratification system may be seen as one of the most important and continuous social contexts in which the individual's developmental history takes place; certainly, one's position in it should have a substantial bearing on his personality. This is not to say, how-
ever, that personality is wholly determined by social class. The possible influences on personality development to which the individual is subjected are many and varied and are by no means all classlinked. The two principal sources of research evidence on the relationship between social class and personality are studies of social class and the socialization of the child and studies of social class and mental illness. In the past 25 years many studies in both of these areas have appeared. Fortunately, reviews of much of this literature are available (Bronfenbrenner 1958; Dunham 1961; Sewell 1962; Mishler & Scotch 1963) so that only major trends and more recent developments are covered here. Social class and socialization In one of the early studies of social class and personality, Davis and Dollard (1940) attempted to show how the social structure influences the nature of the learning process by which Negro children are trained to take on the behavior appropriate to their position in the social stratification system of the southern United States. The authors trace the process by which the child learns and acquires from his parents, his family's social clique, his peers, and his interactions with white adults the needs, motives, cognitions, attitudes, values, and behavior patterns of the class subculture of which he is a member. These results were based mainly on informal observational procedures and, consequently, are suggestive rather than definitive; but they stimulated many subsequent studies of social class and child rearing. Perhaps best known is the study by Davis and Havighurst (1946) of middle-class and lower-class Negro and white children in Chicago. Using interviewing procedures, they found that the social class differences were much greater than the race differences and clearly indicated that middle-class mothers were more restrictive than lower-class mothers in the critical early training of the child. For instance, middle-class mothers were more likely to bottlefeed, follow a strict nursing schedule, restrict the sucking period, wean earlier and more abruptly, and begin and complete toilet training earlier than lower-class mothers. They also followed stricter regimens in other areas and expected their children to assume responsibilities earlier. These differences in early feeding and toilet training were widely interpreted by psychoanalytically oriented writers as evidence that middle-class child-training practices were baneful to middleclass children and were likely to produce maladjusted adults. Subsequent and more carefully
MENTAL HEALTH: Social Class and Personal Adjustment designed studies of social class differences in childrearing practices have failed to confirm the findings of the Chicago study. In fact, on many points, the results of later studies (see, for instance, Sears, Maccoby & Levin 1957) have contradicted those of the Chicago study—particularly on toilet-training and infant-feeding practices—and have shown that lower-class mothers are more restrictive and punitive in relation to basic needs than middleclass mothers. Urie Bronfenbrenner (1958), on the basis of a detailed examination of data from a number of studies covering a 25-year period, concluded that lower-class mothers have probably become more restrictive in infant feeding and toilet training since World War n, while middleclass mothers have become more permissive, with the result that the gap between them has tended to close. However, throughout this period, middleclass mothers have been consistently more permissive toward the child's expressed needs and wishes, less likely to use physical punishment, and more accepting and equalitarian in dealing with the child than have lower-class mothers. Thus, it would appear that there is little evidence from these studies to support the view that the lower-class child undergoes socialization experiences that are more favorable to his later personality than does the middle-class child; if anything, the evidence points in the opposite direction. Possibly as a result of these findings, and because empirical research has cast doubt on the importance of toilet training and infant-feeding practices for later personality (Sewell 1952), recent studies of social class and personality development have tended to place less emphasis on infant training and more stress on parent-child relationships extending into childhood and adolescence. Several studies illustrating this trend may be briefly mentioned. Kohn (1959a; 1959b) finds that middleclass parents emphasize internalized standards of conduct, including honesty and self-control, while working-class parents stress respectability, obedience, neatness, and cleanliness. Middle-class parents tend to respond to misbehavior in terms of the child's intent and to take into account his motives a nd feelings, while lower-class parents focus on the child's actions and respond in accordance with the seriousness of the act. Moreover, there is evidence that middle-class parents are less authoritarian in their relations with their adolescent children than lower-class parents but have higher expectations °f them (Elder 1962). Rosen (1961) finds that not On ly do middle-class junior-high-school boys have higher achievement motives and values than lowerclass boys, but that middle-class parents put more Pressure on them to succeed, teach them to believe
223
in success, and create conditions in which success is possible. Studies of lower-class adolescent boys, on the other hand, testify to the influence of peer groups and of the lower-class culture of the community, especially in socialization to delinquent roles (Miller 1958). Still other studies have shown that middle-class adolescents are trained to defer their gratifications and lower-class youths to satisfy their current needs (Schneider & Lysgaard 1953). Finally, many other studies show that middle-class parents, in comparison with lower-class parents, place more stress on values which result in high levels of aspiration and achievement in the educational and occupational spheres (Kahl 1953). Another quite different recent emphasis in socialization research has been renewed interest in cognitive development. Studies thus far reported indicate that lower-class children suffer from cognitive deficits that may seriously impede their later adjustments to school and adult roles (Deutsch 1963; Hess & Shipman 1965). Much more needs to be done to discover the full range of class differences in socialization practices and especially to determine their effects on personality development and adjustment in the various classes. Studies, not reviewed here, relating socioeconomic status to scores on personality tests indicate a low but positive correlation between social class and the personality adjustment of the child (Sewell 1962, pp. 348-349). Some good work on socialization and social class is being done, but much more is needed using better samples, a wider range of socialization practices, and better datagathering and data-analysis techniques. Social class and mental illness The largest body of evidence on the relation of social class to personality comes from the findings of a number of studies of social aspects of mental illness. One of the most important of these is the study by Faris and Dunham (1939), who found, among other things, an inverse association between socioeconomic characteristics of Chicago census tracts and first admission rates for schizophrenia. Since the publication of this research, similar studies of American, European, and Asian cities have essentially replicated these results (Dunham 1961, pp. 274-290). Ecological studies of this kind have been criticized because of bias arising from socioeconomic selection in first admissions to mental hospitals; the possibility that mentally ill persons have drifted from the better into the poorer areas of the city after the onset of their illness; and reliance on purely ecological correlations. Studies (Clark 1948; 0degaard 1956) based on the association between occupation or income and admission
224
MENTAL HEALTH: Social Class and Personal Adjustment
rates for psychoses, especially schizophrenia, generally conlirm the results of the ecological studies, but are also subject to the criticism that admission rates to mental hospitals tend to be selective of lower-class persons. Hollingshead and Redlich (1958), in their study of social class and mental illness in New Haven, improved on the earlier studies by obtaining detailed classifications of all cases in treatment with a psychiatrist or under the care of a psychiatric clinic or mental hospital, by carefully assessing individual socioeconomic status, by taking a city-wide control sample of normal persons for comparative purposes, and by computing rates for treated cases of various types of mental illness by class status. Most of their findings are for treated prevalence and therefore understate the total prevalence of mental illness in the community, but they clearly indicate that the lower classes have much higher rates for psychiatric illness, especially for psychoses. Other evidence collected by Hollingshead and Redlich indicates that diagnosis and treatment favor the higher social classes, with the consequence that members of the lower social classes tend to be diagnosed more readily as psychotics, to receive less individually oriented treatment, and to remain in custodial care for much longer periods of time. Because this piling-up of cases might explain the higher treated prevalence rates of the lower classes, incidence rates (based on the number of patients who entered treatment during the interval of observation) were computed. Again the lowest social class had the highest rates, although the differences between the other classes were no longer as marked. Moreover, while there was no relationship between social class and the incidence of neuroses, the inverse relationship of class membership and psychoses remained, with the rate for the lowest class being twice that for the next highest class and almost three times as high as for the two highest classes. This finding is particularly impressive because it confirms the results of the earlier ecological and correlational studies. But even the study just described is seriously defective because it is based only on treated cases. Evidence has been mounting for some time that the prevalence and incidence of mental illness in the community are much greater than the treated rates because many cases are either not treated or are handled by others than psychiatrists, mental health clinics, and mental hospitals. This is apparently true even for quite serious forms of mental illness. Recently, attempts have been made to obtain more satisfactory evidence concerning total prevalence of mental illness by means of sample
surveys in which clinical examinations or symptoms inventories are used to determine mental health status. Obviously, the magnitude of the rate will depend on the inventories and the cutting points used in determining who is and who is not mentally ill. The results of the Midtown Manhattan Study (1962; 1963), based on a large probability sample of adults, are especially informative in that a consistent inverse relationship is found between socioeconomic status and poor mental health and a direct relationship between status and absence of significant symptoms of mental pathology. Of all of the many variables tested, socioeconomic status was the one most clearly associated with mental health. Moreover, this relationship held whether parental socioeconomic status or the person's own socioeconomic status was taken as the status measure, and it persisted when age and sex were controlled. The finding of an inverse relation between socioeconomic status of parents and impaired mental health is particularly significant because it indicates that successively lower parental status carries for the child progressively greater likelihood of inadequate personality adjustment in adulthood. The finding that one's current socioeconomic status is even more closely related to one's mental health suggests that the effects of low socioeconomic status are probably cumulative in that the vulnerable personalities developed by some low-status children prevent their upward mobility and destine them to the further burdens and stresses that low socioeconomic status adults typically encounter in the United States. Moreover, lower-class persons tend toward socially disturbing psychotic adaptations that further complicate their adjustment to an already stressful environment, while higher-status persons tend to respond to stress with mild neurotic responses that are socially more adaptive. Thus, the cumulative effects of unfavorable childhood and adult experiences on the lower-class person may result in a higher degree of vulnerability not only to mental illness but also to the development of more serious psychiatric symptoms. Another important finding of the Midtown Manhattan Study is that those who are downwardly mobile present more symptoms of mental disturbance than those who are nonmobile, with those who are upwardly mobile having the fewest symptoms of all. Evidence indicates that downward mobility is associated with the character disorders, or personality-trait disturbances, while upward mobility tends to be associated with neurotic behavior. These findings confirm the conclusions of earlier studies based on clinical observations (Hollingshead & Redlich 1958). The task of unraveling
MENTAL HEALTH: Social Class and Personal Adjustment cause and effect in this area is indeed challenging and demands further research; whereas mobility may result in some types of psychiatric illnesses, it is also likely that certain personality characteristics —including psychiatric symptoms—may help determine who rises or falls in the stratification system (Dunham et al. 1966). The one finding from the studies of social class and mental illness which comes through most clearly is that the lowest social class has the highest incidence and prevalence of major psychiatric illness. The explanations offered for this finding vary considerably, but they may be conveniently subsumed under three general notions. First is the claim that class variations in rates of mental illness are due to the way in which a social system functions over time to sort and sift persons with certain personality characteristics or vulnerabilities into social class positions. Second, it is argued that differences in the extent and nature of environmental stress in the various classes account for differences in rates. Finally, some authors argue that class differences in socialization, especially early socialization, are responsible for differing rates of mental illness among the various social classes. As we have seen in our examination of the research evidence so far available, it is clear that no one of these explanations has ever been subjected to anything approaching a scientifically adequate test. It may be concluded that there are good theoretical reasons for expecting an association between social class and personality development and adjustment. However, studies to date do not indicate a sizable relationship but suggest that lower-class status is associated with socialization experiences that foster the development of needs, motives, attitudes, belief systems, self-conceptions, cognitive modes, and styles of coping with stress which result in personality maladjustment. Much more needs to be known about the socialization experiences that members of the various classes undergo, particularly how these affect personality systems. Finally, more systematic and theoretically informed studies of the role of social class in the etiology of mental illness are greatly needed. WILLIAM H. SEWELL [See also ACHIEVEMENT MOTIVATION; LIFE CYCLE; MENTAL DISORDERS, article on EPIDEMIOLOGY; PERSONALITY; PERSONALITY MEASUREMENT; PSYCHIATRY, article on SOCIAL PSYCHIATRY; SOCIAL MOBILITY; SOCIALIZATION; STRATIFICATION, SOCIAL.] BIBLIOGRAPHY BRONFENBRENNER, URIE 1958 Socialization and Social Class Through Time and Space. Pages 400-425 in
225
Society for the Psychological Study of Social Issues, Readings in Social Psychology. 3d ed. New York: Holt. CLARK, ROBERT E. 1948 The Relationship of Schizophrenia to Occupational Income and Occupational Prestige. American Sociological Review 13:325-330. DAVIS, ALLISON; and DOLLARD, JOHN (1940) 1953 Children of Bondage: The Personality Development of Negro Youth in the Urban South. Prepared for the American Youth Commission. Washington: American Council on Education. DAVIS, ALLISON; and HAVIGHURST, ROBERT J. 1946 Social Class and Color Differences in Child Rearing. American Sociological Review 11:698—710. DEUTSCH, MARTIN 1963 The Disadvantaged Child and the Learning Process. Pages 163-179 in Work Conference on Curriculum and Teaching in Depressed Urban Areas, Columbia University, 1962, Education in Depressed Areas. Edited by Harry A. Passow. New York: Columbia Univ., Teachers College. DUNHAM, H. WARREN 1961 Social Structures and Mental Disorders: Competing Hypotheses of Explanation. Milbank Memorial Fund Quarterly 39:259-311. DUNHAM, H. WARREN et al. 1966 A Research Note on Diagnosed Mental Illness and Social Class. American Sociological Review 31:223-227. ELDER, GLEN H. 1962 Adolescent Achievement and Mobility Aspirations. Chapel Hill: Univ. of North Carolina, Institute for Research in Social Sciences. PARIS, ROBERT E. L.; and DUNHAM, H. WARREN (1939) 1960 Mental Disorders in Urban Areas: An Ecological Study of Schizophrenia and Other Psychoses. New York: Hafner. HESS, ROBERT D.; and SHIPMAN, VIRGINIA C. 1965 Early Experience and the Socialization of Cognitive Modes in Children. Child Development 36:869-886. HOLLINGSHEAD, AUGUST B.; and REDLICH, FREDRICK C. 1958 Social Class and Mental Illness: A Community Study. New York: Wiley. KAHL, JOSEPH A. 1953 Educational and Occupational Aspirations of "Common Man" Boys. Harvard Educational Review 23, no. 3:186-203. KOHN, MELVIN L. 1959a Social Class and the Exercise of Parental Authority. American Sociological Review 24:352-366. KOHN, MELVIN L. 1959fo Social Class and Parental Values. American Journal of Sociology 64:337—351. THE MIDTOWN MANHATTAN STUDY 1962 Mental Health in the Metropolis: The Midtown Manhattan Study, by Leo Srole et al. Vol. 1. New York: McGraw-Hill. THE MIDTOWN MANHATTAN STUDY 1963 Life Stress and Mental Health: The Midtown Manhattan Study, by Thomas S. Langner and Stanley T. Michael. Vol. 2. New York: Free Press. MILLER, WALTER B. 1958 Lower Class Culture as a Generating Milieu of Gang Delinquency. Journal of Social Issues 14, no. 3: 5-19. MISHLER, ELLIOT G.; and SCOTCH, NORMAN A. 1963 Sociocultural Factors in the Epidemiology of Schizophrenia. Psychiatry 26:315-351. 0DEGAARD, 0. 1956 The Incidence of Psychoses in Various Occupations. International Journal of Social Psychiatry 2:85-104. ROSEN, BERNARD C. 1961 Family Structure and Achievement' Motivation. American Sociological Review 26: 574-585. SCHNEIDER, Louis; and LYSGAARD, SVERRE 1953 The Deferred Gratification Pattern: A Preliminary Study. American Sociological Review 18:142-149.
226
MENTAL HOSPITALS
SEARS, ROBERT R.; MACCOBY, E. E.; and LEVIN, H. 1957 Patterns of Child Rearing. Evanston, 111.: Row, Peterson. SEWELL, WILLIAM H. 1952 Infant Training and the Personality of the Child. American Journal of Sociology 58:150-159. SEWELL, WILLIAM H. 1962 Social Class and Childhood Personality. Sociometry 24:340-356.
MENTAL HOSPITALS See MENTAL DISORDERS, TREATMENT OF, article On THE THERAPEUTIC COMMUNITY.
MENTAL ILLNESS See ILLNESS; MENTAL DISORDERS; MENTAL HEALTH; PSYCHOSOMATIC ILLNESS.
MENTAL RETARDATION Mental retardation is a problem of serious social concern. In view of the large number of persons considered to be mentally retarded, such concern is certainly justified. Using the conventional criterion of 3 per cent of the population, the U.S. President's Panel on Mental Retardation (1963) estimated that almost 5.5 million children and adults in the United States are mentally retarded. The criterion for mental retardation established in the "Manual on Terminology and Classification in Mental Retardation" (Heber 1959) and adopted by the American Association on Mental Deficiency as well as the Biometrics Branch, National Institute of Mental Health, is that all those at least one standard deviation below the population mean intelligence quotient (IQ) are considered retarded. If one accepts this criterion, and many do not, there are almost 30 million mental retardates in the United States. If the more conservative estimmate is employed, mental retardation is twice as prevalent as blindness, polio, cerebral palsy, and rheumatic heart conditions combined (Doll 1962). The typical textbook pictures the distribution of intelligence as normal or Gaussian in nature, with approximately the lower 3 per cent of the distribution encompassing the mentally retarded. A common class of persons is thus constructed, a class defined by intelligence-test scores below 70. This schema has misled many laymen and students and has subtly influenced the approach of experienced workers in the area. For if one fails to appreciate the arbitrary nature of the cutoff point of 70, it is but a short step to the formulation that all those falling below this point compose a homogeneous class of "subnormals." Since the conceptual distance between "subnormal" and "abnormal," with
its age-old connotation of disease and defect, is minimal, the final step is to regard retardates as a homogeneous group of defective organisms, immutably different from those persons possessing a higher IQ. The view that mental retardates represent a homogeneous group is seen in numerous research studies where comparisons between retardates and normals are made on the basis of IQ classification alone. The view that mental retardates, as a group, are "different" is most vividly encountered in comparative studies where mental retardates are conceptualized as occupying a position on the phylogenetic scale somewhere between monkeys and children of average intellect. It is of some interest to note that people deficient in respect to intelligence-test performance are usually not called "mental deficients" but rather are commonly referred to as "mental defectives." The defect orientation to mental retardation originally emphasized the notion of moral defect and stemmed anywhere from the belief that retardates were possessed by a variety of devils to the empirical evidence of their exhibiting an inordinately high incidence of socially unacceptable behaviors, such as crime and illegitimacy. More recently, the notion of defect has referred to defects in either physical or cognitive structures. This defect approach has a certain unquestionably valid component. There is a sizable group of retardates who suffer from a variety of known physical defects. Mental retardation may be due to such factors as a dominant gene (as in epiloia), a single recessive gene (as in gargoylism, phenylketonuria, amaurotic idiocy), infections (such as congenital syphilis, encephalitis, rubella in the mother), chromosomal defects (as in mongolism), toxic conditions (such as radiation in utero, lead poisoning, and Rh incompatibility), and cerebral trauma. For a complete listing of the many types of mental retardation the reader is referred to the "Manual on Terminology and Classification in Mental Retardation" (Heber 1959). The diverse etiologies noted above have one factor in common: in every instance examination reveals an abnormal physiological process, that is, there are specific or related defects in physiological functioning. Such persons are abnormal in the orthodox sense, since they suffer from a known disease defect. However, in addition to this group, which forms a minority of all retardates, there is the group labeled "familial," or more recently "undifferentiated," which comprises approximately 75 per cent of all retardates. This group presents the greatest mystery and has been the object of the
MENTAL RETARDATION most heated disputes in the area of mental retardation. The diagnosis of familial retardation is made when an examination does not reveal the physiological manifestations noted above and when retardation exists among parents, siblings, or other relatives. As will be seen in a later section, several theoreticians have extended the defect notion to this type of retardate. On the basis of differences in performance between retardates and normals on some experimental task rather than on physiological evidence, they have advanced the view that all retardates suffer from some specifiable defect over and above their general intellectual retardation. However, these theoreticians differ as to the specific nature of the defect. The experimental paradigm employed to demonstrate such defects involves equating groups of normals and retardates on Mental Age (MA), thus roughly controlling for general intellectual level, and demonstrating differences in performance between the two groups on some experimental task. This more general defect approach thus lends support to the conceptualization of the mentally retarded as a homogeneous group of physiologically defective persons. Some order can be brought to the area of mental retardation if a distinction is maintained between physiologically defective retardates, with known etiologies, and familial retardates, whose etiology is unknown. For the most part, work with physically defective retardates involves investigation into the exact nature of the underlying physiological processes, with prevention or amelioration of the physical and intellectual symptoms as the goals. Jervis (1959) has suggested that such "pathological" mental deficiency is primarily in the domain of medical sciences, whereas familial retardation represents a problem to be solved by behavioral scientists, including educators and behavioral geneticists. Diagnostic and incidence studies of these two types of retardates have disclosed two striking differences. The retardate having an extremely low IQ (below 40) is almost invariably of the defective type. (This does not mean that one cannot find defective retardates at every level of retardation. In fact, brain-damaged individuals may be found at every point along the IQ continuum.) Familial retardates, °n the other hand, are almost invariably mildly retarded, usually with IQs above 50. The defect position emphasizes the innate, if not immutable, difference between retardates and normals. The problem of definition. The decision of whether a person is considered retarded is often based not upon his intellectual characteristics but u Pon legal and occupational factors as well as his
227
general level of social adjustment. The matter has been put most succinctly by Maher who stated: What constitutes mentally retarded behavior depends to a large extent upon the society which happens to be making the judgment. An individual who does not create a problem for others in his social environment and who manages to become self-supporting is usually not defined as mentally retarded no matter what his test IQ may be. Mental retardation is primarily a socially defined phenomenon, and it is in large part meaningless to speak of mental retardation without this criterion in mind. (1963, p. 238) This emphasis on social factors in defining mental retardation may lead to more confusion than clarity as indicated by the discrepancies found among various incidence and survey studies. The data of Table 1 would indicate that the incidence fluctuates not only across age categories but also according to the locality. If mental retardation is defined strictly in terms of IQ, and assuming a certain constancy of IQ score, we would expect no difference in the incidence of mental retardation at different ages. The standardization data of the Wechsler-Bellevue Scale confirm this expectation. Table 1 — Percentage of persons classified as mentally retarded LOCALITY
AGE Under 5 5-9 10-14 15-19
England and Wales
Baltimore, Maryland
Onondaga County (Syracuse), New York
0.12 1.55 2.65 1.08
0.07 1.18 4.36 3.02
0.45 3.94 7.76 4.49 Source: Jervis 1959, p. 1290.
The incidence figures reported in Table 1 are understandable if one realizes that they reflect diagnoses based on some combination of IQ and the success of the individual in meeting social demands. For example, the extremely low incidence under 5 years of age may reflect the minimal social demands made on young children. The highest incidence obtained at the 10-14 age level occurs when the child is faced with school and more demanding intellectual tasks. It is probably in this age range that the relationship between IQ scores and meeting societal expectancies (i.e., successful school performance) is greatest. Stated somewhat differently, it is probably in this age range where the use of either the IQ or the child's success in meeting social demands would result in his being classified as mentally retarded. A test-score orientation to mental retardation results in the view that approximately 3 per cent of the population is mentally retarded. The social competence viewpoint,
228
MENTAL RETARDATION
however, results in a much smaller incidence. Data obtained through surveying representative samples of large populations or the entire population of certain limited regions in England and Scandinavia indicate that about 1 per cent or less of adults are classified as mentally retarded (e.g., Fremming 1947). Armed with the information that a person's social adequacy has much to do with whether or not he is considered retarded, we begin to get some inkling of the arbitrariness involved in such a classification. The nature of intelligence Whether mental retardation is defined by an intelligence-test score or by the person's social competence, which many claim reflects his intelligence, the essential aspect of mental retardation is lower intelligence than that displayed by the modal member of an appropriate reference group. There is little agreement, however, when the question precipitated by this statement is raised, namely, "What is the nature of intelligence?" We cannot avoid this question by invoking the very unsatisfying cliche that "intelligence is what an intelligence test measures," since it is perfectly apparent that the test constructor must have some definition of intelligence in mind, either explicitly or implicitly, before he can select test items. However, some consensus can probably be found for the view that intelligence is a hypothetical construct which has as its ultimate referent the cognitive processes of the individual. Given this, we are still faced with the unresolved issue of whether intelligence represents some single cognitive process which permeates every intelligence test or nontest behavior or whether it represents a great variety of relatively discrete cognitive processes which can be sampled and then summated to yield some indication of the amount of intelligence a person possesses [see INTELLIGENCE AND INTELLIGENCE TESTING]. In either case, the more important questions involve an understanding of exactly how such cognitive processes develop over the life span and exactly how innate and environmental factors interact to influence such development. Approached in this way, the problem of defining intelligence becomes one with the problem of the nature of cognition and its development. Cognitive versus psychometric approach. It follows that if we are to understand the nature of intelligence, we must consult those workers intent on investigating the nature and development of cognitive processes (e.g., thought, memory, concept formation, and reasoning) rather than focus
on the work of test constructors and psychometricians. There has been little cross-fertilization between these two groups, which have approached the investigation of intellective functioning quite differently. The former group of investigators utilizes a variety of techniques and, through extremely detailed analyses, attempts to tease out the intricacies of man's cognitive functioning. These theorists have tried to evolve a theory of human cognition and its development. If intelligence tests had been developed by this group, psychology might have avoided the perplexing state of affairs encountered in trying to define intelligence. Tests devised by such a group would, by necessity, be indicators of the formal features of the cognitive structure at various times in the life cycle. Recently, workers within this framework (e.g., Laurendeau & Pinard 1963) have taken an interest in the problem of assessment. Although the task of providing an acceptable theory of the development of cognition is far from finished, Laurendeau and Pinard have been able to take the first step toward the construction of an intelligence test based on the formal features of cognition that were isolated by Piaget [see DEVELOPMENTAL PSYCHOLOGY, article on A THEORY OF DEVELOPMENT].
From a historical point of view, the practical demands of society for a test which would measure intellectual functioning meant that intelligence became the province of the second group, namely, the testers and psychometricians. Furthermore, for a variety of reasons, American thinking was not receptive to the approach taken by the cognitive theorists. The practical and empirical nature of the work of the testers can be seen in the efforts of Alfred Binet, whose intent was not to investigate the nature of intelligence but rather to discover those test items which would discriminate between successful and unsuccessful school performance. As has been pointed out, Binet viewed his empirically selected tests as a social screening device rather than as a theoretical interpretation of the nature of intelligence. The success of Binet's tests in predicting school achievement led workers to the belief that such tests were inextricably bound to man's intelligence or cognitive functioning. For the psychometricians, it then became clear that the nature of intelligence could be understood by examining the nature of the tests that were employed to measure it. By discovering the correlations obtained between subtests within a given battery or across different tests, it was felt that the structure of intellect would be revealed. Despite the statistical rigor involved, no very satisfactory
MENTAL RETARDATION theory of intelligence has come out of the correlational or factor-analytic methods. There is, in fact, little agreement among workers even with regard to the one constant theoretical issue throughout this body of work, namely, the global versus the specific nature of intelligent behavior. It should be emphasized that the weakness of current theories of intelligence has led to a conceptual impasse in the area of mental retardation. If there is no satisfactory theory of intelligence, then the essential aspect of mental retardation must escape us and we must be content with superficial statistical and social approaches to this complex problem. We do not necessarily have to await a completed theory of intelligence, however, to cut through much of the complexity, disputation, and confusion encountered in the area of mental retardation. Some clarification appears possible through the simple process of reorienting or restructuring our approach to intellectual retardation. A rather sizable step forward is taken if our commitment to a simple test approach is abandoned in favor of a concern with cognitive processes. The process-content distinction. The plea is not that we abandon tests, for every cognitive theorist must eventually employ tests, as denned in the broadest sense. The plea is that workers in the field turn their attention from the superficial content of tests (i.e., the right or wrong answer) and come to grips with the problem of the cognitive structures and processes that give rise to content. It is this distinction between structure and content that has for too long escaped most workers in the area of mental retardation. Conventional tests are viewed by process-oriented cognitive theorists as too analytic and artificial in character and as measuring an end product and not a process (Laurendeau &Pinard 1963, p. 481). In general, however, Piaget's approach, with its developmental and normative emphasis, has had very little appeal to workers in the area of mental retardation, since such workers are committed to the study of individual differences. In this context, the test-constructing efforts of Laurendeau and Pinard appear very promising, since these followers of Piaget have formulated the states of cognitive development in terms of the nature of the cognitive operations achieved, thus emphasizing the nature of the cognitive structure and its accompanying processes. In their work we thus see a bridge beween a truly cognitive approach to intelligence and the need in the area of mental retardation for an instrument with which to make individual comparisons. The focus on the content of test behaviors has
229
been carried over to many nontest behaviors, and often an insufficient distinction is made between intelligence and intelligent behavior. It is the author's view that intelligence must refer to the formal characteristics of the cognitive structure and the processes that accompany it, whereas intelligent behavior should refer to the content of behavior in respect to the appropriateness (often defined in a relatively arbitrary manner) with which an organism carries out an act. (See Maher 1963, for another discussion of this distinction between intelligence and intelligent behavior.) For example, in a sheltered workshop the author recently encountered a retardate working with a surprisingly complex piece of machinery. His ability to use this machinery defied current knowledge concerning the capabilities of the retarded. The director of the workshop explained that the retardate had been taught to operate the machine through a shaping process not unlike that employed by B. F. Skinner in training pigeons to play Ping-pong. It was also learned that the retardate could handle the machine quite adequately provided its position was not changed. To emphasize this point, the machine was rotated on its axis approximately 90°; the retardate then became somewhat agitated and was no longer able to operate the equipment. (Piaget has also commented that remarkable intellectual feats performed by children on some task or other cannot be repeated following relatively minor alterations in the task stimuli.) In terms of the finished product, the retardate was behaving just as intelligently as an operator with a normal IQ. However, this accomplishment does not indicate that the retardate has become normal in intelligence. It is obvious that he is using a much more primitive cognitive process to achieve his intelligent behavior than is the normal person. This example should also make it clear that process analyses demand that the investigator make a more careful analysis of the content than that provided by a superficial "product" or criterion of correctness. Social competence and mental retardation. The content approach is expressed also in the social competence definition of mental retardation. What are the intellectual demands of social competence? We do not know, and very little effort is made to discover what they might be. In the area of mental retardation, social competence usually means the ability to maintain oneself without too frequent contact with state schools, state hospitals, welfare agencies, and police officers. Though social competence defined in this way reflects certain cognitive abilities, it may also reflect a variety of factors reminiscent of nonintellectual aspects of intelli-
230
MENTAL RETARDATION
gence-test performance. We refer here to factors such as luck, social values, attitudes toward other people, and emotional needs that are relatively independent of intellectual level. Thus, present intelligence tests may predict social competence better than an ideal intelligence test because of the overlap of nonintellectual variables which influence both intelligence-test scores and social competence. Social competence does not inevitably reflect normal intellectual functioning any more than its absence in the emotionally unstable, the criminal, or the social misfit reflects intellectual subnormality. Social competence is much too heterogeneous a phenomenon and reflects too many nonintellectual factors to be of great value in understanding mental retardation. The basic problem is that the concept of social competence is so value laden, and its definition so vague, that it has little heuristic utility. Windle (1962) has pointed out that the social competence definition of mental retardation is applicable only to institutionalized populations, whereas quite different definitional criteria must be employed with noninstitutionalized retardates. The only clear and acceptable operational definition of social competence would appear to be related to whether the individual has managed to function outside an institutional setting. Even Heber (1962), who has made the strongest case for employing social competence, has admitted that objective measures of adaptive behavior are presently unavailable. He has also stated that the present ambiguity of the social competence construct is such that in practice intelligence-test performance must remain "the most important and heavily weighted of the criteria used." There is a further problem with the social competence construct related to a fallacy which has permeated much of our thinking concerning the retarded. We have somehow come to believe that it is impossible for anyone who is "truly" retarded to meet the complex demands of our society. The bulk of retardates who have MAs in the 9-12 range (remembering that an MA of 16 is the upper limit for an individual of average IQ), have the intellectual wherewithal to meet the minimal demands of our society. This becomes immediately apparent if one raises the question of how much intellectual ability is required to arise in the morning, dress oneself, catch a bus or walk to a single location, perform some undemanding sort of labor, and return home. Indeed in the 1920s and 1930s it was discovered that there were no less than 118 occupations in our society suitable for individuals having MAs from 5 to 12. As late as 1956 it was noted that 54 per cent of jobs require no schooling beyond the elementary level (Whitney 1956).
Another major aspect of social competence is the ability of the individual to abide by the values of the society, that is, obey laws, and so on. While the incidence of crime among the retarded is higher than among the nonretarded, this increase of incidence is not very great, especially if one controls for social class. Here again it is an error to view obedience to the law as somehow beyond the ability of the retarded. One simply has to apply the concept of the stages of moral development as investigated by Jean Piaget (1932): fairly young children are capable of a morality based on absolutism, that is, the rules inhere in the very fabric of existence and are not to be broken under any circumstances. Individuals who never achieve a higher stage of moral development are certainly not developmentally adequate, but neither are they likely to break many laws. In order to make social competence a useful indicator of cognitive functioning, we must thus abandon some simplistic notion of social competence in favor of a variety of continua theoretically based upon the cognitive demands of the social requirements involved. Such indexes could then be considered independent indicators of intellectual functioning. Empirical efforts of this sort may be seen in the Vineland Social Maturity Scale and the Worcester Scale of Social Attainment. A more theoretical effort may be found in the work of Phillips and Zigler (1964) where both intelligence test scores and conventional social competence indexes are combined into an index of developmental or maturational level. The nature-nurture issue Attention is now turned to the role of cognitive capacity in mental retardation. Maher (1963) believes that the concept of capacity has considerable heuristic value for workers in the area of mental retardation. Intellectual capacity means something akin to Hebb's (1949) intelligence A, that is, an innate potential for the development of intellectual functions. Those who have argued that the intellectual capacity notion is a relatively useless one (e.g., Ferguson 1956) appear to be invariably committed to an environmentalistic or learning orientation. Maher's position is that the capacity concept has value as related to "the differences between individuals in rate of acquisition of responses under similar learning conditions. Such a concept necessarily implies the existence of structural differences between individuals and is incompatible with a. psychology of the empty organism . . ." (1963, p. 250). It is in this last sentence that we see the theoretical value of the capacity concept, since it
MENTAL RETARDATION forces us to conceptualize individuals as biological organisms innately differing in respect to the potential manifestation of a multitude of traits. Thus the concept of capacity is intimately related to the biological concept of the genotype. There has been an interesting effort to make intelligence, and thus mental retardation, a matter of acquired skills and transfer phenomena in the classical learning theory sense (see Ferguson 1956). Although Ferguson appears to abhor a biological concept of intelligence, he nevertheless falls back upon it in dealing with those aspects of early learning which do not reflect transfer effects. In addition, his treatment of transfer as a uniformly manifested phenomenon overlooks differences in ability to transfer from one task to another which may very well be a reflection of biological capacity. Given such an orientation, we can derive the optimistic view that complete control over learning experiences would do away with individual differences and, thus, mental retardation, at least of the nondefective variety. Such a view, though appealing, flies in the face of what has been observed. Herculean efforts of teaching and training have not resulted in marked change in the intellectual level of most retardates. The environmentalist, while acknowledging the importance of biological capacity, treats human behavior as the outgrowth of an infinite number of experiences. It is of interest to note that one environmentally oriented theorist (McCandless 1964) has argued that although heredity and environment interact in the production of intelligent behavior, we need only concern ourselves with environment, since "we can do something about environment." This approach implies that the manipulations of environments are expected to have constant results and, furthermore, ignores the obvious possibility that children with particular capacities will need specific environmental events in order to maximize their cognitive development. The one group that has seriously considered the nature of the interaction between genotype and experiences in producing certain behaviors (phenotypes) has been the behavior geneticists. Employing infrahuman subjects, these investigators have presented evidence that the effects of particular experiences and the behaviors to which they give rise depend upon the biological nature of the organism (e.g., Hirsch 1963). The attempt to determine the proportion of variance attributable to heredity or to environment is full of difficulties (H. Jones 1946). Despite the shortcomings of the nature-nurture work on intelligence, it is still possible to derive certain con-
231
clusions. (For more complete reviews of this work the reader is referred to H. Jones 1946; McCandless 1964.) Studies of parent-child and of sibling resemblances in intelligence, a variety of twin studies, and studies on children in foster homes have made it clear that inherited intellectual endowment is a much more important factor in intelligence than those who are environmentally oriented would have us believe. At the same time one must not forget the importance of environmental factors to manifest intelligence. The role of environment is evident even in extreme cases where a known gene defect is the cause of mental retardation. In the case of genetically determined phenylketonuria, subnormal intelligence occurs only in an environment which provides phenylalanine in the diet of the affected individual. A specific change in the environment (i.e., withholding phenylalanine from the diet) will prevent the occurrence of subnormal intelligence. An issue in the nature—nurture controversy of special pertinence to mental retardation concerns the degree to which the environment may produce individual differences in intelligence in contrast to affecting the absolute achievement level of man. It is one thing to assert that the environment may play a role in determining the range of individual differences found among men. It is another thing to assert that environmental events can cause the individual with a normal intellectual endowment to become retarded or, for that matter, shift the entire range of intelligence in such a way that no individual would display that degree of intellectual impairment that we now label retarded. Different environments (e.g., rural versus urban, racial-cultural, social class) are associated with differences in intelligence. To what extent such differences reflect environmental as opposed to inherited factors remains an open issue. The position of the majority seems to be nurture oriented, and the argument advanced is that it is the social class or cultural environment which produces retardation. To state the matter more simply, the hereditarian asserts that one is in a lower socioeconomic class because one is less intelligent, whereas the environmentalist asserts that one is less intelligent because one is in the lower socioeconomic class. More specifically, mental retardation is sometimes seen as a major consequence of social deprivation. Such a view assumes that children are capable of "normal" intellectual functioning if we but expose them to enough "cultural enrichment." Environmental factors and IQ changes. A matter of considerable import in testing the above
232
MENTAL RETARDATION
hypotheses is the magnitude of change that could be effected as a result of changes in the environment. Many investigators have been relatively pessimistic in their conclusions. McClearn (1962) has pointed out that the magnitude of the difference in IQs attributable to environmental factors, though statistically significant, has been so minute as to be practically negligible. In support of the environmentalistic point of view, however, instances can be found where rather marked improvements in IQ have been reported following some type of environmental manipulation. The reader is referred to the review by McCandless (1964), whose statement is perhaps one of the strongest in favor of the environmentalistic position. Other studies have indicated that when a geographic area is subjected to social improvement, such as better schools and improved communication, there is a tendency for the IQs of all the inhabitants to improve. Wheeler's study of Tennessee mountain children (1942) is of considerable interest. Testing over three thousand subjects in 1930, he found that IQs progressively declined from a mean of 95 at age 6 to a mean of 74 at age 16. Testing a new sample ten years later, he found a mean increase in IQ of approximately 10 points at every age level. However, the steady decline with age, from a mean of 103 at age 6 to a mean of 80 at age 16, was again discovered, despite the general increase. There has been a certain inconsistency in studies that have attempted to relate IQ changes to environmental factors. In some instances, significant correlations have been found between various subjective ratings of the "goodness" of the environment and increase in IQ (e.g., Thorpe 1946). But in other instances no environmental correlates could be found to account for changes in the IQ (e.g., H. Jones 1946). Jones has given some especially striking case histories of children who have manifested marked changes in IQ without the apparent involvement of environmental factors. A continuing problem has been the failure to designate just what constitutes a good environment for optimal intellectual development. Little has been added to the implicit view that the American middle-class home represents some sort of standard. A related matter, of course, is the problem of defining cultural or social deprivation. The social deprivation concept has been loosely applied to certain events in early childhood which are characterized as antecedent to certain social behaviors. There is little agreement about either the early events or the resultant behaviors. Major
dimensions of childhood deprivations that have been suggested are social isolation, cruelty and neglect, institutional upbringing, adverse childrearing practices, and separation experiences across a wide range of severity. Even factors such as these need much further definition and clarification. The view that, given a fairly standard environment, it is extremely difficult to improve the quality of cognitive functioning is consistent with the bulk of findings resulting from efforts to improve children's performance on Piaget-type tasks. Of course, familial retardates do not come from what we consider standard environments. Even with these children there is considerable evidence that no great intellectual improvement is produced through environmental manipulation, and this holds true for a variety of techniques. The reader is referred to E. E. Doll's excellent history of mental retardation (1962) for evidence on this point. Binet, with his concept of "mental orthopedics," and Jean M. L. Itard, with his great faith in the possibility of improving the quality of intellect, were responsible for the philosophy underlying the early work with retardates in this country. After several years of employing a variety of techniques, many of which are today being rediscovered, it became apparent that this optimism was unwarranted. In the early days, training schools in this country were just what the name implies. They became custodial institutions only when it became apparent that many retardates could not be trained to a level that would make them self-sustaining in the society at large. A reaction appears to have set in at this time, and the view that we could do nothing for retardates except provide them with a comfortable domicile became dominant. There is much for contemporary workers to learn from this marked swing in attitude toward the retarded. It suggests that undue optimism is dangerous, since it breeds undue pessimism. The conclusion that may be reached concerning the relevance of the heredity and environment controversy for mental retardation has been well stated by Penrose, who after a lifetime of work with the retarded wrote: The most important work carried out in the field of training defectives is unspectacular. It is not highly technical but requires unlimited patience, good will and common sense. The reward is to be expected not so much in scholastic improvement of the patient as in his personal adjustment to social life. Occupations are found for patients of all grades so that they can take part as fully and usefully as possible in human affairs. This process, which has been termed sociali-
MENTAL RETARDATION zation, contributes greatly to the happiness not only of the patients themselves but also of those who are responsible for their care. ([1949] 1963, p. 282) It is perhaps within this area of socialization that we can do a great deal to enhance the everyday effectiveness of the retarded. Personality and character traits were discovered to be more influenced by environment than was intellectual level (e.g., Leahy 1935). Such findings bolster the argument that there are many modifiable factors which are important in the determination of social adjustment. It is not rare to encounter individuals with the same intellectual make-up demonstrating quite disparate social adjustments. Perhaps the question is not how to improve the cognitive functioning of familial retardates but rather how to maximize the adjustment of such individuals, whatever their intellectual capacity may be. That considerable change in performance can result from the manipulation of nonintellective (i.e., motivational) factors will be made clear in subsequent passages. A two-group conception. Hirsch has asserted that we will make little headway in understanding individual differences in intelligence and many other traits unless we incorporate into our thinking the fact that to a large degree such differences reflect the inherent biological properties of man. As Hirsch has noted, we can no longer make the "gratuitous uniformity assumption that all genetic combinations are equally plastic and respond in like fashion to environmental influences . . . ," and he added that "without an appreciation of the genotypic structure of populations, the behavioral sciences have no basis for distinguishing individual differences that are attributable to differences whatsoever where there is a common history" (1963, p. 1442). Work in population genetics appears capable of bringing considerable order to the area of mental retardation. We need simply to accept the generally recognized fact that the gene pool of any population is such that there will always be variations in the behavioral or phenotypic expression of virtually every measurable trait or characteristic of man. From the poly genie model advanced by geneticists, we would deduce that the distribution of intelligence would be a symmetric bell-shaped curve, which is characteristic of such a large number of distributions that we have come to refer to it as the normal curve. This theoretical distribution is a fairly good approximation of what is actually encountered in the observed distribution of intelligence. In the poly genie model of intelligence (see Hirsch 1963; Penrose 1949), the genetic foundation of intelligence is not viewed as
233
dependent upon a single gene. Rather, intelligence is viewed as the result of a number of discrete genetic units. (This is not to assert, however, that single-gene effects are never to be encountered in mental retardation. As noted earlier, certain relatively rare types of mental retardation are the products of such simple genetic effects.) A variety of specific polygenic models have been advanced that generate theoretical distributions of intelligence that are congruent with observed distributions (e.g., Burt & Howard 1956). Again caution is in order. An environmentalistic model positing five environmental factors acting additively would also generate an approximation to a normal curve. However, such a model appears much less capable of encompassing the raw data encountered in investigations of intelligence. An aspect of polygenic models of special interest for the area of mental retardation is that they generate IQ distributions ranging approximately from 50 to 150. Since an IQ of approximately 50 appears to be the lower limit for familial retardates, it has been concluded (e.g., Burt & Howard 1956; Penrose 1949) that the etiology of this form of retardation reflects the same factors that determine "normal" intelligence. Approached in this way, the familial retardate can be seen as normal, where "normal" is defined as representing an integral part of the distribution of intelligence that we would expect from the normal manifestations of the genetic pool in our population. Within such a framework, it is possible to refer to the familial retardate as less intelligent but it would make no sense to say that he is abnormal. He is just as integral a part of the normal distribution as are the 3 per cent of the population that we view as superior or that more numerous group of individuals that we consider to be average. The two-group conception of mental retardation calls attention to the fact that the second group of retardates, those who have known physiological defects, represents a distribution of intelligence with a mean which is considerably lower than that of the familial retardates. Such children, for the most part, fall outside the range of normal intelligence, that is, they have an IQ below 50, although there are certain exceptions; brain-damaged children with IQs as high as 150 have been found. Thus the empirical distribution of intelligence may best be represented by two curves. Considerable clarity could be brought to the area of mental retardation if we were to do away with the practice of conceptualizing the intelligence distribution as a single continuous normal curve. The more appropriate representation is to depict the intelligence of
234
MENTAL RETARDATION
the bulk of the population, including the familial retarded, as a normal distribution having a mean IQ of 100 with lower and upper limits of approximately 50 and 150. Superimposed on this curve would be a second nearly normal distribution having a mean IQ of approximately 35 and a range from 0 to 70. The first curve would represent the poly genie distribution of intelligence; the second would represent all those individuals whose intellectual functioning reflected factors other than the normal polygenic expression (i.e., those retardates for whom there is an identifiably physiological defect). This two-group approach to the problem of mental retardation has been supported by Penrose (1949) among others. The very nature of the empirical distribution of IQs below the mean, especially in the 0-50 range (see Penrose 1949) seems to demand such an approach. This distribution is exactly what we would expect if we combined the two distributions discussed above, as is the general practice. This two-group approach is of particular significance to the issue of mental retardation, since it calls for a reappraisal of the entire concept of normality. Hirsch has pointed out that such a concept, as presently employed, is of little value: Implicit in our use of "normal" is reference to some region of a distribution arbitrarily designated as not extreme—for example, the median 50 percent, 95 percent, or 99 percent. We choose such a region for every trait. Among n mathematically independent traits— for example, traits dependent on n different chromosomes—the probability that a randomly selected individual will be normal for all n traits is the value for the size of that region raised to the nth power. Where "normal" is the median 50 percent and n = 10, on the average only one individual out of 1024 will be normal (for ten traits). (1963, p. 1437) Thus, if we consider the whole person with his many variable physiological and psychological systems, it would be extremely rare to find an individual we would consider normal. Indeed, if we were to find him, his very normality would be considered abnormal in the sense that he represented a rare event. In the area of mental retardation the concept of abnormal should be confined to those cases with known physical defects wherever these cases may be found in the distribution of intelligence. A two-group approach makes the problem of the etiology of the familial retarded just as assailable as the problem of etiology in pathological retardation. In respect to the etiology of familial retardates, McClearn has stated that "these individuals undoubtedly represent the lower tail of the distribu-
tion generated by assortment of the polygenes underlying 'normal' intelligence, and should no more be considered abnormal than those whose intelligences are an equal distance above the mean" (1962, p. 186). Once we adopt the position that the familial mental retardate is not defective or pathological but is essentially a normal individual of low intelligence, then the problem of familial retardation becomes part of the general problem of developmental psychology. In terms of cognitive development, the familial retardate would then be viewed as progressing from one intellectual stage to the next in the same sequence as is encountered in other children. He would, of course, progress from stage to stage at a slower rate than other children, and the final stage that he achieves would be lower than that achieved by the more intelligent members of the population. In terms of cognitive functioning alone, the familial retardate with a chronological age (CA) of 10 and an MA of 7 would be conceptualized as being cognitively similar, that is, at the same developmental level as a child with a CA of 7 and an IQ of 100. (The reader must remember that the MA, which is invariably based on the IQ, can be considered only a very rough indicator of the cognitive or developmental level; however, to date, it represents the most adequate measure available.) To say that two such hypothetical children are cognitively similar is not to assert that they will necessarily behave exactly the same on the intellective and nonintellective tasks with which society confronts them. If nothing else, the retardate is three years older, and if the performance involves lifting weights, or a task that he has encountered much more frequently than the 7-year-old normal child, we would expect the retardate to be superior. Furthermore, performance on even cognitive tasks reflects the wide variety of factors that are the product of the past history of the child rather than his cognitive ability alone. To the extent that these two children have different histories, have experienced different environments, and have developed different values and motives, we would expect differences in performance. It is no great mystery that a group of children with IQs of 70 and a group with IQs of 100 matched on chronological age differ on a variety of tasks. These children are at different developmental levels, and such differences are exactly what a developmentalist would expect. The mystery is the repeated demonstration that even when groups are matched on MA, the retardate does less well, or at least behaves differently, than the MA-
MENTAL RETARDATION matched "normal" child. Two distinctly different explanations for this phenomenon have been advanced. One view is that these differences reflect a variety of experiential or motivational differences. The second position is that the familial retardate is really not a normal individual developing at a slower rate but is rather an inherently different or abnormal type of organism who, at every level of development, is suffering from some defect in his physiological or cognitive structure. These hypothesized defects are then viewed as producing differences in behavior even in those instances where the MA is equated. In the next section we shall consider the defect orientation, and the motivational position will be discussed further in the final section. The defect and difference orientation This section deals with those theoretical and empirical efforts that have advanced the view that all retardates, including those conventionally diagnosed as familial, suffer from some specifiable defect. These efforts are in opposition to the view that the familial retardate suffers from nothing more than a slower and more limited rate of cognitive development. The evidence typically offered by the difference, or defect, theorist is that even when groups of normals and retardates are matched on MA, which grossly controls for differences in the rate of development, the two groups behave differently. This difference in behavior is advanced as proof of the existence of some physiological or cognitive defect which itself is responsible for the slower rate of development. Where the hypothesized defect is an explicitly physiological one, it would appear to be a simple matter to obtain direct validation for the defect's existence. Such evidence would come from biochemical and physiological analyses as well as from pathological studies of familial retardates. A number of such studies have, of course, been carried out. Although there is an occasional report of some physical anomaly, the bulk of the evidence has indicated that the familial retardate does not suffer from any gross physiological defects. Indeed, if such evidence were readily available, the defect theorist would give up his reliance on the more ambiguous data provided by studies examining molar behavior. The failure to find direct evidence for the existence of a physiological defect in the familial retarded has not deterred—and probably should not deter—theorists from postulating such defects. In spite of the negative physiological evidence, such workers as Spitz (1963) maintain that all retardates, including familials, are physically defec-
235
tive and that our failure to discover defects in the familial retarded is due to the relatively primitive nature of our contemporary diagnostic techniques. It is perfectly legitimate for these workers to assert that, although presently not observable, the physical defect that causes familial retardates to behave differently from normals of the same MA will some day be seen. These theorists operate very much like the physicists of a not-too-distant era who asserted that the electron existed, even though it was not directly observable. Analogously, defect theorists in the area of mental retardation validate the existence of a defect by first asserting that its existence should manifest itself in particular phenomena, that is, in particular behaviors of the retarded. They then devise experiments in which, if the predicted behavior is observed, the existence of the hypothesized defect is confirmed. This approach is legitimate and has become increasingly popular. The majority of theories in the area of mental retardation are basically defect theories. It should be noted that these theories differ among themselves. One difference involves the theoretician's effort to relate the postulated defect to some specific physiological structure. The theoretical language of some defect positions is explicitly physiological, that of others is nonphysiological, while that of others has remained extremely vague. Such differences are related to the specific nature of the defect postulated. Particular defects that have been attributed to the retarded include the relative impermeability of the boundaries between regions in the cognitive structure (Kounin 1941; Lewin 19261933); primary and secondary rigidity caused by subcortical and cortical malformations, respectively (Goldstein 1943); inadequate neural satiation related to brain modifiability or cortical conductivity (Spitz 1963); malfunctioning disinhibitory mechanisms (Siegel & Foshee I960); improper development of the verbal system resulting in a dissociation between verbal and motor systems (Luria 1963; O'Connor & Hermelin 1959); and the relative brevity in the persistence of the stimulus trace (Ellis 1963). Luria and verbal mediation theory. Some of the more influential of the defect positions will be examined here, turning first to the position of the Russian investigator, A. R. Luria, whose work has influenced investigators in England and the United States. In respect to Russian efforts it should be noted that, given the political philosophy of the U.S.S.R., workers in the area of mental retardation have no alternative but to accept a defect position. As in the United States, the Russians di-
236
MENTAL RETARDATION
vide the retarded into three groups, although they use the older terms "idiot," "imbecile," and "debile." However, the generic term for mental retardation is "oligophrenia." The practice, followed in this article, of distinguishing between the group of approximately 25 per cent of retardates having known organic impairments and that larger group having unknown etiologies is simply not permitted by Soviet investigators. Although this section of the article is directed at illuminating differences of opinion concerning this larger group, there is a general consensus in the United States that this type of retardate, which we conventionally classify as familial, is the product of complex genetic determinants and cultural deprivation. In contrast, as the subcommittee of the President's Panel which recently visited the U.S.S.R. has noted (see Mental Retardation in the Soviet Union 1964), Soviet philosophy does not accept the view that mental retardation is determined by genetic factors, nor is cultural causation considered a possible explanation. Thus workers in this area attribute all grades of mental retardation to central-nervous-system damage, suggesting that it occurs initially during the intrauterine period or during early childhood and then results in a disturbance of the child's subsequent mental development. It is clear, then, that in the Soviet Union the diagnosis of mental retardation necessarily involves the specification of a defect in some neurophysiological system, and it is noteworthy that professionals, including researchers and teachers, working with the retarded are called "defectologists." Knowledgeable visitors (see "Mental Retardation in the Soviet Union" 1964) have pointed out that, given such an approach, diagnosticians will go to great lengths to "discover" some slight indication of possible organicity. However, in observing the pupils in the Soviet schools for the debile, it was apparent that those in attendance were primarily retardates who would be diagnosed as familial in the United States. With rare exceptions, these were the children of unskilled workers and in some instances were actually the children of the graduates of such schools. Consistent with the over-all Russian philosophy, general intelligence tests have been banned since 1936 by the Communist party because such tests are considered to be methods which discriminate against the peasants and the working class in favor of the culturally advantaged. Diagnosis in mental retardation is made by neurologists and psychophysiologists, who rely heavily on gross
pathological signs in the case of the severely retarded and minor physical defects, minute examinations of electroencephalograph (EEC) patterns, and certain qualitative (nonstandardized) tests of perception, conditioning, and concept formation (with special emphasis on the identification of specific types of language disorders) in the case of the more mildly retarded. Luria's efforts and the difficulties they pose for non-Russian workers can only be understood in terms of such an orientation toward mental retardation. In his work on verbal mediation, Luria has demonstrated that the behavior of retardates resembles that of chronologically younger normal children in that the verbal instructions do not result in the smooth regulation of motor behavior. His findings clearly indicate that on all his tasks requiring verbal mediation, the retarded subjects have considerable difficulty. In light of these behavioral data, Luria has inferred that the major defect in the retarded child involves both an underdevelopment or a general "inertness" of the verbal system and a dissociation of this system from the motor or action system. The general effect of this dissociation, vaguely conceptualized as a disturbance in normal cortical activity, is that a verbal response cannot serve as an adequate regulator of voluntary behavior. Unfortunately, it is impossible to utilize Luria's data to throw any light on the issue of whether the cognitive processes of retardates, typically diagnosed as familial, differ from normal children of the same MA. As noted earlier, a Russian defectologist would not accept this as a legitimate question. Since there is no concern with the IQ, there is no way to determine the MAs of the retardates and normals compared in Luria's work. Furthermore, the etiological question of whether his retarded subjects are of the physiologically impaired or the familial-cultural type remains unanswered. However, in light of Luria's discussion of "profound atrophic changes . . . expressed in the underdevelopment of the complex neuron structures of the first and third strata of the cortex" and his classification of these retardates as imbeciles rather than debiles, it would appear that these subjects probably suffer from gross physiological impairment. It must be concluded, then, that these data have extremely limited relevance to the issue of whether those retardates whom we conventionally classify as familial suffer from some physiological defect. We must therefore look to English and American workers for more adequate tests of the basic prop-
MENTAL RETARDATION osition that all retardates, including the familial, differ from normals in the degree to which they employ verbal cues in regulating voluntary behavior. For example, O'Connor and Hermelin (1959) have found no significant difference between normals and retardates in the number of trials required to learn a size discrimination. However the finding that retardates required significantly fewer trials to learn a reversal was interpreted as supporting Luria's position. O'Connor and Hermelin reasoned that on the original learning task the normal child employs both motor and verbal mediational responses in his learning, while the retarded child relies primarily on the motor response. When the reversal is introduced, the normal child must unlearn both the original motor and verbal responses. The retardate, having to unlearn only the motor response, would thus be expected to learn the reversal problem more easily. The findings of O'Connor and Hermelin are troublesome in light of their inconsistency with earlier studies (e.g., Stevenson & Zigler 1957) in which mental retardates were not found to be superior on a discrimination reversal task. In an effort to resolve this discrepancy, Balla and Zigler (1964) ran a reversal-learning study involving several different reversal tasks and different types of retardates at different MA levels. This study provided no support for the Luria position. Milgram and Furth (1963) compared retarded and normal children of the same MA on a series of concept tasks assumed to vary in the degree to which language might facilitate performance. Their findings were consistent with expectations derived from Luria's position. However, in an experiment comparing retardates and normals of the same MA on their ability to employ verbal mediators, Rieber (1964) obtained findings that were inconsistent with those that would be derived from Luria's theory. It thus appears that the evidence to date which has been mustered to support Luria's position remains equivocal. Spitz and cortical satiation theory. Another major defect position is that of Herman Spitz (1963), who has extended the Kohler-Wallach cortical satiation theory to the area of mental retardation. Spitz has argued that all retardates suffer from inadequate neural satiation which is related to brain modifiability or cortical conductivity, and has tested this position by comparing normals and retardates of the same CA. Again it should be noted that no direct physiological evidence has been presented to indicate that familial retardates suffer from inadequate neural or cortical function-
237
ing. Furthermore, there is direct physiological evidence (Lashley et al. 1951) which calls into question the validity of the entire Kohler-Wallach position [see GESTALT THEORY]. As in the case of the earlier gestalt workers, Spitz has primarily employed perceptual tasks to test his position. His extensive program of research has now been summarized (Spitz 1963), and any complete review would be beyond the scope of this paper. Spitz's most convincing evidence has been obtained on those perceptual tasks (e.g., figural aftereffects and Necker cube reversals) that are thought to be sensitive to hypothesized cortical satiation effects. The heuristic value of Spitz's position can be seen in his recent efforts to extend his postulates beyond the visual perception area and to employ them to generate specific predictions concerning the phenomena of learning, transposition, generalization, and problem solving. Spitz has noted a number of studies in these various areas which lend credence to his basic position. He has also been quite explicit in noting the limitations of his view. He has pointed out that, contrary to his theory, cortical satiation as measured by his perceptual indexes does not "in general correlate with IQ, but rather only differentiates the average performance of two distinct groups." The extensive overlap between normals and retardates on his tests of satiation led him to conclude that "the satiation variable must be only a very small one in the total complex of intelligent behavior." Spitz has also been appropriately concerned with the fact that the test-retest reliability of the scores of his retardates is not impressive. Furthermore, he has noted that the lack of any correlation of individual scores across certain of his satiation tasks is troublesome for his position. Across modalities and even in the same modality, correlations have been moderate or nonexistent. In addition to these concerns, Spitz has been sensitive to the issue of how accurately the subject's response, often a verbal report, reflects the perceptual response being investigated. (See Spivack 1963 for a discussion of this problem in respect to research with the retarded.) Adding to these difficulties is the fact that several investigators have now discovered that responses to cognitive and perceptual tasks are influenced by a variety of motivational factors (e.g., Zigler & deLabry 1962; Zigler & Unell 1962). In addition certain aspects of Spitz's work have come in for criticism on the grounds that his findings are inconsistent with those of other investigators.
238
MENTAL RETARDATION
Spivack (1963) has voiced this concern in a review of research on perceptual processes in the retarded, noting that certain of Spitz's findings "are in marked contrast to the findings of others." Of more importance to the central question of this section is the conclusion that Spitz's data throw little light on the issue of whether familial retardates are inherently different from normals of the same MA. Taking a stand reminiscent of the Russian position, Spitz has argued that the distinction between familial and organic retardates is misleading. In Spitz's view, all retardates suffer from brain damage in the broader sense, and he has argued (see Garrison 1966) that retardates be conceptualized as belonging to a common class. Therefore his work has been characterized by a relative lack of concern with the problem of etiology, and we have little way of assessing whether the differences he reports are a product of gross organic pathology or may actually reflect the cortical phenomena that Spitz postulates. That one finds differences between normals and retardates matched on CA is not very surprising, since we are dealing with groups who are at different developmental levels (as defined by MA). One would be tempted to say that Spitz's work has little relevance to the central issue of this section except for the fact that he has been quite explicit in his view that the differences he obtains are not developmental phenomena but reflect a physical deficit that should manifest itself even in comparisons with MA-matched normals. The Lewin-Kounin formulation. The final defect position that we shall discuss is that of Lewin (1926-1933) and Kounin (1941). This position is different from the other defect views in that the defect postulated is one in the cognitive structure rather than the physical structure of the retardate. The Lewin-Kounin formulation has had considerable impact not only on our conceptualization of the retarded but also on the treatment and training practices that have been employed over the years. (For a more complete historical review and critique of the Lewin-Kounin formulation, the reader is referred to Zigler 1962.) In Lewin's general theory the individual is treated as a dynamic system with differences among individuals derivable from a diversity of (1) structure of the total system, (2) material and state of the system, or (3) meaningful content of the system. The first two of these factors play the most important role in Lewin's theory of retardation. Lewin viewed the retarded child as having a lessdifferentiated cognitive structure, that is, having fewer regions or cells, than a normal child of the
same CA. Thus, in terms of structure, the retarded child resembles a normal younger child. In relation to the material and state of the system, Lewin stated that even though a retarded child corresponded in degree of differentiation to a normal younger child, these children were not to be regarded as entirely similar. He considered "the major dynamic difference between a feebleminded and normal child of the same degree of differentiation to consist in a greater stiffness, a smaller capacity for dynamic rearrangement in the psychical systems of the former." (Degree of differentiation was later operationally defined as MA.) Although Lewin undoubtedly felt that lack of differentiation could lead to rigid behaviors (e.g., pedantry, fixation, stereotypy, inelasticity, perseveration), he was quite clear that this lack of differentiation was not what he meant by rigidity. To Lewin, lack of differentiation referred to the number of regions within the total system, while rigidity was defined in terms of the fluidity between regions. (By rigidity, Lewin was referring to the nature of the boundary between cells in the cognitive structure.) It follows from Lewin's theory that an individual whose system is characterized by either lack of differentiation or rigidity, or both, is more likely to emit behaviors commonly referred to as rigid. The failure to draw a clear distinction between the meaning of rigidity as he employed it and rigid behaviors as such appears to be a major factor leading to the subsequent controversy in the area. The clearest experimental support for the position that familial retarded individuals are more rigid than normal individuals having the same degree of differentiation is contained in the work of Kounin (1941; 1948). Kounin, building upon Lewin's work, advanced the view that rigidity is a positive, monotonic function of CA. Again, it is imperative to note that by rigidity Kounin, like Lewin, referred to "that property of a functional boundary which prevents communication between neighboring regions" and not to rigid behaviors as such. Kounin (1941) offered the findings of five experiments in support of his theory. In these experiments he employed three groups, older familial retarded individuals, younger familial retarded individuals, and normals. Noting the inadequacies of Lewin's own experimental efforts, Kounin instituted certain experimental controls. He defined the degree of differentiation as the MA of an individual and controlled for this factor by equating the three groups on MA. He also attempted to reduce what he later referred to as "motivational factors
MENTAL RETARDATION (such as low success expectation and hesitance to enter unfamiliar regions) that might produce those very types of behavior that are sometimes lumped together in the pseudo-descriptive category of behavioral rigidity" (Kounin 1948). To control for these factors, Kounin attempted to make each subject feel confident and secure in the experimental tasks by having them engage in each of the activities prior to the experiment proper. As Kounin predicted, the three groups differed in certain instruction-initiated tasks (e.g., drawing cats until satiated and then drawing bugs until satiated, lowering a lever in order to release marbles and then raising the lever to release marbles). As predicted from the Lewin-Kounin formulation, the normals showed the greatest amount of transfer effects from task to task, the younger retarded a lesser amount of transfer, and the older retarded the least amount of transfer. That is, on the drawing task the retarded individuals drew longer on the second task following satiation of the first task than did normals, and the older retardates longer than the younger. On the lever-pressing task, the greatest number of errors, that is, lowering rather than raising the lever on part two, were made by the normals, the least number by the older retarded, with the younger retarded falling between these two groups. One should note that on this last task the lesser rigidity, as defined by Lewin and Kounin, of the normals results in a higher incidence of a behavioral response often characterized as rigid (i.e., perseverative responses). Furthermore, this lack of influence of one region upon another in the performance of the retarded would only be predicted in those cases where the retarded individual is "psychologically" placed into a new region by employing an instructional procedure. In those instances where the individual must, on his own, move from one region to another, the LewinKounin formulation would predict that such movement would be more difficult for the retarded than for the normal individual. This prediction was also confirmed by Kounin in his concept-switching experiment in which the child was asked first to sort a deck of cards, which could be sorted either on the basis of color or form, and then to put the cards together some other way. Here the normals evidenced the least difficulty in shifting, the older retarded the most difficulty, and the younger retarded group again fell between the other two groups. Thus, when a movement to a new region is self-initiated, it is the retarded who evidence the higher incidence of perseverative responses. The Lewin-Kounin theory of rigidity is a con-
239
ceptually demanding one in that it sometimes predicts a higher and sometimes a lower incidence of "rigid" behaviors in retarded as compared to normal individuals. However, the fact that it generates specific predictions as to when one or the other state of affairs will obtain is a tribute to the theory. Kounin thus offered impressive experimental support for the view that, with MA held constant, the older or more retarded (or both) an individual is, the more will his behaviors be characterized by dynamic rigidity, that is, greater rigidity in the boundaries between regions. This model and its experimental support was so impressive that until fairly recently very few further experimental tests were attempted. However, recent explicit tests of the model (Balla & Zigler 1964; Stevenson & Zigler 1957; Zigler & Unell 1962) have failed to provide support for it. Much evidence now indicates that the differences found by Kounin were not a product of the inherent rigidity of retardates, but rather reflected a number of motivational differences between normal children and institutionalized retardates of the same MA. These motivational factors will be discussed in the final section. Motivational and emotional factors A recurring theme in the present article has been the importance of a variety of nonintellective factors as determinants of the level at which the retarded functions. We shall never comprehend the behavior of the retarded if we assume that every behavior he manifests is the immutable product of his low intelligence. Furthermore, we must go beyond the overly simplistic theories that have been advanced, such as the view that all retardates manifest a highly similar pattern of behavior which is determined by their common defect. Indeed, a striking feature encountered when groups of retardates are observed is the variety of behavior patterns displayed. Clearly, we are not dealing with a homogeneous group of simple organisms. Once we concern ourselves with the total behavior of the retarded child, we find him an extremely complex psychological system. To the extent that his behavior deviates from the norms associated with his MA, he is even more difficult to understand than the normal individual. It is unfortunate that so little work emanating from a personality point of view has been done with the retarded. Some progress has been made, however, and much of the recent work supports the view that it is not necessary to employ constructs other than those used to account for the behavior of normal individuals in explaining the
240
MENTAL RETARDATION
behavior of the familial retarded. It appears that many of the reported differences between retardates and normals of the same MA are a result of motivational and emotional differences which reflect differences in environmental histories and are not a function of innate deficiencies. That personality factors are as important in the retardate's adjustment as are intellective factors has been noted (e.g., Penrose 1949; see also Windle 1962 for an especially comprehensive review of the importance of nonintellective factors in the prognosis of mental retardation). Many of the early workers in this country felt that the difference between social adequacy and inadequacy in that large group of borderline retardates was a matter of personality and character rather than intelligence. A number of studies have confirmed this view (see Windle 1962). Perhaps the best of these is the comprehensive study by Weaver (1946) of the adjustment of 8,000 retardates inducted into the U.S. Army, most of whom had IQs below 75. Of the total group, 56 per cent of the males and 62 per cent of the females made a satisfactory adjustment to military life. The median IQs of the successful and unsuccessful groups were 72 and 68 respectively. Weaver concluded that "personality factors far overshadowed the factor of intelligence in the adjustment of the retarded to military service." This tendency to overemphasize the importance of the intellect in adjustment has been made clear by Windle (1962). On the basis of a survey, he found that most institutions presume that intelligence is the critical factor in adjustment after release. Windle goes on to point out that the vast majority of studies (over 20) on outcome "after release from institutions have reported no relation between intellectual level and later adjustment." In examining this literature we find that the factors which led to poor social adjustment include anxiety, jealousy, overdependency, poor self-evaluation, hostility, hyperactivity, and failure to follow orders even when requests were well within the range of intellectual competence. It is hardly surprising that retardates evidence such difficulties in light of their atypical social histories. The specific atypical features of their socialization histories and the extent to which they are atypical may vary from child to child. Two sets of parents who are themselves familially retarded may provide quite different socialization histories for their children. At one extreme we may find a familially retarded child who grows up in an abysmal home environment and who is ultimately institutionalized, not because of lack of intelligence but
because his own home represents such a poor environment. That many borderline retardates are institutionalized for just such reasons has been confirmed by Kaplun (1935) in a study of 642 highgrade retardates; Zigler's recent finding (1961) that a positive relationship exists between the institutionalized familial retardate's IQ and the amount of preinstitutional deprivation he experienced provides further support for this claim. This latter finding does not indicate that social deprivation produces greater intelligence but rather that our institutions contain borderline retardates who would not be institutionalized except for their extremely poor home environments. At the other extreme, the familially retarded set of parents of an institutionalized child may have provided him with a relatively normal home, even though it might differ in certain important respects (e.g., values, goals, and attitudes) from the typical home in which the families are of average or superior intelligence. In the first example the child not only experiences a quite different socialization history while still living with his parents, but he also differs from the child in the second situation to the extent that institutionalization affects his personality structure. Given the penchant of many investigators for comparing institutionalized retardates with children of average intellect who live at home, the factor of institutionalization becomes an extremely important one. One cannot help but wonder how many differences discovered in such comparisons reflect some cognitive aspect of mental retardation as opposed to the effects of institutionalization, the factors that led to the child's institutionalization, or some complex interaction between these factors and institutionalization. To add even more complexity, the socialization histories of both institutionalized and noninstitutionalized familial retardates differ markedly from the history of the brain-damaged retardates. The brain damaged do not show the same gross differences in the frequency of good versus poor environments as do familials. In the face of such complexity, we need not consider the problem unassailable, nor need we assert that each retarded child is unique and that it is therefore impossible for us to isolate the ontogenesis of those factors which we feel are important in influencing the retardate's level of functioning. Once we conceptualize the retardate as occupying a position on a continuum of normality, we can allow our knowledge of normal development to give direction to our efforts. This does not mean that we ignore the importance of the lowered intelligence per se, since per-
MENTAL RETARDATION sonality traits and behavior patterns do not develop in a vacuum. However, in some instances the personality characteristics of the retarded will reflect environmental factors that have little or nothing to do with intellectual endowment. For example, many of the effects of institutionalization may be constant regardless of the person's intelligence level. In other instances, we must think in terms of an interaction; that is, given his lowered intellectual ability, a person will have certain experiences and develop certain behavior patterns differing from those of a person with greater intellectual endowment. An obvious example is the greater amount of failure which the retardate typically experiences. But again what must be emphasized is that the behavior pattern developed by the retardate as a result of such a history of failure will not differ in kind or ontogenesis from those developed by an individual of normal intellect who, by some environmental circumstance, also experiences an inordinate amount of failure. By the same token, if the retardate can somehow be guaranteed a more typical history of success, we would expect his behavior to be more normal, independent of his intellectual level. Within this framework, the author will discuss the personality factors which have been known to influence the performance of the retarded. Caution is needed in evaluating the role of motivational and emotional factors in the performance of the retarded. Performance on a task is most appropriately conceptualized as a function of two types of factors, intellective (i.e., cognitive) and nonintellective (i.e., motivational). The contribution of each factor will vary with the nature of the task. Motivational factors will more readily influence a perseveration task (e.g., how long a retardate will continue to put marbles into a box) than they will a discrimination-learning or concept-formation task. It has been demonstrated that the performance of retardates on tasks of the latter type is also influenced by motivational factors (Butterfield & Zigler 1965a; Zigler & deLabry 1962), but this should not be interpreted as evidence that basic intellectual capacity has been changed. Rather, these demonstrations suggest ways in which one may help the mentally retarded to utilize their intellectual capacity optimally. As such, they should not be viewed as manipulations which can make the retarded "normal" in their intellectual functioning. Anxiety. Considerable evidence has now been collected indicating the importance of anxiety on performance for a wide variety of tasks (Taylor 1956). The attenuating effects of anxiety on per-
241
formance appear to be a function of both the taskirrelevant defensive responses employed by the person to alleviate his anxiety and the drive features of anxiety itself. The drive approach to anxiety (Taylor 1956), which has received considerable confirmation, conceptualizes high anxiety as beneficial on extremely nondemanding tasks (e.g., classical eyelid conditioning) but detrimental on complex tasks where a variety of responses are available to the person. The higher anxiety level of retardates, as compared to normals, has now been noted by several investigators (e.g., Garfield 1963) who have either demonstrated or suggested that the heightened anxiety level of retardates could well have produced certain of the differences between retardates and MA-control normals reported in the literature. Work with retardates that has either focused on anxiety or raised the anxiety issue in a post hoc manner is of considerable value "in that it applies concepts and techniques to the study of retarded individuals, which for the most part had not been applied or seen as relevant for this group" (Garfield 1963, p. 594). [See ANXIETY.] The facts that anxiety level affects the performance of retardates much as that of normals and that retardates might have higher levels of anxiety than normals tell us little about the ontogenesis of anxiety in retardates. To understand their atypical anxiety levels, we must examine the relatively atypical experiences of the retarded, as well as a variety of other motivational states which influence their performance. Social deprivation. It has now become increasingly clear that our understanding of the performance of the institutionalized familial retardates will be enhanced if we consider the inordinate amount of preinstitutional social deprivation they have experienced (Clarke & Clarke 1954; Kaplun 1935; Zigler 1961). A series of recent studies (Green & Zigler 1962; Zigler 1961; Zigler & Williams 1963) has indicated that one result of such early deprivation is a heightened motivation to interact with a supportive adult. (In the process of conducting these studies, a social deprivation scale was constructed which promises to bring some added objectivity to the social deprivation concept.) These studies suggest that, given this heightened motivation, retardates exhibit considerable compliance with instructions when the effect of such compliance is to increase or maintain the social interaction with the adult. Compliance is apparently reduced in those instances where it leads to terminating the interaction. It now appears that the perseveration so frequently noted in the behavior of the retarded is
242
MENTAL RETARDATION
primarily a function of this motivational factor rather than the inherent cognitive rigidity suggested by Lewin (1926-1933) and Kounin (1941). Evidence on this latter point comes from findings indicating that (1) the degree of perseveration is directly related to the degree of preinstitutional deprivation experienced (Zigler 1961) and (2) institutionalized children of normal intellect are just as perseverative as institutionalized retardates, while noninstitutionalized retardates are no more perseverative than noninstitutionalized children of normal intellect (Green & Zigler 1962). The finding that institutionalization (or the social history factors leading to institutionalization) is the crucial factor in determining the child's response to social reinforcement on a simple task has also been found by Stevenson and Fahel (1961). The heightened motivation to interact with an adult, stemming from a history of social deprivation, would appear to be consistent with the often-made observation of certain behaviors in the retarded, such as seeking attention and wishing for affection (Doll 1962). It is impossible to place too much emphasis on the role of overdependency in the institutional familial retarded and on the socialization histories that give rise to such overdependency. Given some minimal intellectual level, the shift from dependence to independence is perhaps the single most important factor necessary for the retardate to become a self-sustaining member of our society. It appears that the institutionalized retardate must satisfy certain affectional needs before he can cope with problems in a manner characterized by individuals whose affectional needs have been relatively satiated. These affectional needs can best be viewed as ones which often interfere with certain problem-solving activities. Because the retardate is highly motivated to satisfy such needs through maximizing interpersonal contact, he is relatively unconcerned with the specific solution to these problems. Of course the two goals will not always be incompatible, but in many instances they will be. Some evidence that this attenuating aspect of retarded behavior can be overcome has been presented by McKinney and Keele (1963), who found improvement in a variety of behaviors in the mentally retarded following an experience of increased mothering. Zigler and Williams (1963) have provided some evidence on the interaction between preinstitutional social deprivation and institutionalization in influencing the child's motivation for social interaction and support. It was found that although institutionalization generally increased this motivation, it was increased much more for children
coming from relatively nondeprived homes than for those coming from more socially deprived backgrounds. Change in IQ scores. An unexpected finding of the Zigler and Williams study was that a general decrease in the IQs of retardates was discovered between the administration of two IQ tests, the first of which occurred at the time of admission five years prior to this follow-up study. This change in IQ, discovered in the context of a study employing the amount of preinstitutional social deprivation as an independent variable, is reminiscent of a finding by Clarke and Clarke (1954). These investigators found that changes in the IQs of retardates following institutionalization were related to their preinstitutional histories. They discovered that children coming from extremely poor homes showed an increase in IQ which was not observed in children coming from relatively good homes. Zigler and Williams, however, found that the magnitude of the IQ change in their subjects was not significantly related to preinstitutional deprivation. Although this finding appears inconsistent with that of Clarke and Clarke, it should be noted that some support for a relationship was suggested, since the only subjects in the Zigler and Williams study who evidenced an increase in IQ were the highly deprived group. The failure of Zigler and Williams to replicate the findings of Clarke and Clarke may be due to two factors: the subjects used by Clarke and Clarke were older and had been institutionalized at a later age than the retardates employed by Zigler and Williams and the IQ changes reported by Clarke and Clarke took place during two years of institutionalization, while the IQ changes reported in the Zigler and Williams study were based on five years of institutionalization. This latter factor becomes increasingly important in view of E. C. Jones and Carr-Saunders' finding (1927) that normal institutionalized children show an increase in IQ early in institutionalization and then a decrease in IQ with longer institutionalization. The work of Clarke and Clarke, Jones and CarrSaunders, and others, dealing with changes in IQ following institutionalization has given central importance to the degree of intellectual stimulation provided by the institution in contrast to that provided by the original home. This orientation suggests that it is the actual intellectual potential of the person which is altered. The Zigler and Williams study, however, suggests that the change in IQ reflects a change in the child's motivation for social interaction. That is, as social deprivation, resulting from increased length of institutionaliza-
MENTAL RETARDATION tion, increases, the desire to interact with the adult experimenter increases. Thus, for the deprived child the desire to be correct must compete in the testing situation with the desire to increase the amount of social interaction. This argument would appear to provide the conceptual framework for Clarke and Clarke's finding that highly deprived subjects evidence an increase in IQ with relatively short institutionalization, while the less deprived subjects demonstrate no greater increase than a test-retest control group. One would further expect that with continued institutionalization all children would exhibit a decrease in IQ, the phenomenon found by Jones and Carr-Saunders (1927) and one that appears in the Zigler and Williams study. Direct support for this view comes from the finding in the Zigler and Williams study of a positive relationship between the magnitude of the decrease in IQ and the child's motivation for social interaction. It should be noted that the Jones and CarrSaunders (1927) study involved institutionalized children of approximately average intellect, thus indicating that the dynamics under discussion here are the same for both normal and retarded children. Furthermore, the position advanced here is quite consistent with the findings for normal children obtained by Barrett and Koch (1930); these investigators found that the greatest increase in IQ was obtained by children who showed the greatest improvement in their personality traits or by children who evidenced a marked change in the nature of their relationship with the examiner. Conversely, what must be emphasized in respect to lowered IQs is not the lowered test scores per se but rather that the factors which attenuate these test scores will, in all probability, reduce the adequacy of many problem-solving behaviors performed in a social situation. Although there is considerable observational and experimental evidence that social deprivation results in a heightened motivation to interact with a supportive adult, it appears to have other effects as well. Again, the nature of these effects is suggested in those observations of the retarded that have emphasized their fearfulness, wariness or avoidance of strangers, or their suspicion and mistrust. The experimental work done by Zigler and his associates on the behavior of the institutionalized retarded has indicated that social deprivation results in both a heightened motivation to interact with supportive adults (positive-reaction tendency) as well as a reluctance and wariness to do so (negative-reaction tendency). That both of these tendencies are influenced by the quality of past relationships with adults and are amenable to ex-
243
perimental manipulations has been demonstrated in recent studies employing both normal and retarded children (e.g., Shallenberger & Zigler 1961). However, little work has been done on the range of behaviors that might be influenced by the negativereaction tendency. Failure and performance. Another factor frequently mentioned as a determinant in the performance of the retarded is their high expectancy of failure (Cromwell 1963). This failure expectancy has been viewed as an outgrowth of a lifetime characterized by frequent confrontations with tasks for which the retarded are intellectually ill-equipped to deal. That failure experiences and the failure expectancies to which they give rise affect a wide variety of behaviors in the intellectually normal has now been amply documented. Of special interest to workers in the area of mental retardation is Lantz's finding (1945) that a relatively simple failure experience prevented children from profiting by practice which ordinarily leads to improvement on intelligence-test scores [see ACHIEVEMENT MOTIVATION]. The results of experimental work employing the success—failure dimensions with retardates are still somewhat inconsistent. The work of Cromwell and his students (reviewed in Cromwell 1963) has lent support to the general proposition that retardates have a higher expectancy of failure than do normals. This results in a style of problem solving for the retardate which causes him to be much more motivated to avoid failure than to achieve success. However, the inconsistent research findings suggest that this fairly simple proposition is in need of some further refinement. One investigator found that retardates performed better following success and poorer following failure as compared to a control group. Another investigator (Heber 1957) found that the performances of normals and retardates were equally enhanced following both a failure and a success condition, although in the success condition the performance of retardates was enhanced more than that of normals. Conversely, Kass and Stevenson (1961) found that success enhanced the performance of normals more than that of retardates. Another study also found that failure had a general enhancing effect for both normals and retardates but that failure enhanced the performance of normals more than that of retardates (Gardner 1958). In a recent study by Butterfield and Zigler (1965a), one factor which may have produced this type of inconsistency was isolated. These investigators found that both normal and retarded children reacted differentially to success and failure experiences as a
244
MENTAL RETARDATION
function of their responsivity to adults, that is, their desire Lo gain an adult's support and approval. The nature of the difference between normals and retardates in their reaction to success or failure experiences appeared to be determined by this desire for approval. Among high-responsive subjects, failure, as compared to success, attenuated the performance of retarded subjects while improving the performance of normal subjects. Among low-responsive subjects, failure, as compared to success, attenuated the performance of normals while improving the performance of retardates. Debilitating effects of prolonged failure on the performance of the retarded have been found by Zeaman and House (1962). These investigators discovered that following such failure, retardates were unable to solve a simple problem, although they had previously been able to do so. Assuming a failure set in retardates, Stevenson and Zigler (1958) confirmed the prediction that retardates would be more willing to "settle for" a lower degree of success than would normal children of the same MA. The fear of failure in the mentally retarded also appears to be an important factor in differences that have been found between normals' and retardates' achievement motivation (Jordan & DeCharms 1959). Recent studies (Green & Zigler 1962; Turnure & Zigler 1964) have indicated that the high incidence of failure experienced by retardates generates a cognitive style of problem solving characterized by outer-directedness. That is, the retarded child comes to distrust his own solutions to problems and therefore seeks guides to action in the immediate environment. This outer-directedness may explain the great suggestibility so frequently attributed to the retarded child. Evidence has now been presented indicating that, compared to normals of the same MA, the retarded child is more sensitive to verbal cues given by an adult, is more imitative of the behaviors of both adults and peers, and engages in more visual scanning. Furthermore, certain findings (Green & Zigler 1962) suggest that the noninstitutionalized retardate is more outer-directed in his problem solving than the institutionalized retardate. This makes considerable sense if one remembers that the noninstitutionalized retardate does not reside in an environment adjusted to his intellectual shortcomings and should therefore experience more failure than the institutionalized retardate. Turnure and Zigler (1964) have suggested that the distractability so frequently encountered in the retarded reflects, in part, this outer-directed style of problem solving. This interpretation is of par-
ticular interest, since distractability has often been viewed as a neurophysiologically determined characteristic of the retarded rather than the reflection of a style of problem solving emanating from the particular experiential histories of such children. Work on the outer-directedness of the retarded also appears related to the locus of control work done by Cromwell and his associates (Cromwell 1963). These investigators found that retardates, as compared to normals, manifest an external locus of control, that is, they attribute certain events caused by their own behavior to outside forces over which they have little control. (This internal-control versus external-control dimension has been employed by Cromwell to bring some further order to the inconsistent findings in the success-failure literature.) The reinforcer hierarchy. Another nonintellective factor important in understanding the behavior of the retarded is the retardate's motivation under various types of incentives. That performance by normals and retardates on a variety of tasks is influenced by the nature of the incentive is certainly well documented. The social deprivation work discussed earlier in this section indicates that retardates have an extremely high motivation for attention, praise, and encouragement. Several investigators (e.g., Cromwell 1963; Zigler 1963) have suggested that in normal development the effectiveness of attention and praise as reinforcers diminishes with maturity and is replaced by the reinforcement inherent in the information that one is correct. This latter type of reinforcer appears to serve primarily as a cue for the administration of self-reinforcement [see LEARNING, article on REINFORCEMENT]. Zigler and his associates (Zigler 1962; Zigler & deLabry 1962; Zigler & Unell 1962) have argued that a variety of experiential factors in the history of the retarded cause them to be less motivated to be correct for the sake of correctness than normals of the same MA. Stated somewhat differently, these investigators have argued that the position of various reinforcers in the reinforcer hierarchies of normal and retarded children of the same MA differ. To date, the experimental work of this group has centered on the reinforcement which inheres in being correct. It is this reinforcer that is the most frequently dispensed, immediate incentive in most real-life tasks. Furthermore, it is a frequently used incentive in many experimental cognitive and perceptual tasks on which retardates and normals are compared, and it also seems to be the most important incentive in the typical test situation. When such an incentive is employed in experimental studies, one wonders how many of the differences
MENTAL RETARDATION found are attributable to differences in capacity between retardates and normals rather than to differences in performance which result from the different values that such incentives might have for the two types of subject. Clearest support for the view that the retardate is much less motivated to be correct than is the middle-class child, so typically used in comparisons with the retarded, is contained in a study by Zigler and deLabry (1962). These investigators tested middle-class, lower-class, and retarded children equated on MA on a concept-switching task (Kounin 1941) under two conditions of reinforcement. In the first condition, similar to that employed by Kounin, the only reinforcement dispensed was the information that the child was correct. In the second condition, the child was rewarded with a toy of his choice if he switched from one concept to another. In the "correct" condition these investigators found, as Kounin did, that retardates were poorer in their concept switching than were middleclass children. That this was not a simple matter of cognitive rigidity was indicated by the finding that lower-class children equated with the middleclass children on MA were also inferior to the middle-class children. In the toy condition this inferiority disappeared, and retarded and lower-class children performed as well as the middle-class children. This study highlights an assumption that has been noted as erroneous by many educators; namely, that the lower-class child and the retarded child are motivated by the same incentives that motivate the typical middle-class child. An intriguing avenue of further research is the degree to which the position of various reinforcers in the hierarchy can be manipulated. General effects of institutionalization. No discussion of motivational factors in the performance of the retarded would be complete without some mention of the effects of institutionalization. The institutionalization variable has probably contaminated more research in the area of mental retardation than any other single variable. Given our general lack of knowledge concerning the effects of institutionalization on human behavior, the extent of this contamination cannot be determined. That the effects of institutionalization on the behavior of retardates are considerable has been suggested by several investigators (e.g., McCandless 1964; Windle 1962). In view of the general consensus concerning the importance of institutionalization, it is amazing that more work has not been done to investigate its effects on retarded children. Some fairly clear findings with retardates have demonstrated that institutionalization causes a
245
decrement in the quality of language behavior (Lyle 1959), reduces the level of abstraction on vocabulary tests (Badt 1958), interferes with the ability to conceptualize an emotional continuum (Iscoe & McCann 1965), and increases the child's orientation toward punishment (Abel 1941). These studies, though suggestive, have shed little light on the specific aspects of institutionalization which affect such behaviors or on the exact nature of the process through which behaviors are affected. Whether the deficiencies in the behavior of the institutionalized retardate are motivational in nature or reflect an actual change in intellectual capacity is still an open question. Evidence that institutions for the retarded differ in their effects on behavior has recently been reported by Butterfield and Zigler (1965b). It was found that children residing in a cold, restrictive institution showed a higher motivation for adult support and approval than children residing in an institution having a warm, accepting social climate. These investigators are presently conducting a cross-institutional longitudinal study of six state schools for the retarded in an effort to isolate the institutional factors and psychological processes underlying such effects. Much of this work on motivational and emotional factors in the performance of the retarded is very recent. The research conducted on several of the factors discussed in this section is more suggestive than definitive. It is clear, however, that these factors are extremely important in determining the retardate's general level of functioning. Furthermore, these factors seem much more open to environmental manipulation than do the cognitive processes discussed earlier. An increase in knowledge concerning motivational and emotional factors and their ontogenesis and manipulation holds considerable promise for alleviating much of the social ineffectiveness displayed by that sizable group of persons who must function at a relatively low intellectual level. EDWARD ZIGLER [Directly related are the entries ACHIEVEMENT TESTING; INTELLIGENCE AND INTELLIGENCE TESTING; other relevant material may be found in CREATIVITY, article on GENIUS AND ABILITY; INFANCY, article on THE
EFFECTS
OF
EARLY
EXPERIENCE;
SYSTEMS
ANALYSIS, article on PSYCHOLOGICAL SYSTEMS; THINKING; and in the biographies of BINET and MONTESSORL] BIBLIOGRAPHY ABEL, THEODORA M. 1941 Moral Judgments Among Subnormals. Journal of Abnormal and Social Psychology 36:378-392.
246
MENTAL RETARDATION
BADT, MARGIT 1958 Levels of Abstraction in Vocabulary Definitions of Mentally Retarded School Children. American Journal of Mental Deficiency 63:241-246. BALLA, DAVID; and ZIGLER, EDWARD 1964 Discrimination and Switching Learning in Normal, Familial Retarded, and Organic Retarded Children. Journal of Abnormal and Social Psychology 69:664-669 BARRETT, HELEN E.; and KOCH, HELEN L. 1930 The Effect of Nursery-school Training Upon the Mental Test Performance of a Group of Orphanage Children. Journal of Genetic Psychology 37:102-122. BURT, CYRIL; and HOWARD, MARGARET 1956 The Multifactorial Theory of Inheritance and Its Application to Intelligence. British Journal of Statistical Psychology 9:95-130. BUTTERFIELD, EARL C.; and ZIGLER, EDWARD 1965a The Effects of Success and Failure on the Discrimination Learning of Normal and Retarded Children. Journal of Abnormal Psychology 70:25-31. BUTTERFIELD, EARL C.; and ZIGLER, EDWARD 1965£> The Influence of Differing Institutional Social Climates on the Effectiveness of Social Reinforcement in the Mentally Retarded. American Journal of Mental Deficiency 70:48-56. CLARKE, A. D.; and CLARKE, A. M. 1954 Cognitive Changes in the Feebleminded. British Journal of Psychology 45:173-179. CROMWELL, RUE L. 1963 A Social Learning Approach to Mental Retardation. Pages 41-91 in Norman R. Ellis (editor), Handbook of Mental Deficiency: Psychological Theory and Research. New York: McGrawHill. DOLL, EUGENE E. 1962 A Historical Survey of Research and Management of Mental Retardation in the United States. Pages 21-68 in E. Philip Trapp and Philip Himelstein (editors), Readings on the Exceptional Child. New York: Appleton. ELLIS, NORMAN R. 1963 The Stimulus Trace and Behavioral Inadequacy. Pages 134-158 in Norman R. Ellis (editor), Handbook of Mental Deficiency: Psychological Theory and Research. New York: McGrawHill. FERGUSON, GEORGE A. 1956 On Transfer and the Abilities of Man. Canadian Journal of Psychology 10:121131. FREMMING, KURT H. (1947) 1951 The Expectation of Mental Infirmity in a Sample of the Danish Population: (Based on a Biographical Investigation of 5,500 Persons Born in the Years 1883-1887). London: Eugenics Society. -» First published in Danish. GARDNER, WILLIAM I. 1958 Reactions of Intellectually Normal and Retarded Boys After Experimentally Induced Failure: A Social Learning Theory Interpretation. Ann Arbor, Mich.: University Microfilms. GARFIELD, SOL L. 1963 Abnormal Behavior and Mental Deficiency. Pages 574-601 in Norman R. Ellis (editor), Handbook of Mental Deficiency: Psychological Theory and Research. New York: McGraw-Hill. GARRISON, M. (editor) 1966 Cognitive Models and Development in Mental Retardation. American Journal of Mental Deficiency 70, no. 4 (Monograph Supplement). GOLDSTEIN, KURT 1943 Concerning Rigidity. Character and Personality 11:209-226. GREEN, CALVIN; and ZIGLER, EDWARD 1962 Social Deprivation and the Performance of Retarded and Normal Children on a Satiation Type Task. Child Development 33:499-508.
HEBB, DONALD O. 1949 The Organization of Behavior: A Neuropsychological Theory. New York: Wiley. HEBER, RICK F. 1957 Expectancy and Expectancy Changes in Normal and Mentally Retarded Boys. Ann Arbor, Mich.: University Microfilms. HEBER, RICK F. 1959 A Manual of Terminology and Classification in Mental Retardation. American Journal of Mental Deficiency 64, no. 2 (Monograph Supplement). HEBER, RICK F. 1962 Mental Retardation: Concept and Classification. Pages 69-81 in E. Philip Trapp and Philip Himelstein (editors), Readings on the Exceptional Child. New York: Appleton, HIRSCH, JERRY 1963 Behavior Genetics and Individuality Understood. Science 142:1436-1442. ISCOE, IRA; and McCANN, BRIAN 1965 Perception of an Emotional Continuum by Older and Younger Mental Retardates. Journal of Personality and Social Psychology 1:383-385. JERVIS, GEORGE A. 1959 The Mental Deficiencies. Volume 2, pages 1289-1313 in American Handbook of Psychiatry. Edited by Silvano Arieti. New York: Basic Books. -» See especially Table 1 on page 1290. JONES, E. CARADOG; and CARR-SAUNDERS, A. M. 1927 The Relation Between Intelligence and Social Status Among Orphan Children. British Journal of Psychology 17:343-364. JONES, HAROLD E. (1946) 1954 The Environment and Mental Development. Pages 631-696 in Leonard Carmichael (editor), Manual of Child Psychology. New York: Wiley. JORDAN, THOMAS E.; and DECHARMS, RICHARD 1959 The Achievement Motive in Normal and Mentally Retarded Children. American Journal of Mental Deficiency 64:457-466. KAPLUN, DAVID 1935 The High-grade Moron: A Study of Institutional Admissions Over a Ten Year Period. Journal of Psychoasthenics 40:69-91. -» Now called the American Journal of Mental Deficiency. KASS, NORMAN; and STEVENSON, HAROLD W. 1961 The Effect of Pretraining Reinforcement Conditions on Learning by Normal and Retarded Children. American Journal of Mental Deficiency 66:76-80. KOUNIN, JACOB S. 1941 Experimental Studies of Rigidity. Character and Personality 9:251-282. -> Part 1: The Measurement of Rigidity in Normal and Feebleminded Persons. Part 2: The Explanatory Power of the Concept of Rigidity as Applied to Feeble-mindedness. KOUNIN, JACOB S. 1948 The Meaning of Rigidity: A Reply to Heinz Werner. Psychological Review 55:157166. LANTZ, BEATRICE 1945 Some Dynamic Aspects of Success and Failure. Psychological Monographs 59, no. 1; Serial no. 271. LASHLEY, K. S.; CHOW, K. L.; and SEMMES, JOSEPHINE 1951 An Examination of the Electrical Field Theory of Cerebral Integration. Psychological Review 58:123— 136. LAURENDEAU, MONIQUE; and PINARD, ADRIEN 1963 Causal Thinking in the Child: A Genetic and Experimental Approach. New York: International Universities Press. LEAHY, ALICE M. 1935 Nature-Nurture and Intelligence. Genetic Psychology Monographs 17:236-308. LEWIN, KURT (1926-1933) 1935 A Dynamic Theory of Personality: Selected Papers. New York: McGrawHill.
MERCANTILISM LURIA, A. R. 1963 Psychological Studies of Mental Deficiency in the Soviet Union. Pages 353-387 in Norman R. Ellis (editor), Handbook of Mental Deficiency: Psychological Theory and Research. New York: McGraw-Hill. LYLE, J. G. 1959 The Effect of an Institutional Environment Upon the Verbal Development in Imbecile Children. 1: Verbal Intelligence. Journal of Mental Deficiency Research 3:122-128. MCCANDLESS, BOYD R. 1964 Relation of Environmental Factors to Intellectual Functioning. Pages 175-213 in Harvey A. Stevens and Rick F. Heber (editors), Mental Retardation: A Review of Research. Univ. of Chicago Press. MCCLEARN, GERALD E. 1962 The Inheritance of Behavior. Pages 144-252 in Leo Postman (editor), Psychology in the Making. New York: Knopf. McKiNNEY, JOHN P.; and KEELE, TINA 1963 Effects of Increased Mothering on the Behavior of Severely Retarded Boys. American Journal of Mental Deficiency 67:556-562. MAKER, BRENDAN A. 1963 Intelligence and Brain Damage. Pages 224-252 in Norman R. Ellis (editor), Handbook of Mental Deficiency: Psychological Theory and Research. New York: McGraw-Hill. Mental Retardation in the Soviet Union. 1964 Canada's Mental Health Supplement no. 42. MILGRAM, NORMAN A.; and FURTH, HANS G. 1963 The Influence of Language on Concept Attainment in Educable Retarded Children. American Journal of Mental Deficiency 67:733-739. O'CONNOR, N.; and HERMELIN, B. 1959 Discrimination and Reversal Learning in Imbeciles. Journal of Abnormal and Social Psychology 59:409-413. PENROSE, LIONEL S. (1949) 1963 The Biology of Mental Defect. 3d ed. London: Sidgwick & Jackson. PHILLIPS, LESLIE; and ZIGLER, EDWARD 1964 Role Orientation, the Action-Thought Dimension, and Outcome in Psychiatric Disorder. Journal of Abnormal and Social Psychology 68:381-389. PIAGET, JEAN (1932) 1948 The Moral Judgment of the Child. Glencoe, 111.: Free Press. -> First published in French. RIEBER, MORTON 1964 Verbal Mediation in Normal and Retarded Children. American Journal of Mental Deficiency 68:634-641. SHALLENBERGER, PATRICIA; and ZIGLER, EDWARD 1961 Rigidity, Negative Reaction Tendencies, and Cosatiation Effects in Normal and Feebleminded Children. Journal of Abnormal and Social Psychology 63:20-26. SIEGEL, PAUL S.; and FOSHEE, JAMES G. 1960 Molar Variability in the Mentally Defective. Journal of Abnormal and Social Psychology 61:141-143. SPITZ, HERMAN H. 1963 Field Theory in Mental Deficiency. Pages 11-40 in Norman R. Ellis (editor), Handbook of Mental Deficiency: Psychological Theory and Research. New York: McGraw-Hill. SPIVACK, GEORGE 1963 Perceptual Processes. Pages 480511 in Norman R. Ellis (editor), Handbook of Mental Deficiency: Psychological Theory and Research. New York: McGraw-Hill. STEVENSON, HAROLD W.; and FAHEL, LEILA 1961 The Effect of Social Reinforcement on the Performance of Institutionalized and Noninstitutionalized Normal and Feebleminded Children. Journal of Personality 29: 136-147. STEVENSON, HAROLD W.; and ZIGLER, EDWARD 1957 Dis-
247
scrimination Learning and Rigidity in Normal and Feebleminded Individuals. Journal of Personality 25: 699-711. STEVENSON, HAROLD W.; and ZIGLER, EDWARD 1958 Probability Learning in Children. Journal of Experimental Psychology 56:185-192. TAYLOR, JANET A. (1956) 1963 Drive Theory and Manifest Anxiety. Pages 205-222 in Martha T. Mednick and Sarnoff A. Mednick (editors), Research in Personality. New York: Holt. -> First published in the Psychological Bulletin. THORPE, Louis P. (1946) 1955 Child Psychology and Development. 2d ed. New York: Ronald Press. TURNURE, JAMES; and ZIGLER, EDWARD 1964 Outerdirectedness in the Problem Solving of Normal and Retarded Children. Journal of Abnormal and Social Psychology 69:427-436. U.S. PRESIDENT'S PANEL ON MENTAL RETARDATION 1963 A Proposed Program for National Action to Combat Mental Retardation. Washington: Government Printing Office. WEAVER, THOMAS R. 1946 The Incidence of Maladjustment Among Mental Defectives in Military Environment. American Journal of Mental Deficiency 51: 238-246. WHEELER, LESTER R. 1942 A Comparative Study of the Intelligence of East Tennessee Mountain Children. Journal of Educational Psychology 33:321-334. WHITNEY, E. ARTHUR 1956 Mental Deficiency: 1955. American Journal of Mental Deficiency 60:676-683. WINDLE, CHARLES 1962 Prognosis of Mental Subnormals. American Journal of Mental Deficiency 66, no. 5 (Monograph Supplement). ZEAMAN, DAVID; and HOUSE, BETTY J. 1962 Approach and Avoidance in the Discrimination Learning of Retardates. Child Development 33:355-372. ZIGLER, EDWARD 1961 Social Deprivation and Rigidity in the Performance of Feebleminded Children. Journal of Abnormal and Social Psychology 62:413-421. ZIGLER, EDWARD 1962 Rigidity in the Feebleminded. Pages 141-162 in E. Philip Trapp and Philip Himelstein (editors), Readings on the Exceptional Child. New York: Appleton. ZIGLER, EDWARD 1963 Social Reinforcement, Environmental Conditions and the Child. American Journal of Orthopsychiatry 33:614-623. ZIGLER, EDWARD; and DELABRY, JACQUES 1962 Conceptswitching in Middle-class, Lower-class, and Retarded Children. Journal of Abnormal and Social Psychology 65:267-273. ZIGLER, EDWARD; and UNELL, EARL 1962 Conceptswitching in Normal and Feebleminded Children as a Function of Reinforcement. American Journal of Mental Deficiency 66:651-657. ZIGLER, EDWARD; and WILLIAMS, JOANNA 1963 Institutionalization and the Effectiveness of Social Reinforcement: A Three Year Follow-up Study. Journal of Abnormal and Social Psychology 66:197-205.
MENTAL TESTING See INTELLIGENCE AND INTELLIGENCE TESTING. MERCANTILISM See under ECONOMIC THOUGHT.
248
MERCIER DE LA RIVIERE MERCIER DE LA RIVIERE
Pierre Paul Mercier de la Riviere (1720-1793), French physiocrat, was born at Saumur (Indre et Loire), the son of a president tresorier of France. After studying law, at the age of 27 he became a member of the parlement of Paris and remained there for 12 years. He then served for several years as intendant of Martinique. Recalled as a result of policy differences with respect to free trade (he had admitted English ships to the island, in violation of the pacte colonial}, he returned to the parlement in 1764 and in semiretirement wrote his masterpiece, L'ordre naturel et essentiel des societes politiques (1767). The basic ideas of the physiocrats, and especially of Quesnay, their leader, are to be found in Mercier's work. In his Ordre naturel, Mercier built on the ideas presented by Quesnay in his "Despotisme de la Chine." Mercier stressed the political aspects of physiocracy rather than the agricultural ideas of the school. For him, the law of property, which is based on the physical order of nature, is unique and universal, underlying all other laws. It is a law that may be directly apprehended by all men, and one that governs the essential order of society. The proper character of political institutions derives from this basic importance of the law of property. The sovereign, according to Mercier, is by definition co-owner of the fixed wealth of the society; in the eyes of his subjects he is no more than a large proprietor who has no privileges at the expense of others but, rather, is linked to his subjects by a common interest in maximizing the value of common property. The proportion of fixed wealth that is to be used as public revenue, or taxes, is thus determined by the natural law of property, provided that government takes the form of personal and legal despotism (as opposed to arbitrary despotism). Since under personal and legal despotism the despot embodies that fundamental unity of society which is based on the law of property, it is in the direct interest of such a despot to keep taxes within the limits of the portion required by the law of property. Mercier proposed that the sovereign have both executive and legislative powers, for if legislative power is in the hands of a representative assembly, there necessarily arise parties with irreconcilable private interests. However, Mercier did advocate an independent judiciary, with the right to register laws, a suspensive veto of laws, and control of their constitutionality. In contrast with the law of property, which provides the security requisite to liberty, the laws enacted by the sovereign create no funda-
mental rights: they assure only the fulfillment of private contracts. Thus Mercier envisioned a political order that is in harmony with nature. Every man becomes an instrument of the welfare of his fellows; no one can profit or become rich at the expense of others. Luxury, "that monster," will disappear, and peace will be established among nations. The system of government that Mercier envisioned requires a mature public opinion as a final check on the consistency of the sovereign's conduct with the law of property. Mercier's De I'instruction publique (1775) described the system of national education necessary to raise public opinion to the appropriate level of maturity. To be sure, the "nation" that Mercier wished to educate was made up only of landowners and farmers; as a physiocrat, he considered commerce and industry as unproductive and unworthy of participation. It is hard to assess the influence that Mercier had, since it merges with that of the entire school of Quesnay. Mercier is undoubtedly at least the most widely read of Quesnay's disciples, because his works have been most accessible to the general public. Mercier did not get along with Catherine the Great when he visited Russia at her invitation; however, the king of Sweden commissioned, and presumably profited from, Mercier's work on public education. FREDERIC MAURO [For the historical context of Mercier's work, see ECONOMIC THOUGHT, article on PHYSIOCRATIC THOUGHT; and the biography of QUESNAY.] WORKS BY MERCIER
(1767) 1910 L'ordre naturel et essentiel des societes politiques. Paris: Geuthner. 1770 L'interet general de I'etat . . . . Amsterdam and Paris: Desaint. 1775 De /'instruction publique: Ou, considerations morales et politiques sur la necessite, la nature et la source de cette instruction, ouvrage demande pour le roi de Suede. Stockholm and Paris: Didot. 1789 Essais sur les maximes et loix fondamentales de la monarchic francoise . . . . Paris: Vallat-La-Chapelle. 1790 Palladium de la constitution politique: Ou, regeneration morale de la France . . . . Paris: Baudouin. SUPPLEMENTARY BIBLIOGRAPHY
JOUBLEAU, F. 1858-1859 Notice sur P.-P. Lemercier de la Riviere. Academic des Sciences Morales et Politiques, Paris, Seances et travaux 46:439-455; 47:121150, 249-265. LARIVIERE, CHARLES DE (1897) 1909 Mercier de la Riviere a Saint-Petersbourg en 1767. Pages 71-132 in Charles de Lariviere, La France et la Russie au XVIII* siecle: Etudes d'histoire et de litterature franco-russe. Paris: Soudier.
MERGERS RICHNER, EDMUND 1931 Le Mercier de la Riviere: Ein Fiihrer der physiokratischen Bewegung in Frankreich. Zurich: Girsberger. SILBERSTEIN, LOTTE 1928 Lemercier de la Riviere und seine politischen Ideen. Berlin: Eberling. WEULERSSE, GEORGES 1910 Le mouvement physiocratique en France de 1756 a 1770. 2 vols. Paris: Alcan. WEULERSSE, GEORGES 1950 La physiocratie sous les ministeres de Turgot et de Necker (1774-1781). Paris: Presses Universitaires de France. WEULERSSE, GEORGES 1959 La physiocratie a la fin du regne de Louis XV: 1770-1774. Paris: Presses Universitaires de France.
MERGERS A merger is the combination into a single business enterprise of two or more previously independent enterprises. The combination may take a number of forms. Among these are the outright purchase of the assets of one company by another for cash or for the stock or debt of the acquiring company. A holding company may be created, with the stock of the combining companies exchanged for that of the parent company. The stock of the merging companies may be held in trust, though this has been generally superseded by corporate arrangements. Combinations have been effected by long-term lease. The legal and financial forms the merger takes are governed largely by tax, corporate charter, and other legal provisions that introduce unique elements in each case. While such factors are of some influence in shaping the broad pattern of mergers, they are probably much less important than underlying forces of economic change and competition. Mergers represent a formal, as against an informal, form of combination. Independent and competing enterprises may pursue a common course of action by various arrangements falling short of outright merger. These range from conscious parallelism of action to contractual agreements governing prices, production, conditions of sale, marketing, and other major business policies. Many of these less formal arrangements may achieve the same purpose as a merger in organizing an industry. However, their effects are likely to be less permanent, enforceability is less absolute, and they are vulnerable to the charge of conspiracy. Probably only a small fraction of all mergers, certainly of recent mergers, have had the reduction or elimination of competition as a principal motive. Other reasons for merging are to achieve a more efficient integration of successive stages in production and marketing, to diversify into new products and markets, to take advantage of a fa-
249
vorable investment opportunity, to minimize taxes when liquidating a business, and to acquire a talented person or a promising patent. In the case of many mergers of very small enterprises, the purpose may be simply to gain the benefits of specialization in management, one partner to the merger becoming responsible for production, the other for marketing and sales. The great variety of objectives in mergers suggests that convenient generalizations about their underlying causes may not be easy to find. Certain patterns in mergers have been observed, however, which offer some guides to promising lines of inquiry. The following discussion must be confined largely to the United States because of the lack of statistical information about other countries. Merger movements. One outstanding characteristic of mergers in the United States is the highly episodic nature of their occurrence. In three periods—1898 through 1902, 1926 through 1930, and 1957 through 1961—industrial mergers occurred on so extensive a scale that they are best described as waves or movements (see Figure 1). This tendency of a fundamental form of enterprise expansion to show vast and widely separated peaks of activity has probably interested students more than the examination of individual mergers. The first recorded merger movement of major proportions occurred as the United States entered the twentieth century, its peak years being 1898 through 1902. For a number of industries it represented the formal consolidation of companies that had already achieved a certain degree of policy coordination through agreements to avoid active competition, agreements that had shown distressing tendencies to break down. For a few important companies it represented merely a change in legal form, from trust to holding company, of an earlier merger. Most importantly, however, it involved the consolidation of companies in a large number of previously dispersed industries into single companies in which control was tightly centralized. It transformed many industries formerly characterized by many small and medium-size firms into those in which one or a few large corporations occupied dominant positions. During the first merger wave such industrial giants as U.S. Steel, American Tobacco, International Harvester, DuPont, Anaconda Copper, Corn Products, American Smelting and Refining, Otis Elevator, Allis-Chalmers, and American Sugar Refining were created. The second large movement took place in the last half of the 1920s, its peak years being 1926 through 1930. To some degree it represented consolidation in the important new industries that had
250
MERGERS
Turn-of-the-century movement 2,000r-
Late 1920s movement
Latest movement
Figure 1 — Firm disappearances by merger, United States, annually, 1895—1961 a. Logarithmic scale. b. The two series are not directly comparable and are presented on the same chart only to provide historical perspective. Sources: For 1895-1920, Nelson 1959, pp. 152-153; for 1919-1939, Thorp 1941, pp. 231-234; for
1940-1954, U.S. Federal Trade Commission 1955,
p. 33; 1955-1961, U.S. Federal Trade Commission data.
appeared since the first merger wave. It also reflected attempts in some industries to restore the levels of concentration achieved three decades earlier by firms whose leading positions had been eroded in the meantime. Among the prominent companies created by merger in this period were National Steel, National Dairy Products, United Aircraft, Owens-Illinois, and Caterpillar Tractor. As Figure 1 indicates, the third large movement was probably underway in the early 1960s. There was a short merger revival immediately following World War n, which was confined mainly to the two years 1946 and 1947. However, merger activity did not return to sustained high levels until the mid-1950s. The pattern of recent mergers has been more varied, with product diversification and tax minimization playing a more prominent role than in the earlier movements (Butters et al. 1951). The current revival of merger activity, while large, is not as large as the earlier merger movements. In the five turn-of-the-century years, 1898 through 1902, at least 2,700 firms disappeared into the manufacturing and mining mergers reported in the financial press—and reporting was not as
comprehensive as it has since become. In the peak years of the late 1920s, 1926 through 1930, mergers claimed 4,800 firms. By contrast, in the five years 1957 through 1961 there were about 2,900 disappearances. Since the number as well as the size of industrial firms has grown considerably in the past six decades, the most recent levels of merger activity are relatively lower than suggested by the absolute comparisons. Mergers and competition. The three merger movements have had varied effects on the intensity and form of competition in markets. The turn-ofthe-century movement, as indicated above, succeeded in consolidating thousands of small and medium-size companies into relatively few large ones. In literally dozens of cases the merged firm attained a dominant share of its industry. The avowed goal of many mergers was monopolistic control of a market, and this goal was realized in many instances. The effect was to transform competition from that among many firms into that in which one firm was large enough, through force of size, to maintain orderly, and profitable, conditions in a market. The result was a major reduction in the
MERGERS amount of competition as it had been known in the nineteenth century. The merger wave of the late 1920s, superimposed on an industrial structure still showing the effects of its giant predecessor wave, had a necessarily different effect on competition. Some observers have speculated that its pattern was influenced by judicial interpretation. Antitrust policy, while generally permissive toward mergers in this period, may have made the largest companies in an industry less eager to engage in mergers which would markedly increase their leadership of an industry [see ANTITRUST LEGISLATION]. Such action might be considered predatory and lay the company open to the charge of being a "bad trust." Since well-behaved colossi were generally free from antitrust attention, an industry's largest company might think twice before taking action which might jeopardize its reputation. The second- , third- , or fourth-ranking companies might have felt less restricted. This may have accelerated the formation of industries in which the leadership was shared by several large firms. However, until direct evidence on the number and size of mergers in this period becomes available, such an interpretation must remain largely speculative. The pattern of several-firm leadership of industry—oligopoly—is now characteristic of many of our leading industries [see OLIGOPOLY]. It is still a much-debated question whether the oligopolistic industry is competitive enough, though there is more general agreement that the existence of several big firms in an industry signifies more active competition than that of one huge, dominant firm. Certainly the merger wave of the 1920s produced an increase in industrial concentration, if by this is meant simply the centralizing of the control of markets into a smaller relative number of enterprises. It remains to be established, however, whether the more common result was the substitution of oligopoly for industries having only one clear leader or its substitution for decentralized industries having many firms and the more classical variety of competition. The most recent merger revival has been even more complex in its effect on competition. Unlike the two earlier merger waves, the goal of most recent mergers has not been the union of two or more firms producing the same product at the same stage °f fabrication. This type—the horizontal merger— has the immediate effect of reducing the number °f independent firms selling the product in question. A study by the Federal Trade Commission suggests that horizontal mergers may be accounting for about two-fifths of recent merger activity (U.S.
251
Federal Trade Commission 1955, chapter 3). They amounted to about three-fourths of the turn-of-thecentury merger activity (Nelson 1959, p. 103) and probably much more than half of that of the 1920s. The over-all effect on competition of recent horizontal mergers has therefore been considerably less than the horizontal mergers of three and six decades earlier. It would be difficult to see how it could be greater. Given the existing levels of concentration, established in no small part in the earlier movements, the recent merger movement could only maintain or slightly increase this level. To change it greatly would mean the creation of monopolies or near-monopolies in many industries, and this is clearly contrary to public policy. The strengthening of major oligopolies by merger has recently received greater discouragement from antitrust authorities. In 1958 the courts ruled against the proposed merger of the Bethlehem and Youngstown steel companies, whose effect would have been to make the second largest steel company, Bethlehem, more nearly equal to the largest company, U.S. Steel. Bethlehem, with 15 per cent of the industry, would have had 19 per cent had it acquired Youngstown, thus bringing it closer to U.S. Steel's share of 30 per cent. A comprehensive study by the National Industrial Conference Board has characterized recent antitrust orientation as follows: The Board's study finds that, contrary to popular impression, enforcement has focussed, not on the size of the acquiring or acquired company, as such, but on market effects. Indeed, only 2% of the acquisitions recorded for 1958 and 1959 for the 300 largest manufacturing corporations had resulted in a merger case by March, 1960. . . . up to the present time, a merger has been most vulnerable if the acquiring corporation's sales and assets exceed $10 million and it is one of the first companies in its field, if the acquired unit is also one of the major organizations in its field, and if a high percentage of the output of the products or services in question is concentrated in relatively few companies. Vulnerability is greatest if the two companies operate in the same field. (Bock 1960, pp. 9-10; italics added) Approximately one-fifth of recent merger activity has involved the combining of firms at successive stages in the production and distribution of a particular product—the vertical merger. The effect of a vertical merger on competition is not likely to be great, unless it provides the merged company with a stranglehold on one of the stages of production. If this is the result, then the relevant combination is fundamentally a horizontal one. The relatively crude evidence available suggests that it is unlikely
252
MERGERS
that the latest vertical mergers have produced any appreciable diminution in competition. Finally, about two-fifths of recent mergers have had the diversification of products or production processes as a goal. In the first movement, by contrast, there were virtually no mergers for diversification. Diversification objectives cover a broad range. Some companies hope to achieve economies in marketing through production of a broader line of products. Others seek economies in production by acquiring products capable of being produced by the company's existing production facilities. Still others diversify simply to avoid having their fortunes dependent on a single product or industry. The competitive effects of a diversified merger are not likely to be great. If the merged products did not directly compete with one another before the merger, the fact that they are now produced by the same company can have only indirect effects on competition in their separate markets. The above general review of the competitive consequences of the three merger movements should amply illustrate that we continue to possess only the crudest notions of the effect of mergers on competition. Considerable work remains to be done in classifying mergers; the simple horizontalvertical-diversified taxonomy is only a beginning. Beyond this, however, lie challenging conceptual problems for which the tools of economic analysis are appropriate. Many problems are related to the appropriate economic sector in which to measure competition and have relevance to other factors in the structure and performance of industries. Many, however, have their basis in the unique characteristics of the merger and the role that considerations of competitive factors play in the merger decision. Mergers and business cycles. Although merger history has been dominated by the three large waves, mergers have occurred in measurable amount in every year. One of the periods of lowest activity observed in the twentieth century was the decade from 1905 through 1915, following the huge turn-of-the-century wave. Even during this period there were important mergers. In 1908 General Motors was formed, and in 1911 what is presently International Business Machines. Both were consolidations of several prominent firms in their fields. The great depression of the 1930s saw mergers at probably their lowest ebb; yet even then there were some important chemical and electronics mergers. The historical pattern of merger activity is notable, therefore, not only for its great waves but also for the presence of cycles in mergers. The cycles
are most pronounced as part of the great movements, but they also may be observed during the two-to-three-decade intervals of lowered activity. For the six-decade period from 1895 through 1954 12 clear merger cycles have been identified. During the same period the National Bureau of Economic Research identified 14 cycles in general economic activity, and the merger cycle conformed to the general business cycle in 11 of the 12 cases. When it did not conform, the cycle in general business was either very short, very mild, or both (Nelson 1959, chapter 5). The cyclical responsiveness of mergers to business activity raises a number of questions. One set relates to the role of mergers as a form of business investment. The balancing of the cost of the firms to be acquired with the discounted expected value of the future earnings of the merged firm involves the same kind of calculation required when deciding to organize a new business or build another plant. Like private investment, merger activity has been shown to respond in a positive and sensitive fashion to the business cycle. Both merger activity and private investment in new plant and equipment reach their highest points before the peak in general business activity. Both seem to bear a fairly close relationship to movements in stock prices, which suggests that an important factor is the possibility of financing the purchase of either a new plant or another company under conditions favorable to the issuance of new equity securities. Firms expanding by merger, as in other forms of firm growth, frequently turn to public sources for the needed extra funds. The issuance of new securities is most necessary when the acquired firm is purchased for cash; however, when the purchase is made by exchange of stock, new securities may be issued to increase working capital. Even when there is only the exchange of stock, the organizers of the merger are likely to be sensitive to the recent trend of the stock market, because ratios of exchange are often based on the relative market prices of the two securities. Although both are generally responsive to stock price rises and other manifestations of economic expansions, merger activity and plant and equipment investment are not wholly synchronous. One examination has found that merger activity reaches its peak earlier in a cyclical expansion than do contracts for industrial and commercial construction and orders for manufacturers' durable equipment (Nelson 1966, p. 58). This pattern of timing refutes the theory that businesses turn to mergers only after other profitable forms of investment have been exhausted.
MERGERS What might explain the earlier peak in mergers than in internal expansion? One hypothesis might be that, although nothing in the basic decision to invest in merger or new plant would predict when in an expansion each is more likely to occur, delays in the construction of new plants may result in merger plans coming to fruition sooner than plantbuilding programs. Indeed, delays encountered in plant-building may encourage firms to accelerate merger efforts in order to achieve growth targets on schedule. Mergers by large firms. In the histories of a majority of the largest industrial corporations there has been at least one merger important enough to have had a significant effect on the subsequent rate and direction of company growth (Nelson 1959, p. 4). Some mergers have made companies leaders in their traditional industries, while others have created diversified enterprises. Some have given large companies commanding leads in expanding new industries, while others have consolidated into fewer firms the control of stationary or declining industries. One could claim that the attrition of time has nullified the effects of at least the earlier of these mergers, so that the structure and performance of these companies is now little different from what it would have been had no mergers taken place. This may have been true, almost by definition, of the least successful mergers, but it strains credibility to argue that the forces molding industry structures have been so pervasive as to make it generally true. Assuredly the structure of the aluminum industry from 1900 through 1940 would have been different had not early patent mergers succeeded in making Aluminum Company of America the only company in the field at that time. A problem that continues to need solution is that of measuring with some precision the part that mergers have played in the growth of firms. Recent studies provide some idea of its magnitude, and evidently it has not been small. The most comprehensive examination made to date has been that of J. Fred Weston. Investigating 74 large industrial firms, he found that, at a minimum, 22.6 per cent of the 1900-1948 growth of these companies could be assigned to merger (1953, p. 14). Weston regards this as a minimum estimate because he explicitly counted as growth by merger only the addition of the acquired firm at the time of the merger. This assumes that the part of the now-enlarged company representing the new acquisition did not continue to grow after the merger, and so contributed nothing to the postmerger growth of the combined firm. While necessary in setting a lower limit
253
to the range of estimates, the assumption leads, among other things, to the unlikely inference that mergers are organized by pessimistic men. It is difficult to measure the growth-enhancing effects of any major restructuring of an enterprise, and that produced by a merger is no exception. One approach to measurement might be to compare observed growth with that predicted by simple models of firm growth. One such model might assume that, for the merger to have been neutral in its effect on growth, the acquired firm would have grown at the same rate as its industry. Alternatively, one might assume that merging firms (acquiring and acquired) each would have grown at the same rate as the merged firm. Perhaps most plausible would be the assumption that the merging firms would have grown at the same rate as other similar firms that did not merge. Other models could be developed, and much could be learned of the effects of mergers in the process of developing and testing them. None has yet been used on a comprehensive scale to measure the merger component in large-firm growth. The interest of many students has been in the role of mergers in producing high concentration in major industries, for this is where the implications for antitrust policy have greatest relevance [see INDUSTRIAL CONCENTRATION]. To study the role of mergers in concentration, one must focus on the firm's share of its industry and on horizontal mergers that directly affect this share. For 25 of his 74 companies Weston presented measures of mergers' effects on industry shares. In this formulation he assumed that, in the absence of merger, the acquiring firm would have maintained its premerger share of the industry, that is, it would have been able to grow as fast as the industry. This assumption probably assigns too little of postmerger growth to the acquired firm, for enhancement of growth potential must be a primary reason for acquiring a rival firm. Despite this bias toward understatement, the calculated contribution to growth was impressive. He found that, under the above assumption, most of the companies' increase in market share was assignable to merger. Indeed, for 11 of the 25 companies, mergers accounted for more than their increase in market share, that is, the companies witnessed a decline in the shares of markets that mergers had initially given them. These findings led Stigler, in his review of the Weston study, to conclude: "He [Weston] lends support to the opinion that merger has been the basic method by which individual firms have acquired high shares in major industries in the United States" (1956, p. 40).
254
MERRIAM, CHARLES E.
International comparisons. Though crude and incomplete, merger statistics for the United States are incomparably more abundant than those for other countries. The only comprehensive time series on mergers known to the writer is that for Great Britain during the large wave of amalgamations it, too, experienced at the turn of the century (Macrosty 1907). Reasons for the paucity of merger statistics in other countries are not hard to find. Absence of antitrust laws, especially those directed at mergers, means that government agencies have had no need to collect data on which to base policy. Also, cartelization rather than amalgamation seems to have been the more common form of combination in European industry, and merger activity may not have been sufficiently large or widespread to engage the interest of economists. With the acceleration of European economic integration, mergers may begin to receive more attention. The creation of a tariff-free market as large or larger than that of the United States may compel major changes in the firm-structures of industries, and it seems probable that mergers should emerge as important instruments for effecting any such change. The experience of the United States suggests that availability of data provides no sure guarantee that public policy toward mergers will always be enlightened. However, there are some indications that lags in assembling merger data have made the evolution of policy more tortuous, erratic, and probably less successful. An early beginning to the assembling of data on Common Market mergers could aid considerably in the development of appropriate public policies toward mergers; policies, one hastens to add, that may be expected to depart in significant respects from the United States example. RALPH L. NELSON BIBLIOGRAPHY
Current data on the number, size, and industry of mergers are available from the Office of Information of the Federal Trade Commission. Comprehensive lists of mergers may be found in National Industrial Conference Board, Conference Board Record (see Bock 1960, pages 107-119 for a discussion of these data). BOCK, BETTY 1960 Mergers and Markets: An Economic Analysis of Case Law. Studies in Business Economics, No. 69. New York: National Industrial Conference Board. -» See especially pages 107-119, "Data on Merging Companies." BUTTERS, JOHN K.; LINTNER, JOHN; and CARY, WILLIAM L. 1951 Effects of Taxation: Corporate Mergers. Boston: Harvard Univ., Graduate School of Business Administration, Division of Research. MACROSTY, HENRY W. 1907 The Trust Movement in British Industry. London: Longmans.
MARKHAM, JESSE W. (1955) 1966 Survey of the Evidence and Findings on Mergers. Pages 141—182 in Universities-National Bureau Committee for Economic Research, Business Concentration and Price Policy: A Conference. National Bureau of Economic Research, Special Conference Series, No. 5. Princeton Univ. Press. NATIONAL INDUSTRIAL CONFERENCE BOARD Conference Board Record. -» Published since 1944. Previously published under the titles Conference Board Business Record and Conference Board Business Management Record. NELSON, RALPH L. 1959 Merger Movements in American Industry: 1895-1956. National Bureau of Economic Research, General Series, No. 66. Princeton Univ. Press. NELSON, RALPH L. 1966 Business Cycle Factors in the Choice Between Internal and External Growth. Pages 52-70 in William W. Alberts and Joel E. Segall (editors), The Corporate Merger. Univ. of Chicago Press. STIGLER, GEORGE J. 1956 The Statistics of Monopoly and Merger. Journal of Political Economy 64:33-40. THORP, WILLARD L. 1941 The Structure of Industry. U.S. Temporary National Economic Committee, Investigation of Concentration of Economic Power, Monograph No. 27. Washington: Government Printing Office. -» See especially pages 227-234, "The Merger Movement." U.S. FEDERAL TRADE COMMISSION 1955 Report on Corporate Mergers and Acquisitions: May 1955. Washington: Government Printing Office. WESTON, J. FRED 1953 The Role of Mergers in the Growth of Large Firms. Berkeley: Univ. of California Press.
MERRIAM, CHARLES E. The life of Charles Edward Merriam, Jr. (18741953), American political scientist, represents and reflects many of the changes which have taken place not only in the field in which he achieved his reputation but also in American society in the twentieth century. His generation was perhaps the first to experience with enthusiasm the headlong rush of history and of industrial technology, which so depressed Henry Adams, and to return to the relentless faith in social study characteristic of eighteenth-century democratic thought. Merriam was determined to retain for America in the twentieth century the vision that Alexis de Tocqueville had had in the nineteenth: that the political course of Western society was set irrevocably toward ever more democratic government and that the United States could lead the way. To this vision Merriam added his own conviction that the observable weaknesses in modern government are the result of too little rather than too much democracy. His belief that the sources of such weaknesses can be found by an examination of the actual workings of politics, and that the methods of such examination have to be scientific, formed the basis of his approach to
MERRIAM, CHARLES E. politics and of his efforts to reorganize political science. His commitment to democracy and to scientific method gave impetus to his lifelong efforts to bring scientific knowledge to the service of government, and his conception of scientific method facilitated the development of interrelationships among the social sciences. Merriam was born in Hopkinton, Iowa, the second son of the local postmaster, who was also keeper of the general store. His mother, Margaret Campbell Kirkwood Merriam, was a devout Scottish Presbyterian who had been educated in Scotland to be a schoolteacher, although family responsibilities and chronic ill health prevented her from teaching. The Merriam family was deeply involved in Iowa politics; Merriam's father, however, never had the political career he hoped for. Like his father and his elder brother, Merriam was educated first at Lenox College in Hopkinton. His father planned a legal career for him, preparatory to a life in politics, but a brief period at the law school of Iowa State University convinced him that legal training lacked a proper concern for ethics, and he rebelled. He decided to study political economy and social science at Columbia University, then a rapidly growing center of American social science. At Columbia he was influenced by William A. Dunning, John W. Burgess, and E. R. A. Seligman, among many others. The introduction to the modern historical and comparative method that Merriam received at Columbia took much of the edge off his later pilgrimage to Germany to hear Otto von Gierke and Hugo Preuss. Conception of political theory Merriam's acceptance in 1900 of a position as decent in political science at the University of Chicago began his long career at that university. His doctoral thesis, History of the Theory of Sovereignty Since Rousseau, was published in 1900, but it was the publication, in 1903, of A History of American Political Theories that first established him in his profession. Dedicated to Dunning, the book follows Dunning's pseudo-biological methods of historical classification and description, grouping Writers in orderly, if somewhat stilted, fashion by Period and major concern. But it is the first work in which the practicing politicians of the colonial, early federal, and pre-Civil War periods are classified as "theorists." Merriam was also among the first to call attention to the fact that John Locke had exercised a stronger influence over American Political history than had Rousseau. The work may now seem dated and quite static, but much of later discussion of American political thought is based
255
on the analysis it contains. Like its sequel, American Political Ideas (1920), it demonstrates Merriam's particular interest in broadening the definition of political theory to include not only the more traditionally recognized theorists whose writings were already part of the canon of political thought, but also the practitioners of politics, whose actions and intentions permanently affect the life of the community even though they may have given little attention to the formulation of doctrine. Indeed, for Merriam political theory came to embrace the study of society itself, as is shown in the memorial volume for Dunning, A History of Political Theories, Recent Times (1924), that he and H. E. Barnes edited: the volume includes, in systematic arrangement, essays in philosophy, sociology, psychology, and anthropology, all of which fields Merriam considered directly relevant to political theory. Involvement in politics With his background of family involvement in Iowa politics, Merriam could scarcely have avoided a similar interest in Chicago. To be sure, his conception of involvement in the political life of the city hardly coincided with that of President William Rainey Harper of the University of Chicago and of successive university administrators. Harper preferred to exert influence on the community through adult education, while Merriam saw city politics as a suitable area for applying new technical skills to the operations of government. Merriam's first opportunity for direct involvement in Chicago politics came in 1905, when the City Club of Chicago asked him to do a study of the city's municipal revenues. The success of the report, particularly among the club's membership of prominent local businessmen, led to Merriam's appointment by the mayor in 1907 as secretary of the Chicago Harbor Commission, whose purpose was to study the city's water transportation facilities (part of an effort to make the new Chicago city plan effective). The work succeeded not only in bringing Merriam to public attention but also in acquainting him with some of the city's most complex problems of business policy and political obfuscation. The work raised issues of land use, public utilities, private enrichment at public expense, and graft. Chicago's unusually high consciousness of its physical layout and its growing determination to make use of its remarkable lake frontage gave Merriam a rich education in some of the newly developing problems related to urban planning. As a result of his investigations, Merriam and his supporters were able to secure his nomination as alderman in the city's first primary and his elec-
256
MERRIAM, CHARLES E.
tion to the City Council in 1909. He promptly introduced an ordinance for a commission on city expenditures, becoming chairman of the commission upon its creation. By 1910, the commission had so successfully exposed fraud in Chicago city purchasing that it achieved a national reputation among reform groups interested in the reorganization of financing in local government. Merriam's work also came to the attention of Julius Rosenwald, who was already noted for his philanthropies but had hitherto avoided involvement in politics. Rosenwald financed the commission after the City Council angrily stopped its funds. He also backed Merriam's unsuccessful campaign for mayor in the 1911 election. Although Merriam ran on the Republican ticket, his identification as a progressive and a reformer alienated party regulars, who preferred the risk of Democratic victory to the possibility of party repudiation of their control of local politics. Although Merriam was active in the formation of the national Progressive party in 1912, his unwillingness to support it after the election, even though he continued to respect and support its aims, was typical of the growing group of "realists" among the reformers. They had come to look upon a party as having a complex social base as well as a political one, and therefore as less amenable than some reformers had hoped to modification by such political methods as the initiative, referendum, recall, and direct primary elections. Merriam had published his Primary Elections in 1908. Unlike so many of the studies of structural reform, the book called attention to the fact that structural reorganization by itself is not enough, that politics ultimately depends upon which groups of citizens are interested, or willing to be made interested, in the outcome of political events. Merriam's political activities, couched as they were in the imagery of the scholar-politician made popular by the successful candidacy of Woodrow Wilson, brought him also a national reputation. His desk became an informal clearinghouse of information for groups interested in the new methods of reform in local politics: primaries, budget and accounting systems, commissions of investigation and management, and the like. Re-elected to the Chicago City Council in 1913, he served until 1917, meanwhile continuing his teaching at the university. His career came to exemplify the new pragmatic voice of the academy, dedicated not only to the historical understanding of political structure but also to the discovery of useful methods for improving the conduct of politics. World War i took Merriam to Italy, where he
served briefly as the American high commissioner of public information, an office used by the Wilson administration to circumvent the more traditional diplomatic service. The position gave Merriam a sharp awareness of the problems involved in international exchanges of information, a field scarcely touched upon by Americans. Several of the postwar projects in which he was interested—most notably a series on civic education in various countries and a study of international reporting in American news media—were products of his months in Italy. Organization of research His return to Chicago politics after the war was unsuccessful; as an internationalist, he was swimming against the tide. The postwar period marked his ascendancy in the academic profession, an ascendancy which was nonetheless paralleled by an increasing sense of political frustration. His influence within the American Political Science Association was at its peak, and he led the movement for more research in politics and for closer relations with other disciplines, particularly psychology. He became president of the association in 1924. The founding of the Social Science Research Council in the same year was the culmination of his efforts to encourage greater interaction among the various fields. During this period he also did his most successful graduate teaching, and the students from this period, among them V. O. Key, Jr., and H. D. Lasswell, have been among his most influential. Through his friendship with young Beardsley Ruml, Merriam had an influence on the Rockefeller Foundation, and Ruml's striking ability to give organizational reality to Merriam's ideas was the source of much of Merriam's effectiveness during the period. The Rockefeller Foundation financed a committee on local community research at the University of Chicago; a faculty board headed by Merriam and Leonard White used the funds to finance research projects by students and colleagues, often in fields far removed from local community study. The founding of the Public Administration Clearing House in 1931 fulfilled another dream of Merriam's: the bringing together of the research and reform organizations directly involved in professional work in public service. It also brought Louis Brownlow to Chicago, thereby establishing a working friendship which proved enormously influential to both men. Yet the frustration of these years is also clear. Merriam's concern with the nature of leadership and the psychology of voting behavior was a re-
MERRIAM, CHARLES E. sponse in part to his disappointment with the course taken by the Republican party after World War i. The years from 1920 to 1928 saw the tacit repudiation by successive administrations of most of the ideals and programs of the progressives. Only Secretary of Commerce Herbert Hoover seemed concerned with these ideals, and his election to the presidency raised some hope for a return to them. At the University of Chicago, successive presidents frustrated Merriam's efforts to finance research; they saw these efforts as a threat to their own more traditional fund-raising needs. Merriam was tempted, in 1923, to accept a chair at Columbia and again, in 1927, to take a post with the Rockefeller Foundation in Paris, but each time he ultimately decided to remain at Chicago. Conception of a science of politics New Aspects of Politics was published in 1925. More obviously characteristic of his method of work than many of his other books, New Aspects is a collection of papers written and revised between 1920 and 1924. The papers were, in turn, built on notes of his comments at meetings and conferences of social scientists. This method of gradual accretion, accumulation, and revision was the one most often employed by Merriam but it was usually obscured by the final revision. Other such books are Chicago (1929) and Four American Party Leaders (1926), the latter, again characteristically, paralleling work done by his students. More than any of his other writings, however, New Aspects reveals the hortatory Merriam, suggesting directions for future investigation and pointing out to colleagues and students the possibilities inherent in a science of politics that was one of the new sciences of society. The essays also indicate his opposition to deterministic theories of history and politics, not only Darwinism and the economic theories of Charles A. Beard but also the behavioristic determinism in the very psychology, sociology, and anthropology whose methods he urged upon his colleagues. It was not the principles and predictions of these sciences which appealed to Merriam but the usefulness of their methods for the enrichment of the science of politics. Yet in spite of his rejection of deterministic views of history, he nonetheless depended upon a kind of "tendential" history that moves in trends and directions which are observable without being prescriptive. His attitude toward history is clearly related to the concepts of process then current in the pragmatic philosophies of John Dewey, George Herbert Mead, and, perhaps most of all, T. V. Smith. While the essays seem to describe new
257
directions of change, these directions are in effect consistent with the traditions of American government and the trends of American politics. Merriam seems to have seen his own role in very classical terms: to provide a modern basis for the kind of "whole man" theories of politics that had marked the history of political theory. Theories of the state, such as those of Hobbes and Locke and of many of their predecessors, had been based on investigations under way in psychology and physics. By Merriam's day, psychology and physics, like all of the natural sciences, had changed far more radically since the eighteenth century than had theories of the democratic state. This lag meant that democracy seemed increasingly destined to bear the brunt of the critical disillusion produced by the more recent scientific investigations of the nature of man. Merriam sought to provide a basis for restating a theory of the democratic state which would be consistent both with the traditions of democratic theory and with the revolutions in scientific doctrine, aware all the while that no modern theorist could ever again claim the universal knowledge which had made possible the comprehensive ambitions of classical theory. Such an endeavor now required a social science community, ambitious for the same ends and willing to be tolerant of a multiplicity of approaches. In Political Power (1934a) he sought to apply to American democracy European ideas about the sociological and psychological factors underlying political organization. European, and particularly German, theories of power analysis were given a specifically American setting and generalized in Merriam's characteristic fashion. Hitler's rise to power, like the ambitions of the Kaiser, shocked American scholars: Merriam had a deep respect for the quality of nineteenth-century German scholarship and sought to reconcile its traditional commitments with current events in Germany and in his own country. National planning By the 1930s, Merriam was once again in a position to exert political influence. He was a member of President Hoover's Research Committee on Social Trends, and the report of that committee, published in 1933, introduced Merriam's influence into the New Deal. The report had recommended the establishment of a high-level governmental agency for planning; and the appointment within the Department of the Interior of a national resources committee in 1933 brought Merriam a direct and influential role in the Roosevelt administration; it was a continuing role, since, in 1939,
258
MERRIAM, CHARLES E.
the committee became the National Resources Planning Board, with Merriam still a member. In its own day the National Resources Planning Board was better-known to those who criticized it than to those who used the information it produced. Although more than two decades have passed since its demise in 1943, its place in the history of the New Deal has yet to be determined. Over seventy major and minor reports on subjects ranging from land and water resources to labor, industry, education, and science, to mention only the most obvious categories, are largely unknown to (or ignored by) historians, despite the fact that they represent perhaps the best example extant of the transformation of turn-of-the-century progressivism into the professionalized government and social science of the post-New Deal generation. President Roosevelt often used these reports as the basis for proposals to Congress and as a means of testing public response to far-reaching experimental programs. Roosevelt also considered giving the reports wider public circulation to stimulate public interest in government, but the necessity of keeping the board out of politics made any scheme difficult to realize. Merriam was also a member in 1936 and 1937 of the President's Committee on Administrative Management, the so-called Brownlow Committee. His work on the report of the committee gave both practical structure and theoretical base to his concepts of national planning and the relation of national planning to executive organization. The last decade of his life was spent in what might be called active retirement. He continued to influence policy in the department of political science at Chicago, and the loose intellectual community which had come to be known as the Chicago school was maintained. He spent a year, 19481949, on President Truman's Loyalty Review Board, and he undertook various lecture obligations, among them the Walgreen series at the University of Chicago (twice during this period) and a series on public administration at the Maxwell School of Syracuse University, in 1947. He intended these lectures to serve as first drafts for several books: an autobiography, a study of government and the economic order, and a work on politics and administration. They remained among the manuscripts left unfinished at his death. Political theory and political behavior Merriam's official retirement from the University of Chicago came in 1940. During the 1930s his writings reflected his gradual return to the problem he had considered fundamental earlier in his
career but which had been overshadowed for a time by his interest in the study of political behavior, namely, the relation between political theory and democratic government. While Merriam's reputation remains bound to his work in political behavior, his fundamental interest throughout his life was theory. To be sure, theory, for him, needed ultimately to be based upon behavior, and behavior had been neglected by nineteenth-century students in the field: the bringing together of theory and behavior gave Merriam's work the appearance of shifts in focus—from theory to political behavior and back again to theory. His own experience in politics had led him to the observation and analysis of political action, using new methods and concepts imported from fields outside of politics. From the beginning it was his aim to bring new materials to the study of politics and to make it consistent with his ideals of political behavior; and in his later years he sought to fulfill this aim in his theoretical writings, an aim culminating in Systematic Politics (1945). The title reflects what could be called the paradox of Merriam's intellectual life: that he viewed politics as systematic and scientific but could find successful elaboration of its organization only in descriptive statements of political experience, his own and others', rather than in the structure of political theory itself. Though committed to bridging what he felt to be the gap between theory and practice, his best formulations of theory were virtually indistinguishable from practical examples. His book Chicago: A More Intimate View of Urban Politics (1929) is the best example. In Systematic Politics he attempted explicitly to separate theory from practice, thereby extending theory. However, only to those who knew the practical politics in which it had properly been imbedded could the book reveal much; to those who did not, it seemed a bit antiquated. For the post-1945 political scientist, Systematic Politics seems either unsystematic or unpolitical, depending upon whether the critic is committed to the older sense of system which Merriam had sought to revise or to the newer sense of politics which Merriam had sought to create. To assess the career of Merriam apart from the times in which he lived is apt to involve some rather complex distortions. His writings do not constitute a corpus of the importance ordinarily associated with the great in any field. Yet he deserves the accolade as few of his generation do. Merriam was in many ways a publicist of the persuasion of Walter Lippmann, Herbert Croly, and Walter Weyl, but instead of trying to give specialized knowledge of political science wide circulation, as they did, he
MERRIAM, CHARLES E. sought to transcend the academic disciplines for their common benefit, to keep social scientists mindful not only of one another's increasingly specialized problems but also of the broad public responsibility which, as citizens, they shared. Merriam is often called the father of behavioral study in politics, but he did not always relish recognizing his offspring, and his offspring in turn often looked upon him askance. Behavioral study emerged from World War n with a revised canon of method, often wholeheartedly committed to quantification (which Merriam had always viewed with much suspicion) and deeply influenced by the rapid development of new machinery for the collection and analysis of data. The war, too, had dampened reform ardor, as World War i had done. A shocking confrontation with reality had created a generation which, to Merriam, often seemed cynical and mechanistic. He had urged science upon them; but they were using science to question the very principles from which he himself had derived the necessity of scientific method. His own interest in behavioral study was rooted in the conviction that the arena of politics is the proper source of information and generalization about politics and political reform. All of the newly developed social sciences should be brought to bear on the re-examination of old generalizations about politics, the destruction of demonstrably useless ones, and the construction of new ones whose utility would continue to be tested by experience. But it should always be recognized that the social sciences serve rather than control the process of democratic politics. In the continuing relationship between political science and practical politics, the political scientist will always question the adequacy of the politician's knowledge, while the politician will question the validity of the "science" offered to him. Merriam dealt with these reservations by subjecting science to politics and by basing politics on his unshakable belief in democracy. Democratic government, whatever the details of its form, was for him the only government ultimately consistent with the nature of man. He avoided the question of whether or not this principle can be determined behaviorally, convinced as he was that observable weaknesses in the operation of democratic governments were the result of the still-existing nondemocratic elements, not of the essential nature of democracy. The accomplishments of Merriam's career rest as much on the insights to which he directed the attention of others as on the work which can be directly attributed to him. He used his optimism as a device for encouraging investigation and his
259
entrepreneurial energies as a means of making that investigation possible. Through his efforts others were enabled to explore frontiers which he himself could see only dimly and to penetrate barriers which he himself could not reach. Much of his reputation must ultimately depend on the roads he marked and the maps he drew. More confident of the end than others were apt to be, and far more certain of the Tightness of the direction, he pointed the way. BARRY D. KARL [For the historical context of Merriam's work, see POLICY SCIENCES; POLITICAL BEHAVIOR; POLITICAL SCIENCE; and the biographies of BEARD; DEWEY; MEAD. For discussion of the subsequent development of Merriam's ideas, see the biography of KEY.] BIBLIOGRAPHY
A bibliography of Merriam's writings through 1941 can be found in White 1942. Studies of aspects of his work can be found in the highly critical Crick 1959 and in Karl 1963. The Merriam papers at the University of Chicago contain a significant amount of unpublished material and constitute an extraordinarily rich source of information on the period during which he lived. WORKS BY MERRIAM
1900 History of the Theory of Sovereignty Since Rousseau. New York: Columbia Univ. Press. 1903 A History of American Political Theories. New York: Macmillan. 1906 Report of an Investigation of the Municipal Revenues of Chicago. City Club of Chicago. (1908) 1928 MERRIAM, CHARLES E.; and OVERACKEH, LOUISE Primary Elections: A Study of the History and Tendencies of Primary Election Legislation. Rev. ed. Univ. of Chicago Press. 1920 American Political Ideas: Studies in the Development of American Political Thought, 1865-1917. New York: Macmillan. (1922) 1949 MERRIAM, CHARLES E.; and GOSNELL, HAROLD F. The American Party System: An Introduction to the Study of Political Parties in the United States. 4th ed. New York: Macmillan. 1924 MERRIAM, CHARLES E.; and BARNES, HARRY E. (editors) A History of Political Theories, Recent Times: Essays on Contemporary Developments in Political Theory. New York: Macmillan. 1924 MERRIAM, CHARLES E.; and GOSNELL, HAROLD F. Non-voting: Causes and Methods of Control. Univ. of Chicago Press. (1925) 1931 New Aspects of Politics. 2d ed. Univ. of Chicago Press. 1926 Four American Party Leaders. New York: Macmillan. 1929 Chicago: A More Intimate View of Urban Politics. New York: Macmillan. 1931a The Making of Citizens: A Comparative Study of Methods of Civic Training. Univ. of Chicago Press. 1931k The Written Constitution and the Unwritten Attitude. New York: Smith. 1934a Political Power: Its Composition and Incidence. New York: McGraw-Hill.
260
MESMER, FRANZ ANTON
1934£> Civic Education in the United States. Report of the Commission on the Social Studies, American Historical Association, Part 6. New York: Scribner. 1936 The Role of Politics in Social Change. New York Univ. Press. 1939a The New Democracy and the New Despotism. New York: McGraw-Hill. 1939Z? Prologue to Politics. Univ. of Chicago Press. 1941a On the Agenda of Democracy. Cambridge, Mass.: Harvard Univ. Press. 1941£> What Is Democracy? Univ. of Chicago Press. (1945) 1962 Systematic Politics. Univ. of Chicago Press. 1963 MERRIAM, CHARLES E.; PARRATT, SPENCER D.; and LEPAWSKY, ALBERT The Government of the Metropolitan Region of Chicago. Univ. of Chicago Press. SUPPLEMENTARY
BIBLIOGRAPHY
CRICK, BERNARD 1959 The American Science of Politics: Its Origins and Conditions. Berkeley: Univ. of California Press. KARL, BARRY D. 1963 Executive Reorganization and Reform in the New Deal: The Genesis of Administrative Management, 1900-1939. Cambridge, Mass.: Harvard Univ. Press. -» See especially pages 37-81, "Charles Edward Merriam: Politics, Planning, and the Academy." The Limits of Behaviorialism in Political Science: A Symposium. Edited by James C. Charlesworth. 1962 Philadelphia: American Academy of Political and Social Science. RANNEY, AUSTIN (editor) 1962 Essays on the Behavioral Study of Politics. Urbana: Univ. of Illinois Press. WHITE, LEONARD D. (editor) 1942 The Future of Government in the United States: Essays in Honor of Charles E. Merriam. Univ. of Chicago Press. -> A bibliography of Charles E. Merriam's writings, complete through 1941, appears on pages 269-274.
MESMER, FRANZ ANTON Franz Anton Mesmer (1734-1815) was the originator of the doctrine of animal magnetism, later called mesmerism. Son of a gamekeeper on the estate of a bishop, Mesmer studied divinity first at Dillingen and then at Ingolstadt, where he acquired the degree of doctor of philosophy. He next went to Vienna to study law, and there he appears to have obtained a second doctoral degree. He received his third and final doctoral degree, in medicine, in 1766. Having become by then a man of independent means, Mesmer for a time followed his inclination to be a dilettante in a variety of scientific fields and was especially active as a patron of the musical arts. Himself a versatile musician, he was a friend of Gluck and a patron of young Mozart. What evidence there is shows Mesmer to have been a sensitive, sincere, well-educated man, of superior intelligence and possessed of an inquisitive and intuitive spirit, of imagination and enthusiasm, and of a genuine love for his fellow men.
Granted that he may have had unduly strong and erroneous convictions, that he may have been somewhat mystical and at times flamboyant, there is nevertheless little basis for believing that he was ever a charlatan or a quack. Universal fluid and animal magnetism. The origins of Mesmer's notions of animal magnetism are often said to go back to his 1766 doctoral thesis in medicine, "De planetarum influxu," which was concerned with the influence of the planets on the human body. In it he attempted to apply the writings of Newton and Descartes to older ideas, propounded by such men as van Helmont, Paracelsus, and Wirtig. His thesis was that there is a universal fluid permeating all things; it is in a perpetual state of flux and reflux and serves as a medium through which all coexisting objects continuously interact. In particular, it is through this fluid that the planets influence human beings. As might be expected, Mesmer did have something to say in his thesis with regard to the medical aspects of this influence, but he did not then appear to have developed the notion of "animal magnetism." During the summer of 1774, however, he was given an opportunity to witness a remarkable cure effected through the application of magnets. Intrigued, he himself began to experiment on a few patients, with some remarkable successes. In his efforts to find the true basis of these cures, whose source, he hypothesized, must lie in something other than the scientifically known physical properties of magnets, he returned to some of his earlier ideas about the universal fluid. Strongly impressed by the success of Johann Gassner, a popular healer of the day, in obtaining cures solely through the touch of the hands, Mesmer arrived at the notion that the universal fluid manifests itself in living organisms, particularly man, in a way quite analogous to the manner in which physical magnetism manifests itself in natural magnets. According to this analogy, there are like and unlike animal magnetic poles, which can be transmitted (or induced), changed, destroyed, and reinforced. Health depends upon a proper distribution and balance of such poles or, in other words, upon a proper distribution or concentration of the vital fluid. Mesmer attributed to physical magnets powerful animal magnetic properties, parallel to their physical properties, which enable them to affect the distribution of animal magnetism in other objects, particularly human beings. Moreover, he believed that some human beings are like physical magnets, in that they are powerful sources of animal magnetism, and can influence objects and humans. Since illness is the result of an inadequate distribution or a lack of
MESMER, FRANZ ANTON animal magnetism, a cure can be produced by altering the inadequate distribution through use of a powerful source of animal magnetism. This, in essence, is the doctrine of animal magnetism as Mesmer propounded and applied it. Mesmerism. However, in the hands of Mesmer's students and their students, at least three elements soon entered into the picture to distort the doctrine into "mesmerism." It is difficult to tell whether these were independent factors or whether each one followed more or less as a consequence of preceding ones. In any event, a critical departure was the discovery of "artificial somnambulism," a rather spectacular "nervous" condition that was presumably brought about by the use of animal magnetism and that produced in many individuals all sorts of unusual and often paranormal faculties. Another change was the increasing tendency to equate animal magnetism with the universal fluid discussed in Mesmer's thesis, rather than to consider it only one manifestation of that fluid. Last, animal magnetism became increasingly associated with various physical and paraphysical forces, so that, for instance, it was used to explain table tilting and turning during spiritualistic seances. In fact, in the hands of the mesmerists animal magnetism became a multif aceted biophysical entity which could account for just about anything. For Mesmer, animal magnetism was and remained a biophysical agency belonging to the Newtonian scheme of things, of interest primarily as a way to understand illnesses and a way to cure them rationally. For the mesmerists, animal magnetism became an occult agent, used primarily to bring about the somnambulistic state and other spectacular and extramedical effects. Mesmer himself noted this unfortunate development during the course of his life but was unable to stem its progress. Charges of malpractice. Mesmer's successful but unconventional use of magnets was unacceptable to the relatively small and select group of practicing Viennese physicians, and in 1778 he was forced to leave Vienna, following what now appear to have been poorly founded accusations of malpractice. From Vienna he moved to Paris, where he enjoyed great popularity as a practitioner for several years but again met with increasing opposition and hostility on the part of the medical profession, which labeled him an impostor and a charlatan. The final blow was dealt Mesmer when, in 1784, a commission, of which Benjamin Franklin was a member, was appointed by the French government to investigate Mesmer's claims and concluded that animal magnetism did not exist. The commission did recognize that Mesmer had
267
effected numerous cures, but it preferred to ascribe them to as yet unknown physiological causes. It is worth noting that the commission itself never directly accused Mesmer of charlatanism; this accusation came from less well-informed professional men of his time. Forced to leave Paris by the attacks of the medical profession, Mesmer moved to Switzerland, where he lived out the remainder of his days as a medical doctor. Assessment. Today we know, of course, that Mesmer's animal magnetism is not a scientifically valid concept; but in the light of what constituted science, especially medical science, in his day, it probably seemed to have some validity. We cannot overlook the fact that Mesmer did observe the apparent cure of illnesses by some unknown principle or agent. He tried to find an explanation for these cures that was compatible with the best general scientific writings of his time, such as those of Descartes and Newton. Seen in retrospect, Mesmer appears more a somewhat tragic figure—a product and a victim of his time—than a villain. He lived in an era of widespread superstitition, gullibility, and relative ignorance even among the upper classes, and of complete illiteracy among the common people. Black magic was still a thing to be feared, and wise men spoke seriously of the influence of the planets, advocated the intensive use of leeching, bleeding, and poultices as quasi-universal remedies, and talked learnedly of the circulation of the phlegm. Yet, significant strides were being made in the direction of modern science: Newton died just before Mesmer was born, and such men as Lavoisier, Gay-Lussac, Gauss, and Ampere were Mesmer's contemporaries. Most unfortunately for Mesmer, however, the comte de Saint Germain and Cagliostro also were his contemporaries. These charlatans succeeded in linking their names to the practice of mesmerism, thus bringing it into their own suspect orbit of intrigue and infamy. Finally, rather unwisely, Mesmer left much of the conduct of his affairs in the hands of students and friends who, however well-meaning they were, may have done him more harm than good through their overenthusiasm and personal inadequacies. ANDRE M. WEITZENHOFFER [See also HYPNOSIS.] WORKS BY MESMER
(1779) 1957 Memoir of F. A. Mesmer, Doctor of Medicine, on His Discoveries: 1799. Mount Vernon, N.Y.: Eden. -» First published as Memoire sur la decouverte du magnetisme animal. 1781 Precis historique des faits relatifs au magnetismeanimal jusques en avril 1781. London.
262
MESSIANIC MOVEMENTS WORKS ABOUT MESMER
BERTRAND, ALEXANDRE 1826 Du magnetisme animal en France et des jugements qu'en ont portes les societes savantes . . . Paris: Bailliere. GOLDSMITH, MARGARET L. 1934 Franz Anton Mesmer: A History of Mesmerism. Garden City, N.Y.: Doubleday. ZWEIG, STEFAN (1931)1932 Mental Healers: Franz Anton Mesmer, Mary Baker Eddy, Sigmund Freud. New York: Viking. H> First published as Die Heilung durch den Geist: Mesmer, Mary Baker Eddy, Freud.
MESSIANIC MOVEMENTS See MASS PHENOMENA; MILLENARISM; NATIVISM AND REVIVALISM; SECTS AND CULTS; SOCIAL MOVEMENTS.
METHODENSTREIT See ECONOMIC THOUGHT, article on THE AUSTRIAN SCHOOL; and the biographies of MENGER and SCHMOLLER.
METRAUX, ALFRED Alfred Metraux (1902-1963) was a pioneer in South American ethnohistory, a student of African culture in the New World, and a specialist in the field of race relations. He was also instrumental in promoting the role of the social sciences in the United Nations and its specialized agencies. Born in Lausanne, Switzerland, Metraux spent most of his childhood in Argentina, where his Swiss father was a well-known surgeon practicing in the city of Mendoza. He received his secondary and university education in Europe, studying at the Gymnasium in Lausanne, and in Paris at the Ecole Nation ale des Chartes, the Ecole Nationale des Langues Orientales, the Ecole Pratique des Hautes Etudes, and finally the Sorbonne, from which he received a doctoral degree in 1928. He also studied briefly in Goteborg, Sweden. Among his teachers were Marcel Mauss, Paul Rivet, and Erland Nordenskiold. Metraux also acknowledged the influence of Father John M. Cooper of the Catholic University of America, Washington, B.C., with whom he corresponded for many years. It was Cooper who introduced him to the American school of cultural anthropology, and Metraux was to combine the best of both the European and the American traditions of historical anthropology in his work. His professional career was equally cosmopolitan. He was the first director, from 1928 to 1934, of the Institute of Ethnology at the University of Tucuman in Argentina. In 1934/1935, he led a French expedition to Easter Island. From 1936 to 1938 he was a fellow of the Bishop Museum in
Honolulu, and the following year he became the Bishop Museum visiting professor at Yale University. In 1939 he returned for a year to Argentina and Bolivia for field research as a fellow of the Guggenheim Foundation. Then he went back to Yale, where he worked with South American data on the Cross Cultural Survey (now Human Relations Area Files). In 1941 he joined the staff of the Bureau of American Ethnology of the Smithsonian Institution, and there he played an important role from 1941 to 1945 by editing and writing for the monumental Handbook of South American Indians (Steward 1946-1959). In addition, Metraux taught at the University of California at Berkeley, the Escuela Nacional de Antropologia e Historia of Mexico, the Colegio Nacional de Mexico, the Facultad Latino-Americana de Ciencias Sociales in Santiago, Chile, and the Ecole Pratique des Hautes Etudes in Paris. From 1946 until his retirement in 1962, Metraux served in various capacities for the United Nations and for UNESCO. As a representative of UNESCO he took part in the Hylean Amazon Project in 1947/1948 and in the Marbial Valley (Haiti) anthropological survey in 1949/1950. In cooperation with personnel from the International Labour Office, he studied the internal migrations of the Aymara- and Quechua-speaking Indians of Bolivia in 1954. He was primarily responsible for the publication by UNESCO, between 1950 and 1958, of a series of pamphlets, monographs, and books on the concept of race and on race and minority relations. As a staff member of the department of social science of UNESCO he was constantly in touch with social science research throughout the world. Metraux contributed most to the social sciences in the field of ethnohistory. Perhaps no other writer contributed more pages to the Handbook of South American Indians. Most of these contributions are derived from documentary sources and are models of judicious historical reconstruction. His two books on the Tupinamba (1928&; 1928k) are classics in the ethnohistory of the South American Indian. In these two books, he drew upon a wide range of sixteenth- to eighteenth-century sources, written in French, Portuguese, and Spanish, to present a coherent picture of the material and socioreligious life of the extinct coastal tribes of Brazil known generically as the Tupinamba. His books are a contribution not only to South American ethnography but also to Brazilian history. The Tupinamba were the first Brazilian Indians encountered by the Portuguese upon their arrival in the New World. Their language, Tupi-Guarani, became the lingua franca for missionaries, and their names for the flora,
MEYER, ADOLF fauna, and topographical features became part of the Portuguese language as spoken in Brazil. From these coastal Indians, the Europeans learned to adapt to the New World environment, and much of the Indians' religious belief and mythology became a part of Brazilian folklore. Metraux was also a sensitive and indefatigable field researcher among primitive and peasant societies of Latin America. He published numerous monographs and articles reporting upon his field research over a period of 25 years (see Wagley 1964 for his complete bibliography). His major field research was carried out in the Argentine Chaco and in Haiti. One piece of work in the Chaco stands out, his study of the mythology of the Toba and Pilaga Indians (1946), in which he made use of his vast knowledge of South American ethnology to draw parallels in theme and plot between Chaco mythology and that of the Andean region. Making a Living in the Marbial Valley, Haiti (1951) is a careful and detailed ecological study stressing the effects of minifundia (overparcelization), soil erosion, and overpopulation on the peasant society of one Haitian valley. In this report, such aspects of social life as cooperative work groups, marriage and household groups, and religion and religious organizations are shown in relation to the ecological adjustment. He also wrote often for the public at large, both books and articles in a variety of journals. Most of his popular writing was originally in French, later translated into English. It was always solidly based upon his own bibliographical and field research; this is also true of his books on Easter Island (1940; 1941). In these books he disagreed with the theories both of American Indian and of Asian origin of the Easter Island stone sculpture. He took the view that the Easter Islanders are both physically and culturally Polynesian and that their art forms are likewise of local origin. Similarly, his book on Haitian voodoo (1958) is based upon many field trips to Haiti as well as on written sources. It is a study of the persistence of African fetish cults and African belief in Haiti and the relationship of this African religion, derived mainly from the Dahomeans of west Africa, to Catholicism. He gave an objective picture of voodoo as an orderly and complex religious system rather than a wild set of heathen orgies, as it had often been described [see CARIBBEAN SOCIETY]. CHARLES WAGLEY [See also HISTORY, article on ETHNOHISTORY; and the biographies of COOPER; MAUSS; NORDENSKIOLD; RIVET.]
263
WORKS BY METRAUX 1928a La civilisation materielle des tribus Tupi-Guarani. Paris: Geuthner. 1928fo La religion des Tupinamba et ses rapports avec celle des autres tribus Tupi-Guarani. Paris: Leroux. 1940 Ethnology of Easter Island. Bernice P. Bishop Museum, Bulletin No. 160. Honolulu (Hawaii): The Museum. (1941) 1957 Easter Island: A Stone-age Civilization of the Pacific. New York: Oxford Univ. Press. -» First published as L'lle de Pdques. 1946 Myths of the Toba and Pilagd Indians of the Gran Chaco. American Folklore Society, Memoirs, Volume 40. Philadelphia: The Society. 1951 Making a Living in the Marbial Valley, Haiti. Paris: UNESCO. (1958) 1959 Voodoo in Haiti. New York: Oxford Univ. Press. -> First published as Le voudou ha'itien. SUPPLEMENTARY BIBLIOGRAPHY
STEWARD, JULIAN H. (editor) (1946-1959) 1963 Handbook of South American Indians. 7 vols. U.S. Bureau of American Ethnology, Bulletin No. 143. New York: Cooper Square. WAGLEY, CHARLES 1964 Alfred Metraux, 1902-1963. American Anthropologist New Series 66:603-613.
METROPOLITAN GOVERNMENT See under CITY.
MEYER, ADOLF Adolf Meyer (1866-1950) was the dominant figure in American psychiatry during the first four decades of this century. He was a major force in molding psychiatry into its current form, but his teachings have become so solidly incorporated into American psychiatric theory and practice that the sweep and depth of his influence are often overlooked. He gave American psychiatry its pluralistic and instrumental orientation; its holistic approach to human problems; its conceptualization of psychiatric disorders, including schizophrenia, as reaction patterns rather than discrete disease entities; its concern with the psychotherapy of the psychoses. His contributions have been eclipsed, but not displaced, by those of Freud and by the ascendancy of psychoanalysis. Meyer was born in Niederweningen, near Zurich, Switzerland, and emigrated to the United States soon after receiving his doctoral degree in 1892. He filled, in succession, the positions of neuropathologist at Kankakee State Hospital in Illinois, clinical director at Worcester State Hospital in Massachusetts, and chief of the Pathological Institute of New York's state mental hospitals. He increasingly became convinced that the essential pathology of mental disorders is to be found in the person and not in the brain cells. When the Johns
264
MEYER, ADOLF
Hopkins Medical School decided to establish a department of psychiatry in 1908, Meyer was the obvious and unanimous choice for the new professorship. He remained at Hopkins until his retirement in 1941, by which time he had long been recognized as the dean of American psychiatrists. The cultural setting may have determined the orientation of psychiatry in the United States. In a country in which people were undergoing rapid acculturation, the importance of environmental influences upon personality change was more apparent than in Europe. Even though Meyer was a Swiss, he was particularly suited by birth and training to introduce a characteristically American pragmatic, pluralistic, and instrumental approach into psychiatry. He was born into a family that considered itself the spiritual heir of Kleinjogg (Jakob Gujer), a folk philosopher who had practiced and taught an "instrumental" approach to farming and communal living, combating the superstitions and the confining traditional usages of the farmer. Meyer had gained an exceptional grounding in neuroanatomy and neuropathology under Constantin von Monakow and Auguste Forel and, while studying abroad, was attracted by Thomas H. Huxley's evolutionary and ecological approach to biology and by Hughlings Jackson's concepts of the integration of the nervous system. Soon after his arrival in the United States he came under the influence of those men who had shaped the American philosophical and sociological tradition—Charles Peirce, William James, John Dewey, G. H. Mead, and C. H. Cooley. Meyer fused these various influences into a new conceptualization of human behavior, which he termed psychobiology, or ergasiology. He recognized that the Jacksonian concepts of the evolution and integration of the nervous system needed to be extended to include the highest level of integration through mentation: what man thinks affects his functioning down through the cellular and biochemical level, but, conversely, his thinking and feeling can be affected by the functioning of the organism at all levels of integration. Psychobiology offered an approach to the mind-body problem that obviated the need for the unsatisfactory mindbrain parallelism that had directed scientific attention to the study of the brain, to the neglect of the processes of living. Meyer made a number of fundamental contributions to neuroanatomy and neuropathology, including the discovery of the temporal-lobe detour of the optic radiations, termed "Meyer's loop," and his studies of central neuritis and aphasia; and he introduced the construction of plasticine models
into the teaching of neuroanatomy. However, he increasingly directed his attention to problems of the essentially human aspects of behavioral integration. Although Meyer welcomed the development of psychoanalysis and particularly its emphasis upon early childhood experiences and upon the role of symbolization, he considered the focus upon instinctual vicissitudes and unconscious motivations as unduly limited and neglectful of the total person. He increasingly opposed the premature oversystematizations in Freud's theorizing. Meyer insisted upon studying the problems of human adaptation and integration in their total complexity. Meyer's conception of psychiatric disorders as types of reaction patterns that are exaggerations of, aberrations from, or substitutions for, more normal and workable ways of living profoundly influenced the course of psychiatry. He turned away from psychiatry's efforts to become part of the mainstream of medical science by discovering some unknown biological or neuroanatomical basis for insanity, and chose instead to examine how people's ways of living and thinking can go astray. Of particular moment was his extension of this reactionpattern concept to schizophrenia, as outlined in his 1906 paper "Fundamental Conceptions of Dementia Praecox" (Collected Papers, vol. 2, pp. 432437), which emphasized that schizophrenia can result from deterioration of habit patterns, including habits of thinking. At the time, he stood almost alone in considering that schizophrenia may be a disorder of the personality rather than of the brain or its metabolism. His dynamic concept of schizophrenia also led to his insistence that patients suffering from schizophrenic reactions are amenable to psychotherapy and resocializing measures. At the Henry Phipps Psychiatric Clinic of the Johns Hopkins Hospital, which opened in 1914, Meyer developed the first significant teaching and research psychiatric hospital that was an integral part of a medical school. It provided the model for medical school teaching and residency training in psychiatry for the next quarter century. A large proportion of the outstanding psychiatric teachers and investigators in the United States and Great Britain trained under Meyer, spreading his orientation throughout the English-speaking world. Meyer was a man of broad vision, and his energy was sufficient to turn vision into reality. When he retired, psychiatry was on the verge of the vast expansion that followed World War u. Meyer had guided and nurtured it through its immaturity, propounding a psychiatry that had roots in both
MICHELS, ROBERT the biological and behavioral sciences, countering premature theoretical closures by his insistence upon a holistic and pluralistic approach, and fostering a psychotherapeutic approach to the psychoses. THEODORE LIDZ [For the historical context of Meyer's work, see the biographies of COOLEY; DEWEY; JAMES; MEAD; PEIRCE. For discussion of the subsequent development of his ideas, see MENTAL DISORDERS, article on BIOLOGICAL ASPECTS; PSYCHIATRY; SCHIZOPHRENIA.] WORKS BY MEYER
Collected Papers. 4 vols. Edited by Eunice Winters. Baltimore: Johns Hopkins Press, 1950-1952. -» Volume 1: Neurology. Volume 2: Psychiatry. Volume 3: Medical Teaching. Volume 4: Mental Hygiene. The Commonsense Psychiatry of Dr. Adolf Meyer. Edited by Alfred Lief. New York: McGraw-Hill, 1948. -» Consists of 52 selected papers. WORKS ABOUT MEYER
BLEULER, M. 1962 Early Swiss Sources of Adolf Meyer's Concepts. American Journal of Psychiatry 119:193196. CAMPBELL, C. MACFIE 1937 Adolf Meyer. Archives of Neurology and Psychiatry 37:715-731. EBAUGH, FRANKLIN G. 1937 Adolf Meyer: The Teacher. Archives of Neurology and Psychiatry 37:732—741.
MICHELS, ROBERT Robert Michels (1876-1936) belongs to that generation of European sociologists which tried to apply the insights of the founders of sociology to the understanding of twentieth-century Western society. Michels' standing in sociology is assured by his brilliant monograph Political Parties (191 la), in which he formulated the problem of oligarchical tendencies in organizations. Like Schumpeter, Geiger, Mannheim, Lukacs, de Man, and Ortega, he grappled with the problems of democracy, socialism, revolution, class conflict, trade unionism, mass society, nationalism, and imperialism, and with the role of intellectuals and of elites. He dealt more extensively than did these contemporaries of his with the politics of the working class, and he studied some topics that interested them little, such as eugenics, feminism, sex, and morality. More passionately committed than they were, he found himself deeply involved in the ideological and national conflicts of his time, and his work probably suffered from this involvement. Michels' background was cosmopolitan: he was born in Cologne, into a bourgeois-patrician family with a German-French-Belgian background. He attended the Gymnasium in Berlin and, after serving
265
in the army, studied in England and at the Sorbonne. He then went to Munich, where he attended lectures by the economist Lujo Brentano, and in 1897 he studied in Leipzig with Erich Brandenburg, Karl Lamprecht, and others. The following year he went to the University of Halle, studying with Michael Conrad and Hans Vaihinger and with Theodor Lindner, whose daughter he later married; in 1900 he completed his dissertation in history. Until World War I he was in close touch with the intellectual and political worlds of Belgium and France. Although he studied in England and taught in the United States, his interest in and understanding of the Anglo-Saxon world remained limited; his outlook was that of a continental European. As a young Dozent at the University of Marburg, he became a socialist and participated in the Social Democratic party congresses of 1903, 1904, and 1905. He left the party in 1907 but attended the Stuttgart congress of that year as a delegate of the Italian Socialist party (he had become a libero docente at the University of Turin). A few months later he also resigned from the Italian Socialist party. Because of his socialist views, it was impossible for Michels to qualify for a position at a German university. Max Weber strongly deplored this stand on the part of the German universities and showed a great deal of personal interest in the young Michels. He admitted him to what he called the salon des refuses in Heidelberg, and in 1913 he asked Michels to become coeditor of the Archiv fur Sozialwissenschaft und Sozialpolitik. Weber surely made a profound impression on Michels. It was partly Weber's influence and partly the security gained from his academic position at Turin that led Michels to shift from short and somewhat journalistic writings to more substantial publications in scholarly journals. Michels described his own political evolution in an autobiographical essay, "Eine syndikalistisch gerichtete Unterstromung im deutschen Sozialismus" (1932a); he also recorded his perception of the external events of the first ten years of the twentieth century, in the introduction to the Italian edition of Political Parties. His political views were influenced by Arturo Labriola and Enrico Leoni, whom he met in 1902, and by the French syndicalists Georges Sorel, Hubert Lagardelle, Edouard Berth, Paul Delesalle, and Victor Griffuelhes, with whom he became increasingly friendly after 1904. Disturbed by the state of the labor movement, Michels was attracted to the idea of revitalizing it by fusing the ideas of Marx, Proudhon, and Pareto. In particular he deplored the way calculations of
266
MICHELS, ROBERT
parliamentary advantage dominated party life and led to the abandonment of every vigorous idea and every energetic course of action. The contrast between the revolutionary statements that were made by the Social Democratic party in general and by August Bebel in particular and the cautious policy they actually pursued was brought home to him by the failure of the Ruhr strike in 1905 (Weber was also disturbed by this contrast). In a long and welldocumented article (1907) Michels analyzed the socialist ideological position, particularly with respect to pacifism and the general strike to avert war; by a neat juxtaposing of texts, he made plain the extent to which radical statements and actual policy diverged. Michels' involvement in German politics provided him with insights for his critique of the Social Democratic party and of the trade unions. It is clear from his autobiography that his descriptions of the demagogic orator, of party congresses, of the characteristics required for leadership, and, especially, of the intellectual in politics and of the class renegade are largely based on his own experiences. But his involvement was not motivated solely by intellectual concerns; it was related to his love for passion, for action, for youth, for principle irrespective of consequences, and for symbolic gestures. Indeed, his early political stance—his intellectual evolution toward a voluntaristic outlook— was the basis of his later affinity with fascism. His political life seems discontinuous and inconsistent if Political Parties is read only as the work of a disappointed democrat or a disillusioned regular member of the Social Democratic party. In fact, the life of a syndicalist Michels makes more sense than that of a purely Marxist-socialist Michels. Michels refused to support Germany in World War i, and this led to what must have been a painful break with Weber. In 1914 he moved to Basel, where he became professor of economics. In 1926 he taught a course in political sociology at the University of Rome. The following year he was a visiting professor in the United States, and after that he became professor of economics at the University of Perugia. He died in 1936 in Rome. His life was that of a romantic, a frustrated politician, a patriot of an adopted country, and a scholar; it reflected as have few others the conflicts of loyalty and the intellectual ambivalences of the first decades of the twentieth century. "Political Parties" In the years between 1906, when he first published in Weber's Archiv, and 1910, when Political Parties was completed, Michels' life contained ele-
ments that have often produced classic works: a deeply felt and probably painful personal experience—his involvement with the revolutionary cause and its interference with his academic career —and the impact of major intellectual figures, particularly Max Weber and Mosca. (Michels had become friendly with Mosca in Turin.) The starting point of Michels' classic study of political parties is the hypothesis that in organizations committed to the realization of democratic values there inevitably arise strong oligarchic tendencies, which present a serious if not insuperable obstacle to the realization of those values. "It is organization which gives birth to the domination of the elected over the electors, of the mandatories over the mandators, of the delegates over the delegators. Who says organization says oligarchy" ([191 la] 1962, p. 15). Thus Michels summed up his famous "iron law of oligarchy." The nature of leadership. Michels was dissatisfied with "psychological" (i.e., motivational) explanations of the oligarchic tendencies in organizations. His whole analysis emphasized the constraints derived from organizational needs— the growth of the organization, the need to make rapid decisions, the difficulties of communicating with the members, the growth and complexity of the tasks, the division of labor, the need for fulltime activity—and from the consequent processes of selection of leadership and development of knowledge and skills. These processes, in turn, lead to the emergence of stable leaders, whose professionalization, combined with their consciousness of their own worth, leads to oligarchy. The important point is that the leaders' deviation from norms they themselves accept is not the result of their motivation. The fact that conformity to certain norms may indirectly lead to deviation from other norms accepted by the same person has, of course, been emphasized by social scientists since Marx. Michels studied the special case of men who, despite their commitment to democracy, often acted in ways not conforming to their values because of the demands of organization and other factors of political life. While Michels often referred to the "psychological predispositions" of both the masses and the leaders, he saw these predispositions as fundamentally serving to reinforce or, occasionally, to weaken the organizational factors, even though at times they also seemed to him to function independently. Significantly, when he presented his theory schematically in a chart (191 la, p. 382), he did not stress the manipulative or illegitimate actions of the leaders (which he discussed at length elsewhere in Political Parties)
MICHELS, ROBERT but concentrated instead on the factors influencing the active and effective participation of the members in decision making. Leaders and followers. In organizations with formally democratic constitutions—such as the German working-class parties, which Michels examined closely—elections (and to a lesser extent referenda) determine who shall act in the name of the members. Elections presumably also assure the accountability of the leaders to the members. Michels' concern in a large part of Political Parties is with the way the leaders take advantage of the incompetence and emotionality of their followers to hold on to power and become a de facto oligarchy. When they establish such an oligarchy, they are no longer willing to submit their power to free electoral confirmation. In his later writings (1927a; 1933a) Michels made a virtue of what initially he had seen only as an iron law; he was carried away by his preference for decisive leadership and an elite unhampered by the "numerical maximum, mortal enemy to all freedom of program and thought" (1927a, p. 765). By this time he saw no difference between elected representatives and charismatic leaders to whom the mass voluntarily sacrifices its will in conscious admiration and veneration (1928a, p. 291). Furthermore, he believed that "leaders never give up their power to the 'mass' but only to other, new leaders" ("die Fiihrer weichen menials der 'Masse' sondern immer nur anderen, neuen, Fiihrern"; 1928a, p. 291). He did not seem to realize that it does make a difference whether leaders are displaced by elections, in which the majority decides who shall lead, or by death or violent revolution. Furthermore, de facto oligarchy is not necessarily identical with de jure oligarchy, or dictatorship. The fact that de facto oligarchs need to manipulate their followers in the ways that Michels, Mosca, and Pareto described certainly makes "oligarchic" or corrupt democracies like Italy in the Giolittian period, from 1900 to 1914 very different from dictatorships like Italy under Mussolini. The inability of Michels to work out in his later writings a clear conception of the new elitist parties—the Fascists and the Bolsheviks—and his tendency to see them only as manifestations of the same general tendency to oligarchy are partly a result of this confusion. Michels was not satisfied with electoral accountability as a criterion of democracy; in fact, he considered this de jure aspect insignificant compared to the de facto circumstances that affect the electoral process. He therefore constantly returned to Another dimension: the degree of responsiveness
267
of (stable) leadership to the expectations and desires of the constituency. Presumably, if democratic leaders do not respond to the expectations and desires of their constituents, they will be defeated at the polls. Also, according to democratic theory, the wishes of the constituency will coincide with its interests, and democracy is the best way of assuring the satisfaction of those interests. Much of Political Parties, however, argues that leaders are responsive, not to the desires or interests of their constituents, but to the interests of the organization or to their own interests. (Michels noted perceptively that this identification is often unconscious.) This lack of responsiveness does not result, according to Michels, from a divergence between the interests of the leaders and those of their constituents but from the apathy and ignorance of the constituents—demonstrating what he called the incompetence of the masses—and from the general unwillingness of the leaders to overcome this passivity. (Only when new leaders challenge the old, raising real or spurious issues, are any attempts made to mobilize and inform the constituency.) In his discussion of the responsiveness of the leadership to its followers, Michels was only dimly aware of what Carl Friedrich has called the "rule of anticipated reactions": when leaders have neither the time nor the technical means to ascertain the wishes of their constituency, or when those wishes have not crystallized, the leaders are generally guided by some sense of what their constituents' desires might be. This capacity to anticipate is characteristic of any leadership, but especially of democratic leadership. Another dimension constantly present in Michels' analysis, as in all discussions of oligarchy and democracy, is the nature of the responsibility of the leaders to their followers: are leaders responsible only to their constituency, or are they responsible also to the larger whole of which their constituency is a part? are they responsible to the party membership or to the electorate? The problem of responsibility to a larger unit—the society as a whole—rather than to a particular constituency becomes especially acute when a party is in power, rather than in the opposition; this was a problem socialist leaders had not faced at the time Political Parties was written. Party ideology and party policy. Michels was much concerned with two questions (which are somewhat confused in his brilliant chapter "The Conservative Basis of Organization"): can a revolutionary party follow a revolutionary policy? and can a democratic party follow a democratic policy? He felt that the answer to the first question is
268
MICHELS, ROBERT
clearly negative if a revolutionary party hopes to achieve its goals by obtaining an electoral majority. To the question about democratic parties, his answer was less clear-cut. While he did assert that "within certain narrow limits, the Democratic Party, even when subject to oligarchic control, can doubtless act upon the state in the democratic sense" ([191 la] 1962, p. 333), he also tended to argue that if democratic parties are not internally democratic, then democracy is impossible. Lipset (1962) and Sartori (1960) have pointed out, however, that competition between parties makes the politically "organized" minority (within each party) dependent at times and to a degree on the "nonorganized" majority. This competition assures the citizen of a degree of participation and power. Michels conceived of an ideal party as a purely ideological group, open only to those who share the goals of the founding members and identify their interests with the original conception of the interests of the group. According to Michels, the sole cause of deviation from party ideology is oligarchy. It is in the nature of oligarchy to sacrifice ideological purity to the methodical organization of the masses for electoral victory. By such methods, not merely does the party sacrifice its political virginity, by entering into promiscuous relationships with the most heterogeneous political elements, relationships which in many cases have disastrous and enduring consequences, but it exposes itself in addition to the risk of losing its essential character as a party. The term "party" presupposes that among the individual components of the party there should exist a harmonious direction of wills towards identical objectives and practical aims. Where this is lacking, the party becomes mere "organization." ([191 la] 1962, p. 341) The contrast between the party as an ideologically pure expression of interests (Michels had in mind primarily class interests) and the reality of modern mass parties is in many ways similar to that between sect and church in the sociology of religion of Ernst Troeltsch. Just as Troeltsch identified primitive Christianity with a sectlike conception, so did Michels identify the early socialist party with an ideal party. Michels' ideal party is elitist—a group sharing a commitment to an ideological understanding of class interest. His conception did not encompass either organizations with specific substantive goals or modern mass parties. Lipset and Sartori have noted that the more specific the substantive goal of an organization, the more difficult it may be to find commitment to a procedural goal, such as
democracy. The narrower the substantive goals, the less likely it is that the members have either the need or the time to participate and influence policy. This would explain why there is less democracy in trade unions than in parties. As for mass democratic parties, Michels believed that since these are "open" and do not require a declaration of faith in party principles, they cannot be "true" parties. He accurately pinpointed the open quality of modern parties as one of their basic characteristics, but his indignation at some of the consequences of this phenomenon prevented him from analyzing it and seeing that it is inevitable, given a society that has moved from closed and established status groups to voluntary, open organizations that compete for power in a system of universal suffrage. Characteristics of oligarchy. Michels used the term "oligarchy" or "oligarchic tendency" to cover several aspects of political behavior that are conceptually quite distinct and that may or may not coexist in organizations, parties, or trade unions: (1) the emergence of leadership; (2) the emergence of professional leadership, and its stabilization; (3) the formation of a bureaucracy, that is, an appointed, regularly paid staff with distinct duties; (4) the centralization of authority; (5) the displacement of goals, particularly the shift from ultimate goals (e.g., achieving a socialist society) to instrumental goals (i.e., perpetuating the organization ): with the growth of "conservative tendencies" in revolutionary parties, the survival of the organization takes precedence over the revolution itself, and increased emphasis is placed on satisfying the immediate needs of the members, through such activities as collective bargaining or participation in municipal government ("reformism"); and the addition of goals (e.g., ameliorating the condition of the working class); (6) increased ideological rigidity—conservatism, in the sense of adherence to policies and ideas that have been rendered obsolete by changed circumstances, and intolerance toward attempts to revise such policies or ideas; (7) the growing difference between the interests and/or points of view of the leaders and of the members, and the precedence of the leaders' interests over those of the members; (8) the decrease in the members' opportunities to participate in policy decisions, even when they are willing to participate; (9) the co-optation of emergent opposition leaders by the existing leadership; (10) the "omnibus" tendency of parties, the shift from appeals to the membership to appeals to the electorate and from appeals to a class electorate to
MICHELS, ROBERT appeals to a broader electorate—such shifts may produce a more moderate program, while opposition as a matter of principle is replaced by competition with other parties, and disloyal opposition to the social and political system is replaced by loyal opposition and even by participation in governing. While the first nine of these characteristics may be found in very different types of organizations, the tenth is valid only for revolutionary parties or for organizations in a "democratically competitive political system" (and perhaps particularly in its parliamentary variety). If, as the list suggests, the label "oligarchic tendencies" is used to cover so many different things, it becomes quite meaningless. Such critics as Cassinelli (1953) and Dahl (1958) therefore have tried to define the meaning of "oligarchy"— or of related concepts, like "ruling class"—in more precise and operational terms. They have also endeavored to show that some of the processes may be independent, rather than closely linked. Finally, they feel it is essential to clarify which of these tendencies are inherently incompatible with democracy and which can coexist with it. Many factors favoring oligarchy seem to be especially characteristic of working-class organizations. In such organizations it is difficult for leaders to return to manual work in the factory after assuming leadership that implies a middleclass position; also, the workers' lack of education, their limited access to information, their frequent apathy, together with their predisposition to authoritarian attitudes, all contribute to the development of oligarchy. Evidence of such "oligarchic" tendencies in organizations with equalitarian, democratic, even revolutionary, ideology is nevertheless not proof of the validity of the iron law of oligarchy in all organizations. Other works In 1908 Michels published II proletariate e la borghesia nel movimento socialists italiano (1908a), a work which he hoped would contribute to what is today called political sociology. Its content is similar to that of recent books in this field: discussions of the social composition of the Socialist party in parliament, of delegates to party congresses, of candidates in municipal elections, and of local party organizations, followed by an ecological analysis of electoral participation and of Socialist strength in the electorate. Michels was particularly original in his attempt to sketch what We would now call the "political culture" of Italy
269
in general and, specifically, that of the working class, making constant comparisons between Italian and German society. His combination of structural and psychological perspectives in this part of the book still stands as an example to sociologists who analyze the relationships between classes and the political manifestations of these relationships. The last part of the book is devoted to syndicalist currents and to comparisons between Italy and France. The discussion of the role of intellectuals in political parties, particularly in working-class parties, is linked with an analysis of the Italian occupational and academic structure. In the same year that Political Parties was published, there also appeared Michels' Die Grenzen der Geschlechtsmoral (1911Z?). Such subjects as feminism, the female worker, sexual morality in different societies, and birth control had always interested him; many years later, in 1928, he published a book, Sittlichkeit in Ziffern? (1928fr), presenting the available statistical data on various aspects of sex and family life and social deviance. Uimperialismo italiano (1912) is concerned with the violent upheavals the Tripoli war of 19111912 (in which Italy took several coastal cities and towns from the Turks) produced in Italian values. (The war led to a crisis in Michels' life, which he described in the introduction.) In the book, he dealt with the suffering caused by war, the moral impact of war propaganda, and the sacrifice of long-held values to rhetorical appeals, as well as with the failure to see the similarity between the motives of the Arabs defending their homeland and those of the fighters for the Risorgimento. He interpreted Italian imperialism partly in politicopsychological terms but mainly as resulting from demographic pressure and from the social and cultural loss due to overseas migration. Thus, he asserted that the imperialismo della povera genie was qualitatively different from the imperialism of other nations. In the 1920s and early 1930s he produced a number of books and many articles, dealing with nationalism, Italian socialism and fascism, elites and social mobility, the role of intellectuals, the history of the social sciences, and other subjects not so closely related to his central interests. "Psychologic der antikapitalistischen Massenbewegungen" (1925a) remains one of the most interesting and well-documented systematic treatments of working-class protest. He often returned to the problem of oligarchy and democracy (1927a,1928a; 1933a) but added little to the original formulation of 1911; his writings merely became
270
MICHELS, ROBERT
more antidemocratic in tone as a result of his tendency to see the new totalitarian parties as confirmation of his iron law of oligarchy. Assessment of Michels' contributions The chief basis of Michels' work, in addition to his personal experience, was secondary sources and contemporary accounts from magazines and newspapers. In many cases he did not analyze these secondary materials in a systematic way, and he never made an effort to develop a new methodology. His lack of sustained effort in the collecting of data and his reliance on scattered, piecemeal information gathered by others deprive his works of the unity characteristic of good monographs and of great books based on a single theme, although they often contain much valuable information. Originality. At the time that he wrote, Michels' work was not unique; a number of his contemporaries and immediate predecessors had formulated some of the same ideas and expressed similar sentiments. In footnotes, dedications, and later writings, he acknowledged the influence of, and intellectual affinity with, Mosca, Ostrogorskii, and Bryce; these men, in turn, acknowledged the coincidence of views. Like his immediate predecessors—Sorel, Pareto, Mosca, and Weber—Michels challenged the prevalent democratic and socialist climate of opinion. But his basic confrontation was with Marx. Thus, he wrote: . . . the defects of Marxism are patent directly as we enter the practical domains of administration and public law, without speaking of errors in the psychological field and even in more elementary spheres. . . . The problem of socialism is not merely a problem in economics. In other words, socialism does not seek merely to determine to what extent it is possible to realize a distribution of wealth which shall be at once just and economically productive. Socialism is also an administrative problem, a problem of democracy, and this not in the technical and administrative sphere alone, but also in the sphere of psychology. [191 la] 1962, pp. 349, 350)
In underlining the sociopsychological aspects of political behavior—the problem of power and its abuse; the susceptibility of the masses, particularly of the lower classes, to charismatic appeals; and the importance of organization and its constraints in fostering oligarchic and dictatorial tendencies— Michels was trying to break away from the vulgarMarxists, who saw the economic structure (of capitalism) only as a restriction of freedom. Viability of democracy. Michels' views on democracy have been the subject of much discussion.
Mosca (1912) presented Michels as an ademocratic theorist; others have gone further, arguing that his later writings and his attitude toward fascism provide evidence that he was actually antidemocratic; and finally, some have attributed to him a positive appraisal of the compatibility of organization and democracy. Such very different interpretations suggest two possibilities that are not mutually exclusive. First, Michels had, in fact, no clear-cut assessment of the viability of democracy, although his more ambitious formulations of the iron law of oligarchy suggest that his view was predominantly negative. Second, the commentators do not have in mind the same problems Michels did. In common with many intellectuals, Michels tended to define democracy in terms of what he considered favorable to the interests of the people, and he concluded that if the people express preferences for, support, or acquiesce in policies not compatible with their interests, this must be the result of oligarchic manipulations. To some he has therefore appeared as a disappointed democrat whose disillusionment ultimately led him to adopt an ademocratic and even antidemocratic elitist stance (see May 1965). Studies inspired by Michels Michels' theories have inspired considerable empirical research intended to support, specify, or challenge them. Much of that research has been done in Anglo-Saxon countries or by scholars trained there. The recent interest in general theories of organization, concerned with such processes as bureaucratization, goal displacement, and cooptation—as distinct from a concern with specific groups, like parties, unions, pressure groups, government agencies, and corporations—has also contributed to a renewed interest in Michels. In this theoretical development Union Democracy, a study of the International Typographical Union (I.T.U.) by Lipset, Trow, and Coleman (1956), has had a central place. Alone among American unions the I.T.U. has had, for over fifty years, a functioning two-party system at the national level. Its history seems to show that Michels' iron law of oligarchy is not, after all, universal: if even one exception can be found, there may be others. The study seeks to explain how the I.T.U. has maintained a system of democratic self-government. By examining the processes that have maintained democracy in a small society, the authors hoped to illuminate the relevant processes in the larger society. Although Union Democracy represents an important challenge to Michels' ideas, a large number
MICHELS, ROBERT of studies of secular and religious organizations have confirmed Michels' theories, discovering processes similar to those he described. Thus, Harrison has shown (1959) that the American Baptist Convention, which is committed to the independence of its local churches and to the advisory nature of its larger organization, clearly manifests oligarchic tendencies. The growth of the organization and the increased complexity of tasks, the lack of means to ascertain the sentiments of the members on issues not directly relevant to them, specialization in pastoral work, the indifference of many of those concerned with organizational goals—all strengthen the leadership and increase its independence. The recent literature on political parties has also been influenced by Michels' early classic. The basic books by Maurice Duverger (1951) and Sigmund Neumann (1956) have summarized and taken issue with it, while stressing its pathbreaking character. Monographic studies by Robert T. McKenzie (1955), Renate Mayntz (1959), and Samuel J. Eldersveld (1964) have been explicitly directed to the questions Michels raised. Although Eldersveld, on the basis of careful research, challenged Michels' thesis, research on political parties generally confirms Michels' observation that they are less oligarchic than single-purpose organizations concerned with more technical problems. JUAN J. LINZ [See also DEMOCRACY; ELITES; OLIGARCHY; PARTIES, POLITICAL. Other relevant material may be found in LEADERSHIP; ORGANIZATIONS; POLITICAL SOCIOLOGY; SOCIAL MOVEMENTS; VOLUNTARY ASSOCIATIONS; and in the biographies of BRYCE; MOSCA; OSTROGORSKII; PARETO; SOREL; WEBER, MAX.] WORKS BY MICHELS 1905-1906 Proletariat und Bourgeoisie in der sozialistischen Bewegung Italiens: Studien zu einer Klassenund Berufsanalyse des Sozialismus in Italien. Archiv fur Sozialwissenschaft und Sozialpolitik 21:347-416; 22:80-125, 424-466, 664-726. 1906 Die deutsche Sozialdemokratie: I. Parteimitgliedschaft und soziale Zusammensetzung. Archiv fur Sozialwissenschaft und Sozialpolitik 23:471-556. 1907 Die deutsche Sozialdemokratie im internationalen Verbande: Eine kritische Untersuchung. Archiv fur Sozialwissenschaft und Sozialpolitik 25:148-231. 1908a II proletariato e la borghesia nel movimento socialista italiano: Saggio di scienza sociografico-politica. Turin: Bocca. 1908b Die oligarchischen Tendenzen der Gesellschaft: Ein Beitrag zum Problem der Demokratie. Archiv fur Sozialwissenschaft und Sozialpolitik 27:73-135. 1908c Le syndicalisme et le socialisme en Allemagne. Pages 21-28 in Syndicalisme et socialisme. Edited by Hubert Lagardelle. Paris: Riviere.
271
(1911a) 1962 Political Parties: A Sociological Study of the Oligarchical Tendencies of Modern Democracy. With an introduction by Seymour M. Lipset. New York: Free Press. -> First published as Zur Soziologie des Parteiwesens in der modernen Demokratie. The 1911 German edition and the Italian translations include a graphic schema of Michels' theory, not included in the English editions. 1911fo Die Grenzen der Geschlechtsmoral: Prolegomena, Gedanken und Untersuchung en. Griinwald: Frauenverlag. (1912) 1914 L'imperialismo italiano: Studi politicodemografici. Rev. & enl. ed. Milan-. Societa Editrice Libraria. -> First published in German. 1914 Probleme der Sozialphilosophie. Leipzig: Teubner. 1922 La teoria di C. Marx sulla miseria crescente e le sue origini: Contributo alia storia delle dottrine economiche. Turin: Bocca. 1924a Elemente zur Soziologie in Italien. Kolner Vierteljahrshefte fur Soziologie 3:219-249. 1924b Lavoro e razza. Milan: Vallardi. 1925a Psychologic der antikapitalistischen Massenbewegungen. Section 9, part 1, pages 241-359 in Grundriss der Sozialokonomik. Tubingen: Mohr. 1925b Nachtrag zu Robert Michels' Aufsatz: Elemente zur Soziologie in Italien. Kolner Vierteljahrshefte fur Soziologie 4:331 only. 1925c Sozialismus in Italien: Intellektuelle Stromungen. Munich: Meyer & Jessen. 1925d Sozialismus und Fascismus in Italien. Munich: Meyer & Jessen. 1926 Soziologie als Gesellschaftswissenschaft. Lebendige Wissenschaft, Vol. 4. Leipzig: Kroner. (1927a) 1949 The Sociological Character of Political Parties. Pages 134-155 in Robert Michels, First Lectures in Political Sociology. Minneapolis: Univ. of Minnesota Press. -*• First published in Volume 21 of the American Political Science Review. 1927b Bedeutende Manner: Charakterologische Studien. Leipzig: Quelle & Meyer. -» Biographical studies of Bebel, de Amicis, Lombroso, Schmoller, Max Weber, Pareto, Sombart, W. Miiller—seven of whom Michels knew personally. (1927-1936) 1949 First Lectures in Political Sociology. Translated with an introduction by Alfred de Grazia. Minneapolis: Univ. of Minnesota Press. -> Includes a translation of Corso di sociologia politica. 1928a Grundsatzliches zum Problem der Demokratie. Zeitschrift fur Politik 17:289-295. 1928b Sittlichkeit in Ziffern? Kritik der Moralstatistik. Munich: Duncker & Humblot. 1928c Die Verelendungs-theorie: Studien und Untersuchungen zur internationalen Dogmengeschichte der Volkswirtschaft. Leipzig: Kroner. -» An excellent scholarly study of the historical development of the idea of immiserization. 1929 Der Patriotismus: Prolegomena zu seiner soziologischen Analyse. Munich: Duncker & Humblot. 1930a Authority. Volume 2, pages 319-321 in Encyclopaedia of the Social Sciences. New York: Macmillan. 1930b Italien von heute: Politische und wirtschaftliche Kulturgeschichte von 1860 bis 1930. Zurich: Fiissli. (1931) 1949 Patriotism. Pages 156-166 in Robert Michels, First Lectures in Political Sociology. Minneapolis: Univ. of Minnesota Press. -» First published in Handworterbuch der Soziologie. 1932a Eine syndikalistisch gerichtete Unterstromung im deutschen Sozialismus (1903-1907). Pages 343-
272
MIDDLE AMERICAN SOCIETY
364 in Festschrift fur Carl Griinberg zum 70. Geburtstag. Leipzig: Hirschfeld. 1932b Intellectuals. Volume 8, pages 118-126 in Encyclopaedia of the Social Sciences. New York: Macmillan. 1933a Studi sulla democrazia e sull'autoritd. Perugia, Universita, Facolta Fascista di Scienze Politiche, Collana di Studi Fascisti, 24-25. Florence: "La Nuova Italia." 1933b Historisch-kritische Untersuchungen zum politischen Verhalten der Intellektuellen. Schmollers Jahrbuch fur Gesetzgebung, Verwaltung und Volkswirtschaft im Deutschen Reiche 57:807-884. 1934 Umschichtungen in den herrschenden Klassen nach dem Kriege. Stuttgart: Kohlhammer. SUPPLEMENTARY BIBLIOGRAPHY
BUKHARIN, NIKOLAI I. (1921) 1965 Historical Materialism: A System of Sociology. Translated from the 3d Russian edition. New York: Russell. -> First published as Teoriia istoricheskogo materializma. BURNHAM, JAMES (1943) 1963 The Machiavellians, Defenders of Freedom. Chicago: Regnery. CASSINELLI, C. W. 1953 The Law of Oligarchy. American Political Science Review 47:773-784. DAHL, ROBERT A. 1958 Critique of the Ruling Elite Model. American Political Science Review 52:463-469. DUVERGER, MAURICE (1951) 1962 Political Parties: Their Organization and Activity in the Modern State. 2d English ed., rev. New York: Wiley; London: Methuen. -> First published in French. ELDERSVELD, SAMUEL J. 1964 Political Parties: A Behavioral Analysis. Chicago: Rand McNally. GOULDNER, ALVIN W. 1955 Metaphysical Pathos and the Theory of Bureaucracy. American Political Science Review 49:496-507. HARRISON, PAUL M. 1959 Authority and Power in the Free Church Tradition: A Social Case Study of the American Baptist Convention. Princeton Univ. Press. LINZ, JUAN J. 1966 Michels e il suo contributo alia sociologia politica. Pages vii-cxiii in Robert Michels, La sociologia del partito politico nella democrazia moderna. Bologna: Carlino. LIPSET, SEYMOUR M. (1954) 1960 The Political Process in Trade-unions. Pages 357-397 in Seymour M. Lipset, Political Man: The Social Bases of Politics. Garden City, N.Y.: Doubleday. LIPSET, SEYMOUR M. 1962 Introduction. Pages 15-39 in Robert Michels, Political Parties: A Sociological Study of the Oligarchical Tendencies of Modern Democracy. New York: Free Press. LIPSET, SEYMOUR M. 1964 The Biography of a Research Project: Union Democracy. Pages 96-120 in Phillip E. Hammond (editor), Sociologists at Work: Essays on the Craft of Social Research. New York: Basic Books. LIPSET, SEYMOUR M.; TROW, MARTIN A.; and COLEMAN, JAMES S. 1956 Union Democracy: The Internal Politics of the International Typographical Union. Glencoe, 111.; Free Press. H> A paperback edition was published in 1962 by Doubleday. MCKENZIE, ROBERT T. (1955) 1963 British Political Parties: The Distribution of Power Within the Conservative and Labour Parties. 2d ed. New York: St. Martins. MAY, JOHN D. 1965 Democracy, Organization, Michels. American Political Science Review 59:417—429. -» One of the most important studies of Michels' work.
MAYNTZ, RENATE 1959 Parteigruppen in der Grossstadt: Untersuchungen in einem Berliner Kreisverband der CDU. Cologne: Westdeutscher Verlag. MOMMSEN, WOLFGANG 1959 Max Weber und die deutsche Politik, 1890-1920. Tubingen: Mohr. MOSCA, GAETANO (1912) 1949 La sociologia del partito politica nella democrazia moderna. Pages 26—36 in Gaetano Mosca, Partiti e sindicati nella crisi del regime parlamentaro. Bari: Laterza. -» A review of Michels' Political Parties. First published in Volume 1 of II pensiero moderno. NEUMANN, SIGMUND (editor) 1956 Modern Political Parties: Approaches to Comparative Politics. Univ. of Chicago Press. NIPPERDEY, THOMAS 1961 Die Organisation der deutschen Parteien vor 1918. Diisseldorf: Droste. PERUGIA, UNIVERSITA, FACOLTA DI GIURISPRUDENZA 1937 Studi in memoria di Roberto Michels. Annali, Vol. 49. Padua: CEDAM. -» Contains a bibliography of Michels' writings on pages 37-76. RITTER, GERHARD A. 1959 Die Arbeiterbewegung im Wilhelminischen Reich: Die Sozialdemokratische Partei und die freien Gewerkschaften, 1890-1900. Berlin (West Berlin) Freie Universitat, Friedrich Meinecke Institut, Studien zur Europaischen Geschichte, No. 3. Berlin-Dahlem: Colloquium Verlag. ROTH, GUENTHER 1963 The Social Democrats in Imperial Germany: A Study in Working-class Isolation and National Integration. Totowa, N.J.: Bedminster Press. SARTORI, GIOVANNI 1960 Democrazia, burocrazia e oligarchia nei partiti. Rassegna di sociologia 1:119-136. SARTORI, GIOVANNI (1962) 1965 Democratic Theory. New York: Praeger. -> Based on the author's translation of his Democrazia e definizione (1957). SCHORSKE, CARL E. 1955 German Social Democracy, 1905-1917: The Development of the Great Schism. Harvard Historical Studies, Vol. 65. Cambridge, Mass.: Harvard Univ. Press. SCHUMPETER, JOSEPH A. (1942) 1950 Capitalism, Socialism, and Democracy. 3d ed. New York: Harper; London: Allen & Unwin. -> A paperback edition was published by Harper in 1962. WEBER, MAX 1905 Bemerkungen im Anschluss an den vorstehenden Aufsatz. Archiv fur Sozialwissenschaft und Sozialpolitik 20:550-553. -> Comments on the article "Die soziale Zusammensetzung der sozialdemokratischen Wahlerschaft Deutschlands," by R. Blank.
MIDDLE AMERICAN SOCIETY Middle America is a cultural and geographical region comprising Mexico, Central America, and Panama. Central America is composed of Guatemala, British Honduras (or Belize), Honduras, El Salvador, Nicaragua, and Costa Rica. Because of its size and many problems shared with the other countries, Panama is sometimes considered a part of Central America, although it was formerly a part of Colombia. The term Mesoamerica, coined by Paul Kirchhoff (1943), refers to a cultural subregion within Middle America, specifically the areas occupied by the pre-Columbian high cultures
MIDDLE AMERICAN SOCIETY [see URBAN REVOLUTION, article on EARLY CIVILIZATIONS OF THE NEW WORLD]. Mesoamerica was bordered on the south by a line that runs approximately from the Gulf of Nicoya (Costa Rica) to the mouth of the Motagua River (Guatemala). To the north, it included the areas set off by the northern borders of the Mexican states of Veracruz (including adjacent San Luis Potosi), Queretaro, Guanajuato, Jalisco, Durango, and the southern portions of Chihuahua and Sonora. Today Mesoamerica is a region of distinctive Indian populations surrounded by national civil populations. Culturally, the northern border of Middle America could be extended to include those portions of the southwestern United States that were formerly part of Mexico and that still have a significantly large and growing Spanishspeaking population. The processes of economic and social development are presently affecting each inhabitant of Middle America, whether he cuts coupons in Mexico City's plush pedregal or cuts sugar cane in the Pacific lowlands of Central America. Inextricably woven into the process of development is the very high population growth rate and a social structure dominated by a concern for power but technologically still heavily primitive and agrarian. It is this challenge of primitive technology, on the one hand, and the growing concern for control of one's fellow man, on the other, that is shaping the development process in Middle America today. The heavily mercantilist export pattern that dominated the nineteenth-century political economy has continued, although industrialization and the national exploitation of mineral resources have made marked advances in Mexico. Cultural components Indians and mestizos. The contemporary Middle American population is predominantly mestizo, a mixture of Spanish and American Indian. (In much of this region, as elsewhere in Latin America, the term is also used to refer to the parallel cultural mixture.) In southern Mexico, Guatemala, and adjacent Honduras and El Salvador, the term Ladino is used to refer to non-Indian peoples, especially to those of Spanish-American culture. The concept of the plural society, often used in describing the societies of the Caribbean, is less applicable in Middle America, where its usage tends to obscure the continuing processes of integration and acculturation. In contrast to Guyana, where two distinctive ethnic groups contend for political dominance, Middle America is SpanishAmerican, and existing ethnic enclaves may be
273
seen as a cultural pluralism that is gradually undergoing social integration. The term "ladinoization" (ladinizacion) (Adams 1957) has been applied to this cultural process in those areas where ladino is in use; elsewhere, "mestizo-ization" (mestizaje} is the more general term for the acculturation process. The Indians now form a very minor percentage of the total population of all countries except Guatemala. "Indian," in the sense used here, refers to that sector of the population that retains the use of an Indian language (whether bilingually or monolingually) and specific forms of social organization that are considered by the Spanish-American population to be Indian. However, many of the specific customs that differentiate Indians from non-Indians today are of Spanish colonial origin, not indigenous origin—religious organizations such as the cofradia; many elements in the men's costume, such as short trousers and split-side trousers; representational dances of the Christians and the Moors; much of the paraphernalia of the church and its rituals, etc. Similarly, many cultural traits of Indian origin are now common to the way of life of the national population—the basic diet of corn tortillas, which is supplemented by beans, yucca, and squash; basic agricultural tools such as the digging stick; and rural constructions such as ranches, which are made of adobe, poles, and wattle-and-daub and have grass or palm roofs. Traits of Indian origin have tended to survive in matters pertaining to direct adaptation to the habitat, whereas traits of Spanish origin have dominated in matters pertaining to social organization and ideas and values. The decline of the Indian component of the total population has been steady since the conquest; however, three major aboriginal patterns may be distinguished, each with a separate history regarding relations with the non-Indian population. Numerically and culturally the most important of the three patterns has pertained to the sedentary agriculturalist population of Mesoamerica. This population was organized under native states, and at the time of the conquest some of these provided an already "domesticated" population which the Spanish empire was able to take over and harness to its needs. The process of adaptation to the requirements of Spanish colonial activity, however, led to a continuing and sometimes precipitous decline of the Indian population lasting until about the middle of the eighteenth century. At that time, the combination of forced labor, wars, and disease had completed its work. It is reasonable to suppose that the two and one-half centuries of conquest
274
MIDDLE AMERICAN SOCIETY
had led to a severe natural selection of the Indian population and that the survivors formed a genetically different population than had originally encountered the Spaniards. The Spanish empire itself became so weak that colonial segments were perforce acting with more local autonomy; the Indian population increased from that time on. By the time of independence in 1821, the Spanish, Creole, and mestizo populations were still outnumbered by the Indians, but they were increasing more rapidly. The ratio of non-Indians to Indians showed a relative decline of the Indian population in spite of its absolute growth, but this also meant that the Indian was increasingly brought into faceto-face contact with the growing mestizo population. This process, combined with government action of the nineteenth century designed to reduce the organizational strength of the Indian population, caused a series of sharp acculturation situations that continue into the present. The aggressive retention of "Indian" cultural traits in the nineteenth century was in part a response to efforts to disorganize the Indians so as to convert them into a more controllable labor force. The Mexican reform laws and the later efforts of Justo Rufino Barrios in Guatemala sought to break the control that the church held over extensive lands as well as what might be termed the "eminent domain" over communal lands jealously guarded by the Indians. These efforts did succeed in challenging the church to a considerable degree and in shattering some Indian enclaves, but it also led many Indian communities to adopt a defensive posture with respect to the efforts of both government and private entrepreneurs to obtain the lands. The over-all cultural result, however, is that while the Indian population continues to increase in number, sectors of it that have been subject to special economic disintegration or political penetration rapidly acculturate and cease, in a cultural sense, to be Indian. This evidently has happened in El Salvador, where in the early 1930s an abortive revolution among the Indians was put down with such violence that the Indians over a wide region gave up their Indian costumes and tried to become ladinos in order to avoid further reprisals. The Mexican revolution caused severe acculturation in a number of instances, and Mexican government action against the Yucatan Indians and the Yaqui brought about similar shifts. The sudden progressive liberalism of the Arevalo—Arbenz period in Guatemala, from 1946 to 1954, caused a series of acculturative changes that continued thereafter.
The rapid decline of the Mexican Indian culture in recent years, coupled with growing nationalism, has led the national government to try to preserve certain of the native craft industries. Here again, the actual products in many instances are of colonial Spanish origin, but in the contemporary world they are marketed as "Indian." The Folk Art Museum of Mexico has achieved considerable success in providing a broader market for these industries, thereby strengthening their role in the local economy. Many of the Indian towns are becoming less and less distinctive as the demands and needs of the national scene impinge on the local organizational structure. Although the tribes and lesser states to the south of Mesoamerica acculturated early to the efforts of the Spanish, in Costa Rica, Panama, and Honduras, where the population was much more sparse, the conquest and colonization led to severe losses in the aboriginal population, and, in many instances, flight into refuge areas. The contemporary Indians of Panama, especially the Cuna and Guaymi, are in fact the descendants of a number of distinctive groups, the remnants of which fled to avoid the pressures of European colonial rule. So far as can be determined, the surviving Jicaque, Sumu, Paya, and Mosquito of Honduras and the Nicaraguan Atlantic coast have somewhat similar histories. The last mentioned have long been racially mixed with Negroes, although their culture remains distinctively Indian. To the north of Mesoamerica, the Spanish mission system brought under control various Indian populations, but as in the south, the Indian populations were sparse, and the effects of colonial control led to a much more severe destruction of the native cultures. By the nineteenth century, pressures from the north created by the expanding Anglo-American agricultural population increasingly forced bands of horse-riding "barbarians" into northern Mexico. Mexican efforts to control these bands, even with the loss of Texas and the greater southwest to the United States, were generally weak, since the northern frontier area was only sparsely settled by Mexicans far removed from the more central concerns of the government. Culturally, the proximity of northern Mexico to growing centers of the United States led to an economic orientation toward the north, which tended by the early twentieth century to differentiate both the culture and society of northern Mexico from those in the center and south, and the continued interchange of population across the border has contributed to a growth of the Latin American population in the southwestern part of the United States. Northern Mexico, earlier a dry and sparsely
MIDDLE AMERICAN SOCIETY populated area, today is one of the fastest-growing major regions in the republic, and one in which the Mexicans are achieving pronounced success in regional economic development. In the general picture of ethnic distributions in Middle America today, the national mestizo, or Latin American, population is predominant in all but a few areas. Central Mexico, sectors of southern Mexico and Yucatan, and adjacent Guatemala contain the largest remaining Indian populations. Small enclaves are to be found scattered in western Mexico and in the remaining Central American countries. Other ethnic groups. In addition to the Indian and mestizo populations, there are a number of other ethnic components that should be mentioned, either because of their past importance or because they remain today as ethnic enclaves. Negro slaves were brought into Mexico and various parts of Central America during the colonial period in an attempt to provide labor to substitute for the declining Indian population. In general, the settlement of these imported peoples was localized, and evidence of their presence survives today in a Negroid physical component in a few communities. Just prior to the end of the colonial period, the British shipped an entire population of "Black Caribs" from the Lesser Antilles to the northern coast of Honduras. Communities of these distinctive peoples today dot the littoral from British Honduras to southern Nicaragua. In spite of their name, they retain an Arawakan language and a culture that is more characteristic of the Antilles than the mainland. During the building of the Isthmian railway in Panama, the construction of the canal, and the two world wars, many English-speaking Antillean Negro laborers came to Panama and today form an important sector of the urban population. Their language and culture contrast with the Spanish-American culture of the mainland Negroid population. During the nineteenth and twentieth centuries, Chinese have settled in many areas along the west coast of the New World; in the early days they were imported as labor, and in more recent years they have immigrated for independent motives. This population is usually occupied in commerce in larger towns and cities. In recent years Chinese have mixed increasingly with the mestizo population. European populations of more recent origin include the English Bay Islanders off Honduras, the English of British Honduras, and the English, Spanish, German, and North American agricultural and commercial entrepreneurs who have been investing in the countries of Middle America during the past 150 years. In some instances, these peoples
275
have maintained strong connections with their homelands, and even though technically nativeborn nationals, they are considered by many mestizos to be still essentially foreign. Indian culture change. The surviving Indian population of Middle America is being acculturated, although the process is uneven. It is hastened by economic and political events and slowed by the continuing social, commercial, or geographic isolation of many of the groups. Acculturation occurs as the individual removes himself from the context of the Indian community by migrating to the city or to a plantation as a laborer; it occurs also as entire communities undergo pressures leading to a breakup of the older social structure. Thus the decline of the religious and political system of the highland Indians has been due primarily to political pressure from the government and an inability of the older ritual organization to hold the community together. The formation of sindicatos and gremiales (labor unions and organized interest groups) and cooperatives that include Indians has led to a new patterning of relationships that is increasingly pertinent to the way of life in some regions. A few community organizations have resisted the pressures for change. The campesinos (countrymen) of Jalapa, Guatemala, persist in regarding themselves as "Indian" although they have no distinctive Indian costume and speak only Spanish. In this case, the basis for cohesion lies in the campesinos' attempt to maintain their communal lands inviolate from the encroachments of neighboring ladinos. Generally speaking, however, poverty and political weakness characterize the Indians. This political inferiority is somewhat overcome when the Indian is economically successful. In a few places, such as Quezaltenango (Guatemala), and Tehuantepee (Mexico), a distinctive Indian middle class has emerged, composed of merchants who were long dominant in the market and are now among the major retailers in the city as well. This group has a heightened interest in forms of Indian organization that could prove politically viable within the nation. Racial differences make almost no difference in the acculturation process. An individual is socially classified principally by his conduct. Indian physical appearance is significant only in terms of being one of a number of traits that may identify an individual as being "Indian." When behavior changes, however, physical appearance is taken as indicative of genetic history, not current social allocation. The same holds true concerning people of Negroid extraction in Panama. Although an Anglo-Caribbean background gives them some cultural differences, when they assimilate into the Spanish-speaking
276
MIDDLE AMERICAN SOCIETY
Panamanian population their skin color carries no further social or cultural meaning. Recent historical background The termination of the Spanish empire in Middle America occurred in 1821. Mexico was formed of what had formerly been the viceroyalty of New Spain, but it also included Chiapas, the southern area that had been part of the captaincy-general of Guatemala. The Central American federation included the area from Guatemala through Costa Rica; and Panama was considered part of Gran Colombia. Politically, the federation disintegrated into its component parts in 1838. Although attempts have been made to reconstitute a Central American union, nothing has come of it to date. Independence did not bring social or economic revolution to Middle America. Colonial mercantilism evolved naturally (and, in fact, had already done so through illicit trade) into an agrarian mercantilism whereby the Middle American countries gained their principal national income from the export of hacienda-produced crops, the extraction of forest and other products growing wild in the habitat, and from the continuing though much reduced mining. The governments alternated between a continuing conservatism, wherein the church played a strong stabilizing part and controlled a significant sector of the state territory, and the emergence of nineteenth-century liberalism, directed toward trying to exploit resources in order to foster economic growth. The collapse of the Central American federation marked the end of liberalism in Guatemala until the third quarter of the century, but liberalism continued in the other Central American states and in Mexico. The private hacienda flourished during the nineteenth century; the hacendado essentially ran his private regional state when under the conservatives and received the overt support of the government when under the liberals. The condition of the mass of the population was basically of little concern to both conservatives and liberals. The last half of the century saw the ascendancy of liberalism, with the government of Porfirio Diaz in Mexico and that of Justo Rufino Barrios in Guatemala. The latter was killed in 1885, in an unsuccessful attempt to reconstitute the Central American union by force, but Diaz continued to lead Mexico until the revolution in 1910. Middle America, like the rest of Latin America, constituted a part of the world's hinterland at the time of the industrial revolution in northern Europe and North America. Industrialization in Mexico was encouraged primarily by foreigners, since it was
they who had direct contact and experience with the process. By 1880, 400 factories in Mexico employed 80,000 laborers, and the mining population had risen to 70,000. Heavy industry was underway by the turn of the century; there were steel plants in Monterrey, and oil was extracted under foreign concessions. Although there were a number of attempts, some fairly active, to establish labor organizations during the latter part of the nineteenth century, it was not until the revolution that they became significant. Although by 1900 some industry had begun, the predominant pattern of life in Middle America continued to be a rural one. The few nascent industries in a sense constituted an extension of the northern industrial effort into the Middle American region. Industrialization was not growing up within Middle America but rather being thrust upon it in a relatively advanced form. This led to a secondary development, wherein the technology was basically borrowed and the society had to make major readjustments to incorporate it. For example, the exploration for and extraction of oil were achieved with a well-developed technology, so that the labor force did not grow with the complexity of the industry but rather had to be trained specifically to handle certain kinds of tasks. The same was true for the steel plants. The development of labor organizations essentially followed European, and to some degree, North American counterparts: the organizations placed emphasis on anarchist philosophies, but they were unable to organize effectively much beyond the level of mutual benefit societies. The entire process of industrialization involved an essentially rural laboring population. Differing from an urban proletariat, the laboring force moved to industry from the country and in many instances moved back to the country. Although the life of the workers in the countryside was extremely harsh, and perhaps even harsher in the mining sectors, conditions in the cities were not so much better that they attracted a great number of people. There was no enclosure movement to force people off the land; indeed, the growing haciendas had to compete for the available labor, and the rural population was needed in the rural area. The developing industries found that they had to adapt to the customs of the rural population rather than the reverse. The regional hegemony that provided haciendas with control over their labor was not so easily achieved by the new industries. Involving as it does the adaptation of a population to different and complex cultural forms, the process of creating an organized and urbanized work force has proved to be the basic difficulty
for the economic development of the Middle American countries, just as it has proved to be elsewhere in Latin America. The fact that the countries had been politically independent for over half a century did not have the same significance as in Europe and North America, where the societies had been evolving as an intrinsic part of the growing demands of the technological and economic changes and advances. The very success of Diaz' economic development program can be measured to some extent by the degree of degradation to which the Mexican campesino was increasingly subjected. Mexico continued to be marked by strong regional ties and controls exercised as much by regional bosses as through derivative power of the central government. The Mexican revolution erupted in 1910 and shook the country for almost a decade. It is pertinent for understanding Middle America and its role in recent world history to recognize that this revolution was the first major successful social revolution to reflect the incredibly complex conflicts that were engendered by the industrial revolution. The fact that this occurred in a society controlled by an agrarian (although industrializing) oligarchy, not by the industrialists themselves, made the contrast between the conditions of the laboring population and those of the upper sector of the society especially visible. In the years that followed the initial outbreak, the Mexican revolution succeeded in eliminating much of the agrarian oligarchy, and it turned over much of the land to the hacienda laborers and neighboring communities of campesinos. It established a new set of community lands called after the earlier form, ejido, and, by retaining title, placed the recipients of this land under direct governmental control. The small subsistence landholder has not been supplanted, however, and the population expansion has brought increasing agrarian problems in its wake. None of the other Middle American countries has successfully undertaken such revolutionary reforms. The 1952 agrarian reform law of Guatemala lasted for two years and was then rescinded, and most of the expropriated lands were returned to their original owners. The role of foreign countries in this process must be noted, especially that of the United States. During the nineteenth century, Middle America and the Caribbean continued to be objects of competition and some conflict between various European countries and the United States. The French intervention in Mexico came, significantly, after the United States' war with Mexico. In Central America, the British continued their territorial claims, which survive today in the colonial residue
of British Honduras. U.S. policy in the Caribbean was an attempt to exclude European interests, and following the war with Spain at the end of the century, the United States began a program of "superpaternalism," developing complete economic control of Cuba, retaining Puerto Rico as U.S. territory, and attempting to control the external financial affairs of Haiti and Nicaragua. At the turn of the century the United States had developed its interests in the banana industry in a number of countries, and, following the separation of Panama from Colombia, the Panama Canal was built. The weakness of the Central American governments was reflected in the continuing action of the U.S. Marines in Nicaragua that led to the establishment of the Somoza dictatorship in the 1930s; the strong influence of the banana companies (and growing coffee interests) in the governmental affairs of Honduras, Guatemala, and Costa Rica; and the maintenance of U.S. control over the Panama Canal. Mexico's concern for its own internal affairs and its desire to gain control of dominant foreign economic interests and nationalize its oil resources led that country to pay scant attention to the nations to the south, leaving the United States as the only major foreign power. Today, Mexico is finally capitalizing on its economic successes and in fact has challenged Guatemalan claims to the colony of British Honduras. In a very real sense, the Mexican revolution culminated in the expropriation of foreign-held lands and interests under Cardenas, head of state from 1934 to 1940, and in the great increase in agrarian reform activity. World War n brought to the Central American countries the same opportunity and cause for nationalistic development that occurred in Africa and Asia. During the period 1946-1954 Guatemala attempted to shed the controls exercised by the United States and to initiate strong national self-identification. This period in Guatemala was followed by a gradual shift toward the slower development that was occurring in Costa Rica and El Salvador after these two countries had ousted their dictators. Panama also has made a significant effort to provide national controls exclusive of foreign influence; Honduras and Nicaragua have showed somewhat less interest. Among themselves, the five countries of Central America (Panama being excluded) have formed a "common market" organization that is scheduled, should things occur more or less as planned, to create a single economic entity of the entire region. Significantly, this union was initially encouraged by the Economic Commission for Latin America,
278
MIDDLE AMERICAN SOCIETY
an organization generally free of U.S. influences, although subsequently the United States has supported it with technical and economic aid. Social organization Although Middle America had several urban centers prior to the Spanish conquest and industry has become an integral part of the Mexican economy and a growing sector in the smaller Central American countries, the greater portion of the Middle American population is still agrarian in orientation, and much of the growing urban population has immediate rural origins. During the nineteenth century agrarian mercantilism created and perpetuated an upper class that lived on the land and a small service, commercial, and administrative class in the cities and towns. This pattern has continued. The middle-income group has expanded as the major cities have grown. The countryside and towns have an upper socioeconomic sector that is allied with the middle-income sector of the cities. The lower sector forms a continuous population from countryside to city, although there tends to be a sharp cultural differentiation between city dwellers of the second and later generations and those who have themselves made the move. Since so many migrants are recent, the poorer sectors of the city have a distinctly rural flavor. Guatemala City had no slums in 1950, but today it has a variety of shack towns. Mexico City has been the recipient of such migrants for some decades. Many are swallowed up in the older sections of the cities, moving in with earlier-arrived relatives until such time as they can be on their own. All the growing major Middle American cities are faced with the problem of rural migrants for whom there is not enough work and insufficient housing. For the rural migrants, the major destinations other than the cities are frontier areas such as the Atlantic regions of Panama, Costa Rica, Nicaragua, Honduras, and Guatemala, and the northern and southern states of Mexico. Mexican migration to the United States over the past years has been large, although recent legislation has led to a severe restriction in numbers. These population movements are creating broad kinship networks over great areas, extending from North American cities to the rural areas of Mexico. The household remains the basic kin unit, and migration to rural and urban areas establishes a chain of such households. Many Middle American rural communities have colonies in the major cities, and these serve to indoctrinate the newcomer, helping him to adjust to the exigencies of urban life. Advice from friends and relatives similarly leads
people to seek out a better life in the frontier areas. This kind of broad geographical network has been long established among wealthier peoples, but among the poor it is fairly new. It has not led to a disintegration of kinship as a major relational system, but many specific obligations and responsibilities have changed. In the middle-income groups, women are increasingly emancipating themselves from the pattern of male dominance and the traditional restrictions that limited the range of their contact to female relatives and the nuclear family. This older pattern has not entirely disappeared, but it has always been less effective among the poorer people, especially in the cities, where there is a high proportion of households headed by women who have to earn a living and support their children without the aid of a regularly employed man. Under these circumstances, when women carry the major economic responsibilities of the family, there is little room for the older restrictions. Urban organization. The plan of Middle American towns and cities generally follows a rectangular grid pattern, with one or more plazas. The major Catholic church and the municipal government building are found on the central or oldest plaza. In former years the houses of families in the upper social strata were generally located in or near this center. These houses were built with patios, the number of patios being some measure of the wealth and prominence of the family. The major cities today have grown far beyond this pattern; the automobile and bus have made possible middle-income and wealthy residential suburbs; projects have been developed for middle-income white-collar workers; and severe slums have grown up along the margins and in crevices of unoccupied land in almost all sections of the city. Workshops and house industries may be found in the older sections of the cities, but they are also scattered in the growing middle-income and lowerincome districts. The traditional centralism of Latin American countries is evident in the fact that the capital city in every Middle American country, with the possible exception of Honduras, is also the major commercial center, and it is five to ten times as large as the next largest center in the country. In Mexico some provincial centers have been growing more rapidly than the national capital. Regional industrial centers are of major importance, as are some regional educational institutions. In the Central American countries the national capital still dominates, although there is significant growth of some provincial centers, such as Escuintla in Guatemala and San Pedro Sula in Honduras.
MIDDLE AMERICAN SOCIETY Metropolitan Mexico City is one of the major cities of the world. Guadalajara, Monterrey, Puebla, Ciudad Juarez, Guatemala City, Panama, San Salvador, and Managua each exceed 200,000 in population. Industrial development in Mexico has been able to absorb a significant number of the rural and provincial immigrants, but there are still very many who lack basic skills and can enter the industrial labor force only in a completely unskilled capacity. This is also true in the smaller cities. The basically rural orientation of the migrants makes it difficult to organize them into syndicates or unions, although in Mexico such organizations have become large over the years and have played an important part in the consolidation of the single Mexican political party. The organization of labor in Middle America dates back to the last century, but it has achieved strength mainly in Mexico, where well over two million men are involved. There are fewer than fifty thousand organized workers in each of the other countries, and of these, only in Costa Rica and El Salvador do unions have affiliations with an international organization. Unions have been under severe government control, and in Mexico especially the right to strike and the arbitration of demands are determined by what the government considers to be the national welfare. In many industrial and commercial firms there is still a strong paternalistic relationship between management and labor, but government action and the establishment of unions have done much to reduce this in Mexico and Guatemala. While in Mexico and Guatemala government has attempted to balance industrial developments with the welfare of the general population, elsewhere private investment interests continue to be a stronger influence. In Panama and El Salvador the controlling oligarchies may not participate directly in political administration, but they effectively dominate the economic systems of the two countries. In Nicaragua, the Somoza family has, since the 1930s, exercised a political control that has enabled it to obtain ownership or interest in many of the enterprises of the country. In Guatemala, Honduras, and Costa Rica those who possess the wealth of the country exercise considerable influence, but they do not seem to have the same degree of political control as their counterparts in Panama, El Salvador, and Nicaragua. Agrarian structure. Only in Mexico has industrialization advanced to a point where it has assumed a significant role in the national economy; industry and commerce have each equaled or bettered the agrarian portion of the gross domestic
279
product. In the other countries, manufacturing is less than half as large as the agrarian production, and commerce seldom reaches even a third. Furthermore, the greater part of the national wealth in these countries is still dependent upon agrarian products, of which there is generally only one or a few export crops. In 1960 coffee was the principal agricultural export in four of the countries (Costa Rica, El Salvador, Guatemala, Nicaragua) and was the second most important export in all the rest. Bananas are the major export of Honduras and Panama and the second most important in Costa Rica and Guatemala. Cotton occupies first place in Mexico and second in Nicaragua. This emphasis on one main crop in the Central American countries is a continuation of the nineteenth-century agrarian dependence, and much in the social structure reflects a similar continuation from that period. There has been a serious attempt to experiment with other crops that might require new forms of social organization, but except in Mexico it has had relatively little effect on the over-all picture. In all countries of Middle America over half the economically active population is involved in agriculture, and in all but Mexico and Panama approximately one-half or more of all wage earners are so occupied. In Mexico, approximately one-third of the wage earners are in agriculture, and in Panama, because of the ready availability of free land, only 13 per cent are so employed. Subsistence agriculture, which emphasizes corn and beans (except in Panama, where rice is especially important), is still the basis of life for a large proportion of the population. The subsistence agriculturalist still depends primarily on a few basic tools, such as the digging stick, the hoe, or the plow for planting, and the axe and the machete for clearing land, the last also being used for general work and occasionally as a weapon. The seasonal round of activity characteristically allows time for many subsistence farmers to earn wages as seasonal laborers on plantations or coffee fincas; there is also a period just before the first harvest that is known as the "hunger period." Those who do not migrate seasonally usually have some crop or craft that provides a cash income, and in many regions, whole communities will specialize in the production of squash, onions, wheat, vegetables, fruits, flowers, or some other marketable produce for the cities. Scientific large-scale agriculture is increasing, but it has yet to achieve any degree of stabilization. Heavy dependence on a single crop makes the enterprise subject to the whims of the world
280
MIDDLE AMERICAN SOCIETY
market, and the use of pesticides, fertilizers, and herbicides is merely beginning. The most successful advance in this area has been the Mexican irrigation projects—the Rio Grande, the Fuerte, the Papaloapan, the Tepalcatepec, and the Grijalva— Usumacinta. Developments in agriculture are bringing economists, engineers, agronomists, and other technicians into important professional positions in Mexico, and they are assuming a new kind of control within the social structure. While Mexico alone has taken the major steps toward industrialization of agriculture, it too has extensive areas populated by small-scale subsistence agriculturalists, and the number of these people is increasing. Although agrarian reform was instituted shortly after the Mexican revolution and well over 115 million acres of land have been reallocated to small holders, there are still some very large holdings. Cline (1962, p. 220) estimated that in 1950, fewer than one thousand landholders still received 35 per cent of the total agricultural income. None of the other Middle American countries has been able to carry through any significant land reform. Guatemala's attempt during the Arbenz regime, which lasted from 1950 to 1954, ended when Arbenz was exiled and has been replaced by rural colonization projects. The contemporary agrarian structure is under three forms of pressure. First, the increase in rural population materially reduces the amount of land available per capita. Second, there is a demand for large-scale agricultural enterprises, since industrialization can best develop this way. Third, popular political efforts are aimed at reforming the entire tenure structure. To this, a fourth may be added, in Mexico—a gradual inflation that is making small-scale production increasingly noncompetitive at the national level. Rural social organization. The lower sector of the agrarian social structure is composed of peasant farmers, small-scale subsistence agriculturalists (whether renting or owning land), and a variety of regular and seasonal agricultural wage laborers; the upper sector includes both corporately controlled agricultural entrepreneurs and individuals. The classic latifundia and minifundia are still in existence today, although the economic structure behind them has tended to change their appearance. Population increase in the rural areas has exacerbated the minifundia problem (i.e., the progressive diminution of land size due to equal inheritance in a population expanding through natural increase) and has led to an increase in demand for seasonal work. The latifundia is becoming more profit-oriented. The nineteenth-
century-style hacienda, the private regional domain of its owners, has disappeared from much of the area. This development has occurred in part because of the laws protecting labor, but even more because of the appearance of younger aggressive agrarian entrepreneurs, who in some instances have developed large-scale plantations, often converted from the older haciendas and utilizing new crops. Others rent land and produce fast-growing crops that promise at the moment to be profitable on the world market. The speculator shows essentially no interest in the care of the land or the welfare of the laborer, and usually disappears, either wealthy or bankrupt, as soon as the market becomes unfavorable. The basic landholding system is still private, with the exception of the ejido system in Mexico. This system, however, involves only about a quarter of the Mexican farming population. Community landholdings are found in scattered areas. They may be lands dating back to old grants, but probably just as common are those which have been more recently created out of land purchased by a community. Community land as such, however, must not be confused with the Mexican ejido. The former is controlled by the members of a community (not necessarily all members), whereas ejido lands are controlled by the federal government. The two are structurally different, just as the profit-oriented plantation is structurally distinct from the disappearing hacienda. Except in very poor communities (and communities composed almost entirely of Indians) there tends to be a fairly evident local upper socioeconomic sector. Indian communities, especially those that have retained a corporate quality through collective interest in community lands or a strong local religious-political-administrative system, can seldom be differentiated into strata, although differences of wealth and prestige do exist and play an important role in the operation of the Indian community organization. In mestizo and ladino communities and those Indian communities with a significant non-Indian population, there is often a group of families that traditionally have taken responsibility for government and public leadership, and they usually have control of major local resources or manage them for those who do. Rural community organization varies with many factors, but among the more important are whether the communities are composed primarily of landowning farmers or rural laborers; whether or not they are expanding rapidly; whether they are old or of fairly recent origin; and whether or not the farmers have rights to community land of some
MIDDLE AMERICAN SOCIETY kind. Old communities of independent farmers with collective rights to the land tend to be fairly exclusive and maintain fairly close internal ties insofar as the land can support them. As the population grows and opportunities for wage labor become increasingly necessary for survival, there is a tendency to slough off population, sending people to plantations, the cities, or frontier areas. People in new communities, even when there is community land, tend to identify with the nation, neutralizing an otherwise strong attachment to the village. Where there is no basis for corporate community involvement, such as land held in common, the orientation of each of the members will be toward his own self-interest. From the point of view of national growth, rural organizations not based on the community are more important. Among these are the campesino federations of Mexico and Guatemala, the rural labor unions, and cooperatives. The Confederation Nacional Campesina is the largest Mexican rural "interest" group and is fundamentally composed of all the ejidatarios. A similar attempt to organize peasants and wage laborers in Guatemala was made during the Arbenz period, but it collapsed with his downfall. Recently it has been reinitiated under Christian-Democratic efforts. Rural labor unions, though underdeveloped, are usually organized on specific farms. Access to judicial action in protection of labor's rights has been established in all countries, although the bias of the courts varies considerably. Recent years have seen a sudden proliferation of cooperatives, primarily of a producer type in handicrafts, and increased marketing of extractive products and farm produce. In Mexico the government has succeeded in obtaining and retaining a fairly strong degree of control over rural organizations; in the rest of Middle America landholders have a stronger lien on the government's interests and have been fearful of such organizations. As a result, they have been fewer in number and less successful. Power and prestige structure The liberal power structure of nineteenthcentury Middle America had an agrarian base oriented toward the export of a few basic crops and depending on northern industrial centers for necessary tools and the luxuries enjoyed by the small upper sector. At the head of government there was in each country a strong individual who tried variously to develop the country or his own fortunes through encouraging the expansion of exportation. This was particularly true in Mexico
281
and Guatemala; less so elsewhere. In Nicaragua, power was thin and essentially balanced between the colonial cities of Leon and Granada; in Honduras the north coast was often beyond the effective control of the government of Tegucigalpa. The center of population in Costa Rica, the meseta central, and the population of Panama growing up around the transit zone were both effectively separated from all other centers by miles of uninhabited land. Local leaders, because of distance from the capital, often exercised more power than the central government could bring to bear. Economically, some regions were all but independent from the capital. In a country as small as Guatemala, coffee production in the western part of the country developed a regional society, which included many Germans in the administrative sector. The port of Champerico was used as the area's point of entry and exit. A similar society in the north, Alta Verapaz, exported directly through the Rio Dulce. The capital city had hardly any contact with people or products from either of these societies. Similarly, northern Mexico, Yucatan, and other regions were in many respects relatively independent of what was occurring in the country at large, and sometimes they had closer relations with foreign centers than with their own country's capital. In the provincial areas, political control was generally held by large landholders, not necessarily having extraordinary wealth themselves, but relatively much wealthier than the rural population surrounding them. They either belonged, or considered themselves to belong, to a society that reached into urban parts of the world, and their homes usually were combinations of the special riches of their region and clothing, pianos, and other articles of varying luxury imported from Europe or the United States. Some areas were specifically under the control of members of the clergy, and others were still under the corporate control of organized Indian villages. The liberal governments were concerned with strengthening the central government, thereby weakening the local and regional power centers. Strengthening of the central governments has come about through a number of means. Federal or national troops of police frequently were established to roam the country and keep order, helping the regional power holders but at the same time making them increasingly dependent upon centralgovernment action. The church was neutralized by Barrios in Guatemala in the 1870s, and in Mexico it was effectively restricted in the nineteenth century and later most specifically by the
282
MIDDLE AMERICAN SOCIETY
expropriations following the revolution. Debt peonage was abolished in Guatemala, and a vagrancy law was substituted, thereby bringing the labor force under government control rather than control by local landholders. Alliances with foreign interests provided the government with an income independent of the rest of the country. While this substituted one kind of regional control for another, it placed governments in a position where they did have more funds. The professionalization of the military started in Mexico in the nineteenth century and in the Central American countries proceeded gradually, providing government with a more dependable military arm. In Mexico, however, military leaders had grown so powerful during the revolution that the immediate problem was how to extract the power from the military without destroying it as a necessary government arm. This was accomplished over the decades following the revolution, and now the Mexican military is essentially fragmented into a great number of small units; officers are rotated so that the possibility of collusion among them is severely reduced; and the early revolutionary generals are now replaced by younger officers. However, the trend toward a more powerful military, evident elsewhere, could occur in Mexico. More important than specific changes in the controls exercised by the Mexican government have been the significant gains in economic development that have established new bases of power within the country. In addition, the entire revolutionary situation created new political and governmental tasks and responsibilities and has led to an expanding bureaucracy and party organization, thereby providing thousands of positions by which individuals may participate in economic and political action. This may be thought of as an expansion of the base of power within the area and, as such, an increase in the amount of power that is available for people to manipulate. In the Central American countries specific relations within the power structure are more varied. There have been no revolutions to completely displace the older landowning oligarchy, but in part this has been the case because such an oligarchy was not always dominant. In the early twentieth century, Nicaragua, Honduras, and Costa Rica had a fairly large proportion of small and medium farmers; concentration on export crops—first coffee, then cotton and other field products—has led to the accumulation of lands in relatively fewer hands. At the same time, however, the economic development that has occurred in the wake of cash cropping has provided the same expansion of the power base that has occurred in Mexico, and the
sector of the society that operates in this framework has concomitantly expanded. This broadening of the power structure has produced what many have considered to be an emerging middle class, middle sector, or middle mass. Considering economic measurements, i.e., the income gradient of the population, and the growing white-collar sector, these terms are applicable. This development, however, has been accompanied by an unfortunate tendency to attribute the origin of all new things to the members of this middle class: they are presumed to possess the ideology of nationalism; they are expected to produce industrial and commercial entrepreneurs; and they are thought to be the potential source of a stable political society in the Western tradition. While there is something to recommend these suggestions, they have become intellectual blinders to the fact that changing power bases have also served to perpetuate features of the nineteenthcentury oligarchic structure. The vast gulf between the ways of life of the older upper and lower classes produced a system of prestige behavior that identifies certain forms of behavior with the upper sector and certain others with the lower. Manual labor, or earning a living with one's hands, is generally regarded as the mark of a person of low prestige; work, in this sense, is not regarded as a goal of those individuals who aspire to a better way of life. Socially, this idea has led to a continuation of an apparent dichotomy between those who are satisfied with work as a means of employment of time and those who reject it. The idea is no longer clearly congruent with differences in income, because there are craftsmen and farmers with moderate holdings who, through the organization of workshops and employment of labor, have succeeded in achieving considerably more wealth, both in land and in cash, than the average white-collar worker. Also, while the per capita income of the countries has increased, there is no evidence of a significantly wider distribution of wealth among members of the lower classes. White-collar workers do not generally enjoy a large income but regard it as important that they not be marked as manual workers. The means to social mobility do not stem from the power that accrues from wealth but rather from the ability to control the behavior of others in a variety of ways. This, in turn, has perpetuated a series of what have been termed "vertical" relationships in the society. Through kinship, friendship, reciprocal help, influence, using one's position to achieve a slightly better one, and so on, there continues to be a strong set of interdependencies that relate people
MIDDLE AMERICAN SOCIETY high in the power structure downward, in separate lines, to many people lower in the structure. Since power always works two ways, these are the very lines that are used by people lower down to better their positions. In spite of new wealth and the growth of an economic middle class, Middle American society is still recognizably divided into two prestige sectors, many characteristics of which are derived from the past. While the power base has expanded, there seems to have been little to change the orientation toward work and wealth that has characterized the lower sector or the upper sector's orientation toward power manipulation in order to achieve a better position in the prestige system. The orientation toward work in the low?r sector is not necessarily manifested by a devotion to work. Among wage laborers it is often quite the reverse; the economic development process has not yet materially raised their standard of living, and they have recognized that work does not mean upward mobility. Work is seen as a necessary means to survival, measured by cash income. Working for additional income is not seen as crucial, since the amount of income such work makes available will have no material effect on the person's access to power. By the time a person may in fact achieve an income that will permit him such access, he has learned that money is still not enough: power is exercised through a variety of exchange devices and vertical relationships. The lower sector may be seen, then, as a survival sector, whereas the upper sector, having mastered survival, seeks political position, use of luxury goods (although not necessarily their ownership), ownership of land (the continuation of the old symbol of power), and leisure. The power structure of the countries of Middle America has not fundamentally changed in shape, although there have been important shifts in emphasis. It has conserved certain features that used to be thought peculiar only to agrarian states. Many of these features appear to be viable in a situation of economic development and, as such, are responsible for many of the characteristics that have been called "Latin American" by those who have equated progress with the ways of life of Europe and the United States. Government and political organization The formal administrative structures of Mexico and the Central American countries differ in details, but they are basically the same in manner of operation. The Central American countries are divided into departments or provinces, the adminis-
283
tration of each of which tends to be nominal, since most power is vested in the national executive and the congress. The minimal unit of territorial and administrative organization within the national structure is the municipio (distrito in Panama, canton in Costa Rica). There may be several small towns within a municipality, but they are completely subordinate to the capital, which gives its name to the entire territory. All the Middle American countries are constitutional republics and have legal systems based on civil law. Only Mexico is technically a federal republic. Even in Mexico, however, the president has the power to remove any state government that is felt to be unequal to its tasks, and so in effect the central government exercises considerable power over the states. The mode of election of the congresses and their specific breadth of responsibility and authority vary. Congresses are generally responsible for major legislation, usually supporting the executive policies. Situations such as have occurred in Brazil, Chile, and Argentina, where the congress may oppose the president for extended periods, have not been common in Middle America. Overt dictatorship began to decline following World War n, when some dictators resigned, as in Guatemala, El Salvador, and Honduras, and others were killed, as in the case of Somoza in Nicaragua. But violence has accompanied both constitutional and nonconstitutional rulers, with the recent exception of Mexico. Since World War n the military has played a central part in establishing or removing the executive in Guatemala, El Salvador, Honduras, Nicaragua, and Panama (where the national police function as an army). Following the 1948 revolution in Costa Rica the president disbanded the army entirely. The judiciary operates at the local level through justices of the peace, who are in some instances also the local administrative officers. The courts of each country are based on civil law rather than constitutional law. There are special courts, such as labor courts, in some countries where the government felt that the regular courts would tend to operate unfairly with respect to a particular sector of the society. Although all the countries of Middle America have political parties, these are by no means similar. Both Honduras and Nicaragua have a twoparty system, Nicaragua's dating from the nineteenth century. For the past thirty years, however, Nicaragua has been controlled by the Somoza family. Mexico has had basically a single party for the past 35 years, but the party is so organ-
284
MIDDLE AMERICAN SOCIETY
ized that it has sectors representing most of the major groups in the country—ejido families, rural unions, industrial and labor unions, civil servants, cooperatives, small proprietors, merchants, professionals, etc. While there have been opposition parties, none has been able to win sufficient support or success to retain a consistent front of opposition to the main party. There has been some continuity to party organization in the other countries, but for the most part, parties have arisen and been especially important at the time of elections. The major political process of recent years has been that referred to as "politicization," the recognition of the state as the ultimate authority and the recognition of legitimacy of certain governmental processes. Elections in all countries have become regular events, even though in all but Mexico and Panama they have been interrupted by coups d'etat that have displaced the duly elected officials or put off elections that apparently were not going to be favorable to the army. Since university education in Middle America is still available for relatively few people, it is in the secondary schools that the real politicization takes place. In the schools there are usually strongly nationalistic opinions, and the students are generally willing to turn out, or be turned out, for demonstrations of a political nature. University students also become involved in these activities, and frequently the police and army are called upon to restrain demonstrators. The party organization in Mexico has done much to facilitate the political participation of a wide segment of the population, as have the political events of recent years in Guatemala, Costa Rica, and Panama. Nevertheless, a large portion of the populations, especially the rural sectors, still do not participate in the political process because structures are not developed to keep an electorate interested and governments tend to inhibit popular participation, being afraid of a popular swing to support of either a demagogue or a leftist. The major ideologies that operate in most countries involve some combination of nationalism (sometimes openly coupled with anti-U.S. positions, sometimes merely subtly so), promotion of economic development, and promotion of democratic and constitutional procedures (governments recently taken over by the military always immediately profess plans to return to these procedures). In general, all governments recognize, at least on paper, the need for social development and the responsibility the state has in this development. Incumbent governments and the military in the Central American countries have dealt with opposi-
tion either by stopping an election if the opposition looked too promising or by outlawing the opposition's participation in the constitutional procedures. The Roman Catholic church has played an important role in politics. It was restricted by the revolutionary government of Mexico, and only in the most recent years has it gradually become again a political influence of some importance. The Barrios regime set the number of priests to be permitted in Guatemala. As a result, Guatemala has the lowest ratio of priests to Catholic population of any country in the Western hemisphere, and during the Arbenz period the church's hostility did not deter much of the population from supporting the government. The efforts of Protestant missionaries have increased in recent years, and although they have not been marked with overwhelming success, they have stimulated the Catholic church to improve the quality of its own clergy. The role of Catholicism has remained one of participation by the clergy in political action rather than involvement with political ideology. The influence of foreign powers in Middle America has a long history, but in recent years the United States has been most in evidence. The United States has retained the control over the Panama Canal and has exercised strong influence over a number of the governments, especially those of Guatemala and Nicaragua. Concerned that the incumbent Arbenz government was being taken over by "communists" the United States provided funds, equipment, and administrative aid in the "liberation" of Guatemala by Castillo Armas in 1954. The Central American republics have increasingly been integrated into a Central American common market, and steps are taken annually to promote this development. This does not include either Panama or Mexico. Such a development does not mean that there will be a serious move toward Central American political union, since the conditions for this eventuality are not politically attractive for the external relations of the small countries, nor would it necessarily resolve their internal political problems. Mexico is a member of the Latin American Free Trade Association and therefore is not involved in the Central American Common Market. The smaller union has been undertaken with the view of providing a more viable entity that might ultimately participate in the Free Trade Association. Research on Middle America In the nineteenth century, most research in what today would be regarded as social science
was natural-history reporting. Much of it was of high quality, such as the work of Alexander von Humboldt, John L. Stephens, Ephraim George Squier, A. P. Maudsley, and others. Otto Stoll, Eduard Seler, Walter Lehmann, Franz Blom, and Franz Termer have maintained a long tradition of Germanic scholarly work. French interest has concentrated more in Mexico, stemming in part from France's political interests during the period of Maximilian. Spanish scholarship, as well as that of the Middle Americans themselves, has until recent years focused on the colonial period. U.S. interest, while scattered through the nineteenth century, was marked by the major studies of Hubert H. Bancroft on Mexico and America. It matured into specific disciplinary concentrations in economics and anthropology early in the present century. The indigenismo movement stimulated work in Mexico, principally under Manuel Gamio, Jose Vasconcelos, Moises Saenz, and others. In Latin America this movement had little effect beyond the boundaries of Mexico, with the exception of the Andes region and Brazil. Interests stemming from philosophical traditions marked sociological studies until the period of World War n, when more empirical research began. Political science, marked by national and international biases, may be said to have started about the same time. Besides the national libraries and archives of the region, other important research collections for the area are those at the University of California at Berkeley; the University of Texas in Austin; the Middle American Research Institute of Tulane University in New Orleans; the Library of Congress; the Institute Ibero-Americano of Berlin; and the libraries of Seville, Barcelona, and Madrid. There are still many important private collections. Research needs in Middle America are of two major types: (1) those concerned with preserving records of now disappearing entities; and (2) those emerging from concern with problems attendant on contemporary society. Among the first are ethnographic studies of gradually (and in some instances rapidly) disappearing indigenous cultures, especially in Mexico and Guatemala, as well as studies of ways of life that are being crowded out by population expansion and urbanization. The second includes the entire range of issues in economics, sociology, and political science having to do with the continuing adjustment of societies to the rapidly changing modern world. Contemporary understanding of Middle America is still fettered by a Euro-American intellectual inheritance that °ften obscures elements in the empirical situation. The advancing technology of the social sciences
is gradually being applied to Middle American studies. What is most lacking is the development of new concepts for use in studying an increasingly complex evolutionary picture. RICHARD N. ADAMS BIBLIOGRAPHY
ADAMS, RICHARD N. 1957 Cultural Surveys of Panama— Nicaragua-Guatemala-El Salvador-Honduras. Pan American Sanitary Bureau, Scientific Publication No. 33. Washington: The Bureau. AGUIHRE BELTRAN, GONZALO 1946 La poblacion negra de Mexico, 1519-1810: Estudio etnohistorico. Mexico City: Ediciones Fuente Cultural. ARRIOLA, JORGE Luis (editor) 1956 Integration social en Guatemala. Guatemala City: Seminario de Integracion Social Guatemalteca. CLINE, HOWARD F. 1962 Mexico: Revolution to Evolution, 1940-1960. New York: Oxford Univ. Press. Cosfo VILLEGAS, DANIEL 1955-1965 Historia moderna de Mexico. Vols. 1-8. Mexico City: Editorial Hermes. Handbook of Middle American Indians. Edited by Robert Wauchope. Vol. 1—. 1964—. Austin: Univ. of Texas Press. -» Four volumes have been published to date. KIRCHHOFF, PAUL (1943) 1952 Mesoamerica: Its Geographic Limits, Ethnic Composition and Cultural Characteristics. Pages 17-30 in Sol Tax et al., Heritage of Conquest: The Ethnology of Middle America. Glencoe, 111.: Free Press. ->• First published in German in Volume 1 of Acta americana. LEWIS, OSCAR 1961 The Children of Sanchez: Autobiography of a Mexican Family. New York: Random House. MEXICO, ClNCUENTA ANOS DE REVOLUCION
1963
Cin-
cuenta anos de revolution. 4 vols in 1. Mexico City: Fondo de Cultura Economica. MIRO, CARMEN A. 1964 The Population of Latin America. Demography 1, no. 1:15-41. PARKER, FRANKLIN D. 1964 The Central American Republics. New York: Oxford Univ. Press. SCOTT, ROBERT E. (1959) 1964 Mexican Government in Transition. Rev. ed. Urbana: Univ. of Illinois Press. SILVERT, K. H. 1954 A Study in Government: Guatemala. Tulane University, Middle American Research Institute, Publication No. 21. New Orleans, La.: The Institute. Statistical Abstract of Latin America. -> Published since 1960 by the Center of Latin American Studies of the University of California. TAX, SOL et al. 1952 Heritage of Conquest: The Ethnology of Middle America. Glencoe, 111.: Free Press. UNITED NATIONS 1964 The Economic Development of Latin America in the Post-war Period. New York: United Nations. WEST, ROBERT C.; and AUGELLI, JOHN P. 1966 Middle America: Its Lands and Peoples. Englewood Cliffs, N.J.: Prentice-Hall. WHETTEN, NATHAN L. 1948 Rural Mexico. Univ. of Chicago Press. WHETTEN, NATHAN L. 1961 Guatemala: The Land and the People. New Haven: Yale Univ. Press. WOLF, ERIC R. 1959 Sons of the Shaking Earth. Univ. of Chicago Press.
MIDDLE EASTERN SOCIETY See NEAR EASTERN SOCIETY.
286
MIGRATION: Social Aspects MIGRATION
i. SOCIAL ASPECTS ii. ECONOMIC ASPECTS
William Petersen Brinley Thomas
SOCIAL ASPECTS
In its most general sense "migration" is ordinarily defined as the relatively permanent movement of persons over a significant distance. But this definition, or any paraphrase of it, merely begins to delimit the subject, for the exact meaning of the most important terms ("permanent," "significant") is still to be specified. A person who goes to another country and remains there for the rest of his life, we say, is a migrant; and one who pays a two-hour visit to the nearest town is not. Between these two extremes lies a bewildering array of intermediate instances, which can only partly be distinguished by more or less arbitrary criteria (Lacroix 1949). Permanence of movement. What should be the minimum duration of stay that differentiates a migration from a visit? With respect to international migration, the recommendation of the United Nations (and the practice of a number of countries) is to define removal for one year or more as "permanent," and thus as migration, while a stay for a shorter period is classified as a visit. Note that the data reflect not behavior but statements about future behavior; and persons have been known to lie to immigration officials or to change their minds. This kind of ambiguity often makes it difficult to interpret migration statistics. For example, according to a critical analysis of United States immigration statistics (Kuznets & Rubin 1954, table 7), during the height of the mass immigration of 1890-1910 about forty per cent of the foreign-born returned. Conclusions from uncorrected immigration data, therefore, are likely to be grossly inaccurate. Remigrants—those who leave their country of origin for a period and then return to it—ordinarily differ from the emigrants who remain abroad, but not necessarily according to any consistent pattern. Particularly during an economic depression, some immigrants left the new country when they lost their jobs (Berthoff 1953, p. 73). Sometimes, on the contrary, it was the relatively successful that returned, either to find wives (Borrie 1954) or to retire. The rise of nationalism in the old country often attracted back some of the incompletely assimilated migrants (Saloutos 1956). A small percentage of the subsidized emigrants to Australia and Canada have returned to Britain, and
the especially careful studies made of these groups are largely inconclusive (Appleyard 1962a; 1962i>; Richmond 1966). Almost by definition, remigrants are less able to acculturate to their new environment than immigrants who remain there, but few valid generalizations can be added to that truism. The ambiguity pertains also to the definition of internal migrants (Hamilton 1961; Taeuber 1961). Particularly in a federal country like the United States, "permanence" of movement is defined, in effect, by laws stipulating the meaning of domicile with respect to marriage and divorce, suffrage, and other prerogatives reserved to "bona fide residents." Surveys by the U.S. Bureau of the Census show that each year approximately one person in five moves to a new residence (compare Wilber 1963). However, according to a detailed analysis of one particular community (Goldstein 1954), a sizable proportion of this large percentage is made up of persons who move more than once during a year and who are atypical also in other ways. A study of repeated migration in Denmark suggests that the phenomenon is not restricted to any one country (Goldstein 1964). More generally, when one speaks of migratory birds, or migrant laborers, or nomads, the connotation is not of a permanent move from one area to another, but rather of a permanently migratory way of life, which often means a cyclical movement within a more or less definite area. Nomads (the word derives from the Greek for pasturing') typically follow their herds back and forth over a region delimited either by natural boundaries or by neighbors sufficiently powerful to repel incursions. Similarly, agricultural laborers often move with the growing season, and shepherds (in what is termed transhumance) alternate between high mountain pastureland in the summer and lowlands in the winter. Commutation, the daily "journey to work" (Liepmann 1944), constitutes a similar cycle within a smaller compass. One must not accept the common notion that such a separation of place of residence from place of work is peculiar to modern industrial societies. Many of the burghers of ancient Athens, fourteenth-century London, and preindustrial cities generally were part-time agriculturists (Petersen 1961, pp. 348-353). In many presently underdeveloped countries, particularly India and Pakistan, a peasant who migrates to the city often leaves his family in the village, to which he therefore returns periodically. In Africa south of the Sahara the temporary separation of male industrial or mine workers from village life has been institutionalized into the standard pattern (Mitchell 1961). In sum, whether short-term re-
MIGRATION: Social Aspects movals should be included in migration depends on the purpose of the statistics being collected. Thus, no particular specification of the duration of stay suits all purposes, and each analyst has to adapt the available data to his needs as best he can. "Significant" distances. The meaning of migration also varies according to how a "significant" distance is defined. The word derives from the Latin migrare, to change one's residence, but by current definitions it means rather to change one's community. A person who moves from one home to another in the same neighborhood, and who therefore retains the same social framework, is not deemed a migrant. If we regard a nation as a community, then by this criterion all international movements are included under the rubric "migration." Partly because of this rationale, partly because the two sets of statistics are separately collected, the distinction between international and internal migration sets the framework of most analyses. It is worth emphasizing, therefore, that in a general discussion of the phenomenon the distinction is more or less irrelevant. Not only do some types of migration fall outside of this dichotomy (prehistoric wanderings, for instance), but some of the most important and interesting characteristics of migrants apply whether or not they cross an international border (labor mobility, urbanization, migratory selection, acculturation, etc.). Moreover, there are often greater cultural differences within the boundaries of a nation than between nations. In practice, geographical distance is generally taken as a rough measure of whether the migrant crosses into another community. Thus, the U.S. Bureau of the Census divides the mobile population between "movers," who have changed their residence within a single county, and "migrants," who have crossed a county line, and it subdivides the latter category according to whether they move within a single state, to an adjacent state, or to a nonadjacent state. This kind of classification passes over the fact that a fanner who moves to a town in the same county probably changes his way of life more than one who crosses the nation but remains a farmer. To take a more striking example, the tens of thousands of refugees who fled from East to West Berlin have traversed the most significant boundary line in Europe while remaining within the confines of a single city. Models of migration It is reasonable to suppose that the number of migrants within any area homogeneous with respect to all the other factors that affect the propensity
287
to migrate will be inversely related to the distance covered. One can express this relation in an equation, as follows: M = oX/D', where M stands for the number of migrants, D for the distance over the shortest transportation route, and X for any other factor that is thought to be relevant; a and b are constants, usually set at unity. In one version of this equation, the so-called P,P5/D hypothesis, the populations of the end points of the movement are taken as the X factor (Zipf 1949). Another variation is the familiar proposition that the number of persons going a given distance is directly proportional to the number of employment opportunities at that distance and inversely proportional to the number of intervening opportunities (Stouffer 1940). When "opportunities" were defined operationally as the number of in-migrants, the hypothesis could be validated in a number of instances. According to a detailed comparison of the two, Stouffer's formulation is better than Zipf s, since, in effect, measuring opportunities corrects the total population figures for the amount of unemployment in the two areas (Anderson 1955). (For other models, see Ldvgren 1956; Thomllnson 1961; Heide 1963; Tabah & Cataldi 1963.) A proposition about migration between only two points is too simplistic a unit, however, to be a useful building block for more elaborate theories. These have in general started from other premises. An important example is the three-volume study Population Redistribution and Economic Growth: United States, 1870-1950 (Kuznets 1957-1964), in which the available data concerning the regional distribution of the developing national economy and data concerning internal migration are combined into a unified analysis of the interaction between the two. This kind of analysis is not limited to migration within a single country. A shorter work in the same broad perspective analyzes the post-1945 migration to Switzerland (Mayer 1966). According to several studies of the transatlantic movement, if conditions in the home country build up a propensity to emigrate, the volume, direction, and timing of the movement are set largely by the business cycle in the receiving country (e.g., D. S. Thomas 1941). A later work, however, challenged this interpretation and placed more emphasis on the unity of "the Atlantic economy" and the importance of "push" factors (B. Thomas 1954). Noneconomic motives. In most of the supposedly general models of migration, it is presumed that movement is generated mainly by economic forces. This may not always be a reasonable postulate. Whether the correlation between business
288
MIGRATION: Social Aspects
cycles and migratory movements is positive or negative, for example, sometimes depends only on how broadly the study is conceived. While the rise of Europe's urban-industrial civilization brought a great increase in population and thus a pressure to emigrate, it also resulted in a general rise in the level of aspiration. Young men who were better off than their fathers were nonetheless dissatisfied, and many sought to better themselves overseas. Thus, it may be true to say that for certain periods the dominant motivation of emigrants from particular countries was economic, even though these countries had, by and large, far better conditions than those from which very few persons left. This paradox is not limited to economic factors: religious oppression, or the infringement of political liberty, was often a motive for European emigration, but before the rise of modern totalitarianism those who left came predominantly from precisely those countries least marked by such stigmata. An increasing propensity to emigrate spread east and south from northwest Europe, together with democratic institutions and religious tolerance. The anomaly that those who emigrated "because" of persecution tended to come from countries where there was less of it than elsewhere can be analyzed only by separating personal motivation from social causation. According to a recent survey of British emigrants' motives, they tended to rationalize their general feeling of insecurity and inadequacy into more specific economic factors (Appleyard 1964). At least in the United States, internal migration is also less motivated by economic factors than is usually assumed. At one time, the U.S. Bureau of the Census asked a sample of migrants who had moved during one year why they had moved ("Postwar Migration" . . . 1947; compare "Reasons for Moving . . ." 1966, which showed similar responses). Only 22.6 per cent said it was to take a job or to look for work. Family migration constituted 61.7 per cent (i.e., moving with the head of the family or to join him, moving because of a change in marital status); 6.4 per cent said it was because of housing problems. Health, climate, education, and miscellaneous motives accounted for the remaining respondents. This conclusion has been generally validated by the few other studies made of internal migrants' motivation (e.g., Rossi 1955). Migratory selection That migration is both related to economic trends and yet not, in any simple sense, caused by them, should not occasion any surprise. The same
is true of many other complex social phenomena. It would be no contribution to substitute for purely economic causes a list of other "factors," ranging from the spirit of adventure to the development of transportation facilities; nor would it be a great improvement to divide such a list between circumstances at home that repel and those abroad that attract, that is, between "push" and "pull" factors. Given a sedentary population and an inducement to leave home, typically some persons go and some stay behind. Push and pull factors, in short, do not exert their force equally. The self-selection by which migrants differentiate themselves from the sedentary population is called migratory selection (or, by some authors, selective migration). An analysis of this process can afford a better understanding of why a migration takes place. It is a valuable extension of the Stouffer-Zipf generalization, for example, to go beyond the counting of heads and differentiate among the types of migrants. It is not sufficient, even in an analysis restricted to economically motivated migrations, to posit job opportunities in general: potential migrants with specific skills go to places where there are openings specifically for them. Thus, among white migrants within the United States, those seeking higher-status positions generally have to move greater distances than those with lower levels of skill. And some job-seeking migrants are also strongly motivated by noneconomic factors: among American Negroes important reasons for moving have been to get out of the rural South (hence the high rate of urbanization), and preferably to get out of the South altogether (hence the shift to the North and West). Negroes, therefore, move greater distances than would be expected from the level of skill in the jobs that they typically seek (Rose 1958; compare Stub 1962; Taeuber & Taeuber 1965). It is possible to analyze migratory selection by a number of demographic and social characteristics in addition to occupation and race; and although the conclusions from different studies vary widely, some tentative generalizations are possible (D. S. Thomas 1938; Petersen 1961, pp. 592-603). In both internal and international movements adolescents and young adults predominate; for not only do the young adapt more easily, but since they are close to the beginning of their working life, they can more readily take advantage of new opportunities. It is feasible, therefore, to analyze migration by cohorts (Eldridge 1964). One can argue a priori that either the less or the more intelligent will tend to migrate: since the
MIGRATION: Social Aspects more intelligent will have succeeded at home, the less intelligent will seek their fortunes elsewhere; on the other hand, the more intelligent will respond first to any stimulus to migrate, while the duller will remain behind. Various studies have seemed to validate one or the other of these propositions. It is possible to reconcile the contradiction by postulating that a selection by intelligence is in fact one by actual or potential occupational level (Hofstee 1952; compare Lee 1966). Thus, since urban occupations are generally more demanding, ruralurban movements typically select the more intelligent. This is not true, however, of agriculturists who move to manual jobs in the cities, such as Negroes in the United States (D. S. Thomas 1938, pp. 111-121), or in general of migrants who make no substantial change in vocational level. Effects on populations. For the two areas concerned, migratory selection determines the significance of the movement almost as much as the number of migrants. Consider the ramifications of what can be taken as the most fundamental question in migration theory: If X persons leave country A and migrate to country B, what changes take place in the size of the two populations (Petersen 1955, chapter 9)? The common-sense answer, that country A is decreased and country B is increased by X, is true only in the short run. If the typically young migrants have their children in their new country, its fertility rate may go up, while that of their native country goes down. Since the remaining population of country A will then be older on the average, its death rate may go up, while that of country B goes down. In short, after a generation the transfer of X persons will in fact amount to X plus a certain proportion based on the migration's effects on the population structures, and rates of population growth, of the two countries. At a third level of analysis, however, this increment, and indeed X itself, may be canceled out. For Mai thus, thus, emigration was a slight palliative, a partial and temporary expedient, with no permanent effect on population size (Maithus 1798, book 3, chapter 7). This is likely to be true of any country where the mortality of infants and children is high (so that emigration would reduce the mortality slightly), or where marriages and conceptions are put off because of economic pressure (so that a lesser pressure, the consequence of emigration, would result in a higher fertility). If one includes such indirect effects, the change in the population of the immigration country is also difficult to estimate. Immigration to the United States, for example, accelerated urbanization and
289
industrialization; and these changes, in turn, increased the upward social mobility of the native population and thus tended to accelerate the secular decline of the birth rate. In sum, even the simplest question—How many persons migrated?—cannot be fully answered merely by counting heads. Unlike mortality and fertility, migration has no biological dimension: it cannot be analyzed, even in preliminary terms, independently of its cultural context. Accordingly, there are no "laws" of migration in the sense of universal generalizations; the highest level of abstraction possible is the contrast of various types of migrants (Heberle 1955). Migration typologies In a study of migrants to Aberdeen—that is, of movement within the single country of Scotland over only a few years—it was found useful to classify respondents into a number of types. These included professionals seeking careers, young persons seeking education, workers taking specific jobs, casual workers looking for employment, former commuters moving for greater convenience, family migrants joining heads of families, and return migrants (Illsley et al. 1963, pp. 238-240). The conclusions to be drawn differed for these various classes. If this is so for movements within a relatively homogeneous area, then it is manifestly the case for migration in its most general terms, encompassing the whole world and all of human history. In constructing a general typology, one should begin by choosing the criteria by which the types are to be distinguished (Petersen 1964, pp. 271-290). Perhaps the most fundamental is the distinction between innovating migrants, who move in order to achieve the new, and conservative migrants, who move in response to a change in their circumstances, hoping by migrating to retain their way of life in another locus. Within each of these two broad classes, one can distinguish types of migration according to the force impelling the movement. An ecological push results in what might be termed a primitive migration—not a wandering of primitive tribes as such, but one dependent on a people's inability to cope with natural forces. When the activating agent is the state or some equivalent institution, the movement is forced or impelled migration, depending on whether the prospective migrants retain some power to decide whether to leave or not. A movement of adventurous pioneers, deviant religious or political groups, or similar individually motivated persons can aptly be termed
290
MIGRATION: Social Aspects
free migration. Its importance is not in its size, which is never large, but in the example it sets for others. If the ensuing flow develops into a broad stream, an established pattern for whole social classes, an example of collective behavior, we speak of mass migration, similar to what has been termed "chain migration" (MacDonald & MacDonald 1964). Then individual motivations become correspondingly less important—indeed, the individuals involved may not be able to give a rational account of their decision to migrate. The motives they ascribe are likely to be trivial or, more probably, the generalities that they think are expected (Hansen 1940a, pp. 77-78). Uses of typological method. The value of such a typology is in its utility: Does it help in solving analytical problems? The typology suggests, first of all, that migratory selection ranges along a continuum, from total migration at one extreme (food gatherers or nomads) to total nonmigration at the other. Intermediate instances, moreover, cannot be arranged along a single dimension. Sometimes it is the age or the sex or the occupation of the potential migrant that is relevant, but if an ethnic or social minority leaves to escape persecution or is shipped off to concentration camps, then the only pertinent characteristic is how the state defines "Jew" or "kulak," for example. For more than a century various governments, concerned about the depopulation (real or supposed) of villages, have sought measures to counteract it. It would increase understanding of the process merely to ask whether this is a conservative or an innovating migration: Do these agriculturists want better conditions within their present way of life, or do they move to cities for the sake of urban amenities? Perhaps the most useful distinction in the typology is that between mass migration and all other types, for it emphasizes the fact that the nineteenth-century exodus from Europe does not constitute the whole of the phenomenon. When this type of migration declined after World War i, largely because of new political limitations imposed by both emigration and immigration countries, this was very often interpreted as marking the end of significant human migration altogether (e.g., Forsyth 1942). It was rather, in large part, a change to neomercantilist migration, in which the welfare of the national state becomes the main criterion for judging whether the movement is desirable and in which state agencies foster or impede, force or prevent, the migration. The "natural" right of the passportless person to move about has been supplanted by the "natural" right of the state to control that movement (Petersen 1955, chapter 1).
In the present age of total wars and totalitarian regimes, political motivations have set not only "Europe on the move" (Kulischer 1948) but also, partly as reverberations of European influences, much of the rest of the world. To take a notable example, the partition of British India into the nations of India and Pakistan was accompanied by one of the largest migrations in human history, in part induced by terrorists on both sides, in part arranged under state auspices. Many analysts prefer to omit this type of movement from their purview. In a United Nations publication, for example, international migration is defined as "the noncoerced migrations, which constitute the great majority of all migratory movements in normal times, and which are closely related to economic and social factors. . . . Specifically, 'migration' excludes population transfers, . . . deportations, refugee movements, and the movements of 'displaced persons'" (United Nations 1953, p. 98). That an international body which includes some of the states most responsible for forced migrations should exclude them from its demographic analyses is understandable; but there is no reason why independent scholars should accept this arbitrary and misleading definition. The number of refugees in the world today depends of course on how that term is defined. Data on "refugees" are compiled mainly by the various agencies set up to aid them, and the resultant totals considerably understate the number of persons who have migrated because of political stress and sought refuge elsewhere. The major world-wide agency, the office of the U.N. High Commissioner for Refugees, has a narrowly restricted prime mandate: to assist persons who do not want to return to their country because of actual or feared racial, religious, or political persecution; and it may also extend its "good offices" to certain other limited categories. This definition does not include several numerically important classes of uprooted peoples: (1) those who have fled from local political disturbances but remain within the boundaries of the same state; (2) those who are forcibly moved about within the boundaries of a single state (see, for example, Conquest I960); (3) those who have been forced to "return" to what is now defined as "their" country, after having lived "abroad" sometimes for generations. Thousands of refugees remain as hard-core cases from World War I, the Spanish Civil War, and World War n. It has been estimated, probably conservatively, that about forty million persons became refugees in the dozen years following 1945 (Rees 1957); the implication of the figure can be better grasped when it is recalled that the usual
MIGRATION: Social Aspects estimate for the total migration from all of Europe from 1800 to 1950 is only one and a half times as large, that is, sixty million. WILLIAM PETERSEN [See also POPULATION; REFUGEES; and the biographies Of KULISCHER; WlLLCOX.j BIBLIOGRAPHY
ANDERSON, THEODORE R. 1955 Intel-metropolitan Migration: A Comparison of the Hypotheses of Zipf and Stouffer. American Sociological Review 20:287-291. APPLEYARD, R. T. 1962a The Return Movement of United Kingdom Migrants From Australia. Population Studies 15:214-225. APPLEYARD, R. T. 1962Z? Determinants of Return Migration: A Socio-economic Study of United Kingdom Migrants Who Returned From Australia. Economic Record 38:352-368. APPLEYARD, R. T. 1964 British Emigration to Australia. London: Weidenfeld & Nicolson. BERTHOFF, ROWLAND T. 1953 British Immigrants in Industrial America: 1790-1950. Cambridge, Mass.: Harvard Univ. Press. BORRIE, WILFRID D. 1954 Italians and Germans in Australia: A Study of Assimilation. Melbourne: Cheshire. CANADA, DEPARTMENT OF CITIZENSHIP AND IMMIGRATION 1961 Citizenship, Immigration, and Ethnic Groups in Canada: A Bibliography of Research, 1920-1958. Ottawa: Queen's Printer. CONQUEST, ROBERT 1960 The Soviet Deportation of Nationalities. New York: St. Martins. ELDRIDGE, HOPE T. 1964 A Cohort Approach to the Analysis of Migration Differentials. Demography 1: 212-219. FORSYTH, WILLIAM D. 1942 The Myth of Open Spaces: Australian, British and World Trends of Population and Migration. Melbourne Univ. Press. GOLDSTEIN, SIDNEY 1954 Repeated Migration as a Factor in High Mobility Rates. American Sociological Review 19:536-541. GOLDSTEIN, SIDNEY 1961 The Norristown Study: An Experiment in Interdisciplinary Research Training. Philadelphia: Univ. of Pennsylvania Press. GOLDSTEIN, SIDNEY 1964 The Extent of Repeated Migration: An Analysis Based on the Danish Population Register. Journal of the American Statistical Association 59:1121-1132. HAMILTON, C. HORACE 1961 Some Problems of Method in Internal Migration Research. Population Index 27: 297-307. HANSEN, MARCUS L. 1940a The Immigrant in American History. Cambridge, Mass.: Harvard Univ. Press. -> A paperback edition was published in 1964 by Harper. HANSEN, MARCUS L. 1940b The Atlantic Migration: 1607-1860. Cambridge, Mass.: Harvard Univ. Press. -» A paperback edition was published in 1961 by Harper. HASKETT, RICHARD C. 1956 An Introductory Bibliography for the History of American Immigration: 1607— 1955. Pages 85-295 in Stanley J. Tracy (editor), A Report on World Population Migrations. Washington: George Washington Univ. HEBERLE, RUDOLF 1955 Theorie der Wanderungen. Schmollers Jahrbuch fur Gesetzgebung, Verwaltung und Volkswirtschaft im Deutschen Reiche 75:1-23.
291
HEIDE, H. TER 1963 Migration Models and Their Significance for Population Forecasts. Milbank Memorial Fund Quarterly 41:56-76. HOFSTEE, E. W. 1952 Some Remarks on Selective Migration. Research Group for European Migration Problems, Publications, No. 7. The Hague: Nijhoff. HUTCHINSON, EDWARD P. 1956 Immigrants and Their Children: 1850-1950. New York: Wiley. ILLSLEY, RAYMOND; FINLAYSON, ANGELA; and THOMPSON, BARBARA 1963 The Motivation and Characteristics of Internal Migrants: A Socio-medical Study of Young Migrants in Scotland. Milbank Memorial Fund Quarterly 41:115-143, 217-248. INTERNATIONAL ECONOMIC ASSOCIATION 1958 Economics of International Migration. Edited by Brinley Thomas. New York: St. Martins; London: Macmillan. INTERNATIONAL LABOR OFFICE 1928-1929 Migration Laws and Treaties. Vol. 1-3. Studies and Reports, Series O, No. 3. Geneva: The Office. INTERNATIONAL LABOR OFFICE 1959 International Migration: 1945-1957. Studies and Reports, New Series, No. 54. Geneva: The Office. KIRK, DUDLEY 1946 Europe's Population in the Interwar Years. Geneva: League of Nations. KULISCHER, EUGENE M. 1948 Europe on the Move: War and Population Changes, 1917-1947. New York: Columbia Univ. Press. KUZNETS, SIMON (editor) 1957-1964 Population Redistribution and Economic Growth: United States, 1870— 1950. Memoirs, Nos. 45, 51, 61. Philadelphia: American Philosophical Society. KUZNETS, SIMON; and RUBIN, ERNEST 1954 Immigration and the Foreign Born. National Bureau of Economic Research, Occasional Paper No. 46. New York: The Bureau. LACROIX, MAX 1949 Problems of Collection and Comparison of Migration Statistics. Pages 71-105 in Milbank Memorial Fund, Problems in the Collection and Comparability of International Statistics. New York: The Fund. LAVELL, C. B.; and SCHMIDT, WILSON E. 1956 An Annotated Bibliography on the Demographic, Economic and Sociological Aspects of Immigration. Pages 296449 in Stanley J. Tracy (editor), A Report on World Population Migrations. Washington: George Washington Univ. LEE, EVERETT S. 1966 A Theory of Migration. Demography 3:47-57. LEE, EVERETT S.; and LEE, ANNE S. 1960 Internal Migration Statistics for the United States. Journal of the American Statistical Association 55:664-697. LIEPMANN, KATE K. (1944) 1945 The Journey to Work: Its Significance for Industrial and Community Life. London: Routledge. LOVGREN, ESSE 1956 The Geographical Mobility of Labour: A Study of Migrations. Geografiska annaler 38:344-394. MACDONALD, JOHN S.; and MACDONALD, LEA TRICE D. 1964 Chain Migration, Ethnic Neighborhood Formation, and Social Networks. Milbank Memorial Fund Quarterly 42:82-97. MALTHUS, THOMAS R. (1798) 1958 An Essay on Population. 2 vols. New York: Dutton. -» First published as An Essay on the Principle of Population. A paperback edition was published in 1963 by Irwin. MAYER, KURT B. 1966 The Impact of Postwar Immigration on the Demographic and Social Structure of Switzerland. Demography 3:68-89.
292
MIGRATION: Economic Aspects
MILBANK MEMORIAL FUND 1947 Postwar Problems of Migration. New York: The Fund. MILBANK ^MEMORIAL FUND 1958 Selected Studies of Migration Since World War II. New York: The Fund. MITCHELL, J. CLYDE 1961 Wage Labour and African Population Movements in Central Africa. Pages 193248 in K. M. Barbour and R. M. Prothero (editors), Essays on African Population. London: Routledge. NATIONAL BUREAU OF ECONOMIC RESEARCH 1929-1931 International Migrations. Edited by Walter F. Willcox. 2 vols. New York: The Bureau. -» Volume 1: Statistics, compiled on behalf of the International Labour Office, Geneva, with introduction and notes by Imre Ferenczi. Volume 2: Interpretations, by a group of scholars in different countries. PETERSEN, WILLIAM 1955 Planned Migration: The Social Determinants of the Dutch-Canadian Movement. Berkeley: Univ. of California Press. PETERSEN, WILLIAM 1961 Population. New York: Macmillan. PETERSEN, WILLIAM 1964 The Politics of Population. Garden City, N.Y.: Doubleday. Postwar Migration and Its Causes in the United States: August, 1945, to October, 1946. 1947 U.S. Bureau of the Census, Current Population Reports Series P-20, No. 4. Reasons for Moving: March 1962 to March 1963. 1966 U.S. Bureau of the Census, Current Population Reports. Series P-20, No. 154. REES, ELFAN 1957 Century of the Homeless Man. International Conciliation 515:193-254. RICHMOND, A. H. 1966 Demographic and Family Characteristics of British Immigrants Returning From Canada. International Migration 4:21-26. ROSE, ARNOLD M. 1958 Distance of Migration and Socioeconomic Status of Migrants. American Sociological Review 23:420-423. ROSENFIELD, HARRY N. 1957 Historical Research as a Tool for Immigration Policy. American Jewish Historical Society, Publications 46:341-365. Rossi, PETER H. 1955 Why Families Move: A Study in the Social Psychology of Urban Residential Mobility. Glencoe, 111.: Free Press. SALOUTOS, THEODORE 1956 They Remember America: The Story of the Repatriated Greek-Americans. Berkeley: Univ. of California Press. SHIMM, MELVIN G. (editor) 1956 Immigration. Law and Contemporary Problems 21, no. 2. SHYOCK, HENRY S. 1964 Population Mobility Within the United States. Univ. of Chicago, Community and Family Study Center. STOUFFER, SAMUEL A. 1940 Intervening Opportunities: A Theory Relating Mobility and Distance. American Sociological Review 5:845-867. STUB, HOLGER R. 1962 The Occupational Characteristics of Migrants to Duluth: A Retest of Rose's Hypothesis. American Sociological Review 27:87-90. TABAH, LEON; and CATALDI, ALBERTO 1963 Eff'ets d'une immigration dans quelques populations modeles. Population 18:683-696. TAEUBER, KARL E. 1961 Duration-of-residence Analysis of Internal Migration in the United States. Milbank Memorial Fund Quarterly 39:116-131. TAEUBER, KARL E.; and TAEUBER ALMA F. 1965 The Changing Character of Negro Migration. American Journal of Sociology 70:429-441.
THOMAS, BRINLEY 1954 Migration and Economic Growth: A Study of Great Britain and the Atlantic Economy. Cambridge Univ. Press. THOMAS, BRINLEY 1961 International Migration and Economic Development: A Trend Report and Bibliography. Paris: UNESCO. THOMAS, DOROTHY S. 1938 Research Memorandum on Migration Differentials. New York: Social Science Research Council. THOMAS, DOROTHY S. 1941 Social and Economic Aspects of Swedish Population Movements: 1750-1933. New York: Macmillan. THOMLINSON, RALPH 1961 A Model for Migration Analysis. Journal of the American Statistical Association 56:675-686. TRACY, STANLEY J. (editor) 1956 A Report on World Population Migrations. Washington: George Washington Univ. UNITED NATIONS, DEPARTMENT OF ECONOMIC AND SOCIAL AFFAIRS 1955 Analytical Bibliography of International Migration Statistics, Selected Countries: 19251950. U.N. Bureau of Social Affairs, Population Studies, No. 24. New York: United Nations. UNITED NATIONS, DEPARTMENT OF SOCIAL AFFAIRS 1949 Problems of Migration Statistics. Population Studies, No. 5. New York: United Nations. UNITED NATIONS, DEPARTMENT OF SOCIAL AFFAIRS 1953 The Determinants and Consequences of Population Trends: A Summary of the Findings of Studies on the Relationships Between Population Changes and Economic and Social Conditions. Population Studies, No. 17. New York: United Nations. U.S. CONGRESS, HOUSE, COMMITTEE ON THE JUDICIARY 1950 The Displaced Persons Analytical Bibliography. House Report No. 1687. Washington: Government Printing Office. WILBER, GEORGE L. 1963 Migration Expectancy in the United States. Journal of the American Statistical Association 58:444-453. WILBER, GEORGE L.; and ROGERS, TOMMY W. 1965 Internal Migration in the United States, 1958 to 1964: A List of References. Sociology and Rural Life Series, No. 15. Unpublished manuscript, Mississippi State Univ., Agricultural Experiment Station. ZIPF, GEORGE K. 1949 Human Behavior and the Principle of Least Effort: An Introduction to Human Ecology. Reading, Mass.: Addison-Wesley. -> See especially "The Factor of Distance," pages 386-409. ZUBRZYCKI, JERZY 1960 Immigrants in Australia: A Demographic Survey Based Upon the 1954 Census. 2 vols. Melbourne Univ. Press. -» Volume 2 is a statistical supplement. II ECONOMIC ASPECTS
This article is mainly concerned with international migration, which may be defined, in the strict sense, as a permanent movement of people, of their own free will, from one sovereign country to another. Transfers of this kind, however, account for only a small part of the redistribution of world population in the last three centuries. A comprehensive view of international migration must therefore include forced as well as free move-
MIGRATION: Economic Aspects ments, and temporary as well as permanent movements. It is also necessary to distinguish between intercontinental and intracontinental transfers. Free and forced movements Up to the beginning of the nineteenth century there were hardly any statistical records of international migration. Nevertheless, it is possible to indicate the main orders of magnitude. The first great Atlantic migration was the traffic in African slaves. It is estimated that over ten million slaves were transported to America between 1619 and 1776 and that 3.4 million of them went to the English colonies in America. The trade was initiated by the Portuguese and the Dutch. Britain also took part, beginning in the 1660s, mainly through the Royal African Company, which existed from 1672 to 1752. The plantation economies producing sugar in the West Indies, tobacco in Virginia, and rice and indigo in South Carolina entailed a growing demand for slave labor from Africa. In the West Indies white immigration was replaced by black on one island after another, until slaves constituted about four-fifths of the population. In contrast, the white migration across the Atlantic in the seventeenth and eighteenth centuries was comparatively small. In the seventeenth century about 250,000 left the British Isles for the New World, and in the eighteenth century the outflow was perhaps 1,500,000, of whom about half a million were Ulster Presbyterians. An indication of the volume of Spanish emigration to the New World is given by the fact that 150,000 were recorded as having embarked at the port of Seville between 1509 and 1740, but this is a serious underestimate. The only other prominent group of trans-Atlantic migrants up to 1800 was the 200,000 Germans estimated to have left for America. The nineteenth century was the great age of mass migration from Europe across the Atlantic of people who went of their own free will, and most of what we know about the economic and social determinants and consequences of international migration is based on the experience of that remarkable period. Between 1846 and 1932 about 52 million people left Europe for oversea destinations. When this redistribution was over, one-eleventh of the population of the world were people of European origin living outside Europe. One of the most somber features of our time is that while there has been a sharp decline in free international mobility, the relative scale of forced transfers is reminiscent of the eighteenth century. The world picture has been dominated by move-
293
ments of refugees, as shown by the following rough estimates. The partition of India and Pakistan led to the expulsion of more than 18 million people from their homes. Other outstanding examples of inflows of refugees after World War n are West Germany, 12 million; Japan, 6.3 million; South Korea, 4 million; Hong Kong, 1.3 million; Israel, 1 million; Arab refugees from Palestine, 1 million. About 1.5 million refugees were settled overseas by the International Refugee Organisation and the Intergovernmental Committee for European Migration [see REFUGEES, article on WORLD PROBLEMS]. This is only a partial account, for it is impossible to give estimates of the considerable forced movements which have taken place between countries controlled by the Soviet Union and China. Even this partial estimate of the international transfer of "political" migrants gives a total of 45 million for the ten years beginning in 1945. It is a sobering thought that the number of people expelled from one country to another in the decade after World War n was equal to the entire oversea emigration from Europe in the century ending in 1913. Temporary and permanent movements Of the large number of passengers who enter or leave any country in a given period, only a proportion are genuine migrants. Persons to be counted as migrants are those who move from one country to a permanent residence in another, and, in accordance with United Nations standards, the criterion generally adopted is a declared intention to stay in the receiving country for more than one year. Owing to the wide variety of methods used in different countries, it is not easy to obtain accurate statistics. It must be recognized, however, that much of the international mobility of labor which is of economic significance is of a temporary nature. This is particularly true on the European continent, where daily or weekly or seasonal movements over national frontiers occur on a considerable scale. For example, permanent immigration into Switzerland rose steadily from 1946 to 1957 and amounted to 690,000 workers for the period as a whole; during the same period there were 942,000 seasonal workers and 275,000 frontier workers. The net immigration for the period was, however, only 250,000; this demonstrates that a number of the aliens admitted as "permanent" immigrants (a great many being women) are in fact in the temporary category. On the American continent a similar case may be seen in the seasonal traffic across the border between Mexico and the United States.
294
MIGRATION: Economic Aspects
Intercontinental and intracontinental movements The distinction between intercontinental and intracontinental movements is illuminating when we are dealing with the phases in the migration of Europeans. With the exception of the movement of Russians into Asia, the story of the outpouring of Europe's population is largely one of oversea settlement, and it is noteworthy that the migrations which have had the deepest and most enduring effects have been those which were transoceanic and intercontinental. When migrants cross an ocean, there are strict limits to what they can take with them; their traditions, ideals, techniques, and material belongings, when applied in a distant and strange environment, yield a pattern of life quite different from the one they left behind. There is something irrevocable about crossing an ocean. The political, economic, and racial configuration of the United States today is very much the outcome of three transoceanic migrations—the Pilgrim Fathers and their successors, the slaves from Africa, and the European masses in the nineteenth century. When we consider shifts of population in Asia, we find that intercontinental movements are not significant, for a journey by sea has often meant merely a transfer from one part of the same continent to another. The abolition of slavery in the British colonies in the 1830s saw the beginning of an outflow from the Far East to the countries of America, Oceania, and Africa; and throughout the nineteenth century the recruitment of indentured
laborers from India, China, and Japan was a characteristic feature. This species of intercontinental migration came to an end in the 1920s, and its place was taken by interregional movements which have had important demographic and social effects. The chief suppliers of intracontinental migrants have been China, India, Pakistan, Japan, and Korea; the main recipients have been Malaya, Ceylon, Burma, Indonesia, Thailand, Vietnam, Laos, Cambodia, British Borneo, the Philippines, and Manchuria. In most of these countries the immigrants have been mainly Chinese and Indians. In the Far East internal migration has been more important than transfers across national boundaries. For example, in Japan in the period 19201940 the net exodus from rural areas to urban areas amounted to 17.5 million persons. This was more than the entire increase in the population of Japan in this period, and it was ten times greater than the net emigration from the country. The measurement of migration Countries did not begin to keep records of genuine international migration until the big modern movements had passed their peak. For most of the nineteenth century the available statistics were byproducts of acts or regulations introduced to achieve some other purpose. For example, in the United States and the United Kingdom, records of passenger movements were the results of acts passed to regulate shipping. In Britain, statistics began to be furnished under an act in 1803. But it was not until over a century later, in 1912, that the British Board of Trade decided to adopt a sta-
Tdfa/e 7 — Intercontinental migration, selected countries and periods (in thousands) EMIGRATION Country of emigration
Period covered
Austria and Hungary
1846-1932
Belgium
1846-1932
British India
1846-1932
British Isles
1846-1932
Denmark
1846-1932
Finland
1871-1932
France
1 846-1 932 1846-1932 1846-1932 1846-1932 1 846-1 932 1846-1932 1920-1932 1846-1932 1846-1924 1846-1932 1846-1932 1846-1932
Germany Holland Italy Japan Norway Poland Portugal Russia Spain Sweden Switzerland
IMMIGRATION Number of emigrants
5,196 193 1,194 18,020 387 371 519 4,889 224 10,092 518 854 642 1,805 2,253 4,653 1,203 332
Country of immigration
Period covered
Number of immigrants
Argentina
1856-1932
6,405
Australia
1861-1932
Brazil
1821-1932
British West Indies
1836-1932
2,913 4,431 1,587
Canada
1821-1932
5,206
Cuba
1901-1932
Hawaii
1911-1931
South Africa
1881-1932
857 216 573 226 594 852
United States
1821-1932
34,244
Uruguay
1836-1932
713
Mauritius
1836-1932
Mexico
1911-1931
New Zealand
1851-1932
Source: Adapted from Carr-Saunders 1936, p. 49.
MIGRATION: Economic Aspects tistical classification which defined a migrant as a passenger who declares that he has lived for a year or more in one country and intends to remain for a year or more in another. Statistical tests have shown that, despite their obvious deficiencies, ". . . the Board of Trade statistics of aggregate net passenger movement are a surprisingly good measure of the course of total net emigration from the United Kingdom in the period ending in 1912" (Thomas 1954, p. 52). This would not hold good for recent times, because air travel has become important and Britain persists in not including air migrants in her statistics. The primary data used by various countries to measure international migration can be grouped under six headings: those yielded by controls at ports, by transport contracts, by population registers, by control at land frontiers, by passports, and by coupons detached from certain documents. In North America, South America, Asia, and Africa the usual practice has been to base the records on controls at frontiers and ports; in Europe, however, countries have adopted one or another of the six systems. Each government has tended to organize its migration statistics in accordance with its own particular policy objectives, without any regard to the need for international comparability. The result has been a bewildering variety of definitions and classifications. Valuable attempts have been made by the International Labor Office to point the way toward a common pattern (see, for example, International Labor Office 1932; 1952). In recent years the problem of improving migration statistics has been thoroughly explored by the United Nations. An inquiry in 1950 showed that only 16 out of 45 countries classified emigrants by country of future residence or destination and only 17 classified immigrants by country of last residence or origin, while only 16 distinguished between continental Table 2 — World population, by continent, in 1946 and 7957, and balance of intercontinental migration in the intervening period (in millions) Continent or major region Africa Americas Asia Europe Oceania U.S.S.R.
Population 1946 185.0
300.0 1,302.0 379.0 11.8 175.0
1957
225.0 381.0 1,556.0 414.0 15.4 204.0
Total population increase +40.0 +81.0 +254.0 +35.0
Net-
migration + 0.5
+4.4 -0.5 -5.4
+3.6
+ 1.0
+29.0
*
Not available. Source: Adapted from International Labor Office 1959, p. 304.
295
Table 3 — Population growth and net migration for selected countries, 1946-1957 (in thousands) ESTIMATED NET MIGRATION
1946
Country population Canada 12,620 United States 141,390 Jamaica 1,300 Mexico 23,180 Argentina 15,650 Brazil 47,310 Uruguay 2,280 Venezuela 4,390 Australia 7,460 New Zealand 1,660
Natural increase 3,305 27,735 348 8,689 3,358 1 6,000 320 1,812 1,391 351
Absolute number + 1,000 +2,200 -90 -250
+800 +450 + 110 +330 +930 + 145
Per cent
Per cenf
of 1946
of
population
natural increase +7.9 +30 + 1.6 +8 -6.9 -26 -1.1 -3 +24 +5.1 + 1.0 +3 +4.8 +34 +7.5 + 18 +67 + 12.5 +8.7 +41
Source: Adapted from International Labor Office 1959, pp. 308, 312.
and intercontinental immigrants and 10 between continental and intercontinental emigrants. For the demographer it was disconcerting that only 16 countries gave information on the marital status of migrants, and in nine of these countries this information was not combined with age grouping. As a result of the work of the Economic and Social Council of the United Nations, most of the known statistics have been set out in two comprehensive monographs (United Nations, Bureau of Social Affairs 1953; 1958). These surveys cover 33 countries and include tables classifying migrants by occupation or industry, state of dependency, possession or nonpossession of a contract for employment, and, for the United States, Israel, and South Africa, the amount of money which immigrants bring in with them (for additional detailed information, see United Nations, Department of Economic and Social Affairs 1955). Since progress in improving official sources on migration is inevitably a slow process, it is all the more necessary to check imperfect time series in the light of the more accurate population census data at decennial intervals. Thus, Kuznets and Rubin (1954) have compared the annual record of immigration into the United States over a long period with the estimates obtained from the census figures on resident foreign-born. Similarly, Keyfitz (1950) has drawn up a population balance sheet for Canada for the century 1851-1950, including the best possible estimates of immigration and emigration in each decade. Some indications of the scale of intercontinental population movements from the middle of the nineteenth century to the middle of the twentieth can be found in tables 1-3.
296
MIGRATION: Economic Aspects
Economic determinants of migration The period 1840-1924 was in several respects unique in the history of migration. The evolution of the Atlantic economy in that era necessitated a considerable movement of population and capital from the Old World, which was relatively well endowed with these factors, to the New World, where they were relatively scarce. Over 45 million people crossed the ocean; the average rate of growth of population in each decade of the nineteenth century was 29 per cent in the United States, 34 per cent in Argentina, and 8 per cent in Europe. The world's chief provider of capital was Great Britain; of her foreign investments of over $17,000 million in 1913, nearly 70 per cent were located in North America, South America, and Oceania. The motives which led these millions of people to leave their homelands were infinitely varied; a lengthy catalogue of them would be full of human interest but would not provide an interpretation of the phenomenon. At certain times and in certain places the operative force was political oppression or religious persecution or eviction by tyrannical landlords or the threat of starvation or evasion of military service or the love of adventure or the lure of gold or the attraction of a new country with limitless opportunities. But what has to be explained is why the individual decisions of millions of people resulted in four major upswings with intervening downswings, with an average interval of 15 to 20 years from peak to peak, in oversea emigration from Europe. The emigration upswings took place in 1845-1854, 1863-1873, 1881-1888, and 1903-1913. There can be no possible doubt about the explanation of the first of these: its inception had nothing to do with demand conditions in the United States. Calamity struck in Ireland in 1845, when the potato crop failed and a terrible famine followed; and as if this were not enough, the landlords added to the horrors by violently evicting thousands of peasants from their homes. In another part of Europe, southwest Germany, in the years 1848-1854 a severe crisis in the rural areas (in addition to the prevailing political unrest) brought population pressure to a head, and the only solution was emigration. In that period, of the 2,796,000 European immigrants who landed in the United States, no less than 80 per cent came from Ireland and Germany—1,283,000 from Ireland and 939,000 from Germany. This was essentially a Malthusian evacuation; both its timing and its magnitude were determined by exogenous driving forces in two stricken areas of Europe. The explanation of the subsequent fluctuations
in migration lies in a complicated process of interaction between the economies of the Old World and those of the countries of new settlement overseas. It is significant that when the receiving countries, notably the United States, Canada, and Australia, were absorbing immigrants on a large scale, they were also experiencing a long upswing in capital construction (such as railroads and housing), which is sensitive to population growth. When this cycle entered its downward phase, with both migration and capital imports dwindling, there was simultaneously an upsurge in capital construction in the United Kingdom, where the rural surplus was now absorbed in urban areas at home. This capital construction was financed by loanable funds which were no longer attracted abroad (Thomas 1954). Thus, there was an inverse relation between long swings in population-sensitive capital formation in the United Kingdom and in the United States, and in the United Kingdom there was an inverse relation between external migration and internal migration. The mechanism of this inverse long swing between the United Kingdom and the United States can be seen most clearly in the period 18701913; it arose because a substantial part of total capital formation was sensitive to the rate of population growth and the rate of population growth was determined by the net migration balance. There are some grounds for thinking that the propensity to emigrate was in some way related to a cycle of births in Europe that caused a periodic recurrence of swollen numbers in the emigration age groups (Thomas 1954). In the interwar period some of the basic trends of the pre-1913 era were reversed. The United States had become the world's leading exporter of capital, and the Immigration Restriction Act of 1924 virtually closed the doors to further immigration except on a very modest scale. From the turn of the century, British settlement in Canada, Australia, New Zealand, and South Africa had been expanding, and British emigration to the Empire considered as a proportion of British emigration to the United States, which had been only 43 per cent in 1881-1900, had risen to 245 per cent by 19111913. After World War I Britain embarked on a substantial program of Empire settlement, and in the decade 1922-1931, 400,000 emigrants received financial support to enable them to settle in the overseas dominions. However, this outflow did not survive the world depression, which had such severe consequences that it actually reversed the world currents of migration. In 1932, 11 European countries of emigration received a net inward bal-
MIGRATION: Economic Aspects ance of 102,000 persons, and Argentina, Australia, New Zealand, the United States, and Uruguay together had a net outflow of 65,000 (International Labor Office 1932). The international migration picture had become a perverse caricature of its former self. Changes since World War n. The factors determining the volume, quantity, and direction of international migration since World War n are quite different from what they were in the nineteenth century. Profound structural changes have taken place in the Atlantic economy, and they have had far-reaching effects on the pattern of world migration. It is necessary to distinguish between the years immediately after the war and the period beginning in 1952. The correspondence between international flows of people and of private capital, which was the outstanding feature of the nineteenth century, disappeared. In its place there emerged in the years 1945-1952 an international circular flow based on the immense net transfer of $33,800 million of public capital (government loans and grants) from the United States, $22,800 million of which went to Europe. This recovery program made it possible for the exhausted countries of Europe, particularly the United Kingdom, France, the Netherlands, and Belgium, to resume exports of people and capital to the oversea territories with which they had a special relationship. It facilitated the revival of migration and mobility of capital within the British Commonwealth and strengthened the purchasing power of the less developed parts of the world. Perhaps the most significant effect of the recovery program was its contribution to basic capital formation in western Europe. This contribution was the prelude to the remarkable upsurge in economic growth in the 1950s [see FOREIGN AID]. After 1952 a new balance of economic forces emerged. American economic aid to Europe ended, but the volume of military aid rose considerably and the amount of private investment by American firms in Europe greatly increased, until in 1959 for the first time the flow of new American funds for direct investment was larger in western Europe than in either Canada or Latin America. Rapid economic growth in western Europe has meant a remarkable increase in intracontinental migration and a continued decline in emigration to oversea countries. Net immigration into the European Economic Community amounted to 288,000 in 1960 and 421,000 in 1961. The latter figure was particularly large because of the repatriation of French and Belgian nationals from Africa. Looking at the separate countries, we find that in 1961
297
West Germany had a net inflow of 421,000 and France 150,000, whereas Italy had a net outflow of 164,000. Over the five years 1956-1960 the net outward migration from Italy to other European countries was at an annual average of 83,000, and emigration from Italy to oversea countries fell from 111,000 in 1956 to 48,000 in 1960. Intracontinental migration is increasing at the expense of the traditional European outflow to countries such as those of Latin America, and this is basically determined by the economic resurgence of western Europe and the consequent change in the economic balance within the Atlantic economy. New determinants are also operating on other continents. The complex problems of migration on the African continent have been explored in a comprehensive study by the International Labor Office (1958). The spread of industrial development draws Africans long distances from their rural homes, but they are often prevented from becoming members of a settled work force. They find themselves suspended with the maximum of insecurity between the village to which they are attached and the harsh conditions of the industrial labor market. This can have disastrous social consequences: the countryside is denuded of a large part of its labor, family life is broken up, the social structure disintegrates, and there is neither economic nor social security. The General Conference of the International Labor Office in 1955 adopted a recommendation on the protection of migrants in underdeveloped countries which is clearly relevant to Africa. Most underdeveloped countries are detrimentally affected by current trends in the migration of qualified personnel. In the nineteenth century, skilled manpower tended to accompany capital flows from advanced to less developed countries: in the modern world the bulk of private investment is a circulatory process within the rich sector, and there is a suction of skilled labor from the poorer countries into the more advanced. Of the 9,245 immigrant engineers admitted to the United States in 1953-1956, 50 per cent came from Europe, 25 per cent from Canada, and 22 per cent from "other countries," which included a number in a low state of economic development. The same was true of natural scientists ([U.S.] National Science Foundation 1958). Unskilled labor is often complementary to skilled, and when there is a decline in the immigration of the latter into a developing country, as has occurred, e.g., in Latin America, the scope for the absorption of unskilled immigrants is automatically curtailed. Some of the international movements of relatively scarce human capital tend
298
MIGRATION: Economic Aspects
to widen the disparity between rates of economic growth in rich and poor countries. Such transfers could be on such a scale that some underdeveloped countries could never begin the process of growth. There seems to be a conflict here between the principle of freedom to migrate and the goal of reducing inequality. However, in assessing the economic effects of the migration of a factor of production, the relevant criterion is not marginal private productivity, but marginal social productivity. Judged by this criterion, some of the international migration of skill in the world today is perverse. Impact on receiving countries It used to be argued that immigration into the United States had not added to the American population because its effect had been counterbalanced by an induced decline in the fertility of native-born Americans (see, for instance, Walker 1891). This proposition has been disproved. Modern demographic analysis has demonstrated that the immigrations of the last century had hardly any net effect on the rate of natural increase of the nativeborn population in the receiving countries. It is estimated that in France between 1801 and 1936 net immigration was 3,960,000 and contributed only a third of the growth of population within the 1936 boundaries of the country during that period. The white population of the United States in 1790 was about 3.2 million; their descendants living in the United States in 1920 have been estimated at 41 million, and in the same year the number of descendants or survivors of immigrants since 1790 came to about 53 million. In that period net immigration into the United States was about 26.5 million (United Nations, Department of Social Affairs 1953, p. 139). One of the most striking examples of mass immigration in recent times is that into West Germany, which absorbed 12 million immigrants in 12 years after the war. Although this influx increased the population of West Germany by onethird, it failed to make up for the gaps in the demographic structure caused by war losses, since the immigrant population had suffered the same kinds of losses in the same age groups (International Labor Office 1959, pp. 27-28). In recent years there has been a dramatic change of trend in the United Kingdom, a traditional country of emigration. Since 1958 there has been an appreciable net inward movement of migrants. This reversal of trend was caused by the substantial inflow of colored Commonwealth citizens, mainly from the West Indies, India, and Pakistan.
This led the government to introduce legislation to regulate Commonwealth immigration; the Commonwealth Immigrants Act came into force on July 1, 1962. Although the total number of colored people living in the United Kingdom is only about 1 per cent of the population, the newcomers have tended to cluster in certain places, and this has given rise to difficult social problems. This is part of a wider phenomenon—compare, for instance, the situation of the Puerto Ricans in the United States or the north Africans in France. Where poor, overpopulated countries have special relationships with advanced countries, it is natural that an overflow of population will take place so long as channels of mobility remain open. Impact on sending countries Mass emigration can in certain circumstances turn into a self-reinforcing process with profound long-run effects on the sending country. This arises from one of the most significant of migration differentials, the fact that the incidence of migration is particularly heavy in the age group 15—30. Given a substantial initial outflow, the sequel can be as follows. There is a decline in the marriage rate and consequently a fall in the size of the age group 0—5; but, since the rate of emigration among children is relatively low, the 0-5 age group becomes a relatively large group ten years later. It is a paradox that countries which are heavy losers through emigration seem to be liberally endowed with teen-agers. The process is self-perpetuating because, after a lag of about 15 years from the original thinning out of the 15-30 age group, the number entering the high emigration age group of 15-20 is relatively high in relation to the total population. The influence of mass emigration on age composition in the sending country can be observed most strikingly in the case of Ireland, from which the total outflow from 1850 to 1911 was 4,191,000. By 1951, 30 per cent of the population were aged 45 and over, as compared with 16 per cent in 1841, and 11 per cent were 65 and over, as compared with 3 per cent in 1841. The experiences of Ireland, Sweden, and Scotland show that substantial emigration tends to reduce the marriage rate, but we cannot be certain about the long-run effects on fertility. There can be little doubt that prolonged emigration helps to explain why there are so many spinsters in Ireland; however, the women who do get married have relatively large numbers of children. As to the death rate, the tendency is for the loss of good lives from the 15-35 age group
MIGRATION: Economic Aspects through emigration to raise the average death rate in that group. Assimilation It is broadly agreed that assimilation is best regarded as a mutual process of integration. An American sociologist has well said that ". . . the United States has not assimilated the newcomer nor absorbed him. Our immigrant stock and our so-called 'native' stock have each integrated with the other. . . . It will be apparent that this concept of integration rests upon a belief in the importance of cultural differentiation within a framework of social unity. It recognizes the right of groups and individuals to be different so long as the differences do not lead to domination or disunity" (Borrie 1959, pp. 93-94). One of the difficulties of group settlement, e.g., in Latin America, is that there arises a conflict between the immediate interests of the migrant in his group and the long-run objective of cultural integration. Much benefit can be derived from the provision of instruction to the migrant before he has left his own country. The Intergovernmental Committee for European Migration has evolved effective means of selecting, educating, and pretraining European migrants. [See REFUGEES, article on ADJUSTMENT AND ASSIMILATION.] Experts who have studied migration in Asia stress the importance of assimilation as a necessary condition of the diffusion of skills. In colonial regimes there was a tendency for skilled immigrants to remain a class apart, and they often sought to maintain the relative scarcity of their skill. The great need in Asian countries now is for the importation of skilled personnel who will assimilate easily and thereby facilitate the rapid spread of technical knowledge. International migration no longer plays the role in economic growth that it did in the nineteenth century. Legislative restrictions, the changes in the economic determinants, and the population upsurge in different parts of the world have all tended to reduce the scale of movement. Countries which have been receiving a relatively large influx of migrants since World War 11, e.g., Australia, will soon find that the rate of entry into the working population from the swollen lower age groups will make immigration on the old scale unnecessary. The international circulation of skilled manpower has become relatively more important. Intercontinental migrations have lost most of their significance and have been replaced by intracontinental movements.
299
Much more interdisciplinary research is needed into the problems of adjustment of immigrants, particularly where they are ethnically different from the population of the host country; the interaction between external and internal migration; the relation between immigration and the incidence of mental health; the determinants of the rate of increase of immigrant groups in multiracial societies; and the economic and social consequences of the changing pattern of the international circulation of skilled manpower. BRTNLEY THOMAS [Directly related are the entries CAPITAL, HUMAN; REFUGEES. Other relevant material may be found in ASSIMILATION; POPULATION, article on POPULATION DISTRIBUTION; and in the biographies of GINI and KULISCHER.] BIBLIOGRAPHY BORRIE, WILFRID D. et al. 1959 The Cultural Integration of Immigrants: A Survey Based Upon the Papers and Proceedings of the UNESCO Conference Held in Havana, April 1956 . . . With Case Studies. Paris: UNESCO. CARR-SAUNDERS, ALEXANDER M. 1936 World Population: Past Growth and Present Trends. Oxford: Clarendon. EISENSTADT, SHMUEL N. (1954) 1955 The Absorption of Immigrants: Comparative Study Based Mainly on the Jewish Community in Palestine and the State of Israel. Glencoe, 111.: Free Press. GINI, CORRADO 1946 Los efectos demograficos de las migraciones internacionales. Revista internacional de sociologia (Madrid) 4:351-388. INTERNATIONAL ECONOMIC ASSOCIATION 1958 Economics of International Migration: Proceedings of a Conference. Edited by Brinley Thomas. New York: St. Martins; London: Macmillan. INTERNATIONAL LABOR OFFICE 1932 Statistics of Migration: Definitions-Methods—Classifications. Geneva: The Office. INTERNATIONAL LABOR OFFICE 1952 International Classification of Occupations for Migration and Employment Placement. 2 vols. Geneva: The Office. -> Volume 1: Occupational Tables, Codes and Definitions. Volume 2: Tables of Occupational Comparability: Major Occupational Groups 1 Through 6. INTERNATIONAL LABOR OFFICE 1958 African Labour Survey. International Labour Organization Studies and Reports, New Series, No. 48. Geneva: The Office. INTERNATIONAL LABOR OFFICE 1959 International Migration: 1945-1957. International Labour Organization, Studies and Reports, New Series, No. 54. Geneva: The Office. KEYFITZ, NATHAN 1950 The Growth of Canadian Population. Population Studies 4:47-63. KUZNETS, SIMON; and RUBIN, ERNEST 1954 Immigration and the Foreign Born. National Bureau of Economic Research, Occasional Paper No. 46. New York: The Bureau. LASKER, B. 1945 Asia on the Move: Population Pressure, Migration, and Resettlement in Eastern Asia Under the Influence of Want and War. New York: Holt.
300
MILITARISM
MILBANK MEMORIAL FUND 1958 Selected Studies of Migration Since World War II. New York: The Fund. NATIONAL BUREAU OF ECONOMIC RESEARCH 1929-1931 International Migrations. Edited by Walter F. Willcox. 2 vols. New York: The Bureau. -> Volume 1: Statistics, compiled on behalf of the International Labour Office, Geneva, with Introduction and notes by Imre Ferenczi. Volume 2: Interpretations, by a group of scholars in different countries. TAFT, D.; and ROBBINS, R. 1955 International Migrations. New York: Ronald. THOMAS, BRINLEY 1954 Migration and Economic Growth: A Study of Great Britain and the Atlantic Economy. National Institute of Economic and Social Research, Economic and Social Studies, No. 12. Cambridge Univ. Press. THOMAS, BRINLEY 1961 International Migration and Economic Development: A Trend Report and Bibliography. Paris: UNESCO. UNITED NATIONS, BUREAU OF SOCIAL AFFAIRS 1953 Sex and Age of International Migrants: Statistics for 19181947. Population Studies, No. 11. New York: United Nations. UNITED NATIONS, BUREAU OF SOCIAL AFFAIRS 1958 Economic Characteristics of International Migrants: Statistics for Selected Countries, 1918-1954. Population Studies, No. 12. New York: United Nations. UNITED NATIONS, DEPARTMENT OF ECONOMIC AND SOCIAL AFFAIRS 1951-1952 International Migration in the Far East During Recent Times. Population Bulletin 1:13-30; 2:27-58. UNITED NATIONS, DEPARTMENT OF ECONOMIC AND SOCIAL AFFAIRS 1955 Analytical Bibliography of International Migration Statistics, Selected Countries: 19251950. Population Studies, No. 24. New York: United Nations. UNITED NATIONS, DEPARTMENT OF SOCIAL AFFAIRS 1953 The Determinants and Consequences of Population Trends: A Summary of the Findings of Studies on the Relationships Between Population Changes and Economic and Social Conditions. Population Studies, No. 17. New York: United Nations. U.S. IMMIGRATION COMMISSION, 1907-1910 1911 Abstracts of Reports of the Immigration Commission: With Conclusions and Recommendations and Views of the Minority. 61st Congress, 3d Session, Senate Document 747, Vol. 1. Washington: Government Printing Office. [U.S.] NATIONAL SCIENCE FOUNDATION 1958 Immigration of Professional Workers to the United States, 19531956. Scientific Manpower Bulletin No. 8. WALKER, FRANCIS A. 1891 Immigration and Degradation. Forum 11:634-644.
MILITARISM Militarism is a doctrine or system that values war and accords primacy in state and society to the armed forces. It exalts a function—the application of violence—and an institutional structure— the military establishment. It implies both a policy orientation and a power relationship. Although militarists have used violence to silence domestic critics, their ideology rationalizes its use
primarily in foreign affairs. War is held to be a divine commandment or an experience that ennobles by developing courage, patriotism, honor, unity, and discipline. Militarists seek to universalize such values by precept, symbol, and ceremony. A fully militarized society also confers a privileged position on warriors. In the extreme case, possible only in centralized polities, the armed forces unilaterally determine the nature of basic institutions, the choice of regimes, the rights and duties of citizens, and the share of national resources allocated to military functions. In a less extreme but more common case military leaders exercise great power as partners or agents of other social groups rather than as relatively autonomous forces. Ideal-type militarism was approximated in Japan from 1931 to 1945 and in Germany during the later stages of World War I. Advocates of militarism. Militarists cannot be identified with military or uniformed personnel. In modern European history officers were often restrained in foreign policy and circumspect in domestic politics. Indeed, Huntington (1957, pp. 6971) contends that such attributes define the genuinely professional officer. The identification of militarists with men in uniform also fails because civilian militarists have long cherished war and warriors. D'Annunzio, Barres, Carlyle, Theodore Roosevelt, and Treitschke stand as examples. Because the policy and institutional aspects of militarism can be separated, it is quite possible to emphasize one without the other. For example, though the Nazis regimented Germany to facilitate foreign conquests, they also destroyed the vaunted autonomy of the professional army. Conversely, though officer-aristocrats of the nineteenth century exalted the army, they were not always bellicose. Their "enemies" were often fellow citizens, liberals, or socialists, against whom they sometimes made common cause with foreign officer-aristocrats. Such inward-looking militarism has also existed in Latin America. Specialization of function. In ordinary usage "militarism" has a derogatory meaning. Like legalism or clericalism it suggests excess: a lack of proportion in policy or, when exhibited by warriors, a disregard for appropriate professional bounds. Since such disregard appears central to the concept, it is not rewarding to apply the term to societies in which professional bounds are invisible, that is, where roles are so fused that a distinctive military calling does not exist. Institutional militarism assumes a minimum differentiation of mili-
MILITARISM tary from political, economic, and religious roles. At the same time it assumes that the differentiation is incomplete or that it has been challenged and stands in danger. The negative nuance in the term also implies that disregard for the bounds carries a penalty. The penalty is technical incompetence. Historically, for example, some European militarists failed to note the obsolescence of cavalry because they were less concerned with professional questions of maneuver and firepower than with a social question: the class status symbolized by cavalry. Japanese militarists exceeded their jurisdiction as strategic planners and, insisting on making their own diplomatic determination, embarked on a conflict that ultimately proved disastrous to their own forces. In both cases militarists actually jeopardized security ends by their inability to select means appropriate to their defense. This inability is a function of inordinate power and forsaken expertise. In a nonmilitarized society the warrior is an agent and a specialist. In a militarized society he is a principal and a generalist. He feels competent to deal with the whole sweep of public policy, foreign and domestic. But in the long run so unconfined a jurisdiction impairs his professional military expertise without developing in the warrior the skills necessary to match experienced civilians in political bargaining or economic management. Antimilitarism of the middle class. The term "militarism" appears to have been used first by middle-class liberals in nineteenth-century Europe. Possibly they sensed that militarism was incompatible with the specialization of functions demanded in an industrial era. Clearly they realized that standing armies had become citadels of aristocratic privilege. The values of officer-aristocrats conflicted with those of the middle class. The officers prized hierarchy, feudal honor, absolutism, prodigality, and organic unity; the middle class spoke of equality, material gain, parliamentary rule, thrift, and individualism. Inevitably, military institutions were suspect to such liberals as Locke, Ferrero, Voltaire, Jefferson, and Kant. Later they became equally suspect to Leninists, whose doctrine defined war as a disease of capitalism in its final, imperialist phase. Given the problems that concerned them, it was natural that European critics of militarism rarely speculated about the danger that military expertise Anight be inadequately represented in the foreign policy process. Nor, despite examples of revolutionary armies in the Americas and France, did they theorize about the causes or consequences of military liberalism. Overgeneralizing from selected as-
301
pects of Western history, they merely identified the armed forces with unbridled power, social reaction, and war. Wherever representative democracy advanced, therefore, military power was curbed. Today, in the industrialized West such power tends to be confined to defense policy, the field of greatest professional concern to warriors. This generalization also holds for the Soviet Union and other totalitarian states, except possibly during crises of succession. New approaches to militarism Recent social research on militarism has dealt less with Europe and Japan than with southeast Asia, the Near East, and Latin America. The change in focus has had a perceptible impact on the term itself. It is no longer identified so closely with bellicose foreign policy. In these vast regions social revolution replaces war as the significant form of violence. To be sure, it is still identified with the primacy of the military establishment. But analysis of ideological views has been succeeded by interest in the actual degree of social and political power held by armed forces in different states. Efforts have been made to distinguish these levels and to analyze the variables that determine them. More sophistication is also displayed about the economic and social policies of military regimes. The spectrum of power positions. The political power of armed forces reaches its nadir in countries which dispense with them altogether; Iceland and Costa Rica are modern examples. Next come societies in which the military establishment is dominated by authoritarian regimes: traditional autocracies or dictatorships supported by parties of mass mobilization. In stable constitutional democracies the armed forces, like other segments of the bureaucracy, press their claims through prescribed channels and procedures but ultimately comply with decisions made by civilian superiors. Militarism begins when the armed forces accompany their advice with a threat of sanctions if the advice is unheeded. Finer (1962) has summarized the high-handed techniques sometimes used: threats to resign, to withdraw support, to announce disagreements publicly, to demonstrate disdain for the regime, to refuse to execute its orders, or to rise up in arms. Whenever such blackmail succeeds, the armed forces in fact begin to rule covertly, either by exercising a veto or by substituting policies and personnel of their own choice for those of the de jure government. From this point it is a short step to more extreme measures which take special advantage of a disciplined following, a superior communications net, and heavy weapons.
302
MILITARISM
These measures include the manipulation or delay of elections, the deployment of troops to intimidate opponents or to seize key points, and the arrest or assassination of politicians. In military interventions the armed forces may act alone or in coalition with civilians. They may take the initiative or respond, more or less eagerly, to pleas from politicians. These are important distinctions. It is equally important to distinguish sporadic from chronic intervention and brief from prolonged military rule. Military juntas frequently assume power with the statement that they will serve only as "caretakers" until a legitimate civilian regime can be installed. But such pledges are not always honored, and what appears to be withdrawal is sometimes merely a tactical retreat to a position from which covert rule can be attempted. On the other hand the examples of Atatiirk, Franco, de Gaulle, and Ayub indicate that military leaders who serve as heads of government for long periods may in effect transform themselves into civilian politicians and thereby legitimize their rule. Demands for military leadership. The degree of political power possessed by armed forces is determined first of all by the effective demand for military leadership. Response to such demand has been termed "reactive militarism" (Janowitz 1964, p. 16). Demand for military leadership varies with the intensity of foreign or domestic social conflict. Prussia's history illustrates that armies become influential in countries which experience foreign pressure consistently. The theory of the "garrison state" (Lasswell 1941) elaborates this insight and applies it to industrial nations. It assumes that if such countries face continuing security crises they will subordinate all else to defense preparations and their social systems will be controlled rigorously by a combination of military and civilian leaders. Although this thesis has not been disproved, it is evident that the vitality of the political process can affect the outcome significantly. For example, military leaders have played a prominent role in American policy councils during the cold war, but at no time has military rule been likely. Soviet and British experiences in and after World War n also suggest that prolonged security crises need not generate military regimes where civilian government rests on a firm consensus. It is equally clear that military leaders can acquire great power in the absence of significant foreign pressure. Latin America and Tokugawa Japan provide examples. In the new states of Africa and Asia military primacy is more frequently a function of internal than of external security crises. In most cases the
latter make themselves felt only indirectly. For example, military aid programs, initiated in response to global security crises, increase the political power of armed forces in recipient states by reducing their dependence on local political leaders and by stimulating more rapid modernization of the military bureaucracy than of the civil service. Demands for military leadership can be generated by domestic events under three analytically distinct conditions, which can be understood with the aid of insights derived from the general theories of Hobbes, Marx, and Pareto respectively. In the first instance, once society becomes disorganized to the point of anarchy, military force seems essential to the restoration of public order, especially at the local level. In the second situation, a privileged social group, aided by control of the state, oppresses subordinate groups. If the latter cannot seek redress of grievances peacefully, their protests may take violent forms ranging from individual acts of terrorism to the organization of revolutionary armies. Since the dominant strata cannot rely on overarching loyalties to hold the community together, they in turn must call upon police and army to compel obedience. In the third situation, although the fundamental basis of the social order is not threatened, elites are at odds over such issues as corruption, constitutional procedure, or foreign policy, and one or another faction eventually turns to the military for help. Countervailing forces. The strength of competing forces and institutions also affects the social and political power of the military establishment. Competition may, of course, emanate from within the armed forces themselves, but little is yet known about the conditions under which military influence in society is increased or decreased by factional rivalry. The effect of external forces seems clearer. In such stratified societies as Iran or Morocco, for example, the armed forces have long been regulated by autocratic rulers; in India they were controlled by colonial civil servants. Military primacy is more likely if the legitimacy of such traditional controls is challenged before influence can be acquired by other civilians. Military primacy varies inversely with the number and strength of private associations, with the integrity and skill of civilian bureaucrats, and with the prestige of political parties and legislatures. Especially in new states, if other public institutions are tarnished by corruption or if they are supported only by privileged social strata or only in the capital city, military leaders often assume the functions of mobilizing and representing social interests. The strength of countervailing forces is some-
MILITARISM times weakened by disputes over legitimate authority. During civil strife or in crises of succession it is often unclear where rightful control over the armed forces is lodged and their discretionary power inevitably rises. Legislatures cannot always make good their claim to control military budgets or war ministers. Prime ministers and war ministers do not always possess full authority over senior command and staff officers. In some cases war ministers are ineffective agents of political authority because they, too, are professional officers, with strong ties to their military colleagues. This was often the case in Germany and France before 1914, in Japan through World War n, and in Argentina and Brazil in more recent times. Finer has related the levels of military intervention to the demands for military leadership that are generated by domestic conditions or by what he terms "the order of political culture" (1962, p. 139). In mature political cultures the armed forces engage only in prescribed modes of influence. In developed but not fully mature cultures militarism emerges in the form of sporadic and usually covert intervention, the limits of which are set in part by countervailing social forces and institutions. In countries with low or minimal political culture military intervention is proportionately easier, more open, more frequent, and more enduring. However, a succession of military coups may also serve as a substitute for revolutionary warfare in propelling countries of low political culture toward political maturity. National traditions. Military primacy is also a function of particular national traditions. Where leaders of the armed forces have formed part of a respected ruling class or where they have come to be respected as national heroes, saviors, or symbols, they recruit a disproportionate share of talented and ambitious men. Governments are readier to authorize these men the money and materials they desire. Citizens are also readier to tolerate their assumption of political and social leadership. Under such circumstances the armed forces develop corporate pride and confidence that translate into still greater influence. For such reasons the military in imperial Germany and Japan possessed greater social and political power than their counterparts in imperial China, nineteenth-century America, or twentieth-century Africa. Values and internal policies. Given identical opportunities to acquire political primacy, not all military leaders are equally disposed to seize them. Readiness to do so depends on self-images and values, some of which in turn shape the internal Policies of military regimes.
303
The most important inhibiting value is a commitment to the principle of civilian supremacy. As noted earlier, Huntington (1957) defines this as part of the professional military ethic, but Finer (1962, pp. 24-30) contends that it is an independent factor. The principle itself requires military leaders to serve all lawful regimes faithfully. Acceptance of such a commitment induces them to resist substantial temptations to intervene in politics; rejection makes intervention likely on little provocation. The strength of the commitment differs from country to country, from individual to individual, and probably from one branch of service to another. Several kinds of values predispose military leaders to seek political power. From the praetorian guards of Rome to the condottieri of Italy and the janissaries of Turkey, blackmail and coups d'etat have been attempted for personal aggrandizement, special privileges, and material benefits. These essentially personal motives or values are still important in many countries. Institutional values are important wherever military careers provide a major outlet for ruling classes or upward-mobile persons; they are less important where good opportunities exist in business, learned professions, and other careers. Different institutional interests help explain the special affinity of navies for foreign politics and the special affinity of armies for internal politics. Although such interests may be either defensive or offensive, it is not always easy to distinguish the one from the other. A desire to preserve institutional criteria for promotion of personnel can shade off into a demand for autonomy so complete that the armed forces become "a state within the state." A desire to protect corporate values can induce an army to demand an enormous proportion of the nation's manpower and income. Public or political values are nowadays most influential in prompting military men to seek political primacy. Such values involve conceptions of national security, aggressive as well as defensive; conceptions of nation building in the face of centrifugal forces; conceptions of efficiency and austerity in the face of corruption; conceptions of the rule of law in the face of arbitrary government; conceptions of social unity in the face of factional politics; and conceptions of social and economic justice—conservative, liberal, and eclectic. A historical relationship has existed between armies and conservative regimes not only in western Europe but also in the traditional autocracies of Afro-Asia and under many Latin American dictatorships. Armed forces often assist in main-
304
MILITARISM
taining an existing system of social stratification and privilege. But it is impossible to ignore the tradition of military liberalism in the American and French revolutions, in the Napoleonic armies, among some Prussian military reformers, among the military republicans of Spain in 1820, and among the Latin American followers and heirs of Bolivar. It is equally impossible to dismiss as conservative the military elites of the underdeveloped world today. In the new states of Africa and Asia military conservatives in the classical European sense tend to be exceptions, in part because few such countries experienced feudal institutions and in part because colonial administrators monopolized high-status positions. Also, armies in emerging nations are heterogeneous. Countries as different in other respects as Brazil and Iraq are alike in having politically influential officers who represent both oligarchic and radical forces. The former are interested in fiscal responsibility, order, and legitimacy. The latter are more interested in welfare programs, tax and land reform, and the mobilization of youth, women, and peasants. Still other officers hold positions that can only be described as technocratic or pragmatic. Some change their ideologies in response to events. Many Latin American officers, for example, support modernization efforts up to the point at which the stratification system seems threatened and then recoil in alarm. On the other hand, in most of the new states of Afro-Asia there is little principled opposition to public enterprise, and the military regimes of Nasser in Egypt and Ne Win in Burma actually pledged major transformations of the social order. Orientation of military leaders. Social origins and connections, age, and foreign influences are among the factors that shape the economic and social orientation of military leaders. When officership is primarily an ascriptive calling, it attracts the wellborn or the rich. However, when the entire military profession is open to men of talent on a competitive basis, men of the middle and lower middle class are more likely to enter it. A decision to recruit on the basis of merit has strengthened revolutionary elements in the officer corps in times and places as different as France in 1792 and Egypt in 1936. Moreover, popular military organizations, such as militia or national guards, are frequently more liberal than small professional armies. In eighteenth-century and nineteenth-century Europe such relatively technical branches as artillery and engineers tended to attract middle-class liberals, while cavalry and infantry remained citadels of the aristocracy. Similarly, in the developing
nations after World War n officers whose work had a relatively large technical component tended to be less conservative than their colleagues. In these countries, also, officers in the grade of colonel or lower tended to be more radical than their more senior colleagues. Finally, economic and social orientations of military personnel are affected by foreign cultural impacts. In the armies of Latin America oligarchic tendencies were strengthened first by Iberian influences and later by fascist military advisers. In the nineteenth-century Turkish army, on the other hand, contact with France led to a diffusion of liberal ideas stemming from the Enlightenment and the French Revolution. Although it is extremely difficult to generalize, it is also probable that the net effect of post-1945 military aid programs— Soviet as well as Western—was to strengthen modernizing tendencies in the armed forces of developing nations. LAURENCE I. RAD WAY [See also CIVIL-MILITARY RELATIONS; MILITARY; MILITARY POLICY. Other relevant material may be found in MODERNIZATION; NATIONAL SECURITY; PACIFISM; WAR.] BIBLIOGRAPHY ANDRZEJEWSKI, STANISLAW 1954 Military Organization and Society. London: Routledge; New York: Humanities. ERICKSON, JOHN 1962 The Soviet High Command: A Military-Political History, 1918-1941. New York: St. Martins. FINER, SAMUEL E. 1962 The Man on Horseback: The Role of the Military in Politics. New York: Praeger. GIRARDET, RAOUL 1953 La societe militaire dans la France contemporaine: 1815—1939. Paris: Plon. HACKETT, ROGER F. 1964 The Military. A. Japan. Pages 328-351 in Conference on Political Modernization in Japan and Turkey, Gould House, 1962, Political Modernization in Japan and Turkey. Edited by Robert E. Ward and Dankwart A. Rustow. Princeton Univ. Press. HUNTINGTON, SAMUEL P. 1957 The Soldier and the State: The Theory and Politics of Civil-Military Relations. Cambridge, Mass.: Harvard Univ. Press. JANOWITZ, MORRIS 1964 The Military in the Political Development of New Nations: An Essay in Comparative Analysis. Univ. of Chicago Press. LASSWELL, HAROLD D. 1941 The Garrison State. American Journal of Sociology 46:455-468. LIEUWEN, EDWIN (1960) 1961 Arms and Politics in Latin America. Rev. ed. Published for the Council on Foreign Relations. New York: Praeger. MAXON, YALE C. 1957 Control of Japanese Foreign Policy: A Study of Civil-Military Rivalry, 1930-1954. Berkeley: Univ. of California Press. MILLIS, WALTER; MANSFIELD, HARVEY C.; and STEIN, HAROLD 1958 Arms and the State: Civil-Military Elements in National Policy. New York: Twentieth Century Fund.
MILITARY RITTER, GERHARD 1954-1964 Staatskunst und Kriegshandwerk: Das Problem des "Militarismus" in Deutschland. 3 vols. Munich: Oldenbourg. -> Volume 1: Die Altpreussische Tradition: 1740-1890. Volume 2: Die Hauptmdchte Europas und das Wilhelminische Reich: 1890-1914. Volume 3: Die Tragodie der Staatskunst: Bethmann-Hollweg als Kriegskanzler, 1914-1917. SPEIER, HANS 1952 Social Order and the Risks of War. New York: Stewart. VAGTS, ALFRED (1937) 1960 A History of Militarism: Civilian and Military. Rev. ed. London: Hollis & Carter.
MILITARY As a sociological category, the term "military" implies an acceptance of organized violence as a legitimate means for realizing social objectives. Military organizations, it follows, are structures for the coordination of activities meant to ensure victory on the battlefield. In modern times these structures have increasingly taken the form of permanent establishments maintained in peacetime for the eventuality of armed conflict and managed by a professional military. Accordingly, the military professional is an officer who pursues a lifetime occupational career of service in the armed forces, where, to qualify as a professional, he must acquire the expertise necessary to help manage the permanent military establishment during periods of peace and to take part in the direction of military operations if war should break out. Career commitment and expertise, the hallmarks of any professional, set the professional military officer apart from those other personnel in the armed services who are merely carrying out a contractual or obligatory tour of duty or for whom officer status primarily represents, as it often did in former times, an honorific pastime into which military skill enters only as a secondary consideration. Social organization and armed force Throughout most of history the right to employ violence has been derived from membership in a special community or in one of its status groups. While societies everywhere have always regarded outsiders as legitimate targets for violence, societies whose internal relations are based on physical dominance of one group by another allow for fewer of the fine distinctions between brute force and other bases of social and political power. Thus the Roman legions both served imperial ambitions and became the major domestic political force. Indeed, war lords of all epochs have considered their armies a form of private property and have used
305
them to secure their tax base and to extend its boundaries. Under such conditions, the internal organization of the armed forces closely reflects the distribution of power within the society at large. In general, the more pervasive the prospect of violence against external or internal enemies of the regime, the more similar are the military and the civilian value hierarchies. The concept of the military as a permanent establishment maintained solely in support of foreign policy objectives presupposes the development of a civil society based on consensus. In such a society, the armed forces are called upon to cope with domestic disorder only in extraordinary circumstances, this task being relegated largely to civilian police forces. However, the incapacity of party governments to resolve vexing internal problems, including an inability to mobilize the "home front" in support of national goals, has on many occasions led the military to do more than provide coercive power for use against external enemies. Their role in this regard has been especially important in those newly emerging nations whose civil institutions and sense of national identity have not yet had sufficient time to develop. Professionalization of the military, with rank and authority granted on the basis of demonstrated competence rather than status, cannot evolve until the problem of military management has become separated from and subordinated to the more general problem of governing a society. Even so successful an innovator of strategy as Frederick n of Prussia, because he wanted to ensure the personal loyalty of his officers, insisted that they be drawn exclusively from the ranks of the aristocracy. Using this kind of power base inside the officer corps, the postfeudal nobility of many a European country was able to prolong its waning influence. It did so by preserving within the military certain archaic sentiments, ceremonial practices, and ideological beliefs that supported the social superiority of officers, and then proclaiming this superiority valid for the society as a whole (Vagts 1937). Militarism of this sort, because it hindered rather than helped the growth of expertise, was a major impediment to the professionalization called for by advances in military technology. The possibility that certain strata of the society might use their monopoly of armed force to gain a disproportionate share of the values available within a society helps to explain why the right to bear arms has so frequently been declared one of the inalienable rights of a free citizenry. Such a right, when it becomes the prevailing military doctrine, may stand in the way of military efficiency.
306
MILITARY
In France the governments of the Third Republic, intent on curbing any possible political anbitions of military leaders, insisted on a short-tern conscript force, and by this manpower policy they deprived the French army of the opportmity to develop a highly trained force. The doctrine of the "nation in arms" in this French version helped seal that country's fate when it had to confroni bettertrained and numerically superior German frrces in two world wars. Under modern conditions, lasting victor) in war certainly can no longer—if it ever coild—be achieved primarily through the sheer weight of a hastily assembled mass of manpower acting either under the command of their social superior or of a very small professional cadre. Furthernore, a decided advantage now goes to the belligerent with the industrial and scientific base for de^eloping more powerful weapon systems and with a labor force containing sufficient skilled persomel to maintain, repair, and replenish the predicts of military technology during hostilities. As research, development, technical maintenance, and the organization of logistic support become more important elements in strategic planning, nilitary managers are forced to pay more attentior to the implications of economic, social, and polittal policy for the state of military preparedness. Hence, the events to which they must be responsive have increasingly more to do with scientific-technical capabilities and sociopolitical forces, and proportionately less to do with direct encounteis on a clearly delimited battlefield. The traditioral wall of separation between strategy (the explicitiomain of the professional military) and policy (the explicit responsibility of civil government) omes to be breached at many points. This shift in perspective has been reinforced by the growing emphasis on deterrence, rath;r than counterviolence, as the major strategic goa of the nuclear powers. But similar tendencies are evident in the industrially backward nations, whoie military leaders recognize that they must cr
weapon system before it can be operationally tested in battle. Thus there exist, within the military establishment, installations whose express function is to create and develop new and unorthodox concepts and procedures, including the application of computerized simulation techniques to the solution of strategic problems. In some ways, therefore, the military establishment begins to conform more to the pattern of a laboratory for testing the concepts and "hardware" underlying a new system, and relatively less to the pattern of a striking force whose permanent components are designed primarily to provide a basic framework for expansion in case of need. Here, again, the traditional boundaries between the civilian expert and the military professional tend to become blurred, in particular as the influence of the civilian expert ceases to be confined to the design and development of basic military "hardware" but begins to extend to matters related to its application. Civilian specialists have carried major responsibility for the introduction into the military of new managerial methods and training devices based on scientific evaluation rather than on traditional concepts. Perhaps the most significant development, however, is to be found in the number of civilians, either in the direct employ of the military or under contract, who have come to play major innovating roles that extend to the domain of strategy, which is traditionally a military preserve. Military organization The impact of modern technology notwithstanding, all military organizations continue to operate within a context of considerable uncertainty. The authority structure, work routines, and conceptions of discipline in the armed forces must be geared to the ever-present possibility, no matter how remote, that every member of the organization, whatever his job classification, may be called upon to perform his normal duties under battle conditions. Many specifically military practices explicitly express, or have as a latent point of reference, a concern about the capacity to make an adequate response under stress. The anxiety-reducing function of many routines is especially evident in the persistence of archaic ceremonial practices that have no apparent functional utility. These probably help instill confidence that, in the event of a crisis, officers and men at all levels of the organization will conduct themselves in a predictable manner. Active warfare, moreover, is a highly seasonal occurrence that alternates with more or less prolonged periods of peace. As in the past, the military man must indulge in a certain amount of roman-
MILITARY ticism to justify his continuing dedication to the martial arts when no apparent need for them exists. Acclamation of selfless service to one's country as an ennobling ideal for all, emphasis on the manly virtues, and the sense of corporate eliteness implicit in these ideals have been basic ingredients of military esprit de corps. Thus, the tenacity with which European armies resisted motorizing their horse cavalry, even after its inutility in war had been fully demonstrated, derived not only from an aristocratic tradition, symbolized by the officer and his mount, but probably drew added vigor from a reluctance to countenance the replacement of heroic men by mechanized components. By the same token, the massive resistance of military traditionalists to the formation of a separate air arm, to the introduction of the aircraft carrier and the atom-powered submarine as strategic naval weapons, to the replacement of manned bombers by missiles, and so forth, contains elements of a defensiveness that seems to be characteristic of the military, although it reflects rigidities and vested interests of a kind likely to develop in any large and complex organization. But military doctrines, in particular, are codifications of experiences gained in the past—experiences that are forever being reanalyzed. Since doctrinal modifications in time of peace cannot immediately be tested against new experiences, the remote advantages of change must inevitably be balanced against the confusion and uncertainty that attend reorientation of any sort. This organizational dilemma has been especially aggravated by acceleration of obsolescence. In this process, the reduction of "lag time" (the interval between the time a system becomes operationally feasible and its full acceptance by officers) becomes as important as the "lead time" between the drawing board and the operational stage. The concern about remaining up-todate creates a real danger of innovation for its own sake rather than as a rational adaptation to changed circumstances. In turning toward science as a source of new ideas, the military may, under the guise of modernity, be searching for the same kind of procedural solutions that it once embraced because they were traditional. Reliance by the armed services in their internal management on highly rationalized procedures and computerized systems diverts some of the uncertainties inherent in the possibility of military failure into a quest for internal order. Scientific innovation, especially when the assumptions behind its adoption are not constantly tested by experience, can degenerate into an obsession with the latest gadgetry, as divorced from reality as the prescientific forms of ceremo-
307
nialism. Similarly, techniques as unorthodox, from a military point of view, as political warfare and counterinsurgency do not necessarily encourage objective evaluation of the limitations that political and social forces impose on the value of a strictly military success. To some enthusiasts within the military these techniques often appear merely as more effective alternatives to annihilation. The military profession Another effect of technological change has been to undermine the military profession's insularity, once the almost inevitable consequence of faraway missions, assignments at isolated posts, or duties on board ship, all of which tended to deprive officers of social contact outside their narrow professional world. Many tasks with which military personnel nowadays must cope as a matter of routine are only indirectly related to combat. Modern technology has so transformed the conditions of wartime service that to maintain a single soldier in combat takes many more men than it did when the martial arts were at a more primitive level. It follows that the most rapidly expanding military job categories are generally those involving scientific, technical, and administrative skills—categories for which there are near equivalents in the civilian economy. As a result, the experience gained during military service acquires transfer value for a subsequent career in civilian life, where these same skills are likewise in demand. In order to retain skilled personnel in military service beyond their obligatory tour, the armed forces must try to offer inducements comparable to those in alternative civilian employment. Recruitment of officers. The traditional, ascriptive pattern of recruitment, especially the time-honored practices of giving preference to sons of officers in the selection of applicants to officer schools, and of fostering among candidates and junior officers a unique professional culture, was calculated to discourage all those not highly motivated toward an officer career. But higher skill requirements have more recently led to a wider search for talent and have opened new opportunities for social ascent to many ambitious young men of relatively modest origin. Sons of enlisted men, once likely to have been disqualified from officer ship on purely social grounds, are no longer at this great disadvantage. In France, their proportion among new officers has nearly tripled since World War n (Girardet 1964, pp. 38 ff.). There has been a similar broadening of the officer recruitment base, mostly under the impact of technology, in nearly every country. The proportion of the officer corps
308
MILITARY
recruited from aristocratic and plutocratic elements has dropped off even more abruptly as political purges—especially in the Soviet Union, Germany, and many Latin American countries—have forced the separation or premature retirement of officers too closely identified with discredited political regimes. Despite this general trend toward more representative recruitment, there are still many sons of officers who follow their fathers in choosing a military career and so to some extent maintain the social continuity of the occupation. For example, as opportunities for advancement were sharply curtailed in the contracted German army of the 1920s, the proportion of new officers who were sons of officers increased considerably. In the United States during the two decades after World War n, the proportion of "second generation" officers entering West Point remained at a nearly constant level of somewhat above 20 per cent (Janowitz 1964, p. 135). The significance of this occupational continuity is debatable; as in any occupation, the amount of intergenerational mobility depends in part on changes within the entire occupational structure. It may be noted that in 1950 about 20 per cent of practicing lawyers were sons of lawyers, law having the highest amount of intergenerational continuity of any occupation in the United States (Warkov 1965, p. 43). But graduates of the major military schools, such as West Point, Sandhurst, and St. Cyr, have had, as a rule, stronger commitments to a military career and, partly for that reason, have contributed disproportionately to the higher officer ranks and leadership positions. Anticipatory socialization early in their family life, together with the experiences and contacts made in the academy, gives these officers a competitive advantage over others recruited directly from civilian life. Hence a hard core of traditionally military families, where they exist, probably exerts greater influence than is indicated by gross figures on occupational continuity. Even in the United States, where such families have not been especially conspicuous, intergenerational continuity of occupation among top military executives seems to have been greater than among their civilian counterparts in the federal government (Warner et al. 1963, chapter 2). A significant ambiguity results from the fact that the officer corps is both a profession requiring the acquisition of certain skills and a corporate body through whose rank hierarchy each officer must advance. Many officers who have acquired educational and other professional skills of use to the military are not professional military men. In fact,
increases in officer allocations in recent years reflect in large measure the need for men qualified to take responsibility for complex equipment and to perform certain technical and administrative functions. While some old-fashioned armies, in order to provide positions for sons of the privileged classes, have had unusually high allocations of officers, in modern armies the increases have been greatest in branches with the most advanced technology and at levels of responsibility—usually intermediate ones—where experienced men with technical qualifications must be promoted in order to be retained. However, the authority of many officer specialists is severely limited, and in some instances their specific designation precludes advancement into positions of major responsibility. Also, the frequency with which officers in many specialties transfer out of the armed service into civilian employment indicates a primary commitment to a specialty that takes precedence over any commitment to the military. Hierarchy of command. The implications of the diversification of skills and specialties extend beyond the character of officership as a profession. Diversification also affects the internal authority structure of military organization. Traditionally, military authority has been both hierarchical and collegial. On the one hand, military discipline prescribes unquestioning compliance with orders passed down through an unambiguous line-of-command authority, with only the details of implementation left to the discretion of individual commanding officers. On the other hand, military discipline means more than automatic compliance: it subsumes the imperative, binding on every officer, to inspire one's subordinates by personal example and to cultivate among all officers a strong respect for professional norms. The presence of specialists injects the element of technical knowledge into these authority relationships. One source of strain stems from the fact that many unit commanders, even at the lower echelons, lack the technical knowledge necessary to direct all the diverse components for which they shoulder formal responsibility. Nor do they have officers on their staffs with sufficient knowledge. If such knowledge is not available at the level of the unit to which an individual officer is assigned, he has a strong incentive, when difficulties of a purely technical nature arise, to solicit information and advice directly from technical specialists attached to higher staffs. This enables him to resolve many routine difficulties while avoiding formal command channels and without involving his commanding officer in the details of every operational
MILITARY problem. However, commanding officers who tolerate such informal trouble-shooting activities, which clearly deviate from prescribed procedures, run the risk of teaching disrespect for the chain of command. In fact, they may inadvertently be discouraging their officers from keeping them fully informed on all matters under their own command. Another source of strain is that many functions and policies come under the central administration of a staff from which detailed directives emanate. These directives often leave little leeway to a local commander and may actually usurp some of his traditional prerogatives. Staff officers, by definition, have no command authority in their own name, but only as delegated. Yet relatively junior officers can, and sometimes do, informally exercise a considerable amount of de facto authority, simply by virtue of the esoteric wisdom with which their position on the higher staff endows them. The hypertrophy of this kind of staff authority was reached by the Germany army in World War I, where general staff officers, in control of their own network of communication, came to issue orders that at times completely countermanded directives from commanding generals whom they formally served only as staff advisers. This development was evidently fostered by the past practice of favoring members of royal houses for command positions, the consequences of which staff intervention was intended to redress. Nevertheless, central direction, even when it accords with the best technical principles, tends toward the creation of a dual system of authority and inevitably generates some anxiety among unit commanders about what authority they actually have. The desire of commanding officers to retain firm personal control even over matters that are centrally directed can be seen in the frequency with which they use whatever discretion they have in implementing a centrally issued directive in such a manner as to subvert its intended purpose. Strain also arises from the need to coordinate the activities of lower-echelon individuals or units that are components of different hierarchies. When the recognition of a work relationship does not in itself induce spontaneous collaboration, there can be considerable concern over limits of competence and of formal authority. Another version of this problem exists in the staffs of supernational forces, where the separate military hierarchies represented are associated with disparities of national power. Staff cooperation in NATO headquarters is said to have suffered considerably from the capacity of some officers to compensate for any lack of rank or formal authority by making use of the power of
309
the nation they represented. Certainly, U.S. advisors in Vietnam were often able to use their country's control over certain weapons to gain compliance with their decisions, even when they were clearly outranked by their Vietnamese counterparts. Still a third and somewhat analogous version of this conflict, based on ideological disparity, has occurred between professional military officers and political commissars. Where the latter represent the political regime at the unit level, and are therefore assured of outside political support, they are often in a position to countermand certain orders of their nominal superiors if they wish to do so. Combat No authority structure can by itself ensure spontaneous cooperation under battle conditions, where confusion is inevitable and improvisation and unorthodox solutions are frequently called for. The makeshift character of front-line living arrangements inevitably gives rise to serious deviations from procedures learned in the training camp. A great deal has in fact been written on the displacement of motives to the immediate group: the "comrades" or "buddies" with whom each soldier shares the experience (Grinker & Spiegel 1945; Stouffer et al. 1949, vol. 2; Mandelbaum 1952; Janowitz 1964, pp. 195-224). This sense of solidarity, which in some ways extends to all combat men, usually engenders strongly deprecatory attitudes toward those echelons of lesser risk from which most regulations emanate. To the degree, therefore, that individual and interpersonal motives become determining factors, military units in combat tend to assume some of the characteristics of a primitive mass formation. The capacity of this formation to absorb stress is highly contingent on the strength of shared sentiments. Where organizational authority does not enjoy legitimacy, strong sentiments of this sort can facilitate the rapid spread of deviant tendencies. However, the prevalence of a sense of generalized obligation lends legitimacy to punitive discipline when it is invoked as a last resort. The unavoidable presence of physical risk is a major source of disruption in combat units. Detailed investigations of the behavior of ground combat soldiers have convincingly documented the reluctance of many riflemen to discharge their weapons against a visible enemy target: during any single encounter, only a minority were found to have fired, irrespective of the chances the men had to engage the enemy (Marshall 1947; for the sources of the following remarks on reaction to combat stress, see Janowitz 1959, chapter 4; Lang 1965a). Evidently success in such encounters does not de-
310
MILITARY
pend on the performance of every Individual. Indeed, among U.S. interceptor pilots serving in Korea, only a small minority of aces accounted for an overwhelming majority of all enemy planes shot down, and most fliers were not even credited with a single plane. Air superiority was nevertheless maintained. Containment of deviance. The old-fashioned practice of severely disciplining some men for "cowardice" to deter others from failing in their duty under fire at best promotes token compliance when opportunities for escape are lacking. As a means of instilling the motivation necessary for superior performance, it is hardly effective. Yet, under conditions of modern warfare, much depends on the initiative displayed by individuals operating in small units relatively removed from the influence of formal control. The problem under these conditions is how to contain deviance within certain tolerable limits so that it does not disrupt organizational effectiveness. Even in the normal engagement many men will not measure up to par. Since enemy fire causes casualties, rates of desertion, dereliction from duty, psychoneurotic breakdown, and other forms of deviance invariably begin to rise either after a prolonged stretch of uninterrupted combat or after an engagement in which a unit suffers particularly heavy losses. In these circumstances, any break in efficiency has cumulative implications because it tends to impair the motivations and efficiency of other men. The major role in the control of deviance is increasingly being assigned to the medical specialist. In acknowledging anxiety in battle to be a natural and normal reaction, military psychiatry in general, but American and British military psychiatry in particular, has gone a long way toward treating its disruptive effects on behavior as primarily a medical and only secondarily a disciplinary problem. Although the literature provides regrettably few studies that permit reliable historical or crossnational comparisons, prevailing psychiatric theories certainly suggest that the reliance on rigid disciplinary controls would produce more lasting mental damage, with chances for ultimate recovery much diminished. In World War I, the number of severely impaired shell shock cases was certainly greater than in World War n, with its more enlightened practices of military medicine. Japanese soldiers in World War n, subject to the most unyielding discipline and supposedly indifferent to their own survival, seemed especially prone to severe attacks of hysteria. The possibility of culturally patterned reactions expressing differences in national character, especially tolerance for
anxiety, cannot, of course, be ruled out. Yet the apparent severity of the reactions among Japanese troops may have been provoked by the strong sanctions against the open expression of anxiety in any form. Organizational correlates of breakdown. The availability of a legitimate medical evacuation channel has important implications for organizational behavior. Thus, some psychiatrists have pointed out that a collective belief among American troops in World War n in an objective "breaking point" beyond which a person could not go on may actually have contributed to an increase in the number of psychiatric breakdown cases who requested evacuation because of a typical symptomatology. Neuropsychiatric breakdown was far less frequent among British troops in north Africa, who, unlike the Americans, had no expectations of being permanently repatriated until the end of the war, but whose combat was interrupted by frequent periods of rest. Similarly, there were no neuropsychiatric casualties among South Korean troops until after their integration with American forces, when these same evacuation channels became available to them. Yet they had previously exhibited many other kinds of ineffectiveness. The point is that evacuation statistics reflect not only psychiatric malaise but also a complex decision process. For the soldier who has had enough, the use of the evacuation channel with the approval of a psychiatrist offers a legitimate alternative to self-mutilation, letting himself be taken prisoner, temporary desertion, and other forms of escape. Thus, during the rapid retreat by U.S. troops from their advanced positions on the Yalu River during the winter of 1950/1951—a period of evident stress—the neuropsychiatric casualty rate exhibited a marked decline. Soldiers could not rely on evacuation, for all medical facilities were severely overtaxed. Even when ready to give up, they had a strong incentive to remain with their unit simply to avoid being captured or killed. Similarly, desertion, which had been a major problem in Europe with major cities nearby, was practically nonexistent among American troops engaged in island-hopping operations against Japan. Although comparable data from other nations are not available, it is clear that the American combat soldier in World War n was inclined to take a rather lenient view of temporary desertion, consistent with his generally tolerant attitude toward a man who was suffering from symptoms of fear which he had made a genuine effort to control. One distinguishing characteristic of men who became
MILITARY neuropsychiatric casualties was their tendency, on the average, not to entertain favorable attitudes toward the less legitimate forms of escape provided by unauthorized absence from the battlefield. Conversely, many men guilty of desertion in combat left their units only after they had on one or more previous occasions been refused medical evacuation. There are indeed indications that the two forms of escape are in some respects interchangeable and also that the decision on whether a man is entitled to medical evacuation or should be returned to his unit is not only medical but nearly always involves judgments based on organizational criteria. The effect any disposition may have on the morale of the remaining men can rarely be kept from intruding into such decisions. If the tactical situation permits, a man's prior record of good performance can earn him evacuation for symptoms that might send another man back to the front. Particularly, officers and noncommissioned officers who carry responsibility for other men, whose safety might be jeopardized by their continued presence, are more readily evacuated (the technical reports on which the foregoing remarks are based can be found summarized in Janowitz 1959; Lang 1965a). Units in combat are undergoing a constant process of attrition and replenishment, as evidenced by the continuous turnover in personnel. But the maintenance of logistic and organizational support is probably more important for maintaining the effectiveness of larger units than is keeping a particular man in battle, especially if he is suffering from evident symptoms of stress. Viewed in this context, military psychiatry as practiced nowadays reflects the same shift in orientation toward warfare that is often noted in connection with strategic planning: the long-term conservation and management of national resources and talent has become a more important military asset than victory in almost any local encounter. Again the picture of the whole world as a potential battlefield and of the possible involvement of whole populations is reflected in practices that reach into the lower units. Understanding of combat goals is clearly essential to understanding the military and its organizational practices. Yet the battlefield itself is undergoing change, and the specific missions assigned to the military are changing with it. The new forms of warfare, including ideological war and nuclear deterrence, lead to new priorities in the mobilization of men and resources. Hence,
31 1
both the relationship between the armed forces and society and the internal structure of the military will undergo further change. KURT LANG [Directly related are the entries CIVIL—MILITARY RELATIONS; INTERNMENT AND CUSTODY; MILITARISM; MILITARY POLICY; MILITARY PSYCHOLOGY; NATIONAL SECURITY; WAR. Other relevant material may be found in ECONOMICS OF DEFENSE; INTELLIGENCE, POLITICAL AND MILITARY; MILITARY LAW; MILITARY POWER POTENTIAL; SCIENCE, article On SCIENCEGOVERNMENT RELATIONS; STRATEGY; and in the biographies of CLAUSEWITZ; DOUHET; MAHAN.] BIBLIOGRAPHY
ANDRZEJEWSKI, STANISLAW 1954 Military Organization and Society. London: Routledge. DEMETER, KARL (1962) 1965 The German Officer-corps in Society and State, 1650-1945. New York: Praeger. -> First published in German. FOOT, MICHAEL R. D. 1961 Men in Uniform: Military Manpower in Modern Industrial Society. New York: Praeger. GIRARDET, RAOUL 1953 La societe militaire dans la France contemporaine: 1815-1939. Paris: Plon. GIRARDET, RAOUL 1964 La crise militaire francaise, 1945—1962: Aspects sociologiques et ideologiques. Paris: Colin. GRINKER, ROY R.; and SPIEGEL, JOHN P. 1945 Men Under Stress. Philadelphia: Blakiston. -> A paperback edition was published in 1963 by McGraw-Hill. GUTTERIDGE, WILLIAM F. 1965 Military Institutions and Power in the New States. New York: Praeger. JANOWITZ, MORRIS (1959) 1965 Sociology and the Military Establishment. Rev. ed. New York: Russell Sage Foundation. JANOWITZ, MORRIS 1960 The Professional Soldier: A Social and Political Portrait. Glencoe, 111.: Free Press. -» A paperback edition was published in 1965. JANOWITZ, MORRIS (editor) 1964 The New Military: Changing Patterns of Organization. New York: Russell Sage Foundation. JOHNSON, JOHN J. (editor) 1962 The Role of the Military in Underdeveloped Countries. Princeton Univ. Press. -» Papers presented at a conference sponsored by the RAND Corporation at Santa Monica, Calif., in August 1959. LANG, KURT 1965a Military Organizations. Pages 838878 in James G. March (editor), Handbook of Organizations. Chicago: Rand McNally. LANG, KURT 1965b Military Sociology: A Trend Report and Bibliography. Current Sociology 13, no. 1. MANDELBAUM, DAVID G. 1952 Soldier Groups and Negro Soldiers. Berkeley: Univ. of California Press. MARSHALL, SAMUEL L. A. 1947 Men Against Fire: The Problem of Battle Command in Future War. Washington: Infantry Journal. MILLIS, WALTER; MANSFIELD, HARVEY C.; and STEIN, HAROLD 1958 Arms and the State: Civil-Military Elements in National Policy. New York: Twentieth Century Fund. STOUFFER, SAMUEL A. et al. 1949 The American Soldier. Studies in Social Psychology in World War II, Vols. 1 and 2. Princeton Univ. Press. -> Volume 1:
31 2
MILITARY LAW
Adjustment During Army Life. Volume 2: Combat and Its Aftermath. VAGTS, ALFRr^ (1937) 1960 A History of Militarism: Civilian and Military. Rev. ed. London: Hollis & Carter. WARKOV, SEYMOUR 1965 Lawyers in the Making. Chicago- Aldine. WARNER, W. LLOYD et al. 1963 The American Federal Executive: A Study of the Social and Personal Characteristics of the Civilian and Military Leaders of the United States Federal Government. New Haven: Yale Univ. Press.
MILITARY LAW Military law, in the sense of a distinctive body of law relating to the armed forces and their activities, is probably as old as law and war themselves, which is to say about as old as organized human polities. The term embraces the codes that govern the members of a nation's armed forces (military justice), the relationship of the military to the civilian community (martial law or military government), and the conduct of belligerents in time of war (the law of war). In all of these areas the military, independently of civilian magistrates, may exercise some degree of jurisdiction, conferred by domestic legislation or international law or a combination of the two. Military justice It may be surmised that pre-Roman military codes amounted to little more than the complete subjection of the soldier to the will of the commander, the deprivation of whatever right to due process he might otherwise have had as a citizen. This was certainly the basic principle of Roman military law; the field commander and his delegates were empowered to inflict any punishment for any offense, military or civilian. The common military offenses were in general akin to those proscribed by modern codes—desertion, cowardice, insubordination, and the like. As was the case with most military codes until comparatively recent times, punishment was swift, severe, and often brutal. Corporal and capital punishment were very freely inflicted (in part, no doubt, because imprisonment was not available as an alternative), and resort was occasionally had to decimation and other punitive measures that have long been obsolete. Other Roman military penalties, such as ignominious expulsion, reduction in rank, and forfeiture of pay, are still standard features of military justice. Medieval military justice was as simple and crude as medieval tactics and logistics: Richard I's
Ordinance of 1190, intended to deter theft and fighting among his Crusaders, is probably a representative specimen. It provided, inter alia: "Whoever shall slay a man on shipboard, he shall be bound to the dead man and thrown into the sea. If he shall slay him on land, he shall be bound to the dead man and buried in the earth." Procedural provisions were entirely lacking; the court-martial, as a distinct tribunal, had not yet evolved. It probably traces its ancestry to the court of chivalry of the later Middle Ages. In late medieval and Renaissance times, there appeared in western Europe more elaborate and sophisticated military codes, in part derived from Roman precedents and in part based on the laws and customs of the Franks and other German nations. The best-known of these is the Constitutio Criminalis Carolina, promulgated by the Emperor Charles v in 1532, which served as a model for a number of other European codes. One of these was the Articles of War of Gustavus Adolphus, dated 1621, which was translated into English shortly before the English Civil War and may fairly be described as the direct ancestor of modern British and American articles of war, to which the following description is chiefly directed. The military codes of many other nations are, however, derived from, or strongly influenced by, the Anglo-American model, and even those which are based on other sources and traditions have many points of generic resemblance. The basic reasons for the existence of a separate system of military justice may be summarized as (1) the need for swift and summary machinery for the maintenance of discipline; (2) the fact that the adjudication of military crimes may require military expertise by the court; and (3) the fact that the armed forces may be stationed abroad, outside the jurisdiction of their country's civil courts. The English Articles of War, from Richard I's Ordinance to James n's detailed Articles of 1688, were wholly exercises of the crown's prerogative, having no parliamentary sanction and, in time of peace, no lawful application in domestic territory. Military crimes, military punishments, and military courts had no place in the common law; in Macaulay's words, "A soldier, therefore, by knocking down his colonel, incurred only the ordinary penalties of assault and battery, and by refusing to obey orders, by sleeping on guard, or by deserting his colours, incurred no penalty at all" ([1849-1861] 1953, vol. 1, p. 222). Given the necessity of a standing army, such a situation was intolerable. Parliament dealt with it by the Mutiny Act of 1689, which permitted courts-martial to punish mutiny, sedition, or desertion by death or
MILITARY LAW such lesser penalty as the court might adjudge. For nearly two centuries, the Articles of War, applicable only to troops stationed abroad, and the Mutiny Act, annually re-enacted, existed side by side. Not until 1881 were the two jurisdictions fused. No such dichotomy exists in the history of the American Articles of War, which have always been statutory. Congress enacted the original articles, largely borrowed from the British, in 1775. Since the adoption of the constitution their enactment has been an exercise of Congress' power (art. i, sec. 8) "to make Rules for the Government and Regulation of the land and naval Forces." There have been several revisions, mostly with war in prospect or its lessons in retrospect. The principal ones are those of 1776, 1786, 1806, 1874, 1916, and 1920. In 1950 the Articles of War and the Articles for the Government of the Navy (basically similar, although differing in a number of details) were superseded by the Uniform Code of Military Justice, which applies alike to the army, navy, and air force. Like its predecessors, the Uniform Code specifies the persons who are amenable to military jurisdiction, defines offenses, prescribes punishments, and establishes trial and appellate procedure for courts-martial. Jurisdiction over persons. Courts-martial of the United States, like those of most nations, exercise criminal jurisdiction primarily over members of the armed forces on active duty, including cadets and midshipmen, in both peace and war. The U.S. code followed previous articles in subjecting to military law civilians accompanying or serving with the armed forces, such as dependents of military personnel and civilian employees, in time of war or outside the United States. Similarly the code attempted to deal with a serious problem that developed during and after World War ii: it provided for the court-martial of discharged servicemen for serious offenses committed while the accused was subject to military law, offenses which could not be tried in an American court—for example, a murder committed in a foreign country. In a series of major decisions between 1955 and 1960, however, the Supreme Court held that Congress could not constitutionally subject to military jurisdiction either an honorably discharged serviceman who had severed all connection with the military or, in time of peace, any civilian. Whether such jurisdiction may be exercised in time of war and whether it is constitutional in regard to certain categories of quasi civilians (such as retired regulars, some reservists, and dishonorably discharged prisoners in military custody) must, in the
373
light of these decisions, be regarded as open to question. In time of war, courts-martial are empowered to try "any person" for aiding the enemy or espionage; the same constitutional question might, however, be raised if the act took place in the United States and particularly if the accused were an American citizen. Military jurisdiction is not exclusive. If an offense committed by a soldier is denounced by both the code and state or federal law, he may be tried by either a court-martial or a civilian court. Moreover, since the constitutional prohibition against double jeopardy bans only a second trial by the same sovereign for the same offense, he may be tried by both a court-martial and a state court or (if the act embraces two distinct offenses—for example, the civilian offense of assault and the military offense of striking a superior officer) by both a federal court and a court-martial. As a general rule, however, the policy of the military, in peacetime and within the United States, is to leave to civilian justice a soldier who commits a civilian offense that does not directly affect the military. Soldiers stationed in friendly foreign countries are likewise subject to the jurisdiction of both courtsmartial and the local civilian courts, although in the absence of agreement between the two sovereigns the jurisdiction of the host country is primary. In practice, the matter has been regulated by so-called status-of-forces agreements, which usually give American courts-martial primary jurisdiction over purely military offenses and other offenses not involving citizens, property, or other interests of the host nation. Jurisdiction over offenses. Courts-martial have, of course, long had jurisdiction over the traditional military offenses, such as desertion, absence without leave, mutiny, insubordination, disobedience, misbehavior before the enemy, drunkenness on duty, and a few whose interest is largely antiquarian, such as improper use of a parole or countersign, forcing a safeguard, and dueling. In 1916 American courts-martial were given jurisdiction in both war and peace over virtually all the common law crimes except the capital offenses of rape and murder committed in the United States in peacetime. The Uniform Code made even these offenses triable by court-martial. But court-martial jurisdiction of offenses is in reality still broader, for the "general article," which has no real analogue in civilian penal codes, covers "crimes and offenses not capital," which means acts that are not specifically covered by other articles of the code but are made criminal by other federal laws. Still more broadly, it denounces "disorders and neglects
31 4
MILITARY LAW
to the prejudice of good order and discipline in the armed forces" and "conduct of a nature to bring discredit upon the armed forces." The military authorities have traditionally been accorded broad, although not unlimited, discretion in giving content to these vague phrases; in practice almost any violation of the law of a state or foreign country can be fitted into one or the other category—or in the case of officers, cadets, and midshipmen, into "conduct unbecoming an officer and a gentleman." Punishments. Courts-martial may impose a number of distinctively military punishments, such as dishonorable or bad-conduct discharge, reduction in rank or grade, forfeiture of pay, and reprimand. They may also impose sentences of death, imprisonment, and fine. In time of peace the death penalty is limited to mutiny, sedition, murder, and rape. In wartime, since the maintenance of discipline by deterrence is still the prime object of military justice, death may be inflicted for a number of military offenses, of which the chief are desertion, assaulting or willfully disobeying a superior officer, misbehavior before the enemy, aiding the enemy, and espionage; for the latter offense, the death sentence is mandatory. (It should be remarked, however, that during and since World War ii there appears to have been only one case, involving a second desertion, in which an American soldier was actually executed for a military offense.) These punishments may be, and commonly are, combined in a single sentence, for example, dishonorable discharge, forfeiture of pay, and confinement at hard labor for a term of years. General courts-martial may impose any authorized penalty. Special courts-martial are in substance limited to bad-conduct discharges and six months' imprisonment, and summary courts, to a month's confinement, plus corresponding forfeiture of pay. Although a court-martial may not inflict the death sentence unless explicitly authorized by the code itself, Congress has placed all lesser sentences within the court's discretion; in strict theory a soldier might be sentenced to life imprisonment for addressing rude language to his sergeant. The president, however, is empowered to prescribe maximum punishments for those offenses for which Congress has set no mandatory punishment—i.e., all save espionage (death) and premeditated or felony murder (death or life imprisonment). The "Table of Maximum Punishments," contained in the Manual for Courts-Martial, places limits that approximate civilian norms on the penalties for specific crimes. Corporal punishments, notably flogging and branding, which were a distinctive feature of military justice until well into the nine-
teenth century, have long been expressly prohibited, as has "any other cruel or unusual punishment." The military authorities maintain their own prison system—in the case of the army, post stockades (for minor offenders) and disciplinary barracks—for those military convicts who are judged capable of rehabilitation and ultimate restoration to duty; others serve their confinement in federal reformatories and penitentiaries. If the authorities conclude that salvage is feasible, the execution of a punitive discharge will normally be suspended and ultimately, if the prisoner behaves himself, remitted. Procedure. The Uniform Code regulates in detail the appointment, composition, jurisdiction, procedure, and appellate review of courts-martial. Authority to convene a general court-martial is normally given only to a commander of a division, a task force, or a comparable unit; a special courtmartial may be convened by the commander of a regiment or of a naval vessel; and summary courtsmartial, by commanders of detached companies. (Authority to convene a general or special court includes authority to convene the inferior types.) A general court consists of not fewer than five members plus a law officer; a special court, of not fewer than three members; and a summary court, of a single officer. Prior to the enactment of the Uniform Code, members of a court-martial were required to be commissioned officers, but an accused enlisted man may now request that a minimum of one-third of the members be enlisted men —-a privilege rarely exercised in practice. Traditionally the procedure of military courts has been swifter and more summary than that of civilian criminal courts; according to Macaulay again "a summary jurisdiction of terrible extent must, in camps, be entrusted to rude tribunals composed of men of the sword" ([1849-1861] 1953, vol. 2, p. 414). This splendid rhetoric is not applicable to the Uniform Code, whose protection of the rights of the accused is so extensive that some critics fear that it would be unworkable in time of war. Although the procedure of a courtmartial differs in many respects from that of a civil court, an accused has—at least in a general court-martial—virtually all the substantial protections that he would have in a civil proceeding and even some that he might not have in many civil courts. The functions of the law officer—who must be a lawyer and is usually a member of the judge advocate general's corps, hence a specialist in military law—are roughly analogous to the functions of a trial judge; and the functions of the court-
MILITARY LAW rnartial members, to those of the jury; the principal difference is that the members not only determine guilt or innocence but also assess the sentence. The rules of evidence are approximately the same as in a criminal trial in a federal court. The Uniform Code, like the bill of rights, prohibits compulsory self-incrimination, double jeopardy, and cruel or unusual punishments; the charges must be investigated before the accused is arraigned; he must be apprised of the charges against him; he is entitled to counsel of his choice and to compulsory process to obtain defense witnesses. His counsel must be a qualified lawyer, at least in a trial before a general court-martial, and pressure upon courts by commanders is forbidden—although it is not always easy to eradicate such influence. The principal differences are that the military accused is not entitled to bail and that he may be convicted and sentenced by vote of twothirds of the members, although unanimity is required for a death sentence. Appellate review under the code is probably more extensive than is common in civilian practice. All findings and sentences (other than acquittals) must be approved by the authority who convened the court-martial, after review of the record for legal sufficiency by his staff judge advocate. The convening authority may order a rehearing, or order the charges dismissed, or modify or remit the sentence in the exercise of clemency. Sentences that, as approved, include death, a punitive discharge, or confinement for a year or more are referred to board of review (whose members must be lawyers) in the office of the judge advocate general of the service concerned. Death sentences or sentences involving general or flag officers require presidential approval; the dismissal of a commissioned officer requires the approval of the secretary or an assistant secretary of the department. At the top of the military appellate structure is the Court of Military Appeals, whose creation was the major innovation of the Uniform Code. Considering that the armed forces include some two million men, most of them in the age brackets in which crime is most frequent, it is perhaps the most important court of criminal appeal in the United States. Its three judges, who must be lawyers, are appointed by the president "from civilian life" and have no connection with the military establishment. They must review all death sentences and any cases certified to them by the judge advocate general of any of the services and may grant review of any case passed upon by a board of review. Their jurisdiction is otherwise essentially like that of a civilian appellate court, includ-
31 5
ing power to set aside any conviction in which they find errors of law or insufficient evidence. Since its inception, the court has proved to be no rubber stamp. It has developed and applied a concept of "military due process," derived from both the code and the constitution and much influenced by the Supreme Court's holdings on constitutional due process. Experience so far indicates that a courtmartial in which there has been substantial unfairness is unlikely to survive review by the Court of Military Appeals—although it should be recalled that its powers of review are in effect limited to fairly serious cases, those involving punitive discharges or confinement for a year or more. It is probable that many inferior courts-martial are still somewhat summary in operation, as are many inferior civilian criminal courts. Although there can be no direct appeal to a civilian court from the decisions of the military reviewing authorities, there may be collateral review of the court-martial's jurisdiction by such proceedings as petition for a writ of habeas corpus in a federal district court or suit for pay in the Court of Claims. Military nonjudicial or "disciplinary" punishments may be imposed by commanding officers without trial for offenses not deemed sufficiently serious to require reference to a court-martial. Depending on the rank of the commander and the offender, such punishments may include forfeitures, reduction in grade (of enlisted men), and comparatively short periods of additional fatigues or confinement. While the accused can always demand trial by court-martial, he is usually welladvised to accept company punishment in lieu thereof. Since punishments equivalent to (and in some cases slightly greater than) those within the competence of a summary court-martial may be imposed under the code's article on disciplinary punishment, the summary court has become more or less obsolete. Martial law The term "martial law" describes the exercise of military force to preserve order and insure the public safety in domestic territory in a time of emergency, when the civilian authorities are unable to deal with the situation. In one form or another, under such names as "state of siege" or "state of emergency," the concept is found in every country. In some countries it is almost the normal type of government. In Anglo-American law, its only proper purpose is to restore order with a view to the restoration of civilian government, and the degree to which the military may properly assume
3 16
MILITARY LAW
governmental functions depends entirely on the needs of the situation. In its mildest form martial law may amount to no more than the employment of troops, in aid of and under the direction of the civil authorities, to supplement the regular police in the control of riots and other public disorders and the enforcement of the law, as was done in connection with integration of the schools in Arkansas and Mississippi. At the other extreme, if the emergency is great enough, such as actual or imminent invasion, the military authorities may assume all the functions of government, including the legislative and judicial. In such a situation statutes and even the constitution may be suspended and replaced by ordinances of the military commander, and the civilian courts superseded by military tribunals. Such courts, although they bear a generic resemblance to courts-martial, are not bound to follow the same procedure, but may employ whatever rules are called for by the needs of the emergency. The best-known example of such a situation in recent American history is the declaration of martial law in Hawaii immediately after the Japanese attack on Pearl Harbor. Martial law is nowhere explicitly mentioned in the constitution but is simply an inherent attribute of sovereignty, the right of every government to take whatever steps are necessary for its own preservation. As such it is a part, although an extraordinary part, of the common law. Although the constitution does not explicitly either authorize or limit the executive's invocation of martial law, it is now well established that there are constitutional checks upon the exercise of this power. To the extent that the measures of martial law encroach upon the citizen's rights under state and federal constitutions, the civil courts have jurisdiction to determine whether the measures taken are in fact commensurate with the emergency and to annul them to the extent that they are more drastic than the court deems requisite. Although the courts are usually disposed to give considerable weight to the executive's judgment of the crisis, there are numerous cases in which they have found martial law measures to be unjustified. Most such cases have involved the governors of states (some of whom have been tempted to use martial law whenever a political goal could not be achieved by lawful methods), but the Supreme Court has on occasion applied the same test to the exercise of war power by the president and Congress. One famous instance is Ex parte Milligan (71 U.S. 2, 1867), decided shortly after the Civil War, in which the Supreme Court freed a Copperhead leader who had been sentenced to death by a military commission
in Union territory at a time when the civil courts were open and functioning normally. Another is Ex parte Endo (323 U.S. 283, 1944), in which the court, having previously upheld most of the restrictive measures applied to American citizens of Japanese descent in World War n, finally concluded that certain relocation measures, involving drastic interference with normal constitutional rights, could not be justified by military need. Military government Military government is a belligerent's exercise, through its armed forces, of governmental powers in the conquered and occupied territory of another nation. It presupposes actual, effective physical control of the territory in question and ability to impose on its inhabitants the will of the military commander. International law recognizes the occupant's right to govern the conquered territory, but it also imposes restrictions on the exercise of this right. The principal limitations are embodied in the Hague Convention of 1907 and the 1949 Geneva Convention Relative to the Protection of Civilian Persons in Time of War. Essentially, the occupant is required to preserve public order and to respect the laws in force "unless absolutely prevented"—a phrase of ambiguous content, which occupying powers have traditionally construed extremely broadly. The persons and property of inhabitants of the occupied territory are accorded some basic protections. Such aspects of former military occupations as pillage and the taking of hostages are forbidden. While the occupant may impose taxes and levy contributions for the needs of the occupying forces, he is bound also to defray the usual costs of government in the occupied territory. He may promulgate such ordinances as are reasonably necessary to protect his forces and govern the territory and may try violations thereof in his military government courts. Such trials must meet elemental standards of fairness, such as giving the accused the right to counsel and an interpreter and the right to call witnesses. Military government courts may supersede the indigenous courts and, to the extent authorized by the military governor, exercise criminal and civil jurisdiction over all persons in the occupied territory, including citizens of the occupying power and even members of its forces. In practice, matters which do not affect the interests of the occupant, its forces, or its nationals are usually left to the local courts—if they are open and can be trusted not to discriminate against persons friendly to the occupant. The difficulty lies in the enforcement of the
MILITARY LAW rules designed to protect the inhabitants of occupied areas. It is common knowledge that in World War ii, German, Japanese, and Russian occupations of conquered territory were marked by atrocities on an enormous scale. Such violations can, of course, be punished as are other violations of the law of war, discussed below, but for the duration of hostilities the degree of the occupant's compliance with the rules of international law regarding belligerent occupation must depend on his conscience and his estimate of the likelihood of retribution. In normal circumstances an occupation endures until it is ended by the expulsion of the occupying forces, the annexation of the territory by a victorious belligerent, or the conclusion of a treaty of peace. The abnormally prolonged occupation of Berlin after World War n results simply from the inability of the victorious allies to agree on a peace treaty or any other method of bringing the occupation to an end. Upon the termination of the occupation and the return of the legitimate sovereign there are always difficult questions concerning the extent to which the restored courts and other authorities of the occupied nation should treat as valid the executive, legislative, and judicial acts of the occupant—the so-called problem of postliminium. In principle, those acts of the occupant which were within its lawful powers ought to be accorded as full force and effect by the returning sovereign as they would be if done under its own authority. In practice this has usually been done only with respect to routine governmental acts, such as the collection of normal taxes and the punishment of common crimes. Where an occupant has inflicted punishments, such as fines and imprisonment, for acts directed against the occupying forces, the returning sovereign can hardly be expected to give them effect—unless it has by treaty obligated itself to do so, as the Federal Republic of Germany did when the Western allies terminated their occupation of its territory. Historically the powers of the United States as a military occupant have been exercised by the president as commander-in-chief of the armed forces, usually through the highest military commander in the occupied territory as the military governor thereof, but occasionally (as in Germany after 1949) through civilian officials. Congress has never attempted to control the president's discretion, and it is doubtful that it could constitutiona % do so. Also uncertain is the extent, if any, to which the bill of rights and other parts of the constitution apply to the acts of American military government in foreign territory. The Supreme
31 7
Court held in 1900 that the constitution had no application to a criminal trial in occupied Cuba, even though the accused was an American citizen. This must still be regarded as the orthodox view, although in recent years some members of the court have suggested that the powers of American military authorities in occupied territory are subject to the more basic provisions of the bill of rights. It may be that the legal problems of military government are likely to be less important in the future than they have been in the past. They can be circumvented by the establishment of a friendly, or even a puppet, indigenous government and the recognition of that government as the legitimate sovereign. What would otherwise be occupation forces thus become merely visiting forces in the territory of a friendly power. The Soviet Union, although it did not invent the technique, brought it to a high degree of perfection after World War n, and the lesson has not been lost on other powers. The law of war The law of war comprises that branch of international law which governs the rights and obligations of belligerents. Its basic object is to protect combatants and noncombatants from unnecessary suffering and to safeguard the fundamental human rights of the victims of war, such as prisoners of war, the wounded and sick, and civilians, including the inhabitants of occupied territory. Its basic problem is to reconcile that policy with military necessity. Belligerents in ancient times seem to have recognized few, if any, rules; the attitude of the Greeks and Romans is accurately summarized in Cicero's maxim, Silent enim leges inter arma ("When men fight, laws have no voice"). Customary limitations upon the conqueror's freedom to massacre, enslave, and loot began to evolve in the Middle Ages and to take coherent form in the sixteenth and seventeenth centuries, as exemplified by the writings of Hugo Grotius. The first really important and influential attempt to codify the rules recognized by the consensus of civilized nations was probably Francis Lieber's "Instructions for the Government of the Armies of the United States in the Field," drafted at President Lincoln's request and promulgated as a general order of the War Department in 1863. Lieber's "Instructions" were followed by a number of similar unilateral declarations by Italy, Russia, France, and other countries. The Brussels Conference of 1874 marked the beginning of an effort to embody the customary laws of war in international treaties, an effort which
31 8
MILITARY LAW
ultimately led to the Hague conventions of 1899 and 1907. The Hague conventions, still a basic source for the laws of war, have been supplemented by a number of other major treaties to which practically all of the great powers have acceded (sometimes with particular reservations). The most important of these are the Geneva Conventions of 1949 relating to the wounded and sick, prisoners of war, and the protection of civilians in time of war. The law of war as established by custom and treaty, to the extent that it is observed and enforced, is calculated to mitigate the hardships of conventional war. Enforcement, however, is entirely unilateral. There is as yet no permanent international tribunal with jurisdiction to punish war crimes. The Nuremberg and Tokyo international tribunals, which tried violations of the laws of war committed by the German and Japanese defendants, were created ad hoc. If a belligerent is guilty of war crimes, the nation aggrieved may resort to protest and ultimately to reprisals in kind. The law of war may also be enforced by the punishment of captured violators. The Geneva Conventions of 1949 obligate each signatory to search out and try in its own courts persons guilty of "grave breaches" thereof, a provision which is essentially a declaration of the prior customary law. In theory, jurisdiction to try war crimes is universal—that is, any nation having physical custody of a war criminal may try and punish him. In practice, nations have not typically displayed much diligence or zeal in trying war criminals of their own nationality. Neither have they typically been concerned to punish violations which did not directly affect their own interests. The post-World War n trials of some German war criminals in the courts of the Federal Republic of Germany, although many criminals have gone unpunished and others have received sentences hardly commensurate with the enormity of the crimes, probably constitute the outstanding example of a nation punishing its own war criminals. In most of the cases in which war criminals have actually been tried and punished, the trials have taken place in the courts of a victorious enemy. Typically such jurisdiction is exercised by military tribunals; this has always been the practice of the United States. After World War ii the United States tried before military commissions—tribunals akin to courts-martial but not bound by such rigid rules of procedure and evidence—many Germans and Japanese charged with offenses against prisoners of war or civilian nationals of the United States and its allies. The jurisdiction of military commissions to try such
offenses was upheld by the Supreme Court in Ex parte Quirin (317 U.S. 1, 1942), the case of nine saboteurs landed in Long Island and Florida by German submarines in 1942, and In re Yamashita (327 U.S. 1, 1946), which concerned the trial of the commanding general of Japanese forces in the Philippines. Under the Geneva Conventions of 1949, persons accused of war crimes are, however, afforded basic guarantees of a fair trial. The laws of war as they now stand are open to the criticism that they do not deal with the realities of present-day wars. The use of nuclear weapons is not prohibited, and the Geneva Conventions could hardly do much to soften the impact of a hydrogen bomb. On the other hand, nonnuclear belligerency at the present time is likely to take the form of subversion, insurrection, and guerrilla warfare, covertly instigated and supported by nations that are not officially belligerents. The laws of war, being part of international law, are difficult to apply to domestic insurrection, short of a full-fledged civil war in which the parties are accorded belligerent status by other nations. Guerrillas, who wear no uniforms, do not carry arms openly, and usually do not obey the laws of war themselves, are not regarded as entitled to the benefit of those laws. In theory at least, every such guerrilla is an unlawful combatant, a war criminal, if not simply a violator of the laws of the recognized sovereign of the territory. However, article 3 of each of the Geneva Conventions of 1949 does apply to "armed conflict not of an international character occurring in the territory of one of the High Contracting Parties," and it does prescribe minimum human rights for persons involved in such insurrections by prohibiting, for example, murder, torture or other cruel treatment, and execution without fair trial. JOSEPH W. BISHOP, JR. [See also INTERNATIONAL LAW; LEGAL SYSTEMS; MILITARY.] BIBLIOGRAPHY BALDWIN, GORDON B. 1959 A New Look at the Law of War. Military Law Review 4:1-38. BISHOP, JOSEPH W. JR. 1961 Civilian Judges and Military Justice: Collateral Review of Court-martial Convictions. Columbia Law Review 61:40—71. FAIRMAN, CHARLES (1930) 1943 The Law of Martial Rule. 2d ed. Chicago: Callaghan. FAIRMAN, CHARLES 1946 The Supreme Court on Military Jurisdiction: Martial Rule in Hawaii and the Yamashita Case. Harvard Law Review 59:833-882. LAWYERS CO-OPERATIVE PUBLISHING COMPANY 1951 M#itary Jurisprudence: Cases and Materials. Rochester, N.Y.: The Company. MACAULAY, THOMAS B. (1849-1861) 1953 History of
MILITARY POLICY England From the Accession of James II. 4 vols. New York: Button. , MYRES S.; and FELICIANO, FLORENTINO P. 1961 Law and Minimum. World Public Order: The Legal Regulation of International Coercion. New Haven: Yale Univ. Press. SPAIGHT, JAMES M. 1911 War Rights on Land. London: Macmillan. U.S. DEPARTMENT OF DEFENSE 1951 Manual for Courtsmartial: 1951. Washington: Government Printing Office. U.S. DEPARTMENT OF THE ARMY 1956 The Law of Land Warfare. Washington: Government Printing Office. U.S. DEPARTMENT OF THE ARMY 1957 Treaties Governing Land Warfare. Washington: Government Printing Office. VON GLAHN, GERHARD 1957 The Occupation of Enemy Territory: A Commentary on the Law and Practice of Belligerent Occupation. Minneapolis: Univ. of Minnesota Press. WIENER, FREDERICK B. 1958 Courts-martial and the Bill of Rights: The Original Practice. Harvard Law Review 72:1-49, 266-304. WINTHROP, WILLIAM W. (1886)1920 Military Law and Precedents. Rev. ed. Washington: Government Printing Office. -> First published in two volumes as Military Law.
MILITARY POLICY Military policy consists of those activities of a government which are primarily concerned with its armed forces. Military policy is thus defined in terms of its scope rather than its purpose. In Western nations reference is frequently made to "defense policy" and "national security policy." These terms define policy in terms of purpose— "defense" or "national security"—rather than in terms of scope. For this reason, they are less useful for research. In some states, defense and/or national security may not be the principal purpose of military policy: the armed forces may be designed for aggression rather than for defense or internal security and economic development, or they may be used to minimize the burden on the domestic economy rather than to maximize national security. In modern states, moreover, the scope of defense policy and national security policy is much broader than the scope of military policy. Diplomacy, economic mobilization, economic warfare, foreign economic assistance, political warfare, intelligence, and propaganda—all may be directed toward national security objectives, but they are ftot military policy. Military policy is, instead, a narrower field of governmental activity comparable to agricultural policy, labor policy, education policy > or tax policy. Military policy differs from most other substantlv e policy areas in that it straddles the line
31 9
between domestic policy and foreign policy. Domestic policy consists of those activities of a government which affect significantly the allocation of values among groups within its society; foreign policy consists of those activities of a government which affect significantly the allocation of values between it and other governments. Particular substantive policies may affect the allocation of values both within a society and between societies, but typically they have their primary impact in one field or the other. Foreign economic assistance affects the domestic allocation but has its primary impact on the international allocation. Agricultural subsidies affect the international allocation but have their primary impact domestically. Military policy, however, drastically affects the allocation of values both within the society and between societies. The scope of military policy Military policy can be divided generally into two broad categories: strategy and structure. Strategy concerns the units and uses of force; it is military policy viewed from the foreign-policy perspective. Strategy itself involves two broad types of issues. Program issues deal with the strength of the military forces, their composition and readiness, and the number, type, and rate of development of their weapons. Use issues deal with the deployment, commitment, and employment of military forces, and are manifested in alliances, war plans, declarations of war, force movements, and the like. A strategic concept identifies a particular need and implicitly or explicitly prescribes policies on the uses, strengths, and weapons of the armed services. The structural side of military policy is its domestic component and deals with the acquisition and organization of the resources which are drawn from society and which go into the units and uses of force. Structural policy subsumes: budgetary policy, concerning the size and distribution of funds made available to the armed services; manpower policy, concerning the procurement, retention, pay, and working conditions of members of the armed services; procurement policy, concerning the acquisition and distribution of supplies to the military forces; and organizational policy, concerning the methods and forms by which the military forces are organized and administered. Analytically these various elements of military policy are distinct. In actual practice, of course, any major decision in military policy involves a combination of many of them. The terms in which the decision is initially defined, however, often re-
320
MILITARY POLICY
fleet the purposes which it is designed to realize. Decisions designed primarily to influence the international environment are formulated initially in strategic terms and must then be translated into structural policies. Conversely, decisions primarily designed to affect the allocation of domestic values are usually first formulated in structural terms, and their strategic implications are calculated later. For example, a decision to reduce the military budget is likely to be prompted by a concern with domestic factors: fear of the inflationary effects of high military spending, the desire to expand domestic welfare programs, concern over the undue influence of the military on the economy and in society, or the desire to balance the budget and reduce taxes. The reduction in military spending will require the elimination of some programs and forces and the reshaping of others. These changes will have implications for war plans and deployments, and they may make it difficult or impossible for the state to maintain its existing commitments in world politics. Hence, alliances may have to be negotiated, and what began as an effort to achieve certain domestic economic goals through the mediation of military policy comes to have a major impact on foreign relations. Conversely, a change in foreign policy, such as the assumption of a commitment to aid another state, may require increases and changes in military forces and programs and war plans which will eventually have their impact on military manpower and material procurement, which, in turn, may significantly redistribute goods and services in the domestic economy. In theory the elements of domestic policy, foreign policy, and military policy should be congruent. In actual practice, of course, they never are. The purposes and goals of policies are always changing and always conflicting. If there are not major conflicts of purpose, however, the policies can be said to be in equilibrium. Periods of disequilibrium are typically periods following major changes in the domestic or international environments of a state. In some instances, changes in one environment and concomitant changes in foreign policy or domestic policy may not be transmitted into changes in military policy. In this event, the different elements may continue to operate at cross purposes for a long period of time. In some circumstances, such as France in the 1930s, such a disjunction between military policy and foreign policy may lead to disaster. The elements of military policy Strategy—programs and forces. Strategic programs deal with the over-all size, the weapons,
and the composition of the military forces. The key issues normally concern the "mix" or relative strength and importance of different types of forces: land, sea, and air forces; offensive and defensive forces; active and reserve forces. After World War n, in the United States and in other major powers these traditional categories began to lose their significance. Increasingly, American military policy was concerned with the allocation of resources among strategic deterrent forces, continental defense forces, and general purpose or limited-war forces. Superimposed on these issues was the broader issue of the relative stress which should be placed on nuclear forces and conventional forces. In general, it is possible to analyze a country's policies on strategic programs in terms of the concepts of "strategic pluralism" and "strategic monism" (see Huntington 1954). A pluralistic strategy, such as that followed by the United States between 1950 and 1953 and after 1961, involves the maintenance of a variety of military forces so as to be able to make graduated and appropriate responses to different types of aggression and to serve a variety of foreign-policy objectives. Strategic pluralism in programs is usually accompanied by higher military budgets and a more restrained and "defensive" foreign policy. Strategic monism, on the other hand, involves primary emphasis on one particular type of military force (for example, strategic nuclear forces) which is well designed to serve certain foreign-policy objectives but not others. Consequently, strategic monism usually means a lower level of military spending, but it also requires a more active and positive foreign policy that attempts to prevent through diplomacy the appearance of challenges which the military forces of the state are not equipped to handle. Major changes in a country's strategic programs occur only rarely and usually in combination with major changes in its foreign policy and domestic environments. Examples of such changes are the German decision in the late nineteenth century to create a sizable navy, the American decision in the 1950s to create a military system for the defense of North America against nuclear attack, and the Chinese decision in the mid-1950s to develop a nuclear-weapons capability. At a lower level, technological developments may lead to the innovation of new weapons without changes in major strategic purposes. Thus, between 1955 and 1965 the United States substantially changed the principal element in the weapons "mix" of its strategic deterrent forces from long-range bombers to intercontinental missiles. Weapons innovation is
MILITARY POLICY a continuing concern for all nations involved in prolonged international rivalries or arms races. In such situations the weaker power almost always has an interest in introducing new weapons, unless it thinks that such innovation will provoke the stronger power to an immediate attack. The stronger power, on the other hand, has a vested interest in the current level of weapons but must be prepared either to respond quickly to weapons innovation by the weaker power or to take the initiative in such innovation and thus in effect hasten the obsolescence of its existing superior weapons system. Strategy—uses of force. A government can use or plan to use its military forces against another government in three ways (see Huntington 196la, pp. 430-431). First, it can take the initiative in using force to secure some foreign-policy objective, such as the acquisition of territory or economic concessions. Second, it can use force responsively to counter or to reply to the initial use of force by another government. Third, it can use force as a deterrent in an effort to convince another government that it should not take some action. Any particular military action may, of course, serve all three purposes, but generally the strategy of a government gives primacy to one use. Have-not powers typically take the initiative in using force; status quo powers usually act responsively or deterrently. During much of its history the United States either took the initiative in the use of force, as in 1812 and 1898, or responded to the use of force by other governments, as in World War I. Since World War n, however, American military forces have been used primarily for deterrent purposes, although where deterrence has failed they have been used responsively (as in Korea in 1950) and in some cases initially (as in the Dominican Republic in 1965). The uses of force are often analyzed in terms of the spectrum of violence. At one extreme is all-out war with thermonuclear weapons. At the other extreme are the terrorism and subversion of "sublimited" war and "wars of national liberation." In between are such forms of violence as guerrilla warfare, limited conventional war (restricted in geographical area, weapons, targets, or goals), general conventional war, tactical nuclear war, and limited nuclear war. At one time American thinklr ig on war tended to hold that all war must be all-out war for total victory. During the twenty Years after World War n, however, the American government became accustomed to the graduated Us e of force and came to accept the Clausewitzla n dictum that political goals must determine
321
the nature and the extent of the use of force in war as well as in peace. A recurring issue in military policy concerns the extent to which the use of force at any one level in the spectrum of violence is likely to escalate to higher levels in the spectrum. Escalation is most likely when one side appears about to suffer a total defeat at the lower level of violence. Communist China intervened in the Korean War in the fall of 1950 when North Korea was almost totally defeated; the United States expanded its military action in Vietnam in the winter of 1964-1965 when it seemed probable that the Saigon government would be defeated. The most important step in the escalation of a conflict would, of course, be the shift from conventional to nuclear weapons. While other forms of escalation may be gradual and difficult to identify clearly, this shift would be a dramatic qualitative change in the nature of the conflict. A preventive war occurs when a government initiates hostilities because it is convinced that war is inevitable later and that it would then be fought under less favorable conditions than it would be if initiated immediately. A pre-emptive attack is an attack designed to forestall or to blunt an enemy attack already in the process of preparation and launching. Thus, a country could plan to launch a preventive attack against another country and at the same time be the target of a pre-emptive attack from its enemy. After World War n, the development of nuclear weapons and of high-speed, long-range delivery capabilities brought to the fore new issues in the use of military force by the major powers. The crucial factors were the relative size and vulnerability of each side's strategic force. If both sides have relatively vulnerable strategic forces, in a crisis each will be under considerable pressure to launch a "first strike" and to destroy the other side's strategic force before it can attack. This is a situation of maximum instability. If one or both sides have relatively invulnerable strategic forces (as a result of concealment, dispersion, mobility, or sheer numbers), the incentive to launch a first strike is much less. The "balance of terror" is thus more stable when each side is capable of absorbing a first strike and then responding with an attack capable of dealing the other side unacceptable damage. These strategic issues are frequently formulated in terms of a choice between a counterforce and countervalue strategy. In a counterforce strategy the enemy's military forces are the principal target of the nuclear strike; in a countervalue (or countercity) strategy its population centers are the principal target.
322
MILITARY POLICY
The deterrence of military action may also be achieved through various combinations of diplomatic and military means. A defensive military alliance is a classic means of communicating the intention of one state to use force to protect another state. The North Atlantic Treaty Organization (NATO) is probably the most notable example of such an alliance in the mid-twentieth century. Deterrent intentions may also be communicated through formal or informal statements by government officials and by the deployment and maneuvering of military forces. The success of such deterrent moves depends upon the ability of the deterring state (a) to identify clearly the action which it wishes to deter; (£>) to convince the potential military actor of its intention to respond if the identified action occurs; and (c) to suggest to the potential actor that a response will impose unacceptable costs upon the actor. Structure—manpower. At its broadest level, manpower policy involves the nature of the military manpower procurement system. Four principal systems have been used by modern states: (1) a citizen militia (Israel, Switzerland), in which all those qualified serve a large part of their lives in the citizen reserve forces, which form the bulk of the country's military strength; (2) universal military service (France, the Soviet Union), in which all qualified men serve a short period (usually two or three years) in the active forces and then a longer period in the reserves; (3) volunteer service (Great Britain after 1960, Canada), in which efforts are made to recruit long-service professionals for the active forces; and (4) selective service (United States, Federal Republic of Germany), in which compulsory service typically for a two-year period is required of certain classes of young men, but liberal exemptions and deferments are granted, so that service is far from universal. In addition to choice among these general systems, manpower policy concerns the recruitment, retention, pay, working conditions, promotion, education, training, and retirement of both officers and enlisted men. Among the more frequently debated issues of manpower policy are the following: To what extent should officers be recruited from special military schools, from civilian schools and colleges, or from enlisted ranks? What should be the criteria for promotion of officers—seniority, command ability, intellectual qualities? What types of education and training should officers and enlisted men receive during their military service? To what extent should military pay equal the pay for comparable work in civilian life?
Structure—procurement. Procurement and materiel policies concern the methods by which the military services acquire weapons and installations. A recurring issue is to what extent the military establishment should itself produce weapons within government-owned arsenals and to what extent it should procure them from private companies. If private procurement is preferred, to what extent is it desirable or possible to rely on competitive bidding as against negotiated bids? What policies should be followed with respect to maintaining production lines in existence on a stand-by basis: should a broad or a narrow "mobilization base" be maintained? Closely related to procurement are policies on the research and development of new weapons. To what extent should new weapons be developed in response to a previously determined "military requirement" or need for such a weapon? Or to what extent should scientists and technicians be encouraged to pursue broad-gauged research along the most promising scientific lines, with the expectation that these advances may lead to new weapons for which new appropriate uses could be found? A related issue concerns the extent to which weapons innovation is furthered by competition among several concerns or laboratories, each following a somewhat different path in attempting to develop a weapon to meet a military need. Or would weapons development be just as rapid and less costly if this duplication of effort were avoided and, at an early stage in the research, all resources were concentrated upon one approach to the problem? Structure—organization. The key issues of military organization involve command and control, on the one hand, and mission and purpose, on the other. Both issues come together in the problem of "unification," or centralization, versus decentralization, a problem which continually troubled the major powers after World War n. In most countries the tendency was toward more and more centralized direction and control over the military forces. In the United States and Great Britain the previously separate services were brought within the framework of a single cabinet-level department. At the same time, in the United States in particular, the services were increasingly relegated to supporting rather than to combat roles. The principal combat units of the military establishment came to consist of the functional commands, which typically include forces from two or more services. These commands, such as the North American Air Defense Command, the European Command, the Strike Command, and the Strategic
MILITARY POLICY Air Command, more directly reflected the missions and purposes of the military establishment than did the services organized simply in terms of the element in which they operated (land, sea, air, or sea-land). The civilian leaders of the services thus suffered a decline in prestige and power. The military leaders of the services avoided this decline in some degree through their participation in the central military staff organization (in the United States, the Joint Chiefs of Staff). Recurring in all countries is the issue of having a single military chief of staff or a chiefs of staff committee, and the problem of dividing responsibilities between the central military staff and the service staffs. The relations between the civilian secretary in charge of the military establishment and his civilian associates and subordinates, on the one hand, and the central military staff organization, on the other, raise other major organizational issues. Structure—the budget. There are two major types of budgetary issues: substantive and procedural. Substantive issues concern the allocation of funds among different programs, weapons, and services. They are thus directly linked to program and force decisions on strategic programs. It is quite possible that decisions on strategic programs may be made only in the context of the budgetary process. In this event, no practical distinction exists between the strategic decision and the structural one. In other instances, however, decisions may be made to develop or to maintain certain types of forces and strategic programs, quite apart from the decisions on how much money should be devoted to those forces. Just as a decision to maintain a certain number of divisions may imply certain manpower decisions, so also it may imply certain budgetary decisions. Neither the manpower nor the budgetary decisions, however, are guaranteed as a result of the force-level decision. Indeed, they may be made by a substantially different group of people at a different time and with significantly different perspectives and priorities. Procedural issues concern the nature and structure of the budget and the budgetary process. A major change in American military policy under the Kennedy administration was the reorganization of the budget in terms of major purposes or functions (that is, the output categories of the military establishment) rather than simply in terms of organizational categories (the services) or input categories (for example, men, hardware, soft goods). As a result of this change, it became possible to evaluate the costs of the major strategic programs. Since the budget is the one place where
323
military activities are expressed in a common denominator that makes them comparable to civilian activities, it is also the place where, typically, the civilian leaders of the defense establishment play a major role. Civilian control often is identified with budgetary control. The making of military policy In almost all countries the executive branch plays a decisive role in the formulation of military policy. In constitutional democracies, the predominance of the executive is particularly marked with respect to strategy, less so with respect to structure. In the United States, Congress has constitutional authority to determine the size and composition of the armed forces and to declare war. In actuality, however, after World War u, the effective decisions both on strategic programs and on the uses of force have been made by the president acting through and in consultation with the National Security Council, the civilian leadership of the State and Defense departments, the Joint Chiefs of Staff, and, at times, selected congressional leaders. Congressional groups can exert pressure on the executive decision makers, but they are seldom in a position to make decisions on strategy themselves. In effect, like Bagehot's queen, they have "the right to be consulted, the right to encourage, the right to warn." On the structural side of military policy, on the other hand, Congress retains an important role in the decision-making process. Typically, the executive presents its recommendations to Congress on legislation dealing with manpower, personnel, organization, pay, procurement, and reserve forces. Congress and its Armed Services Committees usually amend and at times even reject these executive proposals. With respect to the budget, Congress rarely makes any significant reductions in executive requests for funds. In many cases, Congress appropriates more money than the president requested for such particularly favored programs as the National Guard, bomber and missile procurement, and the Marine Corps. The president, however, can, and on occasion does, refuse to spend these extra funds. In general, Congress tends to be much more sympathetic to the requests of the military services than the top civilian leadership of the executive branch. In parliamentary democracies, the legislature typically has much less control over military policy than does the United States Congress. Strategic and structural policies are determined by the cabinet in consultation with civil servants and
324
MILITARY POLICY
military chiefs and are more or less automatically ratified by the legislature. In Great Britain the principal public debate on military policy occurs in connection with the approval of the defense estimates by Parliament. This is normally preceded by the issuance of a white paper which sets forth the government's over-all defense policies and strategy. At times in Great Britain, the presence of retired military officers in the House of Lords stimulates informed and caustic debates on military policy in that chamber. Strategic programs and structural issues typically have fairly restricted publics; in some cases almost no groups outside of official government agencies play a significant role in the policy-making process. In the United States public opinion at large has been very favorably disposed toward the maintenance of large military forces; in other constitutional democracies the pattern of opinion is less clear, and the perception of serious conflicts between increased military spending and increased spending for social welfare programs often produces a more hostile attitude toward the former. In contrast to strategic programs, decisions on the use of force have obvious implications for the entire society. Consequently, in constitutional democracies public opinion has a much greater influence on such issues. In particular, efforts to initiate the use of force or to carry on the prolonged use of force for limited objectives overseas may arouse substantial opposition among broad groups in the population. This was true in France with respect to the Indochinese and Algerian wars, in Great Britain over Suez, and in the United States with respect to the Korean and Vietnamese conflicts. In such situations, a government may be caught between the realities of foreign politics and the pressures of domestic politics and thus be severely restricted in its ability to carry out a consistent policy. In totalitarian states military policy is typically a major and continuing preoccupation of the top political leadership. In Nazi Germany the principal strategic decisions were made by Hitler and his immediate associates, frequently against the advice of, and over the opposition of, the principal professional military chiefs. In communist states the top leaders of the party shape military policy; often they have had considerable experience themselves in the conduct of military operations. In all totalitarian states ideological considerations are a major influence on military policy; these frequently run counter to the judgments of the professional military; and consequently the tension between "political" and "military" approaches is frequently
more intense than it is in constitutional states. Ideologically oriented political leaders are often ready to pursue more expansionist or "adventurist" military policies than their more conservative professional military men are willing to support. Totalitarian political leaders also typically have more varied and more forceful means of asserting their authority over their military forces than do the political leaders of constitutional states. In both modern constitutional states and in totalitarian states the dominant consideration in military policy is typically the need of the state in relation to other states. However, in many societies in Asia, Africa, and Latin America, domestic considerations and needs have a much more important influence on strategy and structure. Often the military forces play a key role in domestic politics; the size of the armed forces reflects their domestic political strength more than their external military function. Indeed, in many such states the armies, although large in terms of the governmental budget, seldom, if ever, engage in external warfare. Another important influence on military policy in these states comes from the more-developed countries which furnish military assistance and advice. In many cases these influences and the desire to appear "advanced" may lead a small and backward state to adopt a military policy more appropriate for a large and industrialized power. External support for the military forces of a state, of course, also tends to make those forces less dependent upon the political system of their own country and thus may well encourage tendencies toward "praetorianism." SAMUEL P. HUNTINGTON [See also CIVIL-MILITARY RELATIONS; FOREIGN POLICY; MILITARISM; MILITARY; NATIONAL SECURITY; PUBLIC POLICY; STRATEGY. Other relevant material may be found under INTERNATIONAL RELATIONS; WAR.] BIBLIOGRAPHY
BERNARDO, C. JOSEPH; and BACON, EUGENE H. 1955 American Military Policy: Its Development Since 1775. Harrisburg, Pa.: Military Service Publishing Co. BULL, HEDLEY (1961) 1965 The Control of the Arms Race: Disarmament and Arms Control in the Missile Age. 2d ed. New York: Praeger. GARTHOFF, RAYMOND L. (1958) 1962 Soviet Strategy in the Nuclear Age. Rev. ed. New York: Praeger. HALPERIN, MORTON H. 1963 Limited War in the Nuclear Age. New York: Wiley. HAMMOND, PAUL Y. 1961 Organizing for Defense: The American Military Establishment in the Twentieth Century. Princeton Univ. Press. HITCH, CHARLES J.; and McKEAN, R. N. (1960) 1961 The Economics of Defense in the Nuclear Age. Cambridge, Mass.: Harvard Univ. Press.
MILITARY POWER POTENTIAL HUNTINGTON, SAMUEL P. 1954 Radicalism and Conservatism in National Defense Policy. Journal of International Affairs 7:206-222. HUNTINGTON, SAMUEL P. 1961a The Common Defense: Strategic Programs in National Politics. New York: Columbia Univ. Press. -> A paperback edition was published in 1966. HUNTINGTON, SAMUEL P. 1961b Equilibrium and Disequilibrium in American Military Policy. Political Science Quarterly 76:481-502. KAHN, HERMAN (1960) 1961 On Thermonuclear War. 2d ed. Princeton Univ. Press. KISSINGER, HENRY A. 1957 Nuclear Weapons and Foreign Policy. New York: Harper. LEVINE, ROBERT A. 1963 The Arms Debate. Cambridge, Mass.: Harvard Univ. Press. MILLIS, WALTER 1956 Arms and Men: A Study in American Military History. New York: Putnam. -» A paperback edition was published in 1958 by New American Library. MILLIS, WALTER; MANSFIELD, HARVEY C.; and STEIN, HAROLD 1958 Arms and the State: Civil-Military Elements in National Policy. New York: Twentieth Century Fund. RIES, JOHN C. 1964 The Management of Defense: Organization and Control of the U.S. Armed Services. Baltimore: Johns Hopkins Press. SCHELLING, THOMAS C.; and HALPERIN, MORTON H. 1961 Strategy and Arms Control. New York: Twentieth Century Fund. SCHILLING, WARNER R.; HAMMOND, P. Y.; and SNYDER, G. H. 1962 Strategy, Politics and Defense Budgets. New York: Columbia Univ. Press. STEIN, HAROLD (editor) 1963 American Civil-Military Decisions: A Book of Case Studies. University: Univ. of Alabama Press. STERN, FREDERICK M. 1957 The Citizen Army: Key to Defense in the Atomic Age. New York: St. Martins. U.S. LIBRARY OF CONGRESS, LEGISLATIVE REFERENCE SERVICE 1957 United States Defense Policies Since World War II. 85th Congress, 1st Session, House Document 100. Washington: Government Printing Office. U.S. LIBRARY OF CONGRESS, LEGISLATIVE REFERENCE SERVICE 1958— United States Defense Policies, 19451956. Washington: Government Printing Office. -» Kept up-to-date by annual supplements. WOLFE, THOMAS W. 1964 Soviet Strategy at the Crossroads. Cambridge, Mass.: Harvard Univ. Press.
MILITARY POWER POTENTIAL Military power potential consists in the resources that a nation-state can mobilize against other nation-states for purposes of military deterrence, defense, and war. This definition—which makes the term approximately synonymous with "defense potential" but renders it broader than the term "war potential"—follows a narrow definition of national power. More broadly conceived, national power in interstate relations is the ability of nation-states to produce desired effects in the behavior of other nation-
325
states. However, a wide variety of conditions and means, noncoercive as well as coercive, may be available to a nation-state to produce such effects. Indeed, since nation A may behave in certain ways toward nation B because it "respects" or "admires" nation B, such respect or admiration may be said to be part of B's power, broadly denned, over A. This inclusive perspective is useful, since it keeps us from neglecting or ignoring various factors affecting the behavior of nation-states toward others. However, as we know from the study of interpersonal relationships, a particular kind of behavior may result from different conditions or combinations of conditions and, for analytical, predictive, and manipulative purposes, we may be interested in distinguishing among particular conditions and their effects. One such condition, of prime importance in interstate relations, is power more narrowly defined as military power. This is in fact the most widely accepted definition of power in interstate relations. It is defined as the ability to affect the behavior of other nation-states through the actual or threatened exertion of force. As used in this article, then, national power is the ability to coerce other nations, and coercion refers to physical constraint rather than to such other means of pressure as economic reprisal. National military power is, of course, relative. It pertains to a relationship between states, the military power of state A being great or small in relation to the military power of state B or of several other states. The importance of military power in shaping the behavior of nation-states toward one another is also relative to the importance of other means of generating desired results. Thus, the importance of national military power will vary between state actors and, over time, within the entire international system of action. The main conditions immediately accounting for this variability are: (1) the pattern of distribution of military power in the international system; (2) the values at stake in international conflicts; (3) the "costs"—in terms of such values as economic resources, personal self-direction, moral standards, and reputation—of producing and employing military power; and (4) the comparative availability and effectiveness of other means for resolving international conflict, and the "costs" of using these alternative means. In studying the exertion and counterexertion of power, one approach is to concentrate on the results or on the means accounting for, or capable of accounting for, particular results. When we concentrate on the results, and perhaps equate power with particular effects, we are concerned with far
more complex relationships than when we compare the particular power instruments available to different nation-states. Thus, state A may have twice as much military power to exert against states B or C than either can exert against A. Faced with the same threat and demand from A, B may give in and C may fight; and faced with the same threat, B may give in to one demand, but fight when confronted with a different demand. Conversely, A may use its military power against B or C in support of some demands but not in support of others. Such problems involve choices of action determined by the various advantages and disadvantages attached by state actors to the exertion or counterexertion of power. In the following we are concerned not with the whole range and scope of power but with the availability of one particular instrument of power, namely, military power. Military power, however, may produce desired results without being exerted. State A may act in certain ways because it does not wish to risk the use of B's military power against itself. It is this silent effectiveness of national power that appears to be pervasive, even though it is hard to trace in terms of cause and effect and hence is easily ignored. In the real world, a nation's military power may be conditioned by its relations with friendly or allied states from which it may receive supplies, training for personnel, technological assistance, and financial aid, as well as military support in time of war. For the sake of simplicity, these international (and possibly supranational) phenomena will be disregarded in this discussion. We will assume that national power is self-generated. If national military power is the ability of one nation to coerce other nations through the employment of military means, or to resist such coercion by other states, then this power may be said to have, at any one time, two components: first, mobilized military capabilities ready for immediate operational commitment; and, second, additional power potential, which is a nation's ability to produce further military capabilities. To be employed militarily, power potential must first be mobilized—that is, transformed into, or used in, the production of operational military force, although the very act of initiating the mobilization of potential is a demonstration of a nation's intentions and may of itself act as a coercive threat or counterthreat. The process of mobilizing military potential takes time, and the resources involved will add to operational military capability only to the extent that there actually is opportunity for their mobilization. Manpower and industrial re-
sources may obviously contribute to a nation's military potential, and most writers have in fact focused entirely or largely on economic potential when using the concept of war potential. In modern nations, however, a vast range of national resources is adaptable to the generation of military forces and it is, furthermore, possible to indicate certain administrative resources and political conditions that affect the magnitude of a nation's military potential and the speed with which it can be mobilized. Since most authors refer to "war potential" rather than to "military power potential," they are usually concerned only with the part of a nation's power potential that has not yet been mobilized in the production of military forces and is still available for mobilization in pursuit of an arms race or in the event of war. However, in the nuclear age, as we shall see, the concept of power, or military, potential is of greater interest than that of war potential. Military power potential is the total ability of a nation to produce military power. Part of it is mobilized at any one time; part of it is unmobilized. History of the concept The concept of "war potential" did not gain appreciable currency until after World War n. However, thinking about the reality encompassed by the concept has been familiar in the literature of western Europe ever since the beginning of the mercantilist school. The mercantilist writers (for example, Charles Davenant, Josiah Tucker, and Jean Baptiste Colbert) were profoundly preoccupied with studying the bases—particularly the financial, commercial, industrial, and population resources—on which rested a nation's ability to prepare for war and to conduct it (Heckscher 1931). Mercantilist discussions of the "sinews of war" dealt with military power potential in the light of mercantilist perspectives. Despite the repudiation of mercantilist notions by the classical economists, there was a line of noted writers—for example, Adam Smith, Alexander Hamilton, Friedrich List, Friedrich Engels—who transmitted this preoccupation from the eighteenth to the twentieth century. It is not surprising that the mercantilists were the first thinkers who undertook, although with inadequate analytical tools, the systematic study of military potential. They were themselves products of two momentous and interconnected events in the history of mankind—the emergence of the nation-state, which monopolized the organization of military power, and the beginnings of the indus-
MILITARY POWER POTENTIAL trial revolution, which was to have enormous effects on the nature and bases of military power. The quickening evolution of pure modern science, the flourishing of applied science and technological innovation, and the rapid economic growth associated with the agricultural and, especially, the industrial revolution had even more revolutionary effects on the generation, form, and distribution of military power. These effects did not exhaust themselves in the forging of new weapons, such as constantly improved firearms; or even in such new devices as the steam engine and electrical apparatus, which were adapted to direct military use, lent themselves readily to the production of armaments, and vastly increased the mobility of heavily equipped military forces. More basic was the swift increase in the productivity of labor. In the preindustrial age, when communities were basically agrarian and farmers were unable, on the average, to produce much more than they consumed, the portion of resources available for military purposes as well as for other nonfarm activities was severely limited. Large military forces could be supported only temporarily, usually at the expense of plunder, capital depletion, and malnutrition or starvation. The Roman Empire began to decline when it attempted to place too heavy a burden of nonagricultural activities, including military ones, on a peasantry that, compared with today's, was poor in productive resources. What happened in modern times, and especially after 1800, was that farmers became able to feed an increasing number of nonfarm families, which were thus released to commerce, industry, government, and the military. Pigou ([1921] 1941, pp. 42-43) calculated that, by the beginning of the twentieth century, industrial nations were able to divert about one-half of their productive capacity to the conduct of war by means of augmented production, reduced personal consumption, reduced investment in new capital, and depletion of existing capital. Their ability to prepare for war had increased similarly. The first push of modernization began in small local areas. It gave western Europe an enormous advantage in military capability, enabling that comparatively small region not only to inflict huge damage on itself in internecine warfare but also to conquer vast colonial empires overseas. The process of modernization was weak and halting at first, gathered rapid momentum during the second half of the nineteenth century, and—although still unevenly distributed over the globe—reached a stupendous acceleration after World War n. The monopoly of coercion at the level of the
327
modern nation-state brought about the "nationalization" of war. Advancing science, technology, and economic production resulted in what may be called the Industrialization" of war. As war became nationalized and progressively industrialized, the concept of war, and warfare itself, tended to become "total." Although preindustrial and indeed preagrarian societies have been known to conduct wars of extermination, total war was widely recognized as a new development only after World War i. Thus, Oualid (Inter-parliamentary Union 1931, pp. 120-121) stated that modern war "implies the utilization of all the forces of the national collectivity: human, material, economic and moral." And General Douhet ([1921] 1942, p. 5) observed that "the prevailing forms of social organization" had given modern warfare "a character of national totality—that is, the entire population and all the resources of a nation are sucked into the maw of war." During World War i, the tendency for war to become "total" had expressed itself chiefly in the extension of the naval blockade to shipments of food and nearly all other civilian goods; in World War n, it led to the massive areabombing of industries and cities, causing huge losses of civilian lives and assets. Making war "total" meant directing hostilities not only against the opponent's armed forces but also against his entire military potential. Relevance in the nuclear age Between World War i and World War n, military power potential was usually discussed in terms of "war potential"—that is, the ability of nations to mobilize resources after war had broken out. It was normal for nations to maintain military establishments in peacetime far smaller than those they could establish in time of war. Following the outbreak of hostilities, and in fact during the diplomatic crisis preceding it, there was usually time to mobilize potential power. Germany, for instance, in 1939 produced only 20 per cent of the volume of combat munitions that she produced in 1944 (Knorr 1956, p. 59); similarly, it took several years for the military production of the United States to reach a peak in World War n. World War i and World War n were classical wars of attrition, won by the coalitions that possessed superior potential in manpower and economic resources. This outcome does not, of course, prove that war potential was decisive; the outcome of war is determined by other factors as well, such as the quality of generalship, strategic surprise, morale, and geography. One may surmise, however, that war potential became more important,
328
MILITARY POWER POTENTIAL
in relation to other determinants, as warfare became industrialized and that very large asymmetries in these other conditions were then required to overcome a substantial difference in war potential, especially modern, industrial war potential. It is the r efore not surprising that, in World War n, military fortunes turned with the balance of combat supplies (Knorr 1956, pp. 33-34). The emergence of nuclear bombs, and of aircraft and rockets of high speed and vast range, has caused many analysts to regard the concept of war potential as obsolete. The new arms technology means that military destruction on an unprecedented and previously unimaginable scale could be visited on the interior of any country within days or even hours of the outbreak of war. A few penetrating weapons alone could wreak enormous damage, and there are no known or foreseeable defenses capable of preventing this. The only sure method of preventing a nuclear weapon from penetrating is to destroy it, by means of a first or preemptive strike, before it has taken off from its base. If all-out war occurred under these conditions, the decisive blows, it is assumed, would be struck within a matter of days. The preatomic age saw many cases in which an antagonist lacked sufficient time for mobilizing potential because he was quickly overrun by his opponent. Large-scale nuclear war, however, would permit no time at all to mobilize "war potential," and the constituents of this potential would be mostly destroyed or disorganized in the first waves of attack. The main object of defense in this situation is to deter attack by threatening retaliation; to do this requires forces fully mobilized before the outbreak of hostilities. As long as these conditions of military technology prevail, "war potential," as distinct from "power potential," will be of less significance, although it remains important in the power relations of nonnuclear nations and in protracted nonnuclear disputes involving nuclear powers. (The Korean War required the United States to mobilize some margin of its war potential; and, in the early 1950s, the United States increased its capability for waging limited nonnuclear war.) However, the concept of military, as distinct from war, potential suffers no loss of significance whatever, since it refers to the capacity of nations to produce military power—nuclear and nonnuclear—whether in peace or war. The relative military power that nations are able to muster for deterrence and defense and their capacities for innovating in, and exploiting, a highly dynamic technology and for entering and sustaining arms races are basically matters of military potential.
Components of power potential Nations have few, if any, properties—geographic, demographic, economic, political, social, or cultural—that do not directly or indirectly affect their ability to produce military forces. These factors are too numerous to permit detailed enumeration and too heterogeneous and interdependent to encourage exhaustive and meaningful classification. Analysts of power potential have therefore focused on what seem to them to be the important determinants and have described these in terms of broad categories, such as population and industrial resources, that have intricate structures and are capable of further differentiation and comparison. The military sector may be regarded as a subsystem of the national society, receiving manpower, services, and other resources as inputs and producing from them the outputs—trained personnel, equipment, supplies, organization, doctrine, strategy, and deployment—that constitute the nation's military power. Since virtually all nation-states maintain military establishments at all times, a varying portion of their total power potential is mobilized at any one time; and, in the case of the major nuclear powers at least, this portion has risen compared with previous peacetime periods. Clearly, the unmobilized portion, as well as the total, of a nation's ability to generate military power is subject to change. It is possible to distinguish three major determinants of military power potential: economic capacity (in the past usually called "economic war potential"), military motivation, and administrative competence. The quantity and quality of the nation's manpower and other resources that are suitable as inputs for the military sector represent its economic military potential. The portion of this potential which, in time of peace or war, a nation is prepared to divert to the military sector depends upon military motivation. The efficiency with which the inputs are transformed into outputs depends on the nation's administrative military potential. Economic capacity. Production for both civilian and military use depends upon the quantity, composition, and quality of available factors of production. These are the nation's labor force; land and other natural resources; material capital in such forms as farms, factories, railroads, and inventories; monetary capital in the form of net claims on foreigners; and organizations in which factors are combined for economic production. How large a volume of goods and services a nation
MILITARY POWER POTENTIAL produces depends, additionally, on the rate of factor employment (some resources may be idle voluntarily or involuntarily) and on resource productivity (that is, the productivity of labor, land, capital, and management). Productivity has a bearing on military potential, since the larger the volume of output per capita, the more resources can be diverted, if the community so chooses, to the military sector without causing intolerable hardship to the civilian sector. A nation's resources, including its economic defense potential, tend to grow with the numbers in the labor force, additions to real capital, technological innovation, and improvements in productive human skills. Increases of the real gross national product (GNP), in the aggregate and per capita, are a rough index of such changes, although the benefits of rising labor productivity may be claimed, at least in part, in the form of increments to leisure rather than to goods and services. A country with a rapidly expanding national product, in the aggregate and per capita, tends to have an increasing economic potential for military purposes. However, the magnitude of this economic potential is also conditioned by the composition and quality of resources. In time of peace, and especially in time of crisis and war, the value of potential is derived largely from the speed with which it can be mobilized. Therefore, whatever the size of the national product, the closer resources are to the forms required by the military sector, and the more flexible the economic system is in permitting a rapid shift of productive factors from civilian to military production, the larger a nation's economic defense potential will be. For instance, the larger the proportion of males in the age bracket preferred for military service and the greater the proportion of such males with previous military training, the more the labor force contributes to military potential. Similarly, the closer the products for the civilian sector are to the many products required by the military sector, the greater the nation's potential for military production. Output of energy in various forms and means of transportation are of course basic in the modern industrial age. Modern arms technology has made the military sector a voracious consumer of extremely complicated instruments and hence has put a high premium on such industries as electronics and on sophisticated metallurgical, chemical, and engineering production. Basic energy sources apart, a rich local base of raw materials has, on the other hand, become relatively less important, partly because modern technology has a startling capacity for creating new materials and for substituting
329
one material or even one end product for another. Moreover, in view of the emphasis on military preparedness in peacetime and the improbability of large-scale and prolonged wars of attrition, productive nations can import many of the materials with which they are not endowed. With the acceleration of technological advance, notably in military technology, the most precious resource, especially among the leading military powers, is the capacity for endless scientific discovery and technological innovation and hence the number, skill, and genius of a nation's scientists and engineers. As suggested, military potential depends not only on how close a nation's composition of total output is to the mixture required by the military sector but also on the rapidity with which output composition can be changed. This flexibility rests in large part on the mobility and versatility of labor and of other factors of production. It may be observed further that, assuming any given composition of national production, this mobility varies with the growth of the real GNP and with the stage of economic development. Rapid economic growth indicates a high rate of investment and of innovation, including product changes. It occurs in a society capable of initiating and absorbing frequent change. The stage of economic development is significant because the more advanced an economy, the richer it is in the kinds of skills and the variety of productive facilities that are conducive to rapid and continuous changes in output. The quantity and quality of military potential are, of course, relative to the varying military needs and ambitions of the country. Relatively few countries have the potential resources for developing, on their own account, sophisticated nuclear weapons and vehicles for their delivery. Indeed, some of these nations do not intend to become nuclear powers and hence have no military use for nuclear energy production. Although these countries may possess a substantial potential for producing "conventional," that is, nonnuclear military forces, no single country can rival the United States and the Soviet Union in military power and military potential, One might, in fact, conjecture that there is now a greater diversity in the magnitude and quality of military economic potential than prevailed in the past. Differences in sheer size of population and territory have always existed, and so have differences in the state of the arts applicable to military activities. But one suspects that the qualitative differences between, for example, Afghanistan and the Soviet Union or between Haiti and the United States, are greater than formerly existed between comparable pairs
330
MILITARY POWER POTENTIAL
of countries. These differences probably reflect, on the one hand, crucial differences in economic and technologic^] development. On the other hand, they may also reflect a new implication of differences in the size of national outputs—namely, that certain very sophisticated lines of production (such as nuclear energy or nuclear weapons) demand a scale of effort that only large and highly developed countries can afford. Military motivation. A nation may possess a large economic potential for military activities but refrain from mobilizing it for this purpose. Large economic potential for military power does not necessarily equal large military power. Power potential, therefore, depends on what may be called military motivation-—that is, upon the will among members of society to supply men and other resources to the military subsystem. The production and employment of military power are organized, collective actions. Power potential is action potential, and action results from motivation. Since the production and use of military power are organized by government, it is through the political process that a society's military motivation is aggregated. However, that motivation expresses itself not only in determining the stream of men and resources made available to the military sector; it also conditions those other aspects of personal behavior that are relevant to the output of power and are often referred to as "morale." Broadly speaking, the military potential of a society depends upon the capacity of its members to forgo the satisfaction of wants and preferences—whether they are concerned with safety, income, consumption, status, leisure, respect, self-direction, or other values—that compete with the demands of the military subsystem (Knorr 1956, p. 67). This does not mean that the individual or group pursuit of such values necessarily conflicts with the production and use of national military power, for some individuals may gain in status, respect, income, and so forth from serving the military sector. Zero war potential would obtain if all the members of a nation were intensely dedicated to goals and preferences that, in every way, prevented military power from being produced or used and that they were completely unwilling to neglect even temporarily. Military motivation is a concept beset with difficulties. Its systematic examination awaits further progress in relevant social science research. There is the additional difficulty that the willingness of a society to undertake a military effort obviously depends on the nature of international conflicts and of available choices for resolving them, and
on the way in which these conflicts and choices are perceived. However, societies also have a certain underlying disposition or readiness to accept military postures and actions, and the personal sacrifices they demand. Historians frequently refer to societies as "warlike" or "peaceful," and political leaders often speak of certain nations as highly likely or unlikely to fight—images obviously based on the record of past national behavior. We are concerned here with particular patterns of values, or cultural standards, that affect a society's attitudes toward international violence and national preparation for it. Since these standards are more or less internalized during the process of socialization—although they are no doubt susceptible to the impact of adult experience and learning—it is assumed that a society's underlying mode of readiness to react to military threats or opportunities usually undergoes only gradual change. This subject likewise stands in need of further conceptual and empirical research before important questions can find more than an intuitive answer. Thus, the dichotomy between "warlike" and "peaceful" is obviously far too simple. A "peaceful" nation may be basically and abidingly averse to the use of military power, or, although strongly preferring the peaceful resolution of international conflicts, it may reveal a high military motivation once it feels sufficiently provoked. In the latter case, a high threshold of provocation must be crossed before that society's underlying willingness to authorize military action is released—in other words, before its military motivation potential is mobilized. The literature also lacks plausible hypotheses on the relation between military motivation potential and different political and social systems and, therefore, on whether the military motivation of nations tends to vary with forms of government. Two observations may be made. First, the willingness of a society to forgo the satisfaction of civilian wants and preferences competing with the demands of the military sector differs among groups that may also differ in formal and informal political influence. Second, whatever the structure of politics, political influence between government and governed is reciprocal, although the balance of influence may vary greatly from system to system. An authoritarian or totalitarian government (or elite) that is itself endowed with a strong military motivation and exerts a powerful influence on the relevant motivation of the rest of the population will tend to lend a high military motivation potential to the country involved and presumably will find it easy to mobilize this potential. In contrast, a democratic government that reflects or
MILITARY POWER POTENTIAL confronts a basically low military motivation in the electorate can do little to increase, and perhaps even to mobilize, the military potential of the nation. However, the historical record contains many democratic nations with a high potential and many authoritarian countries with a low one. The possible patterns are many and complex and do not encourage generalizations on the comparative power potential of nations with different forms of government. From this point of view, the military motivation potential of societies is an empirical question, which can be answered on the basis of empirical research. Administrative competence. Whatever the military economic potential of a society and whatever its military motivation potential, its output of military power may be relatively large or small depending on the efficiency with which the military subsystem employs the resources supplied to it. The military sector uses inputs in the form of manpower, funds, industrial capacity, research physicists, in order to produce various military forces and supplies, military doctrine and strategy, and all the elements that, in the aggregate, constitute the nation's power. In the modern age, the different production tasks are extraordinarily numerous and complex, involving difficult choices to be made in the face of various uncertainties. Moreover, the production processes are complicated and lengthy, requiring highly coordinated action and involving a great deal of time, often many years, for their completion. The efficiency problem is encountered on many levels. The industrial enterprises used for manufacturing military hardware may be more or less efficient in their employment of scarce resources; so may be the military services that must transform recruits into competent soldiers and officers; so is the planning and implementation of military research and development (which loom ever larger in the military power equation of modern nations); and so are the crucial choices of present and future weapons systems and strategies, intricately phased out and in over time as technology and strategic requirements change. Efficiency is a matter not only of choosing the right military end products and components, and °f producing them at the least cost in economic and bureaucratic resources, but also of the speed with which decisions are made and implemented. Until a few decades ago, military technology—and with it strategy and doctrine—changed only slowly. By the 1960s the pace of change had become explosive and the administration of the military sector far less able to rely on lessons from past experi-
331
ence. Since major weapon systems take years to develop and produce, a time lag of even one year may have an important bearing on national military power. Administrative military potential resides essentially in the efficiency of several bureaucracies, especially in business, in the armed services, in special research organizations, and in various government departments. In these structures, competence depends on the conceptual realism with which tasks are understood, the intelligence and training of the directing and staff personnel, the efficiency with which information is procured and used, the efficiency with which many necessarily decentralized operations are coordinated, the quality of the analytical techniques and instruments employed for identifying and evaluating choices in problem solving, and all the other conditions that make for good and prompt decisions and their efficient and prompt execution. Power potential in a changing world As long as there is military power, power potential will, of necessity, remain important. However, the implications of power potential and, still more, the nature and relative weights of its constituents are notably sensitive to changes in the international system and to changes in technology and economic productivity. Industrialization and the accelerating progress of arms technology have had a revolutionary impact on the nature of national military power and power potential. As economic development and scientific and technological progress continue, the demands of the military sector on societies will tend to undergo further revision, and power potential will rise or fall depending on the society's ability to meet these changing demands for particular inputs of resources. The future may also bring more or less radical changes in the properties of the international system and its parts. Thus, the consolidation of nation-states into larger units may have far-reachirig effects on military power relationships, including comparative power potentials. Similarly, the establishment of various kinds of arms control and disarmament could affect the significance of power potential. It is interesting that the concept of war potential figured importantly in the exploration of disarmament during the 1920s and early 1930s (Inter-parliamentary Union 1931, chapters 3-5), because war potential was expected to become a more decisive factor in the international power balance as states limited their mobilized forces. Similarly, the military potential of nations would change in the future if nuclear weapons were
332
MILITARY POWER POTENTIAL
abolished or drastically reduced in number. If general and complete disarmament occurred, and the disarmed world turned out to be highly unstable politically, rearmament or its threat might become a major political factor, and military potential wouid remain a major element in international relations. The study of power potential The importance of military potential in interstate power relations and in the calculations of statesmen stands in odd contrast to the paucity of rigorous social science literature on the subject. There are good reasons for this contrast. Although the basic concept of power potential is clear enough, it refers to a reality so complex and embracing so much of the political, social, economic, and cultural life of nations that it resists detailed definition and, on the whole and in many particulars, defies measurement. Who can measure the military motivation potential of a society, or its military administrative competence, or even its military economic potential? The first two are customarily regarded as "imponderables," and the important qualitative aspects of the last are not measured by the GNP or similar indices. Resistance to quantification is of course a common property of many social phenomena. It is lowest in the demographic and economic areas, where a great many data, although of greatly varying quality, are readily available. The structure of populations and labor forces can be roughly compared, and so can the capacity of many industries. It is nevertheless impossible, at least as yet, to compute comparable aggregate values for the economic war potential of nations. Comparing military motivation potentials is much more problematical. Comparative defense budgets, terms of military service, and similar data are indicators of some interest. But it is patently difficult to infer motivation from behavior, and all we may be able to appraise in these instances is current rather than potential motivation. To estimate administrative military potential is yet more difficult, for to evaluate the outputs of the military sector, short of war, is even harder than to evaluate the inputs it receives. But in this context, as in many others, "difficult" questions are not necessarily hopeless operations. Ingenious effort may yield worthwhile, although imperfect, results. The fact is that statesmen, government officials, and military men are continuously engaged in comparing the military power, including the power potential, of nations. If they
did not do so, they would be unable to perform their tasks. Their comparisons may be extremely crude and largely represent intuitive judgment. Sound intuition, however, is anchored in observation, however haphazard, fragmentary, or impressionistic. Surely, whatever is done haphazardly and impressionistically can, in principle, be done more rigorously, or at least be greatly assisted, by means of empirical analysis and the formation and testing of hypotheses. The fact that this effort is lacking does not argue its impracticability. It is probable that the social sciences, especially as they progress further in analytical capability, could achieve a great deal more in this neglected area of application. KLAUS KNORR [See also ECONOMIC WARFARE; INTERNATIONAL POLITICS; STRATEGY. Other relevant material may be found in INTERNATIONAL RELATIONS; MILITARY; POWER; WAR.] BIBLIOGRAPHY ARON, RAYMOND 1962 Paix et guerre entre les nations. Paris: Calm arm-Levy. BEATON, LEONARD; and MADDOX, JOHN R. 1962 The Spread of Nuclear Weapons. New York: Praeger. BENOIT, SMILE; and BOULDING, KENNETH E. (editors) 1963 Disarmament and the Economy. New York: Harper. DOUHET, GIULIO (1921) 1942 The Command of the Air, New York: Coward-McCann. -> First published in Italian. EARLE, EDWARD MEAD (editor) 1943 Makers of Modern Strategy: Military Thought From Machiavelli to Hitler. Princeton Univ. Press. HECKSCHER, ELI F. (1931) 1955 Mercantilism. 2 vols., rev. ed. New York: Macmillan. -> First published in Swedish. HITCH, CHARLES J. 1941 America's Economic Strength. Oxford Univ. Press. HITCH, CHARLES J.; and McKEAN, R. N. 1960 The Economics of Defense in the Nuclear Age. Cambridge, Mass.: Harvard Univ. Press. HUNTINGTON, SAMUEL P. (editor) 1962 Changing Patterns of Military Politics. New York: Free Press. INTER-PARLIAMENTARY UNION 1931 What Would Be the Character of a New War? London: King. -> A collection of essays. KNORR, KLAUS 1956 The War Potential of Nations. Princeton Univ. Press. KNORR, KLAUS 1957 The Concept of Economic Potential for War. World Politics 10:49-62. KNORR, KLAUS 1966 On the Uses of Military Power in the Nuclear Age. Princeton Univ. Press. PIGOU, ARTHUR C. (1921) 1941 The Political Economy of War. New & rev. ed. New York: Macmillan. SCHLESINGER, JAMES R. 1960 The Political Economy of National Security: A Study of the Economic Aspects of the Contemporary Power Struggle. New York: Praeger.
MILITARY PSYCHOLOGY SILBERNER, EDMUND 1946 The Problem of War in Nineteenth Century Economic Thought. Princeton Univ. Press. SOKOLOVSKII, VASILII D. (editor) (1962) 1963 Military Strategy: Soviet Doctrine and Concepts. Introduction by Raymond L. Garthoff. New York: Praeger. -> First published in Russian. STEINMETZ, S. RUDOLF (1907) 1929 Soziologie des Krieges. Leipzig: Earth. -> The 1929 publication is an enlarged and revised version of an earlier work published in 1907 under the title Die Philosophie des Krieges.
MILITARY
PSYCHOLOGY
The application of psychology to military problems began in World War i, was revived in World War n, and has continued since then as a part of military research and development and in certain military staff activities. It has had a major impact on military personnel procedures and on the design and use of military weapons, vehicles, and other equipment, as well as considerable effect on military training and on life-support activities. It is coming to have an important effect on what is known in the military as psychological or special operations: on all those military activities, that is, in which the knowledge of other customs and cultures is important. Finally, military psychology has been a major influence on psychology itself. During two generations, psychology's leaders served in the military, and a significant fraction of psychologists today are supported financially by the military. Military applications and activities tended, at least until the early 1960s, to associate psychology with the natural sciences and engineering rather than with the social sciences. An understanding of military psychology may be facilitated by noting the nature of modern military operations and modern military duties. Even in peacetime, military operations are greatly varied. Practically the whole range of nonmilitary activities is represented, particularly if comparisons are in terms of kinds of activities rather than in terms of details. As Janowitz (1960, p. 65) has pointed out, modern military jobs, more often than not, are noncombat jobs. Increasingly they are specialized, technical jobs; in some instances they require knowledge from the very frontiers of science. Even in peacetime, however, the atmosphere is often one of crisis, danger, and stress. Operations are often conducted in environments that are exotic both Physically and culturally, and involve complex, expensive, technically advanced, rapidly obsolescing
333
equipment used by men whose terms of enlistment are short. Readiness to fight can be tested, and practice in the use of some weapons can be obtained, only in a simulation of war. Thus it can readily be seen that the primary limit to the usefulness of psychology in military operations is the limit of psychology's knowledge of human behavior. It can quickly be appreciated, as Geldard (1953) has suggested, that it is difficult to think of an area of psychology which might not prove useful to the military. And Melton's definition of military psychology (1957) must be accepted: military psychology is coextensive with psychology and is defined primarily by the context of application. The categories of information which are most helpful in understanding the work of military psychologists are the following: (1) the military problems for which solutions are needed; (2) the products or techniques to be developed or applied; (3) the military organization desiring help, i.e., the "client"; (4) the psychological organization offering help; (5) the relevant psychological concepts and theories; (6) the nature of the interdisciplinary team within which psychologists will probably work; (7) the place of the work in the research and development cycle; and (8) the impact on psychology more generally. These points will be considered in turn. The first four—military problems, psychological products, military organizations with problems, and psychological organizations—can most profitably be considered together and in historical development. Except as specifically stated, the discussion will concern military psychology in the United States, since all types of development have taken place in this country and on a more organized basis than elsewhere. World War i. As a significant activity, military psychology began in World War i. In several of the warring nations, psychologists used their professional talent to assist the military. The selection and classification of recruits and specialists by means of mental tests and the development of an over-all personnel system were of primary concern. Robert M. Yerkes, president of the American Psychological Association, led in organizing a number of committees for war service. Most of these served under the auspices of the National Research Council. These committees were concerned with examination of recruits, acoustic problems, education and special training, incapacity, military training, emotional stability, motivation, recrea-
334
MILITARY PSYCHOLOGY
tion, special aptitudes, aviation, visual problems, psychological literature, tests for deception, a course in psychology for the Student Army Training Corps, and propaganda. (The most complete bibliographical source covering these and other World War i activities is Ferguson 1962.) The Army Alpha Test. The outstanding accomplishment of the committees was the creation of the Army Alpha Test, a group-administered mental test given as a part of the medical examination to all recruits. A large number of the nation's leading academic and scientifically oriented psychologists served in the Sanitary Corps, administering Army Alpha. Results had been analyzed for more than 1.7 million men by the war's end. Although the test was used to eliminate the unfit, to select for special duty, and to balance units in ability, its primary significance for today is its fascinating demonstration of the extent and significance of individual differences. Yerkes (1921) summarized the analyses. Laymen, military men, scientists, and even the psychologists themselves seem to have been greatly stimulated by the dramatic differences shown, by the possibilities of measuring individual differences, and by the potential impact on traditional ways of dealing with large numbers of people. [See INTELLIGENCE AND INTELLIGENCE TESTING.]
The Army's personnel system. Simultaneously with the development of the Army Alpha Test, Walter D. Scott, with the assistance of Walter V. Bingham, led a number of industrial psychologists and personnel men in a program which ultimately resulted in a modern personnel system for the United States Army (U.S. War Department 1919; Ferguson 1962). Scott, working independently of the committees described above, first developed a rating scale for selection and promotion of officers. This was enthusiastically received, and its success led to the creation, under Scott and Bingham, of the Committee on the Classification of Personnel in the Office of the Adjutant General. This committee guided the development of a complete personnel system for the army, including the analyses of civilian occupations and military jobs, the trade tests, the questionnaires and record forms, and the tables of organization, without which an army of specialists could not be created and maintained on a large scale. Late in the war, the committee was militarized and became the nucleus of the Personnel Branch of the Army General Staff. After the war, many of the individuals involved became prominent in industrial relations and scientific management. [See APTITUDE TESTING and the biography of BINGHAM.]
Between the world wars. Military psychology disappeared between the two world wars. At the close of World War i, the Advisory Committee on the Problems of Military Psychology was established in the Division of Anthropology and Psychology of the National Research Council. The division itself had just been created; in large part its existence was due to the success of military psychology. At the initial organization meeting of the division, Yerkes' list of those present showed that 11 of 22 individuals had a military title, and military problems were high on the list of items for which the division was needed. Nevertheless, the military psychology committee met with complete indifference to its task and finally went out of existence after years of inactivity. World War n—personnel operations. One of the first activities to reappear in U.S. military psychology was the identification of similarities between civilian and military jobs, a function performed for the army in 1940 by the U.S. Employment Service. The army soon established a general personnel research and development program in the Adjutant General's Office. The Bureau of Naval Personnel later followed suit. (The navy's personnel program in World War 11 is described in U.S. Bureau of Naval Personnel 1947.) The army air forces and the navy established in their medical services special, very large psychological units to screen and classify flying officers, especially pilots. (See U.S. Army Air Forces 1947 for a description of work in this organization.) As in World War i, the academically and scientifically minded psychologists tended to move into mental testing while the industrial psychologists moved into other aspects of the personnel system. Differing from the procedure in World War i, the military establishment rapidly created its own units. The National Research Council, through its Emergency Committee in Psychology (Dallenbach 1946), guided and assisted these personnel developments and the other developments to be described below. [See INDUSTRIAL RELATIONS, article On INDUSTRIAL AND BUSINESS PSYCHOLOGY.]
Contract research. In World War n, extensive use was made of contracts with academic and industrial institutions to support civilian research and development for military use. The Office of Scientific Research and Development (Baxter 1946) organized natural scientists and engineers through contracts with its National Defense Research Committee, and biologists and medical men through its Committee on Medical Research. The contractors often included psychologists in their interdisciplinary teams concerned with equipment and operat-
MILITARY PSYCHOLOGY ing procedures for such military interests as night operations, underwater sound, communications, and stereoscopic range finding. Free from the organizational restraints which were placed on psychologists within the military establishment, the psychologists under contract contributed from the earliest days of the war not only to selection of special kinds oi' personnel but also to training and to the design and use of equipment. Training. Research on the selection of personnel led generally to an interest in training because selection occurred before training, and the success of selection procedures was evaluated in terms of the success of the selected personnel in training for duty. The pattern of work established by Dean Brrmhall and J. G. Jenkins in the prewar National Research Council Committee on the Selection and Training of [Civilian] Aircraft Pilots had a very considerable effect on the psychologists in military units, and even more on those of the National Defense Research Committee. Achievement and proficiency tests for use in training were developed and applied, industrial training methods were carried over to the military situation, and training devices were developed and evaluated. Human engineering. The first step in research and development for selection and training is to analyze the jobs of the personnel concerned. This step draws attention to the impact of equipment design on personnel requirements, to the design of the displays and controls used by men. Closely associated are efforts to design efficient operating procedures. Out of these activities grew the field now known as human engineering. In 1942 and 1943 the Applied Psychology Panel of the National Defense Research Committee was created to exploit any psychological approach to the military problems created by the advance of science and engineering. (Bray 1948 describes its work.) One of its projects on gunsights, under the direction of William E. Kappauf and Franklin V. Taylor, was absorbed at the war's end into the Naval Research Laboratory, ever since a leader in human engineering. Its other projects helped to establish the pattern for the human engineering research done by the psychology branch of the aero-medical laboratory at Wright Field, established late in the war under Paul M. Fitts. In England a comparable series of developments occurred. Under the leadership of F. C. Bartlett and Kenneth Craik, and with the support of the Medical Research Council, attention soon turned to operating procedures and equipment design. According to Bartlett (1957), out of this work grew the Unit for Research in Applied Psychology at
335
Cambridge and much of the present-day respect for scientific psychology in Britain. Life support. British work, particularly that on night vision during the battle of Britain, was brought to the attention of the highest political authorities in the United States and was significant in the growth of physiological psychology in U.S. military medical laboratories. Since the war, this type of research has continued and now is part of the field known as life support. Included in the field today are psychophysiological studies on the effects of special and extreme environments, on sensory problems, on drugs, stress, fatigue, vigilance, and so on. [See ATTENTION; FATIGUE; STRESS.] Attitudes and motivation. World War n provided the occasion for the first major organized program of military social psychology, a program on attitudes and motivation. Within the Information and Education Division of the U.S. Army, many leading psychologists and sociologists applied the recently developed techniques of attitude and opinion study to a host of topics. The range of the army studies may be illustrated by these examples: the reasons for soldiers' failure to use atabrine regularly in the Pacific, preferences for winter clothing, reactions to the military way of life, the probable number of neuropsychiatric casualties in particular units, and the probable cost (which turned out to be correct within a few percentage points) of the GI Bill of Rights. The example also illustrate that this type of work leads into sensitive areas, that it is likely to be bound to a specific time and place, and that the information gathered is of interest in varying degree to any given military client. Thus the usefulness of the work depends greatly on calling it to the attention of the "right" user, one who is in the right place at the right time and is able and willing to put the information to use. As Stouffer (Social Science Research Council 1949-1950, vol. 1, p. 48) suggests, the army never found a systematic way to use this kind of information. After World War n. Contrary to the experience after World War I, military psychology continued to be important after World War n. This resulted from the association during the war between psychology and the natural and biological sciences and from the outburst of military research and development following the atomic bomb and other scientific successes, particularly those of the Office of Scientific Research and Development. In consequence, psychological research, development, and application have continued in relation to military personnel operations, training, equipment design
336
MILITARY PSYCHOLOGY
and use, and life support. The exception is the field of attitudes and motivation; in this instance basic research has received some continuing support, but development and application nearly died away after the Korean War. In the early 1960s, accelerating rapidly under the Kennedy administration, military social science began to develop again under the stimulus of the need for better communications with other peoples. The postwar activities have been much influenced by organizational developments in the period, in particular by the rise in importance of the military laboratories, by the appearance of the nonprofit corporation for military research and development, by the organization and function of the Office of Naval Research, and by developments in the Office of the Secretary of Defense. The impact of each of these on military psychology will be considered. The rise of military laboratories. By comparison with the prewar period, the postwar period has witnessed an enormous increase in the resources available for scientific laboratories and related institutions within the military establishment itself. Psychology has shared in this support. Personnel operations and psychological research and development units, laboratory-like in nature if not always in name, continue in each service. These have been given relatively large funds, facilities, professional positions, technician support, and access to "human guinea pigs." Research and development in military training likewise have found generous support through military laboratories since World War n. Human engineering units have appeared within a wider and wider range of military laboratories concerned with various types of equipment. The funding and the over-all control of these laboratories are normally a part of the more general scientific research and development activities of the services. Nonprofit contract laboratories. A significant feature of the postwar period has been the creation of nonprofit corporations of a laboratory-like character to conduct military research under contract to some branch of the military establishment. For psychology and the social sciences the most significant of these have been the RAND Corporation and the System Development Corporation, both of Santa Monica, California; the Human Resources Research Office (HumRRO) of George Washington University; and the Special Operations Research Organization (SORO) of American University. RAND was created by the air force to conduct long-range research and development in any of the sciences. It has provided continuous support for basic social science research for military purposes and has also been influential in the appear-
ance of the system concept in military psychology. The significance of this concept is elaborated in Gagne (1962). The RAND Corporation and the air force created the System Development Corporation to provide realistic synthetic training and exercise of the air defense system under attack. The System Development Corporation expanded into the world's largest employer of psychologists, using them chiefly in research and development relevant to the use of computers in training and in complex weapon systems. Both RAND and the System Development Corporation are contributing heavily to one of the most active modern fields of research, information processing and decision making in command and control activities. In the 1960s HumRRO came to be the major psychological organization concerned with research and development in relation to training. Organized by the army, it has established small, laboratorylike activities at a number of army training centers. HumRRO's approach to training is described by Crawford (1962). SORO is the army's main organization for research and development in the field of human interaction and communication across cultural boundaries. Windle and Vallance (1964) describe the content of its work and the military activities to which it is relevant. Office of Naval Research. One of the most significant developments of the postwar period was military support of basic research in all scientific fields, including psychology. The U.S. Office of Naval Research (ONR) was the primary arm of the military establishment in this respect. ONR set a pattern which was widely followed in military laboratories and the other services. Darley (1957) describes its character. ONR. operated through research contracts with civilian institutions, following the method used by the National Defense Research Committee during the war. ONR acted as the prototype of the National Science Foundation, now the federal government's principal agency for the support of basic research. ONR supported university research, and the research it supported was unrestricted. It accepted research plans from the country's leading scientists, rather than proposing research to them, and has been a major factor in U.S. military support for psychological research in other nations. In recent years ONR has become somewhat more selective than it was originally in choosing to support those scientists who wish to do research related ultimately to navy interests, but it still imposes no direction or restrictions on those it supports. Within psychology, ONR has emphasized
MILITARY PSYCHOLOGY support for psychologists concerned with psychophysiology, psychometrics, learning, engineering psychology, and group behavior. In recent years the modeling of individual behavior and of social processes has become a major interest. The Air Force Office of Scientific Research, set up in the mid-1950s on the pattern of ONR, has given continuing support to long-range social science research, as well as to psychological research. The army's research program has been more goaloriented than those of the navy and air force. Office of tJie Secretary of Defense. When, in 1947, the army, navy, and air force were brought together under the new Office of the Secretary of Defense, the Research and Development Board was established to review, evaluate, and plan for science and engineering in the military establishment as a whole. It established committees of consultants who served to integrate, or "couple," the military and scientific communities. One such was the Committee on Human Resources. Its panels on human engineering and psychophysiology, on personnel and training, on manpower, and on human relations and morale reflect the range of topics which were covered in military research in the postwar period. A report by Lyle H. Lanier, executive director of the Committee on Human Resources, gives the best available picture of military psychology in 1949. Although the Research and Development Board was later abolished, there remains a planning and review officer for psychology and the social sciences under the director of defense research and engineering in the Office of the Secretary of Defense. Under the auspices of this officer, a number of consultants recently analyzed needs and opportunities for long-range research in military psychology and social science (see, for instance, Bray 1962). Some of the recommended programs have been activated by the Advanced Research Projects Agency in the Office of the Secretary of Defense. The Office of the Secretary of Defense is significant not only because it is the prime center of power over all military research and development but also because it is a natural client for social science research oriented to military operations as an element in foreign policy. This office also represents the United States in NATO military psychology activities. Conclusions. It seems likely that every type of military activity and every type of weapon is receiving some psychological attention. Effort has understandably been concentrated on topics for which traditional military solutions are not available. Short-term enlistments, like rapid mobilization, have led to attention to initial personnel processing
337
and training. Rapid change and increasing complexity of equipment have led to interest in human engineering. On the other hand, the inadequacy of traditional techniques and the proven superiority of new ones are not enough. Products must be easily used or embedded in an organizational structure. Melton (1957) develops two related points. First, to change a person's behavior by psychological methods often requires that a commander or instructor change his own behavior. Second, the values of new products of natural science and engineering are readily appreciated, in contrast with the products of psychology. The significance of an increase in speed or power is evident. The advantages of improved maintenance work, which might produce an equivalent change in military effectiveness at less cost, are nevertheless hard to demonstrate. No doubt, the lack of evident advantage in many products is responsible for the continuation of the practice in military psychology of conducting "demonstration experiments." To the extent—and it is usually considerable, or the demonstration would not be attempted—that the results of these experiments can be forecast, they are a waste of time, money, and talent. In the United States at least, products which conceptually and empirically are already known to be worth their cost must sometimes be demonstrated repeatedly in formal experiments to be accepted as superior to traditional methods. The need for such demonstration experiments should be reduced as psychology becomes more technological. With improved scientific skill behind its concepts and products, demonstrations will no doubt give way to tests not of validity but of the degree to which the product meets the specifications laid down for it in advance of its development on the basis of sound scientific opinion. However, little progress can be expected in this respect unless the organizational relations of psychology and the other social sciences are also sound: the client organization and its relations to the research organization are significant. Since the start of World War u, the personnel psychologist and the personnel and medical staffs have enjoyed a continuing and highly integrated relationship. Comparably, human engineering has an exceptionally close relation to its client; the user of human engineering information is the scientist or design engineer who incorporates the psychologist's product into the design of a weapon or other piece of hardware. Thus the routine users are professional people whose backgrounds are appropriate for the evaluation of the psychological
338
MILITARY PSYCHOLOGY
contribution. Morgan's Human Engineering Guide to Equipment Design (1963) illustrates the extent to which human engineering information is adapted for use by other engineers. For training the situation is different. In this instance the client is likely to be a professional military man with long experience and with a tradition of successful use of common-sense methods behind him. Probably for this reason, psychological work on training and training devices has not received very stable support in the navy and air force and has been very dependent on demonstration experiments in the army. Social-psychological contributions appear to be even more affected by the presence or absence of an informed client and by the institutional relationships. The very successful work on attitudes and motivation in World War n expanded in the postwar period to include programs on military government, strategic planning and intelligence, and psychological warfare. By 1953 and 1954, however, this type of work was disappearing within the military establishment, although some relatively basic research studies continued under contract. There was little interest in the military and among natural scientists and engineers in defending this type of study against congressional attacks during a period in which economy in government was stressed. Since 1960, with the revival of interest in civil defense, limited war, guerrilla war, disarmament, military aid, and nation building, behavioral science research relevant to these topics has also revived (Windle & Vallance 1964). Definite clients for this work now seem to exist in the Defense Department's civil defense and international security activities and the army's Special Forces. Concepts and theories. From what has been said, it should be clear that the concepts and theories of concern to military psychology are those of psychology in general. As Hill (1955) suggests in his description of the content of military psychology at the end of the Korean War, contributions to theory are sent for publication to the scientific journals and do not appear in military reports. It may be noted, however, that psychologists and their theories have an influence in the military far beyond that directly involved in the more obvious products. Concepts of individual differences, cultural differences, human error as determined by equipment design, and the need for motivation and reinforcement of learning have effects far beyond particular applications in aptitude tests, in military aid and guerrilla warfare, in aircraft altimeters, or in teaching machines. The very presence of significant numbers of psychologists in the military set-
ting helps to induce the "sleeper effect" by which new concepts come to affect behavior even though the concepts are not explicitly formulated. Perhaps the most important item in this relation is the psychologist's confidence that understanding of human behavior can be improved through the scientific approach. Interdisciplinary relations. While specialization within psychology and within other disciplines has progressed, so too has the need to establsh interdisciplinary teams to deal comprehensively with problems of applied science. Interestingly, the various specialists within military psychology seem to form stronger alliances with nonpsychologists than with fellow psychologists. Psychometricians seem to ally themselves with statisticians and not with job analysts. The latter tend to associate more closely with the management specialists than with training psychologists. And the human engineers, despite their current adherence to a doctrine of the importance of the complete system, rather definitely reject efforts to ally them with any but hardware engineers. These tendencies have probably contributed to the fact that no true profession of military psychology has emerged with special textbooks and graduate training of its own. Such unity as there is in military psychology today comes from common training in research methods rather than from training in application. Place in research and development cycle. The role of the military in supporting basic research and the persistence of the demonstration experiment have been mentioned above. Recently, there has been a growth in the numbers of psychologists contributing to weapon development within defense industry. The growth has been directly spurred by the adoption, originally by the air force, of military specifications that require, first, that all new equipment receive adequate human engineering attention and, second, that the personnel requirements of new equipment and the ways of meeting these requirements be spelled out in detail. It is by no means clear that objective standards exist for the enforcement of these specifications. Enforcement depends heavily on the quality of psychologists available to inspect and test this new equipment. Impact on psychology. Since World War u, the military establishment has consistently been a major source of funds for psychological research and development. The writer knows of no accurate estimate of the amounts involved, but they have certainly been large by any standard. Although, in the United States at least, the military establishment is becoming proportionately less of an influ-
MILITARY PSYCHOLOGY ence in this respect, because of the rise of other government agencies to support research, it is still of the utmost importance to psychology that defense support be wisely administered. Financial support is one way of suggesting the impact of military psychology on psychology more generally. A related type of estimation is based on a calculation of the number of psychologists involved. In 1957, Melton counted over seven hundred psychologist employees of the Department of Defense and its major nonprofit laboratory contractors. He estimated that some 5 per cent of the members of the American Psychological Association were working directly or indirectly for the department. There is reason to suppose that this figure was conservative. It is more difficult to evaluate the impact on the quality of the development of psychology than on the quantity. There can be little doubt that military use and support of psychology was for many years a major force in associating psychology with the natural and biological, rather than the social, sciences. Psychology emerged from World War i with a position in the National Research Council. In World War n it was an organized activity in the Office of Scientific Research and Development. It emerged from World War n as part of the Office of Naval Research and the Research and Development Board. Military psychologists today are still primarily organized under technical and engineering activities. Windle and Vallance (1964) suggest that a major shift within military psychology is now taking place and that social science aspects will be more prominent in the future. Certainly the association of psychology with social science in the higher levels of the military has been a factor in the reorganization of the Division of Anthropology and Psychology of the National Research Council into the Division of Behavioral Sciences. It seems probable, however, that such a development is better interpreted as a reflection of the need to incorporate social scientists into interdisciplinary teams of other scientists than as a simple association of psychology with social science. Bartlett (1957, p. 49) suggests that military applications have been a major factor in establishing psychology as a science in Great Britain. He says that it took four crises—World War I, World War n, and the aftermaths of those wars—to convince the British that "unaided common sense observation is not by itself a good enough guide" to human behavior and that experiment is needed. Issues of far-reaching importance may be raised about military psychology. Some psychologists react so strongly to the horrors of war as to wish no
339
involvement at all of their profession with the military. Many fear bureaucratic control. Others anticipate difficulties from authoritarianism, the sometimes unsatisfactory professional status of civil servants in military laboratories, the necessity to work as part of an organized research team, the subordination of research to application, and restrictions on the range of scientific activity of individual military psychologists. There is no question that all of these fears are justified. Like most fears, however, little is to be gained by simple withdrawal from the source. Major opportunities are likely to be lost by doing so. Psychologists can help to meet their responsibilities with respect to the horrors of war, for example, by finding a way for their work to contribute to the military desire to take social, political, and ethical criteria into account, along with destructiveness, as components of a cost-effectiveness formula in choosing weapons and methods of war. The other fears are by no means peculiar to the military. The problems involved have repeatedly been faced and dealt with by psychologists in the military service, just as they have been met in other fields. The problems are not insoluble, and they do not inevitably arise. CHARLES W. BRAY [Directly related are the entries INTERNATIONAL RELATIONS; MILITARY. Other relevant material may be found in CONFLICT; ENGINEERING PSYCHOLOGY; INDUSTRIAL RELATIONS; LEARNING, article On ACQUISITION OF SKILL; PSYCHOLOGY, article OU APPLIED PSYCHOLOGY; SIMULATION; SPACE, OUTER; WAR; and in the biography of YERKES.] BIBLIOGRAPHY
BARTLETT, FREDERIC C. 1957 Some Recent Developments of Psychology in Great Britain. Istanbul: Baha Matbaasi. BAXTER, JAMES P. 1946 Scientists Against Time. Boston: Little. BRAY, CHARLES W. 1948 Psychology and Military Proficiency: A History of the Applied Psychology Panel of the National Defense Research Committee. Princeton Univ. Press. BRAY, CHARLES W. 1962 Toward a Technology of Human Behavior for Defense Use. American Psychologist 17:527-541. CRAWFORD, MEREDITH P. 1962 Research and Development for Specific Training Programs. Pages 309-324 in Robert Gagne (editor), Psychological Principles in System Development. New York: Holt. DALLENBACH, KARL M. 1946 The Emergency Committee in Psychology, National Research Council. American Journal of Psychology 59:496-582. DARLEY, JOHN G. 1957 Psychology and the Office of Naval Research: A Decade of Development. American Psychologist 12:305-323. FERGUSON, LEONARD W. 1962 The Heritage of Industrial Psychology. Hartford, Conn.: Finlay.
340
MILL, JOHN STUART: Political Contributions
GAGNE, ROBERT M. (editor) 1962 Psychological Principles in System Development. New York: Holt. GELDARD, FRANK A. 1953 Military Psychology: Science or Technology? American Journal of Psychology 66: 335-348. HILL, CHARLES W. 1955 Military Psychology. Pages 437 -467 in Abraham A. Roback (editor), Present-day Psychology. New York: Philosophical Library. JANOWITZ, MORRIS 1960 The Professional Soldier: A Social and Political Portrait. Glencoe, 111.: Free Press. LANIER, LYLE H. 1949 The Psychological and Social Sciences in the National Military Establishment. American Psychologist 4:127—147. MELTON, ARTHUR W. 1957 Military Psychology in the United States of America. American Psychologist 12: 740-746. MORGAN, CLIFFORD T. et al. (editors) 1963 Human Engineering Guide to Equipment Design. New York: McGraw-Hill. SOCIAL SCIENCE RESEARCH COUNCIL 1949-1950 Studies in Social Psychology in World War II. Vols. 1-4. Princeton Univ. Press. -» Volume 1: The American Soldier: Adjustment During Army Life, by S. A. Stouffer et al. Volume 2: The American Soldier: Combat and Its Aftermath, by S. A. Stouffer et al. Volume 3: Experiments on Mass Communication, by Carl I. Hovland et al. Volume 4: Measurement and Prediction, by S. A. Stouffer et al. U.S. ADJUTANT-GENERAL'S OFFICE 1919 The Personnel System of the United States Army. 2 vols. Washington: Government Printing Office. U.S. ARMY AIR FORCES 1947 Aviation Psychology Program: Research Reports. Volumes 1-19. Washington: Government Printing Office. U.S. BUREAU OF NAVAL PERSONNEL 1947 Personnel Research and Test Development in the Bureau of Naval Personnel. Edited by Dewey B. Stuit. Princeton Univ. Press. WINDLE, CHARLES, and VALLANCE, T. R. 1964 The Future of Military Psychology: Paramilitary Psychology. American Psychologist 19:119-129. YERKES, ROBERT M. (editor) 1921 Psychological Examining in the United States Army. National Academy of Sciences, Memoirs, Vol. 15. Washington: Government Printing Office.
i. POLITICAL CONTRIBUTIONS n. ECONOMIC CONTRIBUTIONS
John C. Rees V. W. Eladen
POLITICAL CONTRIBUTIONS
John Stuart Mill (1806-1873) was born in London, the eldest son of James Mill, a leading disciple and friend of Jeremy Bentham. In his Autobiography (1873) the younger Mill described the remarkable education he received from his father, beginning Greek at the age of three and Latin at eight. At 15, massively instructed in a wide range of subjects, including economics, his-
tory, philosophy, and even some branches of natural science, he first read Bentham and emerged with a unifying conception of things and a sense of purpose in life. In 1823 he followed his father into the service of the East India Company and remained with the company until he retired in 1858. For some years Mill vigorously promoted the Benthamite cause by speech and pen, but during a period of serious mental depression that started in 1826, he became convinced that there were serious weaknesses in his inherited opinions. At the same time he was subjected to new influences "which enlarged my early narrow creed," among them the ideas of Wordsworth, Coleridge, Carlyle, Goethe, the Saint-Simonians, and Comte. In these crucial years he came to value poetry and art, both for themselves and as a means of cultivating the feelings and character, and he developed a fuller conception of happiness as involving the rich and varied growth of personality. His conception of social and political affairs also underwent a change: he came to appreciate the Saint-Simonian division of history into organic and critical periods; to see that political institutions must be related to the state of society; and to accept the important role an intellectual elite can play in shaping and making coherent the attitudes and beliefs of a society in a stage of transition. It was at this time too that his fears about the growth of mass conformity and its stifling effect on individual freedom took firm root. In the decade beginning in 1831 Mill published several articles containing clear signs of his changed outlook; notable among them are the series of articles entitled "The Spirit of the Age" (1831), the essay "Civilization" (1836), and his studies of Bentham (1838) and Coleridge (1840a). His judgment on Bentham is especially interesting, manifesting as it does some of the vital differences that were to distinguish Mill from his educators. He praised Bentham's contribution to the philosophy of law and his work for the reform of legal institutions; he greatly admired his methodological principle of breaking up wholes into their parts and abstractions into things; but he rejected a conception of man which, he claimed, has no room for the pursuit of spiritual perfection as an end in itself. Moreover, Bentham's theory of government, he argued, ignores the dangers arising from a despotic public opinion and the importance of establishing checks on the will of the majority. Mill's new attitude toward these two related matters was strongly confirmed by a careful reading of Tocqueville's Democracy in America, and he wrote
MILL, JOHN STUART: Political Contributions lengthy reviews of the two parts of Tocqueville's work when they appeared (1835; 1840£>). Meanwhile, Mill had met Harriet Taylor, the wife of a London businessman, and there soon began what he called "the most valuable friendship of my life." They were married in 1851, two years after Mr. Taylor's death. Mill's extremely high estimate of his wife's abilities and of her contribution to his own writings has generally been regarded with skepticism, although quite recently, through works by Hayek (1951) and Packe (1954), there has been a reaction in her favor. However, it must be emphasized that the claims of Hayek and Packe for Mrs. Mill have been strongly contested. Mill's first major work, A System of Logic, was published in 1843 and ran to several editions, as did the Principles of Political Economy, after it appeared in 1848. With these two works Mill's reputation as an outstanding thinker of his day was firmly established. The later editions of the Political Economy show a more pronounced sympathy for socialism and for the claims of the working class than Mill's early opinions would have permitted, and it is probably here that Mrs. Mill's influence is most generally allowed, when it is admitted at all. On Liberty (1859) came out in the year after Mrs. Mill's death, and Mill insisted that it was a joint product. Mill now spent much of each year in France, where his stepdaughter, Helen Taylor, managed a small house at Avignon, near her mother's grave. His main work on political institutions, Considerations on Representative Government, appeared in 1861, and in the same year he wrote for Eraser's Magazine a set of essays on moral philosophy (1861fo) which came out as a book, Utilitarianism, in 1863. The most notable of his remaining works are Auguste Comte and Positivism (1865) and The Subjection of Women (1869). From 1865 to 1868 Mill represented Westminster in Parliament. He died at Avignon in 1873. His Autobiography, edited by Helen Taylor, was published later in the same year. Mill's social and political thought can usefully be approached in terms of four major concerns: (1) the problem of method in the social sciences; (2) his elucidation of the principle of utility; (3) the freedom of the individual; and (4) his theory of representative government. All four are related, and the interdependence among the last three, at least, has long been recognized. Method in the social sciences. In his Essay on Government (1820) James Mill had tried to demonstrate the necessity for representative government by arguing from the postulate that men's actions always conform to what they take to be
341
their interests and that men's interests in turn can be analyzed in terms of pain and pleasure. Accordingly, a representative assembly should have sufficient power to check the rulers, who, like all other men, are concerned only with advancing their own interests, yet who will thus be made accountable to a body whose interests are identical with those of the whole community. This identity of interests between the representative assembly and the community is possible if the franchise is extended. John Stuart Mill and his circle of young utilitarian radicals initially regarded James Mill's essay as a masterpiece; yet when the new influences began streaming in upon the younger Mill, he began to have doubts which were considerably increased by Macaulay's famous attack on James Mill's essay in the Edinburgh Review (1829). But he became convinced that the various types of reasoning employed by his father and by Macaulay were both wrong, and he was thus led to his own conclusions about the proper methods of study in social matters, later published in Book 6 of A System of Logic (1843). Mill denied that the actions of rulers can adequately be explained in terms of their interests. Such an explanation leaves out factors like a sense of duty, philanthropy, and the traditional attitudes of a community, as well as group or class sentiment and inherited standards of behavior among rulers themselves. The force of these traditional standards may override the personal interests of the rulers. Moreover, Mill believed, accountability to the governed is not the only way of ensuring an identity of interest between rulers and ruled, since to some extent their interests in fact coincide: it is in the interest of both, for example, that law and order be maintained. Nevertheless, the selfish interests of rulers do play an important, if by no means exclusive, part in shaping their conduct, and constitutional checks are therefore necessary. Where James Mill and Bentham had gone wrong, according to the younger Mill, was in supposing that social phenomena depend on one causal factor or law of human nature, with others producing only trivial effects. In fact, the several aspects of human nature contribute to determining social phenomena, and none of these aspects is negligible. Mill believed that a science of society is possible. Its model should be astronomy, even though the science of society would never achieve the kind of precision in its predictive powers that astronomy has. James Mill's error was to adopt the deductive method of geometry; social science must rest on the laws of individual psychology which are discoverable by direct observation and experiment,
342
MILL, JOHN STUART: Political Contributions
and unless generalizations about social phenomena can be connected with, and shown to be derived from, these inductive laws, they cannot be regarded as having a scientific basis. John Stuart Mill set great store by "ethology" (his term for knowledge of the formation of individual, group, and national character), whose laws are derived from those of psychology by deducing what sort of character will be produced, given the laws of mind and a specific set of circumstances. But psychological and ethological laws do not suffice to explain sociological phenomena, since the special circumstances of the society in which a particular phenomenon occurs must be taken into account. The propositions of sociology are therefore only crude, i.e., related to tendencies. The main aim of sociology must be to discover empirical generalizations about social development, generalizations that do not have the status of laws but that nevertheless can be related to the laws of human nature. Mill thought that an appreciation of the enormous importance of the state of intellectual knowledge as an agent of social change and as the chief cause of social progress might contribute to the discovery of such sociological "laws." Mill's belief in the importance of knowledge explains his concern to ensure the existence of an active intellectual elite in an age of mass pressures. In his view the state of knowledge is the product of a small minority, and progress will give way to "Chinese stationariness" unless society secures to its potential innovators the means for their creative role; and among these means the first requirement is the freedom of the individual. Not that freedom is merely an instrumental value for Mill, but it is fundamental even as such. Utility. The principle of utility, as Mill expounded it in Utilitarianism (chapter 2), "holds that actions are right in proportion as they tend to promote happiness, wrong as they tend to produce the reverse of happiness." By "happiness" Mill meant pleasure and the absence of pain. "Pleasure and freedom from pain," he argued, "are the only things desirable as ends," and all desirable things are desirable "either for the pleasure inherent in themselves, or as means to the promotion of pleasure and the prevention of pain." On the evidence of this passage alone, Mill appears to be expounding the orthodox Benthamite creed. But it is well known that later in the same chapter he went on to maintain that the quality of pleasure is no less important than its quantity. Indeed, he insisted that the pleasure derived from the higher faculties is more valuable than any other sort and could even be said to have an "intrinsic superiority." Mill's elucidation
of the principle of utility is clearly inspired by, and intelligible only by reference to, an ideal of human development that he had earlier in his life explicitly contrasted with Bentham's narrow and constricting conception of man, with its failure to recognize adequately the role of such powerful factors as a sense of honor and a sense of personal dignity. Without ever retracting his affirmation that happiness is the sole desirable end, he so described its constituent elements that they reflected his own scale of values. Prominent in that scale was the Greek ideal of self-development, individual spontaneity, mental cultivation, and the importance of men "for ever stimulating each other to increased exercise of their higher faculties" (On Liberty, chapter iv). One of Bentham's teachings that Mill never abandoned was that appeals to "the moral sense" or "right reason" merely serve to enthrone sentiment as its own reason and are incapable of providing a real solution to moral problems. Such appeals play the same sort of role in moral argument as reliance on intuition does in knowledge of truths in mathematics. Mill rejected the claim that truths of this kind can be known independently of observation and experience and was keen to demonstrate the falsity of this claim, since it seemed to him to support prejudices in favor of outdated institutions that have no backing in reason and rely on the alleged validity of intuition. Mill argued that if the principle of utility replaces "the moral sense," moral questions become amenable to rational consideration and the principle of utility itself supplies a tangible if not foolproof criterion for deciding moral issues. Mill shared Bentham's conviction that moral values and the feeling of moral obligation can become purely secular phenomena, however much they may have owed to religion in the past. Every society, he contended, derives its cohesion from a common set of beliefs and values which have, until recent times, been supplied by supernatural religion. With the decline of the religious sanction, however, a secular vision of life must become the source of the necessary integrating beliefs and values. Mill did not conceal his hope that an elevated brand of utilitarianism, such as he sketched in his posthumously published essay, "Utility of Religion," would take the place of religion. He looked forward to a time when men would come to feel it their duty to serve humanity at large, when society would strive to cultivate in all its members a profound sense of unity with each other and a deep concern for the general good. While these are, to be sure, earthly goals, the conception and mode
MILL, JOHN STUART: Political Contributions of life involved may well merit the name of religion, and Mill was sure that it was a better sort of religion than the supernatural one that was widely thought to have an exclusive right to the title. It was above all Comte who convinced him of the need for, and feasibility of, a "religion of humanity." While Mill thought that such a religion of humanity could secure a hold over men's minds, he did fear that it might militate against freedom and individuality. Freedom of the individual. Freedom of speech and publication are prominent among the conditions of good government in Benthamite political thought, and some of Mill's earliest journalistic efforts were based on this view. By the time he came to write On Liberty, his emphasis had changed: what had become central was the fear that society would become increasingly hostile to the full and varied expression of individual character. For his watchword Mill now took Wilhelm von Humboldt's assertion of the absolute importance of the rich and diverse development of the human personality, thereby provoking the charge that he (Mill) had abandoned the principle of utility. However, he took care to say in his introductory chapter that his ultimate standard for judging all ethical questions was still utility; but, he insisted, "it must be utility in the largest sense, grounded on the permanent interests of a man as a progressive being." It was Mill's realization that popular government is no guarantee of freedom that gave much of the driving force to On Liberty. Tocqueville's account of democracy in America strengthened Mill's misgivings about the Benthamite assumption that to identify the interests of rulers and ruled is a necessary and sufficient condition of good government. Even a government based on the will of the people can exercise tyranny, and more than that, the informal pressures of society can become oppressive, especially in England, where, in contrast with France, the weight of public opinion was heavier than that of the law. Mill believed that the restrictions imposed on individuals, whether by law or by opinion, ought to be based on some recognized principle rather than on the preferences and prejudices of powerful sections of the public, and he set himself the task of formulating such a principle and of illustrating how it would work. He described his principle in a number of different ways. At first he permitted social control only if it serves "to prevent harm to others" or to deter a person from inflicting "evil" on someone else; and here the line of division is between conduct which "concerns others," for which a person is answerable should it result in "harm," and conduct
343
"which merely concerns himself," over which society has no jurisdiction at all. But later Mill talked about infringing "the interests" or "the rights" of others; and at other times he referred to the violation of "a distinct and assignable obligation" or a "perceptible hurt" to an "assignable individual." This variety of definitions of the sphere of liberty gives rise to complex problems of interpretation but should not obscure Mill's intention to make the area of freedom as large as possible and his clear recognition of the need for some restraint, both as a condition of social life of any sort and as a safeguard of freedom itself. Nor did Mill recommend indifference to conduct that falls short of accepted standards of private morality, even when it does not actually violate the interests of others; yet we should only try to persuade someone to give up his self-regarding vices, not to coerce him. On Liberty is probably best known for the eloquent justification of liberty of thought and discussion contained in its second chapter. Mill contended that freedom of expression is no less necessary when an honest government is backed by the people than when the government is corrupt or despotic; and small minorities—even a single dissenter—have as much right to express their views as do large or overwhelming majorities. His case, argued at length, rests on the claim that to suppress an opinion is wrong, whether or not that opinion is true. For if it is true, we are robbed of the truth, and if it is false, we are denied that fuller understanding of the truth which comes from its conflict with error. And when, as often happens, the prevailing view is part truth and part error, we shall know the whole truth only by allowing free circulation of contesting opinions. Mill's argument here is strictly utilitarian, in terms of the social benefits to be derived from a policy of freedom and access to truth. In his plea for individuality, however, there is an appeal to the idea of intrinsic goodness which he combined with instrumental arguments. The free development of individuality is indeed socially advantageous; it makes for improvement, progress, and variety in ways of living. But it means also that men may choose to live their own lives in their own distinctive ways, and Mill insisted that a man's own mode of "laying out his existence" is best simply because it is his own mode. Moreover, it is only by cultivating individuality that we can become well-developed human beings, and "what more or better can be said of any condition of human affairs than that it brings human beings themselves nearer to the best thing they can be?" Mill therefore believed in liberty both as a good in itself and as a means to hap-
344
MILL, JOHN STUART: Economic Contributions
piness and progress: for him the ideas of happiness and progress were thoroughly infused with his conception of a freely choosing human agent. It has often been said in criticism of Mill that in his zeal for liberty and his opposition to the extension of state interference, he attached too little importance to justice and welfare and failed to realize that these values can be promoted by government action without serious danger to freedom. It may not be possible to dismiss such a charge entirely, but in Mill's defense one can point to those passages of the Principles of Political Economy (especially book 2, chapters 1 and 2; book 4, chapters 6 and 7), where he showed himself to be fully aware of the injustices involved in the existing system of private property. One should also mention his fair-minded account of socialism and communism, his enthusiasm for the cooperative movement, and his idea of "the stationary state," in which there would be no more "trampling, crushing, elbowing, and treading on each other's heels, which form the existing type of social life" and where "while no one is poor, no one desires to be richer, nor has any reason to fear being thrust back, by the efforts of others to push themselves forward" (book 4, chapter 6, paragraph 2). He looked forward to the ultimate victory of socialism over the private property system, but it was to be a socialism which respected individuality. For the foreseeable future, the main task was so to improve the system of private property as to ensure that everyone shared in its benefits, and the measures on which Mill chiefly relied to achieve this end were a limitation on the inheritance of property, the restriction of the growth of population, and a great increase in the quantity and quality of education. Representative government. In his major work on political institutions, Considerations on Representative Government, the decline of individuality and the growing power of mass opinions are major reasons for Mill's advocacy of a number of reforms to protect minorities and to ensure that the influence exerted by educated minds on government is greater than that to which their numerical strength entitles them. But it is a wide-ranging book, and its interest lies as much in the discussion of general principles as in the particular recommendations regarding the ballot, proportional representation, and plural voting, not to mention the treatment of local government, federalism, and nationality. If Mill's treatise has not stood the test of time as well as, say, Aristotle's Politics or Tocqueville's Democracy in America, nevertheless there is still much to admire; as when, for example, he asserts
that institutions need to be adapted to the place where they have to work (his dealings with India had an important influence here) or that a despotic regime may not only help stabilize a society but may even prepare its people for the exercise of the responsibilities of a free electorate. Mill put heavy emphasis on a people's being properly equipped to assume these responsibilities; for representative government as he conceived it is the best possible form of government because, among other things, its very operation requires such activities of its citizens as are likely to increase both the desire and the capacity to make it work more effectively. One of its greatest virtues is that it puts power in the hands of those whose needs are sure to be considered only when they can voice them and whose rights and interests are sure of protection only when they can stand up for them. In saying this, Mill was surely stating an important part of the case for liberal democracy as it would commonly be made in the contemporary world. JOHN C. REES [For the historical context of Mill's political thought, see DEMOCRACY; FREEDOM; LIBERALISM; REPRESENTATION; UTILITARIANISM; and the biographies of BENTHAM; COMTE; SAINT-SIMON.] BIBLIOGRAPHY The bibliography for this article is combined with the bibliography of the article that follows. II ECONOMIC CONTRIBUTIONS
The essence of John Stuart Mill's economics is found in his Principles of Political Economy, published in 1848, and the best introduction to the Principles is Mill's Autobiography (1873). Here he described the strictly Ricardian economics taught him by his father, James Mill, and his later economic studies with a group of young men at George Grote's house. He also related the effect that Coleridge, Maurice and Sterling, Saint-Simon and Comte, Carlyle, and finally Harriet Taylor had in modifying his Ricardian Benthamite ideas. Highlighting the role that Harriet Taylor played in the writing of the Principles, he said that the chapter "On the Probable Futurity of the Labouring Classes" was "entirely due to her" (1873, p. 208). Insofar, at least, as the Principles were intended by Mill to be "more than a mere exposition of the abstract doctrine of Political Economy" (1848, p. xcii), the Autobiography does much to explain them. Harold Laski, for one, realized that there was more to Mill's Principles than technical economics:
MILL, JOHN STUART: Economic Contributions "The modern economist may use a technique more refined than that of Mill: he rarely conveys the same sense of generous insight into his material" (see Laski in Mill [1873] 1958, p. xix). Indeed, economists now answer with greater precision and certainty many of the questions that Mill asked, but there are many other questions that they have ceased to ask because, dissatisfied as they may be with Mill's answers, they see no better way of approaching them. Yet some of these questions are more important than those economists now deal with, and even Mill's answers would appear better if modern economists truly appreciated the questions he was in fact asking. In particular, he has been misinterpreted because it has been supposed that he was answering the questions posed by the neoclassical school of the later nineteenth century. Yet theirs was an economics of equilibrium; his was an economics of growth and development. Method. Mill had discussed the problems of method in the essay "On the Definition of Political Economy; and on the Method of Investigation Proper to It," published in the Westminster Review in 1836. This is an excellent statement of the value, character, and limitation of pure, abstract theory. In Book 4 of A System of Logic (1843) he discussed the problems of method in the social sciences generally: while still arguing the deductive character of political economy, he stressed the importance of the "inverse deductive or historical method." In the Principles, Mill decided to follow the example of Adam Smith, whose work "associates the principles with their applications" ([1848] 1965, p. xci). This approach, he saw, "implies a much wider range of ideas and of topics, than are included in Political Economy, considered as a branch of abstract speculation," for there are no practical questions which can be decided "on economical premises alone" (ibid.). Mill recognized that competition is limited in the real world (in part by custom), so that the results of analysis of a competitive model must be treated as "truths only in the rough" (ibid., p. 422). He did not seem to notice that his doubts about the universality of self-interest raised doubts about the validity of any analysis based on the concept of the economic man. This economic man was defined as a "being who desires to possess wealth" (1844, p. 137), but Mill in the Principles indulged in some fine preaching against the obsessive pursuit of wealth: "it is only in the backward countries of the world that increased production is still an important object" ([1848] 1965, p. 755). Much of the interest in the Principles resides in its discussion of values:
345
policy can be determined only after a choice of ends, and problems arise out of a conflict of ends. What Mill did not notice, and what is still often ignored, is that the prediction of behavior depends on an understanding of the values held by society. Values are part of the data of the "science" of economics as well as a basis for the practical art. Production. However much Mill the preacher might doubt the importance of increasing production, Mill the economist was realistic enough to devote Book 1 of the Principles to the causes of productivity and of increasing productivity. Modern economists in developing countries, advanced or backward, would do well to study this book. Not least important is his concern with human resources and investment in people. Proper understanding of the book requires recognition that the problems he discussed are those of growth and development. For instance, the continued distinction between productive and unproductive labor is related to his concern for the liquidation of the primitive sector of the economy, in which menial servants are maintained in idleness on a more or less feudal basis, and for the development of industry, the advanced sector. Similarly, the propositions about capital, which have caused so much controversy ("the demand for commodities is not demand for labour" [(1848) 1965, p. 78]), make sense only in the context of the development of industry at the expense of the preindustrial sector. Population. The problems of population control crop up throughout the Principles. The possibility of "restraint" is the issue: "general improvement in intellectual and moral culture" or a rise in the "habitual standard of comfortable living" is necessary if an improvement in productivity is not to have as a consequence "a more numerous, but not a happier people" (ibid., p. 159). Mill discussed the race between productivity and population further: he appeared less afraid of the effect of "communism" on population growth than was Malthus, but his advocacy of repression by public opinion of "this or any other culpable self-indulgence" (ibid., p. 206) sounds more like Orwell's bad dream of 1984 than the sentiments of the author of the essay On Liberty. He recurred to the problem in his chapters on wages, where he effectively argued that what is needed is a dramatic improvement: "a system of measures which shall (as the Revolution did in France) extinguish extreme poverty for one whole generation" (ibid., p. 374). Further discussion of the problem is found in Book 4, Chapter 3. All of this has a new relevance as economists become involved in the problems of the newly developing countries.
346
MILL, JOHN STUART: Economic Contributions
Distribution. Mill made a great point of distinguishing between the laws of production and the laws of distribution. The former, he said, "partake of the character of physical truths. . . . It is not so with the Distribution of Wealth. That is a matter of human institution solely" (ibid., p. 199). Book 2, "Distribution," is, therefore, first concerned with the institution of property and with systems of socialism. Mill recognized that the "rules . . . are what the opinions and feelings of the ruling portion of the community make them." But these opinions and feelings are not "a matter of chance" (ibid., p. 200); and how the chosen institutions work is as little arbitrary and "as much a subject for scientific enquiry as any of the physical laws of nature" (ibid., p. 21). Although he insisted on the distinction between the laws of production and the laws of distribution, he in fact showed the importance of security and pecuniary incentive for productivity, in ideal forms of socialism and in actual institutions of peasant proprietorship and metayage. His interest in cooperatives (Book 4, Chapter 7) is partly based on the expectation of "a vast stimulus to productive energies" (ibid., p. 792). The chapters on wages, profits, and rent are not without interest in the context of development, but they are unsatisfactory in the context of equilibrium analysis. His argument that distribution is not affected by exchange (Book 3, Chapter 16) is now hard to accept: he ignored the pricing process in the theory of distribution, and his successors were too readily content with his static solution. Yet Mill, in Book 2 and in Book 4, had some brilliant insights into the dynamics and the probable direction of change. Exchange. Mill was injudicious in claiming that "there is nothing in the laws of Value which remains for the present or any future writer to clear up; the theory of the subject is complete" (ibid., p. 456). Nevertheless, Book 3, "Exchange," is the most modern of the five books. The general theory of demand and supply is clearly stated. In this book are chapters on money, monetary theory and monetary policy, and international trade. Schumpeter in his History of Economic Analysis (1954, p. 689) has said that the chapters on money contain some of Mill's best work; and the chapters on international trade are described by Viner (1937, p. 535) as Mill's "chief claim to originality in the field of economics." Viner's favorable judgment refers to Mill's performance in the sphere of static analysis; in the context of growth and development Mill's discussion of "indirect benefits of commerce" is also noteworthy. "The opening of a foreign trade . . . sometimes works a sort of industrial revolution in a country whose resources were previously un-
developed for want of energy and ambition in the people" ([1848] 1965, pp. 593-594). But Mill had political effects in mind too: "The great extent and rapid increase of international trade . . . is the great permanent security for the uninterrupted progress of the ideas, the institutions, and the character of the human race" (ibid., p. 594). Progress. Book 4, "Influence of the Progress of Society on Production and Distribution," contains the chapters on the dynamics of distribution referred to above and rated by Alfred Marshall as a short but profound study of the causes that govern the distribution of the national dividend; it also contains two important chapters involving social values. "Of the Stationary State" (Book 4, Chapter 6) ends with a magnificent plea for the preservation of natural beauty which may well have inspired Gissing's novel Demos. "On the Probable Futurity of the Labouring Classes" (Book 4, Chapter 7) contains a brilliant discussion of the "two conflicting theories respecting the social position desirable for manual labourers," the "theory of dependence and protection," and the theory of "self-dependence." The part played by industrialization in developing such self-dependence, thus providing the basis for democracy, had been stressed by Adam Smith and Mai thus. The functions of government. Book 5, "On the Influence of Government," in addition to six chapters on taxation, contains five chapters on the functions of government. The agenda of government changes with changes in the nature of the economy and with changes in the character (particularly the honesty and efficiency) of the government. We should not expect the English prescription for 1848 to be satisfactory for contemporary England, but Mill's discussion of the functions of government is not just material for the economic historian. He raised questions that still demand answers; and he reminds us that the appropriate answers depend on much more than economic effects, that liberty and democracy are at issue. The plea for "privacy" in the last chapter should not be ignored: it seemed to him necessary to develop "powerful defences, in order to maintain that originality of mind and individuality of character, which are the only source of any real progress" (ibid., p. 940). Strong as was his plea in Book 1 for security of property, he also argued in Book 2 that the rights of property are not absolute, and in Book 5 he argued for considerable restriction on the rights of inheritance and bequest. He noted with approval the endowment of charitable foundations in the United States and commented that a man would
MILL, JOHN STUART: Economic Contributions make a similar bequest in England "at the risk of being declared insane by a jury after his death" (ibid., p. 226). The discussion of the economic importance of "limited liability" and of sound laws relating to insolvency (Book 5, Chapter 9) reminds us of the importance of examining some of the institutions we take for granted. The discussion of protection for infant industry (Book 5, Chapter 10) is still relevant; "the superiority of one country over another in a branch of production, often arises only from having begun it sooner" (ibid., p. 918). Finally, attention is directed to education: public provision is defended but monopoly denounced (ibid., pp. 949-950). He made a plea for support of research and scholarship, particularly for support of university professorships: "the greatest advances which have been made in the various sciences, both moral and physical, have originated with those who were public teachers of them" (ibid., p. 969). This is a generous tribute from the servant of the East India Company who was developing the economics of the stockbroker Ricardo; but then Adam Smith and T. R. Malthus were professors. V. W. BLADEN [For the historical background of Mill's economic thought, see the biography of RICARDO.] WORKS BY MILL ECONOMIC WORKS
(1822) 1936 Two Letters on the Measure of Value, Contributed to the Traveller (London) in December, 1822. Reprint of Economic Tracts, No. 16. Baltimore: Johns Hopkins Press. (1836) 1948 On the Definition of Political Economy; and on the Method of Investigation Proper to It. Pages 120-164 in John Stuart Mill, Essays on Some Unsettled Questions of Political Economy. London School of Economics and Political Science. (1844) 1948 Essays on Some Unsettled Questions of Political Economy. London School of Economics and Political Science, Series of Reprints of Scarce Works on Political Economy, No. 7. London School of Economics and Political Science. -» Five essays, of which the fifth was previously published in 1836. (1848) 1965 Principles of Political Economy, With Some of Their Applications to Social Philosophy. 2 vols. Edited by J. M. Robson. Collected Works, Vols. 2-3. Univ. of Toronto Press. -» This edition collates numerous earlier editions. The two volumes are paginated continuously. POLITICAL AND OTHER WORKS
(1831) 1942 The Spirit of the Age. Introductory essay by Friedrich A. von Hayek. Univ. of Chicago Press. -> Five articles first published in the Examiner. (1835) 1962 Tocqueville on Democracy in America (Vol. I). Pages 187-229 in John Stuart Mill, Essays on Politics and Culture. Garden City, N.Y.: Doubleday. -» First published in Volume 21 of the Westminster Review.
347
(1836) 1962 Civilization. Pages 51-84 in John Stuart Mill, Essays on Politics and Culture. Garden City, N.Y.: Doubleday. -> First published in Volume 25 of the Westminster Review. (1838) 1962 Bentham. Pages 85-131 in John Stuart Mill, Essays on Politics and Culture. Garden City, N.Y.: Doubleday. -» First published in Volume 29 of the Westminster Review. (1840a) 1962 Coleridge. Pages 132-186 in John Stuart Mill, Essays on Politics and Culture. Garden City, N.Y.: Doubleday. -» First published in Volume 33 of the Westminster Review. (1840£>) 1962 Tocqueville on Democracy in America (Vol. II). Pages 230-287 in John Stuart MiU, Essays on Politics and Culture. Garden City, N.Y.: Doubleday. -»• First published in Volume 72 of the Edinburgh Review. (1843) 1961 A System of Logic, Ratiocinative and Inductive: Being a Connected View of the Principles of Evidence and the Methods of Scientific Investigation. London: Longmans. (1859) 1963 On Liberty. Indianapolis, Ind.: BobbsMerrill. (1861a) 1962 Considerations on Representative Government. Chicago: Regnery. -» A reprint of the original edition. (1861&) 1957 Utilitarianism. Indianapolis, Ind.: BobbsMerrill. -» First published in three parts in Volume 64 of Eraser's Magazine. (1865) 1961 Auguste Comte and Positivism. Ann Arbor: Univ. of Michigan Press. -> First published in two parts in Volume 83 of the Westminster Review. (1869) 1911 The Subjection of Women. London and New York: Longmans. (1873) 1958 Autobiography. With an appendix of hitherto unpublished speeches and a preface by Harold J. Laski. Oxford Univ. Press. -> Published posthumously. There have been several editions of the Autobiography, including one in 1944 from the original manuscript in the Columbia University Library, published by Columbia University Press, and The Early Draft . . . , published in 1961 by the University of Illinois Press. (1874) 1958 Utility of Religion. Pages 45-80 in John Stuart Mill, Nature and Utility of Religion. New York: Liberal Arts Press. -» Written between 1850 and 1858. Published posthumously. COLLECTED WORKS
Bibliography of the Published Writings of John Stuart Mill. Edited from his manuscript, with corrections and notes, by Ney MacMinn, J. R. Hainds, and James McNab McCrimmon. Northwestern University Studies in the Humanities, No. 12. Evanston, 111.: Northwestern Univ., 1945. Collected Works. Univ. of Toronto Press, 1963—. -> A projected multivolume publication. Essays on Politics and Culture. Edited and with an introduction by Gertrude Himmelfarb. Garden City, N.Y.: Doubleday, 1962. -> These essays were originally published between 1831 and 1874. SUPPLEMENTARY BIBLIOGRAPHY
CANNAN, EDWIN (1893) 1953 A History of the Theories of Production and Distribution in English Political Economy, From 1776 to 1848. 3d ed. London and New York: Staples. HALEVY, ELIE (1901-1904) 1952 The Growth of Philosophic Radicalism. New ed. London: Faber. -> First published in French.
348
MILLAR, JOHN
HAYEK, FRIEDRICH A. VON (editor) 1951 John Stuart Mill and Harriet Taylor: Their Correspondence [i.e. Friendship] and Subsequent Marriage. Univ. of Chicago Press. -> An errata slip indicates the correct title. MACAULAY, THOMAS B. (1829) 1898 Mill on Government. Volume 7, pages 327-371 in Thomas B. Macaulay, The Works of Lord Macaulay. Albany ed. London: Longmans. MILL, JAMES (1820) 1955 Essay on Government. New York: Liberal Arts Press. MYINT, HLA 1948 Theories of Economic Welfare. Cambridge, Mass.: Harvard Univ. Press. MYINT, HLA 1958 The "Classical Theory" of International Trade and Underdeveloped Countries. Economic Journal 68:317-337. PACKE, MICHAEL ST. JOHN 1954 The Life of John Stuart Mill. London: Seeker & Warburg. SCHUMPETER, JOSEPH A. (1954) 1960 History of Economic Analysis. Edited by E. B. Schumpeter. New York: Oxford Univ. Press. STEPHEN, LESLIE (1900)1950 The English Utilitarians. London School of Economics and Political Science, Series of Reprints of Scarce Works on Political Economy, Nos. 9-11. 3 vols. London School of Economics and Political Science; Gloucester, Mass.: Smith. -» A sequel to the author's History of English Thought in the Eighteenth Century. A detailed study of Bentham and the two Mills. TAYLOR, OVERTON H. 1960 A History of Economic Thought: Social Ideals and Economic Theories From Quesnay to Keynes. New York: McGraw-Hill. VINER, JACOB 1937 Studies in the Theory of International Trade. New York: Harper.
MILLAR, JOHN John Millar (1735-1801), professor of civil law at the University of Glasgow from 1761 to 1801 and author of The Origin of the Distinction of Ranks and of An Historical View of the English Government, was born in the manse of the Kirk o' Shotts, Lanarkshire, Scotland. He was educated by an uncle and in the grammar school at Hamilton, where his father had by then been transferred. At the early age of 11 he entered Glasgow College, intended by his father for the Christian ministry. Five years later, he attended the first course of lectures given at Glasgow by Adam Smith, who soon discovered his promising qualities and later recommended him for a tutorship to the family of the distinguished jurist Lord Kames. At the age of 25, just "passed advocate," he was appointed, on the recommendations of Smith and Kames, to the crown chair of civil law at Glasgow —-a post he filled until his death. Here he lectured regularly on civil or Roman law (following the institutes and pandects of Justinian), on what he called "public law" or "the principles of government," and in alternate years on Scots and later also on English law.
The brilliance of his lectures soon attracted students from far and wide and gave great luster to a chair that had, at times, become almost defunct. Among his students were many who later came to hold places of the highest distinction at the bar, on the bench, in legal scholarship, in Parliament, and in the royal councils at Westminster. His approach to the law was characterized by a comparative and historical, and in a sense sociological, orientation, by a lively attempt to reveal the relation of both law in general and the specific provisions of the law to the realities of everyday human experience, and by an unflagging effort to make law a genuine university subject rather than merely to furnish materials for the manual of the practitioner (Lehmann 1960, p. 48). This was true of his lectures on civil and municipal law as well as on public law or government. Millar was, however, more than a teacher of law: he was first of all a teacher of youth. In line with the democratic, pragmatic, and broadly national aims of Scottish higher education in general, he always tried to make knowledge a vital thing in the lives of his students and a challenge to both their intellectual curiosity and their sense of moral responsibility in the affairs of the political community. "No individual, indeed, ever did more," Francis Jeffrey observed in the Edinburgh Review, "to break down the old and unfortunate distinction between the wisdom of the academician and the wisdom of the man of the world" (1806, p. 87). Jeffrey considered the informality of Millar's lectures—his "academic undress," as it were, his afterclass and domestic hearthside discussions with the more promising of his students—to be an educational innovation of the highest order. John Rae, in the Life of Adam Smith, saw not indeed in the master but in his pupil, Millar, "the most effective and influential apostle of Liberalism in Scotland in that age" (1895, pp. 53-54). In 1771, ten years after he began to lecture, Millar published his Observations Concerning the Distinction of Ranks in Society, entitled in the third and fourth editions The Origin of the Distinction of Ranks, with the revealing subtitle An Inquiry Into the Circumstances Which Give Rise to Influence and Authority, in the Different Members of Society. Sixteen years later (1787) he published An Historical View of the English Government. Both books brought him wide acclaim. The former is essentially a historico-sociological analysis of social and political institutions from an evolutionary standpoint, focusing particularly on the relative position and the relations of the sexes in the various stages of civilizational development,
MILLENARISM on the rise of chieftaincy and of monarchical government, and on the "changes produced in the government of a people by their progress in arts, and in polished manners" ([1771] 1960, p. 284). The latter book, somewhat more mature in scholarship and more purely historical in orientation, was dedicated to Charles Fox, whom Millar greatly admired. Millar himself viewed it as a "constitutional history of England," tracing its development from early Saxon institutions to the Norman Conquest; then, with the development of feudalism, to the end of the Tudor period; then to the Stuart period, with its struggles over the royal prerogative, and ending with the Revolution Settlement of 1688, which he considered the highest point in the development of British liberties; and, finally, to his own time. His own period he viewed as characterized by the growth of "the secret influence of the crown" and thus by a re-encroachment of the royal prerogative upon the legislative branch of the government, which was dangerous to established liberties. At the same time he saw that "rapid improvements of arts and manufactures . . . produced a degree of wealth and affluence, which diffused a feeling of independence and a high spirit of liberty, through the great body of the people" ([1787] 1818, vol. 4, p. 100). This work was often cited by James Wilson, a principal draftsman of the American constitution, in his lectures on law in 17901791, at what was later to become the University of Pennsylvania; it was one of the textbooks used by the elder Mill in the rigorous education of the young John Stuart Mill, who greatly preferred it to Hallam's Constitutional History of England. Both of Millar's major works are characterized by a pervasive attempt to trace causes and effects in historical phenomena and by a strong emphasis upon the influence that economic factors have in shaping social and political institutions. Because of this stress on economic factors, some have seen in Millar's work a marked anticipation of Marx's historical materialism. Perhaps it would be fairer to see in it, with A. L. Macfie of Glasgow (1961), both a further development of the thought °f Adam Smith, with differences in emphasis, and an important bridge between eighteenth and nineteenth century social thinking in general. WILLIAM C. LEHMANN WORKS BY MILLAR
(1771) I960 The Origin of the Distinction of Ranks. Pages 165-322 in William C. Lehmann, John Millar of Glasgow: His Life, Thought and His Contributions to Sociological Analysis. Cambridge Univ. Press. -> First published as Observations Concerning the Distinction of Ranks in Society.
349
(1787) 1818 An Historical View of the English Government, From the Settlement of the Saxons in Britain to the Revolution in 1688: To Which Are Subjoined Some Dissertations Connected With the History of the Government From the Revolution to the Present Time. 4 vols. 4th ed. London: Mawman. 1796 Letters of Crito, on the Causes, Objects, and Consequences, of the Present War. Edinburgh: Johnstone. SUPPLEMENTARY BIBLIOGRAPHY
CRAIG, JOHN 1806 Account of the Life and Writings of John Millar, Esq. Pages i-cxxxiv in John Millar, The Origin of the Distinction of Ranks: Or, an Inquiry Into the Circumstances Which Give Rise to Influence and Authority, in the Different Members of Society. 4th ed. Edinburgh: Blackwood. FORBES, DUNCAN 1954 "Scientific" Whiggism: Adam Smith and John Millar. Cambridge Journal 7:643-670. JEFFREY, FRANCIS 1806 [Review of] The Origin of the Distinction of Ranks: Or, an Inquiry Into the Circumstances Which Give Rise to Influence and Authority, in the Different Members of Society, by John Millar. Edinburgh Review 9:83-92. LEHMANN, WILLIAM C. 1960 John Millar of Glasgow: His Life, Thought and His Contributions to Sociological Analysis. Cambridge Univ. Press. -> Includes a bibliography of Millar's works on pages 417-418. MACFIE, A. L. 1961 John Millar: A Bridge Between Adam Smith and Nineteenth Century Social Thinkers? Scottish Journal of Political Economy 8:200-210. RAE, JOHN 1895 The Life of Adam Smith. London: Macmill an.
MILLENARISM The Latin term millennium and its Greek equivalent, chilias, literally mean a period of a thousand years. According to the millenarian tradition, which is based on Jewish apocalyptic literature and the Revelations of St. John, Christ will reappear in the guise of a warrior, vanquish the devil, and hold him prisoner. He will then build the Kingdom of God and reign in person for a thousand years. Those saints who remained steadfast and gave their lives for their faith shall be raised from the dead and serve as his royal priesthood. At the end of this period Satan will be let loose again for a short while and will be finally destroyed. The victory will be followed by the general resurrection of the dead, the last judgment, and final redemption. The term "millenarian" (or "chiliastic") is now used not in its specific and limited historical sense but typologically, to characterize religious movements that expect imminent, total, ultimate, thisworldly, collective salvation. Used thus, the term applies to a wide range of movements. The millenarian tradition developed originally in Persian Zoroastrianism, and above all in Judaism, whence it was transmitted to Christianity and Islam (Mowinckel 1951). Millenarism has its roots
350
MILLENARISM
in the messianic hopes and visions of the later days of prophetic Judaism (Klausner 1909). Belief in final redemption and expectation of the Messiah became firmly established tenets of Judaism. Messianism was a living force in Jewish history and gave rise to numerous popular movements. Many of these movements were of only local and passing importance, yet some of them had a very widespread appeal and left a lasting imprint. The most noteworthy of these movements are the Judean Desert Sect, which is a crucial link between Judaism and Christianity (The Scroll of the War . . .) and the seventeenth-century Sabbatean movement, which spread in most countries of the Diaspora and continued to exert considerable influence on Jewish communities even after its downfall (Scholem 1957). Christianity derived its initial elan from radical millenarism. It is by its very name a form of messianism: the term Christos, or Christ, is a Greek translation of the Hebrew term mashiah. The most important aspects of the development of the messianic doctrine in Christianity are mythologization of the figure of the Messiah, universalization of the concept of redemption, and elaboration of the "suffering servant" motif. Jesus is conceived to be the incarnation of God and not just a God-ordained representative of the divine. In Christianity the conception of the golden age becomes transnational and metapolitical. The image of the Messiah as a king, warrior, or judge does not disappear, but it is overshadowed by the image of the suffering Messiah who redeems humanity by his tribulations and cruel death. Millenarism was preserved in the Western church and was part of orthodoxy till the end of the fourth century. The change in the political position of the church, the penetration of Greek ideas, and the influence of Augustine led to its downfall. In the Concilium of Ephesus in 431 millenarism was denounced as error and fantasy and barred from official theology. The most important among the numerous heretical millenarian movements which developed within the aegis of the Catholic church during the Middle Ages are the movements which emerged during the crusades, the movements inspired by the ideas of Joachim de Fioris, who lived in the twelfth century, and the Cathari, or Universalists, who were prevalent in the south of France and went by the name of Albigenses and Waldenses (Cohn 1957). Among the movements of the Reformation we find the Taborites, who were the extreme wing of the Hussites (Werner 1960), the Adamites, the Moravians, the followers of Miinzer who joined the
revolt of the German peasants against their landlords and made an attempt to establish the Kingdom of God on earth, and most important of all, the Anabaptists (Smithson 1935). In England we find the Fifth Monarchy Men, who during the days of Cromwell established the Parliament of Saints. The development in the Protestant church parallels that of the Greek Orthodox church and the Roman Catholic church. The German and Swiss reformers believed at first that final redemption was imminent, but this millenarian expectation was abandoned. Nevertheless, millenarism found its way into the churches of the Reformation through the influence of apocalyptic mysticism and Anabaptism. This influence is most noticeable in the reformed sects and in Pietism, but it left its mark on the Lutheran church as well. [See CHRISTIANITY.] In Islam the millenarian tradition has developed under the name of Mahdism. The idea of final redemption was alien to Muhammad and his original followers. It was conceived only under Jewish and Christian influences during the civil wars and religious controversies attending the rise of the dynasty of the Ommaiades during the second half of the seventh century. The subsequent development of the caliphate and the decline of Muslim piety and power evoked a belief in the golden age of Islam and a longing for its restitution. A belief emerged that when injustice reached its acme the Mahdi (i.e., the rightly directed one), who was identified as a descendant of the prophet or as Isa (i.e., Jesus), would restore ancient glory and open a reign of abundance and justice. The theory of Mahdism has not been generally accepted in the Sunna and is not a fundamental dogma of orthodox Islam. It has, however, become a central idea of the Shi'ites, who have remained faithful to Ali, the son-in-law of Muhammad, and his descendants (Donaldson 1933). Millenarian Shi'ite sects were often constituted according to the belief in a particular descendant of Ali who was expected to emerge out of hiding as the deliverer. The most important Mahdi movements were those started by al-Mahdi Ubaidallah, who founded the dynasty of the Fatimites, Mohammed ibn-Tumart, a Berber of north Africa, and Mohammed Ahmed ibn Seyyid Abdullah, the Mahdi of Sudan (Holt 1958). [See ISLAM.] The contact between primitive and modern societies and the processes of cultural interpenetration and assimilation have given rise to many millenarian movements in all developing countries. Prominent among these movements are the Ghost Dance movement of the North American Indians (Mooney 1896), the messianic movements of South
MILLENARISM America (Metraux 1941; Ribeiro 1962), the "cargo" cults of the South Pacific (Worsley 1957; Burridge 1961; Lawrence 1964), and the numerous millenarian movements in Africa (Sundkler 1948; Balandier 1955; Price & Shepperson 1958). There have been a number of important millenarian movements in modern societies as well. Most prominent among them are the sectlike Christadelphians (Wilson 1961), who expect a world-wide theocracy with Jerusalem at its center, the radical and proselytizing Jehovah's Witnesses (Stroup 1945; Pike 1954), and the Seventh Day Adventists (Froom 1946-1954). There is a strong millenarian element in Mormonism (O'Dea 1957). The extremist millenarian Black Muslim movement, which has developed among the American Negroes, views itself as anchored in the tradition of Islam (Essien-Udom 1962). Each of the movements listed above has its unique, irreducible particularity and distinctiveness, yet they all manifest a set of common characteristics. Although the major recurrent themes appear in different constellations and there is considerable intratype variation, the basic pattern is reproduced in each of them. Characteristics of millenarian movements We have denned millenarism as the quest for total, imminent, ultimate, this-worldly, collective salvation. The terms of this definition require elucidation (see Miihlmann 1961; Sierksma 1961; Y. Talmon 1962; Thrupp 1962; Lanternari 1960). The millenarian conception of salvation is total in the sense that the new dispensation will bring about not mere improvement but a complete transformation and perfection itself. Millenarian movements also view the impending redemption as ultimate and irrevocable. Time is conceived of as a process that leads to a final future. Millenarism is a merger between a historical and a nonhistorical conception of time. Salvation is viewed as imminent. The millennium is close at hand, and the believers live in tense expectation and preparation for it. Millenarism assumes that history has its predetermined, underlying plan, which is being carried to its completion, and that this predestined denouement is due in the near future. The millennial view of salvation is, in addition, revolutionary and catastrophic. Millenarism is dominated by a sense of deepening crisis that can be resolved only by ultimate salvation. Another important element of millenarism is its terrestrial, this-worldly orientation. Its view of the divine is transcendent and imminent at the same time. The heavenly city is to appear on earth. Thus,
357
the notion of perfect time is accompanied by the notion of perfect space. Yet another major characteristic of millenarism is its collective orientation. Salvation is to be enjoyed by the faithful as a group. The aim of millenarism is not only the salvation of individual souls but the erection of a heavenly city for the chosen people, or the elect. The millenarian message may be directed to an already existing group, or it may call for the formation of a new group. Directly related to the collective orientation is the basic dualism of millenarism. A fundamental division separates the followers from nonfollowers. History is viewed as a struggle between saints and satans or, to use the terms coined by the millenarian Judean Desert Sect, as the "war between the sons of light and the sons of darkness." Millenarian movements tend to be ecstatic. In most movements the ritual involves wild and often frenzied emotional display. We encounter in many millenarian movements cases of hysterical and paranoid phenomena, mass possession, trances, fantasies, and in others ecstatic dance figures prominently. Closely related to these phenomena are the antinomian tendencies, which appear in many guises. In some movements the antinomian element is moderate and mild, in others explicit and radical. Many millenarian movements deliberately break accepted taboos and overthrow hallowed norms. Sexual aberrations and excesses and unbridled expressions of aggression are very common. Sometimes aggression is turned inward; the members may destroy their own property and even commit mass suicide. The clearest example of the antinomian element inherent in millenarism is the doctrine of the "holiness of sin" developed by the Sabbatean movement after the apostasy of the Messiah (Scholem 1957). The majority of millenarian movements are messianic (see Kroef 1952; Metraux 1941; Ban ton 1963). Salvation is brought about by a redeemer, who is a mediator between the human and the divine. Another important mediator between the divine and the movement is the leader. Leadership tends to be charismatic. The intense and total commitment required by millenarism is summoned forth by leaders who are considered to be set apart from ordinary men and endowed with supernatural power. Often there is also not just one charismatic leader but a multiple leadership. First, we find in a number of instances a division of leadership between the inspired prophet and the organizer who is concerned with practical matters. A second less prevalent line of bifurcation is the differentiation between the internal leader, who operates within
352
MILLENARISM
the movement, and the external leader, who represents it in its relations with the outside world. Organizationally, millenarian movements vary from the amorphous and ephemeral movement, with a cohesive core of leaders and ardent believers and a large ill-defined body of followers, to the fairly stable, segregated, and exclusive sectlike group. The organizational form of a more or less ephemeral movement is, however, more typical. This is no doubt closely related to the nature of the millenarian message. The promise of an imminent and total redemption awakes grandiose hopes and sweeps a large number of followers into the movement. However, its source of strength is also its source of weakness: by promising an imminent delivery and often even fixing a definite date, it brings about its own downfall. When the appointed day or period passes without any spectacular happenings or without the right apocalyptic events, the movement faces a serious crisis that often disrupts it or even breaks it up completely. The crisis of nonmaterialization of the millennium is a severe one, but it need not always lead to disruption. In some cases the failures of prophecy have not caused disaffection or immediate disintegration (Festinger et al. 1956). Indeed, there are many cases of persistent recrudescence in spite of repeated failure. Radical millenarism became an ever-present though periodically dormant force in Andalusia for more than seventy years during the nineteenth century (Hobsbawm 1959). It suffered reversal after reversal, yet flared up repeatedly. The recurrent revival of the movement follows an almost cyclical pattern; the millennial outbursts follow one another at approximately ten-year intervals. Similar, though not as cyclical, patterns of disruption and revival can be found in studies of the medieval period, as well as in the literature on Melanesia and Africa. Sometimes there is a hidden continuity between the different phases of the movement. When the millenarian movement suffers a reverse, it goes under cover. It remains underground until it sees a better chance for its struggle, repeatedly hiding or going out into the open, but retaining its radical millenarism. It should be stressed, however, that often there is hardly any direct connection between what may seem to be recurrent phases of the self-same movement. Continuation of similar conditions often breeds similar yet independent reactions. In many cases there is no direct influence or any continuity of either tradition or personnel between successive movements (Guiart & Worsley 1958). An alternative reaction to nonactualization is the switch from a short-range, radical millenarism to a
long-range and more or less attenuated version of it. When the future becomes past and there is no fulfillment, the Endzeit is moved into the past and integrated into the present as a new Urzeit. Final redemption is either postponed to a more distant future or spiritualized. Thus, the millenarian dynamism solidifies into a new institutionalized religion. The histories of Zoroastrianism and of Christianity provide the best examples of this developmental pattern. The institutionalization of the Bahai movement and the gradual attenuation of its initial millenarism is another case in point (Berger 1957). Dimensions of differentiation So far we have underlined the main common characteristics of millenarian movements and only hinted at internal differentiation. Comparative analysis of millenarian movements is at its inception, and attempts to construct a systematic typology are partial and not very satisfactory (for example, see Mair 1959; Smith et al. 1959; Kobben I960; Shepperson 1962; Wilson 1963). The following seem to be the major dimensions of intratype differentiation from the religious and sociological points of view. (1) Millenarism combines a historical and a mythical time conception. The consciousness of time as a linear process of change, as a sequence of once-and-for-all events of unique character and particularity, is intertwined with the consciousness of time as cyclical and endlessly repetitive. Millenarism is in most cases posthistorical, in the sense that it is an outcome of a breakdown of historical consciousness, a flight from history to a mythical Endzeit. The historical perspective does not disappear. It is usually retained in an elaborate temporal scheme, in which a semihistorical or historical epoch ranges between the Urzeit and Endzeit. It should be noted, however, that in quite a number of cases the millennial conception is postmythical rather than posthistorical. The breakdown of the world view anchored in the metahistorical beginning leads to the displacement of the Urzeit and to its projection into the metahistorical future. There is in this type of millenarism a vague notion of time as duration and change, as well as recognition of a short semihistorical interim period, but the cyclical paradigmatic time conception of myth predominates. (2) Millenarism combines the notion of perfect time with the notion of perfect space. The major emphasis may be on the notion of perfect time, in which case location in a specific place is subsidiary or in certain cases even nonexistent. The spatial element may, however, be crucial. The Jewish con-
MILLENARISM ception of redemption is clearly localized: the return to the Promised Land and the rebuilding of Zion are an integral part of it. (3) The millenarian process is two-phased: redemption is preceded by a premillennial catastrophe. The major emphasis may be on the preparatory struggle; in this case the tribulations of the period of breakdown are described in elaborate detail, and the fear of doom and hatred of the adversary are more prominent than hope and love. On the other hand, the dominant emphasis may be on redemption, and the catastrophe may be viewed as just a short prelude to eternal bliss. While the majority of millenarian movements combine catastrophe and redemption, in a number of cases one appears without the other. (4) Millenarism usually involves messianism, but the two do not necessarily coincide. Expectation of a human—divine savior is not always accompanied by expectation of total and final redemption. Conversely, expectation of the millennium does not always involve the mediation of a messiah. Redemption is in certain cases brought about directly by the divine. (5) Millenarism involves both inclusion and exclusion: there are always God's people within and the ungodly without. The divinely appointed group may be singled out on an ascriptive and particularist basis. Only those who belong—to the race, the ethnic group, the nation—will be redeemed and enjoy the new, happy life. The basis of selection may also be elective and universalist. The message is directed to the whole of mankind; everyone who will repent and who qualifies religiously and morally will be saved. The main emphasis may be either exclusive or inclusive. (6) While expectation of imminent redemption is a constitutive element in millenarism, there is a certain range of variation in this respect. There are movements that are swept by a very strong sense of the immediacy and urgency of redemption. They set a very close date for the coming of the millennium or expect it any day. Other movements view the millennium as approaching and close at hand, yet not immediate. (7) Millenarism is a future-oriented religious ideology. However, while its attitude to the present is outrightly and radically negative, there is considerable variation with respect to its orientation to the past. There are millenarian movements that are predominantly restorative. Their aim is a revival and revitalization of the indigenous culture, and their view of the future is largely traditional. Far more common, however, are predominantly innovative movements (Linton 1943). There is a
353
strong antitraditional component in millenarism. Essentially millenarism is a bridge between past and future. There are many antitraditional elements in predominantly restorative movements. Some of the traditional myths and practices become symbols of the old order and acquire a new meaning and an exaggerated significance that they never enjoyed before. There is an ongoing process of selection and reinterpretation. To turn to the other pole, even the most antitraditional version of millenarism is, in fact, a synthesis of the external and the indigenous, of the new and the old. The strong antipast orientation of the innovative movements is mitigated when the millennium is envisaged as a return to a mythical golden age. Inasmuch as the millennium is regarded as "paradise regained," those elements of tradition that are viewed as embedded in it become also components of the new order. By establishing a connection between the metahistorical Urzeit and the metahistorical Endzeit, the millenarian movement can be radically change-oriented yet incorporate traditional elements in its view of the final future. Millenarian movements are thus both restorative and innovative. Classification of a given movement from the point of view of this dimension involves careful weighing of traditional versus nontraditional elements. (8) Millenarism usually evokes extreme dedication and fervor. In the majority of cases this fervor is accompanied by abandonment of self-control and expressed in enthusiastic ritual, violent motion, and antinomian acts. However, in a minority of cases we encounter the direct opposite: religious fervor manifests itself in excessive self-discipline, stringent observation of rules, and extreme asceticism. The Black Muslims, for instance, insist on strict order and decorum; they prohibit any excess and any expression of religious enthusiasm. (9) Another important dimension of differentiation is the definition of the role of the movement in bringing about the advent. There are many variations in this respect. Movements range from the fairly passive and nonviolent, on the one hand, to the extremely activist and aggressive, on the other. There are certain elements in the millenarian ideology that work against an outrightly active definition of the role of the follower. Salvation is preordained and inevitable. Thus, the followers are not makers of the revolution; they expect it to be brought about miraculously from above. Ultimately, initiative and actual power to bring about change rest with divine powers. All millenarian movements share a fundamental vagueness about the actual way in which the new order will be
354
MILLENARISM
brought about, expecting it to happen somehow by divine intervention. It should be noted, however, that there is a strong militant ingredient in the millenarian ideology that more often than not outweighs the passive and pacifist elements in it. The assurance of operating in accordance with the predetermined divine plan and the passionate confidence in ultimate triumph may encourage heightened activity rather than passivity. Since the millennial view of redemption is both transcendent and terrestrial, paving the way for this redemption is usually not confined to the employment of ritual measures. Joining the movements affects participation and activity in the secular sphere as well. Total rejection of the social order leads in many cases to radical withdrawal and noncooperation. Cessation of economic activity, political nonparticipation, conscientious objection with regard to service in the army, strict segregation, and wholesale migration are frequent concomitants of millenarism. An alternative and equally prevalent reaction is active revolt. Radical negation of the social order engenders, in many cases, open aggression and violence. Preparation for the future struggle often entails the introduction of military training for all members or the setting up of a selective secret military organization. Miinzer's Elect, Joseph Smith's Apostolic Corps, and The Fruit of Islam organized by the Black Muslims are cases in point. There are numerous cases of eruption of violence: members of millenarian movements have swept over the country, devastating, burning, and massacring on their way. We also encounter many cases of planned and concerted assaults on the established authorities. Movements that have an essentially ritual and passive conception of their role are often pushed to active revolt by the inner dynamics of their millenarian position and as a result of persecution by the authorities. Conditions of development What are the conditions that account for the emergence and continuance of millenarian movements, and in which social groups are they anchored? By and large the data support the hypothesis that millenarism is the religion of deprived groups—the lower social strata and oppressed and persecuted minorities (Mannheim 1929-1931). It is usually engendered by severe and protracted suffering. At the root of it we often find multiple deprivation, that is, the combined effect of poverty, low status, and powerlessness. The effect of multiple deprivation accounts for the prominence of members of pariah groups and pariah people
among the promulgators and followers of millenarism (Weber 1920-1921; Troeltsch 1912; Miihlmann 1961). The low status of such groups derives from their despised ethnic origin and cultural tradition and from their limitation to menial and degrading occupations. Being at the bottom on so many counts, they are attracted to the myth of the elect and to the fantasy of reversal of roles, which are important elements in the millenarian ideology. Millenarism flares up, in many cases, as a reaction to cumulative deterioration of life conditions and as a result of awareness of prospects for further decline in the future. We note also the precipitating effect of sudden and dramatic crises that aggravate endemic deprivation and at the same time symbolize and highlight it. Many of the outbursts of millenarism have taken place against a background of disaster: plagues, devastating fires, recurrent long droughts, economic slumps that caused widespread unemployment and poverty, and calamitous wars. Deprivation, frustration, and isolation. The hypothesis of acute multiple deprivation provides an important clue. Yet, as it stands, it does not fully account for the emergence and development of millenarism and requires considerable modification and amplification. First, it should be noted that the predisposing factor is, in quite a number of cases, not severe hardship but a markedly uneven relation between expectations and the means of their satisfaction (Aberle 1962). In many cases it is predominantly the inability to fulfill traditional expectations. In medieval Europe millenarism affected mainly people who were cut off from the traditional order and were unable to satisfy wants instilled in them by it. The insidious onslaught of the developing capitalistic order on a backward and isolated peasant economy created the same basic difficulty in Spain and Italy centuries later, although there it affected not only people who were cut off from the rural community but also the rural community itself (Hobsbawm 1959). We encounter the same type of frustration in primitive societies as well, but there it increasingly becomes not so much a problem of the lack of means to supply traditional wants as the development of a set of new expectations. The encounter with modern societies engenders enormously inflated expectations, without a concomitant and adequate development of institutional means for their fulfillment. This discrepancy creates a void that is often bridged by millenarian hope. That frustration may be much more important than actual hardship becomes evident when
MILLENARISM we consider the fact that millenarian unrest in certain parts of New Guinea was not caused by any direct contact with the white men. Although there were hardly any changes in the status quo, indirect contacts and impact by hearsay brought about changed expectations and acute frustration. It should be stressed that in many cases millenarian outbursts were caused not by a deterioration of conditions but by a limited amelioration that raised new hopes and new expectations but left them largely unfulfilled. The incongruity between ends and means is not the only source of frustration. Much of the deep dissatisfaction stems from incongruities and difficulties in the realm of regulation of ends. Rapid social change and encounters with radically different systems of values result in more or less severe cultural disintegration and disorientation. The impinging cultural influences penetrate into the traditional setting and undermine the effectiveness of traditional norms as guides of action. Even central traditional values cease to be self-evident and sacred. Inasmuch as these traditional values are internalized and are an integral part of personal identity, the disintegration of the traditional system results in serious self-alienation. When the alien culture is that of a more prestigious upper class or that of a colonial ruling class, it is often— willingly or unwillingly, consciously or unconsciously—acknowledged as superior. This engenders a nagging feeling of inferiority and even self-hatred. The effect of the incongruity between the indigenous and external influences is aggravated by the discrepancies between the values and policies of different external agencies. In most colonial countries there is constant conflict between the government, the traders, and the missions, as well as open and often bitter rivalry between the different missions. There are, in addition, inner contradictions and inconsistencies between different elements of religious doctrine and a split between religious ideals and reality. Since conflicting claims tend to neutralize and annul each other, the impinging influences weaken and destroy the traditional system without substituting a new system of values. Millenarism is often born out of the search for a tolerably coherent system of values, a new cultural identity, and a regained sense of dignity and self-respect (see Werblowsky 1965; Burridge 1961). Another important factor operative in the emergence of millenarism is social isolation brought about by the disruption of traditional group ties. Analysis of the medieval material indicates that
355
millenarism did not appeal much to people who were firmly embedded in well-integrated kinship groupings and effectively organized and protected in cohesive local communities. The people most exposed to the new pressures and therefore more prone to millenarian heresy were the malintegrated and isolated who could find no assured and recognized place in cohesive primary groups. Comparative historical analysis has underlined the important contribution of migrant groups and itinerant workers to the development and spread of millenarism. The strains of transition. It is significant that millenarism occurs mainly in periods of transition. Millenarian movements in primitive societies provide the clearest proof of this hypothesis. Millenarism usually does not appear in areas largely untouched by modernization, and it appears only rarely in areas in which modernization has reached an advanced stage. It occurs mainly during the intermediate stages. This has given rise to the hypothesis that millenarism in primitive societies is a "half-way" or "quarter-way" phenomenon. (Belshaw 1950). While it is difficult to specify exactly at which point along the line millenarism begins or ceases to be feasible, the basic hypothesis that views it as a concomitant of transition is corroborated in other settings as well. In modern societies we find that those who have undergone the double transition of intercountry and intracountry migration and are both new immigrants and new urbanites are particularly prone to millenarism. Millenarian movements have proliferated during the transition between premodern and the modern way of life in rural Spain and Italy. Millenarian outbursts abounded toward the end of the Middle Ages and the beginning of modern times. The Judaeo-Christian formulation of millenarism developed during the stormy period that preceded the destruction of the Second Temple. The frustration, disorientation, and disruption engendered by these upheavals are the crux of the matter. Millenarism and political helplessness. Even the combination of such factors as deprivation, frustration, and isolation does not supply us with an adequate answer to our question. The most important contribution of recent studies of millenarism to this analysis lies in their insistence that millenarism is essentially a prepolitical, nonpolitical and postpolitical phenomenon (Worsley 1957). Among primitive societies it appears mainly in so-called stateless segmentary societies, which have rudimentary political institutions or lack any specialized political institutions altogether [see STATELESS SOCIETY]. When it appears in societies
356
MILLENARISM
with fairly developed or well-developed political institutions, it appeals mainly to strata that are politically passive and have no experience of political organization and no access to political power. Instances of such "nonpolitical" strata in societies with a more or less developed political structure are the peasants in feudal societies, the peasants in isolated and backward areas in modern societies, marginal and politically passive elements in the working class, recent immigrants, and malintegrated and politically inarticulate minority groups. Sometimes millenarism is "postpolitical," appearing after the downfall of a fairly developed political system. The collapse of an entire political system by a crushing defeat and the shattering of tribal or national hopes have sometimes led to widespread millenarism. It is the sense of blockage—the lack of effective organization, the absence of regular institutionalized ways of voicing their grievances and pressing their claims—that pushes such groups to a millenarian solution. Not being able to cope with their difficulties through concerted political action, they turn to millenarism. Millenarism is born out of great distress coupled with political helplessness. The effect of the various predisposing economic and social factors is further clarified when we examine more closely the sources of recruitment to millenarian movements. The hypothesis that millenarism is a religious ideology of lower strata is based on an assumption that it is a concomitant of social and economic differentiation and is a manifestation of class society. Examination of the data indicates that this is true in most but not all cases. Millenarism is not confined to stratified societies. In quite a number of cases, it is the reaction of a largely undifferentiated primitive society to the unsettling impact of social change. Primitive societies undergo only gradual, almost imperceptible, social change. The dominant time dimension is the mythical past; life in the present is experienced as a repetition of the paradigmatic events of the Urzeit. The idea of Endzeit is either nonexistent or marginal. Swift and radical change disrupts this repetitive rhythm and transforms life conditions. The cosmic and social orders can no longer be grounded in the mythical beginning, and so the major emphasis shifts to the mythical future. The image of the future age of bliss may be largely an extrapolated replica of the former image of a past golden age. It may, on the other hand, be changeoriented and partly independent of this image of the mythical past. The main predisposing factor in such cases is the loss of anchorage in the life-giving myth of the Urzeit, and this loss affects society as a whole. Millenarism of this type is rooted in the
dilemma of stability and disruptive change and not so much in a polarization of underprivileged and overprivileged strata. This problem of breakdown of continuity is of central importance also in the emergence of "posthistorical" millenarism. When we center our attention on stratified societies, we find that underprivileged groups predominate but do not have a complete monopoly. At one time or another millenarism has found support in all levels of society. There is, for instance, a distinctly middle-class element in British millenarism. It is true that such groups as those which built their hopes on Mother Ann Lee of Manchester were usually of humbler origin and that, from the days of Wesley through the initial period of the Salvation Army to the present-day frequenters of Kingdom Halls, the poor were in the majority. However, in most movements we find members, and especially leaders, of middle-class origin. There is even one distinctly middle-class movement: there were few, if any, underprivileged elements in the affluent "Irvingite" Catholic Apostolic church that developed in the middle of the nineteenth century in England (Shaw 1946; Taylor 1958). It is also significant that adherents of millenarian movements are not always the worst off among the underprivileged. Those members of the deprived group who are somewhat better off are often better able to take stock of their situation, to react, and reorganize. The upper strata of a minority group or the indigenous aristocracy of a colonial country may identify with the dominant group in the society. They may, on the other hand, identify with their own membership group and want to share its destiny. An indiscriminate invidious evaluation of all members of the underprivileged group and the existence of an insurmountable barrier between it and the dominant group strengthen the solidarity of the underprivileged group and blur internal status differentiation. The tendency of members of the upper strata of deprived groups to join and lead millenarian protest movements is enhanced if their traditional status is threatened and bypassed. Many studies underline the prominence of members of a frustrated secondary elite among the leaders of millenarian movements (see, for instance, Cohn 1957; Katz 1961). Many of the leaders of the medieval movements were members of the lower clergy who, for one reason or another, decided to turn their backs on the church; Thomas Miinzer is the most famous example of such men. Religious predispositions to millenarism. So f ar we have dealt mainly with the economic and social factors. That the combination of all the predispos-
MILLENARISM ing factors will actually lead to millenarism and not result in the development of other types of religious ideology is conditioned alco by the type of religious beliefs that are prevalent in a society. The yearning for an earthly paradise and for final salvation is very widespread, and millenarian elements appear in most religions. It should be stressed, however, that certain types of religions are more conducive to millenarism than others. Clearly, religions in which history has no meaning whatsoever and religions which have a cyclical repetitive conception of time are not conducive to millenarism (Eliade 1949). Apocalyptic eschatology is essentially alien to religions of a philosophical and mystical cast that turn the eye of the believer toward eternity, where there is no movement and no process. This is certainly the case with some nature and cosmic religions that view the universe in terms of ever-recurring cycles of rise and decline. Another important factor operative in this sphere is a "mis-worldly" emphasis. Religions with a radical, otherworldly orientation that put all the emphasis on the hereafter or on a purely spiritual and totally nonterrestrial salvation do not give rise to the vision of the Kingdom of God on earth. The myth of Kalki as an incarnation of Visnu in a period of abundance, as well as the doctrine of the future Buddha whose advent will bring a golden age, proves that even such basically nonmillenarian religions as Hinduism and Buddhism are not devoid of millenarian conceptions. It should be noted, however, that there is hardly any millenarian tradition in Hinduism and that it has not occupied an important place in Buddhism. It is mainly world views that are based on a notion of divine will working through history toward a preordained end which provide an overall scheme conducive to millenarism. The majority of millenarian movements have appeared in countries that have had direct or indirect contact with the Judaeo-Christian messianic traditions. The Christian missions have been the most important agency for the worldwide diffusion of millenarism. Several fundamentalist sects and millenarian movements have played a particularly important role in this process. The Kitawala movement, which is an African offshoot of the Jehovah's Witnesses, is a case in point (Cunnison 1958). It should be noted, however, that millenarism has also appeared in cases where the main contact was with less apocalyptic versions of Christianity. In such cases millenarism is reinstated to a central position by a process of selection and reinterpretation. We should take into consideration the autochthonous religious concepts as well. Some primitive mythologies contain beliefs that are conducive to
357
millenarism, such as the expectation of the future return of the culture hero or the idea of the return of all the dead as a prelude to a millennial era. It should be stressed, however, that these themes appeared in a rather embryonic form in primitive mythology and did not occupy a particularly important position. They were developed, reinterpreted, and elaborated into full-fledged millenarian conceptions only under the impact of new situations and after contact with Christianity or Islam. The pre-existing primitive conceptions affected the development of millenarism in yet another way. The prevalence of millenarism in Melanesia and the importance of expectations of cargo in this view of the millennium are, it would seem, due to the almost exclusive emphasis that the indigenous religion puts on ritual activity oriented to the acquisition of material goods. The most important ideological starting point of millenarism may be a new importation; it may be the native tradition that exists of old; and, in a number of cases, it seems to be predominantly a largely independent reaction to the pressure of circumstances. Availability of pre-existing millenarian precepts and patterns facilitates the development of a full-fledged millenarian ideology and the organization of a millenarian movement. Such millenarian precepts may be dormant for a long time until activated by suitable circumstances and by crisis. The readily found millenarian representations are invested with the particularity and immediacy necessary to convert them into an effective ideology that serves as a basis for collective action. The Sabbatean movement. Comparative research underlines the close correspondence and interdependence between millenarism and economic and social conditions. At the same time, it indicates the potency and partial independence of the religious factor. The Sabbatean movement (so named after Sabbatai Zevi, a Jewish mystic of Smyrna, who in 1648 proclaimed himself Messiah) supplies us with clear proof of the inadequacy of a reductionist interpretation. In this respect the movement is a crucial case (Scholem 1941; 1957). It was preceded by two waves of unprecedented massacres and persecutions in Poland. Many thousands of Jews were slaughtered, and many more fled before the sword. Hundreds of communities were completely destroyed. Since the messianic movement erupted shortly after the massacres, it was assumed that it was a direct reaction to them. Examination of the differential appeal of messianism in different countries reveals, however, that the Sabbatean movement was not at its strongest in communities that bore the full
358
MILLENARISM
brunt of the disaster and was just as powerful, and in certain °ases more powerful, in countries in which the Jews lived in comparative peace. The calamity contributed to the emergence of the movement by emphasizing the fundamental precariousness of Jewish existence and by enhancing the consciousness of exile, yet in and by itself it cannot account for the development and differential impact of the movement. Moreover, it is significant that messianism spread in prosperous and expanding communities just as in destitute and declining ones. Intracommunity differentiation affected recruitment more than intercommunity differentiation. Part of the established elite distrusted and rejected Sabbatai Zevi as Messiah, and the secondary elite was more active than the primary one. It should be noted, however, that the majority of the elite and upper strata joined the movement and were as enthusiastic as the mass of the people. We find among the adherents members of all strata of society, ranging from wealthy merchants, who offered to donate their entire fortune to the Messiah, to the poorest of the poor. The predominant predisposing factor that accounts for the deep and lasting impact and for the almost universal appeal of Sabbatai Zevi in all countries of the Diaspora was the very wide spread of the doctrines of Isaac Luria, the great Kabbalist teacher who died nearly a century before the Sabbatean movement reached its height. The aim of Luria and his followers was the restitution of cosmic harmony through the earthly medium of a spiritually elevated Judaism. Their doctrines laid far greater stress on the inner aspects of redemption than on its outward historical and political aspects; however, since they viewed liberation from the yoke of servitude and exile as a by-product of spiritual salvation and since they saw the coming of the Messiah as imminent, they engendered tense messianic expectations. To the large circles of Lurianic devotees, the coming of Sabbatai Zevi was an actualization of the promise and prediction of the Kabbala; indeed, Sabbatai chose to proclaim himself Messiah in the year that the Kabbalists had calculated as the year of salvation. The antinomian deviations of Sabbateanism were anchored in the nontraditional elements in the mystical conception of redemption. The inner dynamics of the movement, and especially its transformation during its later phases, are unintelligible without a detailed and full analysis of the precepts and symbols of the Lurianic Kabbala. [See JUDAISM.] Causal analysis of millenarism. In concluding this causal analysis, it should be emphasized that the various predisposing factors are interrelated.
There is a low correlation between any one of them and the emergence of millenarism. It is only if we examine their intricate interplay and their combined effect that the results are more satisfactory. Moreover, to suggest that most millenarian movements arise in situations that have certain identifiable features in common is not to suggest that wherever such situations exist millenarian movements must inevitably arise. Inherent openness and indeterminacy remain even after we have considered all the major determinants. Examination of cases of occurrence, near-occurrence, and nonoccurrence, under basically similar conditions as far as degree of strain and structural and cultural conduciveness are concerned, indicates the considerable importance of historical accidents. Availability or nonavailability of leaders with strong suggestive powers, as well as occurrence or nonoccurrence of precipitating crises, affects the chances of the movement to emerge and develop. The variation in the reaction of the authorities to the movement's efforts to mobilize support is another important factor. Persistent and effective repression by the authorities may prevent the emergence of the movement or defeat and quench it soon after it appears. On the other hand, increased responsiveness and flexibility on the part of the authorities may open avenues of reform and thereby deflect the movement from its purpose. It is mainly when the authorities are not only unresponsive and inflexible but also somewhat ineffective, or at least permit some relaxation of control, that the millenarian movement has a chance to emerge and spread. Functional analysis of millenarism What are the consequences of millenarism? How does it serve the needs of the followers, and what does it contribute to the strata and societies in which it appears? We find two main, diametrically opposed, interpretations in the literature. The first approach underlines the negative functions of millenarism and considers it as a dangerous collective madness (see, for instance, Cohn 1957). According to this viewpoint, millenarism is a paranoid fantasy, an outlet for extreme anxiety, and a delusion of despair. The megalomaniac view of oneself as wholly good and abominably persecuted, the attribution of demonic power to the adversary, the inability to accept the ineluctable limitations of human existence, as well as the excessive emotionality, the antinomian rituals, and the destructive activities, are all diagnosed as symptoms of mental illness. The millenarian ideology is considered as disruptive and destructive both from the
MILLENARISM point of view of the movement and from that of the over-all society. The second approach rejects this negative evaluation of millenarism and underlines its positive functions (for the clearest expression of this viewpoint, see Worsley 1957). According to this view, the highly emotional and aggressive behavior is related to the revolutionary nature of the movement that strives to overthrow the old order and establish a new one. The severing of strong ties and the rejection of internalized norms demand an enormous effort and engender a deep sense of guilt, which causes much of the hysteria and the aggression. Many of the antinomian manifestations represent a deliberate overthrow of the accepted norms, not in order to throw morality overboard but in order to create a new brotherhood and a new morality. The "paranoid" manifestations are seen as stemming primarily from the contradictions inherent in the situation in which such movements appear and from the difficulties inherent in their revolutionary task rather than from the psychological aberrations of individual followers. If we take into consideration the social conditions and the cultural milieus that gave rise to these manifestations, they cease to be bizarre and fantastic and become fully understandable reactions. The promillenarian viewpoint emphasizes its underlying realism and its inherent, though hidden, rationality. This viewpoint considers millenarism to be integrative on all levels. First, the millenarian ideology supplies the believers with invaluable safeguards and supports. The predominant element in millenarism is inner certainty and hope, not despair. Adherents are assured of "being in on history." They are in the know and are working on the winning side. The movement fosters a new collective identity and engenders a feeling of belonging and a sense of purpose. The promise that many of the first shall be last and the last first (Matthew 19.30) transforms inferiority into superiority and fosters self-confidence and a sense of ethical righteousness. The division of humanity into saints and devils enables the followers to focus and express their aggression and affirm the solidarity and integrity of their group. Vibrant expectation, pride, and hope lift them out of their apathy and bring about inner regeneration and rehabilitation. The positive functions of millenarism become even more evident on the social level. Millenarism is an emancipating, activating, and unifying force in hitherto stagnant, politically passive, and segregated groups. In recent and contemporary history
359
it has served as a precursor of political awakening and as a forerunner of political organization. Millenarism has played an important role in overcoming divisions and in joining previously isolated or even hostile groups together. The revolutionary nature of millenarism makes it a very potent agent of change. It demands a fundamental transformation and not just improvement and reform. The radical versions of millenarism incite followers to active anticipation of the advent and even to active revolt. It invests their struggle with the aura of a final cosmic drama and interprets present difficulties as signs of the beginning of the end. Every small success is viewed as proof of invincibility and as a portent of future triumph. Millenarism arouses truly great hopes and therefore can make equally great demands on its followers. By promising complete salvation, it is able to liberate formerly untapped energies and generate a supreme effort without which no major break with the existing order can be achieved. Thus millenarism helps to bring about a breakthrough to the future, and its special efficacy lies in its power to bridge future and past (Wallace 1956). Religion and politics. While bridging the gap between future and past, millenarism also connects religion and politics. Operating in societies or in strata completely dominated by religion, millenarism couches its political message in the familiar and powerful language and images of traditional religion, employing and revitalizing its age-old symbols. In such milieus recruitment to new political goals is often possible only when expressed in religious terms. In many cases it is also the only means of establishing cooperation between leaders and followers. Millenarism provides an important mechanism of recruitment of new leaders. It opens up new avenues of ascent and develops a set of new statuses. Although some of the new leaders derive their authority from their central or marginal position in the traditional order, more often than not their authority stems at least in part from their comparatively superior knowledge and greater experience in nontraditional spheres of activity and has no traditional legitimation. Millenarism helps these leaders to establish their authority. Millenarism is, according to this view, a connecting link between prepolitical and political movements; it facilitates the passage from premodern religious revolt to a full-fledged revolutionary movement. The process of transition from the one kind of movement to the other can actually be traced in both primitive and recent premodern movements. There are two main distinct avenues of transition. In some cases the movements gradually change
360
MILLENARISM
their nature, slowly becoming less ritualized and more secular in emphasis. They start to pay much more attention to purely political and economic goals, attach far more importance to strategy and tactics, and organize more effectively. Yet they do not sever their ties with their millenarian tradition, and they continue to derive much of their revolutionary zeal from its promise of final salvation. Positive and negative evaluations. Assessment of the outcome of millenarism clearly reflects value premises. The two viewpoints on this matter stem at least in part from different ideological stands. The antimillenarian stream of research is gradualist and reformist, while the promillenarian stream is revolutionary and favors radical change. It should be noted, in addition, that the two viewpoints have emerged out of research in different historical and social settings. The positive evaluation grew mainly out of the research into those millenarian movements that were developed by rising groups at the upsurge of their efforts of emancipation. Such research deals mainly with movements that were precursors and concomitants of secular revolutionary action (see Tuveson 1949). These movements engender active change and leave their mark on the whole of society. Millenarism has, in fact, played an important role in all national and social liberation movements in premodern and modern Europe. It has also preceded and permeated many incipient nationalist and socialist movements in developing countries (Bastide 1961; Miihlmann 1961). The negative evaluation of millenarism is based mainly on the study of movements developed by doomed or declining groups. Such movements have served as alternatives to, rather than as precursors or as concomitants of, secular collective action, and they have had few lasting social consequences. For example, most medieval millenarian movements were ephemeral outbursts. Since they had little chance to change the massive structure of medieval society, most of these revolutionary revivals "short-circuited" and disappeared. Material on the American Indians suggests that radical millenarism has played a limited and largely disruptive role in their history. Any movement with a revolutionary potential was quickly suppressed, leaving an aftermath of disillusion and disorganization. The task of rehabilitating and integrating the Indians was performed mainly by reformist cults oriented to peaceful accommodation to the white society (Voget 1956; Barnett 1957). Most millenarian movements in modern society are radically antipolitical. They conduct a violent campaign against secular movements and enjoin their
members to keep away from them. In these cases religious and secular revolutionism are mutually exclusive, and they compete rather than mutually reinforce complementary solutions. [See NATIVISM AND REVIVALISM.]
The outcome of any millenarian movement depends on the historical circumstances, on the type of society, and on the nature of the group in which it occurs. Of crucial importance are the degree of differentiation of the society, the characteristics of the religious and political spheres, the position of the millenarian group in the changing balance of power, and the group's chances to promote its goals through political action. Religious and secular revolutionism The basic similarities and interconnections between religious and secular revolutionism is a major theme in most recent studies of millenarism, irrespective of their ideological position. First we note the typological affinity between these two kinds of movements. Secular revolutionary movements differ greatly from other types of secular political movements and have, in a certain sense, a semireligious character. Their world view is total and all-embracing. It purports to solve basic problems of meaning and to trace and interpret the unfolding of world history. The revolutionary ideology is a matter of ultimate concern and utmost seriousness; it demands from the followers unquestioning faith and unconditional loyalty. It is therefore all-pervasive and defines every aspect of life. Much like the great religious movements of the past, secular revolutionism has deeply stirred large masses of people, evoking intense fervor and dedication to its cause. Second, we find the similarity of predisposing factors. Like millenarism, secular revolutionism is brought about by a combination of deprivation, frustration, disorientation, and disintegration of primary groups. Last but not least are the dynamic interconnections between the two types of revolutionism. I have already mentioned that millenarism is often a precursor and concomitant of secular revolutionism. The most important feature of millenarism seems to be its composite, "intermediate" nature. It combines components which are seemingly mutually exclusive: it is historical as well as mythical, religious as well as political, and, most significant, it is future-oriented as well as past-oriented. It is precisely this combination of a radical revolutionary position with traditionalism that accounts for the widespread appeal of millenarism and turns it into such a potent agent of change. YONINA TALMON
MILLENARISM [Directly related are the entries NATIVISM AND REVIVALISM; SECTS AND CULTS. Other relevant material may be found in COLLECTIVE BEHAVIOR; MASS PHENOMENA; RELIGIOUS ORGANIZATION; HE VOLUTION; SOCIAL MOVEMENTS; and in the biographies of BUBER; KLUCKHOHN; MANNHEIM; TROELTSCH; WEBER, MAX.] BIBLIOGRAPHY ABERLE, DAVID F. 1962 A Note on Relative Deprivation Theory as Applied to Millenarian and Other Cult Movements. Pages 209-214 in Sylvia L. Thrupp (editor), Millennial Dreams in Action: Essays in Comparative Study. Comparative Studies in Society and History, Supplement No. 2. The Hague: Mouton. BALANDIER, GEORGES (1955) 1963 Sociologie actuelle de I'Afrique noire: Dynamique sociale en Afrique centrale. 2d ed., rev. & enl. Paris: Presses Universitaires de France. BANTON, MICHAEL 1963 African Prophets. Race 5, no. 2: 42-55. BARNETT, HOMER G. 1957 Indian Shakers: A Messianic Cult of the Pacific Northwest. Carbondale: Southern Illinois Univ. Press. BASTIDE, ROGER 1961 Messianisme et developpement economique et social. Cahiers internationaux de sociologie 31:3-14. BELSHAW, CYRIL S. 1950 The Significance of Modern Cults in Melanesian Development. Australian Outlook 22:116-125. BENZ, ERNST (editor) 1965 Messianische Kirchen, Sekten und Bewegungen im heutigen Afrika. Leiden (Netherlands): Brill. BERGER, PETER L. 1957 Motif messianique et processus social dans le Bahai'sme. Archives de sociologie des religions 2, no. 4:93-107. BUBER, MARTIN (1932) 1936 Konigtum Gottes. 2d ed.; enl. Berlin: Schocken. BURRIDGE, KENELM 1961 Mambu: A Melanesian Millennium. London: Methuen. COHN, NORMAN (1957) 1961 The Pursuit of the Millennium: Revolutionary Messianism in Medieval and Reformation Europe and Its Bearing on Modern Totalitarian Movements. 2d ed. New York: Harper. CUNNISON, IAN 1958 Jehovah's Witnesses at Work: Expansion in Central Africa. The Times: British Colonies Review 29, no. 1. DESROCHE, HENRI 1963 Les messianismes et la categorie de 1'echec. Cahiers internationaux de sociologie 10:61-84. DONALDSON, DWIGHT M. 1933 The Shi'ite Religion: A History of Islam in Persia and Irak. London: Luzac. ELIADE, MIRCEA (1949) 1954 Myth of the Eternal Return. New York: Pantheon. -» First published in French. A paperback edition was published in 1959 by Harper as Cosmos and History: The Myth of the Eternal Return. ESSIEN-UDOM, ESSIEN U. 1962 Black Nationalism: A Search for an Identity in America. Univ. of Chicago Press. H> A paperback edition was published in 1964 by Dell. FESTINGER, LEON; RIECKEN, H. W.; and SCHACHTER, STANLEY 1956 When Prophecy Fails. Minneapolis: Univ. of Minnesota Press. FIRTH, RAYMOND W. 1955 The Theory of "Cargo" Cults: A Note on Tikopia. Man 55:130-132. FROOM, LE ROY E. 1946-1954 The Prophetic Faith of Our Fathers: The Historical Development of Prophetic
361
Interpretation. Vols. 1, 3, 4. Washington: Review & Herald. GUIART, JEAN; and WORSLEY, PETER 1958 La repartition des mouvements millenaristes en Melanesie. Archives de sociologie des religions 3, no. 5:38-46. HOBSBAWM, ERIC (1959)1963 Primitive Rebels: Studies in Archaic Forms of Social Movement in the 19th and 20th Centuries. 2d ed. New York: Praeger. HOLT, PETER M. 1958 The Mahdist State in the Sudan, 1881-1898: A Study of Its Origins, Development and Overthrow. Oxford: Clarendon. HURWITZ, SIEGMUND 1958 Die Gestalt des sterbenden Messiahs: Religions-psychologische Aspekte der judischen Apokalyptik. Zurich: Rascher. KATZ, JACOB 1961 Tradition and Crisis: Jewish Society at the End of the Middle Ages. New York: Free Press. KLAUSNER, JOSEPH (1909) 1955 The Messianic Idea in Israel: From Its Beginning to the Completion of the Mishnah. 3d ed. New York: Macmillan. -> First published in Hebrew. KOBBEN, A. J. F. 1960 Prophetic Movements as an Expression of Social Protest. International Archives of Ethnography 49:117-164. KROEF, JUSTUS M. VAN DER 1952 The Messiah in Indonesia and Melanesia. Scientific Monthly 75:161-165. KROEF, JUSTUS M. VAN DER 1962 Messianic Movements in the Celebes, Sumatra and Borneo. Pages 80-121 in Sylvia L. Thrupp (editor), Millennial Dreams in Action: Essays in Comparative Study. Comparative Studies in Society and History, Supplement No. 2. The Hague: Mouton. LANTERNARI, VITTORIO (1960) 1963 The Religions of the Oppressed: A Study of Modern Messianic Cults. New York: Knopf. ~» First published as Movimenti religiosi di libertd e di salvezza del popoli oppressi. LAWRENCE, PETER 1964 Road Belong Cargo. Manchester Univ. Press. LINTON, RALPH 1943 Nativistic Movements. American Anthropologist New Series 45:230-240. MACRAE, DONALD G. 1961 Ideology and Society: Papers in Sociology and Politics. New York: Free Press. -» See especially pages 181-198, "The Bolshevik Ideology." MAIR, L. P. 1959 Independent Religious Movements in Three Continents. Comparative Studies in Society and History 1:113-136. MANNHEIM, KARL (1929-1931) 1954 Ideology and Utopia: An Introduction to the Sociology of Knowledge. New York: Harcourt; London: Routledge. -> A paperback edition was published in 1955 by Harcourt. A translation of Ideologic und Utopie (1929); Part 5 is a translation of the article "Wissenssoziologie" (1931). METRAUX, ALFRED (1941) 1957 Les messies de 1'Amerique du Sud. Archives de sociologie des religions 2, no. 4:108-112. MOONEY, JAMES 1896 The Ghost-dance Religion and the Sioux Outbreak of 1890. Part 2, pages 641-1110 in U.S. Bureau of American Ethnology, Fourteenth Annual Report, 1892-1893. Washington: Smithsonian Institution. -> An abridged edition was published in 1965 by the University of Chicago Press. MOWINCKEL, SIGMUND (1951) 1956 He That Cometh. Oxford: Blackwell. -> First published in Norwegian. MUHLMANN, WILHELM E. 1961 Chiliasmus und Nativismus: Studien zur Psychologic, Soziologie und historischen Kasuistik der Umsturzbewegungen. Berlin: Reimer.
362
MILLS, C. WRIGHT
O'DEA, THOMAS F. 1957 The Mormons. Univ. of Chicago Press. PIKE, EDGAR R. 1954 Jehovah's Witnesses: Who They Are, Wnat They Teach, What They Do. London: Watts. PRICE, THOMAS; and SHEPPERSON, GEORGE 1958 Independent African: John Chilembive and the Origins, Setting and Significance of the Nyasaland Native Rising of 1915. Edinburgh Univ. Press. RIBEIRO, RENE 1962 Brazilian Messianic Movements. Pages 55-69 in Sylvia L. Thrupp (editor), Millennial Dreams in Action: Essays in Comparative Study. Comparative Studies in Society and History, Supplement No. 2. The Hague: Mouton. SCHOLEM, GERSHOM G. (1941) 1961 Major Trends in Jewish Mysticism. 3d rev. ed. New York: Schocken. SCHOLEM, GERSHOM G. 1957 Shabtai Tsevi ve-hatnuah hashabtait (Sabbatai Zevi and the Sabbatean Movement). 2 vols. Tel Aviv: Am Oved. The Scroll of the War of the Sons of Light Against the Sons of Darkness. Edited by Yigael Yadin. Oxford Univ. Press, 1962. SHAW, PLATO E. 1946 The Catholic Apostolic Church, Sometimes Called Irvingite: A Historical Study. New York: King's Crown Press. SHEPPERSON, GEORGE 1962 The Comparative Study of Millenarian Movements. Pages 44-52 in Sylvia L. Thrupp (editor), Millennial Dreams in Action: Essays in Comparative Study. Comparative Studies in Society and History, Supplement No. 2. The Hague: Mouton. SIERKSMA, FOKKE 1961 Een nieuwe hemel en een nieuwe aarde: Messianistische en eschatologische be~ wegingen en voorstellingen bij primitieve volken. The Hague: Mouton. SMITH, MARIAN W.; WALLACE, ANTHONY F. C.; and VOGET, FRED W. 1959 Towards a Classification of Cult Movements. Man 59:8-12, 25-28. SMITHSON, ROBERT J. 1935 The Anabaptists: Their Contribution to Our Protestant Heritage. London: Clarke. STROUP, HERBERT H. 1945 The Jehovah's Witnesses. New York: Columbia Univ. Press. SUNDKLER, BENGT G. M. (1948) 1964 Bantu Prophets in South Africa. 2d ed. Published for the International African Institute. Oxford Univ. Press. TALMON, JACOB L. (1952) 1965 The Rise of Totalitarian Democracy. 2d ed. New York: Praeger. -> The first British edition was entitled The Origins of Totalitarian Democracy. TALMON, JACOB L. (1960) 1961 Political Messianism: The Romantic Phase. New York: Praeger. TALMON, YONINA 1962 Pursuit of the Millennium: The Relation Between Religious and Social Change. Archives europeennes de sociologie 3:125—148. TAYLOR, GORDON R. 1958 The Angel Makers. London: Heinemann. THRUPP, SYLVIA L. (editor) 1962 Millennial Dreams in Action: Essays in Comparative Study. Comparative Studies in Society and History, Supplement No. 2. The Hague: Mouton. TROELTSCH, ERNST (1912) 1931 The Social Teaching of the Christian Churches. 2 vols. New York: Macmillan. -» First published as Die Soziallehren der christlichen Kirchen und Gruppen. A paperback edition was published in 1960 by Harper. TUVESON, ERNEST L. 1949 Millennium and Utopia: A Study in the Background of the Idea of Progress. Berkeley: Univ. of California Press.
VOGET, FRED W. 1956 The American Indian in Transition: Reformation and Accommodation. American Anthropologist New Series 58:249-263. WALLACE, ANTHONY F. C. 1956 Revitalization Movements. American Anthropologist New Series 58:264281. WEBER, MAX (1920-1921)1922-1923 Gesammelte Aufsdtze zur Religionssoziologie. 2d ed. 3 vols. Tubingen (Germany): Mohr. WERBLOWSKY, Zwi R. J. 1965 A New Heaven and a New Earth: Considering Primitive Messianisms. History of Religions 5:164-172. WERNER, ERNST 1960 Popular Ideologies in Late Medieval Europe: Taborite Chiliasm and Its Antecedents. Comparative Studies in Society and History 2:344363. WILSON, BRYAN R. 1961 Sects and Society: A Sociological Study of the Elim Tabernacle, Christian Science, and Christadelphians. Berkeley: Univ. of California Press. WILSON, BRYAN R. 1963 Millennialism in Comparative Perspective. Comparative Studies in Society and History 6:93-114. WORSLEY, PETER 1957 The Trumpet Shall Sound: A Study of "Cargo" Cults in Melanesia. London: MacGibbon & Kee.
MILLS, C. WRIGHT C. Wright Mills (1916-1962) was at his death professor of sociology at Columbia University and one of the most controversial figures in American social science. He considered himself and was considered by his peers something of a rebel against the social science "establishment," and he attracted both admirers and critics for this role. Shortly after his death, a series of essays, The New Sociology, was published in his honor. A central theme of these essays was the notion that Mills exemplified that spirit of social concern which he himself saw as the fundamental duty of the modern intellectual, in particular the social scientist—a duty, be it said, which he felt was not fulfilled by the majority of contemporary American social scientists (Horowitz 1964). His writings represented an attempt to open up paths of inquiry and analysis that would enable men to combat what he called the "main drift" of modern society to "rationality without reason," that is, the use of rational means in the service of substantively irrational ends. He found Marx and Weber to be the most helpful classical theorists, but he wanted to go "beyond" both of them to a new comparative world sociology that would seek to understand our time in terms of its historical specificity and by so doing renew the possibility of achieving human freedom. He thus set himself a large task, requiring research on the whole canvas of human (and
MILLS, C. WRIGHT particularly modern) history, but he died before he could present a full synthesis of his ideas. He saw the present as a transition from the modern age to a postmodern period which he called the Fourth Epoch. If throughout his work there is a current of ultimate hope, it is equally suffused with pessimism about the more immediate future. He spoke of the "moral uneasiness of our time," a consequence throughout the Western world (including the Soviet Union) of what he called the "higher immorality," immorality encrusted in the structures and norms of the society, which he saw as particularly prevalent in the United States. The basic problem of this era was that, unlike the eighteenth and nineteenth centuries, rationality no longer produced freedom, and since the two central ideologies which were developed in the modern West, liberalism and Marxism, assumed that it did, they no longer sufficed to explain and thus to control social change. Liberalism, being more heavily dependent on this assumption, was, he said, now irrelevant, and Marxism was inadequate. What was even more unsettling to Mills was the "default" or "defeat" of the free intellectuals, especially deplorable at a time when the power of the intellectual had become potentially very great. His emphasis on the role of the intellectuals, on their failure, derived from his basic assumption that there is a great difference between the range of action possible to what he called "elites" and the range of action possible to the "masses." Men make their own history, but some are freer to do so than others. If the relatively free intellectuals fail to assert their moral leadership, other members of the elite, less qualified and less disinterested, will inevitably do so in their stead. This is in fact what had happened, according to Mills. This failure is indicated by the nature of the problems studied by social scientists, and even more by the inadequate theory and methodology that underlie their work, an inadequacy he attributed to their deliberate abdication of social responsibility. Social theory, to be usable for Mills, had to deal in categories whose level of abstraction was not so high as to deprive them of all historical content or relevance. It should involve the search for causes of specific historical sequences and thereby explain shifts in the importance of and relations between the various "institutional orders" (politics, economics, the military, religion, and kinship). Mills took a strong stand against "principled monism or pluralism" and stated that the simple view of economic determinism must be "elaborated" by political and military determinism.
363
But more than theory was involved. Mills felt that the way in which the theory is used—the methodology of social research—is central to the results. He was not opposed to empirical research (indeed, he conducted a considerable amount of it), but he was against "abstracted empiricism," to which he contrasted the ideal of "craftsmanship." Craftsmanship is at once an ethos and an ideal which is only possible in a "properly developing society" but which also brings such a society into being. While Mills constantly called for such a conception of the role of the intellectual, he preferred to exemplify the skill rather than give an operational definition of it. It is perhaps as a result of this lack of definition that discussion of Mills's criticisms of his colleagues sometimes resembles a theological debate. Mills's intellectual fathers in macrosociological theory were clearly Marx and Weber, as he himself acknowledged, and Freud and Mead in social psychology. It is sometimes said that he was the heir of Veblen. But while he called Veblen "the best social scientist America has produced," he was clearly critical of him, even in the introduction he wrote to The Theory of the Leisure Class (see Mills 1953). Mills called Veblen's views "oversimple" and "inadequate" and found the substance of his work less useful than the style. It is indeed in style and populist bias that Mills most resembles Veblen. In his own research, he was more concerned with restating and advancing the Marx-Weber tradition than the Freud-Mead one. He accepted what he considered to be Weber's two most important revisions of Marx—the broadening of the concept of economic determinism to a wider social determinism and the "sophisticating" of the idea of class by the addition of the category of status or prestige. Mills thought that Marx's major political expectation about advanced capitalist societies—the progressive role of the proletariat—had "collapsed," and he railed against a "labor metaphysic," a faith in the progressive role of the working class (1960a), although an early monograph of his, The New Men of Power (1948), may be thought to exhibit this very view. The shift in focus and methodology of Mills's empirical work over his life reflected his increasing discomfort with his peers in American sociology. The New Men of Power and The Puerto Rican Journey (Mills et al. 1950) rely in large part on survey data, especially the latter. They were both done under the aegis of the Bureau of Applied Social Research of Columbia University and under the methodological influence of Paul
364
MINISTRY
Lazarsfeld. Nonetheless, even in these works Mills used the data to deal with problems of social change of the larger society, the United States; this was a feature of all his books, whatever their particular problems. In White Collar (1951), interview data became minor and government statistical data more important; he explicitly sought to locate the problems of the individual (in this case, the "new middle class") within the trends of the epoch, thus illustrating a methodological orientation he was later to insist upon in The Sociological Imagination (1959). The Power Elite (1956) represented a further evolution of this trend. The problem here was to explain the over-all power structure of the United States, not the role of out-groups that are relatively more accessible to being studied (labor leaders, migrants, white-collar workers). In this task, Mills asserted, national surveys are useless, and he relied upon "reasoning together." The data were largely historical, and the objective of the research was to explain the "moral uneasiness of our time." In the three books that followed, The Causes of World War Three (1958), The Sociological Imagination (1959), and Listen, Yankee (1960£>), Mills had moved one stage further. There was no question here of survey methods. There was even little question, as there still was in The Power Elite, of the systematic collection of data or the use of a research design and a research organization. These three books were historical interpretations—of the contemporary world system, of the evolution of the social sciences in the United States, of social revolution in Cuba—in the form of polemical essays. By then, Mills seemed to feel that methodological rigor was a trap which would prevent him or other scholars from dealing with significant problems. Thus, despite his critical view of Marxian theory, he grew more and more interested in Marxism as a "method of work," as his last published volume, The Marxists (1962), indicates. This was undoubtedly largely because he grew more and more unhappy with what he regarded as the ideological uses other scholars made of the Weberian critique—to defend an established order. And he came to fear the emphasis on science less as an illusion than as a diversion. Mills ended as he began, a moralist preaching to his peers, the community of social scientists, throughout the world but especially in the United States. While he continued to accept the fundamentals of the Weberian modifications of Marx, he refused to accept Weber's "pessimistic world of a classic liberal." He thought the dominant apolitical or "value-free" bias of contemporary Ameri-
can sociology was an ideological mask, hiding value preferences which he did not share. In a basic sense, he was a Utopian reformer. He thought that knowledge properly used could bring about the good society, and that if the good society was not yet here, it was primarily the fault of men of knowledge. IMMANUEL WALLERSTEIN [See also ASSIMILATION; ELITES; KNOWLEDGE, SOCIOLOGY OF; LEADERSHIP, article on SOCIOLOGICAL ASPECTS; MARXIST SOCIOLOGY; POLITICAL SOCIOLOGY; POWER; SOCIAL PROBLEMS; and the biographies of FREUD; MARX; MEAD; VEBLEN; WEBER, MAX.] WORKS BY MILLS The New Men of Power: America's Labor Leaders. New York: Harcourt. 1950 MILLS, C. WRIGHT; SENIOR, C.; and GOLDSEN, R. K. The Puerto Rican Journey: New York's Newest Migrants. New York: Harper. 1951 White Collar: The American Middle Classes. New York: Oxford Univ. Press. -» A paperback edition was published in 1956. 1953 Introduction. In Thorstein Veblen, The Theory of the Leisure Class: An Economic Study of Institutions. New York: New American Library. 1953 GERTH, HANS; and MILLS, C. WRIGHT Character and Social Structure: The Psychology of Social Institutions. New York: Harcourt. 1956 The Power Elite. New York: Oxford Univ. Press. 1958 The Causes of World War Three. New York: Simon & Schuster. 1959 The Sociological Imagination. New York: Oxford Univ. Press. 1960a MILLS, C. WRIGHT (editor) Images of Man: The Classic Tradition in Sociological Thinking. New York: Braziller. 1960£> Listen, Yankee: The Revolution in Cuba. New York: McGraw-Hill. 1962 The Marxists. New York: Dell. Power, Politics and People: The Collected Essays of C. Wright Mills. Edited and with an introduction by Irving Louis Horowitz. New York: Oxford Univ. Press, 1963. 1948
SUPPLEMENTARY BIBLIOGRAPHY
APTHEKER, HERBERT 1960 The World of C. Wright Mills. New York: Marzani & Munsell. HOROWITZ, IRVING Louis (editor) 1964 The New Sociology: Essays in Social Science and Social Theory, in Honor of C. Wright Mills. New York: Oxford Univ. Press. WEBER, MAX (1906-1924) 1946 From Max Weber: Essays in Sociology. Translated and edited by Hans H. Gerth and C. Wright Mills. New York: Oxford Univ. Press.
MINISTRY See RELIGIOUS SPECIALISTS. MINNESOTA MULTIPHASIC PERSONALITY INVENTORY See under PERSONALITY MEASUREMENT.
MINORITIES MINORITIES Contemporary sociologists generally define a minority as a group of people—differentiated from others in the same society by race, nationality, religion, or language—who both think of themselves as a differentiated group and are thought of by the others as a differentiated group with negative connotations. Further, they are relatively lacking in power and hence are subjected to certain exclusions, discriminations, and other differential treatment. The important elements in this definition are a set of attitudes—those of group identification from within the group and those of prejudice from without—and a set of behaviors—those of selfsegregation from within the group and those of discrimination and exclusion from without. Among those who do not study minority groups, the common tendency is to take the word "minority" literally and simply to say that a minority is a small group of people who live in the midst of a larger group. At least two defects make this simple definition useless. First, groups are not "naturally" or "inevitably" differentiated: cultures (either of the minority or the majority, or—usually—both) must define them as differentiated before they are so. People of different races, nationalities, religions, or languages can live among one another for generations, amalgamating and assimilating or not doing so, without differentiating themselves. Like everything else that is social, minority groups must be socially defined as minority groups, which entails a set of attitudes and behaviors. Second, relative numbers in and out of the group have not been found to be definitionally important. Sociologically speaking, it makes no sense to say that Negroes are not a minority group in those few counties of Mississippi, Alabama, and South Carolina where they constitute a numerical majority of the population, but that they are a minority group in the rest of the South. Likewise, even though the Bantus constitute around 80 per cent of the population of South Africa, sociologists have defined them as a minority group because they occupy a subordinate position. Many nations have no single "majority group" in terms of numbers. Thus it is necessary either to counterpose a "minority" to a "dominant" group, in terms of power, or to abandon the term "minority" altogether and call it a "subordinate" group. Origins of national minorities. The origin of the term "national minorities" can be traced to Europe, where it was applied to various national groups who were identified with particular territories by virtue of long residence in them but who
365
had lost their sovereignty over these territories to some more numerous people of a different nationality. In some cases the minority groups ceased altogether to occupy their original territories and were dispersed throughout the nation of which they were now subjects. More often they stayed in the same place but in a subordinate position, since the dominant political and economic institutions were now run mainly for the benefit of the larger national group. The latter usually enacted laws to regulate the political existence of the minorities; for instance, they might have to send their own community leaders to the national assembly instead of being able to vote individually for candidates in a national election. Even the areas in which they could live or the occupations they could pursue might be determined by law; at the least, the dominant nationality regarded them with suspicion, as the Czechs were regarded under the Austro-Hungarian Empire. Changing social definitions. A minority need not be a traditional group with a long-standing group identification. It can arise as a result of changing social definitions in a process of economic or political differentiation. The increasing saliency of a certain occupation, for example, can set apart the people who practice that occupation, if occupations are more or less hereditary in the society, and cause them to be considered a minority group. Language or religious variations in a society can be considered unimportant for thousands of years, but a series of political events can so sharpen the religious or linguistic distinctions that the followers of one variation who happen to be without much power in the society are thereafter considered a minority. These processes can be illustrated by developments in India. The Marwaris, allegedly originating in Rajasthan, were until the late eighteenth century merely another occupational caste among the thousands of castes that make up India. They were moneylenders and small merchants, who were of no greater importance in the social structure than any other occupational caste until the rise of capitalism gave a great new importance to their economic functions. The new economic salience of the hereditary occupation created a salience for the people who practiced the occupation and made them into a despised, feared, and envied minority. The process was aided by the increased geographic dispersion of the group caused by a broader demand for their occupational services. Language differentiation based on geographic dispersal has been going on in India since time immemorial within two great language stocks, the
366
MINORITIES
Dravidian of southern India and the Indo-Aryan of northern India. The differentiation of Dravidian into Tamil, Telugu, Malayalam, and Kannada and several dozen lesser languages was not marked by definite historical events any more than was the differentiation of Latin into Italian, French, Spanish, and Rumanian. The modern development of political boundaries, which occurred at first under the British for administrative convenience and, after 1948, under the independent government of India, made language a salient basis of differentiation because the political boundaries were drawn as closely as possible to language boundary lines. Thus, it has been largely within the past few decades that language has become one of the most distinctive marks of a minority in India, and the basis of considerable group conflict. Minority groups in the United States In the United States, the term "minority groups" can be applied only in an extended sense. All citizens of the United States belong legally to a single American nationality; there are no laws that regulate the political status of any group of citizens according to their or their ancestors' national origin. Moreover, there is no single nationality group in the United States that either forms a numerical majority or enjoys a de facto political dominance; this state of affairs has existed at least since 1830. This is not to say, however, that discrimination and prejudice are unknown in the United States, but that, since there is no one "majority group" with a special claim to American nationality, the handicaps faced by American "minority groups" cannot be explained in terms of their national origin as such. The crucial factor would appear to be the degree to which any group has been allowed to become assimilated into the mainstream of American life and to enjoy the same opportunities as the majority of Americans. Most immigrant nationality groups suffered some discrimination during their early years in the country but were later assimilated. Those groups that were not allowed to assimilate—notably, the Negroes—have continued to be objects of prejudice for most of their fellow citizens, and in this sense they constitute "minorities," even though the number of Negroes far exceeds that of many a group that does, indeed, have a common national origin outside the United States but now thinks of itself as "American." This does not mean that members of assimilating groups in the majority completely lose all their memories of ancestry; they may pass along to successive generations selected aspects of traditional
culture—often of a ceremonial nature—and at the very least they pass along knowledge of the name of the ancestral homeland. [See ASSIMILATION.] Racial minorities. Racial groups are distinguished from each other by their possession of certain physical features inherited as the result of endogamy over a long period. Few races, however, are biologically pure, nor do most people use strictly biological criteria in deciding that a person belongs to one racial group rather than another. Thus, in the United States, a Negro is defined as someone of whom it is known that at least one of his ancestors was a Negro; the definition will hold even if, to all appearances, the individual is a "white." Moreover, although the principal racial minorities of the United States—the American Indians, the Chinese, the Filipinos, the Negroes, and the Japanese—all have members with some Caucasoid ancestry, they are still regarded as "nonwhite." The dominant white majority generally chooses to overlook the fact that they, too, are not "pure," since many whom they accept as white have some Negroid or Mongoloid ancestry. Nationality groups. The principal nationality groups in the United States came originally from Europe and, in spite of some admixture from other races, can plausibly regard themselves as having a common racial ancestry. It is not race, therefore, but culture—and the history of each culture—that provides the most salient distinctions between them. Immigrants of the second and third generations generally adopt English as their major or only language and assimilate their values and manners —at least in the more socially visible aspects of their behavior—to those of the majority. There are thus no permanent physical reminders of their ancestors' minority status, and they are not usually regarded as belonging to a minority group. The major exceptions to this are those groups— such as the Scandinavians of Wisconsin and Minnesota—that have remained in isolated rural areas, having little contact with the dominant American culture and therefore being under no pressure to assimilate themselves. Their status as national minorities is not the result of discrimination and prejudice on the part of the majority but of deliberate choice or sheer lack of opportunity. It cannot be said, however, that they suffer from their minority status, since they enjoy the full privileges of American citizenship and are not compelled to maintain their traditional way of life or to inhabit any particular territory. Finally, a new type of nationality minority is being created by immigration of victims of political persecution; the best example is the Cubans, con-
MINORITIES centrated mainly in south Florida and New York City, who plan to return to their home country after an expected future political revolution there. Language minorities. Some groups in the United States speak a language other than English, although they are not recent immigrants; indeed, they have continued to speak their own language over many generations. They are therefore best designated as "language minorities"; although they tend to have other distinctive cultural traits, it is principally their language that sets them apart from the majority of the population. The outstanding example of such a minority is the Spanish-speaking people who live in the sparsely populated rural areas of New Mexico and southern Colorado. Their position is similar to that of some European national minorities, since most of their ancestors were originally Mexican citizens whose territories were incorporated into the United States after the Mexican War of 1846-1848. They have been able to maintain a distinctive way of life because they are both isolated and poor; this same isolation tends to protect them from the discriminatory attitudes of the dominant, English-speaking population, who have not, on the whole, found it necessary to impose any legal or political disabilities upon them. Religious minorities. Discrimination on grounds of religion, although expressly forbidden by the constitution, has long been practiced in the United States with varying severity against a large number of groups. Chief among these groups are the Jews, the Muslims, Christians of the Eastern Orthodox church, and various Protestant and Orthodox sects. Roman Catholics, too, although their total number in the United States, according to some estimates, was more than forty million in 1960, share some of the disadvantages of minority-group status, though to a decreasing extent. One special feature of membership in a religious minority is that it can be acquired voluntarily, regardless of racial or national origin, though most members, of course, are following the religion of their parents. The position of the Jews is unlike that of other religious minorities because there are more Jews than there are active believers in the Jewish religion. Indeed, it is likely that in the United States believers and nonbelievers are about equal in number, although most of the latter would undoubtedly regard themselves as Jews nonetheless. This raises the question of whether there is any single objective basis for classifying them as Jews. One criterion can be ruled out completely: there is no such thing as a Jewish race, as should be obvious from the endless variety of racial, national, and linguis-
367
tic characteristics to be found among Jews. It therefore seems best to describe them, for summary purposes, as recent descendants of persons known to have followed the Jewish faith. Minorities in other parts of the world Outside the United States, racial minorities are found predominantly where race is considered important in the culture. This is mainly in Africa, where whites, Negroes, and immigrants from India variously consider themselves or each other as minorities. The Ainus are a racial minority in Japan, but the other group that is subjected to discrimination in that country—the eta—are to be considered a caste minority (some authors would prefer not to call castes "minorities" when they are of the same race, religion, nationality, and language as the majority group). To some extent, native Indians are considered minorities in parts of South America. But in a country like India, where race is not considered important, racial differences are not the basis for the formation of minority groups (religion and language are). Nationality differences continue to provide the source of minorities throughout Europe (including the Soviet Union, which extends into Asia). Some of these are in the process of disappearing as distinctive minorities because of assimilation, such as the Scots, Irish, and Welsh in the British Isles. Some are of very ancient origin, and their minority status has not changed appreciably in centuries, such as the Basques in Spain and the Greeks in Turkey, who also use a language different from that of the majority in their respective countries. Other minorities are being newly created by virtue of recent migrations for economic reasons, such as the Italian minority in Sweden. Sometimes political refugees form a new nationality minority, such as the Poles in Great Britain and the Baits in Sweden. Some retain their status as minorities through language differences or through international conflict, such as the German-speaking, Austrian-backed Tyrolese in the Italian province of Alto Adige. Mainly outside Europe, some nationality minorities seems to maintain their distinction through politcal differences with the majority, such as the Karens of Burma. Language is often closely associated with nationality, as we have seen. But there are some linguistic minorities which seem to owe their origin to differences of social class rather than of nationality; notable examples are the Swedish-speaking Finns and the German-speaking people of eastern Europe. Perhaps the contemporary nation with the most salient language minorities is India. When
368
MINORITIES
the states of India were divided mainly along linguistic boundaries, only the 14 languages spoken by the largest numbers of people could be assigned a state. As the states quickly assumed political importance and language became socially identified as the main basis for their differentiation, those who spoke languages other than the dominant one of their state became minorities. Such minorities included people who spoke one of the hundreds of "little" languages of India, for whom there was no state at all, including most of the "scheduled tribes," as the British administrators called the small "primitive" groups living outside the mainstream of Indian life. They also included those who spoke one of the major languages but were not residing in the state where their language was dominant. As language became a national issue in independent India, the language minorities usually became the objects of prejudice and discrimination. Religious differences are still a prime source of minorities, although in Europe perhaps not as much as in past centuries. Perhaps the most destructive conflict of the post-World War n period has been the one between Muslims and Hindus in India, and a most bitter—though small-scale—conflict has been that between the Muslims and Jews in Palestine. Protestant minorities have been subject to a good deal of discrimination in Catholic Spain and parts of South America. Catholics feel themselves to be a minority in several countries where Protestants form a majority, although the prejudice or discrimination directed at them is not very strong, as it once was. The Jews, who have been the most persecuted minority in modern times, are still the subject of considerable prejudice and discrimination in several countries of Europe, particularly in the Soviet bloc. Religious minorities also include the Christians in Muslim countries, pagans and atheists in Christian countries, the Hutterites and Doukhobors in Canada, the minor religious groups of the Indian subcontinent, and several others. The functioning of minorities in society A minority's position involves exclusion or assignment to a lower status in one or more of four areas of life: the economic, the political, the legal, and the social-associational. That is, a minority will be assigned to lower-ranking occupations or to lower-compensated positions within each occupation; it will be prevented from exercising the full political privileges held by majority citizens; it will not be given equal status with the majority in the application of law or justice; or it will be partially or completely excluded from both the formal and the informal associations found among the ma-
jority. Not infrequently, the minority also voluntarily excludes itself partially or completely from participation in these areas of life, partly as a means of maintaining traditional cultural differences. Accompanying the objective subordination and segregation of the minorities are usually to be found some subjective attitudes of mutual hostility, although these may sometimes be publicly denied and camouflaged. Majority-minority relations invariably involve some conflict, although this may take varied forms and operate on different levels. There seem to be three types of attitudes of hostility or prejudice with which the dominant group regards the minority and with which the minority may attempt to counter the dominant group. The complex etiologies of each of these, which differ somewhat from society to society, cannot be analyzed here. The first is an attitude in which power is the main element: the dominant group wishes to exploit the minority for economic, political, or sexual purposes, or for prestige, and the minority group seeks to escape their exploitation. While the achievement of ascendancy in terms of one or more of these scarce values may be brutal (including enslavement of the minority), it is seldom personal, nor does it, except accidentally, result in the death of a minority person. The second attitude is ideological: the dominant group believes that it has a monopoly on the "truth" (as may the minority group also). The achievement of ascendancy by one ideological group over the other results in drastic efforts to convert the minority to the dominant group's version of the "truth"; failing that, it banishes the minority by exile or death. The third attitude is racist: the dominant group believes itself to be biologically superior to the minority group, and it stereotypes the minority in terms of negatively valued characteristics. (The minority may have the same attitude toward the dominant group, but since it lacks power, this has few or no behavioral consequences.) Different social systems of conflict accompany these three different attitudes of hostility. For example, the caste system is generally associated only with the racist attitude; this system prohibits mobility across group lines and equal-status relationships and requires endogamy, systematized displays of inferiority by the minority, and occupational division of labor. Racism also has a pathological form which insists on the physical extermination of the minority race because it is alleged to threaten the "purity" of the dominant race. Where power seems to be the main ingredient in the conflict between dominant and minority groups, there
MINORITIES is one form or another of exploitation: for example, there may be slavery, piracy, tribute, suzerainty over the minority's political or military institutions, differential remuneration for work, or seizure of the minority group's women for sexual purposes. Where the ideological element seems to be the main factor in the hostility of the dominant group toward the minority, the majority group generally offers the minority the alternatives of conversion or extermination. Ideological conflict is at once the most brutal and the most generous toward the minority, depending on whether or not the minority will accede completely to the beliefs of the dominant group. In the contemporary world, the religious minorities of India and Palestine offer examples of ideological conflict, as do the political minorities of some communist countries. Power conflict is most evident in dominant-minority relations in northern and central Africa and in South America. Racism is today most frequent in South Africa and in the United States, although it is apparently still strong in Germany and in eastern Europe. Role of minorities in social change. From the preceding discussion, it will readily be understood that the different roles of minorities in the society will affect their impact on general social change. In general, the existence of minorities in a society offers a constant stimulus and a constant irritant that for several reasons provoke social change. Minorities are often carriers of a culture different from that of the dominant group, and the contact and clash of cultures have long been hypothesized as sources of social change. Even when minorities carry no traditional alien culture, their partial exclusion from the general society serves as a basis for the development of some deviant culture. In addition, apart from their cultural differences, minorities are sources of social dissatisfaction and social unrest, which are conditions for social change. As conflict groups, minorities tend to upset the status quo-, they require the dominant to readjust to them regularly, and sometimes they are able to make coalitions with other minorities within the society or with outside societies in order to change the balance of power. Minorities will often join reform or revolutionary factions or parties among the dominant group, since often the best chance for improving their lot within the existing society is offered by a turnover of elites. Some minorities probably include a disproportionate number of inventive and otherwise creative individuals, because their alienation from the society in which they are forced to live without full participation gives such individuals a perspective that is not pos-
369
sible for the more fully integrated; the "marginal man" between two subsocieties has been identified by some sociologists as one type of "creative man" (Stonequist 1937). If necessity be the mother of invention (which it probably usually is not), minority members are more often beset by necessity than are dominant group members. At least in the limited area of seeking expedients to improve their unhappy lot, minority members are influenced by this creative aspect of necessity. These general sources of social changes created by the existence of a minority in a society are probably best seen in that situation where power considerations by the dominant group maintain the existence of the minority. Where power and material exploitation are not involved, the dominant group is often either generous or unconcerned about letting the minority group go its own way, and that may often create the stimuli for social change. For example, while the powerful dominant group in the society is bent on accumulating wealth or retaining political ascendancy, the weak minority group can concentrate on acquiring knowledge and wisdom, which in the long run become stimulants of social change. The ideational tolerance often practiced by power-controlling groups sometimes results in their own destruction; for example, the historian Edward Gibbon held this to be true about the relations between Romans and Christians in the later stages of the Roman Empire. On the other hand, where ideological or racist considerations maintain the existence of a minority in a society, there is less freedom for it to create conditions that are conducive to social change. Ideational deviation—cultural or individual—is not tolerated where it becomes open and obvious, and racists must constantly prove the incapacity of the minority group by squelching all evidence of creativity whenever it threatens to appear among minority members. Where the dominant group is either racist or believes it holds a monopoly on truth, it is likely to regulate closely the education, cultural expression, and other innovative tendencies of the minority group, thus severely inhibiting the minority as a source of social change. Yet, under these circumstances, the minority group becomes schooled in subtlety and ingeniousness and may stimulate change where it is least expected: the songs, humor, and folk tales of the Negro slaves in the nineteenth-century American South can be seen in retrospect to have had a leavening effect on the white society, and the Jews in medieval Europe—repressed as they were—invented a merchant capitalism which eventually was accepted by the whole society (Sombart 1911).
370
MINORITIES
It should not be assumed that the existence of a minority in a society operates solely to create social change. Dominant-minority relations often inhibit change. They tend to make the dominant group rigid in maintaining the status quo. The existence of an exploited or repressed minority makes even the most powerful dominant group fearful, and fear can discourage all forms of social change. Dominant-minority relations are usually wasteful and inefficient—they waste the time and energy of the dominant group in maintaining the repression, and they prevent the minority group from producing at its maximum potential—and this waste of material and intellectual resources restricts creative social change. Research on minorities Research on minorities can be considered as having taken place within two frameworks. One is the framework of the ethnologist, who is concerned with describing the culture of a specific society where the society is the minority group. Whereas the usual ethnological study is of a geographically separated society, the ethnological study of a minority group has to consider its subject group living in physical proximity to one or more other groups. The institutions, the customs, and the daily life of the minority groups are considered under this approach. The second framework is that of the sociologist, who concentrates on the relationship between minority and majority groups, not on the distinctive cultural characteristics of either group, except insofar as they are pertinent to understanding the relationship. The relationship between majority and minority is analyzed in terms of general processes, such as conflict, accommodation, and assimilation. A variation on this approach has been one which treats minority-majority relations as social problems, with special emphasis on aspects (e.g., Brown & Roucek 1937), causes (e.g., Hughes & Hughes 1952), and results of discrimination and prejudice (e.g., Myrdal 1944). There are other variations in the literature. J. H. Franklin (1947) has analyzed the American Negro problem as a historian, M. R. Konvitz (1946) has examined the position of the alien under American law, and Gordon Airport (1954) has analyzed majority-minority relations in terms of the psychological concept of prejudice. Yet these authors also treat their subject matter as social problems. Practically all monographic studies are limited to a single majority-minority group situation, as is Ruth Glass's study of the West Indians in London (1960). However, other empirical studies attempt to draw together the findings of a number of mono-
graphs: for example, A. H. Richmond's work The Colour Problem (1955) or Charles Wagley and Marvin Harris' more ethnologic Minorities in the New World (1958). The largest number of empirical studies, usually published as articles in the professional social science journals, are highly specialized studies of the history, demography, economic status, political and legal rights, educational attainments, or other achievements of specific minorities. In addition, there are studies of prejudice, group identification, social change, or other such broad concepts, which are, however, usually based on very narrow and limited samples of the population. Much of the literature soon becomes irrelevant, partly because it is limited to description and partly because of the value orientations guiding the research; these orientations are seldom made explicit and thus cannot be readily taken into account by the reader. There is a need for research using analytic concepts, thus permitting nomothetic rather than purely empirical generalizations. There is also a need for research that considers the dynamics of social change affecting and affected by minorities and majority-minority relations. The rapid changes in intergroup relations in contemporary life, occurring under a great diversity of cultural conditions, permit unrivaled opportunities for sociologists wishing to study the dynamic principles involved in all social change. ARNOLD M. ROSE [See also ANTI-SEMITISM; ASSIMILATION; CONSTITUTIONAL LAW, article on CIVIL RIGHTS; ETHNIC GROUPS; PREJUDICE; RACE; RACE RELATIONS; SECTS AND CULTS.] BIBLIOGRAPHY ALLPORT, GORDON W. 1954 The Nature of Prejudice. Reading, Mass.: Addison-Wesley. -» An abridged paperback edition was published in 1958 by Doubleday. BROWN, FRANCIS J.; and ROUCEK, JOSEPH S. (editors) (1937) 1952 One America: The History, Contributions and Present Problems of Our Racial and National Minorities. 3d ed. New York: Prentice-Hall. BURMA, JOHN H. 1954 Spanish-speaking Groups in the United States. Durham, N.C.: Duke Univ. Press. CLARK, KENNETH B. 1965 Dark Ghetto: Dilemmas of Social Power. New York: Harper. CLAUDE, INIS L. 1955 National Minorities: An International Problem. Cambridge, Mass.: Harvard Univ. Press. CONFERENCE ON RACE RELATIONS IN WORLD PERSPECTIVE, HONOLULU, 1954 1955 Race Relations in World Perspective. Edited by Andrew W. Lind. Honolulu: Univ. of Hawaii Press. DRAKE, ST. CLAIR; and CAYTON, HORACE R. (1945) 1962 Black Metropolis: A Study of Negro Life in a Northern City. 2 vols., rev. & enl. New York: Harcourt.
MISSELDEN, EDWARD FINKELSTEIN, Louis (editor) (1949) 1960 The Jews: Their History, Culture and Religion. 3d ed., 2 vols. New York: Harper. FRANKLIN, JOHN H. (1947) 1956 From Slavery to Freedom: A History of American Negroes. 2d ed., rev. & enl. New York: Knopf. FREEDMAN, MAURICE (editor) 1955 A Minority in Britain: Social Studies of the Anglo-Jewish Community. London: Vallentine. GLASS, RUTH (1960) 1961 London's Newcomers: The West Indian Migrants. Center for Urban Studies, University College, London, Report No. 1. Cambridge, Mass.: Harvard Univ. Press. -» First published as Newcomers: The West Indians in London. Chapters 2 and 3 analyze geographical distribution and discrimination in housing. HUGHES, EVERETT C.; and HUGHES, HELEN M. 1952 Where Peoples Meet: Racial and Ethnic Frontiers. Glencoe, 111.: Free Press. KANE, JOHN J. 1955 Catholic-Protestant Conflicts in America. Chicago: Regnery. KONVITZ, MILTON R. 1946 The Alien and the Asiatic in American Law. Ithaca, N.Y.: Cornell Univ. Press. LINCOLN, CHARLES E. 1961 The Black Muslims in America. Boston: Beacon. MYRDAL, GUNNAR (1944) 1962 An American Dilemma: The Negro Problem and Modern Democracy. New York: Harper. -> A paperback edition was published in 1964 by McGraw-Hill. RICHMOND, ANTHONY H. (1955) 1961 The Colour Problem. Rev. ed. Baltimore: Penguin. ROSE, ARNOLD M.; and ROSE, CAROLINE B. (editors) 1965 Minority Problems. New York; Harper. SCHERMERHORN, R. A. 1964 Toward a General Theory of Minority Groups. Phylon 25:238-246. SHIBUTANI, TAMOTSU; and KWAN, K. M. 1965 Ethnic Stratification: A Comparative Approach. New York: Macmillan. SIMPSON, GEORGE E.; and YINGER, J. MILTON (1953) 1965 Racial and Cultural Minorities: An Analysis of Prejudice and Discrimination. 3d ed. New York: Harper. SOMBART, WERNER (1911) 1913 The Jews and Modern Capitalism. London: Allen & Unwin. -» First published as Die Juden und das Wirtschaftsleben. A paperback edition was published in 1962 by Collier. STONEQUIST, EVERETT V. (1937) 1961 The Marginal Man. New York: Russell. SULKOWSKI, JOZEF 1944 The Problem of International Protection of National Minorities: Past Experience as a Basis for Future Solution. New York: Privately published. TAEUBER, KARL E.; and TAEUBER, ALMA F. 1965 Negroes in Cities: Residential Segregation and Neighborhood Change. Chicago: Aldine. VANDER ZANDEN, JAMES W. (1963) 1966 American Minority Relations: The Sociology of Race and Ethnic Groups. 2d ed. New York: Ronald Press. WAGLEY, CHARLES; and HARRIS, MARVIN 1958 Minorities in the New World. New York: Columbia Univ. Press. WILLIAMS, ROBIN M. JR. 1964 Strangers Next Door: Ethnic Relations in American Communities. Englewood Cliffs, N.J.: Prentice-Hall.
MISES, LUDWIG VON See VON MISES, LUDWIG.
371
MISES, RICHARD VON See VON MISES, RICHARD. MISSELDEN, EDWARD Edward Misselden (fl. 1608-1654) was both an English merchant and a comparatively enlightened mercantilist. As a mercantilist, he made an important contribution to the development of the idea of the balance of trade as an analytical and measurable concept. Concern over the state of England's foreign trade moved Parliament in 1622 to appoint a standing commission on trade, and this event stimulated Misselden to write his tract Free Trade: Or, the Meanes to Make Trade Florish (1622). The book attributed the alleged decay of English foreign trade to excessive imports, to the export of bullion by the East India Company, and to the defective enforcement of the regulations of the cloJi trade. Misselden contended that the loss of bullion was partly due to the undervaluation of English coin. Consequently he proposed that its denomination be raised, in the hope that the ouflow of bullion would be checked and that, in "the plenty of money," trade would be quickened and exports increased. He conceded that too much bullion in the form of plate would cause scarcity of money; nevertheless, for a nation to have plate was considered preferable to turning it into coin and sending it out of the kingdom because of its undervaluation. He realized that landlords and creditors would suffer losses if the denomination were raised and advocated that they be protected by a provision that would make contracts negotiated before the raising of the currency payable at the value of the money when the contracts were made. Like other mercantilists, he did not regard higher prices as an evil if they are accompanied by at least an equal increase in money, stocks, employment, or incomes. In effect, this was his answer to the possible objection that raising the denomination of English coin would result in an increase in commodity prices. Misselden's second book, The Circle of Commerce: Or, the Ballance of Trade, was written as a reply to Gerard Malynes—"a dastardly combatant"—who accused him of overlooking the role of foreign exchanges as the chief cause of England's distress. In this book, Misselden appears to have used in print for the first time the phrase "balance of trade," describing it as "an excellent and politique Invention, to shew us the difference of waight in the Commerce of one Kingdome with another"
372
MISSELDEN, EDWARD
(1623, p. 116). The notion of balance was well known; it was the actual measurement of trade in the absence of periodic trade statistics that he regarded as a novel idea. "The first End of our Ballance of trade," he wrote, "is to shew us the state thereof" (1623, gloss on p. 133). Indeed, by multiplying the customs revenue by 20 (a tariff of 5 per cent), Misselden attempted to measure England's trade balance for the year 1621; he found it in deficit and, using mercantilist arguments, warned of the impoverishment of the people. Recognizing that international balance does not consist of commodity exports and imports alone, Misselden added such items as re-exports, profits from fisheries, and freight earnings to commodity statistics in computing the balance. Accordingly, he denied Malynes' accusation that the East India Company had contributed to England's shortage of money by exporting "Reals of Plate" to the East Indies, asserting that England not only would benefit from increased employment and freight earnings related to these exports but also would, in the net, derive more bullion from the import of Indian commodities and their re-export to "all parts of the World" than it sent to India to purchase them (1623, p. 35). In contrast to Malynes, who allegedly held the view that the relative value of internationally traded commodities depends upon the value of the exchanges, Misselden argued that the market value of the exchanges is itself dependent upon the relative demand and supply of the respective foreign currencies and, in turn, upon the relative demand and supply of commodities of the respective countries. Misselden did believe that there is a natural rate of exchange that can be determined by melting down metallic money into its pure form. This "fineness" of money is the "center"—or, in modern terminology, the mint par—"whereunto all Exchanges have their naturall propension" (1623, p. 97). In the last analysis, Misselden observed, the exportation or importation of bullion is to be explained by the general abundance or scarcity of commodities, and with some exaggeration, he accused Malynes of having stated the argument backwards. This controversy probably represents the first time in English history that a question of economic policy produced a war of tracts which exerted an immediate and traceable influence on government policy. The new views of Misselden and Thomas Mun, an official of the East India Company whom he knew and cited with approval, won a definite victory over the older views of Malynes and Milles. This is not to suggest that Misselden's writings
on economics were the product of objective scholarship; while they represented an advance over those of earlier pamphleteers, his contributions to the balance-of-trade doctrine and to the theory of exchanges were steeped in pressure-group politics and in particular circumstances. His exaggerated emphasis on national objectives, combined with his failure to disclose his own private trade connections, lends support to those who see English mercantilism as a body of international trade doctrine primarily concerned with the importance of England's having an excess of exports over imports—and developed for the most part by merchants pleading for special interests. Although there is a sharp and irreconcilable conflict between economic universalism and mercantilism, Misselden appears not to have been aware of any such conflict. He argued that England had to increase its exports and decrease its imports to achieve and maintain a positive balance of trade; otherwise its trade would decay and its bullion be lost. He failed to mention, however, that the continuous accumulation of the precious metals by one nation cannot but harm the economic interests of other nations. He simply asserted the natural harmony of private and public interests (1623, p. 17). His failure to understand the fundamental elements of a self-regulating mechanism of adjustment is explained by his confusion about interests, as well as by his generally limited conceptualization. There is always danger in describing the thought of an earlier period by means of terms or concepts that have later taken on different meanings. Thus, when Misselden advocated greater economic freedom or "free trade" he was certainly not defending economic freedom in principle or in general but only specific forms and degrees of economic freedom. Similarly, when he objected to governmental regulation he was not objecting to it in principle but only to specific forms or degrees of regulation. The achievement of formulating a general doctrine of economic freedom, as distinguished from a selective advocacy of specific freedoms, belongs to the physiocrats and to Adam Smith. They, however, do appear to have built up their general case from earlier specific arguments for particular freedoms. Although never entirely original, Misselden, by advocating certain particular freedoms, made a genuine contribution to doctrinal advance. He pointed out elements in the physical and social order of nature that tend to produce a self-operating and beneficial pattern of human behavior; asserted, however naively, the harmony of private and public interests; stressed the role of invisibles
MITCHELL, WESLEY C. and re-exports in the trade balance; achieved considerable generalization in his discussion of exchanges; and devoted particular attention to the rising role of the large-scale merchant and the important contribution he could make to society if left largely to his own devices. All these concepts served as antecedents in the development of the general doctrine of economic free trade. JOHN M. LETICHE [See also ECONOMIC THOUGHT, article on MERCANTILIST THOUGHT; INTERNATIONAL
MONETARY
ECONOMICS,
article on BALANCE OF PAYMENTS.] WORKS BY MISSELDEN
1622
Free Trade: Or, the Meanes to Make Trade Florish. London: Waterson. 1623 The Circle of Commerce: Or, the Ballance of Trade, in Defence of Free Trade . . . London: Dawson. SUPPLEMENTARY BIBLIOGRAPHY
FRIIS, ASTRID 1927 Alderman Cockayne's Project and the Cloth Trade: The Commercial Policy of England in Its Main Aspects; 1603-1625. Copenhagen: Munksgaard. GARDINER, SAMUEL R. (1883-1884) 1896-1901 History of England From the Accession of James I to the Outbreak of the Civil War: 1603-1642. 10 vols. London: Longmans. HECKSCHER, ELI F. (1931) 1955 Mercantilism. 2 vols., rev. ed. London: Allen & Unwin; New York: Macmillan. -> First published in Swedish. HEWINS, WILLIAM A. S. 1892 English Trade and Finance Chiefly in the Seventeenth Century. London: Methuen. HEWINS, WILLIAM A. S. (1894) 1953 Edward Misselden. Volume 13, pages 498-499 in Dictionary of National Biography. Oxford Univ. Press. JOHNSON, EDGAR A. J. 1937 Predecessors of Adam Smith: The Growth of British Economic Thought. Englewood Cliffs, N.J.: Prentice-Hall. H> See especially pages 57-69 on "Misselden, the Critic." LETICHE, JOHN M. 1959 Balance of Payments and Economic Growth. New York: Harper. SCHUMPETER, JOSEPH A. (1954) 1960 History of Economic Analysis. Edited by E. B. Schumpeter. New York: Oxford Univ. Press. SUPPLE, BARRY E. 1959 Commercial Crisis and Change in England, 1600-1642: A Study in the Instability of a Mercantile Economy. Cambridge Univ. Press. SUVIRANTA, BRUNO 1923 The Theory of the Balance of Trade in England: A Study in Mercantilism. Helsinki: Suomalaisen Kirjallisuuden Seura. VINER, JACOB 1937 Studies in the Theory of International Trade. New York: Harper. VINER, JACOB 1961 The Intellectual History of Laissez Faire. Univ. of Chicago Law School.
MITCHELL, WESLEY C. Wesley Clair Mitchell (1874-1948), American economist, was born and raised in Illinois in modest and occasionally straitened circumstances. En-
373
tering the University of Chicago in 1892, he was soon attracted by the evolutionary view of the development of human thought and social institutions as advanced by John Dewey and Thorstein Veblen. His other teachers in economics influenced him less, but J. Lawrence Laughlin introduced him to problems in monetary theory and policy and also helped to sharpen his critical sense. Mitchell's extensive early writing was devoted mainly to the role of money and its connection with price "revolutions" during the Civil War. He completed a doctoral dissertation on this subject in 1899 and later expanded it into a book entitled A History of the Greenbacks (1903), a major and in many respects still authoritative work on the monetary upheavals of 1862-1865. In 1908 he published Gold, Prices, and Wages Under the Greenback Standard, which carries the analysis of the History further and skillfully organizes a massive statistical investigation of the period 18621878. From his studies of the greenbacks Mitchell derived some of the ideas that were to guide his economic thought. One of the earliest of these ideas was that economics, as one of the sciences concerned with the realities of human behavior and social institutions, must be grounded in observation and measurement. Statistics provide the principal means to ensure a cumulative growth of tested quantitative knowledge, and they are essential for testing hypotheses as well as for suggesting new ones. Mitchell's lifelong interest in the collection and improvement of economic data led him to make several influential studies, one of the best examples of which is The Making and Using of Index Numbers (a monograph published by the U.S. Bureau of Labor Statistics in 1915 and reissued as late as 1938). Such endeavors did much to increase and improve the output of statistical agencies in several areas, notably the preparation of indexes of commodity and security prices, indexes of production, measures of national income and its subdivisions, and measures of financial and other monetary transactions. Another of Mitchell's guiding concepts was that "the use of money and the pecuniary way of thinking it begets is a most important factor in the modern situation." Throughout his working life, Mitchell conceived his central task to be the acquisition and the transmission of knowledge about how the "money economy" works—that is, how it did and does evolve and how it influences men's ideas and behavior. Aiming ultimately at a tested dynamic theory of economic change, Mitchell was often critical of
374
MITCHELL, WESLEY C.
"orthodox" doctrines which dealt with essentially static problems (for example, relative prices, resource allocation, and income distribution, seen as interconnected and determined in equilibrium ). He insisted that deductive reasoning must seek support from tests against observed facts; that logical determinacy and consistency of a theoretical system are not sufficient tests of its "truth"; that theorists are susceptible to, and should guard against, unconscious ideological preconceptions; and that methodologically convenient postulates may be misleading when indiscriminately applied. He thought, for example, that the "logic of pecuniary institutions"—the quest for profits or "making money"— fits tolerably well the activities of modern businessmen but not those of housewives who are engaged in the "backward art of spending money" on articles of family consumption. Occasionally, Mitchell's distrust of theoretical model-building seems too general and categorical; but his writings also clearly recognize the value of traditional theory and the complementarity of theoretical and empirical studies. From 1903 to 1913 Mitchell taught at the University of California and engaged in particularly productive research. He completed his studies of the greenback episode, turned to a more general analysis of the evolution of the money system, discovered that this task required him to investigate the as yet scarcely explored "recurring readjustments of prices," and embarked upon a detailed study of this subject, which soon grew into a comprehensive account of cyclical movements in diverse economic variables. In 1913 his Business Cycles appeared, a large treatise which became an acknowledged precursor and guide for cyclical (and other quantitative) studies in economics for years to come. In particular, Part 3 of this work, reprinted several times between 1941 and 1959, remains one of the best analytical accounts of the processes that give rise to business cycles. Theory of business cycles. Mitchell found the then current theories of business cycles suggestive but fragmentary and uncertain; since none of them was clearly superior, he proposed to test them generally by studying the facts instead of making any one theory a separate object of research. He always conceived of the business cycle in evolutionary and comprehensive terms, and he believed that the study of cycles should encompass all the major parts and aspects of the money economy and should try to explain how they interact and participate in the process of cumulative change. Business cycles, he observed, appear only in those times and places where economic life has come to be organized on
the basis of making and spending money incomes. They are not mere random disturbances that jolt the economy temporarily out of its equilibrium; instead, they are widely diffused, cumulative fluctuations generated by the modern economic system itself. Although not periodic, they recur in an unceasing round. Each cycle differs from its predecessors in many particulars, both because it is influenced by unique historical events and because the economy's structure and reactions change gradually over time. But the study of business annals and historical statistics leads to the conclusion that cycles do have sufficiently numerous and important features in common to constitute a distinct class of related phenomena, somewhat analogous to a biological genus of many species. Thus conceived, there is no single cause, no simple—or even complex but complete and immutable—explanation of business cycles. Mitchell's "analytic description" does identify as the central issue the dependence of tides in business activity upon the prospects of profits or (in times of crisis) the quest for solvency. It emphasizes, therefore, such factors as the prices that enter into business receipts and expenses, the volume of sales effected at prevailing margins of profit, and businessmen's cash and credit requirements. But to explain business cycles it is not enough to know what each of these "factors of chief significance" is or does during such cycles; the "harder half of the battle" is to follow the complex interactions of these factors from one cycle stage to the next. Thus, prosperity "cumulates" by spreading over enterprises, industries, regions, and types of economic activities in ever widening circles. Increases in prices accompany increases in the physical volume of business. But gradually, stresses also start to multiply. In many areas, costs begin to encroach upon selling prices. The rapid rise in prices of raw materials and goods for resale, the advance in interest rates on bank loans, the lowered efficiency of labor, the reactivation of older and poorer equipment, the increase in "incidental wastes of management"—all these contribute to the rise in costs. At the same time, the advance in selling prices is held back by contractual obligations, custom, and public regulation. In some industries, completion of new investment projects brings about increases in capacity outputs that cannot be sold at the high prices asked. The decline in their reserves makes the banks reluctant to expand loans further. A stringency in capital and money markets develops. These factors bring about postponements of investment projects by those who find current financing and construction costs abnormally high and who
MITCHELL, WESLEY C. expect that they will soon decline. Prosperity, then, "breeds" a crisis or gives way to a contraction, which again cumulates and spreads over the economy for a time. Even as the decline of activity continues, however, forces of recovery gradually gain strength. Marginal enterprises are reorganized or recapitalized, bad debts are charged off, and inventories are reduced, so that business can again be conducted more efficiently. Less efficient employees are weeded out, and those that remain are constrained to work harder to protect their jobs. The reserve position of banks improves, and costs of borrowing decline. Raw-material prices also fall, and a general realignment of costs and selling prices comes about. Meanwhile the demand for goods revives slowly, aided by the continuing growth of population, by the need to replace long-used durables and to replenish depleted inventories, and by improved investment opportunities (also, perhaps, by new tastes, products, and techniques, or some other favorable outside events). This is "how depression breeds prosperity" and a new cycle begins. Mitchell attempted to reduce the intricacies of his subject, but he was even more anxious not to oversimplify it; his account of business cycles is, accordingly, complex. Even a brief and necessarily rough sketch of that analysis, however, must not fail to underscore that Mitchell viewed the business cycle as a process whose pervasiveness and continuity are due mainly to institutional responses of the economic system to a variety of unpredictable changes. The lags in these responses are of strategic significance in the dynamics of the cycle. Thus, the essential feature of the cumulative movements of expansion and contraction is that expenditures often are induced by, and lag behind, receipts and that selling prices lag behind buying prices. Also important are the lags of investment expenditures and deliveries behind investment decisions. Orders for new plant and equipment start expanding early during a business revival, when capacity reserves are still ample. Not knowing how much business their competitors are booking, contractors often sell to investors more construction than can readily be executed within the contract time. This is discovered later when efforts to get labor and materials needed to perform the work drive up the prices of these factors. Timely deliveries then become costly, some of the earlier contracts turn out to be less profitable than expected, and much higher prices are asked when new contracts are negotiated. The high costs of construction cause postponement of many new investment projects.
375
In his concept of a self-generating cycle and in his recognition of the need for empirical research, Mitchell had a forerunner in Clement Juglar. Writings akin to Mitchell's in spirit and approach, notably those by Aftalion and Spiethoff, appeared at about the same time as Business Cycles or in the following decade. But Mitchell's research was unique in its broad scope and continuity, and it proved particularly influential. Teaching and research. The study of business cycles remained Mitchell's central scientific concern for the rest of his life, but he frequently also gave attention to other interests and duties. In 1913 he went to Columbia University, where he taught until his retirement in 1944 (except for 1919-1922, when he joined Alvin Johnson and others in organizing the New School for Social Research). His lectures attracted large audiences; one set, "Types of Economic Theory," was transcribed stenographically and issued in mimeographed form. This selective historical survey of the thought and activities of major economists from Adam Smith to Veblen and Commons gave rather scant attention to technical economic theory; instead, it emphasized the dependence of ideas on the main economic and social issues of the time, and also, conversely, the influence of ideas on the course of events. In several essays on individual men and doctrines—dealing with, among others, Ricardo, Bentham, and Wieser (see [1912-1936] 1937, chapters 10-12)—Mitchell showed also the interdependence of economic thought and political and moral philosophy. In 1920 Mitchell assumed direction of the National Bureau of Economic Research in New York City, an organization for "exact and impartial investigations," whose founding was in a large measure due to his initiative. He served as director until 1945 and continued to be an active member of the Bureau's research staff until his death. He was author or coauthor of numerous National Bureau publications. Equally important was his role in inspiring and helping to carry out many other studies of the National Bureau. These included work on such diverse topics as seasonal fluctuations (Simon Kuznets), interest rates (Frederick Macaulay), production trends (Arthur F. Burns), national income (Kuznets), cyclical movements in transportation (Thor Hultgren), inventories (Moses Abramovitz), prices (Frederick Mills), consumption (Ruth Mack), business-cycle indicators (Geoffrey Moore), and others. In particular, the National Bureau's continuing series, Studies in Business Cycles, owes its existence to Mitchell's initiative, contributions, and guidance.
376
MITCHELL, WESLEY C.
Resurvey of the field. Business Cycles: The Problem and Its Setting (1927) was the first book Mitchell wrote after he undertook a "resurvey of the field" to improve the analysis and to subject to extensive new tests the findings of his 1913 treatise. Although large, this book corresponds in scope only to the first of the three main parts of its predecessor. The statistical basis of the 1913 book covered four countries but was restricted to the brief period from 1890 to 1911 and to annual data, which often obscure cyclical movements. Furthermore, many economic processes were but poorly, or not at all, represented in the then available materials. By 1927 these deficiencies could be substantially reduced as new data accumulated at a fast pace. The 1927 study relies on a substantial collection of American and English "indexes of business conditions," which are monthly or quarterly and cover various periods between 1850 and 1925. It also uses reports by contemporary observers on each year's business in 17 countries; the oldest of these "business annals," compiled by Willard L. Thorp, goes back to 1790. Drawing on a much larger mass of evidence than was previously accessible, Mitchell found no reason to alter his basic views on the nature of business cycles and the methods that are appropriate for their study. Of particular interest to the general economist is Chapter 2 of The Problem and Its Setting, which provides a comprehensive survey of the evolution and major features of the modern "economic organization," with particular reference to business cycles. One section of the book offers an important theoretical contribution by showing that the different elements in the equation of exchange must have differing dates if the equation is to be valid. The lags of deliveries and payments behind price agreements, which are reflected in these dating differences, are highly important for the short periods relevant for businesscycle analysis. Over longer periods, such as a year or more, the difference in dating can be disregarded. In the long run, the quantity of money was seen by Mitchell to be the major determinant of the price level, while in short periods its role is usually passive. This is because in contractions and moderate expansions the effective limit on business transactions is set by demand, whereas in "intense booms" a higher limit is reached, which is set by the monetary and banking systems. This accounts for the strong influence of changes in money supply over long periods of time. [See MONEY, article On QUANTITY THEORY.]
In the "tentative working plans" sketched at the end of the 1927 volume, the question, How do
business cycles run their course? was put ahead of the question, What causes business cycles? It is necessary to answer the former question before one can see what the latter means and "in what sense it can be answered." But Mitchell found that the task of providing this answer required a much larger apparatus of research than he had at first expected: compilation of numerous time series to represent a variety of pertinent processes; development of new techniques of isolating and analyzing cyclical movements; and application of these methods to the accumulated data. Under Mitchell's direction, all these operations proceeded simultaneously at the National Bureau, each influencing the others. Use of cyclical measures. The progress of Mitchell's researches can be traced through several brief preliminary reports, notably in the Bureau's Annual Reports and the Bulletin series, nos. 31 (Mitchell 1929), 57 (Mitchell & Burns 1935), and 69 (Mitchell & Burns 1938). The last of the above, "Statistical Indicators of Cyclical Revivals," deserves particular notice, since it paved the way for the more recent research on uses of cyclical measures in the analysis of current business conditions and in short-term forecasting. A full account of their methods of analyzing cyclical behavior was presented by Burns and Mitchell in Measuring Business Cycles (1946). The working definition of business cycles used in Mitchell's 1927 volume is here adopted, with some modifications, as a tool of research. The book draws on the results of a systematic analysis of over 1,000 series, most of them monthly, for the United States, Great Britain, Germany, and France. The clustering of turning points in representative samples of these series, together with selected measures of aggregate economic activity, helps to determine the chronologies of general business expansions and contractions in the four countries. Two sets of measures for each series are computed, describing its behavior during the phases of the general business cycle and during its own "specific cycles," respectively; the differences between the two sets reflect the extent to which the turning dates in an individual series deviate from the turning dates in aggregate economic activity. These deviations also measure the cyclical leads or lags of different economic processes. In addition, measures of duration and amplitude of specific cycles are introduced, as well as "conformity indexes" that reflect the degree of directional agreement between the movements of a given series and those prevailing in the economy at large. Averages of these measures are struck for all the cycles covered by a series in order to
MITCHELL, WESLEY C. bring out the features that are typical; but deviations from these averages are also presented in order to study how the cycles vary in duration, intensity, etc. The last chapters of the volume contain tests of several hypotheses on the effects of secular trends, long-wave movements, and critical historic events upon business cycles. Some apparently systematic changes in the cyclical behavior of particular variables are indicated, but they seem to be rather weak. A tentative inference, qualified by acknowledged limitations of the data and methods used, is that these changes do not alter the typical course of business cycles appreciably. Death prevented Mitchell from completing a final treatise in which he planned to give a comprehensive account of business cycles and their causes. His posthumous What Happens During Business Cycles: A Progress Report (1951) is only a fragment of this project. More than two-thirds of its text is given to a systematic analysis of the variety of cyclical attributes of individual economic processes. This part is preceded by a brief exposition entitled "Aims, Methods, and Materials" and is followed by the unfinished part called "The Consensus of Cyclical Behavior." The latter indicates the synthesis toward which Mitchell's work was directed. It aims at showing "how [the] measures of cyclical behavior in various parts of the economy fit together, and what composite picture they give of business cycles" (1951, p. 255). A central theme of the Progress Report is that "both the similarities and the differences (among the cyclical patterns of a substantial sample of representative series) are explicable on the assumption that economic activities are functionally related to one another in the numberless direct and indirect ways suggested in fancy by the equations of Walras and the analyses of Marshall" (p. 112). The specific cycles of each activity depend in part on "factors peculiar to the activity itself" and in part on "those congeries of specific cycles in other activities which we call business cycles" (p. 113). An overwhelming majority of economic series fluctuate in sympathy with the cyclical phases in aggregate activity: only about one-tenth show "irregular" timing, that is, no systematic relation in time to business cycles. Countercyclical processes are not only far less numerous but also typically less regular than those that respond positively. But some conforming series tend to lead and others to lag by different intervals. Hence, business cycles consist not only of roughly synchronous expansions followed by roughly synchronous contractions in activities: "they consist also of numerous
377
contractions while expansion is dominant, and numerous expansions while contraction is dominant" (p. 79). Mitchell not only put these observations in numerical form but also identified many of the leaders, coinciders, and laggers in the typical round of cyclical developments. While business cycles provided the focus for Mitchell's studies, they were conceived by him as nothing less than the "economic process in motion," or the characteristic course of the money economy in a late stage of development. In this view, to understand business cycles is to understand also the structure and functioning of the economy. The interrelated movements of numerous economic variables reflect economic behavior in its current institutional setting, and their objective, quantitative analysis is the way to reach an understanding of that behavior. This view secured for Mitchell an outstanding place in the evolution of modern economics as a quantitative and empirical science. The theoretical thinking of our times may not seem strongly affected by the general institutionalist challenge of several decades ago, in which Mitchell participated. But Mitchell's own scholarly, empirical approach left a strong imprint on the growth and orientation of economic research. This applies to his critical attitude toward purely normative and deductive economics; his insistence on the need to improve and systematize observation of the economic manifestations of human behavior; and his belief that cumulation of tested knowledge will ensure that economics will have an important place in the deliberations of policy makers. Mitchell had strong humanitarian sympathies and democratic convictions as well as a vivid awareness of the shortcomings of the contemporary social order. He believed that advances in economics and other social sciences can help to reduce such defects of the economic system as recurrence of depressions and unemployment, inequality of opportunity, concentration of power, and insufficient economic security. He recognized that "in the countries that have given wide scope to private initiative . . . , the masses of mankind attained a higher degree of material comfort and a larger measure of liberty than . . . under any other form of organization that mankind has tried out in practice" ([1912-1936] 1937, p. 94). But he also believed that government has social responsibilities which it should meet in the most practicable way. The "nation's full intelligence" should be organized "to deal seriously with social problems before they have produced national emergencies" (pp. 100, 131). National planning, thus conceived as a broad
378
MOBILITY
and continuous effort, would encounter many difficulties and doubtless have its share of failures; nevertheless, it would be preferable to the ad hoc "piecemeal planning," which Mitchell saw as "our common method of attempting to use the powers of government" (p. 130). Although he treated scientific work as his prime commitment, Mitchell gave much of his time to public affairs. During World War i he was chief of the Price Section of the War Industries Board. Later, he served by presidential appointment on the Research Committee on Social Trends, 19291933, and the National Planning Board, 1933, and he prepared a report for the President's Committee on the Cost of Living, 1944. VICTOR ZARNOWITZ [See also BUSINESS CYCLES; ECONOMIC THOUGHT, article On THE INSTITUTIONAL SCHOOL; INDEX NUMBERS.] WORKS BY MITCHELL
1903 A History of the Greenbacks, With Special Reference to the Economic Consequences of Their Issue: 1862-1865. Univ. of Chicago Press. 1908 Gold, Prices, and Wages Under the Greenback Standard. University of California Publications in Economics, Vol. 1. Berkeley: Univ. of California Press. (1912-1936) 1937 The Backward Art of Spending Money, and Other Essays. New York: McGraw-Hill. 1913 Business Cycles. Berkeley: Univ. of California Press. -•> Part 3 was reprinted by the University of California Press in 1959 as Business Cycles and Their Causes. (1915) 1938 The Making and Using of Index Numbers. 3d ed. U.S. Bureau of Labor Statistics, Bulletin No. 656. Washington: Government Printing Office. 1927 Business Cycles: The Problem and Its Setting. National Bureau of Economic Research, Publications, No. 10. New York: The Bureau. 1929 Testing Business Cycles. National Bureau of Economic Research, Bulletin 31. 1935 MITCHELL, WESLEY C.; and BURNS, ARTHUR F. The National Bureau's Measures of Cyclical Behavior. National Bureau of Economic Research, Bulletin 57. 1938 MITCHELL, WESLEY C.; and BURNS, ARTHUR F. Statistical Indicators of Cyclical Revivals. National Bureau of Economic Research, Bulletin 69. 1946 BURNS, ARTHUR F.; and MITCHELL, WESLEY C. Measuring Business Cycles. National Bureau of Economic Research, Studies in Business Cycles, No. 2. New York: The Bureau. 1951 What Happens During Business Cycles: A Progress Report. National Bureau of Economic Research, Studies in Business Cycles, No. 5. New York: The Bureau. ~> Published posthumously. SUPPLEMENTARY BIBLIOGRAPHY
BURNS, ARTHUR F. 1951 Mitchell on What Happens During Business Cycles. Pages 3-14 in Conference on Business Cycles, New York, 1949, Conference on Business Cycles. New York: National Bureau of Economic Research. -> Jacob Marschak's "Comment" and a reply by Arthur F. Burns appear on pages 14-33. BURNS, ARTHUR F. (editor) 1952 Wesley Clair Mitchell:
The Economic Scientist. National Bureau of Economic Research, Publications, No. 53. New York: The Bureau. -» Contains a list of Mitchell's publications. DORFMAN, JOSEPH 1949 The Economic Mind in American Civilization. Vol. 3. New York: Viking. -> See especially pages 455-473, on Mitchell. HANSEN, ALVTN H. (1951) 1964 Business Cycles and National Income. Expanded ed. New York: Norton. -» See especially pages 394-410, on Mitchell's work.
MOBILITY See LABOR FORCE, article on MARKETS AND MOBILITY; MIGRATION; SOCIAL MOBILITY. MODELS, MATHEMATICAL Although mathematical models are applied in many areas of the social sciences, this article is limited to mathematical models of individual behavior. For applications of mathematical models in econometrics, see ECONOMETRIC MODELS, AGGREGATE. Other articles discussing modeling in general include CYBERNETICS, PROBABILITY, SCALING, SIMULATION, and SIMULTANEOUS EQUATION ESTIMATION. Specific models are discussed in various articles dealing with substantive topics. Theories of behavior that have been developed and presented verbally, such as those of Hull or Tolman or Freud, have attempted to describe and predict behavior under any and all circumstances. Mathematical models of individual behavior, by contrast, have been much less ambitious: their goal has been a precise description of the data obtained from restricted classes of behavioral experiments concerned with simple and discrimination learning; with detection, recognition, and discrimination of simple physical stimuli; with the patterns of preference exhibited among outcomes; and so on. Models that embody very specific mathematical assumptions, which are at best approximations applicable to highly limited situations, have been analyzed exhaustively and applied to every conceivable aspect of available data. From this work broader classes of models, based on weaker assumptions and thus providing more general predictions, have evolved in the past few years. The successes of the special models have stimulated, and their failures have demanded, these generalizations. The number and variety of experiments to which these mathematical models have been applied have also grown, but not as rapidly as the catalogue of models. Most of the models so far developed are restricted to experiments having discrete trials. Each trial is composed of three types of events: the
MODELS, MATHEMATICAL presentation of a stimulus configuration selected by the experimenter from a limited set of possible presentations; the subject's selection of a response from a specified set of possible responses; and the experimenter's feedback of information, rewards, and punishments to the subject. Primarily because the response set is fixed and feedback is used, these are called choice experiments (Bush et al. 1963). Most psychophysical and preference experiments, as well as many learning experiments, are of this type. Among the exceptions are the experiments without trials—e.g., vigilance experiments and the operant conditioning methods of Skinner. Currently, models for these experiments are beginning to be developed. Measures. With attention confined to choice experiments, three broad classes of variables necessarily arise—those concerned with stimuli, with responses, and with outcomes. The response variables are, of course, assumed to depend upon the (experimentally) independent stimuli and upon the outcome variables, and each model is nothing more or less than an explicit conjecture about the nature of this dependency. Usually such conjectures are stated in terms of some measures, often numerical ones, that are associated with the variables. Three quite different types of measures are used: physical, probabilistic, and psychological. The first two are objective and descriptive; they can be introduced and used without reference to any psychological theory, and so they are especially popular with atheoretical experimentalists, even though the choice of a measure usually reflects a theoretical attitude about what is and is not psychologically relevant. Although we often use physical measures to characterize the events for which probabilities are defined, this is only a labeling function which makes little or no use of the powerful mathematical structure embodied in many physical measures. The psychological measures are constructs within some specifiable psychological theory, and their calculation in terms of observables is possible only within the terms of that theory. Examples of each type of measure should clarify the meaning. Physical measures. In experimental reports, the stimuli and outcomes are usually described in terms of standard physical measures: intensity, frequency, size, weight, time, chemical composition, amount, etc. Certain standard response measures are physical. The most ubiquitous is response latency (or reaction time), and it has received the attention of some mathematical theorists (McGill 1963). In addition, force of response, magnitude of displacement, speed of running, etc., can some-
379
times be recorded. Each of these is unique to certain experimental realizations, and so they have not been much studied by theorists. Probability measures. The stimulus presentations, the responses, and the outcomes can each be thought of as a sequence of selections of elements from known sets of elements, i.e., as a schedule over trials. It is not usual to work with the specific schedules that have occurred but, rather, with the probability rules that were used to generate them. For the stimulus presentations and the outcomes, the rules are selected by the experimenter, and so there is no question about what they are. Not only are the rules not known for the responses, but even their general form is not certain. Each response theory is, in fact, a hypothesis about the form of these rules, and certain relative frequencies of responses are used to estimate the postulated conditional response probabilities. Often the schedules for stimulus presentations are simple random ones in the sense that the probability of a stimulus' being presented is independent of the trial number and of the previous history of the experiment; but sometimes more complex contingent schedules are used in which various conditional probabilities must be specified. Most outcome schedules are to some degree contingent, usually on the immediately preceding presentation and response, but sometimes the dependencies reach further back into the past. Again, conditional probabilities are the measures used to summarize the schedule. [See PROBABILITY.] Psychological measures. Most psychological models attempt to state how either a physical measure or a probability measure of the response depends upon measures of the experimental independent variables, but in addition they usually include unknown free parameters—that is, numerical constants whose values are specified neither by the experimental conditions nor by independent measurements on the subject. Such parameters must, therefore, be estimated from the data that have been collected to test the adequacy of the theory, which thereby reduces to some degree the stringency of the test. It is quite common for current psychological models to involve only probability measures and unknown numerical parameters, but not any physical measures. When the numerical parameters are estimated from different sets of data obtained by varying some independent variables under the experimenter's control, it is often found that the parameters vary with some variables and not with others. In other words, the parameters are actually functions of some of the experimental variables, and so they can be, and often are, viewed
380
MODELS, MATHEMATICAL
as psychological measures (relative to the model within which they appear) of the variables that affect them. Theories are sometimes then provided for this dependence, although so far this has been the exception rather than the rule. The theory of signal detectability, for example, involves two parameters: the magnitude, d', of the psychological difference between two stimuli; and a response criterion, c, which depends upon the outcomes and the presentation schedule. Theories for the dependence of d' and c upon physical measures have been suggested (Luce 1963; Swets 1964). Most learning theories for experiments with only one presentation simply involve the conditional outcome probabilities and one or more free parameters. Little is known about the dependence of these parameters upon experimentally manipulable variables. In certain scaling theories, numerical parameters are assigned to the response alternatives and are interpreted as measures of response strength (Luce & Galanter 1963). In some models these parameters are factored into two terms, one of which is assumed to measure the contribution of the stimulus to response strength and the other of which is the contribution due to the outcome structure. The phrasing of psychological models in terms only of probability measures and parameters (psychological measures) has proved to be an effective research strategy. Nonetheless, it appears important to devise theories that relate psychological measures to the physical and probability measures that describe the experiments. The most extensive mathematical models of this type can be found in audition and vision (Hurvich et al. 1965; Zwislocki 1965). The various theories of utility are, in part, attempts to relate the psychological measure called utility to physical measures of outcomes, such as amounts of money, and probability measures of their schedules, such as probabilities governing gambles (Luce & Suppes 1965). In spite of the fact that it is clear that the utilities of outcomes must be related to learning parameters, little is known about this relation. [See GAMBLING; GAME THEORY; UTILITY.]
The nature of the models. The construction of a mathematical model involves decisions on at least two levels. There is, first, the over-all perspective about what is and is not important and about the best way to secure the relevant facts. Usually this is little discussed in the presentation of a model, mainly because it is so difficult to make the discussion coherent and convincing. Nonetheless, this is what we shall attempt to deal with in this section. In the following section we turn to the
second level of decision: the specific assumptions made. Probability vs. determinism. One of the most basic decisions is whether to treat the behavior as if it arises from some sort of probabilistic mechanism, in which case detailed, exact predictions are not possible, or whether to treat it as deterministic, in which case each specific response is susceptible to exact prediction. If the latter decision is made, one is forced to provide some account of the observed inconsistencies of responses before it is possible to test the adequacy of the model. Usually one falls back on either the idea of errors of measurement or on the idea of systematic changes with time (or experience), but in practice it has not been easy to make effective use of either idea, and most workers have been content to develop probability models. It should be pointed out that, as far as the model is concerned, it is immaterial whether the model builder believes the behavior to be inherently probabilistic, or its determinants to be too complex to give a detailed analysis, or that there are uncontrolled factors which lead to experimental errors. Static vs. dynamic models. A second decision is whether the model shall be dynamic or static. (We use these terms in the way they are used in physics; static models characterize systems which do not change with time or systems which have reached equilibrium in time, whereas dynamic models are concerned with time changes.) Some dynamic models, especially those for learning, state how conditional response probabilities change with experience. Usually these models are not very helpful in telling us what would happen if, for example, we substituted a different but closely related set of response alternatives or outcomes. In static models the constraints embodied in the model concern the relations among response probabilities in several different, but related, choice situations. The utility models for the study of preference are typical of this class. The main characteristic of the existing dynamic models is that the probabilities are functions of a discrete time parameter. Such processes are called stochastic, and they can be thought of as generating branching processes through the fanning out of new possibilities on each trial (Snell 1965). Each individual in an experiment traces out one path of the over-all tree, and we attempt to infer from a small but, it is hoped, typical sample of these paths something about the probabilities that supposedly underlie the process. Usually, if enough time is allowed to pass, such a process settles down —becomes asymptotic—in a statistical sense. This
MODELS, MATHEMATICAL is one way to arrive at a static model; and when we state a static model, we implicitly assume that it describes (approximately) the asymptotic behavior of the (unknown) dynamic process governing the organisms. Psychological vs. mathematical assumptions. Another distinction is that between psychological and formal mathematical assumptions. This is by no means a sharp one, if for no other reason than that the psychological assumptions of a mathematical model are ultimately cast in formal terms and that psychological rationales can always be evolved for formal axioms. Roughly, however, the distinction is between a structure built up from elementary principles and a postulated constraint concerning observable behavior. Perhaps the simplest example of the latter is the axiom of transitivity of preferences; if a is preferred to b and b is preferred to c, then a will be preferred to c. This is not usually derived from more basic psychological postulates but, rather, is simply asserted on the grounds that it is (approximately) true in fact. A somewhat more complex, but essentially similar, example is the so-called choice axiom which postulates how choice probabilities change when the set of possible choices is either reduced or augmented (Luce 1959). Again, no rationale was originally given except plausibility; later, psychological mechanisms were proposed from which it derives as a consequence. The most familiar example of a mathematical model which is generally viewed as more psychological and less formal is stimulus sampling theory. In this theory it is supposed that an organism is exposed to a set of stimulus "elements" from which one or more are sampled on a trial and that these elements may become "conditioned" to the performed response, depending upon the outcome that follows the response (Atkinson & Estes 1963). The concepts of sampling and conditioning are interpreted as elementary psychological processes from which the observed properties of the choice behavior are to be derived. Lying somewhere between the two extremes just cited are, for example, the linear operator learning models (Bush & Mosteller 1955; Sternberg 1963). The trial-by-trial changes in response probabilities are assumed to be linear, mainly because of certain formal considerations; the choice of the limit points of the operators in specific applications is, however, usually based u pon psychological considerations; and the resultln g mathematical structure is not evaluated directly but, rather, in terms of its ability to account i°r the observed choice behavior as summarized in such observables as the mean learning curve, the
381
sequential dependencies among responses, and the like. Recurrent theoretical themes. Beyond a doubt, the most recurrent theme in models is independence. Indeed, one can fairly doubt whether a serious theory exists if it does not include statements to the effect that certain measures which contribute to the response are in some way independent of other measures which contribute to the same response. Of course, independence assumes different mathematical forms and therefore has different names, depending upon the problem, but one should not lose sight of the common underlying intuition which, in a sense, may be simply equivalent to what we mean when we say that a model helps to simplify and to provide understanding of some behavior. Statistical independence. In quite a few models simple statistical independence is invoked. For example, two chance events, A and B, are said to be independent when the conditional probability of A, given B, is equal to the unconditional probability of A; equivalently, the probability of the joint event AB is the product of the separate probabilities of A and B. A very simple substantive use of this notion is contained in the choice axiom which says, in effect, that altering the membership of a choice set does not affect the relative probabilities of choice of two alternatives (Luce 1959). More complex notions of independence are invoked whenever the behavior is assumed to be described by a stochastic process. Each such process states that some, but not all, of the past is relevant in understanding the future: some probabilities are independent of some earlier events. For example, in the "operator models" of learning, it is assumed that the process is "path independent" in the sense that it is sufficient to know the existing choice probability and what has happened on that trial in order to calculate the choice probability on the next trial (Bush & Mosteller 1955). In the "Markovian" learning models, the organism is always in one of a finite number of states which control the choice probabilities, and the probabilities of transition from one state to another are independent of time, i.e., trials (Atkinson & Estes 1963). Again, the major assumption of the model is a rather strong one about independence of past history. [See MARKOV CHAINS.] Additivity and linearity. Still another form of independence is known as additivity. If r is a response measure that depends upon two different variables assuming values in sets A t and A 2 , then we say that the measure is additive (over the independent variables) if there exists a numerical
382
MODELS, MATHEMATICAL
measure rv on A, and r2 on A 2 such that for x± in Al and x., in A.,, r(xr,x.,) = r , ( x , ) + r,(^ 2 ). This assumption fcv particular experimental measures r is frequently postulated in the models of analysis of variance as well as derived from certain theories of fundamental measurement. A special case of additivity known as linearity is very important. Here there is but one variable (that is, A! = A 2 = A ); any two values of that variable, x and x' in A, combine through some physical operation to form a third value of that variable, denoted x * x'; and there is a single measure r on A (that is, r\ = r2 = r) such that r(x * x'} — r(x) + r(x'). Such a requirement captures the superposition principle and leads to models of a very simple sort. These linear models have played an especially important role in the study of learning, where it is postulated that the choice probability on one trial, pn, can be expressed linearly in terms of the probability, pn^, on the preceding trial. Other models also postulate linear transformations, but not necessarily on the response probability itself. In the "beta" model, the quantity pn/(l — Pn) is assumed to be transformed linearly; this quantity is interpreted as a measure of response strength (Luce 1959). Commutativity. The "beta" model exhibits another property that is of considerable importance, namely, commutativity. The essence of commutativity is that the order in which the operators are applied does not matter; that is, if A and B are operators, then the composite operator AB (apply B first and then A) is the same as the operator BA. Again, there is a notion of independence—independence of the order of application. It is an extremely powerful property that permits one to derive a considerable number of properties of the resulting process; however, it is generally viewed with suspicion, since it requires the distant past to have exactly the same effect as the recent past. A commutative model fails to forget gradually. Nature of the predictions. As would be expected, models are used to make a variety of predictions. Perhaps the most general sorts of predictions involve broad classes of models. For example, probabilistic reinforcement schedules for a certain class of distance-diminishing models, i.e., ones that require the behavior of two subjects to become increasingly similar when they are identically reinforced, can be shown to be ergodic, which means that these models exhibit the asymptotic properties that are commonly taken for granted. A second example is the combining-of-classes theorem, which asserts that if the theoretical descriptions of behavior are to be independent of the grouping of
responses into classes, then only the linear learning models are appropriate. At a somewhat more detailed level, but still encompassing several different models, are predictions such as the mean learning curve, response operating characteristics, and stochastic transitivity of successive choices among pairs of alternatives. Sometimes it is not realized that conceptually quite different models, which make some radically different predictions, may nonetheless agree completely on other features of the data, often on ones that are ordinarily reported in experimental studies. Perhaps the best example of this phenomenon arises in the analysis of experiments in which subjects learn arbitrary associations between verbal stimuli and responses. A linear incremental model, of the sort described above, predicts exactly the same mean learning curve as does a model that postulates that the arbitrary association is acquired on an all-or-none basis. On the face of it, this result seems paradoxical. It is not, because in the latter model, different subjects acquire the association on different trials, and averaging over subjects thereby leads to a smooth mean curve that happens to be identical with the one predicted by the linear model. Actually, a wide variety of models predict the same mean learning curve for many probabilistic schedules of reinforcement, and so one must turn to finer-grained features of the data to distinguish among the models. Among these differential predictions are the distribution of runs of the same response, the expected number of such runs, the variance of the number of successes in a fixed block of trials, the mean number of total errors, the mean trial of last error, etc. [See STATISTICAL IDENTIFIABILITY. ]
The classical topic of individual differences raises issues of a different sort. For the kinds of predictions discussed above it is customary to pool individual data and to analyze them as if they were entirely homogeneous. Often, in treating learning data this way, it is argued that the structural conditions of the experiment are sufficiently more important determinants of behavior than are individual differences so that the latter may be ignored without serious distortion. For many experiments to which models have been applied with considerable success, simple tests of this hypothesis of homogeneity are not easily made. For example, when a group of 30 or 40 subjects is run on 12 to 15 paired-associate items, it is not useful to analyze each subject item because of the large relative variability which accompanies a small number or observations. On the other hand, in some psycho-
MODELS, MATHEMATICAL physical experiments in which each subject is run for thousands of trials under constant conditions of presentation and reinforcement, it is possible to treat in detail the data of individuals. The final justification for using group data, on the assumption of identical subjects, is the fact that for ergodic processes, which most models are, the predictions for data averaged over subjects are the same as those for the data of an individual averaged over trials. Another issue, which relates to group versus individual data, is parameter invariance. One way of asking if a group of individuals is homogeneous is to ask whether, within sampling error, the parameters for individuals are identical. Thus far, however, more experimental attention has been devoted to the question of parameter invariance for sets of group data collected under different experimental conditions. For instance, the parameters of most learning models should be independent of the particular reinforcement schedule adopted by the experimenter. Although in many cases a reasonable degree of parameter invariance has been obtained for different schedules, it is fair to say that the results have not been wholly satisfactory. For a detailed discussion of the topics of this section, see Sternberg (1963) and Atkinson and Estes (1963). Model testing. Most of the mathematical models used to analyze psychological data require that at least one parameter, and often more, be estimated from the data before the adequacy of the model can be evaluated. In principle, it might be desirable to use maximum-likelihood methods for estimation. Perhaps the central difficulty which prevents our using such estimators is that the observable random variables, such as the presentation, response, and outcome random variables, form chains of infinite order. This means that their probabilities on any trial depend on what actually happened in all preceding trials. When that is so, it is almost always impractical to obtain a useful maximum-likelihood estimator of a parameter. In the face of such difficulties, less desirable methods of estimation have perforce been used. Theoretical expressions showing the dependency on the unknown parameter of, for example, the mean number of total errors, the mean trial of first success, a nd the mean number of runs, have been equated to data statistics to estimate the parameters. The classical methods of moments and of least squares have sometimes been applied successfully. And, in certain cases, maximum-likelihood estimators can he approximated by pseudo-maximum-likelihood
383
ones that use only a limited portion of the immediate past. For processes that are approximately stationary, a small part of the past sometimes provides a very good approximation to the full chain of infinite order, and then pseudo-maximum-likelihood estimates can be good approximations to the exact ones. Because of mathematical complexities in applying even these simplified techniques, Monte Carlo and other numerical methods are frequently used. [See ESTIMATION.] Once the parameters have been estimated, the number of predictions that can be derived is, in principle, enormous: the values of the parameters of the model, together with the initial conditions and the outcome schedule, uniquely determine the probability of all possible combinations of events. In a sense, the investigator is faced with a plethora of riches, and his problem is to decide what predictions are the most significant from the standpoint of providing telling tests of a model. In more classical statistical terms, what can be said about the goodness of fit of the model? Just as with estimation, it might be desirable to evaluate goodness of fit by a likelihood ratio test. But, a fortiori, this is not practical when maximumlikelihood estimators themselves are not feasible. Rather, a combination of minimum chi-square techniques for both estimation and testing goodness of fit have come to be widely used in recent years. No single statistic, however, serves as a satisfactory over-all evaluation of a model, and so the report usually summarizes its successes and failures on a rather extensive list of measures of fit. A model is never rejected outright because it does not fit a particular set of data, but it may disappear from the scene or be rejected in favor of another model that fits the data more adequately. Thus, the classical statistical procedure of accepting or rejecting a hypothesis—or model—is in fact seldom directly invoked in research on mathematical models; rather, the strong and weak points of the model are brought out, and new models are sought that do not have the discovered weaknesses. [See GOODNESS OF FIT; more detail on these topics can be found in Bush 1963]. Impact on psychology. Although the study of mathematical models has come to be a subject in its own right within psychology, it is also pertinent to ask in what ways their development has had an impact on general experimental psychology. For one, it has almost certainly raised the standards of systematic experimentation: the application of a model to data prompts a number of detailed questions frequently ignored in the past. A
384
MODELS, MATHEMATICAL
model permits one to squeeze more information out of the data than is done by the classical technique of comparing experimental and control groups and rejecting the null hypothesis whenever the difference between the two groups is sufficiently large. A successful test of a mathematical model often requires much larger experiments than has been customary. It is no longer unusual for a quantitative experiment to consist of 100,000 responses and an equal number of outcomes. In addition to these methodological effects on experimentation and on data analysis, there have been substantive ones. Of these we mention a few of the more salient ones. Probability matching. A well-known finding, which dates back to Humphreys (1939), is that of probability matching. If either one of two responses is rewarded on each trial, then in many situations organisms tend to respond with probabilities equal to the reward probabilities rather than to choose the more often rewarded response almost all of the time. Since Humphreys' original experiment, many similar ones have been performed on both human and animal subjects to discover the extent and nature of the phenomenon, and a great deal of effort has been expended on theoretical analyses of the results. Estes (1964) has given an extensive review of both the experimental and the theoretical literature. Perhaps the most important contribution of mathematical models to this problem was to provide sets of simple general assumptions about behavior which, coupled with the specification of the experimenter's schedule of outcomes, predict probability matching. As noted above, investigators have not been content with just predicting the mean asymptotic values but have dealt in detail with the relation between predicted and observed conditional expectations, run distributions, variances, etc. Although this experimental paradigm for probability learning did not originate in mathematical psychology, its thorough exploration and the resulting interpretations of the learning process have been strongly promoted by the many predictions made possible by models for this paradigm. The all-or-none model. A second substantive issue to which a number of investigators have addressed mathematical models is whether or not simple learning is of an all-or-none character. As noted earlier, the linear model assumes learning to be incremental in the sense that whenever a stimulus is presented, a response made, and an outcome given, the association reinforced by the outcome is thereby made somewhat more likely to
occur. In contrast, the simple all-or-none model postulates that the subject is either completely conditioned to make the correct response, or he is not so conditioned. No intermediate states exist, and until the correct conditioning association is established on an all-or-none basis, his responses are determined by a constant guessing probability. This means that learning curves for individual subjects are flat until conditioning occurs, at which point they exhibit a strong discontinuity. The problem of discriminating the two models must be approached with some care since, for instance, the mean learning curve obtained by averaging data over subjects, or over subjects and a list of items as well, is much the same for the two models. On the other hand, analyses of such statistics as the variance of total errors, the probability of an error before the last error, and the distribution of last errors exhibit sharp differences between the models. For paired-associates learning, the all-ornone model is definitely more adequate than the linear incremental model (Atkinson & Estes 1963). Of course, the issue of all-or-none versus incremental learning is not special to mathematical psychology; however, the application of formal models has raised detailed questions of data analysis and posed additional theoretical problems not raised, let alone answered, by previous approaches to the problem. Reward and punishment. The classic psychological question of the relative effects of reward and punishment (or nonreward) has also arisen in work on models, and it has been partially answered. In some models, such as the linear one, there are two rate parameters, one of which represents the effect of reward on a single trial and the other of which represents the effect of nonreward. Their estimated values provide comparable measures of the effects of these two events for those data from which they are estimated. For example, Bush and Mosteller (1955) found that a trial on which a dog avoided shock (reward) in an avoidance training experiment produced about the same change in response probabilities as three trials of nonavoidance (punishment). No general law has emerged, however. The relative effects of reward and nonreward seem to vary from one experiment to another and to depend on a number of experimental variables. When using a model to estimate the relative effects of different events, the results must be interpreted with some care. The measures are meaningful only in terms of the model in which they are defined. A different model with corresponding re-
MODELS, MATHEMATICAL ward and nonreward parameters may lead to the opposite conclusion. Thus, one must decide which model best accounts for the data and use it for measuring the relative effects of the two events. Very delicate issues of parameter estimation arise, and examples exist where opposite conclusions have been drawn, depending on the estimators used. The alternative is to devise more nonparametric methods of inference which make weaker assumptions about the learning process. A detailed discussion of these problems is given by Sternberg (1963, pp. 109-116). [See LEARNING, article on REINFORCEMENT.]
Homogenizing a group. If one wishes to obtain a homogeneous group of subjects after a particular experimental treatment, should all subjects be run for a fixed number of trials, or should each subject be run until he meets a specific performance criterion? Typically it is assumed by those who use such a criterion that individual subjects differ; that, for example, some are fast learners and some are slow. It is further assumed that all subjects will achieve the same performance level if each is run to a criterion such as ten successive successes. Now it is clear that for identical subjects, it is simpler to run them all for the same number of trials and perhaps use a group performance criterion. It is, however, less obvious whether it would be better to do this than to run each to a criterion. An analysis of stochastic learning models has shown that running each of identical subjects to a criterion introduces appreciable variance in the terminal performance levels. One can study individual differences only in terms of a model and assumptions about the distributions of the model parameters. When this is done, it becomes evident that very large individual differences must exist to justify using the criterion method of homogenizing a group of subjects. Psychophysics. The final example is selected from psychophysics. With the advent of signal detection theory it became increasingly apparent that the classical methods for measuring sensory thresholds are inherently ambiguous, that they depend not only, as they are supposed to, on sensitivity but also on response biases (Luce 1963; Swets 1964). Consider a detection experiment in which the stimulus is presented only on a proportion TT of the trials. Let p(Y s) and p(Y|rz) be the probabilities of a "Yes" response to the stimulus and to no stimulus respectively. If the experiment is run several times with different values of TT between 0 a nd 1, then p(Y|n), as well as p(Y]s), which is a classical threshold measure, varies systematically
385
from 0 to 1. The data points appear to fall on a smooth, convex curve, which shows the relation, for the subject, between correct responses to stimuli and incorrect responses to no-stimulus trials (false alarms). Its curvature, in effect, characterizes the subject's sensitivity, and the location of the data point along the curve represents the amount of bias, i.e., his over-all tendency to say "Yes," which varies with TT, with the payoffs used, and with instructions. Several conceptually different theories, which are currently being tested, account for such curves; it is clear that any new theory will be seriously entertained only if it admits to some such partition of the response behavior into sensory and bias components. This point of view is, of course, applicable to any two-stimulus— two-response experiment, and often it alters significantly the qualitative interpretation of data. [See ATTENTION; PSYCHOPHYSICS.] Although one cannot be certain about what will happen next in the application of mathematical models to problems of individual behavior, certain trends seem clear. (1) The ties that have been established between mathematical theorists and experimentalists appear firm and productive; they probably will be strengthened. (2) The general level of mathematical sophistication in psychology can be expected to increase in response to the increasing numbers of experimental studies that stem from mathematical theories. (3) The major applications will continue to center around welldefined psychological issues for which there are accepted experimental paradigms and a considerable body of data. One relatively untapped area is operant (instrumental) conditioning. (4) Along with models for explicit paradigms, abstract principles (axioms) of behavior that have wide potential applicability are being isolated and refined, and attempts are being made to explore general qualitative properties of whole classes of models. (5) Even though the most successful models to date are probabilistic, the analysis of symbolic and conceptual processes seems better handled by other mathematical techniques, and so more nonprobabilistic models can be anticipated. ROBERT R. BUSH, R. DUNCAN LUCE, AND PATRICK SUPPES [See also DECISION MAKING, article on PSYCHOLOGICAL ASPECTS; SIMULATION, article on INDIVIDUAL BEHAVIOR. Other relevant material may be found in ATTENTION; LEARNING; MATHEMATICS; PROBABILITY; PSYCHOMETRIC s; PSYCHOPHYSICS; SCALING.]
386
MODERNIZATION: Social Aspects BIBLIOGRAPHY
ATKINSON, RICHARD C.; and ESTES, WILLIAM K. 1963 Stimulus Sampling Theory. Volume 2, pages 121-268 in R. Duncan Luce, Robert R. Bush, and Eugene Galanter (editors), Handbook of Mathematical Psychology. New York: Wiley. BUSH, ROBERT R. 1963 Estimation and Evaluation. Volume 1, pages 429-469 in R. Duncan Luce, Robert R. Bush, and Eugene Galanter (editors), Handbook of Mathematical Psychology. New York: Wiley. BUSH, ROBERT R.; GALANTER, EUGENE; and LUCE, R. DUNCAN 1963 Characterization and Classification of Choice Experiments. Volume 1, pages 77-102 in R. Duncan Luce, Robert R. Bush, and Eugene Galanter (editors), Handbook of Mathematical Psychology. New York: Wiley. BUSH, ROBERT R.; and MOSTELLER, FREDERICK 1955 Stochastic Models for Learning. New York: Wiley. ESTES, WILLIAM K. 1964 Probability Learning. Pages 89-128 in Symposium on the Psychology of Human Learning, University of Michigan, 1962, Categories of Human Learning. Edited by Arthur W. Melton. New York: Academic Press. HUMPHREYS, LLOYD G. 1939 Acquisition and Extinction of Verbal Expectations in a Situation Analogous to Conditioning. Journal of Experimental Psychology 25: 294-301. HURVICH, LEO M.; JAMESON, DOROTHEA; and KRANTZ, DAVID H. 1965 Theoretical Treatments of Selected Visual Problems. Volume 3, pages 99-160 in R. Duncan Luce, Robert R. Bush, and Eugene Galanter (editors), Handbook of Mathematical Psychology. New York: Wiley. LUCE, R. DUNCAN 1959 Individual Choice Behavior. New York: Wiley. LUCE, R. DUNCAN 1963 Detection and Recognition. Volume 1, pages 103-190 in R. Duncan Luce, Robert R. Bush, and Eugene Galanter (editors), Handbook of Mathematical Psychology. New York: Wiley. LUCE, R. DUNCAN; and GALANTER, EUGENE 1963 Psychophysical Scaling. Volume 1, pages 245-308 in R. Duncan Luce, Robert R. Bush, and Eugene Galanter (editors), Handbook of Mathematical Psychology. New York: Wiley. LUCE, R. DUNCAN; and SUPPES, PATRICK 1965 Preference, Utility, and Subjective Probability. Volume 3, pages 249-410 in R. Duncan Luce, Robert R. Bush, and Eugene Galanter (editors), Handbook of Mathematical Psychology. New York: Wiley. McGiLL, WILLIAM J. 1963 Stochastic Latency Mechanisms. Volume 1, pages 309-360 in R. Duncan Luce, Robert R. Bush, and Eugene Galanter (editors), Handbook of Mathematical Psychology. New York: Wiley. SNELL, J. LAURIE 1965 Stochastic Processes. Volume 3, pages 411-486 in R. Duncan Luce, Robert R. Bush, and Eugene Galanter (editors), Handbook of Mathematical Psychology. New York: Wiley. STERNBERG, SAUL 1963 Stochastic Learning Theory. Volume 2, pages 1-120 in R. Duncan Luce, Robert R. Bush, and Eugene Galanter (editors), Handbook of Mathematical Psychology. New York: Wiley. SWETS, JOHN A. (editor) 1964 Signal Detection and Recognition by Human Observers: Contemporary Readings. New York: Wiley. ZWISLOCKI, JOZEF 1965 Analysis of Some Auditory Characteristics. Volume 3, pages 1-98 in R. Duncan
Luce, Robert R. Bush, and Eugene Galanter (editors), Handbook of Mathematical Psychology. New York: Wiley.
MODERNIZATION The articles under this heading deal with general social and political problems of modernizing societies. More specialized aspects are treated in AGRICULTURE, article on SOCIAL ORGANIZATION; COMMUNITY-SOCIETY CONTINUA; RURAL SOCIETY. For other relevant material see INDUSTRIALIZATION; POLITICS, COMPARATIVE; SOCIAL CHANGE; and the detailed guide under ECONOMIC GROWTH. i. SOCIAL ASPECTS ii. POLITICAL ASPECTS in. THE BOURGEOISIE IN MODERNIZING SOCIETIES
Daniel Lerner James S. Coleman Ronald P. Dore
SOCIAL ASPECTS
Modernization is the current term for an old process—the process of social change whereby less developed societies acquire characteristics common to more developed societies. The process is activated by international, or intersocietal, communication. As Karl Marx noted over a century ago in the preface to Das Kapital: "The country that is more developed industrially only shows, to the less developed, the image of its own future." We need a new name for the old process because the characteristics associated with more developed and less developed societies and the modes of communication between them have become in our day very different from what they used to be. During the era of imperialism, "images," or pictures, of the future were transmitted mainly to colonial peoples by their colonizers. Accordingly, one spoke of India as Anglicized and of Indochina as Gallicized. As the long generations of colonization made evident certain important similarities among imperialist regimes, regardless of national origins, these parochial terms were abandoned, and one spoke of Europeanization. World War 11, which witnessed the constriction of European empires and the diffusion of American presence, again enlarged the vocabulary, and one spoke, often resentfully, of the Americanization of Europe. But when one spoke of the rest of the world, the term was "Westernization." The postwar years soon made plain, however, that even this larger term was too parochial to comprehend the communication mode that had spread regularly patterned social change so swiftly and so widely as to require a global referent. In
MODERNIZATION: Social Aspects response to this need, the new term "modernization" evolved. It enabled one to speak concisely of those similarities of achievement observed in all modernized societies—whether Western, as in Europe and North America, or non-Western, as in the Soviet Union and Japan—as well as of those similarities of aspiration observed in all modernizing societies regardless of their location and traditions. The hard core of observed similarities was economic. It was along the continuum of economic performance that societies could most readily and unambiguously be aligned, compared, and rated. An important step was taken when development economists reached the consensus that their subject matter was, in the words of W. Arthur Lewis, "the growth of output per head of population" (1955, p. 9). This simple operational definition specified simultaneously the aspirational continuum of economic development and the comparative measure of achievement levels along this continuum. In so doing, it focused the analysis of economic development and anchored the more comprehensive analysis of modernization as a societal process. Modernization, therefore, is the process of social change in which development is the economic component. Modernization produces the societal environment in which rising output per head is effectively incorporated. For effective incorporation, the heads that produce (and consume) rising output must understand and accept the new rules of the game deeply enough to improve their own productive behavior and to diffuse it throughout their society. As Harold D. Lasswell (1965) has forcefully reminded us, this transformation in perceiving and achieving wealth-oriented behavior entails nothing less than the ultimate reshaping and resharing of all social values, such as power, respect, rectitude, affection, well-being, skill, and enlightenment. This view of continuous and increasing interaction between economic and noneconomic factors in development produced a second step forward, namely, systematic efforts to conceptualize modernization as the contemporary mode of social change that is both general in validity and global in scope. Criteria of modernity Although no single theoretical formulation as yet commands consensus among social scientists, there has been steady convergence among scholars on certain key points concerning modernization. There appears to be general agreement, for example, that economic decisions on investment criteria and resource allocation must take close account of
387
such noneconomic factors as population growth, urbanization rates, family structure, the socialization of youth, education, and the mass media. Indeed, the contemporary association of modernization with comprehensive social planning has obliged scholars to seek some consensus on the common characteristics of modern societies. There appears to be a large area of agreement, despite conceptual and terminological differences of more or less importance, that among the salient characteristics (operational values) of modernity are ( 1 ) a degree of self-sustaining growth in the economy—or at least growth sufficient to increase both production and consumption regularly; (2) a measure of public participation in the polity— or at least democratic representation in defining and choosing policy alternatives; (3) a diffusion of secular-rational norms in the culture—understood approximately in Weberian-Parsonian terms; (4) an increment of mobility in the society—understood as personal freedom of physical, social, and psychic movement; and (5) a corresponding transformation in the modal personality that equips individuals to function effectively in a social order that operates according to the foregoing characteristics —the personality transformation involving as a minimum an increment of self-things seeking, termed "striving" by Cantril (1966) and "needachievement" by McClelland (1961), and an increment of self-others seeking, termed "other-direction" by Riesman (1950) and "empathy" by Lerner (1958a). Pictures of the future Every nation that regards itself as more developed now transmits pictures of itself to those less developed societies that figure in its own policy planning. All the once-imperial nations of western Europe are involved—Britain, France, Belgium, the Netherlands, and even Portugal. Modernization has spread beyond the obsolete confines of Europe's once-imperial nations to the Soviet Union and Communist China, to Japan, and even to Israel. The United States, which Andre Siegfried (1927) judged to be presiding at a general reorganization of ways of living throughout the world, has for many years been spending between three and four billion dollars of its national income on modernization abroad. Every nation that is less developed, but regards itself as developing, receives the pictures transmitted by these more developed societies and decides, as a matter of high priority for its own policy planning, which of them constitutes the preferred picture of its own future. This decision is the cru-
388
MODERNIZATION: Social Aspects
cial turn in the direction of modernization; whatever its particular configuration, it spells the passing of traditional society and defines the policy planning of social change. The decision is rarely clear-cut. Hence, the ensuing policy often is ambivalent, and the planning often works at cross purposes. Nevertheless, much of the world is now engaged in an unprecedented process of social change that seeks to govern itself by rational policy planning. The less developed societies want to achieve in years the modernization that more developed societies attained over centuries of haphazard, or at least unplanned, development. But we do not have available the evaluated experience needed to provide rational guidance for such unprecedented efforts to induce comprehensive social change. This is why modernization —-the twentieth century's distinctive mode of accelerating social change by rational planning—presents to social scientists so great a challenge and so important an opportunity. For modernization, as we have seen, presents a very complex matrix of experience to be evaluated. It is one thing to summarize the common characteristics of modern societies. It is quite another thing to plan the rational transfer of these "items" from more developed to less developed societies— for each such transfer from the sender involves a deep transformation in the receiver. There exists no rational formula for the transfer of institutions. Modernization operates rather through a transformation of institutions (Lerner 1964) that can only be accomplished by the transformation of individuals—-the painfully complex process which W. H. Auden epitomized as "a change of heart." Complexities of modernization The complexities of modernization puzzle social scientists, who are indispensable for rational planning, because such complexities bring together varieties of institutional and individual behavior that have in the past been studied in very different ways under the specialized division of labor in the social sciences. The variation in the level of knowledge and the "state of the art" in the different social sciences has been so large that a major effort of reintegration is required to deal with the model of social change presented by the matrix of modernization. This "boomerang effect" upon the social sciences produced by their efforts to deal with modernization is relevant in two ways. First, in seeking to account for variations in the responses of less developed societies to the picture of their own future presented by more developed societies, scholars have felt obliged to restudy the
modernization paths of the more developed societies. Thus, W. Arthur Lewis (1955), building upon prior work on the conditions of economic progress by Colin Clark (1940) and others, has produced a theory of economic growth that measures less developed as well as more developed societies on the same continuum of aspiration and metric of achievement. David C. McClelland (1961), building upon prior work in the psychology of "achievement—aspiration ratios" since William James, has produced a synthetic construct of the achievement motive applicable to all recorded history. Seymour M. Lipset (1963), building upon prior work in sociology on the processes of social change since Karl Marx and Max Weber, has rewritten the history of the United States as "the first new nation." Walt W. Rostow (1960), reviving the latterly quiescent but newly relevant disciplines of economic history and political economy, has formulated a general theory of modernization that ranges all societies of the world along the stages of a single continuum of "self-sustaining growth." These important efforts to conceptualize modernization have become, inevitably, objects of controversy in the modernized world of specialized scholars. However, the critique and correction of detailed relationships in these synthetic models, which is the proper business of scholarship, does not seem to have impaired either their conceptual validity or their policy utility. They have already enabled contemporary thinkers to recognize that economic development is a high priority objective of every modernizing society—the prime mover, when indeed it is not the only motivation, for modernization. Moreover, and this is the crux of the matter, the attainment of "self-sustaining growth" involves far more than purely economic processes of production and consumption. It involves the institutional disposition of the full resources of a society; in particular, its human resources. For an economy to sustain growth by its own autonomous operation, it must be effectively geared to the skills and values of the people who make it work. On this view, a society capable of operating an economy of "self-sustaining growth" is ipso facto a modernized society (Hagen 1962). The apparent circularity of this statement is eliminated when one specifies the minimum conditions required to make a society capable of operating an economy of self-sustaining growth. Although no consensus has yet been reached on the full matrix of modernization, which requires explicit specification of interrelations and sequences among the components, a fair measure of agreement has been achieved on the identification and
MODERNIZATION: Social Aspects conceptualization of the components themselves. This has been the second large gain to accrue from recent attempts by social scientists to reintegrate their specialized ideas and tools in order to deal effectively with a general model of modernization (Millikan & Blackmer 1961). All models of modernization that aim at generality have dealt in some way with the economicdevelopment variables that affect rising output per head directly and visibly, such as industrialization, urbanization, national income, and per capita income. In their quest for a model sufficiently general to subsume the move from "rising output per head" to "self-sustaining growth," sociologists have added to these variables an enlightenment variable measured in terms of schooling, literacy, and media exposure; political scientists have added a power variable measured in terms of participation, party membership, and voting; psychologists have added a cross-cutting variable of personality (usually postulated as an explanatory variable for which other variables serve as behavioral indices) measured in terms of authoritarianism, empathy, and need achievement. Anthropologists have enriched the general model by obliging it to account for local-temporal variants—those "diverse cultures" which, in Kluckhohn's words (1959), shape the behavioral variations underlying our "common humanity." The "Western model" reconsidered The convergence of disciplined perspectives upon a general model of modernization has diffused among scholars the recognition that, in our time, social change has become the distinctive component of virtually every social system. There remain in the world today few "traditional" social systems that operate with low rates of change over long time periods. Most societies are in some phase of transition. These are social systems operating with high (and usually accelerating) rates of change over short (and usually decreasing) periods of time. It is this phenomenon which in our time documents the "acceleration of history" that for previous generations was merely an interesting speculation by philosophers of history. Acceleration, now an essential component that must be incorporated in the research designs of all empirical students of social change, has obliged us to reconsider as well the operational mode of social systems that are already "modern" on current indices of modernization. This reconsideration of modern Western societies has occasioned considerable reorganization of their societal theories and policies. Such recon-
389
sideration, having modified the evaluation of historical paths from the past to the present, now shapes new ways of estimating policy paths from the present to the future. There exist few theoretical constructions of future states of the world that are based on present changes in social systems. Lasswell (1965) has outlined the dangers of a "garrison-prison state" that attend policies designed to make any nation more powerful than all others; Rostow (1960) has sketched the attractions of a "mass-consumption society" for peoples who now demand more comfort and fun than peoples dared to dream of in all previous history. These theoretical constructions are strong because they account for the ambivalent behavior of all "transitional" societies and the vigorous behavior of most "modern" societies. These theoretical constructions are strong as well because they show that modern societies are better able to cope with perceived needs for change than less developed, transitional societies. The obvious examples are their concern with the population explosion and the expanding metropolis. These are the demographic and ecological variables that index fundamental mechanisms for the WantGet ratios which govern "dynamic equilibrium" in any society. Modern Western societies have brought these two variables under policy control more rapidly and efficiently than any transitional society has been able to do. The reason is that modern societies restudy and reappraise themselves continuously with an eye to their future. Hence, it is no accident that contraceptives came into widespread use in modern Western societies a full century ago to prevent an unmanageable population explosion. Nor is it accidental that "the pill," invented in the Western societies, is still more widely used by Westerners than by the transitional peoples to whom it has been offered, virtually free of charge, since the 1950s. Modern societies, founding their societal policies on data-based estimations of the future, are readier to perceive the dangers of overpopulation and to take steps to prevent them (Spengler & Duncan 1956). So it has been also with the dangers of overurbanization. The acceleration of history has produced everywhere, as a major manifestation, an accelerated movement of people from the village to the city (California, University of ... 1959). The outcome has been the spread of slums in every modernizing society. But almost from the moment these slums appeared, social scientists in the Western world began to study them in empirical and policy terms. Over a century ago, Frederic Le Play (1855) described the situation of the urban poor
390
MODERNIZATION: Social Aspects
in France and elsewhere in Europe; Charles Booth (Booth et al. 1889-1891) and Jane Addams (HullHouse . . . 1895) did the same for England and the United States, respectively. Their studies led to social diagnosis, social legislation, and, finally, social programs aimed at improvement. The institutionalization of urban policy in modern society is now visible in American "urban renewal," British "new towns," and French "amenagement du territoire." Few such programs have been made effectively operative in the modernizing societies of the transitional world today (Hoselitz 1960). Transformation, not transfer The widespread failure of transitional societies to incorporate modernizing institutions of sufficient amplitude and durability has occasioned reconsideration of the theory and practice of social change under conditions of extreme acceleration. Among the conclusions that have emerged (many of them reminders of lessons brought by anthropologists from their early encounters with traditional societies a century and more ago) is this reciprocal proposition: Traditional societies can respond effectively to internally generated demands for institutional change articulated over a relatively long period, but they are typically incapable of rapid institutional changes to meet externally induced demands. Such externally induced demands occur whenever a less developed society receives a picture of its own future from a more developed society. Since the start of international development programs in 1949 (with the Point iv program of the United States instituted by President Truman), we have understood that the transmission of such pictures is likely to constitute an intrusion into the less developed, traditional society. Only more recently, by way of hard and often unrewarding experience, have we concluded that such intrusions regularly are, and usually must be, disruptive in transitional societies—these being traditional societies that manifest an urgent will-to-change but are unable to incorporate rapidly an efficacious way-to-change. The disruptive effect, which is produced by the imbalance between the will and the way to modernize, emerges as a key problem of induced and accelerated social change. Consider again the problem of overurbanization. The newly reviving civilizations of the East have always had more people living in their capital cities than could be productively employed. Hence, over many centuries there developed the institution (or at least the vocational jurisdiction) of begging. So ancient and venerable is this institution that its routinized practice is sanctified in the
holy books of most Eastern, and particularly Middle Eastern, religions. The practice of begging and the duty of charity are sanctified alike in the Mosaic code of the Jews, the Koranic verses of the Muslims, and the New Testament of the Christians. Yet, under intrusion from the antislum and antipoverty ideology of the modern West, modernizers in the Eastern world have grown ashamed of this venerable institution and have sought to transform it. Many Western travelers have witnessed, at the doors of the Nile hotels in Cairo and at the gates of the Taj Mahal in Agra, the often brutal consequences of the modernizing proscription of begging inflicted upon people who know no other trade. But the modernizing Eastern leaders, while speeding the obsolescence of begging, have not yet incorporated an efficient institutional replacement to relieve the urban poor, whose members swell at accelerating rates from year to year (Lerner 1962). The great cities of the transitional world often have become massive impediments to orderly social change rather than productive centers of modernization. In much of Latin America, vast lands are deserted while the people are crushed into the megalopolis—for example, half of all Cubans live in and around Havana, half of all Uruguayans live around Montevideo, and about 80 per cent of the Venezuelan population lives on the 10 per cent of land located between Caracas and Maracaibo. In the transitional societies of Asia, which produce far less wealth than those of Latin America, the consequences of overurbanization are even more disruptive. No traveler in Cairo or Calcutta will forget the sights, sounds, and smells of debilitated peoples who perform no productive functions for themselves or their environment. These millions of hapless people who consume (however little) without producing are the psychic displaced persons of modernization—they have come to consider themselves useless for anything beyond survival and reproduction. Their futility is an expression of the disruptive imbalance, for their minuscule benefits are gained only at the disproportionately great costs to their society which overurbanization imposes upon all development efforts. That the problem of overurbanization remains unresolved is the measure of our failure to develop a comprehensive theory and practice of modernization. This proposition is circular in one sense: since the urban explosion is systemic with the population explosion and the literacy explosion, true resolution of any one explosion will help resolve the others. These explosions are systemic in the sense that they derive from a common source, converge on a common demand, and produce a common failure to satisfy the demand. The common source
MODERNIZATION: Social Aspects is empathy; the common demand is well-being; the common failure is poverty. These terms denote the failures that explain why we are passing from a putative "revolution of rising expectations" (Staley 1954), which shaped the theory and practice of planned social change after World War n, to an incipient "revolution of rising frustration" (Conference . . . 1963, pp. 330-333) that may reshape our thinking in the future. Empathy—mechanism of transformation Empathy is the psychic mechanism that enables a person to put himself in another person's situation—to identify himself with a role, time, or place different from his own. Among the range of psychic mechanisms that supply imagination, empathy is distinctively the one that nourishes "upward mobility." For what greater stimulus is there to imagine oneself in another person's situation if not that his situation is "better" (in some sense) than one's own? The power to imagine oneself in a better situation rests upon the psychic mechanism of empathy. The mechanism may or may not be innate, but it can certainly be trained to operate more efficiently in people with a desire to better themselves. Since World War n such training has been supplied by the mass media of print, film, and radio. The mass media, which we call the "mobility multiplier" for this reason, accelerate the training in psychic mobility that enables people to imagine themselves in situations other than their own—and hence, since the alternatives invariably represent better situations, accelerate training in upward mobility. The global spread of empathy has thus diffused a new demand for well-being among peoples who, over all previous centuries, had never even been exposed to the idea that well-being was theirs to demand. Wants have always been with the poor, and expectations have risen or fallen with the richness of the harvest or the goodness of the king, but demand is something new in the lives of poor peoples. It involves nothing less than a new sense of oneself, that is, the transformation of one's identifications that is accomplished by empathy and accelerated by the multiplier effect of the mass media. But the newly diffused sense of demand, which articulates and aggregates the age-old wants and needs of the poor, imposes a new condition upon the management of societies: that ways must be found to satisfy demand if a society is to maintain itself in a relatively durable state of equilibrium— or, more precisely, in a tolerable state of disequilibrium. The new condition is imposed by the systemic quality of the new demand: its widespread dis-
391
tribution throughout the social system entails a comprehensive institutional response. Economic theory has taught generations of analysts in modernized societies that equilibrium can be maintained only in the measure that widespread and persistent demand is balanced by adequate supply. It is the failure of transitional societies to increase supply at a sufficient rate to balance accelerating demand that accentuates the new meaning of poverty as a key to the unsolved problems of modernization. Poverty, which was once accepted as an honorable estate (as in the Biblical theme of the "eye of the needle"), is now rejected as an abject condition unworthy of human acceptance. Poverty is now seen as the self-sealing mechanism of a vicious circle that deprives people of the means to obtain enough of the good things of life. As Hans W. Singer has succinctly summarized the situation, its core is "the dominant vicious circle of low production—no surpluses for economic investment— no tools and equipment—low standards of production. An underdeveloped country is poor because it has no industry; and it has no industry because it is poor" (1949, p. 5). Economists agree that the root problem is that poor people in poor countries do not earn enough to raise their essential consumption (wants) and still have something left over to save (that is, invest). This is attributed to the series of "explosions"—population, urbanization, literacy—that consume all gains in production as soon as they are made, and often more rapidly than they are made. It is the worsening situation of the poor as compared to the rich countries that, as Gunnar Myrdal (1956) has shown, defeats the planning of modernization in our time. Despite large outlays of funds and skills for international development, the poor lands and peoples are continuously getting poorer relative to the rich lands and peoples. The latter have incorporated the individual and institutional mechanisms that make growth self-sustaining and thereby underwrite the stability of modern societies at high and rising levels of output and income. By the same token, transitional societies, which have not been able to incorporate the mechanisms needed for self-sustaining growth, tend to grow relatively poorer and less stable. Recognition that the relative situation of transitional societies is worsening, despite their high expectations and despite substantial contributions of international aid, has stimulated new research and reflection on the mechanisms of self-sustaining growth. It has long been clear that surplus product for economic investment is necessary. What was not clear, until very recently, is that an external input of investment does not necessarily ignite the
392
MODERNIZATION: Social Aspects
motor of modernization and almost never suffices to keep it running. It appears to be essential that the modernizing society, if its growth is to be self-sustaining, should incorporate internal means of generating the surpluses needed for investment. This apparently simple extension of thinking about economic development entails wide and deep consequences for the social theory and practice of modernization. For a transitional society to generate surpluses internally, it must work a profound transformation into its individual and institutional patterns of traditional behavior (Shannon 1957). This is why we no longer speak of a "transfer of institutions" from more developed to less developed societies. Such transfer rarely occurs in fact. When it does, as in those transitional societies that have transferred electoral institutions based on universal suffrage from more developed societies, the effects have been not only intrusive and disruptive but often positively dysfunctional for societal modernization. The indispensable lesson taught by failures to transfer institutions is that modernization must be systemic if it is to be durable. It must involve indigenous people in behavioral transformations so manifold and profound that a new and coherent way of life comes into operation. Institutions cannot be transferred; they must be transformed. Lifeways cannot be adopted; they must be adapted. Adaptive capacity, the most distinctive feature of societies that are genuinely modernized, is what enables them to develop more rapidly than the transitional societies they are aiding out of their own large surplus product. While such handouts alleviate hardship and encourage hope in some transitional societies, these societies do not modernize effectively until they develop an indigenous capacity for accelerating and sustaining growth. Among the requirements are indigenous surpluses that enable people to break out of the vicious circle of poverty and into the self-sustaining cycle of growth. The incorporation of an adaptive capacity of this magnitude can occur only in societies that diffuse widely among their peoples the lifeways and institutions of mobility, empathy, and participation. Mobility is the initial mechanism: people must be ready, willing, and able to move from where they are and what they are. Physical and social mobility have always interacted closely in the societies now regarded as modernized. Horace Greeley's maxim "Go West, young man, go West" told aspiring Americans a century ago: If you want to move up, young man, move outl Among the millions of aspiring young men in the transitional world today, many are heeding
some local variant of this advice—usually delivered by "pictures" from the mass media (Schramm 1964). Those who want to move up are, in rapidly swelling numbers, moving out. Physical mobility has become a characteristic of world society in our time. But social mobility has not kept pace. This is so in part because, as we have seen, poor lands have a built-in tendency to stay poor. This tendency is built in by the persistence of traditional lifeways among the peoples of the poor lands. When they move out, they are not adequately prepared to meet the other requirements for moving up. They are, in particular, unprepared to make efficient use of the empathic mechanism that shapes psychic mobility—the personality reagent that catalyzes the interaction between physical and social mobility. In a word, they lack a sufficient dose of empathy (Lerner 1958b). The inadequate diffusion of empathy—-psychic mobility—is a major source of failure for development programs. People everywhere have been moving out with the expectation of moving up, and everywhere they are being disappointed. Social mobility simply does not coincide with physical mobility often enough to produce widespread satisfaction. On the contrary, if we are facing an incipient "revolution of rising frustration," it is because the newly mobile peoples of the transitional world have not found—and, more critically, have not learned to produce from their own resources— adequate satisfactions for their accelerated expectations. The disruptive imbalance that weighs most heavily on traditional societies in our time is the imbalance between what people have been taught to want and what they have learned to get. The Want-Get ratio We refer to the disruptive imbalance, which is the global source of rising frustrations, as the Want-Get ratio. Adapting an ingenious formula of William James, this can be expressed as follows: Want Get Frustration rises in the measure that the numerator of Want exceeds the denominator of Get. In traditional societies, frustration remained fairly constant and at a relatively low level because wants (at least in the form of articulated demands) were relatively few and unchanging. Frustration is accelerating in transitional societies because articulated wants are increasing, diversifying, and spreading at very rapid and erratic rates. This way of putting the matter links the process of modernization directly to the problem of ecoFrustration =
MODERNIZATION: Social Aspects nomic development. Economic development is critical because continuing and deepening poverty signals the failure to meet the accelerating demand for well-being (articulated Want) generated by the erratic diffusion of empathy that has accompanied increasing mobility. The economic model is also useful because it teaches us to analyze psychosocial states more exactly by submitting them to the metrics of supply and demand. For what has disrupted transitional societies so deeply that stable governance—and rational planning for growth— cannot become self-sustaining is precisely the worsening ratio between supply and demand. Transitional peoples are accelerating their manifold demands beyond the supply capacity of their institutions and resources, including the capacity of individuals with increasing demands to adapt their personal behavior in such ways as to increase supplies. Public opinion—empathy to participation A modern society depends so crucially upon its human resources because it must be, first and foremost, a participant society. This does not mean that all people must participate continuously in all societal activities, since it is unlikely that any society could survive this degree of interaction. It does mean that enough people must participate continuously in each major institution to make these institutions viable, adaptable, and durable. This optimum level of interaction between individuals and institutions can be sustained only when it produces outcomes that are reciprocally rewarding. Institutions cannot endure persistently excessive demands upon their capacity by their individual participants; nor will individuals continue to participate in institutions that consistently frustrate their wants (Shannon 1958). Perhaps the most significant and subtle instance of self-sustaining interaction between individuals and institutions in modern society is public opinion —a distinctive interaction that is not found in traditional societies (Speier 1950). The evolution of public opinion in the modern West can be traced from the eighteenth century, when the institutions of free public education and inexpensive mass media (the so-called penny press, for example) began to expand in response to the direct demands of peoples who had gained a fair measure of empathy and literacy over preceding generations. This initiated a growth cycle of public enlightenment throughout the modern West, from which evolved the distinctive societal process known as public opinion—a process which, in turn, lubricates the self-sustaining mechanisms of a participant society.
393
For the adaptive capacity of a participant society resides in its institutionalized modes for registering, regulating, and responding to individual demands as well as their articulated and aggregated expression through manifold channels of collective demand. [See PUBLIC OPINION; see also Almond & Coleman 1960, chapter 1.] Public demand is something new. Earlier societies were able to survive the sporadic expression of individual and collective demands, particularly during periods of affluence, when their institutions were able to make relatively satisfactory responses to such demands. But no society preceding those of the nineteenth-century West was able to incorporate public demand as a mechanism that continuously interacts with public policy on the shaping and sharing of all societal values—power as well as wealth, enlightenment as well as deference. Indeed, only the twentieth-century West has begun to develop the crude model of a polity in which public demand—in the institutionalized form of public opinion—participates as a matter of course in the making of public policy. Public opinion has become the institutionalized expression of individual and collective demands because it has incorporated a significant measure of self-regulation. It avoids the persistent expression of demands that cannot be satisfied by existing institutions operating upon available resources— the outbursts of excessive demand that recurrently eventuate in the riots, rebellions, and revolutions of nonparticipant societies (Johnson 1962). It is responsible and self-regulating because it is based upon public enlightenment, which informs people about the current condition of public institutions and resources and thereby acts as a constraint upon their individual and collective demands. The effect of enlightenment, in this sense, is that people reconsider their felt private demands in the light of known public constraints and emerge with equilibrated (or otherwise balanced) opinions on the issues before them. It is this internal balancing of the Want-Get ratio by individuals that, ideally, corrects disruptive imbalances in their institutions and makes their society self-sustaining. Next steps The ideal type of a participant society does not yet exist in the modern West and may never be realized perfectly anywhere. As public-opinion polls have shown time and again, citizens are often ignorant of their past, voters are often ambivalent about their future, and consumers are often confused in making their present choices. The reliance of public opinion upon such routinized institutions
394
MODERNIZATION: Social Aspects
of enlightenment as public schools and mass media tends to routinize its creative articulation and aggregation. Those alienated intellectuals from retrograde societies of the West who point to marginal black-marketing in austerity Britain and peripheral cheating on income taxes in the United States ignore the principal fact, which is that these skin rashes upon participant societies have not been permitted to become cancerous (Shils 1958). Public opinion in these countries, despite its putative ignorance and ambivalence, has judged that blackmarketing and tax-cheating are "immoral"—that their cost to public welfare exceeds their benefit to private interests and, therefore, that adaptive institutions are needed to correct their potentially disruptive imbalance. Such institutions include public sanctions against private malfeasance. This process is the key to the "self-sustaining" capacity of a participant society. Public opinion, despite its flaws, can be counted on to perform its system-sustaining functions in an environment that supplies satisfactions and explains frustrations of individual and collective demands. It is this subtle and continuous reciprocity between popular demand and public supply (or nonsupply)—between what people want and what they get (or fail to get) —-that animates and sustains the participant society. Since this model represents the greatest advance in recorded history toward the ages-old ideal of social democracy, we must learn to assay its gains and measure its costs (Lerner & Schramm 1966). These are tasks that lie ahead for social scientists concerned with modernization. Procedures for assaying gains and measuring costs must be perfected. As a precondition, our concepts of what constitutes a cost or a gain must be articulated in sufficiently explicit fashion to guide our measures. While our tasks as social scientists are important, they cannot count for much without the efforts of those social planners and decision makers who change the lifeways of transitional societies. For the modernized lands have learned to develop, however crudely, a participant society with self-sustaining growth capability. This is not yet the case in the lands that are seeking to modernize. The modernized societies must now perfect the model for their own purposes—which include the "transfer" of the model in such fashion that it can be "transformed" by the modernizing societies. The modernizing societies must learn how transferred institutions may be transformed, how adopted lifeways may be adapted. As the modernized succeed in learning their own lesson, they will be better
equipped to teach the lesson to the modernizing. For the contemporary world has become interactive in the sense that all nations and peoples now are continuously exposed to each other. Modernization, now occurring on an interactive global scale, will point the way to a future modernity in the measure that advanced and backward, developed and underdeveloped, societies arrive at an understanding of what they have in common. This achievement of consensus on the values of a commonwealth of human dignity will provide the ultimate motor of modernization—for those who think they are, as for those who wish to be, modern. DANIEL LERNER [Directly related are the entries ACHIEVEMENT MOTIVATION; ECONOMIC GROWTH; POVERTY.] BIBLIOGRAPHY ALMOND, GABRIEL A.; and COLEMAN, JAMES S. (editors) 1960 The Politics of the Developing Areas. Princeton Univ. Press. BOOTH, CHARLES et al. (1889-1891) 1902-1903 Life and Labour of the People in London. 17 vols. London: Macmillan. CALIFORNIA, UNIVERSITY OF, INSTITUTE OF INTERNATIONAL STUDIES, INTERNATIONAL URBAN RESEARCH 1959 The World's Metropolitan Areas, by Suzanne R. Angelucci et al. Berkeley: Univ. of California Press. CANTRIL, HADLEY 1966 The Pattern of Human Concerns. New Brunswick, N.J.: Rutgers Univ. Press. CLARK, COLIN (1940)1957 The Conditions of Economic Progress. 3d ed., rev. London: Macmillan. CONFERENCE ON COMMUNICATION AND POLITICAL DEVELOPMENT, DOBBS FERRY, N.Y., 1961 1963 Communications and Political Development. Edited by Lucian W. Pye. Princeton Univ. Press. HAGEN, EVERETT E. 1962 On the Theory of Social Change. Homewood, 111.: Dorsey. HOSELITZ, BERT F. 1960 Sociological Aspects of Economic Growth. Glencoe, 111.: Free Press. Hull-House Maps and Papers: A Presentation of Nationalities and Wages in a Congested District of Chicago, Together With Comments and Essays on Problems Growing Out of the Social Conditions. 1895 New York: Crowell. JOHNSON, JOHN J. (editor) 1962 The Role of the Military in Underdeveloped Countries. Princeton Univ. Press. -> Papers of a conference sponsored by the RAND Corporation at Santa Monica, Calif., in August 1959. KLUCKHOHN, CLYDE (1959) 1964 Common Humanity and Diverse Cultures. Pages 245-284 in Daniel Lerner (editor), The Human Meaning of the Social Sciences. New York: Meridian. LASSWELL, HAROLD D. 1965 The Policy Sciences of Development. World Politics 17:286-309. LE PLAY, FREDERIC (1855) 1877-1879 Les ouvriers europeens. 2d ed. 6 vols. Tours (France): Mame. LERNER, DANIEL 1958a The Passing of Traditional Society: Modernizing the Middle East. Glencoe, 111.: Free Press. ->• A paperback edition was published in 1964.
MODERNIZATION: Political Aspects LERNER, DANIEL (editor) 1958fo Attitude Research in Modernizing Areas. Public Opinion Quarterly Special Issue 22, no. 3. LERNER, DANIEL 1962 The Reviving Civilizations. Pages 307-322 in Conference on Science, Philosophy, and Religion in Their Relation to the Democratic Way of Life, Fifteenth, New York, 1956, The Ethic of Power: The Interplay of Religion, Philosophy, and Politics. Edited by Harold D. Lasswell and Harlan Cleveland. New York: The Conference. LERNER, DANIEL 1964 The Transformation of Institutions. Pages 3-26 in William B. Hamilton (editor), The Transfer of Institutions. Durham, N.C.: Duke Univ. Press. LERNER, DANIEL; and SCHRAMM, WILBUR (editors) 1966 Communication and Change in the Developing Countries. Honolulu: East-West Center Press. LEWIS, W. ARTHUR 1955 The Theory of Economic Growth. Homewood, 111.: Irwin. LIPSET, SEYMOUR M. 1963 The First New Nation: The United States in Historical and Comparative Perspective. New York: Basic Books. MCCLELLAND, DAVID C. 1961 The Achieving Society. Princeton, NJ.: Van Nostrand. MILLIKAN, MAX F.; and BLACKMER, DONALD L. M. (editors) 1961 The Emerging Nations: Their Growth and United States Policy. A Study from the Center for International Studies, Massachusetts Institute of Technology. Boston: Little. MYRDAL, GUNNAR (1956) 1957 Rich Lands and Poor: The Road to World Prosperity. Rev. ed. New York: Harper. -> First published as Development and Underdevelopment. RIESMAN, DAVID 1950 The Lonely Crowd: A Study of the Changing American Character. New Haven: Yale Univ. Press. -» An abridged paperback edition was published in 1960. ROSTOW, WALT W. (1960) 1963 The Stages of Economic Growth: A Non-Communist Manifesto. Cambridge Univ. Press. SCHRAMM, WILBUR L. 1964 Mass Media and National Development: The Role of Information in the Developing Countries. Stanford Univ. Press. SHANNON, LYLE W. (editor) 1957 Underdeveloped Areas: A Book of Readings and Research. New York: Harper. SHANNON, LYLE W. 1958 Is Level of Development Related to Capacity for Self-government? American Journal of Economics and Sociology 17:367-381. SHILS, EDWARD 1958 Intellectuals, Public Opinion, and Economic Development. World Politics 10:232-255. SIEGFRIED, ANDRE (1927) 1928 America Comes of Age: A French Analysis. New York: Harcourt. -> First published in French. SINGER, HANS W. 1949 Economic Progress in Underdeveloped Countries. Social Research 16:1—11. SPEIER, HANS (1950) 1952 The Historical Development of Public Opinion. Pages 323-338 in Hans Speier, Social Order and the Risks of War: Papers in Political Sociology. New York: Stewart. SPENGLER, JOSEPH J.; and DUNCAN, OTIS DUDLEY (editors) 1956 Population Theory and Policy: Selected Readings. Glencoe, 111.: Free Press. STALEY, EUGENE (1954) 1961 The Future of Underdeveloped Countries: Political Implications of Economic Development. Rev. ed. Published for the Council on Foreign Relations. New York: Harper.
395
II POLITICAL ASPECTS
The political aspects of modernization refer to the ensemble of structural and cultural changes in the political systems of modernizing societies. As an analytically separable subsystem of society the political system comprises all of those activities, processes, institutions, and beliefs concerned with the making and execution of authoritative policy and the pursuit and attainment of collective goals. Political structure consists of the patterning and interrelationship of political roles and processes; political culture is the complex of prevailing attitudes, beliefs, and values concerning the political system. The over-all process of modernization refers to changes in all institutional spheres of a society resulting from man's expanding knowledge of and control over his environment (Black 1962). Political modernization refers to those processes of differentiation of political structure and secularization of political culture which enhance the capability—the effectiveness and efficiency of performance—of a society's political system (Almond & Powell 1966). Political modernization can be viewed from a historical, a typological, and an evolutionary perspective. Historical political modernization refers to the totality of changes in political structure and culture which characteristically have affected or have been affected by those major transformative processes of modernization (secularization; commercialization; industrialization; accelerated social mobility; restratification; increased material standards of living; diffusion of literacy, education, and mass media; national unification; and the expansion of popular involvement and participation) which were first launched in western Europe in the sixteenth century and which subsequently have spread, unevenly and incompletely, throughout the world. Typological political modernization refers to the process of transmutation of a premodern "traditional" polity into a posttraditional "modern" polity. (Since concrete polities are only more or less modern, the term "modern polity" is used here to refer to those polities which in the 1960s are typologically the most modern.) Evolutionary political modernization refers to that open-ended increase in the capacity of political man to develop structures to cope with or resolve problems, to absorb and adapt to continuous change, and to strive purposively and creatively for the attainment of new societal goals. From the historical and typological perspectives, political modernization is a
396
MODERNIZATION: Political Aspects
process of development toward some image of a modern polity. From the evolutionary perspective, the growth process is interminable and the end state of affairs indeterminate. Theoretical approaches Efforts to depict the complex characteristics of a modern polity have tended to take three forms: descriptive trait lists, single-dimension reductionism, and ideal-type continua. Several studies have combined all three approaches in variant ways (e.g., Almond & Coleman 1960). The trait list approach usually identifies the major structural and cultural features generic to those contemporary polities regarded as modern by the observer (Almond & Coleman I960; Black 1962; Eisenstadt 1964a; Kornhauser 1964; Conference on Communication . . . 1963; Conference on Political Modernization . . . 1964). These efforts have been criticized for being temporally and culturally bounded, for being excessively multidimensional, and for including some traits which vary independently of one another (Holt & Turner 1966). The reductionist approach focuses upon a single antecedent factor, explanatory variable, correlate, or determinant as the prime index or most distinguishing feature of modernization and, by implication, of political modernity. Single characteristics which have been highlighted include the concepts of capacity or capability (Brzezinski 1956; Holt & Turner 1966; Almond 1965), differentiation (Riggs 1963), institutionalization (Huntington 1965), national integration (Binder 1962), participation (Lerner 1958), populism (Fallers 1963), political culture (Almond & Verba 1963), psychological traits (Lerner 1958; Doob I960; Conference on Communication . . . 1963), social mobilization (Deutsch 1961), and socioeconomic correlates (Lipset I960; Coleman I960; Cutright 1963). These reductive efforts do not imply a denial of multivariate causation; rather, they reflect either the timeless quest for a comprehensive single concept of modernity or simply the desire to illuminate a previously neglected or underemphasized variable. The ideal-type approach is either explicit or implicit in most conceptualizations of both a modern political system and the process of political modernization. Descriptive trait lists of a generically modern polity tend unavoidably to be ideal-typical; indeed, the very notion of a "modern polity" implies an ideal-typical "traditional polity" as a polar opposite, as well as a "transitional polity" as an intervening type on a continuum of political modernization. Inspired by the original simple dicho-
tomies of Maine (status-contract) and Tonnies (Gemeinschaft-Gesellschaft^, and more directly by the pattern variables developed by Talcott Parsons, more complex ideal-typical dichotomous schemata of variable multidimensionality have been suggested for the study of comparative politics (Sutton 1959; Almond & Coleman 1960) and comparative administration (Riggs 1957). The essential differences between these schemata and the ideal-typical trait lists are that the attributes of the former are more logically interrelated in a unified construct and are specified for the two polar opposites (e.g., agrarian-industrial; traditionalmodern). According to these schemata, the orientations governing the interactions characteristic of a traditional polity are predominantly ascriptive, particularistic, and diffuse; those of a modern polity are predominantly achievement-oriented, universalistic, and specific. Political modernization is viewed as the process of movement from the traditional pole to the modern pole of the continuum. The three-stage (traditional-transitional-modern) approach to political modernization is vulnerable to at least three criticisms. First, like all such models used in the study of change in the social sciences, it tends to convey a false image of the traditional pole of the modernization continuum (Moore 1963). The static, sacred, undifferentiated character of chronologically traditional polities tends to be exaggerated. Many historically "traditional" political systems in fact had typologically "modern" structures, attributes, and orientations and vice versa. Indeed, the political structures of all empirical societies—historical and contemporaneous—are mixed; the degree of their modernity is determined by whichever tendency predominates within the mix. Although this fact is acknowledged, even stressed, by most users of the three-stage approach, the tendency to confuse, or at least to slur, the differences between historical and typological political modernity is a common fallacy in much of the literature. Second, this approach suggests that the movement between the two poles of traditionality and modernity is and must be irreversible, directional, and unilinear. It does not allow for political "breakdowns" in modernization (Eisenstadt 1964b), for "negative" political development or "prismatic" arrest (Riggs 1964), or for political "decay" (Huntington 1965). Third, at the modern end of the continuum, the three-stage model reinforces the image that the modernization process terminates, a notion implicit in the ordinary "present-day" meaning of the concept of modernity. It suggests the completion of a once-and-for-all
MODERNIZATION: Political Aspects and once-to-a-system process of transmutation. By implying that there cannot be any further or continuous modernization, it rules out the concept of evolutionary political modernization. The evolutionary perspective emancipates the concept of political modernization from both its temporal (1500 to the present day) and its cultural and areal (Western world) constraints. It overcomes the implication of termination inherent in the idea of a "modern" polity and avoids the notion of "postmodern" political development. Although it does not becloud the fact that historically the major thrust in political modernization has occurred in the core area of western Europe in its postmedieval period, it allows us to reach back to the beginning of man's organized existence and to encompass the full range of structural diversity in man's experience in governing himself. Moreover, by viewing political modernization as an ongoing and continuous process, it encourages comparative trend analysis. Such a redefinition of the concept ties in with the revival and extension of evolutionary theory in cultural anthropology (Sahlins & Service 1960) and in sociology (Parsons 1964). Indeed, the distinction made by Sahlins between specific evolution (the historically continuous adaptation of particular societies to their environments) and general evolution (over-all, but discontinuous, development in human organization as manifested in the passage from lesser to greater capacity and all-round adaptability and from lower to higher levels of integration) is particularly crucial. [See EVOLUTION, article on CULTURAL EVOLUTION.] This dual character of the evolutionary perspective makes it possible to refer, on the one hand, to the specific process of political modernization (i.e., the acquisition of typologically modern traits and capabilities) in particular concrete societies, which through specialization and adaptation may in time cease modernizing, and, on the other hand, to general political modernization, as manifested in the successive acquisition by politically organized man of enhanced and new capacity to seek, change, and attain his goals. It is, in short, a perspective that allows us to conceptualize political modernization, political development, a nd political growth as synonymous. Characteristics In the growing body of literature on modernization and development, the major characteristics ^ost often associated with the concept of a moder n polity and the process of political modernization can be roughly grouped under three major headings: (1) differentiation, as the dominant
397
empirical trend in the historic evolution of modern society; (2) equality, as the central ethos and ethical imperative pervading the operative ideals of all aspects of modern life; and (3) capacity, as the constantly increasing adaptive and creative potentialities possessed by man for the manipulation of his environment. The political modernization process can be viewed as an interminable contrapuntal interplay among the process of differentiation, the imperatives and realizations of equality, and the integrative, adaptive, and creative capacity of a political system. In these terms, political modernization is the progressive acquisition of a consciously sought, and qualitatively new and enhanced, political capacity as manifested in (1) the effective institutionalization of (a) new patterns of integration and penetration regulating and containing the tensions and conflicts produced by the processes of differentiation, and of (b) new patterns of participation and resource distribution adequately responsive to the demands generated by the imperatives of equality; and (2) the continuous flexibility to set and achieve new goals. The process of differentiation. Differentiation refers to the process of progressive separation and specialization of roles, institutional spheres, and associations in the development of political systems. It includes such "evolutionary universals" (Parsons 1964) as social stratification and the separation of occupational roles from kinship and domestic life, the separation of an integrated system of universalistic legal norms from religion, the separation of religion and ideology, and differentiation between administrative structure and public political competition. It implies greater functional specialization, structural complexity and interdependence, and heightened effectiveness of political organization in both administrative and political spheres. [See POLITICAL ANTHROPOLOGY, article on POLITICAL ORGANIZATION.] The ethos of equality. Equality is the ethos of modernity; the quest for it and its realization are at the core of the politics of modernization. It includes the notion of universal adult citizenship (equality in distributive claims and participant rights and duties), the prevalence of universalistic legal norms in the government's relations with the citizenry (equality in legal privileges and deprivations), and the predominance of achievement criteria (the psychic equality of opportunity to be unequal) in recruitment and allocation to political and administrative roles. Even though these attributes of equality are only imperfectly realized in the most modern polities, they continue to operate as the central standards and imperatives by which
398
MODERNIZATION: Political Aspects
modernization is measured and political legitimacy established. Popular participation or involvement in the political system, either symbolically or determinatively, is a central theme in most definitions of political modernization. [See EQUALITY.] The growth of political capacity. The acquisition of enhanced political administrative capacity is the third major feature of political modernization. It is characterized by an increase in scope of polity functions, in the scale of the political community, in the efficacy of the implementation of political and administrative decisions, in the penetrative power of central governmental institutions, and in the comprehensiveness of the aggregation of interests by political associations. Institutionalization of political organization and procedures, the development of problem-solving capabilities, centralization, and the ability to sustain continuously new types of political demands and organizations are among the varying ways in which the concept of capacity has been made central to the definition of political modernization and development. The sources of increased capacity include both differentiation (secularization, functional specialization, greater structural interdependence, motivation generated by status hierarchization) and equality (liberation of human energy and talent, universalism, achievement, rationalization, and civic identity and obligation); yet the tensions and divisiveness of differentiation and the demands of egalitarianism also constitute the main challenge to the capacity of a polity. The fact that the three aspects of modernization may sometimes conflict rather than reinforce one another explains why their contrapuntal interplay is central to any discussion of the modernization process. Democracy and nation-state. Two other characteristics commonly attributed to a modern polity are democracy and the nation-state. In some instances, Western democratic institutions are used explicitly as the empirical referents for a model of political modernity; in others such an identification is only implicit. This infusion of the concept with an allegedly culture-bound element limits the utility of the concept as a cross-cultural analytical tool. Recognition of this fact has stimulated efforts to identify and specify traits generic to those political systems generally recognized as the most modern in the contemporary world. Ethnocentrism aside, however, there are those who defend a democratic component in any model of political modernization either on ethical grounds or because it demonstrably enhances the integrative and adaptive capacity and the flexibility of a political system. This latter rationale is the basis for identifying the
"democratic association" as an "evolutionary universal," a major threshold in political modernization. [See DEMOCRACY.] The nation-state is the second major controversial component in definitions of political modernization. Most studies assume that the nationstate is the essential, if not the natural, framework of political modernization. According to Black (1962), the essential effect of modernization has been the creation of national states. Indeed, "nation building" is commonly viewed as either a crucial dimension of or as a synonym for political modernization. Historically, of course, the centralized nation-state (existing or emerging) has been the empirical unit of modernization in all its aspects. Moreover, unlike democratic institutions, it is a form of political organization that has in fact become universalized. It has also been legitimated by prevailing norms of international law and organization. Therefore, it is the most convenient and logical unit for analysis in studies both of historical and of contemporary modernization. From the evolutionary and typological perspective, however, it need not and should not be included as a requisite component of political modernization. [See NATION.] Patterns and variables The autonomy of the polity. A prominent theme in various strands of Western social and political thought is the dependence of the polity upon other institutional spheres. This tradition of looking at political behavior and institutions as deriving from more fundamental social, economic, or psychological factors has been fortified in the social sciences—and particularly in political science—by a variety of other influences: behavioralism, systems theory, and structural-functionalism; the prominence in our image of modernization in the Western world of the laissez-faire period, in contrast with the earlier polity-dominant statist period; the pronounced economic determinism in America's post-World War n foreign aid policy, conditioned as it has been by the presumably successful democratization of West Germany and Japan; and several studies which suggest a positive correlation between political, social, and economic aspects of development (Lipset I960; Coleman I960; Cutright 1963; Banks & Textor 1963). In line with the continual rediscovery of lost or neglected variables in the pendulumlike evolution of the social sciences, the reaction against this polity-as-the-dependentvariable tradition is now in full swing (Montgomery & Siffin 1965; Spiro 1966). The need for a critical re-examination of the
MODERNIZATION: Political Aspects degree of autonomy and primacy of the polity in the modernization process has been given added impetus as a result of restrospective analysis of historical modernization in the West (Black 1966); the dominance of the political sphere in the modernization of the Soviet Union and mainland China, as well as Japan, Turkey, and Mexico (Black 1962; Tsou 1963; Conference on Political Modernization . . . 1964; Eisenstadt 1964b); and the pre-eminence of the political factor in the modernization of the developing countries. In virtually all of these instances political leadership and centralized political organization have been dominant and causal, rather than derivative. There are also historical instances where substantial changes have taken place in the political sphere without correspondingly significant changes in the social and economic spheres, and vice versa (see Paige in Montgomery & Siffin 1965), thus underlining the autonomy of the polity in the modernization process. The American experience, as Huntington (1966) emphasized, demonstrates conclusively that some institutions and some aspects of a society may become highly modern while other institutions and other aspects retain much of their traditional form and substance. Patterns of modernization. The varying patterns in the relationship between the polity and the society (polity dominance, polity dependence, and polity autonomy) are only one aspect of the extraordinary diversity which has characterized the process of political modernization throughout history. There presumably is no single universal process, no uniform sequential pattern or common structural arrangement. The modernizing experience of each country is sui generis. According to one view, the only generalization one can make is that late modernizers ". . . will not follow the sequence of their predecessors, but will insist on changing it around or on skipping entirely some stages as well as some 'preconditions'" (Hirschman 1962). Nevertheless, certain patterns have been suggested. One typology (Black 1966), based on three criteria (the ascendance and consolidation of modernizing leadership, economic and social transformation, and the integration of society), identifies seven patterns of political modernization: (1) Britain and France, the early modernizers and models for later modernizers; (2) the United States, Canada, Australia, and New Zealand, the offshoots of Britain and France in the New World; (3) the other societies of continental Europe, in which the consolidation of modernizing leadership occurred after the French Revolution; (4) the independent countries of Latin America; (5) societies
399
that modernized without direct outside intervention but under the influence of early modernizers (Russia, Japan, China, Iran, Turkey, Afghanistan, Ethiopia, and Thailand); and (6) and (7) former and residual colonial territories differentiated according to the existence of precolonial institutions adaptable to modern conditions. Using different criteria, another schema (Eisenstadt 1964k) distinguishes six clusters (constitutional democracies; totalitarian states; indigenous revolutionary regimes such as Turkey and Mexico; dictatorships in eastern Europe, the Middle East, and Latin America; authoritarian regimes in Spain and Portugal; and the postcolonial new states). A third schema (Huntington 1966), primarily concerned with the earlier phases of political modernization in Europe and America, distinguishes three patterns (continental European, British, and American) according to three criteria (rationalized authority, differentiated political structure, and mass political participation). Actual or ideal-typical patterns of political modernization in late-modernizing new states as a special category have also been suggested (Shils 1959-1960; Apter 1965). Variables affecting modernization. Among the many variables which can affect—and which historically have decisively affected—the course of political modernization, four seem to be particularly crucial: (1) the traditional political structure and culture, (2) the historical timing of the modernization thrust, (3) the character and orientation of political leadership, and (4) the sequence in which major system-development problems or "crises" generic to the political modernization process are encountered. Tradition. Traditional institutions and values have an extraordinary resilience and persistence. "[The] form a modern society takes is the result of the interaction of its historically formed traditions with the universalizing effects of modernization" (Black 1962). For example, if prior to the modernization leap a national state, a centralized government, and a dominant value system supportive of innovation and change already exist, there can be a "reinforcing dualism" (Conference on Political Modernization . . . 1964) between the traditional system and the modernizing process. Timing. The timing of the modernizing "takeoff" is also crucial in many ways: it determines the significance of an array of other variables, such as the international environment, the range of modernizing models available for emulation, the political manipulability or obstructiveness of tradition, the degree of social and political mobilization of the population and the resultant demand
400
MODERNIZATION: Political Aspects
load upon the polity, and the opportunities for modernizing short-cuts available to late starters favored by the so-called Law of Evolutionary Potential (Sahlins & Service 1960). Leadership. The nature of a modernizing political leadership largely determines the extent to which tradition is harnessed to modernization if it is supportive or neutralized if it is obstructive. It also determines the degree to which the disadvantages of timing are minimized and the opportunities are exploited. Individual political leaders and political elites have been the prime movers in political modernization. The rate and direction of that process, as well as the political structures and culture which emerge, reflect in large measure the values and goal orientations of the leadership; its adaptive and creative capacities; and its reaction to the modernization crises it confronts. Crises. The experience of the most highly developed contemporary polities has led to the identification of several critical "system-development problems" or "crises" which every modernizing polity encounters at least once and must cope with or surmount if it is to continue to modernize (Conference on Political Modernization . . . 1964; Black 1966; Almond & Powell 1966; Pye 1966). Although formulations vary, the following six problems illuminate this way of conceptualizing the political modernization process: (1) national identity, the transfer of ultimate loyalty and commitment from primordial groups to the larger national political system; (2) political legitimacy, the legitimation of modernizing elites and the authority structure of the new state; (3) penetration, the centralization of power, the establishment of a "determinate human source of final authority" transcending pre-existing subnational authority systems (Huntington 1966), the bridging of discontinuities in political communication, and the effectuation of policies throughout the society by the central institutions of government; (4) participation, the development of symbolic or participatory institutions and a political infrastructure to organize and channel the characteristically modern mass demand for a share in the decision-making process; (5) integration, the organization of a coherent political process and pattern of interacting relationships for the making of public policy and the pursuit and achievement of societal goals; and (6) distribution, the effective use of government power to bring about economic growth, mobilize resources, and distribute goods, services, and values in response to mass demands and expectations. The modernization of a political system is meas-
ured by the extent to which it has developed the capabilities (symbolic, regulative, responsive, extractive, and distributive) to cope with these generic system-development problems (Almond 1965; Pennock 1966). It is argued not only that these capabilities are logically related but also that they suggest an order of development, that is, the development of one type of capability requires the development of another (e.g., increasing the extractive capability implies an increase in the regulative capability). Indeed, this approach could be the first step in the direction of a theory of political modernization, if the structural and cultural characteristics of political systems can be related to the ways in which these systems have confronted and coped with the crises common to all of them (Almond & Powell 1966). Systematic comparative historical studies of political modernization in Western polities are increasingly feasible as a consequence of the development of data archives and the use of electronic computers in processing historical information (Rokkan 1966). One promising initial focus would be upon the growth in political participation: in most countries of the West the requisite political statistics are available as far back as the French Revolution. This rediscovery of the legitimacy and theoretical potentiality of the historical dimension in political research and in diachronic analysis has been one of the unintended consequences of the postwar concern with the modernization of the developing countries. Continued systematic study of the evolution of the latter, together with the retrospective analysis of the political modernization of older polities, should significantly enhance our capacity not only to generalize about the past but also to suggest probabilities regarding the future. JAMES S. COLEMAN [See also GOVERNMENT; POLITICAL ANTHROPOLOGY; POLITICAL CULTURE; POLITICS, COMPARATIVE; SOCIETAL ANALYSIS.] BIBLIOGRAPHY
ALMOND, GABRIEL A. 1965 A Developmental Approach to Political Systems. World Politics 17:183-214. ALMOND, GABRIEL A.; and COLEMAN, JAMES S. (editors) 1960 The Politics of the Developing Areas. Princeton Univ. Press. ALMOND, GABRIEL A.; and POWELL, G. BINGHAM JR. 1966 Comparative Politics: A Developmental ApproachBoston: Little. ALMOND, GABRIEL A.; and VERSA, SIDNEY (1963) 1965 The Civic Culture: Political Attitudes and Democracy in Five Nations. Boston: Little. AFTER, DAVID E. (1955) 1963 Ghana in Transition. Rev. ed. New York: Atheneum. -» First published as The Gold Coast in Transition.
MODERNIZATION: Political Aspects AFTER, DAVID E. 1960 The Role of Traditionalism in the Political Modernization of Ghana and Uganda. World Politics 13:45-68. AFTER, DAVID E. 1961 The Political Kingdom of Uganda: A Study in Bureaucratic Nationalism. Princeton Univ. Press. AFTER, DAVID E. 1965 The Politics of Modernization. Univ. of Chicago Press. BANKS, ARTHUR; and TEXTOR, ROBERT 1963 A Crosspolity Survey. Cambridge, Mass.: M.I.T. Press. BARKER, ERNEST (1937) 1944 The Development of Public Services in Western Europe: 1660-1930. New York and London: Oxford Univ. Press. -> First published as "The Development of Administration, Taxation, Social Services and Education" in Edward Eyre (editor), European Civilization, Its Origin and Development. BENDIX, REINHARD 1961 Social Stratification and the Political Community. Archives europeennes de sociologie 1:181-210. BINDER, LEONARD (1962) 1964 Iran: Political Development in a Changing Society. Published under the auspices of the Near Eastern Center, University of California. Berkeley and Los Angeles: Univ. of California Press. BLACK, CYRIL E. 1962 Political Modernization in Russia and China. Pages 3-18 in International Conference on Sino-Soviet Bloc Affairs, 3d, Lake Kawaguchi, 1960, Unity and Contradiction: Major Aspects of SinoSoviet Relations. Edited by Kurt London. New York: Praeger. BLACK, CYRIL E. 1966 The Dynamics of Modernization: A Study in Comparative History. New York: Harper. BROOKINGS INSTITUTION, WASHINGTON, D.C. 1962 Development of the Emerging Countries: An Agenda for Research. Washington: The Institution. BRZEZINSKI, ZBIGNIEW 1956 The Politics of Underdevelopment. World Politics 9:55-75. CHICAGO, UNIVERSITY OF, COMMITTEE FOR THE COMPARATIVE STUDY OF THE NEW NATIONS 1963 Old Societies and New States: The Quest for Modernity in Asia and Africa. Edited by Clifford Geertz. New York: Free Press. COLEMAN, JAMES S. 1960 The Political Systems of the Developing Areas. Pages 532-576 in Gabriel A. Almond and James S. Coleman (editors), The Politics of the Developing Areas. Princeton Univ. Press. COLEMAN, JAMES S. (editor) 1965 Education and Political Development. Princeton Univ. Press. CONFERENCE ON COMMUNICATION AND POLITICAL DEVELOPMENT, DOBBS FERRY, N.Y., 1961 1963 Communications and Political Development. Edited by Lucian W. Pye. Princeton Univ. Press. CONFERENCE ON POLITICAL MODERNIZATION IN JAPAN AND TURKEY, GOULD HOUSE, 7962 1964 Political Modernization in Japan and Turkey. Edited by Robert E. Ward and Dankwart A. Rustow. Princeton Univ. Press. CUTRIGHT, PHILLIPS 1963 National Political Development: Measurement and Analysis. American Sociological Review 28:253-264. DAVIES, JAMES C. 1962 Toward a Theory of Revolution. American Sociological Review 27:5—19. DEUTSCH, KARL W. 1953a The Growth of Nations: Some Recurrent Patterns of Political and Social Integration. World Politics 5:168-195. DEUTSCH, KARL W. (1953b) 1966 Nationalism and Social Communication: An Inquiry Into the Foundations
401
of Nationality. 2d ed. Cambridge, Mass.: M.I.T. Press; New York: Wiley. DEUTSCH, KARL W. 1961 Social Mobilization and Political Development. American Political Science Review 55:493-514. DEUTSCH, KARL W. 1963 The Nerves of Government: Models of Political Communication and Control. New York: Free Press. DEUTSCH, KARL W. et al. 1957 Political Community and the North Atlantic Area: International Organization in the Light of Historical Experience. Princeton Univ. Press. DOOB, LEONARD W. 1960 Becoming More Civilized: A Psychological Exploration. New Haven: Yale Univ. Press. EASTON, DAVID 1965 A Framework for Political Analysis. Englewood Cliffs, N.J.: Prentice-Hall. EISENSTADT, SHMUEL N. 1958 Bureaucracy and Bureaucratization: A Trend Report and Bibliography. Current Sociology 7:99-164. EISENSTADT, SHMUEL N. 1961 Essays on Sociological Aspects of Political and Economic Development. The Hague: Mouton. EISENSTADT, SHMUEL N. 1963a The Political Systems of Empires. New York: Free Press. EISENSTADT, SHMUEL N. 1963b Modernization: Growth and Diversity. Bloomington: Indiana Univ., Department of Government. EISENSTADT, SHMUEL N. 1964a Political Modernization: Some Comparative Notes. International Journal of Comparative Sociology 5:3-24. EISENSTADT, SHMUEL N. 1964b Breakdowns of Modernization. Economic Development and Cultural Change 12:345-367. EMERSON, RUPERT 1960a From Empire to Nation: The Rise to Self-assertion of Asian and African Peoples. Cambridge, Mass.: Harvard Univ. Press. -» A paperback edition was published in 1962 by Beacon. EMERSON, RUPERT 1960b Nationalism and Political Development. Journal of Politics 22:3-28. FALLERS, LLOYD A. 1963 Equality, Modernity, and Democracy in the New States. Pages 158-219 in Chicago, University of, Committee for the Comparative Study of New Nations, Old Societies and New States: The Quest for Modernity in Asia and Africa. Edited by Clifford Geertz. New York: Free Press. GINSBERG, MORRIS 1961 Essays in Sociology and Social Philosophy. Volume 3: Evolution and Progress. London: Heinemann. HAGEN, EVERETT E. 1962 On the Theory of Social Change: How Economic Growth Begins. Homewood, 111.: Dorsey. HARRIS, DALE B. (editor) 1957 The Concept of Development: An Issue in the Study of Human Behavior. Minneapolis: Univ. of Minnesota Press. HIRSCHMAN, ALBERT O. 1962 Comments on "A Framework for Analyzing Economic and Political Change." Pages 39-44 in Brookings Institution, Development of the Emerging Countries: An Agenda for Research. Washington: The Institution. HOLT, ROBERT T.; and TURNER, JOHN E. 1966 The Political Basis of Economic Development: An Exploration in Comparative Political Analysis. New York: Van Nostrand. HUNTINGTON, SAMUEL P. 1965 Political Development and Political Decay. World Politics 17:386-430. HUNTINGTON, SAMUEL P. 1966 Political Modernization: America vs. Europe. World Politics 18:378-414.
402
MODERNIZATION: The Bourgeoisie
KILSON, MARTIN 1963 African Political Change and the Modernisation Process. Journal of Modern African Studies 1:425-440. KORNHAUSER, WILLIAM 1959 The Politics of Mass Society. Glencoe, 111.: Free Press. KORNHAUSER, WILLIAM 1964 Rebellion and Political Development. Pages 142-156 in Harry Eckstein (editor), Internal War: Problems and Approaches. New York: Free Press. LAPALOMBARA, JOSEPH G. (editor) 1963 Bureaucracy and Political Development. Studies in Political Development, No. 2. Princeton Univ. Press. LAPALOMBARA, JOSEPH G.; and WEINER, MYRON (editors) 1966 Political Parties and Political Development, Princeton Univ. Press. LERNER, DANIEL 1958 The Passing of Traditional Society: Modernizing the Middle East. Glencoe, 111.: Free Press. H> A paperback edition was published in 1964. LEVY, MARION J. 1966 Modernization and the Structure of Society: A Setting for International Affairs. 2 vols. Princeton Univ. Press. LIPSET, SEYMOUR M. 1960 Political Man: The Social Bases of Politics, Garden City, N.Y.: Doubleday. -» A paperback edition was published in 1963. LIPSET, SEYMOUR M. 1963 The First New Nation: The United States in Historical and Comparative Perspective. New York: Basic Books. MARSHALL, T. H. (1934-1962) 1964 CZass, Citizenship, and Social Development: Essays. Garden City, N.Y.: Doubleday. -> A collection of articles and lectures first published in England in 1963 as Sociology at the Crossroads and Other Essays. A paperback edition was published in 1965. MILLIKAN, MAX F.; and BLACKMER, DONALD L. M. (editors) 1961 The Emerging Nations: Their Growth and United States Policy. A Study from the Center for International Studies, Massachusetts Institute of Technology. Boston: Little. MONTGOMERY, JOHN D.; and SIFFIN, WILLIAM (editors) 1965 Politics, Administration and Change: Approaches to Development. New York: McGraw-Hill. -> See especially the articles "The Rediscovery of Politics," by Glenn D. Paige, and "Political Development: Approaches to Theory and Strategy," by Alfred Diamant. MOORE, WILBERT E. 1963 Social Change. Englewood Cliffs, N.J.: Prentice-Hall. NEUFELD, MAURICE F. 1965 Poor Countries and Authoritarian Rule. Ithaca: New York State School of Industrial and Labor Relations. ORGANSKI, A. F. K. 1965 The Stages of Political Development. New York: Knopf. PACKENHAM, ROBERT A. 1964 Approaches to the Study of Political Development. World Politics 17:108-120. PARSONS, TALCOTT 1964 Evolutionary Universals in Society. American Sociological Review 29:339-357. PENNOCK, J. ROLAND 1966 Political Development, Political Systems, and Political Goods. World Politics 18: 415-434. PYE, LUCIAN W. 1962 Politics, Personality, and Nation Building: Burma's Search for Identity. New Haven: Yale Univ. Press. PYE, LUCIAN W. 1966 Aspects of Political Development: An Analytic Study. Boston: Little. PYE, LUCIAN W.; and VERBA, SIDNEY (editors) 1965 Political Culture and Political Development. Princeton Univ. Press.
RIGGS, FRED W. (1957) 1959 Agraria and Industria: Toward a Typology of Comparative Administration. Pages 23-116 in WiUiam J. Siffin (editor), Toward the Comparative Study of Public Administration. Bloomington: Indiana Univ. Press. RIGGS, FRED W. 1963 Bureaucrats and Political Development: A Paradoxical View. Pages 120-167 in Joseph G. LaPalombara (editor), Bureaucracy and Political Development. Princeton Univ. Press. RIGGS, FRED W. 1964 Administration in Developing Countries: The Theory of Prismatic Society. Boston: Houghton Mifflin. ROKKAN, STEIN 1966 Electoral Mobilization, Party Competition and National Integration. Pages 241-265 in Joseph G. LaPalombara and Myron Weiner (editors), Political Parties and Political Development. Princeton Univ. Press. ROSE, ARNOLD M. 1958 The Institutions of Advanced Societies. Minneapolis: Univ. of Minnesota Press. SAHLINS, MARSHALL D.; and SERVICE, ELMAN R. (editors) 1960 Evolution and Culture. Ann Arbor: Univ. of Michigan Press. SHILS, EDWARD (1959-1960) 1962 Political Development in the New States. The Hague: Mouton. SMELSER, NEIL J. 1964 Toward a Theory of Modernization, Pages 258-274 in Amitai Etzioni and Eva Etzioni (editors), Social Change: Sources, Patterns and Consequences. New York: Basic Books. SPIRO, HERBERT J. (editor) 1966 Africa: The Primacy of Politics. New York: Random House. SUTTON, FRANCIS X. 1959 Representation and the Nature of Political Systems. Comparative Studies in Society and History 2:1-10. Tsou, TANG 1963 America's Failure in China, 19411950. Univ. of Chicago Press. WARD, ROBERT E. 1963 Political Modernization and Political Culture in Japan. World Politics 15:569-596. WORMUTH, FRANCIS D. 1949 The Origins of Modern Constitutionalism. New York: Harper. Ill THE BOURGEOISIE IN MODERNIZING SOCIETIES
Just as no pattern of late-developing industrialization is likely to repeat at all closely the history of those European nations where modern economic growth first began, so no non-Western country is likely to have a bourgeoisie identical with those groups whose self-conscious sense of collective identity first gave rise to the term. In many countries there have been, and are, recognizable groups which, by virtue of their intermediate position between a power-holding ascriptive upper class and a large peasant or wage-working mass, might be granted the minimum qualification of middleness required for the label "bourgeoisie." They are likely, however, to differ from the Western model by virtue of a variety of factors; the upper class from which they are differentiated may be not a landed nobility of military origins but a bureaucratic oligarchy, a theocratic court, or a colonial elite; within the middle class professionals or public servants may outnumber or outinfmence businessmen; the
MODERNIZATION: The Bourgeoisie businessmen themselves may be more predominantly members of a corporation salariat than entrepreneurs; the state rather than the private firm may play the major role in industrialization; and the intellectuals of the middle class may function more as the importers and interpreters of ideologies than as their creators. Japan Japan, as the first non-Western country to achieve a high level of industrialization, provides an instructive illustration of some of these differences [see JAPANESE SOCIETY]. In Japan the drive to industrialize and the creation of a bourgeoisie were a consequence of a political revolution—the centralization of state power in the Meiji Restoration of 1868. In Europe, by contrast, causation ran mostly the other way: political change reflected the changed class relations resulting from economic growth. Even before 1868 Japan already had a small merchant class which dominated interregional trade and controlled a good deal of handicraft production, but the majority of the crucially innovative entrepreneurs of the early stages of industrialization were drawn not from this old commercial middle class, but from the samurai—the warrior class that made up some 5-6 per cent of the population (Hirschmeier 1964). The samurai had been the retainers of the some 280 feudal lords, living in these lords' castle-towns and staffing their fief administrations, drawing rice stipends for their services. They were already, therefore, more bureaucratic than gentrylike, more urban than rural, more group-oriented than individualistic (Hall 1962; Smith 1966). This is one reason for the early bureaucratization of business organization in Japan, although there are other factors involved too-, late-developing industrialization requires less individual inventiveness and more institutionalized learning—which enhanced the importance placed on formal educational qualifications (Smith 1960) -—and a good deal of the initial phases of industrialization was undertaken directly by the state. By the end of the nineteenth century the business elite of Japan was located in a few large corporations, and by the 1950s the majority of business leaders had spent the whole of their careers as salaried officials rather than as independent entrepreneurs (Dore 1966). Politically the role of the middle class reflected a crucial difference between the upper class of late nineteenth-century and early twentieth-century Japan and that of European countries in the early stages of industrialization—namely, that those who
403
held prestige and power in Japanese society did not hold a large proportion of their property in land and did not have personal interests in the protection of agriculture even at the expense of manufacturing industry. The Meiji land settlement had removed the feudal lords and their retainers from their fiefs and compensated them for their feudal revenues in government bonds. A few hundred of these feudal families were incorporated into a titled aristocracy, most of whom invested their large stocks of compensation bonds in banking and in industry. They formed a conspicuously consuming rentier class whose life revolved around the imperial court and which, though at the pinnacle of the prestige hierarchy, was almost completely divorced from participation in either business or political life. And insofar as they were able or concerned to insure the protection of their interests, those interests coincided with those of the industrial middle class, lying pre-eminently in the fostering of industrial growth. Power rested initially with the bureaucracy and the army, both recruited from the samurai class. Both were therefore staffed by men who held little property but their commutation bonds, little source of income besides their salary, little power but what they enjoyed by virtue of their office. Both rapidly developed procedures for recruitment by academic examination which strengthened their sense of being elites that deserved to rule. Because it was a bureaucracy selected for scholastic merit and because there were no restrictions based on heredity to bar promotion into the upper power-holding elite (except for the barrier between the administrative and clerical grades), middle functionaries could always aspire to be top functionaries and were consequently not an important source of middle-class consciousness: they produced no John Stuart Mill. The professions also offered a different constellation from that of postfeudal Europe. Japanese feudalism did not dissolve gradually through the legal formalization of the customary property rights of subordinate members of the feudal hierarchy and the transformation of those rights into marketable assets. Consequently, there was no development of a powerful group of independent lawyers, necessarily led by their professional interests to be keenly concerned with politics. The law faculty dominated the new universities, but the pick of its graduates was drawn into the bureaucracy or into the ranks of the state prosecutors and judges. The civilian lawyer, consequently, was typically a failed bureaucrat who carried little prestige and influence —the more so since the habit of formal litigation
404
MODERNIZATION: The Bourgeoisie
in civil matters was less developed than in Christian, Hindu, or Muslim societies. The principal independent professions, therefore, were those of medicine, teaching, and journalism; the first was naturally apolitical, the second only slightly less so and in any case dominated by the professors of state universities closely allied to the bureaucracy. Journalism was the one profession of political importance, and its practitioners played a strong supporting role in the opposition political movement that developed. For there was in the nineteenth century a kind of "middle-class" challenge to the military and bureaucratic elites strong enough to force the grant of a constitution, the gradual extension of the franchise, and an expansion of the powers of political parties (Ike 1950). This challenge came not chiefly from a rising business class but, first, from other ex-samurai who had failed to get a footing in the bureaucracy and, second, from the landlords —men of peasant origin and mostly still farmers themselves, sometimes also brewers, timber merchants or proprietors of small food-processing establishments—whose land taxes provided the bulk of government revenue and who demanded some say in its use. Thus, the dichotomies that roughly held for Europe—rural, aristocratic, agricultural, and power-holding versus urban, middle-class, industrial-commercial, professional, and power-demanding—did not apply to Japan. In the initial stages of constitutional rule, indeed, the landlords largely dominated the opposition political parties, which were allowed to criticize but not to control the bureaucracy (Scalapino 1953). Later, urban commercial and industrial interests gained an increasing hold on the political parties and, pari passu, there was an increasing interpenetration of the bureaucracy and the parties—bureaucrats giving up office to become politicians and the parties influencing appointments in the bureaucracy—with a consequent sharing of power between the two. By the time, in the early 1920s, that industrial concentration, universal education, and imported ideologies had combined to produce a working-class consciousness and the beginnings of a self-styled proletarian political movement, there was no longer any meaningful sense in which the forces that rallied to contain the threat from the working class could be divided into patrician and bourgeois or even into the power-elite and the power-hungry. By then the effective power struggle was between groups that can only be defined in terms of occupation and outlook, not in terms of class or status group—between, on the one hand, the army and its allies in the bureaucracy, the
parties, and the business world and, on the other, the bulk of the bureaucracy, politicians, and businessmen, whose interests were represented in the civilian cabinet. However, if there was, by this time, no meaningful distinction between upper and middle to be drawn between the groups that dominated the country politically, it is possible to distinguish from them another group—in European terms the lower middle class—that was already of some importance politically, particularly inasmuch as its support helped the army to win the struggle for control in the 1930s. These were the shopkeepers and small businessmen, together with the products of the secondary grades of education who became primary and secondary school teachers and the clerical workers in government and private business. These, unlike the dominant metropolitan groups, were men with their roots in local communities, the natural opinion leaders of those communities —Japan's pseudo-intellectuals as one Japanese social scientist has dubbed them (Maruyama 1963) —leaders of reservists' associations, organizers of patriotic charities, and so on. They had enough causes for personal resentment against the dominant groups (there was only one educational ladder of success and they were the ones who had risen only halfway up it), and they were provided with enough moral grounds for expressing their resentment by the luxury and corruption of the business classes, the arrogance of the bureaucrats, the unpatriotic concern for sectional interests of the politicians, and the un-Japanese cosmopolitanism of them all. The cosmopolitanism was a factor of some importance. In Europe and America national consciousness developed its real strength in the bourgeoisie, very often in partial reaction against the cosmopolitanism of the aristocracy (especially, for instance, in the countries culturally on the fringe of Europe, such as Russia and America). In Japan it was the new professional bureaucratic and business classes themselves that were cosmopolitan. Since educational selection played so big a role in their recruitment, a university education was the chief thing that distinguished them from the rest of the population. A high proportion of them had been to foreign universities, if only for a year or two. But even the culture of the Japanese universities, which provided the bulk of their education, was not something evolved out of Japanese traditions; it was an importation—overtly so in the technological fields, less obviously but more consequentially so in the humanities and social sciences, where knowledge and values could not be sepa-
MODERNIZATION: The Bourgeoisie rated. A high proportion of the textbooks students read were translations or in foreign languages; their admired philosophers, physicists, jurists, novelists were German, French, and American; and they learned at the university not only about their subjects but also about Beethoven, whisky, animal protection societies, silk handkerchiefs, impressionism, waltzing, romantic love, anarchism, and all the other overt deviations from Japanese mores that the army (home-bred in much more quickly and wholly Japanized military academies) and the lower-middle groups (who had failed to get to a university) could denounce as cosmopolitan decadence (Bennett et al. 1958). Nevertheless, even if an uncertain sense of cultural identity did prevent these new middle-upper groups from being the bearers of nationalism as opposed to cosmopolitanism, they were just as effective as their European counterparts in promoting nationalism at the expense of regional parochialism—and for the similar reason that they were geographically and socially mobile. Their mobility may have been different from that of the traveling late-medieval merchant—mobility from provincial home to metropolitan school, from one provincial bureaucratic appointment to another— but it had the same effect, of weakening local loyalties, which in Japan centered on the Tokugawa fiefs. Equally, there were similarities between the cultural and ethical values of these groups and those of their European counterparts, determined by the very fact that their education was European in inspiration. The process was different, if the end results somewhat the same. Thus, for instance, a secular world view came with Western science, rather than giving birth to it. Like the steam engine, the Enlightenment did not have to be invented again. The effectiveness of this diffusion through intellectual channels depended, however, on the extent to which the traditional culture was receptive (the ground was already prepared, for instance, for greater stress on achievement at the expense of ascriptive criteria by changes within the samurai class (Smith 1966) and on the extent to which other structural changes supported the new values. The latter point may be illustrated by the contrast between two different elements of the Protestant ethic stereotype: "inner-worldly asceticism" and individualism. The first was a marked characteristic of the new Japanese middle classes, although the underlying mechanisms were different—the Japanese was trained to sacrifice immediate for future gratifications through the discipline of a strenuously purposeful education rather than
405
by the practice of thrift; the dream of success was sanctified by the approval not of God, but of the ancestors, the guardians of the family whose honor the success would adorn. By contrast, the other central feature of the Protestant middle-class ethic— individualism—was much less marked in Japan. The traditions of the family and the fief as institutions demanding a total loyalty and Japan's religious history (Buddhism did not have Christianity's emphasis on conviction, conscience, and principle, and on separating truth from error) are a partial explanation (Benedict 1946). Just as important were the early emphasis on occupational selection through education and the early bureaucratization of business. Even in the initial stages of industrialization most Japanese spent most of their lives firmly embedded either in a traditional community or in a large organization, and both school and firm (for labor mobility was low above the manual worker level) laid claims to a lifetime's loyalty. It was remarked in 1940 that in the sense of a flowering of individual creativity, the Chinese —who for a century had Western education but no stable development of modern corporate organizations—were more thoroughly "Westernized" than the Japanese (Hu 1940). The priority of the firm, corporation, or government department as a focus of "belonging" is a reason not only for the lack of individuation but also for the lack of consciousness of class membership. Japan's society was more segmented than stratified. Thus, the tendency present in all advanced industrial societies for the lines of cleavage between status and class groups to become blurred into a gradual declension of income and prestige has been very marked in postwar Japan. Consequently, one may say that the middle class has disappeared or, alternatively, that it has swallowed up an ever larger segment of the nation. The defeat and the elimination of the army as a significant political and cultural force ended polarization around the issues of cosmopolitanism or tradition. Power became more widely diffused among economic, political, and administrative groups in a pluralistic system. The corporation with its lifetime employment pattern came to embrace even larger groups of the population. Occupational selection became even more rigidly dependent on educational qualification. And rapid economic growth brought the same homogenizing effects—mass production of the consumer durables of prestige significance, wider diffusion of mass media of deeper impact, etc.—as in other societies. If one were to draw a major dividing line through the Japanese population today it would probably be to distin-
406
MODERNIZATION: The Bourgeoisie
guish a lower class of poorer farmers and less skilled manual workers—perhaps 40 to 50 per cent of the population—from the remainder, who, though graded in income, differ little in standards (as opposed to levels) of living, in leisure tastes, reading habits, dress, accent, or aspirations. It is the broadening of this latter group and the relative homogeneity of its aspirations that have so intensified the pressure on the educational system that the intensely competitive entrance examinations for "the good schools" have become the events around which the whole lives of many families revolve (Vogel 1963; Plath 1964). Sub-Saharan Africa The ex-colonial countries of sub-Saharan Africa in some senses show the closest parallels to the Japanese pattern. It is rarely possible, for instance, to draw a meaningful distinction between an upper and a middle class, even where, as in the precolonial period, there was relatively marked social differentiation between the mass of the population and a chiefly stratum with dominant control over land and some degree of hereditary continuity [see AFRICAN SOCIETY]. Even in British Africa, where the power of the chiefs over land was strengthened by a system of indirect rule, they rarely achieved a sufficiently radical differentiation in culture and style of life to be termed a "landed aristocracy." (This is true even of Buganda, which is the nearest to an exception to this generalization [see Apter 1961; Fallers 1964].) And in those areas where the traditional political structure has survived into the postindependence period ( as in Uganda and northern Nigeria), the chiefs or emirs tend to retain only local influence in a position of subordination to the new national or federal elites. When chiefs or the sons of chiefs themselves belong to this new political elite as civil servants, soldiers, or politicians, they most commonly do so by virtue of their education or professional training rather than by virtue of their heredity. (Again, even in Uganda there was a tendency for Gandan politicians to move into the national Ugandan majority party and away from a Bugandan identification [Lee 1965].) The real upper class of these societies had been, of course, the group of white officials, teachers, and settlers. In the preindependence period the educated members of the professions who led the independence movements and now form the new national elites could well have been described as a "rising middle class." (The term "African bourgeoisie" was chosen by a sociologist to describe the South African counterparts of these professionals, precisely
"to emphasise in terms of social change and prospective power their role at the apex of subordination" [Kuper 1965, p. 8].) With independence and the removal of the colonial power the "apex of subordination" became the apex of the society. Those who were trained for teaching, the civil service, the army, the press, medicine, or the law, chiefs' sons educated to be better chiefs—men whose chief common characteristic was a shared experience of postprimary education in European, not African, traditions— have formed the relatively unstratified political elite of the new states. As in Meiji Japan, political struggles—when they are not still largely struggles between "primordial" local or ethnic groups—are usually struggles between the "ins" and the "outs," factional groups not distinguished by different social origins or a different constellation of material interests. ("When a developing country has two PH.DS, one will be prime minister and the other in exile.") These educated professional groups as yet have few rivals for political or cultural influence, not least because the field of commerce was, and to a lesser extent remains, dominated by non-African immigrant communities—Asians and Arabs in east Africa, Levantines and Europeans in the west. Although in the colonial period these groups—particularly Asians in Kenya or Uganda—were sometimes more successful than Africans in pressing for political participation at an earlier stage, independence has seen a marked decline in their influence. They have become marginal minorities, culturally distinct and socially separated from Africans of similar income levels, tolerated for their essential services to the economy but (with a. few exceptions, such as in Senegal) generally the object of discriminatory credit and fiscal policies designed to nurture an African trading class to replace them. Insofar as they translate their wealth into higher education for their sons, however, they may maintain their position by entry into the administrative and professional class itself—though without changing the nature of that class or the pattern of its dominance. Meanwhile, in only a few areas, such as Ghana and western Nigeria, have there emerged African traders of sufficient substance to acquire a notable share of political influence and thereby dilute the predominantly professional character of the elite (Hunter 1962). India Among ex-colonial countries, India stands out as being somewhat different because the colonial power took over a more developed literate civiliza-
MODERNIZATION: The Bourgeoisie tion, because the land settlement created a landlord class of a modern rather than of a tribal—chief tainly kind, because a sizable business class—mostly commercial but partly industrial—had developed before independence, and because Western education and the devolution of power began so much earlier. There was even something similar to the European division between an upper and a middle class. On the one hand were the civil servants of the higher ranks, mostly drawn from the wealthiest (especially landed) families and identified with the colonial status quo; on the other were professional groups, such as physicians and, especially, lawyers, who were active in the independence movement (Misra 1961). Postindependence India, however, is an even more striking example than Japan of the numerical and cultural predominance of the educationally qualified administrative and professional groups over mercantile and industrial elements within the occupations traditionally defined as bourgeois. The reasons are fairly clear: because state enterprise plays an even more dominant part in industrialization and because the bureaucracy has also expanded to accord with the dominant mid-twentiethcentury notions of the role of the state in welfare, educational, and cultural, as well as in economic matters—notions that have evolved in response to the economic development of the advanced industrial societies but are now generally accepted also in societies at a much lower level of industrial development. The diffusion of European middle-class values chiefly through the educational process is even more marked in India than it was in Japan, if only because British universities played such a large part in training the administrative elite and Indian universities were much more overtly modeled on the British. Even if they free themselves from the British model, Indian universities are bound to be modern institutions with a predominantly "Western" content and an entirely "Western" orientation in the techniques of research and scholarship. The problem of national identity—of accepting the alien origins of one's culture and at the same time accepting one's Indianness, of believing in the superiority of the imported alien to the traditional native values and ideologies and yet retaining one's national pride—this pull between tradition and modernity is acute (Shils 1961). It is especially so for intellectuals, but it has not, as in Japan in the 1930s, been brought to a point of traumatic exacerbation. In Japan the conflict with the West heightened the strain. India does not feel this strain so acutely, partly because of the syncretic
407
tradition of Hindu culture, partly because a modus vivendi between the traditional and the modern cultures has had time to become established, and finally because India and the West are not in political conflict. (In China, where the political hostility has been brought to a crucial pitch, the initial solution for intellectuals lay in the fact that by adopting one type of nineteenth-century European middle-class ideology, Marxism, they had a firm basis for confident denunciation of other types embodied in the modern American enemy. Later, as tensions developed in relations with other Marxist countries it became necessary to Sinicize Marxism itself.) Middle East and Latin America Other countries, by contrast, notably in the Middle East and in Latin America, show patterns much more similar to the European than to either the Japanese or the ex-colonial type, particularly inasmuch as they have—or until recently had—a recognizable landed upper class to provide a defining boundary for the "middleness" of the new professional and business groups. The political development of Egypt during this century can be interpreted, for instance, in terms familiar to European history. The Wafd, beginning in the 1920s as an alliance of the landlords and the business elites against the ruling house and its foreign protectors, became increasingly penetrated by the new white collar, professional, and small business groups in the 1930s. Finally, after it was allowed to share in power, it was increasingly weakened by tension between party oligarchy and rank and file to the benefit of the rival Muslim Brotherhood, a party whose center of gravity was at an even more popular level and which combined demands for social and economic reform with a reaffirmation of Islam (thus providing a solution to the late-developer's cultural dilemma by being modern while remaining traditional, accepting foreign models while remaining nationalist). Finally, the military revolution reconcentrated power in the hands of men predominantly of middle-class origin, though without government ever having been effectively exercised by men who were consciously representative of middle-class interests [see NEAR EASTERN SOCIETY; see also Vatikiotis 1961]. Similar processes may be observed in many Latin American countries, where again the "emergence of the middle sectors" has been looked on as the most hopeful source of political and economic progress [see LATIN AMERICAN POLITICAL THOUGHT; see also Johnson 1958]. In the Middle East and Latin America there has occurred a progressive
408
MODERNIZATION: The Bourgeoisie
fusion of the old upper and the newer middle groups. The successful lawyer, banker, or civil servant buys himself a prestige estate (to be used, also, for recreation, tax write-off, and sometimes genuinely agricultural profit-making purposes). The landlord invests in metropolitan business and puts his sons through professional training. The process of fusion is not unlike that, for example, in England, though with one crucial difference. In the English fusion the aristocracy absorbed bourgeois values in good measure—some becoming progressive commercial farmers, some of their sons going into trade, their schools developing the moral seriousness of purpose of the empire builder, and so on. In late-developing countries, by contrast, the aristocratic values are likely to emerge as the dominant ones for the following reasons: (1) professionals and the salariat predominate in the middle groups (expansion of government far beyond that of Europe at a similar stage of development; emergence of the modern corporation full-blown as the initiator of economic development; predominance in the business world of the foreign firm, whose managers and top technicians are outside the indigenous culture, society, and polity); (2) occupational aspirations are consequently concentrated on official or professional, rather than business or technological careers; (3) the universities (as opposed, for example, to the churches) play a dominant role in the formation of common values (the administrative and corporation salariat have above all to be qualified; education, like government, is more easily expanded to developedcountry levels than is industry); (4) universities, designed primarily to train administrators, emphasize law, philosophy, the humanities, neglecting science, engineering, commerce; in other words they provide an education of a traditional consumption-oriented kind, reinforcing aristocratic values at the expense of those parts of the European (or Japanese) middle-class ethic—production-orientation, emphasis on diligence, objective standards of merit, dominance of nature, etc.—which were most important for economic development. Even the business world remains frozen in attitudes characteristic of the mercantilist period in Europe: the acquisition of wealth depends on political patronage and aristocratic connections; it is a function of power to get licenses and permits which are a source of money in themselves, requiring no production effort (Cochran 1959; van der Kroef 1956). This does not prevent the new middle groups from effectively promoting new values and behavior patterns in some fields. In the Middle East, for instance, it is the middle, not the upper groups that
have been the pioneers in feminine education and the general emancipation of women (Berger 1958). In Latin America new literary movements have chiefly been promoted by and for the new middle groups (Ellison 1964). Similarly, if the ethos of the new middle groups seems often directly inimical to economic development, they may still play a crucial role in promoting it insofar as middle-class intellectuals inspire, and men of middle-class origins in armies or revolutionary parties carry through, revolutionary political changes that destroy the existing correlation of power with wealth (both traditional landed and modern urban) to create a new regime that radically alters the dominant ethos, gives honor to the engineer, the manager, and the chemist, and uses state power to mobilize the resources necessary for developmental investment. To be sure, the gentlemanly antiscientific bias of the predominant culture may be such that even revolutionary regimes genuinely intent on economic development find it difficult to will the means to their economic ends. It was, for instance, some years after 1958 that the Cuban government recovered from a belief that revolutionary elan was all that was needed to build a new Jerusalem and began to recognize the essential importance of technological and economic skills (Dumont 1964). Eventually the lesson is likely to be learned, however, and the exemplary effect of a few revolutionary regimes, together with the influence exerted by the expanding international organizations concerned with development, tend to promote in other countries as well the transformation of values that leads from aristocracy to technocracy. Whether that technocracy will be effective in its role of economic development, however—whether it can, as a salariat, show the same dedication as the legendary Puritan entrepreneur to the cause of growing two blades of grass where only one grew before—depends on whether it can find sufficient strength of motivation either in nationalism or in some reformist ideology to provide a substitute for the entrepreneur's more self-interested concerns (Gellner 1965). RONALD P. DORE [See also BUREAUCRACY; STRATIFICATION, SOCIAL.] BIBLIOGRAPHY
AFTER, DAVID E. 1961 The Political Kingdom of Uganda: A Study in Bureaucratic Nationalism. Princeton Univ. Press. BENEDICT, RUTH 1946 The Chrysanthemum and the Sword: Patterns of Japanese Culture. Boston: Houghton Mifflin.
BENNETT, JOHN W.; PASSIN, HERBERT; and MCKNIGHT,
MOIVRE, ABRAHAM DE ROBERT K. 1958 In Search of Identity: The Japanese Overseas Scholar in America and Japan. Minneapolis: Univ. of Minnesota Press. BERGER, MORROE 1958 The Middleclass in the Arab World. Pages 67—71 in Walter Z. Laqueur (editor), The Middle East in Transition. London: Routledge. COCHRAN, THOMAS C. 1959 The Puerto Rican Businessman: A Study in Cultural Change. Philadelphia: Univ. of Pennsylvania Press. CHANG, CHUNG-LI 1962 The Income of the Chinese Gentry. Seattle: Univ. of Washington Press. DORE, RONALD P. 1966 Individuation, Mobility and Equality in Modern Japan. Unpublished manuscript. DUMONT, RENE 1964 Cuba: Socialisme et developpement. Paris: Editions du Seuil. ELLISON, FRED P. 1964 The Writer. Pages 79-100 in John J. Johnson (editor), Continuity and Change in Latin America. Stanford Univ. Press. FALLERS, LLOYD A. (editor) 1964 The King's Men: Leadership and Status in Buganda on the Eve of Independence. Oxford Univ. Press. FEI, HSIAO-T'UNG (1947-1948) 1953 China's Gentry: Essays in Rural-Urban Relations. Univ. of Chicago Press. GELLNER, ERNEST 1965 Thought and Change. Univ. of Chicago Press. HALL, JOHN W. 1962 Feudalism in Japan: A Reassessment. Comparative Studies in Society and History 5: 15-51. HIRSCHMEIER, JOHANNES 1964 Origins of Entrepreneurship in Meiji Japan. Cambridge, Mass.: Harvard Univ. Press. Ho, PING-TI 1954 The Salt Merchants of Yong-chon: A Study of Commercial Capitalism in Eighteenth-century China. Harvard Journal of Asiatic Studies 17:130-168. HODGKIN, THOMAS 1956 The African Middle Class. Corona 8:85-88. Hu, SHIH 1940 The Modernisation of China and Japan: A Comparative Study in Cultural Conflict. Pages 243251 in Caroline F. Ware (editor), The Cultural Approach to History. New York: Columbia Univ. Press. HUNTER, GUY 1962 The New Societies of Tropical Africa: A Selective Study. London: Oxford Univ. Press. IKE, NOBUTAKA 1950 The Beginnings of Political Democracy in Japan. Baltimore: Johns Hopkins Press. INOKI, MASAMICHI 1964 The Civil Bureaucracy: Japan. Pages 283-300 in Conference on Political Modernization in Japan and Turkey, Gould House, 1962, Political Modernization in Japan and Turkey. Edited by Robert E. Ward and D. A. Rustow. Princeton Univ. Press. JOHNSON, JOHN J. 1958 Political Change in Latin America: The Emergence of the Middle Sectors. Stanford Univ. Press. KUPER, LEO 1965 An African Bourgeoisie: Race, Class, and Politics in South Africa. New Haven: Yale Univ. Press. LEE, J. M. 1965 Buganda's Position in Federal Uganda. Journal of Commonwealth Political Studies 3:165-181. MARUYAMA, MASAO 1963 Thought and Behaviour in Modern Japanese Politics. London and New York: Oxford Univ. Press. MISRA, BANKEY B. 1961 The Indian Middle Classes: Their Growth in Modern Times. Oxford Univ. Press. NORMAN, E. HERBERT 1940 Japan's Emergence as a Modern State: Political and Economic Problems of the Meiji Period. New York: Institute of Pacific Relations, International Secretariat.
409
PLATH, DAVID W. 1964 The After Hours: Modern Japan and the Search for Enjoyment. Berkeley: Univ. of California Press. SAFRAN, NADAV 1961 Egypt in Search of Political Community: An Analysis of the Intellectual and Political Evolution of Egypt, 1804-1952. Cambridge, Mass.: Harvard Univ. Press. SCALAPINO, ROBERT A. (1953) 1962 Democracy and the Party Movement in Prewar Japan: The Failure of the First Attempt. Berkeley: Univ. of California Press. SHILS, EDWARD 1961 The Intellectual Between Tradition and Modernity: The Indian Situation. The Hague: Mouton. SMITH, THOMAS C. 1960 Landlords' Sons in the Business Elite. Economic Development and Cultural Change 9, no. 1, part 2:93-107. SMITH, THOMAS C. 1961 Japan's Aristocratic Revolution. Yale Review New Series 50:370-383. SMITH, THOMAS C. 1966 Merit as Ideology in Tokugawa Japan. Unpublished manuscript. TIRYAKIAN, EDWARD A. 1959 Occupational Satisfaction and Aspiration in an Underdeveloped Country: The Philippines. Economic Development and Cultural Change 7, no. 4:431-444. VAN DER KROEF, JUSTUS M. 1956 Economic Development in Indonesia: Some Social and Cultural Impediments. Economic Development and Cultural Change 4, no. 2:116-133. VATIKIOTIS, PANAYIOTIS J. 1961 The Egyptian Army in Politics: Pattern for New Nations? Bloomington: Indiana Univ. Press. VOGEL, EZRA F. 1963 Japan's New Middle Class: The Salary Man and His Family in a Tokyo Suburb. Berkeley: Univ. of California Press. WEINER, MYRON 1957 Party Politics in India: The Development of a Multi-party System. Princeton Univ. Press.
MOHAMMEDANISM See ISLAM. MOIVRE, ABRAHAM DE Abraham de Moivre (1667-1754) was born of French Protestant parents named Moivre. (He was also known as Demoivre; as part of the return address of a letter to Johann Bernoulli he himself wrote his name as deMoivre.) He studied mathematics and physics in Paris under Ozanam, and emigrated to England when he was 21 to escape religious persecution (Walker 1934). Although de Moivre was a mathematical genius of outstanding analytical power and was in contact by correspondence and in person (at the Royal Society) with many of the leading mathematicians of the day, he never succeeded in obtaining a university appointment. Instead, he had to live by tutoring noblemen's sons and by advising gamblers and speculators who dealt in annuities, which were a popular form of investment in the first half of the eighteenth century (Walford 1871). This misfortune for de Moivre is posterity's gain, for the
4 10
MOIVRE, ABRAHAM DE
problems he met in his consulting practice and his successful solution of them provided the material for his two great textbooks. In fact, during his last years de Moivre must have relied heavily on the sales of the later editions of his book on annuity calculations. De Moivre's practical text on probability first appeared in 1718 as a translation and revision of his Latin article of 1711. It was dedicated to Isaac Newton, who is accorded the author's thanks for his writings and conversations. In its final form, published in 1756, this book is notable for its original treatment of the following topics, all of which play a central role in the modern theory of probability : (1) The general laws of addition (David & Barton 1962, chapter 2) and multiplication of probabilities (Montucla [1758] 1802, part 5, book 1, chapter 39); (2) The binomial distribution law (Cantor 1898, chapter 96); (3) Probability-generating functions (Seal 1949a); (4) Difference equations involving probabilities and their solution by means of recurring series (Czuber 1900); (5) New and general solutions of problems on the duration of play, or "gambler's ruin" (Todhunter 1865, chapter 9); (6) The limiting form of the binomial term *(1 - py-*,
ff
= 0, 1, 2, • •• ,n; 0 < p < 1,
when (a) n -» °o with np remaining finite, and (£>) n^ °° and np -* x. In case (a) only the term with x — 0 was considered (David 1962). In case (£>) the result (27rap(l — p)}-*exp[—(.v — np)2/2np(l — p)], namely, the ordinate of the normal distribution, was obtained explicitly (in different notation). Included in this book were the trigonometrical theorem that goes by de Moivre's name and his approximation to the logarithm of a factorial which was improved by Stirling's discovery in the same year that the value of the series contained therein was 277-. Some of the mathematical derivations of this probability text were published for a wider circle of readers in Latin (1730). De Moivre's other textbook laid the foundations of the mathematics of life contingencies (Saar 1923). Although the first edition sold slowly, Thomas Simpson's plagiaristic text of 1742 spurred
de Moivre to a complete revision published in the following year (Young 1908). The success of this edition is indicated by the two further editions, with minor changes, that followed within nine years. In 1756 a final, thoroughly revised edition was printed as the last section of the third edition of The Doctrine of Chances. In an appendix, reference is made to a paper published in 1755 by James Dodson, the father of scientific life insurance (Ogborn 1962) and possibly the "friend" who edited the posthumous edition. The originality of this life contingency textbook is attested by its inclusion of the following: (a) The recursion formula for calculating a life annuity at age x, given that at age x + 1 (though it is doubtful whether the author envisaged the calculation of the whole set of annuity values by starting at the "oldest age" [Young 1908]); (h ) General relations for survivorship and reversionary annuities in terms of single and joint life annuities; (c) Use of the calculus to obtain the value of a continuous annuity-certain; (d~) A law of mortality, namely, that of uniform decrements in the number of survivors, which was in substantial agreement with the Breslau table, published by his friend Halley in 1693; and as a result: ( e ) Easily computable values of single and joint life annuities for limited terms or for life; (f ) Expressions for the computation of complex survivorship probabilities; (g) The value of a life annuity with a proportionate payment in the year of death. All these results originated with de Moivre himself. The earliest published works of de Moivre in the 1690s were influenced by Newton's method of fluxions and theory of series (Cantor 1898, chapter 86), and his interest in probability dates only from the first edition of Montmort's Essay (1708). Perhaps his most important contribution, first printed in 1733 as a supplement to the Miscellanea analytica, was his improvement of the wide limits obtained by Bernoulli (1713) in his statement of the law of large numbers. For this purpose de Moivre utilized the result mentioned in (6) to obtain the sum of the binomial probabilities from x — np—f to x = np+t with p = £ and t = k(n/2^, k = 1,2,3, by approximate quadrature (by ordinate summation and the three-eighths rule) of the normal ordinates. While this constitutes the first tabulation of the normal areas at one, two, and three standard deviations from the mean, there is no evidence that de Moivre thought in terms of a continuous probability distribution (Seal 1954; 1957).
MOIVRE, ABRAHAM DE Nevertheless, this work is clearly the basis for the subsequent demonstration by Laplace (1812, pp. 275-284) that the binomial tends to the normal when n is large (Pearson 1924). It may be added that de Moivre was a pure mathematician little interested in the practical applications of his theory. Although he wrote on life contingencies, only in the final Appendix of the posthumous edition of 1756 is there a brief reference to mortality data later than those of the Breslau table. Actually, the early and middle years of the eighteenth century saw the publication of several collections of mortality statistics that would have been fertile ground for the application of de Moivre's improved version of Bernoulli's theorem (Seal 1949Z?). These data had led to a widespread belief in the divine regularity of demographic ratios, and a few paragraphs in the 1738 and 1756 editions of the Doctrine refer to a connection between this belief and Bernoulli's theorem. Unfortunately, the topic was not pursued by de Moivre or his contemporaries (Westergaard 1932, chapters 7, 10) and cannot be regarded as indicating that de Moivre was interested in theology (Walker 1929) or that he influenced the demographers of the eighteenth and early nineteenth centuries (Pearson 1926). HILARY L. SEAL [For the historical context of de Moime's work, see the article on the BERNOULLI FAMILY. For discussion of the subsequent development of de Moivre's ideas, see DISTRIBUTIONS, STATISTICAL; LIFE TABLES; PROBABILITY; and the biography of LAPLACE.] WORKS BY DE MOIVRE
In many reference works, de Moivre is alphabetized under D; however, in line with the cataloguing practice of major libraries, we have listed him under M. Consistent with this, we have used a lower-case "d" for the particle. 1711 De mensura sortis seu, de probabilitate eventuum in ludis a casu fortuito pendentibus. Royal Society of London, Philosophical Transactions 27:213-264. -> Reprinted by Kraus (New York) in 1963. (1718) 1756 The Doctrine of Chances: Or, a Method of Calculating the Probabilities of Events in Play. 3d ed. London: Millar. (1725) 1752 Annuities Upon Lives: Or, the Valuation of Annuities Upon Any Number of Lives, as Also, of Reversions. 4th ed. London: Millar. -» The Appendix concerns the expectations of life and the probabilities of survivorship. 1730 Miscellanea analytica de seriebus et quadraturis. . . . London: Tonson & Watts. SUPPLEMENTARY BIBLIOGRAPHY
BERNOULLI, JAKOB (1713) 1899 Wahrscheinlichkeitsrechnung: (Ars conjectandi). 2 vols. Leipzig: Engelmann. -» First published posthumously in Latin. CANTOR, MORITZ 1898 Vorlesungen iiber Geschichte der
411
Mathematik. Volume 3: Von 1668-1758. Leipzig: Teubner. CZUBER, EMANUEL (1900)1906' Calcul des probabilites. Section 1, volume 4, pages 1-46 in Encyclopedic des sciences mathematiques. Paris: Gauthier-Villars. -> First published in German in Encyklopadie der mathematischen Wissenschaften. DAVID, F. N. 1962 Games, Gods and Gambling: The Origins and History of Probability and Statistical Ideas From the Earliest Times to the Newtonian Era. New York: Hafner. DAVID, F. N. ; and BARTON, D. E. 1962 Combinatorial Chance. New York: Hafner. LAPLACE, PIERRE SIMON DE (1812) 1820 Theorie analytique des probabilites. 3d ed., rev. Paris: Courcier. [MONTMORT, PIERRE REMOND DE] (1708) 1713 Essay d'analyse sur les jeux de hazard. 2d ed. Paris: Quillau. -» First published anonymously. MONTUCLA, JEAN E. (1758) 1802 Histoire des mathematiques dans laquelle on rend compte de leurs progres depuis leur origine jusqu'd nos jours. . . . Paris: Agasse. OGBORN, MAURICE EDWARD 1962 Equitable Assurances: The Story of Life Assurance in the Experience of the Equitable Life Assurance Society, 1762-1962. London: Allen & Unwin. PEARSON, KARL 1924 Historical Note on the Origin of the Normal Curve of Errors. Biometrika 16:402-404. PEARSON, KARL 1926 Abraham de Moivre. Nature 117: 551-552. SAAR, J. DU 1923 De beteekenis van De Moivre's werk over lijfrenten voor de ontwikkeling van de verzekeringswetenschap. Verzekerings archief 4:28-45. SEAL, H. L. 1949a The Historical Development of the Use of Generating Functions in Probability Theory. Vereinigung schweizerischer Versicherungsmathematiker, Mitteilungen 49:209-228. SEAL, H. L. 1949b Mortality Data and the Binomial Probability Law. Skandinavisk aktuarietidskrift 32: 188-216. SEAL, HILARY L. 1954 A Budget of Paradoxes. Journal of the Institute of Actuaries Students' Society 13: 60-65. SEAL, HILARY L. 1957 A Correction. Journal of the Institute of Actuaries Students' Society 14:210-211. SIMPSON, THOMAS (1742)1775 The Doctrine of Annuities and Reversions, Deduced From General and Evident Principles. . . . 2d ed. London: Printed for J. Nourse. TODHUNTER, ISAAC (1865) 1949 A History of the Mathematical Theory of Probability From the Time of Pascal to That of Laplace. New York: Chelsea. WALFORD, CORNELIUS 1871 The Insurance Cyclopaedia. Volume 1. London: Layton. -* See especially pages 98-169 on "Annuities." WALKER, HELEN M. 1929 Studies in the History of Statistical Method: With Special Reference to Certain Educational Problems. Baltimore: Williams & Wilkins. WALKER, HELEN M. 1934 Abraham de Moivre. Scripta mathematica 2:316-333. WESTERGAARD, HARALD L. 1932 Contributions to the History of Statistics. London: King. YOUNG, T. E. 1908 Historical Notes Relating to the Discovery of the Formula ax = vpx (1 + a^+i): And to the Introduction of the Calculus in the Solution of Actuarial Problems. Journal of the Institute of Actuaries 42:188-205.
41 2
MONARCHY MONARCHY
The term "monarchy" has been used in both a broad and a narrow sense. The broad sense is found in the writings of the Ancients, especially Herodotus and the poets, where it denotes simply the rule of one man (or woman), whether good or bad, legitimate or unlawful, wise or incompetent. Plato and Aristotle introduced distinctions that narrowed the term by restricting it to rule by one good person; Plato defined the good by reference to law, and Aristotle did so by reference to happiness. In the modern West, another kind of narrowing has occurred in response to historical developments, especially feudalism. Here monarchy designates a particular type of one-person rule, characterized by legitimate blood descent, no matter how limited the extent of the governing functions; indeed, the term may even refer to regimes in which the monarch has no governing functions at all, as in Great Britain and the Scandinavian kingdoms. Monocracy. Since Western historical associations cannot be applied to one-person rule in other cultures, comparative politics stands in need of a generic concept similar to the original Greek meaning of monarchy, a concept that would cover primitive kingship, Oriental despotism, tyranny, dictatorship, and the Western kind of monarchy. The term "monocracy," or monocratic rule, first suggested by Max Weber, has been coming into use in recent years. When anthropologists discuss monocratic rule, they usually mean one-person rule among primitives, which prevails, or prevailed before the European conquest, in Polynesia, Africa, and parts of America as well as Asia. The economic, political, judicial, and priestly functions of monocratic rulers differ widely within the same culture area. These rulers are generally regarded as of divine origin, and their acts are invested with divine qualities. They are supposed to possess mana and in their persons are frequently taboo; to touch them constitutes treason. The power of such a ruler is typically related in a magical way to successful crops and wars: there can be little doubt that military leadership is often at the heart of his power. When the rule of this kind of king-priest was extended over large territories, especially in the ancient Orient, it was accompanied by the development of a bureaucracy. Such a bureaucracy may and often does combine priestly and administrative functions. In Egypt, China, and elsewhere, such extended bureaucratic monocracy was often associated with an official doctrine, such as Confucianism, the mastery of which served as the principle for selecting participants in the regime.
The succession of empires from the Egyptian to the Persian shows how widespread was this form of governmental organization within very diverse culture patterns. Indeed, the extent and durability of such regimes suggest that monocratic rule is the usual form of governing extensive territorial domains. This fact may be related to a persuasive general proposition concerning the appearance of monocratic rule in a variety of social contexts: it appears whenever a group is engaged in a serious struggle for survival. The threat to survival may be internal or external: wars, floods, insurrections, and the crises of industrial society have all been "causes" of the appearance of one kind of monocracy or another. Ancient monarchy. Primitive government in the historical perspective seems also to have been predominantly monocratic. In Greece, for example, chiefs of divine descent appear to have performed the key functions of military leader, high priest, and judge. But eventually the nobility secured, or perhaps recaptured, an effective share of government. These fluid situations are reflected in such poets as Homer and Pindar. Modern researches in anthropology, archeology, and prehistory have shown that the Greek situation exemplifies fairly universal and recurrent conditions of early government. It is worthy of note, however, that matriarchal or patriarchal monarchy occurs under conditions that call more for magical and arbitral abilities than for military prowess. Tyranny. A marginal form of monocracy, rarely referred to as monarchy, is tyranny. It made its appearance in Greece, and elsewhere, when class warfare between the nobility and the plebs caused political order to dissolve into civil war and anarchy. According to Aristotle, tyranny is the least stable of all forms of government. The Romans sought to forestall such developments by institutionalizing tyranny in the form of dictatorship. Both forms have, of course, reappeared in more recent times. In Greece, aristocracies, democracies, and tyrannies were challenged by monarchy. Having successfully withstood the onslaught of the Persian kings, at least in Greece proper, the Greeks were overwhelmed by the Macedonian rulers. Philip and his brilliant son Alexander set the stage for a proliferation of dynasties, which had been traditional in Macedonia. These dynasties dominated Greece and Asia Minor during the Hellenistic age until their conquest by Rome. Although deeply imbued with traditional antimonarchical sentiment nurtured by a triumphant aristocracy, Rome eventually became a monarchy of radically autocratic propensity. After an ex-
MONARCHY tended period of transition, during which republican trappings were deliberately cultivated by Augustus and his successors, the Roman Empire emerged as a full-fledged monarchy. It remained troubled by problems of succession throughout its long history, however, because the notion of legitimizing a ruler by blood descent remained unacceptable for many generations. Historians and political philosophers have speculated on why Roman republicanism should have been superseded by monarchical autocracy. In the works of writers from Machiavelli to Montesquieu, Gibbon, and Mommsen, to mention only the most famous, the explanations for this transformation were, variously, the decline in morals, in religion, and in traditional manners, the extension of Rome's sway and the corrupting influence of Oriental ways, and even the personal defects of Sulla, Pompey, and Caesar. Actually, the emergence, or rather re-emergence, of monarchy in Rome occurred in response to the same forces' that characteristically fashion monarchical government: civil dissension, breakdown of public order, and serious foreign setbacks and threats arising on Rome's far-flung frontiers. The continued pressure of outside enemies and internal dissensions operated in the direction of monocratic, and indeed autocratic, rule; when, in the third century, Diocletian openly proclaimed such rule, he was merely stating officially what had long been a fact. Religious legitimation of monarchy. It has been said that the change in the status of the emperor reflected a fundamental transformation in all conceptions of life. This may be true, but the impending Christianization of the empire presumably had an even more profound significance. As against the pagan preoccupation with affairs of this world, the Christian emphasis on the life hereafter became the dominant interest. "Render unto Caesar that which is Caesar's" may be taken as the key symbolic utterance of a basic indifference toward politics. Soon the church was to claim complete autonomy, at least in the West, which spelled the end of monarchy in the priestly tradition; monarchy hereafter appeared as the secular arm of the one God, whose primary representative on earth was the monarchical head of the church. It has rightly been said that the Roman Catholic church preserved and developed the great tradition of Roman law. This heritage had a profound impact upon the development of monarchy in the West. The crucial feature of the emerging monarchical pattern was, at first, not the doctrine of legibus solutus, but above all the emphasis on law as expressed in the principle Quod placet principem,
41 3
legis hdbet vigorem. Reinforced by the monotheistic conception of the deity in the Old Testament, which stressed the law-giving aspect, the monarch became primarily the dispenser of justice in the legal sense. This kind of monarch was epitomized in the symbolic figure of St. Louis sitting under an oak tree expounding the law. Such a monarch was a far cry from the omnipotent Oriental ruler, surrounded by pomp and circumstance. A monarch, confronted by ecclesiastical authorities ever ready to remind him of the natural and spiritual limits of the law and to back up their reminder with excommunication and the release of the ruler's subjects from their allegiance, needed to reinforce his position as an individual by the legitimation of monarchy as an institution. Such legitimation was provided by the hallowing of blood descent. Absolutism and constitutional monarchy. There has been a great deal of learned controversy with regard to the details of the intertwining of Germanic and Roman traditions in the evolving of Western forms of political order. But there can be little question that a real amalgamation took place. The decisive event in this process was the crowning of Charlemagne by the pope in the year 800. This event, the result of ecclesiastical initiative, decisively shaped Western monarchy, especially in France, Germany, Austria, Bohemia, and Poland. After Charlemagne, and in contrast to the caesaropapism of the Eastern Empire, which preserved the older pattern of monarchy and bequeathed it to Russia, Western monarchy was torn between the conception of the Holy Roman Empire and the folkways of Germanic kingship. The latter remained strong in England, Spain, and Scandinavia. In these countries the nobility successfully claimed a share in ruling and thereby provided the restraint that the church sought to exercise in the empire. Nobility and clergy joined in shaping constitutional forms of monarchy, especially in England and Spain. These were "mixed" governments, rather than "pure" forms. But before they could become universal, most of Europe went through a phase of absolute monarchy. Absolutism, especially as practiced in France, at times turned into despotism. Absolutist regimes were at once the creators and the expressions of national unification. They did away with feudal impediments to economic growth and fostered national churches challenging the ultramontane bonds of Catholicism. What Protestantism accomplished by the complete break with Rome, Gallicanism provided in an indirect way: an ecclesiastical authority closely bound up with secular rule. The absolute power thus placed
41 4
MONARCHY
in the hands of the monarch "corrupted" men and regimes and eventually engendered the violent reaction of revolution. In due course, absolute monarchy was overthrown, never to reappear in the European West; it was replaced by various constitutionalist forms, which were inspired by the example of England but which rarely if ever achieved the stability that tradition lent to the British crown. The French Charte constitutionelle is typical in that it contains a rather doctrinaire system of separation of powers, even though the theory underlying it was distrusted. Constitutional monarchies varied considerably in regard to the scope they allowed the monarchical element; the scope was gradually reduced, in stages punctuated by the revolutions of 1830 and 1848. In addition, a new form of monarchy made its appearance with the rise of Napoleon Bonaparte. Although at the outset he was more a dictator than a monarch, Bonaparte insisted upon acquiring the trappings of traditional monarchy and clearly hoped to found a dynasty. Although a more absolute ruler than the monarchs he emulated, he too acknowledged the persistent Western preoccupation with law in putting through his great codification. Still, his methods served to discredit absolutism, and the autocracy of the Russian tsars did nothing to rehabilitate it. Indeed liberalism, like the Enlightenment before it, rapidly undermined the bases of monarchical legitimacy. The decline of monarchy. The extent of the corrosion of monarchy was laid bare by World War i. Its revolutionary sequels swept away the monarchy in Germany, Austria-Hungary, and Russia, a process later completed by the disappearance of monarchy in Spain, Italy, and Turkey. Only a few, largely ceremonial monarchs remain in Europe. The decline of monarchy is a world-wide trend. It has toppled in China and is in rapid retreat in Japan, India, the rest of Asia, and most of Africa. It has never been able to establish much of a foothold in America. If it is remembered that in Britain the monarch has long ceased to be in control of the government, one might venture the proposition that traditional monarchy, legitimized in terms of blood descent and ecclesiastical unction, is becoming extinct. The rise of monocracy. Nothing of the kind can be said for monarchy in the sense of the monocratic rule of one man. This type of government is actually on the increase all over the world, not only in totalitarian dictatorships but in military and even constitutional regimes. The rising importance of executive power, linked as it is to the increasing complexity of the decisions required in a techno-
logical age, enhances the monocratic thrust inherent in bureaucratic structures. Not only men like Stalin, Hitler, Mao, Tito, and Gomulka but also Kemal Atatiirk, Ayub Khan, Nasser, Nkrumah, and in a sense even de Gaulle and some of the more recent Latin American dictators, are the new monarchs in the original Greek meaning of the term. Rulers like the kings of Morocco and of Saudi Arabia are in fact becoming monocrats. The legitimacy of these rulers (as well as of their succession) varies. In the totalitarian states their legitimacy is based upon the party and its ideology; in other countries it rests upon military achievement and support; in still others it is linked to a broad plebiscitary appeal; and in all of them such rule is further legitimized by a rising standard of living and the furtherance of economic development. Nor is there any end to this trend in sight; rather the opposite. While hereditary monarchy is finished— even the movements trying to resuscitate it, like the Action Frangaise, are dead or moribund— plebiscitary monarchy, as first instituted by Napoleon, seems destined to spread during the remainder of the twentieth century. CARL J. FRIEDRICH [See also AUTOCRACY; DICTATORSHIP; EXECUTIVE, POLITICAL; KINGSHIP; LEGITIMACY; SOVEREIGNTY.] BIBLIOGRAPHY BARKER, ERNEST 1923 The Conception of Empire. Pages 45-89 in Cyril Bailey (editor), The Legacy of Rome. Oxford: Clarendon. BRYCE, JAMES (1864) 1956 The Holy Roman Empire. New ed., rev. & enl. London: Macmillan. COULBORN, RUSHTON (editor) 1956 Feudalism in History. Princeton Univ. Press. EISENSTADT, SHMUEL N. 1963 The Political Systems of Empires. New York: Free Press. FIGGIS, JOHN N. (1896) 1922 The Divine Right of Kings. 2d ed. Cambridge Univ. Press. -> First published as The Theory of the Divine Right of Kings. A paperback edition was published in 1965 by Harper. FRANKFORT, HENRI 1948 Kingship and the Gods: A Study of Ancient Near Eastern Religion as the Integration of Society and Nature. Univ. of Chicago Press. FRIEDRICH, CARL J. (1937) 1950 Constitutional Government and Democracy: Theory and Practice in Europe and America. Rev. ed. Boston: Ginn. -> First published as Constitutional Government and Politics: Nature and Development. FRIEDRICH, CARL J. 1963 Man and His Government. New York: McGraw-Hill. -* See especially Chapter 10. GIERKE, OTTO VON (1868-1913) 1954 Das deutsche Genossenschaftsrecht. 4 vols. Graz (Austria): Akademische Druck- und Verlagsanstalt. -> Volume 1: Rechtsgeschichte der deutschen Genossenschaft. Volume 2: Geschichte des deutschen Korperschaftsbegriffs. Volume 3: Die Staats- und Korporationslehre des Altertums und des Mittelalters und ihre Aufnahme in Deutschland. Volume 4: Die Staats- und Korporationslehre der Neuzeit.
MONASTICISM HOCART, ARTHUR M. 1927 Kingship. Oxford Univ. Press. HOOKE, SAMUEL H. (editor) 1958 Myth, Ritual, and Kingship. Oxford: Clarendon. KERN, FRITZ (1914) 1939 Kingship and Law in the Middle Ages. Oxford: Blackwell. -» First published as Gottesgnadentum und Widerstandsrecht im fruheren Mittelalter. KOEBNER, RICHARD (1961) 1965 Empire. New York: Grosset & Dunlap. LOEWENSTEIN, KARL 1952 Die Monarchic im modernen Staat. Frankfurt am Main (Germany): Metzner. MAIR, LUCY P. (1962) 1964 Primitive Government. Baltimore: Penguin. MAURRAS, CHARLES (1909) 1928 Enquete sur la monarchie. New ed. Versailles (France): Bibliotheque des Oeuvres Politiques. MOMMSEN, THEODOR (1871) 1887-1888 Romisches Staatsrecht. 3d ed., 3 vols. Leipzig: Hirzel. -» See especially Volume 2, Part 2. NICOLSON, HAROLD G. 1962 Kings, Courts and Monarchy. New York: Simon & Schuster. PETRIE, CHARLES A. 1952 Monarchy in the Twentieth Century. London: Dakers. PINE, LESLIE G. 1958 The Twilight of Monarchy. London: Burke. ROSTOVTSEV, MIKHAIL I. (1926) 1963 The Social and Economic History of the Roman Empire. New ed., 2 vols. Oxford: Clarendon. SYME, RONALD (1939) 1960 The Roman Revolution. Oxford Univ. Press. WEBER, MAX (1922)1956 Wirtschaft und Gesellschaft. 4th ed., 2 vols. Tubingen (Germany): Mohr. -» See especially Chapter 3 of Volume 1, Part 1. WITTFOGEL, KARL A. 1957 Oriental Despotism: A Comparative Study of Total Power. New Haven, Conn.: Yale Univ. Press. -» A paperback edition was published in 1963. WOLFF-WINDEGG, PHiLipp 1958 Die Gekronten: Sinn und Sinnbilder des Konigtums. Stuttgart (Germany): Klett.
MONASTICISM "Monasticism" is derived from the Greek word for "alone." Words like the Latin monachus ("monk") were first used to describe men who lived alone—hermits, solitaries who lived apart for the sake of God or a prayerful life. By a simple extension of meaning the word was applied to communities of monks (or of nuns) who retired within enclosures to separate themselves from other men for the purpose of seeking quiet for simple devotion and contemplation. Monasteries are groups of men or women pursuing a religious ideal in retirement from society. The religious ideal pursued may differ between one religion and another. But in all the higher religions, examples are found of men or women retiring from society to contemplate truth and strive for purity of heart. The strains and noise of the world are believed to prevent the soul from concentrating upon the good: it must draw apart to direct its attention and eschew every distraction.
41 5
A universally accepted condition of this withdrawal has been celibacy, freeing the individual from the distractions of physical passions and the ties of family life. Another has been poverty, freeing the soul from concern for material possessions. When the withdrawal is to a community rather than to a hermitage, obedience to a superior is considered an important exercise in destroying self-will. In Roman Catholic monasticism the monk or nun takes a threefold vow of chastity, poverty, and obedience. In other religions there are rarely vows, but the threefold intention is almost universal. The origin of the monastery was connected with the belief that the world is evil: existence is a burden, and the soul must be delivered from matter. The soul and body were believed to be opposed: the body must be mortified that the soul may find its true self, its "salvation," "perfection," "deliverance," "redemption." The most ancient forms of this doctrine are those found among the Hindus; the most ancient monasteries known appeared in the early years of Hinduism, when groups gathered to share a life of mortification and Vedic studies. Monasticism has flourished above all in Buddhism, for Gautama Buddha took the deliverance doctrine of Hinduism, spiritualized it, and thereby made withdrawal the only discipline that would lead to that state of perfection which was Nirvana. For Buddhists, monasticism is not a heightened form of the religious way of life, as it is for Catholic Christians; it is the religious life. At different times, monasteries have dominated religion, civilization, and culture in those countries where almost all the people profess Buddhism— Burma, Thailand, Tibet. In the three religions with an interest in the Old Testament—Judaism, Christianity, Islam— monasticism has played a less dominant role. The God of Genesis is a living God, a ruler and a father, who created the world and saw that it was good. For none of these faiths is the body evil. The God of Mount Sinai demands a moral people and a moral society. His servants shall seek to secure a world free of injustice and oppression. Individuals and groups may be permitted to retire from society, but this never becomes a universal ideal. Moreover the monastic life is almost always associated with some form—however rudimentary or however advanced—of mysticism. In Hinduism and Buddhism the soul which mortifies the passions and directs its prayer may pass into union with the absolute good of the universe, possessing it and possessed by it. In the Old Testament, God is high and lifted up, transcendent and other. A Jewish soul cannot seek union with Jehovah, for
4 16
MONASTICISM
the very conception appears blasphemous to it; the created being does not raise itself to equality with its creator. Therefore, although monastic groups are found in all three religious traditions, all three also contain strands of thought that are antithetical to monasticism in the Hindu or Buddhist sense: the divine creation of the body; the salvation of society, as well as of the individual spirit; the faith in a transcendent God and its corollary, the distrust of any uncontrolled search for mystical unity. Judaism and Islam have been less friendly to monasticism than has Christianity. Muhammad declared that there are no monks in Islam and made no mention of them in the Qur'an (Koran). Despite certain anticipations of monasticism among the Jews contemporary with Christ, there were no Christian monks, properly speaking, for two centuries after Christ's death. Monasticism is not inherent in Christianity, and monasteries have never been as integral to the practice of Christianity as they are to the practice of Buddhism. Partly for this reason Christianity has also produced many critics of monasticism. Even at the end of the fourth century, when monastic ideals were rapidly spreading throughout the Christian church, the Latin writer Vigilantius denounced the solitary life as a cowardly abandonment of responsibility. During the sixteenth-century Reformation half the western Christian church repudiated the monastic ideal; few examples of it are found among Protestant churches. Yet, the contemplative and hermit tradition was well represented in western Christendom, although the Benedictine Rule remained dominant. The most highly respected Western order of the quasi-hermit tradition is the Carthusians, founded by St. Bruno in 1084 at the Grande Chartreuse, in southeast France. The mystical element in religion, however, is more diffused than those mysticisms that are dependent upon a division between body and soul. All three of the Old Testament religions came into touch with mystical doctrine and were affected. In Syria and Persia, Jewish, Christian, and Muslim doctrines of salvation were akin to those of Hinduism or Buddhism. The recently discovered Dead Sea Scrolls have proved the existence of a Jewish monastic community in the Dead Sea valley about the time of Christ. Within Judaism, the Essenes practiced a form of monastic life, with community of goods, silence, celibacy, poverty. Philo of Alexandria described a community in Egypt, the Therapeutae, whose way of life was so like that of Christian monks that for centuries Christian writers believed them to be Christians. Islam was
not, on the whole, friendly to mysticism or to enclosed monasteries. But as Sufism developed in Persia it acquired strongly mystical doctrines, and monastic groups began to be founded. Christianity assimilated monasticism into its system of religious life more readily than did Judaism or Islam. It was preaching its gospel in a Greek world where Platonic philosophy and religious dualism combined to welcome a doctrine of salvation through withdrawal from society. People who were educated in Greek thought and were later converted to the church sought to interpret Christian theology in accordance with their earlier philosophy. From Platonic philosophy, Christian teachers adopted the language used in speaking about the contemplation of supreme truth or the unity of the soul with the Divinity and in this sense interpreted the New Testament's statements concerning unceasing prayer. In the middle of the third century St. Anthony led a retreat into the Egyptian desert, and only a century later the movement was the strongest religious force in Christendom. The anarchic conditions of secular society helped it to remain so for six centuries. The monks of the Eastern church looked back to St. Basil of Caesarea (who died in 379) as their chief organizer; the monks of the Western church to St. Benedict of Nursia, who founded the house of Monte Cassino, north of Naples, in the sixth century. In the West, the Benedictine Rule was an elementary framework to which other rules and customs were added. The ideal of life was simplicity, not excessive austerity, with seven or eight short services for worship at fixed points in the day and with time allotted to work in the fields and to spiritual reading. The orders descended from the Benedictine varied greatly in their customs. The first Christian monasteries were communities of laymen with a priest or two to celebrate the sacraments. As centuries passed, it was expected that all fully professed monks would be ordained. Slowly the forms of worship became more elaborate until they were the main work of the monk—especially among the Cluniacs, whose mother house, Cluny, in Burgundy, was founded in 910. The abbey church at Cluny was the grandest of any monastery in Europe. In reaction against these elaborations the Cistercians—called so after their mother house, of Citeaux, Burgundy—sought to recall the monks to the simplicity of the Benedictine Rule. They restored the obligation to work in the fields and founded their houses in remote wildernesses, where they brought new land under cultivation or grazing. In the Eastern church, there was no coherent
MONASTICISM "Rule" of St. Basil similar to the Rule of St. Benedict. Monasticism in Russia and Greece always remained more individualistic in its ethos, nearer to the hermit tradition of Syria and Egypt, with a rich liturgical and contemplative tradition but more remote from society and less influential. Eastern Orthodoxy created a unique monastic republic on the peninsula of Mount Athos in northern Greece, where from the eighth or ninth century a great complex of communities and hermitages began to develop. No woman is yet allowed to set foot upon the peninsula. Nuns are much more numerous in Christianity than in any other religion. The nuns of Buddhism are comparatively few. Every variety of Christian monasticism has made provision for women as well as men, and in modern Roman Catholicism nuns have greatly outnumbered monks. The characteristic government of a monastery springs out of a personal relationship between a holy man and his disciple. A hermit goes into retirement to seek his salvation. He becomes known for his sanctity and moral wisdom. A disciple asks leave to sit at his feet or serve him in his cell, to advance his own salvation. More disciples come, and a group forms around a wise man. They partake of common meals and common worship and have simple rules. Most Hindu monasteries remained loose in organization and hardly passed beyond the stage of having a sage or saint and a few disciples. In many primitive Christian and modern Buddhist monasteries the government remained a loose administration by a group of "elders." Even when the constitution is highly organized, it never quite loses the flavor imparted by its remote origins; the relation is one of a novice to the director of his soul—the experienced elder imparting moral knowledge to the young in years or young in religious experience. Moreover, the disciple grows in grace by conquering his self-will, both by instant obedience to the commands of his director—even when he does not understand the reason—and by accepting without resentment punishment which looks like injustice. Therefore, the moral nature of this relationship has led to very authoritarian forms of constitution; abbots are superiors with absolute authority, except so far as they are limited by civil law or by a rule of life accepted by them at their entry as novices. Some Buddhist communities grew so large that more elaborate forms of organization became necessary. In a country where every male must be a monk for part of his life, a monastery might rise to become a celibate township of ten thousand souls. However, as in Hindu monasteries, there was
417
flexibility, and a man might enter or leave the monastery without blame. The most elaborate forms of organization are found in Christian monastic orders, where very early in Christian monastic practice the idea of stability became morally important. It was recognized that a man might try his experience in various directions. But it was also believed that the spiritual life demanded a long course of obedience and a continuity within the same brotherhood. The existence of vows and the demand for stability presented Christian thinkers with deeper constitutional problems. The characteristic Christian monastery, the Benedictine, had an elected abbot who possessed a permanent authority limited only by the provisions of the Rule of St. Benedict. Later medieval orders, such as the Cistercians and the Dominicans, experimented further in forms of organization, while retaining the absolute duty of obedience to the superior; the several houses of the order were placed under a single governing body in which each was represented. All such societies used sanctions to preserve discipline, varying according to the country and century—flogging, confinement, deprivation of food, temporary exclusion from the social and especially from the religious meetings of the community, and, in the last resort, expulsion. The first monks were holy men who sought retirement. They were devoted to poverty. But in both Christian and Buddhist monasticism, the poverty of the individual was compatible with the wealth of the community. Monasteries which began as societies of the poor sometimes ended as rich and powerful corporations. Buddhist and Hindu monks were expected to live off charity. The begging bowl was almost indispensable as equipment, and the round for alms was an almost indispensable part of daily devotion. It was a devotional exercise to receive such alms with humility and tranquillity of spirit, eschewing worldly satisfaction if the alms were given and resentment if they were refused. In Christianity the Franciscan friar—especially of the stricter or spiritual group—expected God to provide in the same way. But Christianity possesses a stronger doctrine, that earthly vocations are God-given. Christian monks always believed that they should work for their living on simple tasks that did not distract devotion. Their characteristic work has been agriculture, but basket making, mat making, education, the copying of manuscripts, scholarship, and other forms of work have been accepted as suitable for Christian monks. Buddhist monks have likewise engaged in scholarship, education, agricul-
418 MONASTICISM ture, and the copying of manuscripts, but have not usually considered the earning of a livelihood to be a necessary element in religious devotion. Sanctity attracted gifts. Rich novices might make over their funds to the community, although the practice has obvious dangers and monastic rules tried to regulate or even prevent it. Childless widows or widowers left money or lands; pious kings gave endowments; noblemen found in monasteries a worthy object for their alms. An accepted work was the maintenance of shrines, and pilgrims cast their offerings freely. In some countries monasteries have thus acquired over the centuries an astonishingly large proportion of the national land area. In western Europe between the eighth and eleventh centuries, in Russia between the thirteenth and fifteenth centuries, in Tibet between the seventeenth and twentieth centuries, the whole current of popular piety so flowed toward the monastic ideal that the monasteries came to be a major state institution. The climax of the process was attained in modern Tibet, where the monks formed something like a fifth of the population and where the government of the state was for three centuries controlled by the chief abbot, the Dalai Lama. In certain other countries, the abbots have held temporal prominence in the state. In Ceylon, the abbots were the secular judges and the king's cabinet. In medieval England and other European countries the abbots sat of right in the parliament. Tibet is an example of how the social influence of monasteries has been strongest in countries where nearly all the population was, or is, Buddhist. The explanation probably lies in the doctrinal difference from Christianity. In the Christian churches the monastic way of life has always appeared as only one way to heaven among others— although seen by many as the surest way. The dogma that its practice was necessary to salvation appeared in a few early Christian sects but was always rejected as heretical and incompatible with the Bible. Buddhism, in contrast, has held that some institutionalized withdrawal is indispensable to the perfect life. In countries like Tibet and Burma almost the entire male population entered monasteries for a shorter or longer time (at least for three months) and wore the yellow robes. Before the Europeans opened schools in Burma and before the Chinese conquered Tibet, the monks were the only schoolmasters, and every educated layman was familiar with the life and devotion of the monastery. The absence in Buddhism of irrevocable vows, the freedom to leave a monastery without blame, made this possible. In western Europe of the early Middle Ages the monasteries
made important contributions to such education as existed, but in no Christian country were they for any length of time the sole instrument of education. Monks have sometimes been of momentous importance in preserving and transmitting the cultural heritage of a people. From the seventh to the tenth centuries, Christian monasteries helped to preserve the libraries and knowledge inherited from Greco-Roman civilization. But monasticism has never possessed a social drive or consciousness and has acted as a main channel or focus of culture only by accident or as a by-product of other activities and in special (often anarchic) social circumstances. Nevertheless, the most notable of all monastic contributions to learning came from the Benedictine congregation of St. Maur, in France, during the late seventeenth and early eighteenth centuries, where a group of eminent scholars headed by Jean Mabillon laid the foundations for the modern critical study of historical sources. On the edge of the monastic groups proper, brotherhoods have existed which accepted various monastic obligations, although their members lived in the world. In Catholicism the Jesuits accepted the threefold vows of poverty, chastity, and obedience but were nevertheless secular priests living in the world and engaged in pastoral care or in education; in modern times this sort of example led older and originally contemplative orders, like the Benedictine, to accept responsibility for education or pastoral duties outside the monastery. In Tibet there were warrior-monks. Medieval Catholicism had its orders of crusading knights, like the Templars and Hospitalers, who took vows but were devoted to the defense of Christendom by force of arms. Such orders could achieve political authority, just as the Hospitalers ruled Malta and the Teutonic Knights founded the state which later became the duchy of Prussia. Under the Turkish empire the Baktashiyah were a similar military and quasi-monastic order connected with the Janissaries. In Islam—outside the mystical tradition of the Sufis—a majority of the "monasteries" have been of this quasi-monastic type of religious brotherhood with special duties in the world. A modern religious brotherhood of this kind, the Senusi, was founded as late as 1837 and came, in time, to achieve political control of part of the Sudan and nearly all the eastern Sahara. The concern of Christian doctrine for this world and its society meant that the monastic ideal took forms very different from the original societies, which were directed to individual salvation and contemplative prayer. The most celebrated of these novel forms is found in the friars, founded by
MONETARY POLICY St. Francis of Assisi and by St. Dominic at the beginning of the thirteenth century. Francis called for poverty and simplicity as a protest against the elaborate and powerful church of his day; Dominic wished to protect the church by confuting heretics. But both orders of friars which resulted from their initiative were devoted to saving the souls of others as well as their own souls. They are an important example of how monastic groups could become agents for evangelism and the propagation of the faith. In modern times some secular governments, such as Mexico, Russia, and China, have confiscated much or all monastic property, whether Christian or Buddhist, as useless to the state and have secularized their inmates. Even in a still formally Christian country like Greece, where most of the inhabitants profess the Orthodox religion, there has been a spectacular decline, even on Mount Athos, in the number and the reputation of the monks. But as it is impossible to conceive of Buddhism without Buddhist monks, so the course of centuries has made it impossible to conceive of Catholic Christianity without varieties of Christian monks. Quiet withdrawal in the face of eternity appears to meet a need of the highest aspirations of the human conscience. W. O. CHADWICK [See also RELIGIOUS SPECIALISTS. A guide to other relevant material may be found under RELIGION.] BIBLIOGRAPHY
BUTLER, EDWARD C. (1919) 1962 Benedictine Monachism: Studies in Benedictine Life and Rule. 2d ed. New York: Barnes & Noble. CABROL, FERNAND 1916 Monasticism. Volume 8, pages 781—797 in Encyclopaedia of Religion and Ethics. Edited by James Hastings. New York: Scribner. CHADWICK, OWEN (1950) 1967 John Cassian: A Study in Primitive Monasticism. 2d ed. Cambridge Univ. Press. COULTON, GEORGE G. 1923-1950 Five Centuries of Religion. 4 vols. Cambridge Univ. Press. FARQUHAR, JOHN N. (editor) 1916-1938 The Religious Life of India. 13 vols. London: Milford. HEIMBUCHER, MAX (1896-1897) 1933-1934 Die Orden und Kongregationen der katholischen Kirche. 3d ed., rev. & enl. 2 vols. Paderborn (Germany); Schoningh. KNOWLES, DAVID (1940) 1963 The Monastic Order in England: A History of Its Development From the Times of St. Dunstan to the Fourth Lateran Council, 940-1216. 2d ed. Cambridge Univ. Press. KNOWLES, DAVID 1948-1959 The Religious Orders in England. 3 vols. Cambridge Univ. Press. WADDELL, LAURENCE A. (1895) 1958 The Buddhism of Tibet: Or, Lamaism, With Its Mystic Cults, Symbolism and Mythology, and in Its Relation to Indian Buddhism. 2d ed. Cambridge: Heffer. WARD, CHARLES H. S. (1934) 1947-1952 Buddhism.
4 19
Rev. ed. 2 vols. London: Epworth. -> First published as Outline of Buddhism. Volume 1: Hinayana. Volume 2: Mahdyana. WORKMAN, HERBERT B. (1913) 1927 The Evolution of the Monastic Ideal From the Earliest Times Down to the Coming of the Friars: A Second Chapter in the History of Christian Renunciation. London: Epworth. -> A paperback edition was published in 1962 by Beacon.
MONETARY POLICY In its broadest sense, monetary policy includes all actions of governments, central banks, and other public authorities that influence the quantity of money and bank credit. It therefore embraces policies relating to such things as choice of the nation's monetary standard; determination of the value of the monetary unit in terms of a metal or foreign currencies; determination of the types and amounts of the government's own monetary issues; establishment of a central banking system and determination of its powers and rules for its operation; and policies concerning the establishment and regulation of commercial banks and other related financial institutions. A few even extend the meaning of monetary policy to include official actions affecting not only the quantity of money but also its rate of expenditure, thus embracing government tax, expenditure, lending, and debt management policies. It has become customary, however, to define monetary policy in a more restricted sense and to exclude from it choices relating to the broad legal and institutional framework of the monetary and banking system. This narrower concept will be employed here. Monetary policy in this sense refers to regulation of the supply of money and bank credit for the promotion of selected objectives. Elements of monetary policy. Like all economic policies, monetary policy has three interrelated elements: selection of objectives, implementation, and at least an implicit theory of the relationships between actions and effects. All three elements present problems of choice and are continuing subjects of controversy. Monetary policy can be directed toward achieving many different objectives. For example, the supply of money can be regulated to provide the government with cheap or even costless funds, to maintain interest rates at some selected level, to regulate the exchange rate on the nation's currency, to protect the nation's gold and other international reserves, to stabilize domestic price levels, to promote continuously high levels of employment, and so on. Such multiple objectives are unlikely to be fully compatible at all times. Rational policy
420
MONETARY POLICY
making therefore requires identification of the various objectives, analysis of the extent to which they are or can be made compatible, and choices from among those that conflict with one another. A later section will stress changes in the objectives of monetary policy and some of the problems of reconciling them. The role played by monetary policy in promoting selected economic objectives depends greatly on the nature of the economic system and on attitudes toward the use of other methods of regulation. This role is usually secondary in economies characterized by government operation of most economic enterprises and government control of resource allocation, distribution of output, and prices of inputs and outputs. Even in these economies monetary policy is not trivial. An excessive supply of money can create excessive demand and inflationary pressures, which are evidenced in black markets, hoarding, and bare shelves. On the other hand, a deficient supply of money can impede the flow of production and trade. Yet the major function of monetary policy in such economies is that of passive accommodation, that is, to provide the amount of money needed to facilitate the operation of other government controls; it is not to serve as a prime regulator. Monetary policy usually plays a more positive regulatory role in economic systems that rely heavily on market forces to organize and direct processes of production and distribution. In such economies, decisions of business firms relating to rates of output, amounts of labor employed, rates of capital formation, and so on, are strongly influenced by relationships between costs and actual and prospective demands for output. If aggregate demands are deficient, firms will not find it profitable to employ all available labor, to utilize fully existing capacity, or to purchase all the new capital goods that could be produced. On the other hand, excessive aggregate demands for output are inflationary. A major function of monetary policy, therefore, is to regulate the behavior of aggregate demand for output in order to elicit a more favorable performance by the economy. This function is shared with fiscal policy in many countries and in many different combinations or "mixes." Although the deliberate use of fiscal policy for this purpose has increased considerably in recent decades, monetary policy continues to be a major instrument. Primary responsibility for administering monetary policies is usually entrusted to central banks, although there are varying degrees of government control of central banks and their policies. Central banks regulate the money supply and influence the
supply of credit in two principal separate but closely related capacities: as controllers of their own issues of money and as regulators of the amount of money created by commercial banks. Both are important, but their relative importance depends in part on the stage of financial development of the country and on the types of money employed. In countries where bank deposits have not yet come to be widely used, notes issued by the central bank often constitute a major part of the money supply. In such cases the central bank may regulate the money supply largely by controlling directly its own note issues. However, in countries that have reached a later stage of financial development, central bank notes constitute a smaller part of the money supply; deposits at commercial banks are the major component, and the actions of commercial banks directly account for a large part of the fluctuations of the money supply. In such countries, the central bank is primarily a regulator of the commercial banks, although control of its own money creation remains important and is a part of the process. The terms "monetary policy" and "credit policy" are often used interchangeably or with only slightly different shades of meaning. This has come about primarily because in most modern systems the creation and destruction of money by central and commercial banks are so closely intertwined with their expansion and contraction of credit. They typically create and issue money (currency and deposits) by making loans or purchasing securities, usually debt obligations. Thus, one side of the transaction is the issue of money; the other is the provision of funds to borrowers or sellers of securities, which tends to lower interest rates. Central and commercial banks typically withdraw money (currency and deposits) by decreasing their outstanding loans or by selling securities, usually debt obligations. Thus there is both a decrease in the supply of money and a decrease in the funds available to borrowers and to purchasers of the securities sold by the banks, which tends to increase interest rates. Those who speak of monetary policy tend to focus on the behavior of the stock of money, while those who speak of credit policy tend to focus on the quantity of loan funds available from the central and commercial banks. Such differences in focus need not lead to differences in either analysis or conclusions. Yet they sometimes do. Those who focus on the stock of money are more likely to stress "real balance effects" on both consumption and investment spending, while those who focus on credit are likely to put more stress on the direct
MONETARY POLICY effects on interest rates, the availability of funds, and investment. Monetary theory has made considerable progress in reconciling and integrating these approaches, but much remains to be done. The third element in monetary policy is at least an implicit theory of the relationships between actions and effects. If its actions are to promote its objectives, the monetary authority needs some theory as to the nature, direction, magnitude, and timing of the responses. The relevant responses are numerous and on several levels. For example, they include the response of the supply of money and credit; the response of aggregate demand for output; and the responses of real output, employment, and prices. There are still disagreements among both economists and central bankers on many of these theoretical and empirical issues, and these disagreements underlie many continuing controversies over the proper nature and scope of monetary policy. Some of these will be treated in a later section. Evolution of objectives. Monetary policy, in the modern sense of deliberate and continuous management of the money supply to promote selected social and economic objectives, is largely a product of the twentieth century, especially the decades since World War I. In the earlier period, when most countries were on either a gold or a bimetallic standard, the primary and overriding objective of monetary policy was to maintain redeemability of the nation's money in the primary metal, both domestically and internationally. A decline of the nation's metallic reserves to dangerously low levels, or any other threat to redeemability, became a signal for monetary and credit restriction, whatever might be its other economic effects. When redeemability seemed secure, monetary policy was used to promote other objectives—to deal with panics, crises, and other credit stringencies and even to expand money somewhat when business was depressed. But such intervention was sporadic rather than continuous and its purposes limited rather than ambitious. The international gold standard of the pre-1914 period was not purely automatic, but it was managed only marginally. Many forces have contributed to the change and growth of monetary policy since World War I. One set of forces includes the breakdown of the international gold standard and other changes and crises in monetary systems—inflation during and following World War i and the long period of suspension of gold redeemability in most countries, the changed and insecure nature of the gold and gold exchange standards re-established in the 1920s, the renewed breakdown of gold standards during the
4 21
great depression of the 1930s, and world-wide inflation during and following World War u. All these had profound effects on attitudes toward monetary policy. Both countries that had too little gold and those that had too much shifted to the view that the state of their gold reserves was no longer an adequate guide to policy and that new objectives and guides should be developed. Monetary actions became increasingly less sporadic and limited and more continuous and ambitious in scope. The objectives of monetary policy have also been powerfully influenced by changes in attitudes concerning the responsibilities of central banks and governments for the performance of the economy. The 1920s witnessed growing demands that some central agency reduce instability of price levels and business activity. These demands were strengthened immeasurably by the economic catastrophe of the 1930s and by fears that World War n would be followed by another world-wide depression. Within a few years after that war the governments of almost all Western nations had formally assumed responsibility for promoting continuously high levels of employment and output. And within a few more years almost all of these governments had signified their intentions to promote economic growth. Monetary policy is required, in some cases by government and in others by the force of public opinion and pressure, to contribute to such objectives. Although often phrased in different terms, it is now common for monetary authorities to state four major or basic objectives of monetary policy: (1) continuously high levels of employment and output, (2) the highest sustainable rate of economic growth, (3) relatively stable domestic price levels, and (4) maintenance of a stable exchange rate for the nation's currency and protection of its international reserve position. In some countries monetary policy is also influenced by other considerations, such as a desire to maintain low interest rates to facilitate government finance or other favored types of economic activity. Conflicts of objectives. Some of the most basic problems of monetary policy relate to the compatibility of such multiple objectives. Can all these be achieved simultaneously and to an acceptable degree even if a nation has precise control of the behavior of aggregate demand for output? Of course, the answer depends in part on the ambitiousness of the goals; perfection in all respects is hardly to be expected. The answer also depends to an important extent on the responses of output, employment, money wage rates, and prices to changes in aggregate demand. The most favorable case is that in which the
422
MONETARY POLICY
supply of output is completely elastic at existing price levels up to the point of "full employment" and capacity outpuf In such cases, increases of demand would elicit only increases in output until the economy reached its maximum capacity to produce. Price inflation would appear only when demand became excessive relative to productive capacity. Problems of reconciling objectives relating to output, employment, and price level stability arise, however, when the supply of output does not respond in such a favorable manner to increases of demand—when prices rise before the economy has neared its capacity to produce. Even in the face of considerable amounts of unemployment, average money wage rates may rise faster than average output per man-hour, thereby tending to raise costs of production. And for this, or other reasons, business firms may raise the prices of their products even though considerable amounts of excess capacity persist. Under such conditions it may be impossible to achieve all objectives, to acceptable degrees, solely by controlling aggregate demand. Levels of demand sufficient to elicit "full employment" and capacity output may bring inflation, while levels of demand low enough to assure stability of price levels may leave large amounts of unemployment and unused capacity. Because of such difficulties, many economists and other observers have come to believe that objectives relating to output, employment, and price levels can be reconciled satisfactorily only if regulation of aggregate demand through monetary and fiscal policies is supplemented by measures designed to elicit more favorable responses by the economy. These measures are of several types, which can only be listed here: (1) reform of wagemaking processes in order to avoid inflationary increases of money wage rates, (2) decrease of monopoly power in industry, and (3) increase of regional and occupational mobility of labor. The above discussion related to possible conflicts among a nation's multiple domestic objectives. One, or more, of these domestic objectives may also conflict with the nation's international objectives of maintaining a stable exchange rate for its currency and of protecting its international reserve position. Fortunately, domestic and international objectives do not always conflict. For example, a nation may have a deficit in its balance of payments primarily because of excessive domestic demands and rising prices. In such cases, restrictive monetary policies may be appropriate for both domestic and international reasons. On the other hand, a nation may have a surplus in its balance of payments primarily
because of unemployment and depressed output and incomes at home, which depress its demands for imports. In this case an expansionary monetary policy will promote both its domestic and international objectives. Cases do arise, however, in which domestic objectives and the objectives of maintaining stable exchange rates and a balance in international payments come into conflict. For example, a nation may have a large and persistent surplus in its balance of payments while demands for its output are so large as to bring actual or threatened inflation. An expansionary monetary policy, aimed at reducing the surplus in its balance of payments, would increase inflationary pressures at home; while a restrictive policy, aimed at inhibiting domestic inflation, would continue, and perhaps even increase, the surplus in its balance of payments. A nation faced with this situation may be compelled to sacrifice its domestic objective of preventing inflation or to increase the exchange rate on its currency in order to decrease the value of its exports relative to its imports. Considered by most countries to be even more serious is the situation in which there is a large and persistent deficit in the balance of payments combined with actual or threatened excess unemployment at home. Employing expansionary monetary and fiscal policies to increase domestic demand and eradicate excess unemployment would tend to widen the deficit in the nation's balance of payments and to drain away its international reserves. But employing restrictive policies to eradicate the deficit in its balance of payments would increase unemployment at home. The nation may be forced to sacrifice its domestic objectives relating to employment, output, and growth or to lower the exchange rate on its currency. Because of such conflicts, many economists have become critical of arrangements under which exchange rates remain fixed over long periods of time. They see little merit in stable exchange rates as such and would alter them whenever they conflict with important economic objectives. However, their prescriptions vary widely. For example, some favor stability of exchange rates most of the time with adjustments only in case of "fundamental disequilibrium." Others favor continuously flexible exchange rates, with or without official intervention to influence their behavior. The entire field of exchange rate policy remains highly controversial. [See INTERNATIONAL MONETARY ECONOMICS, article On EXCHANGE RATES.] Monetary policy and aggregate demand. The preceding sections dealt with some of the prob-
MONETARY POLICY lems that would be encountered in promoting multiple economic objectives simultaneously, even if the monetary authority possessed precise control over the behavior of aggregate demand for output. But it is unsafe to assume without analysis that the monetary authority, or even the monetary authority together with the fiscal authorities, can control aggregate demand precisely. The monetary authority has no direct control over aggregate demand for output or over any of its major components, such as demands for consumption, for investment or capital formation, for government use, or for export. Its powers are largely confined to regulation of the supply of money and credit. Even at this level its controls may lack precision. Presumably the central bank can accurately control its own creation and destruction of money; but its control of the creation and destruction of money and credit by the commercial banking system, exercised largely through its control over the reserve position of the banks, may be less accurate. And even if the monetary authority has precise control of the money supply, aggregate demand for output may not respond in a uniform or precisely predictable manner; the income velocity, or rate of expenditure, of money may fluctuate. Thus there are many links in the chain of causation from central bank action to the reaction of aggregate demand and many possibilities of slippage. The effectiveness of monetary policy as a regulator of aggregate demand does not depend on the existence of some fixed relationship between the supply of money and aggregate demand. It requires only that changes in the money supply influence aggregate demand in the desired direction and in a predictable way and that the monetary authority have power to change the money supply to the extent required to offset adverse variations in the income velocity of money. However, the possibility of control of aggregate demand does suffer to the extent that changes in money supply fail to affect aggregate demand, that the power of the monetary authority to change the money supply is limited, and that the relationship between the money supply and aggregate demand is unpredictable. Few economists doubt the ability of monetary policy, in the absence of strong cyclical forces, to regulate effectively the secular behavior of both the money supply and aggregate demand for output. Secular changes in the velocity of money are usually gradual and can be allowed for in determining the appropriate rate of change of the money supply. There is much less agreement, however, concerning the effectiveness of monetary policy alone for offsetting cyclical forces and stabilizing aggre-
423
gate demand over the various phases of the business cycle. Monetary policy meets its most severe test in dealing with the strong forces that cause recessions or depressions. Consider the extreme case in which an economy has slipped into a severe depression with widespread unemployment and unused capacity. Under such conditions businessmen are likely to view the future pessimistically and to see few opportunities for investment in capital facilities that promise favorable rates of return. Their demand functions for output to be used for capital formation may be so low that only extremely low interest^Crates, perhaps rates approaching zero, would induce them to invest enough to lift the economy back toward full-employment levels. But monetary policy may be incapable of depressing interest rates, and especially long-term rates, to such low levels. The monetary authority may encounter difficulties in increasing the money supply under such conditions because the banks prefer to hold excess reserves rather than lend and take risks. Interest rates, and especially long-term rates, may fall only sluggishly, even in the face of large increases in the money supply. One reason for this is the fear of default by borrowers under depression conditions. John Maynard Keynes suggested another reason—his famous "liquidity trap." He argued that there was some long-term rate of interest, not far below that previously prevailing, that the public considered "normal," in the sense that it would again prevail. No one would hold securities at lower yields because of fear of capital losses when interest rates returned to their normal levels. Below this normal rate the public would increase its holdings of money balances indefinitely rather than lend at a lower rate. Thus monetary policy may be incapable of lowering interest rates enough to offset the decline of investment demand functions, and recovery may be delayed until something increases the expected profitability of private investment or until the government adopts expansionary fiscal policies. In how many cases would a well-conceived and well-executed monetary policy prove incapable of dealing with depressive forces? On this there is still lack of agreement among economists. Some have argued that experience during the great depression proved the ineffectiveness of monetary policy. This experience is hardly relevant to the present question, however, because the monetary policies of that period were hardly exemplary. To protect gold standards or for other reasons, many countries actually followed deflationary monetary policies for a considerable period. Expansionary
424
MONETARY POLICY
policies were in many cases initiated only after a long delay, during which excess capacity had become widespread, expectations had deteriorated, and the entire financial system had come under serious strain. It may well be that in this and other recessions an ambitious expansionary monetary policy introduced promptly after the downturn would have proved effective in arresting the decline of aggregate demand. However, many economists —including some who are optimists about the effectiveness of monetary policy—believe that monetary policy alone may not be potent enough to offset strong depressive forces and that expansionary fiscal policies should also be employed under such conditions. It is generally conceded that well-conceived monetary policies can be more effective in restricting increases in aggregate demand during the prosperity phases of business cycles. However, such prosperity periods are usually characterized by increases in aggregate demand relative to the money supply. This increase in the income velocity of money, or "economizing of money balances relative to expenditures," reflects several forces that usually accompany prosperity—greater optimism on the part of both households and business firms concerning their future receipts of income, which decreases the amounts of money held against contingencies; more profitable opportunities for investing idle balances held by business firms; and rising interest rates. Theorists have tended to stress, perhaps to overstress, the role played by rising interest rates. The rise of investment demand during prosperity tends to raise interest rates, and the rise of rates is accentuated by a restrictive monetary policy. In turn, the availability of higher yields on other assets induces both business firms and households to economize their holdings of money balances that yield no interest. Such increases of velocity—induced in part, but only in part, by restrictive monetary policy—do constitute a slippage in the operation of monetary policy. This does not mean that monetary policy is rendered ineffective; it means only that larger restrictive actions are required to achieve any specified amount of restriction of aggregate demand. Of course, the monetary authority may be unable or unwilling to restrict money to the required extent. For example, it may be inhibited by inadequacy of the control instruments currently at its disposal, fear that further restriction would precipitate a recession, dislike of high interest rates, or charges that credit restriction discriminates against both new and small business firms. However, these are not limitations on the capability of monetary
policy to restrict aggregate demand. They are only considerations affecting the willingness of the monetary authority to use its powers of restriction. Lags in monetary policy. The effectiveness of monetary policy as a countercyclical instrument depends heavily on the quickness of policy action and the quickness of response of the economy. Ideally, policy actions would be taken as soon as adverse developments appeared, or even in anticipation of such developments; and there would be an immediate and full response of aggregate demand and of such policy objectives as employment and output. Under such ideal conditions a high degree of stability might be maintained continuously. In practice, of course, such ideal performance is not realized. Economists have long recognized three lags in monetary policy: (1) the recognition lag—the interval between the time when a need for action develops and the time the need is recognized; (2) the administrative lag—the interval between recognition and the actual policy action; and (3) the operational lag—the interval between policy action and the time that the policy objectives, such as output and employment, respond fully. Both the length and significance of these lags depend heavily on the reliability of economic forecasting. If developments could be reliably forecast well in advance, the first two lags could be eliminated and actions could be taken soon enough to allow for the operational lag. But when economic forecasting is unreliable the monetary authority is likely to wait until a development appears before taking action to deal with it. In such cases the length of the operational lag becomes highly important for countercyclical policy. Those who favor flexible countercyclical monetary policies implicitly assume that the operational lag is rather short, that all or most of the effects of a monetary action will be achieved within a few months or a year. [See PREDICTION AND FORECASTING, ECONOMIC.] This view has been challenged by some economists, notably by Milton Friedman. These economists contend that the responses to a given monetary action are distributed over time and that the full effects are realized only after a lag of considerably more than a year. Because of this, monetary actions taken to counter cyclical fluctuations may actually produce, or at least accentuate, these fluctuations. For example, expansionary policy actions taken to counter recession may have little effect for several months and then achieve their full expansionary effects on aggregate demand only when the economy is in its next boom phase. And actions taken to restrict aggregate demand during a boom
MONETARY POLICY may in fact precipitate and accentuate an ensuing depression. For this and other reasons, members of this school oppose flexible countercyclical monetary policies. They believe that a greater degree of stability will be achieved by a monetary policy aimed at a steady growth of the money supply, regardless of cyclical conditions. This growth should be at an annual rate approximating the growth rate of real gross national product. This whole question, which is obviously crucial for countercyclical monetary policy, remains unresolved and controversial. Friedman's theoretical and statistical arguments have been strongly challenged but not wholly refuted. Much more research is needed on both the magnitude and timing of responses to monetary policy actions. The same applies to the various types of fiscal policy actions. Monetary and fiscal policies. Nations face complex problems in determining the relative roles to be played by monetary policies and by the various types of government expenditure and tax policies in promoting the economic objectives described earlier. Only a few of the considerations determining these relative roles can be mentioned here. One is, of course, the whole set of cultural, institutional, and political conditions determining the actual availability of these policy instruments. For example, in some countries it is in fact acceptable to use government tax and expenditure policies in a timely and flexible manner. Other governments are not yet in this position. Still others may find it possible to reduce taxes or increase expenditures to support aggregate demand but not to restrict it by fiscal measures. There can also be comparable differences in the actual availability of monetary policy instruments. Also relevant are judgments concerning the relative effectiveness of monetary and fiscal policies in achieving some desired behavior of aggregate demand. For example, an expansionary fiscal policy may be judged to be necessary to promote quick recovery from depression conditions but to be no more effective than monetary policy in restricting increases of demand. The optimum mix of monetary and fiscal policies also depends in part on the nature of economic objectives and on their relative priorities. Suppose that it is possible to achieve some selected level of aggregate demand with various combinations of monetary and fiscal policies—with, say, some restrictive fiscal policy and some expansionary monetary policy or with some expansionary fiscal policy and some restrictive monetary policy. This level of aggregate demand can reflect various com-
425
binations of consumption and capital formation. If the objective is only to achieve some selected level of total output and employment, without regard to the distribution of output between consumption and capital formation, many different combinations of monetary and fiscal policies may be equally acceptable. But this may cease to be true if promotion of economic growth through a higher rate of capital formation is also an objective. For this purpose a restrictive fiscal policy and an easy monetary policy may be most appropriate. Large taxes relative to government expenditures for current purposes can be used to force the nation to consume a smaller part, and to save a larger part, of its total income; and an easy monetary policy, instituted to lower interest rates, can encourage the use of savings for capital formation. A somewhat different case is that in which a nation wishes to raise aggregate demand for its output while it faces an undesired deficit in its balance of payments. Both expansionary fiscal policies and expansionary monetary policies tend to increase the deficit in the balance of payments to the extent that they succeed in raising aggregate demand, which in turn increases imports. But an expansionary monetary policy, which lowers interest rates, will also tend to increase capital outflows or at least to reduce capital inflows. In such a situation, an optimum policy mix may require more expansionary fiscal policies to raise domestic demand, together with a less expansionary monetary policy to support interest rates and attract capital inflows or at least to retard capital outflows. These are but a few of the many considerations that determine the relative roles of monetary and fiscal policies. These relative roles have changed markedly in recent decades and are likely to continue to change with changes in the nature and relative priorities of economic objectives, with changes in attitudes toward the flexible use of fiscal policies for stabilization purposes, and with changes in our knowledge concerning the magnitudes and timing of responses to various types of both monetary and fiscal actions. LESTER V. CHANDLER [See also FISCAL POLICY and MONEY.] BIBLIOGRAPHY
COMMISSION ON MONEY AND CREDIT 1961 Money and Credit: Their Influence on Jobs, Prices and Growth. Englewood Cliffs, N.J.: Prentice-Hall. CULBERTSON, J. M. 1960 Friedman on the Lag in Effect of Monetary Policy. Journal of Political Economy 68: 617-621. CULBERTSON, J. M. 1961 The Lag in Effect of Monetary
426
MONETARY REFORM
Policy: Reply. Journal of Political Economy 69:467477. FRIEDMAN, MILTON 1961 The Lag in Effect of Monetary Policy. Journal of Fclitical Economy 69:447-466. GREAT BRITAIN, COMMITTEE ON THE WORKING OF THE MONETARY SYSTEM 1959 Report. Papers by Command, Cmnd. 827. London: H. M. Stationery Office. -> Known as the Radcliffe Report. SCAMMELL, W. M. (1957) 1962 International Monetary Policy. 2d ed. London: Macniillan; New York: St. Martins. YEAGER, LELAND B. (editor) 1962 In Search of a Monetary Constitution. Cambridge, Mass.: Harvard Univ. Press.
MONETARY REFORM See under MONEY.
MONEY Monetary theory is discussed in the first two articles under this entry and in LIQUIDITY PREFERENCE and INTEREST. For monetary policy and institutions, see the last article under this entry and BANKING; BANKING, CENTRAL; CREDIT; FINANCIAL INTERMEDIARIES; and MONETARY POLICY. Related material is covered in INFLATION AND DEFLATION. For the international aspects of money, see INTERNATIONAL MONETARY ECONOMICS.
i. H. HI. iv.
GENERAL QUANTITY THEORY VELOCITY OF CIRCULATION MONETARY REFORM
Albert Gailord Hart Milton Friedman Richard T. Selden Fred H. Klopstock
GENERAL
The term "money" has accumulated such a wealth of connotations and variant uses that it is perhaps more serviceable as an adjective rather than as a noun. The most useful definition of the term as a noun seems to be an extremely liquid asset, measured in a standard unit of account and capable with certainty of discharging debts expressed in that unit. As applied to the United States at the present time, this definition includes in money the circulating stock of metallic small change, Federal Reserve Notes and other paper currency, and also the stock of commercial bank deposits with checking privileges. Since the definition just proposed makes "moneyness" a matter of degree (because of the relativity inherent in the terms "extremely liquid" and "with certainty"), it may be construed either to include or to exclude from the stock of money in the United States such liquid claims as certificates of deposit issued by commercial banks, Treasury bills, savings deposits, and "shares" in savings and loan associa-
tions. Wherever the frontier between money and what the International Monetary Fund in its compilations calls "quasi money," we must always expect to find some types of quasi money which rank almost at the monetary extreme of the relativistic scales of "extremely liquid" and "capable with certainty of discharging debts." The proposed definition implies differences in the list of things which constitute money—between different societies and through time within a given society. In rare cases where there are unusually sharp cleavages in attitudes and expectations within a given society, it may even imply different moneys for different groups within that society. The underlying concepts "unit of account," "debt," and "liquid asset" obviously have to be interpreted to put content into the proposed definition of money. Unit of account means a unit (such as the U.S. dollar) which by convention (whether with or without supporting pressure from government) is accepted in a society to value commodities and services sold, to compute costs, to reckon wealth, and to state debts. Such units have been used to facilitate thinking about economic matters through much of human history. In Biblical annals, for example, we find as early as Genesis 23 an account of Abraham buying a field "for the full price . . . [of] four hundred shekels of silver according to the weights current among the merchants"; and the New Testament pictures Jesus as taking it for granted that the unsophisticated people to whom he addressed his parables could readily think in monetary units. As may be seen from the fact that the more venerable units of account (shekels, pounds, and the like) correspond to units of weight, most societies until recently have thought of their units of account as expressing the value of a stated weight of gold or silver; but since paper money came into general use in the nineteenth century, units of account have become more and more abstract. It should be noted that a society at a given time may be using one or more units of account. A debt is an obligation on the part of one economic unit (person, firm, or government body) to another, expressed in a standard unit of account. For the debtor, a debt is negative wealth—but expressed in the unit of account, whereas other types of negative wealth (such as a contract to deliver 1,000 bushels of wheat next month) are expressed in physical units. Since every debt obligation is two-sided, the obverse of each debt payable is a claim receivable, which constitutes an asset (wealth) for the creditor. Money in most presentday societies consists chiefly of claims upon debtors who are central governments, central or commercial banks, or other credit institutions.
MONEY: General For an asset to be liquid, it must be either money or else something which quickly and with a high degree of certainty can be converted into a known amount of money. Since "quickly" and "with a high degree of certainty" are both relative expressions, assets can evidently be more or less liquid. The concept of a "liquid asset," furthermore, is subjective; it is defined from the point of view of the holder—the creditor, if the asset is a debt claim. In rating assets as more or less liquid, therefore, we should not ask whether all units of such a unit could in fact be converted into money, but whether each unit can be so converted in the opinion of its holder. This view admits of a shift of opinion through time. Such a shift may be quite sudden, with holders today viewing as illiquid assets which a short time ago they viewed as highly liquid. It seems clear that there were such sudden shifts in the liquidity attributed to deposits in individual banks during the great epidemic of bank failures in the United States in 1930-1933. There may also have been such a sudden shift in the liquidity attributed to government securities at the time of the "monetary accord" of 1951, which terminated the rigid support of government securities prices by the Federal Reserve. Most liquid assets in the United States consist of short-term claims upon the national government, upon the Federal Reserve banks, upon commercial banks, or upon "nonbank credit institutions"—particularly mutual savings banks, savings and loan associations, and, in the view of some analysts, also credit unions and life insurance companies. For some holders and upon some occasions, liquid assets may include inventories of commodities, short-term claims upon firms which are not credit institutions, longer-term government securities, and even listed stocks. For holders in small countries, a large part of the stock of liquid assets may consist of claims upon banks, government bodies, etc., outside the country in question. Such use of foreign claims may or may not involve the use of foreign units of account in domestic dealings and calculations. The meaning of money may be illuminated further by reference to two other terms, not used in the proposed definition. Legal tender is that which is established by governmental rules as a satisfactory medium for settling debts in case of dispute. Anything that is legal tender must be money; but often (as with checking deposits in the United States) large parts of the money stock may be excluded from legal tender. But one might paraphrase the definition of money as "that which is by custom treated like legal tender." A monetary standard may be defined as a fixed
427
relation between the unit of account and the standard commodity. Such a standard is, in the inspired definition of D. H. Robertson, an arrangement by which "a country keeps the value of its monetary unit and the value of a defined weight of gold [or other standard commodity] at an equality with each other" (1922, p. 134). In a "full" gold or silver standard, such as existed in many countries before World War i, this equality of value was maintained through the free convertibility of monetary metal, metal coins, and paper money. Such an arrangement based on gold (or possibly on a bimetallic standard with both gold and silver coins of full weight) was regarded as normal for a developed industrial economy. With the disappearance of gold coins, the restriction in most centers of dealings in monetary gold bars to "official" dealers, and the fading of the tradition of permanence in monetary arrangements, the standard has been modified. The United States today may be described as operating a "limited, provisional, gold-bullion standard," and similar descriptions would apply to the other major industrial countries. Many countries choose to treat the currency units of other countries as their "standard commodities" and may be described as on a "sterling-exchange standard" or a "dollar-exchange standard." [See INTERNATIONAL MONETARY ECONOMICS, article on INTERNATIONAL MONETARY ORGANIZATION.]
It should be noted that many economists prefer to define money more informally than is proposed above: simply as "that which constitutes means of payment." This is an easy and useful way to convey a correct general impression. But it is hard to give a precise meaning to "means of payment." Strictly, the immediate means of payment for most goods and services sold is the establishment of "book credit": the buyer recognizes a debt to the seller for merchandise supplied or for services rendered. (In the broad sweep of economic operations, "cash and carry" transactions are exceptional.) But no economist finds it convenient to regard book credit as the primary form of money. To adopt the means-of-payment definition without regarding book credit as money involves casuistry to the effect that goods are not "really" paid for until the book credit in question has been settled by check or by a transfer of paper currency. Another often-proposed simple definition of money is "that which a seller will accept from a buyer whose credit standing is unknown." This may be a very useful formulation in countries where payments are normally made by Giro (a payments system used in parts of Europe). But it has the defect of ruling out checking deposits as a form of money—a defect which is fatal for analysis of the
428
MONEY: General
economy of the United States and a number of other advanced countries. Analytical role of money in economics. One of the key problems of present-day economics is the role of money and other liquid assets in the structure of economic decisions—particularly in the decisions of firms and households to save and to invest in durable real assets, such as factories, machinery, houses, and vehicles. Broadly speaking, the funds available to a firm or household for investment within a stated period consist of its saving during the period (taking saving gross, to include depreciation charges and the like), plus its net borrowing, plus any reduction it may make in its holdings of liquid assets. In any stated situation, there is usually something to be gained for the firm or household by investing more, something to be gained by reducing rather than increasing debt, and also something to be gained (in the form of increased consumption, or of increased distribution of a firm's profits to its owners) by saving less. Given the size of current income, the more ample the stock of liquid assets, the more it is possible to realize all these benefits simultaneously. The scarcer the liquid assets, the more it is necessary to choose to forgo one benefit in order to reap another. Thus, adequacy of liquid assets in the possession of a firm or household is viewed as an incentive to invest, while inadequacy of liquid assets is viewed as an incentive to save and to curtail investment. It is plain from experience that a "spendthrift" response to the possession of money and other liquid assets—that is, a course of management which outspends receipts so heavily as to bring liquid holdings down rapidly to a crisis point—is very rare. In most societies, the firms, households, and governments which account for the bulk of wealth holdings and of economic operations feel a need to maintain substantial holdings of money and other liquid assets. If we measure money by the sum of hand-to-hand currency plus checking deposits (which, as we saw at the outset, is a measurement which conforms fairly well to the proposed definition), the private sector of the U.S. economy in recent years has held a money stock roughly equivalent to a quarter-year's gross national product and, in addition, has held other liquid assets equivalent roughly to a half-year's product. Motives for holding money. Monetary economists have developed an interesting array of hypotheses about the motives for holding money. Prior to the great depression of the 1930s, emphasis was placed primarily on the transactions motive— the need to hold a stock of money so as to smooth out the irregularities of inflow and outflow and to
carry the holder past a foreseen trough in his money holdings. During the 1930s, under the leadership of John Maynard Keynes, emphasis shifted to the speculative motive—the benefit of holding money while one waits for an expected fall in the price of some alternative asset one may be interested in buying. Some such element in monetary theory was clearly needed to interpret the sharp fall during the 1930s of the "velocity of circulation of money"—the ratio of money payments to money stock—which would have to remain fairly constant if the transactions motive were dominant. Without abandoning either of these previously emphasized motives, monetary economists in recent years have put increasing emphasis on the precautionary motive—the benefit of holding money to mitigate uncertainty. An attractive explanation of the benefit derived from keeping a margin of safety in one's money holdings is the principle of linkage of risk. If a firm or household lacks such a margin, an unexpected unfavorable development is likely to create a crisis that will bring on further unfavorable events. But if the adverse effect of the first event can be taken in stride, the linkage of risk is weakened, and the further unfavorable events may be averted. Except for very short-term aspects of the transactions motive, all these motives for holding money can be served at least moderately well by holding some type of nonmonetary liquid asset. Money as ordinarily defined consists of elements (paper currency and checking deposits) which yield no money income, while nonmonetary liquid assets do yield such income. Hence, it pays the holder to substitute other liquid assets for money up to the point at which the next remaining unit of money has net advantages equal to the interest income forgone. A practical consequence of this fact is that the financial institutions whose liabilities constitute the public's nonmonetary liquid assets have an incentive to design the claims they offer so as to present attractive combinations of liquidity and income. The working of this incentive to narrow the qualitative gap between money and money substitutes may be seen in the rapid development during the 1960s of "certificates of deposit" and "capital notes," which are offered on the open market by commercial banks. A theoretical consequence of the same fact is that it becomes interesting to view the demand for money in opportunity-cost terms. Developing a Keynesian insight, many monetary economists center their analysis on a liquiditypreference function, which treats the stock of money the public will choose to hold as an inverse function of the interest rate which could be earned on alternative uses of funds [see LIQUIDITY PREFERENCE].
MONEY: General Creation of money. The processes which bring stocks of money into being, and which distribute them among various holders, are best seen in terms of transactions among various sectors of a country's economy. In the first stage of analysis, it is convenient to recognize two sectors only. The first is called the "nonbank public"—made up of households, firms other than banks, and local governments; it is through effects upon incentives in this sector that monetary influences on saving and investment are supposed to work. The second sector is the "money-generating sector"—made up of the national government, the central bank (in the United States, the Federal Reserve System), and the commercial banks (those which include among their liabilities deposits that can be used in payment by check or Giro order). The stock of money constitutes an asset for the nonbank public and a liability for the money-generating sector. A simple way to view the processes which generate money is to think of the flow of checks and its effect on the holdings of the nonbank public. Any check which is drawn by one member of the nonbank public and is payable to another member has a net effect of zero upon the total money stock. The payee enlarges his holding of money when he deposits the check, but the drawer's account is necessarily reduced by an identical amount, so the total is unchanged. (Transitory nominal changes may arise from variations in the "float" of checks which have been drawn and not yet debited, since there is often a spread of several days between the dates on which withdrawals and deposits are entered in bank records and in the checkbooks of depositors.) But net effects on the money stock are not zero when the checks cross the boundary between the two sectors. For example, when a government employee deposits his paycheck, it will not be debited against the account of any other member of the nonbank public, so that the transaction is money-increasing. In the other direction, if a business firm draws a check to repay a bank loan, this check is not deposited in the account of any other member of the nonbank public, so that the transaction is money-decreasing. Transactions in both directions across the frontier between the nonbank public and the money-generating sector go on continuously, and the net change in the money stock depends on the net difference between the money-increasing flow and the money-decreasing flow. It should be noted in passing that the situation is complicated by the presence of liabilities of the money-generating sector which are not treated as part of the money stock. The government employee, for example, might have deposited his paycheck in
429
a savings account at his bank instead of in his checking account. To deal with this complication, one may, like Milton Friedman (see Friedman & Schwartz 1963), adopt a working definition of money which includes as money the time deposits of commercial banks. Or one may (like the International Monetary Fund) adopt a concept of "quasi money," changes in which are viewed as an alternative use of "potential money" generated by net payments from the money-generating sector to the nonbank public. Transactions in either direction between the two sectors may be on income account or on wealthtransfer account. The government paycheck referred to above is an income-account transaction; so is a check to pay for current products of the private sector which are bought for government use, or a dividend check to a bank stockholder. In the other direction, checks to pay taxes or to pay interest on bank loans may be regarded as incomeaccount payments from the nonbank public to the money-generating sector. Wealth-transfer transactions may be represented by checks drawn by members of the nonbank public to pay for their subscriptions to newly issued government securities or, in the other direction, by checks drawn to pay for open-market purchases of government securities by the Federal Reserve from nonbank sellers. With minor exceptions, income-account transactions which affect the stock of money are transactions that figure in the social accounts among the receipts and expenditures of the central government and come into the domain of fiscal policy, while wealth-transfer transactions which affect the stock of money are bank-loan or government-debt transactions that clearly lie within the domain of monetary policy. A basic point of dispute between economists who think largely in terms of fiscal policy and those who are sometimes called "monetary monists" is whether the effect of an increment of money stock will be different according to whether it originates in an income-account or in a wealth-transfer-account transaction. Theories of the supply of money. Theories of the supply of money center upon wealth-transfer transactions carried on by commercial banks. Income-account transactions of the government are seen as by-products of fiscal policy, and wealthtransfer transactions by the treasury and central bank are viewed in terms of policy decisions rather than of the more or less impersonal response mechanisms attributed to the banking subsector. Particularly in the United States, with its wide dispersion of activity among unit banks, one must view the creation of money by bank activity as a mass phenomenon directed by incentives and re-
430
MONEY: General
strictions, rather than as a simple decision of high policy like, for example, a cut in federal income tax rates. For this p«irt of our analysis, we must look inside the "money-generating sector" and distinguish the commercial banks from the "bankreserve-generating subsector" made up of the national government and the central bank. Commercial banks have a continuous incentive to carry out money-increasing transactions—that is, to expand their loans and investments—because their income arises as interest on these assets. Furthermore, there is ordinarily an available supply of such assets—for loans, a "fringe of unsatisfied borrowers"; for investments, a mass of bonds suitable for bank ownership that are held by the nonbank public and can be bought on the open market. Banks are free to respond to this incentive only insofar as they have a margin over reserve requirements in their holdings of reserve balances at Federal Reserve banks (plus their vault cash) or as they are willing to borrow reserves by discount at the Federal Reserve banks in the face of various deterrents [see BANKING, CENTRAL]. The central bank is able to facilitate the expansion of bank assets, and thus of the money stock, or to apply pressure toward contraction. The Federal Reserve System has authority within wide limits to vary legal reserve requirements. Furthermore, the total mass of reserves can be increased by Federal Reserve open-market purchases of government securities or decreased by open-market sales. The deterrents to discounting can be altered by varying the official discount rate or by official "moral suasion." True, there are certain forces outside central bank control which change the bankreserve position—notably changes in the flow of international payments which expand or contract the reserve funds of commercial banks as well as the international-liquidity position of the country as a whole, and flows of hand-to-hand currency in and out of circulation. But on the whole, these forces can be offset or reinforced by measures at the disposal of the central bank. For any individual commercial bank, the limits on its expansion of loans and investments within any short period are determined by its initial reserve position (the excess over requirements of the reserves it holds, plus the amount it is willing to borrow), plus the amount of additional deposits it can attract within the period, less the reserves required against those additional deposits. But if we shift our attention from the individual bank to the commercial banking system as a whole, the limits on expansion become less than one might think at first glance, because for the system as a whole the amount of additional deposits which can be
"attracted" will be almost the same as the amount "created" by transactions which increase earning assets for the banks as a whole. (The only major difference between the amount created and the amount that can be attracted is the part of any increment in its total holding of money—hand-tohand money plus commercial bank deposits— which the public will insist on taking in hand-tohand currency.) The system as a whole can go on expanding so long as there are still commercial banks which have excess reserves or are willing to increase their discounts at the Federal Reserve. According to the assumptions one makes about the division of the increment into hand-to-hand money, checking ("demand") deposits, and time deposits, the unused lending power implied by a given amount of initial excess reserves may be calculated as anywhere between three and five times the initial excess reserves. For any other class of credit institutions, the limits on expansion of earning assets are more like those for the individual commercial bank than like those for commercial banks as a whole. An acceleration of mortgage lending by savings and loan associations, for example, does very little to increase the amount of funds which savers hold at such associations. If these associations obtain a million dollars of excess reserves (for example, through discounts at a Federal Home Loan bank), the additional amount they can lend is increased by almost exactly a million dollars. Thus, the initiative in nonbank credit expansion comes largely from savers who decide to entrust their funds to these institutions—although of course the institutions have some scope for making themselves attractive to savers. To the extent that society reorganizes itself to make more use of such financial intermediaries, liquid assets expand relative to activity. For example, suppose that a group of savers have been in the habit of using their flow of saving from year to year to erect apartment houses and rent flats to newly married couples. If these savers now decide to place their funds with savings-and-loan associations, which in turn lend to newly married couples who buy new houses, the amount of real activity in housing investment may be unaltered. But the savers (who have acquired savings-and-loan "shares" redeemable on short notice instead of the ownership of apartment houses) will have their assets in more liquid form, while the liquidity of the new homeowners will not be more impaired by the prospect of paying amortization and interest on their mortgages than it would have been by the prospect of paying corresponding apartment rent. Accordingly, the economy will be more liquid at the
MONEY: General end of a period in which savings-and-loan mortgage financing is substituted for direct investment by savers in new buildings. The argument is similar for other kinds of credit-institution expansion. [See FINANCIAL INTERMEDIARIES.] The economic impact of money. Monetary economics offers a wide range of competing views about the impact of monetary forces on prices and economic activity. In good part, differences of view relate to the interpretation of somewhat ambiguous historical and statistical evidence. In principle, the adherents of each of today's monetary schools admit the conceivability of a world in which the other schools' favorite channel of monetary influence would be of the highest importance, but each school tends to argue that realistically and quantitatively its favorite channel of influence is the most important by a decisive margin. Furthermore, the different schools disagree sharply at the level of monetary policy. Hence, a correct impression can probably be given by contrasting several distinct types of theory—disregarding the minor concessions made by each school to the others. To clear the ground, we may examine briefly several discredited theories—held in the past by influential economists but without professional support today. A common element of these discredited theories, which today's monetary economists are at one in repudiating, is the view (stated explicitly by many of the older theorists and implied by the others) that the real volume of economic activity is governed entirely by nonmonetary forces, that the role of monetary analysis is solely to explain changes in the "purchasing power of money" (that is, the reciprocal of some broad index number of prices). Each of the discredited theories has in addition at least one other major element that today's monetary economists all find unacceptable. Statist theories viewed the value of money as determined by an act of will on the part of government, whereas observation suggests that changes in the price level ordinarily occur against the will of government. Commodity theories viewed the value of money as transferred from commodity markets for gold and silver, which could be interpreted by means of a supply-and-demand analysis essentially similar to that applicable to iron or cotton. In view of the increasingly abstract character of money and of peculiarities of the gold market which stem precisely from the monetary role of gold, it seems more reasonable to describe the commodity aspect of gold as dominated by the monetary aspect. The classical quantity theory of money (as it flourished before 1929) took the velocity of circulation as a constant. Today all schools, including the modern quantity theorists, regard velocity as a variable
431
whose behavior must be explained by monetary theory. While each of the foregoing theories must be regarded as discredited as a general theory of money, present-day economists make much use of special-purpose theories which contain important elements of these older theories. For example, there is some affinity with statist theories in the widely used models which assume price levels to be constant or which take some key element of the price structure (for example, the level of wages) as a policy variable. In considering the probable long-run effects of suggestions that international monetary relations should be "reformed" along lines closer to the traditional gold standard, today's economists are not such purists as to refuse to take into account the costs of producing gold, as well as the speculative attitudes of private and governmental holders of gold. In taking these factors into account, they use market analysis techniques similar to those used for other durable commodities. The classical quantity theory can still be applied with confidence to situations in which changes in the money stock are of enormous magnitude. If, as sometimes happens, a country's money stock is multiplied by 10 or by 100 within a few years, monetary economists predict a change in the price level of the same order of magnitude— although many monetary economists would not be much surprised if some tenfold increases in the money stock were accompanied by fivefold increases in prices and others by twentyfold increases. Present-day schools of monetary economics may be sorted out fairly well by their preferences in devising models that explain the general course of economic activity and prices in a market economy. At one extreme stands the "modern quantity theory" school, typified by Milton Friedman. It pictures changes in the stock of money as the dominant force in any explanation of the course of money payments and draws the policy inference that the sovereign prescription for steady growth without inflation is to engineer a steady growth rate for the money stock about equal to the growth of the economy's productive potential. In this theory, velocity is treated neither as a constant nor as an exogenous variable but, rather, as endogenous to the system of interrelations used in the theory. Nevertheless, the forces that govern velocity are not pictured as lending themselves to any sort of policy intervention which might usefully supplement the regulation of the quantity of money. At the other extreme from quantity-theory models stand models that analyze the behavior of economic activity and the price level without including any variable that corresponds to the stock of money. It
432
MONEY: Quantity Theory
would be hard to name any economists who would make it a matter of principle to go to this extreme. But the stress, in teaching and in popularized statements about economic policy, on investment as an exogenous variable, and on the determination of activity by investment (mediated by a "propensity to consume"), is so heavy that this extreme view is likely to be taken as the sum of academic wisdom about macroeconomics by a large proportion of those who have been exposed to economic pedagogy or advice. The associated view of economic policy is that fiscal policy is all-sufficient and monetary policy is inconsequential. Much more representative of professional opinion as the academic monetary economists would like it to be understood is what may be called the interest rate school. On the theoretical side, the models typical of this view present "the" rate of interest as a major influence on investment and, through investment, on economic activity. In policy terms, this school treats the interest rate as the monetary influence on activity par excellence and does not concern itself with any direct influence of the stock of money on activity. (In relation to fiscal policy, the position of this school is likely to be eclectic, looking to an interaction of interest rate policy with such fiscal-policy variables as public expenditure and tax rates.) In the analytical models of this school, a peripheral liquidity-preference function expresses a relation between the money stock and the interest rate. The policy implication drawn may be that the interest rate can be regulated through the stock of money, or that if an appropriate rate of interest is adopted, the stock of money can be allowed to adapt itself to this rate without disturbing other aspects of the economy. Some variants of the interest-rate approach pay a good deal of attention to possible changes within the structure of interest rates: for example, possibilities of relative changes between the interest rates on home mortgages and those on foreign funds invested in treasury bills in New York. This view merges into another position, which in principle is quite distinct from the interest-rate school: that of the credit-availability school. The creditavailability doctrine is implicit in many official statements by the monetary authorities of the United States and other countries and has been usefully made explicit by Robert Roosa (1951). This view is that various types of investment may be powerfully influenced by the amount of funds the credit machinery makes available to finance home construction, inventory holding, exports, etc. Relative movements of interest rates may be useful indicators of the forces at work but will not themselves be the effective variable. The stock of money
as such does not figure in explanations of activity along these lines, but changes in the stock of money will be by-products of transactions called for to carry out appropriate financing. The size of the monetary expansion that accompanies a given course of economic activity and prices, in this view, may vary substantially according to the financing of the economic activity. Despite the lively controversy among schools, it is hard to see their views as philosophically irreconcilable. "Pure" models of one or another of the types just sketched illuminate the implications of various hypotheses, can help guide the search for evidence, and may offer useful special-purpose models for work on economic diagnosis and economic policy. But advocacy of any of these views as all-sufficient can be seriously misleading. This is particularly true, in the judgment of the author, of the "monetary monism" shown by advocates of the modern quantity theory approach and of some variants of the interest-structure approach—advocates who try to explain the flow of payments and economic activity without reference to such variables as taxes, accelerator effects of activity upon investment, changes in the impact of the "rest of the world," and so forth. A certain healthy eclecticism, with willingness to be guided by the evidence in the choice of theoretical simplifications, would seem appropriate in the present stage of monetary economics. ALBERT GAILORD HART BIBLIOGRAPHY
FRIEDMAN, MILTON; and SCHWARTZ, ANNA J. 1963 A Monetary History of the United States: 1867-1960. National Bureau of Economic Research, Studies in Business Cycles, No. 12. Princeton Univ. Press. KEYNES, JOHN MAYNARD (1930) 1958-1960 A Treatise on Money. 2 vols. London: Macmillan. -» Volume 1: The Pure Theory of Money. Volume 2: The Applied Theory of Money. KEYNES, JOHN MAYNARD 1936 The General Theory of Employment, Interest and Money. London: Macmillan. -» A paperback edition was published in 1965 by Harcourt. NUSSBAUM, ARTHUR (1939) 1950 Money in the Law. 2d ed. Chicago: Foundation Press. PATINKIN, DON (1956) 1965 Money, Interest, and Prices: An Integration of Monetary and Value Theory. 2d ed. New York: Harper. ROBERTSON, D. H. (1922) 1959 Money. Rev. ed. Univ. of Chicago Press. ROOSA, ROBERT V. 1951 Interest Rates and the Central Bank. Pages 270-295 in Money, Trade and Economic Growth: In Honor of John Henry Williams. New York: Macmillan. II QUANTITY THEORY
Since men first began to write systematically about economic matters they have devoted special
MONEY: Quantity Theory attention to the wide movements in the general level of prices that have intermittently occurred. Two alternative explanations have usually been offered. One has attributed the changes in prices to changes in the quantity of money. The other has attributed the changes in prices to war or to profiteers or to rises in wages or to some other special circumstance of the particular time and place and has regarded any accompanying change in the quantity of money as a common consequence of the same special circumstance. The first explanation has generally been referred to as the quantity theory of money, although that designation conceals the variety of forms the explanation has taken, the different levels of sophistication on which it has been developed, and the wide range of the claims that have been made for its applicability. The broad outlines of the quantity theory of money were fully developed by the eighteenth century. The contemporary economist can still read David Hume's essay "Of Money" (1752) with pleasure and profit and find few if any errors of commission. Reasonably satisfactory attempts at mathematical formulation have been traced back to the eighteenth century (see the references in Marget 1938). And certainly the mathematical formulation given by Simon Newcomb, the eminent astronomer, in 1886 is entirely modern, excepting only the particular symbols used. Knut Wicksell published a highly sophisticated analysis in 1898 that, because it was written in German, had less influence than its excellence justified. The two formulations of the quantity theory that have most influenced modern thinking both date from the end of the nineteenth century (although the dates of their publication are later): Irving Fisher's transactions version (1911) and the Cambridge cashbalances version, attributed to Alfred Marshall (1923) and Arthur C. Pigou (1917). After some introductory remarks, this article discusses these two versions and then examines the Keynesian attack on the quantity theory, the post-Keynesian reformulation, empirical evidence bearing on the quantity theory, and finally some policy implications of the quantity theory. In its most rigid and unqualified form the quantity theory asserts strict proportionality between the quantity of what is regarded as money and the level of prices. Hardly anyone has held the theory in that form, although statements capable of being so interpreted have often been made in the heat of argument or for expository simplicity. Virtually every quantity theorist has recognized that changes in the quantity of money that correspond to changes in the volume of trade or of output have no tendency
433
to produce changes in prices. Nearly as many have recognized also that changes in the willingness of the community to hold money can occur for a variety of reasons and can introduce disparities between changes in the quantity of money per unit of trade or of output and changes in prices. What quantity theorists have held in common is the belief that these qualifications are of secondary importance for substantial changes in either prices or the quantity of money, so that the one will not in fact occur without the other. The quantity theory in all its versions rests on a distinction between the nominal quantity of money and the real quantity of money. The nominal quantity of money is the quantity expressed in whatever units are used to designate money— talents, shekels, pounds, francs, lire, drachmas, dollars, and so on. The real quantity of money is the quantity expressed in terms of the volume of goods and services that the money will purchase. There is no unique way to express the real quantity of money. One way of expressing it, one that is widely used, is in terms of some specified standard basket of goods and services. That is what is implicitly done when the real quantity of money is calculated by dividing the nominal quantity by a price index. The standard basket is then the basket whose components are used as weights in computing the price index—generally the basket purchased by some reference group in a base year. Another way of expressing the real quantity of money is in terms of the time duration of the flows of goods and services the money could purchase. For a household, for example, the real quantity of money can be expressed in terms of the number of weeks of the household's average level of consumption that it could finance with its money balances or, alternatively, in terms of the number of weeks of its average income to which its money balances are equal. For a business enterprise, the real quantity of money it holds can be expressed in terms of the number of weeks of its average purchases or of its average sales or of its average expenditures on final productive services (net value added) to which its money balances are equal. For the community as a whole, the real quantity of money can be expressed in terms of the number of weeks of aggregate transactions of the community or aggregate net output of the community to which it is equal. For the community, attention has generally centered not on the real quantity of money but on a velocity of circulation—which can be regarded as the reciprocal of a particular expression of the real quantity of money. The ratio, for example, of the aggregate annual transactions of a community
434
MONEY: Quantity Theory
to its stock of money is termed the "transactions velocity of circulation of money," since it gives the number of times the stock of money would have to "turn over" in a year to accomplish all transactions; similarly, the ratio of annual income to the stock of money is termed "income velocity." In every case the calculation of the real quantity of money or of velocity is made at the set of prices prevailing at the date to which the calculation refers. These prices are the bridge between the nominal and the real quantity of money. The quantity theory takes for granted that what ultimately matters to holders of money is the real quantity rather than the nominal quantity of money they hold and that there is some fairly definite real quantity of money that people wish to hold under any given circumstances. Suppose the nominal quantity that people hold happens to correspond at current prices to a real quantity larger than that which they wish to hold. Individuals will then seek to dispose of what they regard as their excess money balances; they will try to pay out a larger sum for the purchase of securities, goods, and services, for the repayment of debts, and as gifts than they are receiving from the corresponding sources. However, one man's expenditures are another's receipts. One man can reduce his nominal money balances only by persuading someone else to increase his. The community as a whole cannot in general spend more than it receives. The community's attempt to do so will nonetheless have important effects. If prices and income are free to change, the attempt to spend more will raise the nominal volume of expenditures and receipts, which will lead to a bidding up of prices and perhaps also to an increase in output. If prices are fixed by custom or by government edict, the attempt to spend more either will be matched by an increase in goods and services or will produce "shortages" and "queues." These in turn will raise the effective prices and are likely sooner or later to force changes in official prices. The initial excess of money balances will therefore tend to be eliminated, even though there is no change in the nominal quantity, by either a reduction in the real quantity held through price rises or an increase in the real quantity desired through output increases. Conversely, if nominal balances happen to correspond to a smaller real quantity at current prices than people wish to hold, people will seek to spend less than they are receiving. They cannot in the aggregate do so. But their attempt will in the process lower nominal expenditures and receipts, driving down prices or output and either raising the real balances held or lowering the real balances desired.
It is clear from this discussion that changes in prices and nominal income can be produced either by changes in the real balances that people wish to hold or by changes in the nominal balances available for them to hold. Indeed it is a tautology, summarized in the famous quantity equation (to which we shall return) that all changes in nominal income can be attributed to one or the other—just as a change in the price of any good can always be attributed to a change in either demand or supply. The quantity theory is not, however, this tautology. It is, rather, the empirical generalization that changes in desired real balances (in the demand for money) tend to proceed slowly and gradually or to be the result of events set in train by prior changes in supply, whereas, in contrast, substantial changes in the supply of nominal balances can and frequently do occur independently of any changes in demand. The conclusion is that substantial changes in prices or nominal income are almost invariably the result of changes in the nominal supply of money. Variants of the quantity theory of money are distinguished by the variables that are regarded as most important in determining the real quantity of money that people desire to hold and by the analysis of the process whereby any discrepancy between actual and desired real balances works itself out. The chief issues that have occasioned controversy and conflict are perhaps the definition of money, the importance of transactions motives versus asset motives in the holding of money, the importance of substitution between money and other assets expressed in nominal terms as compared with substitution between money and real goods and services, and the speed and character of the dynamic process of adjustment. We shall have occasion to comment on these below. Fisher's transactions approach The quantity equation in transactions form. Every payment made by one economic unit in an economy—household, business enterprise, or governmental organization—to another can be regarded as the product of a price and a quantity: wage per week times number of weeks, price of a good times number of units of the good, dividend per share times number of shares, and so on. The total volume of transactions during a period of time can thus be regarded as equal to the sum of a large number of such products, say ^PiU, where Pi is the price and ti the quantity for the ith transaction. Let P be a suitably chosen average of the prices, and let T be a suitably chosen aggregate of the quantities. We then have (1) Total volume of transactions = PT =
MONEY: Quantity Theory The total volume of transactions can also be viewed in terms of the medium of exchange used to effectuate them. Let M be the total quantity of money in the economy and V the average number of times each unit of money is used to effectuate a transaction during the year (the transactions velocity). We then have (2)
Total volume of transactions = MV,
or, putting (1) and (2) together, the famous quantity equation (3)
MV = PT.
Each side of this equation can be broken into subcategories: the right-hand side into different categories of transactions and the left-hand side into payments in different form. Fisher and later writers emphasized in particular the subdivision of the left-hand side into two categories of payments, those effected by the transfer of hand-to-hand currency (including coin) and those effected by the transfer of deposits. Let M stand solely for the volume of currency and V for the velocity of currency, M' for the volume of deposits and V' for the velocity of deposits. We then can write (4)
MV + M'V' = PT.
One reason for the emphasis on this division was the persistent dispute about whether the term "money" should include only currency or deposits as well—this dispute was at the center of the banking school-currency school controversy that raged in England in the nineteenth century. Another reason was the direct availability of figures on M'V' from bank records of clearings or of debits to accounts so that it was and is possible to calculate V' in a way that it is not possible to calculate V. As they stand, equations (3) and (4) are identities: The V of (3) or the V and V of (4) are defined as the numbers having the property that they render the equations correct. If P changes from one time period to the next, then so must one or more of the other terms in the equations. That is an arithmetic necessity, not an economic proposition. The identities are useful for economic analysis because they offer a useful classification of the factors at work, a classification into categories each of which contains factors largely independent of those in the other categories. The categories in the quantity equation. Consider the fourfold classification in equation (3). Transactions. The physical volume of transactions is denoted by T. It is determined by the resources available to the economy, the efficiency with which they are used, the degree of integration or disintegration of the economy (which determines
435
the number of transactions involved in the production and sale of final goods), and so on. These are the basic physical and operational characteristics of the economy. All quantity theorists, at least since Hume, have recognized that changes in the stock of money may have transitional effects on T. However, they have generally regarded the average level of T and long-run changes in T as largely independent of the quantity of money, although not of the existence of a money economy. Price level. The price level, which is the object of investigation, is denoted by P. It has generally been regarded as the resultant of other forces rather than as itself having any important element of autonomy. Cost-push or profit-push theories of inflation treat it as being to some extent independently determined. Under a regime of widespread government price fixing, it clearly does have some measure of autonomy. Stock of money. The stock of money in nominal units is denoted by M. Its precise definition, as noted before, has been the subject of much controversy. The transactions approach makes it seem natural to define money in terms of its function as a medium of exchange and to include only those means of payments generally acceptable in discharge of debts. Under a gold standard, specie was regarded as money par excellence, and questions were raised about extending the definition to include paper money and then demand deposits transferable by check. Today these would generally be included in the definitions, but there is much controversy about the treatment of other deposits, such as time deposits and savings deposits. On transactions lines, it is argued that such deposits cannot be used to discharge debts without first being converted into either currency or demand deposits. One answer to this argument is that it is also true of some items that all are willing to regard as money. For example, in the United States, $10,000 is the largest denomination of currency. Such a currency note can be used to effectuate few transactions without first being converted into smaller denominations. No issue of principle is involved. However M is defined, equation (3) remains valid, provided V is appropriately defined. The issue is one of the usefulness of one or another definition: what definition of M will have the empirical property of rendering the forces determining the other symbols in the equation as nearly independent as possible of those determining M? Whatever the precise definition of M, the factors determining it depend critically on the monetary system and are largely independent of the forces determining T. Two main cases should be distinguished: a commodity standard, of which a gold
436
MONEY: Quantity Theory
standard is the most important historical example, and a fiduciary standard. Under a gold standard the amount of money in the gold standard world is determined by the total existing amount of gold, the fraction used as money, and the institutional arrangements determining the superstructure of claims to gold, in the form of currency or deposits, that can be erected on any given stock of gold. Changes in the amount of money depend on costs of producing various quantities of gold, the demand for gold for nonmonetary purposes, and the financial arrangements for issuing fiduciary claims to gold. For any one country the situation is somewhat different: the quantity of money is a dependent rather than an independent variable. It must be whatever quantity is consistent with levels of prices and incomes that will maintain balance in its international payments. Gold inflows or outflows tend to keep it at that quantity. Under a fiduciary standard the amount of money is ultimately under the control of the monetary authorities. In practice these authorities have always been governmental agencies. Although they have had the power to control the stock of money, they frequently have not stated their objectives in these terms but have let the stock of money be whatever was consistent with some alternative objective (e.g., given exchange rates or given interest rates). Under either the gold or the fiduciary standard the factors determining M are connected only loosely, if at all, with those we have considered as affecting directly either P or T. It is precisely this clearly perceived independence of the factors determining the quantity of money that has rendered the quantity theory so attractive to economists. Velocity of circulation. We now come to V, the velocity of circulation. This is the core of the quantity theory. It is determined by whatever factors affect, on the one hand, the amount of money people want to hold and, on the other, their ability to make their actual money balances equal their desired balances. The transactions approach makes it natural to emphasize payment practices: the frequency with which people are paid, the irregularity of receipts and payments, and so on. However, such payment practices themselves seem to be largely explained by the willingness of people to hold money. For example, during periods of rapid inflation, when it is costly to hold money, pay periods consistently tend to become more frequent. It is convenient to postpone a fuller consideration of the factors determining velocity until we discuss the post-Keynesian formulation in terms of the demand for money. Here it suffices to point out that
Fisher and other earlier quantity theorists explicitly recognized that velocity would be affected by, among other factors, the rate of interest and also the rate of change of prices. They recognized that both high rates of interest and rapidly rising prices would give people an incentive to economize on money balances and so tend to raise velocity and that low rates of interest and falling prices would have the opposite effect. They were never guilty of the crude fallacy—with which critics have often charged them—of regarding velocity as something of a natural constant. The quantity equation in income form. One difficulty with equations (3) and (4) is that the magnitudes designated "transactions" and the associated "general price level" proved conceptually ambiguous and difficult to measure with available data. Despite the large amount of empirical work done on these equations, notably by Fisher and Carl Snyder, these ambiguities and deficiencies of data have never been satisfactorily resolved. Should capital transfers, such as purchases and sales of real estate and securities, be included? What about gifts? Money-changing transactions? What is the relevant price and quantity in these transactions? As noted before, the data on volume of transactions have been satisfactory only for transactions effected by check. For these, debits to bank accounts (or bank clearings) provide a statistically reliable total, although even then there are problems involved in separating out money-changing transactions. Average deposits give a statistically reliable estimate of M', so that estimates of V can be and are readily calculated for frequent time intervals and for many different geographical areas. However, even for check transactions, there is no satisfactory way to break down the other side of the equation into price and quantity components. With the development of national or social accounting, which has stressed income transactions rather than gross transactions and which has explicitly and satisfactorily dealt with the conceptual and statistical problems of distinguishing between changes in prices and changes in quantities, there has been a tendency to express the quantity equation in terms of income rather than of transactions. Let Y be money national income, P the price index implicit in estimating national income at constant prices, and y national income in constant prices, so that (5)
Y = Py.
Let M represent, as before, the stock of money, but define V as the average number of times per year that the money stock is used in making income transactions (that is, payments for final productive
MONEY: Quantity Theory services) rather than all transactions. We then can write the quantity equation in income form as (6)
MV = Py.
Although the symbols P and V are used both in eqs. (5) and (6) and in eqs. (1) through (4), they stand for different concepts in each group. Equation (6) is both conceptually and empirically more satisfactory than equation (3). Nonetheless, the earlier discussion of the fourfold classification implicit in the quantity equation applies, except for changes that are nearly self-evident, such as the very different relevance for y than for T of the degree of integration or disintegration of industry. Equation (6) is also closer in conception to the Cambridge approach, to which we now turn. The Cambridge cash-balances approach The essential feature of a money economy is that it enables the act of purchase to be separated from the act of sale. An individual who has something to exchange need not seek out the double coincidence—someone who both wants what he has and offers in exchange what he wants. He need only find someone who wants what he has, sell it to him for general purchasing power, and then find someone else who has what he wants and buy it with general purchasing power. In order for the act of purchase to be separated from the act of sale, there must be something which can serve as a temporary abode of purchasing power in the interim. It is this aspect of money which is emphasized in the cash-balances approach. How much money will people or enterprises want to hold for this purpose? As a first approximation we may suppose that the amount one wants to hold bears some relation to one's income, since that determines the volume of purchases and sales in which one is engaged. We then add up the cash balances held by all holders of money in the community and express the total as a fraction of their total income. We can then write (7)
M = kPy,
where M, P, and y are defined as in equation (6) and k is the ratio of the money stock to income. We can regard k either as a constant so calculated as to make (7) an identity, or as the "desired" ratio, so that M is the "desired" amount of money, which need not be equal to the actual amount. In either case, k is clearly equal numerically to the reciprocal of the V of equation (6), the V in one case being interpreted as measured velocity and in the other as desired velocity. Formally the Cambridge equation (7) is simply a transformation of Fisher's equation (6). Most
437
writers who have used one of the two approaches regarded them in this way and tended to cover much the same ground. Yet to a far greater extent than is reflected in the writings of the early expositors, the two approaches stress different aspects of money, make different definitions of money seem natural, and lead to emphasis being placed on different variables and analytical techniques. Consider the definition of money. The transactions approach makes it natural to define money in terms of whatever serves as the medium of exchange in discharging obligations. By stressing the function of money as a temporary abode of purchasing power the cash-balances approach makes it seem entirely appropriate to include also such stores of value as demand and time deposits not transferable by check, although it clearly does not require their inclusion. Similarly, the transactions approach leads to stress being placed on such variables as payments practices, the financial and economic arrangements for effecting transactions, and the speed of communication and transportation as it affects the time required to make a payment—essentially, that is, to emphasis on the mechanical aspects of the payments process. The cash-balances approach, on the other hand, leads to stress being placed on variables affecting the usefulness of money as an asset: the costs and returns from holding money instead of other assets, the uncertainty of the future, and so on. Stress on the first set of variables led most early writers—both those using the Fisher equation and those using the Cambridge equation—to predict that velocity would increase over time as a result of technological improvements in transportation and communication, which would facilitate the payments process. In fact, velocity has shown no tendency to rise over time. If anything it has rather tended to decline in economically progressive countries along with rises in real income, although this tendency is less pronounced when money is defined narrowly than when it is defined to include some deposits not transferable by check. The tendency for velocity to decline, along with the very size of money balances (equal in 1960 in the United States to about one month's income for currency outside banks alone, to nearly five months' income for currency plus adjusted demand deposits, and to about seven months' income for currency and all deposits at commercial banks) has contributed to a shift of emphasis from the function of money as a medium of exchange to its function as a temporary abode of purchasing power. Finally, with regard to analytical techniques, the cash-balances approach fits in much more readily
438
MONEY: Quantity Theory
with the general Marshallian demand-supply apparatus than the transactions approach does. Equation (7) can be regarded as a demand function for money, with P and y on the right-hand side being two of the variables on which demand for money depends, and with k symbolizing all the other variables, so that k is to be regarded not as a numerical constant but as itself a function of still other variables. For completion the analysis requires another equation showing the supply of money as a function of other variables. The price level is then the resultant of the interaction of the demand and supply functions. From this point of view the quantity theory of money as embodied in equation (7) is a theory of the demand for money, not a theory of the price level or of money income. The Keynesian attack The Keynesian income-expenditure analysis developed in the General Theory of Employment, Interest and Money (1936) offered an alternative approach to the interpretation of changes in money income that emphasized the relation between money income and investment or autonomous expenditures rather than the relation between money income and the stock of money. The success of the Keynesian revolution in economic thought led to a temporary eclipse of the quantity theory of money and to perhaps an all-time low in the amount of economic research and writing devoted to monetary theory and analysis, narrowly interpreted. It became a widely accepted view that money does not matter, or, at any rate, that it does not matter very much, and that policy and theory alike should concentrate on investment, government fiscal policy, and the relation between consumer expenditures and income. Keynes did not, of course, deny the validity of the quantity equation. What he did was something very different. He argued that under conditions of underemployment equilibrium the V in equation (6) and the Mn equation (7) were highly unstable and would, for the most part, passively adapt to whatever changes independently occurred in money income or the stock of money. Hence, under such conditions these equations, although entirely valid, were largely useless for policy or prediction. Moreover, he regarded such conditions as prevailing much, if not most, of the time. Keynes reached this conclusion by giving a highly specific form to equation (7). The quantity of money demanded, he argued, could be treated as if it were divided into two parts, one part, Mj, "held to satisfy the transactions- and precautionarymotives," the other, M 2 , "held to satisfy the speculative-motive" (1936, p. 199). He regarded M^ as
a roughly constant fraction of income. He regarded the demand for M2 as arising from "uncertainty as to the future course of the rate of interest" and the amount demanded as depending on the relation between current rates of interest and the rates of interest expected to prevail in the future. (Keynes, of course, emphasized that there was a whole complex of interest rates. However, for simplicity, he spoke in terms of "the rate of interest," usually meaning by that the rate on long-term securities that were fixed in nominal value and that involved minimal risks of default—for example, government bonds.) In a "given state of expectations," the higher the current rate of interest, the lower would be the (real) amount of money people would want to hold for speculative motives for two reasons: first, the greater would be the cost in terms of current earnings sacrificed by holding money instead of securities, and, second, the more likely it would be that interest rates would fall, and hence bond prices rise, and so the greater would be the cost in terms of capital gains sacrificed by holding money instead of securities. Although expectations are given great prominence in developing the liquidity function expressing the demand for M 2 , they do not enter explicitly into that function. For the most part, Keynes and his followers in practice treated the amount of M2 demanded simply as a function of the current interest rate, the emphasis on expectations serving only as a reason for their attribution of instability to the liquidity function. Except for somewhat different language, the analysis up to this point differs from that of earlier quantity theorists, such as Fisher, only by its subtle analysis of the role of expectations about future interest rates and its greater emphasis on current interest rates and by restricting more narrowly the variables explicitly considered as affecting the amount of money demanded. Keynes's special twist concerned the empirical form of the liquidity-preference function at the low interest rates that he believed would prevail under conditions of underemployment equilibrium. Let the interest rate fall sufficiently low, he argued, and money and bonds would become perfect substitutes for one another; liquidity preference, as he put it, would become absolute. The liquidity-preference function, expressing the quantity of M2 demanded as a function of the rate of interest, would become horizontal at some low but finite rate of interest. Under such circumstances, he held, if the amount of money is increased by whatever means, the holders of money might seek to convert the additional cash balances into bonds. This would, however, tend to lower the rate of return
MONEY: Quantity Theory on bonds. Even the slightest lowering would, he argued, lead holders of money to desist from trying to convert it into bonds. The result would simply be that people would be willing to hold the increased quantity of money; k would be higher and V lower. Conversely, if the amount of money were decreased, holders of bonds would seek to convert them into money, but this would tend to raise the rate of interest, and even the slightest rise would reconcile them to holding the bonds instead of the money. Or, again, suppose there is an increase in money income for whatever reason. That will require an increase in M t , which can come out of M2 without any further effects. Conversely, any decline in M! can be added to M2 without any further effects. The conclusion is that under circumstances of absolute liquidity preference income can change without a change in M and M can change without a change in income. The holders of money are in metastable equilibrium, like a tumbler on its side on a flat surface; they will be satisfied with whatever the amount of money happens to be. Keynes regarded absolute liquidity preference as a strictly "limiting case" of which, though it "might become practically important in future," he knew "of no example . . . hitherto" (1936, p. 207). But, since he regarded interest rates as frequently being not far above the level at which liquidity preference would become absolute, he treated velocity as if in practice its behavior frequently approximated that which would prevail in this limiting case. Keynes's disciples went much farther than Keynes himself. They were readier than he was to accept absolute liquidity preference as the actual state of affairs. More important, many argued that when liquidity preference was not absolute, changes in the quantity of money would affect only the interest rate on bonds and that changes in this interest rate in turn would have little further effect. They argued that both consumption expenditures and investment expenditures were nearly completely insensitive to changes in interest rates, so that a change in M would merely be offset by an opposite and compensatory change in V (or a change in the same direction in fe), leaving P and y almost completely unaffected. In essence their argument consists in asserting that only paper securities are substitutes for money balances—that real assets never are (see Tobin 1961). The issues raised for the quantity theory by the Keynesian analysis are clearly empirical rather than theoretical. Is it a fact that the quantity of money demanded is a function primarily of current income and of the rate of interest on fixed-moneyvalue securities? Is it a fact that the amount demanded is highly elastic with respect to the rate
439
of interest on such securities at a low but finite rate of interest? Is it a fact that expenditures are highly inelastic with respect to such a rate of interest? Or, to put the issue in an equivalent but more readily observable form, is it a fact that velocity is a highly unstable and unpredictable magnitude that generally varies in a direction opposite to that of the quantity of money? The post-Keynesian reformulation Experience with monetary policy after World War ii very quickly produced a renewed interest in money and a renewed belief that money matters. Under the influence of Keynesian ideas, country after country followed an easy-money policy designed to keep interest rates low in order to stimulate, if only slightly, the investment regarded as needed to offset the shortage of demand that was universally feared. The result was an intensification of the strong inflationary pressure inherited from the war, a pressure that was brought under control only when countries undertook so-called orthodox measures to restrain the growth in the stock of money, as in Italy, beginning in August 1947, in Germany in June 1948, in the United States in March 1951, in Great Britain in November 1951, and in France in January 1960. The effect of experience was reinforced by developments in economic theory, especially by the explicit analysis of the so-called real-balance effect as a channel through which changes in prices and in the quantity of money could affect income, even when investment and consumption were insensitive to changes in interest rates or when absolute liquidity preference prevented changes in interest rates (see Haberler 1937; Tobin 1947; Pigou 1943; 1947; Patinkin 1948). Many economists continue to use Keynesian analysis but have revised their empirical presumptions. They grant that liquidity preference is not absolute and that investment does have a sizable elasticity with respect to interest rates. They continue, however, to regard analysis in terms of the quantity equation as less useful and meaningful than analysis in terms of autonomous expenditures and the multiplier, with monetary changes being taken into account as one factor among many that can affect these magnitudes. The postwar period has also seen a return to analysis in terms of the quantity equation accompanied by a reformulation of the quantity theory that has been strongly affected by the Keynesian analysis of liquidity preference (Johnson 1962). The reformulation emphasizes the role of money as an asset and hence treats the demand for money as part of capital or wealth theory, concerned with
440
MONEY: Quantity Theory
the composition of the balance sheet or portfolio of assets. From this point of view, it is important to distinguish between ultimate wealth-holders, to whom money is one form in which they choose to hold their wealth, and enterprises, to whom money is a producer's good like machinery or inventories. Demand by ultimate wealth-holders. For ultimate wealth-holders the demand for money, in real terms, may be expected to be a function of the following variables. (a) Total wealth. This is the analogue of the budget constraint in the usual theory of consumer choice. It is the total that must be divided among various forms of assets. In practice, estimates of total wealth are seldom available. Instead, income may serve as an index of wealth. However, it should be recognized that income as measured by statisticians may be a defective index of wealth because it is subject to erratic year-to-year fluctuations and that a longer term concept, like the concept of permanent income developed in connection with the theory of consumption, may be more useful. (Friedman 1957; 1959, p. 7; Meltzer 1963; Brunner & Meltzer 1963). The emphasis on income as a surrogate for wealth, rather than as a measure of the "work" to be done by money, is conceptually perhaps the basic difference between the reformulation and the earlier versions of quantity theory. (b~) The division of wealth between human and nonhuman forms. The major asset of most wealthholders is their personal earning capacity, but the conversion of human into nonhuman wealth or the reverse is subject to narrow limits because of institutional constraints. It can be done by using current earnings to purchase nonhuman wealth or by using nonhuman wealth to finance the acquisition of skills but not by purchase or sale and to only a limited extent by borrowing on the collateral of earning power. Hence, the fraction of total wealth that is in the form of nonhuman wealth may be an additional important variable. (c) The expected rates of return on money and other assets. This is the analogue of the prices of a commodity and its substitutes and complements in the usual theory of consumer demand. The nominal rate of return on money may be zero, as it generally is on currency, or negative, as it sometimes is on demand deposits subject to net service charges, or positive, as it sometimes is on demand deposits on which interest is paid and generally is on time deposits. The nominal rate of return on other assets consists of two parts; first, any currently paid yield or cost, such as interest on bonds, dividends on equities, and storage costs on physical
assets, and, second, changes in their nominal prices. The second part will, of course, be especially important under conditions of inflation or deflation. (d!) Other variables determining the utility attached to the services rendered by money relative to those rendered by other assets — in Keynesian terminology, determining the value attached to liquidity proper. One such variable may be one already considered — namely, real wealth or income, since the services rendered by money may in principle be regarded by wealth-holders as a "necessity," like bread, the consumption of which increases less than in proportion to any increase in income, or as a "luxury," like recreation, the consumption of which increases more than in proportion to any increase in income. Another variable, one that is likely to be important empirically, is the degree of economic stability expected to prevail in the future. Wealth-holders are likely to attach considerably more value to liquidity when they expect economic conditions to be unstable than when they expect them to be highly stable. This variable is likely to be difficult to express quantitatively even though the direction of change may be clear from qualitative information. For example, the outbreak of war clearly produces expectations of instability, which is one reason why war is often accompanied by a notable increase in real balances — that is, a notable decline in velocity. We can symbolize this analysis in terms of the following demand function for money for an individual wealth-holder: ,0.
(8)
M
1 dP
where M, P, and y have the same meaning as in equation (7) except that they relate to a single wealth-holder; w is the fraction of wealth in nonhuman form (or, alternatively, the fraction of income derived from property); rm is the expected rate of return on money; rb is the expected rate of return on fixed-value securities, including expected changes in their prices; re is the expected rate of return on equities, including expected changes in their prices; (l/P)(dP/
MONEY: Quantity Theory equities, and different kinds of physical assets from one another. The usual problems of aggregation arise in passing from equation (8) to a corresponding equation for the economy as a whole—in particular, they arise from the possibility that the amount of money demanded may depend on the distribution of such variables as y and w and not merely on their aggregate or average value. If we neglect these distributional effects, (8) can be regarded as applying to the community as a whole, with M and y referring to per capita money holdings and per capita real income, respectively, and w to the fraction of aggregate wealth in nonhuman form. The major problems that arise in practice in applying (8) are the precise definitions of y and w, the estimation of expected rates of return as contrasted with actual rates of return, and the quantitative specification of the variables designated by u. Demand by business enterprises. Business enterprises are not subject to a constraint comparable to that imposed by the total wealth of the ultimate wealth-holder. The total amount of capital embodied in productive assets, including money, is a variable that can be determined by the enterprise to maximize returns, since it can acquire additional capital through the capital market. Hence, there is no reason on this ground to include total wealth, or y as a surrogate for total wealth, as a variable in their demand function for money. It may, however, be desirable to include a somewhat similar variable denning the "scale" of the enterprise on different grounds—namely, as an index of the productive value of different quantities of money to the enterprise. This is more nearly in line with the earlier transactions approach emphasizing the "work" to be done by money. It is by no means clear what the appropriate variable is: total transactions, net value added, net income, total capital in nonmoney form, or net worth. The lack of availability of data has meant that much less empirical work has been done on the business demand for money than on an aggregate demand curve encompassing both ultimate wealth-holders and business enterprises. As a result there are as yet only faint indications about the best variable to use. The division of wealth between human and nonhuman form has no special relevance to business enterprises, since they are likely to buy the services of both forms on the market. Rates of return on money and on alternative assets are, of course, highly relevant to business enterprises. These rates determine the net cost to them of holding the money balances. However, the particular rates that are relevant may be quite dif-
441
ferent from those that are relevant for ultimate wealth-holders. For example, rates charged by banks on loans are of minor importance for wealthholders yet may be extremely important for businesses, since bank loans may be a way in which they can acquire the capital embodied in money balances. The counterpart for business enterprises of the variable M in (8) is the set of variables other than scale affecting the productivity of money balances. At least one of these—namely, expectations about economic stability—is likely to be common to business enterprises and ultimate wealth-holders. With these interpretations of the variables, equation (8), with w excluded, can be regarded as symbolizing the business demand for money and, as it stands, symbolizing aggregate demand for money, although with even more serious qualifications about the ambiguities introduced by aggregation. The process of adjustment. Emphasis on the role of money as a component of wealth is important because of the variables to which it directs attention. It is important also for its implications about the process of adjustment to a difference between actual and desired stocks of money. Any such discrepancy is a disturbance in a balance sheet. As such it can be corrected in either of two ways: by a rearrangement of assets and liabilities through purchase, sale, borrowing, and lending or by the use of current flows of income and expenditure to add to or subtract from some assets and liabilities. The Keynesian liquidity-preference analysis stressed the first and, in its most rigid form, only one specific rearrangement: that between money and bonds. The earlier quantity theory stressed the second to the almost complete exclusion of the first. The reformulation enforces consideration of both. The process of adjustment is important in particular for its implications about the time that readjustment may be expected to take. Balance-sheet adjustments can in general be expected to take considerable time, especially when they take the form of adjustments through alterations in flows and especially when they concern the money balance, M, whose function is precisely that of serving as a temporary abode of purchasing power, thereby permitting purchases to be separated from sales. It is plausible that any widespread disturbance in money balances—through, say, an unanticipated increase or decrease in the quantity of money by the actions of monetary authorities—will initially be met by an attempted readjustment of assets and liabilities through purchase or sale. But such attempted readjustments will alter the prices of assets and liabilities, leading to the spread of
442
MONEY: Quantity Theory
the adjustment from one asset or liability to another. Such changes in prices will also alter the relative prices of capital items and the services they yield and so establish incentives to alter flows of receipts and expenditures. If the monetary change has altered the total nominal value of wealth, not simply its composition, this will introduce an additional reason to change flows. The effect of any monetary disturbance will thus spread in ever-widening ripples, and some of its most important effects may not be manifest for many months after the initial disturbance. Empirical evidence Empirical evidence about the relation between changes in the quantity of money and in prices, although it was sufficiently extensive to produce a widespread belief in the quantity theory, has seldom been systematically collated and organized. Until modern times, money was mostly metallic— copper, brass, silver, gold. The most notable changes in its nominal quantity under such circumstances were produced by sweating and clipping, by governmental edicts changing the nominal values attached to specified physical quantities of the metal, or by great discoveries of new sources of specie. Economic history is replete with examples of the first two and their coincidence with corresponding changes in nominal prices (see Cipolla 1956; Feavearyear 1931). The most important example of the third is the great specie discoveries in the New World in the sixteenth century. The association between this increase in the quantity of money and the price revolution of the sixteenth and seventeenth centuries has been well documented (see Hamilton 1934). The nineteenth and early twentieth centuries offer another striking example, despite the much greater development of deposit money and paper money. The gold discoveries in Australia and the United States in the 1840s were followed by substantial price rises in the 1850s. When the rate of growth of the gold stock slowed down, and especially when country after country shifted from silver to gold (Germany in 1871-1873, the Latin Monetary Union in 1873, the Netherlands in 18751876) or returned to gold (the United States in 1879), world prices in terms of gold fell slowly but fairly steadily for about three decades. New gold discoveries in the 1880s and 1890s, powerfully reinforced by the development of improved methods of mining and refining, particularly the development of commercially feasible methods of using the cyanide process to extract gold from low-grade ore, reversed the trend. The world gold stock started to grow at a much more rapid rate, and no additional
important countries shifted to gold, so there was no increase in demand from this source. The price trend also reversed itself. From the mid-1890s to 1914, world prices in terms of gold rose by 25 to 50 per cent, depending on the index used. Evidence from great inflations. The most dramatic evidence about the role of the quantity of money comes from periods of great monetary disturbances, and among these the most striking are the periods of extremely rapid price rise, such as the hyperinflations after World War I in Germany, Austria, and Russia, those after World War n in Hungary and Greece, and the rapid rises, if not hyperinflations, in many South American and some other countries both before and after World War n. These twentieth-century episodes have been rather more systematically studied than earlier ones. The studies demonstrate almost conclusively the critical role of changes in the quantity of money (the most important study is Cagan 1956). These studies also enable us to sketch with considerable accuracy a rather typical profile of an inflation that follows a period of fairly stable prices. The inflation often has its start in a period of war, but it need not. What is important is that something, generally the financing of extraordinary governmental expenditures, produces a much more rapid rate of growth of the money stock. Prices start to rise, but at a slower pace than the money stock, so that for a time the real stock of money increases. The reason for this is twofold. First, it takes time for people to readjust their money balances. Second, initially there is a general expectation that what goes up will come down, that the rise in prices is temporary and will be followed by a decline. Such expectations make money seem to be a desirable form in which to hold assets, and therefore they lead to an increase in desired money balances in real terms. As prices continue to rise, expectations are revised. People come to expect prices to continue to rise. Desired balances decline. People also take more active measures to eliminate the discrepancy between actual and desired balances. The result is that prices start to rise faster than the stock of money, and real balances start to decline (that is, velocity starts to rise). How far this process continues depends on the rate of rise in the stock of money. If it remains fairly stable, real balances settle down to a level that is lower than the initial level but roughly constant—for a constant expected rate of rise in prices there will be a roughly constant level of desired real balances; in this case, prices ultimately rise at the same rate as the stock of money. A decline in the rate of rise in the stock of money is followed by a decline in the rate of
MONEY: Quantity Theory rise in prices, and this is followed in turn by an increase in actual and desired real balances as people readjust their expectations; the converse also holds. The result is that once the process is in full swing, changes in real balances follow with a lag changes in the rate of change of the stock of money. The lag reflects the fact that people apparently base their expectations of future rates of price change on an average of experience over the preceding several years, the period of averaging being shorter the more rapid the inflation. In the extreme cases, those which have degenerated into hyperinflation and a complete breakdown of the medium of exchange, rates of price change have been so high and real balances have been driven down so low as to lead to the widespread introduction of substitute moneys, usually foreign currencies. At that point completely new monetary systems have had to be introduced. A similar phenomenon has occurred when inflation has been effectively suppressed by price controls, so that there is a substantial gap between the prices that would prevail in the absence of controls and the legally permitted prices. This gap prevents money from functioning as an effective medium of exchange and also leads to the introduction of substitute moneys, sometimes rather bizarre ones like the cigarettes and cognac used in post-World War ii Germany. Evidence from the United States. Recent studies of the monetary history of the United States provide an especially full documentation of monetary relations (see especially Friedman & Schwartz
443
1963a). Some of the salient findings may be summarized briefly. (a) The real stock of money, expressed in terms of months of income, has risen from about 3£ months' income at the end of the Civil War in 1865 to over 7 months' income by 1960—that is, velocity has fallen (money is denned as currency held by the public plus all adjusted deposits in commercial banks, income is denned as net national product). One interpretation of this trend is that the rise in real balances reflects the contemporaneous rise in real income per capita. From the end of World War n to almost 1960, velocity rose rather than fell. It is not yet clear whether this was a temporary interruption or a change of trend. (b) If allowance is made for the trend in velocity, there has been a very close connection between the stock of money per unit of output and prices. This is brought out clearly by Figure 1, which, to eliminate short-period fluctuations, plots the average stock of money per unit of output and average prices in successive reference-cycle phases. (c) In the course of business cycles the stock of money has slowed up its rate of growth well before the date designated by National Bureau of Economic Research reference-cycle dates as the peak of the cycle and has increased its rate of growth well before the trough. In mild contractions these decelerations have generally produced not an absolute decline in the stock of money but only a lower rate of growth. Every severe contraction has been accompanied by an absolute decline in the stock of money, and the severity of the contrac-
Stock of money per unit of output'
Figure 1 — Implicit prices and stock of money per unit of output, reference-cycle phase averages, 1870—1961° a. A phase is the trough-ro-peak or peak-to-trough interval between reference-cycle turning points. (For a discussion of reference cycles, see Moore 1961.) Phase averages are computed by weighting initial and terminal years each at one-half and intervening years at unity. The trend lines are computed regressions based on phase-average values, 1882-1961. b. The index of implicit prices is based on 1929 = 100. For the underlying figures, see Friedman and Schwartz (1963a, chart 62). c. Stock of money per unit of output is the ratio of the money stock to real income, expressed as an index. For the underlying figures, see Friedman and Schwartz (1963a, table A-1, col. 8, and source notes to chart 62).
444
MONEY: Quantity Theory
tion has been in roughly the same order as the size of the decline in the stock of money. Although changes in the rate of growth of the stock of money have to some extent reflected the contemporaneous course of business, on many occasions they have quite clearly been the result of independent forces, such as the deliberate decisions of monetary authorities. The clearest examples are probably the wartime increases and the decreases from 1920 to 1921, 1929 to 1933, and 1937 to 1938. (d) Velocity as usually measured has tended to rise during business expansions and decline during business contractions. One explanation offered is that this pattern reflects the use of measured income in computing velocity rather than a longer term concept, such as permanent income (Friedman 1959). Another explanation offered is that it reflects the effect of interest rates. (e) It is agreed that velocity is related to interest rates, higher interest rates being associated with higher velocity, and conversely, but there is wide disagreement about the magnitude and significance of the relation. One view is that changes in interest rates are either the primary or a major source of all cyclical and secular changes in velocity (Latane 1954; I960; Brunner & Meltzer 1963). Another view is that changes in interest rates have been a minor factor, much less important than changes in real per capita income for secular changes in velocity and much less important than differences between measured and permanent income for cyclical changes (Friedman 1959; Friedman & Schwartz 1963a). Evidence from underdeveloped countries. A few scattered figures for some of the less developed countries may help to indicate the broad range of applicability of the quantity theory of money. Real balances of currency. In less developed countries, currency is often a more meaningful total than currency plus deposits for two reasons. One is that deposits are often used to a very limited extent and by highly selected groups in the population. The other is that governmental monetary intervention is more frequent and more important with respect to deposits, so that an erratic element is introduced into the conditions of supply of deposits. Table 1 gives estimates for a recent year of the stock of currency expressed in number of weeks of personal disposable income for less developed countries and, for comparison, for the United States. These figures are subject to very wide margins of error, particularly because of the unreliability of income estimates for the less developed countries. It is, therefore, all the more
Table 1 — International comparison of real balances Number of weeks of personal
India Greece Yugoslavia Turkey Israel United States
Year
disposable income held in currency
1958-1959 1960 1960 1961 1961 1960
6.9 6.3 6.2 5.2 4.4 4.3
striking that for countries for which methods of economic organization vary so greatly and for which real income per capita must vary over a range of something well in excess of 20 to 1, real balances vary over a range of decidedly less than 2 to 1. And much of that variation is readily explained by different degrees of financial development: deposits are least widely used in India, Greece, and Yugoslavia, most widely used in Israel and the United States, and used to an intermediate extent in Turkey. Clearly, money-holding propensities have a great degree of uniformity under a wide range of circumstances. Changes in quantity of money and in prices. If data like those in Table 1 are of questionable accuracy, year-to-year data are even more dubious for the underdeveloped countries. A recent study that was confined to the Middle East shows a variety of relations. In Egypt and Turkey the data for wholesale prices show the kind of close relationship between money supply and price changes that other experiences would lead one to expect. For the other countries the relation is loose or nonexistent (Penrose 1962). Rises in output may explain some part of the discrepancy. Much more likely explanations are the following: (a) The inclusion of rapidly expanding deposits whose significance is questionable. Currency figures alone show much less of a discrepancy. (£>) Major defects in the price indexes. The countries have sought to suppress price increases, often have legal prices that are honored more in the breach than in the observance, and calculate price indexes in ways that understate the actual price rise. It is highly likely that revised and improved figures will remove much of the apparent discrepancy. Stability of velocity and the multiplier. As pointed out above, the challenge to the quantity theory offered by Keynes rested entirely on differences in empirical presumptions, which can be summarized in terms of the stability attributed to the velocity of circulation, on the one hand, and the Keynesian multiplier (the ratio of changes in income to changes in autonomous expenditures), on the other.
MONEY: Quantity Theory A systematic comparison of the relative stability of velocity and the multiplier has been made for the United States from 1896 to 1958 (Friedman & Meiselman 1964a; I964b- 1965). The results are striking: velocity is consistently more stable than the multiplier. These results have been challenged by other writers (Hester 1964; Ando & Modigliani 1965; DePrano & Mayer 1965), showing that this question is still far from settled. Policy implications On a very general level the implications of the quantity theory for economic policy are straightforward and clear. On a more precise and detailed level they are not. Acceptance of the quantity theory clearly means that the stock of money is a key variable in policies directed at the control of the level of prices or of money income. Inflation can be prevented if and only if the stock of money per unit of output can be kept from increasing appreciably. Deflation can be prevented if and only if the stock of money per unit of output can be kept from decreasing appreciably. This implication is by no means a trivial one. Monetary authorities have more frequently than not taken conditions in the credit market— rates of interest, availability of loans, and so on— as criteria of policy and have paid little or no attention to the stock of money per se. This emphasis on credit as opposed to monetary policy accounts both for the great depression in the United States from 1929 to 1933, when the Federal Reserve System allowed the stock of money to decline by one-third, and for many of the post-World War n inflations. The quantity theory has no such clear implication, even on this general level, about policies concerned with the growth of real income. Both inflation and deflation have proved consistent with growth, stagnation, or decline. Passing from these general and vague statements to specific prescriptions for policy is difficult. It is tempting to conclude from the close average relation between changes in the stock of money and changes in money income that control over the stock of money can be used as a precision instrument for offsetting other forces making for instability in money income. Unfortunately there are many slips between this cup and this lip. One slip is that a very close relationship on the average is consistent with much variation in the individual instance. A high correlation between changes relative to trend in the stock of money and in money income over many business cycles— involving, say, an average increase of 2 per cent in
445
money income for every 1 per cent increase in money—is entirely consistent with the corresponding ratio varying in individual years or over single cycles from zero or a negative number to, say, 4 or 5. But for policy in a particular cycle, what is important is the relation in that cycle, not the relation on the average. A second slip is the length of time it takes for changes in the stock of money to have their effect —this is one of the reasons for the variability that constitutes the first slip. A change in the stock of money today will have most of its effects some months from now, perhaps on the average as much as 12 to 15 months from now. A policy of using monetary changes to offset other forces making for instability therefore requires an ability to forecast a considerable time in advance what those forces will be—an ability that has so far been conspicuous by its absence. Moreover, the time it takes for monetary changes to be effective undoubtedly varies rather considerably. Hence it would also be necessary to forecast how long the lag would be in the specific instance. These two slips mean that monetary changes intended to be stabilizing may in fact be destabilizing; they may introduce a random and erratic influence into economic affairs. It is a sobering thought that both the stock of money and economic activity displayed greater instability in the first two peacetime decades after the establishment of the Federal Reserve System (1919 to 1939) than in any other pair of decades in the whole of United States history. The blind, quasi-automatic forces that controlled monetary matters in earlier decades produced a higher degree of stability than a system specifically established to promote monetary and economic stability. The greater stability of prices and employment since the end of World War n may be a sign that we have learned how to avoid the mistakes of the interwar decades, but it is much too soon to have any confidence in that comfortable conclusion. Other slips have to do with the indirect effects of methods used to control the stock of money; with possible conflicts between the objective of stable prices and such other objectives as stable exchange rates, stable employment at a high level, and low interest rates on government borrowing; and with the possible desire to use inflation as a means of imposing a tax on money balances. One negative implication of the quantity theory, implicit in the above, is worth spelling out because of the continued widespread acceptance of the belief that fiscal policy is the key to control of the level of money income. The quantity theory implies
446
MONEY: Quantity Theory
that the effect of government deficits or surpluses depends critically on how they are financed. If a deficit is financed by borrowing from the public without an increase in the quantity of money, the direct expansionary effect of the excess of government spending over receipts will be offset to some extent, and possibly to a very great extent, by the indirect contractionary effect of the transfer of funds to the government through borrowing. Furthermore, the deficit will primarily affect income only while it lasts; a cessation of the deficit will mean a cessation of its effects. If a deficit is financed by printing money, there will be no offset, and the enlarged stock of money will continue to exert an effect after the deficit is terminated. What matters most is the behavior of the stock of money, and government deficits are expansionary primarily if they serve as the means of increasing the stock of money; other means of increasing the stock of money will have closely similar effects. MILTON FRIEDMAN [See also LIQUIDITY PREFERENCE; MONETARY POLICY.] BIBLIOGRAPHY
ANDO, ALBERT; and MODIGLIANI, FRANCO 1965 The Relative Stability of Monetary Velocity and the Investment Multiplier. American Economic Review 55:693728, 786-790. BRUNNER, KARL; and MELTZER, ALLAN H. 1963 Predicting Velocity: Implications for Theory and Policy. Journal of Finance 18:319-354. CAGAN, PHILLIP 1956 The Monetary Dynamics of Hyperinflation. Pages 25-117 in Milton Friedman (editor), Studies in the Quantity Theory of Money. Univ. of Chicago Press. CIPOLLA, CARLO M. 1956 Money, Prices, and Civilization in the Mediterranean World, Fifth to Seventeenth Century. Princeton Univ. Press. DEPRANO, MICHAEL E.; and MAYER, THOMAS 1965 Tests of the Relative Importance of Autonomous Expenditures and Money. American Economic Review 55: 729-752. FEAVEARYEAR, ALBERT E. (1931) 1963 The Pound Sterling: A History of English Money. 2d ed. Oxford: Clarendon. FISHER, IRVING (1911) 1920 The Purchasing Power of Money: Its Determination and Relation to Credit, Interest and Crises. New ed., rev. New York: Macmillan. FRIEDMAN, MILTON 1957 A Theory of the Consumption Function. National Bureau of Economic Research, General Series, No. 63. Princeton Univ. Press. FRIEDMAN, MILTON 1959 The Demand for Money: Some Theoretical and Empirical Results. Journal of Political Economy 67:327-351. FRIEDMAN, MILTON; and MEISELMAN, DAVID 1964a The Relative Stability of Monetary Velocity and the Investment Multiplier in the United States, 1897-1958. Pages 165-268 in Stabilization Policies: A Series of Research Studies Prepared for the Commission on Money and Credit. Englewood Cliffs, N.J.: PrenticeHall.
FRIEDMAN, MILTON; and MEISELMAN, DAVID 1964b Re. ply to Donald Hester. Review of Economics and Statistics 46:369-377. -» Includes a rejoinder by Donald D. Hester. FRIEDMAN, MILTON; and MEISELMAN, DAVID 1965 Reply to Ando and Modigliani and to DePrano and Mayer. American Economic Review 55:753-785. -» Contains a rejoinder by Ando and Modigliani on pages 786-790 and by DePrano and Mayer on pages 791792. FRIEDMAN, MILTON; and SCHWARTZ, ANNA J. 1963a A Monetary History of the United States, 1867-1960. National Bureau of Economic Research, Studies in Business Cycles, No. 12. Princeton Univ. Press. FRIEDMAN, MILTON; and SCHWARTZ, ANNA J. 19636 Money and Business Cycles. Review of Economics and Statistics 45, no. 1, pt. 2:32-64. HABERLER, GOTTFRIED (1937) 1958 Prosperity and Depression: A Theoretical Analysis of Cyclical Movements. 4th ed., rev. & enl. Harvard Economic Studies, Vol. 105. Cambridge, Mass.: Harvard Univ. Press; London: Allen & Unwin. HAMILTON, EARL J. (1934) 1965 American Treasure and the Price Revolution in Spain, 1501—1650. Harvard Economic Studies, Vol. 43. New York: Octagon. HESTER, DONALD D. 1964 Keynes and the Quantity Theory: A Comment on the Friedman-Meiselman CMC Paper. Review of Economics and Statistics 46: 364-368. HUME, DAVID 1752 Of Money. Discourse III. Pages 4159 in David Hume, Political Discourses. Edinburgh: Fleming. JOHNSON, H. G. 1962 Monetary Theory and Policy. American Economic Review 52:335-384. KEYNES, JOHN MAYNARD 1936 The General Theory of Employment, Interest and Money. London: Macmillan. H> A paperback edition was published in 1965 by Harcourt. LATANE, HENRY A. 1954 Cash Balances and the Interest Rate: A Pragmatic Approach. Review of Economics and Statistics 36:456-460. LATANE, HENRY A. 1960 Income Velocity and Interest Rates: A Pragmatic Approach. Review of Economics and Statistics 42:445-449. MARGET, ARTHUR W. 1938 The Theory of Prices: A Reexamination of the Central Problems of Monetary Theory. Vol. 1. New York: Prentice-Hall. MARSHALL, ALFRED (1923)1960 Money, Credit & Commerce. New York: Kelley. MELTZER, ALLAN H. 1963 Demand for Money: The Evidence From the Time Series. Journal of Political Economy 71:219-246. MOORE, GEOFFREY H. (editor) 1961 Business Cycle Indicators. 2 vols. National Bureau of Economic Research, Studies in Business Cycles, No. 10. New York: The Bureau. ->• Volume 1: Contributions to the Analysis of Current Business Conditions. Volume 2: Basic Data on Cyclical Indicators. NEWCOMB, SIMON 1886 Principles of Political EconomyNew York: Harper. PATINKIN, DON (1948) 1951 Price Flexibility and Full Employment. Pages 252-283 in American Economic Association, Readings in Monetary Theory. Homewood, 111.: Irwin. -> First published in Volume 38 of the American Economic Review. PATINKIN, DON (1956) 1965 Money, Interest, and Prices: An Integration of Monetary and Value Theory2d ed. New York: Harper.
MONEY: Velocity of Circulation PENROSE, EDITH T. 1962 Money, Prices, and Economic Expansion in the Middle East, 1952-1960. Rivista internazionale di scienze economiche e commerciali 9:401-427. PIGOU, A. C. (1917) 1951 The Value of Money. Pages 162-183 in American Economic Association, Readings in Monetary Theory. Philadelphia: Blakiston. PIGOU, A. C. 1943 The Classical Stationary State. Economic Journal 53:343-351. PIGOU, A. C. (1947) 1951 Economic Progress in a Stable Environment. Pages 241-251 in American Economic Association, Readings in Monetary Theory. Homewood, 111.: Irwin. -> First published in Volume 14 of Economica New Series. TOBIN, JAMES (1947) 1960 Money Wage Rates and Employment. Pages 572-587 in Seymour Harris (editor), The New Economics: Keynes' Influence on Theory and Public Policy. London: Dobson. TOBIN, JAMES 1961 Money, Capital and Other Stores of Value. American Economic Review 51, no. 2:26-37. WICKSELL, KNUT (1898) 1936 Interest and Prices (Geldzins und Giiterpreise). With an introduction by Bertil Ohlin. London: Macmillan. -» First published in German. Ill VELOCITY OF CIRCULATION
At least since the time of William Petty the velocity of circulation of money—known also as the rate of turnover, rate of use, frequency of use, rapidity of circulation, or efficiency—has been recognized as an important dimension of monetary analysis. A given quantity of money can finance any volume of spending, depending on how frequently, on the average, each unit is used. Moreover, a change in the quantity of money will alter aggregate demand for goods and services only if it is not offset by an opposite change in velocity. An understanding of the factors governing velocity obviously is crucial to the formulation of effective monetary policy. Nevertheless, the velocity concept has been surrounded by controversies throughout its long history. The concept found greatest acceptance during the opening decades of the twentieth century —particularly in the United States, through the influence of Irving Fisher (1911). During the 1930s and 1940s it was abandoned by most economists in favor of the new conceptual framework fashioned by J. M. Keynes [see LIQUIDITY PREFERENCE]. More recently, however, the older concept has been finding its way into monetary literature again. The recent revival of interest in monetary velocity reflects a number of developments. It has become evident in the post-World War n period that the major behavior relations proposed in the New Economics are not as dependable as many Keynesian enthusiasts had hoped they would be. Meanwhile, velocity analysis has been improved signifi-
447
cantly. The concept of velocity has been refined in various ways, and it has been integrated at last into the main body of economic theory. In addition, the statistical resources for study of velocity have been extended greatly. This combination of conceptual breakthroughs and improved statistics has been accompanied by a number of attempts to explain particular velocity movements over time or cross-sectional differences, or to fashion general theories of velocity. The early history of thought relating to velocity has been traced quite fully elsewhere, particularly by Holtrop (1929) and Marget (1938). This article is confined to a review of fundamentals, along with a discussion of more recent developments in velocity theory. Types of velocities The concept of the velocity of circulation of money is clearly and easily defined in general terms. It is the average number of times that each unit of money is spent during any time period. From the equation of exchange, MV = PT,
where M is the average stock of money in existence during the period, P the average price of items purchased, and T the number of items purchased, it is evident that velocity is the volume of spending per unit of money: V = PT/M.
However, the definition does not uniquely define velocity, since it fails to specify the meaning of "spending" and "money." Actually, economists have worked with several broad types of velocities, and with countless minor variations thereof. Fisher's approach was to include in spending all exchanges of money against goods, services, and securities throughout an economy during a period such as a year and to restrict money to actual means of payment (i.e., privately held demand deposits and currency). The resulting spendingmoney ratio can be called aggregate transactions velocity, Vt • For several reasons this velocity concept is not very useful, except for purposes of classroom exposition. In the first place, reliable measures of total spending in any economy—even for a single year—do not exist and would be extremely difficult to construct. Second, while study of Vt might help us to understand changes in total spending, the general price level (P), and the volume of transactions (T), these concepts have little interest from a welfare or policy point of view. Finally, the use of Vt could be defended, apart
448
MONEY: Velocity of Circulation
from its immeasurability, only if money were regarded mainly as a "medium of exchange"; the demand for money would then be sensitive to the volume of spending, and Vt would tend to be fairly stable. Most economists now emphasize the "store of wealth" function of money, and consequently they see no advantage in relating the stock of money to total spending, as Vt does. A second approach, pioneered in the United States by the Federal Reserve System, is to focus on the velocity of demand deposits alone, in which case spending means "spending by check." This velocity is known as deposit turnover, Vd. Monthly estimates of VfJ, based on data from a large sample of banks, are available for the United States since 1919 and are published each month in the Federal Reserve Bulletin. Although these estimates are a valued part of our monetary statistics, Vd, like Vt, does not directly relate to important policy variables such as the level of wholesale prices or the level of national income. And, like Vt, it assumes implicitly that the volume of spending is the major determinant of the demand for money. Recognition of the shortcomings of Vt and Vd has led modern economists, beginning with Pigou (1927), to develop a third index of money use, income velocity, V,,. Since Vy is merely the ratio of spending for currently produced goods and services (i.e., gross or net national product) to the total money stock, it can be computed quite simply. Annual time series of V,, in the United States since 1867 have been constructed by Friedman and Schwartz (1963). Moreover, with the world-wide development of national income statistics, Vy estimates can now be made for a large number of other countries. Quarterly Vy series are also available for the United States since 1946. In addition to its measurability, V^ has the important advantage of relating the money stock to national product, a concept of major interest to economists. Similarly, an index of the prices of final goods and services is much more meaningful than a general price index which includes prices of stocks and many other things that are not vitally important to most policy decisions. While Vy was attacked by Keynes (1930, vol. 2, p. 24) as a "hybrid conception having no particular significance" because some of the money included in its denominator is used to finance purchases other than those of final output, most contemporary economists would reject the criticism as placing too much emphasis on the "transaction motive" for holding cash. If the volume of spending does not dominate the demand for money, then it does not matter that Vv omits large segments of spending from the analysis.
One can obtain a fourth type of velocity by disaggregating whatever concept of spending one wishes to use and dividing each sector's spending by its money holdings. The sectors can be drawn according to any number of principles (e.g., by regions, industries, or size classes) and at any level of aggregation; hence, the number of conceivable sector velocities is indefinitely large. The idea of sectoral velocity analysis is not new; Keynes (1930, vol. 2) advocated such an approach decades ago. Except for the Federal Reserve estimates of Vd, however, which have always been available by groups of cities as well as on an aggregate basis, sector velocities have been ignored for the most part until quite recently. One can compute annual velocities for business firms and households in the United States since the early 1930s and quarterly business velocities since the late 1940s (see Selden 1962). The principal advantage of the sector approach is that it may facilitate analysis of aggregate velocity. Aggregate velocity is a weighted average of sector velocities, the weights being the share of the money stock that each sector holds. Let Vti and Mi be transactions velocity and money holdings in the iih sector. Then, Vt =
MJM.
Thus, changes in aggregate velocity reflect either changes in the weights of sectors or changes in sector velocities. Velocity changes may emanate from different sectors at different times; specific knowledge of the point of origin of a change should contribute to an understanding of its nature. Another concept, the velocity of active money, Va, was popularized by Keynes (1936). In a sense this may be regarded as a special kind of sector velocity, in which total spending, however defined, is divided by "active" balances only. The relationship of this velocity to aggregate velocity is then V = VaMa/M, where Ma is active money. Because of the difficulty in finding an appropriate basis for separating cash into active and idle components, most economists have found this concept, like Vt , useful mainly in abstract discussions of monetary theory. Angell (1936), Tobin (1947), Bronfenbrenner and Mayer (1960), and several others have attempted to solve this problem by (1) adopting the Keynesian hypothesis that Va changes only gradually over time and (2) finding some period, such as 1929, in which all cash supposedly was drawn into active circulation. Such calculations are not without interest, but there has been a growing tendency in
MONEY: Velocity of Circulation the 1950s and 1960s for economists to abandon the active-idle dichotomy and to work with total cash instead. The behavior of velocity It is doubtful whether any economist of recognized stature, from Petty's day to the present, has regarded the velocity of money as being rigidly fixed over time. Not until the twentieth century, however, did dependable time series become available, permitting close study of velocity movements. U.S. data reveal the existence of fairly regular seasonal and cyclical velocity variations, as well as persistent secular changes. Seasonally, both Vd and Vv reach lows early in the year and highs in the closing months, despite the fact that the money stock has a similar seasonal pattern. Cyclically, all velocity measures tend to rise during general business expansions and fall during contractions, with peaks and troughs in velocity
449
coinciding with business cycle peaks and troughs. Cyclical amplitudes, interpreted as deviations from secular trends, are substantially greater in V than in M; indeed, the latter usually continues to rise during business contractions, although at a diminished rate. These cyclical changes in velocity can be seen in Figure 1, which reproduces two income velocity series constructed by Friedman and Schwartz, one referring to the velocity of money defined broadly (total adjusted deposits plus currency outside banks) for the period 1869-1960, the other referring to money defined more narrowly (adjusted demand deposits plus currency outside banks) for the period 1915-1960. Cyclical swings in velocity are characteristic of all major sectors of the economy, but they are much more severe for businesses than for households and governmental units. Figure 1 also shows a pronounced and steady downtrend in velocity between the early 1880s and the late 1940s—a pattern that was first noted by
INCOME VELOCITY
5
4
J
Velocity of currency plus demand deposits
Velocity of money
LOGARITHMIC SCALE
1 II n I i i I I I il I i i I I i l l l I H l l I I I I I I II I l I I 1 I I I II I l I i I n l l l 1 1 I 1 1 1 1 I l 1 1 1 I 1 l l 1 I l I l 1 l l I I I l l 1 1 I 1 1 1 I ll
Figure 1 Source: Friedman & Schwartz 1963, p. 640.
450
MONEY: Velocity of Circulation
Warburton (1945; 1949). The latter's studies, based on admittedly crude data, suggested that Vy has been declining at a rate of about 1-J- per cent per year since the beginning of the nineteenth century. It is interesting that the more elaborate study of Friedman and Schwartz (1963), covering nine decades ending in 1960, also found a declining trend of slightly over 1 per cent per year. These findings are particularly interesting because they are contrary to the expectation, held by Fisher (1911) and others, that velocity would rise over time. Comparable statistics over extended time periods are lacking for other countries, but fragmentary evidence compiled by Doblin (1951) strongly suggests that secular velocity declines have been a world-wide phenomenon, at least through the 1940s. Since the end of World War n, on the other hand, Vd and Vy have been rising steadily, except for minor cyclical interruptions. The postwar rise shows up regardless of how spending and money are defined, although the rise is dampened considerably if one follows Friedman and Schwartz and defines M broadly to include commercial bank time deposits as well as demand deposits and currency. Moreover, sectoral studies reveal that the postwar velocity rise has taken place in every sector for which data are available. There has been much controversy over the nature of postwar velocity movements—whether the rise represents a fundamental break with the past or is merely a readjustment from abnormally low levels in the 1930s and during World War n. We shall have more to say on this matter in the next section. In addition to the temporal changes in velocity already mentioned, there are noteworthy crosssectional differences at any point in time. Perhaps the most familiar of these differences are in Vd for New York City, for six other major centers, and for the remaining centers for which information is compiled. In 1963 these figures were 84.8, 44.6, and 29.0, respectively. Among the major sectors covered by Federal Reserve flow-of-funds accounts, corporate business has consistently had higher velocity ratios than noncorporate business, which in turn has higher ratios than the consumer and nonprofit sectors. State and local governments, the farm sector, and nonbank financial intermediaries hold large amounts of cash per dollar of spending, while the federal government operates with relatively small cash balances, though not so small as that of corporate business. Within the business sector there are further in-
teresting differences by industry and by size of firm. Wholesale and retail trade are high velocity sectors, manufacturing is intermediate, and mining and public utilities maintain low velocity ratios. Until recently small firms have tended to have higher velocities than large firms; however, during the general velocity rise of the 1950s the velocities of very large firms rose much more rapidly than those of medium-size and small firms. By the end of the decade most of the earlier size differentials had been eliminated. Determinants of velocity Fisher, Marshall, Pigou, and Wicksell. Although a number of early thinkers gained important insights into the problem of what determines monetary velocity, it is fair to say that real progress dates from the first decade or two of this century, with the contributions of Fisher (1911), Marshall (1923), Pigou (1917), and Wicksell (1906). These men worked more or less independently (except Pigou, who was Marshall's student and colleague), and they developed rather different modes of analysis. In fact, Marshall and Pigou chose to work with the reciprocal of velocity, which they misleadingly designated k, rather than with velocity itself. Yet the substance of their analyses was remarkably similar. In each case emphasis was placed on more or less mechanical relationships between payments and receipts. This is evident from Fisher's formal listing of influences on velocity: 1. Habits of the individual. (a) As to thrift and hoarding. (£>) As to book credit. ( c ) As to the use of checks. 2. Systems of payments in the community. (a) As to frequency of receipts and disbursements. (b) As to regularity of receipts and disbursements. (c) As to correspondence between times and amounts of receipts and disbursements. 3. General influences. (a) Density of population. (£») Rapidity of transportation. However, implicitly or explicitly all of these economists assigned some role to the rate of interest as a velocity determinant. This comes out most clearly in Pigou's work (1917). Perhaps the major stumbling block in these early analyses was the sterile manner in which velocity (or its reciprocal) was related to the demand for money. It was recognized that velocity and the
MONEY: Velocity of Circulation demand for money are intimately related: a rise (fall) in V implies a fall (rise) in the demand for money. However, the neoclassical depiction of the demand for money necessarily took the form of a rectangular hyperbola. M was placed on the horizontal axis; the value of money, 1/P, on the vertical. For given levels of V and T, M times 1/P is fixed; that is, real cash balances are constant. Variations in T/V would cause a shift in the demand curve, but the new curve would again be a rectangular hyperbola. This pseudo integration of monetary theory with orthodox price theory was a cul-de-sac which impeded progress in velocity theory for a generation. To a large extent the theoretical advances made by Angell (1936; 1941), Ellis (1938), and others in the 1920s and 1930s were merely refinements of the technical payments factors isolated earlier by Fisher. The interesting contributions made more recently by Garvy (1959&; 1959&) represent a further development in this direction. The Hicksian—Keynesian revolution. The transition into modern velocity analysis began with Hicks's famous article (1935) and Keynes's General Theory (1936). Both of these works proposed that the demand for money be analyzed by setting M against the cost of holding it rather than against its exchange value (1/P), cost being measured by forgone yields on other assets. Unfortunately, the analysis was not carried much beyond this. Furthermore, Keynes's discussion, which attracted more attention than Hicks's, was built around the arbitrary distinction between active and idle cash—velocity received scant explicit attention. In fact, Keynes ridiculed "those who make sport with velocity." Many years passed, therefore, before it became generally recognized that the Keynesian discussion of "liquidity preference" was a disguised analysis of velocity. Insofar as they have been expressly concerned with velocity theory, most Keynesian economists have emphasized the causal role of interest rates— low (high) rates being associated with low (high) velocities. Postwar developments. The most significant advances in velocity theory in the postwar period have been, essentially, elaborations of Hicks's 1935 contribution. It is now widely accepted that velocity must be analyzed in the framework of the demand for money and that orthodox demand theory can be applied in a fairly straightforward manner to the demand for the services of money. However, the "price" variable—the cost of holding money—has been refined considerably, and attention has been directed increasingly to the impact
451
of such nonprice determinants as income, wealth, money substitutes, tastes, and expectations. The cost of holding money. Despite a number of interesting contributions, economists remain sharply divided over the role of the cost of holding money as a determinant of V. On the level of pure theory, Baumol (1952) and Tobin (1956) demonstrated that there are good reasons for thinking that, contrary to the earlier Keynesian emphasis, the demand for transactions balances is a function of interest rates. More significantly, several empirical studies were made. Cagan (1956) found striking relationships during hyperinflations in a number of countries between real balances (and presumably V) and the rate of change of the price level. On the basis of annual data for the United States for 1907-1958, Latane (1954; 1960) concluded that desired holdings of demand deposits plus currency, per dollar of gross national product, were fairly responsive to changes in corporate yields. Meltzer (1963£>), using measures similar to those of Latane, also found a strong interestrate effect on velocity for 1900-1958. In addition to these aggregate time series studies, Selden (1962) and Meltzer (1963a) made cross-section analyses of velocity and the demand for money among American business firms, and found strong indications of interest-rate effects. On the other hand, Friedman (1959, p. 345), in a study of velocity movements over the period 1870-1954, concluded: A rise in the bond yield tends to reduce the real stock of money demanded for a given real income—that is, to raise velocity—and conversely. Bond yields, however, play nothing like so important and regularly consistent a role in accounting for changes in velocity as does real income. The short-term interest rate was even less highly correlated with velocity than the yield on corporate bonds. In part these differences in emphasis reflect differing concepts, measures, and time periods used in the various statistical tests. Friedman, in contrast with Latane and Meltzer, included commercial bank time deposits in the money stock, and his period of analysis is substantially longer. But the differences also reflect the fact that in Friedman's work the effect of interest rates on V was examined after allowing for the effect of changes in real income per capita. Aside from these extensive empirical investigations, there was increasing concern, in general commentaries on monetary problems during the 1950s, with the interest elasticity of velocity. It was frequently contended that during periods of rising demand for goods and services, banks and
452
MONEY: Velocity of Circulation
other lenders can easily sell securities on the open market and use the proceeds to finance additional spending. Thus, while the monetary authorities can keep M from expanding at such times, they may be unable to prevent inflationary increases in V. However, the validity of this line of argument depends on (1) the terms on which the holders of cash are willing to acquire additional securities and (2) the terms on which prospective spenders are willing to incur additional debt. If the first of these relationships is highly interest-inelastic while the second is not, then lenders have little power to circumvent monetary policy. But the facts concerning these interest elasticities, and hence the interest elasticity of V, need much further study before any definite conclusions can be reached. Other hypotheses. A number of economists, including Warburton (1949), Selden (1956), and Friedman (1959), have studied the role of per capita real income as a velocity determinant. Friedman's analysis is particularly interesting, in that he relies on income changes to explain not only broad secular movements in V but cyclical movements as well. This is done by use of a "permanent income" hypothesis. [See MONEY, article on QUANTITY THEORY, for additional discussion.] As income rises secularly, corresponding to rises in permanent income, the demand for money rises faster than income; hence, the ratio of income to the money stock (V y ) falls. On the other hand, during cyclical expansions measured income rises faster than permanent income; hence, Vy rises. Friedman was able to explain nearly all velocity movements in the United States between 1870 and 1954 in terms of this permanent income hypothesis. However, the persistent rise in Vj,, despite rising real incomes, during the 1950s and early 1960s has created a problem for all of these income approaches. The postwar rise in V has stimulated economists to propose other explanations as well. Some have stressed the greater sense of economic security in the postwar world because of the altered economic role of government. Others have pointed out the generally inflationary environment that characterized the 1940s and much of the 1950s, making cash an unattractive asset to hold. However, other than changes in interest rates and income, the factor that has received most attention as a velocity determinant has been wealth. The role of financial wealth has been singled out by Gurley and Shaw (1960, pp. 177-179), who point out that in its broad historical contours the ratio of income to all financial assets has followed a pattern similar to that of the ratio of income to the money stock.
Certainly the growth of money substitutes in the form of claims against nonbank financial intermediaries has been an outstanding feature of the postwar world. A different kind of wealth hypothesis has been put forth by Meltzer (1963&), who found a close multiple correlation between V, corporate bond yields, and nonhuman tangible wealth over the period 1900-1958. It is clear from these various studies that economists are still some distance from reaching a consensus on the determinants of velocity. Nevertheless, the studies indicate that the velocity concept continues to preoccupy a large number of economists and that important progress has been made. RICHARD T. SELDEN BIBLIOGRAPHY
ANGELL, JAMES W. 1936 The Behavior of Money: Exploratory Studies. New York: McGraw-Hill. ANGELL, JAMES W. 1941 Investment and Business Cycles. New York: McGraw-Hill. BAUMOL, WILLIAM J. 1952 The Transactions Demand for Cash: An Inventory Theoretic Approach. Quarterly Journal of Economics 66:545-556. BRONFENBRENNER, MARTIN; and MAYER, THOMAS 1960 Liquidity Functions in the American Economy. Econometrica 28:810-834. CAGAN, PHILLIP 1956 The Monetary Dynamics of Hyperinflation. Pages 25-117 in Milton Friedman (editor), Studies in the Quantity Theory of Money. Univ. of Chicago Press. DOBLIN, ERNEST 1951 The Ratio of Income to Money Supply: An International Survey. Review of Economics and Statistics 33:201-213. ELLIS, HOWARD S. (1938) 1951 Some Fundamentals in the Theory of Velocity. Pages 89-128 in American Economic Association, Readings in Monetary Theory. Philadelphia: Blakiston. FISHER, IRVING (1911) 1920 The Purchasing Power of Money: Its Determination and Relation to Credit, Interest and Crises. New ed., rev. New York: Macmillan. FRIEDMAN, MILTON 1959 The Demand for Money: Some Theoretical and Empirical Results. Journal of Political Economy 67:327-351. FRIEDMAN, MILTON; and SCHWARTZ, ANNA J. 1963 A Monetary History of the United States: 1867-1960. National Bureau of Economic Research, Studies in Business Cycles, No. 12. Princeton Univ. Press. -> Copyright © 1963, by National Bureau of Economic Research. GARVY, GEORGE 1959a Deposit Velocity and Its Significance. New York: Federal Reserve Bank of New York. GARVY, GEORGE 1959b Structural Aspects of Money Velocity. Quarterly Journal of Economics 73:429-447. GURLEY, JOHN G.; and SHAW, EDWARD S. 1960 Money in a Theory of Finance. With a mathematical appendix by Alain C. Enthoven. Washington: Brookings Institution. HICKS, JOHN R. (1935) 1951 A Suggestion for Simplifying the Theory of Money. Pages 13-32 in American Economic Association, Readings in Monetary TheoryPhiladelphia: Blakiston.
MONEY: Monetary Reform HOLTROP, MARIUS W. 1929 Theories of the Velocity of Circulation of Money in Earlier Economic Literature. Economic History 1:503-524. KEYNES, JOHN MAYNARD (1930) 1958-1960 A Treatise on Money. 2 vols. London: Macmillan. -> Volume 1: The Pure Theory of Money. Volume 2: The Applied Theory of Money. KEYNES, JOHN MAYNARD 1936 The General Theory of Employment, Interest and Money. London: Macmillan. -» A paperback edition was published in 1965 by Harcourt. LATANE, HENRY A. 1954 Cash Balances and the Interest Rate: A Pragmatic Approach. Review of Economics and Statistics 36:456-460. LATANE, HENRY A. 1960 Income Velocity and Interest Rates: A Pragmatic Approach. Review of Economics and Statistics 42:445-449. MARGET, ARTHUR W. 1938 The Theory of Prices: A Reexamination of the Central Problems of Monetary Theory. Vol. 1. New York: Prentice-Hall. MARSHALL, ALFRED (1923) 1960 Money, Credit &- Commerce. New York: Kelley. MELTZER, ALLAN H. 1963a The Demand for Money: A Cross-section Study of Business Firms. Quarterly Journal of Economics 77:405-422. MELTZER, ALLAN H. 1963k The Demand for Money: The Evidence From the Time Series. Journal of Political Economy 71:219-246. PIGOU, ARTHUR C. (1917) 1951 The Value of Money. Pages 162-183 in American Economic Association, Readings in Monetary Theory. Philadelphia: Blakiston. PIGOU, ARTHUR C. (1927)1929 Industrial Fluctuations. 2d ed. London: Macmillan. SELDEN, RICHARD T. 1956 Monetary Velocity in the United States. Pages 177-257 in Milton Friedman (editor), Studies in the Quantity Theory of Money. Univ. of Chicago Press. SELDEN, RICHARD T. 1962 The Postwar Rise in the Velocity of Money: A Sectoral Analysis. New York: National Bureau of Economic Research. TOBIN, JAMES 1947 Liquidity Preference and Monetary Policy. Review of Economics and Statistics 29:124131. TOBIN, JAMES 1956 The Interest-elasticity of Transactions Demand for Cash. Reviexv of Economics and Statistics 38:241-247. WARBURTON, CLARK 1945 Volume of Money and the Price Level Between Two World Wars. Journal of Political Economy 53:150-163. WARBURTON, CLARK 1949 The Secular Trend in Monetary Velocity. Quarterly Journal of Economics 63:6891. WICKSELL, KNUT (1906) 1935 Lectures on Political Economy. Volume 2: Money. London: Routledge. -> First published in Swedish. IV MONETARY REFORM
In its broadest sense, the term "monetary reform" refers to any programs or measures intended to change basic features of a nation's monetary and banking system. Recently the term has been extended to include proposals for reform of the international financial mechanism through fundamental changes in the present system of opera-
453
tions under the gold exchange standard. But in its most commonly accepted sense, the term relates to the comprehensive stabilization programs adopted in many European countries after World War ii with a view to ending monetary disorders or disorganization and re-establishing a well-functioning currency system. Monetary reform programs after World War n typically provided for a reduction in varying degrees of the liquid asset holdings of the public. While differing in many respects from country to country, the reforms always involved a withdrawal of most, and on occasion all, of the outstanding currency and the issue of a new currency. In most countries that adopted such programs, only a small part of the currency holdings was directly converted into a new currency; the remainder had to be deposited in banks. All or a large part of the balances in bank accounts were usually blocked, with withdrawals or transfers permitted only up to specified amounts or for specified purposes. In some cases, a substantial proportion of the blocked deposits was eventually wiped out. Several reform programs were associated with fiscal measures of varying sorts, such as capital levies or war-profits taxes. In a few cases the compulsory exchange of some of the blocked deposits into nonmarketable government securities was required. Background and objectives. The objectives of the reform programs can be readily understood given the monetary situation prevailing through most of continental Europe during World War n and immediately afterward. In German-occupied Europe, the diversion of goods and services to the occupation armies, and similar exactions, were typically financed by central banks. The same was true of the large export surpluses vis-a-vis Germany. In that country, only a relatively small part of the war effort was financed out of taxes and public subscriptions to government bonds. Following liberation of western Europe and the occupation of Germany and Austria, the Continent was subjected to new financial strains. The allies' military expenditures for local supplies and services, and particularly the spending of military currency by their armies, added to monetary disorders during a period of severe disruption of the civilian economy. Yet, despite the vast accumulation of liquid reserves in the hands of the public throughout the Continent and the shrinkage of civilian production, the familiar signs of open inflation—rapidly rising prices and wages and skyrocketing currency circulation—were largely confined to France, Italy, and southeastern Europe. The reason was a rigid
454
MONEY: Monetary Reform
enforcement of comprehensive price, wage, and allocation controls. Experience in postwar Europe demonstrates that when shelves are bare of all save the most essential supplies and actual economic transactions are at a bare minimum, considerable scope exists for the effective enforcement of such controls. [See PRICES, article on PRICE CONTROL AND RATIONING.] As conditions for a recovery of production were re-established, the effectiveness of controls rapidly diminished. Even then in some parts of Europe they were fairly effective in preventing price inflation. But it became apparent that repressed inflation was exerting a deactivating, if not disintegrating, effect on economic life. Farmers resisted selling in legal markets for money with which there was little to buy and which was likely to depreciate; they preferred to barter their produce for consumer goods, including jewelry and other valuables that could serve as hoarding media. Manufacturers were reluctant to use up their remaining stocks of raw materials and semiprocessed goods and preferred to produce not for sale but primarily for the purpose of adding to their inventories. Consumers with large hoards of unwelcome funds at their disposal had no incentive to work at legal wage rates, payment of which added little to their purchasing power in real terms and merely left them with so much more unusable cash holdings. In some countries, notably Germany, money was thus increasingly repudiated as a medium for effecting transactions, and a growing segment of trade moved entirely outside the traditional money economy. Farmers and manufacturers, as well as traders, turned to barter and so-called "compensation trading," with sales of goods tied to the delivery of usable products. Elsewhere, several heterogeneous market spheres existed side by side, with gray and black markets taking over an ever larger share of the distribution of current output. Currency in circulation tended to be used only as one of several media of exchange in illegal market deals and as a supplement to the ration ticket in transactions at authorized prices. Especially in Italy and to a lesser extent in France, the control mechanism had largely broken down and open inflation taken hold. In some parts of southeastern Europe, particularly in Hungary and Greece, hyperinflation reigned after the end of the war. Thus in much of postwar Europe a basic task for civilian and military governments was to mop up idle money before it leaked into illegal markets and undermined the control mechanism and to rehabilitate the monetary system so that producers, whether farmers or manufacturers, would again be
responsive to incentives to sell for monetary compensation and workers would depend on current income instead of past savings. A longer-term objective was to make the economy more amenable to the traditional controls of monetary policy. In those parts of Europe where inflation was no longer repressed, the task of monetary reform was to re-establish public confidence in money as a store of value. Several other major objectives of monetary reform programs had little to do with the removal of excess liquidity. Among such purposes were a census of wealth, the detection of war profiteering and tax evasion, the cancellation of currency held by the enemy, and the unification of the currency in countries where several currencies had been introduced during the war. In some countries, political objectives also played a major role; reforms were directed at depriving certain socioeconomic groupings of most or all of their savings. This was true particularly in the countries of Sovietoccupied Europe, where monetary reforms had the incidental aim of strengthening the planning and allocation system. In sharp contrast, one of the major objectives of the West German currency reform was to revitalize free market forces and to permit the price mechanism to reassert itself as the decisive determinant of economic behavior. Types. Despite the variety of their purposes, monetary reform programs can be classified by a few basic types, although of course few programs fall wholly in any one category. A useful classification, based on the method of reform employed, distinguishes (1) those that reduce the money supply by canceling part of the currency in circulation and part of existing bank deposits; (2) those that reduce the money supply by directing part of it into bank deposits, which are then to some extent demonetized or deactivated; (3) those that provide for conversion of the outstanding currency into another currency, without any significant blocking of bank deposits; and (4) those that virtually replace the entire money circulation with a new unit of account, after the pre-existing unit has depreciated to an infinitesimal fraction of its original value. Further useful lines of distinction may be based on whether or not the programs include fiscal devices, such as capital levies directed at absorbing significant amounts of funds held by owners of real, rather than monetary, assets. (For a somewhat different typology of monetary reforms, see Gurley 1953.) ( I ) Cancellation—Germany. Monetary reform programs of the first type—featuring a severe reduction of the money supply by simply wiping out
MONEY: Monetary Reform large portions of outstanding notes and deposits— were enacted in West Germany and several eastern European countries. West Germany's program, enacted in June 1948, is of special interest because it was a resounding success and a turning point in the postwar history of that country. Under a series of decrees by the occupation powers, individuals were issued Deutsche mark (DM) 60 in exchange for an equal amount of old reichsmark (RM) holdings, and DM60 per employee were paid out to businesses for payroll purposes. Business holdings and individual holdings in excess of the converted amount were credited to bank accounts. Only a small fraction of all bank deposits was eventually converted into Deutsche marks, the great bulk being simply wiped out. For individuals the ultimate conversion ratio was in effect one-to-one for original holdings of no more than RM60, between one-to-one and ten-to-one for those holdings between RM60 and RM600, and between ten-to-one and slightly over fifteen-to-one for those holdings over RM600. The effective conversion ratios were more favorable, however, for heads of families and for businesses with more than one employee, becoming less onerous with increasing size of family or firm. All bonds, mortgages, annuities, and other forms of private indebtedness were written down by 90 per cent; but prices, wages, rentals, and similar payments had to be converted at the one-to-one ratio. Cash holdings of public bodies were canceled and replaced by Deutsche mark allotments based on average monthly receipts over a given period. The government security holdings of financial institutions were simply canceled. Banks received cash reserves and state equalization claims in amounts equal to their new liabilities plus an allotment of 5 per cent of deposit liabilities, the counterpart of which constituted the capital account of their balance sheets. Similar provisions applied to insurance companies and other financial institutions. West Germany's monetary reform was not accompanied by a capital levy on real asset holdings. However, one of the military government laws providing for the reform called on appropriate German legislative bodies to frame the necessary legislation for the equalization of the war burden. Such legislation was subsequently adopted, along with laws that provided for special conversion rates applicable to deposit holdings of pensioners, refugees, savers, and selected groups of other liquid-asset holders. (2) Blocking—Belgium. Belgium provides an example of the second type of monetary reforms—
451
those that do not cancel any part of the mone^ supply but reduce it by requiring the conversior of liquid holdings into illiquid assets and by im posing severe restraints on the spending of these illiquid assets. The Belgian program, executed ir October 1944, was the forerunner of all othe] monetary reform measures in liberated Europe anc probably the inspiration for several of the reforrr laws adopted elsewhere. For immediate needs, the head of each family could exchange old banknotes for new ones, on a one-to-one basis, up to the amount of 2,000 francs per family member; al] remaining holdings of bank notes in denominations of 100 francs and higher had to be declared and deposited in blocked bank accounts. Simultaneously, all existing bank deposits were blocked. (A certain portion, representing either the amount held on the day preceding the German invasion 01 10 per cent of the amount held immediately before the reform, was excepted; for business firms the exempted portion was 1,000 francs per employee.) A short time later, each deposit owner was permitted to withdraw an additional amount of up to 3,000 francs. Each blocked amount, whether arising from note deposits or from pre-existing deposits, was divided into two parts, with 40 per cent temporarily blocked and 60 per cent definitively blocked until a means for its disposition was determined. A series of general releases gradually deblocked the temporarily blocked deposits. At the end of 1945, the 60 per cent portion of previously deposited notes and frozen bank balances was converted into long-term nonnegotiable government bonds carrying an interest rate of 3.5 per cent; subsequently, a large part of these bonds was absorbed by a special tax program. (3) Simple conversion—Denmark, France. Turning now to the third type of monetary reforms— those that convert the old currency into a new one, without significant contraction of the money supply—Denmark's currency exchange of July 1945 affords a good illustration. Its major objectives were to reduce currency holdings relative to bank deposits, to prevent the reimport into Denmark of German-held Danish currency, and to facilitate the taxation of war profits. The reform program called for a declaration of wealth, a limited exchange of banknote holdings, the depositing of excess holdings in blocked bank accounts, the blocking of existing bank deposits if in excess of 10,000 kroner or in excess of 150 per cent of deposit holdings on the day of Denmark's invasion by Germany. Within five months, however, the blocked deposits were released, except those of tax evaders. The French currency reform of June 1945 had
456
MONEY: Monetary Reform
the same purposes as that of Denmark but did not call for even a temporary blocking of deposits. The reform was accompanied by a progressive capital levy and a capital gains tax, with payment of these taxes spread over several years. In February 1948 the French government withdrew all 5,000 franc notes in circulation, and amounts in excess of 10,000 francs were returned to their owners only after they had discharged certain tax liabilities. But this measure did little more than sterilize part of the money supply for a short period. (4) Drastic conversion—Greece, Hungary. The monetary reform programs in Greece and Hungary, which exemplify the fourth type, were put into operation only after protracted periods of currency disturbances and not until inflation had brought about a depreciation of the currencies to an infinitesimal fraction of their prewar value. Special interest attaches to the Hungarian stabilization scheme of 1946, inasmuch as it brought to an end possibly the greatest inflation of history. Its special feature was that it provided for an internally consistent wage and salary structure designed to permit the distribution of scarce supplies at rigidly fixed prices. The program, executed in August 1946, called for the introduction of a new currency unit, the forint, to replace the pengo at the rate of 1 forint to 400 octillion pengo. (This conversion was preceded by the issue earlier in 1946 of a special currency, the so-called tax pengo, a monetary unit of account whose value was related to a price index expressed in terms of the regular pengo.) The reform program was based on computations of the gross national product in relation to its prewar level. Proportionate ceilings were set on wages, somewhat less favorable ceilings were established for salaries, and the income to be allocated to farmers and to manufacturers was related to the new money supply. The architects of the reform were insistent on limiting total income to the money value of available goods and services. The program was reinforced by a balanced budget, by the central bank's acquisition of dollars circulating in the country, and by the return of the gold removed by the Nazi regime. (For details, see Nogaro 1948.) Capital and increment levies. Many of the monetary reforms, notably those in western Europe, were accompanied by a census of both monetary and real assets. This served the purpose of laying the basis for capital levies and for taxes on capital increments and war profits—fiscal devices that in several countries, including Denmark and Norway, played a central role in the reform program. The motive, apart from the obvious desire to confiscate
profits resulting from trading with the enemy and illegal transactions, was to distribute the financial burden of monetary sanitation programs more equitably between holders of monetary and real wealth. By and large, monetary reforms that involve a cancellation of currency and bank deposit holdings affect solely households and businesses that have been unable or unwilling to dispose of their liquid funds. Capital and increment levies, on the other hand, can be laid on property owners in approximate proportion to their share in, or gains of, real wealth as well as monetary assets. With few exceptions, capital levies and similar devices have failed to make a major contribution to achieving the objectives of monetary reforms, although some of them have produced handsome yields over time. Most of the nonmonetary property subject to such levies consists of real estate, buildings, plants, equipment, valuables, and securities. Quite apart from the valuation problems involved, such assets cannot be converted into cash with which to discharge the levy because of the absence of capital markets that could absorb large offerings. In actual practice, the collection of these levies had to be spread over many years, which meant that payment was usually made out of current income. The proceeds were rarely employed for the redemption of government debt and contraction of the money supply. In some countries, notably Belgium and the Netherlands, such levies were in part paid out of blocked accounts or nonnegotiable government securities into which blocked accounts had been converted. But even in these two countries, individual tax liabilities often substantially exceeded blocked or nonnegotiable asset holdings. Postwar experience has demonstrated that capital and increment levies, whatever their merit from the viewpoint of social justice and equity, give rise to highly complex assessment and collection problems. For this reason they do not commend themselves as an effective tool for the removal of a monetary overhang. Evaluation. Not many additional generalizations about the efficacy of monetary reforms can be made with any assurance; there has been too much diversity in both the design and the execution of reform programs. On the whole, the preventive, ameliorative, and bracing effects of the more farreaching measures, at least during the six months or year after their adoption, may be judged as quite impressive. In two or three countries, particularly in Germany, the effects of the reforms in stimulating the economy were truly remarkable; in several other countries they succeeded in eliminating black
MONOPOLY markets, at least temporarily, and in restoring the public's waning faith in the worth of m^ney as a store of value. The resurgence of both open and repressed inflation and the re-emergence of black markets in many countries relatively soon after the completion of reforms should not be laid at their door, except in the few cases where the scope of the measures was so narrow as to cast doubt on the propriety of their designation as "reforms." In most cases, the reappearance of monetary maladies— which in several countries necessitated another sanitation program—should be attributed not to faulty or weak reforms but to subsequent inflationary monetary and fiscal policies and to the fact that in economies suffering from supply scarcities there was a low propensity to save and yet strong official pressures for investment. This is not to deny that several of the reforms were marred by economic disturbances. A number of technical mistakes were made in the preparation and execution of reform programs, including premature announcements of details, too scanty or too liberal releases of deposits, and misjudgments of the public's transaction requirements. But this need not evoke surprise, since the architects of at least the initial programs had few if any precedents to draw on in their decision making. From the viewpoint of equity, most monetary reform programs of the postwar period left much to be desired. The elimination or blocking of large proportions of the money supply without, or with scant, regard to the total wealth of its holders is a very crude device of the sledge-hammer variety, even if cushioned by exemptions for holders of small amounts of currency and bank deposits. But social justice would probably not have been served any better if the money surfeits of postwar Europe had been permitted to be absorbed by rising prices or if the authorities had continued their largely unsuccessful attempts to suppress the manifestations of excessive monetary expansion. On balance, the evidence justifies the conclusion that postwar monetary reforms made a major contribution to economic recovery in Europe. FRED H. KLOPSTOCK BIBLIOGRAPHY
Currency Reform in Eastern Europe. 1946 Federal Reserve Bank of New York, Monthly Review 28:39-43. Currency Reform in the Netherlands. 1946 Federal Reserve Bank of New York, Monthly Review 28:8-9. DE RIDDER, VICTOR A. 1948 The Belgian Monetary Reform: An Appraisal of the Results. Review of Economic Studies 16, no. 1:25-40. DUPRIEZ, LEON H. 1947 Monetary Reconstruction in Belgium. New York: King's Crown Press.
457
GROTIUS, FRITZ 1949 Die europaischen Gcldreformen nach dem zweiten Weltkrieg. Parts 1-2. Weltwirtschaftliches Archiv 43:106-152, 276-325. GURLEY, JOHN G. 1953 Excess Liquidity and European Monetary Reforms: 1944-1952. American Economic Review 43:76-100. KLOPSTOCK, FRED H. 1946 Monetary Reform in Liberated Europe. American Economic Review 36, no. 4: 578-595. KLOPSTOCK, FRED H. 1948a Monetary and Fiscal Policy in Post-liberation Austria. Political Science Quarterly 63, no. 1:99-124. KLOPSTOCK, FRED H. 1948b Western Europe's Attack on Inflation. Harvard Business Review 26, no. 5:597612. KLOPSTOCK, FRED H. 1949 Monetary Reform in Western Germany. Journal of Political Economy 57, no. 4:277292. NOGARO, BERTRAND 1948 Hungary's Recent Monetary Crisis and Its Theoretical Meaning. American Economic Review 38, no. 4:526-542. PESEK, BORIS P. 1958 Monetary Reform and Monetary Equilibrium. Journal of Political Economy 66, no. 5: 375-388. SCHOUTEN, D. B. J. 1948 Theory and Practice of the Capital Levies in the Netherlands. Oxford University, Institute of Statistics, Bulletin 10, no. 4:117-122.
MONISM See PLURALISM. MONOPOLISTIC COMPETITION See, in order of relevance, MONOPOLY; FIRM, THEORY
OF THE;
ADVERTISING,
article
OH
ECO-
NOMIC ASPECTS.
MONOPOLY In accordance with its etymological meaning of "one seller," the term "monopoly" in a strict sense refers to a situation in which a seller is the sole source of supply for an economic good that has no significant substitutes. The term is also applied more broadly, however, to any market in which the behavior of sellers is other than purely competitive. Since the price confronting the individual seller in pure competition, as determined by demand and supply in the market as a whole, is essentially independent of the quantity that he chooses to sell, monopoly in the broad sense characterizes the market position of any seller who has a significant degree of discretion about his price and whose quantity sold varies inversely with the price selected. In this sense, at least some degree of monopoly power is both widespread and practically unavoidable. Furthermore, monopoly and competition are not mutually exclusive elements but may both be present in any given market.
458
MONOPOLY
Finally, even when the behavior of individual sellers is purely competitive, any artificial restriction on their number or on the quantities that they are permitted to sell may also be classified as monopolistic. Simple monopoly. The elementary static theory of the profit-maximizing equilibrium of a simple monopolist may be illustrated by Figure 1. With the monopolist's product quantity, q, measured on the horizontal axis and such magnitudes as his price, p, and his unit cost, c, on the vertical axis, the diagram depicts illustrative revenue and cost schedules, both average and marginal. The nature of these data can be briefly explained. If the monopolist's total revenue and total cost are designated by R and C, respectively, the corresponding average magnitudes are defined as AR = R/q = p and AC = C/q =c. The AR schedule reflects the demand confronting the seller, since it shows the quantities that customers are willing to buy from him at various alternative prices. Its significantly downward slope, the hallmark of the seller's monopoly power, is in contrast to the essentially horizontal demand that would apply to the individual seller in pure competition. Depending on the context of the analysis, the cost curves may refer to either (1) the long run, when a firm can vary all relevant factors of production, including the number and sizes of its plants, or (2) a short run, when certain components of the firm's plant and equipment are temporarily fixed (giving rise to the distinction between fixed and variable costs). In the long run the AC curve slopes downward, at least initially, as a reflection of the economies of large-scale production, If it turns upward at some point, that reflects an eventual dominance of diseconomies of scale. In the short run AC slopes even more sharply downward initially, as a reflection of the spreading of the fixed or overhead p,c
MC
Po Cn
MR
Figure 1
costs over an increased output; it turns upward primarily because of plant-capacity limitations. The concepts of marginal revenue and marginal cost refer to the rate of increase of the corresponding total magnitudes per unit of extra output. If output is increased by some finite amount, Ag, giving rise to similarly finite increments of total revenue, AR, and total cost, AC, marginal revenue and marginal cost in their discrete versions are defined as MR = AR/Ag and MC = AC/Ag. For example, if Ag equals one unit, MR and MC reflect the increments of total revenue and total cost, respectively, occasioned by the production and sale of the one unit of extra output. If the basic variables are treated as continuous, the marginal concepts in their continuous versions become MR = dR/dq and MC = dC/dq—the first derivatives of the corresponding total-revenue and total-cost functions, R = R(g) and C — C(g). In this version MR and MC again represent the additional R or C per unit of additional output, but only as the limiting values that are approached when the addition to output is viewed as becoming indefinitely small. It is this definition that is implied when MR and MC are represented by continuous curves, as in Figure 1. As can readily be proved, a marginal magnitude is less than, equal to, or greater than its corresponding average according as the average curve is decreasing, constant, or increasing. The principal analytical significance of these marginal concepts is that the firm's profit-maximizing output is determined where the MR curve cuts the MC curve from above, as at the output qQ in Figure 1. This follows from the definitions of MR and MC, which imply that the firm's total profit (TT- = R — C) is increased or decreased by an expansion of output according as MR is greater or less than MC. The profit-maximizing price, p0, and unit cost, c ( ) , are then indicated on the AR and AC curves, respectively, at q0. This identifies the equilibrium profit per unit as ir0/q0 = p() — c 0 ; the maximized total profit, TT-Q = (Po — Co)<7o, is represented by the area of the shaded rectangle. In a popular layman's phrase, monopolists are often represented as "charging what the traffic will bear," but this is, at best, ambiguous. Notice that a smaller "traffic" than q0 would "bear" both a higher price and a higher profit per unit, whereas a lower price than p0 would allow a "traffic" greater than q0. In other words, to maximize total profit the monopolist must balance the favorable effect of a greater quantity against the unfavorable effect of a lower profit per unit, maximizing neither. On the other hand, a positive profit is not a defining characteristic of monopoly. Thus, if the AR
MONOPOLY P>c
MC
PO> C O
AR MR
Figure 2
curve is tangent to the AC curve at a single point (with AR < AC everywhere else), as in Figure 2, total profit is merely zero when maximized. As may be relevant in a short run, furthermore, a maximized profit may also be negative—when AR lies below AC at all possible outputs. Conversely, a purely competitive firm may also have positive profit, not only in a short run but in the long run as well (provided only that "profit" is defined in a consistent manner, as the excess of total revenue over the minimum total cost that must be covered if the firm is to be willing to go on producing the given output indefinitely). Monopoly broadly conceived. "Simple" (sometimes called "pure") monopoly, implying essentially a complete absence of competitive influence, is a rare phenomenon, because of the strict conditions that must be met. First, there must be no significant relationship of substitution or complementarity with the products of any other firm or group of firms, save for the inevitable interdependence of all products in the economy as a whole. Second, there must be no threat of potential competition from the possible entry into the market of a supplier of a significantly substitutable product, Then, if the monopolist can vary the output that he produces and sells without having any appreciable effect on the situation or decisions of any other firm in the economy, his equilibrium can be treated in analytical isolation from the remainder of the economy—as that of a one-firm "industry." Other forms of monopoly, to be distinguished from this simple species, include several forms of oligopoly, complementary monopoly, and the nonoligopolistic type of competition among a large number of suppliers of "differentiated" products— that is, products that are relatively close, but not perfect, substitutes. Oligopoly exists when there are relatively few
459
suppliers of at least relatively close substitutes or when relatively few suppliers account for the predominant part of the industry's total supply [see OLIGOPOLY]. It is classified as "pure" or "differentiated" oligopoly according as the oligopolists' products are perfect or imperfect substitutes. The distinctive feature of oligopoly is the appreciable effect that the individual firm has on the situation of its rivals through its own price, output, and other business decisions, since induced reactions of one kind or another on the part of the rivals are then highly likely. This means that it is highly unlikely that an oligopolist would suppose that the demand curve relevant for his market decisions would be one drawn on an assumption that either the prices or the outputs of his rivals were constant. Yet, since the reactions of rivals are, in general, uncertain, there is a corresponding uncertainty about the relevant demand confronting each oligopolist. This is why the oligopoly problem is such a conundrum, with a wide range of possible outcomes that embrace (1) various forms and degrees of collusion, whether explicit or tacit, toward one extreme and (2) various types and intensities of economic warfare toward the other. Except under conditions of warfare, however, the mutual awareness by oligopolists of their distinctive interdependence typically leads to higher prices and lower outputs than would be the case if that interdependence were ignored. Although less important in reality and comparatively neglected by analysts, cases in which a small number of monopolists supply goods that are either perfect or imperfect complements involve essentially the same type of interdependence as in oligopoly. Here, however, collusion works in the direction of lower prices and greater quantities sold than is the case when the interdependence is ignored or imperfectly appreciated. When differentiated products are supplied by a large number of small firms, oligopolistic relationships may be absent for much the same reason as in pure competition, because the small individual firm has only a negligible impact on the market position of any one of its many rivals even when it brings about a large relative change in its own sales. Accordingly, the demand relevant for this type of firm's decisions may be drawn on the assumption that all of its rivals' prices are constant— that is, independently determined. Given a sufficient degree of product differentiation, however, the demand confronting this type of firm slopes downward significantly, in contrast to the essentially horizontal demand facing the pure competitor. This market form may be called "differentiated
460
MONOPOLY
competition," to distinguish it from pure competition in the same way that differentiated oligopoly is distinguished from pure oligopoly. As to the equilibrium of the individual firm in differentiated competition, this is much the same as in simple monopoly. These two market forms differ, however, in that there is a problem of "group equilibrium" in differentiated competition, just as there is in pure competition. Differentiated competition typically involves not only a simultaneous equilibrium of each individual firm but also an equilibrium adjustment of the number of firms in the long run, through entry or exit in response to profits or losses. In a theoretically important, though hardly realistic, case in which all the actual and potential members of the group have like costs and face like demands, the ultimate long-run equilibrium is of the zero-profit type—with each firm's demand, or AH, curve just tangent to its AC curve, as in Figure 2. As emphasized by Edward H. Chamberlin (1933), the pioneer in the analysis of this and other forms of "monopolistic competition," this heroically simplified special case is merely a convenient illustration of the group-equilibrium concept. Realistically, various asymmetries of both demand and cost are overwhelmingly likely, with resulting differences of price, output, and profit among various firms in the group even in a state of long-run equilibrium. Degree of monopoly power. In the sense that the situation of an individual seller may differ by much or little from that of a pure competitor, his monopoly power is a matter of degree. In the limiting case of pure competition, where the individual seller faces a horizontal demand so that AR and MR coincide, the maximizing of profit at the point where MR = MC also implies an equating of AR and MC. When the AR curve slopes downward so that the MR curve lies below it, however, the equating of MR and MC implies that AR > MC. This led Abba P. Lerner (1934) to formulate as a quantitative measure of a firm's degree of monopoly power the ratio M = (AR - MC)/AR. That ratio is then zero for the profit-maximizing pure competitor, whereas for the profit-maximizing monopolist it is positive and would approach unity as an upper limit if his price or AH was visualized as becoming indefinitely large, relative to a given positive MC. The degree of monopoly power of a firm in static profit-maximizing equilibrium is closely related to the "elasticity" of demand at the equilibrium point. That elasticity, defined as E = (dq/dp}(p/q~), measures the percentage change in quantity de-
manded, q, relative to the associated percentage change in price, p. It can then be shown that E = A H / ( M R - A R ) . Accordingly, when MR = MC it follows that M = — 1/E. This is consistent with the implication that the horizontal demand facing a purely competitive firm is "perfectly" or "infinitely" elastic, whereas the elasticity of the downwardsloping demand facing a monopolist is finite and greater than unity in absolute value at a point where profit is a maximum and MC is positive. On the other hand, the relevance of M as a measure of the degree of monopoly power is not limited to situations of static equilibrium. Thus, whether or not MR is equated with MC or is determinate at all, the definition of M in terms of AH and MC makes it a measure of the degree of currently "exerted" monopoly power, even if its value would be different in some alternative position of static equilibrium. This is important, for sometimes (especially in oligopoly) MH may be basically uncertain or indeterminate, and very often monopolists of various types are persuaded by dynamic or "long run" considerations not to seek to maximize profit with respect to the demand and cost data that apply in a given short run. Furthermore, it is precisely the relationships of AH and MC in all of the firms throughout the economy that are of central interest in evaluating the efficiency of allocation of the economy's resources, as will be further discussed below. Other comparative implications. As a reflection of the horizontal demand facing a pure competitor and the resulting equilibrium equality of AR and MC, each pure competitor sells all that it wishes to at the prevailing price. By contrast, since the firm that faces a downward-sloping demand regularly chooses to operate where AR > MC, it remains eager to sell more at its chosen price. If it could sell more at that price, its profit would rise by the amount AR — MC for every extra unit of product sold. A variety of important implications follows from this basic contrast between monopoly and pure competition. As an analytical matter, it means that only in pure competition can equilibrium price and quantity be explained by means of the famous law of supply and demand, for only in pure competition can the aggregate willingness of all eligible producers to sell be summarized in an industry supply curve, whose intersection with the industry demand curve then determines the equilibrium price and quantity. Under any form of monopoly, by contrast, supply regularly exceeds demand in the sense that each producer is willing to sell more than he
MONOPOLY actually sells at the price he chooses to set Nor is it possible to analyze a monopolist's equilibrium with reference to any sort of quasi supply curve traced out by a series of hypothetically shifting demand curves, for a different locus of price-quantity equilibrium points would be traced out by every different set of shifting demands, as a reflection of the different relationships between AR and MR that apply when the slope and elasticity of demand are altered. The monopolist's eagerness to sell more at his currently chosen price is also closely related to what is called "nonprice competition," including the two major categories of selling effort and product variation. Advertising, for example, is never relevant for a purely competitive firm, which can always sell what it pleases anyway, at a price over which it has no control. When AR > MC, however, advertising will pay if it induces a sufficient expansion of demand relative to the advertising expense. The higher is M = (AR — MC)/AR, moreover, the greater are the chances that this will be so. If M = .1, for example, an extra dollar spent on advertising will be profitable only if it raises the sales revenue R by more than ten dollars, but if M — .25, it will be profitable if it raises R by more than four dollars. This follows because an extra dollar of R, from expanded sales at an unchanged price, augments profit by a fraction of a dollar equal to the magnitude of M. Naturally, advertising may also change the equilibrium price, and if it does so, the foregoing relationships apply with reference to the new price and the new value of M. In general, advertising may either raise or lower the profit-maximizing price. Thus, if MC is constant over the range of expanded output, the static-equilibrium price will rise or fall according as the elasticity of demand is lowered or raised in absolute value at the level of the original price—as can be proved from the relationship E = AR/(MR - AR). Similarly, if E is unchanged at the original price as the demand curve is shifted to the right, the equilibrium price will rise or fall according as the MC curve is rising or falling. These relationships apply to the "comparative statics" of a spontaneous increase in demand as well as to an increase induced by advertising, since advertising expense is in the nature of a fixed cost, having no impact on marginal cost. At least as a matter of static analysis, it is worth emphasizing that the effect of a shift in demand on the equilibrium price depends on MC, not on AC, as popular discussions would usually have it. Since AC is more likely to be downward sloping than MC
461
is, the correct analysis is less favorable to the possibility that price will fall as demand is increased than is the erroneous theory so widely held in the advertising profession. When the products of rival firms are inherently homogeneous, as in pure competition or pure oligopoly, strategies of product variation are ruled out by definition. When there is scope for product differentiation, however, the "product" itself becomes a variable, so that deliberate variations in its characteristics become an eligible part of the total market strategy, with simultaneous effects on both cost and demand. From an economic standpoint, the "product" involves not just its physical features but all of those ancillary characteristics that influence its acceptability to customers, such as its packaging, the location and personality of its sellers, any accompanying services, and so on. This gives a very wide scope to the various possible strategies of product variation. Thus, some firms may seek to imitate as effectively ah possible the more popular products of their rivals, while the rivals seek to maintain or increase the distinctiveness of theirs. Especially when the magnitude of M is great, consumers may be at least partially compensated for the weakness of price competition by an intensified effort by producers to improve quality and service. When consumers are relatively poor judges of quality differences, however, they may be victimized by deliberate product deterioration. In general, the comparative strengths of price and nonprice competition often tend to be inversely related. Not only is nonprice competition wholly absent when M = 0 and price competition is of maximum effectiveness, but also, as price competition is increasingly inhibited, there is a natural tendency for the various forms of nonprice competition to be correspondingly intensified. To the extent that at least certain types of nonprice competition are deemed desirable, a difficult problem is posed for public policy as to the desirable degrees of both price and nonprice competition. Bases and limitations of monopoly power. Restraints on competition may be natural, artificial, or both in some combination. Perhaps the simplest illustration of artificially created monopoly power is that based on the exclusive right granted by government in the form of a patent, whether as a gratuitous privilege bestowed by a monarch or, as in the more modern practice, as a reward for invention [see PATENTS]. The modern policy reflects another significant conflict between the social advantages and disadvantages of competition, for it is precisely when unrestrained competition is most
462
MONOPOLY
likely to prevent the successful innovator from recovering adequate compensation for his costs and risks that some artificial incentive is most needed to induce his inventive and innovating activity. Although alternative forms of compensation would be possible, the patent systems of most modern nations reflect a judgment that, in the simplest and most favorable case, it is better to have a new product or process available under monopoly control than not to have it available at all. Similar considerations underlie the provisions for copyrighting literary and artistic productions. Other types of franchise may be granted because of the inherent scarcity of an otherwise unappropriated resource. For example, the limited space on city streets may call for the limited licensing of taxicabs, and the limited number of television channels makes unique allocation necessary. More nearly absolute monopolies may also be enfranchised as a recognition of their status as "natural monopolies," where any relevant output can be produced much more efficiently by a single firm than by two or more. This is the case with a variety of "public utilities." which are then typically subjected to governmental regulation to limit their prices and profits [see REGULATION OF INDUSTRY]. Contrasting types of regulation of numbers of producers, outputs, or prices—as in agriculture, oil production, and even liquor retailing in some states —are designed only to limit competition in the interests of the existing producers. Another possible source of monopoly power is the concentrated ownership of distinctive natural resources, such as ore deposits. Whether the consequences are monopolistic in the classical sense or just oligopolistic, such resource ownership frequently leads to monopoly power also over the products for which the distinctive resources are essential. On the other hand, the mere fixity of supply of a natural resource does not make it a monopoly. Modern economists do not follow Adam Smith's dictum that the rent of land is naturally a monopoly price. In general, since pure competition requires that a homogeneous product be produced by a large number of firms, each large enough to exhaust all net economies of large-scale production, the more "natural" bases of monopoly power are to be found in the impediments to the fulfillment of these necessary conditions for pure competition. These impediments involve considerations that widen the scope for product differentiation, whether real or fancied, and also the conditions that underlie economies of scale, which limit the numbers of eligible
producers of any given product and its relatively close substitutes. Product differentiation would be important even if consumers were both exceedingly mobile and well informed about the inherent properties and relative prices of the various available products. It is all the more important in view of the actual imperfections of consumers' mobility and knowledge, which lead both to a good deal of inertia in consumer behavior and to the choice of products on the basis of reputation and prestige, whether deserved or not. Brand names and trade-marks, protecting the identity of given products, thus have the dual effects of guiding consumers to desired choices and fixing product differentiation in consumers' minds. Whatever its basis, strong consumer loyalty to a given brand strengthens the monopoly power of the supplier, against both existing and potential rivals. Just as economies of scale sometimes produce natural monopolies, so under slightly weaker conditions they may cause some industries to be "natural oligopolies," when only a few firms at most can simultaneously achieve substantially all the potential economies of scale. In this connection there is an important distinction between the economies of the large-scale plant and the further possible economies of the large-scale, multiplant firm. Given the number, sizes, and locations of firms producing a set of substitute products in a specified region, it should be emphasized that the market consequences further depend on the relative mobility of the customers, the products, or both. Thus, much retailing is inherently limited, as far as the spatial extent of the market is concerned, to relatively small neighborhoods. Toward the other extreme, there are meaningful "national" and "world" markets; but these markets are also inherently imperfect because of the costs and delays of communication and transportation. Just as space itself is a continuum, so are markets typically incapable of unique spatial definition. The establishment, maintenance, and exercise of monopoly power also depend on the legal framework of permitted and prohibited acts. Clearly, mergers provide an easier path to monopoly than an increased market share that must be fought for competitively. Likewise, explicit agreements to limit competition among otherwise independent firms, whether in formal cartels or more informally, are consistently more effective techniques for achieving monopolistic behavior than the alternative of merely tacit collusion. Collusion, consisting of a mutual self-restraint from at least some forms
MONOPOLY of aggressively competitive behavior, frequently relies on one or more of such techniques as price leadership (the imitation of one firm's prices by the others) and market sharing (the policy on the part of each firm not to seek to increase its percentage of industry volume above some mutually recognized "normal" level). Full compliance with the "rules" of price leadership or market sharing is, however, typically difficult to achieve, especially on a lasting basis. Although there is no necessary connection between a firm's degree of monopoly power (as reflected by the relationship of price and marginal cost) and its profitability, these are presumably linked in at least a rough empirical correlation. The persistence of profits through time depends, in turn, on barriers to the entry of potential rivals. Where entry is at least legally free, these barriers depend on net advantages of cost or product acceptance enjoyed by existing firms as compared with would-be entrants. The industries most difficult to enter are usually those with large absolute capital requirements for an efficient scale of production and product distribution, complex patentable technology, strong allegiance to existing brands by consumers, or some combination of these elements. When the nonrecurrent costs of building an efficient organization and achieving a sufficient degree of product acceptance are high, existing firms can continue to enjoy substantial profits without excessive danger of inviting actual attempts at entry. Conversely, concern for potential competition is also a limiting factor on the degree of monopoly power that existing firms can afford to exert. Firms may also be deterred from the fullest possible exploitation of their monopoly power by the fear of government action and by other considerations, such as public, consumer, and employee relations. Price discrimination. When a firm with monopoly power can divide its market into submarkets sufficiently distinct so that favored customers cannot readily resell to others, it is frequently more profitable for the monopolist to charge different prices in the different submarkets than to charge the same price throughout his whole market. In the limiting case where the various submarkets are wholly independent and costs per unit of product are the same, static profit-maximizing calls for equating marginal revenue in each submarket with the common marginal cost. This results in different prices when elasticities of demand differ from one submarket to another. Thus, from the previously noted relationship E = AR/(MR - AR), it
463
follows that AR = MR • E/(l + E), where E < -1. Hence, when MR is the same in each submarket, AR, or price, will be higher the closer is the elasticity, E, to -1. When submarkets are not wholly independent, either because some limited resale among customers is possible or because the customers are themselves competing firms, the profit-maximizing rules are more complicated. Here, in addition to considerations of relative elasticities, it is also necessary to take into account the tendency of an increase in sales in a particular submarket, resulting from a price cut, to reduce sales in other submarkets. Price discrimination is said to be "personal" when it depends on such features as the age, sex, income, and trade status of customers—as when different prices are charged to children and adults or to men and women at entertainments, or when rich and poor are charged different fees by physicians, or when different prices are charged to retailers and wholesalers even for like quantities of products, or when individual bargaining with customers results in different prices. Another category is "geographical" discrimination, as in the "dumping" of goods abroad at lower prices than at home or as a feature of "delivered-price systems" involving zone pricing or the use of basing points, where the net factory price differs for goods shipped to different localities. Price discrimination may also be either systematic or sporadic. Not all price differences for the "same" product are necessarily discriminatory. Thus, if marginal costs vary for different quantities, and if prices reflect only those cost differences, there is no price discrimination. Similarly, when marginal costs vary between periods of peak and off-peak operation, price differences with the season of the year or even the time of day are not necessarily discriminatory. Indeed, when marginal costs differ and prices do not, that represents a concealed discrimination. The concept of price discrimination is frequently expanded to cover the comparative prices of products that are related but not identical, such as different models of a generic product or physically comparable products marketed as different brands. The general test of price discrimination, covering these cases as well as the simpler ones, is whether the proportion of price to marginal cost is the same or different from one piece of business to another. Indeed, price discrimination may be defined as the firm's exertion of different degrees of monopoly power, as measured by the value of M = (AR — MC)/AR, from one submarket to another.
464
MONOPOLY
This formulation serves to emphasize that price discrimination is inherently a monopolistic phenomenon; in pure competition, where price necessarily equals marginal cost in equilibrium, price discrimination is impossible. Of theoretical interest, though of limited practical importance, is the polar concept of "perfect" discrimination. This involves not only the monopolist's being able to treat each individual customer as a separate submarket but also his charging each customer, at least in effect, different prices for the different increments of product that he buys, in such a way that the monopolist arrogates to himself something approaching the entire potential gain from trade. Under appropriate continuity assumptions, the monopolist then maximizes his profit by equating with his marginal cost the price that he exacts from each customer for the final increment purchased (also equal to marginal revenue). As an alternative technique, the monopolist can achieve this result with an appropriate all-ornothing offer to each customer for the appropriate aggregate quantity. Monopsony. In this section and the next some kindred concepts of monopoly will be briefly discussed. Analogous to a seller's monopoly power, a buyer is said to have "monopsony" power when he can significantly affect the price of what he buys by varying the quantity bought. Typically, the monopsonist faces an upward-sloping supply schedule, showing the prices or average costs, AC, at which he may buy alternative quantities. There is then a related schedule of marginal cost, MC, lying above the AC curve when the AC curve is positively sloped. Examples of such schedules are shown in Figure 3. Although a monopsonist might conceivably be a large ultimate consumer, the more important inw
MC
MB fo
Figure 3
stances of monopsony concern the purchase of a factor of production by a firm. If the factor is passively supplied, as with unorganized labor or an intermediate product of a purely competitive industry, and if the buying firm is large enough to have the requisite influence on the factor price, the conditions for monopsony are fulfilled. The monopsonist's equilibrium further depends on schedules of "average benefit," AB, and "marginal benefit," MB, as illustrated in Figure 3. In the rudimentary case where the factor in question is the only variable one, these concepts would correspond to what are called the factor's "average revenue product" and "marginal revenue product" —-or R/f and dR/df, respectively, where R is the total revenue from the sale of the product and f is the factor quantity. When other factors are also variable, as in the firm's long-run equilibrium, similar concepts involving a "net revenue product" are used instead. The monopsonist's equilibrium factor quantity is then determined where MC cuts MB from below, provided that AC does not exceed AB. In Figure 3, where the equilibrium factor quantity is at f 0 , the corresponding equilibrium price is determined on the AC schedule at w 0 , and the aggregate benefit from having that factor quantity (as compared with not producing at all) is indicated by the shaded area. In long-run equilibrium this would be the firm's total profit; in a short-run situation it would be necessary to subtract the fixed costs of the fixed factors to determine the profit. Just as a monopolist would like to sell more at his equilibrium price and may therefore have a motive for exerting himself through advertising and other forms of nonprice competition to do so, so the monopsonist has a similar interest in buying more at his equilibrium price. This is so because his MB would exceed the price for a certain increment of hypothetical extra purchases. Similarly, the analogue of a seller's degree of monopoly power is the concept of a buyer's degree of monopsony power, which may be defined symbolically as M' = (MB - AC)/MB, with static-equilibrium values that lie between zero and one when the AC schedule is positive and positively sloped. A monopsony equilibrium with zero profit is also possible. This would be illustrated in a diagram such as Figure 3 if, in the long run, the positively sloped AC curve were shifted upward until it was just tangent to the AB curve. Bilateral monopoly. In monopoly or monopsony the discretionary power to set the price is uniquely on one side of the market, since the other side consists of passive price-takers. When compe-
MONOPOLY tition is limited on both sides of the market at the same time, the usual consequence is a bargaining relationship. There are numerous possible patterns, involving either one or a few sellers, one or a few buyers, and homogeneous or differentiated goods. The application to union-management bargaining is obvious. Whether in the strict form of a single seller facing a single buyer or in the looser variants, this class of market relationships is known as "bilateral monopoly." A basic theoretical case, first analyzed by Edgeworth (1881), concerns two persons who each possess fixed stocks of two homogeneous goods and who can trade only with one another. The problem is best formulated and analyzed with reference to the ingenious "box diagram" shown in Figure 4, first employed by Edgeworth himself. Initial stocks of the two goods are designated as Xi and 1/1 for the first person and as x.2 and y., for the second. The dimensions of the box are then set at xl + x., on the horizontal axis and yl + y» on the vertical axis. When the first person's quantities are measured conventionally from the origin at i and the second person's are measured in
465
reverse directions from the origin at n, the point A represents the initial position and any other point within the box represents a possible redistribution of the fixed total quantities of the two goods. The tastes of the two traders are represented by selected indifference curves, such as those labeled In, I 12 , and I t3 for the first man and I,,, I 22 , and I23 for the second. A person is made better off by any movement to another indifference curve lying farther away from his map's origin. Since the indifference curves through point A— namely, I,, and L T —are not tangent but, rather, intersect, mutually beneficial trade is possible. Specifically, any movement from A into the cigarshaped area bounded by those two curves represents a simultaneous improvement for both parties —that is, a movement to "higher" indifference curves. The potential benefits of trade are fully exploited, however, only when the traders move to a point where their indifference curves are tangent .on a locus that F.dgeworth called the "contract curve." The relevant range of this locus is depicted in Figure 4 by the curve drawn between D^ and D,. Naturally, the first person would prefer an
Figure 4
466
MONOPOLY
outcome as close as possible to D x , and the second would prefer an outcome as close as possible to D 2 . This conflict of interest is the source of the so-called indeterminacy of the bilateral-monopoly problem, since the analyst cannot confidently predict any unique outcome of the parties' bargaining. In the course of that bargaining each party can threaten to refuse to trade at all unless the other is willing to grant sufficiently favorable terms. Hence, one possible outcome is that the parties may simply remain stubbornly at point A. If either can make a plausible take-it-or-leave-it offer to the other, however, an offer of the all-or-nothing type might achieve a result corresponding to perfect monopolistic discrimination, indefinitely close to Dl or to Do (where the trader with the strategic initiative would reap all the potential gain and the passive trader none at all). Intermediate outcomes somewhere on the contract curve between D n and D2 are, of course, more plausible under assumptions of a more nearly symmetrical bargaining process. Indeed, game theorists such as Nash (1950; 1953) and Harsanyi (1956), with the aid of further assumptions including the von NeumannMorgenstern utility functions of the traders, claim to identify a unique solution point on the contract curve, but their reasoning is too complex for brief summary. Other special outcomes, such as those at C, M l 5 and M 2 , also emerge from appropriately special assumptions. Thus, if each trader is assumed to be free to trade at some specified exchange ratio, along a straight line through A with a slope reflecting the given exchange ratio, he maximizes his benefit by moving to the point where the specified line is just tangent to one of his indifference curves. The locus of such points for various alternative exchange ratios is known as an "offer curve." It is illustrated in Figure 4 by the dashed curves through A, M 2 , and C for the first person and through A, M x , and C for the second. The possible outcome at C, where the offer curves intersect, corresponds to the "competitive" solution, since it implies an equating of supply and demand. On the other hand, if either person has the privilege of setting the exchange ratio and the other then trades in accordance with his offer curve, the best ratio for the active strategist is the one implied at the point of tangency between one of his indifference curves and the offer curve of the other. These outcomes, corresponding to simple monopoly or monopsony solutions, are illustrated at MI and M 2 , according as the first person or the second has the privilege of setting the exchange ratio. It should be noticed that these monopoly
points fall short of the contract curve, in contrast to the competitive point which necessarily lies on it. Bilateral monopoly in the context of a monopsonistic firm hiring organized labor or bargaining with a monopolistic supplier of an intermediate product is subject to similar analysis. By comparison with the monopsonistic equilibrium previously identified in Figure 3 (at f ( ) , w ( l ), for example, the competitive solution would call for the higher factor price and quantity at the intersection of AC and MB. Still higher prices and correspondingly reduced quantities along the MB schedule would illustrate a movement toward a monopolistic solution. A linear contract curve passing vertically through the intersection point of AC and MB is also implied in some cases, although not usually in the one involving labor. Significance for welfare. The "evils" of monopoly and its kindred market situations, together with some possibly offsetting advantages, are at least to some extent in the eye of the beholder. This is especially so with respect to some of the sociopolitical issues involving "big" versus "small" business or the social implications of "economic power" —to the extent that these issues overlap with monopoly in the economic sense. Similarly, the tendency of monopoly to intensify the inequality of the distribution of income, though far from clear-cut, is in any event also subject to different evaluations. The remaining aspect of monopoly and monopsony, on which economic analysis has had rather more to say, is the effect on the efficiency of resource allocation. Here the relevant analytical approach is that of welfare economics, with its central concept of "Pareto-optimality." As initially formulated by Vilfredo Pareto, a situation is said to be Pareto-optimal if there is no reallocation of resources or goods that can make one person better off without injury to at least one other person [see WELFARE ECONOMICS]. It is a principal theorem of welfare economics that a universally purely competitive general equilibrium is Pareto-optimal, provided that people's individual tastes are respected and that there are no "externalities," such as a dependence of any firm's output or any household's welfare on the factor employment of other firms or the goods consumption of other households. Short of a full demonstration of that theorem, it may be said that its essence lies in the implied equality of price and marginal cost for all produced goods and the similar equality of price and marginal benefit for all hired factors. By the same token, then, if these ideal conditions are disturbed by a monopolistic inequality of price and marginal cost or by a mo-
MONTESQUIEU nopsonistic inequality of factor price and marginal benefit, the result is a departure from Paretooptimality. Typically, the monopolist produces and sells too little at too high a price, and the monopsonist buys too little at too low a price. Note that if excessive profit were the only evil of monopoly, the equating of price with average cost would be the remedy—as in the conventional philosophy of public-utility regulation. When average cost is decreasing, however, an excess of price over marginal cost is still implied. Under such circumstances the Pareto-optimal equating of price with marginal cost involves a loss, which must then be made good by external subsidy. Even under the indicated ideal conditions, universally pure competition is only a sufficient—not a necessary—condition for Pareto-optimality. As already illustrated for the simplified two-person exchange economy portrayed in Figure 4, there are various ways in which the Pareto-optimal contract curve can be reached, including perfect discrimination as well as pure competition. At the same time, however, that analysis also showed how simple monopoly or monopsony systematically falls short of Pareto-optimality. When universal pure competition is not naturally viable (for example because of persistently decreasing costs) or when externalities are present, Paretooptimality cannot be achieved by simply preventing monopoly and monopsony. Furthermore, the attainment of that ideal by elaborate special regulations, though theoretically conceivable, obviously involves attendant difficulties and costs that force us to aim at less than the ideal. Under these circumstances there are no simple rules for attaining a "secondbest" result. In this context, a blanket indictment of monopoly and monopsony as inefficient is no longer valid. In the wider context of a dynamic economy where technological progress is to be encouraged, this observation acquires still greater force.
467
EDGEWORTH, FRANCIS Y. (1881) 1953 Mathematical Psychics: An Essay on the Application of Mathematics to the Moral Sciences. New York: Kelley. FELLNER, WILLIAM 1947 Prices and Wages Under Bilateral Monopoly. Quarterly Journal of Economics 61:503-532. HARSANYI, JOHN C. 1956 Approaches to the Bargaining Problem Before and After the Theory of Games: A Critical Discussion of Zeuthen's, Hicks' and Nash's Theories. Econometrica 24:144-157. LERNER, ABBA P. 1934 The Concept of Monopoly and the Measurement of Monopoly Power. Review of Economic Studies 1:157-175. LERNER, ABBA P. 1944 The Economics of Control: Principles of Welfare Economics. New York: Macmillan. MACHLUP, FRITZ (1952) 1964 The Economics of Sellers' Competition: Model Analysis of Sellers' Conduct. Baltimore: Johns Hopkins Press. MASON, EDWARD S. (1957) 1964 Economic Concentration and the Monopoly Problem. New York: Atheneum. NASH, JOHN F. JR. 1950 The Bargaining Problem. Econometrica 18:155-162. NASH, JOHN F. JR. 1953 Two-person Cooperative Games. Econometrica 21:128-140. PIGOU, ARTHUR C. (1920) 1960 The Economics of Welfare. 4th ed. London: Macmillan. ROBINSON, E. A. G. 1941 Monopoly. Cambridge Univ. Press. ROBINSON, JOAN (1933) 1961 The Economics of Imperfect Competition. London: Macmillan; New York: St. Martins. STOCKING, GEORGE W.; and WATKINS, MYRON W. 1951 Monopoly and Free Enterprise. New York: Twentieth Century Fund. TRIFFIN, ROBERT 1940 Monopolistic Competition and General Equilibrium Theory. Harvard Economic Studies, Vol. 67. Cambridge, Mass.: Harvard Univ. Press. UNIVERSITIES—NATIONAL BUREAU COMMITTEE FOR ECONOMIC RESEARCH 1955 Business Concentration and Price Policy: A Conference of the Committee. National Bureau of Economic Research, Special Conference Series, No. 5. Princeton Univ. Press.
MONOPSONY See MONOPOLY.
MONTE CARLO METHODS See RANDOM NUMBERS.
ROBERT L. BISHOP BIBLIOGRAPHY
BAIN, JOE S. 1956 Barriers to New Competition: Their Character and Consequences in Manufacturing Industries. Cambridge, Mass.: Harvard Univ. Press. BURNS, ARTHUR R. 1936 The Decline of Competition: A Study of the Evolution of American Industry. New York and London: McGraw-Hill. CHAMBERLIN, EDWARD H. (1933) 1962 The Theory of Monopolistic Competition: A Re-orientation of the Theory of Value. 8th ed. Cambridge, Mass.: Harvard Univ. Press. COURNOT, ANTOINE AUGUSTIN (1838) 1960 Researches Into the Mathematical Principles of the Theory of Wealth. New York: Kelley. -» First published in French.
MONTESQUIEU Charles de Secondat, Baron de la Brede et de Montesquieu (1689-1755), made original contributions to social and political theory. He was viewed by Comte and Durkheim as the most important precursor of sociology; by Ernst Cassirer and Franz Neumann as the inventor of ideal-type analysis; by Sir Frederick Pollock as the "father of modern historical research" and of a "comparative theory of politics and law based on wide observation of ... actual systems"; by Friedrich Meinecke as one of the founders of Historismus (historicism or histo-
468
MONTESQUIEU
rism) with its relativism, holism, and emphasis on the positive value of the irrational and the customary; and by Hegel, who did riot find it easy to praise his predecessors, as the first to explain law and political institutions by reference to characteristic;; of the social system in which they function (Comte [1830-1842] 1877; Durkheim [1892-1918] 1960, p. 26; Cassirer [1932] 1951, p. 212; Neumann 1949, pp. xl-xli; Pollock [1890] 1960, pp. 86-87; Meinecke 1936, pp. 118-170; Hegel [1821] 1942, p. 16). Now that political sociology has become a recognized discipline, Montesquieu has also been given pride of place as its first modern practitioner (Aron [1960] 1965, pp. 55-56; Runciman 1963, pp. 24 ff.). Nor is there much question that Montesquieu's concept of the general spirit of a society anticipated modern cultural anthropology. Thus, Montesquieu's position as social theorist would seem to be secure. Yet few other theorists of his order of achievement have combined such contributions with such defects: imprecise definition, lack of internal consistency, the tendency to generalize on the basis of inadequate evidence, and, in the Spirit of the Laws, a deplorable lack of organization. To discriminate what remains permanently valuable in Montesquieu from what is unacceptable—this is the difficulty complicating any critical exposition of his thought. Other problems may perplex the modern reader. Montesquieu claimed to be breaking altogether new ground. He prefaced the Spirit of the Laws with the epigraph "prolem sine matrem" (a child born of no mother), yet it has been shown that his work in many ways carries on that of his predecessors and shares the concepts, attitudes, and political positions of his contemporaries (Dedieu 1909; Meinecke 1936; Ford 1953; Mauzi I960; Shackleton 1961; Ehrard 1963; Rothkrug 1965). The genuine novelty of Montesquieu's work is to be found in its terms of analysis and its theoretical focus— the relations of a society's laws to its type of government, climate, religion, mores (moeurs), customs (manieres), and economy. Such an approach is inconsistent with the older notion that there exists an eternal, natural law superior to positive law. Yet Montesquieu refused to abandon the theory of natural law, despite its patent incompatibility with his own. Another difficulty arises from Montesquieu's insistence that his writings did not censure any established institution, that he took his principles not from his prejudices, but from the nature of things. Yet he condemned despotism, slavery, and religious persecution as contrary to natural law or human nature. Thus he wavered between a positivist, rela-
tivist concept of law on the one hand and a conventional acceptance of natural law on the other. Montesquieu opposed intellectual systems, for he thought they falsify experience; he emphasized the irreducible diversity of human institutions and history. Yet he also asserted that he had laid down first principles from which all particular cases follow—the histories of all nations are only consequences of these first principles, and every particular law is connected with or depends on another law of a more general extent (1748, preface). Montesquieu's family stemmed from both the nobility of the sword and the nobility of the robe; it could be traced back 350 years, which, in his view, made its name neither good nor bad. His childhood was a curious combination of aristocracy and rusticity. He was born in the castle at La Brede, but his godfather was a beggar, chosen to remind Montesquieu of his obligation to the poor. He was sent out to nurse with a peasant family for his first three years. His mother died when he was seven; her early death contributed to his detachment and to his distaste for enthusiasm; both qualities were equally prominent in his writing and in his character. At the age of 11 he was sent away to Juilly, a school maintained by the Congregation of the Oratory. At Juilly Montesquieu acquired an education stronger in Latin than in Greek; it was relatively liberal for its day. The philosopher Malebranche was a member of the Congregation, and his influence made itself felt. Montesquieu's Latin studies impressed him with the value of civic virtue and stoicism. In 1705 Montesquieu returned to Bordeaux to study law. Between 1709 and 1713 he was a legal apprentice in Paris. There he came to know some of the most advanced thinkers of his time: Freret, the Abbe Larna, and Boulainvilliers (Shackleton 1961, pp. 8-13). On the death of an uncle in 1716, Montesquieu succeeded to considerable wealth, land, and the office of president a, mortier in the Parlement of Guyenne. Montesquieu's office was not a sinecure. He worked seriously at his legal duties, but later confessed that he had not understood all the ancient procedures of his court. The truth was that he did not much enjoy his life as a magistrate. Nevertheless, in the Spirit of the Laws he supported the position of the parlementaires against the monarchy, defended venality of office, and condemned as despotism any attempt to divest the parlements of their political functions (1748, book 8, chapter 6). During his residence in Bordeaux, Montesquieu participated in the work of its academy. At that
MONTESQUIEU time the provincial academies provided a setting within which the nobility of the robe could develop an intelligentsia of its own; their members included learned noblemen of the sword as well as educated commoners. Montesquieu did experiments in natural history and physiology. The academy gave him a distaste for prejudice, a priori reasoning, and teleological arguments; from it he acquired a predisposition to materialism. The "Persian Letters." In his Bordeaux period Montesquieu began the Persian Letters, which was published anonymously in Amsterdam in 1721. An immediate and lasting success, it alone would have assured his reputation. The book is witty and delightful, but Montesquieu's irony and irreverence did more than amuse his readers. By depicting France as seen through the eyes of two Persians, he provided a double perspective, a revealing device used earlier by La Bruyere and Bayle. As Caillois has written, the positive construction later undertaken by Montesquieu in the Spirit of the Laws presupposed a prior sociological revolution— that of "daring to consider as extraordinary and difficult to understand those institutions, those habits, those moeurs, to which one has been accustomed since birth, . . . [which] are so powerfully, so spontaneously respected that in most situations, no alternative to them can be imagined" (Oeuvres completes, vol. 1, p. v in the Gallimard edition). Relativism about values is among the most significant contributions of the Persian Letters to the early Enlightenment. Certain points made in the Persian Letters anticipate what Montesquieu later argued more extensively—that men are always born into a society and that it is therefore meaningless to discuss the origin of society and government; that self-interest is not a sufficient basis for human institutions, as Hobbes had asserted; and that, instead, the possibility of good government depends on education and example, in short, on civic virtue. Montesquieu did not believe that the absurdity and corruption in French society could be remedied by governmental action. His view of human nature put great stress on the passions, and he believed that jealousy and the desire for domination are among the mainsprings of despotism. He was already concerned with the structure and psychological basis of absolute rule. His models were taken from Louis xiv, as well as from what he read about the Near East and Far East. Travel and later works. With the success of the Persian Letters, Montesquieu was accepted by the society of regency Paris and lived the life of an aristocratic rake. His Paris friends secured his elec-
469
tion to the French Academy in 1728. He sold his office of president a mortier, partly because of financial need and partly because he wanted to live in Paris. As a further result he was at last free to travel. From 1728 to 1731 Montesquieu was away from France, visiting Austria, Hungary, Italy, Germany, Holland, and England. He came to think of himself as a man first and a Frenchman second, and claimed to regard "all the peoples of Europe with the same impartiality as I do the peoples of Madagascar" (ibid., p. 997). The two years he spent in England had the greatest effect on his later work. There he made distinguished friends who taught him to view the English constitution through the eyes of the Whigs, although he was aware also of the Tories' point of view. During his stay he was elected a fellow of the Royal Society and became a Freemason as well. After his return to France he divided his time between his estate at La Brede and Paris; he became an independent scholar dedicated to producing his two great books, the Considerations on the Causes of the Greatness of the Romans and Their Decline (1734) and the Spirit of the Laws (1748). Much of his time was spent in Paris, where he shone among the luminaries of the intellectual salons, now more open to merit than before. Montesquieu encouraged the young philosophes he met there. Personal religion. In 1775 Montesquieu fell victim to an epidemic sweeping Paris. As he lay dying, he asked to be given the last rites of the church. When he chose as confessor a Jesuit who had helped him publish the Considerations, the Society of Jesus insisted that he first accept certain conditions. Although Montesquieu denied ever having been in a state of disbelief, he was made to consent to having his final confession made public. It is reported that after receiving the last rites, he said, "I have always respected religion; the ethic of the evangelists is an excellent thing, and the most beautiful gift God could have made to man" (Shackleton 1961, p. 396). Certainly Montesquieu believed in the social and political utility of religion, nor is there any doubt that he held some form of belief compatible with natural religion. But it remains unknown to what degree he believed in the dogmas of his church. He never capitulated to the Jesuits' demands for control of his manuscripts. Ideas about historical causation. The Considerations, perhaps the least known of Montesquieu's three major books, is notable for its style, clarity, and remarkable analysis of historical causation and of the nature of politics. Montesquieu was attracted to Roman history because it was the most complete
470
MONTESQUIEU
record of a political society available to him. His study of Rome led him to concepts he later developed more fully in the Spirit of the Laws •. although chance plays some part in human events, these may always be rationally analyzed; the orientation of political actors is in large part to be explained by religion, ideas, maxims, and public opinion (in the Considerations Montesquieu did not emphasize milieu); politics in a free society requires a degree of disunion and conflict; and every society has a "general spirit." Perhaps the single most telling passage in the Considerations states Montesquieu's theory of causation: It is not fortune that rules the world . . . The Romans had a series of consecutive successes when their government followed one policy, and an unbroken set of reverses when it adopted another. There are general causes, whether moral or physical, which act upon every monarchy, which create, maintain, or ruin it. All accidents are subject to these causes, and if the chance loss of a battle, that is to say, a particular cause, ruins a state, there is a general cause that created the situation whereby this state could perish by the loss of a single battle. (1734, chapter 18) This statement, which received much attention after the fall of France in 1940, referred in its original context to the place held by war and conquest in Roman policy. Montesquieu reasoned, in what would later be called a dialectical manner, that Rome was first made and then ruined: "Here, in a word, is the history of the Romans. By following their original maxims, they conquered all other peoples. But after such success their republic could no longer be maintained. It became necessary to change the form of government. The new principles caused the Romans to fall from their former grandeur" (ibid., chapter 18). Montesquieu here combined judgments of fact and of value in a way dear to him. On the one hand he generalized about the effects of scale on governmental structure and functions; on the other he concluded that the Romans had fought too much and conquered too much. Violence, first used as a weapon against other nations, was in turn employed at home. Roman decadence was inherent in the means used to attain greatness. Montesquieu was recasting themes that had originated with opponents of Louis xiv's foreign policy and of mercantilism. The Considerations contains a striking first formulation of Montesquieu's treatment of politics in a free society. The texture of relations among persons and groups is much looser in a free society
than in a despotism. Under freedom, divergencies and even conflicts are essential, for such a society is based on the conciliation of recognized groups, each with its own interest. The virtues of consensus and unanimity are overrated: Authors who write about the history of Rome never tire of asserting that its ruin was caused by internal division, by contending groups. But these writers fail to see that these divisions were necessary. . . . As a general rule, it may be assumed that whenever everyone is tranquil in a republic, that state is no longer free. What constitutes a union in a political body is difficult to determine. True union is a harmony, in which all the parts, however opposed they may appear, concur in attaining the general good of the society, just as dissonances in music are necessary so that they may be resolved in an ultimate harmony. Union may exist in a state, where apparently only trouble is to be found. . . . But underlying the unanimity of Asiatic despotism, that is to say every government that is not moderate, there is a division of another kind. The peasant, the soldier, the merchant, the noble are related only in the sense that some of them oppress others without meeting any resistance. If this is considered to be union, it can be so only in that sense in which corpses are united when buried in a mass grave, (ibid., chapter 9) The "Spirit of the Laws" The Spirit of the Laws, the product of twenty years' work, is so sweeping in its scope that there can be no question of dealing here with all that it covers. Ostensibly a treatise on law, it spills over into a consideration of every domain affecting human behavior and into questions of philosophical judgment about the merit of various kinds of legislation. Its absence of organization is notorious, and many commentators have tried to rearrange the order of the separate books to produce a more coherent argument. Such schemes can be divided into those which pretend to have divined the true intent of the author and those with the more modest aim of reducing confusion. Behind these different approaches lie two different conceptions of Montesquieu as a thinker. Some argue that he based the Spirit of the Laws on general principles and a discernible over-all design; others that, whatever his intention, he fell far short of such an achievement because he composed the 31 books over so long a period. The supporters of the view that Montesquieu did formulate a distinctive and systematic theory tend to argue that for two reasons Montesquieu deliberately concealed his design: he feared the censure of the authorities, lay and eccle-
MONTESQUIEU siastical; and he believed that much of his public should be kept in ignorance of certain truths (about religion, for example). Whatever Montesquieu's intent, the present value of the Spirit of the Laws depends upon two central topics: Montesquieu's classification of political structures and his comparative and historical political sociology. Types of government. Montesquieu classified governments in terms of three types, each of which is characterized by a nature and a principle. By the "nature" of a government he meant the person or group holding sovereign power; by "principle," that passion which must animate those involved in a form of government if it is to function at its strongest and best. When a government is functioning properly, a legislator who violates the principle of government will provoke revolution. On the other hand, when a government is debilitated by the weakening of its essential principle, it can be saved only by a good legislator capable of strengthening it. The persona of the legislator is used by Montesquieu in the classical sense of an exceptional person called in by a society to give it basic laws. But the retention of this fiction produced an ambiguity when joined to what is novel in Montesquieu's thought, the limits placed upon legislation by the physical and moral causes that combine to form the general spirit of a society. Montesquieu was inconsistent in his recommendations to legislators: sometimes he suggested that the legislator adapt laws to the general spirit of the society, sometimes that he use laws and even religion to combat that spirit. Much depended on whether Montesquieu liked or disliked a particular institution or practice. When classified by their nature, governments fall into three categories. A republic is that form in which the people as a whole, or certain families, hold sovereign power. A monarchy is that in which a prince rules according to established laws that create channels through which the royal power flows. (Montesquieu's examples of such channels include an aristocracy administering local justice, parlements with political functions, a clergy with recognized rights, and cities with historical privileges.) Despotism is the rule of a single person, who is directed only by his own will and caprice. The principles of these governments differ: virtue is the principle of republics; honor, of monarchies; and fear, of despotism. Montesquieu subdivided republics into democracies and aristocracies. His image of the first was taken from classical Greece and Rome. When he assigned virtue to them as their distinctive principle, he meant those
471
political qualities requisite to their maintenance: in the case of democracies, love of country, belief in equality, and the frugality and asceticism that lead men to sacrifice their personal pleasures to the general interest. Montesquieu found his model for aristocracy in contemporary republics such as Venice. Although aristocracies also require virtue, it takes the form of moderation in behavior and aspirations by members of the ruling class (the principal weakness of aristocracy being immoderate internal rivalry). Montesquieu thought that monarchy, as found in France and other European states of his time, was the characteristically modern way of ruling territories of intermediate size. The principle of monarchy is honor, that esprit de corps found only in a society based on preferment and distinctions for the few. Such privileges, when demanded and granted, sustain partially autonomous, intermediate groups between the crown and the people. In a famous phrase Montesquieu wrote, "Without a monarch, no nobility: without a nobility, no monarchy. For then there is only a despot" (1748, book 2, chapter 4). Despotism, in Montesquieu's view, has no offsetting virtues. Based on fear, it tolerates no intermediary powers and is moderated, if at all, only by religion. Throughout this analysis Montesquieu used what Max Weber later called ideal types. As Montesquieu phrased it: I have had new ideas; I have had to find new terms, or else to give new meaning to old ones. . . . It should be noted that there is a great difference between saying that a certain quality . . . or virtue is not the spring that moves a government, and saying that it is nowhere to be found in that government. If I say that this wheel, this cog are not the spring that makes this watch go, does it follow that they are not in this watch? . . . In a word, honor exists in a republic, although political virtue is its mainspring; political virtue, in a monarchy, although honor is its principle. (1748, "Avertissement de 1'auteur")
Montesquieu's treatment of despotism is the most flagrant of his departures from his claim to have derived his principles not from prejudice but from the nature of things. Thus, he asserted that "it is impossible to speak of such monstrous governments without becoming infuriated" (ibid., book 3, chapter 9). Yet he said much that was incisive about the patterns of authority in despotisms. Under this form of government unquestioning obedience is regarded as the only proper response to authority. Education is designed to produce the requisite type of character. The ruled must be ignorant, timid, broken in spirit, requiring little in the way of legis-
472
MONTESQUIEU
lation. Family life is also regulated, and the members of one family are isolated from all others. Men, instead of being trained to live on the basis of mutual respect, are made io respond only to fear of violence. Furthermore, Montesquieu posed the question: Since men love liberty and hate violence, and will therefore presumably rise in rebellion against despotism, why in fact do most of the world's peoples live under despotisms? In part this is because large empires must be governed despotically if their administration is to be effective (ibid., book 8, chapter 19); more important, it is because despotism has but one necessary condition, the human passions, and these exist everywhere. The alternatives to despotism are more difficult to achieve (ibid., book 5, chapter 15). The legislator who wishes to form a government that is free must have unusual skills. He must know how to combine political powers, subject them to rules, moderate them, and yet make them act together. In what are probably the best known and most influential sections of the Spirit of the Laws, those describing, or idealizing, the government of England (ibid., book 11, chapter 6; book 19, chapter 27), Montesquieu went beyond Locke to distinguish clearly between the executive power (extended to foreign affairs), the legislative, and the judicial. He made their rigid separation the condition of liberty. "When the legislative power is united to the executive, there is no more liberty" (ibid., book 11, chapter 6). Nor is there liberty if the judicial power is not separated from the legislative. "Power must check power" (ibid., book 11, chapter 6). But can a government so constituted act effectively? Montesquieu simply asserted that it will because it has to. Taken purely as constitutional doctrine, this theory does not appear to have had much factual basis, even when Montesquieu wrote. Taken as a guide to present-day practice, it is useless and even dangerous. Yet the theory of the division of powers is more plausible if understood in either a sociological or a psychological sense. Madison, for example, in the 51st Federalist paper, interpreted Montesquieu's doctrine in a psychological sense as involving not rigid separation, but the blending of powers. The means of resisting an attack on the powers given to an office by a constitution should be tied to the ambitions of the person holding office. Thus, Madison combined the formula "Ambition must be made to counteract ambition" with Montesquieu's formula "Power must check power." The possibility of a sociological interpretation emerges clearly from the question, first asked by Bentham: What possible guarantee of liberty can there be in the separation of powers, if all three
powers are controlled by the same group or class? Obviously there can be no guarantee unless each of the three powers is in the hands of a different group or class. In that case, liberty is the outcome of a struggle among groups. This is a struggle of a particular and limited kind that varies with the type of government. Intrigue is essential in a democracy, for when there is no intrigue, the people, whose nature it is to act by passion, become subject to bribery and corruption; in short, they calculate their own interest when they should be directed by patriotic passion (ibid., book 2, chapter 2). In a monarchy, liberty exists when semiautonomous intermediate groups have the power to resist the will of the ruler, or at least to engage in negotiation with him when they feel their interests are threatened. Only in despotism is there no conflict among groups. Thus, even the most political part of Montesquieu's theory has a sociological dimension: conflict has positive functions—the prerequisite of liberty is the existence of groups that are at least partially independent groups set between the state and the individual. Determinants of a society's spirit. The scope of Montesquieu's concern is global: "This work has for its object the laws, customs, and various usages of all peoples" (Oeuvres completes, vol. 2, p. 1137 in the Gallimard edition). Such a subject can be treated adequately only by a method at once comparative and historical. Comparison, the single most valuable capacity of the human mind, is particularly useful when applied to human collectivities (ibid., vol. 2, pp. 54, 57). For if we wish to explain why they have the characteristics they do, it is better to apply hypotheses to general effects known by comparison than to particular effects known from a single case. In making comparisons it should be remembered that in nature even members of the same class are not exactly alike, but only more or less so. Furthermore, such social phenomena as laws must be regarded as forming part of a system, within which they function in some relation to the other parts of the system. In order to understand a system properly, it is also essential to know how it developed over time: to explain why laws exist, it is necessary to follow the historical process by which they have acquired a function within the context of a system, even though the original system may have ceased to operate (ibid., vol. 2, p. 1103). What constitutes an adequate explanation of why a nation has a given set of laws, a given social and political structure? Montesquieu answered that a satisfactory explanation must include the two major types of causes, physical and moral, which
MONTESQUIEU together form a society's general spirit. Principal among physical causes is climate, which produces a number of physiological and mental consequences. Also to be taken into account are the quality of the terrain, the density of population, and the territorial extent of a society. Montesquieu, who made much of physical causes, nevertheless rejected the notion that they alone directly determine a society's mode of life. On the contrary, moral causes are more important than physical ones—a good legislator can, for example, minimize and overcome even the effects of climate (ibid., vol. 2, pp. 61-62). Many moral causes affect a society's general spirit: religion, laws, maxims, precedents, mores, customs, economy and trade, style of thought, and the atmosphere that is created in a nation's capital or court and then spreads to its outermost limits (1748, book 19, chapter 4). The general character of a society can also be seen in the style of education it gives to its members. There is nothing mystical, no notion of a Volksgeist for example, nothing transcending reason and experience, in Montesquieu's concept of the general spirit of a society. The general spirit results from a number of causes whose effects can be rationally assessed after empirical investigation. Law. Montesquieu considered law to be among the most crucial determinants of human behavior. However, because his legal definitions in the first six books of the Spirit of the Laws are ambiguous and because he did not build on them in other parts of the book, it is best to seek to understand his use of the term "law" from its use in the work as a whole. For the most part he used "law" to mean any rule of conduct that is supported by governmental sanctions against those who disobey the rule. Montesquieu also used "law" to refer to rights and obligations protected or enforced by courts and to basic rules that must be followed by those who exercise power. Despite the confusion, Plamenatz was correct to conclude that Montesquieu, more than Hobbes or Locke, understood the social function of law—it is made up of rules that control the governors as well as the governed (1963, p. 263). What is significant is Montesquieu's treatment of law as but one way of affecting human conduct. It is the method peculiar to the government. The society as a whole uses other means: religion, mores, and customs. Montesquieu did not underrate what can be done by laws that have behind them the coercive power of the state. But he wished to call attention to those forces outside the government that may limit the effectiveness of state action and thus serve a function equivalent to law
473
by using essentially social constraints to restrain human passions, wills, and imagination. Montesquieu did not attempt to reduce government to a derivative function of society, or vice versa; rather, he wished to specify the numerous and complex ways in which the political and social systems interact. Religion. Among the essentially social forces that may affect government, religion ranks high. Montesquieu's treatment of religion wavers between the rationalist theory, which he found in Machiavelli, that elites manipulate the credulous, and a more sophisticated sociological theory, which he was one of the first to develop. When following Machiavelli's lead, Montesquieu treated religion as something used by rulers much as they use laws. Both religion and law, for example, can be employed to overcome the worst effects of climate, such as reluctance to work the land (1748, book 14, chapter 6). Montesquieu also agreed with Machiavelli that it is easier to enforce laws in a religious country than elsewhere. But Montesquieu developed this instrumental theory into the theory that to the extent that religion is an effective force in a society, there is less need for control by the state. Religion, Montesquieu argued, can even save a state that would be overturned if its survival depended upon the capacity of its police to coerce the population. He emphasized the political and social effects of religion, seen always as operating within a given type of social organization: thus, the most sacred and true dogmas may produce the worst consequences, if these dogmas should turn out to be incongruent with the general spirit of a society. In a despotism, religion is the only restraint upon the ruler. In a republic, it is dangerous to allow the clergy to gain strength, but in a monarchy, a strong clergy helps maintain liberty. Religion also can determine men's orientations toward politics, economic activity, population, and liberty. In a sentence that later engaged Max Weber's interest, Montesquieu called attention to the fact that the English had been the people who had most effectively combined religion, commerce, and liberty (ibid., book 20, chapter 7). Mores and customs. Two other causes affecting the general spirit are mores and customs, both of which closely resemble religion in their operation. They may be used as surrogates for laws of the state. "When a people has bonnes moeurs, its laws need not be complex" (ibid., book 19, chapter 22). Mores (moeurs} apply internalized restraints on conduct not specifically prohibited by law; customs (manieres} apply external restraints on such conduct, but the sanctions are social rather than
474
MONTESQUIEU
legal. The distinctions among laws, mores, and customs are analytical. In practice they may be confused, as in China or Sparta. Yet even there one predominated: in Sparta it was mores, in China customs. Implications of social theory. Montesquieu's social theory is especially significant because it emphasizes social determinants of behavior rather than legal sanctions. Hitherto, political theorists who had attempted empirical generalizations had concentrated almost exclusively on explanations based on the behavioral consequences of legal sanctions. Montesquieu offered instead a pluralist view of causation; he did not attempt to establish a hierarchy of causes, with priority assigned to nongovernmental as against governmental action. Montesquieu believed that the general spirit might be determined by any one, or a combination, of the causes he had identified. (Tocqueville was very much in Montesquieu's style when he concluded the first part of Democracy in America with the argument that the success of the United States had been caused more by the constitution than the climate and terrain, but that most important of all had been the mores of the inhabitants.) Montesquieu's emphasis on the general spirit also led him to discuss theories of national character. Every society has its own particular character, a mixture of good and bad qualities. Legislators ought not to fly in the face of this character, unless it violates principles necessary to the government's existence. Otherwise, apparently desirable innovations may produce disastrous consequences. Peoples have their own ways of reaching conclusions, their own style of thought, leur maniere de penser totale (Oeuvres completes, vol. 2, p. 1102 in the Gallimard edition). There can be little doubt about the conservative implications of this theory. Some of them Montesquieu developed; others he did not. Inherent in his position is an appeal to the past or a vision of the past from a particular place in the society and politics of his time. Yet to the extent that he was a spokesman for the parliamentary nobility, "Montesquieu was not a true conservative, because he was not satisfied with the way the Bourbon monarchy had developed and was developing in his time" (Palmer 1959, p. 60). Although Montesquieu's work as a theorist should not be assessed simply in terms of his class position, it would be a mistake to ignore its influence on his political values, his theory of politics, and his scheme of analysis taken as a whole. Conflict. The single most important doctrine in the Spirit of the Laws is Montesquieu's theory
that intermediate bodies like the nobility, the parlements, the local courts of seigneurial justice, and the church are all indispensable to political liberty. These and other constituted bodies, such as provinces, towns, guilds, and professional associations, all have their rights, legal powers, and privileges, none of which can be removed, since they all derive from the original institutions of the realm. Their present function is to balance one another and to serve as barriers to despotism. Such constituted bodies are not to be treated as equal in value. To do so would violate the essential principle of monarchy, which rests upon honor derived from inequality. The great—those most distinguished by birth, wealth, or honor—should have a share in legislation equal to their advantages. This, Montesquieu specified, is the power necessary to check the enterprises of the people, and it is as important to the state as the people's power to check the enterprises of the great (1748, book 11, chapter 6). Montesquieu's analysis of the British constitution demonstrates that he did not believe in rule by one class (Palmer 1959, pp. 57-58). In addition to a body of nobles, there should also be a body representing the people, that is, those who are not noble. Classes should be distinct, with the nobility a vital element in the balance. A hierarchical form of society and a noble class jealous of its privileges are essential to the preservation of liberty. Change. There is an ambivalence in Montesquieu's attitude toward political change. On the one hand he opposed large-scale innovations, especially if they were proposed as the implementation of a program deduced from abstract principles; on the other he himself suggested far-reaching reforms. In part his ambivalence derived from the fact that the legitimacy of his own class depended on historical rather than abstract arguments; in part from his belief that the reasons for the continued existence of a state are complex and probably unknowable. If the entire system were changed, unanticipated difficulties might arise. Piecemeal change is therefore best—precedents should guide policy (1734, chapter 18). Institutions of long standing tend to improve a people's mores, while new institutions tend to corrupt them (1748, book 5, chapter 7). Politics is an instrument that accomplishes its work by slowly wearing away resistance (ibid., book 14, chapter 13). A prudent administration seldom proceeds to its ends by direct means. It changes by law only what has been established by law; it attempts to change the mores not by legislation, but by introducing new mores. The uniformity invariably sought by a centralized administration leads to despotism. Political wisdom consists
MONTESQUIEU in being able to discriminate those cases in which uniformity is preferable from those other instances in which diversity presents greater advantages (ibid., book 29, chapter 18). Montesquieu neither opposed all that was new nor defended all that existed. In addition to attacking slavery and religious persecution, he argued that the state owes its inhabitants an assured subsistence, nourishment, clothing, and good health. It is also the state's duty to provide for orphans, the sick, and the old; it should feed the people in the event of famine (ibid., book 23, chapter 19). Much of this was based on a general aristocratic paternalism, but Montesquieu's values emerge clearly from his discussion of slavery. He took the position that slavery is incompatible with the general spirit of both republics and monarchies. Yet he added that emancipated slaves should be given only civil, not political, liberty. Even in popular governments, power should never be allowed to fall into the hands of the lowest classes (le bas peuple). Yet Montesquieu stressed the worth of education and denounced prejudice: knowledge makes men less cruel, prejudice leads them away from humanitarianism (ibid., book 15, chapter 4). Evaluation Nothing is easier than to criticize Montesquieu, even in the most valuable parts of his writings. The concepts he used as ideal types are defective, and his typology is both abstract and incomplete. It is abstract in the sense that no existing government fitted his specifications, despite the great number of monarchies in his time. England, whose laws he claimed came closest to achieving liberty, was not a monarchy as he defined it, for intermediate bodies no longer existed there. Montesquieu's types are incomplete even on his own showing: Books 19, 26, and 29 of the Spirit of the Laws either modify or greatly amplify his initial types. Thus, in Book 19, while discussing the English constitution, he pointed out the advantages of representative over direct democracy. Yet he never included representation in his ideal type of democracy. Also in Book 19, he added political parties to his discussion of the politics of a free society, but again failed to explain how his types should be modified. And he virtually added a fourth type of government, the federative republic, formed by the confederation of a number of republics. It represented his solution to the puzzle of how republics could maintain their intimate scale and at the same time resist aggression by larger neighbors. Forgetfulness, inability or lack of willingness to revise, and absence of organization are everywhere evident in Montes-
475
quieu. To his intellectual faults may be added the fact that Montesquieu failed to emancipate his scheme of analysis from the perspective of his class. Yet in large part he triumphed over these defects. Montesquieu was extraordinarily imaginative in formulating general hypotheses designed to relate those variables that must be taken into account when explaining social and political behavior. Montesquieu advanced a theory of politics and a conception of the relation between the political and social systems whose full usefulness made itself felt only later. He upheld the value of conflict in politics—the importance of pluralism in systems characterized by conciliation, compromise, and bargaining between intermediate groups and the central authority. He formulated the theoretical concepts that authority can be of diverse kinds and that order can be maintained by a variety of devices functionally equivalent to commands enforced by political and legal sanctions. He made comparison the central problem of political sociology and thus directed the focus of inquiry away from Europe to all the societies known, however imperfectly, to man. MELVIN RICHTER [For the historical context of Montesquieu's work, see CONSTITUTIONS AND CONSTITUTIONALISM; LEGAL SYSTEMS; POLITICAL THEORY; PUBLIC LAW, article
on COMPARATIVE STUDY; and the biographies of BODIN; MACHIAVELLI; for discussion of the subsequent development of Montesquieu's ideas, see the biographies of COMTE; DURKHEIM; HEGEL; MEINECKE; TOCQUEVILLE.] WORKS BY MONTESQUIEU
(1721) 1964 The Persian Letters. Translated by George R. Healy. Indianapolis: Bobbs-Merrill. -> First published as Lettres persanes. (1734) 1965 Considerations on the Causes of the Greatness of the Romans and Their Decline. Translated, with notes and an introduction by David Lowenthal. New York: Free Press. ->• First published in French. Translations of extracts in the text were provided by Melvin Richter. (1748) 1950-1961 De I'esprit des loix. Vols. 1-4. Edited by Jean Brethe de la Gressaye. Paris: Societe Les Belles Lettres. -> The best critical edition. Translations of extracts in the text were provided by Melvin Richter. An English translation was published by Hafner in 1962. Oeuvres completes. Edited by Roger Caillois. 2 vols. Bibliotheque de la Pleiade, Vols. 81, 86. Paris: Gallimard, 1949-1951. -> The most generally available edition. It does not contain Montesquieu's correspondence. Translations of extracts in the text were provided by Melvin Richter. Oeuvres completes. Vols. 1-3. Paris: Nagel, 1950-1955. -> The best edition of Montesquieu; includes his correspondence.
476
MONTESSORI, MARIA SUPPLEMENTARY BIBLIOGRAPHY
ALTHUSSER, Louis 1959 Montesquieu: La politique et I'histoire. Paris: Presses Universitaires de France. -» A provocative Marxist treatment. ARON, RAYMOND (1960) 1965 Main Currents in Sociological Thought. Volume 1: Montesquieu, Comte, Marx, Tocqueville: The Sociologists and the Revolution of 1848. New York: Basic Books. -> First published in French. Perhaps the best brief treatment of Montesquieu as a political sociologist. BARCKHAUSEN, HENRI A. 1907 Montesquieu: Ses idees et ses oeuvres d'apres les papiers de La Brede. Paris: Hachette. BORDEAUX (France) 1948 Montesquieu et L'esprit des lois: Exposition organisee dans les salons de I'Hotel de Ville de Bordeaux pour celebrer le deuxieme centenaire de la publication de L'esprit des lois. Bordeaux: Delmas. BORDEAUX (France) 1956 Actes du Congres Montesquieu, reuni a Bordeaux du 23 au 26 mai 1955 pour commemorer le deuxieme centenaire de la mort de Montesquieu. Bordeaux: Delmas. ->• Contains 31 critical essays. CABEEN, DAVID C. 1947 Montesquieu: A Critical Bibliography. New York Public Library. -> An annoted bibliography. Restricted to works by Montesquieu examined by the author at the Columbia University Library and the New York Public Library. CABEEN, DAVID C. 1955 A Supplementary Montesquieu Bibliography. Revue Internationale de philosophic 9: 409-434. CARCASSONNE, E. 1927 Montesquieu et le probleme de la constitution francaise au XVIIIe siecle. Paris: Presses Universitaires de France. CASSIRER, ERNST (1932) 1951 The Philosophy of the Enlightenment. Princeton Univ. Press. -» First published as Die Philosophie der Aufkldrung. COMTE, AUGUSTE (1830-1842) 1877 Cours de philosophic positive. 4th ed. 6 vols. Paris: Bailliere. DEDIEU, JOSEPH 1909 Montesquieu et la tradition politique anglaise en France: Les sources anglaises de L'esprit des lois. Paris: Gabalda. -> An important study of English influences on Montesquieu. DEDIEU, JOSEPH 1913 Montesquieu. Paris: Alcan. DEDIEU, JOSEPH 1943 Montesquieu, I'homme et I'oeuvre. Paris: Boivin. DURKHEIM, EMILE (1892-1918) 1960 Montesquieu and Rousseau: Forerunners of Sociology. Ann Arbor: Univ. of Michigan Press. -» Part 1 is a translation of Durkheim's thesis Quid Secundatus politicae scientiae instituendae contulerit (1892); Part 2 was first published in Volume 25 of the Revue de metaphysique et de morale. EHRARD, JEAN 1963 L'idee de nature en France dans la premiere moitie du XVIIIe siecle. 2 vols. Paris: S.E.V.P.E.N. FLETCHER, FRANK T. H. 1939 Montesquieu and English Politics (1750-1800). London: Arnold. FORD, FRANKLIN L. 1953 Robe and Sword: The Regrouping of the French Aristocracy After Louis XIV. Harvard Historical Studies, Vol. 64. Cambridge, Mass.: Harvard Univ. Press. HEGEL, GEORG WILHELM FRIEDRICH (1821) 1942 The Philosophy of Right. Translated with notes by T. M. Knox. Oxford: Clarendon.
LEVIN, LAWRENCE M. 1936 The Political Doctrine of Montesquieu's Esprit des lois: Its Classical Background. New York: Columbia Univ., Institute of French Studies. MAUZI, ROBERT 1960 L'idee du bonheur dans la litterature et la pensee francaises du XVIIIe siecle. Paris: Colin. MEINECKE, FRIEDRICH (1936) 1959 Werke. Volume 3: Die Entstehung des Historismus. Munich: Oldenbourg. NEUMANN, FRANZ (1949) 1962 Editor's Introduction. In Montesquieu, The Spirit of the Laws. New York: Hafner. PALMER, ROBERT R. 1959-1964 The Age of the Democratic Revolution: A Political History of Europe and America, 1760-1800. 2 vols. Princeton Univ. Press. PARIS, UNIVERSITE DE, INSTITUT DE DROIT COMPARE 1952 La pensee politique et constitutionnelle de Montesquieu: Bicentenaire de L'esprit des lois 1748-1948. Paris: Sirey. PLAMENATZ, JOHN P. 1963 Man and Society: Political and Social Theory. Volume 2: Bentham Through Marx. New York: McGraw-Hill. POLLOCK, FREDERICK (1890) 1960 An Introduction to the History of the Science of Politics. Boston: Beacon. RICHTER, MELVIN 1963 [A Book Review of] Montesquieu: A Critical Biography, by Robert Shackleton. History and Theory 3:266-274. ROTHKRUG, LIONEL 1965 Opposition to Louis XIV: The Political and Social Origins of the French Enlightenment. Princeton Univ. Press. RUNCIMAN, W. G. 1963 Social Science and Political Theory. Cambridge Univ. Press. SHACKLETON, ROBERT 1961 Montesquieu: A Critical Biography. Oxford Univ. Press. -» The best biography in any language. SOREL, ALBERT (1887) 1888 Montesquieu. Translated by Melville B. Anderson and Edward Playfair Anderson. Chicago: McClurg. -» First published in French by Hachette. A German translation was published in Berlin in 1896 by Hofmann. SPURLIN, PAUL M.- 1940 Montesquieu in America, 17601801. Louisiana State Univ., Romance Language Series, No. 4. University: Louisiana State Univ. Press. TOUCHARD, JEAN 1959 Histoire des idees politiques. 2 vols. Paris: Presses Universitaires de France.
MONTESSORI, MARIA Maria Montessori (1870-1952), Italian educator, was born in the provincial town of Chiaravalle. Her father, a conservative army officer, had little sympathy with his daughter's desire for a career, but she received encouragement from her mother. Montessori attended a lay state school until she was 12, when the family moved to Rome for better educational opportunities. At 14, because of an interest in mathematics and engineering, she went to classes at the technical institute; this interest gave way to an interest in biology, which led ultimately to her decision to study medicine. She became the first woman graduate of a medical school
MONTESSORI, MARIA in Italy, despite difficulties which surely enhanced her strong feminist leanings. (She attended several international feminist congresses.) As an assistant doctor at the Psychiatric Clinic of the University of Rome, she had her first encounter with defective children, and this early experience convinced her that the problem of handicapped children is a pedagogical as well as a medical one. Previous advocates of this approach were Jean Itard, who worked with deaf-mutes as well as with "the wild boy of Aveyron," and Itard's student Edouard Seguin, who founded a school for defectives in Rome; their work reinforced her conviction that the difficulties of the handicapped could be ameliorated by special educational treatment. In 1899 Montessori became the directress of the State Orthophrenic School in Rome, which served the "hopelessly deficient" children of the city, and later also the "idiot" children. There she taught the children and trained other teachers to work with them. She visited London and Paris to exchange ideas on methods of treatment with others in this field. The mentality of the children in the institution developed so remarkably and unexpectedly that she received considerable attention. Her success made her want to try the same methods and techniques with normal children, and the opportunity came when in 1906 the Italian government gave her the responsibility for 60 children aged three to six from the slums of the San Lorenzo quarter of Rome—the beginning of her famous Casa dei Bambini. Meanwhile, in 1901 she had left the Orthophrenic School to resume studies at the University of Rome; she sought "further study and meditation" in psychology and philosophy. She was then holding the chair of hygiene at the Scuola di Magistero Femminile in Rome and was a permanent external examiner in the faculty of pedagogy. In 1904 she became a professor at the University of Rome, and from 1904 until 1908 held a chair of anthropology there. In addition to lecturing (some of her published works were based on her auditors' lecture notes), she was practicing not only in hospitals and clinics but also privately, and it was through this extensive practical application of her methods and principles that she came to formulate her conception of the nature of the child that underlay the program of the Casa dei Bambini. The Montessori method. It was in the early years of the Casa dei Bambini that the fundamentals of what we now know as the Montessori method were developed. This "Children's House," as well
477
as subsequent ones, proved to be an excellent way of dealing with cultural deprivation. The "prepared environment" set a basic atmosphere for learning, with room for "the liberty of the pupils in their spontaneous manifestations." In keeping with her belief that the teacher must be kept in the background, guiding and disciplining minimally, the entire staff consisted of herself and two untrained young women. The activity materials provided an opportunity for the child to acquire important percepts through sensory-motor means. Each "game" was designed to teach a skill or a fact. There were no benches, desks, or stationary chairs (standard equipment in schools prior to Montessori) but, rather, small chairs and tables, a low washstand, and low blackboards, all making the daily routine easy for the child. Long low cupboards contained the didactic materials, the care of which was entrusted to the children: these materials included counting beads in blocks of ten; two-dimensional geometric puzzles; graduated prisms, rods, and cubes; letters of the alphabet made of sandpaper, cardboard, and wood, for obtaining direct sensory impression of the letters; and series of tuned bells. In this "prepared environment" the child practiced the education of his senses, reading, metrics, grammar, music, manual training, and gymnastics, and he also learned cleanliness, order, poise, absorption, and patience. The pleasure the children took in silently concentrating on the materials was remarkable. Montessori had the ability to learn from observing the children at work on the apparatus and constantly made constructive changes in the "work situation." Montessori made certain generalizations on the basis of her observations: that children go through a series of "sensitive periods" with their "creative moments," when they show spontaneous interest in learning and have maximal ability to do so; that children prefer "work" with creative materials to "play" with objects defined as toys; that they have an extraordinary capacity for mental concentration, a desire to repeat activities over and over, and a love of order, for which witness their concern that materials be returned where they "belong"; that "work is its own reward" and there is no need for external reward; and that since spontaneous self-discipline is created by the liberty and independence of the school situation, there is no need for punishment (other than isolation). Indeed, Montessori became quite mystical about this notion of self-discipline: she saw it as a continuation of the cosmic discipline that orders the stars. A further general pattern that she identified was
478
MOONEY, JAMES
the existence of spontaneous "advanced interests," for example, "the burst into writing," which precedes by several months the "burst into reading"; by virtue of these "advanced interests," three- and four-year-old children begin to read and write with the materials available to them in the classroom. Influence. Montessori's work grew out of a dedication to individual self-expression that goes back to the eighteenth century; she belongs in the tradition of Rousseau, Froebel, and Pestalozzi. Also, her work is related to that strain in evolutionary thought which stresses development. But the hereditarian stress in Darwin's theory runs counter to her own emphasis on the importance of early experience, and her work was not in harmony with other strong intellectual trends of the first half of the twentieth century: behaviorism, with its emphasis on stimulus-response learning; the notion of fixed intelligence, based on intelligence testing; and the psychoanalytic emphasis on instinctual, and especially psychosexual, determination of personality and behavior. "Progressive education," as conceived primarily by John Dewey, was more in keeping with these trends, and as it came to dominate education, the Montessori system was all but forgotten. Although the Montessori method did spread abroad from Rome after 1918—Montessori's publications were translated into 20 languages, and training courses were set up in England, Ireland, Germany, Spain, Ceylon, and Argentina—there was only a brief flurry of interest in it in the United States when Montessori visited there in 1913. Recently, beginning in the 1950s, there has been a resurgence of interest, related perhaps to such developments as reforms in the mathematics and science curricula in the schools and new concern for handicapped children—handicapped genetically or environmentally. This renewed interest has produced many new Montessori schools and training centers. It may well be that the Montessori method is more than a fad, that it deals, instead, with fundamental aspects of learning. JACQUELINE Y. SUTTON [See also DEVELOPMENTAL PSYCHOLOGY; EDUCATIONAL PSYCHOLOGY; INTELLECTUAL DEVELOPMENT; and the
(1910) 1913 Pedagogical Anthropology. New York: Stokes. -» First published as Antropologia pedagogica. (1914) 1966 A Montessori Handbook: "Dr. Montessori's Own Handbook." Edited by R. C. Orem. New York: Putnam. (1916-1917) 1964 The Advanced Montessori Method. 2 vols. Cambridge, Mass.: Bentley. -» Volume 1: Spontaneous Activity in Education. Volume 2: Montessori Elementary Material. First published in Italian. (1924) 1965 Child in the Church: Essays on the Religious Education of Children and the Training of Character. 2d ed. Edited by Edward M. Standing. St. Paul (Minn.) Catechetical Guild. -» A collection of essays, excerpts, and conversations first published in Italian. 1936 The Secret of Childhood. London: Longmans. -> A second edition was published in 1950 in Italian as II segreto dell' infanzia. 1946 Education for a New World. Asundale Montessori Training Center, Adyar, Madras Publication Series, No. 1. Madras (India): Kalakshetra. (1949a) 1964 The Absorbent Mind. 5th ed. Madras (India): Theosophical Publishing House. (1949fo) 1955 The Formation of Man. Madras (India): Theosophical Publishing House. -> First published in Italian. SUPPLEMENTARY BIBLIOGRAPHY
BRUNER, JEROME S. 1960 The Process of Education. Cambridge, Mass.: Harvard Univ. Press. DONAHUE, GILBERT E. 1962 Dr. Maria Montessori and the Montessori Movement: A General Bibliography of Materials in the English Language, 1909-1961. Pages 141-175 in Nancy M. Rambusch, Learning How to Learn: An American Approach to Montessori. Baltimore and Dublin: Helicon. ITARD, JEAN M. G. (1801) 1932 Wild Boy of Aveyron. New York: Century. -» First published as De I'education d'un homme sauvage, ou des premiers developpements physiques et moraux du jeune sauvage de I'Aveyron. LEWIN, KURT (1931) 1935 Education for Reality. Pages 171-179 in Kurt Lewin, A Dynamic Theory of Personality: Selected Papers. New York: McGraw-Hill. PIAGET, JEAN (1923) 1959 The Language and Thought of the Child. 3d ed., rev. New York: Humanities Press. -» First published as Le langage et la pensee chez I'enfant. RAMBUSCH, NANCY M. 1962 Learning How to Learn: An American Approach to Montessori. Baltimore and Dublin: Helicon. SEGUIN, EDOUARD 1846 Traitement moral, hygiene et education des idiots. Paris: Bailliere. STANDING, EDWARD M. 1959 Maria Montessori: Her Life and Work. Fresno, Calif.: Academy Library Guild. STANDING, EDWARD M. 1962 The Montessori Method: A Revolution in Education. Fresno, Calif.: Academy Library Guild.
biographies of CLAPAREDE; DEWEY; GESELL.] WORKS BY MONTESSORI
MOONEY, JAMES
(1909) 1964 The Montessori Method: Scientific Pedagogy as Applied to Child Education in "The Children's Houses." Cambridge, Mass.: Bentley. -> First published as II metodo della pedagogia scientifica. . . . A paperback edition was published by Schocken with an Introduction by J. McV. Hunt.
James Mooney (1861-1921), author of "The Ghost-dance Religion and the Sioux Outbreak of 1890" (1896), The Aboriginal Population of America North of Mexico (1928), and other distin-
MOORE, HENRY L.
479
guished works on the American Indian, was born places (in the early phases of the great religions in a small town in Indiana of Irish immigrant parand the movement of Joan of Arc, as well as other ents. As a youth, he developed an intense interest Indian movements) was probably owing in part to in the Indian tribes of the Americas and early dehis own vigorous interest in Irish nationalism. termined to be an ethnologist. After working briefly ANTHONY F. C. WALLACE as a schoolteacher and newspaperman in Indiana, [See also INDIANS, NORTH AMERICAN; MILLENARISM; he went to Washington, D.C., where he met John NATIVISM AND REVIVALISM.] Wesley Powell, the founder of the Bureau of American Ethnology in the Smithsonian Institution. WORKS BY MOONEY Powell hired him as an ethnologist in 1885, and 1885 Linguistic Families of the Indian Tribes North of Mexico, With Provisional List of the Principal Tribal from then until his death Mooney worked for the Names and Synonyms. U.S. Bureau of American Ethbureau. nology, Misc. Publ. No. 3. Washington: The Bureau. Mooney undertook many field trips among North 1891 The Sacred Formulas of the Cherokees. Pages 301American tribes, becoming an expert particularly 397 in U.S. Bureau of American Ethnology, Seventh Annual Report . . . 1885-1886. Washington: The on the Cherokee and the Kiowa. He collected hisBureau. torical and linguistic data which contributed heavily 1894 Siouan Tribes of the East. U.S. Bureau of American not only to his own work but also to the collaboraEthnology, Bulletin 22:1-101. tive effort which led to the publication of the Hand1896 The Ghost-dance Religion and the Sioux Outbreak of 1890. Part 2, pages 641-1136 in U.S. Bureau of book of American Indians North of Mexico (the American Ethnology, Fourteenth Annual Report . . . famous Bulletin 30 of the Bureau of American 1892-1893. Washington: The Bureau. -> An abridged Ethnology). He published extensively on the Cheroedition with an introduction by Anthony F. C. Wallace kee and other American Indian tribal groups, alwas published in 1965 by the Univ. of Chicago Press. ways basing his publications on substantial per- 1898 Calendar History of the Kiowa Indians. Part 1, pages 129-445 in U.S. Bureau of American Ethnology, sonal field work and historical research. The value Seventeenth Annual Report . . . 1895-1896. Washingof his reports is enhanced today by the fact that ton: The Bureau. his research was done at an early date, when some 1900 Myths of the Cherokee. Part 1, pages 3-548 in U.S. Bureau of American Ethnology, Nineteenth Annual of the Oklahoma tribes still lived relatively indeReport . . . 1897-1898. Washington: The Bureau. pendent of interference in Indian Territory and 1907 The Cheyenne Indians. American Anthropological others, only recently confined to reservations, preAssociation, Memoirs 1:357-442. served customs and values of prior ages. 1928 The Aboriginal Population of America North of Mexico. Smithsonian Miscellaneous Collections, Vol. His most celebrated work, "The Ghost-dance Re80, No. 7. Washington: Smithsonian Institution. ligion and the Sioux Outbreak of 1890," was a careful account of the religion which swept across SUPPLEMENTARY BIBLIOGRAPHY [HEWITT, J. N. B.] 1922 James Mooney. American Anthe reservations west of the Mississippi in 1890 thropologist New Series 24:209-214. -» Includes a and the following years. Mooney talked to the comprehensive bibliography of Mooney's works, prePaiute prophet, Wovoka, and visited many of the pared by his wife. tribes, including the Sioux, who were receptive to HODGE, FREDERICK W. (editor) (1907-1910)1959 Handbook of American Indians North of Mexico. 2 vols. the ghost dance. The ghost dancers believed that Smithsonian Institution, Bureau of American Ethnola native millennium was about to arrive, in which ogy, Bulletin No. 30. New York: Pageant. the faithful dancers would be saved, to live in the ancient way in a world relieved of white men and their customs. The organization of the ghost dance MOORE, HENRY L. among the Sioux, itself a response to various ecoHenry Ludwell Moore (1869-1958), American nomic and political pressures, led to the killing of economist, made the first major attempts to comSitting Bull, then chief, and to the massacre of bine economic theory and statistical techniques in Indians who resisted being disarmed. Mooney's the empirical estimation of theoretical economic sympathetic account of the dance generally and of relationships. Quantitative estimates of elasticities Sioux resentments in particular has become a classic of early American ethnography. As a result, the of demand and supply, of productivity changes and ghost dance has achieved international fame and is of the nature of strikes, of cost curves and of deoften treated in secondary sources as the very proterminants of wage rates are so prominent in— totype of nativistic or revitalization movements. even characteristic of—modern economics that it His ability to understand the nativistic aspirations is difficult to remember that they were initiated of the Indians and to see in their behavior a homoonly in the present century, and by Moore more logue with religious enthusiasms in other times and than by any other economist.
480
MOORE, HENRY L.
Moore's life was that of a scholar wholehearted in his devotion to research. After receiving his PH.D. from Johns Hopkins University in 1896, he began teaching at Smith College, and in 1902 he went to Columbia University, where he remained until 1929; aside from two terms in Karl Pearson's statistical laboratory (in 1909 and 1913), he had no important association with any other institution or type of activity. His premature retirement was due to poor health. His first professional interest was the history of economic thought. Moore's dissertation was a competent survey of the vast literature on von Thiinen's celebrated natural rate of wages, \/ap, where a is the subsistence of a workingman's family and p the total product per workingman (1895). A second study was devoted to Cournot, the great pioneer of the mathematical method in economics (1902). Soon, however, his interests shifted to what he called statistical economics and is now more commonly called econometrics. His first publication in statistical economics was a set of essays, all connected with labor, Laivs of Wages (1911). Several of the essays were devoted to a verification of the marginal productivity theory, as applied to the pattern of average wages in coal mining over time and space and to the differences between wages of individuals. In general these investigations displayed careful and (for that time) sophisticated statistical methodology, but the hypotheses were very loose in their theoretical formulation. Moore's contemporaries properly applauded the purpose and criticized the execution of these studies. A second set of essays in this first volume proposed empirical uniformities, for which theoretical explanations might then be sought. One was a demonstration that within an industry or occupation, the larger the establishment, the higher the wage rates—a finding confirmed by later research. A second was an attempt to measure the influence of unions on the outcome of strikes, with much more ambiguous results. Three years later Moore's Economic Cycles (1914) launched the most important of all his work, the empirical estimation of theoretical relationships. Yet these estimates were essentially only by-products of Moore's search for a truly fundamental explanation of fluctuations in the level of economic activity. The main theme, as he saw it, is as follows: The principal contribution of this Essay is the discovery of the law and cause of Economic Cycles. The rhythm in the activity of economic life, the alternation
of buoyant, purposeful expansion with aimless depression, is caused by the rhythm in the yield per acre of the crops; while the rhythm in the production of the crops is, in turn, caused by the rhythm of changing weather which is represented by the cyclical changes in the amount of rainfall. The law of the cycles of rainfall is the law of the cycles of the crops and the law of Economic Cycles, (p. 135) The support for this bold claim consisted of four steps: (1) The discovery of several cycles—the chief being of eight years' duration—in rainfall in Ohio. (2) The argument that the rainfall cycles lead to cycles of equal duration in yields per acre (but lagged by half a cycle). (3) The demonstration that yields per acre are inversely related to the prices of the grain products. (4) The argument that demand curves for agricultural products shift upward in periods of rising industrial prices. If, as Moore for a time believed, the demand curve for pig iron (a typical industrial good) is positively sloped and the volume of pig iron falls when crops decline, then the price of pig iron falls when crops are small, thus lowering the demand for the crops. The cycle in rain has thus been carried through to the cycles in outputs and prices of industrial goods. Only the third step in this argument, in which statistical demand curves are estimated, was well founded, and it was this part of Moore's work which had the major impact on economics. Moore's first demand curve, that for corn from 1866 to 1911, illustrates his procedures. The historical series of prices and outputs are influenced by increasing population and fluctuations of price levels, so the annual price and quantity are expressed as ratios to the previous years' prices and quantities (link relatives). A linear demand equation then yields an elasticity of -1.12 (with r= -.789); a cubic equation yields an elasticity of —.92. Moore examined demand functions for periods of rising and of falling general prices—they differed little— but never introduced prices of substitutes or income (for which no satisfactory data existed). The work on demand curves was extended in Forecasting the Yield and the Price of Cotton (1917). Here he developed predictions of the size of the cotton crop on the basis of early-season rainfall which were more reliable than the elaborate crop forecasting system of the U.S. Department of Agriculture. Subsequently Moore introduced the concept of the flexibility of prices (the relative change in price divided by the relative change in
MOORE, JOHN BASSETT quantity—the reciprocal of the elasticity of demand in a two-variable relationship) and the partial elasticity of demand. Two years later Moore extended this type of analysis to supply curves by correlating percentage changes in acreage with percentage changes in prices a year earlier (1919). This analysis assumes that farmers predict that this year's price (or price change) will continue next year, and this approach led Henry Schultz (who was Moore's chief disciple) to formulate the cobweb analysis. Until 1923, however, Moore continued to consider his research on the extraterrestrial theory of cycles of primary importance. He found eight-year cycles almost everywhere and ultimately attributed them to the transits of Venus. Eventually he accepted the futility of this work, and he stopped working on the subject shortly after the book Generating Economic Cycles (1923) appeared. Moore's final book, Synthetic Economics (1929), proposed the boldest of programs: the statistical estimation of Walras' equations of general equilibrium. But although Moore's vision continued to be superlatively farsighted, he had not the power to translate this vision into a workable research program. The increasing rigor of economic theory, the expanding arsenal of statistical techniques, and the increasing intervention of the state in economic life all contributed to the cordial reception of Moore's work. The 1920s saw an extensive application of his techniques to agricultural products, and from this base empirical estimation of theoretical functions has spread over the entire corpus of economics. GEORGE J. STIGLER [For the historical context of Moore's work, see the biographies of COURNOT and THUNEN. For discussion of the subsequent development of Moore's ideas, see DEMAND AND SUPPLY, article on ECONOMETRIC STUDIES; ECONOMETRICS; TIME SERIES, article on CYCLES; and the biography of SCHULTZ.] WORKS BY MOORE Von Thiinen's Theory of Natural Wages. Quarterly Journal of Economics 9:291-304, 388-408. 1902 Antoine-Augustin Cournot. Revue de metaphysique et de morale 13:521-543. 1911 Laws of Wages: An Essay in Statistical Economics. New York: Macmillan. 1914 Economic Cycles: Their Law and Cause. New York: Macmillan. 1917 Forecasting the Yield and the Price of Cotton. New York: Macmillan. 1919 Empirical Laws of Demand and Supply and the Flexibility of Prices. Political Science Quarterly 34: 1895
546-567.
1923 1929
481
Generating Economic Cycles. New York: Macmillan. Synthetic Economics. New York: Macmillan. SUPPLEMENTARY BIBLIOGRAPHY
STIGLER, GEORGE J. 1962 Henry L. Moore and Statistical Economics. Econometrica 30:1-21. -» Contains a complete bibliography of Moore's work on pages 1921, and references both to the leading reviews by his contemporaries and to the work on statistical demand curves in the 1920s.
MOORE, JOHN BASSETT John Bassett Moore (1860-1947), the outstanding international lawyer of his generation, was born in Smyrna, Delaware. His father, John Adams Moore, was a prominent physician and for a time a member of the Delaware legislature; his mother, Martha Anne Ferguson, came from a family with classical interests, and Moore frequently said that one of his treasures was his mother's copy of Liddell and Scott's massive Greek—English Lexicon. In 1870 Moore's father, who had moved to the town of Felton, was one of the principal founders of the Felton Institute and Classical Seminary. Moore attended the Felton Seminary, as it was popularly called, and when ready for college chose the University of Virginia, partly because of the climate. He spent three years there, from 1877 to 1880, then studied law privately, and in 1883 was admitted to the Delaware bar. Two years later, when civil service examinations were held to fill the position of law clerk in the Department of State at Washington, Moore was one of the four young men who passed the examination. His selection for the post was certain because the then secretary of state, Thomas F. Bayard of Delaware, was a friend of the family. From 1886 to 1891 Moore served as third assistant secretary of state and then left Washington to join the faculty of Columbia University where, until his retirement in 1924, he was Hamilton Fish professor of international law and diplomacy. Although Moore was often on leave from the university to perform public services, he never neglected his students—a score who took their doctoral degrees under him did notable service in Washington, on the faculties of law schools, and in political science departments. For six months in 1898 he was again assistant secretary of state. In 1913 he became counselor of the Department of State with power to sign as secretary, but he resigned the next year because he was critical of some phases of Woodrow Wilson's foreign policy. He was repeatedly an American delegate to inter-
482
MOORE, JOHN BASSETT
national conferences. He served as a member of the Permanent Court of Arbitration, The Hague, from 1912 to 1938 and (even though the United States was not a member cf the League of Nations) was a judge of the Permanent Court of International Justice from 1921 to 1928. The first group, he remarked in 1943, was more distinguished than the second, as they constituted only an "eligible list" of persons nominated by their governments to serve as members of a panel of arbitrators and "were not required at once, if ever, to abandon their usual pursuits and live a sacrificial life abroad" (The Collected Papers, vol. 7, p. 348). Before he went to Columbia, Moore had published a good deal, principally on extradition and extraterritoriality. In 1898 he brought out the History and Digest of the International Arbitrations to Which the United States Has Been a Party; in 1905 he published the textbook American Diplomacy and, the following year, the monumental Digest of International Law. Thereafter, every treatise writer relied on "Moore's Digest." He continued his interest in arbitration and between 1929 and 1933 edited six volumes of International Adjudications, Ancient and Modern. Moore edited a 12volume edition of The Works of James Buchanan (1908-1911). Most of Moore's minor writings were published in 1944 as The Collected Papers of John B as sett Moore. The arrangement is chronological: the first item is a previously unpublished Fourth of July speech that Moore had made in 1877 at the age of 17, and the last item a hitherto unpublished, brief monograph, "Peace, Law and Hysteria," described as "a 'dissertation' chiefly written prior to 1936 and completed in 1943" (vol. 7, pp. 220-349). In between are addresses, articles from the law reviews and popular journals, books (e.g., the 1924 "International Law and Some Current Illusions" [vol. 6, pp. 1-280]), legal opinions given to clients, letters to newspapers, and 125 book reviews, whose urbanity sometimes softens devastating criticisms. The much-discussed 1933 article, "An Appeal to Reason," is also included (vol. 6, pp. 416-464). Practically all of these "papers" demonstrate that Moore had a good classical education, that he was a learned historian, that he was steeped in great literature, that he had a keen wit, and that his career had enabled him not infrequently to participate in important events. Moore did not seek involvement in the heated controversies over legal and political matters that followed World War i (he often asserted that this description was incorrect, that there had been previous world wars), but he never concealed his
opinions. He adhered to traditional international law and was skeptical of attempts to "modernize" it. Throughout his life Moore was constantly aware of La Rochefoucauld's maxim "The mind is the dupe of the heart," and he seldom engaged in wishful thinking. His attitude toward the League of Nations Covenant was that expressed by Cardinal Fleury when he was shown the Abbe de SaintPierre's Projet de paix perpetuelle: "Admirable, Sire, save for one omission: I find no provision for sending missionaries to convert the hearts of the princes." Moore repudiated "the notion that every alleged violation of international law gives to every member of the international community a right of action against the supposed violator. . . ." This, he maintained, "is no less a counsel of anarchy and confusion than would be the claim that every alleged infraction of municipal law gives to every individual in the domestic community a right of action against the alleged wrongdoer" ([1937] 1944, p. 142). Moore thought that the search for "collective security" was doomed to failure. He declared that the Kellogg Pact "constitutes with its record, experience and reservations, the most sweeping concession ever made to undefined claims of interest and the right to defend them by force" ([1935] 1944, p. 44) and for the "new neutrality" he had nothing but contempt. "The other day, when some one asked me what the 'new neutrality' meant," he wrote in a letter to the New York Sun on Dec. 10, 1935, "I replied that, as its limitations appeared to be wholly emotional, it perhaps might be best defined in the terms of the 'new chastity,' which encouraged fornication in the hope that it might reach the stage of legalized prostitution. In other words, the 'new neutrality' appears to be intended to get us into war, which is in a special legal category, by acts which cannot be defended on legal or moral grounds" (ibid.). When, during the Wilson administration, the government of the United States departed from its traditional policy of extending recognition to any new government that controlled its territory and promised to fulfill its obligations, Moore was horrified. He never believed in refusing to recognize a certain regime in order to show disapproval of its character and policies. He outlined at length his views on this matter in an address "Candor and Common Sense" before the Association of the Bar of the City of New York in December 1930 (The Collected Papers, vol. 6, pp. 340-368). Moore received many foreign decorations and honorary degrees and was a member of the principal learned societies. Fellow lawyers and corpora-
MORAL DEVELOPMENT 483 tions frequently retained him as special counsel, and from 1925 on he was a director of the Equitable Life Assurance Society. His private papers (including many boxes of correspondence) are in the Library of Congress and are much used by students of the diplomatic history of the period during which he was active. LINDSAY ROGERS [For the context of Moore's work, see INTERNATIONAL LAW.] WORKS BY MOORE
1898
History and Digest of the International Arbitrations to Which the United States Has Been a Party. 6 vols. Washington: Government Printing Office. 1905 American Diplomacy, Its Spirit and Achievements. New York: Harper. -> A revision and amplification of a series of articles that appeared in Harper's Magazine. 1906 A Digest of International Law as Embodied in Diplomatic Discussion, Treaties and Other International Agreements . . . . 8 vols. Washington: Government Printing Office. 1908-1911 BUCHANAN, JAMES The Works of James Buchanan, Comprising His Speeches, State Papers, and Private Correspondence. Collected and edited by John Bassett Moore. 12 vols. Philadelphia: Lippincott. 1929-1933 MOORE, JOHN BASSETT (editor) International Adjudications, Ancient and Modern: Modern Series. 6 vols. New York: Oxford Univ. Press. (1935) 1944 The "New Neutrality" Defined. Volume 7, pages 43-45 in The Collected Papers of John Bassett Moore. Oxford Univ. Press; Yale Univ. Press. (1937) 1944 The Dictatorial Drift. Volume 7, pages 136149 in The Collected Papers of John Bassett Moore. Oxford Univ. Press; Yale Univ. Press. The Collected Papers of John Bassett Moore. 7 vols. Oxford Univ. Press; Yale Univ. Press, 1944. -» A comprehensive bibliography of Moore's works appears in Volume 7, pages 351-372.
MORAL DEVELOPMENT The study of moral development has long been recognized as a key problem area in the social sciences, as indicated by McDougall's statement that "the fundamental problem of social psychology is the moralization of the individual by the society" (1908) or by Freud's statement that "the sense of guilt is the most important problem in the evolution of culture" (1930). However, it is hard to make clear distinctions between moral development and the broader area of social development and socialization (learning to conform to cultural standards). Such topics as the development of patterns of cooperation, of aggression, or of industry and achievement are generally studied under the broader rubric of socialization, although they may also be viewed as moral development insofar as cooperation or nonaggression are considered "good"
and insofar as they involve learning to conform to cultural rules. The past decade has witnessed a great deal of research on moral development (reviewed in Kohlberg 1963a; 1964; Hoffman 1966) viewed as the particular aspects of socialization involved in internalization, i.e., learning to conform to rules in situations that arouse impulses to transgress and that lack surveillance and sanctions. In this research literature, moral development has usually been conceived of as the increase in internalization of basic cultural rules. Various theories and researchers have stressed three different aspects of internalization: the behavioral, emotional, and judgmental aspects of moral action. A behavioral criterion of internalization is that of intrinsically motivated conformity, or resistance to temptation. Such a conception is implicit in the common-sense notion of "moral character" which formed the basis of earlier American research on morality; Hartshorne and May (Columbia University 1928-1930) defined moral character as a set of culturally defined virtues, such as honesty, which could be measured by observing the child's ability to resist the temptation to break a rule (for example, against cheating) when it seemed unlikely that he would be detected or punished. A second criterion of the existence of internalized standards is the emotion of guilt, that is, of self-punitive, self-critical reactions of remorse and anxiety after transgression of cultural standards. Both psychoanalytic and learning theories of conscience have focused upon guilt as the basic motive of morality. It has been assumed that a child behaves morally to avoid guilt. In addition to conduct that conforms with a standard and to emotional reactions of remorse after transgression, the internalization of a standard implies a capacity to make judgments in terms of that standard and to justify maintaining the standard to oneself and to others. This judgmental side of moral development has formed the focus of the work and theory of Piaget (1932) and others (Kohlberg 1966). In recent research, then, answers to the problems of moral development have been sought by examining how socialization factors, such as amount, type, and condition of punishment and reward, or opportunities for identification with parents, are related to individual differences in resistance to temptation, guilt, or moral judgment. Internalization versus situational factors. Kohlberg has argued (1964; 1966) that the study of internalized socialization has cast a limited light upon the classical problems of moral development. Problems have arisen, in the first place, be-
484
MORAL DEVELOPMENT
cause internalization does not represent a clear dimension of temporal development. Experimental measures of resistance to temptation (honesty) do not indicate any clear age trends toward greater occurrence of honesty from the preschool years to adolescence. Projective measures of intensity of guilt or moral anxiety also do not indicate clear age trends, except in terms of rather rapid and cognitively based age changes in the years eight to twelve, and these changes are in the direction of defining moral anxiety as a reaction to moral self-judgment rather than to more diffuse external events. While clear trends of development have been found in moral judgment, these trends cannot be easily considered to be trends of internalized socialization as such. In the second place, problems have arisen because a distinctive set of socialization factors has not been found that can be considered as an antecedent of moral internalization. Research results suggest that the conditions which facilitate moral internalization (e.g., parental warmth) are the same conditions which, in general, facilitate the learning of nonmoral cultural rules and expectations. In other words, this research does not indicate a distinct area of internalization or of "conscience"—of moral control linked to guilt feelings—that is distinct from general processes of social learning and social control. Recent research findings, then, reinforce the skeptical conclusions about both common-sense and psychoanalytic conceptions of a faculty of conscience or superego. Such conclusions were the major results of Hartshorne and May's monumental studies of moral character. These scholars found that the most influential factors determining resistance to temptation to cheat or disobey were situational factors rather than a fixed, individual moral character trait of honesty. The first finding that led to this conclusion was the low predictability of cheating in one situation for cheating in another. A second finding was that children could not be divided into two groups—the "cheaters" and the "honest children." Children's cheating scores were distributed in bell-curve fashion around an average score indicative of moderate cheating. A third finding was the importance of the expediency aspect of the decision to cheat; that is, the tendency to cheat depends upon the degree of risk of detection and the effort required to cheat. Children who cheated in more risky situations also cheated in less risky situations. Thus, noncheaters appeared to act more from caution than honesty. A fourth finding was that even when honest behavior was not dictated by concern about punishment or de-
tection, it was largely determined by immediate situational factors of group approval and example (as opposed to determination by internal moral values). Some classrooms showed a high tendency to cheat, while other, seemingly identically composed classrooms in the same school showed little tendency to cheat. A fifth finding was that moral knowledge or values had little apparent influence on moral conduct, since the correlations between verbal tests of moral knowledge and experimental tests of moral conduct were low. A sixth finding was that where moral values did seem to be related to conduct, these values were somewhat specific to the child's social class or group. Rather than being a universal ideal, honesty was more characteristic of the middle-class child and seemed less relevant to the lower-class child. The Hartshorne and May findings, then, suggested that honest behavior is determined by situational factors of punishment, reward, group pressures, and group values, rather than by an internal disposition of conscience or character. The general problem raised by these findings is whether moral traits describing moral character are simply value judgments of behavior made by the group or whether they correspond to some inner disposition in the person and hence help us to understand and predict his behavior. Psychologists have usually used "moral development" to mean the formation of internal standards that control behavior. This conception of an internalized standard seems to require some cross-situational generality. It is not useful to speak of behavior as being determined by an internalized rule like "Be honest" or "Don't cheat" if the rule does not predict the individual's behavior and situational forces do. We do not find it useful to speak of the morality of the dog or the rat, although both have been trained to "resist temptation" in specific situations. We do assume, however, that the animal's resistance to temptation is produced by anxiety aroused by situational cues, rather than by regard for a moral rule. To the extent that human resistance to temptation is not general across situations to which a moral rule pertains and must therefore be predicted by purely situational factors, it would seem to be no more useful to describe human behavior as the result of conscience than it is to describe animal behavior in these terms. Since MacKinnon's research (1938), studies of morality have generally attempted to cope with Hartshorne and May's findings by defining moral internalization in terms of superego, rather than "moral character." Researchers have recognized that moral action was not the direct result of an
MORAL DEVELOPMENT internal disposition toward honesty cr moral character and instead have assumed it to be the result of a complex balance of internal and external forces, including strength of drives aroused by temptation, defenses against these drives, situational fears, group pressures, etc. However, one distinctively moral force, guilt, was assumed to be a major determinant of action in situations of moral conflict or temptation. The disposition to feel guilt was assumed to be the result of early childhood identifications and experiences of punishment, rather than of situational forces. Accordingly, while moral behavior might be situation-specific, one might still be able to isolate a general process of moral internalization or guilt formation having the same childhood antecedents, regardless of the particular moral situation involved. These childhood antecedents should then have some value for predicting guilt and resistance to temptation in any situation, even though they did not produce a consistent disposition of moral character. Subsequent research on parental antecedents of guilt and of resistance to temptation has fulfilled this hope only to a very limited extent. Usually the child-rearing correlates of children's resistance to temptation in one situation have not proven to be correlates of resistance in another, and the child-rearing correlates of projective test measures of guilt have not proven to be correlates of actual moral behavior. Finally, projective measures of guilt have not proven to predict consistently actual resistance to temptation behavior (reviewed in Kohlberg 1963a). Kohlberg (1964) has argued that this more recent research evidence is consistent with the Hartshorne and May findings by suggesting that the variables leading to resistance to temptation arise primarily from the situation rather than from fixed habits, character traits like honesty, or permanent superego dispositions to feel guilt. Following Burton's analysis of honesty (1963), however, one would agree that there is some personal consistency in honest behavior or some determination of honest behavior by general personality traits. These traits, however, seem not to be traits of moral conscience but rather a set of ego abilities corresponding to common-sense notions of prudence and will. In a tradition of moral psychology dating back to the British associationists and utilitarians, moral character is believed to result from practical judgment or reason. In this view, moral action (action based on rational consideration of how one's action affects others) requires much the same capacities as does prudent action (action
485
based on rational consideration of how it affects the self's long-range interests). Both require empathy (the ability to predict the reactions of others to action), foresight (the ability to predict longrange consequences of action), judgment (the ability to weigh alternatives and probabilities), and capacity to delay (delay of response and preference for the distant, greater gratification over the immediate, lesser gratification). In psychoanalytic theory these factors are included with other aspects of decision making and emotional control in the concept of ego strength. Some of the ego abilities which have been found to correlate consistently with experimental and rating measures of children's honesty include the following: intelligence (IQ); delay of gratification (preference for a larger reward in the future over a smaller reward in the present); and attention (stability and persistence of attention in simple experimental tasks). [See DECISION MAKING, article On PSYCHOLOGICAL ASPECTS.]
These findings suggest that one can predict honesty about as well from an individual's behavior in cognitive-task or other nonmoral situations as one can from his behavior in other situations involving honesty. This, in turn, implies that the study of moral behavior in terms of early experiences centering on specifically moral training of honesty, guilt, etc., is less likely to be fruitful than is a study of moral behavior in terms of more general experiences relevant to ego development and ego control in nonmoral contexts. Some specific moral determinants. While the findings stressed so far suggest the determination of moral action by nonmoral situational and personality forces, there are also some findings suggesting the determination of action by specifically moral values. This research conclusion should not be taken to mean that there is any direct correspondence between conformity of verbal moral beliefs or attitudes and conformity of moral action. Subjects who say that cheating is very bad or that they would never cheat are as likely to cheat in an experimental situation as are subjects who express a qualified view as to the badness of cheating (studies reviewed in Kohlberg 1966). Apparently, the same willingness to deceive in order to make a good appearance which impels cheating also impels the child to make pious moral statements about cheating. A conclusion more consistent with actual research is that there is considerable correspondence between maturity of moral values (the possession of rational and internal reasons for moral action) and maturity of action in moral-conflict situations.
486
MORAL DEVELOPMENT
Clear relations between maturity of moral judgment and mature moral action are found in situations in which social norms are ambiguous or conflicting and in which developmeri tally advanced values clearly predispose toward one course of action rather than another. Such a correspondence is suggested only to a limited extent by Hartshorne and May's findings of moderate correlations between age-linked measures of moral knowledge and experimental measures of honesty. This limited correspondence occurred because they defined moral knowledge largely in terms of verbal conformity of attitudes rather than maturity of moral reasoning and because resistance to cheating is not clearly a developmentally more mature choice or a choice based on moral reasons in the young age group studied. There is evidence, however, suggesting that resistance to cheating does become a more mature alternative at older ages or higher levels of development than those involved in the Hartshorne and May study. Only 11 per cent of college subjects who were at the level of moral principle in a verbal moral-values test cheated in an experimental situation, whereas half the subjects at a level of conventional moral values cheated (this test is discussed later in this article; the findings cited are reviewed in Kohlberg 1966). With younger subjects, the same relations between moral judgment and cheating are not found, since few of the younger subjects are at the level in which not cheating may be defined as relevant to principles of contract, trust, and equity. While college-age subjects making principled moral judgments were more likely to conform to an experimenter in the matter of moral expectation about cheating, such subjects are markedly more autonomous, or less conforming to an experimenter, where the experimenter's expectations violate the subjects' moral values. Whereas 75 per cent of the morally principled subjects refused to give increasing levels of shock to an experimental "victim" when ordered to do so by an experimenter, only 13 per cent of the remaining subjects refused to do so. Major questions. The evidence suggests, then, that the basic social science problem of moral development is not that of accounting for individual differences in moral character as revealed in behavior. Moral behavior that involves conformity to social rule is, on the whole, to be explained as the result of the same situational forces, ego variables, and socialization factors that determine behaviors which have no direct moral relevance. A more distinctive focus of analysis centers instead upon the direct study of the development of moral values, judgments and emotions. The study of actual con-
duct becomes relevant to problems of moral development insofar as research is able to find links between the child's conduct and the development of his moral values and emotions. The major questions which may be asked about moral development, then, are as follows: What is the origin of distinctively moral concepts and emotions in the child? To what extent does the child's development indicate typical or regular trends of change in these concepts and sentiments? What causes or stimulates these developmental changes in moral concepts and sentiments? To what extent are these developmental changes in moral concepts and attitudes reflected in developmental changes in the child's moral action under conditions of conflict or temptation? Culture and cultural agents. All of the questions may also be asked about the development of morality in cultures. The present article will not attempt to deal with the development of cultural moralities, a topic still most comprehensively treated in the work of Hobhouse (1906). It must be pointed out, however, that most recent psychological as well as sociological thought has assumed that the problem of the origin of moral values is a cultural problem. It has been assumed that morality is a system of rules and values defined by the culture and that the individual child acquires these readymade values by general cultural-transmission mechanisms such as reinforcement learning or identification. If this were the case, our understanding of the content of the individual's moral beliefs and emotions should be based on seeing it as a cultural, rather than an individual, product. This culturological approach to moral development was first clearly outlined by Durkheim (1898-1911; 1925), who based it on assumptions about the cultural relativism of moral values which are still widely held but which do not seem to be supported by recent research findings. Durkheim developed his position out of a critique of the British utilitarians (e.g., Hume 1751; Smith 1759; and Mill 1861). The utilitarians assumed that moral values were the products of individual adults, possessed of language and intelligence, who judged the actions of other individual men. The utilitarians suggested that actions by the self or by others whose consequences to the self are harmful (painful) are naturally deemed bad and arouse anger or punitive tendencies, and actions whose consequences are beneficial (pleasant) are naturally deemed good and arouse affection or approving tendencies. Owing to natural tendencies of empathy, to generalization, and to the need for social agreement, acts are judged good (or bad) when their consequences to others are
MORAL DEVELOPMENT good (or bad), even if they do not help (or injure) the self. Logical tendencies lead these judgments of consequences to take the form of judging that act right which does the greatest good for the greatest number. [See UTILITARIANISM and the biographies of HUME; MILL; SMITH, ADAM.] In his critique of the utilitarians Durkheim pointed to the following four phenomena: (1) Morality is basically a matter of respect for fixed rules (and the authority behind those rules), not of rational calculation of benefit and harm in concrete cases. (2) Morality seems universally to be associated with punitive sentiments, sentiments incompatible with the notion that the right is a matter of human-welfare consequences. (3) From group to group there is wide variation as to the nature of the rules arousing moral respect, punitiveness, and the sense of duty. (4) While modern Western societies divorce morality from religion, the basic moral rules and attitudes in many groups are those concerning relations to gods, not men, and hence do not center on human-welfare consequences. According to Durkheim, these facts in turn implied the following: The mere fact of the existence of an institutionalized rule endows it with moral sacredness, regardless of its human-welfare consequences. Accordingly, moral rules, attitudes, and consequences originate at the group, rather than the individual, level. The psychological origin of moral attitudes, then, is in the individual's respect for the group, the attitudes shared by the group, and the authority figures who represent the groups. The values most sacred to the individual are those which are most widely shared by, and most closely bind together, the group. While Durkheim's views of the group mind have been widely questioned, the essential implications of his position have been widely accepted. Assumptions common to Durkheim and Freud underlie the research studies of moral internalization previously discussed. Unlike Durkheim, Freud (1923; 1930) derived moral sentiments and beliefs from respect for, and identification with, individual parents, rather than from respect for the group. Furthermore, Freud derived this respect and identification from instinctual attachments (and defenses against these attachments) and viewed the central rules of morality as deriving their strength and rigidity from the need to counter these instinctual forces. In spite of these differences, Freud agreed in viewing morality (superego) as fundamentally a matter of respect for concrete rules which are culturally variable or arbitrary, since these rules are a manifestation of social authority, and he agreed in viewing punitive or (self-punitive) sentiments toward
487
deviation as the clearest and most characteristic expression of moral internalization or respect. The research findings on individual moral judgments in a variety of cultures seem incompatible with either of the extreme views just contrasted (Kohlberg 1966). Moral judgments and decisions in all cultures are a mixture of judgments in terms of individual human-utility consequences and judgments in terms of concrete categorical social rules. The utilitarian derivation of respect for rules from utilitarian consequences is as psychologically unfeasible as Durkheim's derivation of concern for individual welfare consequences from respect for social rules as such. A culturally universal core of moral values and moral development may be found, but it is not based on a culturally universal acceptance of moral principles of the utilitarian variety. Individual moral beliefs and sentiments involving universal principles not directly embodied in concrete social rules often develop and often function at a level of conscious opposition and transcendence of group authority, as the utilitarians implied, but this development itself presupposes the development of respect for group authority discussed by Durkheim. Such, at least, seem the implications of recent research oriented to a third, or "developmentalist," concept of morality. In general, the developmental approach to moral psychology (Baldwin 1897; Mead 1934; McDougall 1908; Hobhouse 1906; Piaget 1932; Kohlberg 1966) has attempted to mediate between the extreme positions represented by the utilitarians and by Durkheim. Moral judgment and emotion based on respect for custom, authority, and the group are seen as one phase or stage in the moral development of the individual rather than as the total definition of the essential characteristics of morality it was for Durkheim. Judgment of right and wrong in terms of the individual's consideration of social-welfare consequences, universal principles, and justice is seen as a later phase of development. This phase depends upon and integrates many of the emotional features of the earlier customary phase and does not spring directly from the minds of unsocialized rational adults, as it did for the utilitarians. Both a morality of respect for social authority and an autonomous rational morality are to be understood as arising from the development of a self through the process of taking the roles or attitudes of other selves in interactions occurring in institutionalized patterns. Stages of moral development. As elaborated in Piaget's developmental theory (1932), the child first moves from an amoral stage to Durkheim's stage of respect for sacred rules. This is not so
488
MORAL DEVELOPMENT
much respect for the group as it is respect for the authority of individual elders such as the parents. Piaget believes that the cognitive limitations of the child of three to eight lead him to confuse moral rules with physical laws and to view rules as fixed external things, rather than as the instruments of human purposes and values. Piaget believes that the child sees rules as absolutes and confuses rules with things because of his "realism" (his inability to distinguish between subjective and objective aspects of his experience) and because of his "egocentrism' (his inability to distinguish his own perspective on events from that of others). In addition to seeing rules as external absolutes, the young child feels that his parents and other adults are all-knowing, perfect, and sacred. This attitude of unilateral respect toward adults, joined with the child's realism, is believed to lead him to view rules as sacred and unchangeable. Piaget believes that intellectual growth and experiences of role taking in the peer group naturally transform perceptions of rules from external authoritarian commands to internal principles. In essence, he views internal moral norms as logical principles of justice. Of these, he says: In contrast to a given rule, which from the first has been imposed upon the child from outside . . . the rule of justice is a sort of immanent condition of social relationships or a law governing their equilibrium. (Piaget [1932] 1948, p. 196). The sense of justice . . . is largely independent of [adult precept] and requires nothing more for its development than mutual respect and solidarity which holds among children themselves (p. 195). By "the sense of justice," Piaget means a concern for reciprocity and equality between individuals. However, norms of justice are not simply matters of abstract logic; rather they are sentiments of sympathy, gratitude, and vengeance which have taken on logical form. Piaget believes that an autonomous morality of justice develops in children of about age eight to ten and eventually replaces an earlier, heteronomous morality based on unquestioning respect for adult authority. He expects the autonomous morality of justice to develop in all children, unless development is fixated by unusual coerciveness of parents or cultures or by deprivation of experiences of peer cooperation. Certain aspects of Piaget's theory have been supported by subsequent research findings, while others have not. Piaget's stage theory suggests a number of cross-culturally universal age trends in the development of moral judgment. At least three such trends have been found to occur in a variety
of Western, Oriental, and aboriginal (American Indian and Malaysian) cultures (evidence summarized in Kohlberg 1966). These include: (1) Intentionality in judgment. Young children tend to judge an act as bad mainly in terms of its actual physical consequences, whereas older children judge an act as bad in terms of the intent to do harm. (2) Relativism in judgment. The young child views an act as either totally right or totally wrong and thinks everyone views it in the same way. If the young child does recognize a conflict in views, he believes the adult's view is always the right one. In contrast, the older child is aware of possible diversity in views of right and wrong. (3) Independence of sanctions. The young child says an act is bad because it will elicit punishment; the older child says an act is bad because it violates a rule, does harm to others, and so forth. The young child's absolutism, nonintentionalism, and orientation to punishment do not appear to depend upon extensive parental use of punishment. Even the permissively reared child appears to have a natural tendency to define good and bad in terms of absolutism and punishment, a tendency which his awareness of punishment by teachers, police, and other parents seems sufficient to stimulate. While specific punishment practices or cultural ideologies do not appear necessary for the formation of the young child's moral ideology of punishment, they may lead to the persistence of this ideology into adolescence or adulthood. In other words, specific cultural factors appear to stimulate or retard age trends of development on the Piaget dimensions, but they do not appear to actually cause the age shifts or trends observed. Piaget, then, appears to be correct in assuming certain characteristics of the young child's moral judgment in any society, characteristics which arise from the child's cognitively immature interpretation of acts labeled good and bad by adults, according to the derivation of their goodness or badness from their association with good and bad consequences of physical harm—punishment and reward. However, his interpretation of these aspects of the young child's morality—as deriving from the child's sense of the sacredness of the rules and of adult authority—has not been supported. Piaget (1932) attempts to demonstrate that the young child's attitude toward rules is one of unilateral sacredness by observations of children's behavior and beliefs about the rules of the game of marbles. Swiss children are quoted as saying that the rules of the game can never be changed, that the rules have existed from the beginning of time and have been invented and handed down by God, the head
MORAL DEVELOPMENT 489 of the state, or the father. More systematic research suggests that attitudes of rigidity toward game rules seem to decline with age in American children of five to twelve but that attitudes expressing the rigidity or sacredness of moral rules or of laws increase in this period, rather than decline. The young child's ignoring of subjective factors such as intention, then, is not based on respect for sacred rule but on a more or less pragmatic concern for consequences. An example of the fact that young children orient more or less pragmatically to punishment rather than to sacred rule is indicated by a study by Kohlberg, Krebs, and Brener (Kohlberg 1963fo). Young children were asked to judge a helpful, obedient act (attentively watching a baby brother while the mother is away) followed by punishment (the mother returns and spanks the baby-sitting child). Most four-year-olds, ignoring his act, say the obedient boy was bad because he got punished. By age seven, a majority say the boy was good, not bad, even though he was punished. Piaget also appears to be incorrect in postulating a general trend from an authoritarian to a peergroup, or democratic, ethic. Postulated general age shifts from obedience to authority to peer loyalty, from justice based on conformity to justice based on equality, have not been generally found. Peergroup participation has not been found to be a factor facilitating development on the Piaget dimensions. More broadly, however, Piaget is correct in assuming a culturally universal age development of a sense of justice, involving progressive concern for the needs and feelings of others and elaborated conceptions of reciprocity and equality. As this sense of justice develops, however, it reinforces respect for authority and for the rules of adult society; it also reinforces more informal peer norms, since adult institutions have underpinnings of reciprocity, equality of treatment, service to human needs, etc. The last-mentioned conclusion is derived primarily from cross-cultural research by this writer and his colleagues on children's responses to a number of hypothetical moral dilemmas, such as whether to steal an expensive drug to save one's dying wife. In this research every sentence or response of a subject could be reliably classified into one of six stages that have also been divided into three major levels of development as follows: Level i. Premoral: Stage 1. Punishment and obedience orientation. Stage 2. Naive instrumental hedonism.
Level n. Morality of conventional role conformity: Stage 3. Good-boy morality of maintaining good relations, approval by others. Stage 4. Authority maintaining morality. Level in. Morality of self-accepted moral principles: Stage 5. Morality of contract, of individual rights, and of democratically accepted law. Stage 6. Morality of individual principles of conscience. Each of these six general stages of moral orientation could be defined in terms of its specific stance on some 32 aspects of morality. For example, with regard to the aspect "motivation for rule obedience or moral action," the six stages were defined as follows: Stage 1. Obey rules to avoid punishment. Stage 2. Conform to obtain rewards, have favors returned, and so on. Stage 3. Conform to avoid disapproval, dislike by others. Stage 4. Conform to avoid censure by legitimate authorities and resultant guilt. Stage 5. Conform to maintain the respect of the impartial spectator judging in terms of community welfare. Stage 6. Conform to avoid self-condemnation. It is evident that this aspect of moral development represents successive degrees of internalization of moral sanctions. Other aspects of moral development involve successive cognitive reorganization of the meaning of culturally universal values. As an example, in every society human life is a basic value, even though cultures differ in their definition of the universality of this value or of the conditions under which it may be sacrificed for some other value. With regard to the value of life, the six stages are defined as follows: Stage 1. The value of a human life is confused with the value of physical objects and is based on the social status of physical attributes of its possessor. Stage 2. The value of a human life is seen as instrumental to the satisfaction of the needs of its possessor or of other persons. Stage 3. The value of a human life is based on the empathy and affection of family members and others toward its possessor. Stage 4. Life is conceived as sacred in terms of its place in a categorical moral or religious order of rights and duties.
490 Stage 5. Stage 6.
MORAL DEVELOPMENT Life is valued both in its relation to community welfare and as a universal human right. Life is valued ac sacred and as representing a universal human value of respect for the individual.
It is evident that these stages represent a progressive disentangling or differentiation of moral values and judgments from other types of values and judgments. With regard to the particular aspect—the value of life—the moral value held by the person at stage 6 has become progressively disentangled from status and property values (stage 1), from his instrumental uses to others (stage 2), from the actual affection of others for him (stage 3), etc. While philosophers have been unable to agree upon any ultimate principle of the good which would define "correct" moral judgments, most philosophers agree upon the characteristics which make a judgment a genuine moral judgment (Hare 1952; Kant 1785). Moral judgments are judgments about the good and the right of action. However, not all judgments of "good" or "right" are moral judgments; many are judgments of aesthetic, technological, or prudential goodness or rightness. Unlike judgments of prudence or aesthetics, moral judgments tend to be universal, inclusive, consistent, and based on objective, impersonal, or ideal grounds. "She's really great; she's beautiful and a good dancer" and "The right way to make a martini is five to one" are statements about the good and right which are not moral judgments, since they lack these characteristics. If we say, "Martinis should be made five to one," we are making an aesthetic judgment; we are not prepared to say that we want everyone to make them that way, that they are good in terms of some impersonal ideal standard shared by others, and that we should all make five-to-one martinis whether we wish to or not. In a similar fashion, when a tenyear-old answers the "moral should" question "Should Joe tell on his older brother?"—in stage 1 terms of the probabilities of getting beaten up by his father and by his brother—he does not answer with a moral judgment that is universal (applies to all brothers in that situation and ought to be agreed upon by all people thinking about the situation) or one that has any impersonal or ideal grounds. In contrast, stage 6 statements not only use specifically moral words like "morally right" or "duty" but use them in a moral way: e.g., phrases such as "regardless of who it was" and "by the law of nature or of God" imply universality; "Morally, I would do it in spite of fear of punishment" implies
impersonality and ideality of obligation, and so on. Thus, the responses of subjects at lower levels to moral-judgment matters fail to be moral responses the same way that the value judgments of subjects at higher levels about aesthetic or morally neutral matters fail to be moral responses. In this sense we can define a moral judgment as "moral" without considering its content (the action judged) and without considering whether it agrees or not with our own judgments or standards. It is also evident that moral development in terms of these stages is a progressive movement toward basing moral judgment on concepts of justice. To base a moral duty on a concept of justice is to base that duty on the right of an individual; to judge an act wrong is to judge it as violating such a right. The concept of a right implies a legitimate expectancy, a claim which I may expect others to agree I have. While rights may be grounded on sheer custom or law, there are two general grounds for a right—equality and reciprocity (including exchange, contract, and the reward of merit). At stages 5 and 6 all the demands of statute or of moral (natural) law are grounded on concepts of justice, i.e., on agreement, contract, and the impartiality of the law and its function in maintaining the rights of individuals. It is apparent that the stages just defined are stages in the development of moral judgment. Rather similar stages, however, have been independently arrived at by Peck and Havighurst (1960), who include emotional and behavioral as well as judgmental traits in their stage definitions. The progressions, or stages, just described imply something more than age trends. In the first place, they imply an invariant sequence in which each individual child must go step by step through each of the kinds of moral judgment outlined. It is, of course, possible for a child to move at varying speeds and to stop (become "fixated") at any level of development, but if he continues to move upward, he must move in accord with these steps. The longitudinal study of American boys at ages 10, 13, 16, and 19 suggests that this is the case (Kohlberg 1966). Second, a stage concept implies universality of sequence under varying cultural conditions. It implies that moral development is not merely a matter of learning the verbal values or rules of the child's culture but reflects something more universal in development, which would occur in any culture. In general, the stages in moral judgment just described appear to be culturally universal. Middleclass urban, lower-class urban, and tribal or rural
MORAL DEVELOPMENT village boys aged 10 to 21 have been studied in Taiwan, Yucatan, Turkey, and the United States. In all groups, stage 1 appears first and becomes less prevalent with age. Stage 2 appears next and then stages 3 and 4, which increase with age. In all middle-class groups, and some lower-class groups, stages 5 and 6 appear at later ages (primarily ages 16 to 21). These last two stages are not found among tribal or village peasant groups. (Kohlberg 1966). Factors in development. It seems obvious that moral stages must primarily be the products of the child's interaction with others, rather than the direct unfolding of biological or neurological structures. However, the emphasis on social interaction does not mean that stages of moral judgment directly represent the teaching of values by parents or direct "introjection" of values by the child. Theories of moral stages view the influence of parental training and discipline as only a part of a world or social order perceived by the child. The child can internalize the moral values of his parents and culture and make them his own only as he comes to relate these values to a comprehended social order and to his own goals as a social self. Culturally universal invariant sequences in the child's social concepts and values imply that there are some universal structural dimensions or invariants in the social world analogous to those in the physical world. Universal physical concepts have been found because there is a universal physical structure which underlies the diversity of physical arrangements in which men live and the diversities of formal physical theories held in various cultures. In somewhat analogous fashion, the social stages imply universal structural dimensions of social experience; this is based on the fact that social and moral action involves the existence of a self in a world composed of other selves playing complementary roles organized into institutional systems. In order to play a social role in the family, school, or society, the child must implicitly take the role of others toward himself and toward others in the group. One side of such role taking is represented by acts of reciprocity or complementarity (Mead 1934), the other side by acts and attitudes of sameness, sharing, and imitation (Baldwin 1897). These tendencies, intimately associated with the development of language and symbolism, form the basis of all social institutions which represent various patternings of shared or complementary expectations. [See INTERACTION; LANGUAGE, article on LANGUAGE DEVELOPMENT; ROLE, article on PSYCHOLOGICAL ASPECTS.] Such institutional expectations have per se a
491
normative or moral component involving rights and duties and require moral role taking. While the concrete definitions of required behavior in given roles are relatively fixed throughout age development, the perspectives in which these behaviors are related to a moral order undergo successive stagelike transformation. Required behavior may be based upon power and external compulsion (stage 1), upon a system of exchanges and need satisfactions (stage 2), upon the maintenance of legitimate expectations (stages 3 and 4), or upon ideals or general logical principles of social organization (stages 5 and 6). The order in this development is largely the result of general aspects of cognitive development. Concepts of legitimate expectations presuppose concepts of reciprocity and exchange, while general principles of social organization and justice presuppose concepts of legitimate expectations. The large cognitive component of moral role taking is suggested by correlations between the development of moral judgment and cognitive advance on intelligence tests or on Piaget's cognitivestage tasks. Intelligence may be taken as a necessary, but not sufficient, cause of moral advance. All morally advanced children are bright, but not all bright children are morally advanced. Cognitive advance is associated with emotional aspects of moral role taking (e.g., the movement of moral motives from punishment to disapproval to self-condemnation) as well as with more intellectual forms of moral role taking in terms of the values and the rights of others (e.g., the movement from conceiving of life as a physical value to conceiving it as based on a universal respect for the human individual). In addition to cognitive advance, opportunities for participation and role taking in all the basic groups to which the child belongs appear to be important for moral development. Piaget's theory (1932) has stressed the peer group as a source of moral role taking, while other theories (Mead 1934) stress participation in the larger secondary institutions or participation in the family itself (Baldwin 1897). Research results suggest that all these opportunities for role taking are important and that all operate in a similar direction by stimulating moral development rather than producing a particular value system. In three divergent cultures studied, middle-class children were found to be more advanced in moral judgment than matched lower-class children (Kohlberg 1967). This was not because the middle-class children heavily favored a certain type of thought which corresponded to the prevailing middle-class pattern.
492
MORAL DEVELOPMENT
Instead, middle-class and working-class children seemed to move through the same sequences, but the middle-class children seemed to move faster and farther. Similar but even more striking differences were found between peer-group participators (popular children) and nonparticipators (unchosen children) in the American sample. Studies underway suggest that these peer-group differences partly arise from, and partly add on to, prior differences in opportunities for role taking in the child's family (family participation, communication, emotional warmth, sharing in decisions, awarding responsibility to the child, pointing out consequences of action to others). Our discussion has stressed the role of intellectual advance and of social participation and roletaking opportunities in family, peer group, and secondary institutions as they facilitate the development of moral judgment. While the evidence is less complete, these same factors appear to correlate with clinical ratings of maturity of moral character (Peck & Havighurst 1960) and experimental or rating measures of honesty and of moral autonomy (Kohlberg 1967; Columbia University 1928-1930). Parental identification and guilt. It is important to note that some of the findings used here to argue for the centrality of role-taking opportunities in moral development have also been interpreted as indicating the centrality of parent identifications in conscience formation. In psychoanalytic and neopsychoanalytic discussions, identification has meant the general tendency to take the role of the punishing and criticizing other; that is, in order to criticize or punish himself after transgression, the child must take the role of another toward himself. Otherwise he would continue to view himself and the situation as he did when he performed the act. For self-criticism to be guilt, the child must "take the role of the other" in a deep or internalized sense, regardless of whether the other knows about his transgression. Such deep, fixed role taking or identification has been variously hypothesized to result from needs to substitute for an absent or rejecting love object (Freud 1930; Sears et al. 1957), from the need to defend against fear of aggression (A. Freud 1936), or from "status envy" needs (Whiting 1960). It is evident that identification is a special or particular form of role taking as previously defined. As opposed to more general theories of role taking, identification theories of moral formation have assumed: (a) that the child's role taking of parents represents a unique, special, and necessary basis for conscience formation rather than one of
a number of general role-taking relationships; ( b ) that the basic moral role-taking tendencies leading to conscience formation are formed in early childhood, when the child's weakness can create overwhelmingly strong tendencies to love, fear, and respect and lead to introjecting adult figures and their prescriptions; (c) that basic role taking of parents leads to direct introjection, transfer, or mimicking of fixed parental standards rather than being a step toward the development of general role-taking tendencies which move out into wider social realms and so promote moral advance. In general the research findings suggest the importance of children's role taking of their parents in moral development, but they do not support the notion that conscience is a unique product of parent identifications (Kohlberg 1963a; 1963fr; 1964; Hoffman 1966). Parental warmth, children's positive attitudes toward parents, and children's expressed desire to be like their parents correlate positively with acceptance of the conventional moral code as measured by tests of conventional expressions of guilt and of moral judgment. Little evidence, however, has been found to indicate that these variables are correlated with the fixed introjection of particular, individual parental moral values. Furthermore, little evidence has been found to suggest that a close bond to one or both parents is crucially necessary for conscience formation. The most relevant studies come from comparison of kibbutz-reared and family-reared children in Israel. While kibbutz children have regular contacts with parents in evenings and on holidays, parents are little involved in making or enforcing moral or socialization demands upon the child. This task is primarily the function of the nursecaretaker, the teacher, and the peer group. Few clear differences have been found between these children and city children in moral judgment, in projective measures of guilt, or in naturalistic observations of moral control of behavior (studies reviewed in Kohlberg 1964). It would appear, then, that affectional relationships (or identification) with parents are important in moral development, more because positive and affectional relations to others are generally conducive to ego development and to role taking and acceptance of social standards than because they provide a unique and direct basis for conscience formation. [See AFFECTION.] Common psychological notions that parental punishment and resultant guilt play a critical role in moral development seem even more questionable in the light of research findings. It seems selfevident that self-induced pain after transgression (guilt) must originate largely from experiences
MORAL DEVELOPMENT 493 of transgression-related pain caused by others (punishment). Some core experiences of punishment, or at least of blame, are presumably necessary for the development of guilt reactions, and even the most permissively raised children experience them. Punishment, however, does not directly produce guilt, since the very young punished child does not experience guilt. Furthermore, there does not appear to be a direct relationship between amount of punishment and amount of guilt. We are also not able to say that the more psychologically painful the punishment, the more likely it is to produce guilt. Physical punishment seems to show a low positive correlation with children's use of punishment fantasies as consequences of transgression, but it does not relate positively to types of transgression reaction more representative of guilt. Even for punishment reactions, young children whose parents report they never use physical punishment may make heavy use of it in doll-play transgression stories. Punishment by love withdrawal (ignoring, isolation, a mother's statements that she doesn't like her child when he is bad) has been thought to be especially critical in producing guilt, because loss of love is believed to be more psychologically painful or anxiety-arousing than physical punishment and because it would be expected to lead to implicit role taking or identification with the parent's disapproval. However, love withdrawal has not been found to relate to self-critical guilt (Hoffman 1966). Rather than showing striking or unique relationships to punishment experiences, projective measures of internal guilt show the same general age trends and social correlates as measures of maturity of moral judgment in the school years. This suggests that the development of conscious internal standards of judgment and of empathic and role-taking capacities is the major factor in the genesis of guilt (Kohlberg 1964; Hoffman 1966). The findings just reviewed, together with findings presented initially in this article, are inconsistent with the notion of a fixed moral structure (conscience-guilt) developing out of experiences of parental punishment and reward and determining moral behavior. This conclusion is not inconsistent with the obvious importance of punishment and reward in the short-term situational control of "moral" (conforming) behavior, as suggested by the Hartshorne and May findings. Experimental studies that manipulate punishment parameters show striking effects upon short-term resistance to temptation in given situations (Aronfreed 1966). In contrast, naturalistic correlational studies of
parameters of parental punishment and reward suggests few clear or persisting effects of these parameters upon later moral behavior (findings reviewed in Kohlberg 1963a). Thus, S-R reinforcement theories may be useful in explaining short-run learning of behavioral conformity, without being adequate for the understanding of what we have considered as characteristic of moral development. Neurotic behavior. In addition to distinguishing between moral development and situational conformity with regard to punishment-guilt factors, it is important to distinguish between moral development and the formation of neurotic inhibitions, anxieties, and punitive feelings resulting from punishment-guilt factors. It is obvious that neurotics suffer from strong feelings of anxiety, depression, low self-esteem, and inhibition. To a considerable extent, psychopathologists have held that these feelings result from guilt experiences resulting in turn from real or fantasied childhood transgressions and associated punishments, and they have developed general theories of moral development from these clinical data. The research findings on guilt and moral factors in neurosis are sparse, but they do suggest limitations to the notion that neurotics suffer from too much general guilt or moral restraint. There is little reason to believe that neurotics are more scrupulous about moral ideals or more morally restrained in their conduct than normal people. Neurotic children have not been found to be higher (or consistently lower) than normal children in projective measures of guilt, in moral judgment, or in resistance to dishonest behavior. (In contrast, pathologically delinquent children are markedly lower on guilt and moral judgment than are either neurotic or normal children.) While neurotic symptoms do not seem to be explainable as the result of too much general guilt or moral concern resulting from childhood experiences, it does seem plausible to view distinctively "neurotic" moral anxieties and inhibitions (anxieties about matters viewed as morally permissible by the general culture) as the result of childhood experiences and fantasies of parental punishment. Clinical observations as to the genesis of these idiosyncratic moral anxieties may be valid, then, even though they have not provided a useful model for the general understanding of moral development. Such understanding rests on further elaboration of the processes of ego development as these interact with social experiences of which the moral is a universal dimension. LAWRENCE KOHLBERG
494
MORALE
[Directly related are the entries DEVELOPMENTAL PSYCHOLOGY and SOCIALIZATION. Other relevant material may be found in CONFORMITY; DELINQUENCY; JUSTICE; LEARNING, article cr1 REINFORCEMENT; PERSONALITY, article on PERSONALITY DEVELOPMENT; PSYCHOANALYSIS; ROLE; SYMPATHY AND EMPATHY; UTILITARIANISM; and in the biography of DURKHEIM.] BIBLIOGRAPHY ARONFREED, J. 1966 Conduct and Conscience: The Experimental Study of Internalization. Unpublished manuscript. BALDWIN, JAMES M. (1897) 1906 Social and Ethical Interpretations in Mental Development: A Study in Social Psychology. 4th ed., rev. & enl. New York: Macmillan. BOWERS, WILLIAM J. 1964 Student Dishonesty and Its Control in College. Cooperative Research Project No. OE 1672. Unpublished manuscript, Columbia Univ., Bureau of Applied Social Research. BURTON, ROGER V. 1963 Generality of Honesty Reconsidered. Psychological Review 70:481-499. COLUMBIA UNIVERSITY, TEACHERS COLLEGE 1928-1930 Studies in the Nature of Character. 3 vols. New York: Macmillan. -> Volume 1: Studies in Deceit, by Hugh Hartshorne and Mark A. May. Volume 2: Studies in Service and Self-control, by Hugh Hartshorne, Mark A. May, and J. B. Mailer. Volume 3: Studies in Organization of Character, by Hugh Hartshorne, Mark A. May, and F. K. Shuttleworth. DURKHEIM, EMILE (1898-1911) 1953 Sociology and Philosophy. Glencoe, 111.: Free Press. -> Written between 1898 and 1911. First published posthumously in French. DURKHEIM, EMILE (1925) 1961 Moral Education: A Study in the Theory and Application of the Sociology of Education. New York: Free Press. -» First published posthumously in French. FREUD, ANNA (1936) 1957 The Ego and the Mechanisms of Defense. New York: International Universities Press. -» First published as Das Ich und die Abwehrmechanismen. FREUD, SIGMUND (1923) 1961 The Ego and the Id. Volume 19, pages 12-63 in Sigmund Freud, The Standard Edition of the Complete Psychological Works of Sigmund Freud. London: Hogarth; New York: Macmillan. -> First published as Das Ich und das Es. FREUD, SIGMUND (1925) 1961 The Resistances to Psycho-analysis. Volume 19, pages 213-222 in Sigmund Freud, The Standard Edition of the Complete Psychological Works of Sigmund Freud. London: Hogarth; New York: Macmillan. -» First published in German. FREUD, SIGMUND (1930) 1958 Civilization and Its Discontents. Garden City, N.Y.: Doubleday. -» First published as Das Unbehagen in der Kultur. HARE, RICHARD M. 1952 The Language of Morals. Oxford: Clarendon. HOBHOUSE, LEONARD T. (1906) 1951 Morals in Evolution: A Study in Comparative Ethics. With a new introduction by Morris Ginsberg. 7th ed. 2 vols. London: Chapman. HOFFMAN, M. 1966 Childrearing Antecedents of Moral Internalization. Unpublished manuscript. HUME, DAVID (1751) 1957 An Inquiry Concerning the Principles of Morals. New York: Liberal Arts Press. KANT, IMMANUEL (1785) 1949 Fundamental Principles of the Metaphysics of Morals. New York: Liberal Arts Press. -» First published as Grundlegung zur Metaphysik der Sitten.
KOHLBERG, LAWRENCE 1963a Moral Development and Identification. Pages 277-332 in National Society for the Study of Education, Child Psychology. 62d Yearbook. Edited by Harold Stevenson. Univ. of Chicago Press. KOHLBERG, LAWRENCE 1963fc The Development of Children's Orientations Toward a Moral Order. Part 1: Sequence in the Development of Moral Thought. Vita humana 6:11-33. KOHLBERG, LAWRENCE 1964 Development of Moral Character and Moral Ideology. Volume 1, pages 383431 in Martin Hoffman and Lois Hoffman (editors), Review of Child Development Research. New York: Russell Sage Foundation. KOHLBERG, LAWRENCE 1966 Stage and Sequence: The Developmental Approach to Moralization. Unpublished manuscript. KOHLBERG, LAWRENCE 1967 The Development of Children's Orientations Toward a Moral Order. Part 2: Social Experience, Social Conduct, and the Development of Moral Thought. Unpublished manuscript. McDoucALL, WILLIAM (1908) 1950 An Introduction to Social Psychology. 30th ed. London: Methuen. -> A paperback edition was published in 1960 by Barnes and Noble. MACKINNON, DONALD W. 1938 Violation of Prohibitions. Pages 491-501 in Henry W. Murray (editor), Explorations in Personality. New York: Oxford Univ. Press. MEAD, GEORGE H. (1934) 1963 Mind, Self and Society From the Standpoint of a Social Behaviorist. Edited by Charles W. Morris. Univ. of Chicago Press. -» Published posthumously. MILL, JOHN STUART (1861) 1957 Utilitarianism. Indianapolis, Ind.: Bobbs-Merrill. PECK, ROBERT F.; and HAVIGHURST, ROBERT J. 1960 The Psychology of Character Development. New York: Wiley. PIAGET, JEAN (1932) 1948 The Moral Judgment of the Child. Glencoe, 111.: Free Press. -» First published in French. A paperback edition was published in 1965. SEARS, ROBERT R.; MACCOBY, ELEANOR E.; and LEVIN, HARRY 1957 Patterns of Child Rearing. Evanston, 111.: Row, Peterson. SMITH, ADAM (1759) 1948 The Theory of Moral Sentiments. Pages 3-277 in Adam Smith's Moral and Political Philosophy. Edited by Herbert Schneider. New York: Hafner. WHITING, JOHN W. M. 1960 Resource Mediation and Learning by Identification. Pages 112-126 in Ira Iscoe and Harold W. Stevenson (editors), Personality Development in Children. Austin: Univ. of Texas Press.
MORALE See ATTITUDES; GROUPS, article on GROUP PERFORMANCE; INDUSTRIAL RELATIONS; LEADERSHIP; MILITARY PSYCHOLOGY; WORKERS. MORALS See MORAL DEVELOPMENT. MORBIDITY See EPIDEMIOLOGY; HEALTH; ILLNESS; MEDICAL CARE; MENTAL DISORDERS; PUBLIC HEALTH. MORES See NORMS; VALUES; the biography of SUMNER.
MORGAN, CONWY LLOYD MORGAN, CONWY LLOYD Conwy Lloyd Morgan (1852-1936), habitually known as Lloyd Morgan because of his common surname, was a British comparative psychologist and psychological philosopher who, coming under the influence of Thomas H. Huxley, interested himself in the philosophy of evolution and of human conduct and in the intelligent behavior of animals in their relation to each other and to man. Lloyd Morgan, the son of a solicitor, James A. Morgan, was born in London. He received his early education at the Royal Grammar School in Guildford near London, after his parents had moved from the city. He was already reading philosophy, but to prepare himself to earn a living he enrolled in the School of Mines in London, with the intention of becoming a mining engineer. By chance, at a dinner at the school he found himself seated next to the great Huxley, 27 years his senior. Huxley quizzed the young student of mining about his intellectual interests and recommended that he finish his present training and then shift to work in biology with Huxley at the Royal College of Science. Thereafter Huxley had a new disciple. Lloyd Morgan was much more interested in science than in mining. On completing his training at the school, he accepted a post as a tutor, which took him on tour through North America and Brazil. After that he did indeed go to study with Huxley; Adolf C. Bastian, later the defender of the doctrine of the spontaneous generation of life, was a fellow pupil. In 1878 he obtained the post of lecturer at the Diocesan College at Rondebosch in South Africa. There he taught physical science, English literature, and constitutional history but devoted his leisure to studying geology and natural history. It was becoming clear that teaching was his forte. In 1883 he was appointed a lecturer in geology and zoology at University College, Bristol, where he was to remain for the rest of his professional life. In 1887 he was made principal of the college, a post equivalent to appointment to a permanent chair. Much later, in 1910, when the college became a university, he acted as vice-chancellor for a year but thereafter returned to teaching, the occupation that he greatly preferred, as professor of psychology and ethics. In 1919 he retired, continuing in the suburbs of Bristol his active life of writing. Then finally he withdrew to Hastings on the English Channel, where he died in 1936. For fifty years at Bristol, Lloyd Morgan, besides being concerned with teaching and college administration, lived the life of a philosopher of nature, an observer of animal behavior, and a writer of
495
many essays and a dozen books on evolution, especially the evolution of mind, as well as on comparative psychology, especially the emergence of consciousness and the growth of intelligence in the evolutionary scale. (The term "comparative psychology" had been coined by G. J. Romanes in 1882, the year of Darwin's death. Lloyd Morgan's best-known book, An Introduction to Comparative Psychology, was published in 1894, the year of Romanes' death.) Lloyd Morgan was constantly on the alert for significant incidents in the behavior of animals: he brought together the reports of others on this topic, watched his own dogs and cats, and arranged little experiments with them and with newly hatched chicks and ducklings in order to study the distinction between instinctive and learned behavior. He wrote about instinct, learning, intelligence, association, imitation, reasoning, and the perception of relations. Always he compared animals with respect to one another and to man, with especial reference to the scale of mental evolution. He is best known for what has come to be called Lloyd Morgan's canon, which demands parsimony in the inference of an animal's place on the scale of mind from its behavior: "In no case may we interpret an action [of an animal] as the outcome of the exercise of a higher psychical faculty, if it can be interpreted as the outcome of the exercise of one which stands lower in the psychological scale" (1894, p. 63). He used this canon consistently throughout his Comparative Psychology and his later books, always rejecting the inference of the more nearly human level of consciousness in favor of whatever simpler account seemed adequate. Lloyd Morgan is also known for his support of the doctrine of emergent evolution, a view which he shared with his philosophical contemporary Samuel Alexander and which they derived in part from Henri Bergson's concept of elan vital and in part from the concept of entelechy as advocated by the vitalist Hans Driesch. Lloyd Morgan tells how quite early he tried to convince a skeptical Huxley that evolution occurs by discrete steps. Evolutionary emergence is equivalent to chemical emergence: the various observable properties of water cannot be predicted from the observable properties of hydrogen and oxygen. Lloyd Morgan presented this view as applied to new biological organizations in his Gifford lectures, published as Emergent Evolution in 1923, shortly after his retirement from Bristol, and again in The Emergence of Novelty of 1933, his last publication of importance, for he was then 81. EDWIN G. BORING
496
MORGAN, LEWIS HENRY
[For the historical context of Morgan's work, see EVOLUTION; for discussion of the subsequent development of Morgan's ideas, see ETHOLOGY; INSTINCT; PSYCHOLOGY, articles on COMPARATIVE PSYCHOLOGY and PHYSIOLOGICAL PSYCHOLOGY.] WORKS BY C. L. MORGAN
1885
The Springs of Conduct: An Essay in Evolution. London: Routledge. 1891 Animal Life and Intelligence. London: Arnold; New York: Scribner. (1894) 1906 An Introduction to Comparative Psychology. London: Scott. 1896 Habit and Instinct. London and New York: Arnold. 1900 Animal Behaviour. London and New York: Arnold. 1912 Instinct and Experience. London: Methuen; New York: Macmillan. 1923 Emergent Evolution: The Gifford Lectures Delivered in the University of St. Andrews in the Year 1922. London: Williams & Norgate. 1926 Life, Mind and Spirit: Being the Second Course of the Gifford Lectures. London: Williams & Norgate. 1929 Mind at the Crossways. London: Williams & Norgate. 1930 The Animal Mind. London: Arnold; New York: Longmans. 1932 Autobiography. Volume 2, pages 237-264 in A History of Psychology in Autobiography. Worcester, Mass.: Clark Univ. Press. 1933 The Emergence of Novelty. London: Williams & Norgate; New York: Holt. SUPPLEMENTARY BIBLIOGRAPHY
Conwy Lloyd Morgan. 1932 Volume 3, pages 952-955 in Psychological Register. Worcester, Mass.: Clark Univ. Press; Oxford Univ. Press. FIELD, G. C. 1949 Morgan, Conwy Lloyd: 1852-1936. Pages 627-628 in Dictionary of National Biography: 1931-1940. Oxford Univ. Press. GRINDLEY, G. C. 1936 Professor C. Lloyd Morgan. British. Journal of Psychology 27:1-3. PARSONS, J. H. 1936 Conwy Lloyd Morgan. Royal Society of London, Obituary Notices of Fellows 2:25-27.
MORGAN, LEWIS HENRY Lewis Henry Morgan (1818-1881), American anthropologist, was born near Aurora, New York, of a Welsh family who had settled in New England as early as 1640. He attended Cayuga Academy in Aurora before going to Union College, from which he was graduated in 1840. He then returned to Aurora, where he studied law. In 1844 he went to Rochester and established himself as an attorney. In 1851 he married his cousin, Mary Elizabeth Steele, by whom he had three children. In the 1850s Morgan invested in mining and railroad ventures in the Upper Peninsula of Michigan. From these investments he acquired a modest fortune which he bequeathed to the University of Rochester. He served two terms in the New York State legislature, one in the Assembly and one in the Senate.
He tried repeatedly, but without success, to obtain a position as United States minister to a foreign country. Morgan never served on the staff of a scientific or educational institution; he declined President A. D. White's offer of a chair of ethnology at Cornell University. He retired from his legal practice in 1862, although he continued to represent some of the Michigan corporations in which he had invested. He resided in Rochester until his death. Morgan's ethnological career began when he joined a young men's club, the Grand Order of the Iroquois, in Aurora after graduating from college. In order to pattern this club upon the famous Iroquois confederacy, Morgan undertook an exhaustive study of the Iroquois, their history, and their culture, particularly of the Seneca tribe. The results of his research were published in 1851 as The League of the Ho-de-no-sau-nee, or Iroquois, dedicated to his friend and co-worker Ely S. Parker, a Seneca Indian. Morgan was adopted into the Seneca tribe in 1846, but he did not "live the life of an Indian among them for years," as some have assumed. He was, however, a lifelong and stanch champion of the American Indians in their losing struggle against encroachment by the white man. After a few fallow years, Morgan's interest in ethnology was revived in 1856, when he attended a meeting of the American Association for the Advancement of Science. He returned to further consideration of the Seneca method of designating relatives, which differed radically from AngloAmerican usage at many points. In 1858 he discovered that the same system of terminology existed among the Ojibway Indians who lived at Marquette. Michigan. It occurred to Morgan that this system might be widespread and that if it could be found in Asia, the Asiatic origin of the American Indians could be demonstrated. He at once began a vigorous and comprehensive program of field research and circulated questionnaires to distant lands in the hope of obtaining data. His monumental Systems of Consanguinity and Affinity of the Human Family, published by the Smithsonian Institution in 1871, was the result. He believed that his data definitely proved that the American Indians had migrated to America from Asia. But, more important, his interpretation of the kinship terminologies led him to formulate a comprehensive theory of social evolution, according to which forms of the family evolved by stages from an original state of promiscuity, culminating in monogamy in the stage of civilization. Morgan's researches and writings led to the publication in 1877 of his best-known and most influ-
MORGAN, LEWIS HENRY ential work, Ancient Society. The book attempts to embrace culture in its entirety, but its emphasis is upon the evolution of society. It is divided into four parts, titled (1) "Growth of Intelligence Through Inventions and Discoveries"; (2) "Growth of the Idea of Government"; (3) "Growth of the Idea of the Family"; (4) "Growth of the Idea of Property." Two theories of evolution are used: an idealistic and a materialistic one. According to the idealistic one, institutions are explained as the accumulated product of germs of thought in the human mind; this concept was widely held by Morgan's predecessors and contemporaries. The second theory rests on zoological, ecological, and technological explanations. Man is seen as an animal species effecting life-sustaining adjustments to his habitat by technological means; culture evolves as control by these means is improved and extended. Morgan tended to view the evolution of culture as the progress of the human mind, but he did not avoid the word "evolution" as some have claimed. He divided man's career, which is "one in source, one in experience, and one in progress," into three great stages: savagery, barbarism, and civilization. Each stage was subdivided into upper, middle, and lower "statuses." He likened stages of sociocultural development to successive geological strata. Ancient Society has a number of defects and shortcomings. Morgan's whole theory of the evolution of the family has now been abandoned as obsolete. But this work was the first impressive attempt to provide a scientific account of the origin and evolution of civilization and to illustrate the successive stages of this development by the use of descriptions of specific cultures. For examples Morgan drew on ethnographic knowledge of such societies as the aborigines of Australia and America and on classical sources concerned with the ancient Greeks and Romans. Ancient Society became a classic in Marxist literature. Marx and En gels were attracted to Morgan's writings: his emphasis on the role of property in the evolution of culture, his criticism of the "property career" of modern societies, and his predictions of a nobler and a more just social order to come unquestionably drew Marx and Engels to his work. Above all, Ancient Society provided the best available account in Marx's day of how culture had actually evolved, and emphasized—or called attention to—the revolutionary character of some cultural changes. Marx died before he was able to write a book he had planned about Morgan's work; in his stead Engels wrote The Origin of the Family, Private Property and the State (1884). Therein he
497
credited Morgan with having independently formulated the Marxist materialist conception of history. Yet Morgan's lecture entitled Diffusion Against Centralization (1852) as well as several other writings make it clear that he had not clearly grasped the conception of a proletarian revolutionary overthrow of the capitalist order and that he was an enthusiastic admirer of the achievements of the so-called bourgeois revolution, that is, of the emergence and rise to predominance of the industrial and commercial classes as against the landed aristocracy. Mention should be made of Morgan's work in Australian ethnology. He was the first anthropologist to publish a treatise on Australian kinship. Through correspondence, he taught the scientific principles of ethnology to Lorimer Fison, an English missionary in Fiji, and to A. W. Howitt, a police magistrate in Australia. He guided their field work and wrote the introduction to their book, Kamilaroi and Kurnai (1880), which was dedicated to him. Morgan's ethnology was harshly criticized by John F. McLennan and was treated with some condescension by other British anthropologists. Nevertheless, he was recognized in England as a great pioneer in the field. On his European tour in 1870-1871 Morgan met Darwin, Huxley, McLennan, Lubbock, and Maine. He corresponded with these men and also with J. J. Bachofen on the Continent. In the United States, Morgan achieved great distinction. He knew all the leading anthropologists, many of whom came to him for advice and counsel. In 1879 the newly established Archaeological Institute of America asked Morgan to provide it with a comprehensive program for field research in the Americas (1879-1880). Union College awarded him an honorary degree. He was made a fellow of the American Academy of Arts and Sciences in 1868, elected to membership in the National Academy of Sciences in 1875, and elected president of the American Association for the Advancement of Science in 1879. Morgan fell into disrepute in the United States when Franz Boas and his students rose to ascendancy in anthropological science. As an American he was looked down upon or ignored by the European-born members of the Boas school. The reaction against cultural evolutionism, which became vigorous in the United States under Boas, and in Europe under the leadership of Fritz Graebner and later of Schmidt and Koppers, took Morgan as its prime target. He was in turn ignored, belittled, and ridiculed. The fact that Ancient Society had become a Marxist classic unquestionably contributed
498
MORTALITY
to the hostility to and rejection of Morgan's work, but it is difficult to gauge the magnitude of this factor. The Catholic anthropologists of the Kulturkreis school, in the United States as well as in Europe, were especially venomous in their attacks upon Morgan's "crass materialism" and his "evolutionist vagaries." The theory of evolution, however, has become respectable again, at least among many cultural anthropologists; the numerous Darwin centennials in 1959 did much to bring about this change of attitude. With this about-face has come a reconsideration and re-evaluation of Morgan and his work. The League of the Iroquois was reprinted for the fifth time in 1962. A new printing of Ancient Society, with an introduction and annotations by Eleanor Burke Leacock, was issued in 1963, and still another edition was published in 1964. The University of Rochester sponsored a series of Lewis Henry Morgan lectures in 1963 and 1964. The re-evaluation of Morgan and his work has contributed greatly to an appreciation of his full stature as one of the great pioneers in the science of anthropology. LESLIE A. WHITE [For discussion of the subsequent development of Morgan's ideas, see ANTHROPOLOGY, article on THE FIELD; CULTURE; EVOLUTION, article on CULTURAL EVOLUTION; FlELD WORK; INDIANS, NORTH AMERICAN; KINSHIP; SOCIAL STRUCTURE; and the biographies of BACHOFEN; BANDELIER; ENGELS; MCLENNAN; MAINE; RIVERS; TYLOR; WESTERMARCK.] WORKS BY L. H. MORGAN (1851) 1962 The League of the Iroquois. New York: Citadel. -> First published as The League of the Hode-no-sau-nee, or Iroquois. 1852 Diffusion Against Centralization. Rochester, N.Y.: Dewey. 1868 The American Beaver and His Works. Philadelphia: Lippincott. 1871 Systems of Consanguinity and Affinity of the Human Family. Smithsonian Contributions to Knowledge, Vol. 17, Publication No. 218. Washington: Smithsonian Institution. 1872 Australian Kinship: From Original Memoranda of Reverend Lorimer Fison. American Academy of Arts and Sciences, Proceedings 8:412-438. (1877) 1964 Ancient Society. Edited by Leslie A. White. Cambridge, Mass.: Belknap. 1879-1880 A Study of the Houses of the American Aborigines With a Scheme of Exploration of the Ruins in New Mexico . . . [and Elsewhere]. Archaeological Institute of America, Annual Report 1:27-80. 1881 Houses and House-life of the American Aborigines. Contributions to North American Ethnology, Vol. 4. Washington: Government Printing Office. 1937 Extracts From the European Travel Journal of Lewis H. Morgan. Edited by Leslie A. White. Rochester Historical Society, Publications 16:219-389.
1959
The Indian Journals, 1859-1862. Edited and with an introduction by Leslie A. White. Ann Arbor: Univ. of Michigan Press. SUPPLEMENTARY BIBLIOGRAPHY
BANDELIER, ADOLPH F. A. 1940 Pioneers in American Anthropology: The Bandelier-Morgan Letters, 18731883. Edited by Leslie A. White. 2 vols. Albuquerque: Univ. of New Mexico Press. EGGAN, FRED 1960 Lewis H. Morgan in Kinship Perspective. Pages 179-201 in Gertrude E. Dole and Robert L. Carneiro (editors), Essays in the Science of Culture, in Honor of Leslie A. White. New York: Crowell. EGGAN, FRED 1966 The American Indian: Perspectives for the Study of Social Change. Chicago: Aldine. -» This volume contains Eggan's "Lewis Henry Morgan Lectures" given at the University of Rochester in April 1964. ENGELS, FRIEDRICH (1884) 1942 The Origin of the Family, Private Property and the State. New York: International Publishers. -> First published in German. FISON, LORIMER; and HOWITT, A. W. 1880 Kamilaroi and Kurnai: Group-marriage and Relationship, and Marriage by Elopement, Drawn Chiefly From the Usage of the Australian Aborigines; Also the Kurnai Tribe, Their Customs in Peace and War. With an introduction by Lewis H. Morgan. Melbourne: Robertson. [Lewis Henry Morgan: A Bibliography.] 1923 Volume 2, pages 165-179 in Rochester Historical Society, Publication Fund Series. Rochester, N.Y.: The Society. LOWIE, ROBERT H. 1936 Lewis H. Morgan in Historical Perspective. Pages 169—181 in Robert H. Lowie (editor), Essays in Anthropology Presented to A. L. Kroeber. Berkeley: Univ. of California Press. RESEK, CARL 1960 Lewis Henry Morgan: American Scholar. Univ. of Chicago Press. STERN, BERNARD J. 1931 Lewis Henry Morgan: Social Evolutionist. Univ. of Chicago Press. WHITE, LESLIE A. 1944 Morgan's Attitude Toward Religion and Science. American Anthropologist New Series 46:218-230. WHITE, LESLIE A. 1948 Lewis Henry Morgan: Pioneer in the Theory of Social Evolution. Pages 138-154 in Harry E. Barnes (editor), An Introduction to the History of Sociology. Univ. of Chicago Press. WHITE, LESLIE A. (editor) 1957 How Morgan Came to Write Systems of Consanguinity and Affinity. Michigan Academy of Science, Arts, and Letters, Papers 42:257-268.
MORTALITY Mortality statistics are by-products of the legal process of death registration [see VITAL STATISTICS]. These data serve various purposes, such as estimating a component of population growth and preparing population projections; delineating health problems, planning public health programs, and assessing health progress; and studying the natural history of disease. The absolute numbers of deaths are useful as a direct measure of the attrition of the population due to deaths. However, for analytical purposes,
MORTALITY death data are generally 'ised in the form of ratios. Properly computed, a death rate expresses the force of mortality on the population at risk. Types of death rate. The crudest form of death rate is the total or general death rate. This is the number of deaths occurring in a particular period of time, usually a year, for each 1,000 persons in the area or population. Because the general death rate (often called the crude death rate) is the mean of the death rates by age, sex, color, and other demographic variables weighted by the demographic composition of the population, an area with a young population, for example, would have a low general death rate, and an area with an old population a high general death rate, even if the set of age-specific death rates for the two areas were the same. In order to take into account the differential mortality by age, sex, or other demographic variable, death rates are usually computed for a specific population class or group. The age-specific death rate is an example of this type of rate. In some cases, comparisons are based on death rates adjusted for differences in population composition. If the rate is standardized for differences in the age composition of two populations, it is called an age-adjusted death rate. A special kind of death rate is the life table death rate. This is a hypothetical set of derived death rates based on certain assumptions of mortality in a stationary living population unaffected by migration or births. One function of the life table which is of interest is the expectation of life. This is the average number of years that will be subsequently lived by a group of persons who have attained a certain age. The expectation of life at birth is the average age at death of all the 100,000 who start life together in the life table cohort. Another important function is the survival rate, which is the probability that persons of a particular age will survive for a particular period of time, usually a calendar year [see LIFE TABLES]. Cause of death. An important aspect of mortality statistics relates to data derived from the medical information reported on death certificates. Despite their limitations, statistics on causes of death have contributed a great deal in the past to the field of public health [see PUBLIC HEALTH]. The present statistics on causes of death relate to the "underlying cause of death," which is the term used to denote the disease or injury that initiated the train of events leading directly to death; in the case of accident or violence, it may also include the circumstances which produced the fatal injury. These statistics have done good service for
49
public health in the past; but, with the lessening importance, at least in the United States, of the acute infectious diseases as compared with the chronic noninfectious diseases, the data have be come less and less adequate. The selection of * single disease entity as the "underlying cause': poses a real problem in deaths involving chronic diseases, since in such cases it is frequently difficult, if not impossible, to identify a single underlying cause. International comparison of cause-of-death statistics also presents a problem. In addition to differences arising from incompleteness of death registration in various countries, there are variations in proportion of deaths attended by a physician, in diagnostic acumen of the clinician in attendance, and in the recording of diagnostic information. International comparisons are further complicated by differences in medical concepts of diseases and in the methods of classifying causes of death. In fact, strict international comparability of cause-of-death statistics is at present a virtual impossibility, and too much significance should not be attached to small differences in rates between countries. World mortality—situation and trends The estimated annual death rate for the world population is 17 per 1,000 population for the period 1958-1962. As might be expected, the death rate varies over a wide range in different parts of the world (see Table 1). If differences in the age composition of the population in various parts of the world were taken into account, the mortality differential would undoubtedly be much greater than that indicated by the crude death rates shown here. Unfortunately, the Table 1 — Population estimates, birth rates, and death rates for major regions of the world Population"
Birth rateb
Africa
269
46
America North Middle South
430
33
Asia Europe Oceania U.S.S.R. World total
24 43 41
206 71 153
Deaf/i rafeb
23 11 9 14 13
1,764
43
20
434
19
10
17
24
8
221
24
7
3,135
37
17
a. 1962, in millions. b. Annual average, 1958-1962, per 1,000 population. Source: Computed from data in Demographic Yearbook 1963, p. 142. Copyright © United Nations 1964. Reproduced by permission.
500
MORTALITY
data needed to compute age-adjusted death rates are not available for the various regions of the world. In fact, one of the serious problems in international mortality studies is the lack of adequate mortality statistics for a large part of the world. By and large, reliable data are available only for the countries of northern and western Europe, North America, and Oceania. With a few notable exceptions, data for countries in other regions are either very incomplete or nonexistent. The estimated birth rate for the world population is a little more than twice the estimated death rate. The natural rate of population increase (the difference between the birth and death rates) is highest in the Latin American countries, followed by the countries on the African continent and in Asia. Traditionally, a major part of annual population growth comes from the contribution made by births, but one of the significant demographic developments in the recent postwar period is the sharp acceleration in population growth due to the rapid decline in mortality. Virtually all countries, and more particularly the developing countries, experienced unprecedented declines in mortality while their birth rates remained at a high level. The rate of decline in world mortality following World War n was dramatic, but the death rate began to level off in the 1950s in a number of countries, such as the United States, England and Wales, Sweden, Norway, Finland, the Netherlands, Japan, and Chile. Intensive studies of the mortality trend for the United States (U.S. Dept. of Health, Education, and Welfare . . . 1964a), Chile (U.S. Dept. of Health, Education, and Welfare . . . 1964b), and England and Wales (U.S. Dept. of Health, Education, and Welfare . . . 1965) indicate that a large part of the acceleration in the decline of general mortality was due to the large reduction in the death rate for infective and parasitic diseases as a result of antimicrobial therapy. In the United States, for example, the death rate for infective and parasitic diseases reached a low level, and by the mid-1950s it was no longer significantly influencing the general mortality trend. At the same time, the mortality trend for chronic diseases and for violence was either rising, remaining unchanged, or declining very slowly. This combination of circumstances causes a marked deceleration in the downward trend of the general death rate. Whether this change in the mortality pattern is transient or permanent is difficult to say. It is obviously not possible for the death rate to decline indefinitely. Further reductions in mortality appear possible in the United States, but it does not seem likely that large declines will occur until a major
breakthrough is made in the prevention of deaths from chronic diseases. On the other hand, if the age-specific death rates in the United States were to decline to levels already achieved by several other countries of low mortality, the crude death rate for the United States in 1960 would have been 7.3 per 1,000 population, as compared with the recorded death rate of 9.5 per 1,000 population. For males the expected death rate would have been 7.8, as compared with the recorded rate of 11.0 per 1,000 population. For females the corresponding rates would have been 6.9, as compared with 8.1 per 1,000 population. The leveling off of the death rate as it reaches its irreducible minimum is readily understandable. However, there seems to be no ready explanation for the change in mortality trends at different levels. For example, the death rate for nonwhites in the United States is still considerably higher than that for whites. Yet the rate of decline of the mortality trend for nonwhites has slowed down in the same manner as that for the whites. National death rates are also becoming stabilized at different levels. For example, the Scandinavian countries and the Netherlands have achieved much lower age-specific death rates than the United States, whereas the age-specific death rates for Japan and Chile are higher. Yet the death rates appear to be leveling off in all of these countries. The experience of Chile appears to have important implications for the developing countries. It seems clear that the knowledge and technical means are available for securing significant reductions in the death rate even in developing countries. The institution of mosquito and fly control and/or the widespread introduction of antibiotics for therapeutic purposes will have an immediate impact upon the death rate. However, it would appear that a point of diminishing returns will soon be reached and the decline in mortality come to a halt. Accordingly, the study of mortality trends in Chile points to the importance of planning health activities as a part of the social and economic development of the country (U.S. Dept. of Health, Education, and Welfare . . . 1964k). Death rates by age Reference was made earlier to the unsatisfactory nature of the crude death rate, which is significantly affected by the age composition of the population to which it refers. Death rates computed for various age groups, as in Table 2, are, of course, free of this problem. As indicated by these age-specific death rates, infancy is the most critical period of life, even for
MORTALITY Table 2 — Death rates by age group: United States, 7 962 Age
Deafh rate*
Under 1 year
2,530.1
25-34
98.1 43.9 103.5 145.2
35-44
298.2
1-4
5-14 15-24
45-54
741.0
55-64
1,692.9
65-74
3,798.4
75-84 85 and over Total population v
8,431.5 20,510.0 945.4
Per 100,000 population. Source: U.S. Dept. of Health, Education, and Welfare, Public Health Service, National Vital Statistics Division 1964, pp. 1—5.
a developed country like the United States. Although data are not available to demonstrate this point, it would not be surprising if one-quarter or more of all live births in many of the developing countries fail to survive the first year of life. For the developed countries it is possible to assess the progress made in the reduction of the infant mortality rate. A significant decline in infant mortality has occurred, and remarkably low rates have been achieved by the Netherlands (15.3 per 1,000 live births in 1962), Sweden (15.8 per 1,000 live births in 1961), and Norway (17.9 per 1,000 live births in 1961). A recent study (Shapiro & Moriyama 1963) of the international infant mortality trends indicates that the rate of decline is slowing up in many countries of low mortality. From a relatively high death rate at infancy, the risk of death drops to a minimum at age ten or so. From then on, there is an increase in mortality with increasing age. This is the typical crosssectional pattern of mortality in countries of low mortality. However, there are a number of countries where the infant mortality rates are lower than that for the United States. Except in extreme old age, lower death rates are also found at other ages in other countries of low mortality. In countries of low mortality, most of the deaths occur in the older age groups. In the developing countries, by contrast, it would not be unusual for more than half of all deaths to occur among children under five years of age. Under these conditions, it is obvious that the expectation of life at birth could not be very great. Expectation of life The Biblical life span of "three-score years and ten" has become the norm for a number of countries. In Sweden, Norway, Denmark, the Netherlands, and Israel the life expectancy at birth is 70
501
years or more for both males and females. In othei countries, such as the United States, Canada; Czechoslovakia, France, England and Wales, Australia, and New Zealand, the average length of life of 70 years or more for the total population has been attained only because of the favorable mortality experience of females. For example, the average expectation of life at birth in the United States for 1962 is 73.4 years for females and 66.8 years for males. If up-to-date life tables were available for all countries, it is probable that a few other countries could be added to the list above. The world situation with regard to longevity cannot be described with any precision. However, it seems clear that longevity is at present greatest in the northern and western European countries, Canada and the United States on the North American continent, and Oceania. The average life expectancy is less favorable in the central, eastern, and southern European countries. Still lowei on the scale are the Latin American countries. The average expectation of life for a large part of the Asian population is low, although an average length of life of 60 years or more may be found in such Asian countries as Japan, Nationalist China (Taiwan), and Ceylon. Life table values for many of the countries on the African continent are not available. The question in a good part of Africa, especially in the southern and tropical countries, is not longevity but survival through childhood. The increase in longevity of the population in the developed countries has been considerable. For example, in the period 1900-1902 the average expectation of life at birth in the United States was 48 years for males and 51 years for females. In a period of some sixty years, the male population gained about 19 years in life expectancy at birth, while the gain for females was about 22 years. The postwar increase in life expectancy has been spectacular for some countries. For example, the expectancy of life at birth in Ceylon increased from 46.8 years in 1945-1947 for males to 60.3 years in 1954. For females, the corresponding figures were 44.7 years and 59.4 years, respectively. The average annual gain in longevity in Ceylon, as compared with the experience in the United States, is therefore roughly five times greater. Death rates by marital status Almost without exception, the mortality among the married 20 years and over is lower, age for age, than the corresponding death rates for the single, widowed, or divorced. This is true for both males and females. Beyond this, the pattern of
502
MORTALITY Table 3 — Ratio of death rates of unmarried persons to death rates of married: Sweden, 1959 FEMALE
MALE
Age
Single
Widowed
Divorced
Single
20-24
2.00
25-34
2.50
* 2.50
* 4.13
2.00 2.40
Widowed
Divorced
* 2.00
* 2.40
2.15
35-44
2.12
1.56
2.31
2.23
1.77
45-54
1.53
1.82
2.58
1.50
1.34
1.59
55-64
1.27
1.42
1.87
1.28
1.19
1.35
65-74
1.21
1.29
1.47
1.09
1.17
1.17
75-84
1.26
1.30
1.41
1.13
1.16
1.10
' Too few cases for significant comparison with married. Source: Computed from data in Demographic Yearbook 1961, pp. 592—593. Copyright © United Nations 1962. Reproduced by permission.
mortality differentials by marital status varies somewhat by country. In countries like Sweden, the mortality among divorced males is higher by far than the corresponding rates for bachelors or widowers (see Table 3). For females, the differences in death rates between the single, widowed, and divorced are not so great as those observed for males. The higher mortality among the single has been explained on the basis of selection; that is, those who never marry because of some serious physical impairment or chronic disease have a higher risk of mortality than the married. The single may therefore include among their number a higher proportion of the poorer mortality risks than those who marry. The higher mortality among the widowed has been attributed to the high association of diseases from which both marital partners die or to a less favorable economic situation that they both share. One of the problems in the interpretation of death rates by marital status is the fact that the informant may not always know the civil status of those living alone. Also, there is the problem of the lack of correspondence between the marital status reported on death certificates and on the census enumeration schedules. Because the married population constitutes a large part of the total population, errors in reporting of marital status affect the data for the married much less than the data for the single, widowed, and divorced. Death rates by sex One of the significant constants of mortality statistics in countries of low mortality is the favorable experience among females as compared with that of males. Examination of death rates by sex for a recent year indicates large sex differentials in mortality for the United States and Canada (36 and 38 per cent, respectively) and for New Zealand and Australia (23 and 26 per cent, respectively). In the countries of western Europe the male mor-
tality exceeded the death rate for females by 10 to 20 per cent. The death rate for females is lower than that for males in each age group from birth to the end of the life span in virtually every country of low mortality. Even in the developing countries the mortality experience among females is generally favorable as compared with males, except in the child-bearing ages. Maternal mortality is a significant public health problem in these countries, as it was in the developed countries some forty or fifty years ago. It is not clear why female mortality is consistently lower than that among males. One obvious explanation is the biological difference between the sexes; however, biological differences do not appear to account for much of the sex differential in mortality. A good part of the difference in the death rate appears to be due to the increasing mortality among males or to the fact that the death rate among females is declining faster than that among males. Whatever the explanation for this phenomenon, the continued occurrence of the large sex difference in mortality as recorded in a number of countries will have important consequences in terms of the sex composition of the population of the future, especially in the older ages. Death rates by cause of death At the turn of the century, infective and parasitic diseases constituted the major public health problems in the world population. Pneumonia and influenza, tuberculosis, diarrhea and enteritis, and the childhood diseases were the principal causes of death in 1900, even in economically developed countries. The large reduction in mortality since 1900 has been achieved primarily through control of the infective diseases. Although influenza and pneumonia still remain significant public health problems, mortality from the chronic diseases has
MORTALITY Table 4 — Death rate and proportionate mortality for the five leading causes of death: selected countries* of North America, Europe, and Oceania, 1961 Average death rate per 100,000 population
Per cent of total deaths
Heart disease
300
Malignant neoplasm
172
31 18
132 48 37
13 5 4
Leading causes of death
Vascular lesion of
central nervous system Accidents Influenza and pneumonia
* Australia, Austria, Belgium, Canada, Denmark, Finland, France, German Federal Republic (including West Berlin), Hungary, Italy, Netherlands, New Zealand, Norway, Portugal, Republic of Ireland, Sweden, United Kingdom, and United States. Source: Compiled from "The Ten Leading Causes . . ." 1964a.
come to the forefront. The results of the review of causes of death in selected countries of North America, Europe, and Oceania in 1961 are summarized in Table 4. From Table 4 it may be seen that more than 60 per cent of all deaths in the developed countries are attributable to the cardiovascular diseases and to malignant neoplasms. Although accidents rank fourth, they constitute the leading cause of death in the age groups 1 to 44 years; malignant neoplasms are the most frequent cause of death in the age group 45 to 64 years; and heart disease the principal cause of death in the population 65 years and over. Similar data for selected countries of Africa, South and Central America, and Asia for 1960 are shown in Table 5. The number of countries in Africa, Asia, and South and Central America that met the criteria for inclusion in the World Health Organization compilations is limited, and the 12 countries that were selected do not, by any means, represent the mortality problems in the vast population of these continents. Although gastritis, duodenitis, enteritis, Table 5 — Death rate and proportionate mortality for the five leading causes of death: selected countries* of Africa, South and Central America, and Asia, 1960 Leading causes of death
Average death rate per 100,000 population
Per cent of total deaths
95 77 67 48 38
7 7 5 4
Gastritis, duodenitis, enteritis, and colitis Heart disease Influenza and pneumonia Malignant neoplasms Accidents
9
* Mauritius, United Arab Republic, Chile, Colombia, Costa Rica, Guatemala, Mexico, Panama, Trinidad and Tobago, Ceylon, Israel (Jewish population), and Japan. Source: Compiled from "The Ten Leading Causes . . ." 1964b.
503
and colitis were the leading causes of death for half of the selected countries, their average death rate and the proportionate mortality are relatively low. A principal cause of death that accounts for only about 9 per cent of all deaths and five leading causes that constitute no more than one-third of all deaths do not suggest any major health problems. Actually, the averages conceal some of the problems indicated by the data for individual countries. For example, the death rate for gastritis, duodenitis, enteritis, and colitis was 700 per 100,000 population in the United Arab Republic, and 36 per cent of all deaths were charged to these intestinal infections. Adequate mortality statistics for these regions would delineate existing public health problems more clearly. If such statistics were available, it is likely that other infective diseases, such as tuberculosis, dysentery, typhoid, and measles; parasitic diseases, such as schistosomiasis and malaria; and possibly malnutrition and other dietary deficiency diseases would figure prominently as causes of death. With the availability of knowledge and means for controlling most of the important infective and parasitic diseases, prospects are good for rapid reduction in mortality from these diseases. The resultant increase in survival of the population will bring new problems to the regions affected. These are the problems of the chronic noninfectious diseases with which the developed countries are now struggling. IWAO M. MORI YAM A [See also FOOD, article on WORLD PROBLEMS; POPULATION; PUBLIC HEALTH.] BIBLIOGRAPHY CAMPBELL, HUBERT 1965 Changes in Mortality Trends: England and Wales, 1931-1961. U.S. National Center for Health Statistics, Vital and Health Statistics, Series 3, No. 3. Washington: Government Printing Office. Demographic Yearbook 1961. 13th ed. 1961 New York: United Nations. -» Special Topic: Mortality Statistics. Prepared by the Statistical Office of the United Nations in collaboration with the Department of Social Affairs. Demographic Yearbook 1963. 15th ed. 1963 New York: United Nations. -> Special Topic: Population Census Statistics II. Prepared by the Statistical Office of the United Nations in collaboration with the Department of Social Affairs. SHAPIRO, S.; and MORIYAMA, I. M. 1963 International Trends in Infant Mortality and Their Implications for the United States. American Journal of Public Health and the Nation's Health 53, no. 5:747-760. The Ten Leading Causes of Death for Selected Countries in North America, Europe and Oceania, 1954-1956,
504
The
U.S.
U.S.
U.S.
U.S.
MOSCA, GAETANO
I960, 1961. 1964a World Health Organization, Rapport epidemiologique et demographique 17:54-112. Ten Leading Causes of Death for Selected Countries in Africa, South and Central America and Asia, 19541956, 1960, 1961. 1964b World Health Organization, Rapport epidemiologique et demographique 17: 118-152. DEPT. OF HEALTH, EDUCATION, AND WELFARE, PUBLIC HEALTH SERVICE, NATIONAL CENTER FOR HEALTH STATISTICS 1964a The Change in Mortality Trend in the United States. Prepared by Iwao M. Moriyama. National Center for Health Statistics, Series 3, No. 1. Washington: Government Printing Office. DEPT. OF HEALTH, EDUCATION, AND WELFARE, PUBLIC HEALTH SERVICE, NATIONAL CENTER FOR HEALTH STATISTICS 1964fe Recent Mortality Trends in Chile. National Center for Health Statistics, Series 3, No. 2. Washington: Government Printing Office. DEPT. OF HEALTH, EDUCATION, AND WELFARE, PUBLIC HEALTH SERVICE, NATIONAL CENTEH FOR HEALTH STATISTICS 1965 Changes in Mortality Trends in England and Wales, 1931-1961. Prepared by H. Campbell. National Center for Health Statistics, Series 3, No. 3. Washington: Government Printing Office. DEPT. OF HEALTH, EDUCATION, AND WELFARE, PUBLIC HEALTH SERVICE, NATIONAL VITAL STATISTICS DIVISION 1964 Vital Statistics of the United States 1962. Volume 2: Mortality. Part A. Washington: Government Printing Office.
MOSCA, GAETANO Gaetano Mosca (1858-1941), Italian political scientist, was born in Palermo, Sicily. He took his law degree there in 1881 with the thesis I fattori delta nazionalitd. The thesis foreshadows some of the characteristics of Mosca's later writings: his detachment from the ideological climate of the risorgimento and his lively sense of history, which acted as a corrective to his strongly positivistic approach. It is difficult to assess the extent to which Mosca's conceptual approach was influenced by the Sicilian environment of his youth. Sicily was, both socially and politically, the most backward region of Italy, and the introduction of representative government had, if anything, aggravated the political problems of the South. Hence, such diverse scholars as Antonio Gramsci, an Italian Marxist, and William Salomone, an American historian, have attributed to Mosca's Sicilian background his hostility toward democratic ideology and the parliamentary system, which was so evident in his first major work, the Teorica del governi e governo parlamentare (1884; "On the Theory of Governments and Parliamentary Government"). The book is an outburst against contemporary Italian political life, which, Mosca alleged, had become arbitrary and corrupt as a necessary consequence of popular sovereignty. Such antiparliamentary polemics were common in Europe at the
time; in Italy, however, feelings on this score were particularly intense because the difficulties of an enfeebled regime were exacerbated by problems created by the risorgimento. Mosca's criticism is, in part, simply an instance of the then prevailing antiparliamentarianism; but it stands apart because of its clear-cut distinction between the ideal of liberty on the one hand and the evils to the democratic "myth" on the other hand. As an old man, Mosca used to blame certain failures in his early academic career on his denial of the principles of popular sovereignty and political representation. The fact is that he qualified to teach constitutional law very early—in 1885—but had no success in various competitions for fellowships for study abroad and for a chair of constitutional law. His writings during this period—e.g., his essay Le costituzioni moderne (1887; "Modern Constitutions")—were entirely in the field of public law; in the absence of chairs of political science in Italy at the time, this was the discipline closest to his interests and the one in which he hoped to make his career. In 1887 Mosca's setbacks at the universities led him to accept a position as editor of the proceedings of the Chamber of Deputies, a position he kept for ten years. It was, as he later said, an ideal observation post for a young man eager to understand the realities of politics. For most of that period he published little, but it must have been a time of intense study and meditation, decisive for the elaboration and ordering of his thought. Basic to Mosca's thought was the conviction that only the substitution of scientific truth (such as the doctrine of "ruling class") for "metaphysical abstractions" (such as the democratic myth) would make it possible to purify and to heal political practice. His faith in the redeeming power of political science appears to have been fostered by the prevailing cultural atmosphere of his youth. At that time, in Italy as elsewhere, positivist philosophy was dominant, and Mosca believed he could transfer its inductive method from the study of nature to the study of human society. The theory of the ruling class. Mosca's ideas were first systematically presented in The Ruling Class (1896), the work that may be said to mark the birth of political science in Italy. Mosca was never to change basically the theory he presented at that time, although by 1923, when the second edition of the work appeared, his doctrine had been in many respects deepened and elaborated. The main outlines of the second edition of The Ruling Class may be summarized as follows. Whatever the form of government, power is always in the hands of an organized minority, the
MOSCA, GAETANO "ruling class," which h?s authority over the majority by virtue both of certain characteristics that vary according to the epoch and the situation and of the power derived from organization per se. In accordance with human nature, however, this ruling class always tries to justify its rule by a moral or legal principle, the "political formula," which, however abstract, must be consonant with the conception of life of the community that is governed. The concept of the political formula not only makes Mosca's theory a powerful tool for interpreting historical reality, in that the formula presumably reveals that reality, but also constitutes a reaffirmation of the value of consensus in the organization of the state. As indicated by the title of Mosca's programmatic lecture "II principio aristocratico ed il democratico nel passato e neH'avvenire" (1903; "The Aristocratic Principle and the Democratic One, in the Past and the Future"), he held that two opposite tendencies are inherent in society: the aristocratic tendency toward keeping power in the hands of the descendants of those who govern and the democratic tendency toward renewal by means of elements derived from the governed. (Mosca became involved in an acrimonious dispute with Pareto over the priority of the discovery of the concept of the circulation of elites; Mosca's priority is now generally acknowledged.) Paralleling these tendencies are two principles, likewise opposed to each other: the "autocratic," according to which authority is transmitted downward, and the "liberal," by which authority is delegated from below. The two antitheses are independent and may coexist. The theory acquires a tighter articulation by its distinction between two levels within the ruling class, with government proper being at one level, and at the other, lower level all the existing political forces. Finally, the theory is crowned by the concept of "juridical defense," possible only when there exists a "balance of social forces" and therefore a government of law dispensing "relative justice" (Meisel 1958, p. 12). Juridical defense, that is, can be realized only when there is a plurality of forces, independent of and checking each other and sharing in the power of government [see CONSTITUTIONAL LAW]. Mosca's use of the concept of juridical defense clearly warrants his being classified as a liberal (in the European sense of the term), for it introduces a value judgment on political systems: those political systems are better which guarantee greater respect for the "moral sense." According to Meisel, what mattered to Mosca was less the substance of this moral sense than the existence of social mech-
SOL
anisms that allow it to flourish. These mechanisms are more likely to exist under conditions of politica liberty. Moreover, a value judgment is also impliec in the statement that an "open" ruling class is pref erable to a "closed" one. By 1923 Mosca had in facl changed his position with regard to representative government, which he distinguished sharply frorr the parliamentary system (a degenerate form); he attributed to the representative system the highest degree of juridical defense ever attained in history [sec REPRESENTATION]. It is not too clear how Mosca arrived at his theory of the ruling class and the political formula. It is known that from his boyhood he was an avid reader of history, and among historians there are. of course, some who more or less consciously realize that in human societies there is always a small group that does the actual governing. More specifically, it was Taine who influenced the development of Mosca's thinking, by his antiegalitarianism and his concept of a bienfaisante aristocracy in particular and in general by the pessimistic view of humanity to which his interpretation of history and politics is closely linked. When Mosca first presented it, the theory of the ruling class had no influence whatever, either as an instrument of historical interpretation or as a lever for a new discussion of the nature of politics; only later, as a result of Pareto's writings, did the concept of a ruling minority take ho]d. Instead, Mosca's trenchant criticism of the parliamentary system did have wide repercussions, and it has not unfairly been charged that, although he was a liberal, his attack on the institutions that represent historically the attempt to realize the liberal ideal actually helped to undermine liberty. Reception of Mosca's work. The Ruling Class did win for Mosca the chair of constitutional law at the University of Turin, where he remained until 1923. It is hard to say whether and to what extent the new environment influenced his subsequent political thinking. The intellectual atmosphere of Turin, more cosmopolitan than that of Rome, let alone Sicily, must surely have had an impact on him. Also, he came to know such outstanding men as the economist Luigi Einaudi, the ecclesiologist Francesco Ruffini, and the jurist and philosopher Gioele Solari, all of whom were then teaching at Turin and with whom he shared membership in the Liberal party. It was during the first part of his Turin period that Mosca seems to have become aware of the appeal that the doctrine of the political class had vis-a-vis the Marxist theory of the economic class. Having reached high academic standing, Mosca went into active politics. He was elected to the
506
MOSCA, GAETANO
Chamber of Deputies in 1908 and took his seat among the conservatives (in 1912 he voted against extension of the suffrage). From 1914 to 1916 he was undersecretary for the Colonies, and in 1919 he became a senator. In his attitude toward fascism he was typical of many of the most prominent Italian liberals of the time: he moved from an initial position of benevolently suspended judgment, and even outspoken hope, to one of open opposition. The fascists, for their part, never claimed that their ideology was related to Mosca's theories—as they did in the case of Pareto's— although around 1930 some young fascist intellectuals did maintain that Mosca's criticism of the majority principle and his vehement antiparliamentarianism entitled him to a prominent place among the ideological ancestors of fascism. They were, however, taking a one-sided view of Mosca's theory, which, as has been noted, ultimately led to a liberal position, mainly via the conception of juridical defense but also via the acknowledgment of the value of an "open" ruling class. It should be remembered, however, that although the events of the time had some influence on Mosca's formulation of a liberal position, he reached this position primarily by theoretical reasoning. The new edition of The Ruling Class won for Mosca a call to the University of Rome in 1923, and there, from 1925 to 1933 (when he reached the mandatory retirement age), he occupied Italy's first chair of the history of political institutions and doctrines. The lectures he delivered in Rome were published as the well-known Storia delle dottrine politiche (1932a), remembered especially for its affirmation of the interdependence of political practice and political ideas. Although the 1923 edition of The Ruling Class was well received, Mosca's doctrine continued to have little influence. In the case of Italy, at least, the reason for this was the predominance of the philosophy of idealism, which rejected the "generalizations" of the social sciences; although Mosca's work had won the approval of Benedetto Croce, the leader of Italian idealism, this did not move others to penetrate the positivist surface of his thought. Michels was the only one who used Mosca's theory of the ruling class, chiefly in his studies of the oligarchical structure of political parties. Not until after World War n did Mosca's doctrine have some success, in part because enlarged cultural contacts required notice of the Marxist doctrine of classes. Special mention should be made of the revision of Mosca's theories by Guido Dorso in Dittatura, classe politica e classe dirigente (1949), which evaluates the role of the masses in a new way. But it is especially in the United States, with its rich and deep-
rooted tradition of research into the phenomena of association, that Mosca has received the attention he merits, from J. Burnham, J. H. Meisel, and others. Today in Europe as well, Mosca's central idea is considered a basic concept and has become common intellectual property. MARIO DELLE PIANE [See also ELITES and OLIGARCHY and the biographies Of MlCHELS; OSTROGORSKII; PARETO.] WORKS BY MOSCA
1882
I fattori della nazionalita. Rivista euro-pea 13, fasc. 4:703-720. (1884) 1958 Teorica del govern! e governo parlamentare. Pages 15-328 in Gaetano Mosca, do che la storia potrebbe insegnare: Scritti di scienza politica. Milan: Giuffre. (1884-1941) 1958 do che la storia potrebbe insegnare: Scritti di scienza politica. Milan: Giuffre. -» A commemorative volume. (1887) 1958 Le costituzioni moderne. Pages 445-549 in Gaetano Mosca, do che la storia potrebbe insegnare: Scritti di scienza politica. Milan: Giuffre. (1896) 1939 The Ruling Class (Elementi di scienza politica). New York: McGraw-Hill. -> An abridged edition, entitled La classe politica, was published by Laterza in 1966. (1903) 1949 II principio aristocratico ed il democratico nel passato e nell'avvenire. Pages 1-36 in Gaetano Mosca, Partiti e sindacati nella crisi del regime parlamentare. Bari: Laterza. (1924) 1949 Lo stato-citta antico e lo stato rappresentativo moderno. Pages 37-60 in Gaetano Mosca, Partiti e sindacati nella crisi del regime parlamentare. Bari: Laterza. (1932a) 1962 Storia delle dottrine politiche. 8th ed. Bari: Laterza. -* First published as Lezioni di storia delle istituzioni e delle dottrine politiche. (1932&) 1958 The Final Version of the Theory of the Ruling Class. Pages 382-391 in James H. Meisel, The Myth of the Ruling Class: Gaetano Mosca and the "Elite." Ann Arbor: Univ. of Michigan Press. -» First published as Chapter 40 of Lezioni di storia delle istituzioni e delle dottrine politiche. Partiti e sindacati nella crisi del regime parlamentare. Bari: Laterza, 1949. -» A posthumous volume, containing a large number of minor writings first published between 1897 and 1925. SUPPLEMENTARY BIBLIOGRAPHY
BOBBIO, NORBERTO 1960 Gaetano Mosca e la scienza politica. Rome: Accademia Nazionale dei Lincei. BOBBIO, NORBERTO 1962 Gaetano Mosca and the Theory of the Ruling Class. Banca Nazionale del Lavoro, Rome, Quarterly Review 60:3-23. BURNHAM, JAMES 1943 Mosca: The Theory of the Ruling Class. Pages 79-115 in James Burnham, The Machiavellians: Defenders of Freedom. New York: Day. CAPRARIIS, VITTORIO DE 1954 Profilo di Gaetano Mosca. Mulino 3:343-364. COOK, THOMAS I. 1939 Gaetano Mosca's The Ruling Class. Political Science Quarterly 54:442-447. CROCE, BENEDETTO (1923) 1947 Premessa. In volume 1 of Gaetano Mosca, Elementi di scienza politica. 4th ed. Bari: Laterza. -> A book review of the 1923 edition of Mosca's Elementi di scienza politica.
MOTIVATION: The Concept DE PiETRi-ToNELLi, ALFONSO 1935 Mosca e Pareto. Rivista internazionale di scienze sociali 43:468-493. DELLE PIANE, MARIO 1949 Bibliografia di Gaetano Mosca. Florence: La Nuova Italia. -> A comprehensive and annotated list of Gaetano Mosca's writings. DELLE PIANE, MARIO 1952 Gaetano Mosca: Classe politica e liberalismo. Naples: Edizioni Scientifiche Italiane. -» Contains a large and up-to-date list of Mosca's writings on pages 377-382. DORSO, GUIDO 1949 Dittatura, classe politica e classe dirigente. Turin: Einaudi. -> See especially pages 121-184 on "Classe politica e classe dirigente." GRAMSCI, ANTONIO 1949 II risorgimento. Turin: Einaudi. -> See especially page 59. HUGHES, H. STUART 1954 Gaetano Mosca and the Political Lessons of History. Pages 146-167 in H. Stuart Hughes (editor, Teachers of History: Essays in Honor of Laurence Bradford Packard. Ithaca, N.Y.: Cornell Univ. Press. HUGHES, H. STUART 1958 Consciousness and Society: The Reorientation of European Social Thought, 18901930. New York: Knopf. ~> See especially pages 252259. LUCIOLLI, MARIO 1959 G. Mosca y el pensamento liberal. Santiago (Chile): Universidad de Chile, Instituto de Ciencias Politicas y Administrativas. MALAGODI, GIOVANNI F. 1928 Le ideologie politiche. Bari: Laterza. -» See especially Chapter 6. MEISEL, JAMES H. 1958 The Myth of the Ruling Class: Gaetano Mosca and the "Elite." Ann Arbor: Univ. of Michigan Press. MEISEL, JAMES H. 1964 Mosca "transatlantico." Cahiers Vilfredo Pareto 4:109-117. PASSERIND'ENTREVES, ALESSANDRO 1959 Gaetano Mosca e la liberta. Politico 24:579-593. PIOVANI, PIETRO 1951 Momenti dclla filosofia giuridicopolitica italiana. Milan: Giuffre. -> See especially pages 97-143. RUNCIMAN, W. G. 1963 Social Science and Political Theory. Cambridge Univ. Press. -» See especially Chapter 4. SALOMONE, ARCANGELO WILLIAM 1945 Italian Democracy in the Making: The Political Scene in the Giolittian Era, 1900-1914. Philadelphia: Univ. of Pennsylvania Press. -» See especially Chapter 2. SPITZ, DAVID (1949) 1965 Patterns of Anti-democratic Thought: An Analysis and a Criticism, With Special Reference to the American Political Mind in Recent Times. Rev. ed. New York: Free Press. VECCHINI, F. 1965 La pensee politique de Gaetano Mosca et ses differentes adaptations au cours du XXe siecle en Italie. Ph.D. dissertation, Univ. of Dijon.
MOTIVATION i. THE CONCEPT n. HUMAN MOTIVATION
Lawrence I. O'Kelly Robert C. Birney
THE CONCEPT
The concept of motivation has had a comparatively short formal history in experimental psychology, figuring hardly at all in the systematic presentations of such forebears and founders as the English associationists Wundt, James, and
507
Titchener. While space does not permit here adequate development of background or supporting documentation, it is probable that motivation became a central variable in behavior theories coincidentally with the change from viewing mind as "structure" to viewing mind as "function." This was the period of the emergence of the functionalism of Dewey and Angell, Freud's psychoanalysis, and McDougall's "hormic," or purposive, psychology. The notion that mind or behavior has directional and energetic components could only have occurred to students who regarded organisms as going and achieving, as desiring and searching, or as solving problems and adapting. Philosophical antecedents. The philosophical heritage of experimental psychology was of little help in telling the functional psychologists how to think about these problems in dynamics. Philosophy had thought much about human values, but there seemed little possibility of generalizing ethical systems across the broad range of species and phyla that seemed to have motivational components in their behavior. Nor, except as a kind of confused background, do the nonhumans have much to contribute to the chronic debate over hedonism. The principles of "association," as contributions to theories of learning and memory, have had a long philosophical history, of course, but they never gave rise to concepts that could be called "motivational." Biological contributions. A good deal more was available from the biologist, particularly concepts of instinct and physiological regulation and a knowledge of neurophysiological bases of behavior. Instinct. Through many revisions, the concept of instinct as a directive force had achieved general acceptance among naturalists and was at hand for the now familiar uses in behavior theory to which it was put by McDougall and by Freud. Evolution, as Darwin saw it, made instinctive behavior clearly adaptive for either the individual or his species. Through the years, this concept of instinct has had an eventful career, being rejected out of hand by the radically empirical behaviorists and being dramatically revived into a new and fruitful usefulness by the zoological ethologists. In the study of motivation the concept of instinct becomes useful when it is made to represent rather uniform, genotypically shaped behavior patterns operating in the context of self-maintenance or species maintenance. This, then, enables some of the fundamental characteristics of the motivational concept to be easily discerned. An animal's responses are triggered by some internal physiological change, usually acting conjointly with distinctive external stimulus patterns. The responses are selec-
508
MOTIVATION: The Concept
tively oriented to one or another aspect of the environment and show a flexibly shaped succession that is usually relevant to some adaptive end. There then ensues, if the animal is "successful," achievement of some state of affairs that ends the sequence. Physiological regulation. Another related set of facts and concepts from biology anticipated the psychologist's concern with motivation and gave him the framework of a model that has had long viability, not only within experimental psychology proper but also, as an inspiration for analogous models, within the social sciences generally. We refer to the facts of physiological regulation. Claude Bernard may properly be said to be their discoverer. In the middle of the last century he was demonstrating that the effective functioning of vertebrate organisms depends on maintenance of the physical and chemical state of body fluids within rather narrow zones of constancy. His maxim that "constancy of the internal environment is the condition of a free life" is among the most celebrated truths of modern physiology. While physiologists usually remained concerned with analysis of the internal mechanisms that ensured constancy, it was obvious that in many, if not all, instances the complete story would involve the behavioral capabilities of the animal. To a working, machinelike body, the most usual and frequent threats to constancy come from the metabolic processes that make work possible, including those that labor themselves to maintain constancy. For example, body temperature may be controlled by using water for evaporative cooling or by using glycogen for warming through shivering. In these instances, stores of water and glucose are depleted. The only replacement sources are in the external world, and the animal must manifest behavior as it searches the environment to locate and to consume the needed substances. This whole cycle, which goes on endlessly throughout the animal's lifetime, provides a blueprint for theorizing about behavior which is basically and primarily motivational. Also, because animals depend for their very existence on the adequacy of both internal and external aspects of regulation, the structures mediating these functions are subjected to powerful selective biases in their evolution. Neurophysiological bases. In recent years the neurophysiologist and physiological psychologist have been increasingly successful in identifying the neural and endocrine bases for such important motivational conditions as hunger, thirst, and sex; but even before this actual demonstration the functional theories of behavior were assuming that
motivation was firmly anchored in the organic needs of the body. This is clearly illustrated in the following quotation from Dashiell's influential textbook of 1928: The primary drives to persistent forms of animal and human conduct are tissue-conditions within the organism giving rise to stimulations exciting the organism to overt activity. A man's interests and desires may become ever so elaborate, refined, socialized, sublimated, idealistic; but the raw basis from which they are developed is found in the phenomena of living matter. (1928, pp. 233-234) Social motives. The second sentence of the above quotation reflects one of the major problems of the contemporary motivational theorist: the nature and derivation of those human motives that do not seem to be connected in any obvious manner to the waxing and waning of organic needs. All possible positions are represented in the writings of psychologists, ranging from Dashiell's view that social motives simply grow out of physiological strivings to the assertion that social motives may be completely unrelated to biological needs in either their development or their full-fledged operation. Experimental psychologists have been interested in the study of motivation for its own sake and because of the important role that motivational constructs have come to play in theories of learning and performance. Current developments in the psychology of motivation are taking place in a number of areas without a great deal of apparent interaction. Since the purpose of this article is to present an introductory overview of the field, some of the major fields of interest in motivation will be described, followed by a brief attempt at synthesis. Physiological mechanisms in motivation While the earliest and still most basic interest of the psychologist concerned with the physiological mechanisms in motivation is the nature of behavior resulting from alterations in internal physiological states, his study of underlying mechanisms has drawn him into an active partnership with the physiologist interested in regulatory processes and the neurophysiologist. The application of the Horsley-Clark stereotaxis technique for precise subcortical brain exploration has led to a vast number of discoveries of the importance of the hypothalamus and brain stem for regulation of the internal constancy of the body. It has further been demonstrated that these structures participate in the more external aspects of such regulation, as in the initiation and termination of eating
MOTIVATION: The Concept and drinking. The underlying mechanisms of drive turn out to be complexly interrelated combinations of physical and chemical changes in cell membranes, endocrine secretion, and neural integration. As the sequence of operations underlying the various regulatory cycles becomes more clearly understood, a generalized schema is beginning to emerge—a pattern of processes not unlike the patterns psychologists and engineers are accustomed to deal with in the analysis of any type of behavior system. This schema is illustrated in Figure 1, a much simplified representation of the regulatory model of motivated behavior. When the physical or chemical constancies of the body are altered, correctional mechanisms are brought into play by either completely internal homeostatic adjustments or regulation that involves arousal of the animal to discriminatory awareness and to selective orientation with respect to corrective aspects of the environment. When system variables are restored to something approaching optimal levels, signals of restoration act to inhibit the behavioral and physiological processes of correction. Thus the familiar negative feedback principle of control can be applied to guidance of the organic system [see CYBERNETICS]. While regulation frequently is quite automatic and does not require any observable behavioral effort, such is not always the case. As was mentioned earlier, when restoration of system variables to an optimal range consumes substances which must be replaced from sources exterior to the animal or when the environment itself poses local conditions of stress, the animal must make dis-
509
criminative responses in that environment. That this is so has been recognized for a long time, as the quotation from Dashiell would indicate. What is essentially new is the identification of the mechanisms underlying the processes of the model. It is almost invariably true that significant variations from the optimum of any physiological system variable are signaled by changes in the physical or chemical characteristics of the extracellular body fluids. For example, oxygen lack is signaled by an increase in the carbon dioxide content of the plasma, dehydration is accompanied by an increase in extracellular osmotic pressure, etc. Either these changes or some of their secondary consequences are adequate stimuli for specialized detector cells which, functioning as quasi-sense organs, react to the index of system disturbance by hormonal and/or neural excitation and response. These responses are the direct cause of correctional activity in the case of the automatic regulations and instigate the discriminatory and orienting responses in the case where behavioral components are necessary. In the latter case it is obvious that docility, flexibility, and variability are introduced, furnishing survival criteria of a kind quite different in many ways from those inherent in the more automatic and internally sufficient type of regulatory mechanism. In short, it would appear that the phylogenetic modifications that have led to the superior mammalian nervous system were decisively determined by the demands of survival by external regulation. Experimental verification of the essential points in this argument has been accomplished. Discrete
System variables
Deviation of system variables from optimum
back to optimum
Internal correctional mechanisms
Figure 1 — A simplified representation of the regulatory model of motivated behavior
510
MOTIVATION: The Concept
lesions produced in the lateral areas of the hypothalamus cause an animal to become aphagic; the animal will refuse food in the face of a growing and critical caloric need, ^nd eventually, unless maintained by artificial feeding, will die of starvation. In a nearby region ablations can be made that will cause an animal to become adipsic; even when some water is placed in the mouth of such an animal it will refuse to swallow and will attempt to reject the water as if it were some illtasting or noxious substance. It has been reported (Andersson & McCann 1956) that aphagic animals will accept food if it is in liquid form and that adipsic animals will accept water if it is contained in fluid that has an acceptable taste and high caloric content. Such results tend to support the notion of relatively discrete neural systems controlling the urges to eat and to drink. Further support for this point of view is gained by observing the results of lesions in the ventromedial nuclei of the hypothalamus. Animals so injured develop hyperphagia, eating ravenously and far beyond their caloric needs. Such animals, with unlimited supplies of palatable food, become obese, frequently weighing more than twice as much as would be normal for their age. Thus, there would appear to be a specific food-control system, the "need detector" being critically involved with the lateral hypothalamus and the "satiation detector" with the ventromedial hypothalamus. Other parts of the nervous system are, of course, involved also, since the search for and consumption of food requires a vast amount of sensory apparatus, memory capacity, and ability to manage a repertoire of motor responses. But all of this apparatus, important as it is, appears to depend for its operation on signals instigated by the hypothalamic mechanisms. This would seem to be the closest we have yet come to locating a specifically physiological basis for a "pure" motivational component to behavior. While much remains to be done, it can be said with some confidence that the remaining problem in the physiological analysis of motivation is elucidating the organic basis for learning, memory, and perception; the shape of the purely motivational component is now within our grasp. Brain stimulation. One development that holds promise of relating the need-signaling and satiation-signaling systems to the systems governing learning, memory, and perception is the now wellknown finding of Olds and his associates that animals will perform a wide variety of instrumental acts if rewarded by electrical stimulation in subcortical brain areas; still other areas are found to generate avoidance responses (see Olds 1962 for
a review of studies of the phenomenon). The "reinforcing" areas are located in a somewhat diffuse path that extends from the midline septal nuclei down to the lateral and posterior hypothalamus. Under optimal stimulus conditions and with electrodes implanted in the most favorable hypothalamic loci, animals will continue pressing a lever to receive stimulation for an indefinitely long period, stopping only when apparently fatigued and resuming their efforts after brief periods of rest. Response rates are augmented by concurrently operating tissue needs (Brady et al. 1957) and are modifiable by changes in the electrical parameters of the stimulus. The relations of this organic phenomenon to hedonic and other theories of motivation will be dealt with by other contributors to this section. What appears inescapable is the fact that by quite artificial and nonphysiological means it is possible so to stimulate the brain as to instigate behavioral sequences that look for the most part like ordinary "motivated" behavior. Still an interesting and vital question is just what part of the regulatory mechanism is triggered by the electrical stimuli. Does the stimulation mimic the adequate need-detection stimuli, or does it operate on whatever system is responsible for determining the acceptability, palatability, or "hedonic tone" of peripheral stimuli? There is not yet enough evidence to decide between these alternatives. Perhaps neither possibility is true, and it might turn out that electrical stimulation of the brain creates a completely unique motivational state having little to do with the central mechanism for other physiological drive states. Even if this were true, however, the animal must marshal his sensory and motor equipment in the interest of repeating the instrumental acts leading to brain stimulation, and so at the perceptual and motor sectors of motivated behavior the over-all model is still sufficient [see NERVOUS SYSTEM, article on BRAIN STIMULATION]. Activation. In addition to the newly expanded knowledge of neural and hormonal variables in regulation, there has been another development in neurophysiology that has exercised a significant influence on concepts of motivation. The discovery by Horace Magoun (1950; 1958) and his co-workers of a second sensory and motor neural mediating system, working in conjunction with the classical afferent paths to the brain and the pyramidal efferent motor outflow from the brain, has given the psychological theorist a wider range of physiological properties on which to base his thinking about the relation of brain to behavior. Magoun showed that the reticular formation of the brain stem received innervation from most of the afferent
MOTIVATION: The Concept nerves leading from sense organs and that stimulation of sense organs caused excitation to be transmitted not only through the long-known "specific projection pathways" to the corresponding sensory areas of the cortex but also, by means of the reticular formation, diffusely to most or all other parts of the cortex. Because the reticular formation receives excitation from all sensory channels and because any specificity appears lost in the diffuse transmission to the cortex, this system was called the "nonspecific," or "diffuse," projection system. The major feature of the reticular formation seems to be the dependence of the higher regions of the brain on this diffuse excitatory consequence of sensory stimulation for proper transmission and integration of impulses that are carried over the specific projection system. In a classical experiment Moruzzi and Magoun (1949) showed that cats with the reticular formation ablated appeared unable to respond to peripheral stimulation, although electrical recording from the sensory areas of the cortex showed that the signals from the sensory nerves were arriving at the cortical sensory projection areas in a normal fashion. It had been known that the electroencephalogram (EEC) recording the spontaneous massed electrical activity of the cortex showed a regular alternation of a roughly sinusoidal character and of a frequency that was directly related to the degree of alertness of the subject—coma and sleep being accompanied by very slow waves; relaxed waking states, by an intermediate frequency; and a shift toward higher frequencies occurring when the subject either was attending to peripheral sensory stimulation, was actively engaged in tasks, or was disturbed by emotional thoughts. The shift from lower to higher frequencies was shown by Magoun and others to be closely related to activity in the diffuse projection system. The phenomenon of increased EEC frequency as a consequence of stimulation has been called activation and as such is an index of the widespread changes in the higher nervous system attending integrated behavior. More broadly, then, "activation" is a term used to denote the generalized, nondirectional alerting of the subject as a consequence of external or internal stimulation. In this sense, the concept has been called upon to bear an increasingly heavy theoretical load in discussions of motivated behavior [see ATTENTION]. Considering the importance of activation as a part of the physiological analysis of motivation, at least two main points should be made. To the extent that regulatory imbalance increases the activity level of the animal, it could be surmised that the
511
various deficiency and excess detector mechanisms, like the peripheral sense organs, contribute to excitation in the diffuse projection system in addition to functioning as the origin of signals specific to the particular system out of equilibrium. If this is so, the diffuse projection system is at least a part of the physiological mechanism underlying "drive" and conforms nicely to some of the behaviorally observable properties of motivation. A second, and perhaps more important, possible property of the activation system is that, in its obvious importance for discriminatory awareness, it provides a common physiological mechanism for mediation of tissue-need motivation and the other more complex forms of motivation that appear to originate in peripheral sensory stimulation or in some relationship to the previous learning and memory of the individual. A number of years ago Morgan (1943) proposed that the essential physiological mechanism in motivation was a kind of general excitatory process, to which he gave the name "central motive state." It would seem that the central motive state could well be the diffuse activation process. To the extent that activation is nonspecific it is a process that can be equally at the service of any adequate stimulus situation, be it internal or external, be it changes in physical constants of plasma or changes in patterns of symbolic sensory input. Instinctive behavior The strongly empiricist bias of behavioral science in the United States and in Russia has led to a serious neglect by psychologists of some of the clearest examples of motivational models at work. Most animal species show, to varying degrees, complex behavioral sequences, the performance of which is so similar among members of a species as to suggest compellingly that some of the crucial determinants of the sequence are a part of the genetically transmitted characteristics of that species. Intensive field and laboratory studies of the behavior of members of many of the animal phyla have led to a renewed interest in instinctive behavior and to several important modifications in our concept of its characteristics. The most important insight is the recognition that instinctive behavior does not run itself off blindly and inflexibly, but rather occurs under an exquisitely balanced set of external and internal stimulus conditions, changes in any of which cause corresponding changes in the details of the instinctive acts; along with this there is a flexibility and adaptability of the activity to the existing conditions, and within the uniformity of the over-all ends, or goals, of the behavior, there is a variety of means employed
572
MOTIVATION: The Concept
that makes any given sequence of instinctive behavior well-nigh unique. Further, the physiological basis for instinctive behavior, so far as it is now known, seems to conform quite well to the general model for regulatory behavior. All of these points can be made more vividly by briefly citing an example. Several species of Pacific salmon have a life cycle that starts with hatching from eggs laid in fresh-water streams at some distance from the ocean. After a few months of growth in the immediate vicinity of their birth site, the young salmon gradually drift downstream, foraging as they go and maintaining a predominant upstream body orientation. Upon reaching the ocean they range over a territory of many thousand square miles for periods that vary from one to four years. During this time, nourished by the plentiful food supply of the seas, they attain a large size. Then, for reasons still obscure, changes occur in the pituitary, the fish stop foraging for food, their digestive tracts start an active atrophic process, and the salmon start on the return trip to the river drainage from whence they came. With a high probability of success they locate the correct river, ascend it, select the correct branchings, and return to the particular stretch of water in which they were hatched. By this time the starvation and the pituitary-directed gonadal changes have produced marked structural alterations in the fish. Their skin pigmentation is changed, their jaw structure is modified, and they are immeasurably weaker. The female selects a favorable spot, scoops out a nest in the stream bottom, and lays her eggs. These are covered with the sperm-bearing fluid of the male. Soon both parents die. (Other closely related species, the steelhead trout and the Atlantic salmon, show the migratory sequence but live to spawn on several repeated occasions.) Noteworthy in this sequence is the interdependence of genetic inevitabilities, such as the digestive-tract atrophy and individually learned memories. In some very real sense the salmon must learn and remember a unique set of geographical coordinates and features. Its return from the ocean to the mouth of the main stream of its home drainage requires not only the capacity to use navigational techniques (it is probable that the salmon navigates by sun angles) but also a memory of these bearings in relation to particular sequences of temperatures, currents, salinities, odors, and other environmental features in order to return successfully to its spawning ground. The complex interweaving of genotypic and phenotypic factors that are seen in patterns of instinctive behavior do not seem completely different
from the types of variables underlying any physiological drive state. Nor is it beyond the realm of possibility that even the most complex and uniquely human motivational conditions may follow the same sort of pattern, in which cultural, individual, and genetic variables interact to produce resultants that have individual variability within a larger context of species uniformity. Modern concepts of instinctive behavior, then, may lend some support to a "neo-McDougallism," not by making culture and learning less important but by making "instincts" more susceptible to phenotypic variation [see INSTINCT]. Psychological aspects of motivation Turning from the biological material we have been considering to the type of thinking and writing being done by the majority of experimental psychologists, we move from a search for organic substrates to an exercise in theoretically guided research, in which motivational concepts are treated as intervening variables and hypothetical constructs. These constructs are used as mediators between the observable and controllable aspects of stimulation and response, along with such nonmotivational constructs as habit. The central issue in the theoretical psychology of motivation has been the relationship of motivational variables to those of learning. In what Hunt (1963) has called the traditionally dominant conceptual scheme, behavior is thought of as starting with general random activity, instigated by drive; the latter may be equivalent to painful or uncomfortable internal stimulation consequent to tissue needs. A behavior sequence, or cycle, is terminated when the drive is reduced by the animal coming into commerce with circumstances that terminate the uncomfortable internal stimulation. If, as is usually the case, the tissue need is recurrent, the animal learns quicker and more effective techniques for drive reduction. Thus, in its final form behavior becomes some sort of joint function of drive and habit. Important issues grow out of this simple basic conceptual scheme. Most influential in experimental psychology has probably been the treatment given to these problems by Hull (1943), in which drive reduction is viewed as a necessary condition for habit acquisition and in which performance is a joint function of habit as a directional component and drive as an energizing component. For well over a decade, almost every study reported in the experimental journals seemed oriented toward issues raised by this formulation or its obvious alternatives. Despite all the careful experimental work, however, it is still impossible to
MOTIVATION: The Concept arrive at any simple specification of the role of motivation in learning. Nor, unfortunately, is it possible to make a clear decision about the correctness of the traditional formulation. There is, for example, enough experimental evidence to lead us to suspect that habits may be acquired quite independently (or even in the absence) of any concurrent drive state; that, while temporal contiguity of an action and drive reduction may facilitate learning, the role of drive reduction may be more apparent than real; that intensity of drive may be related to efficiency of performance by some sort of an inflected function, very high drive states being as detrimental to performance as are very weak drive levels; that decisions between alternative habit possibilities may be a function of perceptual structuring independent of the relative strength of the alternative habits or of the nature of the drive state; that quality and quantity of an incentive may be more important in determining acquisition or performance than any of the detectable properties of concurrent drive states; and that the intensity of a drive state under which a habit is acquired may be a determiner of the strength of the habit, as measured in later tests of retention [see the biography of HULL]. An over-all negative conclusion relevant to the problem of motivation, however, may be stated with some confidence: throughout the range of mammalian species explored (unfortunately almost exclusively rat, cat, dog, monkey, and man) there are many sources for the energizing of behavior that are not easily and directly related to the needs that arise from regulatory processes. Space does not permit a thorough review of the facts or theoretical proposals concerning motivational sources of a nonregulatory nature, although Hunt provides an excellent summary (1963). One of the principal features of many "intrinsic" motivational proposals is their emphasis on the role of cognitive processes arising from one form or another of incongruity, either with or without the added assumption that the cognitive process is accompanied by or stimulates affective, or emotional, reactions. Thus, Montgomery (1954) and Berlyne (1960) have adduced evidence that behavior may be instigated by unfamiliar stimulus situations which arouse "curiosity" or "exploratory drive." Festinger (1957) has studied the motivational effects of what he terms cognitive dissonance, which refers to uncertain, unfamiliar, or unexpected relationships between stimulus elements and internally stored memories, beliefs, attitudes, etc. While not all of the "nonphysiological" theorists have made use of the concept of activa-
513
tion as the underlying energizing mechanism, Hebb (1955), Malmo (1959), Duffy (1962), and Hunt (1963) have each in his own way argued for the importance of the nonspecific projection system as the neural mediator of intrinsic motivation [see STIMULATION DRIVES]. The cognitive theories represent a departure from the classical formulation for the development of motives unrelated to primary needs. The older point of view maintained that drives could be acquired by the familiar process of conditioning and thus were derivable from primary drives. An earnest search for evidence of acquired drive and secondary reinforcement has been only partially successful. Recent summaries by Mowrer (1960) and by Brown (1961) discuss much of the evidence. Russian workers have reported a great many experiments which demonstrate that almost any internal "automatic" process, such as bile secretion, urine secretion, or gastric acidity, can be brought under the control of environmental stimulation by applying the methods of classical conditioning (see Razran 1961). The significance of these findings for the problem of higher-order motivational processes is potentially great. What does the experimental psychology of motivation have to contribute to the social scientist? In the present writer's opinion, the strongest developments in motivation research of the past twenty years have been in the basic underlying physiological processes. Not only have mechanisms been specified for individual regulatory states, but an outline, at least of expectation for more general somatic integrating processes underlying complex perceptual and socially oriented behavior has been achieved. Socialized man, unique as he is, works within limits set by his anatomical characteristics, and a more precise prediction of his behavior may well emerge when his properties as a physiological system are assimilated with his properties as a social being. LAWRENCE I. O'KELLY [Directly related are the entries DRIVES; INSTINCT; LEARNING, article on REINFORCEMENT; STIMULATION DRIVES. Other relevant material may be found in EMOTION; HOMEOSTASIS; LEARNING; NERVOUS SYSTEM; PAIN; PERSONALITY: CONTEMPORARY VIEWPOINTS; and in the biographies of CANNON and MCDOUGALL.] BIBLIOGRAPHY
ANDERSSON, B.; and MeCANN, S. M. 1956 The Effects of Hypothalamic Lesions on the Water Intake of the Dog. Acta physiologica scandinavica 35:312-320. BERLYNE, D. E. 1960 Conflict, Arousal, and Curiosity. New York: McGraw-Hill.
514
MOTIVATION-. Human Motivation
BRADY, JOSEPH V. et al. 1957 The Effect of Food and Water Deprivation Upon Intracranial Self-stimulation. Journal of Comparative and Physiological Psychology 50:134-137. BROWN, JUDSON S. 1961 The Motivation of Behavior. New York: McGraw-Hill. DASHIELL, JOHN F. 1928 Fundamentals of Objective Psychology. Boston: Houghton Mifflin. DUFFY, ELIZABETH 1962 Activation and Behavior. New York: Wiley. FESTINGER, LEON 1957 A Theory of Cognitive Dissonance. Evanston, 111.: Row, Peterson. HEBB, DONALD O. 1955 Drives and the C.N.S. (Conceptual Nervous System). Psychological Review 62:243254. HULL, CLARK L. 1943 Principles of Behavior: An Introduction to Behavior Theory. New York: Appleton. HUNT, J. McV. 1963 Motivation Inherent in Information Processing and Action. Pages 35-94 in O. J. Harvey (editor), Motivation and Social Interaction. New York: Ronald. MAGOUN, HORACE W. 1950 Caudal and Cephalic Influences of the Brain Stem Reticular Formation. Physiological Reviews 30:459-474. MAGOUN, HORACE W. (1958) 1963 The Waking Brain. 2d ed. Springfield, 111.: Thomas. MALMO, ROBERT B. 1959 Activation: A Neuropsychological Dimension. Psychological Review 66:367-386. MONTGOMERY, K. C. 1954 The Role of Exploratory Drive in Learning. Journal of Comparative and Physiological Psychology 47:60-64. MORGAN, CLIFFORD T. (1943) 1965 Physiological Psychology. 3d ed. New York: McGraw-Hill. MORUZZI, G.; and MAGOUN, HORACE W. 1949 Brain Stem Reticular Formation and Activation of the E.E.G. Electroencephalography and Clinical Neurophysiology 1:455-473. MOWRER, ORVAL H. 1960 Learning Theory and Behavior. New York: Wiley. OLDS, J. 1962 Hypothalamic Substrates of Reward. Physiological Reviews 42:554-604. RAZRAN, GREGORY 1961 The Observable Unconscious and the Inferable Conscious in Current Soviet Psychophysiology: Interoceptive Conditioning, Semantic Conditioning, and the Orienting Reflex. Psychological Review 68:81-147. II HUMAN MOTIVATION
The language of motivation is a workaday device for all of us in our social world. We speak of aims, purposes, desires, wants, needs, and compulsions in others and use the same language in testifying about ourselves. The language is descriptive, unqualified, contradictory, and misleading. It manifestly will not do for science, and yet, in a pragmatic fashion, we get it to work much of the time in our daily lives. For better or for worse, it has been the departure point for the development of scientific statements about human motivation. From the outset, systematic writing about human motivation has had to accommodate the fact that our subjective sense of intention is an unreliable index of our behavior. Many behaviors show inten-
tional organization which may be successfully identified by the observer when the behaving person himself cannot report or infer the intention. Efforts to cope with this feature of human motivation have led to a wide range of strategies of theorizing which, in turn, have stimulated rather distinctive styles of research tactics. One result of this state of affairs is that there is not yet any general theory of human motivation, nor does it seem likely that there will be one for quite some time. Let the reader thus be prepared for a certain amount of surveying here, with a special effort to mark numerous reference signs pointing to those sizable nexuses of literature which must be pursued in depth. (For a more extended survey, see Murphy 1954.) Textbook treatments of social motivation from various viewpoints may be found in recent texts by Atkinson (1964), Brown (1961), and Gofer and Appley (1964). Atkinson provides an excellent historical review of the manner in which the framing of motivational questions has evolved and suggests an essentially cognitive resolution; Brown's book treats the topic from the point of view of Hullian drive theory, with its resultant absorption of motivational questions into the analysis of habit systems; while Cofer and Appley give an exhaustive and eclectic summary of the motivational literature, culminating in the suggestion that research will be best guided by an "equilibration" model focusing on the anticipation and/or sensitization invigoration mechanisms. In these modern treatments of motivation, the fact of socialization is acknowledged but not given any special status beyond that given other sources of stimulus input. The same is true of the response concept, in which no qualitative distinction is made between subjective report and observed behavior. The effect of this sort of theorizing is to place the burden for distinguishing social classes of stimuli and responses upon spatial location, timing, intensity, association, and complexity. The power of such an approach lies in its reductionist implications, since the observer must give up his "area" terms, such as love, anxiety, ambition, etc., in favor of a step-by-step analysis of the motivated sequence. Such is the approach of Ford and Beach (1951) in describing the pre-conditions, body states, arousing stimuli, and preparatory, consummatory, and withdrawal movements which characterize sexual behavior across species, including man. The limitations of the reductionist approach have been obvious for decades. McDougall (1908) warned against them and tried to provide an alternative that would preserve the value of social motivation in our common vocabulary. Contemporary
MOTIVATION: Human Motivation writers have also pointed up the severe limitations of reducing the study of motivation to those behavioral sequences which focus on action "in order to" at the expense of action for its own sake of "being." Gordon Allport (1964) has reiterated his often expressed evaluation of theory and research unenlightened by a proper degree of eclecticism. When we add to these considerations the problems posed by the desire of many to write a truly social psychology of motivation—for example, Floyd Allport and Kurt Lewin—we must be prepared to find the literature of the field a disordered array of constructs, theories, methods, empirical findings, and research programs. There is a sense in which constructs and theories are answers to questions posed by observation. What are the origins of human motives? How do motives develop? What are the motives of men? How do motives affect behavior and experience? By organizing the remainder of this article around these questions, we will be surveying the literature on human social motivation. The origins of motives Nearly all serious observers of human behavior have had to frame a statement about the sources and wellsprings of motivated behavior. The early works of McDougall (1908), Freud (1915), and Thorndike (1927) use extensions of the philosophical discussions of hedonism and the role played by the affective dimensions of experience. The general notion is that those behaviors which result in changes in affect soon take on directional qualities, while those which have no observable affective components are not properly called motivational. However, both the Freudian postulation of unconscious affects and the difficulties of objective measurement of affects soon led to a willingness to assert that basic motivational tendencies may emerge as a natural component of behavior in the normal course of maturational development (e.g., White 1959). Gordon Allport has extended this position by postulating that new motives may develop from old by becoming "functionally autonomous"; that is, early motives produce a profusion of new experiences which transform and redirect them. Jung (1932-1936) extended the maturational change through the life span well beyond the middle years and asserted that new motives continue to appear late in life. Learning theorists have progressively shown more interest in objective determinants of behavior. The two major research programs have been those of Clark Hull and B. F. Skinner [see HULL; LEARNING, article on INSTRUMENTAL LEARNING]. By emphasizing the role of response consequences ("rein-
51
forcers") in learning and the directing influence of stimuli associated with reinforcement, they re duced the motivational bases of behavior to thos< primary bodily conditions which drive the organisn to a sufficient level of arousal to support learning Thus Brown (1961) argues that all social motiva tion is based upon primary drive systems that hav< been elaborated by reinforcement into secondar systems. The effect of these formulations has been t< focus attention on the definition of primary sys terns. It is an empirical fact that hunger, thirst pain, and affective arousal surrounding those bod systems eventually integrated into adult sexualit have proved easiest to observe and manipulate. Bu other researchers, such as Sears, Maccoby, an* Levin (1957), have demonstrated the feasibility o empirical studies of those systems devoted to cog nition, mastery, empathy, and identification o value orientations. These "ego functions," as the are called, appear to be equally primary in drivin the organism. Indeed, the difficulties of specifyin the attributes of primary drive systems lend force t the models emerging in the discussions between ps} chologists and ethologists about the proper metho< of study of emergent patterns of behavior. Here w are warned against placing too much value o: "area" terms such as "drive," "primary," etc., i: favor of providing a closely specified, objective de scription of the sequence of events, both organismi and environmental, which contribute to the appeal ance of a behavior sequence (Bindra 1959). A this advice is more widely adopted, it appears likel that the origin of social motivation will be d( scribed as some unique arrangement of determ nants known to characterize social motivatio throughout the life span. The development of social motivation. Th earliest comprehensive statement of motivationc' development is Freud's theory of psychosexuj stages (1932), wherein intense affects of pleasui and distress progressively focus on the emergin body functions of ingestion, elimination, and o: gasm, as well as on fantasied castration threa These maturational stages have been further su] plemented by Erikson's "epigenetic ages" (1950 which emphasize psychosocial stages of develo] ment of trust, autonomy, initiative, industry, idei tity, intimacy, generativity, and ego integrity. Jur also conceives of psychological stages extendin through the life span. The social factor in each < these schemes lies in the association of gratific tion or fear with other persons, acting within tY area of interest to the developing human bein Since the focus is on the growing person, tr
516
MOTIVATION: Human Motivation
"other" tends to be treated as an object or agent, and the sense of reciprocity found in social psychology is absent. Also absent from the above, formulations is a closely reasoned theory of learning. McClelland (1S51) has presented a discussion of the importance of the first years of life in the formation of motives which points toward the objective analysis of those conditions of early environmental input, autonomic conditioning, undeveloped cognitive discrimination, absence of symbolic control, and failure of extinction due to the unreproducibility of the original learning situations. Recently more precise statements of early learning of social motives have been set out by Staats and Staats (1963), Bijou and Baer (1961), and Bandura and Walters (1963). The first two pairs of authors use recent developments in Skinnerian analysis of behavior to effect an exposition of the emergence of directed behavior according to the pattern of classical conditioning of "respondents," reinforcement schedules of "operants," use of "discriminant" stimuli, and eventual symbolization of such stimuli. From this point of view, motives appear because some stimuli and reinforcers are more common than others and are easier to discriminate and of greater importance to society. Important issues remain untouched in this analysis. The defining attributes of reinforcers, beyond their capacity to reinforce, go unanalyzed, thus placing a heavy environmental emphasis on the theory and separating it from the research discussed in the previous section on origins of motivation. The loss of objectivity which occurs when the child masters sufficient language to permit the chief dynamics of reinforcement and discrimination to take place as thought and decision points to the need for a theory of language, and the Skinnerian efforts in this direction have suffered heavy criticism as being too simplistic. Finally, the value of these theories in generating research on the development of motives remains to be seen. As yet the presentations are descriptive and largely speculative, being reminiscent of a previous effort by E. R. Guthrie [see GUTHRIE]. Judging by the capacity of such theory to stimulate research and by some of the preliminary efforts, considerable research will be produced. Bandura and Walters (1963) write from the sociobehavioristic viewpoint and present a considerable array of research findings in support of their theorizing. They emphasize the importance of social imitation and vicarious reinforcement, that is, change in behavior through observing the reinforcement experience of another. By combining the effects of imitative experience with direct social
reinforcement, they show that social learning may involve sudden "mastery" of whole patterns of behavior in brief periods of time. They further theorize that the establishment of behavior patterns of aggression, dependence, sexual behavior, and selfcontrol follows directly from patterns of reinforcement and stimulus generalization. However, the inhibition of these behaviors by socially acceptable alternative behaviors requires a combination of reinforcement withdrawal, modeling of alternative responses by others, and perhaps cessation of punishment following a restitutive or prosocial response. Punishment alone is said to inhibit the expression of the response in the presence of the punishing agent, but nothing more. This monograph is an excellent example of the behaviorist's art of analyzing a behavioral sequence into those components of stimuli, responses, and environmental events for which reasonable estimates of relationship can be made. There is the usual behavioristic sense of circumvention of the private, subjective interior of human experience in this writing, although Bandura and Walters devote considerable attention to learned verbal responses. "In contrast, a child may learn to criticize himself for transgression because selfcriticism proved a successful means of securing the reinstatement of his parents' affection and approval. In this case, the child's behavior parallels that of an animal who learns to press a mildly charged lever in order to obtain food" (1963, pp. 186-187). The subjective sense of conflict, discrimination, interpretation, and decision—to say nothing of aspiration and purpose—continues to await an adequate theory of reinforcement. [See IMITATION.] Standing in sharp contrast to the behavioristic approach is the work of McClelland in The Achieving Society (1961). He, too, attempts to trace the development of motivation, in this case the achievement motive, through an interlocking set of studies designed to focus on the value and meaning of various reinforcement situations as they impinge on the child. The emphasis is on the way parents interpret problem, work, and play situations to the child; on the problem-solving strategies of aspiration and effort which children adopt (Heckhausen 1963); on the effects of such experience as reflected in fantasy, self-evaluation, and choice of long-term interests; and on the eventual appearance in the adult personality of a coherent motivational system that continues to affect decision processes, performance characteristics, and belief systems. Such a program suffers from confusion of definitions, argument by analogy, numerous inade-
MOTIVATION: Human Motivation quate controls, and poorly defined construct validity. Its value lies in its holding close to the phenomenal world of the subject, as experienced, in the hope of laboriously introducing those methods uniquely required for the proper study of human motivation. [See ACHIEVEMENT MOTIVATION.] The motives of men Asking for a taxonomy of motives implies some sort of list. Thus it was that early writers on the subject (for example, Jeremy Bentham) felt that the naturalist's approach to motivational phenomena would lead to the proper definition of the subject. However, the proliferation of "instinct" theories, with their endless lists of motives, combinations, and hierarchies, eventually led to the discrediting (see Bernard 1926) of the work of men like McDougall (1908) and Troland (1928) and to the suppression of the question, What are the motives of men? In 1938 Henry Murray reopened the question with the publication of Explorations in Personality. This effort to re-establish the importance of the taxonomic approach to motives and situations seemed to renew interest in programs of research systematically directed at the study of single motive systems. The list of publications devoted to such study is growing steadily. Generally speaking, the research strategy for the study of motive systems consists of identifying a reasonably finite set of behaviors designated by common-sense language under a single term, for example, affiliation. Efforts are then made to provide adequate measures of both the behavior—that is, an affiliative response—and the disposition so to respond—that is, the need for affiliation. Once the measure of the motive is established, a series of studies is begun to determine the motive's sensitivity to environmental and social arousal, its role in personality dynamics, and its effects on other specific behavior systems such as perception, cognitive processes, or learning sequences—and to abstract from all of these findings some general statements of motivational processes. Inevitably such research produces enough anomalies to require revision of the original conception of the motive construct itself, the set of behaviors initially said to define it, and some of the assumptions made in its measurement. The studies discussed below illustrate these conditions. Since truly programmatic research is still not common in American social science and psychology, it should not be surprising to find that Murray's mapping of personality domains failed to guide motivational search. Broadly speaking, the programmatic literature does divide into "primary"
517
systems of sexual behavior and anxiety-driven behavior; "secondary" systems of curiosity, competence, and dependency; and acculturated systems of authoritarianism, affiliation, approval, ingratiation, conformity, achievement, and power. For each of the terms just mentioned, at least one volume or working paper has been published by researchers working continuously on the problem area with coherent methods and purposes. Certainly the literature on other motivational systems is quite large but fragmented to the point of defying integration. Frequently the study of a particular behavior pattern is accompanied by the invocation of the "need for " phrasing (Gofer and Appley index the need for identity, dreaming, rest, satisfaction, security, sex, and sleep) with no effort made to give the motive construct an independent definition and measurement. It is this practice which has led many writers to abandon the motive construct as unfruitful and redundant (e.g., Rogers 1963; Jones 1956-1962, esp. volume 8; Kelly 1962). However, the more naturalistic, descriptive analysis of motive systems continues to illuminate the nature of human experience in social situations Current studies of motivation Sex. Given the extraordinary preoccupation oJ psychological studies with sexual behavior, we might expect to cite several comprehensive work; on human sexual motivation. But in fact, the recen Encyclopedia of Sexual Behavior (Ellis & Abarbane 1961) displays a wide diversity of investigatioi from every conceivable point of interest withou providing a clear picture of sexuality as a motiva tional process. The early comparative study fr Ford and Beach (1951) gave some hint of hov such investigation might proceed, but we have onb the Kinsey studies (Kinsey et al. 1948; 1953) exploratory reports by Maslow (1962), and tb more recent intensive studies by Money, Hookei and Masters (Money 1965) to show the beginning of an assessment of sexual capacities, development practice, and experience in humans. [See SEXUA BEHAVIOR.]
Anxiety. The literature on anxiety undoubted! exceeds that on any other motivational topic. 1 may be divided into studies of physiological proc esses, case studies, field studies, and laborator studies of the effects of anxiety on behavior and 63 perience, assessments of therapeutic procedures fc its relief, and theoretical statements on its origir nature, and role in personality functions. Psych< pathology, work loss and impairment, much of tb thematic content of contemporary works of ar and the focus of social commentary in both popi
518
MOTIVATION: Human Motivation
lar and scholarly writing all give testimony to the presence of anxiety in human affairs. Again, however, as with sex, we find a discursive literature remarkably deficient in progiarnmatic intent and lacking in coherence. Hoch's assertions of 1950 remain true: Today we know a great deal about where and when anxiety occurs, but we are still quite hazy as to how it originates and even what purpose it serves. . . . Some think that anxiety is secondary to an intraorganismic or interorganismic imbalance, being a symptom of a disturbed homeostasis in the organism due to conflicting drives within the individual and the environment; others support the point of view that anxiety itself is the cause of the disturbances we see in most neurotic and in some psychotic manifestations. (Hoch & Zubin 1950, p. 105) The physiological mechanisms are outlined in the work of Selye (1956) on the "general adaptation syndrome," in Wolff (1953), and in papers delivered at the Symposium on Stress, held in 1953 by the National Research Council and Walter Reed Army Medical Center. The Funkenstein, King, and Drolette (1957) experimental studies of induced stress as it affects physiological and psychological indices reveal the complexity of individual mastery of stressor effects and their attendant anxiety. Jams' strategies (1958) for testing specific hypotheses of anxiety control in preoperative and postoperative patients stand as an excellent example of the careful field studies that are so badly needed. [See ANXIETY; STRESS.] Aggression. An excellent example of the successful researching of a motivational process is found in Berkowitz' Aggression. By integrating his own research with the large body of material available, he was able to conclude: . . . the habitually hostile person is someone who has developed a particular attitude toward large segments of the world about him. He has learned to interpret (or categorize) a wide variety of situations and/or people as threatening or otherwise frustrating to him. Anger is aroused when these interpretations are made, and the presence of relevant cues—stimuli associated with the frustrating events—then evokes the aggressive behavior. In many instances the anger seems to become "short-circuited" with continued repetition of the sequence so that the initial thought responses alone elicit hostile behavior. (1962, p. 258-259) Thus a motive component surrounds the experience of threat, while the notion of latent aggression or need for aggression is abandoned in favor of a trait conception of more or less consistent aggressive reactions. However, Berkowitz clearly states that enduring motives may conflict with or support the aggressive response to threat as well as lie at
the seat of the developmental course which leads to the aggressive personality pattern. Berkowitz' findings probably have great generality for other motivational systems. By tracing the relative weights of the various sources of variance as they are found in capacity for emotional arousal, constitutional capability, early models for action and thought, social settings of support or inhibition, and the structure of interiorized moral standards, he has doubtless identified the basic sources of many important motivational systems. Not that the scheme is yet complete. Notably absent from the work is a treatment of the middle and late life changes which occur presumably from self-education, shifts in ideology, and a steadily lengthening course of experience. This deficit is a common one in the works we are reviewing. [See AGGRESSION.] Ludic behavior. "Ludic behavior consists in large measure of what we are calling perceptual and intellectual activities—seeking out particular kinds of external stimulation, imagery, and thought" (Berlyne 1960, p. 5). There is a growing body of literature concerned with exploratory behavior, curiosity, manipulation, attention, and epistemic behavior. It is paralleled in the clinical literature by an increased emphasis on the analysis of ego functions. Berlyne's works (I960; 1965) constitute an impressive review of the studies of animals and men engaged in ludic behaviors. He suggests that exploration may be released by some specific stimulus event in the situation or may emerge from an ultrastable stimulus situation in an apparent effort to create "diversive" stimulation. Given these basic motivational dispositions, the processes of socialization, reinforcement, etc. may then produce more stable response patterns which are placed in the service of conflict reduction; these in turn lead to generalized epistemic behaviors designed to provide information and understanding suitable for adjustment to a wide range of choice and conflict situations. Thus ludic behaviors are usually found concurrently with the activation of other motivational systems, but they are distinct processes in their own right and not merely variants of anxiety, aroused drive states, etc. Thus far this research has not attempted to provide standardized measures of individual differences in ludic motivation. [See CREATIVITY; STIMULATION DRIVES.] Affiliation. The complexity of human social attachments naturally leads to attempts to distinguish between the qualities of human association. The voluminous clinical and psychoanalytic literature on psychosexual development has generated numerous hypotheses about the sources of attraction, dependence, love, and identity between per-
MOTIVATION: Human Motivation sons. For several years S^ars and his associates have been studying the development of affiliative tendencies in children (Sears et al. 1957; Sears 1963). "For the child, the upshot of this infantile experience is that a certain number of operant responses become firmly established to the various instigators that have been commonly associated with primary gratifications or reinforcing stimuli. The child learns to 'ask' for the mother's reciprocal behavior. These asking movements are the dependency acts whose frequency and intensity we use as a measure of the dependency trait (or action system)" (Sears 1963, p. 31). It is the appearance, maintenance, growth, and elaboration of these dependency acts that concern Sears, and his studies demonstrate the complexities of tracing these processes. Thus for the sample of four-year-olds for whom data on early infancy was available, the prediction of negative or positive attention-seeking, touching or holding, being near, and seeking reassurance proved to differ for the sexes, with the girls' patterns related to level of maternal care, achievement demands, and sex anxiety for the father. Maternal coldness, slackness of standards, and neglect, without any real permissiveness, and paternal general nonpermissiveness—especially about sex— was related to boys' dependency (Sears 1963, p. 63). Sears is willing to refer to these patterns as motivational systems, but he makes it clear that the sheer complexity of variables requires more precise definition than a list of needs or motives can provide. It is perhaps for this reason that individual difference measures of the dependency disposition are not reported. Shipley and Veroff (1952) have established a reliable measure of need for affiliation (n Aff), using the modified Thematic Apperception Test procedure of McClelland. For college student populations in particular, this measure has shown predicted positive relationship to suggestions for conformity (Walker & Heyns 1962), negative subjective reactions to rejection (Hardy 1957), and effort on achievement tasks when they are instrumental to social approval (French 1958), to cite a few salient findings. These studies have been primarily aimed at establishing the construct validity of the n Aff measure and do not constitute a comprehensive study of affiliative behavior. Schachter (1959) set out to study the conditions which cause variation in affiliative action. It has been demonstrated that affiliative tendencies increase with increasing anxiety and hunger and that, for anxiety, ordinal position of birth is an effective discriminator of the magnitude of the affiliative tendency. The over-all findings warrant
519
the conclusion that affiliative tendencies are a manifestation of needs for anxiety reduction and self-evaluation (Schachter 1959, p. 132). Here we have an example of motive assignment from the observer's interpretation of the situation, supported by subjective report of the subjects. Approval and ingratiation. A comprehensive program of research on affiliative behavior is found in The Approval Motive by Douglas P. Crowne and David Marlowe (1964). Starting with an interest in a measure of individual differences in the socialdesirability response set to personality inventory items, . . . we directed our search toward the goals and expectations that would impel one to evaluate himself in terms conditioned by the acceptance of others. To do so required us to postulate a motivational state [the approval motive], reflected in test-taking behavior [the Marlowe-Crowne Social Desirability Scale (MC)], and to seek its correlates in behaviors less harassed by the confusions of personality tests. Our findings have been confirmative, although in the process a major alteration of the concept of the approval motive—the defensiveness-and-vulnerable-self-esteem hypothesis— was necessary to account for some unanticipated and initially paradoxical results. (1964, p. 206) In this case the authors did find a significant correlation ( + .55) between the projective n Aff score and the MC score, whose high scorers are . . . more conforming, cautious, and persuasible, and [whose] behavior is more normatively anchored. . . . The greater amenability to social influence of persons who characterize themselves in very desirable terms is seen in (a) the favorability of their attitudes toward an extremely dull and boring task; (b) their greater verbal conditionability, both directly and vicariously; (c) social conformity; (d) a tendency to give popular word associations; (e) the cautious setting of goals in a risk-taking situation; ( f ) their greater reactivity . . . in a ... perceptual-defense task; and (g) susceptibility to persuasion. (1964, p. 190) These authors chose to keep the concept "approval motive" while finding it useful, as did Schachter, to postulate underlying motivation to maintain and preserve self-esteem. Thus we see the manner in which the hierarchy of social motives must be uncovered by a coherent program of research. The same experience is reported by Jones (1964). He reports a series of studies using instructional and situational manipulation designed to reveal the extent and variety of ingratiation behaviors as well as effects on attitudes, beliefs, and perceptions, especially as they focus on self-esteem. Given instructional or situational sets to enhance ingratiation behavior, subjects (a) emphasize their
520
MOTIVATION: Human Motivation
positive attributes over their weaknesses, (b) move toward greater public agreement with a target person's stated opinions, and (c) show an adaptive capacity for adjusting these actions to the status, awareness of the target of the subject's intentions, and requirements of the mutual task. Each of these monographs approaches the affiliative process in particular response domains, demonstrates some of the determinants of the behaviors, and finds it useful to infer a generalized disposition having motivational properties. The data suggest that at the center of affiliative behavior lies a concern for self-confirmation, enhancement, esteem, or maintenance, which itself may imply a more basic personality disposition to stabilize, order, and control changes in one's position in the world. Achievement. A considerable part of one's lifetime is devoted to the performance of tasks whose outcomes provide important consequences for survival, well-being, social rewards, and self-esteem. Clearly understood standards of performance exist for these tasks, and to match or surpass the norm is considered an achievement. Murray gave the following definition of need Achievement (n Ach~): "To accomplish something difficult. To master, manipulate, or organize physical objects, human beings, or ideas. To do this as rapidly and as independently as possible. To overcome obstacles and attain a high standard. To excel one's self. To rival and surpass others. To increase self-regard by the successful exercise of talent" (Explorations in Personality . . . 1938, p. 164). In 1953 McClelland, Atkinson, Clark, and Lowell published The Achievement Motive—which presented a projective measure of n Ach, defined now as concern with success in competition with some standard of excellence—and a series of studies designed to establish the construct validity of the measure. This work was followed by Atkinson's Motives in Fantasy, Action, and Society (1958), which contains further studies of n Ach as well as new projective scoring systems for n Sex, n Power, and n Aff. McClelland's Achieving Society (1961) and Heckhausen (1963) have provided still more research and theory about the achievement motive and its avoidance opposite, fear of failure. The continuously growing body of literature is the subject of reviews by Heckhausen (1965) and Birney (1966) and a collection of papers edited by Atkinson and Feather (1966). The body of knowledge growing out of this sustained research effort has slowly taken the following shape. The child's early efforts to master his world provide the parents with the opportunity to reward independent, self-propelled actions differentially. If such rewards come early in life and are
accompanied by maternal praise and pacing and supportive paternal endorsement, task situations become the cue for realistic aspirations, capacity for delayed gratification, fantasies of success, and the desire for personal responsibility. These preferences lead to realistic occupational aspirations emphasizing moderate risks and personal freedom of decision. Authoritarian work situations are avoided and resisted, and these may include highly demanding academic situations. Vocational careers are marked by upward mobility, preference for moderate-risk business and managerial situations, and concentration on the instrumentalities of working situations. It might be pointed out that this pattern of entrepreneurial features was not initially anticipated by the researchers, being only slowly understood as numerous studies showed that high task achievement did not necessarily denote a high need for achievement in most subjects. By focusing on the motivation measure, rather than on achieving behavior, the form of the motivational system has emerged. This review of systematic studies of human motive systems illustrates the current phase of research now being pursued by persons interested in the identification, measurement, and functional properties of important social motives. Whether Murray's list of motives proves prophetic remains to be seen. As more of these systems are understood, the opportunity for writing a general theory of human motivation will arise. Whether that theory will resemble the many restrictive models of action and behavior also remains to be seen. At the present it appears that human motives play their major role in sensitizing persons to environmental possibilities, directing their choice among incentives, contributing to both their degree of involvement in the situation and their phenomenal sense of it, and ordering the sense of closure and history surrounding the past sequence of events. So long as these aspects of life remain denotable, the motive construct will retain its usefulness. ROBERT C. BIRNEY [Directly related are the entries ATTITUDES; DRIVES, article on ACQUIRED DRIVES; STIMULATION DRIVES. Other relevant material may be found in ACHIEVEMENT MOTIVATION; AFFECTION; AGGRESSION; ANXIETY; IMITATION; LEARNING, article on REINFORCEMENT; PERSONALITY, article on PERSONALITY DEVELOPMENT; PERSONALITY: CONTEMPORARY VIEWPOINTS; PROJECTIVE METHODS, article on THE THEMATIC APPERCEPTION TEST; SEXUAL BEHAVIOR;
SOCIALIZATION; STRESS; and in the biographies of ALLPORT; LEWIN; McDoucALL.]
MOTIVATION: Human Motivation BIBLIOGRAPHY
ALLPORT, GORDON W. 1964 The Fruits of Eclecticism: Bitter or Sweet? Acta psychologica 23:27-44. ATKINSON, JOHN W. (editor) 1958 Motives in Fantasy, Action, and Society. Princeton, N.J.: Van Nostiand. ATKINSON, JOHN W. 1964 An Introduction to Motivation. Princeton, N.J.: Van Nostrand. ATKINSON, JOHN W.; and FEATHER, NORMAN T. (editors) 1966 A Theory of Achievement Motivation. New York: Wiley. BANDURA, ALBERT; and WALTERS, R. H. 1963 Social Learning and Personality Development. New York: Holt. BERKOWITZ, LEONARD 1962 Aggression: A Social Psychological Analysis. New York: McGraw-Hill. BERLYNE, D. E. 1960 Conflict, Arousal, and Curiosity. New York: McGraw-Hill. BERLYNE, D. E. 1965 Structure and Direction of Thinking. New York: Wiley. BERNARD, L. L. 1926 Introduction to Social Psychology. New York: Holt. BIJOU, SIDNEY W.; and BAER, DONALD M. 1961 Child Development. Volume 1: A Systematic and Empirical Theory. New York: Appleton. BINDRA, DALBIR 1959 Motivation: A Systematic Reinterpretation. New York: Ronald. BIRNEY, R. C. 1966 Research on the Achievement Motive. Unpublished manuscript. BROWN, JUDSON S. 1961 The Motivation of Behavior. New York: McGraw-Hill. GOFER, CHARLES N.; and APPLEY, MORTIMER H. 1964 Motivation: Theory and Research. New York: Wiley. CROWNE, DOUGLAS P.; and MARLOWE, DAVID 1964 The Approval Motive: Studies in Evaluative Dependence. New York: Wiley. ELLIS, ALBERT; and ABARBANEL, ALBERT (editors) 1961 The Encyclopedia of Sexual Behavior. 2 vols. New York: H awthorne. ERIKSON, ERIK H. (1950) 1964 Childhood and Society. 2d ed., rev. & enl. New York: Norton. Explorations in Personality: A Clinical and Experimental Study of Fifty Men of College Age, by Henry A. Murray et al. 1938 London and New York: Oxford Univ. Press. FORD, CLELLAN S.; and BEACH, FRANK A. 1951 Patterns of Sexual Behavior. New York: Harper. FRENCH, E. G. 1958 Effects of the Interaction of Motivation and Feedback on Task Performance. Pages 400-408 in John W. Atkinson (editor), Motives in Fantasy, Action, and Society. Princeton, N.J.: Van Nostrand. FREUD, SIGMUND (1915) 1959 Instincts and Their Vicissitudes. Volume 4, pages 60-83 in Sigmund Freud, Collected Papers. International Psycho-analytic Library, No. 10. New York: Basic Books; London: Hogarth. FREUD, SIGMUND (1932) 1965 New Introductory Lectures on Psycho-analysis. New York: Norton. -» First published as Neue Folge der Vorlesungen zur Einfiihrung in die Psychoanalyse. FUNKENSTEIN, DANIEL H.; KlNG, STANLEY H.; and DRO-
LETTE, MARGARET E. 1957 Mastery of Stress. Cambridge, Mass.: Harvard Univ. Press. HARDY, KENNETH R. 1957 Determinants of Conformity and Attitude Change. Journal of Abnormal and Social Psychology 54:289-294. HECKHAUSEN, HEINZ 1963 Hoffnung und Furcht in der Leistungsmotivation. Meisenheim am Glan (Germany): Hain.
527
HECKHAUSEN, HEINZ 1965 Leistungsmotivation. Volume 2, pages 602-702 in Handbuch der Psychologic. Gottingen (Germany): Hogrefe. HOCH, PAUL H.; and ZUBIN, JOSEPH 1950 Anxiety. New York: Grune. JANIS, IRVING L. 1958 Psychological Stress: Psychoanalytic and Behavioral Studies of Surgical Patients. New York: Wiley. JONES, EDWARD 1964 Ingratiation: A Social Psychological Analysis. New York: Appleton. JONES, MARSHALL R. (editor) 1956-1962 Nebraska Symposium on Motivation. Vols. 4, 8, 10. Lincoln: Univ. of Nebraska Press. JUNG, CARL G. (1932-1936) 1939 The Integration of the Personality. New York: Farrar. ->• Originally published in German in the 1932-1936 volumes of Eranos Jahrbuch. KELLY, GEORGE A. 1962 Europe's Matrix of Decision. Volume 10, pages 83-125 in Marshall R. Jones (editor), Nebraska Symposium on Motivation. Lincoln: Univ. of Nebraska Press. KINSEY, ALFRED C. et al. 1948 Sexual Behavior in the Human Male. Philadelphia: Saunders. KINSEY, ALFRED C. et al. 1953 Sexual Behavior in the Human Female. Philadelphia: Saunders. MCCLELLAND, DAVID C. 1951 Personality. New York: Sloane. MCCLELLAND, DAVID C. 1961 The Achieving Society. Princeton, N.J.: Van Nostrand. MCCLELLAND, DAVID C. et al. 1953 The Achievement Motive. New York: Appleton. McDoucALL, WILLIAM (1908) 1950 An Introduction to Social Psychology. 30th ed. London: Methuen. -> A paperback edition was published in 1960 by Barnes and Noble. MASLOW, ABRAHAM H. 1962 Toward a Psychology of Being. Princeton, N.J.: Van Nostrand. MONEY, JOHN (editor) 1965 Sex Research: New Developments. New York: Holt. MURPHY, GARDNER 1954 Social Motivation. Volume 2, pages 601-633 in Gardner Lindzey (editor), Harcdbook of Social Psychology. Cambridge, Mass.: AddisonWesley. ROGERS, CARL 1963 The Actualizing Tendency in Relation to "Motives" and to Consciousness. Volume 11, pages 1-24 in Marshall R. Jones (editor), Nebraska Symposium on Motivation. Lincoln: Univ. of Nebraska Press. SCHACHTER, STANLEY 1959 The Psychology of Affiliation: Experimental Studies of the Sources of Gregariousness. Stanford Studies in Psychology, No. 1. Stanford Univ. Press. SEARS, ROBERT R. 1963 Dependency Motivation. Volume 11, pages 25-64 in Marshall R. Jones (editor), Nebraska Symposium on Motivation. Lincoln: Univ. of Nebraska Press. SEARS, ROBERT R.; MACCOBY, E. E.; and LEVIN, H. 1957 Patterns of Child Rearing. Evanston, 111.: Row, Peterson. SELYE, HANS 1956 The Stress of Life. New York: McGraw-Hill. SHIPLEY, THOMAS E.; and VEROFF, JOSEPH 1952 A Projective Measure of Need for Affiliation. Journal of Experimental Psychology 43:349-356. STAATS, ARTHUR W.; and STAATS, CAROLYN K. 1963 Complex Human Behavior. New York: Holt. THORNDIKE, EDWARD L. 1927 The Law of Effect. American Journal of Psychology 39:212-222.
522
MOTOR DEVELOPMENT
TROLAND, LEONARD T. 1928 The Fundamentals of Human Motivation. New York: Van Nostrand. WALKER, EDWARD L.; and HEYNS, ROGER W. 1962 An Anatomy for Conformity. Englewood Cliffs, N.J.: Prentice-Hall. WHITE, ROBERT W. 1959 Motivation Reconsidered: The Concept of Competence. Psychological Review 66:297333. WOLFF, HAROLD G. 1953 Stress and Disease. Springfield, 111.: Thomas.
MOTOR DEVELOPMENT See SENSORY AND MOTOR DEVELOPMENT. MOVEMENTS See MlLLENARISM; NATIVISM AND REVIVALISM; SOCIAL MOVEMENTS; VOLUNTARY ASSOCIATIONS.
MULLER, ADAM HEINRICH Adam Heinrich Miiller (1779-1829), German political economist of the romantic school, was born in Berlin and studied at Berlin and Gottingen. In 1802 he moved to Vienna, where he was an intimate friend of Friedrich von Gentz, a politician and writer associated with Metternich. In 1805 Miiller was received into the Roman Catholic church, and as he grew older his ideas were increasingly influenced by Catholic thought. He served from 1806 to 1809 in Dresden as a tutor to Prince Bernhard of Saxe-Weimar. There he was associated with the romantic dramatist Heinrich von Kleist in editing the literary journal Phobus. His most creative book, Die Elemente der Staatskimst (1809), was based on lectures given at Dresden. He spent the years 1809-1811 in Berlin. Because he opposed the reforms of Stein and Hardenberg, opportunities for public service in Prussia were closed to him, and he returned to Vienna, where in 1813 he entered the Austrian government service. Through Gentz he became acquainted with Metternich, whom he served as an adviser and assistant in various posts. Miiller was a leading member of the German romantic school of political economy, composed of several political writers and literary figures affiliated with the early German romantic movement. Among them, in addition to Miiller and Gentz, were Carl Ludwig von Haller, Johann Joseph von Gorres, and Franz von Baader. In varying degrees these writers opposed the marked rationality, the individualism, and the emphasis on material values characteristic of the political economy of the Enlightenment. Inspired by the integrated social organization of the Middle Ages, they sought to develop a political economy based on an organic
conception of society and, thus, to recapture the "German spirit." All were influenced by the philosophy of Fichte, and Miiller and Gentz in particular were influenced by Edmund Burke. Miiller published copiously in the fields of political economy and social philosophy and served as a kind of intellectual spokesman for the reactionary forces of the post-Napoleonic period. Muller's economic and political ideas were founded, then, on an organic conception of society. In such a form of society political, economic, religious, moral, and aesthetic elements would be merged indivisibly in the state, which would represent the "mysterious reciprocity of all the relationships of life." The state not only would unite all social elements at any given time but also would be the instrument that binds society together through time and fosters the development of national consciousness, or national spirit. As a result of this conception of society, Miiller opposed individual freedom in favor of central authority, he opposed competition in favor of cooperation and reciprocity, and he opposed free trade in favor of a national system of protection. He rejected the classical theory that value is determined by exchange in the market and argued that social as well as private usefulness be considered in determining the value of any good. He rejected also the classical concept of wealth as including only material objects and advanced his own, famous concept of spiritual capital. By this, he meant that the capital of a society includes not only material objects but also intangibles derived from the past, such as the national existence, the traditions of the society, the constitution, the language, the motivations and character of the people, the extant knowledge and technology, and other nonmaterial features of the culture. Miiller regarded money as a creature of the state and the value of money as derived from its role as a link between the individuals of the organic society rather than from its exchange value or its metallic content. Because of his organic theory of society, he was unwilling to isolate the economic aspect of society for study and insisted that society be studied as a comprehensive organic unity. This led him to point out some of the excessive narrowness, materialism, and individualism of classical economics, but it also led to diffuse and muddy analysis and exposition. Muller's influence was never great. Despite the economic and political backwardness of the Germany of his time, the prevailing political trend was liberal, and his political position was distinctly counter to the trend. However, as indicated by the
MULLER, GEORG ELIAS appreciative comments of Roscher and Hildebrand, he did have some influence on the older historical school of economists, who developed his more significant economic ideas. List, who knew Miiller personally, acknowledged indebtedness to him. Various groups of socialists, especially Christian socialists (both Protestant and Catholic), have used his ideas. In recent times, Othmar Spann built his "universalist" system of economics on the foundation of Miiller's ideas. The German National Socialists found Miiller's ideas congenial and resurrected them from obscurity. Many of Miiller's ideas today appear quaint, fuzzy, or dangerously reactionary. Yet he stood in the van of a long line of critics whose work has been useful in countering the abstractness, the radical individualism, and the neglect of social values characteristic of the dominant classical economics. Miiller's concept of spiritual capital, of the productive power imbedded in cultural factors as well as in the concrete physical wealth of a society, is being rediscovered in the mid-twentieth century, as economists face the problems of economic growth in the underdeveloped areas of the world. HOWARD R. BOWEN [For discussion of the subsequent development of Miiller's ideas, see ECONOMIC THOUGHT, article on THE HISTORICAL SCHOOL; and the biographies of HILDEBRAND; LIST; ROSCHER.] WORKS BY A. H. MULLER (1806) 1920 Vorlesungen uber die deutsche Wissenschaft und Literatur. New ed. Edited by Arthur Salz. Munich: Drei Masken. (1809) 1922 Die Elements der Staatskunst: Offentliche Vorlesungen. 2 vols. New ed. Edited by Jakob Baxa. Vienna: Wiener Literarische Anstalt. (1812a) 1931 Ausgewdhlte Abhandlungen New ed. Edited by Jakob Baxa. Jena: Fischer. -> First published as Vermischte Schriften uber Staat, Philosophie, und Kunst. 1812& Die Theorie der Staatshaushaltung und ihre Fortschritte in Deutschland und England seit Adam Smith. Vienna: Schaumburg. (1816) 1922 Versuche einer neuen Theorie des Geldes mit besonderer Riicksicht auf Grossbrittannien. Edited by Helene Lieser. Jena: Fischer. (1817) 1920 Zwolf Reden uber die Beredsamkeit und deren Verfall in Deutschland. Edited by Arthur Salz. Munich: Drei Masken. Gesammelte Schriften. Munich: Franz, 1839. BAXA,
SUPPLEMENTARY BIBLIOGRAPHY JAKOB 1923 Einfiihrung in die romantische
Staatswissenschaft. Jena: Fischer. BAXA, JAKOB (editor) 1924 Staat und Gesellschaft im Spiegel der deutschen Romantik. Jena: Fischer. BAXA, JAKOB 1930 Adam Muller: Ein Lebensbild aus den Befreiungskriegen und aus der deutschen Restauration. Jena: Fischer. -» Contains a bibliography.
523
ROLL, ERICH (1938) 1942 A History of Economic Thought. 2d ed., rev. & enl. New York: Prentice-Hall. -> See especially pages 154-202 on "Political Economy in Germany." SPANN, OTHMAR (1911) 1930 The History of Economics. New York: Norton. -» First published as Die Haupttheorien der Volkswirtschaftslehre auf lehrgeschichtlicher Grundlage. See especially pages 212—270 on "Reaction and Revolution." TOKARY-TOKARZEWSKY-KARASZEWICZ, J. VON 1913 Adam H. Muller, Ritter von Nittersdorf als Okonom, Literal, Philosoph und Kunstkritiker: 1779 bis 1829. Vienna: Gerold.
MULLER, GEORG ELIAS Georg Elias Muller (1850-1934) was one of the leaders of the new experimental psychology when it was being "founded" in Germany, just after the middle of the nineteenth century. If Wilhelm Wundt at Leipzig was the founder and therefore first, Muller perhaps was second. Wundt's Leipzig laboratory was certainly the best, but Miiller's at Gottingen was clearly second best. If most of the able students flocked to Wundt, still many important psychologists formed their values with Muller. Helmholtz had enormous influence, but he was a sense physiologist, who presently turned physicist. Fechner, the founder of psychophysics, was at heart a philosopher, with no loyalty to psychology as such. Hering was a physiologist, who influenced many by his thinking and his phenomenology of vision, but he was not quite a psychologist. Muller, throughout his forty years at Gottingen, was known for his clear thinking, his vigorous logic, his insistent polemics, and his indefatigable pursuit of theory and fact in each of his three chosen fields of research: psychophysics, memory and learning, and vision. He did not originate any of these fields, but in each he became the leader for a time. He took over psychophysics from Fechner when the latter died. He developed the experimental attack on learning and memory after the interests of Hermann Ebbinghaus, the pioneer, had moved elsewhere. From Hering he picked up the problems of visual sensation and color theory, and he became one of the three leading figures in that area of investigation, along with Hering and Helmholtz. Muller was born on July 20, 1850, in Grimma, a small town 16 miles from Leipzig, which boasted a thirteenth-century castle. His father was a theologian and a professor of religion at the local royal academy, later becoming rector at another village near Leipzig. The son went first to the Gymnasium at Leipzig and then, at the age of 18, to the university there, to study history and philosophy. He had been a studious boy, with his thinking directed
524
MULLER, GEORG ELIAS
toward mysticism by his reading of Goethe, Byron, and Shelley but redirected later, by his discovery of Lessing, to the hard clarity that characterized his maturity. At Leipzig he was inducted into Herbartian philosophy; he then went to Berlin to study history. For two years he worried over the choice between history and philosophy, but when he became a soldier in the Franco-Prussian War, escaping from the worrisome dubieties of academic life, he saw clearly that he preferred philosophy. After the war he went back to Leipzig and then moved on to Gottingen to study with the great R. H. Lotze, in the days when Lotze was actively sponsoring the new scientific psychology and it was being said that all philosophy must be firmly founded upon a knowledge of science. He received his doctorate at the hands of Lotze in 1873 after having presented a psychological thesis on the theory of sensory attention, a basic analysis of this function that was still being cited in books on attention 35 years later. After receiving his degree, Miiller became a tutor, first at Rotha, near Leipzig, and then at Berlin. A severe illness caused him to return home, and there, during his convalescence, he became interested in Fechner and psychophysics. He wrote a critical monograph, which he presented when he applied at Gottingen in 1876 to become a Dozent and which was published in 1878 as Zur Grundlegung der Psychophysik. This and his critique the next year of the method of constant stimuli, a paper that contained a table of the well-known Miiller weights, established Miiller as a worthy successor to Fechner, who was then nearing the end of his eighth decade. In 1880 Miiller accepted the chair in philosophy at Czernowitz; but in 1881 Lotze was persuaded to go to Berlin, where he died a few months later, and Miiller succeeded him at Gottingen, remaining there for forty years of continuous service. Lotze had held the chair for 37 years—making a total of three-fourths of a century for the two of them. Actually, Miiller's productive life extended to almost six decades, from 1873 to 1930. We shall now consider separately the three lines of endeavor that he promoted so successfully through those many years. Psychophysics. Miiller's 1878 monograph was concerned mostly with a critique of Weber's law, the law of the relation of sensory intensity to its stimulus. In 1889—we can touch only the high points—he pursued this problem with Friedrich Schumann in an experimental study of the discrimination of weight. With an American, Lillian J. Martin, he published a classic paper (1899) on
how anticipation affects the discrimination of weights, one of the early experimental papers on attitude. In 1903 appeared his elaborate study of psychophysical methods, the study that caused E. B. Titchener to delay the publication of his magnum opus on psychophysics for two years while he made revisions. Memory and learning. In 1894 Miiller and Schumann took up Ebbinghaus' work on learning, standardizing the method of complete mastery and working out rules for the use of the nonsense syllables that Ebbinghaus had invented as material for learning. In 1900 Miiller, with Alfons Pilzecker, developed the use of reaction times in the memory method of right associates. Much later Miiller, working alone, produced three huge volumes (1911—1917) on memory activity, which included much of his work with Ruckle, the mathematical prodigy, and also his analysis of the method of introspection. Vision. Miiller's third line of interest was vision, particularly color vision. His classic papers of 1896 and 1897 contain his revision of Hering's theory of color, in which he eliminated some of the contradiction by assuming that the brain adds a constant gray to all the colors induced by the retina—a cortical gray, as he called it. At this time he also laid down his five "psychophysical axioms," principles of the relation of neural events in the brain to the corresponding events in perception. About twenty-five years later these axioms formed the basis for the gestalt psychologists' theory of isomorphism. In 1930, toward the end of his life, Miiller published two large volumes on the psychophysics of color sensations; but these tomes made less of an impression than did his earlier work, because Miiller was then 80 years old, nine years past his retirement, and the times were moving away from the patterns of his interests. Nevertheless it must be noted that not so many years before, in the period from 1909 to 1911, some of the important work on visual perception—work cast in the modes of the new experimental phenomenology—had been produced at Miiller's laboratory by three soon-to-be-famous psychologists: E. R. Jaensch, David Katz, and Edgar Rubin. Miiller had retired in 1921. In 1923 he published a little book polemicizing against the new gestalt psychology, followed in 1924 by a short outline of general psychology as he then saw it. He died at Gottingen on December 23, 1934, an outstanding figure among the pioneers of the new experimental psychology.
EDWIN G. BORING
MULLER, JOHANNES [For the historical context of Miiller's work, see the biographies of EBBINGHAUS; FECHNER; HELMHOLTZ; HERING; LOTZE; TITCHENER; WEBER, E. H.; WUNDT; for discussion of the subsequent development of Miiller's ideas, see FORGETTING; GESTALT THEORY; PsYCHOPHYSics; VISION, especially the article on COLOR VISION AND COLOR BLINDNESS; and the biographies of KATZ and JAENSCH.] WORKS BY G. E. MULLER
1873
Zur Theorie der sinnlichen Aufmerksamkeit. Leipzig: Edelmann. 1878 Zur Grundlegung der Psychophysik. Berlin: Grieben. 1889 MULLER, GEORG E.; and SCHUMANN, FRIEDRICH Uber die psychologischen Grundlagen der Vergleichung gehobener Gewichte. Archiv fur die gesammte Physiologic 45:37-112. 1894 MULLER, GEORG E.; and SCHUMANN, FRIEDRICH Experimentelle Beitrage zur Untersuchung des Gedachtnisses. Zeitschrift fur Psychologie und Physiologic der Sinnesorgane 6:81-190, 257-339. 1896-1897 Zur Psychophysik der Gesichtsempfindungen. Zeitschrift fur Psychologie und Physiologic der Sinnesorgane 10:1-82, 321-413; 14:1-76, 161-196. 1899 MULLER, GEORG E.; and MARTIN, LILLIAN J. Zur Analyse der Unterschiedsempfindlichkeit. Leipzig: Earth. 1900 MULLER, GEORG E.; and PILZECKER, ALFONS Experimentelle Beitrage zur Lehre vom Gedachtniss. Zeitschrift fur Psychologie, Supplement No. 1. 1903 Die Gesichtspunkte und die Tatsachen der psychophysischen Methodik. Ergebnisse der Physiologic 2, part 2: 267-516. 1911-1917 Analyse der Gedachtnistatigkeit und des Vorstellungsverlaufes. 3 parts. Zeitschrift filr Psychologie, Supplements no. 5, 8, 9. 1923 Komplextheorie und Gestalttheorie: Ein Beitrag zur Wahrnehmungspsychologie. Gottingen: Vandenhoeck & Ruprecht. 1924 Abriss der Psychologie. Gottingen: Vandenhoeck & Ruprecht. 1930 tiber die Farbenempfindungen: Psychophysische Untersuchungen. 2 parts. Zeitschrift fur Psychologie, Supplements no. 17, 18. SUPPLEMENTARY BIBLIOGRAPHY
BORING, EDWIN G. (1929) 1950 A History of Experimental Psychology. 2d ed. New York: Appleton. -> See especially pages 371-379; and the bibliography on pages 382-383. BORING, EDWIN G. 1935 Georg Elias Miiller: 18501934. American Journal of Psychology 47:344-348. BORING, EDWIN G. 1936 Georg Elias Miiller. American Academy of Arts and Sciences, Proceedings 70:558560. CLAPAREDE, EDOUARD 1935 Georg Elias Miiller: 18501934. Archives de psychologie 25:110-114. KATZ, DAVID 1935a Georg Elias Miiller. Acta psychologica 1 -. 234-240. KATZ, DAVID 1935b Georg Elias Miiller. Psychological Bulletin 32:377-380. VAN ESSEN, JACOB 1935 G. E. Miiller ter gedachtenis. Nederlandsch tijdschrift t)oor psychologie 3:48-58. -> Contains a bibliography. WATSON, ROBERT I. 1963 The Great Psychologists: From Aristotle to Freud. Philadelphia: Lippincott. -> See especially pages 269-270, "G. E. Miiller."
525
MULLER, JOHANNES Johannes Miiller (1801-1858) is frequently referred to as the father of experimental physiology. While it might be argued that the title belongs more properly to Sir Charles Bell, the fact remains that during the first half of the nineteenth century Miiller was the dominant figure in the rapidly developing science of physiology. Through his own researches, particularly on reflex action and on human and animal vision, through his massive Handbuch der Physiologic des Menschen (18341840; a translation, Elements of Physiology, appeared 1840-1843), which became the standard reference work for physiologists throughout Europe, and through his pupils he made a lasting impression on the biological sciences; and the doctrine foi which he became most famous, the law of specific energies of nerves, continues in modified form tc present a challenge. Miiller, the son of a shoemaker, was born ir Koblenz. In 1819 he matriculated at the University of Bonn, where he received his medical degree ir 1822. After a year of further study in Berlin he was habilitated at Bonn in 1823. Until 1830 he was Privatdozent in anatomy and physiology, at whicr time he was granted a professorship. In 1833 he was called to the chair of anatomy and physiology at the University of Berlin, which he occupied un til his death in 1858. During his career he became prominent in international scientific circles, wa? an active leader in university affairs (being electee Dekan in 1835 and Universitatsrektor in 1838) and during the political upheaval of 1848 he wa; head of the "fliegende Korps der Universitatsange horigen." Among his many pupils the best knowi are Ernst Briicke, Carl Ludwig, and Emil Du Bois Reymond, the last of whom succeeded Miiller ii the Berlin chair. Even better known is Hermam von Helmholtz, who although not a pupil, wai closely associated with Miiller as a junior colleagu* and whose epoch-making contributions to sensor physiology are essentially extensions of Miiller' pioneering studies. It is a tribute to Miiller'i greatness as a teacher that none of his pupils re mained strictly faithful to their master's doctrine In the empiricist-nativist controversy, for instance Miiller was on the nativist side of the argument and Helmholtz became the spokesman for the em piricists; Miiller's avowedly vitalist position wa vigorously rejected by the younger generation o physiologists. Miiller's stature as a scientist is most eviden in the Handbuch, in which he summarized am
526
MULLER, JOHANNES
evaluated the physiological knowledge of his day, reported much of his own research, and defined problems for further investigation. Although his bestknown contributions are in sensory physiology and what would now be called the experimental psychology of sensation and perception, he was interested in every aspect of human and animal physiology and even in the broader philosophical implications of natural science. The famous law of specific energies was first formulated in the 1826 volume on the comparative physiology of vision, and it was amplified in Book 5 of the Handbuch. Briefly stated, it asserts that the basis of differentiation among sensory qualities is to be found not in the physical processes of the external world or in the receptors but in the condition of the sensory nerves. Our knowledge of the external world is thus an interpretation placed upon centrally aroused and immediately apprehended sensations. This is obviously not a totally new doctrine. The early British empiricist philosophers, notably George Berkeley, had made a similar distinction between sensation and interpretation, but without grounding it in anything more than a speculative physiology. A more direct anticipation is to be found in the independent discoveries by Charles Bell and Francois Magendie of the structural and functional differences between sensory and motor nerves, the sensory nerves being responsible for sensation and the motor nerves for muscular action. Miiller extended the principle by according to each nerve its own unique sensory quality: color to the optic nerve, sound to the acoustic nerve, etc.; and a further refinement is to be found in Helmholtz' hypothesis that an even more specific differentiation exists among the constituent fibers of a given nerve. One of the physiological implications of his principle, which Miiller recognized but did not fully explore, is that the ultimate correlates of sensory quality are to be sought not in the nerves themselves but in the specialized structures of the cerebral cortex. Miiller's principle thus points towards a more generalized theory of cortical localization. The Handbuch is a treatise on philosophy and psychology as well as on physiology. For Miiller, both physiology and psychology are to be subsumed under a broader philosophy of nature. His philosophical views show the influence of the German metaphysical idealists, but he might be more properly classed as an Aristotelian in his conception of nature, and his approach to science was very close to that of Goethe. Purpose, he believed, is an observable fact of nature, without which the world of natural phenomena is unintelligible. Purpose is
revealed in the forms of natural objects and events but emerges as conscious mind only with the differentiation of the specialized structures of the central nervous system, the brain being the special organ of consciousness. Causation in nature may be mechanical, chemical, or organic, the third of these involving a special life force (Lebenskraft) that is not reducible to the first two. A science which limits itself to the first two thus provides an incomplete account of nature. In Miiller's philosophy of nature, as in Goethe's, the realm of natural law includes not only the mechanical processes of the physical world but also the phenomena of purposive striving, ideation, and reasoning. Miiller was a staunch exponent of the experimental method in science, but also like Goethe, he insisted that the data of unconstricted observation (unbefangene Beobachtung) are fully as legitimate as are those of the laboratory. In this respect he might be considered one of the forerunners of the phenomenological movement in experimental psychology. ROBERT B. MACLEOD [Directly related are the entries NERVOUS SYSTEM, especially the article on STRUCTURE AND FUNCTION OF THE BRAIN; SENSES. Other relevant material is found in PSYCHOLOGY, article on PHYSIOLOGICAL PSYCHOLOGY; and in the biographies of BELL; HELMHOLTZ; LASHLEY. The section of the biography of FREUD that deals with the historical background of his thought is also relevant.] WORKS BY J. MULLER
1826
Zur vergleichenden Physiologie des Gesichtssinnes des Menschen und der Tiere, nebst einem Versuch uber die Bewegungen der Augen und uber den menschlichen Blick. Leipzig: Cnobloch. (1826) 1927 Uber die phantastischen Gesichtserscheinungen. Leipzig: Earth. (1834-1840) 1840-1843 Elements of Physiology. 2 vols. 2d ed. London: Taylor & Walton. ->• First published as Handbuch der Physiologie des Menschen. SUPPLEMENTARY BIBLIOGRAPHY
BORING, EDWIN G. (1929) 1950 A History of Experimental Psychology. 2d ed. New York: Appleton. BORING, EDWIN G. 1942 Sensation and Perception in the History of Experimental Psychology. New York: Appleton. DRIESCH, HANS (1905)1922 Geschichte des Vitalismus. 2d ed., rev. & enl. Leipzig: Earth. -» An expansion of the main parts of Der Vitalismus als Geschichte und als Lehre, which was translated into English as The History and Theory of Vitalism and published in 1914 by Macmillan. HABERLING, WILHELM 1924 Johannes Miiller: Das Leben des rheinischen Naturforschers. Leipzig: Akademische Verlagsgesellschaft. ROLLER, GOTTFRIED 1958 Das Leben des Biologen Johannes Miillers, 1801-1858. Stuttgart: Wissenschaftliche Verlagsgesellschaft.
MULTIVARIATE ANALYSIS: Overview MULLER, MARTIN 1927 Uler die philosophischen Anschauungen des Naturforschers Johannes Muller. Leipzig: Earth. POST, KARL 1905 Johannes Miiller's philosophische Anschauungen. Halle: Niemeyer.
MULTIPLE COMPARISONS See under LINEAR HYPOTHESES. MULTIVARIATE ANALYSIS i. n. in. iv.
OVERVIEW CORRELATION (1) CORRELATION (2) CLASSIFICATION AND DISCRIMINATION
Ralph A. Bradley R. F. Tate Harold Hotelling T. W. Anderson
OVERVIEW
Multivariate analysis in statistics is devoted to the summarization, representation, and interpretation of data when more than one characteristic of each sample unit is measured. Almost all datacollection processes yield multivariate data. The medical diagnostician examines pulse rate, blood pressure, hemoglobin, temperature, and so forth; the educator observes for individuals such quantities as intelligence scores, quantitative aptitudes, and class grades; the economist may consider at points in time indexes and measures such as percapita personal income, the gross national product, employment, and the Dow-Jones average. Problems using these data are multivariate because inevitably the measures are interrelated and because investigations involve inquiry into the nature of such interrelationships and their uses in prediction, estimation, and methods of classification. Thus, multivariate analysis deals with samples in which for each unit examined there are observations on two or more stochastically related measurements. Most of multivariate analysis deals with estimation, confidence sets, and hypothesis testing for means, variances, covariances, correlation coefficients, and related, more complex population characteristics. Only a sketch of the history of multivariate analysis is given here. The procedures of multivariate analysis that have been studied most are based on the multivariate normal distribution discussed below. Robert Adrian considered the bivariate normal distribution early in the nineteenth century, and Francis Galton understood the nature of correlation near the end of that century. Karl Pearson made important contributions to correlation, including multiple correlation, and to regression anal-
52'
ysis early in the present century. G. U. Yule am others considered measures of association in cor tingency tables, and thus began multivariate deve] opments for counted data. The pioneering work o "Student" (W. S. Cosset) on small-sample distribi: tions led to R. A. Fisher's distributions of simpl and multiple correlation coefficients. J. Wishart de rived the joint distribution of sample variances ani covariances for small multivariate normal samples Harold Hotelling generalized the Student t-statisti and t-distribution for the multivariate problem. S. S Wilks provided procedures for additional tests c hypotheses on means, variances, and covariances Classification problems were given initial considei ation by Pearson, Fisher, and P. C. Mahalanobi through measures of racial likeness, generalize^ distance, and discriminant functions, with som results similar to the work of Hotelling. Both He telling and Maurice Bartlett made initial studie of canonical correlations, intercorrelations betweei two sets of variates. More recent research by S. IS Roy, P. L. Hsu, Meyer Girshick, D. N. Nanda, am others has dealt with the distributions of certaii characteristic roots and vectors as they relate t multivariate problems, notably to canonical com lations and multivariate analysis of variance. Mud attention has also been given to the reduction o multivariate data and its interpretation througl many papers on factor analysis and principal corr ponents. [For further discussion of the history o these special areas of multivariate analysis and o their present-day applications, see COUNTED DAT^ DISTRIBUTIONS, STATISTICAL, article on SPECIA: CONTINUOUS DISTRIBUTIONS; FACTOR ANALYSIS
MULTIVARIATE ANALYSIS, articles on CORRELATIOI and CLASSIFICATION AND DISCRIMINATION; STATIS
TICS, DESCRIPTIVE, article on ASSOCIATION; and th biographies of FISHER, R. A.; GALTON; GIRSHICK COSSET; PEARSON; WILKS; YULE.] Basic multivariate distributions Scientific progress is made through the develof ment of more and more precise and realistic reprc sentations of natural phenomena. Thus, science and to an increasing extent social science, use mathematics and mathematical models for irr proved understanding, such mathematical model being subject to adoption or rejection on the basi of observation [see MODELS, MATHEMATICAL]. L particular, stochastic models become necessary a the inherent variability in nature becomes unde] stood. The multivariate normal distribution provides th stochastic model on which the main theory c multivariate analysis is based. The model has sufF
528
MULTIVARIATE ANALYSIS: Overview
cient generality to represent adequately many experimental and observational situations while retaining relative simplicity of mathematical structure. The possibility of applying the model to transforms of observations increases its scope [see STATISTICAL ANALYSIS, SPECIAL PROBLEMS OF, article on TRANSFORMATIONS OF DATA]. The largesample theory of probability and the multivariate central limit theorem add importance to the study of the multivariate normal distribution as it relates to derived distributions. Inquiry and judgment about the use of any model must be the responsibility of the investigator, perhaps in consultation with a statistician. There is still a great deal to be learned about the sensitivity of the multivariate model to departures from that distributional assumption. [See ERRORS, article on EFFECTS OF ERRORS IN STATISTICAL ASSUMPTIONS.] The multivariate normal distribution. Suppose that the characteristics or variates to be measured on each element of a sample from a population, conceptual or real, obey the probability law described through the multivariate normal probability density function. If these variates are p in number and are designated by X t , • • • , X ;) , the multivariate normal density contains p parameters, or population characteristics, /JLI , • • • , p.p, representing, respectively, the means or expected values of the variates, and i~p(p + 1) parameters o-i}•,, i, j = 1, • • • , p, o-ji — (TH , representing variances and covariances of the variates. Here cr;i is the variance of X ; (corresponding to the variance or2 of a variate X in the univariate case) and o-y = a-a is the covariance of Xi and X ; -. The correlation coefficient between Xi and Xj is ptj = cri; / Verier/; • The multivariate normal probability density function provides the probability density for the variates X,, • • • , Xp at each point x±, • • • , xp in the sample or observation space. Its specific mathematical form is
r \ Xi, * ' * , XP j —oo < Xi < -x, i — 1, • • • , p. [For the explicit form of this density in the bivariate case (p = 2), see MULTIVARIATE ANALYSIS, article on CORRELATION (1).] (Vector and matrix notation and an understanding of elementary aspects of matrix algebra are important for any real understanding or application of multivariate analysis. Thus, xf is the vector (X-L, • • • , X p } , u/ is the vector (/*,, •••,fjLp~), and (x — n)' is the vector (XL — /^, • • • , xp — //,p). Also, Z is the p x p, symmetric matrix which has elements cr i; , Z = [a-a], Zl is the determinant
of Z and Z-1 is its inverse. The prime indicates "transpose," and thus (x — \L~)' is the transpose of (x — ^i), a column vector.) Comparison of f ( x 1 } • • • , xp) with f ( x ) , the univariate normal probability density function, may assist understanding; for a univariate normal variate X with mean //, and variance a2, -* exp {-K* -
\ V27TO-
where — °° < x < oo. The multivariate normal density may be characterized in various ways. One direct method begins with p independent, univariate normal variables, Ul , • • • , U,, , each with zero mean and unit variance. From the independence assumption, their joint density is the product
a very special case of the multivariate normal probability density function. If variates X t , • • • , X.p are linearly related to U^ , • • • , Up so that X = AU + u,, in matrix notation, with X, V, and u, being column vectors and A being a p x p nonsingular matrix of constants a i; , then X; = a ia Ul
+ aip Up +
i = 1, • • • , p.
Clearly, the mean of Xi is E(X,) = ^ , where ^ is a known constant and E represents "expectation." The variance of X, is
and the covariance of X* and X,, i ^ j, is cov(Xi,X,)=E Standard density function manipulations then yield the joint density function of Xl, • • • , Xp as that already given as the general p-variate normal density. If the matrix A is singular, the results for E(Xi), var(Xi), and cov(X i ; X ; ) still hold and Xi, • • • , Xp are said to have a singular multivariate normal distribution; although the joint density function cannot be written, the concept is useful. A second characterization of the p-variate normal distribution is the following: X l 3 • • • , Xp have a p-variate normal distribution if and only if XLi a i^ii s univariate normal for all choices of the coefficients ai} that is, if and only if all linear combinations of the Xi are univariate normal. The multivariate normal cumulative distribution function represents the probability of the joint
MULTIVARIATE ANALYSIS: Overview occurrence of the events X, ^ x may be written Xl,
X
cp and
• • • , X P < * 1 , ) =F(ac 1 , ~f
indicating that probabilities that observations fall into regions of the p-dimensional variate space may be obtained by integration. Tables of F(x l 5 • • • , xp) are available for p — 2, 3 (see Greenwood & Hartley 1962). Some basic properties of the p-variate normal distribution in terms of X = (Xl} • • • , X p ) are the following. (a) Any subset of the X ; has a multivariate normal distribution. In fact, any set of q linear combinations of the X» has a g-variate normal distribution, a result following directly from the linear combination characterization, q^p. (£>) The conditional distribution of q of the X i 5 given the p — q others, is g-variate normal. (c) If era = 0, i^j, then Xi and X; are independent. (d) The expectation and variance of S^o^X; p are Y> and Y Yp a.a.cr... 1 a.» a. t—i%=\ ™l jLjl l /^Jj-l t ; l] (e) The covariance of ^,"i^aiXi and^^^^Xj is =
EUZkfliVv
A cautionary note is that Xx , • • • , Xp may be separately (marginally) univariate normal while the joint distribution may be very nonnormal. The geometric properties of the p-dimensional surface defined by y = f(x l 5 • • • , xp~) are interesting. Contours of the surface are p-dimensional ellipsoids. All inflection points of the surface occur at constant y and hence fall on the same horizontal ellipsoidal cross section. Any vertical cross section of the surface leads to a subsurface that is normal or multivariate normal in form and is capable of representation as a normal probability density surface except for a proportionality constant. Characteristic and moment-generating functions yield additional methods of description of random variables [see DISTRIBUTIONS, STATISTICAL, article on SPECIAL CONTINUOUS DISTRIBUTIONS]. For the multivariate normal distribution, the moment-generating function is M(t l 5 - - ^ f p ) 00
singular p-variate normal distribution. Note tha from its definition, the matrix Z may be shown be nonnegative definite. When Z is positive de: nite the multivariate density may be specified ; f(x-i, • • • , X p ) . When £ is singular, Z-1 does n exist, and the density may not be given. Howeve M ( t l s • • • , t p ) may still be given and can thi describe the singular multivariate normal distrib tion. To say that X has a singular distribution is say that X lies in some hyperplane of dimensic less than p. The multivariate normal sample. Table 1 illu trates a multivariate sample with p — 4 and samp size N = 10; the data here are head measurement Table 1 — Measurements taken on first and second adi sons in a sample of ten families HEAD BREADTH
HEAD LENGTH First son
Second son
First son
Second son
191 195 181 183 176 208 189 197 188 192
179 201 185 188 171 192 190 189 197 187
155 149 148 153 144 157 150 159 152 150
145 152 149 149 142 152 149 152 159 151
Source: Based on original data by G. P. Fr< presented in Rao 1952, table 7b.:
One can anticipate covariance or correlation b tween head length and head breadth and betwee head measurements of first and second sons. Henc for most purposes it will be important to treat tl data as a single multivariate sample rather thi as several univariate samples. General notation for a multivariate sample developed in terms of the variates Xia , • • • , Xpa re resenting the p observation variates for the a sample unit (for example, the ath family in tl sample), a = I , • • • , N. In a parallel way xia m; be regarded as the realization of Xia in a particul set of sample data. For multivariate normal pr cedures, standard data summarization involves a culation of the sample means, xi — $^=1 xia/l i = 1, • • • , p, and the sample variances and c variances,
00
=/.../ -00
52
-00
= exp (I where t' = (ti, • • • , t p ). The moment-generating function may describe either the nonsingular or
Xj)/(N
— 1)
«=
[
N £(*i«*/a) ~
/ (N - 1),
a=l
i, j = 1, • Sif
530
MULTIVARIATE ANALYSIS: Overview
Sample correlation coefficients may be computed from Tij = Sij /\/ SaSjj. For the data of Table 1, the sample values of the statistics are given in Table 2. Table 2 — Sample statistics for measurements taken on sons MEANS, X
1
1
190.0
187.9
Variafe
•H Sii > 812 , ' ' ' , Spp )
151.7
150.0
VARIANCES AND COVARIANCES, st, .Variaie i Variate j
1
81.56
42.00 72.32
2
29.56
18.67
11.86
33.11
20.01
3
7.78
20.67
4 CORRELATIONS, ra
Variate J \ ^
1
7
1
2 3
4
There is complete analogy here with the univariate case. The joint probability density function of the sample variances and covariances, Si} , has been named the Wishart distribution after its developer. This density is
2
3
.55
1
4
.73
.46
.31
.86
1
.38
1
The required assumptions for the simpler multivariate normal procedures are that the observation vectors (X ia , • • • , X pa ) are independent in probability and that each such observation vector consists of p variates following the same multivariate normal law—that is, having the same probability density f ( x 1 } • • • , xp) with the same parameters, elements of \JL and Z. The joint density for the p x N random variables Xia is, by the independence assumption, just the product of IV p-variate normal densities, each having the same //,'s and cr's. The joint density may be expressed in terms of ^i and Z and x and s, where s is the symmetric p x p matrix with elements si;- and x' = (x±, • • • , xp~). Elements of S, the matrix of random variables corresponding to s, and of the vector X constitute a set of sufficient statistics for the parameters in Z and ^i [see SUFFICIENCY]. Furthermore, it may be shown that S and X are independent. Basic derived distributions. The distribution of the vector of sample means, X = (X l 5 • • • , X p ), is readily described for the random sampling under discussion. That distribution is again p-variate normal with the same mean vector, u,, as in the underlying population but with covariance matrix N^Z.
n T[-KN where — °o < s,,- < <x>, i < j, 0 ^ Su < <x>, and i, j = 1, • • • , p, and the matrix s is positive definite. The Wishart density is a generalization of the chi-square density with N — 1 degrees of freedom for (N — l)S 2 /cr 2 in the univariate case, in which S2 is the sample variance based on N independent observations from a univariate normal population. Anderson (1958, sec. 14.3) has a note on the noncentral Wishart distribution, a generalization of the noncentral chi-square distribution. Procedures on means, variances, covariances Many of the simpler multivariate statistical procedures were developed as extensions of useful univariate methods dealing with tests of hypotheses and related confidence intervals for means, variances, and covariances. Small-sample distributions of important statistics of multivariate analysis have been found; almost invariably the starting point in the derivations is the joint probability density of sample means and sample variances and covariances, the product of a multivariate normal density and a Wishart density, or one of these densities separately. Inferences on means, dispersion known. If u,* is a p-element column vector of given constants and if the elements of Z are known, it was shown long ago, perhaps first by Karl Pearson, that when u,* = u, Q(X) = N(X - ^'Z-^X - n*) has the central chi-square distribution with p degrees of freedom [see DISTRIBUTIONS, STATISTICAL, article on SPECIAL CONTINUOUS DISTRIBUTIONS]. It was later shown more generally that Q(X) has the noncentral chi-square distribution with p degrees of freedom and noncentrality parameter r2 = N(u* - u/Z-^u* - [A) when u.* ^ \JL. (The symbol r2 is consistent with the notation of Anderson 1958, sec. 5.4.) A null hypothesis, H01 : u, = \JL*, specifying the means of the multivariate normal density when Z is known and when the alternative hypothesis is general, [A ^ u/, may be of interest in some experimental situations. With significance level a the critical region of the sample space, the region of
MULTIVARIATE ANALYSIS: Overview rejection of the hypothesis H o t , is that region where Q(w) ^ xl;a (xl-,a being the tabular value of a chisquare variate xl with p degrees of freedom such that P(XP ^ Xp-,a] —a ) tsee HYPOTHESIS TESTING]. The po\ver of this test may be computed when H01 is false, that is, when ^i ^ u,*, by evaluation of the probability, P{x'p ^ Xp-,a}> where x'l is a noncentral chi-square variate with p degrees of freedom and noncentrality r2. When the alternative hypotheses are one-sided, in the sense that each component of u. is taken to be greater than or equal to the corresponding component of u.*, the problem is more difficult. First steps have been taken toward the solution of this problem (see Kudo 1963; Niiesch 1966). Since u. is unknown, it is estimated by x. Corresponding to the test given above, the confidence region with confidence coefficient 1 — a for the ^i consists of all values u.* for which the inequality Q(») ^ x/ra holds lsee ESTIMATION, article on CONFIDENCE INTERVALS AND REGIONS]. This confidence region is the surface and interior of an ellipsoid centered at the point whose coordinates are the elements of x in the p-dimensional parameter space of the elements of u.. Paired sample problems may also be handled. Let YI, • • • , Y2p be 2p variates with means ^, • • • , £ > / > having a multivariate normal density, and let yja, 7 — 1, • • • , 2p, a = 1, • • • , N, be independent multivariate observations from this multivariate normal population. Suppose that Y; and Y p+i , i = 1, • • • , p, are paired variates. Then X{ = Y* — Yp+i, i = 1, • • • , p, make a set of multivariate normal variates with parameters that again may be designated as the elements of u. and Z, /JLI = ^ — gp+i, Similarly, take xia = yia — yp+ita and Xi = yi — yp+i. Inferences on the means, ^, of the difference variates, Xj, when Z is known may be made on the same basis as above for the simple sample. In the paired situation it will often be appropriate to take u, = 0, that is, H01 : [i = 0. Here 0 denotes a vector of O's. For example, in Table 1 the data can be paired through the association of first and second sons in a family; a pertinent inquiry may relate to the equalities of both mean head lengths and mean head breadths of first and second sons. For association with this paragraph, columns in Table 1 should have variate headings Y t , Y 3 , Y 2 , and Y 4 , indicating that p — 2; then X1 = Y1 — Y3 measures difference in head lengths of first and second sons and X2 = Y, — Y4 measures difference in head breadths. There are also nonpaired versions of these procedures. In a table similar to Table 1 the designations "first son" and "second son" might be replaced by "adult male American Indian" and "adult male
53
Eskimo." Then the data could be considered to co] sist of ten bivariate observations taken at rando] from each of the two indicated populations wil no basis for the pairing of the observation vector Anthropological study might require comparisoi of mean head lengths and mean head breadtl for the two racial groups. The procedures of th section may be adapted to this problem. Suppo; that X(», • • • , X(;» and Xf, • • • , X are the p vai ates for the two populations, the two sets of variate being stochastically independent of each other ar having multivariate normal distributions with COE mon dispersion matrix Z* but with means p™, • • IJL(V and /42), • • • , p<(p\ respectively. The correspom ing sample means are x\l\ • • • , x™ and x™, • • • , x' based respectively on samples of independent ol servations of sizes Nl and N 2 from the two popi lations. Definition of u, = u,(1) — u,(2), x = x(1) — x(: and NZ-1 = [ N , N . , / ( N , + N,)]Z*- 1 permits associ tion and use of Q(x') and its properties for th two-sample problem. If the dispersion matrices < the two populations are known but different, slight modification of the procedure is readi. available. Jackson and Bradley (1961) have extended tries methods to sequential multivariate analysis [st SEQUENTIAL ANALYSIS]. Generalized Student procedures. In the precei ing section it was assumed that Z was known, bi in most applications this is not the case. Rather, must be estimated from the data, and the genera ized Student statistic or Hotelling's T2, T 2 (X,S) N(X-lO'S-^X-11*), comparable to Q(X), almost always used. (For procedures that are nt based on T2 see Sidak 1967.) It has been shown th; F(X,S) = ( N - p ) T 2 ( X , S ) / p ( N - l ) h a s t h e v a i ance-ratio or F-distribution with p and N — p di grees of freedom [see DISTRIBUTIONS, STATISTICA] article on SPECIAL CONTINUOUS DISTRIBUTIONS The F-distribution is central when u, = ^i*, that i when the mean vector of the multivariate norm; population is equal to the constant vector y.*, an is noncentral otherwise with noncentrality paran eter T- already defined. The hypothesis H01 : u, = ^i* is of interest, as b< fore. The statistic F(X,S) takes the role of Q(X and Fp>y_p.a takes the role of x2p-,a, where Fp>N_p.a is th tabular value of the variance-ratio variate FpjN with p and N — p degrees of freedom such thj P{FPtN-p ^ Fp,.Y-p:a} — a.. The confidence region for th elements of u. consists of all values ^i* for whic the inequality F(x,s) ^ Fp^--p:a holds; the region : again an ellipsoid centered at x in the p-dimension; parameter space, and the confidence coefficient : 1 -a.
532
MULTIVARIATE ANALYSIS: Overview
Visualization of the confidence region for the elements of [i is often difficult. When p = 2, the ellipsoid becomes a simple ellipse and may be plotted (see Figure 1). When p > 2, two-dimensional elliptical cross sections of the ellipsoid may be plotted, and parallel tangent planes to the ellipsoid may be found that yield crude bounds on the various parameters. One or more linear contrasts among the elements of [L may be of special interest, and then the dimensionality of the whole problem, including the confidence region, is reduced. Some of the problems of multiple comparisons arise when linear contrasts are used [see LINEAR HYPOTHESES, article on MULTIPLE COMPARISONS]. For the simple one-sample problem, s — [sij] is computed as shown in Table 2 For the paired sample problem, s in F(x,s) is the sample variance-covariance matrix computed from the derived multivariate sample of differences, and x is the sample vector of mean differences, as before. For the unpaired two-sample problem, it is necessary to replace Ns-1 in F(x,s), just as it was necessary to replace NZ-1 when Z was known. Each population has the dispersion matrix Z*, and two sample dispersion matrices sn) and s*2) may be computed, one for each multivariate sample, to estimate Z*. A "pooled" estimate of the dispersion matrix Z* is s' = [(N, - IX,, + (N, - lX 2 ) ]/(N a + N 2 - 2), the multivariate generalization of the pooled estimate of variance often used in univariate statistics. For the two-sample problem, Ns-1 in F(x,s) is replaced by [N.N,/^^ + No)]**-1. All of the assumptions about the populations and about the samples discussed in the preceding section apply for the corresponding generalized Student procedures. An application of the generalized Student procedures for paired samples may be made for the data in Table 1. The bivariate (p = 2) sample of paired differences (in Table 1, column 1 minus column 2, column 3 minus column 4) is exhibited in Table 3. The sample mean differences and samTafo/e 3 — Difference data on head measurements, first adult son minus second adult son HEAD-LENGTH DIFFERENCE
HEAD-BREADTH DIFFERENCE
X±(d)
X 2 (d)
12
10 o i
-6 -4 -5 5 16 -1 8 -9 5
4 2 5 1 7 -r -1
Tafofe 4 — Sample statistics for measurement differences on sons MEANS, Variate 1.7
2.1
VARIANCES AND COVARIANCES,
Sij(d')
^Variate i Variate j
1
69.88
32.14 25.12
ELEMENTS OF ^(d), ^.Variate i Variate j
1
.0348
-.0445 .0967
pie variances and covariance of the difference data are given in Table 4, along with the elements s i j of S"1. The column headings and statistics in Table 3 and Table 4 have the arguments d simply to distinguish them from the symbols in tables 1 and 2. For a comparison of first and second sons, it may be appropriate to take ^i = 0 and compute 2.1 1.7
= (21,17)
-.00257X . 07094,)
= 1.152. = 8(1.152)/2(9) - .512. If a significance level a =.10 is chosen, then F 2 , 8; . in = 3.11 and the differences between paired means are not statistically significant; indeed, they are less than ordinary variation would lead one to expect. (For some sets of data this sort of result should lead to re-examination of possible biases or nonindependence in the data-collection process.) To find those values [L* in the confidence region for ji, {i* must be replaced in T2(S,s); thus, / .0348 = 10(2.1-^,1.7 - ^ ) ( _ > 0 4 4 5
-.0445\/2.1 -
= .348(^1 - 2.1) 2 - .890(/4j - 2.1)0*; ~ 1-7) + .9670*;-1.7) 2 .
The corresponding F(x,s~) =$T2(x,s). The confidence region on ^ and /x,2 with confidence coefficient l — o ! consists of those points in the (//, t ,jii 2 )space inside or on the ellipse described by
MULTIVARIATE ANALYSIS: Overview This ellipse is plotted in Figure 1 for a = .05, .10, .25, F2;8;a = 4.46, 3.11, 1.66, for clearer insight into the nature of the region. A number of variants of the generalized Student procedure have been developed, and other variants are bound to be developed in the future. For example, one may wish to test null hypotheses specifying relationships between the coordinates of ^i (see Anderson 1958, sec. 5.3.5). Again, one may wish to test that certain coordinates of \JL have given values, knowing the values of the other coordinates. For another sort of variant, recall that it was assumed for the two-sample application that the dispersion matrices for the two parent populations were identical. If this assumption is untenable, then a multivariate analogue of the Behrens-Fisher problem must be considered (Anderson 1958, sec. 5.6). Sequential extensions of the generalized Student procedures have been given by Jackson and Bradley (1961). Generalized variances. Tests of hypotheses and confidence intervals on variances are conducted easily in univariate cases through the use of the chi-square and variance-ratio distributions. The situation is much more difficult in multivariate analysis.
For the multivariate one-sample problem, hypo eses and confidence regions for elements of 1 dispersion matrix, Z, may be considered. A fi possible hypothesis is H 02 : Z = Z*, a null hypothe specifying all of the elements of Z. (This hypo esis is of limited interest per se, except wh Z* = I or as an introduction to procedures on mu variate linear hypotheses.) It is clear that a t' statistic should depend on the elements Si; of S is not clear what function of these elements mi£ be appropriate. The statistic S has been called the generali2 sample variance, and Z| has been called the g< eralized variance. The test statistic >S|/|Z* v\ proposed by Wilks, who examined its distributk simple, exact, small-sample distributions are kmy only when p = l , 2 . An asymptotic or limiti distribution is available for large N; the statis VN - 1 [( S / Z|) - I]/V2p has the limiting u variate normal density with zero mean and u variance. It is clear that when Z — Z* under H 02 estimates Z*, and the ratio \S /!Z* should be m unity; it is not clear that the ratio may not be m unity when S ^ Z*. However, values of tl differ substantially from Z* should lead to rej tion of H 02 (see Anderson 1958, sec. 7.5).
-L -6
Figure 1 — Elliptical confidence regions on U.,, U.2 .155(^-2.1)2- .396(^-2.1)01,- 1.7) + .430^ - 1.7)
534
MULTIVARIATE ANALYSIS: Overview
Wilks's use of generalized variances is only one possible generalization of univariate procedures. Other comparisons of S and Z* are possible. In nondegenerate cases, Z* is noiisingular, and the product matrix SZ*-1 should be approximately an identity matrix. All of the characteristic roots from the determinantal equation [SZ*-1 - XI = 0, where I is the p y. p identity matrix, should be near unity; the trace, trSZ*-1, should be near p. Roy (1957, sec. 14.9) places major emphasis on the largest and smallest roots of S and Z and gives approximate confidence bounds on the roots of the latter in terms of those of the former. A test of H 02 may be devised with the hypothesis being rejected when the corresponding roots of Z* fail to fall within the confidence bounds. These and other similar considerations have led to extensive study of the distributions of roots of determinantal equations. Complete and exact solutions to these multivariate problems are not available. Suppose that two independent multivariate normal populations have dispersion matrices Z (1) and Z ( 2 ) , and samples of independent observation vectors of sizes N! and N 2 yield, respectively, sample dispersion matrices S (1) and S ( . 2 ) . The hypothesis of interest is H 03 : Z ( ] ) = Z ( 2 ) . In the univariate case ( p = l ) , the statistic F= S (1) |/|S (2) = S(\}/S^ is the simple variance ratio and, under H 03 , has the F-distribution with NI — 1 and N2 — 1 degrees of freedom. The general likelihood ratio criterion for testing Hf,3 is, with minor adjustment,
hypotheses is pertinent, and concepts of experimental design carry over to the multivariate case. [See EXPERIMENTAL DESIGN, article on THE DESIGN OF EXPERIMENTS; LINEAR HYPOTHESES, article On ANALYSIS OF VARIANCE.]
Consider the univariate randomized block design with v treatments and b blocks. A response, Xys, on treatment y in block 8, y = 1, • • • , v, 8 = 1, • — ,b, is expressed in the fixed-effects model (Model i) as the linear function Xr5 = ^ + rr + (3d + er6 , where // is the over-all mean level of response, ry is the modifying effect of treatment y (X^=i r 7 = 0), Ps is the special influence of block d (XI 5=1 PS = ^)' and €76 is a random error such that the set of vb errors are independent univariate normal variates with zero means and equal variances, cr2. The multivariate generalization of this model replaces the scalar variate Xrs with a p-variate column vector Xy5 with elements X r<5 i, i— 1, • • • , p, consisting of responses on each of p variates for treatment y in block 8. Similarly, the scalars /x, r r , (3$, and ey5 are replaced by p-element column vectors, and the vectors eYs constitute a set of vb independent multivariate normal vector variates with zero means and common dispersion matrices, Z. In univariate analysis of variance, treatment and error mean squares are calculated. If these are S2. and S I , their forms are
and
™ S (2 ,
where
•y=i 6=1
- 1)S(2 ) Nt + N2 - 2
If p = 1, then X is a monotone function of F. By asymptotic theory for large NI and N 2 , —2 loge X may be taken to have the central chi-square distribution with ip(p + 1) degrees of freedom under H 03 . Anderson (1958, sees. 10.2, 10.4-10.6) discusses these problems further. Roy (1957, sec. 14.10) prefers again to consider characteristic roots and develops test procedures and confidence procedures based on the largest and smallest roots of S (1) S>~^ . Heck (1960) has provided some charts of upper percentage points of the distribution of the largest characteristic root. Multivariate analysis of variance. Multivariate analysis of variance bears the same relationship to the problems of generalized variances as does univariate analysis of variance to simple variances. An understanding of the basic principles of the analysis of variance is necessary to consider the multivariate generalization. The theory of general linear
where X7 . =E»=1 Xy5/b, X.s = E7=1 XyS/v, and X . . = E7=1 Eg=1 XyS/vb. The test of treatment equality is the test of the hypothesis H04 : TI = • • • = rv (= 0); the statistic used is F = S2/S2 , distributed as F with v — 1 and (v — 1 ) (b — 1) degrees of freedom under H04 with large values of F statistically significant. When H04 is true, both S2T and S2 provide unbiased estimates of cr2 and are independent in probability, whereas when H04 is false, S2 still gives an unbiased estimate of
— X 7 .{ — X . at + X .
for i, j = 1, • • • , p. It can be shown that ST and Su have independent Wishart distributions with v — 1
MULTIVARIATE ANALYSIS: Overview and (v — ! ) ( £ > — 1) degrees of freedom and identical dispersion matrices, Z, under H 0 4 . Thus, the multivariate analysis-of-variance problem is reduced again to the problem of comparing two dispersion matrices, ST and Sw, like S u ) and S ( 2 ) of the preceding section. This is the general situation in multivariate analysis of variance, even though this illustration is for a particular experimental design. Wilks (1932a; 1935) recommended use of the statistic |S0) /|SW + ST , Roy (1953) considered the largest root of S^S™1, and Lawley (1938) suggested tr(STS^. These statistics correspond roughly to criteria on the product of characteristic roots, the largest root, and the sum of the roots, respectively. They lead to equivalent tests in the univariate case (where only one root exists), but the tests are not equivalent in the multivariate case. Filial (1964; 1965) has tables and references on the distribution of the largest root. A paper by Smith, Gnanadesikan, and Hughes (1962) is recommended as an elementary expository summary with a realistic example. Other procedures. Other, more specialized statistical procedures have been developed for means, variances, and covariances for multivariate normal populations, particularly tests of special hypotheses. Many models based on the univariate normal distribution may be regarded as special cases of multivariate normal models. In particular, it is often assumed that observations are independent in probability and have homogeneous variances, o-2. A test of such assumptions may sometimes be made if the sample is regarded as N observation vectors from a p-variate multivariate normal population with special dispersion matrix under a null hypothesis H 0 r>: 2 = cr2!, where I is the p x p identity matrix and cr'2 is the unknown common variance. This test and a generalization of it are discussed by Anderson (1958, sec. 10.7). See also Wilks (1962, problem 18.21). Wilks (1946; 1962, problem 18.22) developed a series of tests on means, variances, and covariances for multivariate normal populations. He considered three hypotheses, H06 : fjii — {Ji, HOT :
cr a = cr2,
criy — per2, i^ j,
i, j = 1, • • • , p;
cri; = per2, i*j,
i,j=l, ••• ,p;
given that era = cr2,
i y£ j}
crij — per2,
pothesis about equality of means given the spec dispersion matrix, Z, specified through equality its diagonal elements and equality of its nondi; onal elements. In these hypotheses p is the int class correlation, which has been considered various contexts by other authors [see MUL VARIATE ANALYSIS, article On CORRELATION
(1
Wilks showed that the test of HOR leads to the usu univariate, analysis-of-variance test for treatmei in a two-way classification. For HOG and H 0 7 , lib hood ratio tests were devised and moments of i test statistics were obtained with exact distributic in special cases and asymptotic ones otherwise. Other topics of multivariate analysis This general discussion of multivariate anal} would not be complete without mention of ba concepts of other major topics discussed elsewh in this encyclopedia. Discriminant functions. Classification proble are encountered in many contexts [see MULTIV^ ATE ANALYSIS, article On CLASSIFICATION AND I
CRIMINATION]. Several populations are known exist, and information on their characteristics available, perhaps from samples of individuals items identified with the populations. A particu individual or item of unknown population is to classified into one of the several populations on basis of its particular characteristics. This and lated problems were considered by early work in the field and more recently in the context statistical decision theory, which seems particuk appropriate for this subject [see DECISION THEOI Correlation. The simple product-moment ( relation coefficient between variates X{ and X, ^ defined above as pi}, with similarly defined sam correlation, r{j [see MULTIVARIATE ANALYSIS, tides on CORRELATION]. In the bivariate case (p = the exact small sample distributions of r12 ba on the bivariate normal model were developed Fisher and Hotelling. The multiple correlation tween X t , say, and the set X 2 , • • • , Xp may be fined as the maximum simple correlation betW' XT and a linear function /32 X2 + • • • + /3P Xp, m; mized through choice of (32, • • • , (3P. Partial correlations have been developed as < relations in conditional distributions. Canonical correlations extend the notion of n tiple correlation to two groups of variates. If variate vector, X, is subdivided so that
i, j = 1, • • • ,p.
H06 implies equality of means, equality of variances, and equality of covariances; H 07 makes no assumption about the means but implies equality of variances and equality of covariances; H08 is a hy-
X (s) being the column vector with elements X ± , • Xs and X ( t ) being the column vector with eleme Xs+1 , • • • ,XP (p = s + t~), the largest canonical »
536
MULTIVARIATE ANALYSIS: Overview
relation is the maximum simple correlation between two linear functions,' Y (,s) = V * B X and Y (I ) = /—Jrv -i/^« ct p Y fi X . The second largest canonical correlation / 4fv-t. | ;* (\ (v o is the maximum simple correlation between two new linear functions, Y'(s) and Y' (0 , similar to Y ( s ) and Y(n but uncorrelated with Y ( s ) and Y ( n , and so on. Distribution theory and related problems are given by Anderson (1958, chapter 12) and Wilks (1962, sec. 18.9). The theory of rank correlation is well developed in the bivariate case [see NONPARAMETRIC STATISTICS, article on RANKING METHODS]. Tetrachoric and biserial correlation coefficients have been considered for special situations. Principal components. The problem of principal components and factor analysis is a problem in the reduction of the number of variates and in their interpretation [see FACTOR ANALYSIS]. The method of principal components considers uncorrelated linear functions of the p original variates with a view to expressing major characteristic variation in terms of a reduced set of new variates. Hotelling has been responsible for much of the development of principal components, and the somewhat parallel treatments of factor analysis have been developed more by psychometricians than by statisticians. References for principal components are Anderson (1958, chapter 11), Wilks (1962, sec. 18.6), and Kendall ([1957] 1961, chapter 2). Kendall ([1957] 1961, chapter 3) gives an expository account of factor analysis. Counted data. The multinomial distribution plays an important role in analysis when multivariate data consist of counts of the number of individuals or items in a sample that have specified categorical characteristics. The multivariate analysis of counted data follows consideration of contingency tables and relationships between the probability parameters of the multinomial distribution. Much has been done on tests of independence in such tables, and recently investigators have developed more systematically analogues of standard multivariate techniques for contingency tables [see COUNTED DATA]. Nonparametric statistics. There has been a paucity of multivariate techniques in nonparametric statistics. Except for work on rank correlation, only a few isolated multivariate methods have been developed—for example, bivariate sign tests. The difficulty appears to be that adequate models for multivariate nonparametric methods must contain measures of association (or of nonindependence) that sharply limit the application of the permutation techniques of nonparametric statistics [see NONPARAMETRIC STATISTICS].
Missing values. Only limited results are available in multivariate analysis when some observations are missing from observation vectors. Wilks (1932k), in considering the bivariate normal distribution with missing observations, provided several methods of parameter estimation and compared them. Maximum likelihood estimation was somewhat complicated, but two ad hoc methods proved simpler and yielded exact forms of sampling distributions. Basically, one may obtain estimates of means and variances through weighted averages of means and variances of the available data and estimate correlations from the available data on pairs of variates. If only a few observations are missing, usual analyses should not be much affected; if many observations are missing, little advice may be given except to suggest the use of maximum likelihood techniques and computers for the special situation. It is clearly inappropriate to treat missing observations as zero observations—as has sometimes been done. Some useful references are Anderson (1957), Buck (1960), Nicholson (1957), and Matthai (1951). Other multivariate results. In a general discussion of multivariate analysis, it is not possible to consider all areas where multivariate data may arise or all theoretical results of probability and statistics that may be pertinent to multivariate analysis. Many of the theorems of probability admit of multivariate extensions; results in stochastic processes, the theory of games, decision theory, and so on, may have important, although perhaps not implemented, multivariate generalizations. RALPH A. BRADLEY BIBLIOGRAPHY
Multivariate analysis is complex in theory, in application, and in interpretation. Basic works should be consulted, and examples of applications in various subject areas should be examined critically. The theory of multivariate analysis is well presented in Anderson 1958; its excellent bibliography and reference notations by section make it a good guide to works in the field. Among books on mathematical statistics, other major works are Rao 1952; Kendall & Stuart (1946) 1961; Roy 1957; Wilks 1946; 1962. Greenwood & Hartley 1962 gives references to tables. T. W. Anderson is completing a bibliography of multivariate analysis. Books more related to the social sciences are Cooley & Lohnes 1962; Talbot & Mulhall 1962. Papers that are largely expository and bibliographical are Tukey 1949; Bartlett 1947; Wishart 1955; Feraud 1942; and Smith, Gnanadesikan, & Hughes 1962. Some applications in the social sciences are given in Tyler 1952; Rao & Slater 1949; Tintner 1946; Kendall 1957. ANDERSON, T. W. 1957 Maximum Likelihood Estimates for a Multivariate Normal Distribution When Some Observations Are Missing. Journal of the American Statistical Association 52:200-203.
MULTIVARIATE ANALYSIS: Correlation (1) ANDERSON, T. W. 1958 An Introduction to Multivariate Statistical Analysis. New York: Wiley. BARTLETT, M. S. 1947 Multivariate Analysis. Journal of the Royal Statistical Society Series B 9 (Supplement): 176-190. -> A discussion of Bartlett's paper appears on pages 190-197. BUCK, S. F. 1960 A Method of Estimation of Missing Values in Multivariate Data Suitable for Use With an Electronic Computer. Journal of the Royal Statistical Society Series B 22:302-306. COOLEY, WILLIAM W.; and LOHNES, PAUL R. 1962 Multivariate Procedures for the Behavioral Sciences. New York: Wiley. FERAUD, L. 1942 Probleme d'analyse statistique a plusieurs variables. Lyon, Universite de, Annales 3d Series, Section A 5:41-53. GREENWOOD, J. ARTHUR; and HARTLEY, H. O. 1962 Guide to Tables in Mathematical Statistics. Princeton Univ. Press. -» A sequel to the guides to mathematical tables produced by and for the Committee on Mathematical Tables and Aids to Computation of the National Academy of Sciences-National Research Council of the United States. HECK, D. L. 1960 Charts of Some Upper Percentage Points of the Distribution of the Largest Characteristic Root. Annals of Mathematical Statistics 31:625642. JACKSON, J. EDWARD; and BRADLEY, RALPH A. 1961 Sequential x2- and T2-tests. Annals of Mathematical Statistics 32:1063-1077. KENDALL, M. G. (1957) 1961 A Course in Multivariate Analysis. London: Griffin. KENDALL, M. G.; and STUART, ALAN (1946) 1961 The Advanced Theory of Statistics. Rev. ed. Volume 2: Inference and Relationship. New York: Hafner; London: Griffin. -> The first edition was written by Kendall alone. KUDO, AKIO 1963 A Multivariate Analogue of the Onesided Test. Biometrika 50:403-418. LAWLEY, D. N. 1938 Generalization of Fisher's z Test. Biometrika 30:180-187. MATTHAI, ARRAHAM 1951 Estimation of Parameters From Incomplete Data With Application to Design of Sample Surveys. Sankhyd 11:145-152. MORRISON, DONALD F. 1967 Multivariate Statistical Methods. New York: McGraw-Hill. -» Written for investigators in the life and behavioral sciences. NICHOLSON, GEORGE E. JR. 1957 Estimation of Parameters From Incomplete Multivariate Samples. Journal of the American Statistical Association 52:523526. NUESCH, PETER E. 1966 On the Problem of Testing Location in Multivariate Populations for Restricted Alternatives. Annals of Mathematical Statistics 37:113119. PILLAI, K. C. SREEDHARAN 1964 On the Distribution of the Largest of Seven Roots of a Matrix in Multivariate Analysis. Biometrika 51:270-275. PILLAI, K. C. SREEDHARAN 1965 On the Distribution of the Largest Characteristic Root of a Matrix in Multivariate Analysis. Biometrika 52:405-414. RAO, C. RADHAKRISHNA 1952 Advanced Statistical Methods in Biometric Research. New York: Wiley. RAO, C. RADHAKRISHNA; and SLATER, PATRICK 1949 Multivariate Analysis Applied to Differences Between Neurotic Groups. British Journal of Psychology Statistical Section 2:17-29. -> See also "Correspondence," page 124.
5,
ROY, S. N. 1953 On a Heuristic Method of Test ( struction and Its Use in Multivariate Analysis. An of Mathematical Statistics 24:220-238. ROY, S. N. 1957 Some Aspects of Multivariate Anal] New York: Wiley. SIDAK, ZBYNEK 1967 Rectangular Confidence Regi for the Means of Multivariate Normal Distributii Journal of the American Statistical Association 626-633. SMITH, H.; GNANADESIKAN, R.; and HUGHES, J. B. 1 Multivariate Analysis of Variance (MANOVA). metrics 18:22-41. TALBOT, P. AMAURY; and MULHALL, H. 1962 The PI cal Anthropology of Southern Nigeria: A Bionn Study in Statistical Method. Cambridge Univ. P] TINTNER, GERHARD 1946 Some Applications of M variate Analysis to Economic Data. Journal of American Statistical Association 41:472-500. TUKEY, JOHN W. 1949 Dyadic ANOVA: An Analysi Variance for Vectors. Human Biology 21:65—110 TYLER, FRED T. 1952 Some Examples of Multivai Analysis in Educational and Psychological Resea Psychometrika 17:289-296. WILKS, S. S. 1932a Certain Generalizations in the A ysis of Variance. Biometrika 24:471-494. WILKS, S. S. 1932b Moments and Distributions of ; mates of Population Parameters From Fragmen Samples. Annals of Mathematical Statistics 3:1 195. WILKS, S. S. 1935 On the Independence of k Set Normally Distributed Statistical Variables. EC metrica 3:309-326. WILKS, S. S. 1946 Sample Criteria for Testing Equ of Means, Equality of Variances, and Equality Covariances in a Normal Multivariate Distribu Annals of Mathematical Statistics 17:257-281. WILKS, S. S. 1962 Mathematical Statistics. New Y Wiley. -» An earlier version of some of this mat was issued in 1943. WISHART, JOHN 1955 Multivariate Analysis. Apj Statistics 4:103-116. II CORRELATION (1)
CORRELATION (1) is a general overview of topic; CORRELATION (2) goes into more detail al certain aspects. The term "correlation" has been used in a var of contexts to indicate the degree of interrela between two or more entities. One reads, for ample, of the correlation between intelligence wealth, between illiteracy and prejudice, anc on. When used in this sense the term is not ficiently operational for scientific work. One n instead speak of correlation between numei measures of entities—in short, of correlation tween variables. If statistical inference is to be used, the varia must be random variables, and for them a pr bility model must be specified. For two ran* variables, X and Y, this model will describe probabilities (or probability densities) with w]
538
MULTIVARIATE ANALYSIS: Correlation (1)
(X, Y) takes values (x, z/); that is, it will describe probabilities in the (X, Y)-population. One of the characteristics of this population is the correlation coefficient; the available information concerning it is usually in the form of a random sample, (XT , Y,), • • • , (X,,, Y n ). Thus, correlation theory "s concerned with the use of samples to estimate, test hypotheses, or carry out other procedures concerning population correlations. Surprisingly enough, confusion occasionally sets in, even at this early stage. There are deplorable examples In the literature in which the authors of a study are concerned with whether a certain sample coefficient of correlation can be computed instead of with whether it will be useful to compute it in the light of the research goal and of some special model. The so-called Pearson product-moment correlation coefficient—usually denoted by p in the population and r in the sample, and usually termed just the correlation coefficient—is the one most frequently encountered, and the purpose of this article is to survey the situations in which it is employed. Other sorts of correlation include rank correlation, serial correlation, and intraclass correlation. [For a discussion of rank correlation, see NONPARAMETRIC STATISTICS, article on RANKING METHODS; for serial correlation, see TIME SERIES. Intraclass correlation will be touched on briefly at the end of this article.] First, simple correlation between X and Y will be considered, then multiple correlation between a single variable, X 0 , and a set of variables, (X 1 ; • • • , X p ), and finally canonical correlation between two sets, (Y 1; • • • , Y 7t ) and (X 1} • • • , X p ). Partial correlation will be discussed in connection with multiple correlation. The case of two variables is a sufficient setting in which to discuss relationships with regression theory and to point out common errors made in applying correlation methods. The two most important models for correlation theory are the linear regression model, discussed below (see also Binder 1959), and the joint normal model. The joint normal model plays a central role in the theory for several reasons. First, the conditions for its approximate validity are frequently met. Second, it is mathematically tractable. Finally, of those joint probability laws for which p is actually a measure of independence, the joint normal model is perhaps the simplest to deal with. For any two random variables, X and Y, it follows from the definition of p that if the variables are independent, they are uncorrelated; hence, to conclude that the hypothesis of zero correlation is false is to assert dependence for X and Y. In the other direction, if X, Y follow a bivariate normal law and
are uncorrelated, then X and Y are independent, but this conclusion does not hold in general—the assumption of normality (or some other, similar restriction) is essential; it is even possible that X and Y are uncorrelated and also perfectly related by a (nonlinear) function. If the probability law for X, Y is only approximately bivariate normal, conventional normal theory can still be applied; in fact, considerable departure from normality may be tolerated (Gaycn 1951). For large samples, r itself is in any case approximately normal with mean p and with a standard deviation that can be derived if enough is known about the joint probability distribution of X, Y. Many misconceptions prevail about the interpretation of correlation. These stem in part from the fact that early work in the field reflected confusion about the distinction between sample estimators and their population counterparts. For some time workers were also under the impression that high correlation implies the existence of a cause-andeffect relation when in fact neither correlation, regression, nor any other purely statistical procedure would validate such a relation. Historically, research in the theory of correlation may be divided into four phases. In the latter part of the nineteenth century Galton and others realized the value of correlation in their work but could deal with it only in a vague, descriptive way [see GALTON]. About the turn of the century Karl Pearson, Edgeworth, and Yule developed some real theory and systematized the use of correlation [see EDGEWORTH; PEARSON; YULE]. From about 1915 to 1928, R. A. Fisher placed the theory of correlation on a more or less rigorous footing by deriving exact probability laws and methods of estimation and testing [see FISHER, R. A.]. Finally, in the 1930s first Hotelling and then Wilks, M. G. Kendall, and others, spurred on by psychologists, particularly Spearman and Thurstone, developed principal component analysis (closely related to factor analysis) and canonical correlation. [For a discussion of principal component analysis, see FACTOR ANALYSIS; for a discussion of canonical correlation, see MULTIVARIATE ANALYSIS, article On CORRELATION
(2),
and the section "Canonical correlation" below. See also the biographies of SPEARMAN and THURSTONE.] Along with the mathematical development there occurred an increasing realization among social scientists of the value of mathematics in their work. This produced better communication between them and statistical theorists and also led them to discard the older, and often incorrect, treatments of correlation on which they had relied. Correlation theory is now recognized as an im-
MULTIVARIATE ANALYSIS: Correlation (1) portant tool in experimentation, especially in those situations involving many variables. Its main value is in suggesting lines along which further research can be directed in a search for possible causeand-effect relations in complex situations [see CAUSATION]. In every field of application there are books describing correlation methods and, just as important, acquainting the reader with the types of data he will handle. Some examples are the works of McNemar (1949) in psychology, Croxton, Cowden, and Klein (1939) in economics and sociology, and Johnson (1949) in education. Mathematical treatments on several levels are also available. An excellent elementary work is the book by Wallis and Roberts (1956), which requires very little knowledge of mathematics, yet presents statistical concepts carefully and fully. Those equipped with more mathematics should find the books of Anderson and Bancroft (1952) and Yule and Kendall (1958), at an intermediate level, and Kendall and Stuart (1958-1966), at an advanced level, quite useful.
53
needed, one should note the important relatio cr*x+dY — c2
_
Sl7
where X, Y are the sample means and sx, SY, sx are the sample standard deviations and covariancc Regardless of the specific model adopted, rXY ca be used to estimate pXY and will have some desii able properties: rXY lies between —1 and +1 an has approximately pXY for its population mean; rx is a consistent estimator of pXY , that is, if the sampl size is increased indefinitely, Pr( rXY — pXY < e approaches 1, no matter how small a positive cor stant e is chosen. Normal model. If the joint probability law i bivariate normal, that is, probability is interprets as volume under the surface
exp
In the mathematical study of correlations between several variables the natural language is that of classical matrix theory; some knowledge of matrices, linear transformations, quadratic forms, and determinantal equations is required. This expository presentation, however, will not require background in these topics. Simple correlation For two jointly distributed random variables, X and Y, denote their population standard deviations by (TX and O-Y and their population covariance by (rXY. The correlation coefficient is then defined as PXY = orXY/crxa-Y. (Both standard deviations are positive, except for the uninteresting case in which one or both variables are constant. Then the correlation coefficient is undefined.) As Feller has remarked, this definition would lead a physicist to regard pXY as "dimensionless covariance." Elementary properties of pXY are that it lies between — 1 and + 1, that it is unchanged if constants are added to X and Y or if X and Y are multiplied by nonzero constants of the same sign (if the signs are different, the sign of pXY will be changed), and that it takes one of its extreme values only if a perfect linear relation, Y — a + bX, exists ( — 1 for b < 0, +1 for b > 0). Also, since the variance of a linear combination is frequently
then f ( x , y ) factors into an expression in x time an expression in y (the condition defining inde pendence of X, Y) if and only if pXY = 0. Under normality, r Ay is the maximum likelihooi estimator of pXY. Further, the probability law o rXY has been derived (Fisher 1915) and tabulate* (David 1938). The statistic (n - 2)*r A r /(l - r|y) can be referred to the t-table with n — 2 degrees o freedom to test the hypothesis H: pXY = 0. In addi tion, charts (David 1938), which have been repro duced in many books, are available for the deter mination of confidence intervals. The variabli z = tanrr l r A r is known (Fisher 1925, pp. 197 ff. to have an approximate normal law with meai tanh-1 pXY and standard deviation l/yn — 3, evei for n as small as 10; thus, the z-transformation ii especially useful, for example, in testing whethe: two (X, Y)-populations have the same correlation Also, it has the advantage of stabilizing variance; —that is, the approximate variance of z depend: on n but not on pXY. The quantity rXY itself, thougl approximately normal with mean pXY and standarc deviation (1 — p\Y}/^~n for very large n, will stil be far from normal for moderate n when pXY ii not near zero. Even in the bivariate normal case current!] under discussion, rXY does not have populatior mean exactly pXY, but the slight discrepancy car
540
MULTIVARIATE ANALYSIS: Correlation (1)
be greatly reduced (Olkin & Pratt 1958) by using rXY[l + (1 — r|y)/2(n — 3)] instead as an estimator for px-Y • Biserial and point-biserial correlations. If one variable, say Y, is dichotomized at some unknown point ft), then the data from the (X, Y)-population appear in the form of a sample from an (X, Z)population, with Z one or zero according as Y ^ o> or Y < &). If pXY and ft) are of interest, they can be estimated by r6 (biserial r) and o>&, or by maximum likelihood estimators pXY and u> (Tate 1955a,- 1955£>). The latter estimators are jointly normal for large n, and tables of standard deviations are available (Prince & Tate 1966). If pX7j is desired, it can be estimated by rXK, usually called point-biserial r. If, however, the assumption of underlying bivariate normality is correct, then pxz ^ V2/7rp A -y, so rxz would be a bad estimator of pXY • If one thinks in terms of models rather than data, there is no need for confusion on this point. Tate (1955b) gives an expository discussion of both models. Tetrachoric correlation. If in the bivariate normal case both X and Y are observable only in dichotomized form, the sample values can be arranged in a 2 X 2 table, and one can calculate rt, the so-called tetrachoric r. Unfortunately, the tetrachoric model is not amenable to the same type of simple mathematical treatment as is the biserial model. On this point the reader should consult Kendall and Stuart (1958-1966). Relation of correlation to regression. The notion of regression is appropriate in a situation in which one needs to predict Y, or to estimate the conditional population mean of Y, for given X [see LINEAR HYPOTHESES, article on REGRESSION]. The discussion given here will be sufficiently general to bring out the meanings of the correlation coefficient and the correlation ratio in regression analysis and to indicate connections between them. The reader should keep two facts in mind: (1) predictions are described by regression relations, whereas their accuracy is measured by correlation, and (2) assumptions of bivariate normality are not required in order to introduce the notion of regression and to carry its development quite far. A prediction of Y, $(X), is judged "best" (in the sense of least squares), quite apart from assumptions of normality, if it makes the expected mean-square error, E ( Y - ^ ( X ) ) 2 , a minimum. It turns out that for best prediction, $(X) must be fj,Ylx, the mean of the conditional probability law (also often referred to as the regression function) for Y given X, but that if only straight lines are allowed as candidates, the "best" such gives the prediction A + EX, with A = fiY — fjLX(pXYo-Y/a-x),
B = pXY o-y/0-jr . The basic quantities of interest, /jbY\x and A + BX, lead to the following decomposition of Y — (JLY : Y -
fJiy
= (Y -
fJLy^
+ (A + BX-
fJLy)
+
It can be shown that the right-hand terms are uncorrelated and that, therefore, by squaring and taking expected values they satisfy the basic relation o-2y = E(Y - /x y|;t ) 2 + E(A + BX - fjuYy These terms may be conveniently interpreted as portions of the variation of Y: the first is the variation "unexplained" by X, and the sum of the second and third is the variation explained by the "best" prediction, the second term being the amount explained by the "best" linear prediction. The quantity
the squared correlation ratio for Y on X, is the proportion of variation in Y "explained" by X (that is, by regression). Since it can be shown that E(A + BX - /A r ) 2 = piyO- 2 , the basic relation may be rewritten as 0-2y = E(Y - /i y|A -) 2 + P|y0-y + (TJ r j c - p|y)o-y .
If the regression is linear, then /JLY\X — A + BX, the third term drops out of the decomposition of Y — fjLY, and tfYX , the proportion of "explained" variation, coincides with p|y. If in addition cr\]x, the variance of the conditional law for Y given X, is constant, then this variance coincides with E ( Y - fJLY\x~)2, and When X, Y follow a bivariate normal law, both conditions are met, and hence this last relation is satisfied. In any event one can see from the basic relation, and conditions of nonnegativity for mean squares, that 0 < p2XY ^ irfYX ^ 1, with p\Y = vfYX if and only if the regression is linear; when that linearity of regression holds, both quantities equal zero if and only if the regression is actually constant, and both quantities equal one if and only if the point (X, Y) must always lie on a straight line. It should be noted that in general rfYX ^ *rfXY , whereas PXY is symmetric : pXY = pYX • Traditional terms, now rarely used, are "coefficient of determination" for p2XY, "coefficient of nondetermination" for 1 — /ofy , and "coefficient of alienation" for (1 — p|r)J. The use of data to predict Y from X, by fitting a sample regression curve, evidently involves two types of error, the error in estimating the true re-
MULTIVARIATE ANALYSIS: Correlation (1) gression curve by a sample curve and the inherent sampling variability of Y (which cannot be reduced by statistical analysis) about the true regression curve. (The reader may consult Kruskal 1958 for a concise summary of the above material, together with further interpretive remarks, and Tate 1966 for an extension of these ideas to the case of three or more variables and the consequent consideration of generalized variances.) It cannot be too strongly emphasized that the correlation coefficient is a measure of the degree of linear relationship. It is frequently the case that for variables Y and X, the regression of Y on X is linear, or at least approximately linear, for those values of X which are of interest or are likely to be encountered. For a given set of data one can test the hypothesis of linearity of regression (see Dixon & Massey [1951] 1957, sees. 11-15). If it is accepted, then the degree of relationship may be measured by a correlation coefficient. If not, then one can in any event measure the degree of relationship by a correlation ratio. In some cases it may be desirable to give two measures (that is, to give estimates of both p\Y and rj\,x — p-XY}, one for the degree of linear relationship and one for the degree of additional nonlinear relationship. In the case of nonlinear relationship, however, (j,Y\x cannot be estimated satisfactorily for any specific value X = x unless either a whole array of Y observations is available for that x or some specific nonlinear functional form is assumed for /-IYIA-. In view of the advantages of using normal theory, it is best whenever possible to make the regression approximately linear by a suitable change of variable and to check the procedure by testing for approximate normality and linearity of regression. [See STATISTICAL ANALYSIS, SPECIAL PROBLEMS OF, article on TRANSFORMATIONS OF DATA.] When X, Y follow a bivariate normal law, one has not only linear regression for Y on X but also normality for the conditional law of Y given X and for the marginal law of X. If the conditions of the bivariate normal model are relaxed in order to allow X to have some type of law other than normal, while the remaining properties just mentioned are present, some interesting results can be obtained. It is known, for example (Tate 1966), that for large n, rxv is approximately normal with mean p Ay and standard deviation (I — p 2 r ) ( l + iy/4r)VV^, with y denoting the coefficient of excess (kurtosis minus 3) for the X-population. (For a general treatment of aspects of this case, see Gayen 1951.) It is an important fact that there is value in rXY even if there exists no population counterpart for
54
it. This arises in the following way: Let x be fixed variable subject to selection by the exper menter, and let Y have a normal law with mea A + Bx and standard deviation a-Y\,T. This is calle the linear regression model. The usual math< matical theory developed for this model require that o-y|r actually not depend on x, although sligl deviations from constancy are not serious. If definite dependence on x exists, it can sometime be removed by an appropriate transformation of [see STATISTICAL ANALYSIS, SPECIAL PROBLEMS 01 article on TRANSFORMATIONS OF DATA]. Note thi the nonrandom character of x here is stressed b use of a lower-case letter. The quantities A and B may be estimated, a before, by least squares, and the strength of th resulting relationship may be measured by rxY Distribution theory for r,y is, of course, not th same as in the bivariate normal case, since rxY i only formally the same as r X Y . In other words, it i important to take into consideration in any give case whether (X, Y) actually has a bivariate dii tribution or whether X = x behaves as a paramete: an index for possible Y distributions. Errors in correlation methods. Three commo errors in correlation methods have already bee: mentioned: focusing attention on the data an ignoring the model, concluding that the presenc of correlation implies causation, and assuming th£ no relation between variables is present if correh tion is lacking. The literature contains many cor fused articles resulting from the first type of errc and also many illustrations, some humorous, c what could occur if the second type of error wer committed—for example, in connection with th high correlation between the number of childre: and the number of storks' nests in towns of nortr western Europe (Wallis & Roberts 1956, p. 79 ] The source of the correlation presumably is som factor such as economic status or size of house. A an artificial but mildly surprising example of th third type of error one should consider the fact tha for a standard normal variable X, Y and X are ur correlated if Y = X2. A different type of error arises when one trie to control some unwanted condition or source o variation by introducing additional variables. I U — X/Z and V = Y/Z, then it is entirely possibl that p rr will differ greatly from pXY. For example pXY may be zero but puv very different from zerc The difficulty is clear in this example, but simila difficulties can enter data analysis in insidiou ways. Using percentages instead of initial observa tions can also produce gross misunderstanding. A a very simple example consider IP= X/(X + Y
542
MULTIVARIATE ANALYSIS: Correlation (1)
and V = Y/(X + Y) and the fact that puv =-1 even if X and Y are independent. Of course, if additional variables, say Z and W, are involved, the magnitude of the correlation btlween X/(X + Y + Z + W) and Y/(X + Y + Z + W) will not be so great. The adjective usually applied to this type of correlation is "spurious," though "artificially induced" would be better. A spurious correlation can in certain circumstances be useful; for instance, the idea of so-called part—whole correlation (see McNemar [1949] 1962, chapter 10) deserves consideration in certain situations. If, for example, a test score T is made up of scores on separate questions or subtests, say T, + T2 + • • • + Tm, a high correlation rrTl could not be ascribed wholly to spuriousness. It is altogether possible that TI would serve as well as T for the purpose at hand. Multiple and partial correlation If more than two variables are observed for each individual, say X 0 , XT , • • • , X P , there are more possibilities to be considered for correlation relationships: simple correlations, p-t; (i, j = 0, 1, • • • , p, i ^ j ~ ) , multiple correlations between any variable and a set of the others, and partial correlations between any two variables with all or some of the others held fixed. (In this section capital letters for random variables will be omitted in subscripts; only the numerical indexes will be used.) Multiple correlation. The multiple correlation between X0 and the set (X t , • • • , X p ), denoted by jR 0 -12 • • • ? ) , is defined to be the largest simple correlation obtainable between X0 and a^ + • • • + apXp, where the coefficients, a-t, are allowed to vary. It possesses the following properties: R0.^...p is nonnegative and is at least as large as the absolute value of any simple correlation; if additional variables, Xp+l, Xp+2, • • • , are included, the multiple correlation cannot decrease. It thus follows that if Ro-12-..p = 0, all poj are zero. Also, if R 0 . 1 2 ... p = 1, then a perfect linear relationship, X0 = a,, + a1X1 + • • • + apXp, exists for some a () , al, • • • , ap. The usual estimator of R0.12...p, based on a random sample of vector observations on (X 0 , XT , • • • , X p ), is the sample correlation r 0 . i 2 .. .„ between X0 and its least squares prediction based on X x , • • • , Xp. Under the joint normal model, H: R0.r>...p = 0 can be tested by referring [ ( n - p - l)/p]r;|. 1a ...„/(! - ?t 12 ... p ) to an F-table with p and n — p — 1 degrees of freedom (Fisher 1928). Also, r 0 . 1 2 ... p , like rXY, is approximately normal for large n with mean JR 0 .i2---p and standard deviation (1 — R%^.,...p)/Vn, provided R 0 . 1 2 ... p *0 ; if R0.,2...p = 0, then nr?,. la ... p has approximately a chi-square law with p degrees
of freedom. Fisher's z-transformation applies as before—except when JR 0 . 1 2 ... P is zero (Hotelling 1953). Note that R 0 -i2... P and r 0 . 12 ... P do not reduce to simple correlations if p = 1. Instead, one finds that JRn-i = |p 0 i| and rn.! = |r01|. Regression relationships, in which X0 is predicted by X-t, • • • , X p , are analogous to those for simple correlation; for example, when regression is linear and conditional variances are constant, Rl.^...p is the portion of
Alternatively, if the joint probability law is normal, p01.2 can be defined as the simple correlation be-
MULTIVARIATE ANALYSIS: Correlation (1) tween X0 and X l 5 calculated from the conditional law for X0 and X: given X 2 , but this is not true in general. Also, since pol.2 is the ordinary correlation between the residuals denned above, p 2 ,.., may be characterized in terms of the unexplained variance in one residual after linear prediction from the other, namely 1 — (o-*.r2/a*.2). To see an important relation between multiple and partial correlation, think of the variables Xj, X 2 , • • • , Xp as being introduced one at a time and producing increases in multiple correlation with X 0 . Then
\ -*•
Pop • 12 • • -p-1 /
From this it follows that 1 — R2
±
lll
— M- — -R 2 ~ 1
tt
which yields a recursion relation that allows for the correction of a multiple correlation when a variable is added or subtracted. Elaborate and useful computational schemes are available for adding and subtracting variables in correlation analysis. One viewpoint (see Ezekiel and Fox [1930] 1959, appendix 2) is that one should generally start with the largest feasible number of independent variables and then subtract one at a time those that are negligibly useful in predicting X0 . Other approaches begin with the best single predictor among X, , X2 , • • • , Xp and then add others one at a time until further additions make no substantial improvement. Many expressions and statements analogous to the above relationships can of course be obtained by rearrangement of subscripts, including those which employ only some of the p + 1 variables. Since all parameters involved in this discussion are actually only simple correlations between appropriate pairs of random variables, one can construct estimators by calculating the corresponding sample simple correlations. Thus, for example, r 0 1 > 2 is calculable from the observation pairs, (X 0 , — A0 — A2X2i , Xl4 - A'o - A'2X2i), of sample residuals. Finally, it has been shown (Fisher 1928) that if the multivariate normal model is assumed, many results for r 01 . 23 ... p can be obtained from those for r01 by replacing n - 2 b y n - p — 1. For example, (w-p-l) 1 r 0 1 ., 3 ... p /(l-r2 1 . 2 3 ... p )* can be referred to the t-table with n-p— 1 degrees of freedom as a test of H: p 01 . 2 3 . . - p = 0An example. As an example of the applications of multiple and partial correlation, consider an experiment in which X0 represents grade point average, Xi represents IQ, X2 represents hours of study
54
per week, and the relationship is sought betwe< X0 and X T , with X, held fixed (Keeping 196 p. 363). Results based on a sample of 450 scho children showed that r 0 . 12 = 0.82, r01 = 0.60, r02 0.32, r, 2 = -0.35, and rtn., = 0.80. The positi correlation between X() and X t , together with tl negative correlation between Xl and X2 (a mo intelligent student need not study so long), o scured somewhat the strength of the relationsh between X0 and Xl. It should perhaps be mention< that from the relation 1 — r 2 .,, = (1 — r2,) (1 — r201. it is clear that r 0 . 12 ^ r 0 1 . 2 , with equality if and on if r02 = 0. It is true in general, for parameters sample estimators, that a multiple correlation b tween a given variable and others is at least ; large in magnitude as any simple or partial cc relation between that variable and any of the other Reduction of the number of variables Yule and Kendall (1958, chapter 13) offer pra tical advice of an elementary nature in relation economy in the number of variables to be consi ered. In this connection one thing is more or le certain: if the number of variables is, say, great than ten, an attempt to analyze the interrelatioi between variables by using their whole correlatic matrix offers too many possibilities for the mind encompass, or for methods to isolate, and is ther fore probably a waste of time. There are less elementary techniques for dealii with problems involving large sets of variable which have been treated in depth and are wortl of wide application. These include canonical corr lation, principal components, and factor analysi Canonical correlation. There are cases in whic an experimenter wishes to study the interr lations between two sets of variables, (Y l 3 • • • , Y, and (X 1; • • • , X p ). The purpose of canonical corr lation theory (Hotelling 1936) is to replace the: sets by new (and smaller) sets, at the same tin preserving the correlation structure as much ; possible. The method is as follows: Linear cor binations, one from each set of variables, are i constructed as to have maximum simple correl tion with each other. These linear combination denoted by Ui and V 1? are called the first pair i canonical variables; their correlation, p1} is tl first canonical correlation. The process is continue by the construction of further pairs of linear cor binations, with the provision that each new cano: ical variable be uncorrelated with all previoi ones. If k ^ p, the process will terminate wii [/!, U2, • • • , L/fc, Y! , V 2 , • • • , Vfc and canonical cc relations p x , p 2 , • • • , pj{. If k = 1, the resulting sing
544
MULTIVARIATE ANALYSIS: Correlation (1)
canonical correlation is the multiple correlation for on . , X2 , • • • , Xp . Since pl ^ p2 and since many canonical correlations may be small, it is clear that the canonical pairs worth preserving may be few. The usual model specifies a joint normal law in p + k variables, and estimation of canonical correlations can be carried out with a sample by a scheme which parallels that for the construction of P! , • • • , pk. The joint probability law for sample canonica] correlations is known both in exact form and in approximate form for large n. Before canonical correlations are estimated, it may be wise to carry out an initial test for possible complete lack of correlation between the two sets of variables. The hypothesis thatpi = p2 = • • • — pk = 0, or, equivalently, that all correlations between an Xi and a Y,- are zero, may be tested essentially by a procedure of Wilks (see Tate 1966). The hypothesis being tested can be rewritten as 1 - (1 -pDO -Pl) • • ' (1 "Pi) = 0, a form analogous to that of other, related tests. There are various tests available for this hypothesis; one should try to choose the one with highest power against the alternative hypotheses of interest. (See Anderson 1958, sec. 14.2; Hotelling 1936.) Principal components and factor analysis. One of the central problems arising in the application of correlations is that of holding the variables considered down to a manageable number. This was mentioned above in connection with canonical correlation and is also the guiding principle underlying principal components analysis [for a discussion of principal components, see Hotelling 1933 and FACTOR ANALYSIS, article on STATISTICAL ASPECTS]. There one deals with a single set of variables, forming linear compounds that are uncorrelated with one another and arranged in order of decreasing variance. The basis of principal components analysis is the assumption that the more interesting observable quantities are those with larger variation. Factor analysis, which is of vast importance in psychological testing, utilizes a similar idea, except that the number of linear compounds to be considered is prescribed by the model. Connections between these two methods are discussed in a monograph by Kendall (1957). Other methods of correlation Intraclass correlation. In the discussion of sampling from an (X, Y)-population and the consequent use of the sample to estimate pXY, there has been no question as to the separate identification of the X and Y for each observation. Thus, one can think of pXY as a measure of the interrelation
between two classes, an X-class and a Y-class, and hence the term interclass correlation may be used. As an example of a situation in which the identification of X and Y is not clear, consider measuring the correlation between the weights of identical twins at, say, age five. Here there is in effect only one class, that of pairs of weights of twins. Any establishment of two classes — for example, by considering X the weight of the taller twin and Y the weight of the shorter twin —would be wholly arbitrary and not helpful. The population of weight pairs has a correlation coefficient, and this gives the intraclass correlation, the correlation coefficient between the two weights of a pair in random order. The method for handling this situation works as well with data involving triplets (one is still, however, interested in correlation for weights in the same family) or any number of children. Consider n observations (families) on fe-tuplets, with k ^ 2. The method consists essentially in the averaging of products of deviations over all possible k ( k — 1 ) pairs of children. If X, ; represents the weight of the jth child in the ith family, then the intraclass correlation, r, is given by nk(k—l)s2r = nk2s2m — nks2, with s2 = 2S(Xj;- — X)2/nk, the within-families sample variance, and s~n= I] (X ; — X) 2 /w, the between-families sample variance. Thus, k- I It is clear that r ^ — ! / ( & — 1) and that for a single family ( n = l ) , r—— l/(fe — 1). Intraclass correlation is closely related to components of variance models in the analysis of variance [see LINEAR HYPOTHESES, article on ANALYSIS OF VARIANCE]. Attenuation. Observations on random variables are frequently subject to measurement errors or, at any rate, are observable only in combination with other random variables, so that in attempting to observe U, V one must instead accept X = U + E, Y — V + F. Previous methods lead to information about p A y , when what is relevant is information about pry If E and F are assumed to be uncorrelated with U, V, and each other, then the relation between puv and pXY is given by pxY
_
cov(U
( r _•> ,
5
, V + F) Puv
which shows that pA-y < puv , with equality occurring only in the trivial case in which E, F are both constant. The coefficient pL-v is said to be attenuated by the effect of E and F. Correction for attenuation consists in applying to the above relation
MULTIVARIATE ANALYSIS: Correlation (2) known or assumed information relative to pXY, (a-,,;/crf•), (o>/a>) in order to estimate puv. (For further discussion, see McNemar 1949.) R. F. TATE BIBLIOGRAPHY
ANDERSON, RICHARD L.; and BANCROFT, T. A. 1952 Statistical Theory in Research. New York: McGraw-Hill. ANDERSON, T. W. 1958 An Introduction to Multivariate Statistical Analysis. New York: Wiley. BINDER, ARNOLD 1959 Considerations of the Place of Assumptions in Correlational Analysis. American Psychologist 14:504-510. CROXTON, F. E.; COWDEN, D. J.; and KLEIN, S. (1939) 1967 Applied General Statistics. 3d ed. Englewood Cliffs, N.J.: Prentice-Hall. -» Klein became a co-author with the third edition. DAVID, F. N. 1938 Tables of the Ordinates and Probability Integral of the Distribution of the Correlation Coefficient in Small Samples. London: University College, Biometrika Office. DIXON, WILFRID J.; and MASSEY, FRANK J. JR. (1951) 1957 Introduction to Statistical Analysis. 2d ed. New York: McGraw-Hill. EZEKIEL, MORDECAI; and Fox, KARL A. (1930) 1959 Methods of Correlation and Regression Analysis: Linear and Curvilinear. 3d ed. New York: Wiley. FISHER, R. A. 1915 Frequency Distribution of the Values of the Correlation Coefficient in Samples From an Indefinitely Large Population. Biometrika 10:507521. FISHER, R. A. (1925) 1958 Statistical Methods for Research Workers. 13th ed. New York: Hafner. -» Previous editions were published by Oliver & Boyd. FISHER, R. A. 1928 On a Distribution Yielding the Error Functions of Several Well Known Statistics. Volume 2, pages 805-813 in International Congress of Mathematicians (New Series), Second, Toronto, 1924, Proceedings. Univ. of Toronto Press. GAYEN, A. K. 1951 The Frequency Distribution of the Product-moment Correlation Coefficient in Random Samples of Any Size Drawn From Non-normal Universes. Biometrika 38:219-247. HANNAN, J. F.; and TATE, R. F. 1965 Estimation of the Parameters for a Multivariate Normal Distribution When One Variable Is Dichotomized. Biometrika 52: 664-668. HOTELLING, HAROLD 1933 Analysis of a Complex of Statistical Variables Into Principal Components. Journal of Educational Psychology 24:417-441, 498-520. HOTELLING, HAROLD 1936 Relations Between Two Sets of Variates. Biometrika 28:321-377. HOTELLING, HAROLD 1953 New Light on the Correlation Coefficient and Its Transforms. Journal of the Royal Statistical Society Series B 15:193-225. JOHNSON, PALMER O. 1949 Statistical Methods in Research. New York: Prentice-Hall. KEEPING, E. S. 1962 Introduction to Statistical Inference. Princeton, N.J.: Van Nostrand. KENDALL, M. G. (1957) 1961 A Course in Multivariate Analysis. London: Griffin. KENDALL, M. G.; and STUART, ALAN 1958-1966 The Advanced Theory of Statistics. New ed. 3 vols. New York: Hafner; London: Griffin. -> Volume 1: Distribution Theory, 1958. Volume 2: Inference and Relationship, 1961. Volume 3: Design and Analysis, and Time Series, 1966. The first edition, published in 19431946, was written by Kendall alone.
5
KRUSKAL, WILLIAM H. 1958 Ordinal Measures of sociation. Journal of the American Statistical Asso tion 53:814-861. McNEMAR, QUINN (1949)1962 Psychological Statist 3d ed. New York: Wiley. OLKIN, INGRAM; and PRATT, JOHN W. 1958 Unbia Estimation of Certain Correlation Coefficients. Am of Mathematical Statistics 29:201-211. PRINCE, BENJAMIN M.; and TATE, ROBERT F. 1966 ' Accuracy of Maximum Likelihood Estimates of Co lation for a Biserial Model. Psychometrika 31:85TATE, R. F. 1955a The Theory of Correlation Betw Two Continuous Variables When One Is Dichotomi: Biometrika 42:205-216. TATE, R. F. 1955Z? Applications of Correlation Mo< for Biserial Data. Journal of the American Statist Association 50:1078-1095. TATE, R. F. 1966 Conditional-normal Regression N els. Journal of the American Statistical Associai 61:477-489. WALLIS, W. ALLEN; and ROBERTS, HARRY V. 1956 , tistics: A New Approach. Glencoe, 111.: Free Press A revised and abridged paperback edition of the J section was published in 1962 by Collier. YULE, G. UDNY; and KENDALL, M. G. 1958 An In auction to the Theory of Statistics. 14th ed., rev enl. London: Griffin. -> The first edition was r lished in 1911 with Yule as sole author. Kendall been a joint author since the eleventh edition (19J and the 1958 edition was revised by him. A 1 printing contains new material. Ill
CORRELATION (2)
Correlation, in a broad sense, is any probabilis relationship between random variables (or sets random variables) other than stochastic indepe: ence. Two random variables are said to be in pendent when the conditional distribution of o given the other, does not depend on the given val Viewed another way, independence means that probability that both random variables are sirr taneously in some given intervals is simply product of the separate interval probabilities. Wh ever independence does not hold, the two rand variables are dependent, or correlated. (Termir ogy is not wholly standard, for the word "correlat is sometimes used to refer to special kinds dependence only.) [See PROBABILITY, article FORMAL PROBABILITY.]
Two sets of random variables—that is, two r dom vectors—are independent when the conditio distribution of one set, given the other, does : depend on the given values. The idea of a numerical measure of associat between two random variables seems to have or nated with Francis Galton, in the last part of nineteenth century [see GALTON]. From crude ginnings at his hands, the concept passed i: those of F. Y. Edgeworth and particularly into th of Karl Pearson, whose academic training had b<
546
MULTIVARIATE ANALYSIS: Correlation (2)
in mathematical physics but who caught Galton's enthusiasm and devoted the rest of his life to statistics [see EDGEWORTH; PEARSON]. From them there came the definition and explor?iion of the important correlation coefficient,
r=
(Xt-X)(Y, -
[(Xi-
(X2-X)(Y2-Y)+ (X.v-X) 2 ]*
in sample form. Here (X t , Y x ) , • • • , (X v , Y Y ) are the members of an N-fold bivariate sample, and X and Y are the corresponding sample averages. Of course r may be written more compactly as E(Xi-X)(Yi-Y)
r—
or still more compactly as
r=
where x{ = X» — X and yi = Y* — Y are the residuals, or deviations from the sample averages. Another way of expressing r is obtained by dividing the numerator and the denominator by N — 1:
r— where Sxx and Svy are the conventional modes of expressing sample variance and S,ry that of sample covariance. The population, or underlying, correlation coefficient between random variables X and Y is P =
tion coefficient. Although Pearson was the first to study it with care, later workers, especially R. A. Fisher, pushed both the theoretical study and the applications of correlation much further [see FISHER, R. A.].
cov(X, Y) varT
where varX = E(X - EX)2, var Y = E(Y - EY)2, and cov (X,Y) = E[(X - EX)(Y - EY)]. When the sample of (Xi, YJ) is random, r is the usual estimator of p. Instead of centering the quantities entering into the expressions for r and p on (X, Y) and (EX, EY), respectively, estimated or true conditional expectations, given other variables, may be used. Then the correlation coefficients are called partial correlation coefficients. The adjectives "Pearsonian" and "product-moment" are sometimes used in naming the correla-
(X.v-X)(Yff-?) (Y2- Y)2+ ••• +
Applications of correlation The initial application of correlation was to genetics, although that science remained at a rudimentary stage in England and other Western countries until the early 1900s, when the basic principles published in 1866 by the Austrian monk Gregor Mendel were rediscovered. Subsequent genetic research revealed specific correlations that result from various degrees of relationship, from the extent of random mating, and from other conditions. A substantial compendium of this correlational theory of genetics was published by Fisher (1918). These specific correlations made it important to compare certain hypotheses suggested by theoretical considerations about the value of a correlation coefficient with the observed results. For example, a theoretical correlation of \ between stature of father and stature of son is suggested by a hypothesis of random mating; this correlation may, however, be obscured by the fluctuations of random sampling. Because of such problems the probability distribution of r in samples from a basic distribution with correlation p became an object of mathematical inquiry, of which an account will be given below. Some of the mathematical problems of great complexity are still only partly solved. Analysis of human abilities. Even before the rediscovery of Mendelian genetics, psychologists became interested in correlation with a view to detecting and analyzing variations in human abilities. A pioneer work was Charles E. Spearman's paper of 1904, which was later revised and expanded into his book The Abilities of Man (1927), leading to the theory that each of the various human abilities tested is the sum of a greater or less quota of "general intelligence" and another independent fraction of an ability special to the particular thing tested [see INTELLIGENCE AND INTELLIGENCE TESTING; the biography of SPEARMAN]. These special abilities were initially thought of as being independent of general intelligence and of each other [see FACTOR ANALYSIS]. If p^ denotes the true, or population, correlation between the ith
MULTIVARIATE ANALYSIS: Correlation (2) and jth test scores, the original Spearman theory holds, as a consequence of the assumptions that, for all different subscripts, i, j , k, I, Pa
pikpn —
The population correlations, p-tj, cannot, however, be derived from theory but must be estimated by the sample correlations, r-tj , obtained from actual test scores. After the problem was recognized, there ensued a long period of wrestling with the difficult mathematics and logic of this problem and of attempting to reformulate the early theory to apply with greater generality to situations involving group factors and other elaborations. Greatly enlarged testing programs supplied vast amounts of data. In the 1920s and 1930s new views of the problem were introduced — in numerous articles in journals, in the work of L. L. Thurstone, in a book by Truman L. Kelley (1928), and in a work of Karl Holzinger and Harry Harman (1941) that was the culmination of work done and papers published during the 1930s [see KELLEY; THURSTONE]. One of those who introduced new ideas and methods was Spearman himself, when he became convinced that his original formulation was inadequate (1927). Rank-order correlation. Spearman introduced a correlation coefficient for ranked observations that avoids any assumption, either of normality or of any other particular form of distribution [see NONPARAMETRIC STATISTICS, article on RANKING METHODS]. It has been used extensively by statisticians unwilling to make assumptions of particular forms for their data. An exact standard error for the Spearman coefficient was published by Hotelling and Pabst in 1936. Maurice G. Kendall provided another rank correlation coefficient in Biometrika in 1938 and reviewed the subject at length in chapter 16 of his Advanced Theory of Statistics (1943-1946, vol. 1). See also Kendall's monograph on ranking methods (1948). The correlation ratio. The correlation ratio was originally introduced to deal with nonlinear regression when the data are grouped. It has a strong formal similarity with analysis of variance in the one-way layout. Its theory is treated by Hotelling (1925) and Wish art (1932). Effect of deviations from assumptions. The correlation coefficient, r, is sensitive to deviations from the usual basic assumptions of normality, independence of observations, and uniform variance among the observations. Extreme deviations in variance, particularly in the form of large deviations of both X and Y in the same term, may cause an exaggeration of r above p. Effects of nonnormality
54
may be serious and will be discussed later. The effects are generally ignored in the literature. Lack of independence, another sort of deviati from assumptions, particularly between differe observations on the same variate, has been felt be so serious a menace as to impair deeply t reliability of many correlation coefficients, esj cially for economic time series. Partial correlatic equivalent to removal of a set of variables that £ considered extraneous from both X and Y by lei squares, is a useful method. A special case of it the elimination of trends—best done by lej squares—which may be combined with the elir nation of seasonal variation, for which spec methods have been devised. Caution is needed such enterprises to obtain "models" that are tri reasonable and do not involve removing too mu with the trend, throwing out the baby with the ba water. But the penalty for such a sin is often ve light, usually being limited to a reduction in t number of degrees of freedom, whereas a failu to remove significant components of trend, such secular and seasonal components, may grossly e agger ate the correlation. Autocorrelation and serial correlation. Au correlation, in which each observation on X matched with another observation on X, whe there is a fixed time interval between the two c servations, may be measured by the same formula or by slight variations of it. Lag correlation is giv by the usual formula with a fixed time inten between each X and the corresponding Y. In bo these situations the distribution is different frc that of r based on a random sample. The choice suitable types of autocorrelations and serial cor] lations should be made with a view to what known or believed about the interrelations of t actual observations. Since these interrelations a seldom known exactly, the choice of a particul statistic can often be made so as to relate it su ably both to the true matrix of correlation and manageable forms for its own distribution. (F methods useful in finding some such distributioi see papers by Tjailing C. Koopmans 1942 ai R. L. Anderson 1942.) Other applications. Correlation enters biom rics in many places other than genetics. Areas which it has been widely used are quality conti and quantitative anthropology [see PHYSICAL A THROPOLOGY; QUALITY CONTROL, STATISTICAL].
The precision of r The formula for r is the same as that used solid analytic geometry for the cosine of the an^
548
MULTIVARIATE ANALYSIS: Correlation (2)
between two lines through the origin, one to each of the points with coordinates (x^, • • • , :r,v), (z/i, • • • , z/ lV ), except that the formula given in the textbooks is usually confined tc three dimensions. Since it is a cosine, r cannot exceed 1 or be less than — 1, but when the variates are distributed under reasonable assumptions of continuity, r can take either of these extreme values. If (and only if) r = ±1, the Y's of the sample are linearly related to their corresponding X's, with the linear function increasing if r = 1 and decreasing if r = — 1. In order to make substantial use of r, it is necessary to have at least an approximation to its probability distribution, which will involve both the true value and the sample size. The probability distribution was first deduced for random samples with p T£ 0 from the bivariate normal population by Fisher (1915), but the results, although correct, were very difficult to use until simplifying transformations could be found. One simplification, which in the end proved too drastic, is to use the standard error of r, a function of p and N, and to treat r as normally distributed about p. This had been done by Karl Pearson and L. N. G. Filon (1898). An earlier version contained an error, whose cause it is instructive to examine: the two sample standard deviations in the denominator of r were regarded as fixed, or the same in all samples; this introduced into the denominator of the standard error of r an extraneous factor, (1 + r 2 )*. The error was corrected in the 1898 paper, which provided the equivalent of the formula < r r = (1 -r 2 )n-*, where n ~ N — 1, the number of so-called degrees of freedom in this case. (N could be used instead of n in the above expression, but it is useful and conventional to use the degrees of freedom.) The above expression appeared in textbooks for several decades, puzzling students by the obvious absurdity that the standard error of r appears as a function of r itself. Of course the meaning of the above formula is that the asymptotic (or largesample) standard error of r is (1 — p 2 ) n % which is estimated by substituting r for p in the expression. The notation of the period was one in which parameters and their estimators were often denoted by the same symbol, a pernicious practice that sometimes misled even those statisticians who presumably used it only as a convenient shorthand. The need for a notational distinction between the two concepts of parameter and estimator was not well understood, even by mathematical statisticians, until after the publication of Fisher's paper of 1915.
The development of mathematical theory The first publication of an exact distribution of a correlation coefficient seems to have been by William S. Cosset (1908), a chemist publishing under the name "Student" because of his employers' opposition to publication [see COSSET]. The data were supposed to represent a random sample from a bivariate normal population with correlation p — 0. Fisher's 1915 paper supplied for the first time an exact distribution of r with p ^ 0. This paper has led to others by various authors, and will stand as a great triumph. The matter had been on Pearson's mind, and after the publication of Fisher's paper he mobilized the resources of his entire Biometric Laboratory in London to improve the results. In what has come to be referred to as the Cooperative Study (Soper et al. 1917), Pearson, with four collaborators, began with a series expression for the distribution which is remarkable in that although it converges, it does so with extreme slowness. When multiplied by an appropriate factor, however, and integrated to get the moments, the new series converges with great rapidity, especially for large samples. The Cooperative Study also effected other mathematical improvements and provided handsome plates showing the frequency function as a surface with horizontal coordinates r and />, with drawings and tables. But then came a fateful step. Difficulties about the foundations of statistical inference were coming more clearly into view, partly as a result of all the work on r. It seemed only natural to Pearson to invoke Bayes' theorem of inverse probability to provide a solution of these unsolved problems. The Cooperative Study has a section on the application of the results, with a priori probabilities provided by Pearson's experience and judgment and with far-reaching inferences from hypothetical samples. Fisher had already taken a stand against Bayesian inference and wrote a rebuttal to the inverse probability argument of the Cooperative Study. However, because of Pearson's opposition, Fisher, still a young man and comparatively unknown, was unable to publish his paper in England. It finally appeared in 1921 in Corrado Gini's new journal Metron, published in Rome. In the 1921 volume and in that of 1924, besides pointing out the absurdities arising from application of inverse probability by Pearson's methods to certain data, Fisher made an important constructive contribution regarding the application of the same distribution to partial correlations with a reduction in the number
MULTIVARIATE ANALYSIS: Correlation (2) of degrees of freedom equal to the number of variates eliminated. Florence N. David, a member of the Pearson group at University College, London, computed a very fine table (1938) of the correlation distribution in random samples from a normal distribution, using as a principal method the numerical solution of difference equations. It far exceeded in scope and accuracy the short tables previously published in Fisher's initial paper (1915) and in the Cooperative Study (Soper et al. 1917). She used as a principal computational tool the two second-order difference equations previously discovered, which she adapted. The appropriate formula for the variance of the correlation coefficient, o-2r = (1 — p2y/n, equivalent to the 1898 result of Pearson and Filon, is only the first term of an infinite series of powers of n~l with coefficients involving increasing powers of p. Additional terms may be computed by various methods—for example, by the rapidly convergent series for the moments of r used in the Cooperative Study (Soper et al. 1917) or by Hotelling (1953, p. 212). All these approximations to the variance of r, however, require a knowledge of p, which is ordinarily not obtainable. Moreover, when p ^ 0, the distribution of r is skew, and if p is close to ± 1 and the sample is of moderate size, the distribution is very skew indeed. A serious problem is thus created for statisticians who wish to determine, for example, whether the values of r in two independent samples differ significantly from each other or to find a suitably weighted average of several quite different and independent values of r, corresponding either to distinct values of p or to one common value. Fisher proposed as a solution for such problems the transformation r = tanh z,
z = tanlr1 r =
1 +r 1 -r
abandoning an inferior transformation of his 1915 paper, and announced that, to a close approximation and with moderately large samples, z has a nearly normal distribution, with means and variances nearly independent of p. F. N. David examines, in her volume of tables (1938), the accuracy of these statements by Fisher and is inclined to consider them accurate enough for practical use. These descriptive terms are, however, relative, and it still seems that for some cases, especially with small samples, use of the z transformation is not sufficiently accurate. In Fisher's original calculation there are small errors in the mean and variance of z, which are
549
not carried beyond terms of order n-1. These are corrected and the series are carried out to terms of order n'2 in a paper by Hotelling (1953). These series provide apparent improvements in the accuracy of z, at least for large samples. This paper also contains revised calculations on many other aspects of the correlation distribution. A frequent practical problem is to test the null hypothesis p = 0 from a single observed correlation. Under normality this null hypothesis corresponds to independence. To this end, r is usually transformed into z, which is treated as normally distributed about 0. This practice, however, is not to be recommended. It is far more accurate in such cases to use one of the other three methods of testing the hypothesis p = 0 (given in the first part of Hotelling 1953). This 1953 paper is a careful reworking of most of the earlier theory of correlation, with considerable additions. These include three new formulas for the distribution of r when p = 0 and one formula, involving a very rapidly convergent hypergeometric series, good for all |p| < 1. With these series there are easily calculated and usually small upper bounds for the error of stopping with any term. There are also attractive series for the probability integral and for the moments of r and of Fisher's transform, z = tanlr1 r. Simple improvements are obtained for Fisher's estimates of the bias and variance of z. These eliminate certain small errors and go further in the series of powers of n'1 to terms of order n~3 and carry these through for moments of orders lower than 5. For moments of order 5 or more, all terms are of order n~4 or higher. The moments of r through the sixth are given through terms of order n~3. The skewness and kurtosis are also given and differ slightly from Fisher's values. Finally, it is proposed that z be modified, particularly for large samples, by using in its place either the first two or all three of the terms of Z
3z + r 4n
23z + 33r - 5r3 96n2
Here, as throughout the 1953 paper, n means the number of degrees of freedom, which is ordinarily less by unity than the sample number. A further method for testing p = 0 is to restate this hypothesis as asserting that the regression coefficient of one variate on the other is truly 0, and to test this by means of Student's t, the ratio of the estimated regression coefficient to its estimated standard error; this is a function of r. All these methods are accurate only in the case of random sampling from a normal distribution.
550
MULTIVARIATE ANALYSIS: Correlation (2)
However, even in this standard situation the use of z is more or less inaccurate, especially for small samples and large values of r. As stated above, Fisher recoirrnends the use of z instead of r also for purposes other than testing o — O, such as testing the difference between two independent correlation coefficienls or the dispersion among several such values of r or the weights to be applied in averaging them or the accuracy of the average. This idea was carried further bv R. L. Thorndike (1933) in a study of the stability of the IQ. Each of his experiments resulted in a correlation between the results of the test given at an earlier and at a later date. With the magnitude of such a correlation coefficient is associated the number of persons in the sample and also the time elapsed between tests. Since the weights to be applied to the independent experiments are inversely proportional to the variances in the several cases, and since the reciprocals of the variances are approximately proportional to the number of cases in the samples when the correlations are transformed into values of z, essentially uniform variances are obtained. Thus, in fitting a curve to the several correlations, the method of least squares is appropriate because its assumptions are approximately satisfied. The weights are taken as the numbers of persons in the experiments. More accuracy could presumably be obtained by using instead of z the slightly different expressions z* and z** obtained by Hotelling (1953, pp. 223-224). Variance of r in nonnormal cases In addition to the unreliability of inferences involving correlation coefficients mentioned above, because of correlations between different observations on the same variate and because of nonuniform variances, a quite different source of errors is the nonnormal bivariate distributions that often affect observations. When these distributions, or their first four moments, are known or approximated, the variance of r is given, to a first approximation, by the formula
in which ^ (i,j = 0,1,2,3,4) is the expectation E [ ( X - E X ) i ( Y - E Y ) ' ] . This formula was established by Arthur L. Bowley (1901, p. 423 in the 1920 edition) and later by Maurice G. Kendall (1943-1946, vol. 1, p. 211). If the moments of the bivariate normal distribution are substituted in this formula, the result is cr2 — (1 — p 2 ) 2 /n, the well-known first approxima-
tion. A second approximation is found by multiplying this result by l + llp 2 /(2n), as shown, with considerable extensions, by Hotelling (1953, p. 212). If instead of being normal the distribution is of uniform density within an ellipse centered at the origin and tilted with respect to the coordinate axes if p ^ 0, and if the density is 0 outside this ellipse, the formula for the variance, given above, is multiplied by -f. This is a substantial reduction. Another case is a distribution over only four points, with probabilities 1 + Ap i-ip
for (1, 1) and (-1, -1), for (1,-1) and (-1, 1),
and with p taking any value between —1 and 1. The moments needed are easily found; since x3 = x and 7/3 = y for the values ± 1, which are the only ones considered, any subscript of 2 or more may be reduced by 2 or 4. The result is cr2, = (1 — p 2 )/n, and p is the correlation. This variance is larger than that for samples from a normal distribution by the factor (1 -p 2 )- 1 . A collection of such cases would be useful in practice because of the importance of nonnormality in correlation. Partial and multiple correlation—geometry Suppose that tests of arithmetical and reading abilities, yielding scores X1 and X 2 , are applied to a group of seventh-grade school children and the correlation between these abilities is sought. A difficulty is that proficiency in both tests depends on age, X 3 , and general advancement, X 4 . In this case either or both of X3 and X4 may be incorporated in regression functions fitted by least squares to X1 and X 2 , and the deviations of X1 and X 2 from these functions may be correlated in a way more nearly independent of age and general advancement than Xi and X2 by themselves. Such a correlation is called a sample partial correlation of order 1 or 2, according to the number of variables eliminated, and is denoted by r 12 . 3 , r 12 . 4 , or r 12 . 34 . If all four variables are measured on each of N children, the results may be pictured as the N coordinates, in a space of IV dimensions, of four points, and each of these determines a vector from the origin. If the coordinates are replaced by deviations from the respective four means, this is equivalent to projecting each of the four vectors orthogonally onto the flat subspace through the origin for which the sum of a point's coordinates is zero. Consider the four vectors from the origin to the four projections; the cosines of their angles
MULTIVARIATE ANALYSIS: Correlation (2) are the correlations among the original variables. The above projections may be regarded as the original vectors, from each of which is subtracted its orthogonal projection on the equiangular line (the line of all points whose coordinates are equal among themselves). The sample partial correlations may be regarded similarly, except that the subtracted projection is onto a subspace that includes the equiangular line, and more. For example, r 1 2 . 3 may be described geometrically as follows: Begin with the plane determined by the equiangular line and the vector from the origin to the point determined by the N observations on X3 as coordinates. Project the vector from the origin to the Xl point onto that plane, and subtract the resulting vector from the X, vector. This gives the residual values of the X x observations after best "removing" the effects of a constant and of X 3 . Now go through the same procedure for X 2 . Then r 12 . 3 is the cosine of the angle between the two vectors of residuals. In order to compute r 1 2 < 3 it is not necessary to go through this process arithmetically, for r 12 . 3 is a simple function of the ordinary correlation coefficients T-io — r i s r From this geometry, which was described by Dunham Jackson (1924), it is easy to see that if X! and X2 have a joint normal distribution, with independence among the different persons, and X3 is fixed or has an arbitrary distribution, then the deviations of X1 and X, from their regressions on X3 have a correlation distribution of the same kind, with the sample number reduced by unity. The definition of r 12 . 4 is equivalent to the formula above, with "3" replaced by "4." It may be given a geometrical interpretation like those above. In general, the subscripts before the dot, called primary subscripts, pertain to the variables whose correlation is sought; they are interchangeable. The subscripts after the dot are called secondary subscripts, refer to the variables being eliminated, and may be permuted among themselves in any order without changing the value of the partial correlation provided by the formula. If p variates and the arithmetic means are eliminated, with N values for each variable, the number of degrees of freedom is reduced to n — N — 1 — p. Partial correlations may also be expressed as ratios of determinants of simple correlations. This fact is useful in proving theorems, but in numerical work the recursive formulas like those above are generally used.
551
Partial correlations were used extensively by Yule (see Yule & Kendall [1911] 1958) in investigations of social phenomena, generally on the basis of the poor-law union as a unit. Multiple correlation is the correlation of one predictand ("dependent variate") with two or more predictor variables, with least squares as the method of prediction or estimation. The multiple correlation coefficient is the correlation between the observations, y, and the predicted values, Y [see LINEAR HYPOTHESES, article on REGRESSION]. The exact sampling distribution of the multiple correlation coefficient R, like that of r, was discovered by Fisher. Canonical correlations The situation of multiple correlation is generalized to the case where one has two sets of variables, with two or more variables in each set, and wishes to use and analyze the relations between the sets. The multiple correlation case is that in which one set consists of only a single variable, whereas in the new situation there are at least two variables in each set. This problem was dealt with in a brief paper by Hotelling (1935). A longer, definitive version of it and of many related problems appeared the following year (Hotelling 1936a). T. W. Anderson (1958), working with slightly different notation and subject matter, deals with canonical correlations and canonical variates in a population in chapter 12 and in a sample in chapter 13, with related subjects. A primary objective of canonical correlation analysis is to determine two linear functions, one of variates in the first set, the other of those in the second set, so that the correlation between these two functions is as great as possible. Without loss of generality, one may require the variances of these two linear combinations to be unity, so that covariance is to be maximized. This permits use of the Lagrange multiplier approach, with two fixed conditions. The resulting equation for maximization is a determinantal equation in the Lagrange multiplier, X, written in terms of all the original correlations. If there are s variates in the first set and t in the second, with s ^ t as a matter of convention, it turns out that the determinantal equation has 2s real roots, all less than or equal to one in absolute value. (They come in pairs of equal magnitude and opposite sign.) If one of the roots is substituted for X, the determinant of the determinantal equation is 0; then, if its matrix be used to form linear equations, their solution provides the coefficients of two linear functions of the s and t
552
MULTIVARIATE ANALYSIS: Correlation (2)
variates, respectively. The correlations between those pairs of linear functions, lying between 0 and +1, constitute the canonical correlations of the system. The linear functions are the canonical variates and may be regarded either as determined only to within an arbitrary common multiplier or as determined by the conditions that their variance shall equal unity. The greatest root and its corresponding pair of linear functions provide the solution of the primary problem. If all roots are 0, then every correlation of a variate in one set with a variate in the other is 0. For s = t = 2 the calculations are easy by elementary methods. For larger values of s and t, however, elementary methods rapidly grow more laborious and may well be superseded by iterative procedures. Such processes are available; different but similar processes are described by Hotelling (1936a; 1936&). Canonical correlations and variates may be computed for the population, if its correlation matrix is known, exactly as for the sample. If a population canonical correlation, p, is a single, not a multiple, root of its equation, then large-sample first approximations to it will tend to normality with a standard error that to a first approximation is (1 — p 2 )n~*, exactly as in the case of elementary correlation. For multiple roots the large-sample approximations have a distribution tending to the chi-square form, with the number of degrees of freedom equal to the multiplicity. There is some awkwardness in using canonical correlations that may sometimes be avoided, according to the particular purpose, by using functions of them. Symmetric functions often bring special simplicity. If the roots are r 3 , n, • • • , two of the most useful symmetric functions are q — r-j., • • • rs and z = (1 - r 2 ) ( l - r f ) ••• (1 - r f ) ; < 7 has been called the vector correlation coefficient and z the vector alienation coefficient. They may be used to test different types of deviations from independence between the two sets, but the same is true of other functions of r t , • • • , rs, for example, the greatest root. Between the set (xt, x>) and the set (x 3 , x^ the vector correlation coefficient is -v»
,-v .
A. A.
-*•
13
.*•
24
>*»
14
23
This vanishes if the tetrad difference (the numerator) does so. Thus, the tetrad difference, of great importance in factor analysis, may sometimes be tested appropriately by testing q. It is shown by Hotelling (1936a, p. 362) that if complete independence exists between the two sets, the
probability that q is exceeded in a sample of N from a quadrivariate normal distribution is exactly (1 - g|) AT " 3 . (Many other matters involved in the statistics of pairs of variates are also included in Hotelling 1936a and other publications.) A study of causes of death related to alcoholism in France carried out by Sully Lederman was the starting point of a utilization of canonical correlations and canonical variates by Luu-Mau-Thanh, of the Institut de Statistique de FUniversite de Paris and the Institut National d'Etudes Demographiques (Luu-Mau-Thanh 1963). The first set of variates consisted of three causes of death: alcoholism, liver diseases, and cerebral hemorrhage. The other set consisted of seven other causes of death. The canonical correlations were found to be .812, .450, and .279. The author also calculated principal components for the two sets. He illustrated another kind of application of canonical analysis by some data on grain collected by Frederick V. Waugh (1942) and analyzed by Maurice G. Kendall (1957). Luu-Mau-Thanh commented that the progress of canonical correlation analysis has been hampered by the heavy computational labor required but that the arrival of modern electronic computers will abolish this difficulty. HAROLD HOTELLING [See also STATISTICS, DESCRIPTIVE, article on ASSOCIATION.] BIBLIOGRAPHY ANDERSON, R. L. 1942 Distribution of the Serial Correlation Coefficient. Annals of Mathematical Statistics 13:1-13. ANDERSON, THEODORE W. 1958 An Introduction to Multivariate Statistical Analysis. New York: Wiley. BOWLEY, ARTHUR L. (1901) 1937 Elements of Statistics. 6th ed. New York: Scribner; London: King. DAVID, FLORENCE N. 1938 Tables of the Ordinates and Probability Integral of the Distribution of the Correlation Coefficient in Small Samples. London: University College, Biometrika Office. FISHER, R. A. 1915 Frequency Distribution of the Values of the Correlation Coefficient in Samples From an Indefinitely Large Population. Biometrika 10:507—521. FISHER, R. A. 1918 The Correlation Between Relatives on the Supposition of Mendelian Inheritance. Royal Society of Edinburgh, Transactions 52:399-433. FISHER, R. A. 1921 On the "Probable Error" of a Coefficient of Correlation Deduced From a Small Sample. Metron 1, no. 4:3-32. FISHER, R. A. 1924 The Distribution of the Partial Correlation Coefficient. Metron 3:329-333. [COSSET, WILLIAM S.] (1908) 1943 Probable Error of a Correlation Coefficient. Pages 35-42 in William S. Cosset, "Student's" Collected Papers. Edited by E. S. Pearson and John Wishart. London: University College, Biometrika Office. HOLZINGER, KARL J.; and HARMAN, HARRY H. 1941 Factor Analysis: A Synthesis of Factorial Methods. Univ. of Chicago Press.
MULTIVARIATE ANALYSIS: Classification and Discrimination HOTELLING, HAROLD 1925 The Distribution of Correlation Raiios Calculated From Random Data. National Academy of Sciences, Proceedings 11:657-662. HOTELLING, HAROLD 1935 The Most Predictable Criterion. Journal of Educational Psychology 26:139142. HOTELLING, HAROLD 1936a Relations Between Two Sets of Variates. Biometrika 28:321-377. HOTELLING, HAROLD 1936b Simplified Calculation of Principal Components. Psychometrika 1:27-35. HOTELLING, HAROLD 1943 Some New Methods in Matrix Calculation. Annals of Mathematical Statistics 14: 1-34. HOTELLING, HAROLD 1953 New Light on the Correlation Coefficient and Its Transforms. Journal of the Royal Statistical Society Series B 15:193-225. HOTELLING, HAROLD; and PABST, MARGARET R. 1936 Rank Correlation and Tests of Significance Involving No Assumption of Normality. Annals of Mathematical Statistics 7:29-43. JACKSON, DUNHAM 1924 The Trigonometry of Correlation. American Mathematical Monthly 31:275-280. KELLEY, TRUMAN L. 1928 Crossroads in the Mind of Man: A Study of Differentiate Mental Abilities. Stanford Univ. Press. KENDALL, MAURICE G. 1938 A New Measure of Rank Correlation. Biometrika 30:81-93. KENDALL, MAURICE G. 1943-1946 The Advanced Theory of Statistics. 2 vols. London: Griffin. -> A new edition, written by Maurice G. Kendall and Alan Stuart, was published in 1958-1966. KENDALL, MAURICE G. (1948) 1955 Rank Correlation Methods. 2d ed. London: Griffin; New York: Hafner. KENDALL, MAURICE G. (1957) 1961 A Course in Multivariate Analysis. London: Griffin. KOOPMANS, TJALLING C. 1942 Serial Correlation and Quadratic Forms in Normal Variables. Annals of Mathematical Statistics 13:14-33. LUU-MAU-THANH 1963 Analyse canonique et analyse factorielle. Institut de Science Economique Appliquee, Cahiers Series E Supplement 138:127-164. PEARSON, KARL; and FILON, L. N. G. (1898) 1948 Mathematical Contributions to the Theory of Evolution. IV: On the Probable Errors of Frequency Constants and on the Influence of Random Selection on Variation and Correlation. Pages 179-261 in Karl Pearson's Early Statistical Papers. Cambridge Univ. Press. -> First published in Volume 191 of the Philosophical Transactions of the Royal Society of London, Series A. SOPER, H. E. et al. 1917 On the Distribution of the Correlation Coefficient in Small Samples: A Cooperative Study. Biometrika 11, no. 4:328-413. SPEARMAN, CHARLES E. 1904 The Proof and Measurement of Association Between Two Things. American Journal of Psychology 15:72-101. SPEARMAN, CHARLES E. 1927 The Abilities of Man: Their Nature and Measurement. London: Macmillan. THORNDIKE, R. L. 1933 The Effect of the Interval Between Test and Retest Upon the Constancy of the IQ. Journal of Educational Psychology 24:543-549. WAUGH, FREDERICK V. 1942 Regressions Between Sets of Variables. Econometrica 10:290-310. WISHART, JOHN 1932 Note on the Distribution of the Correlation Ratio. Biometrika 24:441-456.
553
YULE, G. UDNY; and KENDALL, MAURICE G. (1911) 1958 An Introduction to the Theory of Statistics. 14th ed., rev. & enl. London: Griffin. -> Maurice G. Kendall has been a joint author since the eleventh edition (1937). The 1958 edition was revised by Maurice G. Kendall. IV CLASSIFICATION AND DISCRIMINATION
Classification is the identification of the category or group to which an individual or object belongs on the basis of its observed characteristics. When the characteristics are a number of numerical measurements, the assignment to groups is called by some statisticians discrimination, and the combination of measurements used is called a discriminant function. The problem of classification arises when the investigator cannot associate the individual directly with a category but must infer the category from the individual's measurements, responses, or other characteristics. In many cases it can be assumed that there are a finite number of populations from which the individual may have come and that each population is described by a statistical distribution of the characteristics of individuals. The individual to be classified is considered as a random observation from one of the populations. The question is, Given an individual with certain measurements, from which population did he arise? R. A. Fisher (1936), who first developed the linear discriminant function in terms of the analysis of variance, gave as an example the assigning of iris plants to one of two species on the basis of the lengths and widths of the sepals and petals. Indian men have been classified into three castes on the basis of stature, sitting height, and nasal depth and height (Rao 1948). Six measurements on a skull found in England were used to determine whether it belonged to the Bronze Age or the Iron Age (Rao 1952). Scores on a battery of tests in a college entrance examination may be used to classify a prospective student into the population of students with potentialities of completing college successfully or into the population of students lacking such potentialities. (In this example the classification into populations implies the prediction of future performance.) Medical diagnosis may be considered as classification into populations of disease. The problem of classification was formulated as part of statistical decision theory by Wald (1944) and von Mises (1945). [See DECISION THEORY.] There are a number of hypotheses; each hypothesis is that the distribution of the observation is a given one. One of these hypotheses must be accepted and the others rejected. If only two populations are
554
MULTIVARIATE ANALYSIS: Classification and Discrimination
admitted, the problem is the elementary one of testing one hypothesis of a specified distribution against another, although usually in hypothesis testing one of the two hypotheses, the null hypothesis, is singled out for special emphasis [see HYPOTHESIS TESTING]. If a priori probabilities of the individual belonging to the populations are known, the Bayesian approach is available [see BAYESIAN INFERENCE]. In this article it is assumed throughout that the populations have been determined. (Sometimes the word classification is used for the setting up of categories, for example, in taxonomy or typology.) [See CLUSTERING; TYPOLOGIES.] The characteristics can be numerical measurements (continuous variables), attributes (discrete variables), or both. Here the case of numerical measurements with probability density functions will be treated, but the case of attributes with frequency functions is treated similarly. The theory applies when only one measurement is available (p = 1) as well as when several are (p ^ 2). The classification function based on the approach of statistical decision theory and the Bayesian approach automatically takes into account any correlation between variables. (Karl Pearson's coefficient of racial likeness, introduced in a paper by M. L. Tildesley [1921] and used as a basis of classification, suffered from its neglect of correlation between measurements.) Classification for two populations Suppose that an individual with certain measurements (x-i, • • • , x p ) has been drawn from one of two populations, TTI and 7r>. The properties of these two populations are specified by given probability density functions (or frequency functions), ^(Xj, • • • , xp) and p - , ( x 1 , • • • , x p ), respectively. (Each infinite population is an idealization of the population of all possible observations.) The goal is to define a procedure for classifying this individual as coming from TTI or TT-, . The set of measurements #!, • • • , xp can be presented as a point in a p-dimensional space. The space is to be divided into two regions, jRt and R2. If the point corresponding to an individual falls in R± the individual will be classified as drawn from TT: , and if the point falls in R, the individual will be classified as drawn from 7r 2 . Standards for classification. The two regions are to be selected so that on the average the bad effects of misclassification are minimized. In following a given classification procedure, the statistician can make two kinds of errors: If the individual is actually from TTI the statistician may classify him as coming from TT,, or if he is from TT-, the statistician may classify him as coming from TTI . As shown in Table 1, the relative undesirability of
these two kinds of misclassification are C(2|l), the "cost" of misclassifying an individual from TT, as coming from 7r 2 , and C(l|2), the cost of misclassifying an individual from 7r2 as coming from 7r t . These costs may be measured in any consistent units; it is only the ratio of the two costs that is important. While the statistician may not know the costs in each case, he will often have at least a rough idea of them. In practice the costs are often taken as equal. Table I — Cosfs of correct and incorrect classification Population 7T,
TTj
Statistician's decision
In the example mentioned earlier of classifying prospective students, one "cost of misclassification" is a measure of the undesirability of starting a student through college when he will not be able to finish and the other is a measure of the undesirability of refusing to admit a student who can complete his course. In the case of medical diagnosis with respect to a specified disease, one cost of misclassification is the serious effect on the patient's health of the disease going undetected and the other cost is the discomfort and waste of treating a healthy person. If the observation is drawn from TT, , the probability of correct classification, P(l 1,R), is the probability of falling into R: , and the probability of misclassification, P(2|1,R) = 1 - P ( l 1,R), is the probability of falling into R2 . (In each of these expressions R is used to denote the particular classification rule. ) For instance, (1)
The integral in ( 1 ) effectively stands for the sum of the probabilities of measurements from TJ-I in R, . Similarly, if the observation is from ir.2 , the probability of correct classification is P(2[2,R), the integral of P-,(XI , • • • , Xp') over R2 , and the probability of misclassification is P(l 2,R). If the observation is drawn from TT-L , there is a cost or loss when the observation is incorrectly classified as coming from 7T2 ; the expected loss, or risk, is the product of the cost of a mistake times the probability of making it, r(l,R) = C(2|1)P(2|1,R). Similarly, when the observation is from 77, , the expected loss due to misclassification is r(2,R) = C(1|2)P(1|2,R). In many cases there are a priori probabilities of drawing an observation from one or the other population, perhaps known from relative abun-
MULTIVARIATE ANALYSIS: Classification and Discrimination dances. Suppose that the a priori probability of drawing from TT, is ql and from ?r2 is q*. Then the expected loss due to misclassification is the sum of the products of the probability of drawing from each population times the expected loss for that population: 7,r(l,R) + g-,r(2,R) = 9 l C(2|l)P(2|l,R) + qr 2 C(l|2)P(l|2,R). The regions, RI and R 2 , should be chosen to minimize this expected loss. If one does not have a priori probabilities of drawing from TTI and 7r 2 , he cannot write down (2). Then a procedure R must be characterized by the two risks r(l,R) and r(2,R). A procedure R is said to be at least as good as a procedure R* if r(l,R) ^r(l,R*) and r(2,R) ^r(2,R*), and R is better than R* if at least one inequality is strict. A class of procedures may then be sought so that for every procedure outside the class there is a better one in the class (called a complete class). The smallest such class contains only admissible procedures; that is, no procedure out of the class is better than one in the class. As far as the expected costs of misclassification go, the investigator can restrict his choice of a procedure to a complete class and in particular to the class of admissible procedures if it is available. Usually a complete class consists of more than one procedure. To determine a single procedure as optimum, some statisticians advocate the minimax principle. For a given procedure, R, the less desirable case is to have a drawing from the population with the greater risk. A conservative principle to follow is to choose the procedure so as to minimize the maximum risk [see DECISION THEORY]. Classification into one of two populations Known probability distributions. Consider first the case of two populations when a priori probabilities of drawing from TTJ and 7r2 are known; then joint probabilities of drawing from a given population and observing a set of variables within given ranges can be defined. The probability that an observation comes from TTI and that the ith variate is between Xi and Xi + dXi (i - 1, • • • , p) is approximately q-ip^x-i, • • • , xp) dx^ • • • dxp. Similarly, the probability of drawing from 7r2 and obtaining an observation with the zth variate falling between Xi and Xi + dx^ (i= 1, • • • ,p) is approximately q2p2(xi, • • • , Xp) dx^ • • • dxp. For an actual observation x1} • • • , Xp, the conditional probability that it comes from ir-i is (3)
LJ
J_L-
L_^
__
_^ ,
555
and the conditional probability that it comes from
The conditional expected loss if the observation is classified into TT-, is C(2|l) times (3), and the conditional expected loss if the observation is classified into 7rT is C(l|2) times (4). Minimization of the conditional expected loss is equivalent to the rule >C(l|2)qr2p2(%1, • • - , * „ ) ,
(5)
R.,: C(2|l)qf 1 p 1 (oc 1 , • • - , % „ )
,xp) where k=[C(l 2)g 2 ]/[C(2|l)g 1 ]. This is the B ayes solution. These results were first obtained in this way by Welch (1939) for the case of equal costs of misclassification. These inequalities seem intuitively reasonable. If the probability of drawing from TTI is decreased or if the cost of misclassifying into TT., is decreased, the inequality in (6) for JRi is satisfied by fewer points. Since the regions depend on q^ and q2, the expected loss does also. The curve A in Figure 1 EXPECTED LOSS
0
^
_* 1
x
A PRIORI
-«•
PROBABILITY,^
Figure 1 — Expected loss as a function of the a priori probability q, for three procedures
556
MULTIVARIATE ANALYSIS: Classification and Discrimination
indicates how the expected loss may vary with g, (and q.2= 1 - ,)• It may very well happen that the statistician errs in assigning his a priori probabilities. (The probabilities might be estimated from a sample of individuals whose populations of origin are known or can be identified by means other than the measurements for classification; for example, disease categories might be identified by subsequent autopsy. ) Suppose that the statistician uses g, and q.> ( = 1 -
where R^ and JR2 are based on ql and q2 . Given the regions R^ and R2, this is a linear function of qr graphed as the line B in Figure 1, a line that touches A at ! = gt . The line cannot go below A because the best regions are defined by (6). From the graph it is clear that a small error in q^ is not very important. When the statistician cannot assign a priori probabilities to the two populations, he uses the fact that the class of Bayes solutions (6) is identical (in most cases) to the class of admissible solutions. A complete class of procedures is given by (6) with k ranging from 0 to °°. (If the probability that the ratio is equal to k is positive a complete class would have to include procedures that randomize between the two classifications when the value of the ratio is k. ) The minimax procedure is one of the admissible procedures. Since R2 increases as k increases, and hence r(l,R) increases as k increases, and at the same time r(2,R) decreases, the choice of k giving the minimax solution is the one for which r( 1,R) =. r(2,JR). This is then the average loss, for it is immaterial which population is drawn from. The graph of the risk against a priori probability q^ is, therefore, a horizontal line (labeled C in Figure 1 ). Since there is one value of q^ , say q* , such that k = [C(l|2)(l -9i)]/[C(2|l)qf 1 ], the line C must touch A. Two known multivariate normal populations. An important example of the general theory is that in which the populations have multivariate normal distributions with the same set of variances and correlations but with different sets of means. [See MULTIVARIATE ANALYSIS : OVERVIEW.] Suppose that xl} • • • , xp have a joint normal distribution with means in ^ of Exi = ////' and in ir2 of Exi = /42). Let the common set of variances and
correlations be 0-2 , • • • , a* , p12 , p13 , ••• , pp_lijt . It is convenient to write (6) as
where "In" denotes the natural logarithm. In this particular case In (7)
where Xl, • • • , \p form the solution of the linear equations P j-T.
The first term on the right side of (7) is the wellknown linear discriminant function obtained by Fisher (1936) by choosing that linear function for which the difference in expected values for the two populations relative to the standard deviation is a maximum. The second term is a constant consisting of the average discriminant function at the two population means. The regions are given by
(8)
Ink. If a priori probabilities are assigned, then k is [C(l 2)g 2 ]/[C(2|l)qr 1 ]. In particular, if k - I (for example, if C(l|2) = C(2|l) and qi = q2 = $), Ink = 0, and the procedure is to compare the discriminant function of the observations with the discriminant function of the averages of the respective means. If a priori probabilities are not known, the same class of procedures (8) is used as the admissible class. Suppose the aim is to find Ink = c, say, so that the expected loss when the observation is from TTi is equal to the expected loss when the observation is from ir->. The probabilities of misclassification can be computed from the distribution of
when X-L, • • • , xp are from 77-1 and when x1} • • • , xp are from 7r2 . Let A2 be the Mahalanobis measure of distance between 7rt and 7r2 ,
MULTIVARIATE ANALYSIS: Classification and Discrimination The distribution of U is normal with variance A2. If the observation is from 77-1 the mean of U is |A2; if the observation is from 7r2 the mean is — ^A 2 . The probability of misclassification if the observation is from TT, is P(2|1,R) = TJ _
1A2
r _ 1 c
A
*
A
where $(2) is the probability that a normal deviate with mean 0 and variance 1 is less than z. The probability of misclassification if the observation is from TT . is
Classification
with estimated parameters.
557 In
most applications of the theory the populations are not known but must be inferred from samples, one from each population. Two multivariate normal populations. Consider now the case in which there are available random samples from two normal populations and in which the aim is to use that information in classifying another observation as coming from one of the two populations. Suppose the sample (x(y, • • • , x^)(y = 1, • • • , N(1)) is from ir^ and the sample (*<«, • • • , *$)(y = 1, • • • , N (2 >) from 772 . Then /u*0 can be estimated by the mean of the zth variate of the first sample x^ and /42) by the mean of the second sample x(?\ The usual estimate of o^or,/)^ based on the two samples is
P(1|2,R) =
= Pr c + -c - M2 A
Figure 2 indicates the two probabilities as the shaded portion in the tails. The aim is to choose c so that
These estimates may then be substituted into the definition of U, to obtain a new linear function of Xi , • • • , x,, depending on these estimates. The classification function is
where the coefficients / T , • • • , lp are the solution to p
•
If the costs of misclassification are equal, c = 0 and the common probability of misclassification is O(iA). In case the costs of misclassification are unequal, c can be determined to sufficient accuracy by a trial-and-error method with the normal tables. If the set of variances and correlations in one population is not the same as the set in the other population, the general theory can be applied, but Infp^Xi , • • • , xp')/p2(x1 , • • • , xp~)] is a quadratic, not a linear, function of x± , • • • , xp. Anderson and Bahadur (1962) treat linear functions for this case.
Figure 2 — Probabilities of misclassification as shaded areas under normal densities with means ±
A 2 and variance A 2
j£(_2)
=
Since there are now sampling variations in the estimates of parameters, it is no longer possible to state that this procedure is best in either of the senses used earlier, but it seems to be a reasonable procedure. (A result of Das Gupta [1965] shows that when N(1) = N(2) and the costs of misclassification are equal, the procedure with c = 0 is minimax and admissible.) The exact distributions of the classification statistic based on estimated coefficients cannot be given explicitly; however, the distribution can be indicated as an integral (with respect to three variables). It can be shown that as the sample sizes increase, the distributions of this statistic approach those of the statistic used when the parameters are known. Thus for sufficiently large samples one can proceed exactly as if the parameters were known. Asymptotic expansions of the distributions are available (Bowker & Sitgreaves 1961). A mnemonic device for the computation of the discriminant function (Fisher 1938) is the introduction of the dummy variate, y, which is equal to a constant (say, 1) when the observation is from
558
MULTIVARIATE ANALYSIS: Classification and Discrimination
TTi and is equal to another constant (say, 0) when the observation is from 7r2 . Then (formally) the regression of this dummy variate, y, on the obXp over the two samples served variates xl} gives a linear function proportional to the discriminant function. In a sense this linear function is a predictor of the dummy variate, y. In practice the investigator might not be certain that the two populations differ. To test the null hypothesis that p(V — /42), i= 1, • • • ,p,he can use the discriminant function of the difference in sample means
If the observation is classified as from TT/(, the expected loss is (9)
where x stands for the set x^, • • • , xp. The expected loss is minimized at this point if h is chosen to minimize (9). The regions are
h = 1,
which is (N(1) + N (2) )/(N (1) N (2) ) times Hotelling's generalized T2. The T2-test may thus be considered as part of discriminant analysis. [See MULTIVARI-
If C(%) = 1 for all x1, • • • , xp is in Rfc if (10)
R,: qhph(x
and h
m, , then
ATE ANALYSIS: OVERVIEW.]
Classification for several populations So far, classification into one of only two groups has been discussed; consider now the problem of classifying an observation into one of several groups. Let ir-i, • • • , TTm be m populations with density functions PI(XI_, • • • , x,,), • • • , pm(Xi, • • • , Xp), respectively. The aim is to divide the space of observations into m mutually exclusive and exhaustive regions R t , • • • , Rm . If an observation falls into Rg it will be considered to have come from TTO . Let the cost of classifying an observation from 77,, as coming from irh be C(h\g~). The probability of this misclassification is
dx.
P(h\g, R) =JR
If the observation is from irg , the expected loss or risk is
Given a priori probabilities of the populations, <7i , -•• ,qm, the expected loss is m
m
r~ m
~1
Eqgr(g, R) = Ed EC(%)P(%, R) ;
0=1
0=1 [_Ji=i h*g
J
R! , • • • , Rm are to be chosen to make this a minimum. Using a priori probabilities for the populations, one can define the conditional probability that an observation comes from a specified population, given the values of observed variates, oct , • • • , xp . The conditional probability of the observation coming from 77v is qgPg\X-j,
' ' ' , Xp)
_
In this case the point xl} • • • , xp is in Rfc if k is the index for which qgpg(x) is a maximum, that is, 7Tfc is the most probable population, given the observation. If equalities can occur with positive probability so that there is not a unique maximum, then any maximizing population may be chosen without affecting the expected loss. If a priori probabilities are not given, an unconditional expected loss for a classification procedure cannot be defined. Then one must consider the risks r ( g , R) over all values of g and ask for the admissible procedures; the form is (10) when C(h\g) = 1 for all g and h (g =£ h}. The minimax solution is (10) when qlt • • • , qm are found so that (11)
r ( l , R ) = ••• =r(m, R ) .
This number is the expected loss. (The theory was first given for the case of equal costs of misclassification by von Mises [1945].) Several multivariate normal populations. As an example of the theory, consider the case of m multivariate normal populations with the same set of variances and correlations. Let the mean of x in 7r be /A ( >. Then In
where \ ( > w ,
are the solution to
For the sake of simplicity, assume that the costs of misclassification are equal. If a priori prob-
MULTIVARIATE ANALYSIS: Classification and Discrimination abilities, g,, • • • , < ? m , are known, the regions are defined by (13)
Rg : ugh(Xi , • • • , xp) > In —- — In qh. — In qg , *?// h — I , • • • ,m, h*g,
where ugh(x-i, • • • , xp) is (12). If a priori probabilities are not known, the admissible procedures are given by (13), with In qh replaced by suitable constants CH. The minimax procedure is (13), for which (11) holds. To determine the constants c ft , use the fact that if the observation is from irg, w ff fc(#i , • • - , Xp), h= 1, • • • , m and h ^ g, have a joint normal distribution with means (14)
w
Euah(xlt • - . , *„) = EAf >(/*> - M )/2. i=l
The variance of ugh(xl} • • • , xp} is twice (14), and the covariance between the variables ugh(xl , • • • , xp} and M f f f c * i , • • • , x is
From these one can determine P(h\g,R} for any set of constants cl , • • • , cm . This procedure divides the space by means of hyperplanes. If p = 2 and m = 3, the division is by half-lines, as in Figure 3. If the populations are unknown, the parameters may be estimated from samples, one from each population. If the samples are large enough, the above procedures can be used as if the parameters were known. An example of classification into three populations has been given in Anderson (1958). The problem of classification when (Xi, • • • , x p ) are continuous variables with density functions has been treated here. The same solutions are ob-
Figure 3 — Regions of classification into one of three multivariate populations
559
tained when the variables are discrete, that is, take on a finite or countable number of values. Then PI(XI, • • • , xp), p2(x-L, • • • , xp), and so on are the respective probabilities (or frequency functions) of (Xi, • • • , xp~) in -TTi, 7T 2 , and so on. (See Birnbaum & Maxwell I960; Cochran & Hopkins 1961.) In this case randomized procedures are essential. For other expositions see Anderson (1951) and Brown (1950). For further examples see Mosteller and Wallace (1964) and Smith (1947). T. W. ANDERSON [Directly related are the entries CLUSTERING; SCREENING AND SELECTION.] BIBLIOGRAPHY
ANDERSON, T. W. 1951 Classification by Multivariate Analysis. Psychometrika 16:31-50. ANDERSON, T. W. 1958 An Introduction to Multivariate Statistical Analysis. New York: Wiley. ANDERSON, T. W.; and BAHADUR, R. R. 1962 Classification Into Two Multivariate Normal Distributions With Different Covariance Matrices. Annals of Mathematical Statistics 33:420-431. BIRNBAUM, A.; and MAXWELL, A. E. 1960 Classification Procedures Based on Bayes's Formula. Applied Statistics 9:152-169. BOWKER, ALBERT H.; and SITGREAVES, ROSEDITH 1961 An Asymptotic Expansion for the Distribution Function of the W-classification Statistic. Pages 293-310 in Herbert Solomon (editor), Studies in Item Analysis and Prediction. Stanford Univ. Press. BROWN, GEORGE W. 1950 Basic Principles for Construction and Application of Discriminators. Journal of Clinical Psychology 6:58-60. COCHRAN, WILLIAM G.; and HOPKINS, CARL E. 1961 Some Classification Problems With Multivariate Qualitative Data. Biometrics 17:10-32. DAS GUPTA, S. 1965 Optimum Classification Rules for Classification Into Two Multivariate Normal Populations. Annals of Mathematical Statistics 36:1174-1184. FISHER, R. A. 1936 The Use of Multiple Measurements in Taxonomic Problems. Annals of Eugenics 7:179188. FISHER, R. A. 1938 The Statistical Utilization of Multiple Measurements. Annals of Eugenics 8:376-386. MOSTELLER, FREDERICK; and WALLACE, DAVID L. 1964 Inference and Disputed Authorship: The Federalist. Reading, Mass.: Addison-Wesley. RAO, C. RADHAKRISHNA 1948 The Utilization of Multiple Measurements in Problems of Biological Classification. Journal of the Royal Statistical Society Series B 10:159-193. RAO, C. RADHAKRISHNA 1952 Advanced Statistical Methods in Biometric Research. New York: Wiley. SMITH, CEDRIC A. B. 1947 Some Examples of Discrimination. Annals of Eugenics 13:272—282. TILDESLEY, M. L. 1921 A First Study of the Burmese Skull. Biometrika 13:176-262. VON MISES, RICHARD 1945 On the Classification of Observation Data Into Distinct Groups. Annals of Mathematical Statistics 16:68-73. WALD, ABRAHAM 1944 On a Statistical Problem Arising in the Classification of an Individual Into One of Two Groups. Annals of Mathematical Statistics 15:145-162. WELCH, B. L. 1939 Note on Discriminant Functions. Biometrika 31:218-220.
560
MUN, THOMAS MUN, THOMAS
Thomas Mun (1571-1641), English writer on economics, was the third sen of a substantial London family. His grandfather was an officer of the mint and acquired a coat of arms, his uncle was also an officer of the mint, and his stepfather was a director of the newly formed East India Company. Nothing is known of his education, but it is presumed, since there were close links between the Indian and the Mediterranean trades, that he served his apprenticeship in the latter. In fact, he says in one of his books that he lived for some time in Italy. He became a prominent and rich member of the East India Company, married the daughter of a Bedfordshire gentleman, and inherited and bought land in the country. One of his daughters married a baronet, another a merchant. His son appears to have lived the life of a country gentleman. Mun came into the public eye during the economic depression which began in 1620. The books for which he is famous sprang entirely from that depression. The gravest symptom of the depression was the shortage of money, and, indeed, many regarded this shortage as a cause of the depression. In 1621 Mun wrote and published A Discourse of Trade From England Unto the East-Indies to answer the charge that the East India Company, which financed its trade largely by the export of silver coin, was responsible for the depression. His argument was that East Indian goods, when reexported, earned more silver than that originally exported to pay for them. Mun was one of the merchants consulted by the government about the causes of the depression and was a member of the great commission of trade set up in 1622 to make recommendations concerning economic policy. On the commission, he opposed successfully the advocates of two different policies, each based on a distinct theoretical analysis of the mechanism of foreign trade. One group of advocates held that the export of silver was caused by the undervaluation of silver coin in England and urged therefore that sterling be devalued: this view found articulate expression in Edward Misselden's Free Trade (1622). A second group believed that excessive export was intrinsic in foreign exchanges and advocated exchange control with a fixed exchange rate: this view was, in turn, forcibly presented by Gerard de Malynes in The Maintenance of Free Trade (1622) and elsewhere. Mun composed, or helped to compose, a series of papers directed against both these views, and these papers formed the substance of a book which he completed between 1626 and 1628 and which his
son published in 1664: England's Treasure by Forraign Trade. Mun was a practicing professional merchant, which his opponents for the most part were not, and the book is much more a handbook for merchants and statesmen than an essay in theoretical economics. From the theoretical standpoint, Mun's criticism of de Malynes's views was inadequate. His central thesis was a tautological statement that money flows into or out of the country as the value of exports exceeds or falls short of the value of imports. He recognized the existence of invisible exports but was not original in this respect. Far from questioning in what sense treasure (that is, in the last resort, silver) is synonymous with wealth, he accepted as axiomatic that the balance of payments is the "rule" or "touchstone" of national wealth. He ignored the possibly inflationary effects of an indefinite influx of silver, took no account of international lending, and stated that the "overplus" of the balance and no more ought to be drawn off in taxation—as if government spending were economically irrelevant. Nevertheless, England's Treasure remains a great book, even if it is not exactly a storehouse of the best economic ideas of the age and if its originality must be questioned at many points. It is an important book, first, because Mun's ideas prevailed —devaluation and exchange control were not attempted—and second, because in a single (admittedly partial) analysis it embraced with unrivaled lucidity all the economic variables under discussion at that time. Mun insisted that foreign trade is governed by the demand for commodities, that the flow of goods rules the exchange rate, and that silver itself is merely another commodity. He advocated low export prices, efficient commercial procedures, full exploitation of native skills and resources, low duties on exports, encouragement of re-exports, and the like: in sum, an export drive. Perhaps his most notable contribution to economic theory was to recognize and to insist on the principle of elasticity of demand, estimating that a reduction of 25 per cent in the price of cloth (England's chief export) would increase by 50 per cent the quantity sold. He dealt with the great but not insuperable difficulty of drawing up a balance of payments; such a balance was, in fact, established shortly after Mun's book was published. Mun's originality lay in adjusting the conventional doctrine of the balance of trade (or rather of payments) to the new circumstances of rising foreign competition in the export market, especially of fierce economic rivalry with Holland, then the dominant commercial power; it is significant that the book was first published on the eve of the sec-
MUNSTERBERG, HUGO ond Anglo-Dutch war. His practical liberalism, typical of the professional merchant of his day, commended him to later laissez-faire economists such as John R. McCulloch, who saw him as a tentative exponent of freedom of trade. He was, however, sharply divided from the laissez-faire economists and remained typically mercantilist in his reiterated distinction between the profit of the individual merchant and the general welfare of the national economy as a whole, as when he stated that the merchant's gain can be the commonwealth's loss and the merchant's loss the commonwealth's gain. His whole argument presupposes that a nation which gains by foreign trade does so at the expense of another. R. W. K. HINTON [For the historical context of Muris work, see ECONOMIC THOUGHT, article on MERCANTILIST THOUGHT; and the biography of MISSELDEN.] WORKS BY MUN
(1621) 1954 A Discourse of Trade From England Unto the East-Indies. Pages 1-47 in John R. McCulloch (editor), Early English Tracts on Commerce. Cambridge Univ. Press. (1664) 1959 England's Treasure by Forraign Trade. Oxford: Blackwell. SUPPLEMENTARY BIBLIOGRAPHY GOULD, J. D. 1955a The Date of England's Treasure by Forraign Trade. Journal of Economic History 15:160161. GOULD, J. D. 1955& The Trade Crisis of the Early 1620's and English Economic Thought. Journal of Economic History 15:121-133. HARDY, ALFRED L. 1894 Thomas Mun. Pages 1183-1186 in Dictionary of National Biography. London: Smith. HINTON, R. W. K. 1955 The Mercantile System in the Time of Thomas Mun. Economic History Review Second Series 7:277-290. MALYNES, GERARD DE 1622 The Maintenance of Free Trade. London: Sheffard. MISSELDEN, EDWARD 1622 Free Trade: Or, the Meanes to Make Trade Florish. London: Waterson. SUPPLE, BARRY E. 1959 Commercial Crisis and Change in England, 1600-1642: A Study in the Instability of a Mercantile Economy. Cambridge Univ. Press. VINER, JACOR 1937 Studies in the Theory of International Trade. New York: Harper.
MUNSTERBERG, HUGO Hugo Miinsterberg (1863-1916) made his greatest contribution by applying psychology to practical situations in education, medicine, law, and business. He pioneered in this field when most psychologists were still working only on basic theoretical principles. Miinsterberg also did theoretical work, but he is best remembered for several books in spe-
561
cial applied fields and for his very comprehensive (for his time) Grundziige (19140). Most of these practical contributions were made in the eight or nine years prior to his untimely death. Miinsterberg was born in Danzig in 1863, took his PH.D. under Wundt at Leipzig, and received a medical degree at Heidelberg. In 1892 William James arranged to bring him to Harvard as professor of psychology and director of the psychology laboratory. Except for a year as exchange professor at the University of Berlin in 1910/1911, the remainder of his career was spent at Harvard. Miinsterberg's initial academic interests were principally philosophical. His system was sometimes described as a "voluntaristic idealism." He placed a barrier between philosophy and science, philosophy the "real world" of purposes and science limited to causes. Later he tried to formalize this arrangement as causal psychology and purposive psychology. This dichotomy was included in his introductory textbook, but it was not received very well by students in the beginning course; nor did it have much impact on philosophers in general. The same thing was true of his "action theory," which stated that the vividness of experience depends on the amount of activity in the cerebral motor centers. It was in applied psychology that Miinsterberg's work had a lasting effect, although much of it was generated in the armchair rather than in the laboratory. His books mentioned numerous implications of psychology for problems in the workaday world and gave suggestions for further exploration and research. He did some experimental work himself and supervised research by students. One field he explored was the use of psychology in business and industry. He made some of the first efforts toward validating aptitude tests. In an era when a correlation coefficient was something rarely understood or used, Miinsterberg was in some fashion relating test results to a criterion of efficiency of workers on the job—motormen and telephone operators, for example—and he saw the implications of fatigue and monotony for industrial efficiency. He was one of the first to get in touch with business people to suggest ways psychology could help them. He was also in contact with aeronautical engineers regarding psychological problems connected with flying. Another field which Munsterberg explored was education. His contributions here were less notable in the sense that he did not stand alone. Many academic educators had contacts with psychologists, and together they turned up problems of common interest. With his medical background Munsterberg had
562
MURDER
some experience with problems of mental health and did some work in therapy by suggestion. He was among the early users of hypnotism in psychotherapy. During his later years he kept on his desk as a symbol of his interest in hypnosis a paperweight consisting of four glass balls in the form of a tetrahedron. The center of this device provided a good fixation point for a patient being hypnotized. In the field of law, and especially in regard to testimony, Miinsterberg noted how mistakes in perception or lapses in memory contribute to the unreliability of a witness. Nobody else wrote along these lines for two decades. In the 1890s Miinsterberg suggested that changes in blood pressure might have some relation to the veracity of testimony. The first experimental work on blood pressure in this context was done by a student in Miinsterberg's laboratory. Records of blood pressure are now included in the measurements made by practically every polygraph used for "lie detection." Miinsterberg's contribution to applied psychology had two further facets: first, he let outsiders know how psychology might help them in practical problems; and second, he convinced a small group of psychologists that practical application of the science was a legitimate field for a career. This group has grown through the years. Had Miinsterberg lived longer, it is probable that he would have turned back to philosophy as his major interest. It is said that he had hoped to spend his later years in one of the endowed professorships of philosophy. If that had been possible, his professional life would have come full circle, but Miinsterberg died suddenly while lecturing at Radcliffe College in 1916. HAROLD E. BURTT [For the historical context of Miinsterberg's work, see
the biography of WUNDT. For discussion of the subsequent development of his ideas, see APTITUDE TESTING; HYPNOSIS; INDUSTRIAL RELATIONS, article On INDUSTRIAL AND BUSINESS PSYCHOLOGY.] WORKS BY MUNSTERBERG
1908 On the Witness Stand. New York: Doubleday. 1909a Psychology and the Teacher. New York: Appleton. 1909£> Psychotherapy. New York: Moffat. 1913 Psychology and Industrial Efficiency. Boston: Houghton Mifflin. (1914a) 1928 Grundziige der Psychotechnik. 3d ed. Leipzig: Earth. 1914b Psychology, General and Applied. New York: Appleton. SUPPLEMENTARY BIBLIOGRAPHY
MUNSTERBERG, MARGARETE 1922 Hugo Miinsterberg: His Life and Work. New York: Appleton.
MURDER See CRIME, article on HOMICIDE.
MUSIC i. ETHNOMUSICOLOGY ii. Music AND SOCIETY
Alan P. Merriam Hans Engel
ETHNOMUSICOLOGY
The beginnings of ethnomusicology are usually traced back to the 1880s and 1890s, when studies were initiated primarily in Germany and in the United States. Early in this development there appeared a dual division of emphasis that has remained throughout the history of the field. Definitions. Two polar positions on a definition of "ethnomusicology" are most frequently enunciated: the first is embodied in such statements as "ethnomusicology is the total study of non-Western music," and the second in "ethnomusicology is the study of music in culture." The first derives from a supposition that ethnomusicology should concern itself with certain geographical areas of the world; those who hold this point of view tend to treat the music structurally. The second stresses music in its cultural context, no matter in what geographical area of the world and is concerned with music as human behavior and the functions of music in human society and culture. Consequently, its emphasis on musical structure is not as great, although it does use objective techniques of detailing a musical style to effectuate comparison between song bodies and to attack problems of diffusion, acculturation, and culture history. Thus one emphasis in ethnomusicology concerns the description and analysis of technical aspects of musical structure. In early writings this aim tended to be coupled with attempts to use the concept of social evolution to establish basic laws of the development of music structure through time. Particular attention was also directed toward the problem of the ultimate origin of music; and later, with the rise of Kulturkreis theories and particularly in connection with the study of musical instruments, detailed reconstructions of music diffusion from supposed basic geographical centers were attempted. The second emphasis in ethnomusicology was directed toward the study of music in its ethnologic context, and research in this area was influenced by American anthropology. As a result, extreme theories of evolution and diffusion were strongly discounted. Ethnomusicology has thus developed in two directions. On the one hand, music is treated as a structure that operates, it is presumed, according to certain principles inherent in its own construction. On the other hand, since music is produced
MUSIC: Ethnomusicology by and for people, it must also be regarded as a product of human behavior operating within a cultural context and in conjunction with all the other facets of human behavior. The duality of music as a human phenomenon is thus emphasized in ethnomusicological studies; while musical sound has structure, that structure is produced by human behavior and operates in a total cultural context. Ethnomusicology has also been shaped by various historical processes. Arising at a time when virtually nothing was known outside Western and, to a certain extent, Oriental cultures, ethnomusicology placed heavy emphasis on the unknown areas of the world—Africa, aboriginal North and South America, Oceania, inner Asia, Indonesia. Thus the development of ethnomusicology to a considerable extent paralleled that of anthropology: both disciplines were forced to deal with all these areas at once—the anthropologist with the total cultures of the so-called "primitive" peoples and the ethnomusicologist with the total study of their music. Thus there arose in ethnomusicology a body of techniques and a system of analysis, which, while drawing upon studies of Western music, have taken some unique turns. Music structure. Ethnomusicologists are engaged in a search for the proper balance between the basic parts of their discipline, and this search tends to be made within the framework of three major responsibilities felt by scholars in the field. The first of these areas is the technical study of music structure itself and of how it can best be learned, described, generalized, and compared in specific instances. Even here there is divergence of opinion, as one group of ethnomusicologists argues that the best way to learn a music system is by learning to perform in its style. Performance, most notably in Indonesian and Far Eastern orchestras and styles, is stressed by some scholars, and in many cases with notable results. On the other hand, this approach is criticized by those who hold that performance cannot be the ultimate goal of ethnomusicology and that the value of performance tends to be overstressed. Ethnomusicologists are agreed, however, that musical sound must ultimately be reduced to notation. Notation by ear in the field is considered unreliable because of the many nuances that are lost, and the usual procedure is to work by ear from tape or disc recordings. In recent years the possibilities of constructing electronic equipment that will give a far more accurately detailed transcription have been explored, and preliminary results indicate that such equipment may, indeed, be both feasible and useful.
563
The precise transcription of scale systems tuned in intervals different from the Western scale remains somewhat difficult, although such measuring devices as the monochord, electronic equipment, and the cents system can, and do, bring a high degree of precision. Most ethnomusicologists, however, use the Western staff system for notation, employing various special signs to indicate pitch differences and discussing the precise tunings in the body of their report. Analysis is almost always couched in objective, arithmetical, and sometimes statistical terms, with frequencies of appearance of specific characteristics related to the total possibility of the sample. Those characteristics of the music usually considered include melodic range, level, direction, and contour; melodic intervals and interval patterns; ornamentation and melodic devices; melodic meter and rhythm; durational values; formal structure; scale, mode, duration tone, and (subjective) tonic; meter and rhythm; tempo; and vocal style. Other characteristics may be added by the individual student, and almost every body of song demands unique attention in some respects. There remain, however, a number of difficulties in the technical analysis of music. The first of these concerns transcription itself and the accuracy that can be achieved through the use of the human ear. Closely connected with this is the unresolved question of how accurate a transcription must be; that is, can one generalize, or must the accuracy be as high as that presaged by the advent of electronic equipment? A third problem concerns sampling. Theoretically, at least, the musical universe of any given people is infinite, and the questions are thus how large a sample yields reliable results and whether a larger sample will yield significantly different results from a smaller one. It must also be decided whether one type of song in a given culture is significantly different from another and, if so, whether these types must be treated separately or lumped together into a general set of results for the entire body of music. Finally, there is the major problem of which elements of a musical style are significant, and whether those that are significant are also characteristic. Despite these questions, the technical analysis of musical style has reached a point at which a high degree of precision is possible, and the directions in which analysis has thus far moved seem clearly to be those that will be refined and more fully exploited in the future. Musical instruments. Associated with the study of musical structure is the study of musical instruments, taken from both the technical and the distributional points of view. Ethnomusicology has supplied detailed studies of the construction and
564
MUSIC: Ethnomusicology
tuning of instruments, as well as a precise classification of instruments according to the mechanism of sound production (aerophones, chordophones, idiophones, and membranophones). Distributional and diffusion studies of instruments are found for many parts of the world. Music as human behavior. Musical sound does not and cannot constitute a system that operates outside the control of human beings. It is thus a product of the behavior that produces it. Behavior includes a wide variety of phenomena, but within the rubric four particularly important facets can be segregated. The first of these refers to the physical behavior of the musician and his audience. In order to produce vocal sounds, the musician must control the vocal organs and the muscles of throat and diaphragm in certain ways; likewise, in producing instrumental music his breath control and manipulation of fingers or lips upon the instrument can only be achieved through training, whether the musician trains himself or is trained by others. It has further been noted that in performing, musicians take on characteristic bodily postures, tensions, and attitudes, and attempts are being made to correlate these with types of music styles. Similarly, the audience responds to music in physical and physiological ways, but little is known of this phenomenon cross-culturally. A second form of behavior in this context is the social behavior that accompanies music. In response to his social role, the individual musician behaves in specific ways according to his own concept of what that role entails, as well as in response to the pressures placed upon him by society at large. Being a musician means behaving according to culturally defined values; for him, the attitudes and expectations of society, as well as his own attitudes toward himself, define what is considered to be "musicianly." But society is shaped also by the musician and his music, for it is often the latter that gives the cues for proper behavior in a given social situation. The third important aspect of music behavior concerns learning both on the part of the specialist and the layman. The musician needs training, whether it is achieved through imitation, apprenticeship, formal schooling, or some other device. Similarly, the nonspecialist learns his music system sufficiently to participate to some extent and certainly well enough to differentiate it from other systems. Finally, verbal behavior is involved in music to the extent to which analytic comment is made by members of a culture on their music system. Theory of music. Beneath the level of behavior as such, however, lies a deeper level, that of the
conceptualization of music. The ethnomusicologist deals with why music sounds the way it does, as well as with the "musts" and "shoulds" of music. Although little material of this kind is available as yet, the problems lie in the nature of the distinctions made between music and nonmusic, the sources from which music is drawn, techniques of composition, the inheritance of musical ability, and other questions of a similar nature. In other words, before music behavior can be acted out, there must be underlying concepts in terms of which the behavior is shaped. There exists, then, a continuum of levels of analysis in the study of musical behavior: music must begin with basic concepts and values, which in turn are translated into actual behavior; this in turn is directed toward the achievement of a specific musical product, or structural sound. There remains one further aspect of the continuum, however, and this appears in the acceptance or rejection of the final product both by the musician and by the members of the society at large. If the product is acceptable to both, then the concepts out of which it has arisen are reinforced and the behavior perfected insofar as possible; if, on the other hand, the product is not adjudged acceptable, then concepts must be changed and translated into different behavior in order to adjust the structured sound to what is considered proper. The product thus inevitably feeds back upon the concept, which in turn shapes behavior so that the product, again, will be successful. Both here and on the behavioral level, ideas and techniques of musical training are of the utmost importance. Under the stimulation of anthropological problems, methods, and theory, the behavioral aspects of ethnomusicology have begun to take on added interest; and by 1950 "ethnomusicology" was replacing the older term "comparative musicology" (vergleichende Musikivissenschaft). Ethnomusicology and related fields. Growing out of the studies of those interested primarily in music as human behavior has been a third area of responsibility for ethnomusicologists, and this concerns the relationship of the field to other kinds of studies. Two major avenues of research have opened here, the first in the relationship of ethnomusicology to the study of the other arts, and the second in its relationship to the social sciences. Relations with the arts. In respect to the arts as a whole, ethnomusicologists have begun to turn to problems of general aesthetics as these are illuminated by the cross-cultural perspective of comparative music studies. One such problem is the nature of what is called the aesthetic in Western
MUSIC: Ethnomusicology culture, for t^ose few ethnomusicologists who have considered the subject have in general agreed that the term does not translate well to other cultures, particularly those of nonliterate peoples where the underlying assumptions about music tend to run along different lines. There is a strong suggestion that for most peoples outside Western and Eastern civilization music may be a functional rather than an aesthetic complex in which major emphasis is placed upon what music does rather than philosophic speculation on what it is. This in turn has considerable bearing upon the Western assumption of the interrelatedness of the various arts. What empirical evidence is available seems to indicate that most other peoples do not conceive ideationally of the arts as structurally interrelated, and therefore this concept may well be applicable in the Western context alone. Similar problems that tend to bring evidence to these two major questions include synesthesia, intersense modalities, and so forth. The cross-cultural contribution of ethnomusicology in such problems is potentially considerable, and questions of this nature are being more and more widely considered. Relations with the social sciences. The relationship of ethnomusicology to the social sciences has already been indicated in that an ethnologic component is inherent in the basic organization of the field. As ethnomusicology continues to expand its orientation, it becomes more and more apparent that both ethnomusicologists and social scientists have overlooked a number of possibilities for fruitful cooperation between the two broad areas. The entire study of music as human behavior, of course, lies well within the sphere of social science, as does the application even of technical music analysis to problems such as acculturation, but there are other applications as well. Among these is the study of music as symbolic behavior, both in itself and as it relates to broader areas of the culture under study. Political, social, legal, economic, and religious concepts can all be symbolized in musical sound and behavior, and it is frequently to be noted that in the arts in general, among them music, symbolic expression tends to cut to the deepest levels of value and belief. The functions of music in any given culture tell much of the organization and processes of the culture at large, and reference is made here not only to "use" but to integrative function as well. Music operates for specific purposes in all cultures, and analysis of these processes reveals much about both specific and general behavior. Song texts are a badly neglected area of study, both in connection with music itself and with the wider culture. Studies have shown that language behavior in song
565
may differ sharply from that in everyday discourse, with the stress in song often being placed upon the expression of otherwise unutterable feelings, thoughts, attitudes, and ideas; texts are thus very often an extremely important index to basic values. Texts, too, reveal psychological processes in the life of any given culture, such as when they indicate mechanisms of repression or compensation. It is well known that songs can serve functions of social control, as well as educational and historicgraphical functions. The relevance of music studies to social science is indeed great, and both disciplines might derive considerable benefit from recognizing this fact. Ethnomusicology, then, is currently in a phase of expansion and development wherein it is engaged in sorting out the kinds of studies of greatest importance to its development. By its very nature it is interdisciplinary, using the techniques, methods, and theories of both musicology and ethnology; from the fusion of the two it gains new and unique strengths. ALAN P. MERRIAM [Directly related are the entries CRAFTS; FOLKLORE; PRIMITIVE ART.] BIBLIOGRAPHY The works cited below have been chosen to give a broad rather than a selective coverage of widely divergent points of view and methods of approach. ELLIS, ALEXANDER J. 1885 On the Musical Scales of Various Nations. Journal of the Royal Society of Arts 33:485-527. HERZOG, GEORGE 1936 A Comparison of Pueblo and Pirna Musical Styles. Journal of American Folklore 49:283-417. HOOD, MANTLE 1963 Music, the Unknown. Pages 215326 in Frank L. Harrison, Mantle Hood, and Claude V. Palisca, Musicology. Englewood Cliffs, N.J.: PrenticeHall. HORNBOSTEL, ERICH M. VON 1905 Die Probleme der vergleichenden Musikwissenschaft. Zeitschrift der Internationalen Musikgesellschaft 7:85—97. KUNST, JAAP (1950) 1959 Ethnomusicology. 3d enl. ed. The Hague: Nijhoff. -> First published under the title Musicologica. A supplement was published in 1960. LOMAX, ALAN 1962 Song Structure and Social Structure. Ethnology 1:425-451. MCALLESTER, DAVID P. 1955 Enemy Way Music: A Study of Social and Esthetic Values as Seen in Navaho Music. Harvard University, Peabody Museum of American Archaeology and Ethnology, Papers, Vol. 41, No. 3. Cambridge, Mass.: The Museum. MALM, WILLIAM P. 1959 Japanese Music and Musical Instruments. Rutland, Vt.: Tuttle. MERRIAM, ALAN P. 1964 The Anthropology of Music. Evanston, 111.: Northwestern Univ. Press. NETTL, BRUNO 1964 Theory and Method in Ethnomusicology. New York: Free Press.
566
MUSIC: Music and Society
NKETIA, J. H. KWABENA (1963) 1965 Drumming in Akan Communities of Ghana. New York: Humanities Press. SACHS, CURT 1940 The History of Musical Instruments. New York: Norton. SCHAEFFNER, ANDRE 1936 Origins des instruments de musique: Introduction ethnologique a I'histoire de la musique instrumentale. Paris: Payot. SEEGER, CHARLES 1953 Preface to the Description of a Music. Pages 360-370 in International Society for Musical Research, Fifth Congress, Utrecht, 1952, Report. Amsterdam: Alsbach. WALLASCHEK, RICHARD 1893 Primitive Music: An Inqi'iry Into the Origin and Development of Music, Songs, Instruments, Dances, and Pantomimes of Savage Races. London: Longmans. II MUSIC AND SOCIETY
Music is an expression of inner life, an expression of feelings through the technique of composition, according to the rules of a certain musical style. As expression, music affects the listener as well as the player. It liberates feelings, but it also demands, on the part of the listener, receptiveness and an acquaintance with the style in question. Music as communication That music has affectual aspects was stressed in antiquity (in the Greek doctrine of the ethos), in the Middle Ages (musica movet affectum), and in the Baroque era (in the theory of emotions). Carl Philipp Emanuel Bach stated in 1753 that since a musician cannot move unless he himself is moved, he must be able to experience all the emotions that he wishes to awaken in his audience. He lets them know his feelings and, thus, arouses them to sympathy. This expressive character has been disputed by H. G. Nageli (1826), who spoke of "arabesques," or an interplay of lines, in music, and by E. Hanslick (1854), who wrote that forms that are "moved tones" are the content of music and that the beautiful generates no emotions. This formal aesthetic is in contrast to the expressive aesthetic (Hausegger 1885). But, it seems that forms that are merely moved tones, such as arabesques, possess an expressive character, as do all forms (Wellek 1963). All music, even "empty [not aiming at expression] play music," such as Oriental music, is movement and, as such, the expression of demonstrable, nervous, physical sensations. The rhythm of this movement stimulates the listener elementally, causing him to move with it. This is especially evident in dancing. Groups or masses of people can be brought to uniform movement, extending to ecstasy, by endlessly repeated rhythms. A child spontaneously follows a musical movement he hears by making expressive motions,
like those cultivated in the modern expressive dance. The educated concertgoer, to be sure, is trained from an early age to suppress these spontaneous sympathetic movements. It follows, therefore, that music has the character of communication. Sound spontaneously uttered by an individual serves as a contact sound, as a first step toward a call or a shout, or as a decoy, wooing, or warning call. Both speech and music develop symbols. Speech evolves ideas, which lead to thinking and logic. Music begins with emotional sounds, which are followed by signals and calls that serve different social purposes. Yet, even in the animal world we find a play of sounds that is unrelated to social purpose, as in the songbirds. Here we have an instinctual root of purposeless, aesthetic enjoyment. But, much music is quite purposeful, integrated into a superordinate social process; it is so-called Gebrauchsmusik. March music enables a group to keep in step and in proper order (and also promotes turgor vitalis), as does dance music. The folk song even today reveals its social purposes in multifarious variety: cradle songs, war songs, courtship and love songs, serenades, religious songs, incantation and curing songs, and work songs. The last type has almost disappeared in our industrialized countries. Nor is it the oldest type of song, as K. Biicher believed (1896), for it presupposes the existence of rhythmic cooperative work. In present-day industry, music is employed as background music, not to speed up the working rhythm but to stimulate the autonomic nervous system and willingness to work. Schoolchildren doing their homework, and even scientists, employ allegedly soft background music, below the threshold of consciousness or aesthetic effect, as a stimulus to do their work. Musical texts. In every musical performance the composer, the players, the singers, and the listeners interact with one another, often as semiparticipants in popular and exotic music, as in rhythmic clapping. Even when the performer himself does not invent, or improvise (as was the case in the past and in most present-day performances of music by preliterate peoples), but more or less freely reproduces the music invented by others (learned by ear in folk music and in Oriental music), or accurately performs music written by others (res facta, in the Middle Ages), an interpersonal process takes place. The folk song is invented by an individual, but it is much modified ("taken apart") by the singer. The motets of the fourteenth century were also modified by the singers. A composition was not regarded as the individual property of a single composer. Everyone
MUSIC: Music and Society changed it ad lib, adding new voices with new texts, etc. Thus, the musical composition was regarded as common property, a notion that persisted into the seventeenth century and even the eighteenth, when George Frederick Handel took over the compositions of others. The concept of "plagiarism," applied to parts of a work as well as the whole work, is a modern concept. The contemporary practice of copyright is the end product of a long development. To be sure, there were privileges of printing granted by a sovereign, but they were respected only in part. Today even a motif is protected by copyright, although protection is limited to a term of 50 years. Even primitive peoples have a law of musical property. Among the Andaman Islanders an invented song remains the intellectual property of the inventor, for which he is recompensed during festivals, and no one is permitted to sing the song after his death. The same was true among the Iroquois. The present-day law of property covers not only the right to reprint but also the right of performance and, in particular, mechanical reproduction. The musical professions Today the musical professions are highly specialized. The primeval musician was creator, singer, and performer in one, as shown in such mythological figures as Jubal and Orpheus. Composers were always performing musicians as well: singers (Josquin) in the fifteenth and sixteenth centuries; singers and conductors (Monteverdi) in the seventeenth century; and pianists (Mozart, Beethoven) and other instrumentalists (Viotti, Spohr) or professional conductors (Wagner, Mahler, Richard Strauss) in the eighteenth and nineteenth centuries and down into the twentieth century. Today, however, specialization characterizes the musical professions, even within a single profession, dividing them into entire categories, such as "serious" and "entertainment" music. There is a great diversity of musical roles, running from the highly paid star conductor down to the street musician and the beggar playing music, e.g., with a barrel organ. Moreover, each musical profession has a social scale of its own. The status of a musician is based upon one of two factors: (1) the professional role, which in turn derives from education, cultural level, and the prestige of his audience, and (2) income. There is no correlation between these two factors. In addition to talent and endowment, career and success depend upon circumstances, which are often fortuitous, as well as upon reviews in the press. The occupational category of conductor covers all
567
degrees of education, depending upon the kind of music; there are conductors of opera, church, military, jazz, and entertainment orchestras, each group being subdivided along an artistic scale. As the status of the church musician diminished during the nineteenth century, that of the conductor rose extraordinarily. In Verdi's time the conductor was unnamed, ranking behind the singers of the opera. After World War i his name might be printed on posters in conspicuous letters, above that of the composer. Singers, too, are categorized: opera singers, concert singers, jazz singers, and singers of popular tunes. Outstanding singers have always enjoyed substantial popularity and financial success. This was true in classical antiquity but has especially been the case since the eighteenth century, when prima donnas and castrati dominated the musical scene. The earnings of Caruso (who died in 1921), which were regarded as enormous in his day, have been overshadowed by the sensational success of more recent hit-tune singers who become millionaires overnight. This is due to mass responsiveness and particularly to the mechanical reproduction and distribution of hit tunes. Artistic reputation and prestige are greatly differentiated, for example, in the profession of the female singer. Female musicians and singers of the lower categories often led dubious lives, sometimes becoming prostitutes (the Syrian ambubaiae in Rome, the mistresses of princes during the Baroque, chansonnieres ). Instrumentalists also occupy many different positions on the social scale, ranging from the violinist in the orchestra, who is further differentiated according to his position in the orchestra and the quality of the orchestra, up to the eminent soloist, who can count on income from concerts of his own. The same holds true for pianists and other instrumentalists. The independent instrumentalist is often a teacher of his instrument—either privately or in schools, conservatories, etc.—thus, improving his financial status and his prestige (gaining the title of "professor"). It was common practice for masters of the past (Handel, Mozart, Beethoven) to make a living at times by giving lessons. A musician's prestige, apart from the special prestige of his profession, has varied through the centuries and even today differs according to country and people. We know of whole hierarchies of musician castes in antiquity: in Babylon, in Egypt, in Judea. Music was often performed by slaves. What is strange is the frequently severe restrictions placed on the civic dignity of a musician in many countries and times, such as ancient Rome, in contrast with the high esteem in which musicians were
568
MUSIC: Music and Society
held among the Germanic peoples. The skald of the Nordic peoples and the skop of the western Germans were the close confidants of princes. As the Nordic tribes became Westernized, the musician inherited the low status of the Roman mimus, becoming a vagrant minstrel, a tramp, or a street singer. In all of these cases, the individual musician was often able to secure high esteem, wealth, and status at the courts of secular and even ecclesiastical princes. Only with the establishment of cities did the domiciled musician obtain a civil occupation. Celebrated musicians gained high honors. Some of them were raised to the ranks of the nobility (Hofhaimer, Hassler); others were awarded papal decorations (Lasso and Mozart becoming knights; Dittersdorf, Gluck, and Spontini becoming equites aurati). The fact that universities awarded honorary doctorates to celebrated composers contributed to increasing the prestige of musicians. The title of professor gave the top level of musicians the right of presentation at court. A pianist, Ignace Paderewski, even became prime minister of Poland in 1919. The status of the musician was just as indefinite in the eighteenth and nineteenth centuries as his prestige. Fame and riches did not always entail equal rights for many celebrated musicians. For example, Franz Liszt, a rich grand seigneur, who was raised to the ranks of the hereditary nobility, nevertheless encountered resistance at the court when he proposed to marry a princess. This discrepancy between fame as an artist and status in society prevails in parts of the Orient up to the present day. As recently as 1932, at a congress in Cairo, high government officials refused to sit at the same table with eminent musicians of their own country. The prejudice against the artist was reinforced when the "bohemian" type arose in the nineteenth century. Even today the artist is not highly respected among the bourgeois middle class. In a certain sense he is outside society. Secondary musical professions. Alongside creating, performing, and directing there are many important professionals who serve the institutional structure of musical life, such as publishers, printers, music engravers, impresarios, and critics. The invention of the printing of music from movable type, in about 1500, made possible the spread of new musical styles and of music of high artistic merit. Nevertheless, in the eighteenth century the sale of handwritten music—music noted down by the copyist—still predominated over engraved notes. It was only toward the end of the eighteenth century that music publishing developed as a com-
mercial enterprise and influenced the style, distribution, and acceptance of compositions. Impresarios and agents. The impresario, or entrepreneur of public performances, played an important role in the history of opera, but his importance has lessened ever since the high cost of opera made it an unprofitable commercial undertaking, so that it now has to be managed as a subsidized public institution. On the other hand, the entrepreneur, manager, and agent are important elsewhere in the musical world, providing the talent employed in concerts and other musical performances. Critics. Another related profession is that of the music critic, who works for newspapers, magazines, the radio, etc., either as his principal occupation or as a sideline. Critical evaluation of performed music is found as far back as antiquity, the Middle Ages, and the Age of Enlightenment. Yet, as late as the eighteenth century, critics dealt primarily with musical texts; good examples are Mattheson (1722-1725) in the Critica musica and the French Encyclopedists. Most representative critics of the second half of the nineteenth century, for example, in their battle against Wagner followed the same line. Contemporary criticism, on the other hand, emphasizes reproduction and performance. The critic is uncontradicted in the pages of his own newspaper, but public opinion, even when it goes against the opinions of the critic, usually triumphs in the end. No special study, no examinations are required for the profession of critic. His certificate of competence is the quality of his style, not special knowledge of the subject, which is often sadly lacking even in prominent critics. The history of criticism proves how greatly critics have erred. Musicologists. Another musical profession is that of the musicologist who does his work in the quiet of the university. Musicology has made a significant contribution to the revival of music composed to order or on commission (Handel, Bach, and today Vivaldi and the music of the Baroque). Musical research in universities explores historical and aesthetic problems, as well as those dealing with instruments and performance. Public musical life The term "musical life" is generally taken to mean the total of all public and semipublic performances of music, rather than the private, intimate cultivation of music in the home. It involves, for the most part, the large musical institutions— that is, operas, orchestras, choruses. These events
MUSIC: Music and Society are sponsored by the government, the municipality, societies, associations, and commercial entrepreneurs. Opera. Today the opera is the biggest and most costly musical institution. As an art form, it is a stylized, special case of the theatrical play with music, which is to be found among all peoples and in all periods of history. Opera was begun in the West, by humanist circles in Florence, in 1594, as an aesthetic experiment in recreating the drama of antiquity, which used music to support the text. It developed in the sacred opera of the cardinals' palaces in Rome, in the impresario opera of Venice, including carnival farces and pantomimes, and in the royal opera of European sovereigns. These last two types continue to exist. Opera managed by an impresario was started in Venice by Ferrarri in 1637, as a profit-sharing venture among the members of his company; later, in London, between 1720 and 1728, Handel's opera companies took the legal form of joint stock companies. Court opera—presented before members of the court seated hierarchically in the traditional tiered theater—was wholly subsidized by the crown. The musical theater of the people took several forms: the bagatelliste drama, the commedia delI'arte, and the Singspiel. Starting with the caricature of Handel's opera seria in The Beggar's Opera, the Singspiel catered increasingly to the taste of the middle and petty bourgeois, especially in Germany. Singspiel troupes at first often played under the most wretched circumstances, but eventually this theater became a dangerous competitor to the court theater. The Singspiel disseminated the mood of the French Revolution in the theaters of the suburbs, but the political element was often secondary to pantomime and myth (e.g., E. Schikaneder's 1791 Singspiel, Die Zauberflote, with music by Mozart). A combination of the two types of opera were the numerous traveling opera troupes of the impresarios, which were often engaged by princes to play in their court theaters. The era of the Baroque court opera came to an end about 1760. Eight years earlier Empress Maria Theresa had turned her court theater over to the city of Vienna to be operated by the municipality. The French Revolution turned the opera into political propaganda, glorifying revolutionary ideas and heroes, a counterpart to the earlier glorification of princes during the Baroque era. Most of the European monarchies established state opera houses in their principal cities. Many of these have remained active. Because of its many
569
small principalities, Germany became the country with the largest number of opera houses and orchestras. In 1963 the German Federal Republic, including West Berlin, had 132 theaters with a total attendance of 6.3 million (including 2.4 million at operetta performances), 36 theater orchestras, and 41 independent orchestras, including 9 radio orchestras. East Germany had 86 theaters in 1962, with a total attendance of 3.4 million at the opera and 2.8 million at operettas. Russia had opera performances for the imperial court under Peter the Great, and a public state opera house was founded in Moscow in 1806. Alongside the state opera houses there were princely opera houses and, as late as 1900, private opera houses of princes. Bulgaria, Rumania, and Czechoslovakia have state opera houses. State opera houses were established in Scandinavia fairly late: in 1936, in Sweden, and in 1958, in Norway. Italy, the classic land of opera enterprise, established a royal opera house only in 1929. The opera companies are subsidized on a very lavish scale from the receipts of the amusement tax, for the high cost of operas makes the running of an opera company financially unprofitable. La Scala, Milan, for example, has annual box-office receipts of 1,900 million lire and since 1936 has received an annual subsidy of some 880 million lire. In 1958 the Royal Opera House, Covent Garden, received an inadequate subsidy of £63,000, although £500,000 had been requested. The last big private opera company in Germany was Angelo Neumann's Wagnerian company in the 1880s. In France the major arts have always been centered in Paris, where the grand opera received a subsidy of 20 million francs in 1958. The oldest established opera company in the United States is the Metropolitan Opera in New York, founded in 1884. Like the opera in Chicago, it was initially financed by wealthy businessmen. American opera houses are still dependent upon patrons and foundations. In the 1950s there were about 600 opera organizations in 47 states, 25 per cent being professional organizations and the others affiliated with clubs, churches, studios, and colleges. Operas have been performed in colleges, in high schools, in conservatories, and even in elementary schools. The support of large-scale artistic enterprises, whether by princes, institutions, wealthy patrons, or the state, means that these enterprises must conform, within varying limits, to the values of their patrons and publics. There are substantial differences between institutions in the type of
570
MUSIC: Music and Society
financial support, the artistic achievement, and the intellectual level, depending upon the class of society that supports or attends them. Operettas and musicals. The Singspiel troupes in the eighteenth century ca]led their plays operettas. However, the modern operetta, which achieved popular success in the second half of the nineteenth century in the Viennese operettas of Johann Strauss and the Parisian ones of Jacques Offenbach and was performed in independent operetta theaters run by impresarios, tried to advance from a provincial style to a quasi-operatic style (e.g., Franz Lehar). In America there developed the musical comedy, designed as light entertainment in a popular musical idiom. The Archers, performed in 1796 in the John Street Theater, New York, may be regarded as the first of this type. Musical comedies have run for years with enormous success in the Broadway theater of the twentieth century, reaping millions for their investors (for example, My Fair Lady, book by A. J. Lerner after Shaw's Pygmalion and music by F. Loewe, ran for six years, from 1956 to 1962). Orchestras. Orchestras are among the major public musical institutions; some of them are the most important components of the opera houses mentioned above. In the nineteenth and twentieth centuries large orchestras (a hundred men or more) acquired a mass audience in the concert halls of major cities. They are the outgrowth of the chamber orchestras of the seventeenth century, which often consisted of no more than twelve to sixteen men. Enlargement of the orchestras was promoted in Germany by the staging of patriotic celebrations of the Napoleonic Wars and the Congress of Vienna. In England large orchestras had played at the Handel festivals since 1784, and in France gigantic orchestras were assembled by Berlioz in connection with the political demonstrations and events of 1830, 1837 (an orchestra of 146 plus 4 auxiliary orchestras), 1841, 1848, and 1851. Speculative enterprises, such as the enormous orchestra used for an American performance of Johann Strauss, with 20,000 singers and players and 100 assistant conductors before an audience of 100,000, represented an extreme form of musical entertainment. In 1965 the United States had 1,385 symphony orchestras of various sizes. Of the top 100 orchestras, 10 were in existence before 1900; 15 were founded between 1900 and 1920; 54 between 1920 and 1940; and 21 after 1940. Of the 77 orchestras in the German Federal Republic today, 40 (not including the radio orchestras) have 60 to 100 players. The other European
countries do not have as many large orchestras. In France there are five large orchestras, four in Paris and one in Strasbourg. In England, London has three orchestras, and there are orchestras in Liverpool, Birmingham, Bournemouth, and Glasgow; London also has one opera orchestra and three large radio symphony orchestras. Milan, Italy, has the orchestra of La Scala and a symphony orchestra; Rome has one symphony orchestra. There are two radio orchestras, one in Rome and the other in Turin. Groups—in fact, masses—of performers have traditionally served as demonstrations of the power of kings, princes, and feudal lords, and more recently of governments. Musicians, usually trumpeters and drummers, announced the appearance of rulers in preliterate Africa and in advanced cultures on ceremonial occasions. Alongside the highly trained orchestras we find orchestras of all musical and social ranks: opera, concert, operetta, vaudeville, circus, coffeehouse, and parade orchestras, as well as those playing at dances and in parks. The social status of orchestra members varies, ranging from those employed by the state or big private enterprises down to those employed by municipalities, to musicians in private employ either in institutions that receive government subsidies or guarantees or in unsubsidized theater orchestras, and finally to the less highly trained musicians, who have to depend on occasional employment. Military musicians often represent competition for independent orchestra players. Changes in the technology of public entertainment may have serious consequences for independent musicians; one instance was the catastrophe that hit musicians employed in movie theaters when the sound film was introduced in 1930. Choral music. The evolution of the modern chorus parallels that of large-scale organizations in general in Western societies. The choruses of the sixteenth and seventeenth centuries cannot be compared in size with the choruses of today. They consisted of a few professional singers (as in the Sistine Chapel in Rome), with no more than two to four singers in each choir, except on special royal occasions. The modern chorus received a mighty organizational impetus from the French Revolution, in which choruses played a political role. In 1795 it was proposed to the National Convention that national holidays be celebrated by choeurs universels. Mass choruses, with 2,400 men and women, were organized in Paris, in 1784. They set the example for mass choruses in subsequent revolu-
MUSIC: Music and Society tions. In Germany and France, between the Napoleonic Wars and the revolutions of 1830 and 1848, national enthusiasm and socialist trends led to the establishment of choruses, male choruses, and choral organizations, which performed at choral festivals. In choral activity of this period, the democratic ideal of a fusion of the various walks of life, from the commoner to the nobleman, seems to have been achieved in Germany. In the decades that followed, choruses were again divided according to class and occupation (e.g., teachers' choral societies, printers' choruses, etc.). The choral movement was furthered in France after the revolution, especially by G. Wilhem, and totaled 60,000 members in 1,500 orpheons by 1893. In the same period there were monster performances in England with over 4,000 participants, half of whom were singers. Most contemporary choruses are organized as societies, which in turn are organized into associations. Germany has 15,000 choruses, with a total membership of 1.3 million, their social and civic significance outstripping their aesthetic achievements. The Scandinavian countries, Finland in particular, are also rich in choruses. In Italy large mixed choruses have evolved slowly because the association of adult men and women, which is so prominent a feature of choral life elsewhere, conflicted with the prevailing Italian pattern of the relations of the sexes. Only since the end of the nineteenth century have women been admitted to the church choirs in Roman Catholic countries. Choral singing has always been popular in England. The major choruses include the Royal Chorus Society, founded in 1873; the Bach Choir, founded in 1876; and the Goldsmith's Choir Union, founded in 1932. An international choral festival, the Eisteddfod, was founded in 1947 in Wales. In addition to the choral societies, in which the artistic aim often is secondary to the desire for conviviality, there are the professional choruses. These include the little master choirs of the Roman Catholic and Protestant churches in the sixteenth and eighteenth centuries, the opera choruses of modern times, and outstanding national choirs, such as the Association Chorale Professionelle in France and the Soviet State Chorus in Russia, which continues the tradition of the liturgical choirs and of the pre-1919 Moscow Synodal Choir. In Russia, there also are the first-class and highly paid choruses of the radio networks, newly established everywhere. Ecclesiastical music. Churches are important in musical culture as public institutions, although here music is not an end in itself. They maintain
571
choirmasters, organists, and musicians. They publish hymn books, and besides their significant influence upon broad, popular musical culture, they stage major events themselves, including services with orchestral and choral accompaniment and church concerts that are outside the liturgical framework (or else they make the churches available for such performances). Music has always been an element of religious worship. In ancient times the performance of music was reserved to persons of cultic importance: priests and magicians. Advanced theocratic civilizations—such as those of the Babylonians, the Sumerians, the Egyptians, and the Jews—had an extensive hierarchical system of religious musical culture. The recital of sacred texts by singing them is found throughout the world, not only because the singing voice enhances the texts but also because of music's magical effect. From the days of the church fathers down to the present, there has been conflict between the concerns of the church musician, who wishes to elevate the faithful with his music, and the preachers, who regard music that is too copious as a distraction from religious meditation. That is why Calvin and Zwingli, for example, forbade all music with the exception of the chorale. Even today the playing of instruments is restricted in the Roman Catholic church, and their use has been governed by numerous edicts for centuries. In the Roman Catholic church, singing, except for the little permitted the congregation, was until recently reserved to the choir of priests, who, from the theological standpoint, are representatives of the angelic choir. Luther introduced singing by the whole congregation (again theologically based on the evangelical approach ). Today, the significance of church music is confined to the church itself; yet, its influence is still great. Choirmasters (called church music directors in many churches) and organists train lay church choirs. In the missions (as well as in the Negro churches of the United States), the church makes considerable use of the performance of ethnic music. The basic conservatism of the churches in general entails greater cultivation of traditional music, discouraging the development of any generally effective new musical styles. The training of musical taste and skill. Musical education is indispensable to the general cultivation of music. There is no doubt that here it fulfills an important function: bringing up children to be members of society. Musical education is conducted mostly in the schools, where a certain training of children to enjoy music made by singing together, as well as a modest degree of musical instruction,
572
MUSIC: Music and Society
is an important factor in the development of the musical culture of any society. Little time, however, is scheduled for instruction in music in most countries, and only the wealthier classes of society can afford private teachers of music for their children. Music schools serve primarily to train performing musicians. (The word "conservatory" is derived from the word for the orphan asylums in Venice and Naples during the eighteenth century, in which music gradually became the focus of activity.) In most countries there are "institutes of music" for training the musical elite. Musicians' associations. Associations of musicians existed even in antiquity (e.g., the association of "Dionysian artists"). The first medieval organizations of musicians, founded in the twelfth and thirteenth centuries, were religious in nature (St. Bartholomew's Hospital in London, St. Nicolaibriiderschaft in Vienna). In 1657 the guilds and corporations of musicians in 43 localities of central Germany united in an "Instrumental-Musikalisches Collegium," sanctioned by Emperor Ferdinand in. Trumpeters and drummers constituted a separate caste, a "noble guild," in the royal courts and armies; this was so even in ancient Rome. Down to the nineteenth century, organized musicians tried to defend their privileges against the unorganized (for example, by playing at weddings, etc., with a restricted number of instruments). Starting in 1808, the Prussian state contributed to the security of musicians and their families, and voluntary welfare agencies for widows, orphans, and pensioners were promoted by Spontini in Berlin in 1842 (and in Vienna by the Tonkiinstlersociete of Gassmann as early as 1770). Like all social insurance, these programs signified a strengthening of the musical professions in their struggle for social recognition and, thus, strengthened the self-confidence of musicians in general. Musicians' unions, as distinct from artistic organizations, are a comparatively recent phenomenon. In Germany, the Allgemeiner Deutscher Musikerverband was founded in 1872. Since 1952, the main organization is the Deutsche OrchesterVereinigung in der Deutschen Angestellten-Gewerkschaft; it had a membership of 5,676 in 1960 out of a total of about 6,000 orchestra musicians. In England the Incorporated Society of Musicians was established in 1882; in France there is the Syndicat National des Artistes Musiciens, and in the United States, the American Federation of Musicians. Alongside the professional organizations there are also associations of amateurs and patrons of music. At first, these societies did not intend to give public concerts, but only "practice concerts."
Later on, they were succeeded by organizations that gave public concerts (for example, the Concerts of Ancient Music between 1776 and 1848, in England; Le Concert Spirituel between 1725 and 1791, in France; and the Big Concerts in Leipzig, dating from 1743, which were continued in 1781 as the Gewandhaus Concerts). The Allgemeine Deutsche Musikverein, 1861-1937, had as its purpose the "cultivation of music and the advancement of musicians." The Gesellschaft der Musikfreunde was founded in 1771 in Vienna. In England, the New Philharmonic Society was active from 1852 to 1897, and the National Federation of Music Societies was started in 1935. In the United States, there were the Handel and Haydn Society (founded in 1915), the Musical Alliance of America (founded in 1917), and, for orchestral music, the Philharmonic Society of New York (founded in 1842). There is hardly any aspect of musical life that has no organization; school musicians and teachers of music (the Music Teachers' Association, founded in 1876, in the United States), composers, music dealers, instrument makers, etc., all have their own organizations. There are even international bodies such as the International Music Council, established in 1949, which holds national and international congresses; 39 national committees are affiliated with this international organization. Other international organizations include: the Federation Internationale des Jeunesses Musicales (founded in 1947), the International Folk Music Council (founded in 1947), the Confederation Internationale des Societes des Auteurs et Compositeurs (founded in 1926), and the Federation Internationale des Musiciens (founded in 1948). Audience and performance There are various forms of music-playing for larger audiences. The playing of folk music unites the performers, singers, and players. Either there is no distinct audience at all or some of the listeners become participants, for instance, by clapping in rhythm or by joining in a song or its refrains. On a higher artistic plane, music is performed for an audience that does nothing but listen. At big dances, however, the listeners are also the dancers who express the music rhythmically. This distinction among singers, players, and mere listeners is further subdivided into two categories : "familiar music-playing," represented in the past by the regular performance of music by town musicians, performances at church festivals, court music, music played at table, etc., and the modern "performance," which presupposes thorough study of the music to be played, by orchestras and en-
MUSIC: Music and Society sembles. Mozart played his own piano concertos without any rehearsal, and Beethoven's symphonies were played (by amateurs) in big concerts without rehearsal. In these amateur concerts, practice concerts, and glee clubs of the eighteenth century, the relatives of the performers did not come only to hear the music but to play cards and smoke as well. Concerts and publics. As early as the sixteenth and seventeenth centuries, musicians played private concerts and advertised them in the newspapers. Like the opera, which was sometimes open to the public upon payment of an admission fee, in the eighteenth century there were public concerts that were open to anyone upon payment of an admission fee. These concerts were held at the same time as the concerts for invited guests of the aristocracy and the wealthy of Paris and were often staged on a splendid scale. Ever since the middle of the eighteenth century, however, the concert open to the wide public has been the standard form for the performance of music. The admission ticket is a contract of sale. (The German youth music movement has criticized this form of concert since 1914, believing that it entails the danger that the listener, excluded from active participation, might become inwardly inactive as well. The youth movement advocated "open singing," with the active participation of all the listeners, in opposition to the concert form. It wanted to experience music in a community, with the participation of all those present.) The musician presenting his art tries to gain a "public." This may be a homogeneous audience that is linked to the performers. Such is the case, for example, in a concert of active or passive members of a society, in a school concert, in a concert for a public united in support of a particular artistic goal, etc. However, the persuasiveness of music is required to establish a community of musical experience in a concert for a metropolitan public, which is brought together by interests not all of which are artistic and which is not at all uniform in taste. In this case, the purely creative social force, the "sociability" of music, is probably only transitory and hard to estimate. It is a hyperbole to speak of the creative social force of Beethoven's symphonies. Rather, like all works of art, these symphonies are the work of an individual genius, but they also express the general feelings of mankind (or at least those of a national group during a certain epoch). The stratification of musical activity Musical performances differ in the socially different strata of audiences according to the quality of the performance, the artistic and social strivings
573
of the musicians, the magnitude of the performance (number of performers and type of music), and content of the repertoire, and the style and age of the works performed. They are further divided into two categories: serious and entertainment music. Serious-music performances include symphony concerts, choral concerts, oratorios, recitals, evenings of lieder, and programs of church music. Entertainment-music performances include folk-music and military-band concerts, concerts in public squares, and beer-garden concerts featuring light programs, dances, marches, jazz, and popular singing. Public performances requiring tickets of admission constitute a classification of the listeners according to the price of admission, fixed by the listeners' means. The performances themselves differ, and the ticket effects a spatial stratification within the concert hall itself. The motive for operagoing or concertgoing often is the desire for social contacts and, in the case of expensive concerts and the higher-priced seats, the desire to be seen and to gain or maintain prestige, alongside the interest in the music and often ahead of it (in the case of the "snob"). The optional or frequently prescribed dress for the audience (black tie, evening gown) results in social gradations of the performance, as does the spatial allocation within the hall (orchestra seats versus balcony). Not only does the cost of admission act as a hindrance to lower-income groups, but the social level of the audience (education, dress) may hinder outsiders from attending the concert. The statistics available on the stratification of the radio audience, however, indicate that the inexpensive opportunity of listening to music and the elimination of social shyness does not prevent a quite general stratification of the listeners. "Serious" or "heavy" music (usually called classical music for the sake of simplicity) is generally preferred by those of higher education. According to these statistics, the desire to hear serious music and the understanding thereof grow with the level of general education and, correspondingly, with musical training. Audience organizations in all countries are endeavoring to lift the financial barriers for the bulk of the population. Yet, serious music cannot easily be made "accessible to the people." The problem of making "true folk music" widely accessible must also be regarded with skepticism in industrialized countries with a predominantly urban population. Folk music lives on in isolated regions and loses its character when it becomes a school song or is arranged for choral singing. Mass communication, with some 300 million
574
MUSIC: Music and Society
radios in the world and about four times that many listeners, represents a totally new factor in bringing the masses into contact with different types of music. Programs are classified according to content and are broadcast at times that make allowance for the listeners' social status and working hours. (The phonograph record and the jukebox are related forms of mass communication.) Music is disseminated far more widely than at any previous time. This is paralleled by lower intensity; listening to music grows duller and shallower. Mass communication is disseminating EuroAmerican music among non-Western peoples. Regrettably enough, exogenous entertainment music often displaces these peoples' indigenous music. A mixed style that approaches European music is already developing in the non-Western countries that have traditional musical styles of their own. HANS ENGEL BIBLIOGRAPHY ADORNO, T. W. 1962 Einleitung in die Musiksoziologie: Zwolf theoretische Vorlesungen. Frankfurt am Main (Germany): Suhrkamp. ALLEN, WARREN D. (1939) 1962 Philosophies of Music History: A Study of General Histories of Music, 1600-1960. New York: Dover. BAB, JULIUS 1931 Das Theater ira Lichte der Soziologie. Leipzig: Hirschfeld. BAUMOL, WILLIAM J.; and BOWEN, WILLIAM G. 1966 Performing Arts: The Economic Dilemma. New York: Twentieth Century Fund. BECKER, HOWARD S. 1951 The Professional Dance Musician and His Audience. American Journal of Sociology 57:136-144. BLAUKOPF, KURT (1950) 1951 Musiksoziologie: Eine Einfiihrung in die Grundbegriffe mit besonderer Beriicksichtigung der Soziologie der Tonsysteme. Vienna: Verkauf. BLAUKOPF, KURT 1952 Musiksoziologie, Bindung und Freiheit bei der Wahl von Tonsystemen. Pages 237257 in Carl Brinkmann (editor), Soziologie und Leben: Die soziologische Dimension der Fachwissenschaften. Tubingen (Germany): Wunderlich. BONNOT, RENE 1960 Sociologie de la musique. Volume 2, pages 297-298 in Georges Gurvitch (editor), Traite de sociologie. Paris: Presses Universitaires de France. BUCHER, K. (1896) 1902 Arbeit und Rhythmus. 3d ed. Leipzig: Teubner. CROSTEN, WILLIAM L. 1948 French Grand Opera: An Art and a Business. New York: King's Crown Press. ENGEL, HANS 1933 Musik, Gesellschaft, Gemeinschaft. Zeitschrift fur Musikwissenschaft 17:175-185. ENGEL, HANS 1942 Der Musiker: Beruf und Lebensformen. Pages 180-205 in Von Deutscher Tonkunst: Festschrift zu P. Raabes 70. Geburtstag. Edited by Alfred Morgenroth. Leipzig: Peters. ENGEL, HANS 1952 Das Chorwesen in soziologischer Sicht. Zeitschrift fur Musik 8:433-439. ENGEL, HANS 1960 Musik und Gesellschaft: Bausteine zu einer Musiksoziologie. Berlin: Hesse. FARNSWORTH, PAUL R. 1958 The Social Psychology of Music. New York: Dryden.
FELLERER, KARL G. 1963 Soziologie der Kirchenmusik: Materialien zur Musik- und Religionssoziologie. Cologne (Germany): Westdeutscher Verlag. FISCHER, KARL A. 1951 Kultur und Gesellung: Ein Beitrag zur allgemeinen Kultursoziologie. Schriften der soziologischen Abteilung des Forschungsinstitutes fur Sozial- und Verwaltungswissenschaften in Koln, No. 2. Cologne (Germany): Westdeutscher Verlag. FRANCASTEL, P. 1960 Problemes de la sociologie de Tart. Volume 2, pages 279-296 in Georges Gurvitch (editor), Traite de sociologie. Paris: Presses Universitaires de France. GROUT, DONALD J. 1960 A History of Western Music. New York: Norton. HANSLICK, EDUARD 1854 Vom Musikalisch-Schonen: Ein Beitrag zur Revision der Aesthetik der Tonkunst. Leipzig: Weigel. HAUSEGGER, FRIEDRICH VON (1885) 1887 Die Musik als Ausdruck. 2d ed. Vienna: Konegen. HOFSTATTER, PETER R. (1956) 1964 Sozialpsychologie. 2d ed. Berlin: Gruyter. HONIGSHEIM, P. 1958 Soziologie der Kunst, Musik und Literatur. Pages 338-373 in Gottfried Eisermann (editor), Die Lehre von der Gesellschaft: Ein Lehrbuch der Soziologie. Stuttgart (Germany): Enke. KLAUSMEIER, FRIEDRICH 1963 Jugend und Musik im technischen Zeitalter: Eine representative Befragung in einer Westdeutschen Grossstadt. Bonn: Bouvier. KNEIF, TIBOR 1966 Gegenwartsfragen der Musiksoziologie: Ein Forschungsbericht. Acta musicologica 38: 72-118. LENZ, FRIEDRICH 1952 Einfiihrung in die Soziologie des Rundfunks. Emsdetten (Germany): Lechte. MACKERNESS, ERIC D. 1964 A Social History of English Music. London: Routledge. MATTHESON, JOHANN 1722-1725 Critica musica. 2 vols. Hamburg (Germany): No publisher given. MEYER, ERNST H. 1952 Musik im Zeitgeschehen: Grundprobleme der Musiksoziologie. Berlin: Henschel. MOSER, H. J. 1960 Die Tonsprachen des Abendlandes: Zehn Essais als Wesenskunde der europdischen Musik. Berlin: Merseburger. MUELLER, JOHN H. (1951) 1958 The American Symphony Orchestra: A Social History of Musical Taste. London: Calder. NAGELI, HANS G. 1826 Vorlesungen iiber Musik. Tubingen (Germany): Cotta. NETTEL, REGINALD 1946 The Orchestra in England: A Social History. London: Cape. OLKHOWSKY, ANDREY V. 1955 Music Under the Soviets: The Agony of an Art. New York: Praeger. PINTHUS, GERHARD 1932 Das Konzertleben in Deutschland: Ein Abriss seiner Entwicklung bis zum Beginn des 19. Jahrhunderts. Strassburg: Heitz. PREUSSNER, EBERHARD (1935)1950 Die biirgerliche Musikkultur: Ein Beitrag zur deutschen Musikgeschichte des 18. Jahrhunderts. 2d ed. Kassel and Basel: Barenreiter. PROESLER, HANS; and BEER, KARL 1955 Die Gruppe; The Group; Le groupe: Ein Beitrag zur Systematik soziologischer Grundbegriffe. Berlin and Munich: Duncker & Humblot. REINOLD, H. 1955 Musik im Rundfunk. Kolner Zeitschrift fur Soziologie und Sozialpsychologie 7:233246. REVESZ, GEZA (1946) 1954 Introduction to the Psychology of Music. Norman: Univ. of Oklahoma Press. -» First published in German.
MYRES, JOHN LINTON SIEGMEISTER, ELIE (editor) 1938 Music and Society. New York: Critics group. -> Also published in German in 1948. SILBERMANN, ALPHONS (1957) 1963 The Sociology of Music. London: Routledge. -> First published as Wovon lebt die Musik? Die Prinzipien der Musiksoziologie. SLOTKIN, J. S. 1943 Jazz and Its Forerunners as an Example of Acculturation. American Sociological Review 8:570-575. THOMAS, HANS A. 1962 Die deutsche Tonfilmmusik: Von den Anfdngen bis 1956. Giitersloh (Germany): Bertelsmann. [ViAN, BORIS] (1958) 1966 En avant la zizique . . . et par id les gros sous, by Vernon Sullivan [pseud.]. Paris: Jeune Parque. -> Essay on the popular music industry, by an experienced song writer and recording company executive. WEBER, MAX (1921) 1958 The Rational and Social Foundations of Music. Carbondale: Southern Illinois Univ. Press. -» First published in German. WELLEK, ALBERT 1963 Musikpsychologie und Musikdsthetik: Grundriss der systematischen Musikwissenschaft. Frankfurt am Main (Germany): Akademische Verlagsgesellschaft. WOODFILL, WALTER L. 1953 Musicians in English Society From Elizabeth to Charles I. Princeton Univ. Press.
MYRES, JOHN LINTON Sir John Linton Myres (1869-1954), a historian of classical antiquity, showed in his work a knowledge and competence gained from other disciplines—notably, from geography, anthropology, and archeology. The width of his interests led him to anthropology, at that time a broad, inclusive study that combined both humanist and scientific orientations, and he devoted much of his long life to furthering its cause in institutional ways—by his long association with the Royal Anthropological Institute, by founding and editing the journal Man, by helping to extend the teaching of anthropology at Oxford, and by organizing various national and international conferences and congresses. For most of his life Myres lived and worked at Oxford, where from 1910 until 1939 he was Wykeham professor of ancient history. His central interest as a scholar was the origin and development of Greek civilization, and his most important book, Who Were the Greeks? (1930), was a contribution to this theme. First delivered as the Sather lectures at the University of California at Berkeley, this book might be described as an inquiry into the ethnological origins of Greek culture: Myres drew his data from geography, physical anthropology, comparative philology, and archeology, as well as from the traditions and beliefs recorded in Greek literature. In his general conclusions he emphasized the heterogeneity of Greek origins and the
575
processes of selection and adaptation that occurred to produce the seeming unity of the Greek people in its "great age." Myres excelled in this kind of cross-disciplinary study, his own particular contribution being to show the relevance of geography and history for the development of culture. It was this theme that he took up in his Frazer lecture, entitled "An Essay in Geographical History" (see 1943), and that underlay also the collection of essays published shortly before his death, Geographical History in Greek Lands (1953). His other notable books are The Dawn of History (1911) and The Political Ideas of the Greeks (1927). In the field of classics Myres' scholarly achievement was substantial, but in anthropology his influence lay perhaps to a greater extent in his enthusiasm for, and promotion of, the subject and in the opportunities that he created for others. Thus, when Myres was recorder of the anthropological section for the 1899 meeting of the British Association for the Advancement of Science, he wrote to another Oxford-trained classicist, R. R. Marett, asking him to enliven a potentially dull meeting with something "really startling"; for the occasion Marett produced his paper "Pre-animistic Religion," which had indeed the desired effect and brought fame to Marett [see the biography of MARETT]. In the following year, as honorary secretary to the Royal Anthropological Institute, Myres conceived the need for a journal which would report on recent work in the field of anthropological studies and would provide a place for general discussion through shorter articles and reviews. As a result, Man: A Monthly Record of Anthropological Science started publication in 1901. Myres became the first editor, and the form and policy of the journal were largely shaped by him. He was editor from 1901 to 1903 and again from 1931 to 1946. At Oxford, E. B. Tylor had been lecturing in anthropology since 1884, but there was no separate department or school until, in the first years of this century, Myres, with others, helped to create such a school and to establish the diploma course in anthropology. He became the first secretary to the committee for anthropology, which the university set up in 1905 to administer the teaching of the course. In 1908 he contributed to a course of public lectures that Marett (who had succeeded him as secretary to the committee) had arranged to stimulate an interest in anthropology. The lectures, published as Anthropology and the Classics, are an interesting reflection of the subject as it was then conceived; some of the other speakers were Arthur J. Evans, Andrew Lang, Gilbert Murray, and F. B. Jevons. In 1912 Myres, with Barbara Freire-Mar-
576
MYTH AND SYMBOL
reco, one of the first pupils in the diploma course, prepared a new edition of Notes and Queries on Anthropology. From 1919 to 1932 Myres was the honorary general secretary to the British Association for the Advancement of Science. He was vicepresident of the Royal Anthropological Institute from 1921 to 1923 and thereafter continued to serve as an active member on committees until 1928, when he was elected president, an office he held for the very long period of three years. He was active in the creation of the International Congress of Anthropological and Ethnological Sciences and served as honorary general secretary of the group from its first meeting in 1934 until 1947. M. J. RUEL [See also the biographies of FRAZER; MARETT; TYLOR.] WORKS BY MYRES
1908 Herodotus and Anthropology. Pages 121-168 in Robert R. Marett (editor), Anthropology and the Classics. Oxford: Clarendon. 1911 The Dawn of History. New York: Holt. 1912 BRITISH ASSOCIATION FOR THE ADVANCEMENT OF SCIENCE Notes and Queries on Anthropology. 4th ed. Edited by Barbara Freire-Marreco and J. L. Myres. London: Routledge. -» First published in 1874. A sixth, revised edition was published in 1954. 1927 The Political Ideas of the Greeks, With Special Reference to Early Notions About Law, Authority, and Natural Order in Relation to Human Ordinance. London: Arnold; New York: Abingdon. 1930 Who Were the Greeks? Berkeley: Univ. of California Press. 1943 Mediterranean Culture. Cambridge Univ. Press. 1953 Geographical History in Greek Lands. Oxford: Clarendon. -> Includes a bibliography of Myres' works. SUPPLEMENTARY BIBLIOGRAPHY
John Linton Myres: 1869-1954. 1954 Man 54:37-43. -» Memorial tributes by Raymond W. Firth and others.
MYTH AND SYMBOL Myths treat of origins but derive from transitions. By "myths" I do not, of course, mean Marchen, fairy tales, folk tales, Sagen, or legends but sacred narratives telling, as Stith Thompson writes, "of sacred beings and of semi-divine heroes and of the origins of all things, usually through the agency of these sacred beings" (1946, p. 9). Myths relate how one state of affairs became another: how an unpeopled world became populated; how chaos became cosmos; how immortals became mortal; how the seasons came to replace a climate without seasons; how the original unity of mankind became a plurality of tribes or nations; how androgynous beings became men and women; and
so on. Myths are liminal phenomena: they are frequently told at a time or in a site that is "betwixt and between." Myths as liminal phenomena When Arnold van Gennep generalized the processual structure of rites de passage (1909), he opened up many lines of investigation that have not as yet been fully exploited. Van Gennep suggested a threefold progression of successive ritual stages: separation, margin (or limen), and aggregation. Many structural and cultural problems are posed by the liminal stage. The individual or group undergoing rites de passage is, during the liminal period, neither here nor there but in limbo. The individual initiand is no longer the incumbent of a culturally denned social position or status but has not yet become the incumbent of another. If a whole social group is in ritual transition, there is frequently an annulment or invalidation of the distinctive arrangement of specialized and mutually dependent positions that composed its preritual structure; nor as yet has its postritual structure been anticipated. The protracted liminal periods found to be marked by collective rites in preliterate societies are not without structure; rather there is a simplification and generalization of structure. The complexities of stratification and segmentation are replaced by dyadic oppositions of instructors and instructed: the interstructural situation often may also be an instructional situation. Initiators collectively confront initiands, and among the initiands there is usually complete equality of status. Preritual distinctions of kinship, wealth, rank, or age are temporarily invalidated. Correlated with these structural changes, the symbols of liminality frequently represent such ideas as death and birth. The loss of preritual status or structural arrangements is interpreted as "death," the growth toward a new status or articulation as "birth" or "infancy." The loss of status may be emblematized by ritual nudity, or the group's social homogeneity may be emphasized by the wearing of some uniform ritual decoration or dress. The passive attitude of the male initiands may be symbolized by the wearing of female apparel. The absence of status distinctions may be shown further by the use of postures expressive of humility or by decorating the body with earth or ashes. The social invisibility of the initiands may be signified by their total or partial seclusion from the habitats and occasions of secular life, by rules enjoining them to be silent for long periods, and by strange disguises. Liminality is thus a period of structural impov-
MYTH AND SYMBOL erishment and symbolic enrichment. It is essentially a period of returning to first principles and taking stock of the cultural inventory. To be outside of a particularized social position, to cease to have a specific perspective, is in a sense to become (at least potentially) aware of all positions and arrangements and to have a total perspective. What converts potential understanding into real gnosis is instruction. Instruction takes many forms: it is partly communicated through displays of sacra objects which are shown to the initiands and explained, sometimes with the aid of sacred myths; it partly takes the form of direct ethical instruction, although this is rarely the case in primitive ritual; and very often the cultural knowledge is transmitted by the recital of mythical narratives. It must be remembered that like all ritual phenomena and processes, such sacra, such gnosis, and such myths are felt by those who believe them to have ontological efficacy. They re-create or transform those to whom they are shown or told and alter the capacity of the initiand's being so that he becomes capable of performing the tasks of the new status ahead of him. It is not simply a cognitive restructuring that takes place, nor is it solely a ritual legitimization of the initiand's new social status; rather the rites, myths, and symbols are felt to have something akin to a salvific power—without the ontological aspect the initiand would be "lost"; he would not be able to perform even the physical acts appropriate to his new status nor to fulfill the ritual component of this new status. For example, unless a girl has been ritually "grown" into a woman, as the Bemba put it (Richards 1956), many aspects of adult sexuality will present dangers for her. Thus knowledge, including knowledge imparted by myth, "saves." Even where myths are not bound to rites, they have a liminal character. Most of them have a genetic or critical reference. They refer to how things came to be what they are; they are not mere inventories or rules of behavior. They further refer, directly or indirectly, to the biological life-crises of birth, mating, disease, and death. They relate also to climatic or ecological changes, which always involve a restructuring of social relationships— with the possibility of conflict and disorder. The well-known amorality of myths is intimately connected with their existential bearing. The myth does not describe what ought to be done; it expresses what must be. The rhythms and outcomes of biology and climate are both amoral and nonlogical, although they have form and order. To gain power the participant in ritual or the believer in myth (who enacts its episodes in imagination
577
by identification with its characters) must perform or feign to perform, in act or in fantasy, deeds of murder, cannibalism, adultery, or incest, since the generative processes of inner and outer nature are most directly expressed in such behavior. Liminal symbolism, both in its ritual and mythic expressions, abounds in direct or figurative transgressions of the moral codes that hold good in secular life, such as human sacrifice, human flesh eating, and incestuous unions of brother-sister or mother—son deities or their human representatives. Thus the theory that myths are paradigmatic (Eliade 1957) or that myths afford precedents and sanctions for social status and moral rules (Malinowski 1925) requires some sort of qualification. Myths and liminal rites are not to be treated as models for secular behavior. Nor, on the other hand, are they to be regarded as cautionary tales, as negative models which should not be followed. Rather are they felt to be high or deep mysteries which put the initiand temporarily into close rapport with the primary or primordial generative powers of the cosmos, the acts of which transcend rather than transgress the norms of human secular society. In myth is a limitless freedom, a symbolic freedom of action which is denied to the normbound incumbent of a status in a social structure. What the initiand seeks through rite and myth is not a moral exemplum so much as the power to transcend the limits of his previous status, although he knows he must accept the normative restraints of his new status. Liminality is pure potency, where anything can happen, where immoderacy is normal, even normative, and where the elements of culture and society are released from their customary configurations and recombined in bizarre and terrifying imagery. Yet this boundlessness is restricted—although never without a sense of hazard —by the knowledge that this is a unique situation and by a definition of the situation which states that the rites and myths must be told in a prescribed order and in a symbolic rather than a literal form. The very symbol that expresses at the same time restrains; through mimesis there is an acting out—rather than the acting—of an impulse that is biologically motivated but socially and morally reprehended. The "reality" of myths Many authorities on mythology have stressed the reality, as distinct from the fantastic or unreal aspects, of myth. Malinowski, for example, described how myth "as it exists in a savage community" is "not merely a story told but a reality lived." It is "not an idle tale, but a hard-worked
578
MYTH AND SYMBOL
active force" ([1925] 1948, pp. 100-101). Jung wrote that "the primitive mentality does not invent myths, it experiences them." Myths are "anything but allegories of physical processes. . . . Myths . . . have a vital meaning . . . not merely do they represent, they are the mental life of the primitive tribe, which immediately falls to pieces and decays when it loses its mythological heritage" ([1909-1946] 1953, p. 314). And we find Mircea Eliade writing that myth "is always the recital of a creation; it tells how something was accomplished, began to be. It is for this reason that myth is bound up with ontology; it speaks only of realities, of what really happened, of what was fully manifested" ([1957] 1959, p. 95). Now it is true that for each of these authors, reality or experience has a different meaning. Malinowski's primary intent was to relate the myths of the Trobriand Islanders to their social and cultural experience. Thus, myths of the emergence of clan ancestors from holes in the ground were related to actual topographical features, to the contemporary distribution of Trobriand clans, and to Trobriand kinship patterns and social stratification. By "reality" Malinowski meant that myths are charters of extant social institutions. Although they might mention fictitious beings, their details had a point-to-point correlation with social and cultural arrangements—which were real aspects of Trobriand experience. Jung, on the contrary, regarded myths not as indices of, or charters for, cultural institutions, but as "psychological realities," as expressions of the "archetypes" or "primordial images" of the "collective unconscious." These are real in the sense that they represent inherited forms or patterns (in the Platonic sense of ideas) present in every human being. At first these forms are without specific thought content; content is provided by the specific culture. Myths give "a local habitation and a name" to these general forms and give them "reality" by manifesting them to consciousness. By "reality" Eliade means "sacred reality," for, he writes, "it is the sacred that is pre-eminently the real" ([1957] 1959, p. 95). His analysis hinges on a distinction between the sacred and the profane. The sacred for him is sui generis; like Rudolf Otto's das Heilige, the sacred presents itself as something "like nothing human or cosmic . . . a reality of a wholly different order from 'natural' (or 'profane') realities . . . saturated with being . . . equivalent to a power." Myth is a "sacred history" (and hence "saturated with being . . . and power"), and "to relate a sacred history is equivalent to revealing a mystery. For the persons of the myth are not hu-
man beings: they are gods or culture heroes, and for this reason their gesta constitute mysteries; man could not know their acts if they were not revealed to him" (ibid., p. 95). We cannot get behind this theological language to the processes underlying myth. The sacred or sacred realm, for Eliade, is inaccessible to us except insofar as it chooses to reveal itself to us in the analogies of mythic or ritual symbolism. Thus, for Malinowski, "reality," as an attribute of myth, is cultural, for Jung psychological, and for Eliade spiritual (as it is indeed for our preliterate interpreters). If, however, myth is merely a charter or precedent for the continuance of rites and customs, it has some weird and numinous features; if it is a bundle of archetypes, it also has close reference to specific cultural and social institutions and relations; while if it is "an irruption of the sacred, of creative energy into the world . . . a surplus of ontological substance" (ibid., p. 97), it has a variety of profane cultural and psychological interconnections. Analyzing liminal rites and symbolism The social and cultural context. Possibly the best approach to the problem of cracking the code of myth is the via negativa represented by the liminal phase in initiation rites. But to analyze this adequately, we must take heed of all that Malinowski says concerning the necessity of studying such rites in the live context in which they occur. This context is in every given instance a social field: a structure of social positions and a set of cultural institutions and mechanisms. The specific initiation rite or myth must also be examined as a component of a total system of religious beliefs and practices. Its symbols and episodes, subdivided into such units as significata, stages, words, sentences, motifs, personae, objects and relationships, and the principles and themes underlying these, must be related to those found in other parts of the total religious system. Next, the properties and structure of the religious system must be compared and contrasted with the properties and structure of other cultural subsystems, such as the kinship system, the economic system, and the legal and political systems. In other words, we have to seek a part of the meaning of a myth in the idiosyncrasy of its cultural context, a context of many dimensions. Nor must we neglect the dynamics of that culture: we must see the rite and the myth as phases in social processes, as being performed or narrated at significant points in the seasonal cycle, at individual or group life-crises, at times of natural catastrophe,
MYTH AND SYMBOL such as famine, drought, flood, and epidemic, or with reference to crises brought on by human lawbreaking or "sinful" action. Before we can say with any certainty what this or that liminal phenomenon is, we must be able to state what it is not. It is not the state of cultural affairs that precedes it or that which follows. But since it is, in some sense, the antithesis of what precedes it, we must know the structure of that cultural state. And since it is, in some sense, a preparation for the state that is to follow, we must know the properties, conditions, and structural features of that state too. Liminality strains toward universality but never realizes it; a specific culture surrounds it in space and time and invades its innermost sanctum. Its very sacra bear the hallmarks of a particular historically derived culture. Psychogenic factors. Nevertheless, simply because liminality, and the sacred myth which is one of its phenomena, does so strain toward universality, toward the dissolution of specific structural arrangements, there is a rich manifestation of psychical contents otherwise withheld from expression by a preoccupation with norm-governed or pragmatic activities. In many cultures the lifecrises of birth, puberty, marriage, and death have been made the occasions of initiation ritual, and since these crises closely concern the experiences and relationships of the nuclear family, it is possible that Freudians and Neo-Freudians can shed much light on the unconscious semantic components of liminal symbolism, especially insofar as these may represent "the return of the repressed." The Jungians, whose therapy rests on the interpretation of symbols ejected from the "collective unconscious" under the pressure of an adult crisis, might discover in the relationship between ritual and crises found in primitive societies some justification for the use of their analytical procedures. Jung himself uncompromisingly states that myths are first and foremost psychic manifestations that represent the nature of the psyche. All the myths concerned with occurrences of nature, such as summer and winter, the phases of the moon, and the rainy seasons, are definitely not allegories of these objective experiences, nor are they to be understood as explanations of the sunrise, the sunset, and other natural phenomena. Rather, they are symbolic expressions of the inner and unconscious psychic drama that becomes accessible to human consciousness by projection— that is, by being mirrored in the events of nature (Jung 1909-1946). This bluntly psychogenic explanation of myth denies to culture any formative
579
role in its symbolism. It also excludes the intellectualist variety of psychogenic explanation favored today by Levi-Strauss, who holds that myths, and other religious manifestations, contain ideas that "give access to the mechanism of thought." Myths "pertain to the understanding, and the demands to which it responds and the way in which it tries to meet them are primarily of an intellectual kind" ([1962] 1963, p. 104). Levi-Strauss finds in primitive religious phenomena "the emergence of a logic operating by means of binary oppositions and coinciding with the first manifestations of symbolism"; and in metaphor—which plays an important role in myth—he finds "a primary form of discursive thought" (p. 102). His emphasis is primarily on the "logic of oppositions and correlations, exclusions and inclusions, compatibilities and incompatibilities," which for him "explains the laws of association" found in mythic and ritual symbolism and discourse. When Levi-Strauss analyzes myth, his main aim is to reveal the austere structure of this logic behind its symbolic and bizarre integument. Depth psychologists generally would demur at the stress on logic in this realm; they hold that in unconscious thinking, logically incompatible ideas can coexist and even reinforce one another in a single situation, while symbols may have multiple disparate referents. The followers of Pareto, too, would assert that nonlogical or nonrational symbols must be distinguished from logical symbols, constituting a class whose members derive both form and semantic content from biotic and cultural processes of a noncognitive type; logical symbols are conceived in the conscious mind, as Pallas was in Zeus's head. Nonlogical symbols represent the impress on consciousness of factors external or subliminal to it. Such symbols may subsequently become objects of reflection, and from them many logical symbols may be derived by abstraction. But they are not generated by the consciousness, nor are they mutually interrelated in terms of the rules of logic. Many mythic and ritual symbols belong to the class of nonlogical symbols and cannot therefore be analyzed as though they operated by the rules of logic. The cultural dynamics of ritual. Many of these dilemmas may be resolved if we take the cultural dynamics of ritual as our point of departure. Here we find more than the distinction between the profane and the sacred. In the liminal stage of rites de passage, we find not merely the sacred but the most sacred. And paradoxically this is where we also find the most human, indeed, the ail-too-
580
MYTH AND SYMBOL
human. Particularly do we discover in this stage a crucial anchoring of ideas and symbols in the human body and in its somatic processes. The body (with its unconscious rhythms and orectic processes) is viewed as the epitome or microcosm of the universe. It becomes the metaphor or model which illustrates most vividly all other profane types of regularity—of nature, of culture, of society, and of thought. In the profane or secular realm—even though in multifunctional communities this too is saturated with religious ideas and imagery—utility and rationality guide behavior and lead to the classification of phenomena and processes, both of nature and society. This rational categorization of reality enables the human community to cope efficiently with the problems of obtaining its food supply and maintaining social order. These classifications "spill over" into the sacred realm and are particularly in evidence in the separation and aggregation phases of ritual, in which the sacred has to come to terms, so to speak, with the profane, where the two realms interdigitate. But in the liminal phase of separation and secret instruction in gnostic sacerrima, the nonlogical and biopsychical modes of thinking and acting prevail. The behavior in such phases is "inspired by things as they are and not by things as they ought to be" (Horton 1963, p. 98). Trickster tales In the liminal period we see naked, unaccommodated man, whose nonlogical character issues in various modes of behavior: destructive, creative, farcical, ironic, energetic, suffering, lecherous, submissive, defiant, but always unpredictable. One class of myths which throws into sharp relief many aspects of liminality is that represented by the widely distributed trickster tales. A considerable scholarly literature has accumulated on tricksters (see, for example, Radin 1955; Dumezil 1948; Wescott 1962; Herskovits 1938). They include the Greek god Hermes, the Norse god Loki, the Yoruba deity Eshu-Elegba, the Fon Legba, the Winnebago trickster Wakdjunkaga, and many others. Tricksters are clearly liminal personalities (threshold men or edge men). Joan Wescott, for example, describes the Yoruba Eshu-Elegba in the following terms: "[Eshu] is ... described as a homeless wandering spirit, and as one who inhabits the market-place, the crossroads, and thresholds of houses. He is present whenever there is trouble and also wherever there is change and transition" (1962, p. 337). In very similar terms Hermes, as the messenger
of the gods, inhabits crossroads, open public places, and doorways, and is associated with commerce. He is the invincible child, well equipped with the powers of nature and instinct. Most tricksters have an uncertain sexual status: on various mythical occasions Loki and Wakdjunkaga transformed themselves into women, while Hermes was often represented in statuary as a hermaphrodite. On other occasions tricksters appear with exaggerated phallic characteristics: Hermes is symbolized by the herm or pillar, the club, and the ithyphallic statue; Wakdjunkaga has a very long penis which has to be wrapped around him and put over his shoulder in a box; Eshu is represented in sculpture as having a long curved hairdress carved as a phallus. In most trickster tales there are many scatological and even coprophagous episodes, exemplifying what Wescott has called the "katabolic nature of the trickster." Tricksters are multiform and ambiguous. For example, myths about Eshu describe him as firstborn and as last-born, as old man and as child. In these four roles the individual normally has privileged freedom from some of the demands of the social code. Other traits ascribed to tricksters include: combined black and white symbolism, aggression, vindictiveness, vanity, defiance of authority, willfulness, individualism, indeterminacy of stature (sometimes tall, sometimes dwarfish), destructiveness, creativeness (the Winnebago trickster transforms the pieces of his broken phallus into plants and flowers for men—hence he is both single and multiple), and libido without procreative outcome. These liminal entities share an antinomian character. They behave as though there were no social or moral norms to guide them. Self-will, caprice, and lust impel them. In a rather different sense from Eliade's, they are "the opposite of the profane," if we include in the latter the notions of moral and jural order. Yet though wholly other, they are perfectly familiar to mankind, even jocularly so, for they represent what everyone would secretly like to do. Since their energies are untrammeled and unchanneled, they are supererogatory, and their surplus becomes the source of new substances and beings. They are raw, undomesticated bodily and collective power, undefinable, uncontainable, and compounded equally of polymorphous libido and aggression. It is true that in certain trickster myth cycles (especially in North America), the later tales describe the structuring of the trickster's life and activities: he marries, settles down, has children, obeys kinship and
MYTH AND SYMBOL affinal norms, etc., but here he resembles the initiand who leaves the liminal scene and is "aggregated" once again to society. The unpredictable liminal persona becomes predictable again in terms of the norms and classifications of profane society. The interstructural transition stage is over. Creative chaos has become created cosmos. Creation myths But the concept of limen includes not only the Dionysian and polymorphous aspects of human normlessness; it also includes the notions of the mystical and the ascetical. In this regard, there is usually a feeling that the human cultural order is a kind of painted veil over a deeper, superhuman order, the mysteries of which begin to be accessible only to those who have been stripped during initiation of profane status and profane rank. The humility and discipline of the novice, his selfabnegation and self-denial, and his acceptance of the absolute authority of his instructors win for him true gnosis. This set of liminal attitudes is associated with a very different type of mythology than that represented by the trickster cycles. To this type belong such creation tales and chants as the Hebrew Genesis, the Greek Theogony, the Zoroastrian, Gnostic, and Mandaean cosmogonies, the Fon cosmogony, the Quiche Mayan Popul Vuh, the Norse Elder Edda, and the Hawaiian Kumulipo, or creation chant. These all reveal how the One became the Many, how in a series of orderly stages chaos became a cosmos of many dimensions and levels; most of these tell also how sin and death came into the world, and thus they provide a theodicy. These great myths are in many societies recited during liminal periods, the times that are rich in ritual. Every myth of this sort, Eliade holds, "shows how a reality came into existence, whether it be the total reality, the cosmos, or only a fragment—an island, a species of plant, a human institution . . . to tell how a thing was born is to reveal an irruption of the sacred into the world, and the sacred is the ultimate cause of all real existence" ([1957] 1959, p. 97). At first glance, it might seem that these architectonic masterpieces have little in common with the trickster myths, in which "realities" come into being as the result of caprice or accident. Yet in many of these cosmogonies and theogonies, the deities and heroes mate incestuously, devour one another, and clearly transgress human and cultural norms of justice and equity. By these acts, despite priestly editing, the liminal character of the myth betrays itself. And, indeed, in most of
587
these cycles of great myths, trickster figures may be found peeping grotesquely forth like the gargoyles on Gothic cathedrals. Myths are not merely a guide to culture, although they are this as well; they point to the generative power underlying human life, a power which from time to time oversteps cultural limits. Surely these huge symbolizations of incest and crime at the level of the deity are more significant than Ruth Benedict supposed when she described their Zuni manifestations as "distortions" due to "various fanciful exaggerations and compensatory mechanisms" (1935, pp. xx-xxi). They represent a return to the deep sources of psychosomatic experience in a legitimized situation of freedom from cultural constraints and social classifications. These relatively short "liminal instants" must counterbalance the long days of utilitarian and culture-bound experience. At the root of the rational is the nonrational, which gives it its meaning, and liminality is that root. Nature (and indeed spirit, the intelligent and immaterial part of man) is still the mentor of culture and the source of its often unpredictable changes. In myth we see nature and spirit at their shaping work—and this in the liminal moment in and out of time. VICTOR W. TURNER [See also FOLKLORE; POLLUTION; RELIGION; RITUAL; and the biographies of GENNEP; JUNG; MALINOWSKI; MAUSS; RADIN.] BIBLIOGRAPHY BAUMANN, HERMANN 1935 Lunda: Bei Bauern und Jdgern in Inner-Angola. Berlin: Wurfel. BENEDICT, RUTH 1935 Zuni Mythology. 2 vols. Columbia University Contributions to Anthropology, Vol. 21. New York: Columbia Univ. Press. DUMEZIL, GEORGES 1948 Loki. Paris: Maisonneuve. ELIADE, MIRCEA (1957) 1959 The Sacred and the Profane. New York: Harcourt. -* A paperback edition was published in 1961 by Harper. GENNEP, ARNOLD VAN (1909) 1960 The Rites of Passage. London: Routledge. -» First published in French. GLUCKMAN, MAX (1949) 1963 The Role of the Sexes in Wiko Circumcision Ritual. Pages 145-167 in Meyer Fortes (editor), Social Structure: Essays Presented to A. R. Radcliffe-Brown. New York: Russell. HERSKOVITS, MELVILLE J. 1938 Dahomey: An Ancient West African Kingdom. 2 vols. New York: Augustin. HORTON, ROBIN 1963 The Kalahari Ekine Society: A Borderland of Religion and Art. Africa 33:94-114. JUNG, CARL G. (1909-1946) 1953 Psychological Reflections: An Anthology of Writings. Selected and edited by Jolande Jacobi. New York: Harper. -» A paperback edition was published in 1961. LEVI-STRAUSS, CLAUDE (1962) 1963 Totemism. Boston: Beacon. -> First published as Le totemisme aujourd'hui. MALINOWSKI, BRONISLAW (1925) 1948 Magic, Science and Religion. Pages 1-71 in Bronislaw Malinowski,
582
MYTH AND SYMBOL
"Magic, Science and Religion," and Other Essays. Glencoe, 111.: Free Press. OPLER, MORRIS 1938 Myths and Tales of the Jicarilla Apache Indians. Memoirs of the American Folklore Society, Vol. 31. Philadelphia: Th? Society. RADIN, PAUL (1955) 1956 The Trickster: A Study in American Indian Mythology. London: Routledge; New York: Philosophical Library. RICHARDS, AUDREY I. 1956 Chisungu: A Girls' Initiation Ceremony Among the Bemba of Northern Rhodesia. London: Faber.
THOMPSON, STITH 1946 The Folktale. New York: Dryden. TURNER, VICTOR W. 1962 Three Symbols of Passage in Ndembu Circumcision Ritual. Pages 124-173 in Max Gluckman (editor), Essays on the Ritual of Social Relations. Manchester Univ. Press. WESCOTT, JOAN 1962 The Sculpture and Myths of EshuElegba, the Yoruba Trickster. Africa 32:336-354. WHITE, CHARLES M. N. 1961 Elements in Luvale Beliefs and Rituals. Rhodes-Livingstone Paper No. 33. Manchester Univ. Press.