Journal of Personality and Social Psychology n°94 v.5, 2008, May

file:///C|/JPSP/index.txt Journal of Personality and Social Psychology Volume: 94, Issue: 5 May, 2008 Table of Contents...

Author: JPSP

23 downloads 985 Views 2MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

file:///C|/JPSP/index.txt

Journal of Personality and Social Psychology Volume: 94, Issue: 5 May, 2008 Table of Contents: 1) Hedonic Tone and Activation Level in the Mood-Creativity Link: Toward a Dual Pathway to Creativity Model De Dreu, C.K.W.; Baas, M.; Nijstad, B.A. pp. 739-756 (184 KB) 2) Judgments of the Lucky Across Development and Culture Olson, K.R.; Dunham, Y.; Dweck, C.S.; Spelke, E.S.; Banaji, M.R. pp. 757-776 (348 KB) 3) How to Heat Up From the Cold: Examining the Preconditions for (Unconscious) Mood Effects Ruys, K.I.; Stapel, D.A. pp. 777-791 (129 KB) 4) Forming Implicit and Explicit Attitudes Toward Individuals: Social Group Association Cues McConnell, A.R.; Rydell, R.J.; Strain, L.M.; Mackie, D.M. pp. 792-807 (308 KB) 5) Maintaining Sexual Desire in Intimate Relationships: The Importance of Approach Goals Impett, E.A.; Strachman, A.; Finkel, E.J.; Gable, S.L. pp. 808-823 (298 KB) 6) Receiving Support as a Mixed Blessing: Evidence for Dual Effects of Support on Psychological Outcomes Gleason, M.E.J.; Iida, M.; Shrout, P.E.; Bolger, N. pp. 824-838 (255 KB) 7) Nomina Sunt Omina: On the Inductive Potential of Nouns and Adjectives in Person Perception Carnaghi, A.; Maass, A.; Gresta, S.; Bianchi, M.; Cadinu, M.; Arcuri, L. pp. 839-859 (173 KB) 8) Taking the Easy Way Out: Preference Diversity, Decision Strategies, and Decision Refusal in Groups Nijstad, B.A.; Kaps, S.C. pp. 860-870 (102 KB) 9) Distinguishing Between Silent and Vocal Minorities: Not All Deviants Feel Marginal Morrison, K.R.; Miller, D.T. pp. 871-882 (113 KB) 10) Making Choices Impairs Subsequent Self-Control: A Limited-Resource Account of Decision Making, Self-Regulation, and Active Initiative Vohs, K.D.; Baumeister, R.F.; Schmeichel, B.J.; Twenge, J.M.; Nelson, N.M.; Tice, D.M. pp. 883-898 (138 KB) 11) Adolescent Personality Moderates Genetic and Environmental Influences on Relationships With Parents South, S.C.; Krueger, R.F.; Johnson, W.; Iacono, W.G. pp. 899-912 (138 KB) 12) Societal Threat, Authoritarianism, Conservatism, and U.S. State Death Penalty Sentencing (1977-2004) McCann, S.J.H. pp. 913-923 (269 KB)

file:///C|/JPSP/index.txt [25/05/2008 5:49:12 PM]

ATTITUDES AND SOCIAL COGNITION

Hedonic Tone and Activation Level in the Mood–Creativity Link: Toward a Dual Pathway to Creativity Model Carsten K. W. De Dreu, Matthijs Baas, and Bernard A. Nijstad University of Amsterdam To understand when and why mood states influence creativity, the authors developed and tested a dual pathway to creativity model; creative fluency (number of ideas or insights) and originality (novelty) are functions of cognitive flexibility, persistence, or some combination thereof. Invoking work on arousal, psychophysiological processes, and working memory capacity, the authors argue that activating moods (e.g., angry, fearful, happy, elated) lead to more creative fluency and originality than do deactivating moods (e.g., sad, depressed, relaxed, serene). Furthermore, activating moods influence creative fluency and originality because of enhanced cognitive flexibility when tone is positive and because of enhanced persistence when tone is negative. Four studies with different mood manipulations and operationalizations of creativity (e.g., brainstorming, category inclusion tasks, gestalt completion tests) support the model. Keywords: mood, creativity, cognitive flexibility, emotions, arousal

mood stands out as one of the most widely studied and least disputed predictors (e.g., George & Brief, 1996; Isen & Baron, 1991; Mumford, 2003). For example, Ashby, Isen, and Turken (1999) noted that

What enables scientists to make notable contributions, engineers to develop innovative products, and work teams to creatively solve their problems? What hinders stand-up comedians from being funny and refrains poets from being original? When are people creative, and why? What hinders creativity, and when? Partly because of the importance of creativity for human progress and adaptation, these questions are as old as the human sciences (Simonton, 2003). Apart from its obvious, problem-solving function (Mumford & Gustafson, 1988), creative ideation allows individuals to remain flexible (Flach, 1990), giving them the capacity to cope with the advantages, opportunities, technologies, and changes that are a part of their day-to-day lives (Runco, 2004). Accordingly, creativity is studied in a variety of disciplines, including psychology, organizational behavior, and communication sciences. Creativity is usually defined as the generation of ideas, insights, or problem solutions that are new and meant to be useful (Amabile, 1983; Paulus & Nijstad, 2003; Sternberg & Lubart, 1999). Among the many variables that have been shown to predict creativity,

It is now well recognized that positive affect leads to greater cognitive flexibility and facilitates creative problem solving across a broad range of settings. These effects have been noted not only with college samples but also in organizational settings, in consumer contexts, in negotiation situations . . . and in the literature on coping and stress. (p. 530)

In a similar vein, Lyubomirksy, King, and Diener (2005) concluded that people in a positive mood are more likely to have richer associations within existing knowledge structures, and thus are likely to be more flexible and original. Those in a good mood will excel either when the task is complex and past learning can be used in a heuristic way to more efficiently solve the task or when creativity and flexibility are required. (p. 840) Although many studies support the idea that positive mood states trigger more creative responses than do neutral mood control conditions, studies in which positive and negative mood states were compared appear to be less conclusive: “There is also a large literature on negative affect, which indicates that the impact of negative affect is more complex and difficult to predict than is the case for positive affect” (Ashby et al., 1999, p. 532). Indeed, whereas some studies suggest that positive mood states trigger more creativity than do negative mood states (e.g., Grawitch, Munz, & Kramer, 2003; Hirt, Levine, McDonald, Melton, & Martin, 1997; Hirt, Melton, McDonald, & Harackiewicz, 1996), other studies report similar levels of creativity (Bartolic, Basso, Schefft, Glauser, & Titanic Schefft, 1999), and still other studies

Carsten K. W. De Dreu, Matthijs Baas, and Bernard A. Nijstad, Department of Psychology, University of Amsterdam, Amsterdam, the Netherlands. We thank Joyce Jacobs for help in coding the data of Study 4 and Gerben van Kleef, Mark Rotteveel, and Richard Ridderinkhof for comments and suggestions. Correspondence concerning this article should be addressed to Carsten K. W. De Dreu, Department of Psychology, University of Amsterdam, Roetersstraat 15, 1018 WB Amsterdam, the Netherlands. E-mail: [email protected]

Journal of Personality and Social Psychology, 2008, Vol. 94, No. 5, 739 –756 Copyright 2008 by the American Psychological Association 0022-3514/08/$12.00 DOI: 10.1037/0022-3514.94.5.739

739

740

DE DREU, BAAS, AND NIJSTAD

report that negative moods promote creative performance more than do positive or neutral moods (e.g., Carlsson, 2002; Gasper, 2003; Kaufmann & Vosburg, 1997; Madjar & Oldham, 2002). This has led some to call into question the general conclusion that positive mood states produce more creativity than do negative mood states (Shalley, Zhou, & Oldham, 2004) or that negative mood states undermine creative performance (Gasper, 2003; George & Zhou, 2007). In this article, we reconsider the link between mood and creativity and try to reconcile the seemingly contradictory findings and conclusions reviewed above. First, we argue that creativity can be a function of cognitive flexibility and of cognitive perseverance and persistence. Second, we argue that mood states can be conceptualized in terms of two underlying dimensions— hedonic tone (positive vs. negative) and activation (activating vs. deactivating). Whereas past work on mood states and creativity has predominantly focused on hedonic tone dimension and on cognitive flexibility, we argue that the extent to which mood states activate or deactivate and the tendency toward cognitive perseverance and persistence need to be taken into account also. More specifically, we propose that cognitive activation is a necessary precondition for creativity to come about and that hedonic tone determines the route— the flexibility route or the perseverance route—through which creative fluency and originality is achieved. In four studies, we tested (aspects of) the general idea that activating moods with positive tone are linked to cognitive flexibility and thereby promote creative performance, whereas the creativity enhancing effects of activating moods with negative tone are due to perseverance.

A Dual Pathway to Creative Performance Creativity researchers often operationalize creativity with measures of fluency, originality, and flexibility (Guilford, 1967; Torrance, 1966). Because we will also use these measures in our studies, it is important to conceptually relate them to each other as well as to the general concept of creativity. Fluency is a measure of creative production and refers to the number of nonredundant ideas, insights, problem solutions, or products that are being generated. Originality is one of the defining characteristics of creativity and refers to the uncommonness or infrequency of the ideas, insights, problem solutions, or products that are being generated (Amabile, 1983; Guilford, 1967; Paulus & Nijstad, 2003; Sternberg & Lubart, 1999; Torrance, 1966). Fluency and originality may be correlated (e.g., quantity breeds quality; Diehl & Stroebe, 1987; Osborn, 1953), but they need not be. For example, creative fluency may manifest itself in a relatively large number of solved insight or perception problems, with the solutions themselves not being particularly new or uncommon (cf. Fo¨rster, Friedman, & Liberman, 2004). Moreover, states or traits that influence creative fluency do not necessarily also influence originality and vice versa. Flexibility as a measure of creativity manifests itself in the use of different cognitive categories and perspectives and of broad and inclusive cognitive categories (Amabile, 1983; Mednick, 1962). Generating ideas in many different categories will, all other things being equal, be associated with more ideas overall (i.e., increased fluency; cf. Nijstad, Stroebe, & Lodewijkx, 2002) as well as with the generation of ideas in categories that are not usually thought of (i.e., originality; cf. Murray,

Sujan, Hirt, & Sujan, 1990; also see Isen & Daubman, 1984; Mikulincer, Paz, & Kedem, 1990; Rietzschel, De Dreu, & Nijstad, 2007). It is important to note that besides being a measure of creative performance, flexibility also refers to a cognitive process. Many researchers have argued that in order to be creative (i.e., produce novel and appropriate products) people must think flexibly, must break set (e.g., Duncker, 1945; Smith & Blankenship, 1991; Smith, Ward, & Schumacher, 1993), and need flat associative hierarchies (e.g., Eysenck, 1993; Mednick, 1962; Simonton, 1999) to arrive at uncommon and disparate (and thus original) associations. Cognitive flexibility can thus not only be seen as a measure of creativity but also as a precursor of the production of many (fluency) and original responses. However, in addition to cognitive flexibility, it is also possible to achieve creative fluency and originality through hard work, perseverance, and more or less deliberate, persistent, and in-depth exploration of a few cognitive categories or perspectives (Boden, 1998; Dietrich, 2004; Finke, 1996; Schooler, Ohlsson & Brooks, 1993; Simonton, 1997). Perseverance will manifest itself not in the use of many or broad cognitive categories but rather in the generation of many ideas within a few categories or in longer timeon-task. All other things being equal, generating many ideas in a few categories will also lead to more ideas overall (i.e., fluency; Nijstad et al., 2002). Furthermore, recent work suggests that fluency within categories is associated with originality of ideas within these categories: Because only a limited number of conventional and unoriginal ideas are possible in each category, perseverance within categories eventually leads to original ideas (Rietzschel, Nijstad, & Stroebe, 2007). Such within-category fluency (e.g., Nijstad & Stroebe, 2006; Nijstad et al., 2002; Nijstad, Stroebe, & Lodewijkx, 2003) can be illustrated with the example of an individual who generates ideas as to how to improve health. This person may think about physical exercise and sport and may start out with common ideas like, “people should spend more time doing physical exercise.” However, provided he or she continues generating ideas within this category, he or she might proceed to more unusual ideas within that category, like, “putting a strong string in your computer keyboard to make typing very hard work.” In previous work in which both flexibility (number of used categories) and within-category fluency were established, no systematic correlation between the two was observed (Nijstad et al., 2002, 2003). Taken together, creativity can be achieved through enhanced cognitive flexibility, set-breaking, and cognitive restructuring, which manifests itself in the use of many, broad, and inclusive cognitive categories. It can equally well be achieved through enhanced persistence and perseverance, which manifests itself in a higher number of ideas and insights within a relatively low number of cognitive categories, prolonged effort, and relatively long timeon-task. This may apply to idea generation and divergent thinking tasks, as well as to insight tasks that are typically characterized by being ultimately soluble by the average problem solver. Such insight tasks are likely to produce an impasse and a state of high uncertainty as to how to proceed and to produce a kind of “aha” experience when the impasse is suddenly overcome and the solution is discovered after prolonged efforts at solution (Fo¨rster et al., 2004; Schooler et al., 1993).

MOOD–CREATIVITY LINK REVISITED

Discrete Moods, Creative Fluency, and Originality A critical implication of the dual pathway model is that any trait or state influencing cognitive flexibility or cognitive persistence and perseverance may lead to novel yet appropriate insights and ideas. With regard to the influence of mood on creative fluency and originality, it may thus be that mood states influence creativity to the extent that they enhance cognitive flexibility, perseverance, or both; perhaps both positive and negative mood states lead to creative fluency and originality, but through different routes. When thinking about mood states, valence, or hedonic tone, most readily comes to mind. Discrete moods such as anger, anxiety, sadness, and depression all have negative valence, or tone. Discrete mood states such as happiness, elation, and feeling relaxed and calm all have positive valence, or tone. However, in addition to hedonic tone, discrete moods differ in the extent to which they activate or deactivate (Barrett & Russell, 1998; Gray, 1982; Green, Goldman, & Salovey, 1993; Posner, Russell, & Peterson, 2005; Thayer, 1989; Watson, Clark, & Tellegen, 1988). Some mood states are positive in tone and deactivating (calm, relaxed), whereas others are positive in tone and activating (happy, elated). Likewise, some mood states are negative in tone and deactivating (sad, depressed), whereas others are negative in tone and activating (angry, fearful). This applies to temporarily activated and experimentally manipulated mood states (Russell & Barrett, 1999; Watson, Wiese, Vaidya, & Tellegen, 1999), as well as to trait-related differences in mood (Filipowicz, 2006). For example, trait extraversion is often equated with positive affectivity (positive, activating), and trait neuroticism is equated with negative affectivity (negative, activating; Cropanzano, Weiss, Hale, & Reb, 2003; Eysenck, 1993).

Activation, Hedonic Tone, and Creativity Whether mood states are activating or deactivating may have important effects on creative performance. According to both classic and contemporary work on threat rigidity (Carnevale & Probst, 1998; Staw, Sandelands, & Dutton, 1981) and stress-performance linkage (Broadbent, 1972; Yerkes & Dodson, 1908), an individual’s capacity for complex thinking is altered in a curvilinear fashion as arousal and activation increases. Low levels of arousal lead to inactivity and avoidance, neglect of information, and low cognitive and motor performance. Extremely high levels of arousal reduce the capacity to perceive, process, and evaluate information. However, at moderate levels of arousal, individuals will be motivated to seek and integrate information and to consider multiple alternatives. Provided they are not associated with intense arousal, activating moods are thus more likely than deactivating mood states to increase attention to and integration of information. That activating mood states may foster creativity also follows from work on the interrelations among arousal, release of specific neurotransmitters such as dopamine and noradrenalin, and working memory capacity (cf., Ashby, Valentin, & Turken, 2002; Flaherty, 2005; Nieuwenhuis, Aston-Jones, & Cohen, 2005; Usher, Cohen, Servan Schreiber, Rajkowski, & Aston Jones, 1999). Working memory capacity refers to the ability to hold information transiently in mind in the service of comprehension, thinking, and planning (Baddeley, 2000). Activation and arousal associate with the release of dopamine and noradrenalin, which in turn play a

741

major role in regulating the excitability of the cortical circuitry on which the working memory function of the prefrontal cortex depends (Dreisbach et al., 2005; Goldman-Rakic, 1996). Moderate levels of dopamine associate with improved working memory performance (Floresco & Phillips, 2001; Kimberg, D’Esposito, & Farah, 1997), more efficient processing of task-relevant information (Drabant et al., 2006), increased maintenance of task-relevant information (Colzato, Van Wouwe, & Hommel, 2007), and better switching between tasks (Dreisbach & Goschke, 2004). Moderate (but not extremely high) levels of noradrenalin enhance prefrontal cortex control of behavior, including (short-term) working memory (Robbins, 1984; Usher et al., 1999) and sustained selective attention on task-relevant information (Chamberlain, Muller, Blackwell, Robbins, & Sahakian, 2006). Apart from a simple motivating effect of activation, the above indicates that activating mood states rather than deactivating mood states come together with higher levels of dopamine and noradrenalin and greater working memory capacity. Working memory capacity is often taken as a prerequisite for cognitive flexibility, abstract thinking, strategic planning, processing speed, access to long-term memory, and sentience (Baddeley, 2000; Damasio, 2001; Dietrich, 2004). In terms of the dual pathway model outlined in the previous section, it thus appears that both for the cognitive flexibility route and for the persistence route, working memory capacity is required and beneficial. Activating rather than deactivating moods increase working memory capacity, thereby facilitating cognitive flexibility and restructuring, as well as more deliberate, analytical, and focused processing and combining of information. Indeed, affect intensity, measured with both negative and positive high arousing terms, relates to higher levels of creativity in children (Russ & Grossman-McKee, 1990) as well as employees (George & Zhou, 2007). Whether activating mood states produce creative fluency and originality through enhanced cognitive flexibility or perseverance may depend on that mood state’s hedonic tone. According to the cognitive tuning model (Clore, Schwarz, & Conway, 1994; Schwarz & Bless, 1991), a positive affective state leads individuals to experience their situation as safe and problem free, to feel relatively unconstrained, to take risks, and to explore novel pathways and new possibilities in a relatively loose way, relying on heuristic processing styles (Fiedler, 2000; George & Zhou, 2007; Schwarz & Clore, 1988). Positive affect facilitates primary process cognition in the right hemisphere, which is holistic and analogical (Martindale & Hasenfus, 1978; Martindale, Hines, Mitchell, & Covello, 1984; also see Derryberry, 1989; Faust & Mashal, 2007; Fink & Neubauer, 2006). Consistent with this is a classic study on positive affect and creativity (Isen & Daubman, 1984). Participants in a state of mild happiness were asked to rate how prototypical several exemplars (e.g., bus, camel) were for a particular category (e.g., vehicle), with higher ratings for the weak exemplar (camel) indicating broad cognitive categories (Amabile, 1983; Eysenck, 1993). Results showed that compared with the control condition, happy participants had higher prototypicality ratings, that is, had broader and more inclusive cognitive categories (also see Isen, Niedenthal, & Cantor, 1992; Mikulincer & Sheffi, 2000; Murray et al., 1990). Other work showed that individuals in happy moods choose a global rather than a local visual configuration and perform faster on visual insight tasks that require set-breaking

DE DREU, BAAS, AND NIJSTAD

742

Positive

Activation

Motivation; Working Memory Capacity

Creative Fluency and Originality

Tone

Negative

Figure 1.

Cognitive Flexibility; Inclusiveness

Cognitive Persistence; Perseverance

Schematic overview of the roles of activation and tone in the dual pathway to creativity model.

(Fredrickson & Branigan, 2005; Gasper, 2003; Wadlinger & Isaacowitz, 2006). The cognitive tuning model, and related accounts, thus posits that positive affect allows individuals to be inclusive in their thinking, to switch cognitive categories, and to explore uncommon perspectives; positive affect, in other words, increases cognitive flexibility (cf., Ashby et al., 1999). Negative affect, in contrast, informs the individual that his or her situation is problematic, threatening, and troublesome. Specific action must be taken to remedy the current situation, and this calls for a more constrained, systematic, and analytical approach (Ambady & Gray, 2002; Chaiken, Liberman, & Eagly, 1989; Schwarz & Bless, 1991). Negative affect enhances risk aversion and bolsters detail-oriented processing. It facilitates left hemispherical, secondary process cognition, which is more verbal, sequential, and analytical (Martindale & Hasenfus, 1978; Martindale, Hines, Mitchell, & Covello, 1984; also see Derryberry, 1989; Faust & Mashal, 2007; Fink & Neubauer, 2006). Negative mood states such as anxiety promote narrow perceptual processing, resulting in impaired detection of peripheral (but not central) visual information and impaired performance on secondary (but not primary) tasks; provided it does not become too extreme, such narrowed processing accompanying negative mood states may be adaptive in that it helps prevent distraction while focusing attention on the most important information (Derryberry & Reed (1998). Indeed, negative activating moods such as fear and anxiety lead to narrow cognitive categories (Mikulincer et al., 1990), lowered ability to shift attention (Derryberry & Reed, 1998), and reduced cognitive flexibility (e.g., Carnevale & Probst, 1998). It is important to note that negative activating moods also increase persistence and perseverance (Gasper & Clore, 2002; Gray & Braver, 2002; Strauss, Hadar, Shavit, & Itskowitz, 1981; but see Baumann & Kuhl, 2005). For example, Verhaeghen, Joormann, and Khan (2005) showed that rumination (persisting, conscious, and negatively valenced selfrelated thoughts) correlated with creative fluency and originality and that this relationship appeared to be due to greater seriousness about and more time spent on creative activities. According to our dual pathway model, creative fluency and originality may be achieved through enhanced cognitive flexibility, increased persistence and perseverance, or some combination thereof. On the basis of stress-performance literature and psychophysiological and neuroimaging work on arousal and working memory capacity, we argued that activating moods enhance creative fluency and originality more than do deactivating moods. From a combination of this with the cognitive tuning model

(Schwarz & Bless, 1991), the broaden-and-build perspective (Fredrickson, 1998), and the work on visual and conceptual focusing (e.g., Derryberry, 1989), it follows that activating moods that are positive in tone increase creative fluency and originality primarily through enhanced cognitive flexibility, whereas activating moods that are negative in tone increase creative fluency and originality primarily through enhanced persistence and perseverance. Put differently, whereas we would not necessarily expect differences in creative fluency and originality between activating positive (e.g., happy, elated) moods and activating negative (angry, fearful) moods, we would expect activating positive moods to associate with broader and more inclusive cognitive categories, with greater diversity in the cognitive categories used to generate ideas, and with fast completion times in creative insight tasks. Vice versa, we would expect activating negative moods to associate with more ideas within specific cognitive categories and with relatively long completion times in creative insight tasks.1 Figure 1 provides a schematic overview of the way activation and hedonic tone influence the two routes toward creative fluency and originality. As can be seen, the level of activation associated with a particular mood state serves as the critical entry point, with higher activation leading to greater fluency and originality. However, which pathway is used depends on a mood state’s hedonic tone, with positive tone facilitating the cognitive flexibility route and negative tone facilitating the cognitive perseverance route. Some indirect evidence for our model is available, albeit outside the domain of creative performance. In their review of the psychological, neurochemical, and functional neuroanatomical mediators of the effects of positive and negative mood on executive 1

Activation not only varies as a function of mood but also, for example, as a function of physical exercise (see also Kaufmann & Vosburg, 1997). Work on physical exercise and creativity is somewhat inconclusive, however, with some finding no differences between exercise and baseline conditions (Isen et al., 1987; Vosburg, 1998), and others finding physical exercises to lead to more divergent thinking (Blanchette, Ramocki, O’Del & Casey, 2005; Steinberg, Sykes, Moss, Lowery, & LeBoutillier, 1997). Unfortunately, in most of these studies, no manipulation checks for exercise-induced arousal or activation and no controls for participants’ physical condition were included, and it is unclear whether the exercise induced low, moderate, or high physical arousal. Furthermore, the tasks used in these experiments capitalized on cognitive flexibility (e.g., functional fixedness, remote associations), which may explain why, in a few cases (e.g., Isen et al., 1987; Vosburg, 1998), happiness (activating positive mood) produced more creativity than did exercise-induced arousal. We return to this in the Conclusions and General Discussion section.

MOOD–CREATIVITY LINK REVISITED

functions, Mitchell and Phillips (2007) concluded that negative mood effects on executive functioning are mediated by serotonin, whereas positive mood effects may be mediated by dopamine, with serotonin being particularly involved in effortful processes associated with goal-directed activity and dopamine being particularly involved in switching flexibly between categories and tasks (e.g., Ashby et al., 1999). Spering, Wagener, and Funke (2005) found no overall differences in complex problem solving between positive and negative mood states but did find that negative mood states produced a stronger focus on seeking and using information. Brand, Reimer, and Opwis (2007), finally, showed that participants in a negative mood solved transfer tasks less efficiently than did those in a positive mood; negative mood participants needed more repetitions to reach a mastery level but did not differ from those in a positive mood in their ultimate problem-solving ability. Thus, indeed, there is some evidence that a mood state’s hedonic tone alters the processes by which individuals perform cognitive tasks and solve problems.

The Present Study: Overview and Hypotheses To test our model on creative fluency and originality as a function of a mood state’s activation and tone, we conducted four studies. In the first three studies, we used self-generated imagery to induce different mood states (cf., DeSteno, Petty, Rucker, Wegener, & Braverman, 2004; Strack, Schwarz, & Gschneidinger, 1985), some of which were negative in tone (anger, fear, sadness, depression) and some of which were positive in tone (happiness, elation, calm, relaxation). Apart from a hedonic tone contrast, this design allowed us to compute an activation contrast (activating moods [angry, fearful, happy, elated] versus deactivating moods [sad, depressed, calm, relaxed]) that is orthogonal to the hedonic tone contrast or their interaction. In Study 4, we surveyed individuals’ self-reported mood states— negative activating, positive activating, negative deactivating, or positive deactivating—and used regression analyses to relate these mood dimensions to creative performance. We also used, across studies, different tasks to assess creative performance. In Studies 1 and 4, we engaged participants in a brainstorming task. Apart from creative fluency and originality, from coded ideas we also derived indices of cognitive flexibility (i.e., the number of cognitive categories from which ideas were sampled) and perseverance (i.e., the number of ideas within a particular cognitive category; cf., Rietzschel, De Dreu, & Nijstad, 2007). In Study 2, we focused on cognitive inclusiveness and breadth of cognitive categories that people use, and in Study 3, we assessed performance on a Gestalt Completion Test (Ekstrom, French, Harman, & Dermen, 1976; Friedman & Fo¨rster, 2000; Schooler & Melcher, 1995), a classical insight problem in which participants view a series of fragmented pictures of familiar objects and attempt to perceptually integrate and recognize them, to close each gestalt. According to Fo¨rster et al. (2004), “this task may also be seen as requiring visual insight inasmuch as each item is ultimately soluble by the average problem solver and is likely to produce an impasse that may be suddenly overcome after continued efforts at solution” (p. 179).

Study 1 In Study 1, we induced one of four different mood states— anger, sadness, happiness, and relaxation—and subsequently asked

743

participants to brainstorm on ways to improve teaching at their university. We predicted that both positive and negative activating moods (happy, angry) would be related to greater creative fluency and originality than would both positive and negative deactivating moods (sad, relaxed; Hypothesis 1), that activating positive moods (happy) would be related to greater category diversity than would any other mood state (Hypothesis 2), and that activating negative moods (angry) would be related to greater within-category fluency than would any other mood state (Hypothesis 3).

Method Design and participants. Undergraduate students (N ⫽ 58, 73% women, 27% men) at the University of Amsterdam participated for €5 (approximately U.S. $6.50), and participants were randomly assigned to one of four different mood conditions (anger, sadness, happiness, relaxation). Gender had no effects, and it is not discussed further. Dependent variables were self-reported activation and hedonic tone, as checks for the mood manipulation, and creative performance during brainstorming as reflected in number of unique ideas, originality of the ideas, number of cognitive categories used (cf. cognitive flexibility), and within-category fluency (cf. cognitive persistence). Procedure and manipulation of discrete moods. Participants came to the laboratory, and they were seated in individual cubicles equipped with a chair, a desk, and a computer with keyboard. Participants were told that they would be asked to participate in two different and independent studies; one was an autobiographical memory task (the task used to manipulate discrete moods) and the other was a brainstorming task about possible ways to improve the quality of teaching in the psychology department (the task to assess creativity). Participants were then asked to write down their gender and age and to write a short essay about a situation that happened to them and that made them feel really _____ (depending on discrete mood condition: angry, sad, happy, relaxed). They were given an entire page to report their situation and were asked, after finishing their autobiographical story, to report those keywords or (parts of) phrases they considered vital in making them feel _____ (depending on discrete mood condition: angry, sad, happy, relaxed; In this experiment, and subsequent ones, the content of the stories participants wrote always adhered to instructions. Furthermore, we were unable to discern systematic differences between conditions in length of stories or particular topics participants wrote about.) Upon completion of the mood manipulation task, participants were asked to brainstorm about possible ways to improve the quality of teaching in the psychology department. Participants were reminded that the department attracted more and more new students each year and that this put some pressure on the quality of teaching, “as some of you may have already experienced.” They were further told that the departmental teaching staff was interested in their problem solutions and that they would be given 8 min to type in as many ideas, solutions, or suggestions as they could think of. We emphasized that idea generation would be anonymous and that no one would ever be able to link ideas to names or student identification numbers. Hereafter, participants were asked to start generating ideas. They could type in an idea, and by hitting the Enter key, they could submit this idea and receive a new opportunity to type in an idea. This procedure continued for 8 min,

DE DREU, BAAS, AND NIJSTAD

744

after which participants were informed that the brainstorming session was over, and they were asked to answer a few questions. Then, they were told that the experiment was over, and they were debriefed, paid, and dismissed. Dependent variables. The ideas, problem solutions, and suggestions generated by the participants were coded and/or transformed into four different components of creativity. First, independent coders counted the number of unique ideas generated per participant (Cohen’s K ⫽ .98). This was our measure of creative fluency. To obtain a measure of originality, independent coders rated each unique idea for originality, defined as “an idea or suggestion that is infrequent, novel, and original” (1 ⫽ not original at all to 5 ⫽ very original). Interrater agreement was satisfactory following criteria as per Cicchetti & Sparrow (1981; intraclass correlation, ICC[1] ⫽ .69) and we used the aggregation across raters as an indicator of originality. To get at cognitive flexibility, we assigned each unique idea to one of the following seven categories: Ideas having to do with (a) university environment, such as (architecture of) lecture halls, seminar rooms, and opening hours; (b) student facilities, such as extracurricular activities, library access, and classroom interiors; (c) student quality, including selecting better students and increasing cooperation and contact among students; (d) teaching materials, such as readers, textbooks, handouts of PowerPoint presentations, examination issues, and grading systems; (e) teachers, such as teacher training and selection, use of teaching evaluations, and use of mentors and coaches; (f) policy, such as scholarships and other financial issues, information distribution, and reduced bureaucracy; and (g) other issues. The higher the number of categories used, the greater the participant’s cognitive flexibility (e.g., Nijstad et al., 2002, 2003). Interrater agreement was good (Cohen’s K ⫽ .71), and differences were solved through discussion. To get at perseverance, we assessed within-category fluency: the number of unique ideas divided by the number of categories from which these ideas were sampled. To check the manipulation of hedonic tone and level of activation, we asked participants how positive they felt (1 ⫽ not positive at all to 5 ⫽ very positive) and how activated they felt (1 ⫽ not very activated to 5 ⫽ very activated).

Results Manipulation checks. A 2 (activating vs. deactivating) ⫻ 2 (negative tone vs. positive tone) analysis of variance (ANOVA) on self-reported activation revealed only that activating moods (anger,

happiness) produced somewhat higher activation than deactivating moods (sadness, relaxation; M ⫽ 3.62 vs. M ⫽ 3.12), F(1, 54) ⫽ 3.78, p ⬍ .06 (marginal). Such a 2 ⫻ 2 ANOVA on self-reported tone revealed only that positive moods (happiness, relaxation) produced more positive feelings than did negative moods (anger, sadness; M ⫽ 2.43 vs. M ⫽ 1.94), F(1, 54) ⫽ 4.12, p ⬍ .05. We conclude that our manipulations were successful. Creative fluency and originality. We submitted the number of unique ideas to a four level (angry, sad, happy, relaxed) one-way ANOVA. No effects were significant, but a directional test of Hypothesis 1 with planned comparisons showed that more ideas were generated when participants were in an activating mood rather than in a deactivating mood, t(54) ⫽ 1.65, p ⬍ .05, ␩2 ⫽ .05 (see also Table 1) Hedonic tone had no effects (ts ⬍ 1). We conclude that creative fluency is a function of the extent to which a mood activates or deactivates (cf., Hypothesis 1). We submitted the averaged originality of ideas to a four level (angry, sad, happy, relaxed) one-way ANOVA. As predicted, mood influenced originality, F(3, 54) ⫽ 3.42, p ⬍ .025. A follow-up comparison showed that activating moods (happy, angry) produced more original ideas than did deactivating moods (sad, relaxed), t(54) ⫽ 3.12, p ⬍ .003, ␩2 ⫽ .15 (for cell means, see Table 1). Hedonic tone did not matter: The planned comparison of positive states (happy, relaxed) with negative states (angry, sad) was not significant (M ⫽ 2.52 vs. M ⫽ 2.39), t(54) ⬍ 1, ns, nor was the interaction between tone and activation, F(1, 54) ⬍ 1, ns. From these results, we conclude that originality of produced ideas is a function of the extent to which a mood activates or deactivates. This supports Hypothesis 1. Cognitive flexibility. We submitted the number of categories from which ideas were sampled to a four level (angry, sad, happy, relaxed) one-way ANOVA. Means were in the predicted direction (also see Table 1), but there were no significant effects to support the hypothesis (Hypothesis 2) that cognitive flexibility is highest among activating, positive moods. Persistence. A four level ANOVA on persistence showed a trend for mood, F(3, 54) ⫽ 2.44, p ⬍ .075. Planned contrasts were computed to examine effects of activation, effects of hedonic tone, and their interaction on persistence. Neither the simple activation contrast nor the simple hedonic tone contrast was significant, t(54) ⫽ 1.52, p ⬍ .14, ␩2 ⫽ .04, and t(54) ⫽ ⫺1.05, p ⬍ .43, ␩2 ⫽ .01, respectively. However, the Tone ⫻ Activation contrast was significant, t(54) ⫽ 2.66, p ⬍ .025, ␩2 ⫽ .06, showing that anger

Table 1 Means and Standard Deviations for Fluency, Originality, Flexibility, and Perseverance as a Function of Mood (Study 1) Mood state Angry

Sad

Happy

Relaxed

Variable

M

SD

M

SD

M

SD

M

SD

Creative fluency Originality Flexibility Perseverance

13.32 2.73a 3.65 3.64a

5.11 0.71 1.11 1.09

10.16 2.06b 3.56 2.85b

5.67 0.79 1.19 0.71

11.88 2.77a 4.01 2.96b

5.32 0.66 1.52 0.69

10.42 2.26b 3.46 3.01b

5.27 0.74 1.77 0.90

Note.

Means within a row not sharing the same subscript differ significantly at p ⬍ .05.

MOOD–CREATIVITY LINK REVISITED

745

The results support the hypothesis (Hypothesis 1) that activating mood states produce greater creative fluency and originality than do deactivating mood states, and the results support the hypothesis (Hypothesis 3) that activating negative moods produce greater persistence than does any other mood state. One potential limitation of this support, which pertains to the persistence effect in particular, is that effects are tied to one specific mood state (anger). In the next studies, we deal with this by inducing multiple mood states that are similar in tone and activation (e.g., anger and anxiety vs. sadness and depression; elation and happiness vs. relaxation and calm). Replicating support for Hypothesis 1 and 3 would reduce the concern that effects are tied to aspects of a specific mood state other than activation and tone. Although means were in the predicted direction, Study 1 did not support the hypothesis (Hypothesis 2) that activating positive moods produce greater cognitive flexibility than does any other mood state. Given the strong support in the literature that positive mood states foster cognitive flexibility (e.g., Ashby et al., 1999), the current failure may reflect a Type II error, and a conceptual replication is needed before concluding anything with regard to Hypothesis 2. Accordingly, in Study 2, we asked participants to complete the category inclusion task previously used by Isen and Daubman (1984). We predicted greater category inclusiveness for activating than for deactivating moods when tone is positive (cf., Hypothesis 2). Given that negative tone has been related to narrow perceptual focus (e.g., Derryberry, 1988; Mikulincer et al., 1990), it may be that activating moods produce reduced category inclusiveness when tone is negative. In other words, we expected greater category inclusiveness among activating moods than among deactivating moods when tone is positive rather than negative. We included a mood-neutral control condition in which participants did not do the self-generated imagery and only performed the category inclusion task. This permitted us to explore whether (de)activation and positive (negative) tone promote (inhibit) category inclusiveness.

in separate booklets. In the control condition, the self-generated imagery task was not included, and participants immediately went on with the study about object perception. Upon completion of the mood manipulation task, participants handed in their booklet, and they were asked to turn to their computer to continue with the next experiment about object perception. First, they once again filled in their gender and age (to increase the suggestion that indeed a new and independent experiment had started), and participants were given the category inclusion task to assess their cognitive flexibility (see below). Thereafter, they completed several manipulation checks, and participants were fully debriefed, paid for participation, and dismissed. Dependent variables. To assess cognitive inclusiveness, participants were asked to rate how prototypical exemplars were of a particular category (1 ⫽ not at all to 10 ⫽ very prototypical). For each of the four categories we used, three exemplars were presented, one being strongly, one being moderately, and one being weakly prototypical (see Rosch, 1975). Specifically, the four categories (with strong, intermediate, and weak exemplars) were vehicle (bus, airplane, camel), vegetable (carrot, potato, garlic), clothes (skirt, shoes, handbag), and furniture (couch, lamp, telephone). Inclusion ratings across the four categories were aggregated into separate indices for strong, moderate, and weak exemplars (␣ ⫽ .78, .82, and .74, respectively). Cognitive flexibility usually shows up in prototypicality ratings for the weak exemplars more than in ratings for the moderate or strong exemplars (Isen, Daubman, & Nowicki, 1987; Rosch, 1975). Upon completion of the category inclusion task (and because of an administrative error in the mood conditions, only), we measured hedonic tone by asking participants to rate their affective state on three items (how do you feel: very positive–very negative; very pleasant–very unpleasant; very nice–not at all nice). Ratings were aggregated (␣ ⫽ .89) and coded so that higher scores indicated more positive (and less negative) tone. In addition, we included a measure of activation level. Specifically, we asked participants to rate the following: (a) how energetic do you feel, (b) how engaged are you, and (c) how active are you presently? (1 ⫽ not at all to 5 ⫽ very much). Ratings were averaged into one activation level index (␣ ⫽ .79).

Method

Results

Design and participants. Undergraduate students (N ⫽ 179, 73% women, 27% men) participated for €5 (approximately U.S. $6.50), and they were randomly assigned to one of eight different mood conditions (anger, fear, sadness, depression, happiness, elation, relaxation, calm) or to the mood-neutral control condition. Gender had no effects, and it is not discussed further. Dependent variables were self-reported activation, hedonic tone, and category inclusiveness. Procedure and manipulation of discrete moods. Participants were seated in individual cubicles and told that they would participate in two independent studies: one about autobiographical memory (the task used to manipulate discrete moods) and one about object recognition (the task used to assess cognitive flexibility). Participants were then given a booklet with instructions about the autobiographical memory study. Discrete moods were manipulated as before, except that participants wrote their essays

Manipulation checks. Ratings for the activation level measure were tested in two planned comparisons, one testing all four negative mood states (anger, fear, sadness, depression) against all four positive mood states (happiness, elation, relaxation, calm) and one testing all four deactivating mood states (sadness, depression, relaxation, calm) against all four activating mood states (anger, fear, happiness, elation). Results were as expected: Whereas the hedonic tone contrast was not significant, t(155) ⬍ 1, ns, the activation contrast was, t(155) ⫽ 1.97, p ⬍ .05. Participants reported more activation when activating moods had been induced (M ⫽ 4.73) than when deactivating moods had been induced (M ⫽ 4.53). Ratings for the tone measure were tested in the same two planned comparisons. Results were as expected: Whereas the activation contrast was not significant, t(155) ⬍ 1, ns, hedonic tone contrast was, t(155) ⫽ 4.03, p ⬍ .01. Participants reported more

produced greater persistence than did the other three mood states (see also Table 1). This supports Hypothesis 3.

Discussion and Introduction to Study 2

DE DREU, BAAS, AND NIJSTAD

746

positive tone when positive moods had been induced (M ⫽ 4.23) than when negative moods had been induced (M ⫽ 2.53). We conclude that our manipulations were successful. Cognitive flexibility. Table 2 gives the mean prototypicality of strong, intermediate, and weak exemplars per condition. Hypothesis 1 was tested in a planned contrast grouping all activating moods versus all deactivating moods. This contrast was not significant for the strong and intermediate exemplars, ts(170) ⬍ 1, ns, but was significant for weak exemplars, t(170) ⫽ 2.10, p ⬍ .037, ␩2 ⫽ .03. Prototypicality ratings for weak exemplars were higher in activating mood conditions (M ⫽ 6.43) than in deactivating mood conditions (M ⫽ 5.98). Hypothesis 2 predicted that activating mood states lead to greater inclusiveness, especially when tone is positive. A directional contrast showed that positive activating moods produced higher inclusiveness (M ⫽ 6.51) than did all of the negative mood states and the two deactivating positive mood states (M ⫽ 6.10), t(170) ⫽ 1.82, p ⬍ .035, ␩2 ⫽ .025. Furthermore, among the positive moods, the two activating moods produced greater category inclusiveness than did the two deactivating conditions and the control condition, t(170) ⫽ 1.98, p ⬍ .05, ␩2 ⫽ .033, whereas both of the negative activating moods did not produce greater category inclusiveness than did the two deactivating moods and the control condition, t(170) ⫽ 1.48, p ⬍ .14, ␩2 ⫽ .016. However, these patterns notwithstanding, the Tone ⫻ Activation contrast was not significant, and Hypothesis 2 received no support; the trend for negative activating moods to produce greater inclusiveness is weaker but is otherwise in the same direction as the trend for positive activating moods. Comparisons involving the mood-neutral baseline. The conclusions emerging from the above analyses are further supported by specific contrasts involving the mood-neutral control condition (for cell means, see Table 2). First, activating moods (positive and negatives together) produced higher inclusiveness ratings for weak exemplars than did mood-neutral control condition (M ⫽ 6.43 vs. M ⫽ 5.71), t(170) ⫽ 1.98, p ⬍ .05, ␩2 ⫽ .04. It is interesting to note that deactivating moods (positives and negatives together) did not produce lower inclusiveness ratings for weak exemplars than did mood-neutral control condition (M ⫽ 5.98 vs. M ⫽ 5.71), t(170) ⬍ 1, ns. Second, consistent with past work (e.g., Isen & Daubman, 1984), we found that happy participants had higher prototypicality ratings for weak exemplars than did control participants, t(170) ⫽ 2.03, p ⬍ .044, ␩2 ⫽ .051. Positive activating moods (happy and elated) produced higher inclusiveness ratings than did the moodneutral control condition, t(170) ⫽ 1.96, p ⬍ .05, ␩2 ⫽ .031, whereas positive deactivating moods did not produce higher or

lower inclusiveness ratings, t(170) ⬍ 1, ns. This supports the idea that activating moods promote cognitive flexibility and inclusiveness when tone is positive. However, as mentioned, because the same (nonsignificant) trend emerged for negative activating moods versus deactivating moods, we cannot conclude that Hypothesis 2 received support.

Discussion and Introduction to Study 3 Study 2 shows that activating moods increase category inclusiveness. Together with Study 1, we thus have reasonable support for the dual pathway model, which indicates that activating mood states promote creative fluency and originality more than do deactivating mood states and that perseverance is higher among activating moods that are negative in tone (cf. Study 1). Although trends in the data suggested that cognitive flexibility was higher among activating moods that are positive in tone (cf. Study 2), these tendencies for cognitive flexibility were fairly weak—in Study 1, means were as predicted but were not statistically reliable; in Study 2, the critical Activation ⫻ Tone interaction was not significant. At present, it cannot be excluded that category diversity (Study 1) and category breadth and inclusiveness (Study 2) reflect not only cognitive flexibility but also persistence and perseverance. Those in an activating positive mood may be cognitively flexible and may, therefore, include peripheral exemplars (e.g., camel) in a particular category (e.g., vehicle). Those in an activating negative mood may persevere and systematically explore possibilities, ultimately concluding that peripheral exemplars fit into a particular category. This possibility implies that those in an activating positive mood are faster than those in an activating negative mood, which indeed fits the results of Isen et al. (1987). In Study 2, we did not track time-on-task and cannot examine this possibility. In Study 3, however, we included time-on-task as a key variable. The evidence for our model thus far pertains to cognitive and conceptual material (idea generation, cognitive category inclusiveness), and an issue is whether our dual pathway model also predicts perceptual insights and creativity. Creative insight problems differ from the tasks used thus far in that they are soluble, are likely to produce an impasse and a state of high uncertainty as to how to proceed, and are likely to produce a kind of “aha” experience when the impasse is suddenly overcome and the solution is discovered after prolonged efforts at solution (Fo¨rster et al., 2004; Schooler et al., 1993). Such tasks can be solved heuristically, through loose and detached processing, which is relatively effortless and fast (Brand et al., 2007). Alternatively, they can also be solved through persevering and analytical probing of a series of

Table 2 Mean Prototypicality Ratings as a Function of Experimental Manipulations (Study 2) Experimental condition Exemplars

Angry

Fearful

Depressed

Sad

Happy

Elated

Relaxed

Calm

Control

Strong Intermediate Weak

9.64 7.53 6.51

9.42 7.71 6.22

9.57 7.86 5.95

9.68 7.76 5.88

9.48 7.36 6.63

9.56 6.93 6.38

9.56 6.90 6.13

9.75 7.17 5.88

9.62 7.25 5.71

Note.

Higher numbers indicate greater category inclusiveness.

MOOD–CREATIVITY LINK REVISITED

hypotheses. This is a relatively effortful and time-consuming process. From our dual pathway model it follows that activating moods, more than deactivating moods, lead to greater creative fluency and, accordingly, that individuals in activating moods perform better on creative insight tasks—they close more gestalts (see below) – than do those in deactivating moods (cf. Hypothesis 1). Because positive affective tone increases cognitive flexibility and restructuring and pairs with a broader visual field, we further expected that individuals in positive activating moods would be able to perform creative insight tasks in relatively short time and would not benefit from longer time-on-task. But, because negative affective tone increases persistence and more effortful processing and pairs with attentional focus, we expected that individuals in negative activating moods benefit from longer time-on-task when performing creative insight tasks. Put differently, whereas we did not expect differences in creative fluency between positive and negative mood states, we did expect longer time-on-task to associate with creative fluency among (activating) negative mood states more than among (activating) positive mood states.

Method Design and participants. Undergraduate students (N ⫽ 90, 66% women, 34% men) participated for €5 (approximately U.S. $6.50) and were randomly assigned to one of eight different mood conditions (anger, fear, sadness, depression, happiness, elation, relaxation, calm) or to the mood-neutral control condition. Gender had no effects, and it is not discussed further. Dependent variables were manipulations checks, number of correctly closed Gestalts, and time-on-task. Procedures, mood manipulations, and creativity task. These were the same as in Study 2, except that all materials were provided through computers, and responses had to be given using a keyboard and a computer mouse. Furthermore, to enhance comparability between mood conditions, we also asked participants in the moodneutral control condition to perform a task about autobiographical memory. Participants were asked to write a short essay about the route they took to the psychology department. They were specifically asked to pay attention to the buildings they passed and to write their essay in such a way that another person could imagine the route they took. After finishing their autobiographical story, they were asked to report the major building that they passed. Third and finally, we replaced the category inclusion task with the gestalt completion task, adapted from Fo¨rster et al. (2004), which involves recognizing fragmented pictures of familiar objects. After the gestalt completion task, participants answered a short questionnaire, were debriefed, and were paid for participation. Dependent variables. The hedonic tone and activation manipulations were checked, as in Study 2. We coded the number of closed gestalts as correct, incorrect, or missed. Although we had 10 gestalts, initial analyses revealed one picture to be unsolvable (less than 30% correctly closed, and over 50% missed). We decided to base analyses on the remaining 9 pictures (including the tenth gestalt produced similar results and identical conclusions). For each gestalt, we tracked the time in seconds between the appearance of the gestalt on the computer screen and the response (either a word or a hard return indicating a miss). The total time across the nine gestalts served as our second dependent measure.

747

Results Manipulation checks. Ratings for the activation level measure were tested in two directional comparisons. The first tested all four negative mood states (anger, fear, sadness, depression) against all four positive mood states (happiness, elation, relaxation, calm), and the second tested all four deactivating mood states (sadness, depression, relaxation, calm) against all four activating mood states (anger, fear, happiness, elation). Results were as expected: Whereas the hedonic tone contrast was not significant, t(81) ⬍ 1, ns, the activation contrast showed a trend in the predicted direction: Participants reported more activation when activating moods had been induced (M ⫽ 3.35) than when deactivating moods had been induced (M ⫽ 2.69), t(81) ⫽ 1.53, p ⬍ .06. The control condition fell in between (M ⫽ 3.11) and did not differ from the activating or deactivating mood conditions, ts(81) ⬍ 1, ns. For the tone measure, results were also as expected: Whereas the activation contrast was not significant, t(81) ⬍ 1, ns, the hedonic tone contrast was, t(81) ⫽ 2.04, p ⬍ .025. Participants reported more positive tone when positive moods had been induced (M ⫽ 2.89) than when negative moods had been induced (M ⫽ 2.53). The control condition fell in between (M ⫽ 2.69) and did not differ from both the positive and the negative mood conditions, both t(81) ⬍ 1.20, ns. Creative fluency. The number of correctly closed gestalts was analyzed using the same set of a priori contrasts as used in Study 2. Means and standard deviations, broken down for experimental condition, are given in Table 3. The planned comparison grouping all positive mood states versus all negative mood states was not significant, t(81) ⬍ 1, ns. However, consistent with Hypothesis 1, a planned contrast grouping all activating moods versus all deactivating moods was significant, t(81) ⫽ 2.13, p ⬍ .036. Participants in activating mood conditions had more correctly closed gestalts (M ⫽ 7.02) than did those in deactivating mood conditions (M ⫽ 6.25). Directional tests within the negative mood states showed that activating moods produced more correct responses than did deactivating moods, t(81) ⫽ 1.85, p ⬍ .05. Likewise, within the positive mood states, activating moods produced more correct responses than did deactivating moods, t(81) ⫽ 1.75, p ⬍ .05.2 Cognitive flexibility and persistence. The time participants needed to correctly close the gestalts was log-transformed to deal with skewness. A 2 ⫻ 2 (Tone ⫻ Activation) ANOVA on logtransformed time revealed the expected interaction between tone

2 Differences in correctly closed gestalts may be due to differences in incorrectly closed gestalts, and/or differences in number of nonresponses. For incorrect responses, no a priori contrasts were significant, ts(81) ⬍ 1, but participants in the activating mood conditions tended to miss fewer than did those in the deactivating mood conditions (M ⫽ 1.10 vs. M ⫽ 1.73), t(81) ⫽ ⫺1.84, p ⬍ .07. This suggests that the lower number of correct closures in the deactivating mood conditions is due to a higher number of missed responses. Furthermore, inspection of Table 3 may suggest that anger and fear differ in terms of correctly closed gestalts and in number of misses. Statistically, however, this is not the case, ts(81) ⫽ 0.92 and –1.32, ps ⬎ .22, respectively.

DE DREU, BAAS, AND NIJSTAD

748

Table 3 Means and Standard Deviations for Creative Performance and Time-on-Task as a Function of Experimental Manipulations (Study 3) Experimental condition Angry Dependent variable Correct Missed Time-on-task Time for correctly closed gestalts Note.

Fearful

Depressed

Sad

Happy

Elated

Relaxed

Calm

Control

M

SD

M

SD

M

SD

M

SD

M

SD

M

SD

M

SD

M

SD

M

SD

7.36 0.90 1.80

1.20 0.90 0.13

6.70 1.40 1.79

0.94 0.69 0.12

6.11 1.67 1.74

1.73 1.50 0.16

6.37 2.00 1.78

1.19 1.85 0.08

7.00 1.34 1.73

1.00 1.13 0.12

7.00 1.25 1.78

1.41 1.04 0.18

6.00 1.90 1.79

2.01 1.37 0.18

6.55 1.58 1.76

1.58 1.48 0.14

6.67 1.53 1.80

0.89 0.99 0.17

1.68

0.21

1.64

0.15

1.54

0.17

1.53

0.21

1.57

0.12

1.58

0.13

1.58

0.14

1.57

0.15

1.57

0.15

Data for time-on-task and time on correctly closed gestalts are log-transformations of seconds across nine trials.

and activation, F(1, 71) ⫽ 2.69, p ⬍ .10 (marginal).3 Participants spent a significantly longer time on the task in the negative activating mood conditions than in the negative deactivating mood conditions (M ⫽ 4.01 min vs. M ⫽ 3.11 min), F(1, 71) ⫽ 3.98, p ⬍ .05, and a nonsignificantly shorter time in the positive activating mood conditions than in the positive deactivating mood conditions (M ⫽ 3.01 vs. M ⫽ 3.22, F ⬍ 1). It thus appears that longer time-on-task benefits participants in activating negative moods, who tend to persist, but not those in activating positive moods (see Table 3 for log-transformed overall time-on-task, and time needed to correctly close). A complementary perspective is obtained by regressing creative fluency (number of correct responses) on level of activation, hedonic tone, time-on-task, and their interactions. This produced a significant regression model, R2 ⫽ .16, F(6, 68) ⫽ 2.23, p ⬍ .05. Consistent with the contrast analyses reported before, the main effects for hedonic tone (␤ ⫽ ⫺.039, t ⬍ 1) and for time-on-task (␤ ⫽ ⫺.11, t ⬍ 1) were not significant, whereas the activation main effect was (␤ ⫽ .21, t ⫽ 1.98, p ⬍ .05). Furthermore, the interaction between hedonic tone and time-on-task was significant in the activating mood conditions (␤ ⫽ ⫺.52, t ⫽ ⫺3.72, p ⬍ .001) and not in the deactivating mood conditions (␤ ⫽ ⫺.15, t ⬍ 1). Among activating negative mood states, longer time-on-task associated with more correct responses (␤ ⫽ .32, p ⬍ .05); among activating positive mood states, shorter time-on-task associated with more correct responses (␤ ⫽ ⫺.76, p ⬍ .01). This pattern of results strongly suggests that cognitive processes underlying creative performance qualitatively differ between positive and negative activating mood states. This is consistent with our notion that positive tone impacts creative performance because it allows for cognitive flexibility and set-breaking (cf. Hypothesis 2), whereas negative tone impacts creative performance because it engenders cognitive persistence and perseverance (cf. Hypothesis 3). Furthermore, results suggest that participants in a negative activating mood profited from longer time-on-task, whereas those in a positive activating mood or those in a (positive or negative) deactivating mood did not.

Discussion and Introduction to Study 4 Across a variety of tasks, results showed that activating moods produce more creative fluency and originality than do deactivating

moods. We also found that higher creativity associated with enhanced perseverance in the case of negative tone. However, with regard to the idea that cognitive flexibility is enhanced in the case of activating positive moods, evidence was less strong and, in Study 1, statistically not reliable. Furthermore, we did not test the idea that cognitive flexibility (persistence) mediates between positive (negative) activating moods on the one hand and creative fluency and originality on the other. In Study 4, we used the brainstorming task of Study 1 and, to enable formal tests of mediation, engaged a much larger number of participants. We expected higher creative fluency and originality among activating moods than among deactivating moods to be due to greater cognitive flexibility when mood states are positive in tone (Hypothesis 4) and to greater persistence when mood states are negative in tone (Hypothesis 5). Another goal of Study 4 was to replicate results with a different assessment of mood states. Whereas the first three studies provided good evidence for the causal effects of discrete moods, we cannot exclude a monomethod/operation bias—the possibility that our findings are limited to the specific ways we manipulated mood states in Studies 1—3. In Study 4, we therefore used a different method: Participants rated their current mood state on a number of adjectives that were grouped according to their being positive in tone or negative in tone and, independently, activating or deactivating. These four dimensions were correlated with creativity indices.

Method Design and participants. We used a correlational design with measures of discrete moods as predictor variables and brainstorming performance— creative fluency, originality, cognitive flexibility, and within-category fluency—as dependent variables. Participants were 546 first year psychology students (74% women, 26% 3 An alternative approach would be to compute planned comparisons and to use the overall error terms and degrees of freedom (i.e., those of the control condition as well). Doing so yields a highly significant contrast of negative activating moods against all others, t(81) ⫽ 2.28, p ⬍ .025, or against the negative deactivating moods, t(81) ⫽ 2.27, p ⬍ .03. No such effects were obtained when activating positive moods were contrasted against all others or against positive deactivating mood, ts(81) ⬍ 1, ns.

MOOD–CREATIVITY LINK REVISITED

men) at the University of Amsterdam. They participated for partial fulfillment of a course requirement. Procedure and independent variables. The study was included in mass testing sessions (approximately 50 participants per session). Participants were seated in large lecture halls behind personal computers, which displayed all materials. Responses to questions could be typed in using the computer keyboard. Participants were not allowed to talk and were required to work individually, at their own pace, and without consulting others. Experimenters supervised testing sessions and, when necessary, helped participants or enforced the above rules (this happened rarely). Discrete moods were assessed by asking participants to complete a series of items that we derived from the PANAS (Watson et al., 1988) or generated for the specific purpose of this study. In total, participants indicated for each of 29 mood items how much of the mood they had experienced since they got up that morning (1 ⫽ not at all to 5 ⫽ very much so). Thereafter, supposedly as part of a new and unrelated testing session, we introduced the brainstorming task (for further detail, see the Method section of Study 1). When time was over, participants were informed that the test was completed, and they continued with another, unrelated test. At the end of the semester, all participants received a written debriefing along with a mailing address for further questions, and a complaint form to be submitted when they did not want their data to be used (no additional questions or complaints were received). Independent variables. Initial factor analysis of the mood ratings revealed a six-factor solution, with four factors being readily interpretable and two factors grouping 5 items that had high cross-loadings with other factors. These 5 items were dropped, and the remaining 24 items were submitted to a principal component analysis. Because, in theory, dimensions could be correlated, we applied oblimin rotation with Kaiser normalization. As expected, we found a four-factor solution, explaining a total of 62% of the variance. Table 4 summarizes the factor loadings and crossloadings for all items on all four factors. The first factor groups negative activating moods (e.g., angry, guilty), the second factor groups positive activating moods (e.g., happy, elated), the third factor groups negative deactivating moods (e.g., depressed, discouraged), and the fourth factor groups positive deactivating moods (e.g., calm, relaxed). Ratings within each factor were averaged to form one index. Internal reliabilities (Cronbach’s alphas) were acceptable to good (see the Results section). Dependent variables. The ideas, problem solutions, and suggestions generated by the participants were coded and/or transformed into the same components of creativity as used in Study 1 (i.e., creative fluency, originality, cognitive flexibility, and perseverance; .76 ⬍ Cohen’s K ⬍ .98). For originality, interrater agreement was satisfactory, ICC(1) ⫽ .67, and we used the aggregation across raters as an indicator of originality.

Results Table 5 gives the descriptive statistics for all study variables. As can be seen, we found moderate to strong correlations between the four mood dimensions and strong correlation between our four indicators of creativity. Zero-order correlations between mooddimensions and indicators of creativity were low and generally nonsignificant.

749

Table 4 Factor Solution and Loadings for Mood Items (Study 4) Mood item

Factor I

Factor II

Factor III

Factor IV

Disgusted Fearful Ashamed Disdainful Worried Afraid Guilty Angry Upset Happy Elated Excited Drained Lifeless Fatigued Depressed Discouraged Failed Sad Calm Relaxed At ease

.76 .75 .74 .73 .71 .71 .68 .63 .62 ⫺.40 ⫺.21 ⫺.14 .36 .27 .26 .36 .48 .46 .58 ⫺.37 ⫺.34 ⫺.51

⫺.31 ⫺.31 ⫺.22 ⫺.26 ⫺.20 ⫺.28 ⫺.21 ⫺.17 ⫺.28 .83 .79 .75 ⫺.30 ⫺.40 ⫺.14 ⫺.37 ⫺.35 ⫺.24 ⫺.33 .24 .54 .36

.29 .34 .48 .33 .28 .33 .46 .38 .57 ⫺.39 ⫺.17 ⫺.18 .85 .79 .78 .72 .64 .62 .62 ⫺.37 ⫺.36 ⫺.37

⫺.22 ⫺.57 ⫺.22 ⫺.12 ⫺.43 ⫺.59 ⫺.27 ⫺.37 ⫺.54 .40 .21 .08 ⫺.27 ⫺.24 ⫺.23 ⫺.37 ⫺.35 ⫺.53 ⫺.51 .86 .77 .65

Eigenvalue % variance

9.59 41.68

1.97 8.59

1.57 6.77

1.10 4.76

Note. Numbers are factor loadings. Factor loadings in bold within one column are grouped together in subsequent analyses. Factor I ⫽ negative activating moods; Factor II ⫽ positive activating moods; Factor III ⫽ negative deactivating moods; Factor IV ⫽ positive deactivating moods.

Creative fluency and originality. To test Hypothesis 1, we regressed creative fluency and originality on the four mood dimensions. Results are summarized in Table 6 and showed that first of all, both negative and positive activating moods predicted creative fluency. Second, inspection of the regression weights further reveals that positive activating moods significantly predicted originality. Because neither positive nor negative deactivating mood states were related to creative fluency and originality, these results provide new support for the hypothesis (Hypothesis 1) that activating mood states associate with more fluency and originality than do deactivating mood states. Cognitive flexibility and perseverance. For cognitive flexibility (i.e., category diversity), regression weights in Table 6 revealed that only positive activating moods predicted flexibility; no other predictor was significant. This supports the hypothesis (Hypothesis 2) that activating moods promote cognitive flexibility, especially when tone is positive. Regression weights in Table 6 also reveal that only negative activating moods predicted within-category fluency. This supports the hypothesis (Hypothesis 3) that activating moods lead to greater persistence, especially when tone is negative. Mediation tests. To test for mediation (i.e., Hypothesis 4 and 5) we computed a series of regression along the criteria set forth by Kenny, Kashy, and Bolger (1998). We first tested whether cognitive flexibility mediates the effects of positive activating moods on creative fluency and originality (Hypothesis 4). When we regressed originality on positive activating moods after controlling for flexibility, the originally significant effect of positive activating

DE DREU, BAAS, AND NIJSTAD

750

Table 5 Descriptive Statistics for Dependent and Independent Variables (Study 4) Variable

M

1. Negative activating moods 2. Positive activating moods 3. Negative deactivating moods 4. Positive deactivating moods 5. Creative fluency 6. Flexibility 7. Within-category fluency 8. Originality

SD

2.03 3.57

0.69 0.72

2.58 3.67 5.06 2.30 1.80 2.68

0.81 0.78 3.66 1.49 0.97 0.77

1 .88

2

3

⫺.45 .80

***

4

5

.71 ⫺.49***

***

⫺.59 .61***

.06 .07

.88

⫺.56*** .81

.02 ⫺.02 —

***

6

7

***

8 ⫺.01 .07†

***

.01 .09*

.11 ⫺.01

.01 .02 .75*** —

.03 ⫺.06 .63*** .08† —

.01 .02 .56*** .76*** .05 —

Note. N ⫽ 546. Scale reliabilities (Cronbach’s ␣) are on the diagonal. Dashes indicate that there is no scale reliability to report. † p ⬍ .10. * p ⬍ .05. ** p ⬍ .01.

moods dropped to a nonsignificant level (␤ ⫽ .01, t ⬍ 1), whereas flexibility was highly significant (␤ ⫽ .76, t ⫽ 26.90, p ⬍ .001). A Sobel test confirmed that the mediation was significant, z ⫽ 2.45, p ⬍ .015. In other words, consistent with Hypothesis 4, flexibility fully mediated the effect of positive activating moods on originality (see also Figure 2a; We explored whether positive activating moods relate to higher fluency because of greater cognitive flexibility. This was not the case.) We also examined whether persistence (i.e., within-category fluency) mediated effects of negative activating moods on creative fluency. When we regressed creative fluency on negative activating moods after controlling for within-category fluency, the initially significant effect of negative activating moods dropped to a nonsignificant level (␤ ⫽ .02, t ⬍ 1), whereas the effect of within-category fluency was highly significant (␤ ⫽ .63, t ⫽ 18.93, p ⬍ .001). A Sobel test confirmed that the mediation was significant (z ⫽ 2.46, p ⬍ .015). In other words, consistent with Hypothesis 5, perseverance fully mediated the effect of negative activating moods on creative fluency (see also Figure 2b).

lates with both within-category persistence and category diversity (i.e., multiplying these results in creative fluency). However, negative activating moods only affected creative fluency through increased within-category fluency, and neither effect of negative activating moods on category diversity nor mediation of category diversity was found. In the case of positive tone, results showed that activating moods have their effects on originality because of greater cognitive flexibility. These findings fit well with those of the previous studies and support our theoretical framework. Further, Study 4 showed that creativity was related to activating moods and not to deactivating moods. This suggests that activation stimulates creative performance rather than that deactivation undermines creative performance.

Conclusions and General Discussion In their Annual Review of Psychology article, Brief and Weiss (2002, p. 297) stated, It is apparent that discrete emotions are important, frequently occurring elements of everyday experience. Even at work—perhaps especially at work—people feel angry, happy, guilty, jealous, proud, etcetera. Neither the experiences themselves, nor their consequences, can be subsumed easily under a simple structure of positive or negative states.

Discussion Activating mood states related to a greater overall number of unique ideas and, when mood states were positive, to higher levels of originality. In the case of negative tone, results further showed that activating moods have their effects on creative performance (i.e., creative fluency) because they enhance within-category persistence. It should be noted though that fluency necessarily corre-

Quite consistent with this observation, the current study indeed showed that positive and negative mood states differentiate in terms of activating or deactivating nature (cf. Russell & Barrett,

Table 6 Regression of Cognitive Flexibility, Creative Fluency, Within-Category Fluency, and Originality on Positive Activating, Positive Deactivating, Negative Activating, and Negative Deactivating Moods (Study 4) Creative fluency Predictor variable Negative activating moods Positive activating moods Negative deactivating moods Positive deactivating moods Note. N ⫽ 545. p ⬍ .05. ** p ⬍ .025.

*

␤ .12 .13 ⫺.02 ⫺.03

t

Flexibility R2

2.00* 2.29** ⬍1 ⬍1

.02*

␤

t

.02 .14

⬍1 2.47**

.04 ⫺0.2

⬍1 ⬍1

Within-category fluency R2

.015

␤

t

.16 .04

2.48** ⬍1

⫺.08 ⫺.65

⫺1.30 ⬍1

R2

.02*

Originality ␤

t

⫺.01 .11

⬍1 1.98**

.06 ⫺.02

⬍1 ⬍1

R2

.01

MOOD–CREATIVITY LINK REVISITED

A

Cognitive Flexibility β = .14*

Positive Activating Moods

β = .75***

Originality β = .11 * (.01)

B β = .16

751

*

Negative Activating Moods

Within-Category Fluency

β = .12* (.02)

β = .63***

Number of Unique Ideas (Fluency)

Figure 2. A: Path of positive activating mood states on originality mediated by cognitive flexibility (category diversity). B: Path of negative activating mood states on creative fluency mediated by cognitive perseverance (within-category fluency). Numbers in brackets are regressions weights after the mediator has been controlled for. *p ⬍ .05; ***p ⬍ .01.

1999), and our results indicate that when it comes to creative performance, both activation and hedonic tone are important. Across four studies, findings were consistent with our dual pathway to creativity model, which indicates that only activating, and not deactivating, mood states lead to higher levels of creative fluency and originality, that activating positive mood states lead to creativity through higher levels of cognitive flexibility, and that activating negative mood states lead to higher creativity through increased perseverance within thought categories and longer timeon-task. Below, we discuss implications of these findings for research on mood and on creativity. We also discuss some limitations to our findings and highlight avenues for future research.

Summary of Results and Theoretical Implications From our dual pathway to creativity model, we derived five hypotheses about the effects of mood states on particular facets of creativity. Hypothesis 1, predicting higher levels of creativity when moods are activating rather than deactivating, received strong support—with regard to creative fluency it was supported in all three tests (i.e., Study 1, 3, and 4), and with regard to originality, it was supported in two out of two tests (i.e., in Study 1 and 4). Hypothesis 3, that negative activating moods positively associate with cognitive persistence, also received good support; direct evidence was obtained in Study 1 and 4, and indirect evidence was obtained in Study 3. Hypothesis 2, that activating positive moods primarily associate with higher levels of cognitive flexibility, received less support; no evidence was obtained in Study 1 and 2, indirect evidence was obtained in Study 3, and only in Study 4 were statistical tests were supportive. This notwithstanding, Study

4 provided good evidence for mediation Hypotheses 4 and 5: Negative activating moods related to higher fluency because of increased persistence, whereas positive activating moods related to higher originality because of increased flexibility. We thus take these results as quite supportive of our dual pathway to creativity model and its specific application to the mood– creativity link. All in all, results support four conclusions. First, activating moods lead to more creativity than do deactivating moods, most likely because activation stimulates creativity rather than because deactivation undermines it. Second, activating moods with positive tone lead to creative performance through enhanced cognitive flexibility and inclusiveness. Third, activating moods with negative tone lead to creative performance through enhanced cognitive perseverance and persistence. Fourth, and finally, the effects of mood on creativity cannot solely be understood in terms of activation or in terms of hedonic tone; both dimensions are needed to understand how mood states influence creative performance. As discussed below, these conclusions imply that different dimensions of creative performance, such as cognitive flexibility, inclusiveness, or perseverance, cannot and should not be used interchangeably. Further, these conclusions imply that the task used to study creativity may determine the likelihood that some traits or states do, and others do not, appear to successfully predict creativity.

Mood States and Creativity We began this research with the observation that there seems to be general consensus that positive affect leads to more creativity. Contemporaries tend to explain this effect in terms of hedonic tone. For example, Ashby et al. (1999) noted,

752

DE DREU, BAAS, AND NIJSTAD There is substantial reason to believe that affect and arousal are not synonymous . . . and that the increases in cognitive flexibility and creative problem solving reported in so many articles are indeed due to positive affect, not simply to increases in arousal. (p. 532)

Current findings qualify these conclusions. When it comes to discrete mood states, we noted that not only hedonic tone but also activation matters and that tone and activation may take on different roles in the creativity process—activation determines the likelihood of creative performance, and tone determines whether creative performance comes about because of enhanced cognitive flexibility (in the case of positive tone) or because of enhanced perseverance and persistence (in the case of negative tone). As mentioned in footnote 2, this is not the first study to examine the role of activation in the mood– creativity link. Indeed, a number of other studies focused on the role of arousal, typically induced through some form of physical exercise. This past work produced inconsistent results, sometimes showing that physical exercise produces more creativity than no exercise and sometimes showing that it has no effects. Obviously, there are important differences between activation induced through physical exercise and activation associated with a particular mood state. This notwithstanding, it is important to note that past work on physical exercise and creativity did not differentiate between cognitive flexibility and persistence and did not examine possible interactions with hedonic tone—for some participants, physical exercise may have been a pleasant task, putting them in a good mood (happy, upbeat, relaxed) and thus, at best, facilitating cognitive flexibility. For some participants, however, physical exercise may have been an unpleasant task that put them in a negative mood (upset, frustrated, worried, depressed) and thus, at best, facilitating cognitive perseverance. Seen this way, it is not surprising that past work on physical exercise produced inconsistent results. Future work on (physical) activation needs to take into account the possible side effects that manipulations have on participants’ mood as well as the dependent variables assessed (flexibility and/or persistence). Related to this is that past work has revealed an inverted U-shape relationship between level of activation and arousal on the one hand and cognitive and motor performance on the other. Thus, at very low or extremely high levels of activation and arousal, working memory capacity is much lower and relevant brain regions function less effectively than at moderate levels of activation and arousal (cf., Yerkes & Dobson, 1908). An implicit assumption in our work thus far has been that the variation in activation related to particular mood states is in the lower range of this inverted U-shape relation; only intense emotions may temporarily produce the level of activation and arousal that shuts down the system and prohibits people from performing cognitive and motor tasks. Clearly, research is needed to further examine this issue. It would be particularly interesting to see whether exceedingly high levels of activation and arousal undermine cognitive flexibility and persistence to the same degree or to different degrees. Intuitively, it seems that flexibility is more vulnerable than persistence but, once again, research is needed to examine this further. That mood states impact creativity through different routes— cognitive flexibility or perseverance— has an important methodological implication. Some tasks used in creativity research, such as Rosch’s category inclusion task, capitalize on cognitive flexi-

bility, divergent thinking, and the use of broad and inclusive cognitive categories (cf., Murray et al., 1990). The present work shows that in such tasks positive moods have an advantage over negative moods in producing creative ideas, insights, and problem solutions. Other work has relied on tasks such as brainstorming that allow creativity through persistence and perseverance to come about. The present analysis shows that in such tasks negative moods have an equal or perhaps even better chance than positive moods of predicting creative performance. In short, an important insight that derives from our research is that the creativity task used may be a critical moderator of the relationship between mood (or any other trait or state for that matter) and creativity. The currently proposed dual pathway to creativity model captures past and current findings on the effects of positive moods on creativity quite well. However, things are less clear cut when considering the effects of negative moods, most notably those for sadness. Although current findings are supportive of the idea that sadness—a negative and deactivating mood state—neither produces nor inhibits creative performance, past work has revealed that sadness can actually stimulate creativity. For example, when the task is being framed as serious, important, and extrinsically rewarding, sadness leads to more creativity than do mood-neutral control conditions (Gasper, 2003; Hirt et al., 1997; also see, Martin & Stoner, 1996). One could argue that such task framing is motivating and activating and, as such, is doing what sad people need—they need to be activated to perform because their mood state in and by itself will not drive them toward (creative) performance. Although we believe that the dual pathway to creativity model has promise, we readily accept that invoking moderators may be needed to understand how particular (mood) states influence creative performance. Important moderators may include task framing and, as we elaborate on below, specific creativity task used. And although the current analysis focused on hedonic tone and activation as critical dimensions underlying discrete mood states, mood states differ on other dimensions as well, and these may meaningfully relate to creativity. For example, Higgins (2006) has argued that some mood states, such as happiness and anger, associate with approach motivation and promotion focus, whereas other moods, such as fear and feeling relaxed, associate with avoidance motivation and prevention focus (also see Amodio, Shah, Sigelman, Brazy, & Harmon Jones, 2004; Carver, 2004; Higgins, Shah, & Friedman, 1997). Promotion focus relates to more creativity than prevention focus (Friedman & Fo¨rster, 2001), and this together may suggest that mood states associated with promotion focus produce more creativity than do mood states associated with prevention focus. Future research may delve further into these possibilities, keeping in mind that a combination of hedonic tone, activation, and, perhaps also, regulatory focus better explains creative performance than do any of these dimensions alone.

Study Limitations and Avenues for Future Research Before concluding, a few limitations of our study design need comment. First of all, our evidence for mediation is based on correlational designs, and future work is needed to unequivocally establish the causal links. Second, the support for our dual pathway model was stronger and more consistent for the negative activating mood–persistence– creativity pathway, than for the positive acti-

MOOD–CREATIVITY LINK REVISITED

vating mood–flexibility– creativity pathway. To some, this may be surprising because quite some evidence has been gathered showing that positive tone relates to cognitive flexibility. Recent work by Hirt, Devers, and McCrea (2008) invoked hedonic contingency theory (Wegener & Petty, 1994), which posits that individuals in a positive mood use greater scrutiny in activity choice than do those in neutral or negative moods because fewer activities will be able to maintain or improve their current mood. Hirt, Devers, and McCrea showed that positive mood is indeed related to cognitive flexibility to the extent that it allowed mood maintenance, and their results thus suggest that positive mood effects may be limited to tasks that allow participants to maintain their positive feeling. In a way, this work is consistent with the more general idea underlying the current work that specific tasks may facilitate or inhibit mood effects on creativity because some tasks provide more room for persistence and other tasks, for cognitive flexibility to come about. Future research could more systematically explore the role of task environment (also see Kaufmann, 2003). For example, some studies on visual perception and set-breaking tend to provide limited time (e.g., 3 min; e.g., Fo¨rster et al., 2004) to complete the task. Our Study 3 revealed that participants in an activating negative mood benefited from longer time-on-task (and spent, on average more than 3 min) whereas those in an activating positive mood did not. An implication of our work thus is that setting time limits may lead to misguided conclusions about the creative potential of particular states or traits. A final avenue for future research is to analyze creative fluency and originality as a function of variables other than mood. We already discussed regulatory focus and global versus local information processing tendencies (cf., Fo¨rster et al., 2004; Friedman & Fo¨rster, 2001). Other candidates for such analyses are the role of intrinsic motivation versus extrinsic motivation (Amabile, 1983), achievement motivation, and traits such as openness to experience and conscientiousness (e.g., McCrae, 1987). It would be interesting to examine to what extent these and other variables known to affect creative fluency and originality do so because of enhanced flexibility, greater persistence, or some combination.

Concluding Thoughts Creativity researchers have long argued that positive mood increases creative performance and have implicitly or explicitly assumed this to be due to enhanced cognitive flexibility and reliance on broad, inclusive cognitive categories. Our results supported this idea and provided first time evidence for the notion that effects of positive mood states are limited to activating moods. Creativity researchers have long struggled with the effects of negative moods on creativity, with some arguing and finding that negative moods undermine creativity and others arguing that it enhances creative performance. The present work clarified, first of all, that negative moods enhance creative performance when mood states are activating rather than deactivating. Second, our results permit the conclusion that negative activating moods lead to creative performance because of enhanced cognitive perseverance and persistence more than because of cognitive flexibility and inclusiveness. Thus, provided some activation, both positive and negative moods engender creative performance, but through cognitive flexibility and cognitive perseverance, respectively. As such, our work suggests that Edison’s famous quote that creativity is

753

99% perspiration and 1% inspiration may reflect not only that Edison had apt intuition about the psychology of creativity but also that Edison resembled an angry young man more than a happy camper.

References Amabile, T. M. (1983). The social psychology of creativity: A componential conceptualization. Journal of Personality and Social Psychology, 45, 357–376. Ambady, N., & Gray, H. M. (2002). On being sad and mistaken: Mood effects on the accuracy of thin-slice judgments. Journal of Personality and Social Psychology, 83, 947–961. Amodio, D. M., Shah, J. Y., Sigelman, J., Brazy, P. C., & Harmon Jones, E. (2004). Implicit regulatory focus associated with asymmetrical frontal cortical activity. Journal of Experimental Social Psychology, 40, 225– 232. Ashby, F. G., Isen, A. M., & Turken, A. U. (1999). A neuropsychological theory of positive affect and its influence on cognition. Psychological Review, 106, 529 –550. Ashby, F. G., Valentin, V. V., & Turken, A. U. (2002). The effects of positive affect and arousal on working memory and executive attention: Neurobiology and computational models. In S. Moore & M. Oaksford (Eds.), Emotional contagion: From brain to behavior (pp. 245–287). Amsterdam, the Netherlands: Benjamins. Baddeley, A. (2000). The episodic buffer: A new component of working memory. Trends in Cognitive Sciences, 4, 417– 423. Barrett, L. F., & Russell, J. A. (1998). Independence and bipolarity in the structure of current affect. Journal of Personality and Social Psychology, 74, 967–984. Bartolic, E. I., Basso, M. R., Schefft, B. K., Glauser, T., & Titanic Schefft, M. (1999). Effects of experimentally induced emotional states on frontal lobe cognitive task performance. Neuropsychologia, 37, 677– 683. Baumann, N., & Kuhl, J. (2005). Positive affect and flexibility: Overcoming the precedence of global over local processing of visual information. Motivation and Emotion, 29, 123–134. Blanchette, D. M., Ramocki, S. P., O’Del, J. N., & Casey, M. S. (2005). Aerobic exercise and creative potential: Immediate and residual effects. Creativity Research Journal, 17, 257–264. Boden, M. A. (1998). Creativity and artificial intelligence. Artificial Intelligence, 103, 347–356. Brand, S., Reimer, T., & Opwis, K. (2007). How do we learn in a negative mood? Effects of a negative mood on transfer and learning. Learning and Instruction, 17, 1–16. Brief, A. P., & Weiss, H. M. (2002). Organizational behavior: Affect in the workplace. Annual Review of Psychology, 53, 279 –307. Broadbent, D. E. (1972). Decision and stress. New York: Academic Press. Carlsson, I. (2002). Anxiety and flexibility of defense related to high or low creativity. Creativity Research Journal, 14, 341–349. Carnevale, P. J., & Probst, T. M. (1998). Social values and social conflict in creative problem solving and categorization. Journal of Personality and Social Psychology, 74, 1300 –1309. Carver, C. S. (2004). Negative affects deriving from the behavioral approach system. Emotion, 4, 3–22. Chaiken, S., Liberman, A., & Eagly, A. H. (1989). Heuristic and systematic information processing within and beyond the persuasion context. In J. S. Uleman & J. A. Bargh (Eds.), Unintended thought (pp. 212–252). New York: Guilford Press. Chamberlain, S. R., Muller, U., Blackwell, A. D., Robbins, T. W., & Sahakian, B. J. (2006). Noradrenergic modulation of working memory and emotional memory in humans. Psychopharmacology, 188, 387– 407. Cicchetti, D. V., & Sparrow, S. A. (1981). Developing criteria for establishing interrater reliability of specific items: Applications to assessment of adaptive behavior. American Journal of Mental Deficiency, 86, 127– 137.

754

DE DREU, BAAS, AND NIJSTAD

Clore, G. L., Schwarz, N., & Conway, M. (1994). Affective causes and consequences of social information processing. In R. S. Wyer & T. K. Srull (Eds.), Handbook of social cognition (Vol. 1, pp. 323– 417). Hillsdale, NJ: Erlbaum. Colzato, L. S., Van Wouwe, N. C., & Hommel, B. (2007). Feature binding and affect: Emotional modulation of visuo-motor integration. Neuropsychologia, 45, 440 – 446. Cropanzano, R., Weiss, H. M., Hale, J. M. S., & Reb, J. (2003). The structure of affect: Reconsidering the relationship between negative and positive affectivity. Journal of Management, 29, 831– 857. Damasio, A. R. (2001). Some notes on brain, imagination, and creativity. In K. Pfenninger & V. R. Shubik (Eds.), The origins of creativity (pp. 59 – 68). Oxford, England: Oxford University Press. Derryberry, D. (1989). Hemispheric consequences of success-related emotional states: Roles of arousal and attention. Brain and Cognition, 11, 258 –274. Derryberry, D., & Reed, M. A. (1998). Anxiety and attentional focusing: Trait, state, and hemispheric influences. Personality and Individual Differences, 25, 745–761. DeSteno, D., Petty, R. E., Rucker, D. D., Wegener, D. T., & Braverman, J. (2004). Discrete emotions and persuasion: The role of emotion-induced expectancies. Journal of Personality and Social Psychology, 86, 43–56. Diehl, M., & Stroebe, W. (1987). Productivity loss in brainstorming groups: Toward the solution of a riddle. Journal of Personality and Social Psychology, 53, 497–509. Dietrich, A. (2004). The cognitive neuroscience of creativity. Psychonomic Bulletin & Review, 11, 1011–1026. Drabant, E. M., Hariri, A. R., Meyer Lindenberg, A., Munoz, K. E., Mattay, V. S., Kolachana, B. S., et al. (2006). Catechol o-methyltransferase valsuperscript 1-sup-5-sup-8Met genotype and neural mechanisms related to affective arousal and regulation. Archives of General Psychiatry, 63, 1396 –1406. Dreisbach, G., & Goschke, T. (2004). How positive affect modulates cognitive control: Reduced perseveration at the cost of increased distractibility. Journal of Experimental Psychology: Learning, Memory, and Cognition, 30, 343–353. Dreisbach, G., Muller, J., Goschke, T., Strobel, A., Schulze, K., Lesch, K. P., et al. (2005). Dopamine and cognitive Control: The influence of spontaneous eyeblink rate and dopamine gene polymorphisms on perseveration and distractibility. Behavioral Neuroscience, 119, 483– 490. Duncker, K. (1945). On problem-solving. Psychological Monographs, 58(5). Ekstrom, R. B., French, J. W., Harman, H. H., & Dermen, D. (1976). Manual for kit of factor-referenced cognitive tests. Princeton, NJ: Educational Testing Service. Eysenck, H. J. (1993). Creativity and personality: Suggestions for a theory. Psychological Inquiry, 4, 147–178. Faust, M., & Mashal, N. (2007). The role of the right cerebral hemisphere in processing novel metaphoric expressions taken from poetry: A divided visual field study. Neuropsychologia, 45, 860 – 870. Fiedler, K. (2000). Toward an integrative account of affect and cognition phenomena using the BIAS computer algorithm. In J. P. Forgas (Ed.), Feeling and thinking: The role of affect in social cognition (pp. 223– 252). New York: Cambridge University Press. Filipowicz, A. (2006). From positive affect to creativity: The surprising role of surprise. Creativity Research Journal, 18, 141–152. Fink, A., & Neubauer, A. C. (2006). EEG alpha oscillations during the performance of verbal creativity tasks: Differential effects of sex and verbal intelligence. International Journal of Psychophysiology, 62, 129 – 141. Finke, R. A. (1996). Imagery, creativity, and emergent structure. Consciousness & Cognition, 5, 381–393. Flach, F. (1990). Disorders of the pathways involved in the creative process. Creativity Research Journal, 3, 158 –165.

Flaherty, A. W. (2005). Frontotemporal and dopaminergic control of idea generation and creative drive. Journal of Comparative Neurology, 493, 147–153. Floresco, S. B., & Phillips, A. G. (2001). Delay-dependent modulation of memory retrieval by infusion of a dopamine D1 agonist into the rat medial prefrontal cortex. Behavioral Neuroscience, 115, 934 –939. Fo¨rster, J., Friedman, R. S., & Liberman, N. (2004). Temporal construal effects on abstract and concrete thinking: Consequences for insight and creative cognition. Journal of Personality and Social Psychology, 87, 177–189. Fredrickson, B. L. (1998). What good are positive emotions? Review of General Psychology, 2, 300 –319. Fredrickson, B. L., & Branigan, C. (2005). Positive emotions broaden the scope of attention and thought-action repertoires. Cognition and Emotion, 19, 313–332. Friedman, R. S., & Fo¨rster, J. (2000). The effects of approach and avoidance motor actions on the elements of creative insight. Journal of Personality and Social Psychology, 79, 477– 492. Friedman, R. S., & Fo¨rster, J. (2001). The effects of promotion and prevention cues on creativity. Journal of Personality and Social Psychology, 81, 1001–1013. Gasper, K. (2003). When necessity is the mother of invention: Mood and problem solving. Journal of Experimental Social Psychology, 39, 248 – 262. Gasper, K., & Clore, G. L. (2002). Attending to the big picture: Mood and global versus local processing of visual information. Psychological Science, 13, 34 – 40. George, J. M., & Brief, A. P. (1996). Motivational agendas in the workplace: The effects of feelings on focus of attention and work motivation. In B. M. Staw & L. L. Cummings (Eds.), Research in organizational behavior: An annual series of analytical essays and critical reviews (Vol. 18, pp. 75–109). Greenwich, CT: JAI Press. George, J. M., & Zhou, J. (2007). Dual tuning in a supportive context: Joint contributions of positive mood, negative mood, and supervisory behaviors to employee creativity. Academy of Management Journal, 50, 605– 622. Goldman-Rakic, P. S. (1996). Regional and cellular fractionation of working memory. Proceedings of the National Academy of Sciences, USA, 93, 13473–13480. Grawitch, M. J., Munz, D. C., & Kramer, T. J. (2003). Effects of member mood states on creative performance in temporary workgroups. Group Dynamics: Theory, Research, and Practice, 7, 41–54. Gray, J. A. (1982). Precis of the neuropsychology of anxiety: An enquiry into the functions of the septo-hippocampal system. Behavioral and Brain Sciences, 5, 469 –534. Gray, J. R., & Braver, T. S. (2002). Integration of emotion and cognitive control: A neurocomputational hypothesis of dynamic goal regulation. In S. C. Moore & M. Oaksford (Eds.), Emotional cognition: From brain to behaviour (pp. 289 –316). Amsterdam, the Netherlands: Benjamins. Green, D. P., Goldman, S. L., & Salovey, P. (1993). Measurement error masks bipolarity in affect ratings. Journal of Personality and Social Psychology, 64, 1029 –1041. Guilford, J. P. (1967). The nature of human intelligence. New York: McGraw-Hill. Higgins, E. T. (2006). Value from hedonic experience and engagement. Psychological Review, 113, 439 – 460. Higgins, E. T., Shah, J., & Friedman, R. (1997). Emotional responses to goal attainment: Strength of regulatory focus as moderator. Journal Personality and Social Psychology, 72, 515–525. Hirt, E. R., Devers, E. E., & McCrea, S. M. (2008). I want to be creative: Exploring the role of hedonic contingency theory in the positive mood– cognitive flexibility link. Journal of Personality and Social Psychology, 94, 214 –230. Hirt, E. R., Levine, G. M., McDonald, H. E., Melton, R. J., & Martin, L. L.

MOOD–CREATIVITY LINK REVISITED (1997). The role of mood in quantitative and qualitative aspects of performance: Single or multiple mechanisms? Journal of Personality and Social Psychology, 33, 602– 629. Hirt, E. R., Melton, R. J., McDonald, H. E., & Harackiewicz, J. M. (1996). Processing goals, task interest, and the mood-performance relationship: A mediational analysis. Journal of Personality and Social Psychology, 71, 245–261. Isen, A. M., & Baron, R. A. (1991). Positive affect as a factor in organizational behavior. Research in Organizational Behavior, 13, 1–53. Isen, A. M., & Daubman, K. A. (1984). The influence of affect on categorization. Journal of Personality and Social Psychology, 47, 1206 – 1217. Isen, A. M., Daubman, K. A., & Nowicki, G. P. (1987). Positive affect facilitates creative problem solving. Journal of Personality and Social Psychology, 52, 1122–1131. Isen, A. M., Niedenthal, P. M., & Cantor, N. (1992). An influence of positive affect on social categorization. Motivation and Emotion, 16, 65–78. Kaufmann, G. (2003). Expanding the mood-creativity equation. Creativity Research Journal, 15, 131–135. Kaufmann, G., & Vosburg, S. K. (1997). “Paradoxical” mood effects on creative problem-solving. Cognition and Emotion, 11, 151–170. Kenny, D. A., Kashy, D. A., & Bolger, N. (1998). Data analysis in social psychology. In D. T. Gilbert, S. T. Fiske, & G. Gardner (Eds.), The handbook of social psychology (pp. 233–265). New York: McGraw-Hill. Kimberg, D. Y., D’Esposito, M., & Farah, M. J. (1997). Effects of bromocriptine on human subjects depend on working memory capacity. Neuroreport, 8, 3581–3585. Lyubomirsky, S., King, L., & Diener, E. (2005). The benefits of frequent positive affect: Does happiness lead to success? Psychological Bulletin, 131, 803– 855. Madjar, N., & Oldham, G. R. (2002). Preliminary tasks and creative performance on a subsequent task: Effects of time on preliminary tasks and amount of information about the subsequent task. Creativity Research Journal, 14, 239 –251. Martin, L. L., & Stoner, P. (1996). Mood as input: What we think about how we feel determines how we think. In L. L. Martin & A. Tesser (Eds.), Striving and feeling: Interactions among goals, affect, and selfregulation (pp. 279 –301). Hillsdale, NJ: Erlbaum. Martindale, C., & Hasenfus, N. (1978). EEG differences as a function of creativity, stage of the creative process, and effort to be original. Biological Psychology, 6, 157–167. Martindale, C., Hines, D., Mitchell, L., & Covello, E. (1984). EEG alpha asymmetry and creativity. Personality and Individual Differences, 5, 77– 86. McCrae, R. R. (1987). Creativity, divergent thinking, and openness to experience. Journal of Personality and Social Psychology, 52, 1258 – 1265. Mednick, S. A. (1962). The associative basis of the creative process. Psychological Review, 69, 220 –232. Mikulincer, M., Paz, D., & Kedem, P. (1990). Anxiety and categorization: II. Hierarchical levels of mental categories. Personality and Individual Differences, 11, 815– 821. Mikulincer, M., & Sheffi, E. (2000). Adult attachment style and cognitive reactions to positive affect: A test of mental categorization and creative problem solving. Motivation and Emotion, 24, 149 –174. Mitchell, R. L. C., & Phillips, L. H. (2007). The psychological, neurochemical, and functional neuroanatomical mediators of the effects of positive and negative mood on executive functions. Neuropsychologia, 45, 617– 629. Mumford, M. D. (2003). Where have we been, where are we going? Taking stock in creativity research. Creativity Research Journal, 15, 107–120. Mumford, M. D., & Gustafson, S. B. (1988). Creativity syndrome: Integration, application, and innovation. Psychological Bulletin, 103, 27– 43.

755

Murray, N., Sujan, H., Hirt, E. R., & Sujan, M. (1990). The influence of mood on categorization: A cognitive flexibility interpretation. Journal of Personality and Social Psychology, 59, 411– 425. Nieuwenhuis, S., Aston-Jones, G., & Cohen, J. D. (2005). Decision making, the P3, and the locus coeruleus–norepinephrine system. Psychological Bulletin, 131, 510 –532. Nijstad, B. A., & Stroebe, W. (2006). How the group affects the mind: A cognitive model of idea generation in groups. Personality and Social Psychology Review, 10, 186 –213. Nijstad, B. A., Stroebe, W., & Lodewijkx, H. F. M. (2002). Cognitive stimulation and interference in groups: Exposure effects in an idea generation task. Journal of Experimental Social Psychology, 38, 535– 544. Nijstad, B. A., Stroebe, W., & Lodewijkx, H. F. M. (2003). Production blocking and idea generation: Does blocking interfere with cognitive processes? Journal of Experimental Social Psychology, 39, 531–548. Osborn, A. F. (1953). Applied imagination. New York: Scribner. Paulus, P. B., & Nijstad, B. A. (2003). Group creativity: Innovation through collaboration. New York: Oxford University Press. Posner, J., Russell, J. A., & Peterson, B. S. (2005). The circumplex model of affect: An integrative approach to affective neuroscience, cognitive development, and psychopathology. Development and Psychopathology, 17, 715–734. Rietzschel, E. F., De Dreu, C. K. W., & Nijstad, B. A. (2007). Personal need for structure and creative performance: The moderating influence of fear of invalidity. Personality and Social Psychology Bulletin, 33, 855– 866. Rietzschel, E. F., Nijstad, B. A., & Stroebe, W. (2007). Relative accessibility of domain knowledge and creativity: The effects of knowledge activation on the quantity and quality of generated ideas. Journal of Experimental Social Psychology, 43, 933–946. Robbins, T. W. (1984). Cortical noradrenaline, attention, and arousal. Psychological Medicine, 14, 13–21. Rosch, E. (1975). Cognitive representations of semantic categories. Journal of Experimental Psychology: General, 104, 192–233. Runco, M. A. (2004). Creativity. Annual Review of Psychology, 55, 657– 687. Russ, S. W., & Grossman–McKee, A. (1990). Affective expression in children’s fantasy play, primary process thinking on the Rorschach, and divergent thinking. Journal of Personality Assessment, 54, 756 –771. Russell, J. A., & Barrett, L. F. (1999). Core affect, prototypical emotional episodes, and other things called emotion: Dissecting the elephant. Journal of Personality and Social Psychology, 76, 805– 819. Schooler, J. W., & Melcher, J. (1995). The ineffability of insight. In S. M. Smith, T. B. Ward, & R. A. Finke (Eds.), The creative cognition approach (pp. 97–133). Cambridge, MA: MIT Press. Schooler, J. W., Ohlsson, S., & Brooks, K. (1993). Thoughts beyond words: When language overshadows insight. Journal of Experimental Psychology: General, 122, 166 –183. Schwarz, N., & Bless, H. (1991). Happy and mindless, but sad and smart? The impact of affective states on analytic reasoning. In J. P. Forgas (Ed.), Emotion and social judgments (pp. 55–71). Elmsford, NY: Pergamon Press. Schwarz, N., & Clore, G. L. (1988). How do I feel about it? The informative function of affective states. In K. Fiedler & J. Forgas (Eds.), Affect, cognition, and social behavior (pp. 44 – 62). Toronto, Ontario, Canada: Hogrefe. Shalley, C., Zhou, J., & Oldham, G. (2004). The effects of personal and contextual characteristics on creativity: Where should we go from here? Journal of Management, 30, 933–958. Simonton, D. K. (1997). Creative productivity: A predictive and explanatory model of career trajectories and landmarks. Psychological Review, 104, 66 – 89.

756

DE DREU, BAAS, AND NIJSTAD

Simonton, D. K. (1999). Origins of genius: Darwinian perspectives on creativity. New York: Oxford University Press. Simonton, D. K. (2003). Scientific creativity as constrained stochastic behavior: The integration of product, person, and process perspectives. Psychological Bulletin, 129, 475– 494. Smith, S. M., & Blankenship, S. E. (1991). Incubation and the persistence of fixation in problem solving. American Journal of Psychology, 104, 61– 87. Smith, S. M., Ward, T. B., & Schumacher, J. S. (1993). Constraining effects of examples in a creative generations task. Memory and Cognition, 21, 837– 845. Spering, M., Wagener, D., & Funke, J. (2005). The role of emotions in complex problem solving. Cognition and Emotion, 19, 1252–1261. Staw, B. M., Sandelands, L. E., & Dutton, J. E. (1981). Threat-rigidity effects in organizational behavior: A multilevel analysis. Administrative Science Quarterly, 26, 501–524. Steinberg, H., Sykes, E. A., Moss, T., Lowery, S., & LeBoutillier, N. (1997). Exercise enhances creativity independently of mood. British Journal of Sports Medicine, 31, 240 –245. Sternberg, R. J., & Lubart, T. I. (1999). The concept of creativity: Prospects and paradigms. In R. J. Sternberg (Ed.), Handbook of creativity (pp. 3–15). New York: Cambridge University Press. Strack, F., Schwarz, N., & Gschneidinger, E. (1985). Happiness and reminiscing: The role of time perspective, affect, and mode of thinking. Journal of Personality and Social Psychology, 49, 1460 –1469. Strauss, H., Hadar, M., Shavit, H., & Itskowitz, R. (1981). Relationship between creativity, repression, and anxiety in first graders. Perceptual and Motor Skills, 53, 275–282. Thayer, R. (1989). The biopsychology of mood and arousal. New York: Oxford University Press.

Torrance, E. P. (1966). Torrance tests of creativity. Princeton, NJ: Personnel Press. Usher, M., Cohen, J. D., Servan Schreiber, D., Rajkowski, J., & Aston Jones, G. (1999). The role of locus coeruleus in the regulation of cognitive performance. Science, 283, 549 –554. Verhaeghen, P., Joormann, J., & Khan, R. (2005). Why we sing the blues: The relation between self-reflective rumination, mood, and creativity. Emotion, 5, 226 –232. Vosburg, S. K. (1998). Mood and the quantity and quality of ideas. Creativity Research Journal, 11, 315–324. Wadlinger, H. A., & Isaacowitz, D. M. (2006). Positive mood broadens visual attention to positive stimuli. Motivation and Emotion, 30, 89 –101. Watson, D., Clark, L. A., & Tellegen, A. (1988). Development and validation of brief measures of positive and negative affect: The PANAS scales. Journal of Personality and Social Psychology, 54, 1063–1070. Watson, D., Wiese, D., Vaidya, J., & Tellegen, A. (1999). The two general activation systems of affect: Structural findings, evolutionary considerations, and psychobiological evidence. Journal of Personality and Social Psychology, 76, 820 – 838. Wegener, D., & Petty, R. (1994). Mood management across affective states: The hedonic contingency hypothesis. Journal of Personality and Social Psychology, 66, 1034 –1048. Yerkes, R. M., & Dodson, J. D. (1908). The relation of strength of stimulus to rapidity of habit formation. Journal of Comparative Neurology and Psychology, 18, 459 – 482.

Received February 9, 2007 Revision received December 7, 2007 Accepted December 7, 2007 䡲

Journal of Personality and Social Psychology 2008, Vol. 94, No. 5, 757–776

Copyright 2008 by the American Psychological Association 0022-3514/08/$12.00 DOI: 10.1037/0022-3514.94.5.757

Judgments of the Lucky Across Development and Culture Kristina R. Olson

Yarrow Dunham

Harvard University

University of California, Merced

Carol S. Dweck

Elizabeth S. Spelke and Mahzarin R. Banaji

Stanford University

Harvard University

For millennia, human beings have believed that it is morally wrong to judge others by the fortuitous or unfortunate events that befall them or by the actions of another person. Rather, an individual’s own intended, deliberate actions should be the basis of his or her evaluation, reward, and punishment. In a series of studies, the authors investigated whether such rules guide the judgments of children. The first 3 studies demonstrated that children view lucky others as more likely than unlucky others to perform intentional good actions. Children similarly assess the siblings of lucky others as more likely to perform intentional good actions than the siblings of unlucky others. The next 3 studies demonstrated that children as young as 3 years believe that lucky people are nicer than unlucky people. The final 2 studies found that Japanese children also demonstrate a robust preference for the lucky and their associates. These findings are discussed in relation to M. J. Lerner’s (1980) just-world theory and J. Piaget’s (1932/1965) immanent-justice research and in relation to the development of intergroup attitudes. Keywords: preference for the lucky, immanent justice, evaluative contagion, social cognitive development, cross-cultural psychology

time and place, from Aristotle’s Nicomachean Ethics, Roman law (e.g., animus nocendi), and English law (e.g., mens rea) to the modern penal law in the United States and the Rome Statute of the International Criminal Court (United Nations, 2003). This fundamental moral dictum was most clearly described in the 13th century by Henry Bracton (13th century/1968 –1977): “A crime is not committed unless an intention to injure exists.” From it we have the practice that volitional and premeditated behaviors, such as stealing and cheating, are punished, just as hard work and helping others are rewarded—these actions speak to the character of the person performing them. On the other hand, we treat differently those behaviors that involve accidental, unintentional, and random causes. Whether the outcomes themselves are good or bad, such as winning a lottery or being hit by a tornado, we are not to attribute these to the character of the actor. Even when it comes to intentional behavior, we hold that it is those who are involved in producing it who should be held responsible or praised, not those who happen to be associated with the perpetrators via group membership. The Bible supports this belief clearly: “The fathers shall not be put to death for the children, neither shall the children be put to death for the fathers: Every man shall be put to death for his own sin” (Deuteronomy 24:16, King James Version). The belief that deems guilt by association to be immoral is also broad and deep, being upheld by the oldest moral codes from Ptahotep and the Assize of Clarendon to most modern legal doctrine (Banaji & Bhaskar, 2000). Our research concerns the dissociation between these ratified codes of conduct and the behavior of ordinary humans. It seeks to understand the disparity between belief and action, between abstractly held ideals and everyday moral judgments of good and bad. In these studies, we investigated the developmental aspects of

In many societies and legal systems across time, one moral tenet has reigned supreme: Individuals are to be judged by the purposeful actions they commit and not by the random events that befall them. This understanding has been broad and deep, evident across

Kristina R. Olson, Elizabeth S. Spelke, and Mahzarin R. Banaji, Department of Psychology, Harvard University; Yarrow Dunham, Department of Psychology, University of California, Merced; Carol S. Dweck, Department of Psychology, Stanford University. This research was supported by funding from the Beinecke Fellowship and National Science Foundation Graduate Research Fellowship to Kristina R. Olson, National Science Foundation Grant BCS-02-1725 to Carol S. Dweck, National Institute of Health Grant HD23103 to Elizabeth S. Spelke, funding from the Reischauer Institute for Japanese Studies to Yarrow Dunham, and funding from the Third Millennium Foundation to Mahzarin R. Banaji. This research was conducted as part of Kristina R. Olson’s doctoral dissertation. A full list of items and supplementary statistics for all eight studies are available at http://www.people.fas.harvard.edu/⬃banaji/research/ olson_luck.htm We thank Alexa Reynolds for the drawings used in Study 8; the Harvard Museum of Natural History, Highland Park Elementary School, and the Bing Nursery School for providing spaces for conducting research; and Ann Marie Russell, Carla Borras, Rohini Rau-Murthy, Roy Ruhling, Marion Mahone, Valerie Loehr, Patrick Hayden, Lauren Hay, Rachel Montana, Kimihiro Shiomura, Christopher Dial, Katie Lancaster, and Marin Tanaka for assistance in data collection. We also thank the parents, teachers, administrators, and children who participated and assisted in this research and Paul Jose for sending us the immanent-justice stories used in Study 2. Correspondence concerning this article should be addressed to Kristina R. Olson, Department of Psychology, Harvard University, 33 Kirkland Street, Cambridge, MA 02138. E-mail: [email protected] 757

758

OLSON, DUNHAM, DWECK, SPELKE, AND BANAJI

such dissociations by analyzing the relatively early manifestation of such discrepancies in childhood. Do children recognize that the random bad events that befall others do not make them blameworthy? Do they understand that other people who are associated with an unlucky individual are not blameworthy? Observing children as they grapple with such questions can provide an understanding of the developmental origins of adult minds that routinely offer such judgments with consequence. We explored this question in the context of two empirical phenomena: preference for the lucky (over the unlucky) and the evaluation of an individual based on his or her association with another actor—what we call evaluative contagion. Preference for the lucky is simply the greater liking, greater preference, or more positive attitude toward those who experience randomly occurring good or lucky events (e.g., finding $5 on the sidewalk) than toward those who experience random bad or unlucky events (e.g., getting splashed by a passing car). More complex evaluations, which we call judgments of the lucky, involve thinking, for example, that lucky people are more likely to perform good actions than are unlucky people. We use the term “random” as the overarching term for lucky and unlucky events, standing in clear contrast with actions that we term “intentional.” Whereas intentional actions tend to be deliberate and foreseen, random events, for our purposes, are those that are not intended or foreseen by the targets of those actions. Evaluative contagion refers to the extension of evaluations of one actor to his or her associates, such as family or social-group members. Disliking the sibling of someone who was splashed by a passing car would be an example of evaluative contagion, because the negative evaluation of the target of the action (the person splashed) has spread to the sibling of that target. Such evaluations are not only theoretically important but also may have important implications for work on the development of prejudice toward disadvantaged groups. That is, insofar as members of disadvantaged groups tend to experience more unlucky events, a dislike of people associated with others who experience unlucky events could lead to prejudice against members of families or social groups who themselves have not experienced bad or unlucky events. We seek to establish the generality and breadth of these phenomena across age and culture. As we discuss below, there are several theories relevant to a preference for the lucky. One way to evaluate how well these theories explain the preference for the lucky is to examine the developmental predictions of these theories and to look for convergence or divergence between these theories and the preference for the lucky across development. Therefore, one goal of this article is to investigate how the evaluations of the lucky and evaluative contagion might increase or decrease across development and what these changes imply for alternative explanations of these effects. In addition, we seek to understand whether these phenomena are cross-culturally invariant or whether something about American or Western culture might lead young children to prefer the lucky and their associates. Previous research (Masuda & Kitayama, 2004; Morris & Peng, 1994) has demonstrated that Westerners tend to use dispositional attributions to explain behavior (e.g., he tripped because he was clumsy), whereas Easterners tend to use situational explanations (e.g., he tripped because there was a cord on the floor). One possible explanation for children’s preference for the

lucky could be that children are making dispositional explanations for the lucky events. Such an explanation leads to the prediction that children growing up in a country that tends to use situational explanations for behavior will not show this preference. Therefore, in an initial exploration of the universality of the preference for the lucky and its contagious nature, we presented young Japanese children with the same tasks we presented to American children.

Immanent Justice and Belief in a Just World As early as 6 months of age, children appear to have a basic understanding of differences between intentional action and unintentional action (Woodward, 1998), and by 3 years of age, children are able to distinguish intentional from unintentional actions in linguistic tasks (Shultz & Wells, 1985; Shultz, Wells, & Sarda, 1980). Nevertheless, considerable evidence suggests that this distinction is not the only guide to children’s evaluations of others. Children’s tendency to evaluate others on the basis of unintentional acts has been stated or implied by several prominent theories, most notably in work on immanent justice (Piaget, 1932/ 1965) and belief in a just world (BJW; Lerner, 1980).

Immanent Justice In his groundbreaking work on moral development, Piaget (1932/1965) described the belief that “a fault will automatically bring about its own punishment” (p. 256). A classic example is evident in children’s responses to the following story: After stealing apples from an orchard, a boy rides his bike over a rotting bridge and falls into the water. Piaget asked 6- to 12-year-old children why the boy fell into the water and whether the boy would have fallen into the water had he not stolen the apples. A sizeable number of young children reported that the perpetrator fell into the water because he stole the apples. In other words, the random bad event (falling into the water) was viewed as a direct consequence of an intentional bad action (stealing the apples). Other research extended Piaget’s findings to positive events, showing that children believe that a positive random event will occur as the consequence of an intentional good action (Fein & Stein, 1977). It is important to note that immanent-justice reasoning is a mistaken belief about the nature of causation. That is, people who endorse immanent-justice reasoning are arguing that a good or bad action can cause a lucky or unlucky event and, consequently, that the lucky or unlucky event would not have occurred if the good or bad action had not occurred. For our purposes, the most important result is the developmental trend of this belief. Piaget (1932/1965) found a decline in immanent-justice reasoning across the elementary school years. Subsequently, other researchers have confirmed the general decline of immanent-justice reasoning throughout childhood (Jahoda, 1958; Jose, 1991; Percival & Haviland, 1978; Suls & Kalle, 1979; but cf. Karniol, 1980; Najarian-Svajian, 1966). This work has been extended more recently into samples of older teens, generally finding that immanent-justice reasoning further decreases in middle and high school (Johnson, 1962; NajarianSvarian, 1966), although there is new evidence suggesting that immanent-justice reasoning may reemerge in adulthood (Callan, Ellard, & Nicol, 2006; Raman & Winer, 2004). In our current research, we examined whether young children prefer lucky to unlucky individuals and whether they use evidence

JUDGMENTS OF THE LUCKY

of lucky or unlucky events to predict an actor’s future good or bad behavior. That is, do children think a lucky child is more likely than an unlucky child to perform a good action in the future? Although this question is clearly related to immanent justice, there are differences between these procedures. Immanent-justice research focuses on how children reason about the causal consequences of intentional good and bad actions. It shows a general decline in immanent justice with age, presumably because children integrate and articulate a more diverse set of causal principles governing the behavior of agents (Schult & Wellman, 1997). In contrast, judgments of the lucky concern the evaluative consequences of having viewed another’s experience of a lucky or unlucky event. In Study 2, we investigated the relationship between judgments of the lucky and immanent justice by testing the developmental trajectories of both patterns of reasoning in the same participants.

BJW The idea that people get what they deserve is at the heart of Lerner’s BJW theory (Furnham, 2003; Lerner, 1980; Montada & Lerner, 1998). One classic demonstration of BJW involved asking participants about the blameworthiness of a rape victim (Jones & Aronson, 1973). Experimenters manipulated whether the police report revealed that the victim was a virgin, a married woman, or a divorcee and then asked participants how much the victim was to blame for the rape. Counterintuitively, the finding was that participants blamed the virgin and married woman the most and the divorcee the least, though they still blamed the latter. The authors interpreted this finding and many others in terms of their selfprotective function: If we believe the world to be a just and fair place, we can reinterpret or explain good and bad events that seem to befall individuals for no reason at all and, as a result, still feel personally safe. In this case, the authors argued that the idea of an innocent virgin or married woman being raped was simply so inconsistent with participants’ view of a just world that they derogated the victim, whereas a divorcee being raped is not as inconsistent with a view of the world as just, and therefore, less blame was necessary to maintain a sense of the world as just. BJW colors not only people’s beliefs about others but also their attitudes. In another study, participants liked a person who was randomly assigned to be electrically shocked with no compensation less than a person who was randomly assigned to be shocked for a payment of $30 (Lerner, 1971). The logic of BJW predicts that the victim of uncompensated shocks was denigrated because of the underlying belief that a truly blameless person would not be so treated. Traditionally, BJW researchers have tested these questions by examining adults’ responses to extreme events that were presumably strong violations of a sense of justice (e.g., rape, electric shock, etc.). Less is known about whether more everyday events (e.g., seeing someone get splashed by a passing car) would trigger just-world beliefs. In addition, the developmental origins of BJW have not been closely studied, as most developmental research has either focused on older children and teens (Furnham, 1985; Furnham & Rajamanickam, 1992) or has involved tasks that have an uncertain relationship to just-world beliefs themselves, such as distribution of resources (Lerner, 1974; Long & Lerner, 1974), rather than

759

blame and evaluation (but see Fein, 1976). Although little research has been conducted on younger children, Lerner (1977) articulated a theoretical argument about the development of just-world thinking. Most notable, he argued that children move from a focus on getting what they want immediately to understanding that their actions at Time A can be rewarded or punished at Time B. Lerner related this transition to the development of delay of gratification (Long & Lerner, 1974), arguing that once this “action now ⫽ consequence later” rule is understood, children begin to apply this understanding to other people, recognizing that a person’s actions now will produce consequences for him or her later. These arguments suggest that children may begin showing just-world beliefs in mid childhood, somewhere around age 6 or 7 years. Lending further credence to this approximate age prediction, Lerner’s own research tended to involve children in middle to late elementary school, although he demonstrated related principles, such as an understanding of parity and equity, in kindergarteners and first graders (Lerner, 1974). The current work is aimed at testing the core proposition that children, starting early in childhood, prefer the lucky over the unlucky. If BJW is indeed the mechanism by which preference for the lucky emerges, then preference for the lucky should emerge sometime after BJW reasoning has developed. However, an alternative possibility is that the tendency to prefer the lucky precedes the more elaborate sort of reasoning described in BJW; if so, it should emerge earlier in development. Indeed, preference for the lucky might be a core, early-developing tendency that is later justified via just-world beliefs. The present studies tested the development of the preference for the lucky and assess its origin in relation to just-world beliefs.

The Current Work One of the most important tasks children face in navigating their social world is determining who to approach and who to avoid, who is a friend and who is a foe. Therefore, we were interested in whether children would assume that lucky people are more likely to engage in intentional good behaviors and unlucky people are more likely to engage in intentional bad behaviors. In an initial set of studies, we demonstrated that 5- to 7-year-olds prefer lucky to unlucky people and prefer members of a lucky group to members of an unlucky group (Olson, Banaji, Dweck, & Spelke, 2006). Here, we pursued these findings by asking whether children make deeper inferences about lucky and unlucky individuals, such as whether they believe lucky people are more likely to perform intentional good actions, whether the preference for the lucky is observed across cultures, and when this preference begins in childhood. In the first two studies, we examined whether children judge lucky people as more likely to perform intentional good actions than unlucky people and unlucky people as more likely to perform intentional bad actions than lucky people. In both studies, we examined the developmental trajectory of these evaluations, and in the second study, we compared this trajectory with performance on an immanent-justice task. In the third study, we investigated whether young children show evaluative contagion for behavioral predictions, asking whether children believe that the siblings of lucky individuals are more likely to engage in intentional good actions than the siblings of

760

OLSON, DUNHAM, DWECK, SPELKE, AND BANAJI

unlucky individuals. This study, along with the final study, which examines evaluative contagion in novel social groups, suggests that evaluative contagion exists and is not limited to the American context. Placed alongside the work on preference for the lucky, these data suggest that the development of prejudice against members of disadvantaged groups may be fueled by (a) the presence of negative evaluations of individuals who experience unlucky events and (b) the presence of negative evaluations of people merely associated with those who experience misfortune, together resulting in prejudice against the disadvantaged either because they themselves experienced bad luck or because they are associated with others who have. The second emphasis of the current article is an investigation of the developmental course of preferences and judgments favoring the lucky. In the first three studies, we tested children aged 4 to 12 years to assess the developmental trajectory of judgments of the lucky. Previously, this question has only been addressed using an attitude measure in children aged 5 to 7 years (Olson et al., 2006). In Studies 4, 5, and 6, we investigated the basic preference for the lucky in preschool-aged children. Testing such young children allowed us to investigate whether the developmental predictions of Lerner (1977), and therefore just-world beliefs, might explain such a preference. A final question this work seeks to examine is whether our initial discoveries of preference for the lucky are the result of some culture-specific teaching or whether this preference might be invariant across cultures. A cross-cultural test, coupled with work with very young children, can indicate whether a tendency or evaluation might be universal or whether it is the result of specific experiences or antecedents. Thus, in Studies 7 and 8, we investigated whether preference for the lucky and evaluative contagion are seen cross-culturally or whether they are the result of a culturally specific experience or process. Taken together, these studies have the potential to deepen our understanding of preference for the lucky, judgments of the lucky, and evaluative contagion effects. They can inform our understanding of whether children merely prefer lucky people or whether they make corresponding behavioral predictions about lucky and unlucky targets. These studies also clarify the relationship between preference for the lucky, immanent justice, and just-world beliefs. Finally, these studies allow a new understanding of the developmental trajectory and cross-cultural generality of these effects.

Study 1: Behavioral Predictions of the Lucky and Unlucky Learning to decide who is good and who is bad is a major component of successful functioning in the social world. Previous research suggests that even 6-month-old infants can distinguish an agent that helps from an agent that harms and can use this information to form preferences for the former (Hamlin, Wynn, & Bloom, 2007). In addition, by the age of 18 months, children prefer to accept a toy from a helpful rather than a harmful actor (Nurock, Jacob, Margules, & Dupoux, 2008). This evidence suggests that very young children evaluate agents on the basis of their helpful or harmful behavior. What do children think when they observe something good or bad befalling someone? Do they form expectations about that person’s future behavior? For example, do children believe that a

person who found $5 on the sidewalk (random good event) is more likely to read a story to her little brother (intentional good event) than a person who was rained on while walking home (random bad event)? Similarly, is the person who was rained on seen as more likely to lie to his mother (intentional bad event) than the person who found $5? We tested this hypothesis and included comparison items in which actors were described as having previously performed intentional good or bad actions. Because such actions invite dispositional attributions and so should motivate consistent predictions about future actions, these items served as a standard against which to compare the impact of random events. In addition, this study investigated whether there are developmental changes in the behavioral predictions of the lucky and unlucky across middle childhood.1 Previous research has suggested that children’s moral reasoning changes considerably between the ages of 4 and 12 years and, most relevant, that immanent-justice reasoning declines across this age range (Jose, 1991; Piaget, 1932/1965). Therefore, we investigated possible age differences in children’s judgments of lucky and unlucky targets. Although we provided conceptual arguments for why these two phenomena are different, we sought to bolster this contention with a direct test. If behavioral predictions following observation of random events stem from the same underlying process as immanent justice, we would expect to see an age-related decrease in children’s tendency to think that lucky people perform good actions and unlucky people perform bad actions.

Method Participants and recruitment. Participants included 57 children (18 female) aged 4 to 12 years (M ⫽ 7, SD ⫽ 2) who participated while visiting the Harvard Museum of Natural History in Cambridge, Massachusetts, with their parents or guardians. For complete age breakdowns for this and subsequent studies, see Table 1. One additional participant began the study but quit after completing less than half of the study and is therefore excluded from analyses. Participants were approached by an experimenter who asked if they would be interested in participating in a study lasting 5 to 10 min. Interested parents were asked to complete a short consent form, any questions from the child or parent were answered by the experimenter, and the child and parent were escorted to the testing area. The experimenter then explained to the child that he or she was free to stop participation at any time and asked the child if he or she was ready to begin. Although race information was not asked of participants in this study, experimenters observed that the sample was predominately White and middle to upper-middle class. 1 Social cognition research with adult participants has found that adults overestimate their ability to predict the behavior of individuals, for example, thinking that they can predict one’s year-long performance in the Peace Corps from a single interview when in fact, interviews are poor predictors of actual performance (participants estimated r ⫽ .59 between interview performance and Peace Corps performance, when in fact, r ⬍ .10; Kunda & Nisbett, 1986). The above-mentioned work differs from the work proposed here. Those authors were concerned with accuracy of predictions compared with reality, whereas the current work is focused on whether children make systematic predictions in a particular direction to reveal an underlying belief that lucky people perform good actions.

JUDGMENTS OF THE LUCKY

Table 1 Sample Size Broken Down by Age for Each Study (After Exclusions) Age (years) Experiment 2.5 3 3.5 4 4.5 5 6 7 8 9 10 11 12 1 2 3 4 5 6 7 8

6 12 31

29 27 23

14 16 17

9 6 10 6 10 6 3 8 25 23 19 14 17 14 11 19 7 8 4 7 6 9

25 7 21

9 4 3 30 28 8

1 7 2

N 57 127 78 115 49 25 23 87

Materials. Thirty-two pictures of White children were selected from the Internet and were arranged into 16 same-sex pairs. Pictures were paired such that both pictures had been rated by several adult raters as equal in attractiveness and approximate age. Of the 16 pairs, eight were pairs of boys and eight were pairs of girls. Adults’ estimates of age ranged from 4 to 12 years. These 16 pairs were then arranged into four same-sex sets of four pairs (two all male, two all female). Four different versions of the task were created, one beginning with each set of four pairs, alternating between four pairs of boys and four pairs of girls. The side of presentation of each picture was orthogonally counterbalanced across participants, yielding eight versions of the task to which participants were sequentially assigned. Procedure. Participants were presented with 16 trials. In each trial, they were first shown two photographs of children and were told their names and one fact about them (e.g., “This is John. John stole a cookie from his brother”). As each person was mentioned, a picture of a child appeared on the screen. Pictures were approximately 2 in. ⫻ 3 in. and appeared on either the right or left side of the screen. This part of the trial is called the “learning phase” and always consisted of a learning pair (one fact about each of two children). After the learning phase, participants were asked to guess which of the two children engaged in another action (e.g., “On Sunday, one of these children got into a fight. Which child got into a fight?”). Henceforth, this part of the trial is called the “test phase.” Participants were instructed to point to the child that they believed engaged in the action in question, and their responses were recorded. The facts used in the learning phase were all either intentionally caused (by the actor) or randomly caused (not by the actor) and were either good or bad. For example, getting rained on or turning on the television to discover that no cartoons were on are examples of random bad events, whereas finding $5 on the sidewalk or getting to eat cake in school because it was a classmate’s birthday are examples of random good events.2 In contrast, pulling a classmate’s hair or cheating on a test were examples of intentional bad actions, and helping to bake cookies for one’s grandmother or sharing toys with one’s little brother were considered intentional good actions. Each pair of learning items was created to be as parallel as possible (e.g., accidentally hitting someone vs. intentionally hitting someone). Across the 16 trials, four kinds of pairings were made in the learning phase: intentional bad versus intentional good, random bad (unlucky) versus random good

761

(lucky), intentional good versus random good, and intentional bad versus random bad. These four types of learning pairings were crossed with each of the two possible types of test items (intentional good, intentional bad) in the test phase, resulting in eight types of items. A tabular representation of the design is depicted in Table 2. Finally, we made two examples of each type of item (e.g., two items that were random good vs. random bad learning trials with an intentional bad question item), resulting in 16 unique questions. The order of mention of the targets (e.g., mentioning lucky vs. unlucky first) was counterbalanced across items. Experimenters in this study and all subsequent studies were trained to state each item in a neutral or slightly positive tone, even when the item was negative in valence. Although this was a less natural way to state the items, this allowed us to be certain that children were not using the experimenter’s tone as information in their responses, so it provided a more conservative test of our hypotheses. Data preparation and analyses. For each item, participants were given a one if they selected the predicted choice (e.g., the lucky target in the lucky vs. unlucky target predicting an intentional good action item) and a zero if they selected the other choice (e.g., the unlucky target). The four items involving the same learning pairs (e.g., lucky vs. unlucky) were combined such that each participant then had a prediction score between 0 (never picked the predicted answer) to 4 (always selected the predicted answer). Thus, if a participant said that a child who turned on the television and found no cartoons on was more likely to cheat on a test than a child who turned on the television and found an extra hour of cartoons on, the participant was given one point for the lucky versus unlucky prediction score. Similarly, if the participant said that the child who walked to school while it was sunny was more likely to bake a cake for his grandma than the child who walked to school while it was rainy, that participant scored one point for the lucky versus unlucky prediction score. Each child ended up with four prediction scores (lucky vs. unlucky, inten2

An anonymous reviewer raised an important concern that perhaps children saw lucky items as good, rather than lucky, converting what is a preference for the good into a preference for the lucky. For example, this reviewer pointed out that getting to eat cake for a classmate’s birthday might be seen as good by children and not as lucky. Similarly, one could argue that children perhaps see unlucky events as bad, rather than unlucky. One piece of evidence against this argument is the finding throughout this article that children differentially evaluated lucky and good actors and that they distinguished unlucky and bad actors. However, as a more direct test of this concern, we conducted a small-scale pilot study. We presented a new group of 26 children aged 5–10 years with each of the lucky and intentional good items from Study 1 and asked them to state whether each item was something lucky that happened to the target or whether the actor “meant to do good” (we used this phrase because “intentional good” is confusing for 5-year-olds). A partially overlapping group of 26 children aged 5–10 years completed the parallel task for unlucky and intentional bad actions. Overall, the pilot participants identified 75.5% of the lucky targets as lucky, 74.0% of the intentional good targets as intentional good, 88.5% of the unlucky targets as unlucky, and 83.3% of the intentional bad targets as intentional bad. Chance responding would have been 50%, and the reviewer’s predictions would have suggested results significantly lower than 50% for the lucky and unlucky items, which we did not find. We are therefore confident that children did understand that the lucky events were lucky and that the unlucky events were unlucky.

OLSON, DUNHAM, DWECK, SPELKE, AND BANAJI

762

tional good vs. intentional bad, intentional good vs. lucky, and intentional bad vs. unlucky) unless that child failed to complete one or more questions required to complete a score. At most, one participant was excluded from each prediction score. We analyzed prediction scores using one-sample t tests, comparing children’s prediction scores with chance (2.0). Finally, to examine possible age changes in predictions, we correlated prediction scores with age.

Results Lucky versus unlucky. The comparison between lucky and unlucky targets was the primary result of interest. Our hypothesis was that children would believe that the unlucky target was more likely to perform an intentional bad action and less likely to perform an intentional good action than the lucky target, consistent with predictions of judgments of the lucky. A one-sample t test comparing children’s mean prediction score (M ⫽ 2.47) with chance (2.0) supported this hypothesis, t(56) ⫽ 3.57, p ⫽ .001. Figure 1 shows the proportion of predicted responses made for lucky versus unlucky items and intentional good versus bad items. Using a paired t test, we found no effect of the valence of the question; that is, participants were just as likely to think a lucky target would perform a good action as they were to think an unlucky target would perform a bad action, t(56) ⫽ 1.53, p ⫽ .13. In addition, there was a nonsignificant but positive relationship between age and prediction score (r ⫽ .18, p ⫽ .18), indicating that this was not likely to be related to immanent-justice reasoning, which typically shows a decline with age (Jose, 1991; Piaget, 1932/1965). In addition, although previous work has suggested that young children have a poorer understanding of randomness in general than do older children (Weisz, 1980), the increase in predictions based on lucky and unlucky events with age suggests that our result is not due to limitations in children’s understanding of randomness. Intentional good versus intentional bad. We were interested in whether children believed in behavioral consistency, thinking

Table 2 A Schematic Representing the Items Presented in Studies 1 and 3 Learning phase Intentional good vs. intentional bad* Intentional good vs. intentional bad* Lucky vs. unlucky* Lucky vs. unlucky* Intentional good vs. lucky Intentional good vs. lucky Unlucky vs. intentional bad Unlucky vs. intentional bad

Test phase (Who would perform an. . . ?)

Predicted response

Intentional good action

Intentional good

Intentional Intentional Intentional Intentional Intentional Intentional Intentional

Intentional bad Lucky Unlucky Intentional good Lucky Unlucky Intentional bad

bad action good action bad action good action bad action good action bad action

Note. In the learning phase, participants were introduced to two characters (and their siblings, in the case of Study 3). Participants were then asked which of the two characters (or their siblings in Study 3) would perform a different intentional good or bad action in the test phase. We have also listed the predicted response used to conduct analyses. The types of items with asterisks were included in Study 2.

Figure 1. Mean proportion of responses in which participants selected the predicted response in Study 1. The predicted response was selecting the lucky or intentional good actor to perform a good action and selecting the unlucky or intentional bad actor to perform a bad action. The proportion of unpredicted responses is simply 1 ⫺ the proportion of predicted responses. Because including both the predicted and unpredicted bars is redundant, we only included the predicted responses in the graph.

that a person who did something intentionally good one time would do so a second time and that someone who did an intentional bad action would do another. Previous work has suggested that young children do not tend to believe that a person will necessarily do the same intentional action a second time (Kalish, 2002), but here we asked whether the individual would do a different intentional action of the same valence. We predicted that they would expect valence consistency, and indeed, children viewed an actor who had committed an intentional bad action as more likely to perform a different intentional bad action and an actor who had committed an intentional good action as more likely to perform a different intentional good action as indicated by a one-sample t test comparing children’s average prediction score (M ⫽ 3.19) with chance (2.0), t(56) ⫽ 9.27, p ⬍ .001 (see Figure 1 for proportion of predicted responses). Children were equally likely to think an intentional good actor would do another good action as to think that an intentional bad actor would do another bad action, as indicated by a paired t test in which responses did not differ by valence, t(56) ⫽ 1.18, p ⫽ .24. Prediction scores were correlated with age (r ⫽ .45, p ⬍ .001), suggesting that older children were more likely to predict consistency in the behavior of intentional good and bad actors. Also as expected, children’s prediction scores were higher for the intentional good versus bad comparison (M ⫽ 3.19) than for the lucky versus unlucky comparison (M ⫽ 2.47), at least indirectly indicating that children understand a distinction between intentional and random behavior, t(56) ⫽ 4.45, p ⬍ .001, although the next two comparisons test this question more directly. Intentional good versus lucky. To examine whether children distinguish intentional behavior from random behavior, we asked whether children would select an intentional good actor as more likely to perform a different intentional good action than a lucky target and whether they would select a lucky target as more likely to perform an intentional bad action than an intentional good actor. We found that children make these selections, as evidenced by a one-sample t test comparing participants’ average prediction score (M ⫽ 2.51) with chance (2.0), t(56) ⫽ 3.90, p ⬍ .001. Participants were equally likely to think that an intentional good actor would

JUDGMENTS OF THE LUCKY

perform a good action as they were to think that a lucky target would perform a bad action, t(56) ⫽ 0.14, p ⫽ .89. Age was not significantly correlated with prediction score (r ⫽ .14, p ⫽ .31), suggesting that children of all ages distinguished intentional good and lucky actors. Intentional bad versus unlucky. In the final set of comparisons, we paired intentional bad actors with unlucky targets and had children report which of these targets would commit other intentional good or intentional bad tasks. We asked whether children would select an intentional good actor as more likely to engage in a different intentional good action than a lucky target and whether they would select a lucky target as more likely to perform an intentional bad action than an intentional good actor. We found support for this prediction, as evidenced by a one-sample t test comparing the average prediction score (M ⫽ 2.69) with chance (2.0), t(55) ⫽ 5.17, p ⬍ .001, suggesting that children do in fact distinguish between actors who perform intentional bad actions and those who experience unlucky events, seeing the former as more likely to perform an additional bad action and the latter as more likely to perform an additional good action. There was no significant effect of the valence of the question asked, indicating that children were equally likely to think that an intentional bad actor would perform other bad actions as to think that an unlucky target would perform more good actions, as indicated by a onesample t test, t(56) ⫽ 0.65, p ⫽ .52. In addition, older children were more likely to demonstrate this prediction, as evidenced by a significant correlation between age and prediction score (r ⫽ .49, p ⬍ .001), a somewhat surprising finding given that the previous comparison of intentional good with random good demonstrated no significant relationship between an intentional and random distinction and age. Because this finding was unexpected and did not replicate across these conceptually similar comparisons, we do not address this issue further. Nonparametric analyses. Because of possible concerns about the use of parametric statistics throughout this and subsequent studies, we also conducted analyses throughout this article using nonparametric statistics. However, because it is more common to use parametric responses and because of limited space, parametric tests are always reported in this article. The relevant nonparametric tests are available at Mahzarin R. Banaji’s Web site (see the URL in the author note). The findings reported in the text are identical, regardless of our use of parametric versus nonparametric statistics.

Discussion This study provided evidence that children make behavioral predictions for lucky and unlucky targets. Children judged unlucky targets as more likely to commit intentional bad actions and less likely to commit intentional good actions than lucky targets. Thus, children do not simply prefer lucky to unlucky targets but make different predictions about lucky and unlucky targets. These differing predictions may suggest that children make enduring dispositional inferences about actors and may rely on these inferences to motivate future predictions, although alternative accounts could explain these findings. Study 1 also provided assurance that our basic method was valid in that our clearest case, comparisons between intentional good and bad actors, showed the expected results. Participants judged intentional good actors as more likely to perform other intentional

763

good actions compared with intentional bad actors. Indeed, the trends for these cases were even stronger than in the case of random events, demonstrating that children recognize a difference between intentional and random actions. One possible concern regarding these results is that our participants in this study, as well as those in previous studies examining a preference for the lucky (Olson et al., 2006), came from largely advantaged populations (i.e., White, middle- to upper-middle-class children with parents willing and able to take them to a museum, etc.). Perhaps it is because they themselves are lucky or fortunate that they show these effects. To test this possibility, we conducted a pilot study with a sample of 23 participants (aged 6 –12 years) who were all Black and all of low socioeconomic status, many living at or below the poverty line. We found that these children, like the children in Study 1, predicted that a lucky target would perform a good action more than an unlucky target and, similarly, that an unlucky target would perform a bad action more than a lucky target, suggesting that one does not need to be a member of a lucky group to make these evaluations.3 As previously mentioned, Piaget (1932/1965) found a decrease in immanent-justice reasoning across childhood. In contrast, we did not find such a pattern. If anything, the general trend was for older children to show behavior more in line with preference for the lucky than younger children showed. At the very least, these results suggest a developmental dissociation between immanent justice and preference for the lucky, militating against the idea that these phenomena arise from the same mental process or belief. However, to test the relationship between immanent justice and predictions about lucky and unlucky targets’ behavior more directly, in Study 2, we tested both phenomena in the same sample, allowing us to empirically evaluate the relationship between immanent justice and judgments of the lucky.

Study 2: The Dissociation of Immanent Justice and Behavioral Predictions of the Lucky Researchers since Piaget (1932/1965) have found that children believe that intentional bad actions can cause unlucky events to occur, and this thinking has been applied to intentional good actions and lucky events (Fein & Stein, 1977). In these studies, children are often told about a person who has performed, for example, a bad action and who has then experienced an unlucky event (e.g., a boy who stole apples from an orchard and then fell through a bridge on his way home). Children are then asked why the unlucky event happened and/or whether the unlucky event would have happened if the child had not performed the intentional 3

These participants were presented with only two items comparing lucky and unlucky targets. In one item, they were asked which target would perform an intentional good action, and in the other item, they were asked which target would perform an intentional bad action. Participants were given one point if they selected the lucky target to perform the intentional good action and one point if they selected the unlucky target to perform the intentional bad action, resulting in scores of 0, 1, or 2 for each subject. The distribution of these scores was compared with a binomial distribution (25% chance of 0, 50% chance of 1, 25% chance of 2). We found that this pilot sample selected the predicted responses more often than chance, as indicated by a chi-square goodness-of-fit test, ␹2(2, N ⫽ 23) ⫽ 7.09, p ⫽ .029.

764

OLSON, DUNHAM, DWECK, SPELKE, AND BANAJI

bad action. The main result is that young children often say that the unlucky event happened because of the intentional bad action and that it would not have happened had the target not performed the intentional bad action. This type of reasoning is referred to as immanent-justice reasoning. Studies have largely shown that as children get older, their immanent-justice reasoning declines (Jahoda, 1958; Jose, 1991; Percival & Haviland, 1978; Suls & Kalle, 1979). As we have discussed, there is a similarity, although more superficial than it might seem, between the procedures that test the idea of immanent justice and the present work. Whereas immanent justice concerns reasoning about causation, the present research is not interested in causal relations. If anything, the causal pathway in our studies must be reversed, as children are told about lucky and unlucky events and then infer intentional good and bad behavior. Although we have argued that these are conceptually distinct phenomena, in Study 2, we tested this dissociation directly by asking the same children to perform both an immanent-justice task and a judgment of the lucky task. Although age should be negatively correlated with immanent-justice reasoning, as previous research suggests, age should be uncorrelated or even slightly positively correlated with behavioral predictions of the lucky, as demonstrated in Study 1.

Method Participants. Participants included 127 children (63 male, 64 female; 118 White, 3 Asian, 2 Hispanic, 1 Middle Eastern, and 1 Black/White biracial; 2 were not identified by parents and race or ethnicity could not be identified by experimenters) between the ages of 5 and 12 years (M ⫽ 8.7, SD ⫽ 2.0) in a suburban elementary school in Utah from a mostly middle-class background. Materials. Participants completed two tasks, including an immanent-justice task taken from Jose (1990) and the judgments of the lucky task from Study 1. Across participants, there were a total of eight immanent-justice stories: In four stories, the protagonist performed a bad action and then experienced a negative event, and in four parallel stories, the protagonist performed a good action and then experienced a positive event. For example, in the orchard stories, the negative version involved a boy stealing apples from an orchard and then falling through a board on a bridge and into a river on his way home, whereas the positive version involved a boy helping a farmer to pick apples and then finding a wristwatch on a bridge on his way home. Four scripts were created, each consisting of one version of each of the four base stories. Each script contained two positive and two negative stories. Participants were randomly assigned a script. The second task involved eight of the judgments of the lucky prediction items from Study 1. These included the four items from the intentional good versus intentional bad set and the four items from the lucky versus unlucky set. The other eight items from Study 1 (the intentional good vs. lucky and intentional bad vs. unlucky items) were not used, as they made the task too long for the youngest children and were unrelated to the question of interest. The items were randomized into three scripts, each containing all eight items. Participants were randomly assigned a script. Across items, the order of mention of lucky and unlucky targets and the order of mention of intentional good and bad targets was counterbalanced.

Procedure. Participants were brought into a conference room in the school and were greeted by an experimenter. They were told they would be playing two games and that in both games there were no right or wrong answers. They were also informed they could quit at any time (although none of the children did). Participants were sequentially assigned to complete either the immanentjustice task and then the preference for the lucky task or vice versa. For the immanent-justice items, children were read a story while being shown a photograph of a boy (the protagonist) and were then asked to recall as much as they could about the story. They were then asked why the good or bad action from the end of the story happened (e.g., “Why did Joey fall into the river?”). Finally, they were asked if the final action would have happened if the initial action had not (e.g., “Would Joey have fallen into the river if he hadn’t stolen the apples?”). For the luck-prediction items, a picture was presented to represent each of the two targets mentioned, and on the test trials, participants were asked to indicate their answers by pointing to the target who they believed had performed the action, as they did in Study 1. After completion of both tasks, participants were thanked for their time and returned to class.

Results Data preparation. Responses to the why question (e.g., “Why did Joey fall into the river?”) were coded by two judges, using predetermined categories from Jose (1991). The categories included immanent-justice reasoning (e.g., “He fell into the water because he stole the apples”); mediated causality, including physical mediation (e.g., “He fell through the bridge because he was carrying so many apples”) and psychological mediation (e.g., “He fell because he was feeling badly about stealing the apples and did not see the old board”); chance contiguity (e.g., “He fell into the river because the bridge was old and the boards on the bridge were falling apart”); don’t-know responses (e.g., “I have no idea”); and uncodable responses (e.g., “The boy didn’t fall in the water”). Overall, raters agreed on categorization 96% of the time, and in those cases in which they disagreed, the coders discussed their responses and came to an agreement on a final categorization. Participants’ answers to the why and yes or no (i.e., would x have happened if y had not?) immanent-justice questions were used for statistical analyses only if they had correctly answered the memory question. A correct memory answer required the participant to accurately recall the initial action and the final action in a given story (e.g., remembering that the boy stole apples and fell into the river). Overall, children passed the memory requirement 89% of the time. For the luck-prediction items, prediction scores were computed for each category (lucky vs. unlucky and intentional good vs. bad) as was done in Study 1, resulting in a total score ranging from 0 (never selected the expected response) to 4 (always selected the expected response) for each participant. Children who completed all eight items (n ⫽ 126) were included in all analyses. Immanent justice. On the why question, children gave immanent-justice responses 47% of the time, mediated-causality responses 3% of the time, chance-contiguity responses 43% of the time, don’t-know responses 4% of the time, and uncodable responses 3% of the time. On the yes or no questions, participants said “yes” 37% of the time and “no” (the immanent-justice response) 63% of the time.

JUDGMENTS OF THE LUCKY

The proportion of immanent-justice responses to the why question was negatively correlated with age (r ⫽ ⫺.19, p ⫽ .033), indicating that younger children gave more immanent-justice responses than did older children. Don’t-know responses and uncodable responses were also negatively correlated with age (don’t know, r ⫽ ⫺.22, p ⫽ .018; uncodable, r ⫽ ⫺.16, p ⫽ .078). In contrast, mediated-causality and chance-contiguity responses were positively correlated with age, indicating that older children were more likely to use these explanations (mediated causality, r ⫽ .26, p ⫽ .004; chance contiguity, r ⫽ .25, p ⫽ .004). Age was negatively correlated with “no” answers on the yes or no question, again suggesting that younger children supplied more immanentjustice responses than did older children (r ⫽ ⫺.22, p ⫽ .016). Judgments of the lucky. Overall, children were more likely to believe that an intentional good actor would perform an intentional good action and an intentional bad actor would perform an intentional bad action (M ⫽ 3.39) than expected by chance, t(125) ⫽ 19.74, p ⬍ .001, one-sample t test. Children were also more likely to believe a lucky target would perform a good action and that an unlucky target would perform a bad action (M ⫽ 2.94) than expected by chance, t(125) ⫽ 10.33, p ⬍ .001, one-sample t test. Performance on the intentional good versus bad items was correlated with performance on the lucky versus unlucky items (r ⫽ .24, p ⫽ .007), although overall, children selected the predicted responses more for the intentional good or bad items than for the lucky or unlucky items, t(125) ⫽ 4.50, p ⬍ .001, paired-samples t test, as was the case in Study 1. In addition, age was positively correlated with both composites (intentional good or bad, r ⫽ .37, p ⬍ .001; lucky or unlucky, r ⫽ .19, p ⫽ .032), suggesting that older children were more consistent in their responses. It is important to note that although the latter correlation was not significant in Study 1 and is significant here, the effect sizes in both cases were nearly identical (r ⫽ .18 in Study 1, and r ⫽ .19 in Study 2), suggesting that the sample size explains this difference. Thus, age was negatively correlated with immanent-justice responses and positively correlated with judgments about the lucky and unlucky. The relationship between immanent justice and judgments of the lucky. Immanent justice was not related to predictions about the lucky and unlucky, as indicated by nonsignificant correlations between the why and yes or no immanent-justice questions and the lucky or unlucky prediction composite (immanent-justice responses on the why question, r ⫽ ⫺.06, p ⬎ .50; “no” answers on yes or no question, r ⫽ ⫺.05, p ⬎ .5). Indeed, as noted, the age trends for immanent justice and judgments of the lucky were in opposite directions (see Table 3 for all means by age). Thus, both conceptual and empirical arguments strongly suggest a distinct basis for each phenomenon.

Discussion Despite surface similarities between the judgments of the lucky task and immanent-justice reasoning, these two underlying phenomena are quite distinct. We found no significant relationship between these measures. In addition, whereas immanent-justice reasoning decreased with age, predictions about the lucky increased with age, providing further evidence that the mechanism responsible for these effects is not the same. In addition to an empirical dissociation, we see a theoretical dissociation as well.

765

Table 3 Proportion of Responses on the Immanent Justice Task by Type and Mean Prediction Score on Lucky/Unlucky and Intentional Good/Bad Behavioral Prediction Items by Participant Age for Study 2 Age (years) 5 n “No” responses Why—immanent justice Why—mediation Why—chance Why—don’t know Why—uncodable Lucky/unlucky Intentional good/bad

6

7

8

9

10

11

7, 8 24, 25 22, 23 19 14 16, 17 13, 14 0.44 0.79 0.71 0.67 0.61 0.60 0.54 0.49 0.44 0.69 0.47 0.42 0.42 0.30 0.00 0.36 0.12 0.04 2.75 2.62

0.01 0.39 0.09 0.07 2.28 3.08

0.01 0.27 0.03 0.00 3.22 3.22

0.02 0.41 0.04 0.05 3.05 3.47

0.05 0.47 0.00 0.05 3.36 3.64

0.00 0.54 0.03 0.00 3.06 3.88

0.09 0.59 0.00 0.02 2.85 3.54

Note. When two numbers are present, the valid sample size varied depending on the dependent variable. “No” responses indicated proportion of participants saying no on the yes/no immanent-justice question. The rows starting with why include the proportion of participants at each age giving each of the possible responses to these items. The lucky/unlucky and intentional good/bad rows indicate the average number of times (out of four) that participants at each age selected the good/lucky does good or bad/unlucky does bad responses.

Whereas immanent-justice reasoning relies on a misunderstanding about causation (believing that performing a good or bad action can cause a lucky or unlucky event to occur), predictions of the behavior of lucky and unlucky people are not claims about causality. One could imagine, for example, a person who does not believe in immanent-justice reasoning but does believe that an unlucky person is more likely to perform a bad action. With this effect established, we moved on to ask whether children’s inferences about the actions of lucky and unlucky targets are confined to the targets as isolated individuals or whether associates of lucky and unlucky targets are also affected by the targets’ circumstances. In other words, are those who are related to unlucky people seen as more likely to engage in bad actions? And is the converse true of someone who is the relative of a lucky individual?

Study 3: Evaluative Contagion Although most people would argue that it is acceptable to judge someone on the basis of his or her intentions, almost nobody believes it to be fair to judge another by the random events that befall him or her. Similarly, some believe it is undesirable to judge an actor’s associate by the actions of the actor, even if the actor has performed a premeditated crime but especially if the actor has been the victim of a random negative or positive event. That is, making negative inferences about the sibling of a known thief is not deemed right by some people, but making negative inferences about the sibling of someone who was the victim of a robbery seems even less permissible. In Study 3, we tested whether this is indeed the case in the actions of children. We asked whether children’s behavioral predictions of the lucky extend beyond evaluations of individuals to evaluations of the associates of lucky and unlucky targets. That is,

766

OLSON, DUNHAM, DWECK, SPELKE, AND BANAJI

if Jan found $5 on the sidewalk, would children believe that Jan’s sister is more likely to perform a good action than Susan’s sister if Susan was splashed by a passing car? We compared these evaluations with evaluations of the siblings of individuals who perform intentional good and bad behaviors. Such a study can provide initial information about whether evaluative contagion occurs in the prediction of behavior. If children evaluate people on the basis of the events that their associates experience, consistent with predictions of evaluative contagion, this may illuminate how stereotypes and prejudice toward social groups, some of whom experience more unlucky events, develop.

Method Participants. Ninety-four participants (48 female) between the ages of 4 and 12 years (M ⫽ 7, SD ⫽ 2) were recruited to participate in this study, in the same manner as in Study 1. Participant race and socioeconomic status were not requested, but experimenters reported that participants were largely White and, because of location (campus museum), largely middle and uppermiddle class. Stimuli. The exact items and pictures from Study 1 were used in Study 3 with a few additions: We doubled the number of pictures because a sibling was added for each actor. These pictures were drawn from the same database of pictures used in Study 1. As in Study 1, the side of presentation of pictures was counterbalanced across participants, and the mention of targets (e.g., lucky first vs. unlucky first) was counterbalanced across items. Procedure. Participants were read a script that included 16 items. On each trial, participants were told the names of two children and a fact about each of them that was classified as either intentional good, intentional bad, lucky, or unlucky (identical to Study 1). Participants were also shown pictures of the siblings of each of the children. They were then told about another action (intentional good or bad) and were told that the sibling of one of the two initial actors had performed that action. Participants were asked to point to the sibling who they believed performed the action. As in the previous studies, when a child or his or her sibling was mentioned, a picture of that child appeared on the screen. Below is a complete example of an intentional comparison: Intentional good: This is Ross [picture appears] and his brother [picture appears]. Ross shared his toys with his neighbor. Intentional bad: This is Liam [picture appears] and his brother [picture appears]. Liam stole a toy from his neighbor. Intentional bad: The brother of either Ross or Liam punched a classmate. Which brother punched his classmate? Below is a complete example of a random comparison: Unlucky: This is Jeff [picture appears] and his brother [picture appears]. On Saturday, Jeff turned on the television and found that there were no cartoons on. Lucky: This is Todd [picture appears] and his brother [picture appears]. On Saturday, Todd turned on the television and found that there was an extra hour of cartoons on. Intentional good: Either Jeff’s or Todd’s brother helped the teacher clean up after art. Which brother helped his teacher? This study took slightly longer than previous studies, and therefore, halfway through the script (after eight items), we routinely asked participants if they wanted to keep playing. Often children, especially the younger ones, wanted to stop. We always allowed children to stop whenever they asked, and the majority, if they

stopped, stopped after eight items. Therefore, we also alternated whether children started at Item 1 or Item 9 to maximize the number of children completing each item. In total, 20 participants did not complete all 16 items; however, all participants, except the ones described below, completed at least six items. Data preparation and analyses. Several participants were dropped from analyses for the following reasons: Participants always picked the same side of the screen or picked the same side of the screen on 15 of 16 trials (n ⫽ 10), the parent interfered during the task (n ⫽ 2), the participant quit after one item (n ⫽ 1), or the child clearly did not understand the task (n ⫽ 3). After these exclusions, our sample included 78 participants (41 female), aged 4 to 12 years (M ⫽ 7, SD ⫽ 2). We then computed prediction scores in the same manner as in Study 1; however, because 20 participants did not complete all of the items, we had to exclude these participants from any prediction score in which they did not answer all four items, which resulted in a sample of 58 – 63 participants for each prediction score (comparable to the number of subjects in Study 1).4 Data were prepared and analyzed using the methods described in Study 1.

Results Lucky versus unlucky. In our main comparison of interest, we found that participants were significantly more likely to pick the sibling of the unlucky target to perform an intentional bad action than the sibling of a lucky target, who was in turn selected to be more likely to perform a good action, as indicated by a one-sample t test comparing the mean prediction score (M ⫽ 2.48) with chance (M ⫽ 2.0), t(57) ⫽ 3.51, p ⫽ .001 (see Figure 2 for proportion of predicted responses). That is, children generalized evaluations of an actor to the moral behavior of his or her siblings. It was possible that this significant effect was driven largely by children believing either that the siblings of lucky people would do more good things or that the siblings of unlucky people would do more bad things; however, a paired-samples t test indicated that there was no significant difference on the basis of the valence of the prediction question, t(57) ⫽ 0.90, p ⫽ .37. As in Study 1, we found that age did not correlate significantly with prediction score for the lucky versus unlucky comparison, although it was in the positive direction (r ⫽ .15, p ⫽ .25). Intentional good versus bad. The sibling of the intentional bad actor was judged as more likely to perform another intentional bad action than was the sibling of the intentional good actor (and vice versa for a different intentional good action), as indicated by 4 Because of concerns about the number of participants excluded in these analyses, we reanalyzed the data using proportions, so that a child who completed only two lucky versus unlucky items but selected the predicted response on both items would get a score of 1.0, the same as a child who completed four items and always selected the predicted response. Note, however, that a child who was distracted for one item would look very different when he or she had answered two questions than when he or she had answered four questions. In the former case, the child would get a score of .5 (chance), whereas in the latter case, the child would get a score of .75 (better than chance). This is why we did not analyze the data using this strategy in the text; this limitation not withstanding, the results of these proportion-based analyses were almost identical to those reported in the text using overall scores.

JUDGMENTS OF THE LUCKY

767

Study 1). Again, there was no significant difference between prediction scores for good and bad prediction items, t(62) ⫽ 0.11, p ⫽ .91.

Discussion

Figure 2. Mean proportion of responses in which participants selected the predicted response in Study 3. The predicted response was selecting the sibling of the lucky or intentional good actor to perform a good action and selecting the sibling of the unlucky or intentional bad actor to perform a bad action. The proportion of unpredicted responses is simply 1 ⫺ the proportion of predicted responses. Because including both the predicted and unpredicted bars is redundant, we only included the predicted responses in the graph.

a one-sample t test comparing the average prediction score (M ⫽ 2.66) with chance, t(58) ⫽ 4.52, p ⬍ .001 (see Figure 2 for proportion of predicted vs. unpredicted responses). This result suggests that, barring the presence of other information, children use the purposeful behavior of one sibling to predict the purposeful behavior of another sibling. A nonsignificant paired-sample t test indicated that this effect was equally driven by participants’ tendency to see the sibling of an intentional good actor as likely to perform a good action and by participants’ tendency to see the sibling of an intentional bad actor as likely to perform a bad action, t(58) ⫽ 0.15, p ⫽ .89. As in Study 1, age was correlated with performance on the intentional good versus bad comparison (r ⫽ .44, p ⬍ .001), again suggesting that older children show more consistency across trials than do younger children. Intentional good versus lucky. Surprisingly, given the previous results, participants did not distinguish between the siblings of intentionally good and lucky targets in predicting behavior of siblings, as indicated by a one-sample t test comparing the mean prediction score (M ⫽ 2.15) with chance (2.0; p ⬎ .30). Thus, children did not make a significant distinction between whether a target’s sibling performed an intentional good action or experienced a lucky event in evaluating that target’s future behavior. There was no significant difference between predictions of good versus bad actions, as indicated by a paired-samples t test, t(59) ⫽ 0.74, p ⫽ .46. Just as in Study 1, the correlation between performance on this comparison and age was not significant (r ⫽ .06, p ⫽ .63). Intentional bad versus unlucky. In a similar vein, across all participants, children did not distinguish between intentional bad actors and unlucky targets in predicting sibling behavior (M ⫽ 2.17, p ⬍ .15). They did not evaluate the sibling of an intentional bad actor to be any more or less likely to perform a different intentional bad action than the sibling of an unlucky target. In addition, there was a correlation between age and this comparison (r ⫽ .31, p ⫽ .013), suggesting that older children tended to show this expectation more than did younger children (as they did in

Study 3 demonstrated that children are willing to evaluate people on the basis of the actions and experiences of their siblings. The negative evaluation of unlucky people observed in Studies 1 and 2 “rubs off” on children’s evaluations of their siblings—they are seen as more likely to perform other bad actions. In the same vein, siblings of lucky people are viewed as more likely to perform intentional good actions. These findings provide evidence that evaluative contagion exists and that children’s preference extends to the associates of lucky and unlucky people. Surprisingly, children seemed to lose the distinction they made in Study 1 between intentional and random events when evaluating the siblings of targets. Although children view siblings of intentional good actors as likely to engage in intentional good actions when compared with intentional bad actors, they do not believe that siblings of intentional good actors are more likely to do so than siblings of lucky targets. Similarly, although siblings of intentional bad actors are seen as likely to engage in intentional bad actions themselves when compared with siblings of intentional good actors, they are not seen as more likely to do so than the siblings of their unlucky counterparts. Lending further evidence to this claim is the fact that, unlike in Study 1, there is no significant difference between the mean scores on the intentional good versus bad items and the mean scores on the lucky versus unlucky items, t(56) ⫽ 0.67, p ⫽ .51. In other words, whether a child was robbed or was a robber, the sibling was viewed equally negatively despite the fact that the evaluations of the actual robbed child or robber child may have differed. A possible explanation for this pattern is that siblings merely get tagged with a valence (good vs. bad), and the nature of the original source event is not involved in the subsequent evaluation. We return to this affective tagging hypothesis in the General Discussion, but given past findings (e.g., Olson et al., 2006), one could see how being a member of an unlucky or otherwise disadvantaged group could lead to being negatively evaluated, even if the member being evaluated was not the person involved in the original negative event (such as the siblings in this study). These first three studies stand as evidence of the breadth of the preference for the lucky and evaluative contagion effects. These phenomena extend beyond judgments of preference to beliefs about the likelihood of future action, including predictions of both a target’s actions and the actions of a target’s sibling. All three studies also demonstrated a small increase in the consistency of behavioral predictions for lucky and unlucky targets over development, from roughly age 5 through age 12. In the next three studies we further investigated the development of preference for the lucky by testing whether even younger, preschoolaged children showed a preference for the lucky.

Study 4: Preference for the Lucky in Preschoolers The question of how early this preference emerges has not been broached. In the current research, we tested this question directly by creating a simple task that very young children could perform.

768

OLSON, DUNHAM, DWECK, SPELKE, AND BANAJI

Namely, we presented children with pairs of targets and asked them simply “who’s nicer?” As in Study 1, we compared children’s evaluations of lucky and unlucky targets but also compared intentional good with intentional bad actors, intentional good with lucky targets, and intentional bad with unlucky targets. Thus, we explored the emergence of these distinctions in 2.5- to 4.5-year-old children. Evidence of a failure (random performance) at one age and a success at the following age would suggest that either a distinction begins to be made during this period or the task is too hard for children below this age. To differentiate between these two possibilities, we compared performance on the comparison of interest (lucky vs. unlucky) with the other three comparisons (intentional good vs. bad, intentional good vs. lucky, intentional bad vs. lucky). If children performed above chance in an expected direction on at least one comparison, this would suggest that children understood the task and simply failed to make the lucky versus unlucky distinction. If they failed at all tests, it would either mean that young children fail to make all distinctions or, more likely, that the children failed to understand the task. Because so many cognitive and social psychological changes occur in children during this time (e.g., emergence of reasoning about false beliefs; Gopnik & Astington, 1988; Wellman, Cross, & Watson, 2001), we placed children into narrower 6-month age ranges to determine exactly how they perform at each age. For ease of discussion, we label each age group by its lower bound (e.g., children aged 36 – 41 months are called 3.0-year-olds). This sample included not only younger children than the first three samples but also more racial and ethnic diversity. This allowed us to test (beyond the pilot study mentioned in the discussion of Study 1) whether our results are limited to majority group participants. Finally, by using such young children in this study, we shed light on whether BJW is a likely explanation for children’s preference for the lucky. As was described in our introduction, Lerner (1977) predicted that the origins of just-world beliefs are tied to learning to delay gratification and a transition away from egocentrism. In addition, his theory postulated that children must understand the relationship between their behavior and the consequences that occur later and then must apply this understanding to the behavior of others. All of these abilities are beyond the scope of young preschoolers (Harris, 1992; Kurdek, 1979; Kurdek & Rodgon, 1975; Mischel & Mischel, 1983), so evidence of a preference for the lucky in young preschoolers would call into question BJW theory as an explanation for preference for the lucky in young children.

Method Participants. Twelve 2.5-year-olds (5 female; 33.2–36.9 months, M ⫽ 35.1, SD ⫽ 0.77), thirty-one 3.0-year-olds (16 female; 36.3– 41.9 months, M ⫽ 39.0, SD ⫽ 1.8), twenty-nine 3.5-year-olds (16 female; 42.0 – 47.8 months, M ⫽ 45.2, SD ⫽ 1.6), twenty-seven 4.0-year-olds (18 female; 48.0 –53.6 months, M ⫽ 50.4, SD ⫽ 1.9), and sixteen 4.5-year-olds (8 female; 54.1– 59.9 months, M ⫽ 56.7, SD ⫽ 2.1) participated. This sample was considerably more diverse than those in the previous studies, as it included 44 White, 11 Black, 19 Asian, 10 Hispanic, 2 Native American, and 17 biracial participants and 12 participants whose parents did not specify race or ethnicity. All participants were recruited while attending a university preschool in California.

Stimuli. Twenty-four pictures of children (12 male, 12 female) were selected from a larger database of photographs and made into 12 same-sex pairs, matched on adult ratings of attractiveness and age. Twenty-four statements were also created such that six involved intentional good events, six involved intentional bad events, six involved random good experiences, and six involved random bad experiences. An object was used to represent each statement (to minimize memory demands), and a photograph of a child was included to represent the target. For example, for the item “[John] helped his parents with the chores,” a vacuum cleaner icon was presented along with a unique picture of a boy. Each statement/object/photograph set was paired with another on a page of a flipbook. In total, participants saw three intentional good– intentional bad pairs, three random good–random bad pairs, three intentional good–random good pairs, and three intentional bad– random bad pairs. In total, we created eight versions of the task to counterbalance for gender of targets, item effects, and the side of the flipbook on which each photograph appeared. The order of mention of targets (e.g., lucky vs. unlucky) varied across items. All subjects completed items in the following order, although the exact items differed across version: IG (intentional good)–IB (intentional bad), RG (random good)–RB (random bad), IG–IB, RG–RB, IG–IB, RG–RB, IG–RG, IB–RB, IG–RG, IB–RB, IG–RG, IB–RB. This order was selected because the first six items were the primary ones of interest, and we were initially concerned that the younger children might not sit through 12 items (although they did). Participants were sequentially assigned to one of the eight versions. Procedure. Participants were brought to a small room and were seated next to the experimenter. Children were told “We’re going to play a game. This game is called the ‘who’s nicer?’ game. I will tell you about some people and then I’ll ask you ‘who’s nicer?’ Does that make sense? Are you ready to play?” Once children indicated that they were ready, the experimenter began reading the pairs one at a time until participants completed all 12 items or something caused the participant to finish early (e.g., a fire alarm). Five participants (4%) participated but were excluded because they failed to complete all 12 items. Failure to complete the study was the result of accidents, such as fire drills, or a child needing to use the bathroom during the task.

Results Data preparation. For each type of comparison, we computed a separate score, giving children one point each time they selected the predicted response (intentional good for the IG–IB comparison, random good for RG–RB comparison, intentional good for IG–RG comparison, and random bad for IB–RB comparison). Each child therefore had a score that ranged from 0 (never picked the predicted response) to 3 (always picked the predicted response) for each type of comparison. Scores were always compared with chance (1.5) using a one-sample t test. Overall results. Across all participants, responses for all composites differed from chance in the predicted direction: intentional good versus intentional bad (M ⫽ 2.06, SD ⫽ 0.88), t(114) ⫽ 6.82, p ⬍ .001; lucky versus unlucky (M ⫽ 1.87, SD ⫽ 0.88), t(114) ⫽ 4.48, p ⬍ .001; intentional good versus lucky (M ⫽ 1.86, SD ⫽ 0.94), t(114) ⫽ 4.10, p ⬍ .001; intentional bad versus unlucky (M ⫽ 1.81, SD ⫽ 0.90), t(114) ⫽ 3.69, p ⬍ .001.

JUDGMENTS OF THE LUCKY

Having a range of ages also allowed us to look for correlations between age and performance. Age was correlated with all of the composites such that older children had higher scores on all composites (intentional good vs. intentional bad, r ⫽ .46, p ⬍ .001; lucky vs. unlucky, r ⫽ .31, p ⫽ .001; intentional good vs. lucky, r ⫽ .46, p ⬍ .001; intentional bad vs. unlucky, r ⫽ .35, p ⬍ .001). See Figure 3 for a breakdown of responses by age. We found no significant effect of gender on any of the four composites ( ps ⬎ .10). 2.5-year-olds and 3.0-year-olds. None of the indices differed significantly from chance for 2.5-year-olds ( ps ⬎ .35) or for 3.0-year-olds ( ps ⬎ .15). 3.5-year-olds. Participants who were 3.5 years old judged the intentional good actors to be nicer than the intentional bad actors (M ⫽ 1.93, SD ⫽ 0.88), t(28) ⫽ 2.63, p ⫽ .014; the lucky targets to be nicer than the unlucky targets (M ⫽ 2.03, SD ⫽ 0.78), t(28) ⫽ 3.70, p ⫽ .001; the intentional good actors to be nicer than the lucky targets (M ⫽ 1.86, SD ⫽ 0.88), t(28) ⫽ 2.23, p ⫽ .034; and the unlucky targets to be marginally nicer than the intentional bad actors (M ⫽ 1.79, SD ⫽ 0.82), t(28) ⫽ 1.93, p ⫽ .064. 4.0-year-olds. Four-year-old participants judged the intentional good actors to be nicer than the intentional bad actors (M ⫽ 2.52, SD ⫽ 0.75), t(26) ⫽ 7.03, p ⬍ .001; the lucky targets to be nicer than the unlucky targets (M ⫽ 2.15, SD ⫽ 0.91), t(26) ⫽ 3.71, p ⫽ .001; the intentional good actors to be nicer than the lucky targets (M ⫽ 2.52, SD ⫽ 0.64), t(26) ⫽ 8.23, p ⬍ .001; and the unlucky targets to be nicer than the intentional bad actors (M ⫽ 2.07, SD ⫽ 1.00), t(26) ⫽ 3.40, p ⫽ .002. 4.5-year-olds. The 4.5-year-old participants also viewed the intentional good actors as nicer than the intentional bad actors (M ⫽ 2.63, SD ⫽ 0.72), t(15) ⫽ 6.26, p ⬍ .001; the lucky targets as nicer than the unlucky targets (M ⫽ 2.13, SD ⫽ 0.81), t(15) ⫽ 3.10, p ⫽ .007; the intentional good actors as nicer than the lucky targets (M ⫽ 2.19, SD ⫽ 0.91), t(15) ⫽ 3.02, p ⫽ .009; and the unlucky targets as nicer than the intentional bad actors (M ⫽ 2.25, SD ⫽ 0.77), t(15) ⫽ 3.87, p ⫽ .002.

Discussion Across ages, a consistent pattern emerged such that around the age of 3.5 years, children were able to make distinctions between those who performed intentional good versus bad actions and

769

between those who experienced lucky versus unlucky events, and they made further distinctions between those involved in intentional versus random actions. In particular, it is interesting that these distinctions seem to emerge at approximately the same age. Of course, one possible explanation remains that the task was simply too difficult for younger children. In Study 6, we addressed this possibility by testing children in an even simpler task. Our previous study of attitudes toward the lucky and unlucky looked exclusively at children over the age of 5.0 years. The current study allowed us to see that 4-year-olds do in fact demonstrate this preference and that even 3.5-year-olds do. In addition, this study newly examined the age at which children begin to distinguish between evaluations of intentional actors and random targets. Most of the studies that have compared intentional with accidental events have examined older children (Elkind & Dabek, 1977; Surber, 1982) or collapsed over large age ranges and therefore have not conclusively demonstrated that 3.5-year-olds show this distinction between the intentional and accidental (e.g., Shultz & Wells, 1985; Shultz et al., 1980; Yuill & Perner, 1988). We have demonstrated that the distinction between intentional and random is made reliably around age 3.5 years. At this age, children recognize that an individual is “more good” or “more bad” if he or she acted with intent than if he or she happened to be a mere recipient of such events. Another explanation for this effect needs to be addressed. It is possible that these studies created a preference for the lucky by forcing such a response. That is, perhaps young children actually preferred the lucky and unlucky targets equally but merely demonstrated this bias because they had to select an answer, a judgment they would not have offered if left alone. Such a result is still interesting, and this possibility is worth testing, so we did so in Study 5.

Study 5: Do Young Children Actually Think the Lucky and Unlucky Are Equally Nice? In Study 5, we examined the possibility that our findings in Study 4 were the result of an experimental demand. To test this possibility, in Study 5, we presented children with a forced-choice task that included a third alternative: “They’re exactly the same.” Children were presented with the same stimuli as in Study 4, but were instead asked “Who’s nicer, [Johnny], [Jimmy], or they’re exactly the same?” If anything, this option should have created a new demand: to employ use of the exactly the same response, given its neutral stance and presence as an option. If children’s natural inclination is not to differentiate between lucky and unlucky targets, then we should have seen children providing more exactly the same responses than lucky responses, and we should have seen the difference between lucky and unlucky disappear. In contrast, if children believe that lucky is better than unlucky, we should have continued to see the lucky targets selected more often than the unlucky, in spite of the presence of an exactly the same response.

Method Figure 3. Mean number of times in Study 4 that children at each age selected the lucky or intentional good actor as nicer than the unlucky or intentional bad actor. A score of 1.5 equals chance. Asterisks indicate that the mean differed significantly from chance. Intent ⫽ intentional.

Participants. Participants included 49 children (33 female) between the ages of 4.0 and 5.5 years (M ⫽ 54.7 months, SD ⫽ 5.1 months) attending the same university preschool as those in Study

770

OLSON, DUNHAM, DWECK, SPELKE, AND BANAJI

4. Two subjects were excluded from analyses because they did not understand the task, resulting in 47 children (33 female; M ⫽ 55.0 months, SD ⫽ 5.1 months). This study also employed a diverse sample (racial/ethnic breakdown: 19 White, 4 Black, 7 Asian, 7 Hispanic, 5 biracial or multiracial, and 5 did not specify). We included children in this age group because they had most clearly demonstrated the effects in Study 4. Stimuli and design. The stimuli were identical to those presented in Study 4, although the exact items were randomized. This time, the 12 items included six that compared lucky to unlucky and six that compared intentional good to bad (because these were the primary questions of interest, and we wanted to collect more data on these items from each participant). The items of most interest (lucky vs. unlucky) were presented first. We included the intentional good versus intentional bad items to test whether the new option, “exactly the same,” changed performance on the clearest type of comparison and to test whether children understood the task. We reasoned that if a child always responded “exactly the same” for each lucky versus unlucky item, this could be either because he or she thought that random events were not indicative of niceness or because he or she did not understand the task. We therefore included intentional good versus intentional bad to distinguish between these two cases. If a child used the exactly the same response for every random comparison but stopped using the exactly the same response for any of the intentional good versus intentional bad items, as a few children did, then we kept the child in the data set because it seemed clear that he or she understood the task. If a child used the exactly the same response for all 12 items (including intentional and random), we hypothesized that the child did not understand the task, because it seemed unlikely that a child would believe that in all cases an intentional bad actor was just as nice as an intentional good actor. The latter situation occurred only one time, and this child was excluded from the dataset (one of the two excluded above). In total, four versions of the task were created, consisting of two different scripts, each with the order described above. We then counterbalanced the scripts to control for the side of the page on which a given photograph appeared. In all scripts, the order of mention was varied across items. Procedure. First, children were given three training trials for what we described as the first game, the who’s taller game. The experimenter explained that two people would appear and the task would be to say which one was taller: the first one, the second one, or they are exactly the same. In these trials, two stick figures were presented. In the first two trials, one was clearly larger than the other, and the experimenter indicated which one she would select if asked who is taller. In the third trial, two stick figures of the same size were presented, differing in color, and the experimenter indicated that she would say “They’re exactly the same.” Data were not recorded for this task, but anecdotally, children seemed to understand and often shouted their (always correct) responses before the experimenter had a chance to say her opinion. Finally the experimenter said that the subject would get to play a game but that it was a little different from the who’s taller game. Instead, the game would be the who’s nicer game, and children could select either person or say “they’re exactly the same.” All children said they understood, and the experimenter began. Children were read each of the 12 pairs of items and were then asked, “Who’s nicer, [Alex], [Andrew], or they’re exactly the same?” As a conservative test, we added “exactly the same” as the final response, as anec-

dotally, we have observed that children have a tendency to pick the last option given. Children indicated their responses by either pointing or stating their response. Data preparation. For the six items of each type (random or intentional), we computed a score, tallying the number of times the good, bad, or exactly the same response was given. We then compared each score with chance using a one-sample t test.

Results and Discussion Contrary to a task-demand explanation, we found that children continued to select the lucky targets (M ⫽ 3.14 out of 6) as nicer more often than chance (2.0), t(48) ⫽ 4.11, p ⬍ .00l (see Figure 4). Because these responses were interdependent, it is not surprising that the unlucky targets were selected less often than chance (M ⫽ 1.14), t(48) ⫽ ⫺4.83, p ⬍ .001, and the exactly the same response did not differ from chance (M ⫽ 1.71), t(48) ⫽ ⫺0.97, p ⫽ .34. It is not surprising that children also selected the intentional good actor most often for the intentional items (M ⫽ 4.59) and that this was selected more often than chance (M ⫽ 2.0), t(48) ⫽ 10.48, p ⬍ .001 (see Figure 4). The intentional bad actors were selected less often than chance would predict (M ⫽ 0.63), t(48) ⫽ ⫺10.07, p ⬍ .001, and the exactly the same response was also selected less than chance (M ⫽ 0.78), t(48) ⫽ ⫺5.92, p ⬍ .001. A summary of the results can be seen in Figure 4. One possible explanation for these results is that children were simply reluctant to use the exactly the same response and that this may therefore have been an unfair test. However, 55% of children used this response at least once during the task. For these participants, we computed a preference for the lucky score by subtracting the number of times they selected the unlucky as nice compared with the number of times they selected the lucky as nice. We compared this value with zero using a one-sample t test and found that even those participants who used the exactly the same response at least once selected the lucky more than the unlucky, t(26) ⫽ 2.06, p ⫽ .05. These results suggest that the findings in Study 4 were not simply the result of a forced-choice task. Instead, we found that young children continued to articulate that the lucky target was nicer than the unlucky target. Of course, children did use the exactly the same response from time to time, but they did not do so more often than they selected the lucky target, and the addition

Figure 4. Mean proportion of times in Study 5 that each actor was selected as the nicer one across six lucky versus unlucky items and across six intentional good versus intentional bad items in which the option “exactly the same” was also given. Int. ⫽ intentional.

JUDGMENTS OF THE LUCKY

of this option did not undermine the preference for the lucky over the unlucky.

Study 6: Preference for the Lucky in a Simplified Task A remaining concern from Study 4 is the difficulty of the task for children younger than 3.5 years of age. This possibility is suggested by evidence that 2.5- and 3.0-year-old children were no more likely to select the intentional good actor than the intentional bad actor as nicer. Despite our attempts to make Study 5 simple, perhaps the memory and attention load (learning and then remembering what two different people did before making a response) was simply too great for our youngest participants. Thus, in Study 6, rather than presenting pairs of targets and asking children to remember both before selecting an answer, we presented one target at a time and simply asked children whether each target was nice or not nice. Because Study 4 demonstrated a preference for the lucky in children beginning at age 3.5, in this study, we tested 3.0-year-old children.

Method Participants. Participants included twenty-five 3.0-year-old children (11 female, aged 36.7– 41.8 months, M ⫽ 39.3, SD ⫽ 1.5) recruited from either the same campus nursery school in California as the children in Studies 4 and 5 (n ⫽ 12) or from a lab database in Massachusetts (n ⫽ 13). Participant race was not recorded, but we estimated that the final sample was approximately 60% White and that the remaining 40% were evenly distributed among Black, Asian and multiracial participants. Procedure. Participants were brought to a small testing room and were seated next to the experimenter. The experimenter told the child that he or she was going to see some other kids and be asked whether each target was nice or not nice. Participants were presented with a total of 24 targets, six of each type (lucky, unlucky, intentional good, intentional bad) in one of four possible scripts (2 randomized orders ⫻ 2 gender orders).

Results and Discussion Data preparation and analyses. In general, children were more like to say “nice” than expected by chance (50%), t(24) ⫽ 2.27, p ⫽ .032, and this was true for some participants more than others. We were not concerned with this fact, however, given that this bias should have been equally prevalent across item types and that our analyses were within subject. A composite score was created for each type of item such that the total number of nice judgments (out of six possible) was computed for each subject. We then compared these means using paired-sample t tests. Participants most often designated the intentional good actors as nice (M ⫽ 5.0, SD ⫽ 2.1), followed by the lucky targets (M ⫽ 4.2, SD ⫽ 2.0), then the unlucky targets (M ⫽ 3.4, SD ⫽ 2.1), and then the intentional bad actors (M ⫽ 2.4, SD ⫽ 2.2); this pattern was demonstrated by a significant linear trend, F(1, 24) ⫽ 24.12, p ⬍ .001. In addition, all paired t tests indicated differences. Most notable, the lucky targets were more often labeled as nice compared with the unlucky targets, t(24) ⫽ 2.16, p ⫽ .041, and the intentional good actors were labeled as nice more often than the intentional bad actors, t(24) ⫽ 5.17, p ⬍ .001. Children also

771

selected the intentional good actors as nice more often than the lucky targets, t(24) ⫽ 2.22, p ⫽ .036, and the unlucky targets as nice more often than the intentional bad actors, t(24) ⫽ 3.69, p ⫽ .001. Thus, by simplifying the attentional and memory demands of the task, we demonstrated that even children aged 3.0 years prefer lucky to unlucky individuals. In a pilot version of this study with 2.5-year-olds, we found that this task was too difficult for them. Children at this age either said “nice” for every item or simply refused to provide an answer, suggesting that to ask whether children younger than 3.0 years show a preference for the lucky, a completely new, perhaps nonverbal task needs to be created. Across Studies 4 – 6, our results indicated that even very young preschoolers demonstrate a preference for the lucky over the unlucky. This preference appears when lucky and unlucky individuals are pitted against each other in a forced choice, when children have an explicit option to like lucky and unlucky targets equally, and when the targets are presented serially. Such a finding causes some problems for the fullest just-world explanation. Lerner’s (1977) hypotheses about the emergence of just-world thinking suggests that children need to be many months if not years older to show the earliest evidence of just-world thinking. His explanation requires that children move beyond the pleasure principle to the reality principle, which is expected to occur around the age of 6 or 7 years. Even with the most generous definition, 3 years of age is clearly too young for such a transition. It seems highly unlikely, given other results in cognitive development, that 3.5-year-olds have the cognitive capacities and awareness, such as perspective taking and delay of gratification, required for justworld types of reasoning (Kurdek, 1979; Kurdek & Rodgon, 1975; Mischel & Mischel, 1983). These results provide clear support for the hypothesis that a preference for the lucky is in place by age 3.0 and that it may be present prior to that age. It is possible that in future research with new procedures, such a preference may be detected even earlier. It is also possible that a simpler version of the just-world belief (e.g., a basic belief that good things happen to good people and bad things happen to bad people, without a deeper understanding of the complexity of the world or an ability to inhibit their actions) is held by very young children, although establishing that such a theory is in place would require additional work. One of the major undertakings of the current article was to investigate how widespread the preference for the lucky is. We have now demonstrated that it is seen in children ranging in age from 3 to 12 years, that older children extend this preference to predictions of the intentional behavior of lucky and unlucky targets, and that they extend the preference to the siblings of the targets. In the final two studies, we investigated whether preference for the lucky and evaluative contagion appear cross culturally.

Study 7: Cross-Cultural Evidence of the Preference for the Lucky In two final studies, we asked whether the preference for the lucky is constrained to the minds of young children from Western cultures or whether it might instead be a preference held by young children across very different cultures. As our first test of this question, we investigated whether young children who were raised

772

OLSON, DUNHAM, DWECK, SPELKE, AND BANAJI

in a culture that appears to promote fewer trait inferences than that of the United States show this same preference. Research on causal attributions has examined cross-cultural differences in adults’ tendency to use situational versus dispositional (trait) explanations of human behavior (Masuda & Kitayama, 2004; Morris & Peng, 1994).5 Although sometimes the findings have been mixed, when differences have been found, they have tended to fall along Eastern (Japan, India, etc.) versus Western (United States, England, etc.) lines, with Easterners tending to use more situational explanations for behavior and Westerners using more dispositional explanations (Krull et al., 1999; Masuda & Kitayama, 2004; Morris & Peng, 1994). As noted, it is possible that dispositional attributions are at the heart of children’s preference for the lucky. That is, perhaps because American children live in a culture that values dispositional attributions, they tend to blame unlucky targets and credit lucky targets, essentially overextending dispositional explanations to random events. Therefore, Japan stands as an interesting test case for examining cultural variability in the preference for the lucky over the unlucky. If children in both cultures show a preference for the lucky over the unlucky, then cultural differences in attributions likely do not explain the preference-for-the-lucky effect. For this study, we employed a simple test of preference for the lucky in the form of liking judgments of lucky and unlucky targets. Such a test allowed us to examine whether Japanese children differentially evaluated lucky and unlucky targets, rather than asking them to predict behavior (a task that we reasoned required a more elaborated judgment, so if differences occurred, they could be explained by several factors). In addition, by employing the selected method, we could compare the results with published findings with an equivalent American sample (Olson et al., 2006).

Method Participants. Twenty-six children from rural Japan participated; 3 were excluded because of poor participation (e.g., giving the same response to every item), resulting in 23 participants (10 female; aged 4 –7 years; M ⫽ 5). We selected these ages to approximately match those used in Olson et al. (2006), which employed an identical method. In actuality, this sample was approximately 6 months younger than the Olson et al. sample. Stimuli. In total, 40 vignettes were created, 10 of each type (intentional good, intentional bad, lucky, unlucky). These items were scrambled and divided into lists of 10 items each. The items were then scrambled again, and four more lists were created, making a total of eight lists. Each list contained at least one item of each type, with the gender of the targets alternating by item. Participants were sequentially assigned to a list. All items were taken from Olson et al. (2006) and were translated into Japanese by Yarrow Dunham and then checked and back translated by two native Japanese speakers to ensure accuracy. The only changes made were those necessary to maintain understanding (e.g., in the American version, the target found $5 on the sidewalk, whereas in the Japanese version, the target found 500 yen, and names were changed from Mike to Minoru). Procedure. First, children were trained to use a 6-point smiley-to-frowny-face scale that they were to use later in the study. The experimenter gave examples of how he would evaluate different people (e.g., his mother vs. his neighbor) using the scale and

Figure 5. Mean liking rating for intentional good, lucky, unlucky, and intentional bad actors in Study 7, as rated by Japanese children. Higher scores indicate greater liking, and error bars indicate standard error of the mean.

asked the child if he or she understood how to use the scale. Children were then read one of the scripts that included 10 items describing the actions of an individual or an event experienced by an individual (e.g., Tarou helped his teacher). After reading each vignette, the experimenter asked children to indicate how much they liked each actor using the 6-point smiley-to-frowny-face scale. These scores were then converted to a 6-point numeric scale. Data preparation. Following the procedure of Olson et al. (2006), the average rating for each type of target (intentional good, intentional bad, lucky, unlucky) for each participant was computed. We used paired t tests to compare ratings of targets.

Results and Discussion Ratings. Japanese children preferred intentional good targets (M ⫽ 4.68) to intentional bad targets (M ⫽ 3.06), t(22) ⫽ 3.61, p ⫽ .002, and lucky targets (M ⫽ 4.24) to unlucky targets (M ⫽ 3.12), t(22) ⫽ 2.87, p ⫽ .009 (see Figure 5). These results support the claim that Japanese children have a preference for the lucky over the unlucky, despite living in a culture that tends to use fewer dispositional attributions and despite our use of a non-forcedchoice method. It is interesting that the difference between intentional bad actors and unlucky targets was not significant ( p ⬎ .75) and that the difference between intentional good actors and lucky targets was only marginally significant, t(22) ⫽ 1.85, p ⫽ .08. In addition, we computed the difference between evaluations of intentional good and bad actors and the difference between evaluations of lucky and unlucky targets. We then compared these differences and found that there was no significant difference, t(22) ⫽ 1.32, p ⫽ .20. That is, Japanese children made almost no 5 The few studies that have directly examined cross-cultural causal attributions in children have been conducted by Miller (1984, 1986). In those studies, she asked children to spontaneously name examples of intentional good and bad actions from their lives, and she tested whether their explanations for these actions were more situational or dispositional. Her studies differed in several significant ways from the current work: There was no investigation of random events; the events were produced by the subjects, not the experimenters; the children were older than those examined here; and her sample was Indian, not Japanese. She found no significant differences across Indian and American cultures in children’s tendency to use situational versus dispositional explanations.

JUDGMENTS OF THE LUCKY

distinction between whether targets engaged in intentional behavior or experienced random events. It is important to note that the effect size of the preference for the lucky in this sample is very similar to the equivalent effect size in the U.S. sample reported in Olson et al. (2006), which used the same task and a comparable age range (d ⫽ 1.07 in United States, d ⫽ 0.93 in Japan). However, the comparison of intentional good and intentional bad actors suggests a large difference between the Japanese and American samples. Although both groups preferred the intentional good to intentional bad targets, the effect size in the American sample is more than twice as large as the effect size in the Japanese sample (d ⫽ 3.04 in United States, d ⫽ 1.30 in Japan), a result perhaps related to past findings of differences between American and Japanese people’s use of dispositional and situational explanations. In addition, in this sample, participants made no significant distinction between the intentional bad and unlucky targets and only a marginal distinction between the intentional good and lucky targets. In comparison, American children made a large distinction between intentional bad and unlucky targets and also a marginal distinction between intentional good and lucky targets. These results suggest that young children in cultures that vary in evaluations of intentional acts nonetheless blame victims of bad fortune and reward recipients of good fortune in similar ways. Both show a preference for the lucky. In this result, we have initial evidence of cross-cultural generalizability of the preference for the lucky from a country with a culture that provides a meaningful test. In our final study, we took this initial result one step further and asked whether Japanese children also show evaluative contagion.

Study 8: Cross-Cultural Similarity in Evaluative Contagion We asked whether Japanese children extend their preference for the lucky to entire social groups. Children were presented with members of two novel groups, one group that contained some members who had experienced lucky events and one group that contained some members who had experienced unlucky events. It is critical to note that both groups had some members who had experienced neither lucky nor unlucky events (see Levy & Dweck, 1999, for a similar procedure). Children were then introduced to these new members of each group and were asked to indicate which group member they preferred.

773

groups were never explicitly labeled and were distinct only because of shirt color and the side of the screen on which they appeared (e.g., cartoons in blue shirts were always on the right side of the screen, and cartoons in green shirts were always on the left side of the screen). Cartoon children appeared on the screen one at a time, alternating groups (e.g., first a child in a blue shirt appeared, followed by a child in a green shirt). As each picture appeared, participants were told one fact about that child. For Group A, three of the five children were described as experiencing lucky events and two were associated with neutral facts. In Group B, three of five children were described as experiencing unlucky events and two were associated with neutral facts. After the 10 group members had appeared on the screen, two final children appeared, one from each group. These two children, the targets, were identical except for shirt color, and each appeared on the same side of the screen as had the other members of their group. Participants were asked which of these two targets they liked better. Two unique trials like the ones described above were created, and two additional trials were created substituting intentional good actions for lucky events and intentional bad actions for unlucky events, resulting in four final trials. The lucky group, the unlucky group, the intentional good group, and the intentional bad group each appeared on the left once and on the right once. Data preparation. Data preparation and analysis was identical to that used in Study 2 of Olson et al (2006). The two lucky versus unlucky items were combined into a composite, and the two intentional good versus bad items were combined into a separate composite. Each composite was computed by giving the subject one point each time they picked the good or lucky actor, resulting in an index score between 0 (never picked the good or lucky actor) to 2 (always picked the good or lucky actor). Because only three scores were possible (0, 1, or 2), nonparametric tests were necessary. Overall results were analyzed using chi-square goodness-offit tests (chance was computed to be 25% for 0, 50% for 1, and 25% for 2).

Results and Discussion Findings. Children’s preferences differed significantly from chance for both the intentional good versus bad comparison, ␹2(2, N ⫽ 87) ⫽ 6.45, p ⫽ .040, and the lucky versus unlucky comparison, ␹2(2, N ⫽ 87) ⫽ 16.22, p ⬍ .001 (see Figure 6). Inspec-

Method Participants. Eighty-seven participants (49 female, aged 4 –7 years, M ⫽ 5.8, SD ⫽ 1) from rural Japan completed the study. One other participant completed the study but had to be removed from the sample because of experimenter error. Stimuli. An artist drew cartoons of six children, three boys and three girls. The same six pictures were used to represent members of each of the groups; the only difference across groups was the color of their shirts. The lucky and unlucky events were taken from Study 7, and the neutral items were either described as something the actor liked to eat (e.g., Yuko likes oatmeal) or an activity in which the actor engaged (e.g., Ayumi rides her bike). Procedure. Participants were presented with four trials. On each trial, they were told about members of two groups. The

Figure 6. Proportion of Japanese children’s responses across items in which they preferred the new member of the intentional good or intentional bad group in the intentional good versus intentional bad comparisons (left side) and the member of the lucky or unlucky group in the lucky versus unlucky comparisons (right side) in Study 8.

774

OLSON, DUNHAM, DWECK, SPELKE, AND BANAJI

tion of the data indicated that children were more likely to prefer the member of the intentional good group than the member of the intentional bad group and, consistent with evaluative contagion, also preferred the new member of the lucky group to the new member of the unlucky group. These results demonstrate that Japanese children evaluate individuals on the basis of the actions and experiences of others who are socially associated with them. Again, a comparison with the corresponding U.S. sample is in order. As in Study 7, the effect sizes for these two samples are nearly identical for evaluations of people associated with those experiencing random good and bad events (w ⫽ .45 in United States, from Olson et al., 2006; w ⫽ .43 in Japan). Also as in Study 7, American children showed three times as large an effect size for intentional good versus bad items compared with Japanese children (w ⫽ .83 for United States, w ⫽ .27 for Japan). Once again, we found that Japanese children showed less of a bias against intentional bad groups than did American children, just as they showed less of a bias against intentional bad individuals. In sum, children growing up in Japan, where dispositional attributions have been observed to be weaker than in Western cultures, showed a preference for the lucky over the unlucky as well as evaluative contagion in the first cross-cultural tests of these phenomena. These results suggest that evaluative contagion is generalizable across cultures. Japanese children performed nearly identically to American children on this task, preferring members of predominantly lucky groups to members of predominantly unlucky groups.

General Discussion Across eight studies, we have demonstrated that children show a robust tendency to judge the lucky positively. This preference was revealed by a variety of methods and is present in children from a wide range of ethnicities, races, towns, states, countries, and social classes, including predominantly White middle- to upper-middle-class elementary school children in Utah and Massachusetts, low-income Black children in Massachusetts, preschool children from a wide range of ethnicities in California, and rural Japanese children. Across these many samples and tasks, several results emerged clearly. Young children prefer lucky individuals to unlucky ones, children predict that the lucky are more likely to perform intentional good actions and that the unlucky are more likely to perform intentional bad actions, and children extend these predictions and evaluations to the siblings and group members of lucky and unlucky individuals. Another major finding of these studies was that the preference for the lucky appeared at a very young age. We cannot conclude that children below the age of 3 years do not prefer the lucky, only that they did not do so in our task, which may simply have been too hard for them. Measures such as looking time and reaching behavior have been used successfully with young infants and even nonhuman primates in other cognitive and social cognitive tasks (Baillargeon, Spelke, & Wasserman, 1985; Nurock et al., 2007; Santos & Hauser, 1999), and perhaps creative researchers can design studies to test whether children (or even other species) prefer the lucky. If evidence for this effect in young infants or other primates were found, it would suggest that either this pref-

erence is innate or it grows readily out of early cognition, perhaps in conjunction with early socialization. The current studies also provided initial evidence that the preference for the lucky is not constrained to Western societies by showing the same tendencies in Japanese school children as in their American counterparts. Although our results are suggestive of cultural invariance, this preference should be examined in countries that differ from the United States and Japan in meaningful ways, such as in beliefs about or experience with luck to test further for cultural invariance. For example, do children who live in surroundings in which they have very little control over their environments and therefore experience unlucky events frequently (e.g., children in refugee camps in Sudan) prefer those who experience lucky events to those who experience unlucky events, or does their own experience attenuate or even reverse this preference? The value of these results is based both on the empirical demonstrations themselves as well as on the theoretical questions they resolve. Two theories stood out as deserving a test alongside the phenomena of preference for the lucky and evaluative contagion: immanent justice and BJW. Despite a similarity in the structure of the test of immanent justice and the present studies, the results demonstrated a clear dissociation between the two. Whereas immanent justice decreased across age, judgments of the lucky did not and, if anything, increased across childhood. In addition, by demonstrating a preference for the lucky in very young children, we minimized the likelihood that just-world beliefs, as they have been previously described (Lerner, 1977), drive the preference for the lucky in young children. In a set of related studies in progress, we are now investigating the hypothesis that the preference for the lucky is not driven by justice-related concerns at all but, rather, that a simpler mechanism may be responsible for these effects (Olson, Heberlein, Kensinger, Spelke, Dweck, & Banaji, 2008). In particular, we are investigating the possibility that the affect associated with a good or bad event (whether intended or not) rubs off on the individuals experiencing those events, resulting in evaluations of the individuals that are consistent in valence with the events, a process we call affective tagging. It is important to note that this hypothesis is more parsimonious than many of the justice theories and makes some differing predictions. For example, whereas just-world theory predicts that a preference for the lucky should primarily occur when the events described are extreme and threaten a person’s sense of justice, the affective-tagging hypothesis predicts that lucky individuals will always be associated with some positivity and unlucky individuals will always be associated with some negativity (although in some cases, other factors, such as empathy or impression formation, may work in opposition to these evaluations). This prediction is relevant to the current studies because the items we selected in these studies are trivial events, hardly the events likely to violate one’s sense that the world is just. Therefore, the fact that we see a preference for the lucky even for these events provides some initial evidence in favor of the affectivetagging hypothesis. One may wonder whether children grow out of the preference for the lucky or, alternatively, whether this preference continues across childhood into adulthood, increasing as the trajectory of the data in this article might suggest. One could imagine that after the age of 12 years, the developmental trajectory shifts and adoles-

JUDGMENTS OF THE LUCKY

cents grow out of this belief. Even if this were the case, a dislike of particular unlucky groups may nonetheless become entrenched in childhood and continue into adulthood, long after the mechanism that formed them has ceased to operate. Another possibility is that adults continue to hold these judgments or even increase them, leading to a continuation of prejudice toward unlucky and disadvantaged people and groups. Our research in progress, in which we use a similar paradigm, suggests that these preferences seem to continue through adulthood, although they abate considerably; this apparent abatement may be due to adults becoming more reluctant to express the preference publicly. In a simple replication of Study 1, we found that adults show the same pattern of believing that lucky targets are more likely to perform good actions and that unlucky targets are more likely to perform bad actions. A preference for the lucky was also found in American adults in a conceptual replication of Study 7, showing that they prefer people who experience lucky events to those who experience unlucky events, even when we used a non-forced-choice design. As discussed above, this liking of the lucky and disliking of the unlucky is similar to many related findings that suggest that people and things are evaluatively tagged on the basis of the valence of other information associated with that individual or thing. For example, research has shown that adults tend to dislike the bearer of information with which they disagree, even when the bearer of the information disagrees with the information being shared (Manis, Cornell, & Moore, 1974), and that adults see an individual as, for example, more angry if that individual has described another person as angry (Skowronski, Carlston, Mae, & Crawford, 1998). In addition, even novel objects elicit rapid evaluation (Duckworth, Bargh, Garcia, & Chaiken, 2002), and it is therefore not unreasonable to think that quick evaluations occur when humans observe other humans, a prediction at the heart of the affectivetagging hypothesis (Olson et al., 2008). Although the preference for the lucky may seem to be an innocent bias, it is possible that it has important and insidious repercussions, in particular because those expressing it are young children. In the real world, random events are, by definition, out of the control of the individuals experiencing them, but they are not completely random in whom they affect. Rather, some groups (those who are disadvantaged) tend to experience these types of events more than do others. Hurricane Katrina, which hit the United States Gulf Coast in August of 2005, stands as a striking example of the unequal impact of random events on members of advantaged and disadvantaged groups. A disproportionate number of those who were stranded in New Orleans were disadvantaged, a disproportionate number of those who died were disadvantaged, and the impact on the lives of those who survived was greater for the victims who were members of disadvantaged groups. Therefore, what at first appears to be an innocuous belief—that lucky people are better than unlucky people—may actually lead to a systematic bias against disadvantaged people and groups, resulting in both inculcation and perpetuation of prejudice in children. If it is true that the preference for the lucky and the contagion of these judgments play a role in the development and maintenance of prejudice, then this would suggest that to fight prejudice and its development, it is not enough to censor racist remarks, do sensitivity training in schools, and read politically correct stories. As long as negative outcomes continue to fall disproportionately on

775

some groups, we may be unwittingly providing our children with the evidence they use to infer that group’s inferiority. This means that parents, teachers, and society must not only come to understand the preferences young children hold but also must understand that if they wish to change the impact of these preferences, society needs to rectify the injustices that cause disadvantage and/or develop strategies to counteract young children’s early preferences. Thus, these preferences may be one of the origins of or contributing factors to the development of stereotyping, prejudice, and discrimination, perhaps via the development and maintenance of group hierarchies. Such a conclusion is relevant to social psychological discourse on system-justification theory (Jost & Banaji, 1994) and social dominance theory (Sidanius & Pratto, 1999). Both theories suggest that people are motivated to maintain the status quo in which some social groups have a higher status than others; a preference for the lucky may be one such attitude that contributes to the maintenance of group hierarchies. It is possible that the preference for the lucky is a mechanism for the development and maintenance of system-justifying and social dominance beliefs as well as more specific social-group attitudes. We believe this to be a promising avenue of future research.

References Baillargeon, R., Spelke, E. S., & Wasserman, S. (1985). Object permanence in five-month-old infants. Cognition, 20, 191–208. Banaji, M. R., & Bhaskar, R. (2000). Implicit stereotypes and memory: The bounded rationality of social beliefs. In D. L. Schacter & E. Scarry (Eds.) Memory, brain, and belief (pp. 139 –175). Cambridge, MA: Harvard University Press. Bracton, H. (1968 –1977). On the laws and customs of England (S. E. Thorne, Trans.). Cambridge, MA: Harvard University Press. (Original work published 13th Century) Retrieved August 3, 2006, from Bracton Online, Harvard Law School Library: http://hlsl5.law.harvard.edu/ bracton/ Callan, M. J., Ellard, J. H., & Nicol, J. E. (2006). The belief in a just world and immanent justice reasoning in adults. Personality and Social Psychology Bulletin, 32, 1646 –1658. Duckworth, K. L., Bargh, J. A., Garcia, M., & Chaiken, S. (2002). The automatic evaluation of novel stimuli. Psychological Science, 13(6), 513–519. Elkind, D., & Dabek, R. F. (1977). Personal injury and property damage in the moral judgments of children. Child Development, 48, 518 –522. Fein, D. (1976). Just world responding in 6- and 9-year-old children. Developmental Psychology, 12, 79 – 80. Fein, D., & Stein, G. M. (1977). Immanent punishment and reward in sixand nine-year-old children. Journal of Genetic Psychology, 131, 91–96. Furnham, A. (1985). Just world beliefs in an unjust society: A cross cultural comparison. European Journal of Social Psychology, 15, 363– 366. Furnham, A. (2003). Belief in a just world: Research progress over the past decade. Personality and Individual Differences, 34, 795– 817. Furnham, A., & Rajamanickam, R. (1992). The Protestant work ethic and just world beliefs in Great Britain and India. International Journal of Psychology, 27, 401– 416. Gopnik, A., & Astington, J. W. (1988). Children’s understanding of representational change and its relation to the understanding of false belief and the appearance–reality distinction. Child Development, 59, 26 –37. Hamlin, J. K., Wynn, K., & Bloom, P. (2007, November 22). Social evaluation by preverbal infants. Nature, 450, 557–560. Harris, P. L. (1992). From simulation to folk psychology: The case for development. Mind & Language, 7, 120 –144.

776

OLSON, DUNHAM, DWECK, SPELKE, AND BANAJI

Jahoda, G. (1958). Immanent justice among west African children. Journal of Social Psychology, 47, 241–248. Johnson, R. C. (1962). A study of children’s moral judgments. Child Development, 33, 327–354. Jones, C., & Aronson, E. (1973). Attribution of fault to a rape victim as a function of respectability of the victim. Journal of Personality and Social Psychology, 26, 415– 419. Jose, P. E. (1990). Just-world reasoning in children’s immanent justice judgments. Child Development, 61, 1024 –1033. Jose, P. E. (1991). Measurement issues in children’s immanent justice judgments. Merrill-Palmer Quarterly, 37, 601– 617. Jost, J. T., & Banaji, M. R. (1994). The role of stereotyping in systemjustification and the production of false consciousness. British Journal of Social Psychology, 33, 1–27. Kalish, C. W. (2002). Children’s predictions of consistency in people’s actions. Cognition, 84, 237–265. Karniol, R. (1980). A conceptual analysis of immanent justice responses in children. Child Development, 51, 118 –130. Krull, D. S., Loy, M. H., Lin, J., Wang, C., Chen, S., & Zhao, X. (1999). The fundamental fundamental attribution error: Correspondence bias in individualist and collectivist cultures. Personality and Social Psychology Bulletin, 25, 1208 –1219. Kunda, Z., & Nisbett, R. E. (1986). The psychometrics of everyday life. Cognitive Psychology, 18, 195–224. Kurdek, L. A. (1979). Children’s coordination of differing cognitive perspectives. Journal of Genetic Psychology, 135, 279 –285. Kurdek, L. A., & Rodgon, M. M. (1975). Perceptual, cognitive, and affective perspective taking in kindergarten through sixth-grade children. Developmental Psychology, 11, 643– 650. Lerner, M. J. (1971). Observers evaluation of a victim: Justice, guilt, and veridical perception. Journal of Personality and Social Psychology, 20, 127–135. Lerner, M. J. (1974). The justice motive: “Equity” and “parity” among children. Journal of Personality and Social Psychology, 29, 539 –550. Lerner, M. J. (1977). The justice motive: Some hypotheses as to its origins and forms. Journal of Personality, 45, 1–52. Lerner, M. J. (1980). The belief in a just world: A fundamental delusion. New York: Plenum Press. Levy, S. R., & Dweck, C. S. (1999). The impact of children’s static versus dynamic conceptions of people on stereotype formation. Child Development, 70, 1163–1180. Long, G. T., & Lerner, M. J. (1974). Deserving, the “personal contract,” and altruistic behavior by children. Journal of Personality and Social Psychology, 29, 551–556. Manis, M., Cornell, S. D., & Moore, J. C. (1974). Transmission of attitude-relevant information through a communication chain. Journal of Personality and Social Psychology, 30, 81–94. Masuda, T., & Kitayama, S. (2004). Perceiver-induced constraint and attitude attribution in Japan and the US: A case for the cultural dependence of the correspondence bias. Journal of Experimental Social Psychology, 40, 409 – 416. Miller, J. G. (1984). Culture and the development of everyday social explanation. Journal of Personality and Social Psychology, 46, 961– 978. Miller, J. G. (1986). Early cross-cultural commonalities in social explanation. Developmental Psychology, 22, 514 –520. Mischel, H. N., & Mischel, W. (1983). The development of children’s knowledge of self-control strategies. Child Development, 54, 603– 619. Montada, L., & Lerner, M. J. (1998). Responses to vicitimizations and belief in a just world. New York: Plenum Press. Morris, M. W., & Peng, K. (1994). Culture and cause: American and Chinese attributions for social and physical events. Journal of Personality and Social Psychology, 67, 949 –971.

Najarian-Svajian, P. H. (1966). The idea of immanent justice among Lebanese children and adults. Journal of Genetic Psychology, 109, 57– 66. Nurock, V., Jacob, P., Margules, S., & Dupoux, E. (2008). A precursor of moral judgments in infants. Manuscript in preparation. Olson, K. R., Banaji, M. R., Dweck, C. S., & Spelke, E. S. (2006). Children’s evaluations of lucky versus unlucky people and their social groups. Psychological Science, 17, 845– 846. Olson, K. R., Heberlein, A., Kensinger, E., Spelke, E. S., Dweck, C. S., & Banaji, M. R. (2008). Preferring the lucky to the unlucky, but not wanting to: An investigation of the role of affective associations in the preference for the lucky. Manuscript in preparation. Percival, P., & Haviland, J. M. (1978). Consistency and retribution in children’s immanent justice decisions. Developmental Psychology, 14, 132–136. Piaget, J. (1965). The moral judgment of the child (M. Gabain, Trans.). New York: Free Press. (Original work published 1932) Raman, L., & Winer, G. A. (2004). Evidence of more immanent justice responding in adults than children: A challenge to traditional developmental theories. British Journal of Developmental Psychology, 22, 255– 274. Santos, L. R., & Hauser, M. D. (1999). How monkeys see the eyes: Cotton-top tamarins’ reaction to changes in visual attention and action. Animal Cognition, 2, 131–139. Schult, C. A., & Wellman, H. M. (1997). Explaining human movements and actions: Children’s understanding of the limits of psychological explanation. Cognition, 62, 291–324. Shultz, T. R., & Wells, D. (1985). Judging the intentionality of actionoutcomes. Developmental Psychology, 21, 83– 89. Shultz, T. R., Wells, D., & Sarda, M. (1980). Development of the ability to distinguish intended actions from mistakes, reflexes, and passive movements. British Journal of Social and Clinical Psychology, 19, 301–310. Sidanius, J., & Pratto, F. (1999). Social dominance: An intergroup theory of hierarchy and oppression. New York: Cambridge University Press. Skowronski, J. J., Carlston, D. E., Mae, L., & Crawford, M. T. (1998). Spontaneous trait transference: Communicators take on the qualities they describe in others. Journal of Personality and Social Psychology, 74, 837– 848. Suls, J., & Kalle, R. J. (1979). Children’s moral judgments as a function of intention, damage, and an actor’s physical harm. Developmental Psychology, 15, 93–94. Surber, C. F. (1982). Separable effects of motives, consequences, and presentation order on children’s moral judgments. Developmental Psychology, 18, 257–266. United Nations. (2003). Rome Statute of the International Criminal Court. Retrieved February 22, 2008, from http://untreaty.un.org/cod/icc/ index.html Weisz, J. R. (1980). Developmental change in perceived control: Recognizing noncontingency in the laboratory and perceiving it in the world. Developmental Psychology, 16, 385–390. Wellman, H. M., Cross, D., & Watson, J. (2001). Meta-analysis of theoryof-mind development: The truth about false belief. Child Development, 72, 655– 684. Woodward, A. L. (1998). Infants selectively encode the goal object of an actor’s reach. Cognition, 69, 1–34. Yuill, N., & Perner, J. (1988). Intentionality and knowledge in children’s judgments of actor’s responsibility and recipient’s emotional reaction. Developmental Psychology, 34, 358 –365.

Received June 21, 2007 Revision received December 15, 2007 Accepted December 16, 2007 䡲

Journal of Personality and Social Psychology 2008, Vol. 94, No. 5, 777–791

Copyright 2008 by the American Psychological Association 0022-3514/08/$12.00 DOI: 10.1037/0022-3514.94.5.777

How to Heat Up From the Cold: Examining the Preconditions for (Unconscious) Mood Effects Kirsten I. Ruys and Diederik A. Stapel Tilburg University What are the necessary preconditions to make people feel good or bad? In this research, the authors aimed to uncover the bare essentials of mood induction. Several induction techniques exist, and most of these techniques demand a relatively high amount of cognitive capacity. Moreover, to be effective, most techniques require conscious awareness. The authors proposed that the common and defining element in all effective mood induction techniques is the dominating salience of evaluative tone over descriptive meaning. This evaluative-tone hypothesis was tested in two paradigms in which the evaluative meaning of the “primed” concept was more salient than its descriptive meaning (i.e., when subliminal stimulus exposure was so short that mainly the evaluative meaning was activated [see D. A. Stapel, W. Koomen, & K. I. Ruys, 2002] and when the primed concepts were sufficiently extreme such that evaluative meaning always dominated descriptive meaning). Explicit and implicit mood measures showed that the activation of a dominating evaluative tone affected people’s mood states. Implications of these findings for theories on unconscious mood induction are discussed. Keywords: subliminal perception, priming, mood, affect, need for cognition

affected only through the intensive and conscious experience (or recall) of real (or imagined) mood-eliciting stimuli? Are those the essential ingredients of the mood induction recipe? We think not. We propose that the common and defining element in all effective mood induction techniques is a dominating salience of evaluative tone.1 Thus, watching a fragment from “When Harry Met Sally” and looking at the local weather report forecasting sunny spells are similarly successful ways to induce a positive mood state, even though the descriptive content (i.e., falling in love vs. predicted hours of sun) of these mood inducers is very different. What these mood induction methods have in common is their strong, positive evaluative tone. What differs between these techniques is their specific descriptive content. Thus, the crucial ingredient for effective mood induction seems to be a strong evaluative meaning, rather than specific descriptive meaning. One could argue in more technical terms that successful mood induction techniques cognitively activate (“prime”) positive or negative evaluative meaning more strongly than specific descriptive information. It is not necessarily the specific descriptive content of the memories one recalls, the movie one watches, or the music one listens to that makes one feel good or bad. After all, a memory, movie, or piece of music cannot affect one’s mood. Specific content does not matter: It is the global, diffuse, nonspecific, overall evaluative tone that is primed while one is recalling memories that produces mood effects. Taking this notion to its extreme, one could argue that whenever the evaluative features of cognitively activated (“primed”) information strongly dominate the descriptive meaning, mood states are likely to be affected. When the evaluative meaning of primed information dominates the descriptive information, then mood states are likely to be affected.

What makes people feel good or bad? What is needed to put someone in a positive or a negative mood? What does it take to influence people’s affective states? A quick look at the relevant literature does not really suggest a simple answer to these questions. Past research has shown that there is a myriad of successful techniques to induce positive or negative mood states in people. Recollecting pleasant or unpleasant memories, listening to uplifting or depressing music, reading reports of happy or sad events, watching funny or sad film clips, receiving positive or negative performance feedback, imagining wonderful or horrible life events—all these manipulations can be and have been used successfully to influence how positive or negative individuals feel (Fiedler, 2001; Forgas, 1992; Isen, 1987; Schwarz, 1990). But what do these techniques have in common? What makes them successful mood induction methods? First, it should be noted that all these techniques demand a relatively high amount of cognitive capacity. Furthermore, all these techniques require conscious awareness to be effective. It is the conscious recalling, listening, reading, watching, or imagining that induces a positive or negative mood. But are these characteristics really necessary to elicit an affective state in an individual? Can one’s mood be

Kirsten I. Ruys and Diederik A. Stapel,Tilburg Institute for Behavioral Economics Research (TIBER), Department of Social Psychology, Tilburg University, Tilburg, the Netherlands. This research was supported by a Pionier grant from the Netherlands Organization for Scientific Research awarded to Diederik A. Stapel. We thank Katie Lancaster for her insightful comments on an earlier version of this article. Correspondence concerning this article should be addressed to Kirsten I. Ruys, Tilburg Institute for Behavioral Economics Research (TIBER), Department of Social Psychology, Tilburg University, P.O. Box 90153, Tilburg 5000 LE, the Netherlands. E-mail: [email protected]

1

See Hugenberg (2005) and Stapel and Koomen (2005) for a similar use of the term evaluative tone. 777

RUYS AND STAPEL

778

All that is needed to induce a mood state is the dominating activation of diffuse, nondescriptive, evaluative information. In short, all you need is evaluative tone. Thus, although intense, long, conscious exposure to or experience of mood-relevant stimuli may be a sufficient precondition for the production of mood effects, it is not a necessary precondition.

A Cold Recipe for Hot Mood Effects A highly relevant question for defining the essential ingredients for producing mood effects is whether it is necessary to expose people to hot emotional material like music and movies or whether it is possible to elicit mood states by exposing people to cold semantic concepts. Some mood effects are easy to explain in terms of the activation of declarative knowledge (i.e., semantic priming). For instance, the retrieval of a positive experience can encourage the recollection of other positive autobiographical memories by semantic association. However, content-free mood effects such as the influence of Barber’s “Adagio for Strings” on people’s processing style are more difficult to explain without the assumption of a hot mood state. To investigate the influence of hot states versus cold concepts, Innes-Ker and Niedenthal (2002) directly compared the effects of hot emotional states (“I feel happy” vs. “I feel sad”) versus cold emotion concepts (happy vs. sad) on subsequent social judgments. In one study, those researchers showed that priming cold emotion concepts increased the cognitive accessibility of congruent emotion concepts but had no impact on the emotional state of the participants. Another study showed that priming an emotional state (induced by music) influenced self-reported feelings and produced emotion-congruent judgments of an ambiguous target person whose feelings could either be interpreted in terms of happiness or sadness. The priming of emotional concepts had no such effects. Innes-Ker and Niedenthal (2002) took their results to mean that the mere activation of cold concepts is not sufficient to produce hot emotional states in the perceiver and that the presence of a hot emotional state is necessary to produce emotion-congruent judgments (see also Maringer & Stapel, 2007). Such an interpretation of their results is definitely warranted as well as intuitively appealing. It makes sense that mere priming of emotional concepts like “happy” and “uplifting” versus “sad” and “melancholy” is less likely to affect people’s mood states than listening to happy, uplifting music or sad, depressing music. The question remains, however, what exactly it is then that makes this hot mood induction procedure, in fact, “hot”? Applying the present evaluative-tone perspective, we propose that it is the global evaluative tone of information activated through so-called hot induction procedures versus activation of specific descriptive meaning of so-called cold induction procedures that makes the difference. We thus argue that movies or music are often more effectively used to induce mood effects because, with these techniques, global nonspecific evaluative information is more strongly activated than its concrete specific descriptive counterpart. Priming specific emotion concepts is probably less successful because it is likely that the descriptive meaning of the primed concepts will overshadow their evaluative tone. Thus, we argue, when evaluative as well as descriptive information is activated, mood effects are less likely than when merely or mainly evaluative information is activated. When evaluative as well as descriptive information is

activated, the evaluative tone is no longer diffuse because in that case it becomes bound to the descriptive information. Although one can use this all-you-need-is-evaluative-tone logic to explain the variable successfulness of hot versus cold priming techniques to produce actual changes in mood states, one should not take this to mean that it is impossible for cold priming techniques to produce mood effects. Rather, our logic suggests that it should be possible for cold concept priming to produce mood effects, given that the evaluative side of a stimulus or concept can be primed without activating its descriptive features. In other words, when exposure to cold concepts activates information that is cognitively unconstrained (i.e., without descriptive meaning, Clore & Colcombe, 2003), mood effects should be possible. Thus, when priming honest versus dishonest activates merely or mainly the cognitively unconstrained, evaluative meaning of these words (positive vs. negative) rather than their evaluative ⫹ descriptive meaning (friendly vs. aggressive), mood effects should occur. The question is then, under what circumstances, does stimulus exposure result in a relatively strong activation of evaluative (rather than evaluative ⫹ descriptive) information? Affective primacy theory (Zajonc, 1980) provides a possible answer to this question because it holds that when people are exposed to a stimulus, affective reactions (i.e., reactions based on an evaluation of the stimulus) occur prior to nonaffective reactions (i.e., reactions based on descriptive stimulus features). This theory has received support from neurological research showing that independent systems exist for coarse evaluative processing and detailed perceptual processing (e.g., Adolphs, 2003; LeDoux, 1989; Zajonc, 2000). The primacy of affective processing was recently corroborated by researchers studying event-related brain potentials in response to emotional faces (Palermo & Rhodes, 2007). This work shows that crude affective categorization can often occur rapidly, whereas fine-grained processes necessary for recognition of the identity of a face or for discriminatation between basic emotional expressions typically need more time. This suggests, as Stapel et al. (2002) have recently shown that even in the realm of subliminal perception, it is indeed possible to separate evaluation-based and description-based reactions to stimuli (see also Ruys & Stapel, in press–a, in press– b). Stapel and colleagues (2002) demonstrated that when a picture of a happy female face is primed subliminally, evaluative reactions (“positive”) are typically triggered earlier than descriptive reactions (“female”), but neither type of reaction needs awareness to occur. Similarly, when primed with the words honest versus dishonest, people pick up the evaluative meaning of these words (positive vs. negative) prior to their descriptive meaning (honest vs. dishonest; see also Bargh, Litt, Pratto, & Spielman, 1989; Stapel & Koomen, 2005). Thus, both evaluative and descriptive stimulus cues can be detected without awareness, but evaluative cues are often detected earlier (Stapel, 2003). The notion that descriptive meaning is picked up later than evaluative meaning has important consequences for the question of whether (unconsciously) primed cold concepts such as “happy,” “friendly,” and “honest” or “sad,” “aggressive,” and “dishonest” can affect mood states. As Zajonc’s (1980) affective primacy hypothesis suggests and recent work on unconscious affect priming (Ruys & Stapel, in press–a, in press– b; Stapel et al., 2002; Stapel & Koomen, 2005, 2006) demonstrates, when one is subliminally priming concepts (e.g., honest vs. dishonest), exposure

PRECONDITIONS FOR (UNCONSCIOUS) MOOD EFFECTS

time may determine what type of information is actually activated. At very short exposures, the evaluative meaning or tone of these concepts is activated (positive vs. negative). At longer exposures, however, both the descriptive and the evaluative meaning may become available (e.g., honest vs. dishonest). This implies that not only subliminally presented, emotionally charged primes (e.g., happy faces vs. sad faces) but also less emotional, colder primes (e.g., honest vs. dishonest) may affect mood judgments, given that these primes are flashed sufficiently quickly to activate mainly their evaluative tone (see Stapel et al., 2002). In sum then, cold concept priming may produce mood effects when the evaluative meaning of these concepts is more salient and more strongly activated than their descriptive meaning. Zajonc’s (1980) affective primacy theory suggests that one way to achieve this is by flashing concepts sufficiently quickly such that mainly their evaluative tone is activated (see Stapel & Koomen, 2005; Stapel et al., 2002). The increased activation of information that is merely or mainly evaluative and thus diffuse and cognitively unbounded may then spill over to people’s moods (Zajonc, 1980, 2000). In the words of Clore and Colcombe (2003), “[R]epeated suboptimal presentation of positive or negative stimuli activates evaluative meaning. Being objectless, it may become attached to whatever comes to mind next” (p. 343). Thus, objectless evaluative meaning may also become attached to a person’s mood. On a more general level, Bargh (1997, 2006) has noted and demonstrated repeatedly that the cognitive activation of semantic concepts is likely to simultaneously impact all kinds of psychological systems. Priming “aggressive” or “old” may influence perception, motivation, behavior, and evaluation. Interaction and exchange between the different psychological systems (Treisman, 1996) may further explain why the cognitive activation of evaluatively toned information may affect and color a person’s mood state.

The Present Studies In the present studies, we tested our all-you-need-is-evaluativetone hypothesis, using the same subliminal priming paradigm that we have used in prior work (Stapel & Koomen, 2005, 2006; Stapel et al., 2002). As these recent studies on the affective primacy hypothesis have suggested, nondescriptive, nonspecific evaluative stimulus features are especially likely to be picked up when stimulus exposure is very short. Our logic thus suggests—perhaps somewhat counterintuitively—that mood effects from cold concept priming are especially likely when priming occurs unconsciously. However, the aim of the present research was not only to show that subliminally presented information may produce mood effects but also to test some of the boundary conditions of these effects. In accordance with Clore and Colcombe (2003) and Zajonc (1980; see also Stapel et al., 2002), we hypothesized that when priming activates information that has a strong evaluative tone and is cognitively unconstrained, people’s moods may be affected. Because of this influence, even cold concept priming may induce mood effects, as long as the evaluative tone is sufficiently strong and salient. We investigated our evaluative-tone hypothesis with a variety of mood measures: We examined several indicators of people’s explicit mood judgments and information-processing styles. A multitude of mood measures is crucial to revealing the bare essentials of mood induction. However, central for the occurrence of a

779

positive or negative affective state is the conscious experience of positive or negative feelings (e.g., Clore, Storbeck, Robinson, & Centerbar, 2005). Thus, in Study 1a and Study 1b, we started by examining explicit mood judgments and asking people how positive or negative their mood was at the current moment. In Studies 2-4, we turned to more indirect measures of mood, in addition to our explicit mood measure. A well-known indirect consequence of moods is that they may influence people’s processing styles: When people feel good, they are more likely to rely on heuristic, easy, and global processing strategies, whereas when people feel bad, they tend to use more demanding, systematic, and local processing strategies (Fiedler, 1990, 1991; Forgas, 1995; Gaspar & Clore, 2002). Thus, in the present research, we also focused on people’s processing styles as a measure of people’s mood states.

Study 1 Study 1a We started the investigation of our all-you-need-is-evaluativetone hypothesis with a study of the impact of subliminally primed trait concepts on explicit mood judgments. We expected that when prime exposures were short, explicit mood judgments might be affected by the evaluative tone of the primes. However, when prime exposures were long, explicit mood judgments might not be affected because the activation of descriptive meaning could overshadow the evaluative tone. We measured mood immediately after the priming episode by asking people how positive or negative their mood was at that moment.

Method Participants, design, priming stimuli, and measures. Participants (N ⫽ 98) were undergraduates who took part in exchange for partial course credit. The participants were randomly assigned to the conditions of a 2 (prime exposure: long or short) ⫻ 2 (prime valence: positive or negative) between-participant design or to a control condition, in which participants were subliminally primed with neutral traits. Overview. Upon arrival, participants were shown into one of eight cubicles in the experimental room and seated in front of a computer. They were then told that they would be involved in a series of unrelated studies. First, participants performed a parafoveal vigilance task (modeled after a task used by Stapel et al., 2002) in which trait concepts were presented outside of participants’ awareness. Participants were told that very short flashes would appear on the screen at unpredictable places and times and that their task was to decide as quickly and accurately as possible whether the flash appeared on the left or right side of the screen. After having completed the vigilance task, participants were thanked for their participation and given the next task. The experimenter told participants, “A colleague of mine, from another university, would like you to complete this simple questionnaire.” Participants were then given a one-page one-question (“Rate how positive or negative your mood is at this moment”) questionnaire that measured mood; they responded using a scale ranging from 1 (negative) to 7 ( positive). Next, participants received a funnel debriefing procedure, in which they were probed for awareness of the priming stimuli, awareness of the influence of the priming task

780

RUYS AND STAPEL

on later judgments, and general suspicion concerning the goal of the study (see Stapel et al., 2002). Finally, participants were thanked and debriefed. Priming. The priming task was modeled after Stapel et al.’s (2002) parafoveal priming task. Once participants were seated in front of their computer, the experimenter explained the vigilance task. Participants were seated so that the distance between their eyes and the computer screen was 80 –100 cm. This distance ensured that the priming stimuli were presented outside of participants’ perceptual field. The experimenter then instructed participants to place their index fingers on two keys of the keyboard and to press the left key, labeled “L,” if a flash appeared on the left side of the screen and the right key, labeled “R,” if a flash appeared on the right side of the screen. A fixation point consisting of one X was presented continually in the center of the screen. Participants were given 10 practice trials to become familiar with the procedure and to ensure that they understood it. After answering any questions, the experimenter began the 60 experimental trials of the vigilance task, which took participants approximately 10 min to complete. Priming stimuli were trait concepts that were printed in black Times New Roman letters (12 point) printed on a white screen. The words that were flashed in the 10 practice trials and in 40 of the experimental trials were neutral words (e.g., “table,” “chair,” “tree”). In the remaining 20 experimental trials, in the positive priming conditions, the following words were each flashed five times: “confident,” “persistent,” “honest,” “pleasant.” In the negative priming conditions, the following words were each flashed five times: “arrogant,” “stubborn,” “dishonest,” “wrong.” The order in which these words were flashed was random. In the long conditions, words were flashed for 120 ms. In the short conditions, words were flashed for 40 ms. In all conditions, these words were immediately followed by a 120-ms mask (for details, see Stapel et al., 2002). Awareness and suspicion. Previous subliminal priming studies have shown that the paradigm used here provides sufficient safeguards to prevent participants from becoming aware of the priming stimuli (see Chartrand & Bargh, 1996; Erdley & D’Agostino, 1988; Stapel et al., 2002). However, to ensure that participants were not aware of the priming stimuli, we used an extensive funnel debriefing procedure in which participants were asked increasingly specific questions about the study (see Stapel et al., 2002). Participants were asked what they thought the purpose of the study had been, whether they thought any of the tasks they had performed had been related, whether they thought their performance on one task might have affected their performance on a next task, whether anything about the study seemed strange or suspicious to them, and what they thought the content of the flashes had been during the task. If participants indicated knowledge that the flashes consisted of words, they were further probed for general or specific meaning of these words. Next, in several multiple-choice trials, participants were given the priming stimuli used in this experiment (the positive words or the negative words) and were told that at some of the trials, one of those words was flashed. Participants were then asked to choose (guess) which word was flashed. All participants reported that they had seen flashes. Although some reported seeing “words,” no participant could report on the contents of the primes. Furthermore, participants’ guesses of which of the two words they had seen did not exceed chance, nor did they differ between

conditions (Fs ⬍ 1). Finally, there were no participants who thought the vigilance and evaluation tasks were related. Thus, we can safely conclude that we were successful in presenting our priming stimuli outside of awareness and in not alerting participants to the actual relation between the vigilance and judgment tasks. This was also true for the other studies presented here, in which we used the same paradigm as in the present study.

Results A Prime Valence ⫻ Prime Exposure analysis of variance (ANOVA) on the mood measurement revealed the predicted interaction, F(1, 76) ⫽ 3.94, p ⬍ .05, ␩2 ⫽ .05. The two main effects did not reach significance (Fs ⬍ 1). As can be seen in Table 1, the interaction effect reflects that, as expected, in the short exposure conditions, participants in the positive priming condition reported feeling more positive (M ⫽ 5.60, SD ⫽ 1.05) than participants in the negative priming condition (M ⫽ 4.90, SD ⫽ 1.02), F(1, 76) ⫽ 4.43, p ⬍ .05, ␩2 ⫽ .06, whereas in the long exposure conditions, priming had no effect on experienced affect (F ⬍ 1). In those conditions, participants’ mood judgments were similar to those in the control condition (M ⫽ 5.28, SD ⫽ 1.13). In addition, to provide an overall test of the predicted pattern of results including the control condition, we performed a contrast analysis. On the basis of the predictions, we assigned weights of 1 to the cells that we expected not to differ (the long positive, long negative, and control conditions), a weight of 4 to the cell in which we expected the most positive mood (the short positive condition), and a weight of ⫺7 to the cell in which we expected the most negative mood (the short negative condition). It should be noted that this a priori contrast reached significance, t(93) ⫽ 2.01, p ⬍ .05. We also performed two contrast analyses to directly compare the short exposure conditions with the control condition. Unfortunately, neither contrast— one comparing positive with control, t(93) ⫽ 0.91, p ⫽ .37, and one comparing negative with control, t(93) ⫽ 1.08, p ⫽ .29 —reached significance. To test whether the conditions in which we did not expect mood changes to occur (the long exposure conditions) differed from the control condition, we performed two additional contrast analyses. As expected, neither contrast— one comparing positive with control, t(93) ⫽ 0.48, p ⫽ .64, and one comparing negative with control, t(93) ⫽ 0.21, p ⫽ .84 —was significant. The results of Study 1a showed that short subliminal exposures to valenced concepts are more likely to influence people’s mood Table 1 Means and Standard Deviations for Mood Judgments as a Function of Prime Exposure and Prime Valence (Study 1a) Prime exposure Long

Short

Prime valence

M

SD

M

SD

Positive Negative

5.10 5.35

1.17 1.04

5.60 4.90

1.05 1.02

Note. Scale range is from 1 to 7. Higher scores indicate more positive judgments. Mean mood judgment in control condition was 5.28 (SD ⫽ 1.13).

PRECONDITIONS FOR (UNCONSCIOUS) MOOD EFFECTS

states than long subliminal exposures to these concepts. This finding supported our hypothesis that a dominating positive or negative evaluative tone is sufficient to elicit a corresponding mood state. However, compared with our control condition, the obtained mood effects were not very strong. Therefore, we conducted Study 1b.

Study 1b This study replicated Study 1a with a more extensive explicit mood measure, the Brief Mood Inspection Scale (BMIS; Mayer & Gaschke, 1988). In addition, we investigated whether our mood effects could be explained in terms of response mode effects. Therefore, participants also rated themselves and a good friend on specific trait dimensions. We expected that in contrast to the mood judgments, these self and other trait ratings would be unaffected by the subliminally presented trait terms because the judgments would be relatively specific and descriptive (see Keltner, Locke, & Audrain, 1993).

Method Participants (N ⫽ 65) were undergraduates who took part in exchange for partial course credit. The participants were randomly assigned to the conditions of a 2 (prime exposure: long or short) ⫻ 2 (prime valence: positive or negative) between-participant design or to a control condition, in which participants were subliminally primed with neutral traits. The procedure was similar to that used in Study 1a. However, different dependent measures were used. Immediately after the priming procedure, participants completed a self-report measure of emotional state that consisted of selected items from the BMIS. This scale listed nine feeling states: five positive states (“happy,” “content,” “preppy,” “lively,” “active”) and four negative states (“sad,” “gloomy,” “tired,” “drowsy”). Participants indicated how much they were feeling each state using a scale ranging from 1 (definitely do not feel) to 9 (definitely do feel). After participants had completed this mood inspection scale, we asked them to rate themselves and a “good (same sex) friend” on the rating dimensions of “friendly,” “smart,” “physically attractive,” and “athletic,” using a scale ranging from 1 (not at all applicable) to 7 (very applicable). The order of the mood inspection scale on the one hand and the self and good friend ratings on the other hand were counterbalanced to control for possible order effects.

Results and Discussion ANOVAs showed no main or interaction effects of the “order of measures” variable on any of the dependent measures (Fs ⬍ 1). ANOVAs also showed that there were no main or interaction effects of the primed information on participants’ self-ratings or ratings of participants’ friends (Fs ⬍ 1), indicating that self-ratings or other-ratings on specific trait dimensions (friendly, smart, physically attractive, athletic) were not affected by subliminally primed trait concepts, independent of whether prime exposure time was relatively short or long. Reliability analyses of the nine items on the mood inspection scale were conducted (after reverse scoring the four negative items) to form a composite scale (Cronbach’s ␣ ⫽ .82). A Prime

781

Valence ⫻ Prime Exposure ANOVA on this measure revealed the predicted interaction, F(1, 45) ⫽ 4.21, p ⬍ .05, ␩2 ⫽ .09. The two main effects did not reach significance ( ps ⬎ .17). As can be seen in Table 2, the interaction effect reflects that, as expected, in the short exposure conditions, participants in the positive condition reported feeling more positive (M ⫽ 7.00, SD ⫽ 0.74) than participants in the negative condition (M ⫽ 6.15, SD ⫽ 0.98), F(1, 47) ⫽ 6.18, p ⬍ .05, ␩2 ⫽ .12, whereas in the long exposure conditions, priming had no effect on experienced affect (F ⬍ 1). In those conditions, participants’ mood judgments were similar to those of in the control condition (M ⫽ 6.61, SD ⫽ 1.01). We performed a contrast analysis to provide an overall test of the predicted pattern of results. Using the same weights as in Study 1a, we found that the a priori contrast was significant, t(60) ⫽ 2.13, p ⬍ .05. In addition, we performed two contrast analyses to directly compare the short exposure conditions to the control condition. However, both contrasts— one comparing positive with control, t(60) ⫽ 1.11, p ⫽ .27, and one comparing negative with control, t(60) ⫽ 1.36, p ⫽ .18 — did not reach significance. We also conducted two additional contrasts, testing the control condition against the long exposure conditions. As predicted these contrasts— one comparing positive with control, t(60) ⫽ .56, p ⫽ .58, and one comparing negative with control, t(60) ⫽ 0.07, p ⫽ .94 —were not significant. Together, the findings of Study 1a and Study 1b support the idea that a dominating positive or negative evaluative tone may elicit a corresponding mood state. Specifically, subliminal exposure to valenced (trait) information is most likely to affect mood judgments when prime exposures are sufficiently short to activate evaluative reactions that have no specific descriptive content. Thus, these two studies suggest that not all subliminal priming effects are created equal. They show that sometimes longer primes have less impact. The longer one is exposed to evaluatively toned trait information, the less likely it is that such traits may affect one’s mood. The results of Study 1b suggest that such carryover effects are most likely to occur on relatively general and evaluative mood judgments but not on relatively specific descriptive selfjudgments and other-person judgments. This suggests that subliminally activated affective information is most likely to spill over into conscious judgment when the target of this judgment is evaluatively ambiguous (see also Stapel et al., 2002). In sum, Study 1a and Study 1b provide the first evidence for our all-you-need-is-evaluative-tone hypothesis. We showed that evalTable 2 Means and Standard Deviations of Scores for Items on Brief Mood Inspection Scale as a Function of Prime Exposure and Prime Valence (Study 1b) Prime exposure Long

Short

Prime valence

M

SD

M

SD

Positive Negative

6.42 6.58

0.90 0.79

7.00 6.15

0.74 0.98

Note. Scale range is from 1 to 9. Higher scores indicate more positive judgments. Mean mood judgment in control condition was 6.61 (SD ⫽ 1.01).

RUYS AND STAPEL

782

uative tone was critical in influencing people’s explicit mood judgments. However, we needed additional evidence for several reasons: First, the mood effects were not very strong (compared with control conditions) in either study. Second, to increase the generalizability of our hypothesis, we also needed to demonstrate these effects on indirect mood measures. Third, our mood effects could be explained in terms of semantic priming. The exposure to positive (or negative) concepts during the priming episode could have activated other related positive (or negative) concepts in memory, for instance, concepts representing positive (or negative) affective states. This might have increased the tendency for participants to agree with experiencing these positive (or negative) states. To address these three issues, we performed Study 2.

Study 2 An interesting consequence of moods is their impact on people’s processing styles. Several studies have shown that people who feel good tend to rely on heuristic, easy, and global processing strategies, whereas people who feel bad are more likely to use demanding, systematic, and local processing strategies (Fiedler, 1990, 1991; Forgas, 1995; Gaspar & Clore, 2002). Thus, for example, people dining out in a restaurant tend to choose the “surprise of the chef” when they are in a good mood but tend to scrutinize the menu and analyze the ingredients of each course in detail before making a choice when they are in a bad mood. Fiedler (2001) has aptly explained the differences between the two processing styles in his adaptive learning viewpoint, which assumes that the processing styles associated with positive and negative moods have respectively evolved in appetitive and aversive situations to cope most effectively with the demands of the situation.2 A friendly, appetitive situation demands exploratory and knowledge-driven processing. A threatening, aversive situation requires careful and more systematic processing strategies. Although semantic priming could serve as alternative explanation for the results in Study 1, semantic priming cannot easily explain mood effects on people’s information-processing styles. For this reason, we mainly focused on mood effects related to information processing to advance our evaluative-tone hypothesis. In Study 2, we explored and further tested the effects of mood on people’s need for cognition. Previous research has assumed that “individuals high in need for cognition naturally tend to seek, acquire, think about, and reflect back on information to make sense of stimuli, relationships, and events in their world; individuals low in need for cognition, in contrast, are more likely to rely on others (e.g., experts), cognitive heuristics, or social comparisons to provide this structure” (Cacioppo, Petty, Feinstein, & Jarvis, 1996, p. 243). Although need for cognition has often been shown to be a stable personality trait that one can measure reliably with a personality scale (Cacioppo & Petty, 1981; Cacioppo et al., 1996), there are no indications that need for cognition is also (at least to a certain extent) context dependent. The features that characterize how people process information as a function of their need for cognition remind us of the two processing styles between which, according to most dual-process models, people alternate depending on the situation: a systematic, effortful information-processing style and a heuristic, easy information-processing style (e.g., Chaiken, 1980; Petty & Caccioppo, 1986). In a similar vein, we

expected that people’s need for cognition could depend on situational factors. We proposed that analogous to the effect of mood on people’s information-processing styles (i.e., more systematic information processing in a negative mood and more heuristic processing in a positive mood), people may experience a higher need for cognition when they feel bad than when they feel good. Thus, need for cognition may indirectly reflect people’s mood states. In Study 2, we used need for cognition to show that a dominating evaluative tone is also essential in evoking effects on indirect mood measures. Before turning to this main objective, we conducted a pretest to demonstrate that a well-known conscious mood induction technique (recalling positive or negative life events) indeed affects individuals’ need for cognition. The aim of this pretest was thus to show that people’s mood states affect their need for cognition.

Pretest Participants, Design, Mood Induction, and Measures Participants (N ⫽ 38) were undergraduates who took part for partial course credit. The participants were randomly assigned to the conditions of a three-factor (mood induction: positive, negative, or neutral) between-participant design. Participants received a booklet consisting of the mood induction, the need-for-cognition items, and a mood question. We asked them, dependent on the mood induction condition, to remember a positive, a negative, or a neutral event from the past and to try to relive this experience (see, for instance, Bless et al., 1996; Fiedler & Stroehm, 1986, who have successfully used a similar procedure to induce mood). Then, participants completed the following four need-for-cognition items (selected from the Need for Cognition Scale, Cacioppo & Petty, 1982), using a scale ranging from 1 (completely disagree) to 5 (completely agree): “The idea of relying on thought to make my way to the top appeals to me”; “I would prefer a task that is intellectual, difficult, and important to one that is somewhat important but does not require much thought”; “Thinking is not my idea of fun” (reverse coded); and “Learning new ways to think doesn’t excite me very much” (reverse coded). We selected four representative items because of time concerns. Next, participants received the same mood question as in Study 1a.

Results A mood induction ANOVA performed on the mood question demonstrated that our mood induction method was indeed successful, F(2, 57) ⫽ 6.33, p ⬍ .01, ␩2 ⫽ .27. A further contrast analysis showed that participants who remembered and relived a positive life event indicated that they felt more positive (M ⫽ 5.00, SD ⫽ 0.89) than participants who remembered and relived a negative event (M ⫽ 3.46, SD ⫽ 1.20), with mood rating of the 2 Note that although the typical finding is that people in a good mood rely more on heuristic, global processing strategies and people in a bad mood rely more on systematic, detailed ways of processing, researchers also have reported more complex findings. For example, mood can have motivational consequences because of mood management pressures during positive, negative, and neutral mood states (Gervey, Igou, & Trope, 2005; Isen, 1987; Wegener & Petty, 1994).

PRECONDITIONS FOR (UNCONSCIOUS) MOOD EFFECTS

participants in the neutral condition lying between these two extremes (M ⫽ 4.43, SD ⫽ 1.09), t(35) ⫽ 3.53, p ⬍ .05. Reliability analyses of the four need-for-cognition items were conducted to form a composite scale (Cronbach’s ␣ ⫽ .87). A mood induction ANOVA on this measure revealed the predicted main effect, F(2, 57) ⫽ 4.62, p ⬍ .05, ␩2 ⫽ .21. A contrast analysis showed that, as expected, in the negative mood condition, participants reported a higher need for cognition (M ⫽ 4.23, SD ⫽ 0.97) than participants from the positive mood condition (M ⫽ 3.21, SD ⫽ 0.99), with the neutral condition lying between these two extremes (M ⫽ 3.59, SD ⫽ 0.53), t(35) ⫽ 3.02, p ⬍ .05). Further analyses showed that the partial correlation (controlling for experimental condition) for these two dependent measures was high, r ⫽ .80 ( p ⬍ .01).

Method Participants (N ⫽ 60) were undergraduates who took part in the study for partial course credit. The participants were randomly assigned to the conditions of a 2 (prime exposure: long or short) ⫻ 2 (prime valence: positive or negative) between-participant design or to a control condition, in which participants were subliminally primed with neutral traits. The procedure was similar to that used in Study 1a. However, different dependent measures were used. Immediately after the priming procedure, participants completed the four need-forcognition items that were used in the pretest. We again included a mood question to determine whether our mood induction worked.

Results and Discussion A Prime Valence ⫻ Prime Exposure ANOVA on the need-forcognition composite scale (Cronbach’s ␣ ⫽ .85) revealed the predicted interaction, F(1, 44) ⫽ 4.80, p ⬍ .05, ␩2 ⫽ .10, and a main effect of prime valence, F(1, 44) ⫽ 6.97, p ⬍ .05, ␩2 ⫽ .14. There was no effect of prime exposure (F ⬍ 1). As can be seen in Table 3, the interaction effect reflects that, as expected, in the short exposure conditions, participants in the positive condition reported a lower need for cognition (M ⫽ 3.13, SD ⫽ 0.49) than participants in the negative condition (M ⫽ 4.27, SD ⫽ 1.07), F(1, 46) ⫽ 12.04, p ⬍ .01, ␩2 ⫽ .21, whereas in the long exposure conditions, priming had no effect on reported need for cognition (F ⬍ 1). In those conditions, participants’ need for cognition judgments were similar (Fs ⬍ 1) to those in the control condition (M ⫽ 3.63, SD ⫽ 0.53). To provide an overall test of the predicted pattern of results, we additionally performed a contrast analysis. On the basis of the predictions, we assigned weights of 1 to the cells that we expected not to differ (the long positive, long negative, and control conditions), a weight of 4 to the cell in which we expected the highest need for cognition (the short negative condition), and a weight of ⫺7 to the cell in which we expected the lowest need for cognition (the short positive condition). This a priori contrast was highly significant, t(55) ⫽ 3.58, p ⬍ .05. In addition, we performed two contrast analyses to directly compare the short exposure conditions with the control condition. The contrast comparing positive with control reached significance, t(55) ⫽ 2.40, p ⬍ .05, whereas the contrast comparing negative with control was marginal, t(55) ⫽ 1.87, p ⫽ .08. We also tested whether the long exposure conditions differed from the control condition. As predicted, these two con-

783

Table 3 Means and Standard Deviations of Scores for Items on Need for Cognition Scale and for Mood Judgments as a Function of Prime Exposure and Prime Valence (Study 2) Prime exposure Long Prime valence

M

Short SD

M

SD

3.13 4.27

0.49 1.07

5.42 3.83

0.79 0.94

Need for cognition score Positive Negative

3.46 3.57

0.59 1.00

Mood judgment Positive Negative

4.85 4.73

0.90 1.01

Note. The Need for Cognition Scale ranges from 1 to 5. Higher scores indicate a higher need for cognition. Mean need for cognition score in control condition was 3.63 (SD ⫽ 0.53). The mood judgment scale ranges from 1 to 7. Higher scores indicate more positive judgments. Mean mood judgment in control condition was 4.67 (SD ⫽ 0.78).

trasts— one comparing positive with control, t(55) ⫽ 0.53, p ⫽ .60, and one comparing negative with control, t(55) ⫽ 0.18, p ⫽ .86 —were not significant. A Prime Valence ⫻ Prime Exposure ANOVA on the mood judgments also revealed the predicted interaction, F(1, 44) ⫽ 7.74, p ⬍ .01, ␩2 ⫽ .15, and a main effect of prime valence, F(1, 44) ⫽ 10.45, p ⬍ .01, ␩2 ⫽ .19. There was no effect of prime exposure (F ⬍ 1). The means and standard deviations are depicted in Table 3. Equivalent to the pattern of results for need for cognition, the interaction effect indicated that for the short exposure conditions, participants in the positive condition reported feeling more positive (M ⫽ 5.42, SD ⫽ 0.79) than participants in the negative condition (M ⫽ 3.83, SD ⫽ 0.94), F(1, 46) ⫽ 18.76, p ⬍ .01, ␩2 ⫽ .29, whereas for the long exposure conditions, priming had no effect on experienced affect (F ⬍ 1). Again in those conditions, participants’ mood judgments were similar (Fs ⬍ 1) to those in the control condition (M ⫽ 4.67, SD ⫽ 0.78). To provide an overall test of the predicted pattern of results, we performed a contrast analysis (with the same weights as in Study 1a) that was significant, t(55) ⫽ 4.22, p ⬍ .05. Contrast analyses comparing the control condition with the short positive condition, t(55) ⫽ 2.08, p ⬍ .05, and comparing the control condition with the short negative condition, t(55) ⫽ 2.31, p ⬍ .05, were also significant. Additional contrast analyses showed that in line with our hypotheses, neither the long positive condition, t(55) ⫽ 0.51, p ⫽ .62, nor the long negative condition, t(55) ⫽ 0.16, p ⬍ .87, differed from the control condition. Further analyses showed that the partial correlation (controlling for experimental condition) for our two dependent measures was high, r ⫽ .62 ( p ⬍ .01). This allowed us to test the robustness of our findings by computing a composite scale of the z-transformed need-for-cognition scores and the recoded and then z-transformed mood judgments. We performed the same contrast analyses as described earlier on our composite scale. A contrast analysis testing the expected overall pattern of results was highly signifi-

RUYS AND STAPEL

784

cant, t(55) ⫽ 4.24, p ⬍ .05. Contrast analyses performed to compare the short positive conditions with control and the short negative conditions with control were also significant, t(55) ⫽ 2.17, p ⬍ .05, and t(55) ⫽ 2.58, p ⬍ .05, respectively. Again as expected, additional contrast analyses showed that neither the long positive condition, t(55) ⫽ 0.62, p ⫽ .54, nor the long negative condition, t(55) ⫽ 0.20, p ⬍ .84, differed from the control condition. The results thus show that when prime exposures are short, the evaluative tone of the primes affects people’s reported need for cognition, whereas when prime exposures are long, the evaluative tone of the primes does not affect people’s reported need for cognition. The same pattern of results was obtained on the mood question. These findings indicate that need for cognition is a stable personality trait that also depends on situational factors like mood. More important for the present purposes, people may experience a higher need for cognition when they feel bad than when they feel good. This finding is equivalent to mood effects on people’s information-processing styles. In sum, Study 2 provides support for our evaluative-tone hypothesis on an indirect mood measure. Thus far, we have demonstrated mood effects when prime exposure is relatively short. The results of Studies 1a, 1b, and 2 strongly suggest that under short exposure conditions, primarily the evaluative tone of the primed information is activated. However, the evaluative-tone logic also suggests another possible method of testing our hypothesis. An alternate way to manipulate the dominance of evaluative tone would be to use primed concepts that are sufficiently extreme in evaluative tone to dominate descriptive meaning, independent of whether prime exposure is extremely or moderately short. Thus, to expand our evidence, we tested this implication of our line of reasoning in Study 3.

under relatively long but subliminal exposure. Prime exposure duration was the same in all conditions and was similar to the long exposure conditions of Study 1a, 1b, and 2. In contrast to these previous studies, we expected that the relatively longer exposures to trait concepts might affect people’s moods but only when the evaluative tone was sufficiently dominant. Therefore, this priming should only work with extreme traits.

Method Participants (N ⫽ 57) were undergraduates who took part for partial course credit. The participants were randomly assigned to the conditions of a 2 (prime extremity: moderate or extreme) ⫻ 2 (prime valence: positive or negative) between-participant design or to a control condition in which participants were subliminally primed with neutral traits. The priming stimuli presented in the experimental trials were trait concepts with a moderate or extreme valence, taken from (and pretested by) Stapel and Koomen (2000). In the moderately positive priming condition, the following words were each flashed on the computer screen five times: “thrifty,” “reasonable,” “agreeable,” “pleasant.” In the extremely positive priming condition, the following words were each flashed five times: “wonderful,” “sweet,” “good,” “positive.” In the moderately negative priming condition, the following words were each flashed five times: “stingy,” “weak,” “plain,” “unpleasant.” In the extremely negative priming condition, the following words were each flashed five times: “horrific,” “cruel,” “bad,” “negative.” The procedure was similar to that used in Study 1a, except that the time participants were exposed to the primes was always long (120 ms). The dependent measures were similar to those in Study 2: Participants reported their need for cognition and indicated to what extent they felt positive or negative.

Study 3 In this study, we demonstrated in a different way that evaluative tone is essential to induce a mood state. What is crucial in our all-you-need-is-evaluative-tone hypothesis is that evaluative cues of a concept are more salient, or more strongly activated, than are its descriptive cues. One way to achieve this state is to present the information for a sufficiently short duration that primarily the evaluative features are activated (Stapel et al., 2002), which we did in Studies 1a, 1b, and 2. Another way to achieve this is to use concepts that are extreme in their valence and thus strong in their (un)desirability (see Stapel & Koomen, 2000). Such concepts have a strong evaluative meaning that could dominate their specific descriptive meaning. Extremely valenced concepts, such as “wonderful” versus “horrific,” for example, have a strong evaluative tone (i.e., positive vs. negative), whereas moderately valenced concepts, such as “pleasant” versus “unpleasant,” although perhaps similarly specific in their descriptive meaning (see Hampson, John, & Goldberg, 1986; Stapel & Koomen, 2000), have a weaker evaluative tone. We therefore expected that when extremely valenced concepts were primed (i.e., with short and long exposures), their evaluative tone would be more likely to dominate their descriptive meaning and thus yield mood effects than when moderately valenced concepts were primed. In the present study, we primed participants with trait concepts of extreme valence or with trait concepts of moderate valence, both

Results A Prime Valence ⫻ Prime Extremity ANOVA on the need-forcognition composite scale (Cronbach’s ␣ ⫽ .83) revealed the predicted interaction, F(1, 41) ⫽ 5.28, p ⬍ .05, ␩2 ⫽ .11, and a main effect of prime valence, F(1, 41) ⫽ 6.62, p ⬍ .05, ␩2 ⫽ .14. There was no effect of extremity (F ⬍ 1). As can be seen in Table 4, the interaction effect reflects that, as expected, when the prime words were extreme, participants in the positive condition reported a lower need for cognition (M ⫽ 3.07, SD ⫽ 0.56) than participants in the negative condition (M ⫽ 4.27, SD ⫽ 1.07), F(1, 43) ⫽ 12.66, p ⬍ .01, ␩2 ⫽ .23, whereas when the prime words were moderate, priming had no effect on reported need for cognition (F ⬍ 1). In those conditions, participants’ need-for-cognition judgments were similar (Fs ⬍ 1) to those in the control condition (M ⫽ 3.56, SD ⫽ 0.59). To provide an overall test of the predicted pattern of results, we also performed a contrast analysis. On the basis of the predictions, we assigned weights of 1 to the cells that we expected not to differ (the moderate positive, moderate negative, and control conditions), a weight of 4 to the cell in which we expected the highest need for cognition (the extreme negative condition), and a weight of ⫺7 to the cell in which we expected the lowest need for cognition (the extreme positive condition). This a priori contrast reached significance, t(52) ⫽ 3.76, p ⬍ .05. Contrast analyses performed to compare the extreme positive condition

PRECONDITIONS FOR (UNCONSCIOUS) MOOD EFFECTS

Table 4 Means and Standard Deviations of Scores for Items on Need for Cognition Scale and for Mood Judgments as a Function of Prime Extremity and Prime Valence (Study 3) Prime extremity Moderate Prime valence

M

Extreme SD

M

SD

3.07 4.27

0.56 1.07

5.27 4.08

1.01 1.17

Need for cognition score Positive Negative

3.86 3.75

0.64 0.90

Mood judgment Positive Negative

4.55 4.64

0.82 0.92

Note. The Need for Cognition Scale ranges from 1 to 5. Higher scores indicate a higher need for cognition. Mean need for cognition score in control condition was 3.56 (SD ⫽ 0.59). The mood judgment scale ranges from 1 to 7. Higher scores indicate more positive judgments. Mean mood judgment in control condition was 4.58 (SD ⫽ 0.79).

with control and the extreme negative condition with control were respectively significant, t(52) ⫽ 2.07, p ⬍ .05, and marginal, t(52) ⫽ 2.01, p ⫽ .06. As expected, additional contrast analyses testing the moderate positive condition against control and the moderate negative condition against control were not significant, t(52) ⫽ 0.37, p ⫽ .72 and t(52) ⫽ 0.57, p ⫽ .57 respectively. A Prime Valence ⫻ Prime Extremity ANOVA on the mood judgments also revealed the predicted interaction, F(1, 41) ⫽ 4.68, p ⬍ .05, ␩2 ⫽ .10, and a marginal effect of prime valence, F(1, 41) ⫽ 3.44, p ⫽ .07, ␩2 ⫽ .08. There was no effect of prime extremity (F ⬍ 1). The means and standard deviations are depicted in Table 4. Equivalent to the pattern of results for need for cognition, the interaction effect indicates that when the prime words were extreme, participants in the positive condition reported feeling more positive (M ⫽ 5.27, SD ⫽ 1.01) than participants in the negative condition (M ⫽ 4.08, SD ⫽ 1.17), F(1, 43) ⫽ 8.57, p ⬍ .01, ␩2 ⫽ .17, whereas when the prime words were moderate, priming had no effect on experienced affect (F ⬍ 1). Again in those conditions, participants’ mood judgments were similar (Fs ⬍ 1) to those in the control condition (M ⫽ 4.58, SD ⫽ .79). We then performed a contrast analysis to provide an overall test of the predicted pattern of results. On the basis of the predictions, we assigned weights of 1 to the cells that we expected not to differ (the moderate positive, moderate negative, and control conditions), a weight of 4 to the cell in which we expected the most positive mood (the extreme positive condition), and a weight of ⫺7 to the cell in which we expected the most negative mood (the extreme negative condition). This a priori contrast reached significance, t(52) ⫽ 2.73, p ⬍ .05. Contrast analyses performed to compare the extreme positive condition with control and the extreme negative condition with control were respectively marginal, t(52) ⫽ 1.73, p ⬍ .09, and nonsignificant, t(52) ⫽ 1.28, p ⫽ .21. We then conducted two additional contrasts, comparing the control condition with conditions in which we did not expect mood changes to occur. As expected, neither analysis— one contrasting moderate

785

positive with control and one contrasting moderate negative with control—was significant, t(52) ⫽ 0.10, p ⫽ .93, and t(52) ⫽ 0.13, p ⫽ .90, respectively. Further analyses showed that the partial correlation (controlling for experimental condition) for our two dependent measures (i.e., need for cognition and mood judgment) was high, r ⫽ .62 ( p ⬍ .01). This allowed us to test the robustness of our findings by computing a composite scale of the z-transformed need-forcognition scores and the recoded and then z-transformed mood judgments. Next, we performed the same contrast analyses as described earlier on our composite scale. The contrast analysis testing the expected overall pattern of results was highly significant, t(52) ⫽ 3.83, p ⬍ .05. Contrast analyses performed to compare the extreme positive conditions with control and the extreme negative conditions with control were also significant, t(52) ⫽ 1.97, p ⬍ .05, and t(52) ⫽ 2.11, p ⬍ .05 respectively. As expected, two additional contrasts comparing moderate positive with control and comparing moderate negative with control were not significant, t(52) ⫽ 0.28, p ⫽ .78, and t(52) ⫽ 0.26, p ⫽ .80. The results of this study are important for two reasons: First, the findings replicate the effect of mood on people’s reported need for cognition. Second, the results support our evaluative-tone hypothesis because mood was only affected when the evaluative meaning of the primed trait concepts dominated their descriptive meaning. Thus, participants who were primed with extreme positive trait concepts reported a lower need for cognition and indicated they felt better than participants who were primed with extreme negative trait concepts. As expected, reported need for cognition and mood were not affected in participants who were primed with moderate trait concepts. Together, the first three studies demonstrate that a dominating evaluative tone is essential for influencing people’s mood states. We have provided evidence on explicit mood measures and on people’s motivations (i.e., need for cognition). To complete the picture, we set as our final goal providing support for our all-youneed-is-evaluative-tone hypothesis using a different processingstyle measure than need for cognition. It was in this spirit that we conducted Study 4.

Study 4 As mentioned previously, moods may influence people’s processing styles. People are more likely to rely on heuristic, easy, and global processing strategies when they feel good and tend to use more demanding, systematic, and local processing strategies when they feel bad. Bless, Bohner, Schwarz, and Strack (1990) showed the effects of mood on processing style in the realm of persuasion (see also Petty, DeSteno, & Rucker, 2001). The persuasion literature informs us that the impact of argument strength may differ depending on people’s processing styles. Strong arguments are more convincing than weak arguments for people who use systematic processing strategies, whereas strong and weak arguments are equally convincing for people who use heuristic processing strategies. Building on these findings, Bless and colleagues (1990) showed that for participants who felt good, argument strength was not important in convincing them. For people in a negative mood, however, argument strength was important: People who felt bad were more influenced by strong rather than weak arguments.

RUYS AND STAPEL

786

The aim of the current research was to provide support for our evaluative-tone hypothesis on a processing-style measure. We used the impact of argument quality on persuasion to show that a dominating positive evaluative tone leads to a good mood, whereas a dominating negative evaluative tone leads to a bad mood. Thus, because people in a positive mood process information heuristically, they are equally likely to be convinced by weak or strong arguments. People in a negative mood tend to process information systematically and therefore are more likely to be convinced by strong rather than weak arguments. We primed participants with positive or negative trait concepts using long or short subliminal exposures, similar to those used in Studies 1 and 2. After the priming episode, participants were presented with several strong and weak arguments in favor of an attitude object and were then asked to indicate their attitude.

Method Participants (N ⫽ 108) were undergraduates who took part in exchange for partial course credit. The participants were randomly assigned to the conditions of a 2 (prime exposure: long or short) ⫻ 2 (prime valence: positive or negative) ⫻ 2 (argument strength: weak or strong) between-participant design or to a control condition, in which participants were subliminally primed with neutral traits and read strong and weak arguments. The procedure was similar to that used in Study 1a, but we relied on a different measure of processing style. Immediately after the priming procedure, participants read two arguments in favor of using English as the official language of the Dutch university at which this experiment was conducted. This would mean that all classes would be conducted in English rather than in Dutch. The arguments were pretested to be either strong or weak. The strong arguments were as follows: “If English is the official language, the university can attract more international students, which will increase its status and make it easier to improve facilities and teaching resources” and “The international employability of university graduates will increase if English is the official language.” The weak arguments were as follows: “Improving the English of the university’s students may be helpful to them on holidays” and “If English is the official language at our university, it will be easier to understand Anglo-Saxon movies and television series.” After reading the strong or weak arguments, participants were asked to indicate their attitude (1 ⫽ completely disagree, 7 ⫽ completely agree) toward the statement that it is “a good idea to make English the official language at our university and have all teaching in English.” Similar to the procedure in our previous studies, we included a mood question to assess the successfulness of our mood induction. Finally, to check the successfulness of our argument strength manipulation, we asked participants to indicate the strength of the arguments they had read on a 7-point scale ranging from 1 (completely disagree) to 7 (completely agree).

arguments were indeed judged as stronger (M ⫽ 5.00, SD ⫽ 0.83) than the weak arguments (M ⫽ 3.80, SD ⫽ .079). Next, a Prime Valence ⫻ Prime Exposure ⫻ Argument Strength ANOVA on the attitude judgment revealed a marginal three-way predicted interaction, F(1, 88) ⫽ 3.32, p ⫽ .07, ␩2 ⫽ .04, a marginal Prime Valence ⫻ Argument Strength effect, F(1, 88) ⫽ 2.82, p ⫽ .10, ␩2 ⫽ .04, a Prime Valence ⫻ Prime Exposure effect, F(1, 88) ⫽ 6.61, p ⬍ .05, ␩2 ⫽ .07, a prime valence effect, F(1, 88) ⫽ 10.21, p ⬍ .01, ␩2 ⫽ .10, and an argument strength effect, F(1, 88) ⫽ 21.71, p ⬍ .01, ␩2 ⫽ .20. As can be seen in Table 5, the pattern of means behind these effects strongly supports our predictions. As expected, in the short exposure conditions, participants’ attitudes were similarly positive in the strong (M ⫽ 5.14, SD ⫽ 0.66) and weak (M ⫽ 4.86, SD ⫽ 0.86) conditions (F ⬍ 1), whereas in the long exposure conditions, participants’ attitudes were more positive in the strong condition (M ⫽ 4.75, SD ⫽ 0.87) than in the weak condition (M ⫽ 3.25, SD ⫽ 0.75), F(1, 94) ⫽ 15.39, p ⬍ .01, ␩2 ⫽ .14. In other words, after short negative priming, argument strength mattered (suggesting a negative mood was induced), whereas it did not matter after short positive priming (suggesting that a positive mood was induced). In the long exposure conditions, prime valence had, as predicted, no (main or interaction) effect (Fs ⬍ 1). Here, strong arguments led to more positive attitudes (M ⫽ 4.78, SD ⫽ 0.87) than weak arguments (M ⫽ 4.05, SD ⫽ 0.90), F(1, 94) ⫽ 6.07, p ⬍ .05, ␩2 ⫽ .06. Participants’ attitudes in the long exposure–weak arguments conditions were similar (F ⬍ 1) to participants’ attitudes in the control condition (M ⫽ 4.08, SD ⫽ 0.52). However, to provide an overall test of the predicted pattern of results, we again performed a contrast analysis. On the basis of the predictions, we assigned weights of 1 to the cells that we expected to have a positive attitude (all strong argument conditions and the short weak positive condition), weights of ⫺1 to the cells in which we expected a relatively negative attitude (the long weak conditions and the control condition), and a weight of ⫺2 to the cell in which we expected the most negative attitude (the short weak negative condition). This a priori contrast was highly significant, t(99) ⫽ 6.87, p ⬍ .05. Table 5 Means and Standard Deviations for Attitude as a Function of Prime Exposure, Argument Strength, and Prime Valence (Study 4) Prime exposure Long Prime valence

M

Short SD

M

SD

5.14 4.75

0.66 0.87

4.86 3.25

0.86 0.75

Strong arguments Positive Negative

4.82 4.70

0.87 0.95

Weak arguments

Results First, a Prime Valence ⫻ Prime Exposure ⫻ Argument Strength ANOVA on the argument strength manipulation check revealed that the predicted main effect of argument strength, F(1, 88) ⫽ 55.67, p ⬍ .01, ␩2 ⫽ .39 (other Fs ⬍ 1), indicating that the strong

Positive Negative

4.08 4.00

0.90 0.94

Note. Scale ranges from 1 to 7. Higher scores indicate a more positive attitude. Mean attitude judgment in the control condition was 4.08 (SD ⫽ 0.52).

PRECONDITIONS FOR (UNCONSCIOUS) MOOD EFFECTS

A Prime Valence ⫻ Prime Exposure ⫻ Argument Strength ANOVA on the mood judgments revealed the predicted Prime Valence ⫻ Prime Exposure interaction, F(1, 88) ⫽ 19.06, p ⬍ .01, ␩2 ⫽ .18, and a main effect of prime valence, F(1, 88) ⫽ 15.13, p ⬍ .01, ␩2 ⫽ .15 (other Fs ⬍ 1). As can be seen in Table 6, the interaction effect reflects that, as expected (as in the other experiments), in the short exposure conditions, participants in the positive condition reported feeling more positive (M ⫽ 5.57, SD ⫽ 0.69) than participants in the negative condition (M ⫽ 4.13, SD ⫽ 0.90), F(1, 94) ⫽ 39.68, p ⬍ .01, ␩2 ⫽ .30, whereas in the long exposure conditions, priming had no effect on experienced affect (F ⬍ 1) and participants’ moods were similar (F ⬍ 1) to those in the control condition (M ⫽ 4.75, SD ⫽ 0.87). Again, we performed a contrast analysis to provide an overall test of the predicted pattern of results. Using the same weights as in Study 1a, we found that the a priori contrast was significant, t(103) ⫽ 5.40, p ⬍ .05. Contrast analyses performed to compare the short positive conditions with control and the short negative conditions with control were also significant, t(103) ⫽ 2.83, p ⬍ .05, and t(103) ⫽ 2.10, p ⬍ .05, respectively. However as expected, contrast analyses performed to compare the long positive conditions with control and the long negative conditions with control were not significant, t(103) ⫽ 0.28, p ⫽ .78, and t(103) ⫽ 0, p ⫽ 1, respectively. In a nutshell, this study provides additional support for our all-you-need-is-evaluative-tone hypothesis. The results show that in the long prime exposure conditions, participants were influenced more by the strong than by the weak arguments. Their moods were unaffected. More important, in the short prime exposure conditions, participants primed with positive trait concepts were influenced by both strong and weak arguments, whereas participants primed with negative trait concepts were influenced only by strong arguments. Those participants in the short exposure conditions also reported the expected mood states. Thus, a dominating evaluative tone influenced people’s moods and, therefore, people’s processing styles.

Table 6 Means and Standard Deviations for Mood Judgments as a Function of Prime Exposure, Argument Strength, and Prime Valence (Study 4) Prime exposure Long Prime valence

M

Short SD

M

SD

5.50 4.08

0.52 0.87

5.64 4.17

0.84 0.84

Strong arguments Positive Negative

4.67 4.90

1.16 0.88

787

Summary of Findings A critical reader could, perhaps, argue that in some of our studies, direct comparisons between the control condition and the conditions in which we expected mood changes to occur were not always significant. However, we presented multiple studies and replicated our main finding across measures (two types of mood measures, need-for-cognition scale, argument strength logic) and across a large number of studies (Study 1a, Study 1b, Study 2, Study 3, and Study 4). This clearly and convincingly shows that our effect is real and robust. To provide concrete statistical support for this claim, we performed additional contrast analyses across Studies 1a, 1b, 2, 3, and 4 on the z-transformed mood judgments and across Studies 2 and 3 on the z-transformed need-for-cognition scores. First, we performed an overall test of the predicted pattern on the mood judgments. Using the same weights as in Study 1a, we found that the a priori contrast was highly reliable, t(383) ⫽ 7.48, p ⬍ .05. In addition, we performed two contrast analyses to directly compare the control condition with the conditions in which we expected mood changes to occur (the short exposure and extreme conditions). Both contrasts— one comparing positive with control, t(383) ⫽ 4.04, p ⬍ .05, and one comparing negative with control, t(383) ⫽ 3.76, p ⬍ .05—were highly significant. To make sure that the conditions in which we did not expect mood changes to occur (the long exposure and moderate conditions) did not differ from the control condition, we performed two additional contrast analyses. In line with our hypotheses, neither of the contrasts— one comparing positive with control, t(383) ⫽ 0.57, p ⫽ .57, and one comparing negative with control, t(383) ⫽ 0.14, p ⫽ .89 — was significant. Second, we performed similar contrast analyses on the needfor-cognition scores, starting with an overall test of the predicted pattern. Using the same weights as in Study 2, we found that the a priori contrast was highly reliable, t(112) ⫽ 4.76, p ⬍ .05. Next, we performed the two contrast analyses to directly compare the conditions in which we expected mood changes to occur with the control condition. Both contrasts— one comparing positive with control, t(112) ⫽ 2.23, p ⬍ .05, and one comparing negative with control, t(112) ⫽ 3.08, p ⬍ .05—were highly significant. As expected, neither of the contrasts testing the control condition against the conditions in which we did not expect mood changes to occur— one comparing positive with control, t(112) ⫽ 0.18, p ⫽ .86, and one comparing negative with control, t(112) ⫽ 0.28, p ⫽ .78 —was significant. Together, these contrast analyses provide strong support for our claim that only very brief presentations of evaluative information or presentations of very extreme evaluative information may influence mood and cognition. The fact that in some of our studies, single low-power comparisons between individual cells did not reach ordinary levels of significance is thus completely surpassed by the robustness and reliability of our hypothesized effect across a large number of studies and measures.

Weak arguments Positive Negative

4.67 4.60

0.65 0.84

Note. Scale ranges from 1 to 7. Higher scores indicate a more positive mood. Mean mood judgment in the control condition was 4.75 (.87).

General Discussion Most successful mood induction techniques rely on the conscious experience or recall of real or imagined mood-eliciting stimuli and demand a relatively high amount of cognitive capacity.

RUYS AND STAPEL

788

But what exactly are the necessary preconditions to influence people’s mood states? What are the ingredients of a minimal mood induction paradigm? That is the central question of the present research. Together, the results of four studies indicate that successful mood induction is much more basic and simple than previous mood research would suggest. First, our mood induction paradigm does not demand a high amount of cognitive capacity, considering that participants only responded to the location of our masked priming materials. Second, the present studies indicate that moods can be induced without participants’ awareness of the mood-eliciting stimuli. Thus, participants were not conscious that mood states were being elicited. Third, we used valenced trait concepts as mood-eliciting stimuli, which most researchers regard as cold priming materials (see e.g., Clore & Colcombe, 2003; Niedenthal, Rohman, & Dalle, 2003). Recently, it has been argued that the activation of cold semantic concepts is insufficient to induce mood states and thus hot stimulus materials are required to produce mood effects (Clore & Colcombe, 2003; Innes-Ker & Niedenthal, 2002; Niedenthal et al., 2003). At first glance, the results of the present research seem to contradict this idea that hot materials are necessary because our findings show that the activation of semantic concepts can produce affective experiences. However, a closer look supports the view that hot materials are crucial. Our results clearly reveal that mood effects occur when mainly the hot features of these cold concepts are activated. Our results support the notion that the essential hot ingredient in mood-induction procedures is a dominating evaluative tone. The present studies illustrate two ways in which evaluative tone may dominate descriptive cues. According to the affective primacy hypothesis, nondescriptive, nonspecific evaluative stimulus cues are especially likely to be picked up when stimulus exposure is very short. Thus, we can separate the activation of evaluative and descriptive meaning with subliminal, very short stimulus exposures, activating mainly the evaluative tone of the stimulus (see also Stapel et al., 2002). The evaluative tone of a stimulus also dominates descriptive meaning when the stimulus is sufficiently extreme. Evaluative meaning is generally more salient in extremely valenced trait concepts than descriptive meaning is. We conclude from our studies that what all successful moodinduction techniques have in common is that they prime positively or negatively toned information at the expense of specific meaning. All you need to produce a mood state is a dominating activation of global, diffuse, nonspecific evaluative information. The activation of mainly evaluative, and thus diffuse, information may then spill over to people’s moods (Zajonc, 2000; see also Forgas, 1995; Schwarz & Clore, 1996). However, mood effects become less likely when evaluative as well as descriptive information is activated, cognitively constraining the evaluative tone.

Summary of Results In Studies 1a and 1b, we tested our evaluative-tone hypothesis using very short versus relatively long subliminal exposures to positive and negative trait concepts. Both studies demonstrate that subliminal exposure to valenced information is most likely to affect explicit mood judgments when prime exposures are short enough to activate evaluative reactions that have no specific descriptive content. Thus, when subliminal exposure to the valenced

trait concepts was short, then the activated evaluative tone spilled over into participants’ explicit mood reports. In contrast, when subliminal exposure to the valenced trait concepts was relatively long, no effects were found in the participants’ explicit mood reports. Thus, mood changes only took place in the case in which the evaluative tone dominated the descriptive meaning of the prime. The results of Study 1b suggest that such carryover effects are most likely to occur on relatively general and evaluative mood judgments but not on relatively specific, descriptive selfjudgments and person judgments. Study 2 extends the findings of Study 1 to an indirect mood measure: The results illustrate that subliminally primed trait concepts affected the participants’ reported need for cognition when the evaluative tone dominated the descriptive meaning of the primes. Thus, the activation of a positive evaluative tone resulted in a more positive mood than the activation of a dominating, negative evaluative tone. Similar to our findings in Study 1, we obtained no mood effects when the descriptive meaning was sufficiently activated (i.e., under relatively long stimulus exposures). Besides supporting our evaluative-tone hypothesis, Study 2 is the first study to demonstrate that need for cognition can fluctuate due to situational factors. Specifically, people experienced a higher need for cognition when they felt bad than when they felt good. This finding demonstrates that need for cognition may serve as an indirect mood measure, equivalent to people’s information-processing styles. Moreover, this finding shows for the first time that need for cognition can be used as a dependent measure. In previous studies, need for cognition always served as a mediating or moderating variable. Thus, the present findings show that it makes sense to distinguish between “state” and “trait” need for cognition in future research. Study 3 provides additional evidence for our evaluative-tone hypothesis. In the previous studies, we separated the activation of evaluative and descriptive meaning by presenting the information for a sufficiently short duration that primarily the evaluative features were activated. In Study 3, we used trait concepts with an evaluative meaning that always dominated their descriptive meaning. Thus, we primed participants with trait concepts of extreme valence or with trait concepts of moderate valence, both under relatively long, but subliminal, exposure durations. Consistent with our evaluative-tone hypothesis, reported need for cognition and explicit mood judgments indicated that participants who were primed with extreme positive trait concepts felt more positive than participants who were primed with extreme negative trait concepts. As expected, the mood states of participants who were primed with moderate trait concepts were unaffected. Studies 1, 2, and 3 provided evidence for our evaluative-tone hypothesis on explicit mood measures and on people’s motivational states (i.e., need for cognition). To complete the picture, we conducted Study 4, which directly measured processing style. After the priming episode, we offered participants at a Dutch university either strong or weak arguments in favor of using English as the official university language. In line with our prediction, strong arguments were more persuasive than weak arguments for participants who were exposed for a very short time to negative trait concepts, whereas strong and weak arguments were equally persuasive for participants who were exposed for a very short time to positive concepts. However, when prime exposure was relatively long, strong arguments were more persuasive than

PRECONDITIONS FOR (UNCONSCIOUS) MOOD EFFECTS

weak arguments for all participants. Together, these findings indicate that participants in a good mood used heuristic, global processing styles to process the arguments, and participants in a bad mood used systematic, detailed processing styles. Thus, the results expand our evaluative-tone hypothesis to yet another indirect mood measure: information-processing style.

Affective or Semantic Primacy? In sum, the results of our studies strongly support the all-youneed-is-evaluative-tone hypothesis. Our findings also provide additional support for affective primacy, the hypothesis that people extract the evaluative meaning of a stimulus before its descriptive meaning. We showed several times, using explicit and implicit mood indicators, that subliminal presentations of valenced trait concepts only affect people’s mood states when exposure is very short. Subliminal presentations of these trait concepts do not affect people’s mood states when exposure is relatively long but still subliminal. We can only conclude from our findings that people extracted evaluative meaning before descriptive meaning. It seems very unlikely that specific content or descriptive information induced good and bad moods in the participants. Recently, researchers have argued that it is a “sin” (Clore et al., 2005, p. 394) to assume that affect may precede semantic analysis and that the evidence for affective primacy is weak (see also Storbeck & Robinson, 2004). These conclusions were based on (a subset of) previous affective priming studies and research that compared semantic priming and affective priming (Storbeck & Robinson, 2004). Storbeck and Robinson (2004) showed that semantic priming was consistent and more robust than affective priming. They took their results to mean that semantic analysis is more obligatory at encoding than affective analysis is. However, another (and less sectarian) way to interpret these results is that (automatic) information processing is flexible (see also Stapel & Koomen, 2006): Sometimes, people extract evaluative meaning before descriptive meaning, whereas at other times, people extract descriptive meaning before evaluative meaning. Sometimes it makes sense to see immediately whether an animal is a snake or a spider (by extracting descriptive information), whereas at other times it is more functional to see immediately whether an animal is cute or threatening (by extracting evaluative information). In the specific, minimal mood induction paradigm we used in the current studies, people were more likely to extract evaluative meaning before descriptive meaning when subliminal exposure to the primed information was extremely short. However, we emphasize that even though we expected affective primacy to occur in our current studies, we do not argue that affect always precedes cognition. Rather, we applied the affective primacy logic to create the circumstances to test our evaluative-tone hypothesis.

Unconscious Moods Our findings show that mood effects are especially likely when we expose people for a very short time to the subliminal primes. Put differently, shorter flashes lead to stronger feelings. Although some emotion researchers have posited that it is indeed possible to elicit mood effects outside of conscious awareness (Chartrand, Van Baaren, & Bargh, 2006; Kihlstrom, Mulvaney, Tobias, & ¨ hman & Soares, 1994; Robles, Smith, Carver, & Tobis, 2000, O

789

Wellens, 1987; Winkielman & Berridge, 2004; Winkielman, Berridge, & Wilbarger, 2005b), until the present studies hardly any reliable empirical evidence explained how and when moods could be unconsciously induced. Theoretically, it makes sense to assume that people can be aware of their emotional states without being aware of the antecedents that evoked these states. Often, people do not have conscious access to the antecedents that evoked their feelings, thoughts, motivations, and behaviors (Nisbett & Wilson, 1977). Nevertheless, most previous research has pointed out that unconsciously presented stimuli do not affect people’s mood states. Subliminal priming may have an impact on the preference for and liking of neutral target stimuli, without having an impact on people’s affective state (e.g., Banse, 2001; Clore & Colcombe, 2003; Edwards, 1990; Krosnick, Betz, Jussim, & Lynn, 1992; Murphy & Zajonc, 1993; Winkielman, Berridge, & Wilbarger, 2005a). These null findings on mood measures probably led Schwarz and Clore (1996) to conclude that in general, affective priming studies are irrelevant to the study of mood effects. Such studies demonstrate effects of subliminal priming on judgments of words, people, or Chinese characters, but they do not show subliminal priming effects on experienced moods. Thus, Schwarz and Clore (1996) concluded, “In the absence of experienced feelings, affective priming studies may indeed be better conceptualized as reflecting automatic evaluation processes . . . , which have been observed with materials unlikely to elicit any feelings . . . , rather than feeling-based inferences” (p. 440). We posit that Schwarz and Clore (1996) may have been right in their conclusion that to date, there is no consistent evidence that subliminal priming may affect people’s affective feelings. However, this should not be taken to mean that it is impossible for subliminally presented information to influence people’s mood (see e.g., Chartrand et al., 2006). There does not seem to be an a priori reason to assume that people’s moods cannot be affected by subliminally primed information. Why should the subliminally primed information only be capable of affecting a perceiver’s judgments of Chinese characters (e.g., Murphy & Zajonc, 1993; Winkielman, Zajonc, & Schwarz, 1997) or other people’s behaviors (e.g., Stapel et al., 2002) but not also affect the perceiver himor herself? We think that spreading activation, the mechanism presumed to underlie evaluative priming (Bargh, 1997; Ferguson & Bargh, 2003; Wentura, 2000), may equally relate to people’s moods as to people’s liking judgments. However, mood effects need a stronger evaluative tone, one that dominates the descriptive cues. Thus, we explain the lack of mood effects in previous subliminal affective priming research by our hypothesis that evaluative tone needs to overrule the descriptive cues. It seems likely that in previous research, the activation of evaluative cues was sufficient to produce affective priming but insufficient to affect people’s moods because descriptive cues were also activated. In most of the studies that have shown dissociation between effects on liking judgments and mood, researchers used facial expressions as affective primes. Participants were subliminally primed with happy, neutral, or angry faces, which resulted in evaluatively congruent liking judgments of Chinese ideograms or an unfamiliar beverage (e.g., Winkielman et al., 2005a). We think that a possible explanation for the null findings on mood measures might lie in the fact that people are very efficient at “face perception” (Farah, Wilson,

RUYS AND STAPEL

790

Drain, & Tanaka, 1998, p. 482). Some researchers have even assumed a specialized module for the perception of faces. Other research has demonstrated, for example, that even the social category of a subliminally presented face may have an impact on subsequent judgments (Ruys, Spears, Gordijn, & De Vries, 2007; Stapel et al., 2002). Thus, activation of specific descriptive “face information” is likely to occur early in the information-processing chain. When evaluative cues ⫹ face information are activated, effects on liking judgment might still occur. It seems functional to assume that something is pleasant when other people enjoy it. However, it seems less functional that mood states would be affected in this way. Evaluative cues only influence mood when the activation of evaluative cues is cognitively unbound. In sum then, the current studies clearly demonstrate—via a minimal mood induction paradigm—that all you need to influence people’s mood states is evaluative tone. The essential ingredient for the genesis of mood effects is a dominance of evaluative meaning over descriptive meaning. Whereas most previous research findings may be interpreted as suggesting that hot, cognitive-capacity-demanding, and conscious experience of mood-eliciting stimuli are necessary for successful mood induction, the results of the present studies support the notion that people’s mood states can be influenced without their awareness of the mood-eliciting stimuli. Thus, relatively cold concepts can induce hot mental states.

References Adolphs, R. (2003). Cognitive neuroscience of human social behavior. Nature Reviews Neuroscience, 4, 165–178. Banse, R. (2001). Affective priming with liked and disliked persons: Prime visibility determines congruency and incongruency effects. Journal of Experimental Social Psychology, 15, 501–520. Bargh, J. A. (1997). The automaticity of everyday life. In R. S. Wyer (Ed.), Advances in social cognition: Vol. 10 (pp. 1– 61). Mahwah, NJ: Erlbaum. Bargh, J. A. (2006). What have we been priming all these years? On the development, mechanisms, and ecology of nonconscious social behavior. European Journal of Social Psychology, 36, 147–168. Bargh, J. A., Litt, J., Pratto, F., & Spielman, L. A. (1989). On the preconscious evaluation of social stimuli. In A. F. Bennett & K. M. McConkey (Eds.), Cognition in individual and social contexts: Proceedings of the XXV International Congress of Psychology. (Vol. 3, pp. 357–370). Amsterdam: Elsevier/North-Holland. Bless, H., Bohner, G., Schwarz, N., & Strack, F. (1990). Mood and persuasion. A cognitive response analysis. Personality and Social Psychology Bulletin, 16, 331–345. Bless, H., Clore, G. L., Schwarz, N., Golisano, V., Rabe, C., & Woelk, M. (1996). Mood and the use of scripts: Does a happy mood really lead to mindlessness? Journal of Personality and Social Psychology, 71, 665– 679. Cacioppo, J. T., & Petty, R. E. (1982). The need for cognition. Journal of Personality and Social Psychology, 42, 116 –131. Cacioppo, J. T., Petty, R. E., Feinstein, J. A., & Jarvis, W. B. G. (1996). Dispositional differences in cognitive motivation: The life and times of individuals varying in need for cognition. Psychological Bulletin, 119, 197–253. Chaiken, S. (1980). Heuristic versus systematic information processing and the use of source versus message cues in persuasion. Journal of Personality and Social Psychology, 39, 752–766. Chartrand, T. L., Van Baaren, R. B. V., & Bargh, J. A. (2006). Linking automatic evaluation to mood and information processing style: Conse-

quences for experienced affect, impression formation, and stereotyping. Journal of Experimental Psychology: General, 135, 70 –77. Chartrand, T. L., & Bargh, J. A. (1996). Automatic activation of impression formation and memorization goals: Nonconscious goal priming reproduces effects of explicit task instructions. Journal of Personality and Social Psychology, 71, 464 – 478. Clore, G., & Colcombe, S. (2003). The parallel worlds of affective concepts and feelings. In J. Musch & K. C. Klauer (Eds.), The psychology of evaluation (pp. 169 –188). Mahwah, NJ: Erlbaum. Clore, G. L., Storbeck, J., Robinson, M. D., & Centerbar, D. B. (2005). Seven sins in the study of unconscious affect. In L. F. Barrett, P. M. Niedenthal, & P. Winkielman (Eds.), Emotion and consciousness (pp. 384 – 408). New York: Guilford Press. Edwards, K. (1990). The interplay of affect and cognition on attitude formation and change. Journal of Personality and Social Psychology, 59, 202–216. Erdley, C. A., & D’Agostino, P. R. (1988). Cognitive and affective components of automatic priming effects. Journal of Personality and Social Psychology, 54, 741–747. Farah, M. J., Wilson, K. D., Drain, M., & Tanaka, J. N. (1998). What is “special” about face perception? Psychological Review, 105, 482– 498. Ferguson, M., & Bargh, J. A. (2003). The constructive nature of automatic evaluation. In J. Musch & K. C. Klauer (Eds.), The psychology of evaluation (pp. 169 –188). Mahwah, NJ: Erlbaum. Fiedler, K. (1990). Mood-dependent selectivity in social cognition. In W. Stroebe & M. Hewstone (Eds.), European Review of Social Psychology (Vol. 1, pp. 1–32). New York: Wiley. Fiedler, K. (1991). On the task, the measures, and the mood in research on affect and social cognition. In J. P. Forgas (Ed.), Emotion and social judgments (pp. 83–104). Cambridge, United Kingdom: Cambridge University Press. Fiedler, K. (2001). Affective influences on information processing. In J. P. Forgas (Ed.), Handbook of affect and social cognition. Mahwah, NJ: Erlbaum. Fiedler, K., & Stroehm, W. (1986). What kind of mood influences what kind of memory: The role of arousal and information structure. Memory and Cognition, 14, 181–188. Forgas, J. P. (1992). Affect in social judgments and decisions: A multiprocess model. In M. P. Zanna (Ed.), Advances in experimental social psychology (Vol. 25, pp. 227–275). San Diego, CA: Academic Press. Forgas, J. P. (1995). Mood and judgment: The affect infusion model (AIM). Psychological Bulletin, 117, 39 – 66. Gaspar, K., & Clore, G. L. (2002). Attending to the big picture: Mood and global versus local processing of visual information. Psychological Science, 13, 34 – 40. Gervey, B., Igou, E. R., & Trope, Y. (2005). Positive mood and futureoriented self-evaluation. Motivation and Emotion, 29, 269 –295. Hampson, S. E., John, O. P., & Goldberg, L. R. (1986). Category breadth and hierarchical structure in personality: Studies in asymmetries in judgments of trait implications. Journal of Personality and Social Psychology, 51, 37–54. Hugenberg, K. (2005). Social categorization and the perception of facial affect: Target race moderates response latency advantage for happy faces. Emotion, 5, 267–276. Innes-Ker, A., & Niedenthal, P. M. (2002). Emotion concepts and emotional states in social judgment and categorization. Journal of Personality and Social Psychology 83, 804 – 816. Isen, A. M. (1987). Positive affect, cognitive processes, and social behavior. In L. Berkowitz (Ed.), Advances in experimental social psychology (Vol. 20, pp. 203–253). San Diego, CA: Academic Press. Keltner, D., Locke, K. D., & Audrain, P. C. (1993). The influence of attributions on the relevance of negative feelings to personal satisfaction. Personality and Social Psychology Bulletin, 19, 21–29. Kihlstrom, J. F., Mulvaney, S., Tobias, B. A., & Tobis, I. P. (2000). The

PRECONDITIONS FOR (UNCONSCIOUS) MOOD EFFECTS emotional unconscious. In E. Eich, J. F. Kihlstrom, G. H. Bower, J. P. Forgas, & P. M. Niedenthal (Eds.), Cognition and emotion. New York: Oxford University Press. Krosnick, J. A., Betz, A. L., Jussim, L. J., & Lynn, A. R. (1992). Subliminal conditioning of attitudes. Personality and Social Psychology Bulletin, 18, 152–162. LeDoux, J. E. (1989). Cognitive– emotional interactions of the brain. Cognition and Emotion, 3, 267–289. Maringer, M., & Stapel, D. A. (2007). Unfinished business: How completeness affects the impact of emotional states and emotion concepts on social judgment. Journal of Experimental Psychology, 43, 712–718. Mayer, J. D., & Gaschke, Y. N. (1988). The experience and metaexperience of mood. Journal of Personality and Social Psychology, 55, 102–111. Murphy, S. T., & Zajonc, R. B. (1993). Affect, cognition, and awareness: Affective priming with optimal and suboptimal stimulus exposures. Journal of Personality and Social Psychology, 64, 723–739. Niedenthal, P. M., Rohman, A., & Dalle, N. (2003). What is primed by emotion concepts and emotion words? In J. Musch & K. C. Klauer (Eds.), The psychology of evaluation (pp. 307–333). Mahwah, NJ: Lawrence Erlbaum. Nisbett, R. E., & Wilson, T. D. (1977). Telling more than we can know: Verbal reports on mental processes. Psychological Review, 84, 231–259. ¨ hman, A., & Soares, J. J. (1994)”Unconscious anxiety”: Phobic responses O to masked stimuli. Journal of Abnormal Psychology, 103, 231–240. Palermo, R., & Rhodes, G. (2007). Are you always on my mind? A review of how face perception and attention interact. Neuropsychologia, 45, 75–92. Petty, R. E., & Cacioppo, J. T. (1986). The elaboration likelihood model of persuasion. In L. Berkowitz (Ed.), Advances in experimental social psychology (Vol. 19, pp. 123–205). New York: Academic Press. Petty, R. E., DeSteno, D., & Rucker, D. D. (2001). The role of affect in attitude change. In J. P. Forgas (Ed.), Handbook of affect and social cognition. Mahwah, NJ: Erlbaum. Robles, R., Smith, R., Carver, C. S., & Wellens, A. R. (1987). Influence of subliminal visual images on the experience of anxiety. Personality and Social Psychology Bulletin, 13, 399 – 410. Ruys, K. I., Spears, R., Gordijn, E. H., & De Vries, N. K. (2007). Automatic contrast: Evidence that automatic comparison with the social self affects evaluative responses. British Journal of Psychology, 98, 361–374. Ruys, K. I., & Stapel, D. A. (in press–a). Emotion elicitor or emotion messenger?: Subliminal exposure to two faces of facial expressions. Psychological Science. Ruys, K. I., & Stapel, D. A. (in press– b). The secret life of emotions. Psychological Science. Schwarz, N. (1990). Feelings as information: Informational and motivational functions of affective states. In E. T. Higgins & R. M. Sorrentino (Eds.), Handbook of motivation and cognition: Foundations of social behavior (Vol. 2, pp. 527–561). New York: Guilford Press. Schwarz, N., & Clore, G. (1996). Feelings and phenomenal experiences. In E. T. Higgins & A. W. Kruglanski (Eds.), Social psychology: Handbook of basic principles (pp. 433– 465). New York: Guilford Press.

791

Stapel, D. A. (2003). Making sense of hot cognition: Why and when description influences our feelings and judgments. In J. P. Forgas, K. D. Williams, and W. von Hippel (Eds.), Social judgments: Implicit and explicit processes (pp. 227–250). New York: Cambridge University Press. Stapel, D. A., & Koomen, W. (2000). How far do we go beyond the information given? The impact of knowledge activation on interpretation and inference. Journal of Personality and Social Psychology, 78, 19 –37. Stapel, D. A., & Koomen, W. (2005). When less is more: The consequences of affective primacy for subliminal priming effects. Personality and Social Psychology Bulletin, 9, 1286 –1295. Stapel, D. A., & Koomen, W. (2006). The flexible unconscious: Investigating the judgmental impact of varieties of unaware perception. Journal of Experimental Social Psychology, 42, 112–119. Stapel, D. A., Koomen, W., & Ruys, K. I. (2002). The effects of diffuse and distinct affect. Journal of Personality and Social Psychology, 83, 60 –74. Storbeck, J., & Robinson, M. D. (2004). Preferences and inferences in encoding visual objects: A semantic comparison of semantic and affective priming. Personality and Social Psychology Bulletin, 30, 81–93. Treisman, A. (1996). The binding problem. Current Opinion in Neurobiology, 6, 171–178. Wegener, D. T., & Petty, R. E. (1994). Mood management across affective states: The hedonic contingency hypothesis. Journal of Personality and Social Psychology, 66, 1034 –1048. Wentura, D. (2000). Dissociative affective and associative priming effects in lexical decision task: Yes versus no responses to word targets reveal evaluative judgment tendencies. Journal of Experimental Psychology: Learning, Memory, and Cognition, 26, 456 – 469. Winkielman, P., & Berridge, K. C. (2004). Unconscious emotion. Current Directions in Psychological Science, 13, 120 –123. Winkielman, P., Berridge, K. C., & Wilbarger, J. L. (2005a). Unconscious affective reactions to masked happy versus angry faces influence consumption behavior and judgments of value. Personality and Social Psychology Bulletin, 31, 121–135. Winkielman, P., Berridge, K. C., & Wilbarger, J. L. (2005b). Emotion, behavior, and conscious experience. Once more without feeling. In L. F. Barrett, P. M. Niedenthal, & P. Winkielman (Eds.), Emotion and consciousness (pp. 335–362). New York: Guilford Press. Winkielman, P., Zajonc, R. B., & Schwarz, N. (1997). Subliminal affective priming resists attributional intervention. Cognition and Emotion, 11, 433– 465. Zajonc, R. B. (1980). Feeling and thinking: Preferences need no inferences. American Psychologist, 35, 151–175. Zajonc, R. B. (2000). Feeling and thinking: Closing the debate over the independence of affect. In J. Forgas (Ed.), Feeling and thinking: The role of affect in social cognition (pp. 31–58). New York: Cambridge University Press.

Received January 8, 2007 Revision received December 10, 2007 Accepted December 18, 2007 䡲

Journal of Personality and Social Psychology 2008, Vol. 94, No. 5, 792– 807

Copyright 2008 by the American Psychological Association 0022-3514/08/$12.00 DOI: 10.1037/0022-3514.94.5.792

Forming Implicit and Explicit Attitudes Toward Individuals: Social Group Association Cues Allen R. McConnell

Robert J. Rydell

Miami University

University of Missouri—Columbia

Laura M. Strain

Diane M. Mackie

Miami University

University of California, Santa Barbara

The authors explored how social group cues (e.g., obesity, physical attractiveness) strongly associated with valence affect the formation of attitudes toward individuals. Although explicit attitude formation has been examined in much past research (e.g., S. T. Fiske & S. L. Neuberg, 1990), in the current work, the authors considered how implicit as well as explicit attitudes toward individuals are influenced by these cues. On the basis of a systems of evaluation perspective (e.g., R. J. Rydell & A. R. McConnell, 2006; R. J. Rydell, A. R. McConnell, D. M. Mackie, & L. M. Strain, 2006), the authors anticipated and found that social group cues had a strong impact on implicit attitude formation in all cases and on explicit attitude formation when behavioral information about the target was ambiguous. These findings obtained for cues related to obesity (Experiments 1 and 4) and physical attractiveness (Experiment 2). In Experiment 3, parallel findings were observed for race, and participants holding greater implicit racial prejudice against African Americans formed more negative implicit attitudes toward a novel African American target person than did participants with less implicit racial prejudice. Implications for research on attitudes, impression formation, and stigma are discussed. Keywords: attitudes, implicit attitudes, impression formation, prejudice, stigma

be viewed differently when their race or ethnicity influences perceptions and interpretations of ambiguous behaviors and events (e.g., Bodenhausen & Wyer, 1985; Duncan, 1976; Sagar & Schofield, 1980). Although these cues typically do not influence perceptions retroactively, they can influence interpretations of ambiguous acts during encoding (e.g., Bodenhausen & Wyer, 1985). However, when a target’s actions are clear-cut instead of ambiguous, accessible social group categories produce little biased assimilation, reducing the influence of groups on perceptions of target individuals (Bruner, 1957; Higgins, 1989; Srull & Wyer, 1979). Of course, the degree to which a target’s actions can shape one’s attitude is determined, in part, by the extent to which perceivers process individuated information about the target. In fact, Fiske and Neuberg’s (1990) continuum model of impression formation considers the extent to which a target’s behaviors guide social perception instead of information associated with a target’s social group. They proposed that people rely on piecemeal integration (e.g., the behaviors performed by an individual) instead of categorization (e.g., knowledge associated with the group as a whole) under conditions where perceivers are able and willing to devote cognitive resources to understanding target individuals. For example, when motivated and presented with a number of behaviors, a perceiver may come to hold a positive attitude toward a target person who is obese (i.e., a member of a social group associated with negativity) whose behaviors are predominantly positive in nature. Indeed, there is considerable support for the continuum model (for an overview, see Fiske, Lin, & Neuberg, 1999). However, one interesting feature of this work is its focus on how people use categorization and piecemeal integration in the forma-

People would like to believe that their attitudes toward others reflect their careful evaluation of others’ unique and individual merits. Although this undoubtedly occurs in some cases, social psychology research raises questions about the pervasiveness of such a reasoned approach to understanding others (Bargh, 1999; Bargh & Chartrand, 1999; Brewer, 1988; Fiske & Neuberg, 1990; Nisbett & Wilson, 1977; Schwarz & Bohner, 2001). At times, people are evaluated by the content of their character, but in other situations, this content can seem largely irrelevant. For example, individuating information about a person can often be relatively inconsequential when perceivers base their evaluations of a person on information associated with the individual’s social group (Fiske, 1998; Fiske & Neuberg, 1990). Indeed, research has demonstrated that cues providing information about social groups (e.g., obesity, physical attractiveness, race) can impact social perceptions. For example, target people can

Allen R. McConnell and Laura M. Strain, Department of Psychology, Miami University; Robert J. Rydell, Department of Psychological Sciences, University of Missouri—Columbia; Diane M. Mackie, Department of Psychology, University of California, Santa Barbara. These studies were completed while Allen R. McConnell and Laura M. Strain were supported by National Institute of Mental Health Grant MH068279 and National Science Foundation Grant BCS 0601148 and Robert J. Rydell and Diane M. Mackie were supported by National Institute of Mental Health Grant MH63762. We thank John Cacioppo and Penny Visser for their comments on this work. Correspondence concerning this article should be addressed to Allen R. McConnell, Department of Psychology, Miami University, Oxford, OH 45056. E-mail: [email protected] 792

IMPLICIT AND EXPLICIT ATTITUDE FORMATION

tion of explicit attitudes (i.e., evaluations that people can report and for which expression can be consciously controlled) toward individual group members. Yet, it is an open question as to how an individual’s social group and individuated behaviors contribute to the formation of implicit attitudes (i.e., evaluations for which people may not initially have conscious access and for which activation cannot be controlled) toward individuals. Within the context of the continuum model, the impact of social group knowledge has been assumed to result from less effortful consideration of individuated information (Fiske et al., 1999). But in the current work, we suggest that group knowledge may impact implicit attitude formation even when perceivers devote considerable cognitive resources to understanding social targets. Specifically, we propose that many social groups are strongly associated with valence and that the nature of this knowledge (i.e., its associative basis) may also have important implications for attitude formation, especially for implicit attitudes. In the current work, we examine how group association cues affect attitude formation (implicit and explicit) toward individuals. Although these cues may have implications for other aspects of impression formation (e.g., stereotypes, attributions), here we focus exclusively on how these social group cues shape attitude formation toward novel individuals.

Systems of Evaluation Recent work has established that the processes underlying the formation and change of implicit attitudes differ considerably from those involved in explicit attitudes (e.g., Rydell & McConnell, 2006; Rydell, McConnell, Mackie, & Strain, 2006; Rydell, McConnell, Strain, Claypool, & Hugenberg, 2007). Specifically, we (Rydell & McConnell, 2006; Rydell et al., 2006) have advanced a systems of evaluation approach to attitudes, proposing that there are two independent systems of evaluation that differ in both what information they use and how they act on it (see also Greenwald & Banaji, 1995; Sloman, 1996; Smith & DeCoster, 2000; Strack & Deutsch, 2004). The first system of evaluation, the associative system, operates using paired associations based on similarity and contiguity. In this case, learning is based on the accumulation of information over time to form and strengthen associations in memory. The second system of evaluation, the rule-based system, relies on logic and symbolic representations at a relatively higher order level of cognitive processing. On the basis of a systems of evaluation account, one can delineate evaluations that tap into the associative and rule-based systems of evaluation: implicit and explicit attitudes, respectively (Rydell et al., 2006). This approach is congruent with current conceptualizations of how implicit and explicit attitudes operate, allowing one to generate novel predictions about how evaluations are formed and changed in memory (cf. Gawronski & Bodenhausen, 2006). The associative system of evaluation is relevant to the understanding of how implicit attitudes form and function because implicit attitudes are posited to follow the basic principles of similarity and association (Smith & DeCoster, 2000). The rule-based system, however, fits with a conceptualization of explicit attitudes as evaluations based on conscious deliberation or syllogistic reasoning, which can reveal quick changes in expression (Fazio, 1995) but require cognitive resources in their formation and change (Petty & Wegener, 1998). This systems of evaluation approach has proven useful in understanding how implicit and explicit attitudes toward individuals

793

form differently. As an example, Rydell et al. (2006) showed that implicit attitudes were formed in response to the valence of subliminal primes presented prior to the visual appearance of a target individual, whereas explicit attitudes were formed in response to consciously available descriptions of that target’s behaviors. For instance, when concurrently presented with a series of negative subliminal primes and positive behavioral statements performed by a target person, participants’ implicit attitudes toward the target person were negative but explicit attitudes toward the same person were positive. Consistent with a systems of evaluation account, implicit and explicit attitudes were formed relatively independently of each other, with each responding to the type of information assumed to influence the associative and rule-based systems, respectively. Although implicit attitudes can, given a sufficient amount of information, be responsive to verbal information about a target person (Rydell & McConnell, 2006), implicit attitudes are more responsive to information that is associative in nature (in the case of Rydell et al., 2006, associations that were subliminally paired with the target individual). In the current work, we again focus on how implicit and explicit attitudes (based on different systems of evaluation) can be differentially sensitive to distinct forms of social information. These demonstrations of differences in implicit and explicit attitude change notwithstanding, much remains to be determined about the nature of implicit and explicit attitudes (see Gawronski & Bodenhausen, 2006). For example, in our previous work, we have only explored attitude formation and change for relatively impoverished targets (e.g., a nondescript White man named Bob). However, when perceivers encounter social targets, many group association cues such as skin color may be available. Although Bob could potentially be viewed as a member of several social categories, it is likely that such categorizations are not especially salient to our participants for several reasons. First, because they only meet one person instead of a target person in a context of differentiated others, Bob’s race, sex, or other possible categories (e.g., his age, his hairstyle) should not be distinctive. Indeed, social categorization is inherently contextual (e.g., an overweight person may be categorized differently in a group of morbidly obese others), which means encountering a White male target in isolation reduces the number of salient social categories available to a perceiver. Further, a college-age, White male target is not likely to be viewed as deviant or as a member of a minority social group category (e.g., Miller, Taylor, & Buck, 1991), especially to college-age participants who themselves are predominantly European American (e.g., McGuire, McGuire, Child, & Fujioka, 1978). However, the introduction of a target individual with more distinctive social group association cues (e.g., an African American Bob) could presumably have a considerable effect on attitude formation. If so, might these group association cues have different implications for implicit and explicit attitude formation? On the basis of a systems of evaluation analysis, we believe the answer is yes. As Rydell et al. (2006) showed, implicit attitudes were primarily affected by associative information rather than by detailed statements about the target’s behaviors, whereas explicit attitudes were shaped by the valence of the behavioral information instead of the valence of subliminal primes. Because of the sensitivity exhibited by implicit attitudes to information based on associations (see also Sloman, 1996), we reasoned that when group association cues are presented about a target person, such as being overweight, being physi-

794

MCCONNELL, RYDELL, STRAIN, AND MACKIE

cally attractive, or being African American, these cues, because they are association based in nature, would be used more strongly by the associative system of evaluation and thus influence implicit attitudes toward the target in proportion to how strongly they are associated with positivity or negativity (i.e., stronger cues should have a greater impact). However, in the absence of such group association cues, implicit attitudes toward the person should eventually reflect the valence of target-relevant behavioral information (Rydell & McConnell, 2006). That is, implicit attitudes are sensitive to verbally conveyed information about a target’s behavior, but they will be more strongly influenced by group cues that have strong valence associations. Indeed, Castelli, Zogmaister, Smith, and Arcuri (2004) showed that implicit attitudes can be formed simply by linking a person with a group very strongly associated with valence (e.g., child molesters) in the absence of behavioral information. However, if a person is a member of a social group more weakly associated with valence (or if no group association cues are available at all), implicit attitudes toward the target will reflect the individual’s behaviors (e.g., Rydell & McConnell, 2006; Rydell et al., 2007). In contrast, we expected that the valence of the verbal statements presented about the target person’s behaviors would determine explicit attitudes toward the individual (Rydell & McConnell, 2006; Rydell et al., 2006) regardless of the group association cues presented. That is, when unambiguous statements clearly describe a target person who performs positive or negative behaviors, the likelihood that a group association cue can assimilate such clear-cut behaviors is exceedingly low (Higgins, 1989; Srull & Wyer, 1979). However, if a target person’s individual behaviors are ambiguous with respect to valence, a target’s group association cue may serve to disambiguate each behavior, exerting an assimilative effect and thus influencing explicit attitude formation toward the individual in these cases.

Group Association Cues It has been shown that people have strong negative evaluations with groups ranging from the obese and the unattractive (e.g., Nosek, 2005; Rudman, Feinberg, & Fairchild, 2002) to racial outgroups (e.g., Greenwald, McGhee, & Schwartz, 1998; McConnell & Leibold, 2001). But in the current work, we were interested in whether these negative group associations would impact attitude formation about individual targets and, in particular, implicit attitudes toward them. Clearly, obesity (e.g., Crandall et al., 2001), attractiveness (e.g., Dion, Berscheid, & Walster, 1972), and race (e.g., Sagar & Schofield, 1980) can impact deliberate evaluations and judgments. Yet, it is important to note that many studies showing the impact of groups on perceptions and judgments involve situations engineered to be equivocal (e.g., an ambiguous shove in the hallway between two students, student court cases that present a mixture of guilt-suggestive and guilt-exonerating details about defendants) to maximize the likelihood that the cue (e.g., a sketch involving an African American child) will influence perceptions. Thus, in the current study, we expected that group association cues would have a far greater impact on implicit attitudes than on explicit attitudes when a substantial amount of unambiguous verbal information was presented about the target person’s behaviors. However, in cases where the behavioral information about the target person was ambiguous with respect to valence, we anticipated that group association cues would also influence explicit attitudes toward the target

person, consistent with many existent findings showing that social groups can bias judgments in ambiguous situations. In the current work, we examined visual cues strongly associated with positivity or negativity. Specifically, we explored obesity, physical attractiveness, and race. We were drawn to these cues because obesity and race have been studied extensively in research on stigma. For example, people avoid stigmatized group members (e.g., Pryor, Reeder, Yeadon, & Hesson-McInnis, 2004), devalue items associated with them (e.g., Neuberg, Smith, Hoffman, & Russell, 1994; Rozin, Markwith, & Nemeroff, 1992), and evaluate them negatively on implicit (e.g., Castelli et al., 2004; Fazio, Jackson, Dutton, & Williams, 1995; Greenwald et al., 1998; Nosek, 2005; Nosek & Banaji, 2001; Wittenbrink, Judd, & Park, 1997) and explicit (e.g., Crocker, Major, & Steele, 1998; Dovidio, Kawakami, & Gaertner, 2002; Plant & Devine, 1998) measures. Thus, being a member of a stigmatized group provides a strong, negative group association cue. Physical attractiveness can also serve as a strong group association cue (for many of the same reasons as stigmatized group membership), but, unlike stigma, a person’s physical attractiveness can serve as either a positive or a negative cue. For instance, people who are physically attractive are assumed to be competent and positive in domains unrelated to their looks (e.g., Chaiken, 1979; Dion et al., 1972; Eagly, Ashmore, Makhijani, & Longo, 1991), whereas those who are physically unattractive are viewed quite negatively (e.g., Ambady & Rosenthal, 1993; Berscheid & Walster, 1974), even by infants (e.g., Dion, 1973). Thus, whereas obesity and race provide ways to instantiate negative group association cues about target individuals, variability in attractiveness can provide positive and negative group association cues.1

Overview of the Current Work We conducted four experiments to evaluate whether group association cues would, in general, have a stronger impact on implicit attitudes than on explicit attitudes when forming attitudes toward a group member, as anticipated by a systems of evaluation account. The basic paradigm and the attitudes measures used were the same as those applied in previous research (e.g., Rydell & McConnell, 2006; Rydell et al., 2006). Specifically, participants received detailed, verbal information about the behaviors of a novel target person (Bob or Bobbie, depending on the experiment) prior to reporting their implicit and explicit attitudes toward the target. Initially, a number of trials were presented featuring a target photo and behavioral statements about the person to induce either a positive or a negative initial attitude toward the target. New to the current work were manipulations of target photos (see Figure 1 for examples) that allowed us to present a target with negative or positive group association cues (or no salient group association cue 1 In the current work, we use the term group association cue because participants are never directly told anything about the target person’s membership in a social category (e.g., physical attractiveness is inferred from a visual image of the target person, who, on the basis of pretesting, was reliably viewed as normatively attractive). Also, we do not propose that these cues cannot affect explicit attitudes. For instance, group association cues are especially likely to affect deliberate evaluations in circumstances where the cue is perceived to be germane to one’s impression (e.g., physical attractiveness is likely to influence explicit attitudes toward a potential dating partner; Petty & Wegener, 1998).

IMPLICIT AND EXPLICIT ATTITUDE FORMATION

Figure 1. Sample stimuli used to manipulate obesity (Experiments 1 and 4, top row) and race (Experiment 3, bottom row). Top-row photos are from Nosek et al. (2004) and bottow-row photos are from Minear and Park (2004).

in some conditions). Next, participants received either additional neutral (control) statements about the target or additional counterattitudinal (CA) statements about the target (i.e., the valence associated with these subsequent statements was the opposite of the valence of the behavioral statements in the initial learning trials). The CA conditions allowed us to examine how attitudes would change in the face of new and conflicting behavioral information about the target. Past research has shown that presenting a considerable number of CA behaviors (such as in the current work) results in a much more moderated attitude toward the target person (e.g., Kerpelman & Himmelfarb, 1971; Rydell & McConnell, 2006). Whether such revised attitudes toward the individual reflect relatively neutral or relatively ambivalent attitudes toward the target person is less important for the present concerns than is the fact that the introduction of CA information should produce meaningful shifts in attitudes toward the target. More important, we predicted that the introduction of CA information about a target presented with strong group association cues would have a differential impact on explicit and implicit attitudes toward the target person. In general, we expected that explicit attitudes toward the target person would respond to the valence described in the behavioral statements and that they would change after the presentation of CA information. Also, when no salient group association cue was present or when the cue was weakly associated with valence, we expected that implicit attitudes toward the target person would show a pattern similar to the pattern of explicit attitudes. That is, similar to Rydell and McConnell (2006), when large amounts of CA information are presented, implicit attitudes should eventually change in the absence of group association cues. However, when

795

strong group association cues were present, we expected implicit attitudes to primarily reflect the valence associated with the social group and thus not be strongly moderated by the CA information. We tested these predictions by manipulating group association cues related to obesity (Experiments 1 and 4), physical attractiveness (Experiment 2), and race (Experiment 3). Finally, we anticipated that the group association cue would impact explicit attitudes toward the target when the behavioral statements describing the person were relatively uninformative with respect to valence. Thus, in Experiment 4, we manipulated whether the target individual was or was not obese, and we crossed this factor with another manipulation that varied whether the behavioral statements were clear-cut or ambiguous in terms of valence. As in the previous studies, we expected implicit attitudes to be influenced by the presence of a strong group association cue. However, we also anticipated that the group association cue would impact explicit attitudes toward the target under conditions where the individual’s behaviors were ambiguous (but not when they were unambiguous). As noted previously, group membership should have an impact on explicit attitudes toward the target only when each behavior encountered is ambiguous with respect to valence and, thus, the cue can influence how each behavior is encoded at the time of encounter (e.g., Bodenhausen & Wyer, 1985). However, group cues should not affect explicit attitudes toward the target when each behavior is clear-cut in terms of valence (because each action is not subject to interpretation) even if, ultimately, the final attitude toward the individual is relatively mixed in nature (which is more likely in the CA conditions).

Experiment 1 In Experiment 1, we examined how implicit and explicit attitudes formed and changed for members of a stigmatized group (i.e., those who are overweight) relative to targets who are not stigmatized (i.e., those who are not overweight). This study replicated the basic experimental design of Rydell and McConnell (2006), but it also manipulated a group association cue for the target. Specifically, on a between-subjects basis, participants formed attitudes toward a person, Bob, who appeared to be either overweight or not overweight. In addition to seeing a photo of Bob, participants were presented with a number of positive and negative verbal behavioral statements about him and asked to determine whether each statement was characteristic of him. All participants received the same behavioral statements; however, whether a behavior was characteristic or uncharacteristic of Bob was manipulated systematically to indicate that Bob acted positively (positive behaviors were characteristic and negative behaviors were uncharacteristic of him) or negatively (negative behaviors were characteristic and positive behaviors were uncharacteristic of him). Finally, participants’ implicit and explicit attitudes were assessed using the exact same measures as were used in past research (e.g., Rydell & McConnell, 2006; Rydell et al., 2006). In line with the prediction that the associative system would reflect the negativity associated with a group association cue and the rule-based system would be sensitive to the valence of the behavioral information provided when forming an attitude toward an individual, it was expected that (a) explicit attitudes toward Bob would reflect the valence suggested by the verbal statements presented, (b) implicit attitudes would reflect the valence of the

796

MCCONNELL, RYDELL, STRAIN, AND MACKIE

group association cue that was salient (i.e., the overweight condition would lead to negative implicit attitudes toward Bob regardless of the valence of his behaviors), and (c) implicit attitudes would be based on the behavioral information when no group association cue was salient (i.e., in the not-overweight condition; Rydell & McConnell, 2006).

Method Participants. A sample of 133 undergraduates at Miami University participated in return for research credit in their introductory psychology courses. They were randomly assigned to a 2 (Bob’s weight: not overweight, overweight) ⫻ 2 (valence of the initial verbal behaviors: positive, negative) ⫻ 2 (CA condition: control [0 CA], CA conditioning [100 CA]) between-subjects factorial. Learning task. A modified version of Kerpelman and Himmelfarb’s (1971) attitude learning paradigm was used (see Rydell & McConnell, 2006; Rydell et al., 2006). In this learning task, participants received information about Bob on a computer over the course of 200 trials. On the basis of random assignment, one of four different White men served as the target Bob.2 On each trial, participants were concurrently presented with a picture of Bob and verbal statements of behavior that might be characteristic of him. After reading each behavior, participants indicated whether they believed that behavior was characteristic or uncharacteristic of Bob by pressing the c key or the u key, respectively. After each response, participants were given feedback about whether each behavior was characteristic of Bob. Specifically, feedback consisted of the word correct (in blue text) or incorrect (in red text) positioned in the center of the computer monitor and, at the same time, the behavior was stated correctly, on the basis of the assigned condition, at the bottom of the monitor (e.g., “Helping the neighborhood children is characteristic of Bob” or “Helping the neighborhood children is uncharacteristic of Bob”). Thus, through systematically differing feedback (to be described), participants were exposed to the same behaviors, but the reinforcement was designed to indicate that Bob performed positive or negative acts. Manipulation of Bob’s weight. To manipulate whether Bob was perceived as overweight or not overweight, the picture of Bob differed as a function of condition. Half of the participants saw a picture of Bob during each trial that showed he was not overweight, but the rest saw a picture of Bob during each learning trial where the photo of Bob had been morphed from the original (i.e., the picture in which Bob was not overweight) so that Bob appeared to be overweight (see Nosek, Banaji, & Greenwald, 2004). Thus, each not-overweight face was used to create an overweight face that was almost identical except for apparent weight. Manipulation of valence of the initial verbal information. During the first 100 trials, half of the participants received feedback that positive behaviors were characteristic of Bob and negative behaviors were uncharacteristic of Bob (positive initial verbal information). The remaining participants received feedback that negative behaviors were characteristic of Bob and positive behaviors were uncharacteristic of Bob (negative initial verbal information). Manipulation of CA condition. After the first 100 trials, participants in the control condition received 100 neutral trials (i.e., the behavior characteristic of Bob was neither positive nor negative; e.g., “Bob waited at the street corner”). However, participants

in the CA condition (100 CA) received CA feedback about Bob on 100 trials (i.e., the behaviors that were described as characteristic or uncharacteristic of Bob were opposite of the valence presented during the initial learning trials).3 After completing the second block of 100 trials, participants completed measures assessing their attitudes toward Bob.4 Explicit attitude measure. To assess explicit attitudes, we had participants judge how likable Bob was on a scale ranging from 1 (very unlikable) to 9 (very likable). In addition, the participants completed five semantic differential scales, each using a 9-point scale to describe Bob with anchors of good– bad, pleasant–mean, agreeable– disagreeable, caring– uncaring, and kind– cruel. Further, participants provided their evaluation of Bob on a feeling thermometer that ranged in temperature from 0o to 100o. Following past research (e.g., Rydell & McConnell, 2006; Rydell et al., 2006), we standardized the responses for each explicit measure and computed an overall mean (in all experiments, ␣s ⬎ .90). Thus, higher scores indicated more positive explicit attitudes toward Bob. Implicit attitude measure. The Implicit Association Test (IAT; Greenwald et al., 1998) was used to assess implicit attitudes toward Bob, as implicit attitudes have been studied in previous research (e.g., Rydell & McConnell, 2006; Rydell et al., 2006). In this study, the IAT had 25 stimuli: 1 picture of Bob (Bob was either overweight or not overweight), 4 different pictures of White men who were not Bob (2 were overweight and 2 were not), 10 positive adjectives (e.g., wonderful), and 10 negative adjectives (e.g., disgusting). All stimuli were presented in the center of the monitor and the adjectives were always presented in lowercase letters. As in past work (e.g., Rydell & McConnell, 2006; Rydell et al., 2006), the IAT task featured seven blocks with 20 trials per block. Participants were informed that the task involved making category judgments using one of two responses (the d or k keys on the keyboard) for a variety of stimuli (photos or words) presented on a computer monitor. During each block, verbal category label reminders appeared on the left and right sides of the display (assignment of particular labels to the d and k keys was counterbalanced across participants and produced no effects). Participants were instructed to complete that task quickly while also minimizing errors, and they were told to keep their index fingers on the d and k keys throughout the experiment to minimize delays in responding. There was a 250-ms intertrial interval. 2 This counterbalancing procedure produced no effects on any of the results. Similar counterbalancing was used in the other experiments and produced no effects as well. 3 In the current work, we contrasted the 0 CA control condition (where no CA information was presented) with the 100 CA condition (where 100 CA items were presented). We selected 100 CA for our comparison because past research (Rydell & McConnell, 2006) has shown that even slow-changing implicit attitudes change after such a large number of CA behaviors. Thus, if implicit attitudes continue to reflect group association cue evaluations under conditions where, without the cue, they would be significantly moderated, it would be an especially compelling demonstration of the unresponsiveness of implicit attitudes to changing behavioral information that has been shown, in the absence of such cues, to produce markedly changed implicit attitudes. 4 In all experiments reported in the current work, the order of attitude measure (i.e., implicit before explicit vs. explicit before implicit) was counterbalanced, and this factor did not qualify any of the results.

IMPLICIT AND EXPLICIT ATTITUDE FORMATION Explicit Attitudes 1.5

Implicit Attitudes 0 CA 100 CA

1.5

1.0

1.0

0.5

0.5

0.0

0.0

-0.5

-0.5

-1.0

-1.0 6.8 4.1

-1.5

1.9 4.4

Positive Negative Not overweight

6.6 4.7

797

2.3 3.5

Positive Negative Overweight

0 CA 100 CA

221 -56

-1.5

-24 118

Positive Negative Not overweight

-150 -179

-107 -141

Positive Negative Overweight

Figure 2. Explicit and implicit attitudes as a function of Bob’s weight, valence of the initial verbal behaviors, and counterattitudinal condition in Experiment 1. Standardized means are presented on the y-axis, and nonstandardized means are listed along the abscissa. CA ⫽ counterattitudinal statements.

In Block 1, participants judged photos of Bob or not Bob, and in Block 2, they judged whether the adjectives were negative or positive. In Blocks 3 and 4 (Combination 1), participants judged whether the stimuli were “Bob or negative” or “not Bob or positive.” In Block 5, participants performed the same judgment task as they did in Block 2 except the assignment of response keys to the two valence categories was reversed. Finally, in Blocks 6 and 7 (Combination 2), participants judged whether the stimuli were “Bob or positive” or “not Bob or negative.” As in past IAT research, half of the participants performed Combination 1 in Blocks 3– 4 and Combination 2 in Blocks 6 –7, whereas the rest performed Combination 2 in Blocks 3– 4 and Combination 1 in Blocks 6 –7 (this counterbalancing manipulation produced no effects).5 To assess implicit attitudes toward Bob, we subtracted the mean response latencies of Combination 2 from the mean response latencies of Combination 1 (see Greenwald et al., 1998, for detailed scoring information).6 As in past work (e.g., Rydell & McConnell, 2006; Rydell et al., 2006, 2007), these difference scores were standardized, with greater values indicating relatively more positive implicit attitudes toward Bob. Because IAT scores have long been viewed as relative (rather than absolute) measures of attitudes (e.g., Greenwald et al., 1998; Nosek, Greenwald, & Banaji, 2006), standardization maintains their relativistic nature. Moreover, by standardizing the implicit and explicit attitude measures and treating the type of attitude (implicit vs. explicit) as a within-subjects factor, we can evaluate how implicit and explicit attitudes respond differently to the between-subjects manipulations, testing the central predictions that group association cues have differential effects on implicit and explicit attitudes. Thus, the discussion of the results focuses on analyses of these data. However, to provide readers with a better sense of how measures varied within and across experiments (where standardization makes comparisons more difficult), each figure in the current study displays both the means for the standardized explicit and implicit attitude measures along the y-axis (because the inferential statistics were conducted on these values) and the means for nonstandardized explicit (the means of the liking and semantic differential responses, each assessed on 9-point scales) and implicit (the IAT difference score, in milliseconds) attitude measures along the base

of each figure. Because the standardized measures provide the most direct tests of the theoretical predictions in the current work, in the Results and Discussion section, we focus on these data.7

Results and Discussion The attitude measures were examined with a 2 (Bob’s weight) ⫻ 2 (valence of the initial verbal information) ⫻ 2 (CA condition) ⫻ 2 (standardized attitude measure: implicit vs. explicit) mixedmodel analysis of variance (ANOVA), with the latter factor within subjects. Several results obtained, but of greatest importance was the four-way interaction that approached significance (see Figure 2), F(1, 125) ⫽ 3.01, p ⫽ .08.8 To better understand these data, we examined the three-way interactions of Bob’s Weight ⫻ Valence of the Initial Verbal Information ⫻ CA Condition separately for implicit and explicit attitudes. Explicit attitudes. For explicit attitudes, a main effect of valence of the initial verbal information was found, F(1, 125) ⫽ 89.51, p ⬍ .001. Specifically, for those initially receiving positive verbal information, participants reported more positive attitudes 5 Within each block, an equal number of relevant stimuli were presented, with the particular order of presentation being randomly determined for each participant. Thus, in Blocks 1, 2, and 5, ten stimuli from the relevant two categories were presented. In Blocks 3, 4, 6, and 7, five stimuli from the relevant four categories (i.e., Bob, not Bob, positive, negative) were presented. With the exception of the inclusion of group association cues, the current IAT is identical to that used in past research (e.g., Rydell & McConnell, 2006; Rydell et al., 2006). 6 Alternative IAT scoring approaches (e.g., Greenwald, Nosek, & Banaji, 2003) produced identical results in the current work. 7 Parallel analyses conducted on the nonstandardized measures produced similar results. 8 Although only marginal in this study, the same group association cue manipulation (i.e., obesity) was used again in Experiment 4 and revealed the predicted significant interaction. Also, this four-way interaction (using other group association cues) was significant at conventional levels in both Experiments 2 and 3. However, because the current four-way interaction was marginal, some degree of caution should be exercised in its interpretation.

798

MCCONNELL, RYDELL, STRAIN, AND MACKIE

toward Bob (M ⫽ 0.54) than did those initially receiving negative verbal information (M ⫽ ⫺0.51). In addition, this effect was qualified by the expected interaction with CA condition, F(1, 125) ⫽ 59.08, p ⬍ .001. Simple effect analyses showed that for participants initially receiving positive verbal information, those in the 0 CA condition had more positive attitudes toward Bob (M ⫽ 1.01) than did those in the 100 CA condition (M ⫽ 0.06), F(1, 125) ⫽ 22.27, p ⬍ .001. For those initially receiving negative verbal information, the exact opposite pattern emerged, with those in the 0 CA condition evaluating Bob more negatively (M ⫽ ⫺0.88) than those in the 100 CA condition (M ⫽ ⫺0.13), F(1, 125) ⫽ 44.10, p ⬍ .001. The three-way interaction was not significant, F(1, 125) ⫽ 2.30, ns (see Figure 2, left panel). Thus, the CA information reversed the explicit attitudes that were strongly reflective of the initial verbal information. Also, note that Bob’s weight did not play any role in explicit attitudes toward him whatsoever. Implicit attitudes. In contrast, implicit attitudes showed a main effect of Bob’s weight, F(1, 125) ⫽ 32.43, p ⬍ .001. That is, participants had more negative implicit attitudes toward the overweight Bob (M ⫽ ⫺0.42) than toward the not-overweight Bob (M ⫽ 0.43). Thus, the group association cue had a direct impact on implicit attitudes. Also, the two-way interaction between the valence of the initial verbal information and CA condition was significant, F(1, 125) ⫽ 8.30, p ⬍ .001. For those who initially received positive verbal information, participants in the 0 CA condition had more positive implicit attitudes toward Bob (M ⫽ 0.30) than did those in the 100 CA condition (M ⫽ ⫺0.30), F(1, 125) ⫽ 6.01, p ⬍ .02. For those who initially received negative verbal information, the opposite pattern emerged, as those in the 0 CA condition held more negative implicit attitudes toward Bob (M ⫽ ⫺0.13) than did those in the 100 CA condition (M ⫽ 0.12), although this difference was not significant, F(1, 125) ⫽ 1.98, ns. Although this two-way interaction suggests that implicit attitudes followed the valence of the initial verbal information and subsequently were changed by the CA information just like explicit attitudes were, this two-way interaction was qualified by the predicted three-way interaction with Bob’s weight, F(1, 125) ⫽ 8.38, p ⬍ .005 (see Figure 2, right panel). Specifically, the twoway interaction between initial valence of the behavioral information and CA information held for the not-overweight Bob, F(1, 125) ⫽ 16.40, p ⬍ .001, but was absent for the overweight Bob, F(1, 125) ⫽ 0.00, ns. In other words, for the not-overweight Bob, those initially receiving positive verbal information about Bob had more positive implicit attitudes toward him in the 0 CA condition (M ⫽ 1.06) than did those in the 100 CA condition (M ⫽ ⫺0.03), F(1, 125) ⫽ 7.65, p ⬍ .01. In the condition where initial verbal information was negative, the opposite pattern emerged, with those in the 0 CA condition having more negative implicit attitudes toward Bob (M ⫽ 0.02) than those in the 100 CA condition (M ⫽ 0.66), F(1, 125) ⫽ 8.22, p ⬍ .01. As expected, when no salient group association cue was present (i.e., not-overweight Bob), implicit attitudes toward Bob followed the same pattern as explicit attitudes, tracking the valence of the large amount of behavioral information provided (Rydell & McConnell, 2006). That is, attitudes reflected the valence of the initial verbal information, and these attitudes reversed after the presentation of a considerable amount of CA information. However, when the group association cue of Bob’s being overweight was displayed, implicit attitudes toward him reflected the well-

established association between obesity and negativity and were unaffected by the valence of the verbal information (initial or CA) about him. Thus, when the group association cue was present, implicit attitudes toward the target reflected the valence of the group association cue instead of the valence of the behavioral information provided.

Experiment 2 Experiment 1 supported the systems of evaluation prediction that strong group association cues (in this case, being overweight), when present, would influence implicit attitudes toward a novel target. On the one hand, the valence of the behavioral statements determined explicit attitudes (in all cases) and implicit attitudes when no salient group association cue was provided. This work expands our earlier research (e.g., Rydell & McConnell, 2006; Rydell et al., 2006) by showing that social groups with a strong association value (e.g., obesity is negative) are used by the associative system when available. In other words, the negativity associated with Bob being overweight led to negative implicit attitudes toward him even when the statements about his actions conveyed exclusively positive behavioral information. On the other hand, because the behavioral statements were unambiguous with respect to valence, the group association cue had no impact on explicit attitudes toward the target person. Although this provides strong support for our predictions derived from a systems of evaluation perspective, we anticipate that positive group association cues should produce similar outcomes for implicit attitudes as well, leading people to hold positive implicit attitudes toward a target described as performing numerous negative behaviors. Thus, in Experiment 2, we examined a different group association cue, physical attractiveness, which can provide positive (attractive) and negative (unattractive) associations. Indeed, a considerable amount of research on persuasion (Chaiken, 1979) and the “what is beautiful is good” effect (Dion et al., 1972) shows that attractive people are evaluated more positively than average individuals are and that unattractive people are evaluated more negatively than average or attractive individuals are. In Experiment 2, participants learned about either an attractive female, an average female, or an unattractive female named Bobbie. On the basis of our reasoning about which types of information the associative and rule-based systems of evaluation would use, we expected that explicit attitudes toward Bobbie would reflect the valence of the behavioral statements provided about her and that implicit attitudes toward her in the absence of a distinctive group association cue (i.e., when Bobbie was of average attractiveness) would do the same. However, when strong salient group association cues were present (i.e., her physical attractiveness is salient), we expected that implicit attitudes toward Bobbie would reflect the valence associated with the cue instead of the valence of her behaviors, leading to relatively positive implicit attitudes toward her when she was presented as being very attractive and relatively negative implicit attitudes toward her when she was presented as being very unattractive.

Method Participants. A sample of 185 undergraduates at Miami University participated in return for research credit. They were ran-

IMPLICIT AND EXPLICIT ATTITUDE FORMATION Explicit Attitudes

Implicit Attitudes

1.5

0 CA 100 CA

1.5

1.0

1.0

0.5

0.5

0.0

0.0

-0.5

-0.5

-1.0

-1.0 7.2 4.7

-1.5

1.9 4.9

Positive Negative Attractive

7.6 3.8

2.0 5.1

Positive Negative Average

799

7.8 4.4

1.7 4.4

Positive Negative Unattractive

-1.5

0 CA 100 CA

208 241 208 194

201 30

Positive Negative Attractive

Positive Negative Average

-29 149

12

4

-74 -2

Positive Negative Unattractive

Figure 3. Explicit and implicit attitudes as a function of Bobbie’s attractiveness, valence of the initial verbal behaviors, and counterattitudinal condition in Experiment 2. Standardized means are presented on the y-axis, and nonstandardized means are listed along the abscissa. CA ⫽ counterattitudinal statements.

domly assigned to a 3 (Bobbie’s attractiveness: attractive, average, unattractive) ⫻ 2 (valence of the initial verbal information: positive, negative) ⫻ 2 (CA condition: 0 CA, 100 CA) betweensubjects factorial. Procedure. The procedure for Experiment 2 was the same as the procedure in Experiment 1 with three exceptions. First, a female target person, Bobbie, was used. Second, the “not-Bobbie” pictures used in the Bobbie IAT were a mixture of other attractive, average, and unattractive women. Third, to manipulate whether Bobbie was attractive, average, or unattractive, we chose images of Bobbie that differed in their level of physical attractiveness. Specifically, pictures were taken from an Internet dating Web site and a face database (Minear & Park, 2004) and rated by a separate group of 40 participants from the same university (none of whom participated in the current study). On the basis of these ratings, on a scale ranging from 1 (extremely unattractive) to 9 (extremely attractive), two images were selected to be attractive Bobbies (M ⫽ 7.61), two images were selected to be average Bobbies (M ⫽ 5.38), and two images were selected to be unattractive Bobbies (M ⫽ 3.15).9 The attractiveness of these pictures differed significantly, F(2, 76) ⫽ 205.68, p ⬍ .001, with all three levels of attractiveness being significantly different, ps ⬍ .001.

Results and Discussion The attitude measures were examined with a 3 (Bobbie’s physical attractiveness) ⫻ 2 (valence of the initial verbal information) ⫻ 2 (CA condition) ⫻ 2 (standardized attitude measure: implicit vs. explicit) mixed-model ANOVA, with the latter factor being within subjects. Several results obtained, but of greatest importance was the predicted four-way interaction, F(2, 173) ⫽ 4.52, p ⬍ .02, which is presented in Figure 3. To examine this effect, we examined the three-way interactions of Bobbie’s Physical Attractiveness ⫻ Valence of the Initial Verbal Information ⫻ CA Condition separately for implicit and explicit attitudes. Explicit attitudes. Explicit attitudes once again showed a main effect of valence of the initial verbal information, F(1, 173) ⫽ 169.99, p ⬍ .001. Similar to the results of Experiment 1, those who initially received positive verbal information evaluated Bobbie

more positively (M ⫽ 0.50) than did those initially receiving negative verbal information about her (M ⫽ ⫺0.51). Also replicating the results of Experiment 1, this effect was qualified by CA condition, F(1, 173) ⫽ 236.80, p ⬍ .001. For those who initially received positive verbal information about Bobbie, participants in the 0 CA condition had more positive explicit attitudes toward her (M ⫽ 1.14) than did those in the 100 CA condition (M ⫽ ⫺0.13), F(1, 173) ⫽ 108.77, p ⬍ .001. For those who initially received negative verbal information about her, the exact opposite pattern emerged, with those in the 0 CA condition evaluating Bobbie more negatively (M ⫽ ⫺1.06) than those in the 100 CA condition (M ⫽ 0.05), F(1, 173) ⫽ 136.92, p ⬍ .001. The three-way interaction was not significant, F(2, 173) ⫽ 0.80, ns (see Figure 3, left panel). These analyses revealed two effects that paralleled those of Experiment 1. First, explicit attitudes were very responsive to the valence of the initial verbal information and changed dramatically after participants received the CA information. Second, the group association cue manipulation (i.e., Bobbie’s physical attractiveness) did not qualify any of these effects. Implicit attitudes. However, implicit attitudes showed a main effect of Bobbie’s attractiveness, F(2, 173) ⫽ 46.04, p ⬍ .001. Specifically, participants had more positive implicit attitudes toward the attractive Bobbie (M ⫽ 0.71) than toward the unattractive Bobbie (M ⫽ ⫺0.66) or the average Bobbie (M ⫽ ⫺0.05), with the latter two also differing significantly. In addition, there was a Valence of the Initial Verbal Information ⫻ CA Condition interaction, F(2, 173) ⫽ 8.61, p ⬍ .005. This interaction showed that a simple effect of CA condition was not significant for those in the positive condition (0 CA M ⫽ 0.25, 100 CA M ⫽ ⫺0.04), F(1, 173) ⫽ 1.92, ns, but it was significant in the negative condition (0 CA M ⫽ ⫺0.33, 100 CA M ⫽ 0.11), F(1, 173) ⫽ 4.63, p ⬍ .04. Most important, this effect was qualified by the predicted threeway interaction, F(2, 173) ⫽ 8.87, p ⬍ .001 (see Figure 3, right panel). To explore this effect, we examined the interaction be9

There was one blonde and one brunette Bobbie for each of the three levels of attractiveness. The choice of target Bobbie (blonde vs. brunette) was randomly determined, and this factor did not qualify any of the results.

MCCONNELL, RYDELL, STRAIN, AND MACKIE

800

tween valence of the initial verbal information and CA information for the attractive, average, and unattractive Bobbie conditions separately. For the attractive and the unattractive Bobbies, the two-way interactions were not significant, Fs ⬍ 1, ns. Instead, implicit attitudes toward the attractive Bobbie were positive regardless of the valence of the verbal information, and implicit attitudes toward the unattractive Bobbie were negative regardless of the valence of the verbal information. However, the two-way interaction was significant for the average Bobbie, F(1, 173) ⫽ 33.34, p ⬍ .001. For those who received positive verbal information initially, those in the 0 CA condition had more positive implicit attitudes toward the average Bobbie (M ⫽ 0.55) than did those in the 100 CA condition (M ⫽ ⫺0.47), F(1, 173) ⫽ 18.08, p ⬍ .001. For those who received negative verbal information initially, the opposite pattern was found, with those in the 0 CA condition having more negative implicit attitudes toward the average Bobbie (M ⫽ ⫺0.64) than those in the 100 CA condition (M⫽ 0.34), F(1, 173) ⫽ 15.50, p ⬍ .001. Thus, Experiment 2 replicated the findings of Experiment 1. First, explicit attitudes and implicit attitudes (in the absence of a salient group association cue) followed the valence of the verbal information. Yet, when a distinctive group association cue was present, implicit attitudes reflected the evaluation associated with that cue and not the behaviors performed by the target person. Similar to Experiment 1, when the group association cue was negative (in this case, when the cue was the unattractive Bobbie), implicit attitudes toward her were negative even when the behavioral statements conveyed positivity. Moreover, Experiment 2 showed that when the group association cue was positive (i.e., when the cue was the attractive Bobbie), implicit attitudes toward her were positive, even in cases when the behavior statements suggested negativity. Once again, implicit attitudes reflected the valence of the salient group association cue when present, whereas explicit attitudes toward the target were unaffected by this group association cue and instead reflected the valence of the unambiguous actions performed by the individual.

Experiment 3 So far, we have shown that implicit attitudes can be unresponsive to behavioral information when strong group association cues are available. We contend that the evaluations associated with these cues dominate implicit attitudes because those attitudes are determined by a system of evaluation that is especially sensitive to associative information (Rydell et al., 2006). If this reasoning is correct, the extent to which implicit attitudes are driven by these group association cues should be related to the strength of the association between the cue and evaluations of it. For example, the overweight Bob in Study 1 revealed negative implicit attitudes even in circumstances when he performed many positive behaviors, presumably because most participants had strong associations between obesity and negativity in memory (Nosek, 2005). Yet, group association cues can be linked with valence to varying degrees. For example, although many individuals in American culture exhibit strong automatic associations between African Americans and negativity (Devine, 1989; Greenwald et al., 1998), there is meaningful variability in the extent to which people hold such associations (Fazio et al., 1995; McConnell & Leibold, 2001). Thus, we would anticipate that group association cues influence implicit attitudes more strongly for those with stronger cue-

evaluation associations in memory. In other words, as the cue-tovalence association grows weaker, implicit attitudes toward the individual should be increasingly reflective of the behavioral statements about the person. With this logic in mind, in Experiment 3, we examined another group association cue, a target’s race. Specifically, we replicated Experiment 1 but manipulated target race to either provide a distinctive group association cue (i.e., an African American Bob) or not provide a distinctive group association cue (i.e., a White Bob). In addition, we also assessed participants’ evaluative associations with the cue (i.e., their implicit attitudes toward African Americans in general) to examine the relation between their implicit evaluations of the social group cue and their attitudes toward a group target member in particular. We expected to replicate the findings of Study 1 using race as the group association cue, and we anticipated that implicit prejudice against African Americans would account for the magnitude of negative implicit attitudes toward Bob when he was African American. In other words, participants with stronger racial prejudice should be less influenced than those with less prejudice by the behavioral statements about an African American target when forming implicit attitudes toward him. Therefore, we predicted an inverse relation between implicit racial prejudice and implicit (but not explicit) attitudes toward Bob, but only when he was African American and not when he was White.

Method Participants. A sample of 94 White undergraduates at the University of California, Santa Barbara, participated in return for research credit in their introductory psychology courses. They were randomly assigned to a 2 (Bob’s race: African American, White) ⫻ 2 (valence of the initial verbal information: positive, negative) ⫻ 2 (CA condition: 0 CA, 100 CA) between-subjects factorial. Procedure. The procedure was similar to the procedure of Experiment 1 with a few exceptions. First, in the current experiment, we examined the group association cue of race by presenting an African American Bob to half of the participants or a White Bob (as in Experiment 1) to the rest. Several minutes before engaging in the learning task, participants completed a racial IAT where African American and White names were presented with positive and negative adjectives using the same trial and block structure as was used with the IAT in Experiment 1 (see also McConnell & Leibold, 2001). Thus, in one set of critical blocks of this racial IAT, participants judged whether the stimuli were “African American or negative” or “White or positive.” In the other set of critical blocks, they judged whether the stimuli were “African American or positive” or “White or negative.” The difference in mean response latencies for the critical blocks was computed, with higher scores indicating relatively greater implicit prejudice against African Americans (McConnell & Leibold, 2001). After the learning task (involving either an African American or a White Bob target), participants completed the same implicit and explicit attitude measures used in Experiment 1, with the exception that the IAT presented non-Bob targets of the same race as the Bob target (to ensure it assessed implicit attitudes toward Bob specifically and not racial prejudice more generally).

IMPLICIT AND EXPLICIT ATTITUDE FORMATION

Results and Discussion Attitudes toward Bob were examined with a 2 (Bob’s race) ⫻ 2 (valence of the initial verbal information) ⫻ 2 (CA condition) ⫻ 2 (standardized attitude measure: implicit, explicit) mixed-model ANOVA, with the latter factor being within subjects. Several results obtained, but of greatest importance was the predicted four-way interaction, F(1, 86) ⫽ 4.13, p ⬍ .05, which is presented in Figure 4. To explore this outcome, we examined the three-way interactions of Bob’s Race ⫻ Valence of the Initial Verbal Information ⫻ CA Condition separately for implicit and explicit attitudes. Explicit attitudes toward Bob. Replicating the results of Experiments 1–2, explicit attitudes showed a main effect of valence of the initial verbal information, F(1, 86) ⫽ 24.83, p ⬍ .001. Once again, those who initially received positive verbal information about Bob reported more favorable attitudes toward him (M ⫽ 0.40) than did those who initially received negative verbal information about him (M ⫽ ⫺0.43). In addition, this main effect was qualified by the interaction with CA condition, F(1, 86) ⫽ 10.24, p ⬍ .005. Specifically, when initially receiving positive verbal information about Bob, those in the 0 CA condition had more positive attitudes toward him (M ⫽ 0.77) than did those in the 100 CA condition (M ⫽ 0.02), F(1, 86) ⫽ 10.24, p ⬍ .005. However, when initially receiving negative behavioral statements about Bob, those in the 0 CA condition had more negative attitudes toward Bob (M ⫽ ⫺1.06) than did those in the 100 CA condition (M ⫽ 0.21), F(1, 86) ⫽ 28.31, p ⬍ .001. The three-way interaction was not significant, F(1, 86) ⫽ 0.04, ns (see Figure 4, left panel). Thus, the two-way interaction revealed that CA information reversed the attitudes that reflected the valence of the initial information about Bob. Yet, similar to the findings of Experiments 1 and 2, Bob’s race did not play a role in any of these outcomes. Implicit attitudes toward Bob. In stark contrast, implicit attitudes toward Bob revealed a main effect of Bob’s race, F(1, 86) ⫽ 6.07, p ⬍ .02. That is, participants had more negative implicit attitudes toward Bob when he was African American (M ⫽ ⫺0.19) than when he was White (M⫽ 0.23). Thus, as in Experiments 1 and 2, the group association cue had a direct effect on implicit attitudes

toward the target. Also, an interaction between the valence of the initial verbal information and CA condition was found, F(1, 86) ⫽ 15.49, p ⬍ .001. To examine this interaction, we analyzed the simple effects of CA condition as a function of the valence of the initial verbal information. When the valence of the initial verbal information was positive, participants in the 0 CA condition had more positive implicit attitudes toward Bob (M ⫽ 0.51) than did those in the 100 CA condition (M ⫽ ⫺0.21), F(1, 86) ⫽ 9.28, p ⬍ .01. For those receiving initially negative verbal information, the opposite pattern emerged, with those in the 0 CA condition revealing more negative implicit attitudes toward Bob (M ⫽ ⫺0.40) than those in the 100 CA condition (M ⫽ 0.24), F(1, 86) ⫽ 7.12, p ⬍ .02. Thus, overall, implicit attitudes toward Bob reflected the valence of the verbal information (i.e., the valence of the initial behavioral information, which was undercut by the CA information), similar to the explicit attitudes toward Bob. But, unlike the explicit attitudes, this two-way interaction was qualified by Bob’s race in the predicted three-way interaction, F(1, 86) ⫽ 9.54, p ⬍ .005 (see Figure 4, right panel). To explore this effect, we examined the interaction between the valence of the initial verbal information and CA condition for the White and African American Bobs separately. For the White Bob, the twoway interaction was significant, F(1, 86) ⫽ 16.20, p ⬍ .001; however, it was not significant for the African American Bob, F(1, 86) ⫽ 0.01, ns. To examine this interaction for the White Bob, we analyzed the simple effects of the CA condition as a function of the valence of the initial verbal information. For those initially receiving positive verbal information about the White Bob, those in the 0 CA condition had more positive implicit attitudes toward him (M ⫽ 0.95) than did those in the 100 CA condition (M ⫽ ⫺0.20), F(1, 86) ⫽ 12.45, p ⬍ .005. For those initially receiving negative verbal information about the White Bob, the opposite pattern emerged, with those in the 0 CA condition having more negative implicit attitudes toward him (M ⫽ ⫺0.53) than those in the 100 CA condition (M ⫽ 0.68), F(1, 86) ⫽ 12.24, p ⬍ .005. In sum, these effects revealed that implicit attitudes toward a target without a distinctive group association cue (i.e., the White Bob) reflected the valence of the verbal behaviors presented about

Explicit Attitudes 1.5

Implicit Attitudes 0 CA 100 CA

1.5

1.0

1.0

0.5

0.5

0.0

0.0

-0.5

-0.5

-1.0

-1.0 5.9 4.4

-1.5

1.7 5.4

Positive Negative White

7.0 4.7

801

1.8 4.5

Positive Negative African-American

0 CA 100 CA

229

-1.5

9

-25 175

Positive Negative White

73

18

-9

32

Positive Negative African-American

Figure 4. Explicit and implicit attitudes as a function of Bob’s race, valence of the initial verbal behaviors, and counterattitudinal condition in Experiment 3. Standardized means are presented on the y-axis, and nonstandardized means are listed along the abscissa. CA ⫽ counterattitudinal statements.

MCCONNELL, RYDELL, STRAIN, AND MACKIE

802

him, replicating the results of Experiments 1 and 2 and past work involving nondescript targets (Rydell & McConnell, 2006). However, when a group association cue was present (i.e., the African American Bob), the implicit attitudes toward him were reflective of the valence of the group association cue, as found in Experiments 1 and 2. Prejudice against African Americans. To examine if negative associations with the cue (i.e., prejudice against African Americans) can account for the implicit attitudes toward the African American Bob being negative, we explored the extent to which participants’ implicit prejudice toward African Americans predicted their attitudes toward Bob. In our sample, the average participant revealed relatively strong implicit racial prejudice against African Americans (M ⫽ 207.88 ms IAT effect, d ⫽ 1.35). In essence, this effect reaffirms the relative negativity participants associated with the group association cue (i.e., being African American). Next, we examined the correlations among participants’ racial prejudice, explicit attitudes toward Bob, and implicit attitudes toward Bob separately as a function of the race condition. As expected, when Bob was White, there were no relations between implicit racial prejudice and implicit attitudes toward Bob (r ⫽ .05, ns) or explicit attitudes toward him (r ⫽ ⫺.07, ns). However, as predicted, a different pattern emerged when Bob was African American. Although participants’ implicit racial prejudice was unrelated to their explicit attitudes toward Bob (r ⫽ .14, ns), implicit racial prejudice was significantly negatively correlated to their implicit attitudes toward him (r ⫽ ⫺.50, p ⬍ .001). That is, the more negativity they associated with African Americans, the less positive their feelings toward Bob were on implicit (but not explicit) attitude measures, but only when he was Black. As expected, the relation between racial prejudice and implicit attitudes toward Bob differed as a function of race, z ⫽ 2.81, p ⬍ .01, but there were no race condition differences in the relation between racial prejudice and explicit attitudes toward Bob, z ⬍ 1. These data indicate that the magnitude of the valence associated with the group association cue can account for how Bob’s race led to relatively negative implicit attitudes toward him when he was a member of that social group.

Experiment 4 To this point, we have shown in three different experiments using three different group association cues that implicit attitudes toward an individual reflect the valence (and, in Experiment 3, the extremity of valence) of a salient group association cue when such cues are present but that they are responsive to the valence of the behaviors describing the target when such cues are absent or when the cues have relatively weaker associations with valence. Yet, in each of these studies, explicit attitudes were unaffected by the group association cues. At first blush, these results may seem difficult to reconcile with findings in the literature showing that group membership can impact judgments. We have argued that because the behavioral statements ascribed to the target individuals in the current experiments were both numerous and clear-cut with respect to valence, the ability of the group association cue to induce assimilation effects on explicit attitudes was effectively curtailed. However, we would anticipate that if the target-relevant behaviors were more ambiguous with respect to valence instead of being clear-cut, group association cues would have an impact on explicit attitudes toward the individual by providing a means to

bias the encoding of ambiguous actions (e.g., Bodenhausen & Wyer, 1985). In Experiment 4, we revisited the group association cue used in Experiment 1 by manipulating Bob’s apparent weight in a more simplified experimental design. Specifically, participants were only presented with 100 statements about Bob (whose weight was manipulated between subjects) and were told that each statement was characteristic of him. As part of another between-subjects factor, half of the participants read statements indicating that Bob performed unambiguous positive acts whereas the rest read statements that were relatively ambiguous (i.e., not strongly valenced). In the latter case, we predicted that the group association cue would have an assimilative effect, resulting in a relatively negative explicit attitude toward Bob when he was obese. Such a finding would not only be valuable to test the importance of behavioral ambiguity in how group association cues affect explicit attitudes, but it would also demonstrate that participants in our studies are not reticent to report negative explicit attitudes toward members of stigmatized groups (i.e., perhaps the lack of effect of cues in previous studies reflects engaging in positive impression management). However, when Bob’s behaviors were unambiguously positive, we expected relatively positive explicit attitudes toward Bob regardless of his weight, replicating the results of Experiment 1. We chose positive unambiguous behaviors in this study to provide the best opportunity for Bob’s stigma to impact attitudes toward him (i.e., avoid floor effects). In addition to testing our reasoning that group association cues could impact explicit attitudes toward an individual whose behavior was relatively ambiguous, we also modified our IAT task in the current experiment. In the previous three experiments, we used images of people as IAT stimuli to render Bob (or Bobbie) versus not-Bob (or not-Bobbie) categorizations. It is possible that when making these judgments, participants could have been led by each presentation of Bob (or Bobbie) in conditions involving a group association cue to strengthen their association between the target person and the target’s stigma. For example, for participants exposed to an overweight Bob, the IAT task continually re-presents images of an obese Bob throughout the IAT task, which may have served to further reinforce negativity toward Bob. Also, the alternative targets (e.g., the not Bobs) provided distractors that sometimes did and sometimes did not present the group association cue as well, which could introduce unwanted context effects. To any extent that the group association cue was re-presented during the IAT task, the possibility that the implicit attitude measure toward Bob (or Bobbie) reflects a confound of attitudes toward the target and attitudes toward the group association cue itself exists. To eliminate this possibility, in the current experiment, we used an IAT task that presented names and not images of the target and five nontargets. Because the current IAT task did not present visual images of people, we avoided the possibility that the implicit attitude measure was assessing a blend of attitudes toward the target and attitudes toward the group association cue. To summarize, the overall design of the study crossed Bob’s weight (overweight vs. not overweight) with type of behavioral information (100 positive vs. 100 ambiguous) in a more simplified experimental paradigm and with a modified IAT task. For implicit attitudes, a main effect of Bob’s weight was expected, revealing more negative implicit attitudes toward Bob when he was overweight than when he was not (thus replicating the results of Experiment 1). However, for explicit attitudes, we predicted an

IMPLICIT AND EXPLICIT ATTITUDE FORMATION

interaction, such that the group association cue (i.e., overweight Bob) would reduce the positivity of explicit attitudes toward Bob when his behaviors were ambiguous with respect to valence.

Method Participants. A sample of 47 undergraduates at Miami University participated in return for research credit in their introductory psychology courses. They were randomly assigned to a 2 (Bob’s weight: not overweight, overweight) ⫻ 2 (statement type: positive, ambiguous) between-subjects factorial. Procedure. Participants were presented with 100 behavior statements about Bob and told that they were all characteristic of him. Each statement was presented on the monitor for 8 s. On the basis of pretested norms, participants assigned to the positive statement condition read 100 statements that implied positivity (e.g., “Bob helped friends move into a new house”), whereas those in the ambiguous statement condition read 100 statements that were relatively valence neutral (e.g., “Bob watched TV with friends”). The image of Bob presented on the monitor and associated with each statement was either overweight or not overweight, depending on condition assignment (using the same stimuli as Experiment 1). Next, participants completed implicit and explicit attitude measures toward Bob (once again counterbalanced). The explicit measures were identical to those used in Experiment 1. However, the implicit measure was a slightly modified version of the IAT. Specifically, it was identical to the IAT used in Experiment 1 except that rather than the presented images being Bob and notBob targets, the person-related stimuli were names presented in uppercase font (positive and negative adjectives were presented in lowercase), either BOB or five not-Bob names that began with the same letter (e.g., BEN). There were an equal number of presentations of Bob and non-Bob names in each block.

Results and Discussion Attitudes toward Bob were examined with a 2 (Bob’s weight) ⫻ 2 (statement type) ⫻ 2 (standardized attitude measure: implicit, explicit) mixed-model ANOVA, with the latter factor being within

subjects. As expected, we observed the predicted three-way interaction, F(1, 43) ⫽ 4.16, p ⬍ .05, which is illustrated in Figure 5. To explore this effect, the two-way interaction of Bob’s weight and statement type were examined separately for implicit and explicit attitudes. Explicit attitudes. The Bob’s Weight ⫻ Statement Type ANOVA yielded three effects. First, not surprisingly, there was a main effect of statement type, F(1, 43) ⫽ 15.45, p ⬍ .001, revealing that explicit attitudes toward Bob were more positive when the 100 statements suggested positivity (M ⫽ 0.55) than when they were ambiguous with respect to valence (M ⫽ ⫺0.57). Also, there was a main effect of Bob’s weight, F(1, 43) ⫽ 10.65, p ⬍ .01, indicating that explicit attitudes toward Bob were more negative when he was overweight (M ⫽ ⫺0.29) than when he was not (M ⫽ 0.30). It is important to note that this effect was qualified by an interaction with statement type, F(1, 43) ⫽ 6.53, p ⬍ .02 (see Figure 5, left panel). Although explicit attitudes toward the Bob described by unambiguous positive behaviors did not differ as a function of his weight, F(1, 22) ⫽ 0.75, ns, the same was not true when his behaviors were ambiguous, F(1, 21) ⫽ 9.84, p ⬍ .01, with Bob being viewed more negatively when he was overweight (M ⫽ ⫺1.19) than when he was not (M ⫽ ⫺0.01). Thus, as hypothesized, the group association cue did impact explicit attitudes toward the target individual, but only when his behaviors were not clear-cut with respect to valence. Implicit attitudes. In contrast to the explicit attitudes data, the Bob’s Weight ⫻ Statement Type ANOVA for implicit attitudes toward Bob revealed only a main effect of Bob’s weight, F(1, 43) ⫽ 13.86, p ⬍ .001. Figure 5 (right panel) shows that participants had more negative implicit attitudes when Bob was overweight (M ⫽ ⫺0.49) than when he was not (M ⫽ 0.48). Thus, impact of the group association cue was strong in all conditions. In sum, these findings show that a group association cue (i.e., Bob’s weight) can affect explicit attitudes, but only when the target’s actions are relatively ambiguous and thus capable of being assimilated by the group association cue. However, when the target’s behaviors were unequivocal, the group association had no impact on explicit attitudes, replicating the results of Experiments 1–3. These findings obtained using a more simplified attitude-

Implicit Attitudes

Explicit Attitudes 1.5

Positive Ambiguous

1.5

1.0

1.0

0.5

0.5

0.0

0.0

-0.5

-0.5

-1.0

Positive Ambiguous

-1.0 7.5

-1.5

803

6.5

Not overweight

7.4

5.0

Overweight

207 189

-1.5

Not overweight

36

25

Overweight

Figure 5. Explicit and implicit attitudes as a function of Bob’s weight and statement type in Experiment 4. Standardized means are presented on the y-axis, and nonstandardized means are listed along the abscissa. CA ⫽ counterattitudinal statements.

MCCONNELL, RYDELL, STRAIN, AND MACKIE

804

learning paradigm and using a modified IAT designed to circumvent possible confounds that might exist with presenting target stimuli with the group association cue. It is interesting to note that there was no evidence that the type of statement (unambiguously positive vs. ambiguous) qualified the main effect of Bob’s weight on implicit attitudes. Although the type of statement did impact explicit attitudes toward Bob, it appears that the strong group association cue (i.e., Bob’s obesity) had a greater impact on implicit attitudes toward him. These data may, at first glance, seem at odds with the earlier studies showing that the valence of the behaviors impacted implicit attitudes toward targets when group association cues were absent. However, it should be noted that the current study differed from the first three experiments in that the valence manipulations in the former studies pitted two starkly different valence conditions (i.e., 100 positive vs. 100 negative) against each other, whereas the current manipulation (designed to introduce ambiguity rather than polar-opposite valences) was far more modest. Thus, it appears that information (i.e., group association cues) that is especially attuned to the system of evaluation underlying implicit attitudes (i.e., the associative system) has a greater impact on the attitudes produced. In a similar vein, readers comparing the outcomes for explicit attitudes toward the not-overweight Bob between Experiments 1 and 4 might conclude that similar explicit attitudes resulted despite very different behavior presentations (i.e., 0 CA vs. 100 CA in Experiment 1 when positive information was initially presented, positive vs. ambiguous information in Experiment 4). However, this apparent similarity actually reflects a by-product of the standardization process. That is, although the standardized explicit attitudes between the two studies were nearly identical for the not-overweight Bob in Experiment 1 (i.e., initially positive behavioral characteristics followed by the 100 CA condition) and the not-overweight Bob in Experiment 4 (i.e., the ambiguous behavior condition), mean nonstandardized explicit attitudes were much more positive in the latter case (M ⫽ 6.5) than the former case (M ⫽ 4.1), t(26) ⫽ 4.49, p ⬍ .001, reflecting the absence of negative behavioral information about Bob in Experiment 4. In other words, although the standardization process appears to suggest similar explicit attitudes when comparing between studies (which is not what one would expect from such markedly different behavioral presentations), inspection of the nonstandardized values (see the bottom of Figures 2 and 5) indicates that, indeed, attitudes toward the not-overweight Bob who performed ambiguous behaviors in Experiment 4 were far more positive than were attitudes toward the not-overweight Bob in Experiment 1 who performed 100 positive behaviors followed by 100 negative behaviors. Although the focus on the standardized data in the current analyses can make comparisons between studies somewhat more difficult, the value they provide within studies to directly test the key theoretical questions of interest (i.e., how group association cues differentially impact implicit and explicit attitudes) is substantial.

General Discussion In the current work, we explored how a target’s social group that is strongly associated with valence affects the formation of attitudes toward the individual. Whereas a considerable amount of research has focused on the impact of social categorization on explicit attitudes toward people (Fiske et al., 1999), the current study is the first to consider how the formation of implicit attitudes

toward targets is influenced by them. On the basis of a systems of evaluation perspective on attitudes (e.g., Rydell & McConnell, 2006; Rydell et al., 2006), a number of novel hypotheses were advanced. For example, because many social group cues have strong associations with valence (e.g., being physically attractive is desirable, being obese is undesirable), we anticipated that such cues would have an especially strong impact on implicit attitude formation for individuals because such attitudes rely on associative knowledge. Indeed, across four experiments, we found strong and consistent support for this prediction. For example, implicit attitudes toward members of stigmatized groups were negative regardless of the valence of the behaviors attributed to these individuals. These outcomes were observed for a wide variety of stigmas, including being overweight (Experiments 1 and 4), being physically unattractive (Experiment 2), and being African American (Experiment 3). However, when groups were associated with positivity (i.e., being physically attractive in Experiment 2), implicit attitudes toward the target were positive, again regardless of the nature of the individual’s actions. Overall, these data suggest that group association cues have an especially strong impact on implicit attitudes because such evaluations are based on a system of evaluation that uses associative knowledge (Rydell & McConnell, 2006; Sloman, 1996; Smith & DeCoster, 2000). Further, these group association cues did not have much impact on explicit attitudes when the target’s behavioral descriptions were clear-cut with respect to valence. However, when the target’s actions were ambiguous, group association cues influenced explicit attitudes toward targets as well (Experiment 4). Thus, in cases where ambiguity exists, social groups can serve as an accessible construct to promote assimilation effects (Bruner, 1957; Higgins, 1989; Srull & Wyer, 1979). Yet, when a target’s actions were unequivocal in their implications, group cues did not play a role in explicit evaluations. It is also important to note that although we used group association cues that are widely held in society in the current research, there will be meaningful variability in the extent to which these cues are associated with valence. Accordingly, in Experiment 3, we saw that individual differences in the extent to which African Americans as a group were associated with negativity directly predicted the degree to which implicit attitudes toward a novel African American target were negative as well. This finding further reaffirms that it is the overarching association between group cues and valence that determines how a target’s group identity shapes the formation of implicit attitudes toward the individual. Stronger cue-to-valence associations should result in more extreme implicit attitudes being formed more quickly. Further, meaningful differences in one’s past history of group associations will play an important role in how implicit attitudes are affected by group association cues. Moreover, this outcome indicates that implicit attitudes toward an individual should be more sensitive to the valence of a target’s behavioral information when cue-to-valence associations are relatively weak (just like they are in the absence of salient group association cues). The current work sheds light on a number of important issues involved in understanding others. For instance, most research exploring group prejudice has focused on its pervasiveness, expression, and assessment (e.g., Devine, 1989; Dovidio et al., 2002; Greenwald et al., 1998; McConnell & Leibold, 2001) rather than on its specific impact when people are forming attitudes toward

IMPLICIT AND EXPLICIT ATTITUDE FORMATION

individuals. And although some important work, both empirical and theoretical, has been directed at considering the impact of social groups on attitude formation (e.g., Brewer, 1988; Fiske & Neuberg, 1990), past work has not considered the implications of social groups for the formation of implicit attitudes toward individuals. Given many striking demonstrations of how implicit attitudes uniquely predict behavior toward members of social groups (e.g., Fazio et al., 1995; McConnell & Leibold, 2001) and toward individuals without any distinctive group identification (Rydell & McConnell, 2006), in the current work, we engaged an underexamined intersection of important issues with a framework for considering how particular types of information (i.e., associative vs. rule based) are especially likely to impact particular types of attitudes (Rydell & McConnell, 2006; Rydell et al., 2006). In general, the systems of evaluation perspective anticipated how group association cues would impact implicit and explicit attitude formation toward a novel individual quite well. More generally, the current work points to the need to further develop models of impression formation to include implicit attitudes toward individuals, and we believe the systems of evaluation approach provides a compelling framework for doing so. In addition to considering the extent to which perceivers expend cognitive resources in attitude formation (e.g., Fiske & Neuberg, 1990), the systems of evaluation perspective suggests that the fit between information type and attitude type matters too. That is, correspondence between the type of knowledge (i.e., associative vs. rule based) and the type of attitude (i.e., implicit vs. explicit) is an important dimension to consider in the attitude formation process. Although in the current work we focused on social group cues as one form of associative knowledge, we would anticipate that many other types of cues strongly associated with valence (e.g., experts are good) would be especially influential in implicit attitude formation as well. Extending this point, we believe that the current work can shed new light on processes involved in attitudes and persuasion. For example, several models of attitudes propose that heuristic and peripheral cues influence attitudes and behavior more strongly when one’s motivation to process detailed information is low (e.g., Chaiken, 1979; Petty & Wegener, 1998). But what underlies this outcome? When one considers that nonconscious associations (Rydell et al., 2006) and association-based cues (the current work) play critical roles in determining implicit attitudes and that recent work reveals that implicit attitudes are more likely than explicit attitudes to determine spontaneous behaviors that do not involve deliberation and planning (e.g., Dovidio et al., 2002; Jellison, McConnell, & Gabriel, 2004; McConnell & Leibold, 2001; Rydell & McConnell, 2006), a mechanism to account for how these association-based cues influence behavior in low-effort situations becomes apparent. That is, cues such as physical attractiveness or others not considered in the current work (e.g., expertise) will shape implicit attitudes, which, in turn, are more likely to guide behavior in situations where the rule-based system does not (e.g., no verbal information is available for central route persuasion) or cannot (e.g., limited cognitive resources) operate. Thus, a systems of evaluation perspective suggests that implicit attitudes may serve as a mechanism to explain how association-based cues impact behavior in some situations. Moreover, as the current Experiment 3 suggests, the impact of these cues in shaping implicit attitudes varies on the basis of idiosyncratic associations with the target-

805

relevant cue. Thus, not all cues will have the same impact across individuals. In addition to these important conceptual issues, the current findings suggest a number of sizable roadblocks in reducing prejudice and discrimination. First, to the extent that stigmas impact implicit attitudes more strongly than they do explicit attitudes, it may often be the case that people will be unaware of their stigmarelated biases because the biases are associative in nature, which, in turn, makes it less likely that correction processes will be used (Wegener & Petty, 1995). Further, such nonconscious biases may elicit behavioral confirmation from targets (e.g., Chen & Bargh, 1997), perpetuating such evaluations. Also, situational factors that reduce the impact of the rule-based system of evaluation (e.g., distraction, off-peak circadian rhythms) should exacerbate the influence of the associative system in directing behavior, increasing the likelihood that negative implicit attitudes will guide actions toward stigmatized targets (especially negative, nonverbal behaviors; see McConnell & Leibold, 2001; Richeson & Shelton, 2003). It is interesting that these group association cues had little effect on explicit attitudes except under conditions where behavioral information was ambiguous. This suggests that although group association cues may play an important role in deliberate judgments and evaluations, their impact may be reduced in situations that are less ambiguous in nature. Does this mean that the consequences of stigma are less important than the literature suggests? We believe the answer to this question is no. First, although most of the behavioral statements presented about the targets in the current study were clear-cut, most social interactions contain considerable ambiguity, increasing the likelihood of biased assimilation (see Experiment 4). Further, in the current work, participants were compelled to process a large number of statements about the target individuals. However, in real life, people may be far less attentive to individuating information, especially for members of stigmatized groups (Kurzban & Leary, 2001). Thus, in many cases, stigmas may dissuade perceivers from encountering information that could present targets in a much more positive light. Moreover, even if such behaviors are encountered, the extent to which people effortfully individuate such information in such cases may be limited (Fiske & Neuberg, 1990). At any rate, it is clear that additional work is needed to examine how these trade-offs operate in more complex social interaction situations. To conclude, the current work shows that the formation of implicit attitudes toward members of social groups may often reflect the valence of group association cues instead of the behavioral data available to the perceiver. Explicit attitudes, however, were less influenced by these cues and more determined by descriptions of the target person’s behaviors, unless the available behavioral information was ambiguous with respect to evaluations. This research shows that stigmatized people may face challenges in changing others’ implicit attitudes toward them, even when they perform good deeds, whereas members of highly valued groups can behave badly and still enjoy others’ implicit approbation. Although knowledge of the impression formation process with respect to explicit attitudes is well-developed, the current study reveals that the formation of implicit attitudes toward individuals can operate quite differently (see also Rydell & McConnell, 2006; Rydell et al., 2006). In sum, we believe that a systems of evaluation perspective offers a very insightful theory and useful tools for building an understanding of implicit attitude formation pro-

MCCONNELL, RYDELL, STRAIN, AND MACKIE

806

cesses, and it demonstrates the importance of appreciating the consequences of these nonconscious evaluations as well.

References Ambady, N., & Rosenthal, R. (1993). Half a minute: Predicting teacher evaluations from thin slices of nonverbal behavior and physical attractiveness. Journal of Personality and Social Psychology, 64, 431– 441. Bargh, J. A. (1999). The cognitive monster. In S. Chaiken & Y. Trope (Eds.), Dual process theories in social psychology (pp. 361–382). New York: Guilford Press. Bargh, J. A., & Chartrand, T. L. (1999). The unbearable automaticity of being. American Psychologist, 54, 462– 479. Berscheid, E., & Walster, E. (1974). Physical attractiveness. Advances in Experimental Social Psychology, 7, 211–276. Bodenhausen, G. V., & Wyer, R. S., Jr. (1985). Effects of stereotypes on decision making and information-processing strategies. Journal of Personality and Social Psychology, 48, 267–282. Brewer, M. B. (1988). A dual process model of impression formation. In T. K. Srull & R. S. Wyer (Eds.), Advances in social cognition (Vol. 1, pp. 1–36). Hillsdale, NJ: Erlbaum. Bruner, J. S. (1957). On perceptual readiness. Psychological Review, 64, 123–152. Castelli, L., Zogmaister, C., Smith, E. R., & Arcuri, L. (2004). On the automatic evaluation of social exemplars. Journal of Personality and Social Psychology, 86, 373–387. Chaiken, S. (1979). Communicator physical attractiveness and persuasion. Journal of Personality and Social Psychology, 37, 1387–1397. Chen, M., & Bargh, J. A. (1997). Nonconscious behavioral confirmation processes: The self-fulfilling consequences of automatic stereotype activation. Journal of Experimental Social Psychology, 33, 541–560. Crandall, C. S., D’Anello, S., Sakalli, N., Lazarus, E., Nejtardt, G. W., & Feather, N. T. (2001). An attribution-value model of prejudice: Anti-fat attitudes in six nations. Personality and Social Psychology Bulletin, 27, 30 –37. Crocker, J., Major, B., & Steele, C. (1998). Social stigma. In D. T. Gilbert, S. T. Fiske, & G. Lindzey (Eds.), Handbook of social psychology (4th ed., Vol. 2, pp. 504 –553). New York: McGraw-Hill. Devine, P. G. (1989). Stereotypes and prejudice: Their automatic and controlled components. Journal of Personality and Social Psychology, 56, 680 – 690. Dion, K. (1973). Young children’s stereotyping of facial attractiveness. Developmental Psychology, 9, 183–188. Dion, K., Berscheid, E., & Walster, E. (1972). What is beautiful is good. Journal of Personality and Social Psychology, 24, 285–290. Dovidio, J. F., Kawakami, K., & Gaertner, S. L. (2002). Implicit and explicit prejudice and interracial interaction. Journal of Personality and Social Psychology, 82, 62– 68. Duncan, B. L. (1976). Differential social perception and attribution of intergroup violence: Testing the lower limits of stereotyping of Blacks. Journal of Personality and Social Psychology, 34, 590 –598. Eagly, A. H., Ashmore, R. D., Makhijani, M. G., & Longo, L. C. (1991). What is beautiful is good, but . . . : A meta-analytic review of research on the physical attractiveness stereotype. Psychological Bulletin, 110, 109 –128. Fazio, R. H. (1995). Attitudes as object-evaluation associations: Determinants, consequences, and correlates of attitude accessibility. In R. E. Petty & J. A. Krosnick (Eds.), Attitude strength: Antecedents and consequences (pp. 247–282). Mahwah, NJ: Erlbaum. Fazio, R. H., Jackson, J. R., Dutton, B. C., & Williams, C. J. (1995). Variability in automatic activation as an unobtrusive measure of racial attitudes: A bona fide pipeline? Journal of Personality and Social Psychology, 69, 1013–1027. Fiske, S. T. (1998). Stereotyping, prejudice, and discrimination. In D. T. Gilbert, S. T. Fiske, & G. Lindzey (Eds.), Handbook of social psychology (4th ed., Vol. 2, pp. 357– 411). New York: McGraw-Hill.

Fiske, S. T., Lin, M., & Neuberg, S. L. (1999). The continuum model: Ten years later. In S. Chaiken & Y. Trope (Eds.), Dual-process theories in social psychology (pp. 231–254). New York: Guilford Press. Fiske, S. T., & Neuberg, S. L. (1990). A continuum of impression formation, from category based to individuating processes: Influences of information and motivation on attention and interpretation. In M. P. Zanna (Ed.), Advances in experimental social psychology (Vol. 23, pp. 1–74). San Diego, CA: Academic Press. Gawronski, B., & Bodenhausen, G. V. (2006). Associative and propositional processes in evaluation: An integrative review of implicit and explicit attitude change. Psychological Bulletin, 132, 692–731. Greenwald, A. G., & Banaji, M. R. (1995). Implicit social cognition: Attitudes, self-esteem, and stereotypes. Psychological Review, 102, 4 –27. Greenwald, A. G., McGhee, D. E., & Schwartz, J. L. K. (1998). Measuring individual differences in implicit cognition: The implicit association test. Journal of Personality and Social Psychology, 74, 1464 –1480. Greenwald, A. G., Nosek, B. A., & Banaji, M. R. (2003). Understanding and using the Implicit Association Test: I. An improved scoring algorithm. Journal of Personality and Social Psychology, 85, 197–216. Higgins, E. T. (1989). Knowledge accessibility and activation: Subjectivity and suffering from unconscious sources. In J. S. Uleman & J. A. Bargh (Eds.), Unintended thought (pp. 75–123). New York: Guilford Press. Jellison, W. A., McConnell, A. R., & Gabriel, S. (2004). Implicit and explicit measures of sexual orientation attitudes: Ingroup preferences and related behaviors and beliefs among gay and straight men. Personality and Social Psychology Bulletin, 30, 629 – 642. Kerpelman, J. P., & Himmelfarb, S. (1971). Partial reinforcement effects in attitude acquisition and counterconditioning. Journal of Personality and Social Psychology, 19, 301–305. Kurzban, R., & Leary, M. R. (2001). Evolutionary origins of stigmatization: The functions of social exclusion. Psychological Bulletin, 127, 187–208. McConnell, A. R., & Leibold, J. M. (2001). Relations among the Implicit Association Test, discriminatory behavior, and explicit measures of racial attitudes. Journal of Experimental Social Psychology, 37, 435– 442. McGuire, W. J., McGuire, C. V., Child, P., & Fujioka, T. (1978). Salience of ethnicity in the spontaneous self-concept as a function of one’s ethnic distinctiveness in the social environment. Journal of Personality and Social Psychology, 36, 511–520. Miller, D. T., Taylor, B., & Buck, M. L. (1991). Gender gaps: Who needs to be explained? Journal of Personality and Social Psychology, 61, 5–12. Minear, M., & Park, D. C. (2004). A lifespan database of adult facial stimuli. Behavior Research Methods, Instruments, & Computers, 36, 630 – 633. Neuberg, S. L., Smith, D. M., Hoffman, J. C., & Russell, F. J. (1994). When we observe stigmatized and “normal” individuals interacting: Stigma by association. Personality and Social Psychology Bulletin, 20, 196 –209. Nisbett, R. E., & Wilson, T. D. (1977). Telling more than we can know: Verbal reports on mental processes. Psychological Review, 84, 231–259. Nosek, B. A. (2005). Moderators of the relationship between implicit and explicit evaluation. Journal of Experimental Psychology: General, 134, 565–584. Nosek, B. A., & Banaji, M. R. (2001). The Go/No-Go Association Task. Social Cognition, 19, 625– 666. Nosek, B. A., Banaji, M. R., & Greenwald, A. G. (2004). Project Implicit. Retrieved October 15, 2005, from http://implicit.harvard.edu Nosek, B. A., Greenwald, A. G., & Banaji, M. R. (2006). The Implicit Association Test at age 7: A methodological and conceptual review. In J. A. Bargh (Ed.), Social psychology and the unconscious: The automaticity of higher mental processes (pp. 265–292). New York: Psychology Press.

IMPLICIT AND EXPLICIT ATTITUDE FORMATION Petty, R. E., & Wegener, D. T. (1998). Attitude change: Multiple roles for persuasion variables. In D. T. Gilbert, S. T. Fiske, & G. Lindzey (Eds.), Handbook of social psychology (4th ed., Vol. 1, pp. 323–390). New York: McGraw-Hill. Plant, E. A., & Devine, P. G. (1998). Internal and external motivation to respond without prejudice. Journal of Personality and Social Psychology, 75, 811– 832. Pryor, J. B., Reeder, G. D., Yeadon, C., & Hesson-McInnis, M. (2004). A dual-process model of reactions to perceived stigma. Journal of Personality and Social Psychology, 87, 436 – 452. Richeson, J. A., & Shelton, J. N. (2003). When prejudice does not pay: Effects of interracial contact on executive function. Psychological Science, 14, 287–290. Rozin, P., Markwith, M., & Nemeroff, C. (1992). Magical contagion beliefs and fear of AIDS. Journal of Applied Social Psychology, 22, 1081–1092. Rudman, L. A., Feinberg, J., & Fairchild, K. (2002). Minority members’ implicit attitudes: Automatic ingroup bias as a function of group status. Social Cognition, 20, 294 –320. Rydell, R. J., & McConnell, A. R. (2006). Understanding implicit and explicit attitude change: A systems of reasoning analysis. Journal of Personality and Social Psychology, 91, 995–1008. Rydell, R. J., McConnell, A. R., Mackie, D. M., & Strain, L. M. (2006). Of two minds: Forming and changing valence-inconsistent implicit and explicit attitudes. Psychological Science, 17, 954 –958. Rydell, R. J., McConnell, A. R., Strain, L. M., Claypool, H. M., & Hugenberg, K. (2007). Implicit and explicit attitudes respond differently to increasing amounts of counterattitudinal information. European Journal of Social Psychology, 37, 867– 878.

807

Sagar, H. A., & Schofield, J. W. (1980). Racial and behavioral cues in Black and White children’s perceptions of ambiguously aggressive acts. Journal of Personality and Social Psychology, 39, 590 –598. Schwarz, N., & Bohner, G. (2001). The construction of attitudes. In A. Tesser & N. Schwarz (Eds.), Blackwell handbook of social psychology: Intrapersonal processes (pp. 436 – 457). Oxford, UK: Blackwell. Sloman, S. A. (1996). The empirical case for two systems of reasoning. Psychological Bulletin, 119, 3–22. Smith, E. R., & DeCoster, J. (2000). Dual-process models in social and cognitive psychology: Conceptual integration and links to underlying memory systems. Personality and Social Psychology Review, 4, 108 – 131. Srull, T. K., & Wyer, R. S. (1979). The role of category accessibility in the interpretation of information about persons: Some determinants and implications. Journal of Personality and Social Psychology, 37, 1660 – 1672. Strack, F., & Deutsch, R. (2004). Reflective and impulsive determinants of social behavior. Personality and Social Psychology Review, 8, 220 –247. Wegener, D. T., & Petty, R. E. (1995). Flexible correction processes in social judgment: The role of naive theories in corrections for perceived bias. Journal of Personality and Social Psychology, 68, 36 –51. Wittenbrink, B., Judd, C. M., & Park, B. (1997). Evidence for prejudice at the implicit level and its relationship with questionnaire measures. Journal of Personality and Social Psychology, 72, 262–274.

Received May 26, 2007 Revision received December 11, 2007 Accepted December 22, 2007 䡲

E-Mail Notification of Your Latest Issue Online! Would you like to know when the next issue of your favorite APA journal will be available online? This service is now available to you. Sign up at http://notify.apa.org/ and you will be notified by e-mail when issues of interest to you become available!

INTERPERSONAL RELATIONS AND GROUP PROCESSES

Maintaining Sexual Desire in Intimate Relationships: The Importance of Approach Goals Emily A. Impett

Amy Strachman

University of California, Berkeley

University of Southern California

Eli J. Finkel

Shelly L. Gable

Northwestern University

University of California, Santa Barbara

Three studies tested whether adopting strong (relative to weak) approach goals in relationships (i.e., goals focused on the pursuit of positive experiences in one’s relationship such as fun, growth, and development) predict greater sexual desire. Study 1 was a 6-month longitudinal study with biweekly assessments of sexual desire. Studies 2 and 3 were 2-week daily experience studies with daily assessments of sexual desire. Results showed that approach relationship goals buffered against declines in sexual desire over time and predicted elevated sexual desire during daily sexual interactions. Approach sexual goals mediated the association between approach relationship goals and daily sexual desire. Individuals with strong approach goals experienced even greater desire on days with positive relationship events and experienced less of a decrease in desire on days with negative relationships events than individuals who were low in approach goals. In two of the three studies, the association between approach relationship goals and sexual desire was stronger for women than for men. Implications of these findings for maintaining sexual desire in long-term relationships are discussed. Keywords: sexual desire, motivation, close relationships, gender differences, daily experience methods

help couples to maintain sexual desire over the course of their relationships.

I know nothing about sex, because I was always married.—Zsa Zsa Gabor

With this statement, Zsa Zsa highlights a common belief about the decline of sexual interest and activity in long-term relationships. Lack of sexual desire is the most common presenting problem at sex therapy clinics (e.g., Beck, 1995; Hawton, Catalan & Fagg, 1991). In the American survey conducted by Laumann and his colleagues, a lack of sexual desire was reported by 32% of women and 15% of men between the ages of 18 and 29 years (Laumann, Gagnon, Michael, & Michaels, 1994). Recent books by sex therapists and clinicians with such titles as Rekindling Desire: A Step-by-Step Program to Help Low-Sex and No-Sex Marriages (McCarthy & McCarthy, 2003) and Reclaiming Desire: Four Keys to Finding Your Lost Libido (Goldstein & Brandon, 2004) target couples who seek to rekindle sexual intimacy and passion in their relationships. In this article, we introduce and test approach relationship goals (i.e., goals focused on the pursuit of positive experiences in one’s relationship such as fun, growth, and development) as a factor that may

Sexual Desire and Relationship Quality Although there is no widely accepted definition of sexual desire among researchers and theorists (Levine, 2003), central to many definitions is the need, drive, or motivation to engage in sexual activities (Brezsnyak & Whisman, 2004; Clayton et al., 2006; Diamond, 2004).1 Several largescale surveys have shown that sexual desire as well as the related constructs of sexual satisfaction and sexual frequency decline with the length of time that partners have been in a relationship (e.g., Johnson, Wadsworth, Wellings, & Field, 1994; Klusmann, 2002). One large survey of German college students revealed that as duration of partnership increased, the frequency of sexual intercourse and sexual satisfaction declined in both women and men. Further, whereas men’s sexual desire remained relatively stable over the course of a relationship, women’s sexual desire dropped steadily after about 1 year of dating (Klusmann, 1

Emily A. Impett, Institute of Personality and Social Research, University of California, Berkeley; Amy Strachman, Institute for Health Promotion & Disease Prevention Research, University of Southern California, and eHarmony Labs, Pasadena, California; Eli J. Finkel, Department of Psychology, Northwestern University; Shelly L. Gable, Department of Psychology, University of California, Santa Barbara. Preparation of this article was supported by a fellowship awarded to Emily A. Impett from the Sexuality Research Fellowship Program of the Social Science Research Council and by a Ruth L. Kirschstein National Research Service Award to Amy Strachman. We thank Amie Gordon, Anne Peplau, and Deborah Schooler for helpful comments on earlier versions of this article. Correspondence concerning this article should be addressed to Emily A. Impett, Institute of Personality and Social Research, 4143 Tolman Hall, 5050, University of California, Berkeley, CA 94720-5050. E-mail: [email protected]

Although we were centrally concerned with the motivational component of sexuality (i.e., sexual desire), there is substantial overlap between sexual desire and the related constructs of sexual arousal and enjoyment. The traditional “human sex response cycle” of Masters and Johnson (1966) and Kaplan (1979) depicts sexual desire as a spontaneous force that itself triggers sexual arousal. In more recent years, however, therapists and researchers have begun to challenge this model, particularly Basson and colleagues (Basson et al., 2004) who have suggested that sexual arousal, desire, and enjoyment co-occur and can reinforce each other. Many people report that their sexual desire increases during sexual intercourse; that is, as they begin to be aroused and enjoy the sexual experience, they recognize that their sexual desire increases, and they become motivated to become even more aroused (Levine, 2002). For these reasons, in some of the studies in the current article, we assessed sexual arousal and enjoyment in addition to sexual desire in order to more fully capture the interrelated components of sexual desire.

Journal of Personality and Social Psychology, 2008, Vol. 94, No. 5, 808 – 823 Copyright 2008 by the American Psychological Association 0022-3514/08/$12.00 DOI: 10.1037/0022-3514.94.5.808

808

APPROACH GOALS AND SEXUAL DESIRE

2002). Another study documented that the association between relationship duration and reduced frequency of intercourse was stronger than the association between age and sexual frequency (Johnson et al., 1994). In short, sexual desire typically peaks at the beginning of relationships when partners are just getting to know each other and often decreases over the course of relationships (Basson, 2002; Levine, 2003). Because both sexual desire and sexual satisfaction play key roles in determining the quality of intimate relationships, relationship scholars and therapists should care about the decline of sexual desire. Many studies of couples who voluntarily attend sex therapy clinics provide support for the idea that low sexual desire is associated with decreased levels of relationship satisfaction, both for individuals with low desire and for their partners (e.g., McCabe, 1997; Trudel, Landry, & Larose, 1997). More recent studies have also documented similar associations between sexual desire and relationship satisfaction in community samples of married couples (Brezsnyak & Whisman, 2004) and dating couples (Regan, 2000; Sprecher, 2002). Further, many empirical studies have documented a significant positive association between sexual satisfaction and dating and marital quality (Yeh, Lorenz, Wickrama, Conger, & Elder, 2006; see review by Sprecher & Cate, 2004). Sex therapists have similarly noted that when sexuality functions well in a marriage, it contributes substantially to the marital bond. However, dysfunctional or nonexistent sexuality robs the marriage of intimacy, satisfaction, and stability (McCarthy, 1999). While a wealth of research has shown that sexual desire contributes to relationship quality and stability, less research has investigated the factors that help promote and sustain sexual desire in relationships. In this article, we suggest that the adoption of approach relationship goals may help couples to maintain sexual desire over the course of their relationships. We first introduce the approach–avoidance theoretical perspective guiding this research and apply this theory to the study of sexuality in intimate relationships. We then present the results of three studies designed to test our hypotheses concerning the link between approach relationship goals and sexual desire. Finally, we discuss the implications of this research for couples and sex therapists who wish to promote healthy sexual functioning in long-term relationships.

Approach–Avoidance Motivational Framework Several theories of motivational processes postulate the existence of distinct approach (also called appetitive) and avoidance (also called aversive) motivational systems (see reviews in Carver, Sutton, & Scheier, 2000; Elliot & Covington, 2001). For instance, Gray’s (1987) neuropsychological model of motivation posits appetitive and aversive motivational systems, referred to as the behavioral approach system (BAS) and the behavioral inhibition system (BIS; see also Carver & White, 1994). Specifically, the BAS is an appetitive system that is primarily sensitive to positive stimuli or signals of reward, whereas the BIS is an aversive system that is primarily sensitive to negative stimuli or signals of punishment. Gray (1990) has shown that the BAS is associated with feelings of hope, whereas the BIS is associated with feelings of anxiety. In a study of motivational dispositions and daily events, Gable, Reis, and Elliot (2000) found that participants with higher BAS sensitivity reported experiencing more daily positive affect than those with lower BAS sensitivity, while participants with

809

higher BIS sensitivity reported experiencing more daily negative affect than those with lower BIS sensitivity. The approach–avoidance motivational distinction has been particularly helpful in understanding motivation in interpersonal relationships. Basing their work on that of early social motivation theorists (e.g., Boyatzis, 1973; Mehrabian, 1976), Gable and colleagues have recently distinguished between approach and avoidance social goals (Elliot, Gable, & Mapes, 2006; Gable, 2006b). Whereas approach social goals direct individuals toward potential positive outcomes such as intimacy or growth in their relationships, avoidance social goals direct individuals away from potential negative outcomes such as conflict or rejection. For example, in a discussion about child care, a husband who has strong approach goals may be concerned with wanting the discussion to go smoothly and wanting both partners to be happy with the outcome. In contrast, a husband with strong avoidance goals may be more concerned with avoiding conflict about child care and preventing both partners from being unhappy with the outcome (Gable, 2006b). These goals are flexible forms of regulation that may take on diverse manifestations; they may focus on a specific relationship or relationships in general, they may focus on close relationships or acquaintances, and they may focus on a variety of relational concerns such as sexuality, intimacy, and parenting (Dowson & McInerney, 2003; Sorkin & Rook, 2004). In this article, we are specifically concerned with individuals’ goals in their romantic relationships (in Studies 1 and 2) and in their interpersonal relationships more generally (in Study 3). Just as BAS and BIS are associated with distinct emotional outcomes, research has also shown that approach and avoidance social goals predict different social outcomes (Gable, 2006b; Impett, Gable, & Peplau, 2005). In three short-term longitudinal studies, approach goals were associated with more positive social attitudes, more satisfaction with social bonds, and less loneliness, whereas avoidance goals were associated with more negative social attitudes, relationship insecurity, and more loneliness (Gable, 2006b). In a daily experience study of dating couples, on days when individuals made sacrifices for approach motives, they experienced greater positive affect and relationship satisfaction; on days when they sacrificed for avoidance motives, they experienced greater negative affect, less relationship satisfaction, and more conflict (Impett, Gable, & Peplau, 2005). In short, the approach system (but not the avoidance system) is associated with positive emotional and social outcomes.

Applying the Approach-Avoidance Motivational Framework to Sexuality Central to many definitions of sexual desire is the need or motivation to engage in sexual activities or the pleasurable anticipation of such activities in the future (Brezsnyak & Whisman, 2004; Clayton et al., 2006; Diamond, 2004). In short, sexual desire involves the potential rewards and the positive emotional experience that are characteristic of the approach motivational system. In addition to predicting positive affect and relationship satisfaction (Gable, 2006b; Gable et al., 2000; Impett, Gable, & Peplau, 2005), approach relationship goals may also be associated with daily sexual desire and the maintenance of sexual desire over time. One possible reason why approach relationship goals may promote sexual desire concerns people’s motives or reasons for engaging in

810

IMPETT, STRACHMAN, FINKEL, AND GABLE

sexual activity with a partner (Cooper, Shapiro, & Powers, 1998; Impett, Peplau, & Gable, 2005). People who pursue positive experiences, such as growth and development, in their relationships may view sexual activity as one way to create positive, intimate experiences with a partner. Therefore, compared with people with weak approach relationship goals, those with strong approach relationship goals may think more about sex, be more sensitive to their partners’ cues, create environments that promote intimate interaction, and act more readily upon potential sexual encounters. Previous research has shown that approach motives and goals are primarily linked to outcomes through an exposure process (Elliot et al., 2006; Gable, 2006b; Gable et al., 2000). That is, individuals with strong approach goals or motives tend to report experiencing a greater number of positive events (but not fewer negative events). Therefore, individuals with strong approach goals are likely to experience a greater number of positive events and positive emotions (including desire) with their partners. Because of these previous findings, we believe that approach goals in close relationships will be more strongly related to sexual desire than will avoidance goals because approach goals are primarily associated with positive events (likely through processes that lead to increased exposure) and avoidance goals have not been linked reliably to positive events. It is also likely that people with strong approach goals for their relationships in general may also engage in daily sexual activity for approach reasons such as pleasing a partner or enhancing intimacy in the relationship. Repeatedly engaging in sex for approach reasons, in turn, may promote greater sexual desire. A recent cross-sectional study of late adolescent girls showed that engaging in sex for approach goals (e.g., to express love, for physical attraction) was positively associated with sexual satisfaction (Impett & Tolman, 2006). Based on this research, we predicted that individuals with strong approach relationship goals would report engaging in sexual activity for approach reasons, in turn, promoting greater sexual desire.

Gender and Sexual Desire Many lines of research demonstrate that men show more interest in sex than do women (see review by Baumeister, Catanese & Vohs, 2001). For example, men think about sex more often than women do (Laumann et al., 1994), report more frequent sexual fantasies (Beck, Bozman, & Qualtrough, 1991), and report greater feelings of sexual desire (Leitenberg & Henning, 1995). Further, men and women differ in their preferred frequency of sex; when dating and marriage partners disagree about sexual frequency, men usually want to have sex more often than women (Julien, Bouchard, Gagnon, & Pomerleau, 1992; Sprecher & Regan, 1996). Complementing these descriptive gender differences is research demonstrating that women’s sexual desire may be more closely tied to the interpersonal aspects of the relationships than is men’s desire (see review by Peplau, 2003). For instance, when Regan and Berscheid (1999) asked young adults to define sexual desire, men were more likely than women to emphasize physical pleasure and sexual intercourse, whereas women were more likely than men to emphasize the emotional or relational side of sexual desire. Women are more likely than men to engage in sex to enhance commitment and express love for their partners (Basson, 2002; Impett, Peplau, & Gable, 2005). Taken together, these lines of

research suggest that women’s sexual desire may be more sensitive than men’s to relationship dynamics, and in particular, to women’s goals for the relationship. For this reason, a secondary goal of the current research was to explore whether the association between approach relationship goals and sexual desire is stronger for women than for men.

Hypotheses and Research Overview We conducted three studies of individuals in dating relationships to test several predictions from approach–avoidance motivational theory about the maintenance of sexual desire in dating relationships. Study 1 was a 6-month longitudinal study of individuals in dating relationships that included biweekly assessments of sexual desire. In this study, we tested the hypothesis that the adoption of approach relationship goals would buffer against declines in sexual desire over time. Study 2 was a 2-week daily experience study of individuals in dating relationships designed to extend the findings from Study 1 by testing whether approach sexual goals would mediate the link between approach relationship goals and sexual desire. Study 3 was an additional 2-week daily experience study that included (a) a more general measure of social (as opposed to relationship-specific) approach goals, (b) a more detailed measure of sexual goals that distinguished between self-focused and other-focused goals, and (c) measures of positive and negative relationship events. These last measures enabled us to examine how perceptions of the daily relationship climate influence sexual desire and whether relationship events moderate the association of approach goals with sexual desire. In all three studies, we conducted additional analyses to examine the effects of approach goals on sexual desire beyond the influence of how long people have been involved in their relationships, how satisfied they are with their partners, and how frequently they engage in sexual activity. Finally, in all three studies, we examined gender as a moderator of the link between approach goals and sexual desire, exploring the possibility that the association may be stronger for women than for men as women’s sexual desire may be particularly sensitive to women’s goals for their relationships.

Study 1 We tested three main predictions in a 6-month longitudinal study of college students in dating relationships: (a) Individuals with strong approach goals would report higher sexual desire at study entry than individuals with weak approach goals; (b) individuals who began the study with weak approach relationship goals would experience decreases in sexual desire over the course of the study, whereas individuals who began the study with strong approach relationship goals would not experience such decreases; and (c) avoidance relationship goals would not be significantly associated with sexual desire. Finally, we explored gender as a moderator of the association between approach relationship goals and sexual desire.

Method Participants and Procedure Sixty-nine Northwestern University undergraduate students (34 men, 35 women) were recruited via flyers posted around campus

APPROACH GOALS AND SEXUAL DESIRE

to participate in a 6-month longitudinal study of dating processes. Eligibility criteria required that each participant be: (a) a first-year undergraduate at Northwestern University, (b) involved in a dating relationship of at least 2 months’ duration, (c) between 17 and 19 years old, (d) a native English speaker, and (e) the only member of a given couple to participate in the study. Participants who completed all components of the study were paid $100; those who missed some were paid a prorated amount. At the beginning of the study, most participants were 18 years old (7% were 17, 81% were 18, and 12% were 19) and White (74% White, 12% Asian American, 3% Hispanic, 1% African American, and 10% other). On average, participants had been dating their partner for a little over 1 year (M ⫽ 13 months; range ⫽ 2– 42 months). During the 6-month study, 26 participants (38%) broke up with their romantic partner; they were included in the analyses until the breakup.2 This study was part of a larger investigation of dating processes that was divided into four parts: (a) an initial 60-min questionnaire sent via campus mail, (b) a 90-min lab-based session involving additional questionnaires and training for the online sessions, (c) a 10- to-15-min online questionnaire every other week for 6 months (14 in total), and (d) a 60-min lab-based session at the end of the 6-month period. During the training for the online sessions, a researcher reviewed the procedures for the completion of the biweekly surveys, specifically emphasizing that participants should complete their surveys every other Wednesday evening and that their responses were confidential (i.e., they used a password to log onto the server). To bolster and verify compliance, we sent participants reminder e-mails if they forgot to complete a survey on time, and financial incentives were linked to completing each survey. Only surveys received within 48 hr of when they were due were retained in the data set. Participant retention was excellent: All 69 participants completed the study, and 67 of them completed at least 12 of the 14 online measures on time. Fourteen participants failed to complete the measure of approach and avoidance relationship goals correctly, leaving the final sample at 55 participants.3

Measures Approach and avoidance relationship goals. As part of the initial questionnaire, participants completed a 4-item measure assessing approach relationship goals (e.g., “I will be trying to deepen my relationship with my romantic partner” and “I will be trying to move toward growth and development in my romantic relationship”; ␣ ⫽ .86) and another assessing avoidance relationship goals (e.g., “I will be trying to avoid disagreements and conflicts with my romantic partner” and “I will be trying to make sure that nothing bad happens in my romantic relationship”; ␣ ⫽ .66; Gable, 2006a). All questions were answered on 7-point scales (1 ⫽ strongly disagree, 7 ⫽ strongly agree), and each scale was calculated as an average score of the ratings on the 4 items. In the current study, a two-factor-solution principal components analysis with varimax rotation explained 60% of the scale variance. The first factor (35% of explained variance) included the four approach relationship goals items, and the second factor (25% of explained variance) included the four avoidance goals items. The correlation between the two subscales was .21, p ⫽ .12. Sexual desire. As part of the 14 biweekly online questionnaires, participants answered questions about their sexual desire for and participation in sexual activities with their dating partner. (These

811

activities were not limited to sexual intercourse.) Participants completed a two-item partner-specific measure of sexual desire, answering the questions “I feel a great deal of sexual desire for my partner” and “When my partner and I have sexual contact, I enjoy it a great deal” on 7-point scales (1 ⫽ strongly disagree, 7 ⫽ strongly agree). Within each of the 14 waves of online data collection, the correlations between these two items were quite high (rs ⫽ .69 –.98, ps ⬍ .001, with an average correlation across these waves of .91). We also assessed frequency of sexual contact with one’s partner. Participants answered the question “How many times did you have sexual contact with your partner over the last 2 weeks?” by typing in a number rather than answering on a response scale. Relationship satisfaction. Participants answered one question designed to measure relationship satisfaction as part of the 14 biweekly online questionnaires. Specifically, they responded to the statement “I am satisfied with my relationship” on 7-point scales (1 ⫽ strongly disagree, 7 ⫽ strongly agree).

Results Participants reported an average of 3.16 (SD ⫽ 3.92) acts of sexual contact with their partner per 2-week time period of the study. The central hypotheses guiding this study were that approach relationship goals would predict elevated sexual desire at study entry and buffer against declines in sexual desire over time. The two-level data structure included measures assessed on each of the online questionnaires (Level 1) nested within each participant (Level 2). For example, participants who completed all waves of online data collection reported their level of sexual desire on 14 different occasions. Traditional ordinary least squares regression methods assume independence of observations, a criterion that is typically violated when the same individual completes the same measures repeatedly. Therefore, we analyzed the data using multilevel modeling techniques (Raudenbush & Bryk, 2002) with the MIXED procedure in SAS (Littell, Milliken, Stroup, & Wolfinger, 1996). Multilevel modeling approaches provide unbiased hypothesis tests by simultaneously examining variance associated with each level of nesting. A strength of multilevel modeling techniques is that they can readily handle an unbalanced number of cases per person (i.e., number of surveys completed), giving greater weighting to participants who provide more data (Snijders & Bosker, 1999). Following Singer and Willett (2003), we permitted the intercept and slope terms for approach relationship goals to vary randomly; the slope terms for the other predictors were treated as fixed. Finally, all variables were standardized prior to analyses; consequently the coefficients represented changes in standard deviation units of the dependent variable (i.e., sexual desire) associated with a standard deviation unit of the predictor variable. Thus, the coefficients are a convenient measure of effect size. 2

Participant sexual orientation was not assessed in Studies 1 and 3. These 14 participants responded to the approach relationship goals questionnaire items with check marks rather than with the 1–7 rating scale, which meant that we were not able to calculate a score for them. The 55 participants who completed the goals measure correctly did not differ significantly from the 14 who did not on the initial measures of relationship duration, relationship satisfaction, or sexual desire. This problem with the goals measure was subsequently rectified in Studies 2 and 3. 3

812

IMPETT, STRACHMAN, FINKEL, AND GABLE

Figure 1. Approach relationship goals as a moderator of the intercept and slope of sexual desire (Study 1). Note: The means were estimated with ⫾ 1 standard deviation on approach goals.

Before testing our specific hypotheses concerning approach and avoidance relationship goals and sexual desire, we conducted a preliminary analysis to determine if, on average, participants experienced a decline in sexual desire over the course of the study. This analysis included time as a predictor of the intercept and the slope of sexual desire. In this and all subsequent analyses, time was coded such that the first wave of data collection was 0 and the final wave was 13. A significant effect of time on sexual desire, ␤ ⫽ ⫺.02, t(66) ⫽ ⫺2.52, p ⫽ .02, showed that sexual desire decreased significantly over time at a rate of .02 standard deviation units every 2 weeks; this rate of biweekly decline would lead to an annual decline in sexual desire of approximately half a standard deviation (.52 standard deviation units, to be precise). This decline over time in sexual desire mirrors a similar decline in desire in samples of married couples (e.g., Johnson et al., 1994; Klusmann, 2002). Next, we tested the hypothesis that approach relationship goals would moderate the intercept and slope of sexual desire. We predicted that participants with strong approach goals would begin the study higher in sexual desire and would not experience the decline in sexual desire that characterized the sample as a whole. To test this hypothesis, we simultaneously entered time, approach goals, and avoidance goals to predict both the intercept and the slope of sexual desire. The results showed that approach goals predicted the intercept of sexual desire, ␤ ⫽ .35, t(625) ⫽ 3.17, p ⬍ .01, providing support for the hypothesis that participants with strong approach goals would report greater sexual desire at study entry relative to those with weak approach goals. Approach goals also (marginally) moderated the effect of time on sexual desire, ␤ ⫽ .014, t(625) ⫽ 1.87, p ⫽ .06. The results showed that whereas participants with low approach relationship goals experienced declines in sexual desire over the course of the study, participants with strong approach goals retained relatively high levels of sexual desire over the course of the study. Figure 1 depicts both of these effects. Consistent with our hypotheses, avoidance goals predicted neither the intercept, ␤ ⫽ ⫺.17, p ⫽ .13, nor the slope, ␤ ⫽ ⫺.01, p ⫽ .42, of sexual desire. We then conducted two sets of follow-up analyses. In the first analysis, we controlled for relationship satisfaction and duration (both the intercept and slope terms), and approach relationship goals remained significant predictors of both

the intercept, ␤ ⫽ .28, t(470) ⫽ 2.64, p ⬍ .01, and slope of sexual desire, ␤ ⫽ .015, t(470) ⫽ 2.16, p ⬍ .05. In the second analysis, we controlled for the frequency with which participants engaged in sexual intercourse across the 14-day study, and approach relationship goals remained significant predictors of both the intercept, ␤ ⫽ .34, t(469) ⫽ 3.03, p ⬍ .01, and slope of sexual desire, ␤ ⫽ .02, t(469) ⫽ 2.34, p ⬍ .01, pointing to the robust nature of these findings.4,5 The final goal of this study was to explore whether the association between approach relationship goals and sexual desire is stronger for women than for men. To examine this possibility, we included six additional terms involving gender (coded as 1 ⫽ men and ⫺1 ⫽ women) to the primary analysis described above. Specifically, we examined whether gender moderated the intercept and slope effects for sexual desire and whether gender moderated any of the associations of approach or avoidance relationship goals with the intercept and slope of sexual desire. This analysis revealed a significant gender effect on the sexual desire intercept, ␤ ⫽ .18, t(591) ⫽ 2.04, p ⫽ .04, indicating that men reported greater sexual desire than did women at the beginning of the study. There was no significant 4

We conducted additional analyses in which we analyzed both of the individual-item dependent measures of sexual desire separately (i.e., sexual desire and enjoyment). When we included both approach and avoidance goals in an equation simultaneously, approach goals significantly predicted the intercepts of both sexual desire, ␤ ⫽ .33, t(469) ⫽ 3.05, p ⬍ .01, and enjoyment, ␤ ⫽ .24, t(467 ⫽ 2.17), p ⬍ .05; further, approach goals marginally predicted the slope of sexual desire, ␤ ⫽ .013, t(469) ⫽ 1.77, p ⫽ .077, and significantly predicted the slope of sexual enjoyment, ␤ ⫽ .02, t(467) ⫽ 2.96, p ⫽ .003, when both the intercept and the slope of relationship satisfaction and duration were controlled. There were no significant associations between avoidance relationship goals and either of the individual itemdependent measures of sexual desire. 5 In all of the studies reported in this article, we tested for interactions between approach and avoidance goals, and none of these effects was significant. Furthermore, once the interaction terms were added, the effects for the intercept and the slope of approach goals (in Study 1) and the effects for approach goals in Studies 2 and 3 remained significant.

APPROACH GOALS AND SEXUAL DESIRE

gender effect on the sexual desire slope, however, ␤ ⬍ .01, t(591) ⫽ 0.67, p ⫽ .51. Gender also moderated the effect of approach goals on the intercept of sexual desire, ␤ ⫽ ⫺.33, t(591) ⫽ 3.63, p ⬍ .001, suggesting that the association between approach goals and sexual desire at study entry was stronger for women than for men. Finally, gender did not significantly moderate the effect of approach goals on the slope of sexual desire, ␤ ⫽ ⫺.012, t(591) ⫽ ⫺1.43, p ⫽ .15. This result suggests that men and women with weak approach goals did not significantly differ in the tendency to experience decreased sexual desire over time, although this nonsignificant effect trended in the direction of approach goals more positively predicting the slope of sexual desire for women than for men.

Brief Discussion Study 1 provided evidence for the two hypotheses linking approach relationship goals and sexual desire. Not only did approach relationship goals predict greater sexual desire at study entry, but having strong approach relationship goals buffered against declines in sexual desire over a 6-month period. Avoidance goals were not significantly associated with sexual desire at the beginning of the study or trajectories of sexual desire over time. Finally, the association between approach relationship goals and sexual desire at the beginning of the study was stronger for women than for men, pointing to the particular importance of goals focused on obtaining positive outcomes in romantic relationships for enhancing women’s sexual desire. Why do approach relationship goals buffer against declines in sexual desire over time? Study 2 tested the hypothesis that approach relationship goals promote increased sexual desire during daily sexual interactions, given that people who typically pursue approach goals in their relationships may also be highly motivated to pursue shorter term approach goals, such as enhancing intimacy and closeness, during their sexual interactions with a partner (Gable, 2006b). In addition, Study 2 tested approach sexual goals as a possible mediator of the association between approach relationship goals and sexual desire.

813 Method

Participants and Procedure The study was advertised as an examination of dating relationships, and participants received credit toward psychology coursework at the University of California, Los Angeles, in exchange for participation. To be eligible, participants had to: (a) be currently involved in a dating (not a marital) relationship, (b) see their partner at least 5 days per week (i.e., no longdistance relationships), and (c) be the only member of a given couple to participate in the study. Of the 121 participants (55 men, 66 women) who completed the study, 2 were engaged to be married, and 18 were cohabitating; the mean relationship length for all participants was 18 months (range ⫽ 1 month– 8 years). Participants ranged in age from 18 to 38 years (M ⫽ 20.2, SD ⫽ 2.6). The sample was ethnically diverse: 5% were African American, 36% were Asian or Pacific Islander, 15% were Hispanic, 37% were White, and 7% self-identified as multi-ethnic or other. In addition, all participants identified as heterosexual except one gay man, and he was included in the study. During an initial session, each participant was given 14 surveys, each containing the daily measures, one for each night of the week. A researcher then reviewed the procedures for completion of the daily surveys, specifically emphasizing that participants should begin completing their surveys that evening, that they should complete one survey each night before going to bed (even if they did not engage in sex on that particular day), that their responses were confidential, that they should not discuss their surveys with their partner, and that if they missed a day, they should leave that particular survey blank. To bolster and verify compliance with the daily schedule, we asked participants to return completed surveys every 2–3 days to a locked mailbox located outside the laboratory. As an incentive, each time participants handed in a set of surveys on time, they received a lottery ticket for one of several cash prizes ($100, $50, $25) to be awarded after the study. Participants who did not return a particular set of surveys on time were reminded by phone or e-mail. Only daily surveys returned on time were treated as valid and retained in the data set. In total, participants completed 1,549 daily surveys on time, an average of 12.8 days per person. Ninety percent of the participants completed all 14 daily reports on time.

Study 2 We tested three main predictions in a 2-week daily experience study of college students in dating relationships: (a) Approach relationship goals would be associated with increased sexual desire in day-to-day sexual interactions, (b) approach sexual goals would mediate the association between approach relationship goals and sexual desire; and (c) avoidance relationship goals would not be significantly associated with daily sexual desire. In addition, as in Study 1, we examined gender as a possible moderator of the association between approach relationship goals and sexual desire, exploring the possibility that the association would be stronger for women than for men.

Background Measures In their initial session in the laboratory, participants completed a questionnaire with basic demographic information (i.e., gender, age, ethnicity, relationship duration), as well as the same measure of approach and avoidance relationship goals used in Study 1 (Gable, 2006a). They were instructed to answer the questions about their goals for their relationships over the next few months. In the present study, ␣ ⫽ .78 for approach social goals and ␣ ⫽ .79 for avoidance social goals. The correlation between the two subscales was .57, p ⬍ .001. In addition, participants completed a standard 5-item measure

IMPETT, STRACHMAN, FINKEL, AND GABLE

814

of relationship satisfaction (Rusbult, Martz, & Agnew, 1998). Participants responded to such statements as “Our relationship makes me happy” on 9-point scales (0 ⫽ do not agree at all, 8 ⫽ agree completely). In this sample, ␣ ⫽ .89.

Daily Measures If participants had engaged in sexual intercourse since they had completed the previous day’s survey, they completed measures of sexual desire and sexual goals.6 Sexual desire. Each time that they engaged in sexual intercourse, participants answered two questions designed to measure their sexual desire on 7-point scales (1 ⫽ very low, 7 ⫽ very high). More specifically, they responded to the following two items: “Rate your own level of sexual desire just prior to engaging in sex,” and “Rate your own level of sexual desire during sex.” A composite sexual desire variable was created by averaging the responses to these two questions (␣ ⫽ .64). Sexual goals. Each time they engaged in sexual intercourse, participants responded to a nine-item measure of sexual goals adapted from Cooper et al. (1998) and used by Impett, Peplau, & Gable (2005). Participants rated the importance of five approach and four avoidance goals in influencing their decision to engage in sex on 7-point scales (1 ⫽ not at all important, 7 ⫽ extremely important). The approach items were “to pursue my own sexual pleasure,” “to feel good about myself,” “to please my partner,” “to promote intimacy in my relationship,” and “to express love for my partner.” The avoidance items were “to avoid conflict in my relationship,” “to prevent my partner from becoming upset,” “to prevent my partner from getting angry at me,” and “to prevent my partner from losing interest in me.” The within-person correlation between approach and avoidance sexual goals was .03, p ⫽ .49. The reliability coefficients were .71 for approach goals and .90 for avoidance goals.

Results Participants reported a total of 480 sexual interactions. On average, participants reported engaging in sexual intercourse on 4 days during the 2-week study (SD ⫽ 2.3; range ⫽ 1–10 days). A central goal of this study was to test predictions about the associations between approach and avoidance relationship goals and sexual desire. To address the data nonindependence, analyses were performed using multilevel modeling techniques in the hierarchical linear models (HLM) computer program (HLMwin, Version 5.02; Raudenbush, Bryk, Cheong, & Congdon, 2000). Level-1 (i.e., daily) predictors were centered around each individual’s mean across the 14-day study. This technique, known as group-mean centering, accounts for differences between-persons in the sample and assesses whether day-to-day changes from a participant’s own mean are associated with changes in the outcome variable, consequently unconfounding between- and within-person effects. As in Study 1, all variables were standardized prior to analyses.

Relationship Goals and Daily Sexual Desire The first major hypothesis guiding this study was that approach relationship goals would predict increased daily sexual desire. To test this hypothesis, we entered approach and avoidance relationship goals as simultaneous predictors of daily sexual desire. The results showed that approach relationship goals were positively associated with sexual desire, ␤ ⫽ .20, t(117) ⫽ 2.84, p ⬍ .01. In contrast, avoidance goals were not significantly associated with sexual desire, ␤ ⫽ ⫺.07, p ⫽ .31.7 As in Study 1, we then conducted two sets of follow-up analyses. In the first analysis, we controlled for relationship satisfaction and duration, and the association between approach relationship goals and sexual desire remained significant, ␤ ⫽ .17, t(115) ⫽ 2.23, p ⬍ .05. In the second analysis, we controlled for the frequency with which participants engaged in sexual intercourse across the 14-day study, and the association between approach goals and desire also remained significant, ␤ ⫽ .20, t(116) ⫽ 2.79, p ⬍ .01, pointing to the robust nature of these findings. We also explored gender as a moderator of the association between approach relationship goals and sexual desire. Similar to the way we conducted analyses in Study 1, we simultaneously entered approach relationship goals, avoidance relationship goals, gender, and two interaction terms (Approach Relationship Goals ⫻ Gender; Avoidance Relationship Goals ⫻ Gender) to predict daily sexual desire. Although there was no main effect of gender, ␤ ⫽ ⫺.06, p ⫽ .28, the interaction between gender and approach relationship goals significantly predicted sexual desire, ␤ ⫽ .21, t(114)⫽ 2.08, p ⬍ .05. As shown in Figure 2, the association between approach relationship goals and sexual desire was stronger for women than for

6

Although people certainly experience sexual desire in the absence of having sexual contact with their partner, we focused only on days on which participants reported engaging in sexual contact with the partner for several reasons. First, we focused on desire just before a concrete event (e.g., sexual activity) to improve recall and lessen retrospective biases (Reis & Gable, 2000). Second, because there are multiple reasons that partners may not have engaged in sexual activities, some benign or related to circumstance (e.g., schedule, proximity of partner) and some related to self or partner desire (e.g., rebuffed sexual advances), it would be extremely difficult to compare event days to nonevent days. Finally, we wanted to examine goals for each sexual event to capture the full range of goals that participants experienced across days, and we suspected that asking participants to report sexual goals in the absence of sexual activity would have been difficult and would have produced unreliable data. 7 We conducted analyses in which we analyzed both of the individual-item dependent measures of sexual desire separately (i.e., sexual desire just prior to engaging in sex and sexual desire during sex). When we included both approach and avoidance goals in an equation simultaneously, approach relationship goals significantly predicted sexual desire just prior to engaging in sex, ␤ ⫽ .20, t(117) ⫽ 3.23, p ⬍ .01, and marginally predicted sexual desire during sex, ␤ ⫽ .14, t(117) ⫽ 1.69, p ⬍ .10. There were no significant associations between avoidance relationship goals and either of the individual item-dependent measures of sexual desire.

APPROACH GOALS AND SEXUAL DESIRE

815

Figure 2. Gender as a moderator of the association between approach relationship goals and sexual desire (Study 2).

men.8 Neither avoidance relationship goals, ␤ ⫽ ⫺.04, p ⫽ .68, nor the interaction between gender and avoidance goals, ␤ ⫽ ⫺.06, p ⫽ .56, significantly predicted daily sexual desire.

(z ⫽ 2.90, p ⬍ .01), providing evidence for mediation. In other words, participants with strong approach relationship goals also tended to engage in sexual activity to pursue positive outcomes, in turn promoting greater daily sexual desire.

Approach Sexual Goals as a Mediator Another hypothesis was that approach sexual goals would mediate the association between approach relationship goals and sexual desire (see Figure 3). Standard (ordinary least squares [OLS]) hierarchical regression analysis based on the principles of Baron and Kenny (1986) was used to test mediation. Data were aggregated across days such that each person received summary scores for approach sexual goals and sexual desire. The first requirement in demonstrating mediation is that the predictor variable be associated with the outcome variable. Indeed, approach relationship goals were significantly associated with sexual desire (r ⫽ .24, p ⬍ .01). The second requirement is to show that approach relationship goals predict the putative mediator, approach sexual goals; indeed they did (r ⫽ .45, p ⬍ .001). The third requirement is that the mediator predicts the outcome variable (i.e., sexual desire) after the predictor variable is controlled and that this effect could plausibly account for the direct effects between the predictor and the outcome variable. Approach sexual goals significantly predicted sexual desire, ␤ ⫽ .33, p ⬍ .01, and the direct effect from approach relationship goals to sexual desire dropped to nonsignificance, ␤ ⫽ .09, p ⫽ .36). A significant Sobel (1982) test indicated that the drop in the value of the latter beta was significant

Brief Discussion Study 2 replicated and extended the findings from Study 1 in several important ways. First, the results showed that approach relationship goals promoted greater sexual desire during day-today sexual interactions. Second, Study 2 demonstrated that approach sexual goals may be an important mechanism by which approach relationship goals promote sexual desire. That is, individuals who are generally oriented toward promoting positive experiences in their relationships also engage in sex to pursue positive outcomes such as a partner’s happiness or increased intimacy. Approach sexual goals were, in turn, associated with sexual desire. Third, this study showed that the association between approach relationship goals and sexual desire was stronger for women than for men, providing further evidence that women’s sexual desire is more closely tied to relationship dynamics than is men’s sexual desire. Fourth, this study showed that avoidance goals were not significantly associated with daily sexual desire.

Study 3 Study 3 was another daily experience study of college students in dating relationships, but Study 3 differed from Study 2 in three 8

Figure 3. Approach sexual goals as a mediator between approach relationship goals and sexual desire (Study 2). Note: All numbers are ordinary least squares regression coefficients. *p ⬍ .05. **p ⬍ .01. ***p ⬍ .001.

In addition to testing for interactions with participant gender, we also tested for interactions with ethnicity by creating two dummy-coded variables: 1 ⫽ White versus not White, and 2 ⫽ Asian versus not Asian. When each of these dummy-coded variables was (separately) added as a covariate, approach relationship goals remained a significant predictor of daily sexual desire. More important, we also created interaction terms between approach goals and both of the dummy-coded variables. Neither of these interaction terms was associated with daily sexual desire. We also tested for interactions with ethnicity in Study 3 using the same strategy; none of the interactions was significant.

IMPETT, STRACHMAN, FINKEL, AND GABLE

816

important ways. First, the approach relationship goals measure was replaced with a more general measure of approach social goals, allowing us to determine whether the effects were specific to a measure of romantic relationships. Second, Study 3 included a longer, more refined measure of sexual goals that enabled us to determine the relative contributions of both self-focused sexual goals (e.g., “to pursue my own sexual pleasure”) and other-focused sexual goals (e.g., “to please my partner”) to daily sexual desire. Third, Study 3 also included measures of positive and negative relationship events to enable us to examine how people’s perceptions of the daily relationship climate relate to their levels of sexual desire. Each day poses an opportunity for positive events (e.g., partners compliment each other, express their love, or do fun things together) as well as negative events (e.g., they criticize, disagree, or give each other the silent treatment). On the basis of previous research showing a link between relationship satisfaction and increased sexual desire (e.g., Brezsnyak & Whisman, 2004; Sprecher, 2002), we predicted that individuals would experience greater sexual desire on days with more frequent positive events and also on days with less frequent negative events in their relationships. We also predicted that approach social goals would moderate these associations, such that people with strong approach goals would experience even greater sexual desire on days with many positive events because the approach system is sensitive to the presence and absence of positive goal-relevant events (Gable et al., 2000). This prediction is also consistent with work on the upward spiral effect of positive emotions (Fredrickson & Joiner, 2002) and on the role of positive-arousing activities in relationship satisfaction and passionate love (Aron, Norman, Aron, McKenna, & Heyman, 2000). Finally, as in Studies 1 and 2, we explored gender as a moderator of the association between approach social goals and sexual desire.

Method Participants and Procedure The study was advertised as an examination of “relationships, sexuality, and health,” and participants received credit toward psychology coursework at the University of California, Los Angeles, in exchange for participation. Participants were told that the study was about daily events in relationships, including sexual interactions. To be eligible, participants had to be: (a) currently involved in a dating relationship, (b) sexually active with their partner, (c) see their partner at least 5 days per week (i.e., no long-distance relationships), and (d) the only member of a given couple to participate in the study. Ninety participants (60 women, 29 men, 1 did not report gender) completed the study. Twelve of the participants did not engage in sexual intercourse during the study; therefore, the final sample consisted of the remaining 77 participants (55 women, 22 men). Two of the participants were married, and 8 were cohabitating; the mean relationship length for all participants was 21 months. Participants ranged in age from 17 to 44 years (M ⫽ 20.3, SD ⫽ 3.6).9 The sample was ethnically diverse: 4% were African American, 32% were Asian or Pacific Islander, 35% were White, 26% identified as multi-ethnic or other, and 3% did not report their ethnicity. During an initial session, participants were given instructions about how to complete an online survey by logging onto a secure

server each day. The daily survey was posted on a Web site, and participants were given a login name and password to use each time they entered the site. Participants were asked to complete the survey at the beginning of each day for 14 consecutive days. The survey asked about the previous day’s relationship and sexual activities. Participants were instructed to complete the survey by 1 p.m. each day. The date and time of survey completion were automatically recorded by the Web site, and research assistants checked this log each morning and e-mailed reminders to participants who had not yet completed their daily surveys. Only surveys completed on time were accepted and included in the data analyses. As an incentive for on-time completion of surveys, participants who completed between 11 and 14 diaries (N ⫽ 71) were entered into a lottery drawing for $100. Participants completed a total of 1,182 daily surveys on time, an average of 13 days per person. Ninety percent of participants completed all their surveys on time.

Background Measures In their initial session in the laboratory, participants completed a questionnaire with basic demographic information (i.e., gender, age, ethnicity, relationship duration), as well as a measure of approach and avoidance social goals (Elliot et al., 2006). Participants responded to four approach statements (e.g., “I will be trying to move toward growth and development in my friendships,” and “I will be trying to deepen my relationship with my friends”) and four avoidance statements (e.g., “I will be trying to make sure nothing bad happens to my close relationships,” and “I will be trying to avoid getting embarrassed, betrayed, or hurt by any of my friends”) on 7-point scales (1 ⫽ not at all true of me, 7 ⫽ very true of me). They were instructed to answer the questions about their goals for their relationships over the next few months. In the present study, ␣ ⫽ .78 for approach social goals and ␣ ⫽ .79 for avoidance social goals. The correlation between the two subscales was .36, p ⬍ .05. Relationship satisfaction was also assessed using the same measure as in Study 2 (Rusbult et al., 1998; ␣ ⫽ .94).

Daily Measures If participants had engaged in sexual intercourse since they had completed their previous day’s survey, they completed measures of sexual desire and sexual goals. Sexual desire. Each time that they engaged in sexual intercourse, participants answered three questions designed to measure their sexual desire on 5-point scales (1 ⫽ not at all, 5 ⫽ very much). The questions were: “How much did you want to have sex?” “How much did you enjoy the sexual experience?” and “How sexually aroused were you during this sexual experience?” A composite variable called sexual desire was created by averaging the responses to these three questions (␣ ⫽ .95). Sexual goals. Cooper et al.’s (1998) sexual motivation scale was used to measure approach and avoidance sexual goals. The scale consists of 29 items and was modified to assess the participants’ most recent sexual experience. This measure categorizes 9

The 44 year-old participant was an outlier in terms of age. All analyses yielded identical conclusions when this person was excluded.

APPROACH GOALS AND SEXUAL DESIRE

sexual goals using the approach/avoidance distinction as well as a self-focused/other-focused distinction. These two dimensions are crossed to yield four categories of goals (six discrete goals) for engaging in sex: (a) approach self-focused goals (e.g., “I have sex because it feels good” [Enhancement]), (b) approach other-focused goals (e.g., “I have sex to feel emotionally close to my partner” [Intimacy]), (c) avoidance self-focused goals (e.g., “I have sex to reassure myself that I am attractive” [Self-Affirmation], and “I have sex to help me deal with disappointments in my life” [Coping]), and (d) avoidance other-focused goals, “I have sex because I don’t want my partner to be angry with me” [Partner Approval]), and “I have sex just because all of my friends are having sex” [Peer Approval]). The reliability coefficients for Enhancement, Intimacy, Self-Affirmation, Coping, Partner Approval, and Peer Approval were .90, .93, .84, .91, .84, and .78, respectively. Positive and negative relationship events. Participants completed measures of positive and negative events adapted from previous research (Gable, Reis, & Downey, 2003). Each day, participants indicated whether they experienced each of nine positive relationship events and nine negative relationship events. Positive event items included: “My partner told me that he/she loves me,” ‘My partner and I participated in an activity that I really enjoy,” “During a discussion, I felt understood and appreciated by my partner,” “My partner did something that made me feel wanted,” “My partner and I did something fun,” “My partner did something special for me,” “My partner complimented me,” “My partner made me laugh,” and “My partner and I talked about making our relationship more serious or committed.” Negative event items included: “My partner and I had a minor disagreement,” “My partner was inattentive and unresponsive to me,” “My partner tried to control what I did,” “We had a major disagreement,” “My partner’s behavior made me question his or her commitment to me,” “My partner criticized me,” “My partner went out with his/her friends instead of spending time with me,” “My partner did something that made me feel irritated or angry,” and “My partner gave me the silent treatment.” Responses to these questions were summed to create separate indices of the total number of positive events and the total number of negative events that participants experienced in their relationships each day.

Results Participants reported a total of 283 sexual interactions. On average, participants reported engaging in sexual intercourse on 3.4 days during the 2-week study (SD ⫽ 2.0; range ⫽ 1–14 days). As in Study 2, the data set was hierarchically nested, with days nested within persons. Multilevel modeling in the HLM computer program (HLMwin, Version. 5.02; Raudenbush et al., 2000) was used to examine the hypotheses linking social goals, sexual goals, relationship events, and sexual desire. Level-1 (i.e., daily) predictors were centered around each individual’s mean across the 14day study, enabling us to determine whether day-to-day changes from a participant’s own mean were associated with changes in the outcome variable. As in Studies 1 and 2, all variables were standardized prior to analyses.

Social Goals and Daily Sexual Desire As in Studies 1 and 2, we predicted that approach social goals would predict increased daily sexual desire. To test this hypothe-

817

sis, we entered approach and avoidance social goals as simultaneous predictors of daily sexual desire. The results showed that approach social goals were positively associated with sexual desire, ␤ ⫽ .19, t(69) ⫽ 2.50, p ⬍ .05. In contrast, avoidance goals were not associated with sexual desire, ␤ ⫽ ⫺.01, t(69) ⫽ .71, p ⫽ .84.10 As in Studies 1 and 2, we then conducted two sets of follow-up analyses. In the first analysis, we controlled for relationship satisfaction and duration, and the association between approach relationship goals and sexual desire remained significant, ␤ ⫽ .17, t(67) ⫽ 2.08, p ⬍ .05. In the second analysis, we controlled for the frequency with which participants engaged in sexual intercourse across the 14-day study, and the association between approach goals and desire also remained significant, ␤ ⫽ .19, t(67) ⫽ 2.22, p ⬍ .05, pointing to the robust nature of these findings. As in Studies 1 and 2, we explored gender as a moderator of the association between approach social goals and sexual desire. As in Studies 2 and 3, approach social goals, avoidance social goals, gender, and two interaction terms (Approach Goals ⫻ Gender; Avoidance Goals ⫻ Gender) were used to predict sexual desire. Neither interaction term reached significance (Approach ⫻ Gender: ␤ ⫽ ⫺.05, p ⫽ .76; Avoidance ⫻ Gender: ␤ ⫽ .14, p ⫽ .65). This result suggests that the gender effects found in Studies 1 and 2 may be specific to approach goals in romantic relationships, not in social relationships in general.

Sexual Goals and Daily Sexual Desire The second major goal of this study was to determine which specific sexual goals were associated with daily sexual desire. Although we were primarily interested in the distinction between self-focused and other-focused approach sexual goals, we also examined the four different measures of avoidance sexual goals. Therefore, we simultaneously entered all six types of sexual goals (Enhancement, Intimacy, Self-Affirmation, Coping, Partner Approval, and Peer Approval) as well as the control variables (relationship duration, relationship satisfaction, and sexual frequency) to predict daily sexual desire. Table 1 displays the results of this analysis. When all six sexual goals were entered simultaneously, both of the measures of approach sexual goals (i.e., Enhancement and Intimacy) significantly predicted daily sexual desire. On days when participants engaged in sexual intercourse more often to pursue positive outcomes either for themselves (i.e., for enhancement goals) or for their relationships (i.e., for intimacy goals), they reported increased sexual desire. Furthermore, these associations remained significant even after we controlled for relationship satisfaction, relationship duration, and frequency of sexual intercourse over the course of the 14-day study. In contrast, two of the measures of avoidance sexual goals (i.e., Self-Affirmation and Coping) were not significantly associated with sexual desire, and 10

We conducted additional analyses in which we analyzed each of the three sexual desire items separately (i.e., wanting sex, enjoying sex, being sexually aroused). When we included both approach and avoidance goals simultaneously, approach social goals significantly predicted daily arousal, ␤ ⫽ .18, t(69) ⫽ 2.39, p ⬍ .05, and desire for sex, ␤ ⫽ .15, t(69) ⫽ 2.04, p ⬍ .05, and marginally predicted daily enjoyment, ␤ ⫽ .15, t(69) ⫽ 1.91, p ⬍ .10. There were no significant associations between avoidance social goals and any of the individual-item dependent measures of sexual desire.

IMPETT, STRACHMAN, FINKEL, AND GABLE

818

Table 1 Associations Between Sexual Goals and Daily Sexual Desire in Study 3 Outcome: daily sexual desire Variable Approach sexual goals Enhancement goals Intimacy goals Avoidance sexual goals Self-affirmation goals Coping goals Partner approval goals Peer approval goals Control variables Relationship duration Relationship satisfaction Sexual frequency

␤ .65*** .21**

t 7.87a 2.95a

⫺.12† .04 ⫺.11† .02

⫺1.74a 0.86a ⫺1.24a 0.11a

.01 .12* ⫺.17*

0.08b 1.88b ⫺1.85b

Note. All numbers are standardized hierarchial linear model coefficients. a df ⫽ 244. b df ⫽ 68. † p ⬍ .10. * p ⬍ .05. ** p ⬍ .01. *** p ⬍ .001.

the other two measures of avoidance sexual goals (i.e., SelfAffirmation and Partner Approval) were marginally negatively associated with sexual desire.

Approach Sexual Goals as a Mediator Another hypothesis, supported in Study 2, was that approach sexual goals would mediate the association between approach social goals and sexual desire. Study 3 used a more detailed measure of sexual goals than that included in Study 2, with many items distinguishing between self-focused (i.e., Enhancement) and other-focused (i.e., Intimacy) approach sexual goals. Before testing for mediation, we examined associations between approach social goals and both types of approach sexual goals. Approach social goals were not associated with enhancement sexual goals, ␤ ⫽ .04, p ⫽ .72, but were associated with intimacy sexual goals, ␤ ⫽ .40, t(69) ⫽ 4.43, p ⬍ .001. Therefore, in the following analyses, we examined intimacy sexual goals as a mediator of the association between approach social goals and sexual desire. We used the aggregated data and the composite measure of sexual desire. As in Study 2, standard (OLS) hierarchical regression analysis based on the principles of Baron and Kenny (1986) was used to test mediation. Approach social goals were marginally associated with sexual desire (r ⫽ .20, p ⫽ .08). Approach social goals were significantly associated with intimacy sexual goals (r ⫽ .42, p ⬍ .001). Finally, intimacy sexual goals significantly predicted sexual desire after we controlled for approach social goals, ␤ ⫽ .39, p ⬍ .01, and the marginally significant direct effect from approach social goals to sexual desire dropped to nonsignificance, ␤ ⫽ .04, p ⫽ .76. A significant Sobel (1982) test indicated that the drop in the value of the betas was significant (z ⫽ 2.36, p ⬍ .05), providing evidence for mediation. The pattern of results is similar to the one found in Study 2, which is displayed in Figure 3.

Relationship Events and Daily Sexual Desire A third aim of this study was to examine whether daily relationship events predicted daily sexual desire and whether approach social goals moderated these associations. Participants reported an average of 4.23 positive events and 1.25 negative events each day. The most common positive events included “my partner told me that he/she loves me” and “my partner made me laugh.” The most common negative events included “my partner did something that made me feel irritated or angry” and “my partner and I had a minor disagreement.” As predicted, on days when participants reported more frequent positive events (than their own average across the 14-day study), they reported significantly greater sexual desire, ␤ ⫽ .26, t(249) ⫽ 2.69, p ⬍ .01. On days when participants reported more frequent negative events, they reported significantly less sexual desire, ␤ ⫽ ⫺.12, t(249) ⫽ ⫺2.34, p ⬍ .05. We further predicted that approach social goals would moderate the association between daily positive relationship events and sexual desire. To test this hypothesis, we predicted daily sexual desire from positive events at Level 1. At Level 2, we included approach social goals (grand mean centered) as a predictor of both the intercept of sexual desire and the slope of positive events with sexual desire. A similar model was used for negative events. For positive events, approach social goals were a marginally significant predictor of the slope between sexual desire and positive events, ␤ ⫽ .18, t(247) ⫽ 1.83, p ⫽ .07, such that compared with those with weak approach goals, individuals with strong approach goals experienced a marginally greater increase in sexual desire on days when they reported more positive events (see Figure 4). When relationship duration, sexual frequency, and relationship satisfaction were added as covariates to Level 2, this effect remained marginally significant, ␤ ⫽ .16, t(244) ⫽ 1.65, p ⬍ .10. We also tested whether approach social goals moderated the association between negative events and desire. Approach social goals were a significant predictor of the slope between sexual desire and negative events, ␤ ⫽.12, t(247) ⫽ 2.14, p ⬍ .05, such that compared with those with weak approach goals, individuals with strong approach goals experienced less of a decrease in sexual desire on days when they reported more negative events (see Figure 5). When relationship duration, sexual frequency, and relationship satisfaction were added as covariates to Level 2, this effect remained significant, ␤ ⫽ .12, t(244) ⫽ 2.08, p ⬍ .05. Additional analyses conducted to determine whether the interactions between approach goals and positive and negative events were further moderated by gender revealed no significant effects.

Brief Discussion Study 3 extended the results of the previous two studies in several important ways. First, it showed that approach social goals measured more generally predict daily sexual desire. Second, it extended the findings from Study 2 by showing that approach sexual goals that focus on the self (e.g., to pursue one’s own sexual pleasure) and approach sexual goals that focus on the partner/ relationship (e.g., to please one’s partner or enhance intimacy) were both associated with increased sexual desire. Third, it revealed that other-focused approach sexual goals (intimacy goals) mediated the association between approach social goals and daily sexual desire. Fourth, it replicated the results of Studies 2 and 3

APPROACH GOALS AND SEXUAL DESIRE

819

Figure 4. Approach social goals as a moderator of the association between positive events and sexual desire (Study 3).

showing that avoidance social and sexual goals were not significantly associated with daily sexual desire. Finally, it showed that relationship events are an important moderator of the link between approach social goals and daily sexual desire. More specifically, people with strong approach goals experienced even greater sexual desire on days that they reported many positive events and less of a decrease in sexual desire on days that they reported many negative events than individuals with weak approach social goals.

General Discussion Numerous studies have documented the importance of sexual desire in promoting satisfaction and stability in long-term relationships (e.g., Yeh et al., 2006). Many individuals report that their own or a partner’s low sexual desire creates problems for their relationships (Laumann et al., 1994), and some couples seek sex therapy in order to deal with one or both partners’ lack of sexual desire (McCarthy, 1999). The three studies described in this article provide converging support for the importance of approach goals

in enabling dating couples to maintain high levels of sexual desire. Study 1 showed that the adoption of approach relationship goals buffered against declines in sexual desire over a 6-month period in relationships. Whereas people with weak approach goals (i.e., those possessing a lack of interest in pursuing growth, fun, and development in their relationships) experienced declines in sexual desire over the course of the 6-month study, people with strong approach goals (i.e., those who possessed a great deal of interest in pursuing positive outcomes in their relationships) maintained high levels of sexual desire over the course of the study. Study 2 showed that approach relationship goals predicted elevated sexual desire during daily sexual interactions and that this association was mediated by approach sexual goals. That is, people who are generally oriented toward creating positive outcomes in their relationships may view sexual interactions as one way to create closeness and intimacy, and their approach sexual goals may, in turn, predict greater desire during daily sexual interactions. Finally, Study 3 showed that approach sexual goals that focus on

Figure 5. Approach social goals as a moderator of the association between negative events and sexual desire (Study 3).

820

IMPETT, STRACHMAN, FINKEL, AND GABLE

the self (e.g., to pursue one’s own sexual pleasure) and approach sexual goals that focus on others (e.g., to please one’s partner or to enhance intimacy) were both associated with daily sexual desire. Moreover, Study 3 showed that people with strong approach goals experienced even greater sexual desire on days that they reported many positive events. This effect is consistent with previous research that has shown that strong approach tendencies are associated with even greater increases in approach behaviors when signals of movement toward the goal (i.e., gains) are experienced (Fo¨rster, Higgins, & Idson, 1998). Finally, approach goals seemed to buffer against the deleterious effect that negative relationship events had on sexual desire, such that individuals with strong approach goals reported less of a decrease in desire on days that they reported many negative events than individuals with weak approach goals. Although we did not specifically predict this finding, it is consistent with previous work that has found that approach goals are associated with interpreting ambiguous or neutral information in a positive manner (Strachman & Gable, 2006). Thus, those with strong approach goals may have reframed negative events more positively, which may have attenuated the association between negative events and sexual desire. The current studies used a combination of longitudinal and daily experience methods to examine the link between approach goals and sexual desire. In Study 1, participants provided biweekly assessments of sexual desire, enabling us to examine the influence of approach goals measured early in relationships on the maintenance of sexual desire over a 6-month period. In Studies 2 and 3, participants provided daily accounts of their sexual desire, enabling us to obtain accurate, daily accounts of sexual desire. The use of a daily experience method enabled us to study relationship processes within the context of daily life in a way that is not possible with more traditional, cross-sectional designs (Bolger, Davis, & Rafaeli, 2003). These studies contribute to a growing body of research demonstrating the utility of approach–avoidance models of motivation in understanding a broad range of phenomena in everyday life (e.g., Elliot & Sheldon, 1997; Gable et al., 2000). More specifically, the current studies are part of an emerging area of research that focuses on motivation and close relationships (Gable, 2006b; Gable & Strachman, 2007; Impett, Gable, & Peplau, 2005). Previous research guided by Gable’s (2006b) model of social motivation has shown that approach (but not avoidance) motives and goals are associated with positive outcomes including positive emotions and relationship satisfaction (Gable, 2006a, 2006b; Impett, Gable, & Peplau, 2005). Our results show that sensitivity to positive relationships processes, such as intimacy, growth, and fun, has important implications for close relationships that are independent and separate from sensitivity to negative processes, such as conflict and rejection. It is particularly important to note that there are far fewer studies focusing on the role of positive processes than those focused on negative processes in the field of close relationships, reflecting possible empirical and theoretical oversights (Gable & Haidt, 2005; Reis & Gable, 2003).

Limitations and Future Directions The results of these studies provide several interesting directions for future research. First, similar to most of the available research on sexual desire, the current research focused on only one member

of the couple. Oftentimes, the assessment of sexual desire in couples is relative; that is, people perceive that their sexual desire is either too low or too high only after comparing their own desire with their partner’s desire (Davies, Katz, & Jackson, 1999; Ellison, 2001). Future research should obtain sexual desire reports from both members of dating or married couples. We also measured the relationship and sexual goals of only one member of the couple; however, goals in relationships are different than goals in other domains such as achievement or other life tasks in that they involve coordinating with another person who has his or her own goals. For example, what are the implications for sexual functioning in a relationship if one partner has strong approach goals and the other partner has weak approach goals? Future research should examine the joint contribution of both partners’ goals to both partners’ sexual desire, sampling the partners’ feelings at specific moments in their daily lives as well as over longer periods of time. Most of the participants in these studies were college students in relatively new relationships in which sexual desire may have been near its peak. It is possible that the effect of relationship goals on desire might be even more magnified in relationships of greater duration and commitment, such as in married couples. It is also possible that approach goals fail to promote sexual desire in long-term couples who have already experienced steep declines in sexual desire. Future research focusing on relationships of greater duration and commitment is needed to examine these ideas. Another important direction for future research to examine is the benefits of adopting approach goals for other aspects of relationships in addition to sexuality. Desires to pursue growth and development in relationships may also be associated with other positive behaviors and processes such as relationship commitment (Strachman & Gable, 2006), willingness to sacrifice (Impett, Gable, & Peplau, 2005), and willingness to forgive a partner’s wrongdoings (Finkel, Rusbult, Kumashiro, & Hannon, 2002). Another limitation stemmed from the fact that participants’ sexual desire was only assessed on days when they engaged in sexual intercourse (in Studies 2 and 3). Individuals sometimes choose to engage in sexual activity in order to please the partner or to avoid conflict rather to out of personal sexual interest (Levine, 2002). Indeed, research has shown that both men and women report having engaged in sexual behavior in the absence of desire (see review by Impett & Peplau, 2003), suggesting that the experience of sexual desire does not entirely overlap with sexual behavior. An interesting direction for future research would be to assess sexual desire both on days that couples engage in sexual activity and on days that sexual activity does not occur. Future research would also benefit from the use of a measure of sexual desire that distinguishes between desire in solitary and dyadic contexts (e.g., Spector, Carey, & Steinberg, 1996). Finally, this research was centrally concerned with the motivational component of the human sexual response (i.e., sexual desire), but the measures used in each of the studies included related sexual constructs such as sexual enjoyment and arousal. Although auxiliary analyses using individual items (i.e., sexual arousal, desire, and enjoyment) did not change the pattern of results, it will be important for future research to include more nuanced measures to capture possibly meaningful distinctions among these interrelated sexual constructs. Although our theoretical framework proposes that motivation influences sexual desire, our data do not provide a definitive test of this direction of causality. It is also possible that experiencing high

APPROACH GOALS AND SEXUAL DESIRE

levels of sexual desire may also cause people to pursue approach goals in their relationships. Future research in which both approach and avoidance goals are experimentally manipulated (Strachman & Gable, 2006) would provide a more refined explanation of the findings reported in the current studies. Finally, while the results of the two daily experience studies suggest that the association between approach goals and sexual desire applies equally well to White and Asian participants, it will be important for future research to replicate these effects in a sample of greater racial/ ethnic diversity and in non-Western cultures.

Implications The results from three studies document the importance of approach goals for predicting elevated levels of sexual desire on a daily basis and maintaining desire over time. Unfortunately, the current study cannot address the question of whether it is possible for people with chronically low levels of approach goals to learn to focus on the positive things to be experienced in their relationships. Nevertheless, it is important to note that, by definition, goals are short-term cognitive representations of wants and fears that should be malleable and sensitive to situational cues (Gable, 2006b). Moreover, previous research has shown that goals can be experimentally manipulated in the achievement domain (e.g., Elliot & Harackiewicz, 1996) and in the highly similar area of regulatory focus research (e.g., Shah, Higgins, & Friedman, 1998). Experimental evidence for changing relationship goals in situ has yet to be conducted, but on the basis of theory and previous experimental research, we expect that it is possible for people’s goals in their relationships to change over time. This research also points to the central importance of considering the role of gender in understanding sexual desire in intimate relationships. Approach goals were a stronger predictor of sexual desire for women than for men in the two studies that used a measure of approach goals in romantic relationships. Thus, the results of this study support a growing body of research that demonstrates the importance of relationship dynamics for women’s sexual desire relative to men’s (e.g., Basson, 2002, 2006; Peplau, 2003). In recent years, publicity about treatments for men’s erection problems focused attention on women’s sexuality and provoked a competitive commercial hunt for “the female Viagra” (Tiefer, 2002). This hunt reflects a general medicalization of sexuality; many people are trying to find pharmaceutical “solutions” to what in some cases may be relationship problems. Because women’s sexual desire is much more closely tied to their goals in relationships than men’s sexual desire, attempts to boost women’s sexual desire through pharmacological intervention may be misguided. The results of this research highlight the importance of considering the interpersonal aspects of relationships when thinking about how to treat problems of low sexual desire—for women but also for men.

Concluding Comments While a wealth of research has documented important links between sexual desire and relationship quality in intimate relationships (e.g., Regan, 2000; Sprecher, 2002; Yeh et al., 2006), much less research has investigated factors that may help promote and sustain sexual desire in relationships over time. In this article, three

821

studies documented the utility of approach–avoidance motivational theory as well as the important roles of both approach relationship and sexual goals in helping individuals to maintain high levels of sexual desire.

References Aron, A., Norman, C. C., Aron, E. N., McKenna, C., & Heyman, R. E. (2000). Couples’ shared participation in novel and arousing activities and experienced relationship quality. Journal of Personality and Social Psychology, 78, 273–284. Baron, R. M., & Kenny, D. A. (1986). The moderator–mediator variable distinction in social psychological research: Conceptual, strategic, and statistical considerations. Journal of Personality and Social Psychology, 51, 1173–1182. Basson, R. (2002). Women’s sexual desire: Disordered or misunderstood? Journal of Sex and Marital Therapy, 28, 17–28. Basson, R. (2006). Sexual desire and arousal disorders in women. New England Journal of Medicine, 354, 1497–1506. Basson, R., Leiblum, S., Brotto, L., Derogatis, L., Fourcroy, J., FuglMeyer, K., et al. (2004). Revised definitions of women’s sexual dysfunctions. Journal of Sexual Medicine, 1, 40 – 48. Baumeister, R. F., Catanese, K. R., & Vohs, K. D. (2001). Is there a gender difference in strength of sex drive? Theoretical views, conceptual distinctions, and a review of relevant evidence. Personality and Social Psychology Review, 5, 242–273. Beck, J. G. (1995). Hypoactive sexual desire disorder: An overview. Journal of Consulting and Clinical Psychology, 63, 919 –927. Beck, J. G., Bozman, A. W., & Qualtrough, T. (1991). The experience of sexual desire: Psychological correlates in a college sample. Journal of Sex Research, 28, 443– 456. Bolger, M., Davis, A., & Rafaeli, E. (2003). Diary methods: Capturing life as it is lived. Annual Review of Psychology, 54, 579 – 616. Boyatzis, R. E. (1973). Affiliation motivation. In D. C. McClelland & R. S. Steele, (Eds.) Human motivation: A book of readings (pp. 252–276). Morristown, NJ: General Learning Press. Brezsnyak, M., & Whisman, M. A. (2004). Sexual desire and relationship functioning: The effects of marital satisfaction and power. Journal of Sex and Marital Therapy, 30, 199 –217. Carver, C. S., Sutton, S. K., & Scheier, M. F. (2000). Action, emotion, and personality: Emerging conceptual integration. Personality and Social Psychology Bulletin, 26, 741–751. Carver, C. S., & White, T. L. (1994). Behavioral inhibition, behavioral activation, and affective responses to impending reward and punishment: The BIS/BAS scales. Journal of Personality and Social Psychology, 67, 319 –333. Clayton, A. H., Segraves, R. T., Leiblum, S., Basson, R., Pyke, R., Cotton, D., et al. (2006). Reliability and validity of the Sexual Interest and Desire Inventory—Female (SIDI–F), a scale designed to measure severity of female hypoactive sexual desire disorder. Journal of Sex and Marital Therapy, 32, 115–135. Cooper, M. L., Shapiro, C. M., & Powers, A. M. (1998). Motivations for sex and risky sexual behavior among adolescents and young adults: A functional perspective. Journal of Personality and Social Psychology, 75, 1528 –1558. Davies, S., Katz, J., & Jackson, J. L. (1999). Sexual desire discrepancies: Effects on sexual and relationship satisfaction in heterosexual dating couples. Archives of Sexual Behavior, 28, 553–567. Diamond, L. M. (2004). Emerging perspectives on distinctions between romantic love and sexual desire. Current Directions in Psychological Science, 13, 116 –119. Dowson, M., & McInerney, D. M. (2003). What do students say about their motivational goals? Towards a more complex and dynamic perspective on student motivation. Contemporary Educational Psychology, 28, 91– 113.

822

IMPETT, STRACHMAN, FINKEL, AND GABLE

Elliot, A., & Harackiewicz, J. M. (1996). Approach and avoidance achievement goals and intrinsic motivation: A mediational analysis. Journal of Personality and Social Psychology, 70, 461– 475. Elliot, A. J., & Covington, M. V. (2001). Approach and avoidance motivation. Educational Psychology Review, 13, 73–92. Elliot, A. J., Gable, S. L., & Mapes, R. R. (2006). Approach and avoidance motivation in the social domain. Personality and Social Psychology Bulletin, 32, 378 –391. Elliot, A. J., & Sheldon, K. M. (1997). Avoidance achievement motivation: A personal goals analysis. Journal of Personality and Social Psychology, 73, 171–185. Ellison, C. R. (2001). A research inquiry into some American women’s sexual concerns and problems. Women and Therapy, 24, 147–159. Finkel, E. J., Rusbult, C. E., Kumashiro, M., & Hannon, P. A. (2002). Dealing with betrayal in close relationships: Does commitment promote forgiveness? Journal of Personality and Social Psychology, 82, 956 – 974. Fo¨rster, J., Higgins, E. T., & Idson, L. C. (1998). Approach and avoidance strength during goal attainment: Regulatory focus and the “goal looms larger” effect. Journal of Personality and Social Psychology, 75, 1115– 1131. Fredrickson, B. L., & Joiner, T. (2002). Positive emotions trigger upward spirals toward emotional well-being. Psychological Science, 13, 172– 175. Gable, S. L. (2006a). Approach and avoidance relationship goals. Unpublished manuscript, University of California, Los Angeles. Gable, S. L. (2006b). Approach and avoidance social motives and goals. Journal of Personality, 74, 175–222. Gable, S. L., & Haidt, J. (2005). What (and why) is positive psychology? Review of General Psychology, 9, 103–110. Gable, S. L., Reis, H. T., & Downey, G. (2003). He said, she said: A quasi-signal detection analysis of daily interaction between close relationship partners. Psychological Science, 14, 100 –105. Gable, S. L., Reis, H. T., & Elliot, A. J. (2000). Behavioral activation and inhibition in everyday life. Journal of Personality and Social Psychology, 78, 1135–1149. Gable, S. L., & Strachman, A. (2007) Approaching social rewards and avoiding social punishments: Appetitive and aversive social motivation. In J. Shah & W. Gardner (Eds.) Handbook of motivation science (pp. 561–575). New York: Guilford. Goldstein, A., & Brandon, M. (2004). Reclaiming desire: Four keys to finding your lost libido. New York: Rodale. Gray, J. (1987). The psychology of fear and stress (2nd ed.). New York: Cambridge. Gray, J. A. (1990). Brain systems that mediate both emotion and cognition. Cognition and Emotion, 4, 269 –288. Hawton, K., Catalan, J., & Fagg, J. (1991). Low sexual desire: Sex therapy results and prognostic factors. Behavioral Research and Therapy, 29, 217–224. Impett, E. A., Gable, S. L., & Peplau, L. A. (2005). Giving up and giving in: The costs and benefits of daily sacrifice in intimate relationships. Journal of Personality and Social Psychology, 89, 327–344. Impett, E. A., & Peplau, L. A. (2003). Sexual compliance: Gender, motivational, and relationship perspectives. Journal of Sex Research, 40, 87–100. Impett, E. A., Peplau, L. A., & Gable, S. L. (2005). Approach and avoidance sexual motivation: Implications for personal and interpersonal well-being. Personal Relationships, 12, 465– 482. Impett, E. A., & Tolman, D. L. (2006). Late adolescent girls’ sexual experiences and sexual satisfaction. Journal of Adolescent Research, 6, 628 – 646. Johnson, A. M., Wadsworth, J., Wellings, K., & Field, J. (1994). Sexual attitudes and lifestyles. London: Blackwell. Julien, D., Bouchard, C., Gagnon, M., & Pomerleau, A. (1992). Insiders’

views of marital sex: A dyadic analysis. Journal of Sex Research, 29, 343–360. Kaplan, H. S. (1979). Hypoactive sexual desire. Journal of Sex and Marital Therapy, 19, 3–24. Klusmann, D. (2002). Sexual motivation and the duration of partnership. Archives of Sexual Behavior, 31, 275–287. Laumann, E. O., Gagnon, J. H., Michael, R. T., & Michaels, S. (1994). The social organization of sexuality: Sexual practices in the United States. Chicago: University of Chicago Press. Leitenberg, H., & Henning, K. (1995). Sexual fantasy. Psychological Bulletin, 117, 469 – 496. Levine, S. B. (2002). Reexploring the concept of sexual desire. Journal of Sex and Marital Therapy, 28, 39 –51. Levine, S. B. (2003). The nature of sexual desire: A clinician’s perspective. Archives of Sexual Behavior, 32, 279 –285. Littell, R. C., Milliken, G., Stroup, W. W., & Wolfinger, R. D. (1996). SAS system for mixed models. Cary, NC: SAS Institute. Masters, W. H., & Johnson, V. E. (1966). Human sexual response (p. 311). Boston: Little, Brown. McCabe, M. P. (1997). Intimacy and quality of life among sexually dysfunctional men and women. Journal of Sex and Marital Therapy, 23, 276 –290. McCarthy, B. W. (1999). Marital style and its effects on sexual desire and functioning. Journal of Family Psychotherapy, 10, 1–12. McCarthy, B. W., & McCarthy, E. (2003). Rekindling desire: A step-bystep program to help low-sex and no-sex marriages. New York: Brunner-Routledge. Mehrabian, A. (1976). Questionnaire measures of affiliative tendency and sensitivity to rejection. Psychological Reports, 38, 199 –209. Peplau, L. A. (2003). Human sexuality: How do men and women differ? Current Directions in Psychological Science, 12, 37– 40. Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear models: Applications and data analysis methods. Thousand Oaks, CA: Sage. Raudenbush, S. W., Bryk, A. S., Cheong, Y. F., & Congdon, R. T. (2000). HLM5: Hierarchical linear and nonlinear modeling. Chicago, Scientific Software. Regan, P. C. (2000). The role of sexual desire and sexual activity in dating relationships. Social Behavior and Personality, 28, 51–59. Regan, P. C., & Berscheid, E. (1999). Lust: What we know about human sexual desire. Thousand Oaks, CA: Sage. Reis, H. T., & Gable, S. L. (2000). Event-sampling and other methods for studying everyday experience. In H. T. Reis & C. M. Judd (Eds.), Handbook of research methods in social and personality psychology (pp. 190 –222). New York: Cambridge University Press. Reis, H. T., & Gable, S. L. (2003). Toward a positive psychology of relationships. In C. L. Keyes & J. Haidt (Eds.). Flourishing: The positive person and the good life (pp. 129 –159). Washington, DC: American Psychological Association. Rusbult, C. E., Martz, J. M., & Agnew, C. R. (1998). The investment model scale: Measuring commitment level, satisfaction level, quality of alternatives, and investment size. Personal Relationships, 5, 357–391. Shah, J., Higgins, T., & Friedman, R. S. (1998). Performance incentives and means: How regulatory focus influences goal attainment. Journal of Personality and Social Psychology, 74, 285–293. Singer, J. D., & Willett, J. B. (2003). Applied longitudinal data analysis. New York: Oxford University Press. Snijders, T., & Bosker, R. (1999). Multilevel analysis: An introduction to basic and advanced multilevel modeling. London: Sage. Sobel, M. E. (1982). Asymptotic confidence intervals for indirect effects in structural equation models. In S. Leinhardt (Ed.), Sociological methodology (pp. 290 –312). Washington, DC: American Sociological Association.

APPROACH GOALS AND SEXUAL DESIRE Sorkin, D. H., & Rook, K. S. (2004). Interpersonal control strivings and vulnerability to negative social exchanges in later life. Psychology and Aging, 19, 555–564. Spector, I. P., Carey, M. P., & Steinberg, L. (1996). The Sexual Desire Inventory: Development, factor structure, and evidence of reliability. Journal of Sex and Marital Therapy, 22, 175–190. Sprecher, S. (2002). Sexual satisfaction in premarital relationships: Associations with satisfaction, love, commitment and stability. Journal of Sex Research, 39, 190 –196. Sprecher, S., & Cate, R. M. (2004). Sexual satisfaction and sexual expression as predictors of relationship satisfaction and stability. In J. Harvey, A. Wenzel, & S. Sprecher (Eds.), Handbook of sexuality in close relationships. Mahwah, NJ: Erlbaum. Sprecher, S., & Regan, P. C. (1996). College virgins: How men and women perceive their sexual status. Journal of Sex Research, 33, 3–15. Strachman, A., & Gable, S. L. (2006). What you want (and do not want)

823

affects what you see (and do not see): Avoidance social goals and social events. Personality and Social Psychology Bulletin, 32, 1–13. Tiefer, L. (2002). Beyond the medical model of women’s sexual problems: A campaign to resist the promotion of “female sexual dysfunction.” Sexual and Relationship Therapy, 17, 127–135. Trudel, G., Landry, L., & Larose, Y. (1997). Low sexual desire: The role of anxiety, depression, and marital adjustment. Sexual and Marital Therapy, 12, 95–99. Yeh, H.-C., Lorenz, F. O., Wickrama, K. A. S., Conger, R. D., & Elder, G. H. (2006). Relationships among sexual satisfaction, marital quality, and marital instability at midlife. Journal of Family Psychology, 20, 339 –343.

Received April 3, 2007 Revision received August 7, 2007 Accepted August 23, 2007 䡲

Journal of Personality and Social Psychology 2008, Vol. 94, No. 5, 824 – 838

Copyright 2008 by the American Psychological Association 0022-3514/08/$12.00 DOI: 10.1037/0022-3514.94.5.824

Receiving Support as a Mixed Blessing: Evidence for Dual Effects of Support on Psychological Outcomes Marci E. J. Gleason

Masumi Iida and Patrick E. Shrout

Wayne State University

New York University

Niall Bolger Columbia University Although social support is thought to boost feelings of closeness in dyadic relationships, recent findings have suggested that support receipt can increase distress in recipients. The authors investigated these apparently contrary findings in a large daily diary study of couples over 31 days leading up to a major stressor. Results confirm that daily support receipt was associated with greater feelings of closeness and greater negative mood. These average effects, however, masked substantial heterogeneity. In particular, those recipients showing greater benefits on closeness tended to show lesser cost on negative mood, and vice versa. Self-esteem was examined as a possible moderator of support effects, but its role was evident in only a subset of recipients. These results imply that models of dyadic support processes must accord a central role to between-individual heterogeneity. Keywords: close relationships, daily diaries, social support, reciprocity, multilevel models

ships are positively associated with relationship satisfaction (Reis & Patrick, 1996; Sanderson & Cantor, 1997), it would seem that instances of support receipt could lead to lowered relationship closeness as a secondary outcome. We review the strength of the evidence that support receipt has adverse effects on individual well-being, then review the possible effects of support receipt on relationship closeness and intimacy, and finally outline the strengths of considering both negative mood and relationship closeness outcomes in a single study of social support receipt.

Perceived availability of social support (the belief that social support has been available to one in the past and will be in the future) has been linked to a variety of beneficial outcomes (Sarason, Sarason, & Pierce, 1994). However, the beneficial effects of perceived support stand in contrast to those for concrete acts of support (Lakey & Lutz, 1996). Although a few studies have shown that support receipt is related to some positive outcomes (Feeney & Collins, 2001, 2003), many studies of actual support transactions find that support receipt is associated with negative outcomes and in particular with increased negative mood (Barrera, 1981; Bolger, Zuckerman, & Kessler, 2000; Liang, Krause, & Bennett, 2001; Shrout, Herman, & Bolger, 2006). Studies of support transactions also have a tendency to focus on individual emotions and well-being and are less likely to measure relationship-level variables such as closeness and felt intimacy. Given that negative affect is inversely associated with relationship satisfaction (Bradbury & Fincham, 1989; Edwards, Nazroo, & Brown, 1998; Gottman, 1979; Uebelacker, Courtnage, & Whisman, 2003) and feelings of closeness and intimacy within relation-

How Strong Is the Evidence That Support Receipt Can Lead to Increased Distress? As discussed above, there is an established literature linking generalized perceived support to better outcomes, including reduced distress (e.g., Cohen, 2004), but daily support receipt is frequently linked to negative outcomes. Given the seemingly contradictory nature of these findings, the association between daily support and negative mood has been questioned. Specifically, it has been suggested that the apparent negative effects of support receipt could be due to (a) reverse causation, that is, distress leading to support provision, or (b) a common third cause, such as stress leading to both distress and support, which would create a spurious association Two kinds of evidence argue against reverse causation. One is the use of lagged variables so that the association of distress on one day is related to support on the previous day (e.g. Bolger et al., 2000; Shrout et al., 2006). In these lagged models, yesterday’s negative mood is adjusted for statistically such that the adverse effects of yesterday’s support on today’s mood cannot be due to simple build-up of negative mood. Bolger et al. (2000; and also Shrout et al., 2006, in a somewhat more elaborate model) found

Marci E. J. Gleason, Communication and Behavioral Oncology, Karmanos Cancer Institute, Wayne State University; Masumi Iida and Patrick E. Shrout, Department of Psychology, New York University; Niall Bolger, Department of Psychology, Columbia University. This research was supported by National Institute of Mental Health Grant MH60366 to Niall Bolger. We gratefully acknowledge the invaluable contributions of Amie Rapaport, Diane Ruble, Hiro Yoshikawa, Tom Tyler, Gwen Seidman, Christopher Burke, and the Couples Lab of New York University. Correspondence concerning this article should be addressed to Marci E. J. Gleason, Karmanos Cancer Institute, 4100 John R Street, ROC— Room 336, Detroit, MI 48201; email: [email protected]. Further information can also be found at http://www.psych.nyu.edu/couples. 824

DUAL EFFECTS OF SUPPORT ON PSYCHOLOGICAL OUTCOMES

that the negative effects of yesterday’s support remain after adjustment. The second source of evidence is from an examination of simulated data that were constructed to conform to a pattern in which distress elicits support (Seidman, Shrout, & Bolger, 2006). Seidman et al. (2006) constructed the fictitious data using a simulation strategy that was first outlined by Abelson (1968) and then analyzed this reverse causation data with Bolger et al.’s analytic approach. They concluded that the effects obtained empirically by Bolger et al. could not have been obtained with data that were constructed under the reverse causation model. The other alternative model is that of a third variable that leads to both distress and support provision—a spurious association model. An example of this alternative model is the supportseeking–triage model, which posits that the negative outcomes from receiving support are due not to the support itself, but instead to support and psychological distress being simultaneously caused by a precipitating stressful event (Barrera, 1986). According to this model, it is not receiving support that causes distress, but the stressor that simultaneously evokes both psychological distress and increased support from others. Support and negative mood coincide because both are responses to negative events, not because they are causally linked to negative mood. However, the association between support and negative outcomes typically remains even after adjustment for relevant third variables. Krause (1997), in a nationwide study of 60-year-olds in Great Britain, found that even when adjusting for health status, individuals who received support had an increased mortality risk, and those who had high perceived availability of support had decreased mortality risk. Experimental studies have also demonstrated that support receipt and not just the precipitating stressor can have deleterious effects on mood. Bolger and Amarel (2007) found that students asked to give an impromptu speech were more anxious when they received explicit, visible support from a confederate than were students who did not receive support. Furthermore, the Seidman et al. (2006) simulation study discussed above also investigated the third-variable explanation. They created fictitious data in which the level of distress today was caused by yesterday’s distress, as well as adversity experienced today and yesterday. Similarly, the support transactions today were modeled to be more likely when support was provided yesterday and when adversity was experienced either today or yesterday. Unlike the results of the reverse causation simulation study, Seidman et al. found that when Bolger et al.’s (2000) analysis strategy was used, some spurious association was created by the omitted third variable (adversity). However, the size of the bias was very small, even when the magnitude of the effect of adversity was made to be unrealistically large. Seidman et al. concluded that Bolger et al.’s effect sizes were unlikely to be due to an omitted third variable. Taken together, these findings suggest that the association between psychological distress and support receipt is not spurious.

Support Receipt and Increased Distress Equity theory and reciprocity research have also been invoked to explain why the receipt of support can be negative (Uehara, 1995; Walster, Berscheid, & Walster, 1973). Both approaches suggest that people will be most satisfied when they perceive their supportive relationships as being equitable or reciprocal. Equity and reciprocity theories posit that both overbenefit (receiving more

825

support than one has provided) and underbenefit (providing more support than one has received) are psychologically distressing and that individuals are motivated to restore equity either behaviorally by providing or eliciting aid from caregivers or cognitively by psychologically justifying the inequity (Buunk & Schaufeli, 1999; Uehara, 1995; Walster et al., 1973). Buunk and Schaufeli (1999) took an evolutionary approach to reciprocity, suggesting that it is a basic psychological mechanism that developed to maintain social relationships and indicate individuals’ importance in their social groups. Research on couples in which one member is seriously ill has shown that both the ill spouse and the caregiving spouse suffer from frustration, anger, depression, and resentment when the relationship is not judged to be reciprocal (Thompson, Medvene, & Freedman, 1995). From the perspective of reciprocity theory, Uehara (1995) specifically argued that it is being overbenefited—receiving support without returning it—that is particularly psychologically distressing. In this case, recipients are likely to feel obligated to repay what was given to them, and when they cannot, they begin to doubt their status and usefulness in the relationship (see also Roberto & Scott, 1986). In a daily diary study of committed couples, we found that individuals reported increased negative affect and decreased positive affect on days on which they reported receiving support from, but not providing support to, their partners (overbenefit) as compared with days when they only provided support to their partners (underbenefit) or both provided support to and received support from their partner (equitable or reciprocal exchanges; Gleason, Iida, Bolger, & Shrout, 2003). A different explanation for a tendency for distress to increase with received support is one that focuses on possible effects of support on the recipient’s self-esteem. Several studies have reported that being helped is associated with decreased self-esteem and depressed mood in the recipient (Nadler, 1987; Nadler & Fisher, 1976). There is some evidence that this explanation is especially relevant in close relationships (Nadler, Fisher, & Itzhak, 1983). Fisher, Nadler, and Whitcher-Alagna (1982) proposed the threat-to-self-esteem model of aid or support receipt that posits that helping consists of both self-threatening and supportive components. The self-threatening components can undermine the recipients’ evaluation of their self-efficacy, competence, and coping abilities, which can in turn lead to increased psychological distress. On the other hand, Fisher et al. theorized that the supportive components could provide comfort and a sense of being cared for by the support provider.

Social Support as Relationship Enhancer It is perhaps this potential sense of being cared for by one’s partner bolstered by the positivity of the perceived availability of support that gives social support its positive reputation. Reis, Clark, and Holmes (2004) and Cutrona (1996) have related the positive findings associated with perceived availability of social support to a global construct called perceived responsiveness of the partner to the self. This perception of partner responsiveness is the central path to the development and maintenance of closeness and intimacy in relationships. Like perceived responsiveness to the self, perceived availability of support seems to be based on both personality characteristics of the perceiver and actual supportive interactions. People who judge themselves as being highly sup-

826

GLEASON, IIDA, SHROUT, AND BOLGER

ported are also judged by observers as being more supported, but support recipients who perceived their relationships as more positive also judge the support they receive more positively than observers (Collins & Feeney, 2000). Given the research indicating that support receipt increases negative mood, it is surprising that it is judged as positive by the recipient at all. One possible explanation for this contradiction is that support receipt makes one feel closer to the provider of that support because it makes one feel cared for or responded to (Reis et al., 2004) even while increasing personal distress. Gable, Gonzago, and Strachman (2006) found that when individuals were supportive when talking with their partners about their partner’s successes, the partners (i.e., the support recipients) rated the relationship as more satisfying. Although a positive association between receipt of support and positive relationship variables such as satisfaction, closeness, and intimacy has been found in a few studies (Acitelli & Antonucci, 1994; Hagedoorn et al., 2000), it is unclear whether the support being assessed was actual support received, perceived support, or some combination of both. Regardless, this research on support receipt and relationship variables raises the question of whether support has differential effects on individual-level variables (e.g., personal distress) when compared with relationship-level variables (e.g., relationship closeness).

Understanding Dual Effects of Support on Personal Distress and Relationship Closeness The idea that support or aid can produce both increased psychological distress and a sense of being cared for by the provider is particularly intriguing. There are at least two possible models of this pattern of effects: an individual differences model and a within-person differential effects model. An individual differences model would mean that support increases personal distress for some people but increases relationship closeness for others. A differential effects model would mean that support receipt leads to both increased personal distress and increased relationship closeness in the same person. Model 1 in Figure 1 shows a representation of an individual differences model of support receipt. In one group (Group A), there is no effect of support on individual distress, but there is a strong and positive effect on relationship closeness. In the other group (Group B), there may be a strong effect of support on individual distress, but no improvement in relationship closeness. If data from these two groups are combined without a formal model of the nature of the moderation (individual differences), then one might conclude from the mixed analysis that couples are likely to experience both relationship exhilaration and individual distress. An individual differences model of the effects of support receipt is consistent with aspects of the relationship enhancement model of social support (Cutrona, Russell, & Gardner, 2005). This model explains the association between actual support receipt, perceived availability of support, relationship satisfaction, and health. It suggests that the perceived availability of support is directly related to instances of received support, particularly when the provider is seen as a caring and committed partner, and that this process is cyclical: People who receive consistent beneficial support will trust their partners, and people who trust their partners will benefit from support. Perhaps the negative effects of the receipt of support can be explained by individual differences:

Specifically, people who trust their partners may benefit from support (Group A in Figure 1), whereas people who lack that trust may experience costs that have been described above (Group B in Figure 1). In contrast to the individual differences model, Model 2 in Figure 1 represents a differential effects model, whereby a single support event leads to improved relationship closeness and increased individual distress. We might imagine a stressed worker who comes home to a well-intending partner who attempts to provide him or her with a break. The worker might appreciate the good intentions and feel closer to his or her partner but be distressed by the loss of an evening of productivity. If this were the typical pattern of support provision in the couple, then one would witness simultaneous positive and negative effects of support acts. Although the individual differences model and the differential effects model appear to be discrete alternatives, they can actually be viewed as examples of a range of processes that might vary from couple to couple. For some pairs of partners, support events could lead to closeness but not to distress; for other partners, support events could lead to distress but no closeness; and for still others, there could be dual effects. In the population, some patterns of these relations might be more common than others. Only three studies that we know of have reported on these two processes in the same samples of partners. The Bolger and Amarel (2007) study cited earlier did find evidence for mixed effects of support receipt. Students who received visible support experienced larger increases in negative emotion than those in the nonsupport condition, and they also felt that their partners were more concerned, considerate, and supportive than those in the invisible support condition. However, this study had only a single support event and was not designed to examine individual differences in response to support events. Gable, Reis, and Downey (2003) were able to study repeated support events among dating couples. They found that support events that were reported by both partners and recipients (called “hits”) were related to both relationship wellbeing and recipient distress. Even though the data were based on diary reports that allow the study of individual differences, the authors did not include these individual differences (which are called random effects in the multilevel statistics literature) in their statistical model. Gleason, Iida, Bolger, and Shrout (2003) examined the effect of imbalance in support provision and receipt on recipients’ negative mood, and Gleason (2005) analyzed data from the same daily diary study with a focus on relationship closeness outcomes. In both of these analyses (i.e., for both outcomes), Gleason and her colleagues found evidence that the effects of unreciprocated support events varied from couple to couple, but on average unreciprocated support was associated with an increase in negative mood and in a separate analysis with an increase in relationship closeness. However, it was not possible to determine from these two analyses how often the pattern in Model 2 (differential effects model) occurred in the sample.

The Current Study To tease apart the effects of support receipt on negative mood and relationship closeness, we analyzed data from a large daily diary study of nearly 300 cohabitating couples in which one partner was approaching a stressful event, the bar examination (a

DUAL EFFECTS OF SUPPORT ON PSYCHOLOGICAL OUTCOMES

Relationship Closeness

Model 1: Individual Differences (Moderation)

+ Group A

827

Support Event Individual Distress Relationship Closeness

Support Event

Group B

+ Individual Distress

Model 2: Differential Effects

Relationship Closeness + Support Event + Individual Distress

Figure 1. Possible models for understanding the effects of support receipt on individual distress and relationship closeness. In Model 1, support receipt increases relationship closeness in some individuals (Group A) and increases distress in others (Group B). In Model 2, support receipt increases both distress and closeness in all individuals. The weak or missing effect is represented by a dashed arrow; the strong effect is represented by a solid arrow.

difficult-to-pass licensing examination for lawyers that they must pass to practice). Using a dataset in which one member of the couple is approaching a significant stressor allowed us to investigate whether responses to support receipt are affected by overall stress level and ensured that we captured couples at a time when support exchanges should have occurred frequently. A typical analysis would involve estimating and interpreting only the fixed or average effects. Although the fixed effects give valuable information about the predominant pattern of the data, fixed effects alone are unable to distinguish between models like those discussed above. Estimating the random effects of the receipt of support on negative mood and relationship closeness will provide evidence as to whether the effects of receipt of support on the outcomes differ between individuals. If receipt of support increased negative mood on average (a significant fixed effect) and

there was significant variation around it (a significant random effect), we would know that individuals’ negative mood was differentially affected by receipt of support. Furthermore, we could obtain estimates of each individual’s receipt of support effect, which would reveal whether for some people receipt of support decreased negative mood despite the average effect being an increase in negative mood or whether receipt of support increased negative mood in all individuals but to lesser and greater degrees. In the current study, we took such an analysis one step further and built a model in which we simultaneously modeled the effects of support receipt on negative mood and relationship closeness. This special multilevel model is what Raudenbush and Bryk (2002, pp. 185–199) called a multivariate repeated measures model. The model and the large sample size allowed us to estimate the random effects of support receipt on both negative mood and closeness and

GLEASON, IIDA, SHROUT, AND BOLGER

828

then estimate the correlation between them. A significant correlation between the random effects of support receipt on negative mood and relationship closeness would suggest that the effects are systematically linked across individuals, whereas a null correlation would suggest that the association between the effects of support receipt on negative mood and relationship closeness vary by individual but are not linked. This analysis is particularly powerful for two reasons: (a) It allowed us to model the effects of receipt of support on negative mood and closeness simultaneously, thereby allowing us to investigate how these effects are associated within individuals, and (b) it allowed us to see whether and how people systematically differ in their reactions to support receipt without having to identify an explanatory moderator. This second strength is particularly important. Conceptually, we tend to think about the heterogeneity of the responses to support events as possible moderation, as illustrated in Model 1 of Figure 1, but, as stated above, in practice the multilevel models do not require that we specify the variables that distinguish Group A from Group B. Given the difficulty of measuring all possible moderators and the fact that moderation is often difficult to find, it is particularly useful that we can identify systematic variation without having to identify its specific source. In the current study, we first determined the average response to support receipt, then whether there was reliable variation in those responses; as a third step, we attempted to identify the variables that can account for such variation. The literature suggests two important candidates for moderating variables that we could examine. One is derived from the Cutrona et al. (2005) theory that suggests that support receivers who are in more trusting and satisfying relationships will find support to be more effective in reducing personal distress. Another is the proposition by Fisher et al. (1982) that support can be a threat to self-esteem—persons who have compromised self-esteem might be more vulnerable to a threat associated with support acts.

Method Design and Participants The data were collected in the summers of 2001, 2002, and 2003 by contacting more than 100 law schools in the continental United States. In 2001, 14 schools agreed to participate by allowing their graduating students to be contacted; in 2002, 27 schools participated; and in 2003, 30 schools participated. Because access to students’ marital or cohabitation status was unavailable before recruitment, the school representatives were asked to distribute either a letter or an e-mail to their entire graduating class. Across the 3 years, more than 15,000 students were contacted. To be eligible for participation, couples had to be married or cohabiting for at least 6 months at the time of the recruitment, and only 1 member of the couple could be planning on taking the July bar exam. There were 765 eligible couples who contacted us to participate, and of those 552 were assigned to the diary condition.1 Of those, 472 couples agreed to participate, resulting in an 86% agreement rate. The average age of the examinee was 28.9 years (SD ⫽ 6.4), and the average age of the partner was 28.4 (SD ⫽ 7.8). Of the examinees, 46% were male. Sixty-four percent of the participants were married, and the average length of cohabitation was 4.2 years

(SD ⫽ 4.9). The composition of the sample was 76.8% White, 7.1% Asian, 4.4% Latino, 2.1% Black, 0.6% American Indian, 5.1% other, and 3.9% not specified for examinees; 78.8% White, 5.2% Asian, 4.8% Latino, 3.8% Black, 0.9% American Indian, 3.0% other, and 3.5% not specified for partners. Couples overwhelmingly agreed on how long they had been romantically involved (mean difference between estimates ⫽ 0.03 years), with the average length of relationship being 6.5 years (SD ⫽ 5.5), the minimum 8 months, and the maximum 48.8 years. This is a highly educated sample, and therefore is not representative of the population as a whole. Couples were paid $150 for participation, and each couple was given a chance to win $1,000 on the completion of the study. Couples received an initial payment of $10, two consent forms, two background questionnaires, and two return envelopes when they agreed to participate in the study. They returned the completed background questionnaires an average of 3 weeks before the start of the diary period. The diary period consisted of the 5 weeks before the bar exam, the 2 days on which the exam took place, and the week after the exam. Between 1 and 2 weeks before the start of the diary period, both members of each couple received an initial packet containing a batch of daily diary questionnaires, a return envelope, and instructions regarding the diary questionnaires. Packets were mailed to each participant on a weekly basis (six packets over the 6 weeks of the study). Each batch consisted of seven identically structured daily diaries with the exception of the last batch, which consisted of nine daily diaries. The diary form included questions regarding mood, relationship closeness, daily troubles or difficulties, relationship conflicts, and support transactions. Participants were asked to complete the questionnaires separately and not to share or discuss their answers with their partners. Participants were also asked to complete the diaries on the days assigned and to indicate whether each diary had been completed on the correct day. Only entries that indicated that they had been completed on the correct day were included in the analyses (88% of completed diaries). Of those who agreed to participate, 89% returned their background questionnaire (372 couples in which both members returned the background and 16 in which only one member did). Two hundred eighteen couples returned all materials (476 participants2). The final sample consisted of 293 examinees and 290 partners who completed at least 1 week of the daily diaries.

Measures: Dependent Variables Closeness. Each evening, participants indicated separately how emotionally close and how physically intimate they were with their partner on a scale ranging from 0 to 4 with midpoints. High numbers indicated more emotional closeness and increased physical closeness, and low numbers indicated emotional distance and 1 Because this sample is part of a larger study that focused on methodology, interested couples were randomly assigned to different conditions. Only 72% of interested couples were assigned to the diary design, and our analyses use only those couples. 2 In some couples, only 1 member returned materials, so the participant numbers reflect how many individual participants returned the materials, and couple numbers reflect the number of couples in which both members returned the materials.

DUAL EFFECTS OF SUPPORT ON PSYCHOLOGICAL OUTCOMES

lack of physical closeness. Cronbach’s alphas for the two items were .71 for examinees and .68 for partners.3 Items were averaged to create the closeness scale (examinee: M ⫽ 2.27, SD ⫽ 1.07; partner: M ⫽ 2.25, SD ⫽ 1.06). We adjusted for yesterday’s relationship closeness by including lagged relationship closeness as a predictor in the model. Controlling for lagged closeness results in the outcome variable being residualized change in relationship closeness (today’s relationship closeness adjusting for yesterday’s relationship closeness). This strengthens the claim that any change in relationship closeness is due to the events of the day in question rather than lingering effects from the day before. Negative mood. Anger, depressed mood, and anxiety were measured using items from the Profile of Mood States (Lorr & McNair, 1971). For each mood, at least three high-loading items from a factor analysis conducted by Lorr and McNair (1971) were used. Anger and anxiety consisted of three items, and depressed mood consisted of four items. For each of these items, participants rated how they felt “right now” on a 5-point scale ranging from 1 (not at all) to 5 (extremely). The scores were rescaled to a 0 – 4 interval, and a mean for each mood was obtained by averaging the rescaled values of the relevant items. Anger, depressed mood, and anxiety were highly related (between-person reliability estimate ⫽ .60; within-person reliability estimate ⫽ .624) and were therefore averaged to form a single negative mood scale that was then centered on the respective overall means (partners: M ⫽ 0.31 before centering, SD ⫽ 0.52; examinees: M ⫽ 0.72 before centering, SD ⫽ 0.74). It can be seen that on average, examinees as compared with partners reported more than twice as much negative mood. Negative mood was measured twice each day, once in the morning and once at night. Therefore, we adjusted for morning negative mood, and again our outcome variable was residualized change in negative mood, again strengthening the claim that any change in negative mood was due to events of the day in question rather than lingering effects from the day before.5

Measures: Predictor Variables Support provision and receipt. Participants’ provision of emotional support to their partner and receipt of emotional support from their partner was assessed in the evening each day. Each measure consisted of a single item in which participants reported whether they had provided emotional support to their partner and, separately, whether they had received emotional support from their partner. Specifically, participants were asked to indicate, by circling yes or no, whether they had received (provided) any help from (to) their partner for a worry, problem, or difficulty in the past 24 hr. Examples of support such as listening and comforting were given to clarify the question. Support receipt was coded 1 and a lack of receipt was coded 0; similarly, support provision was coded 1 and a lack of provision was coded 0.6

Covariates Time. Temporal effects of being in the study were adjusted for by including time in the study as a predictor of both outcomes. It has been shown that the first 3 days of diary studies generally have elevated levels of negative reports, but not of positive reports (Gleason, Bolger, & Shrout, 2003). Given this potentially biasinducing tendency in our analyses, we eliminated the first 3 days

829

of data, but analyses including these days did not differ from the findings presented. The 4th day of the study was also dropped to account for the use of lagged variables (relationship closeness). The variable representing duration of time in the study was created such that Day 5 was coded 0, Day 6 was coded 1, and so on up to Day 35 (coded 31), resulting in values from 0 to 31 being included in the analyses. Hereinafter, we refer to the days by their code number. Day 31 was the day before the bar examination. We did not include the days of or the days following the examination in the study to limit the sample to persons approaching a stressor. However, additional analyses revealed that the days following the bar examination did not differentiate in important or dramatic ways from the data reported below. Weekend. We have found closeness to be systematically higher on weekends than on weekdays, and we therefore adjusted for the effects of the weekend. We represented weekend with a variable that was coded 1 for Saturday and Sunday and 0 for Monday through Friday. Daily stressors. As discussed above, the support-seeking– triage model has suggested that negative effects of support receipt are due to the fact that stress and support co-occur (stress-eliciting support), and therefore when distress increases after support, it is not because of the support, but because of the stressor that prompted that support. This same potential confound exists for relationship closeness and stressors, given that stressful life events are associated with declines in relationship satisfaction (Tesser & Beach, 1998) and an increase in social support. To rule out these alternative hypotheses, a count of participants’ daily stressors was included in the model. Each day, participants were asked to indicate whether any of 21 possible stressful events had occurred and to indicate any stressful event that occurred that did not correspond to one on the list. The number of events indicated was summed and 3 In an effort to lessen the burden of taking the daily diaries, we shortened a closeness scale used in previous studies by one item. We chose to eliminate an item that asked how connected one felt to their partner that day because it appeared redundant with the emotional closeness item (Cronbach’s ␣ ⫽ .92). By doing so, we lowered the alpha of the scale, but this is to be expected because Cronbach’s alpha underestimates reliability when items tap different aspects of a construct (Raykov, 1998). 4 The between-person reliability is interpreted as the between-person reliability of the average of the measures taken on the same day; the within-person reliability is interpreted as the reliability of change within person throughout the study (see Cranford et al., 2006). 5 Positive mood was also measured using the Profile of Mood States (Lorr & McNair, 1971), but as positive and negative mood have been shown to operate independently of each other we did not condense them into one scale. We did conduct separate analyses looking at positive mood and the effect of receipt of support and found that in both partners and examinees positive mood was not negatively affected by support receipt and was positively affected by giving support. Gleason, Iida, et al. (2003), on the other hand, found that positive and negative mood behaved similarly (support-only days resulted in an increase in negative mood and a decrease in positive mood). 6 Practical support was also measured and behaved very similarly to emotional support (smaller beta coefficients, but in the same direction) in both bar examinees and their partners; however, when both emotional and practical support were entered into the models, practical support no longer had any explanatory power, but the results for emotional support remained. Given this pattern, we report only the effects of emotional support.

GLEASON, IIDA, SHROUT, AND BOLGER

830

centered on the grand mean (partner M ⫽ 1.61 before centering, SD ⫽ 1.55; examinee M ⫽ 1.76 before centering, SD ⫽ 1.65). Gender. Gender (coded ⫺.5 for men and .5 for women) was originally included as a covariate, but because it did not affect the variables of interest (receiving and providing support) and to simplify the model presented, it was not included in the analyses reported below.

Moderating Variables Both potential moderating variables were measured in the background questionnaire, which both members of the couple completed approximately 3 weeks before starting the diary portion of the study (see above for more details about the background questionnaire administration). Relationship satisfaction. Overall relationship satisfaction was measured with one item taken from the Dyadic Adjustment Scale (Spanier, 1976), on which 0 ⫽ extremely unhappy, 3 ⫽ happy, and 6 ⫽ perfectly happy. Relationship satisfaction was generally high among these couples (examinees: M ⫽ 4.45, SD ⫽ 1.04; partners: M ⫽ 4.56, SD ⫽ 1.03). Self-esteem. Self-esteem was measured using the Rosenberg Self-Esteem scale (Rosenberg, 1965), a 7-item Likert scale ranging from 0 (low self-esteem) to 4 (high self-esteem; examinees: M ⫽ 3.17, SD ⫽ 0.59; partners: M ⫽ 3.16, SD ⫽ 0.60). Alpha reliability was .86 for examinees and .88 for partners.

Analytic Approach The goal of the current analysis was to examine the effects of receiving support from and providing support to one’s partner on both an individual’s evaluation of the degree of closeness in the relationship and simultaneously on an individual’s level of negative mood. We used a multilevel statistical model to investigate these relationships separately for partners (less stressed) and examinees (highly stressed). The models had two levels: a withinindividual level (over time) and a between-individuals level. The model also took into account the fact that outcomes, negative mood and closeness, were clustered within individuals.7 Using the multivariate approach described by Raudenbush and Bryk (2002), we included both closeness and negative mood in a single multilevel analysis. The multivariate approach allowed us to estimate the correlation between the random effects for negative mood and closeness and to examine the frequencies of participants showing a moderated pattern (see Figure 1, Model 1) and a differential effects pattern (see Figure 1, Model 2). All analyses were conducted using the MIXED procedure in SAS (SAS Institute, 2003). The within-individual level of the analysis allowed each individual’s relationship closeness and negative mood to be modeled as a function of receipt of support. We predicted a given day’s closeness and negative mood for a particular individual; we adjusted for either yesterday’s closeness or same-day morning negative mood, respectively; number of days in the study; and weekend effects. Given that support transactions may be more likely to take place on days when an individual experiences stressful events, a count of daily stressors was included to adjust for the effects of stressful events as a third variable. The equation was as follows: Yijk ⫽ 共Nijk兲 ⴱ 关b0ni ⫹ b1nYijk⫺1 ⫹ b2nDik ⫹ b3nWik ⫹ b4nSik

⫹ b5nGik ⫹ b6niRik ⫹ b7n共Gik ⫻ Rik兲 ⫹ eijk] ⫹ 共Cijk兲 ⴱ 关b0ci ⫹ b1cYijk⫺1 ⫹ b2cDik ⫹ b3cWik ⫹ b4cSik ⫹ b5cGik ⫹ b6ciRik ⫹ b7c(Gik ⫻ Rik) ⫹ eijk].

(1)

The dependent variable, Yijk, is the outcome for participant i for outcome j (when j ⫽ 1 it is negative mood; when j ⫽ 2 it is closeness) on day k. Thus, there were two records for each day within participant, so the maximum number of records that a participant contributed was 62. When the outcome is negative mood, Nijk ⫽ 1 and Cijk ⫽ 0, and the first part of the model is selected and all of the b coefficients have the subscript n. When the outcome is closeness, Nijk ⫽ 0 and Cijk ⫽ 1, and the second part of the model is selected and each of the b coefficients have a subscript c. Yijk ⫺ 1 is morning negative mood for individual i when j is equal to 1; Yijk ⫺ 1 is yesterday’s closeness for the same individual i when j is equal to 2; Dik is the number of days in the study; Wik indicates whether it is a weekend day or not; Sik adjusts for the number of stressors experienced; Gik is the individual’s report of providing (giving) support; Rik is the individual’s report of receiving support; Gik ⫻ Rik is the interaction term for providing and receiving support, and the residual components are represented by eijk. The coefficient b0ni is the regression intercept for negative mood for individual i and represents negative mood on the first weekday of the study when the individual has neither given nor received support and all other variables are at their projected average level (as morning mood and daily stressors are grand mean centered). The coefficient b0ci is the regression intercept for closeness for individual i and represents closeness on the first weekday of the study when the individual has neither given nor received support and all other variables are at their projected average level (as yesterday’s closeness and daily stressors are grand mean centered). As Bolger and Shrout (2007) discussed, the mixed-model approach can be specified to acknowledge that the residuals on adjacent days are likely to be correlated, and we used this specification in the analysis we report here. This specification allowed us to account for dependency between outcomes in individuals and within individuals across time. The between-individual level of the analysis allows us to model possible individual differences in the coefficients specified in Equation 1. We fit a model that considered intercepts for both closeness and negative mood to be random (i.e., to vary across persons) and the effect of support receipt on each of the two outcomes. The formal specification of these models involves the inclusion of random effects in the Level 2 equation. These have a mean of zero but variance that is assumed to be nonzero. For example, the between-individuals level of the model for the intercepts involves the sum of overall means (␥) and random effects (u). Our analytic model also allowed the random effects for the intercepts and the support receipt effects to be correlated across 7

Kenny, Kashy, and Bolger (1998) provided a general description of multilevel statistical models. Raudenbush and Bryk (2002) showed that these models can be influenced by both within- and between-individuals variation. When within-person and between-person effects are predicted to be the same, the multilevel analysis that combines the effects is recommended.

DUAL EFFECTS OF SUPPORT ON PSYCHOLOGICAL OUTCOMES

both effect type and outcome variable. For those interested in the details of this analysis, the syntax used is available from Marci E. J. Gleason. The Level 2 equations were

831

Table 1 Multilevel Analysis Results Relating Daily Support to Negative Mood and Closeness for Partners and Examinees: Fixed Effects

b0ni ⫽ ␥ 0n ⫹ u0ni b0ci ⫽ ␥ 0c ⫹ u0ci

Variable

b6ni ⫽ ␥ 6n ⫹ u6ni b6ci ⫽ ␥ 6c ⫹ u6ci.

(2)

In addition, we tested the moderation hypothesis in two separate multivariate, multilevel analyses. The same Level 1 equation described in Equation 1 was used for each analysis. The Level 2 equations were modified when testing for moderation to include the moderators (self-esteem or relationship satisfaction), resulting in an additional predictor in each of the four equations. We did not alter the specifications of the random effects for the moderation tests.

Results Support Patterns Examinees reported receiving support on 50% of days, whereas partners reported receiving support on only 40% of days. Examinees reported giving support on 37% of days, whereas partners reported giving support on 53% of days. Examinees’ support receipt increased over time (r ⫽ .09, p ⬍ .01), and their reports of giving support decreased (r ⫽ ⫺.03, p ⬍ .01). The opposite pattern is observed for partners, that is, partners received less and provided more support as the bar exam approached (receiving, r ⫽ ⫺.03, p ⬍ .05; giving, r ⫽ .11, p ⬍ .01).

Fixed Effects Table 1 presents the fixed-effect results for both outcomes for partners and examinees. Only the variables of interest are reported here. The main effect of support receipt was significant for both negative mood, partners: b6n ⫽ 0.075, t(289) ⫽ 3.54, p ⬍ .001; examinees: b6n ⫽ 0.037, t(292) ⫽ 2.04, p ⬍ .05, and closeness, partners: b6c ⫽ 0.248, t(289) ⫽ 6.37, p ⬍ .001; examinees: b6c ⫽ 0.411, t(292) ⫽ 12.19, p ⬍ .001. The main effect of giving support on negative mood was significant for examinees, but not for partners, partners: b5n ⫽ ⫺0.001, t(289) ⫽ ⫺0.10, ns; examinees: b5n ⫽ ⫺0.048, t(292) ⫽ ⫺2.24, p ⬍ .05. The main effect for giving support was significant for both partners and examinees on closeness, partners: b5c ⫽ 0.245, t(289) ⫽ 8.73, p ⬍ .001; examinees: b5c ⫽ 0.319, t(292) ⫽ 8.63, p ⬍ .001. For negative mood, these effects have to be interpreted in the context of a significant interaction between receipt and provision (see Table 1). When one takes the interaction into account, the current findings replicate those of Gleason, Iida, et al. (2003), in which it was found that supportive equity days (days in which support is both received and provided) are associated with the lowest levels of negative mood and that receipt-only days are associated with the highest levels of negative mood (see Figure 2). As the figure shows, the receipt of support is detrimental to negative mood, but only on days in which the recipient of support did not also provide support to his or her partner.

Negative mood Intercept Day ⫻ 10 Weekend Daily stressors Morning negative mood Receiving emotional support Giving emotional support Receiving Emotional Support ⫻ Giving Emotional Support Closeness Intercept Day ⫻ 10 Weekend Daily stressors Yesterday’s closeness Receiving emotional support Giving emotional support Receiving Emotional Support ⫻ Giving Emotional Support †

p ⬍ .10.

*

p ⬍ .05.

**

Partners (n ⫽ 290)

Examinees (n ⫽ 293)

␥

␥

SE

SE

0.343** 0.015* ⫺0.016† 0.027** 0.428** 0.075* ⫺0.001

0.016 0.589** 0.005 0.102** 0.009 ⫺0.016 0.003 0.064** 0.012 0.472** 0.021 0.037* 0.014 ⫺0.048*

0.021 0.007 0.011 0.004 0.011 0.018 0.021

⫺0.093*

0.023 ⫺0.080*

0.027

1.989** ⫺0.057** 0.113** ⫺0.043** 0.318** 0.248** 0.245**

0.037 1.971** 0.010 ⫺0.048** 0.019 0.135** 0.007 ⫺0.065** 0.011 0.140** 0.039 0.411** 0.028 0.319**

0.046 0.001 0.019 0.007 0.011 0.034 0.037

0.046

0.046

0.080†

0.011

p ⬍ .001.

Figure 3 shows the results for closeness. Support receipt’s positive effects on closeness were evident despite its also being associated with an increase in negative mood. Although there was a marginal interaction between receipt and provision on closeness for partners, suggesting that supportive equity days were particularly positive for partners, the interaction does not diminish the beneficial effects of support receipt on closeness. Partners’ effects of receiving and giving support on relationship closeness did not differ (difference between estimates ⫽ 0.002), t(289) ⫽ 0.10, ns. However, examinees’ effect of receiving support was greater than the effect of giving support (difference between estimates ⫽ 0.09), t(292) ⫽ 2.09, p ⬍ .05. Days on which support was received and not given were significantly more negative when compared with the other three types of days for both partners (difference between estimates ⫽ 0.08), t(289) ⫽ 4.45, p ⬍ .001, and examinees (difference between estimates ⫽ 0.08), t(292) ⫽ 5.19, p ⬍ .001.

Random Effects The random effects covariance matrix for both partners and examinees is displayed in Table 2. The model generated random effects for the intercepts of closeness and negative mood both between and within level. The within-level random effects provide information about what is occurring in individuals’ lives that affects their levels of negative mood and closeness that was not captured by our models. As can be seen, the variances for both the intercept for negative mood and closeness are significant, suggesting that the model does not account for all the variation in these variables. In addition, the evidence for a negative covariance

GLEASON, IIDA, SHROUT, AND BOLGER

832 0.65

Examinees

Negative Mood

0.60 No Giving

0.55

Giving 0.50

0.45

0.40

Partners

0.35

0.30 No Receiving

Receiving

Figure 2. The effects of support receipt and provision on evening negative mood for both partners and examinees.

between these intercepts suggests that whatever is causing that within-level daily variation affects closeness and negative mood in opposite ways. On a given day, when negative mood is increased, closeness is decreased and vice versa. The between-person random effects for the intercepts of closeness and negative mood provide information about how negative mood and closeness behave across individuals. By generating random effects, we were able to obtain estimates of each individual’s intercepts for negative mood and closeness. If the random effects are positively correlated, it suggests that individuals who have generally higher levels of closeness also have higher levels of negative mood; if the correlation between them is negative, it suggests that individuals who have generally higher levels of closeness have lower levels of negative mood and vice versa. There are also between-person random effects for the slopes (the effect of receipt on both negative mood and closeness for each individual), and their correlation will give us information as to whether these effects are systematically linked across individuals. Using the random effects variances and covariances, we calculated the correlations between the intercepts for negative mood and closeness and the effects of receipt on negative mood and closeness (the slopes) at the within level. As can be seen in Table 2, partners’ covariance between the random effect for the intercept of negative mood (␶0n ⫽ .035) and the intercept for closeness (␶0c ⫽ .212) is – 0.027. This results in a correlation of ⫺.31 ( p ⬍ .05) for partners, and similarly the correlation for examinees is ⫺.25 ( p ⬍ .05). These correlations suggest that people who are higher in negative mood tend to be lower in closeness. The correlation between the random effects of support receipt on both outcomes is ⫺.36 ( p ⬍ .05) for partners and ⫺.31 ( p ⬍ .05) for examinees, suggesting systematic differences across individuals.

A representation of these negative correlations of the slopes can be found in Figure 4. Each point on the scatterplots represents the estimated random effect of receipt on negative mood (x-axis) and on closeness (y-axis) for a single individual—in other words, each point represents how support receipt typically affects a particular individual’s negative mood and feelings of closeness. As can be seen, there are individuals in three of the four quadrants of the scatterplots. Most individuals fall in the upper right quadrant (examinees ⫽ 209 individuals; partners ⫽ 245 individuals); members in this quadrant experience something akin to the fixed effects: an increase in both closeness and negative mood. However, a sizable portion of individuals also fall into the upper left quadrant (examinees ⫽ 80; partners ⫽ 40); support receipt is only positive for members of this quadrant: It decreases negative mood and increases closeness. Finally, a few individuals fall into the lower right quadrant (examinees ⫽ 4; partners ⫽ 5); support receipt is only negative for members of this quadrant: It increases negative mood and decreases closeness. Selecting individuals at both the positive and the negative ends of the scatterplots and plotting their individual data allows us to see how support differentially affects different people. Figure 5 displays the data for 2 partners and 2 examinees. The top two graphs display data for a partner and examinee for whom the receipt of support is negative. Notice that on days on which they receive support, their closeness ratings are low and their negative mood ratings are high. The bottom two graphs show a strikingly different pattern—these are of individuals who are positively affected by the receipt of support. Notice that on days on which they receive support, their closeness ratings are high and their negative mood ratings are low. These “extreme” individuals are good examples of how people differentially react to the receipt of support,

DUAL EFFECTS OF SUPPORT ON PSYCHOLOGICAL OUTCOMES

833

Examinees

Relationship Closeness

3.0

2.5

2.0 No Giving Giving 1.5

1.0 No Receiving

Receiving

Partners

Relationship Closeness

3.0

2.5

2.0 No Giving 1.5

Giving

1.0 No Receiving

Receiving

Figure 3. The effects of support receipt and provision on relationship closeness for both partners and examinees.

but it is important to note that most individuals fall in the middle of the distribution and experience something more akin to the fixed effects results when they receive support: an increase in negative mood and an increase in feelings of closeness.

Moderation The moderators included in the analyses did not explain why individuals react differently to support receipt. Relationship satisfaction failed to moderate the effects of support receipt on negative

mood and closeness for both examinees and partners. Self-esteem did not moderate the receipt of support for examinees on either negative mood (bn ⫽ .024, SE ⫽ .027, ns) or closeness (bc ⫽ ⫺.056, SE ⫽ .049, ns). It did moderate the effect of support receipt on negative mood for partners (bn ⫽ ⫺0.067, SE ⫽ .026), t(289) ⫽ ⫺2.58, p ⬍ .05, but not the effect of support receipt on closeness for partners (bc⫽ ⫺.005, SE ⫽ .045, ns). These results suggest that when one is not approaching a large stressor, the negative effects of support receipt may be tempered for those with above-average self-esteem (self-esteem was group-mean centered). However,

GLEASON, IIDA, SHROUT, AND BOLGER

834

Table 2 Multilevel Analysis Results Relating Daily Support to Negative Mood and Closeness for Partners and Examinees: Random Effects Partners (n ⫽ 290) ␶

Level and variable Level 1 Variance of negative mood (NM) Variance of closeness (CL) Covariance of NM and CL Level 2 Variances NM CL Receipt ⫻ Negative Mood (RNM) Receipt ⫻ Closeness (RCL) Covariances NM–CL NM–RNM NM–RCL CL–RNM CL–RCL RNM–RCL

SE

Examinees (n ⫽ 293) ␶

SE

0.128** 0.002 0.176** 0.003 0.538** 0.010 0.508** 0.009 ⫺0.054** 0.003 ⫺0.057** 0.004 0.035** 0.004 0.212** 0.023

0.074** 0.009 0.444** 0.045

0.030** 0.006 0.053** 0.016

0.021** 0.005 0.097** 0.020

⫺0.027** ⫺0.007* 0.010 0.004 ⫺0.052** ⫺0.014*

0.007 0.004 0.010 0.008 0.015 0.007

⫺0.045** ⫺0.007* 0.019* 0.027** ⫺0.128** ⫺0.014*

0.014 0.005 0.009 0.012 0.025 0.008

Note. Significance tests of Level 1 effects are constricted ratio Wald tests; significance tests of Level 2 variances are chi-squares (df ⫽ 4); significance tests of Level 2 covariances are chi-squares (df ⫽ 1). * p ⬍ .05. ** p ⬍ .001.

given that the effect of receipt of support on negative mood is bn ⫽ .075 and there are no participants who are more than 0.90 units above the mean in self-esteem (the mean self-esteem for partners was 3.16 and the maximum score was 4.0), this moderation makes receipt of support less detrimental to negative mood, but not on average positive, for those high in self-esteem.

Discussion The results support Model 2 (differential effects) from Figure 1 in that receiving support simultaneously increased relationship closeness and negative mood. However, this was only true on days when support was not provided (22% of days for partners and 21% of days for examinees). On days when support was both received and provided—supportive equity days (31% of days for partners, 29% of days for examinees)—support receipt increased relationship closeness and decreased negative mood. This was true for individuals who were approaching a major stressor (examinees) and for their less stressed partners. The varying stress level did not substantially affect the influence of support receipt or provision on relationship closeness and negative mood. Although this pattern is evident on average, it is not the whole story. Evidence from the random effects analysis suggests that support also differentially affects individuals, and this is in line with Model 1 (individual differences) from Figure 1. Namely, the negative correlations between the random effects of receipt on negative mood and on closeness suggest that individuals who experience a larger increase in negative mood when they receive support experience less of an increase in closeness; conversely,

individuals who experience a larger increase in closeness when they receive support experience less of an increase in negative mood. Although this does not eliminate support receipt’s average effect on outcomes across people, it does indicate that the duality observed for an average individual is limited. The majority of participants were in the middle of the spectrum of support reactivity and reported experiencing increases in both outcomes following support receipt, whereas those on the ends of the spectrum either benefited from support or suffered from support. It seems that support not only differentially affects closeness and negative mood, but also operates differently across individuals. Several possible explanations for the overall finding that support increases both negative mood and relationship closeness seem likely, including characteristics of the recipient, characteristics of the provider, and characteristics of the relationship. The characteristics of the recipients that we tested here, self-esteem and relationship satisfaction, did not explain our pattern of findings. However, there are many other individual difference constructs, such as attachment style, that are plausible moderators but were not included in this study. An intriguing candidate for moderation, which has intuitive ties to Nadler and colleagues’ work on self-esteem (Fisher et al, 1982; Nadler, 1987; Nadler & Fisher, 1976), is perceived respect from one’s partner or the extent to which one feels respected and self-efficacious in one’s relationship. This construct may be more directly implicated than self-esteem when considering support exchanges between partners. For instance, work on self-efficacy has demonstrated that judging oneself as being inefficacious can impair coping and goal achievement (Bandura, 1982), and the receipt of support may lead some individuals to doubt their ability to accomplish goals on their own. Although we were not able to test this idea in the current study, it is bolstered by the current findings that providing support is beneficial, particularly when one has received support. Demonstrating one’s efficacy through the provision of support may allow one to accept support from one’s partner without experiencing efficacy declines. Should this be the case, we would then want to determine why receiving support signals lack of efficacy for some individuals but not others. A second explanation for the differential effect documented in this study could be characteristics of the provider. Characteristics such as being a high self-monitor might result in an individual’s being a particularly skilled support provider (Flynn, Reagans, Amanatullah, & Ames, 2006), one whose support may be less detrimental to recipients’ moods and feelings of efficacy. Work by Lakey, Lutz, and Scoboria (2004; Lakey & Scoboria, 2005) on perceived support suggests that the benefits of perceived social support may be derived not only through personality characteristics of the perceiver, but also through relationship factors such as the perceived similarity of the provider to the recipient. Perhaps the same is true in actual support transactions—the more a recipient feels similar to a provider, the more positive the support given by that provider would be. Other relationship characteristics, such as the match or mismatch of communication styles (see Swann, Rentfrow, & Gosling, 2003), could also lead to the effects documented here. Although we cannot offer definitive evidence supporting a particular explanation for these patterns, given that support is an exchange between at least two individuals it seems likely that at least part of the explanation for these effects will come from dyad-level variables or processes.

DUAL EFFECTS OF SUPPORT ON PSYCHOLOGICAL OUTCOMES

835

Partners

1.4

Closeness X Receipt of Support

1.2 1.0 0.8 0.6 0.4 0.2 0.0 -0.2 -0.4 -0.4

-0.2

0.0

0.2

0.4

0.6

0.8

1.0

0.8

1.0

Negative Mood X Receipt of Support

Examinees 1.4

Closeness X Receipt of Support

1.2 1.0 0.8 0.6 0.4 0.2 0.0 -0.2 -0.4 -0.4

-0.2

0.0

0.2

0.4

0.6

Negative Mood X Receipt of Support Figure 4. Scatterplots of the random effects of Receipt ⫻ Closeness and Receipt ⫻ Negative Mood for each partner and examinee. Individuals whose point lies in the upper left-hand quadrant are those for whom support receipt decreases negative mood and increases closeness; those in the upper right-hand quadrant are those for whom support receipt increases negative mood and also increases closeness; and those in the lower right-hand quadrant are those for whom support receipt increases negative mood and decreases closeness.

Also worth noting was the benefit of support provision for providers, which, regardless of support receipt, improved mood (for examinees) and relationship closeness. Considering provision’s beneficial qualities, it is important that social support researchers include it in their studies to understand why giving appears to be better than receiving. Reciprocity research would suggest that giving is positive because it repays our debts or puts others in our debt (Uehara, 1995) and that it is a tool that individ-

uals use to prove their worth in their social group (Buunk & Schaufeli, 1999). It may also be that giving support boosts selfesteem by making one feel more competent and needed (Fisher et al., 1982). Provision’s benefits need to be explored more thoroughly, particularly in the context of close relationships. These results speak directly to two recently proposed theories in the close relationships literature: Cutrona et al.’s (2005) relationship enhancement model of social support and Reis et al.’s (2004)

GLEASON, IIDA, SHROUT, AND BOLGER

836

Examinee

Partner 4 Closeness 3 2 1 0 Yes Receipt of Support No 4 Negative 3 2 Mood 1 0 0

5

10

15

20

25

30

0

5

10

15

20

25

30

0

5

10

15

20

25

30

0

5

10

15

20

25

30

4 Closeness 3 2 1 0 Yes Receipt of Support No 4 Negative 3 2 Mood 1 0 Day in Study

Day in Study

Figure 5. Top: Partner and examinee for whom support decreases closeness and increases negative mood. Bottom: Partner and examinee for whom support increases closeness and decreases negative mood

perceived partner responsiveness model. The relationship enhancement model of social support suggests that consistent supportive responses can lead to higher perceived partner support, which leads to greater trust, relationship satisfaction, and ultimately better health. Although this model is widely supported by research on perceived support availability, it has been less clear how to reconcile findings regarding the negativity of social support receipt with these ideas. The current research suggests that supportive acts, despite causing personal distress for some individuals, enhance relationship closeness. Perhaps it is this positive effect of support that leads to the positive effects of perceived support and ultimately to relationship satisfaction and health. Reis et al. (2004) also referred to the benefits of perceived availability of support in their discussion of perceived partner responsiveness and also noted that the research on support receipt itself has questioned its benefit. They suggested that the field needs to examine how the effects of social support differentially affect outcomes and individuals. The current research does both of these and suggests that the benefits of support are largely due to relationship enhancement (i.e., increased closeness), although it is important to note that this association itself varies by individual. Actual instances of support, despite some negative “side effects,” may be one of the more important ways that partners establish responsiveness. The lack of gender effects on the variables of interest in both studies may seem surprising, but it is consistent with the support literature that has found similarities in support processes for men and women (Neff & Karney, 2004; Porter et al., 2000). However, a few studies have shown that men and women react differently to support

receipt (see Antonucci & Akiyama, 1987; Cutrona, 1996). It is important to note that we did find some gender effects—for instance, men tended to experience more negative mood and to be less negatively affected by troublesome events— but not on the variables of interest (support receipt and provision). Perhaps gender differences in support processes would have emerged if we had investigated the amount of support requested or received by men and women.

Limitations This study had several limitations, some of which have been discussed above. The sample was not randomly chosen and consisted of well-educated individuals. It is possible that this pattern of results would not be present in a less privileged population. In future studies, more diverse samples should be sought to determine the generalizability of these findings. In addition, given that this was a nonexperimental study we were unable to completely adjust for the level of stress participants experienced and to definitively establish that support increased negative mood instead of just co-occurring with it. We took steps to limit this concern: We included a count of stressful events in all analyses and adjusted for morning negative mood. In addition, by demonstrating the effects in both highly stressed individuals (examinees) and less stressed individuals (partners), we feel confident that support receipt can have these mixed effects regardless of overall stress level. Furthermore, the simulation study by Seidman et al. (2006) concluded that the negative effects associated with support receipt in naturalistic studies could not reasonably be

DUAL EFFECTS OF SUPPORT ON PSYCHOLOGICAL OUTCOMES

caused by co-occurring negative events—the parameter values needed for such an association were unrealistic. Finally, a recent experimental study demonstrated the negative effects of support receipt on mood (Bolger & Amarel, 2007). Given these previous findings and the precautions we took to minimize this concern, we feel confident that support is causally linked to negative mood in this study. In the future, however, we hope to demonstrate this pattern of findings in an experimental setting. Our analyses of support receipt and provision included only emotional support. Although we also asked about practical and instrumental support, we found that emotional support was the driving force behind our findings and therefore included only emotional support in our models. However, research by other individuals has shown that emotional support can be further expanded into such things as esteem support, companionship, and caring (Kang & Rafaeli, 2007). Perhaps such finer distinctions in the type of support received would shed some light on the current findings. For instance, it may be that esteem support is both common and unhelpful, whereas companionship is less common but beneficial. Also, we had participants indicate whether they had received support or not, but did not obtain information on the amount of support they received. Future studies should include a more detailed collection of support that may shed light on both differential effects of support receipt and individual differences in the effects of support receipt. Finally, although we did include theoretically important constructs as moderators, there are several others that would be of interest that were not included. These include not only characteristics of the recipient (i.e., perceived competence), but also of the provider (i.e., empathy or responsiveness) and of the relationship itself (i.e., communication style). Future studies would do well to examine such possible moderators.

Concluding Thoughts and Implications The effects of support receipt in this study were consistent and compelling. Although actual support receipt has been linked to negative outcomes (Barrera, 1981; Bolger et al., 2000; Gleason, Iida, et al., 2003), this study demonstrated that this does not appear to be true for relationship closeness. These findings highlight how a single act (receiving support) can make one feel better in one domain (relationship closeness) but worse in another (personal mood). Perhaps it is this duality that allows social support to be considered positive by laypersons even though psychologists have documented that it can be ineffective and even detrimental. Furthermore, evidence of random effects showed that this duality itself showed substantial betweenindividual heterogeneity, which implies that models of support processes need to accord a more central role to heterogeneity than they have done heretofore. At this time, we have not identified moderators of this heterogeneity and therefore cannot explain why people react differently to support, but understanding that individuals react differently is an important step toward understanding social support processes in close relationships. In conclusion, it is worth considering that the implications of our heterogeneity findings may not be limited to the social support literature. Although many of the constructs studied in psychology are likely to have differential effects across individuals, dyads, and social contexts, rarely are studies designed in ways that allow such heterogeneity to be reliably distinguished. Potential moderators

837

can be included in studies, of course, but the intensive longitudinal approach taken here allows one to quantify the extent of heterogeneity without having identified moderators a priori. Thus, these intensive designs have the possibility to demonstrate that many of the findings in the field reflect average effects, averages that can obscure important, consequential variability.

References Abelson, R. P. (1968). Simulation of social behavior. In G. Lindzey & E. Aronson (Eds.), Handbook of social psychology (Vol. 2, pp. 274 –356). Reading, MA: Addison-Wesley. Acitelli, L. K., & Antonucci, T. C. (1994). Gender differences in the link between marital support and satisfaction in older couples. Journal of Personality and Social Psychology, 67, 688 – 698. Antonucci, T. C., & Akiyama, H. (1987). An examination of sex difference in social support among older men and women. Sex Roles, 17, 737–749. Bandura, A. (1982). The assessment and predictive generality of selfpercepts of efficacy. Journal of Behavior Therapy and Experimental Psychiatry, 13, 195–199. Barrera, M. (1986). Distinctions between social support concepts, measures, and models. American Journal of Community Psychology, 14, 413– 445. Barrera, M., Jr. (1981). Social support in the adjustment of pregnant adolescents: Assessment issues. In B. H. Gottlieb (Ed.), Social networks and social support (pp. 69 –96). Beverly Hills, CA: Sage. Bolger, N., & Amarel, D. (2007). Effects of support visibility on adjustment to stress: Experimental evidence. Journal of Personality and Social Psychology, 92, 458 – 475. Bolger, N., & Shrout, P. E. (2007). Accounting for statistical dependency in longitudinal data on dyads. In T. D. Little, J. A. Bovaird, & N. A. Card (Eds.), Modeling ecological and contextual effects in longitudinal studies of human development (pp. 285–298). Mahwah, NJ: Erlbaum. Bolger, N., Zuckerman, A., & Kessler, R. C. (2000). Invisible support and adjustment to stress. Journal of Personality and Social Psychology, 79, 953–961. Bradbury, T. N., & Fincham, F. D. (1989). Behavior and satisfaction in marriage: Prospective mediating processes. In C. Hendrick (Ed.), Close relationships (pp. 119 –143). Thousand Oaks, CA: Sage. Buunk, B. P., & Schaufeli, W. B. (1999). Reciprocity in interpersonal relationships: An evolutionary perspective on its importance for health and well-being. European Review of Social Psychology, 10, 259 –291. Cohen, S. (2004). Social relationships and health. American Psychologist, 59, 676 – 684. Collins, N. L., & Feeney, B. C. (2000). A safe haven: An attachment theory perspective on support seeking and caregiving in intimate relationships. Journal of Personality and Social Psychology, 78, 1053–1073. Cranford, J. A., Shrout, P. E., Iida, M., Rafaeli, E., Yip, T., & Bolger, N. (2006). A procedure for evaluating sensitivity to within-person change: Can mood measures in diary studies detect change reliably? Personality and Social Psychology Bulletin, 32, 917–929. Cutrona, C. E. (1996). Social support in couples: Marriage as a resource in times of stress. Thousand Oaks, CA: Sage. Cutrona, C. E., Russell, D. W., & Gardner, K. A. (2005). The relationship enhancement model of social support. In T. A. Revenson, K. Kayser, & G. Bodenmann (Eds.), Couples coping with stress: Emerging perspectives on dyadic coping (pp. 73–95). Washington, DC: American Psychological Association. Edwards, A. C., Nazroo, J. Y., & Brown, G. W. (1998). Gender differences in marital support following a shared life event. Social Science & Medicine, 46, 1077–1085. Feeney, B. C., & Collins, N. L. (2001). Predictors of caregiving in adult intimate relationships: An attachment theoretical perspective. Journal of Personality and Social Psychology, 80, 972–994.

838

GLEASON, IIDA, SHROUT, AND BOLGER

Feeney, B. C., & Collins, N. L. (2003). Motivations for caregiving in adult intimate relationships: Influences on caregiving behavior and relationship functioning. Personality and Social Psychology Bulletin, 29, 950 –968. Fisher, J. D., Nadler, A., & Whitcher-Alagna, S. (1982). Recipient reactions to aid. Psychological Bulletin, 91, 27–54. Flynn, F. J., Reagans, R. E., Amanatullah, E. T., & Ames, D. R. (2006). Helping one’s way to the top: Self-monitors achieve status by helping others and knowing who helps whom. Journal of Personality and Social Psychology, 91, 1123–1137. Gable, S. L., Gonzaga, G. C., & Strachman, A. (2006). Will you be there for me when things go right? Supportive responses to positive event disclosures. Journal of Personality and Social Psychology, 91, 904 –917. Gable, S. L., Reis, H. T., & Downey, G. (2003). He said, she said: A quasi-signal detection analysis of daily interactions between close relationship partners. Psychological Science, 14, 100 –105. Gleason, M. E. J. (2005). Is receiving support a mixed blessing? Evidence for dual effects of support on psychological outcomes (Doctoral dissertation, New York University, 2005). Dissertation Abstracts International, 65, 4896. Gleason, M. E. J., Bolger, N., & Shrout, P. E. (2003, February). Effects of study design on reports of mood: Understanding differences between cross-sectional, panel, and diary designs. Poster session presented at the annual meeting of the Society for Personality and Social Psychology, Universal City, CA. Gleason, M. E. J., Iida, M., Bolger, N., & Shrout, P. E. (2003). Daily supportive equity in close relationships. Personality and Social Psychology Bulletin, 29, 1036 –1045. Gottman, J. M. (1979). Detecting cyclicity in social interaction. Psychological Bulletin, 86, 338 –348. Hagedoorn, M., Kuijer, R. G., Buunk, B. P., DeJong, G. M., Wobbes, T., & Sanderman, R. (2000). Marital satisfaction in patients with cancer: Does support from intimate partners benefit those who need it the most? Health Psychology, 19, 274 –282. Kang, N. J., & Rafaeli, E. (2007). Show, don’t tell: Evidence for a hierarchy of support types. Manuscript submitted for publication. Kenny, D. A., Kashy, D. A., & Bolger, N. (1998). Data analysis in social psychology. In D. Gilbert, S. Fiske, & G. Lindzey (Eds.), Handbook of social psychology (4th ed., pp. 233–265). New York: McGraw-Hill. Krause, N. (1997). Received support, anticipated support, social class, and mortality. Research on Aging, 19, 387– 422. Lakey, B., & Lutz, C. J. (1996). Social support and preventive and therapeutic interventions. In G. R. Pierce, B. R. Sarason, & I. G. Sarason (Eds.), Handbook of social support and the family (pp. 435– 465). New York: Plenum Press. Lakey, B., Lutz, C. J., & Scoboria, A. (2004). The information used to judge supportiveness depends on whether the judgment reflects the personality of perceivers, the objective characteristics of targets, or their unique relationships. Journal of Social and Clinical Psychology, 23, 817– 835. Lakey, B., & Scoboria, A. (2005). The relative contribution of trait and social influences to the links among perceived social support, affect, and self esteem. Journal of Personality, 73, 361–388. Liang, J., Krause, N. M., & Bennett, J. M. (2001). Social exchange and well-being: Is giving better than receiving? Psychology and Aging, 16, 511–523. Lorr, M., & McNair, D. M. (1971). The Profile of Mood States manual. San Diego, CA: Educational & Industrial Testing Service. Nadler, A. (1987). Determinants of help seeking behaviour: The effects of helper’s similarity, task centrality and recipient’s self esteem. European Journal of Social Psychology, 17, 57– 67. Nadler, A., & Fisher, J. D. (1976). When helping hurts: Effects of donorrecipient similarity and recipient self-esteem on reactions to aid. Journal of Personality, 44, 392– 409. Nadler, A., Fisher, J. D., & Itzhak, S. B. (1983). With a little help from my

friend: Effect of single or multiple act aid as a function of donor and task characteristics. Journal of Personality and Social Psychology, 44, 310 –321. Neff, L. A., & Karney, B. R. (2004). The dynamic structure of relationship perceptions: Differential importance as a strategy of relationship maintenance. Personality and Social Psychology Bulletin, 29, 1433–1446. Porter, L. S., Marco, C. A., Schwartz, J. E., Neale, J. M., Shiffman, S., & Stone, A. A. (2000). Gender differences in coping: A comparison of trait and momentary assessments. Journal of Social & Clinical Psychology, 19, 480 – 498. Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear models: Applications and data analysis methods. Thousand Oaks, CA: Sage. Raykov, T. (1998). Coefficient alpha and composite reliability with interrelated nonhomogeneous items. Applied Psychological Measurement, 22, 375–385. Reis, H. T., Clark, M. S., & Holmes, J. G. (2004). Perceived partner responsiveness as an organizing construct in the study of intimacy and closeness. In D. J. Mashek & A. P. Aaron (Eds.), Handbook of closeness and intimacy (pp. 201–225). Mahwah, NJ: Erlbaum. Reis, H. T., & Patrick, B. C. (1996). Attachment and intimacy: Component processes. In E. T. Higgins & A. W. Kruglanski (Eds.), Social psychology: Handbook of basic principles (pp. 523–563). New York: Guilford Press. Roberto, K. A., & Scott, J. P. (1986). Equity considerations in the friendships of older adults. Journal of Gerontology, 41, 241–247. Rosenberg, M. (1965). Society and the adolescent child. Princeton, NJ: Princeton University Press. Sanderson, C. A., & Cantor, N. (1997). Creating satisfaction in steady dating relationships: The role of personal goals and situational affordances. Journal of Personality and Social Psychology, 73, 1424 –1433. Sarason, I. G., Sarason, B. R., & Pierce, G. R. (1994). Social support: Global and relationship-based level of analysis. Journal of Social and Personal Relationships, 11, 295–312. SAS Institute. (2003). The SAS system for Windows (Version 8.02). Cary, NC: Author. Seidman, G., Shrout, P. E., & Bolger, N. (2006). Why is enacted social support associated with increased distress? Using simulation to test two possible sources of spuriousness. Personality and Social Psychology Bulletin, 32, 52– 65. Shrout, P. E., Herman, C., & Bolger, N. (2006). The costs and benefits of practical and emotional support on adjustment: A daily diary study of couples experiencing acute stress. Personal Relationships, 13, 115–134. Spanier, G. B. (1976). Measuring dyadic adjustment: New scales for assessing the quality of marriage and similar dyads. Journal of Marriage and Family, 38, 15–28. Swann, W. B., Jr., Rentfrow, P. J., & Gosling, S. D. (2003). The precarious couple effect: Verbally inhibited men ⫹ critical, disinhibited women ⫽ bad chemistry. Journal of Personality and Social Psychology, 86, 1095–1106. Tesser, A., & Beach, S. R. H. (1998). Life events, relationship quality, and depression: An investigation of judgment discontinuity in vivo. Journal of Personality and Social Psychology, 74, 36 –52. Thompson, S. C., Medvene, L. J., & Freedman, D. (1995). Caregiving in the close relationships of cardiac patients: Exchange, power, and attributional perspectives on caregiver resentment. Personal Relationships, 2, 125–142. Uebelacker, L. A., Courtnage, E. S., & Whisman, M. A. (2003). Correlates of depression and marital satisfaction: Perceptions of marital communication style. Journal of Social and Personal Relationships, 20, 757–769. Uehara, E. S. (1995). Reciprocity reconsidered: Gouldner’s “moral norm of reciprocity” and social support. Journal of Social and Personal Relationships, 12, 482–502. Walster, E., Berscheid, E., & Walster, W. G. (1973). New directions in equity research. Journal of Personality and Social Psychology, 25, 151–176.

Received March 3, 2006 Revision received August 28, 2007 Accepted August 30, 2007 䡲

Journal of Personality and Social Psychology 2008, Vol. 94, No. 5, 839 – 859

Copyright 2008 by the American Psychological Association 0022-3514/08/$12.00 DOI: 10.1037/0022-3514.94.5.839

Nomina Sunt Omina: On the Inductive Potential of Nouns and Adjectives in Person Perception Andrea Carnaghi

Anne Maass and Sara Gresta

University of Trieste

University of Padova

Mauro Bianchi

Mara Cadinu and Luciano Arcuri

University of Jena

University of Padova

Six studies (N ⫽ 491) investigated the inductive potential of nouns versus adjectives in person perception. In the first 5 studies, targets were either described by an adjective (e.g., Mark is homosexual) or by the corresponding noun (e.g., Mark is a homosexual) or by both (Study 3). The authors predicted and found that nouns, more so than adjectives, (a) facilitate descriptor-congruent inferences but inhibit incongruent inferences (Studies 1–3), (b) inhibit alternative classifications (Study 4), and (c) imply essentialism of congruent but not of incongruent preferences (Study 5). This was supported for different group memberships and inclinations (athletics, arts, religion, sexual preference, drinking behavior, etc.), languages (Italian and German), and response formats, suggesting that despite the surface similarity of nouns and adjectives, nouns have a more powerful impact on person perception. Study 6 investigated the inverse relationship, showing that more essentialist beliefs (in terms of a genetic predisposition rather than training) lead speakers to use more nouns and fewer adjectives. Possible extensions of G. R. Semin and K. Fiedler’s (1988) linguistic category model and potential applications for language use in intergroup contexts are discussed. Keywords: language, essentialism, stereotyping

As for labels that are associated with categories in general and social categories in particular, there is little doubt that they provide perceivers with important and useful information as they indicate to which class of objects a given instance belongs (Krueger & Clement, 1994; Putnam, 1975; Rothbart, Davis-Stitt, & Hill, 1997). Moreover, even when labels, by virtue of their absolute meaning (e.g., nominal categories such as A and B), are totally irrelevant to the classified objects (e.g., lines that differ in their length), they functionally represent some underlying property of the objects in question (e.g., ordinal categories such as shorter and longer; Tajfel & Wilkes, 1963). There is, indeed, evidence that social labels carry additional information that the object itself does not convey (Bruner, Goodnow, & Austin, 1956). For instance, observers who have to guess the weight of visually presented target persons are greatly influenced by the labels attached to the targets (such as obese vs. above average; see Foroni & Rothbart, 2006). Along the same line, derogatory group labels (e.g., fag, nigger) carry affective information that goes beyond the mere description of the group and that produces distinctly negative associations, quite different from the corresponding neutral group labels (e.g., gay, Afro-American; Carnaghi & Maass, in press; Carnaghi, Maass, Bianchi, Castelli, & Brentel, 2005; Simon & Greenberg, 1996). In summary, there is evidence for Allport’s (1954) claim that noun labels serve both as organizing devices and as carriers of affective information. However, do they fulfill these functions to a greater extent than other linguistic forms? In particular, how do they compare with adjectives, the word form that is dominant in spontaneous

In his seminal work, The Nature of Prejudice, Allport (1954) argued that social category labels such as Jew, Black, or gay, are unusually potent both as cognitive organizing principles and as evaluative reference points. In a similar vein, Rothbart and Taylor (1992) have argued that social categories (such as Jew) are often perceived much like “natural kinds” (such as birds or fish) rather than cultural constructions. People tend to attribute a deep underlying essence to social categories, analogous to biological species that are genetically defined. Perceiving social categories as quasibiological concepts has a number of important implications, including the inexorability and exclusiveness of group membership. Indeed, thinking of social categories as natural kinds implies not only that membership remains stable over time but that it practically excludes simultaneous membership in other categories considering that each object has only a single essence. Furthermore, such natural kind categories also tend to have great inductive potential, allowing inferences about a wealth of related attributes.

Andrea Carnaghi, Department of Psychology, University of Trieste, Trieste, Italy; Anne Maass, Sara Gresta, Mara Cadinu, and Luciano Arcuri, Department of Psychology, University of Padova, Padova, Italy; Mauro Bianchi, International Graduate College, University of Jena, Jena, Germany. We are grateful to L. Castelli, G. B. Flores D’Arcais, R. Job, L. Lotto, and V. Yzerbyt for their helpful comments on our studies. Correspondence concerning this article should be addressed to Andrea Carnaghi, University of Trieste, Via S. Anastasio 12, 34134, Trieste, Italy. E-mail: [email protected] 839

840

CARNAGHI ET AL.

person description as well as in personality research in Western languages?1 Note that despite Allport’s (1954) early intuition regarding the distinct role of nouns, they have remained a neglected word class in social and personality psychology. Although extensively investigated by linguists, nouns (unlike adjectives) have attracted surprisingly little attention in research on person perception and personality. To cite but two examples, practically all Big Five research conducted in different countries and on different languages has relied exclusively on adjectives, with only a few researchers considering the possibility that people may also be described by nouns (e.g., De Raad & Hoskens, 1990; Saucier, 2003; see also Angleitner, Ostendorf, & John, 1990). In a similar vein, Semin and Fiedler’s (1988) influential linguistic category model (LCM) distinguishes four word classes (including three types of verbs and adjectives) relevant to the interpersonal domain, without considering nouns. The scope of the present research is to start filling this lacuna by investigating the distinct role that nouns play, compared with adjectives, in person description. We first briefly review some of the linguistic literature concerning differences between nouns and adjectives and then speculate about the implications for the domain of person perception.

Nouns Versus Adjectives: Similarities and Differences At the surface, nouns and adjectives appear functionally similar, as in the case of statements like Shira is a Jew versus Shira is Jewish. In many Indo–European languages, nouns and adjectives may share the same word stem and differ only in suffix (athlete– athletic, drunk–drunkard), implying, among other things, considerable phonetic similarity. At times, they are distinguishable only by the article (e.g., Marco is an Italian vs. Marco is Italian; Laurie is an anorexic vs. Laurie is anorexic). At least in Indo–European languages, they also seem to play a similar role in personality description (Saucier, 2003). Despite the striking functional, phonetic, and semantic similarities between nouns and adjectives, there are reasons to believe that nouns have a greater inductive potential and exert a greater effect on impression formation than adjectives. Indeed, linguists have identified a number of ways in which nouns differ from adjectives (e.g., Jespersen, 1986; Lyons, 1977; Wierzbicka, 1986). The first basic difference is that nouns represent a universal word class, whereas adjectives are rare or entirely absent in some languages (Dixon, 1977). Second, and most important for the aims of our research, nouns identify the class to which a given object belongs whereas adjectives denote a quality or property of the object. That is, nouns categorize people by assigning them to a specific group or type or kind of person, which is quite different from adjectives that denote one of many qualities that a person may possess. It is interesting to note that children are sensitive to this difference between nouns and adjectives as early as 2 years of age (Gelman & Coley, 1990). For instance, Hall and Moore (1997) exposed preschoolers along with adults to a target (i.e., a blue creature) that was defined either by an adjective (e.g., This is a blue one) or by a noun (e.g., This is a blue). Participants were then presented with two additional pictures, one depicting a target-kind matching object (i.e., the same creature but with a different color) and the other depicting a property-matching object (i.e., a novel creature, different from the

target, but with the same color). Depending on experimental condition, participants were asked to point either to the blue (i.e., the noun form) or to the blue one (i.e., the adjectival form). Results indicated that in the adjective condition, participants chose the property-matching object whereas in the noun condition participants typically pointed to the target-kind matching object. In other words, Hall and Moore (1997) showed that preschoolers used noun labels as event-organizing devices based on category membership, whereas they used attributes to indicate a contingent property of the object. Third, whereas adjectives tend to refer to a single property, nouns imply a complex cluster of characteristics, although the noun is generally considered more than the sum of these properties (Wierzbicka, 1986; see also Jespersen, 1968). An example cited by Wierzbicka (1986) may illustrate this point: Whereas blond refers to a single property of the target person, namely his or her hair color, the noun blonde implies multiple properties (including being female and probably a host of additional characteristics including sexy, glamorous, and not particularly intelligent). It is therefore not surprising that nouns elicit richer associations than adjectives. Indeed, Loftus (1972) found that participants listed more associated items when cued by a noun (e.g., fish, clam, etc., when prompted by seafood) than when cued by an adjective (e.g., brick, cement, etc., when prompted by hard). Fourth, and closely related to the above points, adjectives tend to allow distinctions of degree (e.g., a person may be more or less athletic), whereas nouns tend to have an either– or quality (e.g., a person either is or is not an athlete). This difference is clearly demonstrated by the fact that adjectives, but not nouns, allow comparative and superlative forms (e.g., athletic, more athletic, and most athletic). Fifth, adjectives are parts of speech that modify nouns, suggesting that nouns are primary in information processing and that adjectives can be fully understood only after the noun has been processed. As argued by Wierzbicka (1986), the noun is semantically superordinate, whereas the modifying adjective plays a subordinate role. For instance, talking about a Catholic woman or a female Catholic is not the same, as the focus in the former is on gender (qualified by religion) and the focus in the latter is on religious affiliation (qualified by gender). Indeed, the former phrase directs attention to the subportion of all females that is also Catholic, whereas the opposite holds for the latter phrase. That the adjective is subordinated with respect to the noun is also exemplified by the fact that in languages that have nominal gender (such as German or Italian), the adjective adapts to the gender of the noun rather than vice versa. In line with the idea of superordination versus subordination in sentence processing, existing research generally shows that nouns, compared with adjectives, have primacy in speech production and are more potent memory cues. In speech production, nouns are chosen before adjectives simply because the object has to be defined before its properties can be 1

Throughout this article, we use the terms nouns and adjectives in a less generic sense than is usually done. We are referring only to the subset of nouns and adjectives that can be applied to human beings and that generally denote humans’ social membership (including intimacy groups, social categories, and loose associations, such as professions, see Lickel et al., 2000) as well as their personal qualities (inclinations or personality types).

NOMINA SUNT OMINA

delineated, regardless of whether adjectives precede or follow nouns in a given language (e.g., a red automobile vs. un’ automobile rossa; see Martin, 1969). Also, there is evidence that nouns are generally more available for recall than adjectives (Lockhart, 1969; Lockhart & Martin, 1969). For instance, in Lockhart and Martin’s (1969) study, participants were first asked to memorize a series of adjective–noun pairs, followed by a cued recall test in which either the noun or the adjective served as the cue on the basis of which participants were to recall the missing word. Results supported the idea that nouns are more effective cues than adjectives. In summary, nouns differ from adjectives in important ways. They (a) categorize events, rather than describing event properties; (b) imply multiple qualities, rather than specifying a single property; (c) have an either– or quality, rather than allowing distinctions of degree; and (d) have a superordinate status in speech production and sentence comprehension. Thus, nouns are more efficient memory cues. The main thesis of this article is that the above differences between nouns and adjectives, identified by research in linguistics, may have important implications for impression formation. We argue that people will form quite different impressions of and make different inferences about others when nouns rather than adjectives are used to describe them. First, we hypothesize that nouns, compared with adjectives, lead people to draw richer inferences about the target and that these inferences will be largely in line with stereotypical expectancies associated with the label. Put simply, nouns will induce greater stereotyping than adjectives. Second, we believe that nouns are more likely than adjectives to inhibit alternative classifications of the same person. Thus, once a person has been identified as X (e.g., an athlete), it becomes subjectively difficult to believe that the same person may also be Y (e.g., an artist). In contrast, one adjectival quality is unlikely to preclude another, unless there is a logical (semantic) contradiction. For example, the same person may be athletic as well as artistic. Finally, we predict that nouns will convey greater essentialism than adjectives, thus implying a more profound characteristic of the person. We explain the reasons for each of these predictions below and then provide a brief overview of the present research project.

Stereotypical Inferences The first question to be addressed is whether nouns will induce stronger inferences about the target person than adjectives and whether these inferences will mirror the stereotypes that are associated with the label. For instance, will people infer typical Jewish habits more strongly when the target person is described as a Jew than when he or she is described as a Jewish person? The fact that nouns assign targets to types and generally delineate multiple properties would indeed suggest that they provide a richer source for subsequent inferences. To our knowledge, there are few studies that have investigated inferences elicited by nouns and adjectives, and most of them have taken a developmental (Gelman & Coley; 1990; Gelman, Collman, & Maccoby, 1986; Gelman & Markman, 1986, 1987) rather than social psychological or cognitive perspective (Markman, 1989; Markman & Smith as cited in Markman, 1989). Gelman and Coley (1990) reported that preschoolers inferred that two animals with

841

the same noun label (e.g., bird), but not with the same adjective label (e.g., wide awake), share the same nonobvious properties (e.g., lives in a nest) even when the two animals were perceptually dissimilar (i.e., a dodo). Thus, nouns, compared with adjectives, led preschoolers to go beyond the perceptual exemplar and draw novel inferences on the basis of the animal category membership (see also, Gelman & Markman, 1986, 1987). Along similar lines, Gelman et al. (1986) found that gender nouns, such as boy, triggered richer inferences than gender-matching properties, namely will grow up to be a daddy. Preschoolers appropriately inferred sex-related properties from a category-defining label (i.e., a boy) but their performance decreased when they accomplished a gender-categorization task in which they had to focus on sexrelated properties (i.e., will grow up to be a daddy). In other words, Gelman et al. (1986) observed an asymmetry between inferring properties from a given category (i.e., deduction) and inferring the category from properties (i.e., induction), with the deductive process being more viable than the inductive one. Turning to the adult literature, few studies have contrasted the use of and the consequences produced by nouns and adjectives. Among the few exceptions is Markman’s (1989) work demonstrating that nouns, compared with adjectives, trigger a richer representation of the object that they define. Markman (1989) asked participants to read a series of labels and to list, for each label, their defining properties. Note that labels were either adjectives (e.g., intellectual) or nouns (e.g., an intellectual). Quantitative analyses revealed that participants listed many more properties for nouns than for adjectives. In a similar study by Markman and Smith (as cited in Markman, 1989), participants were provided with sentence pairs depicting a person with an adjective (e.g., Alexander is intellectual) and with a corresponding noun (e.g., Alexander is an intellectual). Participants were asked to list what else might be typically expected of this person. Again, results indicated that participants listed more attributes in response to the noun than to the adjective label. In addition, when asked which sentence of each pair seemed to be more informative, participants systematically chose the noun label and justified their choices by claiming that nouns were more lasting and central to the target’s identity. Together, the above research suggests that nouns, more than adjectives, lead perceivers to draw inferences that go beyond the information given. At least in adulthood, perceivers also judge the event in question as more stable and enduring when nouns are used, suggesting that the identity of an object is communicated more forcefully when expressed by nouns than by adjectives. Although previous research on both children and adults clearly shows that nouns stimulate more inferential processing than adjectives, none of the above studies has investigated the degree to which these inferences follow stereotypical beliefs. We predict that nouns elicit only inferences that are congruent with what is stereotypically associated with the label, but they may actually inhibit inferences about aspects that are incongruent. Because nouns represent category-defining labels whereas adjectives stand for category-related properties (Hall & Moore, 1997), we argue that nouns (e.g., Jew) will make category-related contents more salient than the corresponding adjectives (e.g., Jewish). Moreover, impression formation research has consistently shown that to the extent to which a given category is salient, an individual target will be assimilated to the category-related contents (Fiske & Neuberg, 1990). On the basis of these premises, we hypothesized stronger

CARNAGHI ET AL.

842

stereotypical inferences (e.g., He always goes to the Synagogue), but weaker counterstereotypical inferences (e.g., He always goes to the Church), when an individual target is described by a noun (e.g., Mark is a Jew) than when the same individual target is described by the corresponding adjective (e.g., Mark is Jewish). Thus, unlike prior research, our studies investigate both facilitative and inhibitory effects of nouns (vs. adjectives) on inferential processing. If confirmed, this would have important implications for the perpetuation of social beliefs, considering that the facilitation of stereotypical inferences and inhibition of counterstereotypical inferences, typical of nouns, would render stereotype change rather difficult. The first aim of our research was therefore to investigate this issue.

Inhibition of Alternative Classifications The second issue to be addressed in this article is whether nouns are likely to inhibit alternative classifications more so than other word forms. At a theoretical level, this idea was originally proposed by Allport in 1954 when he argued that nouns inhibit “alternative classification or even cross-classification” (p. 179) of the same object (or person). In other words, the fact that nouns assign the object (in our case, the person) to a specific, all encompassing class and that this assignment has the quality of an all-ornone statement makes alternative classifications less likely. Although this hypothesis is interesting and intuitively appealing, we are not aware of any empirical proof (or disconfirmation) reported in the literature. Although individuals can theoretically be classified according to many different dimensions (age, sex, ethnic group membership, profession, religion, nationality, etc.), once a specific classification has occurred, alternative classifications tend to be inhibited. For instance, Macrae, Bodenhausen, and Milne (1995) have demonstrated such inhibitory mechanisms in their well-known research involving a target classifiable either as a woman or as Chinese. When one of the two categories (either woman or Chinese) became highly accessible, the competing category was inhibited. Although the role of inhibitory processes in social categorization is well documented in the social psychological literature, little is known about the role that language plays in this process. Despite the lack of research on this issue, there are reasons to give credence to Allport’s (1954) intuition, considering than nouns delineate a single, all-encompassing type of person, which is less likely to allow competitors than are adjectives describing one out of many possible qualities. Testing this conjecture was one of the main aims of this research.

Essentialism The third question addressed in this research program is whether nouns imply greater essentialism than adjectives. For example, art-related activities such as painting or visiting art exhibitions are certainly typical of both artistic people and artists, but when performed by an artist, these behaviors are likely to be interpreted as a more profound and enduring behavior tendency than when performed by an artistic person. Indeed, Wierzbicka (1986) has speculated that “human characteristics tend to be designated by nouns rather than adjectives if they are seen as permanent and/or conspicuous and/or important” (p. 357). Inverting this logic, we

predict that human characteristics that are designated by nouns (rather than adjectives) will be perceived as more permanent and/or important. To our knowledge, this hypothesis has not been tested in the past, but indirect evidence for the high degree of essentialism implied by nouns comes from studies comparing nouns with verbs (rather than adjectives). As a matter of fact, Walton and Banaji (2004) found that the same behavioral preference was perceived as a much deeper seated and more central characteristic when the person was described by a noun (e.g., carrot eater) rather than by a verb (e.g., eating carrots). In our studies, we aimed to extend this line of research by focusing on the (much more subtle) difference between adjectives and nouns, hypothesizing that nouns entail greater essentialism than adjectives. Indeed, we varied the label (i.e., noun, such as artist, vs. adjective, such as artistic) with which the target was described and then assessed the inferences about the targets’ behavioral preferences (such as likes to draw paintings). If nouns were indeed shown to imply greater essentialism, then it becomes legitimate to ask whether the inverse relationship may also be true. Not only may nouns induce essentialist perceptions in the listener, but it is also possible that speakers who do hold essentialist beliefs find nouns particularly useful for revealing this conviction. Therefore, the fourth and final aim of this research project was to investigate how a speaker’s beliefs may guide his or her linguistic choices and, more specifically, whether nouns would be considered particularly useful for expressing essentialist beliefs.

Overview of the Studies and Hypotheses To address the above questions, we present the following studies addressing the four interrelated issues. First, in Studies 1 and 2, we tested the hypothesis that nouns (e.g., athlete) trigger more stereotypical inferences and fewer counterstereotypical inferences than the corresponding adjectives (e.g., athletic). In other words, compared with adjectives, nouns were expected to facilitate stereotypical inferences but to prevent counterstereotypical inferences. In an additional experiment (Study 3), we investigated whether this would hold even in the case in which the same person was described both by a noun and by an adjective. Thus, we tested the relative weight of nouns and adjectives in combined noun– adjective descriptions such as Marco is an artistic athlete or Marco is an athletic artist, again hypothesizing a greater impact of nouns in eliciting stereotype-congruent inferences. This study also allowed us to test whether the descriptor encountered first exerts a greater influence on the perception of the target person than the one encountered later. The second question we investigated was whether nouns would inhibit alternative classifications more so than adjectives. Because every person has multiple category membership (e.g., Jew, woman, academic, Italian, environmentalist), he or she may legitimately be described in many different ways. However, following Allport’s (1954) argument, once targets with multiple memberships have been assigned to one particular social category (e.g., Jew), other classifications (e.g., woman, Italian) should become difficult. Following Allport’s reasoning, we argue that the inhibition of alternative classifications should be much stronger when nouns (Jew), rather than adjectives (Jewish), are used, for the simple reason that nouns already define a category membership whereas adjectives delineate a quality that is compatible with

NOMINA SUNT OMINA

many different categories. This issue was investigated in Study 4, in which a target person was described either by a noun or by the corresponding adjective. Participants were asked to estimate the likelihood that the target would also belong to another, unrelated category (noun) or possess another, unrelated trait (adjective). The third issue, addressed in Study 5, concerned the differential likelihood of nouns versus adjectives to elicit essentialist perceptions of the target. If nouns are more abstract than adjectives, then the behavioral preferences of a target described by a noun should be perceived as intrinsically linked to the person. In line with previous empirical evidence on the issue under consideration (Walton & Banaji, 2004; but see also Gelman & Heyman, 1999; Markman, 1989), we hypothesized that a target described by a noun (Mark is an athlete), rather than by the corresponding adjective (Mark is athletic), would allow perceivers to appraise the target’s preference as stronger, more persistent, and less prone to be modified under situational constraints. This was expected for the case in which the target’s preference was congruent with the label’s meaning (e.g., he runs three times per week). In contrast, behavioral preferences that are incongruent with the label’s meaning (e.g., he drinks a lot of wine each day) should lead perceivers to consider the target’s preference as less essential, that is as weaker, less enduring, and more likely to vary under situational constraints, when the target is described by a noun rather than by an adjective. Thus, noun (compared with adjective) labels will foster the perception of essence in cases of fit but will undermine the attribution of essence for characteristics that do not concord with the label. Finally, in our last experiment (Study 6), we inverted the logic of the above experiments. Rather than investigating the effects that nouns and adjectives have on the attribution of essence, we tested what words people would use in interpersonal communication, depending on whether they do or do not have essentialist beliefs. Thus, our last experiment focused on the speaker rather than on the receiver of the communication. The main prediction was that people would show a greater preference for nouns when they perceive a given characteristic as fundamental, lasting, deepseated, and/or innate. In contrast, when the same characteristic is perceived as transient or less essential, people should prefer to use adjectives. We induced the belief that athletic abilities (Study 6A) or intelligence (Study 6B) is either genetically determined or mainly a result of training. We expected that participants, when asked to describe a specific target person engaging in athletic (or intelligent) behaviors, would use more nouns and fewer adjectives when they had a genetic theory of athletic (or intellectual) abilities. The nouns and adjectives used in the different studies of this research project cover the main types of social groups identified by Lickel et al. (2000) in their seminal work on intuitive theories of groups, namely intimacy groups (e.g., father, mother), social categories (e.g., homosexuals, Catholics), and loose associations (e.g., professional groups such as artists or athletes). The only type of group not considered in this research is task groups (e.g., students studying for an exam, employees of a local restaurant) that can be defined only in context and hence cannot easily be described by a single noun or adjective. Instead, we added a type of category not considered by Lickel et al., namely nouns and adjectives describing personality characteristics (e.g., individualist, drunkard, genius), because such personality types are particularly common in person descriptions.

843

Together, we hoped that this research would provide convergent evidence for the powerful effect of lexicalization (Gelman & Heyman, 1999) in the process of impression formation. More than adjectives, we expected nouns (a) to bolster category-congruent expectations and to reduce the expectancy of incongruent behaviors (Studies 1– 3), (b) to inhibit possible alternative classifications (Study 4), and (c) to suggest greater essence (Study 5). Finally, in a complementary vein, the essentialist views of speakers were expected to lead them to use more nouns and fewer adjectives (Study 6).

Study 1: Stereotype-Congruent Inferences Study 1 aimed at testing whether the linguistic form (i.e., adjectives vs. nouns) in which an individual target is described can affect perceivers’ estimates of the frequency of descriptorcongruent behaviors. Participants were provided with a short description of a target person by way of either an adjective (i.e., athletic) or a noun (i.e., athlete). Participants were then required to estimate the frequency of a descriptor-congruent behavior as a measure of stereotype application. We expected that participants would make stronger inferences from nouns than from adjectives. This hypothesis was tested on professions and personality types in Study 1A, on intimacy groups in Study 1B, and on a random sample of adjective–noun pairs in Study 1C.

Study 1A Method Participants. Forty-eight students enrolled at the University of Padova participated in the experiment (30 women and 18 men). Participants were recruited at the university library and completed the questionnaire individually. Procedure and materials. Participants were told that we were interested in the way people form an impression about a target about which they had only a restricted amount of information available. This cover story was used in all studies, unless otherwise noted. Participants were provided with a questionnaire containing three sentences each portraying an individual target. Each sentence contained the name of the target (Paolo, Marco, or Fabio), his age (21, 22, or 23 years), and a short description. Depending on the experimental manipulation, the description was given by means of either a noun (e.g., Paul, 22 years old, is an artist) or an adjective (e.g., Paul, 22 years old, is artistic). Each participant received three descriptors either in noun form (i.e., artista [artist], atleta [athlete], and genio [genius]) or in the corresponding adjective form sharing the same word stem to assure a maximum degree of phonetic and semantic similarity (i.e., artistico [artistic], atletico [athletic], and geniale [brilliant]). Participants then had to estimate the frequency of the corresponding descriptor-congruent behavior (respectively, “How many paintings does he draw each week?” “How many kilometers does he run per week?” and “How many problems does he solve in a week?”) on 11-point scales ranging from 1 (few) to 11 (many). These behaviors were selected on the basis of a small pretest (N ⫽ 9) showing that each behavior was judged typical of one category label but at the same time unrelated to the remaining two categories. To decrease the risk of participants selecting behaviors that

CARNAGHI ET AL.

844

were more typical of nouns than of adjectives, we had participants in the pretest judge each behavior with respect to category labels comprising both the noun and the adjective descriptor (e.g., “How typical is it for an athlete/for an athletic person to run?”). The order of the presentation of the sentences was counterbalanced. We expected participants’ estimations of the descriptorcongruent behaviors to be stronger in the noun condition than in the adjective condition.

However, a potential limit of the above findings is that both Studies1A and 1B deal with restricted samples of the adjective– noun pairs that may not be representative. As a matter of fact, one may argue that our results reflected the peculiarity of the experimental material rather than representing a robust phenomenon that can be generalized to any other matched noun–adjective pair. In order to bolster the external validity of our results, we therefore decided to run a new experiment using a larger and randomly selected list of adjective–noun pairs.

Results and Discussion

Study 1C

First, in order to equate the two dependent variables, we standardized participants’ ratings for each descriptor-congruent behavior into z scores and then analyzed these scores by means of a 3 (type of behavior: draw vs. run vs. solve) ⫻ 2 (condition: adjective vs. noun) ⫻ 2 (version: Order 1 vs. Order 2) analysis of variance (ANOVA). An almost significant main effect of condition was found, F(1, 44) ⫽ 3.81, p ⫽ .057, ␩2p ⫽ .08. Inspection of the means2 revealed that participants’ ratings tended to be higher in the noun (M ⫽ 7.23, SE ⫽ 0.33) than in the adjective condition (M ⫽ 6.25, SE ⫽ 0.36), suggesting that nouns were more likely to elicit stereotypical inferences than adjectives. No other effects were found.

Study 1B This study aimed to extend the results of Study 1A to a different semantic as well as social domain, namely kinship or intimacy groups (see Lickel et al., 2000).

Method Participants. Twenty participants (11 women and 9 men) enrolled at the University of Padova participated in the experiment. Procedure. The experimental procedure was same as in Study 1A, except for the experimental material. Participants, in a withinparticipants design, were given two nouns (mother, father) and two corresponding adjectives (maternal, paternal) as the target descriptors. They were then asked to rate the frequency of descriptorcongruent behaviors (i.e., “To what degree is the target interested in food-related issues?”; “To what degree is the target engaged in financial matters?” respectively, for the mother–maternal and for the father–paternal pairs) on a 7-point scale, ranging from 1 (not at all) to 7 (very much). We counterbalanced the order of presentation of the targets. Again, we expected that participants would estimate the frequency of a congruent behavior higher when exposed to a noun rather than to an adjective as the target descriptor.

Results and Discussion Participants’ ratings were first averaged to create two indices of participants’ estimations, one for the nouns (␣ ⫽ .61) and the other for the adjectives (␣ ⫽ .56). In line with our a priori hypothesis, a paired-sample t test revealed that participants’ ratings were higher in the noun (M ⫽ 5.88, SE ⫽ 0.22) than in the adjective condition (M ⫽ 5.35, SE ⫽ 0.22), t(19) ⫽ 1.71, p ⬍ .05, one-tailed, d ⫽.53. These results clearly show that participants were more prone to draw descriptor-congruent inferences when they received a noun rather than an adjective as the target descriptor, thus expanding previous results to the realm of kinship or intimacy groups.

Participants. Thirty-seven students at the University of Padova were recruited at the university library and completed the questionnaire individually. Six participants did not entirely fill out the questionnaire and were discarded from the analysis, reducing the sample to 31 students (24 women and 7 men). Procedure and materials. The procedure was the same as that in Study 1A, except for the stimulus material and for the fact that a within-participants design was used. Participants read 24 sentences each portraying an individual target. Each sentence contained the name of the target (e.g., Paolo) and a short description. Twelve sentences included nouns as target descriptors and twelve sentences included adjectives as target descriptors. As for the selection of the adjective–noun pairs, we used a random number table to identify pages of the Zingarelli (1988) dictionary, one of the most widely used Italian dictionaries. Starting from each randomly selected page, we selected the first noun– adjective pair that satisfied the following six criteria: 1.

Noun and adjective shared the same word stem and appeared in close proximity (generally as a part of the same paragraph).

2.

Neither noun nor adjective was reported in the dictionary as rare or archaic.

3.

Nouns and adjectives were judged as semantically similar according to two independent judges as well as to the definitions provided by both the Zingarelli (1988) and the Devoto and Oli (1971) dictionaries. Note that semantic similarity of the adjective–noun pairs was essential to assure that the same congruent (or incongruent) behaviors could serve as dependent measures. Unpaired samples of nouns and adjectives would necessarily require that different and hence no longer comparable behaviors be used as the dependent measures.

4.

Nouns and adjectives were applicable to human beings according to the judgments of two independent raters. For instance, the word pair archeologist–archeological was excluded because the adjective is not easily applicable to human beings.

5.

Nouns and adjectives were distinguishable by suffix. We did not allow any entries in which the exact same word

2 Note that all the statistical analyses of Study 1 (but not of Study 1B), Study 2, and Study 3 were computed on z-transformed scores, whereas, for the sake of clarity, all means are reported as raw scores.

NOMINA SUNT OMINA

was identifiable as noun and as adjective (such as Aboriginal, Panamese), because, in such cases, the two linguistic forms are undistinguishable, unless the article is added. 6.

Finally, if no appropriate pairs were found, applying the above criteria, on three consecutive pages, we moved on to the next random number.

Following these criteria, we selected 12 adjective–noun pairs (see the Appendix). A focus group identified a descriptorcongruent behavior considered diagnostic of both noun and adjective of each pair. Participants were asked to estimate the occurrence of the corresponding descriptor-congruent behavior on a 7-point scale ranging from 1 (few) to 7 (many). In order to avoid any effect due to the order of presentation, we counterbalanced the list of the target descriptions.

Results and Discussion Participants’ ratings were first standardized into z scores and averaged, thus creating two indices of target stereotyping, one related to adjectives and one related to nouns. Because the order of presentation did not produce any significant effect, we discarded this factor from the current analyses. We analyzed the data first using the participants and subsequently the stimuli as the unit of analysis. In line with our a priori hypotheses, participants’ ratings were higher in the noun (M ⫽ 4.64, SE ⫽ 0.11) than in the adjective condition (M ⫽ 4.43, SE ⫽ 0.12), F(1, 30) ⫽ 7.34, p ⫽ .011, ␩2p ⫽ .20. Furthermore, we repeated the analysis using the noun–adjectives pairs as random independent variables. Again, findings revealed that nouns (M ⫽ 4.64, SE ⫽ 0.14) elicited higher ratings than adjectives (M ⫽ 4.36, SE ⫽ 0.21), F(1, 11) ⫽ 6.74, p ⫽ .025, ␩2p ⫽ .38. Taken together, these results jointly confirmed that nouns were more likely than adjectives to trigger stereotypical inferences. Note that the stronger inductive power of nouns versus adjectives has been found using both participants and stimuli as random variables.3 Moreover, the use of an extended and randomly generated list of adjective–noun pairs rules out the possibility that the findings may simply be a function of biased or unrepresentative stimulus material. The fact that an identical result pattern was obtained when randomly selected adjective–noun pairs were used suggests that the findings of Study 1A can be generalized across different linguistic labels within the larger word classes of nouns and adjectives. There is one potential confound that may create problems for the interpretation of our results, namely that nouns and adjectives may differ in implicit valence. Studies on the LCM have shown that with increasing abstraction (e.g., passing from verbs to adjectives), words carry more evaluative information, which, in turn, may affect stereotypical inferences. For instance, if nouns were to carry more evaluative meaning than adjectives, then greater inductive power for nouns may simply be a function of the fact that this category of words describes more extreme (positive or negative) characteristics than adjectives. To test this possibility, we had 40 volunteers rate the valence of our stimulus material on a 9-point scale from 1 (very negative) to 9 (very positive), half of whom rated the 12 adjectives and the other half the 12 nouns. On average, adjectives (M ⫽ 5.49) were rated as positive as nouns (M ⫽ 5.23),

845

t(38) ⫽ 1.18, p ⫽ .25, d ⫽ .37. However, the difference between nouns and adjectives may not lie so much in the average positivity of the word but rather in the degree of polarization. Thus, nouns may imply a more extreme evaluation than adjectives both in the positive and negative direction. To test this possibility, we also looked at the absolute distance of each rating from the neutral scale midpoint (5). The degree of polarization was practically identical for adjectives (M ⫽ 1.95) and nouns (M ⫽ 1.98), t(38) ⫽ 0.12, p ⫽ .90, d ⫽ .04, suggesting that there was no difference in the degree to which the two types of words carried implicit evaluative (positive or negative) content. Although there were no reliable differences in valence or in evaluative polarization between nouns and adjective, we calculated the relative valence and the relative evaluative polarization of nouns compared with adjectives by subtracting the mean ratings of each adjective from that of the corresponding noun. Thus, the higher the value, the more positive the noun compared with the adjective, or the greater the evaluative polarization of the noun compared with the adjective. We then added relative valence and, subsequently, relative evaluative polarization to our main analysis involving the stereotypical inferences. Remember that we had found that nouns elicit stronger stereotypic inferences than adjectives, F(1, 11) ⫽ 6.74, p ⫽ .025, ␩2p ⫽ .38. This effect continued to be reliable when adding relative valence as a covariate, F(1, 10) ⫽ 9.31, p ⫽ .012, ␩2p ⫽ .48, and it approached significance when using relative evaluative polarization as a covariate, F(1, 10) ⫽ 4.29, p ⫽ .066, ␩2p ⫽ .30. It is therefore unlikely that the differential implicit valence of nouns versus adjectives may have played a critical role in our findings. In other words, the fact that nouns elicit greater descriptor-congruent inferences than adjectives does not seem to be attributable to differences in evaluative content.

Study 2: Congruent and Incongruent Inferences So far our studies show that compared with adjectives, nouns facilitate congruent inferences, but they do not test the related prediction that nouns also inhibit inferences about behaviors or habits that are incongruent with the descriptor. Study 2 aimed to investigate this issue.

Study 2A Method Participants. Sixty-four students (39 women and 25 men) enrolled at the University of Padova were recruited in the university library and asked to fill out the questionnaire individually. Two participants were excluded from the analyses because they completed questionnaires collectively, resulting in a final sample of 62 participants (38 women and 24 men). Procedure and materials. Half of the participants received information about the two targets in adjective form (i.e., Paolo, 27 3

Although several authors argued that minF-test (Clark, 1973) is needlessly complicated and highly conservative (Baayen, Feldman, & Schreuder, 2006; Brysbaert, 2007), we decided to further compute a minF-test that turned out to be marginally significant, F⬘(1, 31) ⫽ 3.51, p ⫽ .07, confirming, at least in part, the generalizability of our findings to other sets of adjective–noun matching pairs.

CARNAGHI ET AL.

846

years old, is artistic; Marco, 27 years old, is drunk), and the other half received information about the two targets in noun form (i.e., Paolo, 27 years old, is an artist; Marco, 27 years old, is a drunkard). The order of presentation of the targets was counterbalanced across participants. For each individual description, participants were asked to estimate the frequency of occurrence of two behaviors (i.e., “How many kilometers does he run per week?” and “How many glasses of wine does he drink per week?”) on an 11-point scale ranging from 1 (few) to 11 (many). On the basis of a pretest (N ⫽ 9), behaviors were selected so that what was congruent with one category was incongruent with the other, and vice versa. The order of the behaviors to be estimated was counterbalanced across participants. We expected participants to judge descriptor-congruent behaviors more frequent in the noun condition but incongruent behaviors more frequent in the adjective condition.

Results and Discussion Participants’ ratings on the two items were first z transformed, thus combining them into two indices of participants’ ratings, one for descriptor-congruent behaviors and the other for descriptorincongruent behaviors. As the order of presentation of the targets and the order of presentation of the to-be-estimated behaviors did not produce any significant effects, we discarded these two factors from the current analyses. A 2 (condition: adjective vs. noun) ⫻ 2 (domain: athletics vs. drinking) ⫻ 2 (behavior: congruent vs. incongruent) ANOVA was carried out on participants’ ratings, with the former factor as a between-participants variable and the latter factors as within-participants variables. In line with our predictions, the Condition ⫻ Behavior interaction was significant, F(1, 58) ⫽ 14.14, p ⬍ .001, ␩2p ⫽ .20. As Table 1 shows, participants rated descriptor-congruent behaviors more frequent in the noun than in the adjective condition, F(1, 58) ⫽ 9.52, p ⬍ .003, ␩2p ⫽ .14. By contrast, they rated descriptor-incongruent behaviors less frequent in the noun than in the adjective condition, F(1, 58) ⫽ 4.12, p ⬍ .05, ␩2p ⫽ .07. Conceptually replicating the results of our previous studies, participants expected the target to show more category-congruent and fewer category-incongruent behaviors when the target was labeled by a noun rather than by an adjective. Please note that in contrast to Study 1, this study included a positive label (i.e., athlete) and a negative label (i.e., drunkard). Because we found greater target stereotyping for nouns than for adjectives regardless

Table 1 Participants’ Frequency Estimates as a Function of Behavior and Condition (Study 2A)

of valence, these results provided additional evidence of the independence of the effect under consideration from the valence of the characteristic that is being described.4 Together, this study confirms the idea that nouns have a greater likelihood than adjectives to induce stereotype-congruent expectancies and to inhibit incongruent ones. However, one potential problem is that the specific nouns selected for the above study may have differed semantically from the respective adjectives. Especially in the case of athlete, the noun seems to imply a higher (professional) level of involvement in the activity and may, indeed, evoke different exemplars than the respective adjectives. They may also differ in degree of mutual inclusiveness such that all athletes are athletic but not all athletic people are athletes. If so, the above results could be interpreted in light of the theoretical conjectures and the empirical evidence that fall under the banner of “correspondence bias” (Gilbert & Malone, 1995; Jones, 1990; Jones & Harris, 1967; Ross, Amabile, & Steinmetz, 1977). In line with the literature on the correspondence bias, one may conclude that higher levels of target stereotyping in reaction to athlete–artist than to athletic–artistic may not occur by virtue of the differential linguistic properties of these labels but as a function of the higher dispositional attribution that the professional role entails. Although this alternative explanation seems to hold mainly for those stimulus words that refer to professions but less so for intimacy groups (madre-materno [mother–maternal], padre–paterno [father– paternal]) and personality types (ubriaco-ubriacone [drunk– drunkard], genio–geniale[genius– brilliant], individualista– individualistic [individualist–individualistic]), we decided to conduct an additional study in which this potential confound was removed.

Study 2B In Study 2B, we therefore tested the same general hypothesis, but this was done (a) using semantically very similar nouns and adjectives; (b) extending the investigation to a different language community; and (c) relying on labels referring to social groups, rather than professional roles, personality characteristics, or intimacy groups. We chose two pairs (homosexual–a homosexual and Catholic–a Catholic) that did not make any reference to a profession and for which nouns and adjectives implied each other in a symmetrical fashion (a homosexual person is very likely to be a homosexual and vice versa). Moreover, the extension to a different language, in this case German, was important to understand whether the systematic differences between adjectives and nouns observed in our first two studies can be generalized to other Indo–European languages. Note that Italian and German differ in the order in which nouns and adjectives generally appear. In German, just like in English, the adjective generally precedes the noun, whereas the canonical order

Behavior Congruent

Incongruent

Condition

M

SE

M

SE

Adjective Noun

9.37a 10.27b

0.22 0.22

3.83a 2.77b

0.32 0.32

Note. Means with different subscripts differ significantly within columns (t test, ps ⬍ .05).

4 Note that the z transformation does not allow differences between the two domains, drunkard and athlete, to emerge. We therefore repeated the ANOVA on the raw scores. Besides replicating the Condition ⫻ Behavior interaction, F(1, 58) ⫽ 12.76, p ⬍.001, ␩2p ⫽. 94, we did not find a main effect of domain. Congruent behaviors were judged more frequent than incongruent behaviors, both in the athletics domain (respectively, M ⫽ 9.68 and M ⫽ 3.93) and in the drinking domain (respectively, M ⫽ 9.95 and M ⫽ 2.77). More important, the Condition ⫻ Behavior Domain interaction was not significant, F(1, 58) ⫽ 0.63, ns, ␩2p ⫽ .01.

NOMINA SUNT OMINA

in Italian is inverse, although both orders are possible in principle (we come back to this issue in Study 3). Theoretically, it is therefore possible that the greater inferential potential of nouns in Italian is, at least in part, a function of the fact that nouns tend to precede adjectives. This would be in line with findings showing that the recall advantage of nouns over adjectives is particularly pronounced when nouns precede adjectives (Kusyszyn & Paivio, 1966; Paivio, 1963). Thus, it was important to test whether nouns would allow stronger inferences than adjectives even in languages (such as German or English) in which they generally have a less favorable (second) position in relation to adjectives.

Method Participants. Seventy-one students enrolled at the University of Jena participated in this experiment. We selected participants that were neither Catholic nor homosexual. Our final sample then included 65 participants (31 women and 33 men). One additional participant was excluded from the current analyses because his ratings on two items were four standard deviations above the mean5 (for a similar procedure, see Walton & Banaji, 2004). Procedure and materials. Similar to the experimental procedures outlined above, participants read a brief description of two male targets. In one condition, varied between participants, the noun referred to sexual preference and the adjective to religion (a homosexual, Catholic), and in the other, the inverse was true (a Catholic, homosexual). Note that, unlike English, German adjectives and nouns differ not only in terms of the presence or absence of the article but also with respect to the uppercase versus lowercase initial letter and the suffix (noun: ein Homosexueller, ein Katholik; adjective: homosexuell, katholisch). The order in which the targets were presented was varied between participants. Following each target description, participants had to estimate the frequency with which the target person engaged in two behaviors, namely “How often does he attend church in a year?” and “How often does he have one-night stands in a year?” In contrast to our previous studies, an open response format was chosen ( __ times per year). A pretest session of a small sample of students (N ⫽ 8) indicated that “to attend church” was considered as more typical of a Catholic person–Catholic (M ⫽ 5.63, SE ⫽ 0.50) than of a homosexual person– homosexual (M ⫽ 2.13, SE ⫽ 0.35), F(1, 7) ⫽ 38.11, p ⬍ .001, ␩2p ⫽ .85, whereas “to have one-night stands” was judged to be more typical of a homosexual person– homosexual (M ⫽ 5.50, SE ⫽ 0.42) than of a Catholic person– Catholic (M ⫽ 1.89, SE ⫽ 0.40), F(1, 7) ⫽ 41.17, p ⬍ .001, ␩2p ⫽ .86. Therefore behaviors were selected so that what was congruent with one descriptor was incongruent with the other, and vice versa.

Results and Discussion Because the order of presentation of the targets and the order of presentation of the to-be-estimated behaviors did not produce any significant effects, these two factors were not considered in the analyses. A 2 (condition: adjective vs. noun) ⫻ 2(domain: homosexuality vs. Catholicism) ⫻ 2 (behavior: congruent vs. incongruent) ANOVA was conducted on participants’ z-transformed ratings, with the former factor as a between-participants variable and the latter as a within-participants variable. The Condition ⫻ Behavior interaction was significant, F(1, 61) ⫽ 8.34, p ⬍ .005, ␩2p ⫽

847

.12. On the basis of the results of Study 1 and Study 2A and in line with our theoretical conjectures, we expected that nouns would trigger more descriptor-congruent inferences but fewer descriptorincongruent inferences than adjectives. In keeping with these a priori hypotheses, an inspection of the means revealed that participants’ frequency ratings for the descriptor-congruent behaviors were higher in the noun condition (M ⫽ 30.74, SE ⫽ 4.74) than in the adjective condition (M ⫽ 19.02, SE ⫽ 4.67), t(61) ⫽ 1.75, p ⬍ .04, one-tailed, d ⫽ .31. By contrast, participants’ ratings for the descriptor-incongruent behaviors were lower in the noun (M ⫽ 4.89, SE ⫽ 1.67) than in the adjective condition (M ⫽ 8.77, SE ⫽ 0.15), t(61) ⫽ 1.76, p ⬍ .04, one-tailed, d ⫽ .31. Again, replicating the results of our previous studies, participants reported stronger target stereotyping when the target was described by a noun rather than by an adjective. Moreover, and in line with results of Study 2A, participants expected the same target to display incongruent behaviors with lower frequency when a noun rather than the corresponding adjective was used as the target descriptor. In summary, these results provide further support for our hypothesis that nouns facilitate congruent, but inhibit incongruent inferences more so than adjectives. It is important to note that the findings of this study also suggest that this is true even in languages in which adjectives have a primacy advantage as they generally precede nouns. Also, in contrast to our previous studies, the adjective–noun pairs of this study did not refer to professional roles, to personality characteristics, or to intimacy groups but instead referred to social categories (see Lickel et al., 2000), suggesting that the same processes that were found for assumed categories in our first studies are also at work when ascribed categories are involved (Mae & Carlston, 2005). Moreover, and in line with the results of Study 2A, it is interesting to note that facilitation and inhibition were practically identical for the category that is positively valued in society (Catholic) as for the category that is often subject of discrimination (homosexual).6 Thus, valence of the characteristic and/or the corresponding behavior (“going to church” vs. “having one-night stands”) does not seem to moderate the lexicalization effect. Finally, the robustness of the lexicalization effect is also corroborated by the observation that it emerges independently of the way in which participants report their estimates. Indeed, the current study shows that our previous findings, based on rating scales, extend to a situation in which estimates are generated in a free response format. Together, these results suggest that we are dealing with a robust and pervasive phenomenon that holds across languages, noun–adjective pairs, categories, valence, and types of measurement. The Condition ⫻ Behavior interaction was still significant, F(1, 62) ⫽ 5.87, p ⬍ .02, ␩2p ⫽ .09, even when the outlier was not discarded from the analysis. 6 A 2 (condition: adjective vs. noun) ⫻ 2 (target: homosexual vs. Catholic) ⫻ 2 (behavior: congruent vs. incongruent) ANOVA on the raw data confirmed the Condition ⫻ Behavior interaction, F(1, 61) ⫽ 5.92, p ⬍ .02, ␩2p ⫽ .67, and also showed that this interaction was not moderated by type of target, F(1, 61) ⫽ 0.17, ns, ␩2p ⫽ .07. Corroborating, albeit in a different manner, the results of Study 1C, this pattern shows that higher levels of target stereotyping occur in reaction to nouns rather than to adjectives, regardless of valence. 5

848

CARNAGHI ET AL.

Study 3: Inferences When Nouns and Adjectives CoOccur The first two studies, conducted in two different languages, mainly focused on perceivers’ inferences about an individual target who was identified by a single label, either a noun or an adjective. However, in natural language, adjectives and nouns frequently co-occur, with adjectives qualifying nouns. For example, we may refer to a person as an athletic woman or as a female athlete. This offers the possibility for a direct comparative test of our hypothesis, as the relative weight of nouns and adjectives in eliciting stereotype-congruent (and inhibiting incongruent) inferences about the target person can be determined within a single minimal sentence. We know from previous research in which both word forms were presented that adjectives are generally less available for recall than nouns (Lockhart, 1969: Lockhart & Martin, 1969). Indeed, if nouns serve as primary “conceptual pegs” to which adjectival modifiers are attached (Kusyszyn & Paivio, 1966; Paivio, 1963), it is not surprising that they play a more important role in information processing and memory. By extension nouns can also be expected to preserve their unique ability to induce congruent (and inhibit incongruent) inferences even in the copresence of an adjective. Thus, our main aim was to test whether nouns would induce more stereotypic inferences than adjectives even when the two were presented together. The combination of nouns and adjectives within a single description also allowed us to test whether word order would affect the inferences that people draw about the target. Although nouns are superior to adjectives both in facilitating memory (Martin, 1969) and in guiding inferences, as shown by our Studies 1 and 2, regardless of whether they precede or follow adjectives, it is conceivable that this difference is further enhanced when nouns also enjoy a primacy advantage. This is in line with classical research showing that recall for adjective–noun pairs is better when nouns precede, rather than follow, adjectives (e.g., Kusyszyn & Paivio, 1966; Paivio, 1963). Extending this reasoning, we argued that (a) nouns would induce greater inferences regarding congruent behaviors than adjectives (replicating Studies 1 and 2) and (b) that this advantage of nouns over adjectives would be enhanced when nouns precede adjectives. Note that we expected nouns, but not adjectives, to profit from the primacy position. The reason for this asymmetrical order effect lies in the fact that nouns constitute the primary element to which qualifying adjectives are subsequently attached. Indeed, various authors have argued that speakers “habitually choose adjectives after choosing the modified nouns” (Martin, 1969, p. 472; for similar arguments regarding the listener, see Ehrlich, 1977). Thus, nouns may induce stronger stereotype-congruent inferences when they are encountered first, whereas the position of adjectives should be largely irrelevant because, regardless of order, the qualifier will become informative only after the to-be-qualified noun has been identified.

Method Participants. Forty students (23 women and 17 men) enrolled at the University of Padova were recruited in the library and completed the questionnaire individually. Procedure and materials. Participants read minimal descriptions of eight distinct individual targets each defined by a common

Italian male name, age, and a short description that in contrast to the above studies, contained two descriptors, one referring to the athletic and the other to the artistic inclination of the same person. Target persons were described by two nouns (e.g., athlete and artist), by two adjectives (e.g., athletic and artistic), by an adjective followed by a noun (e.g., an athletic artist), or by a noun followed by an adjective. Note that although nouns generally precede adjectives, the word order can be inverted in Italian. Also, the material was organized so that the artistic inclination either preceded or followed the athletic one (e.g., an athletic artist or an artistic athlete). Participants rated each of the eight target individuals with respect to the frequency of two behaviors (i.e., “How many kilometers does he run per week?” and “How many exhibitions does he visit per week?”) using an 11-point scale ranging from 1 (few) to 11 (many). Note that in a pretest, a small sample of students (N ⫽ 7) indicated that “to run” was considered as more typical of an athletic person–athlete (M ⫽ 6.43, SE ⫽ 0.57) than of an artistic person–artist (M ⫽ 2.86, SE ⫽ 0.60), F(1, 6) ⫽ 38.11, p ⬍ .005, ␩2p ⫽ .75, whereas “to visit exhibitions” was considered as more typical of an artistic person–artist (M ⫽ 6.00, SE ⫽ 0.44) than of an athletic person–athlete (M ⫽ 2.86, SE ⫽ 0.60), F(1, 6) ⫽ 37.08, p ⬍ .001, ␩2p ⫽ .86. Therefore, behaviors were selected so that what was congruent with one descriptor was incongruent with the other, and vice versa. In addition, we varied the order of the stimulus sentences and of the dependent variables between participants.

Results and Discussion Data were first z transformed and then combined into two indices of participants’ ratings, one for the behaviors that were congruent with the first descriptor (and incongruent with the second descriptor) and one for the behaviors that were congruent with the second descriptor (and incongruent with the first descriptor). Because the order of presentation of the descriptions as well as the order of presentation of the to-be-estimated behaviors did not produce any significant effects, we discarded these factors from the analyses. A 4 (stimulus pairs: noun–adjective, adjective– noun, noun–noun, adjective–adjective) ⫻ 2 (behavioral congruency: congruent with first vs. congruent with second) ANOVA was carried out on participants’ ratings, with all factors as withinparticipants variables. To facilitate interpretation of the findings, we first report effects pertaining to our main hypothesis, namely that nouns elicit more stereotypical inferences than adjectives, and discuss order effects only subsequently. Looking first at the upper portion of Table 2, we can see that our main results of Studies 1 and 2 were perfectly replicated. In fact, the noun always elicited stronger congruent inferences than the adjective of each pair, regardless of whether it preceded or followed the adjective. In our design, this was reflected in a reliable interaction between stimulus pairs and behavior congruency, F(1, 39) ⫽ 48.07, p ⬍ .001, ␩2p ⫽ .55, when analyzing the upper portion of Table 2 separately. The lower portion of Table 2 shows that on the average, noun– noun pairs (M ⫽ 4.99) elicited greater behavioral inferences than adjective–adjective pairs (M ⫽ 3.88), F(1, 39) ⫽ 26.33, p ⬍ .001, ␩2p ⫽ .40. This replicates our findings obtained on single nouns and adjectives, showing that the inductive potential of nouns greatly

NOMINA SUNT OMINA

Table 2 Participants’ Frequency Estimates as a Function of Behavior Congruency and Stimulus Pair (Study 3) Behavior congruency Congruent with first descriptor

Congruent with second descriptor

Stimulus pair

M

SE

M

SE

Noun–adjective Adjective–noun Noun–noun Adjective–adjective

5.51 3.80 5.31 3.91

0.19 0.22 0.21 0.22

3.59 4.89 4.68 3.85

0.22 0.24 0.25 0.20

exceeds that of adjectives. Together, this suggests that the greater inductive potential of nouns compared with adjectives was confirmed, regardless of whether mixed pairs (upper portion of Table 2) or uniform pairs (lower portion of Table 2) were presented. In addition, a number of order effects emerged that are in line with our predictions. First, a main effect of behavioral congruency, F(1, 39) ⫽ 4.53, p ⬍ .04, ␩2p ⫽ .10, indicated that behaviors congruent with the first descriptor (M ⫽ 4.63, SE ⫽ 0.16) were rated more frequent than those congruent with the second descriptor (M ⫽ 4.25, SE ⫽ 0.16). This reflects an order effect, demonstrating that the first descriptor is primary as an informationorganizing device. This effect was reliably modified by an interaction with stimulus pair, F(3, 37) ⫽ 15.72, p ⬍ .001, ␩2p ⫽ .56, represented in Table 2. The upper portion of Table 2 shows that the advantage of nouns over adjectives was stronger in the noun–adjective condition (M ⫽ 5.51 vs. M ⫽ 3.59), F(1, 39) ⫽ 66.31, p ⬍ .001, ␩2p ⫽ .63, than in the adjective–noun condition (M ⫽ 4.89 vs. M ⫽ 3.80), F(1, 39) ⫽ 22.64, p ⬍ .001, ␩2p ⫽ .37, indicative of a strong order effect. Moreover, and as shown in the upper portion of Table 2, we had argued that word order would mainly be relevant for nouns but less so for adjectives, considering that the noun (regardless of position) has to be processed before the adjective becomes meaningful. Crossover comparisons confirmed this prediction. Behaviors congruent with the noun were judged much more frequent when the noun appeared in the first (M ⫽ 5.51) than in the second position (M ⫽ 4.89), F(1, 39) ⫽ 40.50, p ⬍ .001, ␩2p ⫽ .51. In contrast, for the adjective, the position was practically irrelevant considering that inferences about congruent behaviors were very similar for the adjective in the first (M ⫽ 3.80) and in the second position (M ⫽ 3.59), F(1, 39) ⫽ 0.76, ns, ␩2p ⫽ .02. If the order of presentation was relevant to nouns but not to adjectives, then this should also emerge for noun–noun and adjective–adjective pairs. The lower portion of Table 2 illustrates that for the noun–noun pair, behaviors congruent with the first noun (M ⫽ 5.31) were judged more frequent than those congruent with the second noun (M ⫽ 4.68), although this difference fell slightly short of significance, F(1, 39) ⫽ 3.43, p ⫽ .07, ␩2p ⫽ .08. In contrast, there was no difference between behaviors congruent with the first and with the second adjective, F(1, 39) ⫽ 0.18, ns, ␩2p ⫽ .005, suggesting that the order of presentation was irrelevant for adjectives. Together, three main findings emerge from this study. First, and most important, this study shows that the greater tendency to infer

849

congruent behaviors from nouns than from adjectives also holds when nouns and adjectives co-occur in the same minimal phrase. Second, the information encountered first appears to have a greater weight as it induces a higher expectancy that the target person will engage in behaviors congruent with the descriptor. For example, a person described as an athlete and an artist is believed to engage in more fitness and fewer art-related behaviors than a person described as an artist and an athlete. Third, this primacy effect in which the first information affects impression formation more than information encountered later only holds for nouns but not for adjectives. This is in line with the idea that in noun–adjective pairs, adjectives will acquire their meaning only after nouns have been processed, so it is less important whether they precede or follow the noun.

Study 4: Inhibition of Alternative Classifications Whereas Studies 1–3 focused on stereotypical inferences, the aim of Study 4 was to corroborate a different hypothesis, namely that nouns, much more than adjectives, would prevent perceivers from alternative classifications of a given social target. In line with Allport’s (1954) suggestions, we hypothesized that nouns are more likely than adjectives to reify the individual target into the contents that they describe, thus impeding alternative classifications. In order to test this hypothesis, we conducted a study in which participants were presented with a target description either in noun form (e.g., athlete) or in adjective form (e.g., athletic). Then participants had to estimate the likelihood that the target may also possess another, unrelated characteristic. Please note that this additional characteristic was, again, provided either in noun form (e.g., artist) or in adjective form (e.g., artistic). We hypothesized that nouns, compared with adjectives, would block alternative classifications of the target, hence making it less likely that the target would also possess the additional, unrelated characteristic. Furthermore, we suspected that this would be mainly true in the case in which the alternative classification was provided by a noun rather than an adjective. The reason for this prediction is straightforward. Assigning a target initially to one class (e.g., dog) makes it difficult to also assign it to another class, but because every target can have a variety of qualities (tall, dark, fast, etc.), nothing argues against assigning a number of additional qualities to the same object. Indeed, in most Indo–European languages, multiple adjectives can legitimately be associated with a single noun (e.g., a large old house), but not vice versa (e.g., a large house and tree).

Method Participants. Forty students (23 women and 17 men) enrolled at the University of Padova were recruited in the library and completed the questionnaire individually. Procedure and materials. Participants first read the same cover story as in Study 1. Participants were provided with minimal descriptions of eight distinct individual targets, each defined by a common Italian male name, age, and a descriptor. For half of the targets, the descriptor referred to their athletic inclination, and for the other half, the descriptor referred to their artistic inclination. Following the instructions outlined by Macrae et al. (1995), we decided to use two categories that have no semantic overlap

850

CARNAGHI ET AL.

(unlike woman and housewife) and for which it is unlikely that one constitutes a subtype of the other (such as woman and manager). More important, regardless of the semantic content of the descriptor, each target person was labeled either by a noun (e.g., athlete and artist) or by an adjective (e.g., athletic and artistic). After reading a target description, participants had to estimate the probability that the target (e.g., described as athletic) may also have another characteristic (e.g., artistic). In particular, participants were prompted by the following question: “Could you indicate the likelihood that [the name of the target] could also be [descriptor].” Participants provided their estimates by means of a percentage format, ranging from 0% to 100%. We counterbalanced the order of presentation of the target description.

scriptor takes the form of a noun, perceivers are unlikely to include the target in an alternative classification when this potential alternative is also a noun, although they are willing to assign the target a new adjective. In contrast, when an adjective is used to describe the target initially, perceivers are equally likely to assign a new characteristic to the target, regardless of whether the new information is conveyed in adjective or noun form. These results may be summarized very simplistically as follows: Assuming that nouns assign targets to classes, whereas adjectives describe qualities, participants behave as if each target can belong to only one class (be it the initial descriptor or the alternative one) but can have any number of additional qualities.

Study 5: Implicit Essentialism

Results and Discussion Data were first combined to create four indeces of participants’ estimations. Specifically, two indices concerned the condition in which the initial target descriptor was provided in noun form and the alternative descriptor either in noun form or in adjective form, and two additional indices were related to the condition in which an adjective was used as the initial target descriptor and the alternative descriptor was provided either in noun form or in adjective form. A 2 (initial target descriptor: noun vs. adjective) ⫻ 2 (additional descriptor: noun vs. adjective) ANOVA, with likelihood estimates as the dependent variable, revealed a main effect for initial target descriptor, F(1, 39) ⫽ 4.07, p ⬍ .05, ␩2p ⫽ .094, showing that in line with our hypothesis, alternative classifications were more likely when the target had initially been described by an adjective (M ⫽ 41.25, SE ⫽ 2.67) than by a noun (M ⫽ 36.94, SE ⫽ 2.72). Also, a main effect for the additional descriptor revealed that participants were more likely to consider the target as also possessing an additional characteristic when that characteristic was expressed in adjective (M ⫽ 40.63, SE ⫽ 2.81) rather than in noun form (M ⫽ 37.56, SE ⫽ 2.33), F(1, 39) ⫽ 4.23, p ⬍ .05, ␩2p ⫽ .098. Most important, the predicted interaction emerged, although it fell slightly short of significance, F(1, 39) ⫽ 3.7, p ⬍ .06, ␩2p ⫽ .087. When the target had initially been described by a noun, the likelihood estimate was lower when the alternative classification was presented in noun form (M ⫽ 33.87, SE ⫽ 2.86) than in adjective form (M ⫽ 40.0, SE ⫽ 3.06), F(1, 39) ⫽ 6.43, p ⬍ .015, ␩2p ⫽ .15. Thus, participants were reluctant to assign an unrelated characteristic to a target that had initially been described by a noun if the new characteristic took the form of another noun but not when the same characteristic was proposed as an adjective. Please note that this reluctance was not found when the target had initially been described by an adjective. If an adjective had been used as the target descriptor, then an alternative classification of the target was considered equally likely, regardless of whether the alternative descriptor took the form of a noun (M ⫽ 41.25, SE ⫽ 3.04) or an adjective (M ⫽ 41.25, SE ⫽ 2.64), F(1, 39) ⫽ 0.00, ns, ␩2p ⫽ .007. In summary, when an individual target has been described by a noun rather than an adjective, an alternative classification of the same target is unlikely to occur. In line with our predictions, nouns, more than adjectives, appear to pervade the identity of the target, hence impeding a new classification. However, our results also suggest that this depends greatly on the way in which the alternative classification is expressed. When the initial target de-

To this point, our research project has demonstrated two interrelated phenomena: First, nouns provide much more powerful descriptions of people than adjectives as they induce greater expectations that the person will engage in descriptor-congruent behaviors and, at the same time, will not engage in incongruent behaviors. In other words, despite their great semantic and phonetic similarity, nouns (Carlo is a homosexual) produce more stereotypical views of the person than do adjectives (Carlo is homosexual). Second, nouns block alternative classifications. However, stereotypical inferences and the inhibition of alternative classifications may not be the only way in which nouns differ from adjectives. Rothbart and Taylor (1992) have argued that social categories are cognitively represented by two distinct but related dimensions, namely stereotypicality and essentialism. Whereas Studies 1–3 dealt with the former dimension, the following three studies (Studies 5A, 5B, and 5C) focus primarily on the essentialistic judgments that a noun, compared with the corresponding adjective, may elicit. Therefore, these studies test whether the linguistic form (adjective vs. noun) in which an individual target is described can affect the perceived strength of an individual’s preference for descriptor-congruent behaviors (for a similar paradigm, see Gelman & Heyman, 1999; Walton & Banaji, 2004).

Study 5A Method Participants. Fourteen female individuals attending the University of Padova participated in the experiment.7 Participants were recruited in a seminar class and completed the questionnaire in a collective session. Procedure. Participants read five sentences, each portraying a different individual target. Each sentence contained the individual’s name (e.g., Paul), the individual’s description (e.g., is an artist–is artistic) and the individual’s behavioral preference (e.g.,

7

Only female students participated in Studies 5A and 5C for the simple reason that the seminar, in which participants were asked to volunteer for the study, was exclusively attended by female students on the day of the data collection.

NOMINA SUNT OMINA

he likes to work with plaster).8 For each sentence, the descriptive term could be either an adjective or its corresponding noun (atleta– atletico [athlete–athletic], ubriacone– ubriaco [drunkard– drunk], genio– geniale [genius– brilliant], poeta–poetico [poet–poetic], and artista–artistico [artist–artistic]). Note that if the descriptive term was an adjective [noun], its corresponding noun [adjective] was not included in the same questionnaire. Therefore, we created four different versions of the questionnaire in which three individuals were respectively described by adjectives and two by nouns and four versions of the questionnaire in which two individuals were described by adjectives and three by nouns. The participants’ perception of the target’s behavioral preferences were assessed on 9-point scales (see Walton & Banaji, 2004) assessing strength (“How strong is X’s preference for this activity?”), stability (“How likely is it that X’s preference for this activity will remain the same in the next two years?”), and resilience (“How likely is it that X’s preference for this activity would remain the same if he was surrounded by friends who did not enjoy the activity in question?”). The first scale was anchored to very weak (1) and very strong (7), and the second and third scales were anchored to very likely to change (1) and very likely to remain the same (7). Note that the individuals’ behavioral preferences were always congruent with respect to the descriptive terms. Therefore, we expected participants to attribute greater essentialism (greater strength, stability, and resilience) to the individual’s preference when the individual was described by a noun rather than by an adjective.

Results and Discussion Data were first combined within each type of descriptor condition (adjective vs. noun) and within each participant’s ratings on the three scales. Therefore, we created six indices, namely one for perceived strength, one for stability, and one for resilience of the behavioral preference separately for the adjective and the noun stimuli. Because the three measures had reasonable internal consistency in both the adjective (Cronbach’s ␣ ⫽ .74) and the noun condition (Cronbach’s ␣ ⫽ .63), we averaged the three measures into two overall indices of perceived essentialism, one pertaining to the adjectival stimuli and the other to the noun stimuli. As the version of the questionnaire did not produce any significant effect, we analyzed participants’ ratings on the overall indices of perceived essentialism by means of a paired t test. In line with our hypothesis, participants judged the targets’ behavioral preferences in a more essentialist manner when the target person was described by nouns (M ⫽ 5.61, SE ⫽ 0.17) rather than by adjectives (M ⫽ 5.21, SE ⫽ 0.12), t(13) ⫽ 2.48, p ⫽ .03, d ⫽ .73. Thus, greater essentialism (defined as perceived strength, stability, and resilience) was attributed to category-congruent behaviors when the person had been described by a noun rather than by an adjective. Therefore, the same congruent behavior was perceived as a more essential and enduring feature or habit of the person when the target was described by a noun rather than by the corresponding adjective.9

Study 5B The present study intended to expand the results of Study 5A beyond the specific noun–adjective pairs that had been included in

851

that experiment. In order to ensure that the findings are not attributable to any unintended bias in the selection of the material and, hence, to the idiosyncratic characteristics of the selected stimulus material, the same randomly selected pairs of adjectives and nouns of Study 1B were included in the current study. As in Study 5A, participants were asked to evaluate the target’s preference for a descriptor-congruent behavior. It is worth noticing that the targets’ behavioral preferences used in the current study were exactly the same as the descriptor-congruent behaviors of Study 1B.

Method Participants. Twenty-one students (14 women and 7 men) enrolled at the University of Padova participated in the experiment. Participants were recruited in the library and completed the questionnaire individually. One participant was excluded from the present sample because she was not an Italian native speaker. Procedure and material. Participants read 24 sentences, each portraying an individual target. Each sentence contained the individual’ s name (e.g., Marco), the individual’s description (e.g., is traditional vs. is a traditionalist), and the individual’s behavioral preference (e.g., he likes to send postcards for Christmas). In contrast to the previous studies, the name of the target was kept constant across the descriptions. For each sentence the descriptive term could be either an adjective or its corresponding noun. The participant’s perception of the target’s behavioral preferences was assessed by means of the same three-item scale that we used in Study 5A. Finally, participants were asked to evaluate the valence of each target’s behavioral preference on a 7-point scale ranging from 1 (very negative) to 9 (very positive). Replicating the results of Study 5A, we expected participants to attribute greater essentialism to the individual’s preference when the individual was described by a noun rather than by an adjective.

8

A small sample of students, coming from the experimental population, judged the typicality of a series of behaviors with respects to the labels we used in the current study and in Study 6B. Results indicated that (a) “to run during the week” was considered as more typical for an athlete–athletic person (M ⫽ 6.67, SE ⫽ 0.17) than “to sit on a sofa while watching TV” (M ⫽ 2.55, SE ⫽ 0.41), t(8) ⫽ 8.48, p ⬍ .001; (b) “to find new solutions” was judged to be more typical (M ⫽ 6.67, SE ⫽ 0.17) for a genius– brilliant person than “to watch reality shows” (M ⫽ 3.44, SE ⫽ 0.73), t(8) ⫽ 4.24, p ⬍ .003; (c) “to work plaster” was considered as more typical for an artist–artistic person (M ⫽ 6.22, SE ⫽ 0.55) than “to work in a bank” (M ⫽ 2.55, SE ⫽ 0.44), t(8) ⫽ 4.49, p ⬍ .002; (d) “to drink wine” was rated as more typical (M ⫽ 6.89, SE ⫽ 0.50) for a drunkard– drunk person than “to enjoy solving statistical problem” (M ⫽ 2.44, SE ⫽ 0.11), t(9) ⫽ 8.00, p ⬍ .001; (e) “to write poems” was rated as more typical (M ⫽ 7.00, SE ⫽ 0.00) for a poet–poetic person than “to solve a math quiz” (M ⫽ 3.38, SE ⫽ 0.50), t(9) ⫽ 7.28, p ⬍ .001. The typical behavioral preferences were used in Study 5A, and the atypical preferences were used in Study 5B. 9 Looking at the data separately for each type of scale, the pattern of means (i.e., Mnoun ⫽ 6.29, SE ⫽ 0.11, and Madj ⫽ 5.80, SE ⫽ 0.24, for the strength scale; Mnoun ⫽ 5.50, SE ⫽ 0.21, and Madj ⫽ 5.41, SE ⫽ 0.21, for the stability scale; Mnoun ⫽ 5.05, SE ⫽ 0.14, Madj ⫽ 4.40, SE ⫽ 0.16, for the resilience scale) generally mirrored the pattern of results displayed by the overall index of essentialism.

CARNAGHI ET AL.

852 Results and Discussion

Because the scales had reasonable internal consistency (adjectives: ␣⫽ .76; nouns: ␣ ⫽ .77), participants’ ratings were averaged into two overall indices of perceived essentialism, one for adjectives and the other for nouns. In line with our hypothesis, participants judged the target’s behavioral preferences in a more essentialist way when the target was labeled by nouns (M ⫽ 5.65, SE ⫽ 0.20) rather than by adjectives (M ⫽ 5.35, SD ⫽ 0.18), F(1, 19) ⫽ 4.95, p ⫽ .04, ␩2p ⫽ .21. Moreover, because we intended to show that our findings could be generalized not only across participants but also across the adjective–nouns pairs, a repeated measures ANOVA was performed on the perceived essentialism indices using the linguistic pairs as random variables. Again, participants’ perception of essentialism was enhanced when the target was defined by means of a noun rather than by means of an adjective, F(1, 11) ⫽ 5.70, p ⫽ .04, ␩2p ⫽ .34. We also wanted to investigate whether noun–adjective differences in implied essentialism may vary as a function of valence of the behavioral inferences. Therefore, participants’ evaluative ratings were averaged to create an index of social desirability of the behavioral preferences (M ⫽ 5.05, SE ⫽ 0.11). We then calculated the difference in perceived essentialism by subtracting the essentialism for adjectives from that of nouns and computed the correlation between this index of differential essentialism and social desirability, r(12) ⫽ ⫺.18, p ⫽ .58. This result demonstrated that the evaluative reactions elicited by the behavioral preferences did not moderate the advantage of nouns over adjectives in eliciting stronger attribution of essentialism. Finally, although the adjectives and their corresponding nouns did not differ in terms of social desirability or in the degree of polarization (see Study 1B), we recomputed our main analysis involving essentialism, this time controlling for relative valence and relative evaluative polarization. Remember that we had found that nouns elicit greater essentialism than adjectives, F(1, 11) ⫽ 5.70, p ⫽ .04, ␩2p ⫽ .34. This result did not change when either relative valence, F(1, 10) ⫽ 6.70, p ⫽ .03, ␩2p ⫽ .40, or relative evaluative polarization were entered as covariates, F(1, 10) ⫽ 5.77, p ⫽ .04, ␩2p ⫽ .37. Together, these results clearly rule out the possibility that differential implicit valence of nouns versus adjectives could account for our findings. In other words, the fact that nouns elicit greater perceived essentialism than adjectives is largely independent of differences in evaluative content. Thus, greater essentialism (defined as perceived strength, stability, and resilience) was attributed to category-congruent behaviors when the person had been described by a noun rather than by an adjective. The fact that these findings could be extended even to a randomly selected sample of adjective–noun pairs and were largely independent of the valence of the behavioral preferences bolstered the external validity of our data.

In this case, we expected participants to perceive the target’s behavioral preferences as less essentialist when the target was described by nouns rather than by adjectives.

Method Participants. Twenty-one female individuals at the University of Padova participated in the experiment. Participants were recruited in a seminar class and completed the questionnaire in a collective session. Procedure. The procedure was the same as in Study 5A, except for the target’s behavioral preferences, which were always incongruent with respect to descriptive terms. The dependent variables were the same as in Study 5A.

Results and Discussion Data were first averaged within each condition (adjective vs. noun) to create overall indices of perceived essentialism, one for adjectives and the other for nouns (adjectives: ␣ ⫽ .55; nouns: ␣ ⫽ .76). Again, as the version of the questionnaire did not produce any significant effect, we analyzed participants’ ratings on the overall indices of perceived essentialism by means of a paired t test. In line with our hypothesis, participants perceived the targets’ (incongruent) behavioral preferences as less essentialist when the targets were described by nouns (M ⫽ 4.08, SE ⫽ 0.20), rather than by adjectives (M ⫽ 4.39, SE ⫽ 0.15), t(20) ⫽ 2.07, p ⫽ .052, d ⫽ .38.10 Because the data of Studies 5A and 5C had been collected within the same experimental population on the same day and because these experiments used very similar procedures (although, strictly speaking, participants were not assigned in a random fashion to experiments), we decided to analyze these data jointly; that is, participants’ ratings on the essentialism scale were analyzed as a function of the type of behavioral preference, namely congruent versus incongruent behaviors, and type of descriptor, namely adjectives versus nouns. A 2 (condition: adjectives vs. nouns) ⫻ 2 (behavioral preference: congruent vs. incongruent) ANOVA was conducted on participants’ ratings on the essentialism scale, with the former variable as a within-participants factor and the latter as a between-participants variable. Confirming the exactitude of the experimental material, a main effect of the behavioral preference was found, F(1, 33) ⫽ 29.79, p ⬍ .001, ␩2p ⫽ .47, indicating that, unsurprisingly, participants’ ratings were higher for congruent (M ⫽ 5.41, SD ⫽ 0.17) than for incongruent behavioral preferences (M ⫽ 4.23, SD ⫽ 0.14). More important for our purpose, a significant Condition ⫻ Behavioral Preference interaction was also found, F(1, 33) ⫽ 9.97, p ⬍ .003, ␩2p ⫽ .23, indicating that participants’ relative ratings for congruent over incongruent behavioral preferences were much stronger in the noun than in the adjective condition (nouns: 5.61 for congruent and

Study 5C 10

Study 5C extended the previous studies by investigating whether the linguistic form (i.e., adjective vs. noun) in which an individual target was described may also affect the perceived essentialism of descriptor-incongruent behaviors (for a similar paradigm, see Gelman & Heyman, 1999; Walton & Banaji, 2004).

Looking at the data separately for each type of scale, the pattern of means (i.e., Mnoun ⫽ 4.42, SE ⫽ 0.24, and Madj ⫽ 4.79, SE ⫽ 0.16, for the strength scale; Mnoun ⫽ 4.16, SE ⫽ 0.25, and Madj ⫽ 4.50, SE ⫽ 0.20, for the stability scale; Mnoun ⫽ 3.65, SE ⫽ 0.22, and Madj ⫽ 3.86, SE ⫽ 0.24, for the resilience scale) generally mirrored the pattern of results displayed by the overall index of essentialism.

NOMINA SUNT OMINA

4.08 for incongruent, difference ⫽ 1.53; adjectives: 5.21 for congruent and 4.39 for incongruent, difference ⫽ 0.82). Together, Studies 5A, 5B, and 5C suggest that the linguistic form of the target description reliably affects perceived essentialism. In line with our reasoning, behavioral habits that are congruent with the descriptor are perceived as an essential part of the person, that is, as a strong and enduring preference, when the target is described by a noun rather than by an adjective. In contrast, incongruent preferences are perceived in a less essentialist way when the target person is described by a noun. Thus, nouns drive perceived essentialism in both directions, making congruent preferences appear as stable, defining features of the person but making incongruent preferences appear weak and variable.

Study 6: Essentialist Beliefs and Linguistic Choice In our final studies (6A and 6B), we intended to invert the logic of the above experiments by asking when speakers will chose nouns or adjectives in discourse. Whereas Studies 5A, 5B, and 5C showed that nouns induce a perception of essentialism, in Study 6 we asked whether people who have more essentialist beliefs about the characteristics of another person would also tend to rely more strongly on nouns when describing the target person. If nouns provide a very general, all-embracing definition of the person, then they should represent an ideal linguistic device for communicating that a given characteristic represents a deeply grounded and essential aspect of a target person. In contrast, adjectives, describing one of many possible qualities of a person, seem better suited for those characteristics that are more superficial and variable. For example, nouns may provide excellent descriptions of those characteristics that are believed to have a genetic basis, whereas adjectives may be better suited for describing acquired characteristics that are likely to undergo change. To put it differently, with reference to Dweck’s (1999) model, nouns should be best suited to describe characteristics perceived as stable entities, whereas adjectives should fit an “incremental” mindset. In order to test this hypothesis, participants in this study were asked to choose between two possible descriptions of a target person, one formulated in noun form (athlete) and the other in adjectival form (athletic). It is important to note that at the beginning of the study, participants had either been induced to adopt a genetic or an environmental perspective. We predicted that an environmental– causation mindset would lead people to prefer adjectives, whereas thinking about a target person in terms of a genetic predisposition would increase the use of nouns.

Study 6A Method Participants. Twenty-eight students (21 women and 7 men) enrolled at the University of Padova were recruited in the library and completed the questionnaire individually. Procedure. Participants were told that the study focused on students’ ability to comprehend scientific articles. Participants were then presented with a fictitious article that had ostensibly been taken from the “Genetic Journal of Sport.” Participants were randomly assigned either to the genetic or to the nongenetic condition.

853

Participants in the genetic condition were told the following: Considerable empirical evidence has proven that adults who obtain excellent results in sport competitions have a different genetic makeup than those who do not excel in this domain. Years ago, it was hypothesized that a protein, called proctodineasis, is present in the genome of those who excel in sports. Recently, a team of researchers has tested and confirmed this conjecture.

In contrast, participants in the nongenetic condition were told the following: Considerable empirical evidence has proven that adults who obtain excellent results in sport competitions do not differ in their genetic makeup from those who do not excel in this domain. Years ago, it was hypothesized that a protein, called proctodineasis, was present in the genome of those who excel in sports. Recently, a team of researchers has tested and dismissed this conjecture.

Regardless of experimental conditions, participants then read the following: Specifically, experimenters randomly sampled two distinct groups from the adult population, one that did and one that did not possess the proctodineasis in their genetic makeup. Then, both groups received the same sport training: They had to run 25 minutes two times per week. Finally, both groups took part in a 100 m track competition.

In the genetic condition, the text continued as follows: Results indicated that individuals who have the proctodineasis displayed better sport-related skills than those who did not posses the proctodineasis in their genetic makeup. Moreover, it has been shown that, compared to individuals that did not have the proctodineasis in their genetic makeup, individuals who have the proctodineasis were able to develop higher levels of sport skills, stronger resistance to stress, and higher levels of motivation to compete even if they invested little time in their sport training. The researchers concluded that being good in sports is for the most part genetically determined. Hence, what makes people have success in sports is mainly their predisposition. Not everybody is able to achieve a good physical condition. But only those who have a specific genetic makeup.

In contrast, participants in the nongenetic condition were told the following: Results indicated that individuals who had the proctodineasis displayed very similar sport-related skills than those who did not posses the proctodineasis in their genetic makeup. Moreover, it has been shown that regardless of whether one has, or does not have, the proctodineasis in one’s genetic makeup, individuals will not develop high levels of sports skills, resistance to stress, and high levels of motivation to compete if they invest only little time in their sport training. The researchers concluded that being good at sports is for the most part not genetically determined. Hence, what makes people have success in sports is not their predisposition. In general, everybody is able to achieve a good physical condition, regardless of his/her genetic make up.

Note that nowhere in the text of either condition was the noun athlete or the adjective athletic mentioned. Coherently with the cover story, participants were provided with a series of questions allegedly aimed at testing their comprehension of the article they had read (e.g., “Do people who have the proctodineasis develop better sport skills even at low levels of

CARNAGHI ET AL.

854

training?”), and they provided their answers by means of a true– false format. Participants were then invited to take part in a further study. Specifically, they were told that we were interested in the way people form an impression of an individual target. Regardless of experimental condition (genetic vs. nongenetic), participants read the following description of the target: Marco, is twenty-two years old. He goes running in the park twice a week. He participates in 100 m races. When he had a physical check up, it turned out that Marco has the proctodineasis in his genetic makeup.

When they finished reading the description of the target, participants were provided with two labels, namely athletic or athlete (we counterbalanced the order of the labels). Participants had to chose the label that better described the individual target. Subsequently, they reported their gender and their age and were fully debriefed.

Results and Discussion A chi-square analysis revealed a significant impact of the condition on participants’ choices (adjective vs. noun), ␹2(1, N ⫽ 28) ⫽ 4.18, p ⬍ .04, showing that the likelihood that participants would use a noun rather than an adjective was higher in the genetic (38%) than in the nongenetic condition (7%; see Table 3). In support of our hypothesis, people greatly enhanced their preference for noun labels when a characteristic had been framed in genetic terms, compared with the situation in which the same characteristic was perceived as a transient property. This confirms our suspicion that people prefer adjectives to describe abilities that are developed through training (incremental theory, in Dweck’s, 1999, terms), whereas nouns are reserved for describing genetically determined abilities (entity theory, in Dweck’s, 1999, terms). A potential limit of this study is that, again, a noun was used that may imply a profession and that people may assume that others choose professions that correspond to their naturally given inclinations. Although no reference was made to the target’s profession in the description, it was important to test the same hypothesis on a noun–adjective pair in which the noun does not imply an occupational role. We therefore replicated the above experiment using genio [genius] and geniale [brilliant] as stimuli.

Table 3 Participants’ Choices as a Function of the Experimental Condition in Studies 6A and 6B Choice Noun Study and condition 6A: atleta–atletico [athlete–athletic] Genetic Nongenetic 6B: genio–geniale [genius–brilliant] Genetic Nongenetic

Adjective

n

%

n

%

5 1

38 7

8 14

62 93

8 2

22 6

28 32

78 94

Study 6B Method Participants. Seventy students (55 women and 15 men) enrolled at the University of Trieste participated in the experiment. Participants were recruited in class and completed the questionnaire individually but in a collective session. Procedure. The study was run in class and introduced as research assessing students’ ability to comprehend scientific articles. Participants were informed that they would not be evaluated and that their responses were anonymous, hence they should not report any personal information on their response sheet. Participants were randomly assigned to the genetic versus nongenetic condition, which took place in two different classrooms. All the experimental material and the dependent variables were presented by a Power Point program on a screen. Participants provided their answer on a questionnaire. Participants were presented with an ostensible scientific article published in the “Genetic Journal of Human Intelligence.” Participants assigned to the genetic condition were told the following: Considerable empirical evidence has proven that human intelligence has a genetic basis. It has always been thought that a protein, called proctodineasi, distinguishes the genome of very intelligent people and that this protein is responsible for their superior brain development. Only recently, due to the progress in genetic mapping, has it become possible to test this hypothesis empirically. Researchers have subjected 19,980 individuals, randomly chosen from the adult population, to a test battery assessing their IQ. Based on their test results, participants were divided into those with a high intelligence, that is, scoring above average, and those with normal levels of intelligence. Subsequently, genetic maps were created for both groups. Results show that individuals with a very high IQ display the proctodineasi in their genome, whereas the proctodineasi was absent in the genetic makeup of the average intelligence group. The researchers have concluded that this protein (proctodineasi) distinguishes the most intelligent individuals and that this protein accounts for their greater and faster cognitive development.

Participants assigned to the nongenetic condition were told the following: Considerable empirical evidence has proven that human intelligence does not have a genetic basis. It has always been thought that a protein, called proctodineasi, distinguishes the genome of very intelligent people and that this protein is responsible for their superior brain development. Only recently, due to the progress in genetic mapping, has it become possible to test this hypothesis empirically. Researchers have subjected 19,980 individuals, randomly chosen from the adult population, to a test battery assessing their IQ. Based on their test results, participants were divided into those with high intelligence, that is, scoring above average, and those with normal levels of intelligence. Subsequently, genetic maps were created for both groups. Results show that both individuals with a very high IQ and those with average intelligence display the proctodineasi in their genome. The researchers have concluded that this protein (proctodineasi) is present in both highly intelligent and in normal people and that this protein cannot account for greater and faster cognitive development.

Coherently with the cover story, participants were given a series of questions allegedly aimed to test their comprehension of the

NOMINA SUNT OMINA

scientific article (e.g., “Do people who have the proctodineasi in their genome develop higher levels of intelligence?”), and they provided their answers by means of a true–false format. Then, participants were invited to take part in a further study. Specifically, they were told that we were interested in the way people form an impression of an individual target. Regardless of the experimental condition (genetic vs. nongenetic), participants read the following target description: Marco is twenty-two years old. When one of his friends runs into problems with some computer software, he is very fast at identifying the cause and making the program work properly. He is faster than any of his friends to find the solution to any conundrum. He likes to dedicate his spare time to new inventions, which he is sometimes able to sell to producers. Genetic mapping revealed the presence of the proctodineasi in Marco’s genome.

After reading the description, participants were asked to choose between two labels, namely genio [genius] or geniale [brilliant], the one that they thought described the target more appropriately. Participants then reported their gender and their age and were fully debriefed and dismissed.

Results and Discussion In support of our hypothesis, participants were more likely to pick the noun and less likely to pick the adjective in the genetic than in the nongenetic condition, ␹2(1, N ⫽ 70) ⫽ 3.81, p ⬍ .05, (see Table 3). In line with the results of Study 6A, participants’ preference for the noun label was enhanced when the characteristic (i.e., intelligence) was perceived as a genetically determined aspect of the target rather than as a transient property that is largely under the influence of environmental factors.

General Discussion On the surface, nouns and adjectives often appear very similar, at times distinguishable only by relatively minor variations in suffix (Lone is a Dane vs. Lone is Danish) or by the presence or absence of the article (Frederick is a schizophrenic vs. Frederick is schizophrenic). Yet, the present research project suggests that they differ predictably and systematically in the way they are used by speakers and in the impressions and inferences they create in the listener. It is important to note that these differences seem to reflect systematic differences in grammatical roles played by nouns versus adjectives. First, Studies 1 and 2 show that nouns (compared with adjectives) have a greater potential to induce inferences about the target person that go beyond the information given, confirming findings by Gelman, Markman, and collaborators (Gelman & Markman, 1985, 1986; Markman, 1989; Markman & Smith, as cited in Markman, 1989). Conceptually, this finding is consistent with the fact that nouns define classes that imply multiple properties, whereas adjectives generally delineate a single property, thus lending support to Wierzbicka’s (1986) assertion that “there is more in a noun than meets the eye; there is more in a noun than there is in an adjective” (p. 380). Note, however, that our studies also show that the greater willingness to draw inferences is limited to information that is stereotypically congruent with the noun label. Inferences about incongruent behaviors or habits of the

855

target person are actually inhibited. On one side, nouns (e.g., X is a homosexual) foster stereotypical inferences more so than corresponding adjectives (e.g., X is homosexual), but on the other side, they also tend to inhibit counterstereotypical inferences. Thus, nouns do not simply induce inferential processing, but they channel the inferential activity so that expected behaviors appear more likely and unexpected behaviors appear less likely. Although we did not test this hypothesis in the present set of studies, an interesting implication of this process is that subtle differences in word choice in interpersonal discourse may bolster existing stereotypical beliefs. Our research also shows that the tendency of nouns to facilitate stereotypic and to inhibit counterstereotypic thoughts is not limited to specific word pairs, to a specific language, to specific stimuli, or to specific response formats. We found very similar evidence in experimenter-generated and in randomly selected word pairs and for different kinds of labels referring to kinship or intimacy groups, social categories, personality types, and loosely associated groups such as professions. We also obtained comparable findings in two different languages (Italian and German), in stimulus sentences differing in valence and content domain, and for closed and openended responses. Also, similar results were obtained for nouns and adjectives presented in isolation and for phrases defining the same target both by a noun and by an adjective (Study 3). Thus, it appears that we are dealing with a robust and pervasive phenomenon, at least within the realm of Indo–European languages. The second result of interest refers to the fact that once a person has been labeled by a noun (e.g., Mark is an athlete), classifications of the same person along different dimensions (e.g., Mark is also an artist) become unlikely. The fact that categorizations along one dimension (e.g., Chinese) inhibit alternative classifications of the same person (e.g., women) has long been known (Macrae et al., 1995). What is new about the present approach is the role that language plays in this process. Whereas nouns do, indeed, inhibit alternative (noun) classifications, adjectives do not. Nouns are exclusive in the sense that they do not easily “allow” that the same person be assigned also to a different category. Adjectives, in contrast, do not preclude that the target person also has other qualities defined by other adjectives or by nouns. At the risk of oversimplifying, the target person can belong to only one noun category, but she or he may possess any number of different properties. To our knowledge, this is the first empirical support for Allport’s (1954) intuition that “nouns . . . cut slices” (p. 178), thereby preventing alternative classifications. The third major finding is that nouns imply greater essentialism than adjectives. This is in line with the idea that there is something deeper and more profound about a noun than the sum of the features it entails (Wierzbicka, 1986) and that nouns are more central to the identity of the object or person as they imply great stability and immutability (Gelman & Heyman, 1999), and possibly, also more extreme meanings than adjectives (De Raad, 1992). Our findings extend previous observations (Gelman & Heyman, 1999; Walton & Banaji, 2004) by showing that nouns exceed not only verbs but also adjectives in implicit essentialism. This is remarkable if one considers that nouns and adjectives have much greater surface similarity than nouns and verbs and that, for the aim of the current set of studies, nouns and adjectives were deliberately selected to ensure that they shared the same word stem. Our findings also suggest that features that are semantically

CARNAGHI ET AL.

856

incongruent are perceived as less essentialist when the characteristic is stated in noun rather than adjective form. Thus, just as in the case of stereotypicality, nouns seem to polarize attributions of essentialism, making congruent behaviors or habits appear more, and incongruent behaviors less, essential and profound aspects of the person. Note that the same mechanism can also be reversed, as our last two studies demonstrate. Here we manipulated the mindset of people so that they adopted either a genetic or an environmental view on athletic performance (Study 6A) or intelligence (Study 6B). They then had to describe the same stimulus person showing success in the athletic (or intellectual) domain. In both cases, people who considered the domain genetically determined increased their use of nouns compared with those who held more environmental views. Together, this suggests the possibility of a subtle, self-perpetuating cycle of interpersonal communication, so that people with highly essentialist beliefs tend to report on the behavior of the target person in noun form, which, in turn, induces the perception of high essentialism in the listener. Throughout this article, we have treated the three phenomena (stereotypical inferences, inhibition of alternative classifications, and implicit essentialism) as logically coherent, yet distinct effects. An interesting issue to be investigated in future research is the possibility of a common underlying mechanism. In particular, following the logic of Rothbart and Taylor (1992), it is conceivable that the implicit essentialism of nouns, lending a quasi-biological quality to socially constructed categories, is the common denominator that then drives both stereotypic inferences and the inhibition of alternative classifications. Only future studies using withinparticipant designs will be able to provide insights into the causal link between the three phenomena.

Relation to Prior Theorizing Our research findings appear largely in line with those cognitive–linguistic approaches that conceptualize grammar as closely linked to semantics (Evans & Green, 2006; Langacker, 1986). Yet, one may wonder how our findings relate to social– psychological approaches dealing with the language– cognition interface in the interpersonal domain, such as Semin and Fiedler’s (1988) influential LCM. Although this model does not include nouns among its levels of abstraction and although nouns are generally coded as equivalent to adjectives because of their functional similarity (see Coenen, Hedebouw, & Semin, 2006), there are reasons to propose that nouns be added to the model as the fifth and highest level of language abstraction. Compared with descriptive action verbs (e.g., Shira goes to the Synagogue), interpretative action verbs (e.g., Shira observes the Jewish holidays), state verbs (e.g., Shira feels part of the Jewish community), and adjectives (e.g., Shira is Jewish), the noun phrase Shira is a Jew appears to make the most general and all-encompassing statement about Shira’s religious affiliation. According to LCM (Semin & Fielder, 1988), events that are formulated in a relatively concrete form (such as action verbs) draw perceivers’ attention to the specific situational constraints of the event. However, with increasing abstraction, there is an increasing focus on stable and dispositional causes and a greater implicit likelihood that the described characteristic generalizes across situations, behaviors, and objects and over time. As a consequence, characteristics of an individual

described abstractedly are considered as more stable, less falsifiable, and more informative about the individual while providing little information about the situation in which the event takes place. If nouns can be considered the most abstract level of the LCM, then the noun stimuli used in our studies should exceed the corresponding adjectives in informativeness about the person, perceived duration, and projection into the future. We tested this possibility on a sample of 55 students, who judged six adjective– noun pairs (athletic–athlete, poetic–poet, artistic–artist, genius– brilliant, drunk– drunkard, and hero– heroic) with respect to the three central queries, two of which were adopted from Semin and Fiedler (1988), referring to the amount of information provided about the protagonist, the enduringness of the characteristic, and the likelihood that the person will be like this in the future. Results show that nouns, compared with adjectives, were seen to provide more information about the target person (Mnoun ⫽ 3.97 vs. Madj ⫽ 3.59), to imply a more enduring characteristic (Mnoun ⫽ 4.49 vs. Madj ⫽ 4.03), and to allow better predictions about the future (Mnoun ⫽ 4.36 vs. Madj ⫽ 3.85; ps ⬍ .001). These preliminary results suggest that nouns may legitimately be added to the LCM as a fifth level of abstraction, representing the most general type of assertion that can be made about a target person. In other words, nouns seem to continue the well-established linear trend from descriptive action verbs over interpretive action verbs and state verbs to adjectives in terms of informativeness, enduringness, and predictability. However, there is one feature for which nouns may not match the LCM and for which they may show a deviant, that is, nonlinear, pattern, namely the vividness of imagery. In general, people find it relatively easy to visually imagine the specific situation when concrete words are used, but the capacity of concrete imaging decreases with increasing language abstraction (Semin & Fiedler, 1988). Extrapolating from these observations, one may expect particularly low imaging for nouns. Yet, there is evidence in the literature that on the average, nouns are more likely to evoke concrete visual images than adjectives and that this capacity to arouse imagery may in part explain why nouns are more efficient memory cues than adjectives (e.g., Di Vesta & Ross, 1971; for an overview, see Paivio, 1991). The same may hold for the persondescriptive nouns and adjectives used in our set of studies. As a thought experiment, the reader may try to form, as quickly as possible, a visual image of a person who is “athletic” or of one who is “Jewish” [adjective] compared with “an athlete” or “a Jew” [noun]. It is not unlikely that precise visual images, including specific prototypical exemplars (such as Carl Lewis and Stefie Graw or Albert Einstein and Woody Allen), come to mind more easily when prompted by a noun rather than by an adjective. We tested this conjecture in the above-mentioned follow-up study by asking our participants to indicate how easy they found it to imagine the protagonist or how quickly they had a clear image in their minds when the protagonist was described by a noun (e.g., athlete) or an adjective (e.g., athletic). In line with our expectations, nouns (M ⫽ 4.57) elicited vivid images more easily and more quickly than adjectives (M ⫽ 4.37; p ⬍ .05). Together, this very first evidence suggests that nouns are indeed more general than adjectives in terms of informativeness, enduringness, and predictability and, as such, could be integrated into LCM at the highest level of abstraction, but they do seem to enjoy an advantage in terms of imagery that makes them more similar to concrete

NOMINA SUNT OMINA

verbs. It therefore remains for future research to investigate where nouns may be collocated on Semin and Fiedler’s (1988) abstraction dimension and whether the unique combination of high generality but also high vividness may be responsible for the great inductive potential of nouns observed in the present set of studies.

Limits and Implications Being the first of its kind, it is not surprising that the present research project has a number of limitations. First, our hypotheses were tested only with stimulus pairs that, although phonetically and semantically similar, were easily distinguishable by suffix (and, in one case, also by uppercase vs. lowercase of initial, i.e., Katholik [Catholic] vs. katholisch [Catholic]). In some languages (such as German), adjectives and nouns are distinguishable by definition and therefore the criteria that guided the present research are easily met in such languages, whereas in other languages (such as Italian), nouns and adjectives with common derivations often differ only with respect to the article but are not distinguishable on the bases of suffix or spelling (e.g., omosessuale–un omosessuale [homosexual–a homosexual]; cattolico– un cattolico [Catholic–a Catholic]). It therefore remains to be seen whether similar results can also be obtained in the latter case in which word pairs differ only with respect to the presence or absence of the article. Second, although adjective–noun differences seem very robust as they emerged in every study we have run so far, regardless of stimulus materials and measures, it is currently difficult to estimate the average size of the effect. Effect sizes varied greatly across studies from small to large, with noun–adjective differences accounting for as little as 8% to 9% of the variance up to almost 90%. However, most effect sizes fell into the medium range. We observed only one systematic variation in effect size, suggesting that the facilitation of congruent inferences is generally stronger than the inhibition of incongruent inferences, both with respect to stereotypicality and essentialism. Third, we assessed only conscious and deliberate reactions to nouns or adjectives, which may or may not generalize to implicit measures. However, the fact that practically identical results were obtained with within- and between-participants designs suggests that impression management was not at work. In withinparticipants designs, people could easily have adjusted their second response so as to appear coherent, which they did not do. One may therefore suspect that participants are largely unaware of their different reactions to nouns and adjectives, although this issue remains to be investigated. Fourth, and most important, our hypotheses were tested on only two European languages, reflecting an ethnocentric bias that is typical of much psychological research. The role of nouns may well be different in languages that have only few adjectives (Dixon, 1977). It is likely that nouns in these languages need to cover both functions (namely assigning objects to classes and defining single features of such objects) that in Indo–European languages, are taken over by two distinct parts of speech (nouns and adjectives). Also, even in languages in which adjectives are common, they may differ in form and function from those of Indo–European languages (Croft, 1991). As a case in point, two subclasses of adjectives can be distinguished in Japanese, neither of which resembles exactly the word class that is called “adjective” in Italian, German, or English (Maass, Karasawa, Politi, & Suga,

857

2006). The first subclass, ending with the suffix –i, is regarded as relatively similar to European adjectives; whereas the second subclass, ending with the suffix – da, represents “nominal adjectives” that are grammatically similar to nouns (Uehara, 1998). To complicate things further, both of these subclasses have properties that equate them to verbs. For example, depending on their syntactic role, they may conjugate, just as verbs do. Also, they do not require a copula to be a predicate, and this is particularly true for the prototypical –i adjectives. Hence, the distinction between adjectives and nouns may be considerably more complex in language such as Japanese than it is in Western languages (see also Wetzer, 1996). In addition, whereas the major word classes have highly specialized functions and are syntactically distinct in Indo–European languages (with nouns describing objects or entities; adjectives describing qualities or properties; and verbs describing actions, states, or processes), there may be greater interclass overlap in other languages. As a case in point, many categories can easily and flexibly be converted into other grammatical classes in Japanese. Thus, considering the remarkable variations of word classes across languages and the differential use of verbs, adjectives, and nouns found across cultures (e.g., Kashima, Kashima, Kim, & Gelfand, 2006; Maass et al., 2006), we caution against generalizing our findings beyond the linguistic context in which they were generated. Keeping these limits in mind, we still believe that our findings can have interesting implications for a number of applied areas, including subtle biases in mass communication as well as in interpersonal discourse. The strategic use of nouns versus adjectives may well contribute to the maintenance of socially shared beliefs (for similar language biases, see Maass, 1999; Semin, in press; Wigboldus, Semin, & Spears, 2000). For example, if nouns (rather than adjectives) are used to describe negative features of an outgroup and/or minority member this may foster stereotypical thinking about the target (and possibly about its group). For instance, a person learning that Paula is an anorexic may engage in highly stereotypical and essentialist thinking about Paula’s clinical condition, which is likely to be perceived as an enduring and possibly genetically determined aspect of Paula’s personality. Also, the listener may encounter great difficulties to envisage Paula in ways other than her being an anorexic (e.g., as a musician, a teacher). It may be interesting for future research to investigate how the use of nouns versus adjectives may, in a subtle way, affect the transmission and maintenance of stereotypic beliefs about different social groups, such as ethnic minorities or people with psychopathologies (an area in which nouns seem to prevail, Saucier, 2003).11 In conclusion, nouns and adjectives seem to play distinct roles in person perception despite their apparent similarity in form and function. Nouns are much more potent in directing people’s thoughts, as they suggest greater essentialism and induce stronger stereotyping while blocking alternative classifications of the same 11 In line with this reasoning, the fifth edition of the Publication Manual of the American Psychological Association (American Psychological Association, 2001) provides a number of guidelines to avoid language bias against persons on the basis of their gender, sexual orientation, and so on, among which is “adjectives are preferred to nouns” (p. 67).

CARNAGHI ET AL.

858

person. In short, they are much more prognostic than seemingly similar adjectives. Thus, in reference to the familiar Latin saying that we adopted as the title of this article, nouns are, indeed, omens.12

12

Note that the Latin saying nomina sunt omina (or the better known singular form nomen est omen) is generally translated as names are signs (or presages), but among the many meanings of nomen, noun also figures (as a grammatical unit).

References Allport, G. W. (1954). The nature of prejudice. Cambridge, MA: AddisonWesley. American Psychological Association. (2001). Publication Manual of the American Psychological Association (5th ed.). Washington, DC: Author. Angleitner, A., Ostendorf, F., & John, O. P. (1990). Towards a taxonomy of personality descriptors in German: A psycho-lexical study. European Journal of Personality, 4, 89 –118. Baayen, R. H., Feldman, L. B., & Schreuder, R. (2006). Morphological influences on the recognition of monosyllabic monomorphemic words. Journal of Memory and Language, 55, 290 –313. Bruner, J. S., Goodnow, J. J., & Austin, G. A. (1956). A study of thinking. New York: Wiley. Brysbaert, M. (2007). The language-as-fixed-effect-fallacy: Some simple SPSS solutions to a complex problem. London: Royal Holloway, University of London. Carnaghi, A., & Maass A. (in press). Derogatory language in intergroup context: Are “gay” and “fag” synonymous? In Y. Kashima, K. Fiedler, & P. Freytag (Eds.), Stereotype dynamics: Language-based approaches to stereotype formation, maintenance, and transformation. Mahwah, NJ: Erlbaum. Carnaghi, A., Maass, A., Bianchi, M. B., Castelli, L., & Brentel, M. (2005). Gay or fag? On the cognitive and affective consequences of derogatory group label. Unpublished manuscript, University of Padova, Padua, Italy. Clark, H. (1973). Language-as-fixed-effect fallacy: A critique of language statistics in psychological research. Journal of Verbal Learning and Behavior, 12, 335–359. Coenen, L., Hedebouw, L., & Semin, G. R. (2006). Measuring language abstraction: The Linguistic Category Model (LCM). Retrieved April 2007, from http://www.cratylus.org Croft, W. (1991). Syntactic categories and grammatical relations. Chicago: University of Chicago Press. De Raad, B. (1992). The replicability of the Big Five personality dimensions in three word classes of the Dutch language. European Journal of Personality, 65, 477–528. De Raad, B., & Hoskens, M. (1990). Personality-descriptive nouns. European Journal of Personality, 4, 131–146. Devoto, G., & Oli, G. C. (1971). Dizionario della lingua italiana [Vocabulary of the Italian language]. Le Monnier, Italy: Firenze. Di Vesta, F. J., & Ross, S. M. (1971). Imagery ability, abstractness, and word order as variables in recall of adjectives and nouns. Journal of Verbal Learning and Verbal Behavior, 10, 686 – 693. Dixon, R. M. W. (1977). Where have all the adjectives gone? Studies on Language, 1, 19 – 80. Dweck, C. S. (1999). Self-theories: Their role in motivation, personality, and development. Philadelphia: Taylor & Francis. Ehrlich, M. F. (1977). Learning and long-term memorization of sentence: The role of semantic cohesion of the elements. L’Anne´e Psychologique, 77, 41– 62. Evans, V., & Green, M. (2006). Cognitive linguistics: An introduction. Mahwah, NJ: Erlbaum.

Fiske, S. T., & Neuberg, S. L. (1990). A continuum model of impression formation from category-based to individuating processes: Influences of information and motivation on attention and interpretation. Advances in Experimental and Social Psychology, 23, 1–74. Foroni, F., & Rothbart, M. (2006). Labeling and categorization: Effects of the mere labeling process and of the nature of the labels. Unpublished manuscript, Free University Amsterdam. Gelman, S. A., & Coley, J. D. (1990). The importance of knowing a dodo is a bird: Categories and inferences in two-year old children. Developmental Psychology, 26, 796 – 804. Gelman, S. A., Collman, P., & Maccoby, E. E. (1986). Inferring properties from categories versus inferring categories from properties: The case of gender. Child Development, 57, 396 – 404. Gelman, S. A., & Heyman, G. D. (1999). Carrot-eaters and creaturebelievers: The effects of lexicalization on children’s inferences about social categories. Psychological Science, 10, 489 – 493. Gelman, S. A., & Markman, E. A. (1985). Implicit contrast in adjectives vs. nouns: Implication for word-learning in preschoolers. Journal of Child Language, 12, 125–143. Gelman, S. A., & Markman, E. M. (1986). Categories and induction in young children. Cognition, 23, 183–209. Gelman, S. A., & Markman, E. M. (1987). Young children’s inductions from natural kinds: The role of categories and appearances. Child Development, 55, 1535–1540. Gilbert, D. T., & Malone, P. S. (1995). The correspondence bias. Psychological Bulletin, 117, 21–38. Hall, D. G., & Moore, C. E. (1997). Red bluebirds and black greenflies: Preschoolers’ understanding of the semantics of adjectives and count nouns. Journal of Experimental Child Psychology, 67, 236 –267. Jespersen, O. (1986). The philosophy of grammar. London: George Allen & Unwin. Jones, E. E. (1990). Interpersonal perception. New York: Freeman. Jones, E. E., & Harris, V. A. (1967). The attribution of attitudes. Journal of Experimental Social Psychology, 3, 1–24. Kashima, Y., Kashima, E. S., Kim, U., & Gelfand, M. (2006). Describing the social world: How is a person, a group, and a relationship described in the East and the West? Journal of Experimental Social Psychology, 42, 388 –396. Krueger, J., & Clement, R. W. (1994). Memory-based judgments about multiple categories: A revision and extension of Tajfel’s accentuation theory. Journal of Personality and Social Psychology, 67, 35– 47. Kusyszyn, I., & Paivio, A. (1966). Transition probability, word order, and noun abstractness in the learning of adjective–noun paired associates. Journal of Experimental Psychology, 71, 800 – 805. Langacker, R. W. (1986). An introduction to cognitive grammar. Cognitive Science, 10, 1– 40. Lickel, B., Hamilton, D. L., Wieczorkowska, G., Lewis, A., Sherman, S. J., & Uhles, A. N. (2000). Varieties of groups and the perception of group entitativity. Journal of Personality and Social Psychology, 78, 223–246. Lockhart, R. S. (1969). Retrieval asymmetry in the recall of adjectives and nouns. Journal of Experimental Psychology, 79, 12–17. Lockhart, R. S., & Martin, J. E. (1969). Adjective order and the recall of adjective-noun triples. Journal of Verbal Learning and Verbal Behavior, 8, 272–275. Loftus, E. F. (1972). Nouns, adjectives, and semantic memory. Journal of Experimental Psychology, 96, 213–215. Lyons, J. (1977). Semantics. Cambridge, England: Cambridge University Press. Maass, A. (1999). Linguistic intergroup bias: Stereotype perpetuation through language. In M. P. Zanna (Ed.), Advances in experimental social psychology (Vol. 31, pp. 79 –121). San Diego, CA: Academic Press. Maass, A., Karasawa, M., Politi, F., & Suga, S. (2006). Do verbs and adjectives play different roles in different cultures? A cross-linguistic

NOMINA SUNT OMINA analysis of person representation. Journal of Personality and Social Psychology, 90, 734 –750. Macrae, C. N., Bodenhausen, G. V., & Milne, A. B. (1995). The dissection of selection in person perception: Inhibitory processes in social stereotyping. Journal of Personality and Social Psychology, 69, 397– 407. Mae, L., & Carlston, D. E. (2005). Hoist on your own petard: When prejudiced remarks are recognized and backfire on speakers. Journal of Experimental Child Psychology, 41, 240 –255. Markman, E. M. (1989). Categorization and naming in children: Problems of induction. Cambridge, MA: MIT Press. Martin, J. E. (1969). Some competence-process relationships in noun phrases with pronominal and postnominal adjectives. Journal of Verbal Learning and Verbal Behavior, 8, 471– 480. Paivio, A. (1963). Learning adjective-noun paired associates as a function of adjective-noun order and noun abstractness. Canadian Journal of Psychology, 17, 370 –397. Paivio, A. (1991). Images in mind: The evolution of a theory. Hertfordshire, England: Harvester Wheatsheaf. Putnam, H. (1975). The meaning of meaning. In H. Putnam (Ed.), Mind, language, and reality (Vol. 2, pp. 239 –238). London: Cambridge University Press. Ross, L., Amabile, T. M., & Steinmetz, J. L. (1977). Social roles, social control, and biases in social-perception processes. Journal of Personality and Social Psychology, 35, 485– 494. Rothbart, M., Davis-Stitt, C., & Hill, J. (1997). Effects of arbitrarily placed category boundaries on similarity judgments. Journal of Experimental Social Psychology, 33, 122–145. Rothbart, M., & Taylor, M. (1992). Category labels and social reality: Do we view social categories as natural kinds? In G. R. Semin & K. Fiedler (Eds.), Language, interaction and social cognition (pp. 11–36). London: Sage.

859

Saucier, G. (2003). Factor structure of English-language personality type nouns. Journal of Personality and Social Psychology, 85, 695–708. Semin, G. R. (in press). Stereotypes in the wild. In Y. Kashima, K. Fiedler, & P. Freytag (Eds.), Stereotype dynamics: Language-based approaches to stereotype formation, maintenance, and transformation. Mahwah, NJ: Erlbaum. Semin, G. R., & Fiedler, K. (1988). The cognitive functions of linguistic categories in describing persons: Social cognition and language. Journal of Personality and Social Psychology, 54, 558 –568. Simon, L., & Greenberg, J. (1996). Further progress in understanding the effects of derogatory ethnic labels: The role of preexisting attitudes toward the targeted group. Personality and Social Psychology Bulletin, 12, 1195–1204. Tajfel, H., & Wilkes, A. L. (1963). Classification and quantitative judgment. British Journal of Social Psychology, 54, 101–114. Uehara, S. (1998). Syntactic categories in Japanese: A cognitive and typological introduction. Tokyo: Kuroshio Publishers. Walton, G. M., & Banaji, M. B. (2004). Being what you say: The effect of essentialist linguistic labels on preferences. Social Cognition, 22, 193– 213. Wetzer, H. (1996). The typology of adjectival predication. Berlin, Germany: Mouton de Gruyter. Wierzbicka, A. (1986). What’s in a noun? (Or: How do nouns differ in meaning from adjectives?). Studies in Language, 10, 353–389. Wigboldus, D., Semin, G. R., & Spears, R. (2000). How do we communicate stereotypes? Linguistic bases and inferential consequences. Journal of Personality and Social Psychology, 78, 5–18. Zingarelli, N. (1988). Il nuovo Zingarelli: Vocabolario della lingua italiana [Zingarelli: Vocabulary of the Italian language]. Bologna, Italy: Zanichelli

Appendix Noun–Adjective Stimulus Pairs Used in Studies 1B and 5B Eroe [hero]–eroico [heroic] Internazionalista [internationalist]–internazionale [international] Eccezione [exception]–eccezionale [exceptional] Mostro [monster]–mostruoso [monstrous] Ironista [ironist]–ironico [ironic] Tradizionalista [traditionalist]–tradizionale [traditional] Individualista [individualist]–individualistico [individualistic] Piagnucolone [whiner]–piagnucoloso [whiny]

Celebrita` [celebrity]–celebre [famous] Fatalista [fatalist]–fatalistico [fatalistic] Pacificatore [peacemaker]–pacificativo [peace promoting] Musicista [musician]–musicale [musical] Received February 12, 2007 Revision received July 29, 2007 Accepted August 1, 2007 䡲

Journal of Personality and Social Psychology 2008, Vol. 94, No. 5, 860 – 870

Copyright 2008 by the American Psychological Association 0022-3514/08/$12.00 DOI: 10.1037/0022-3514.94.5.860

Taking the Easy Way Out: Preference Diversity, Decision Strategies, and Decision Refusal in Groups Bernard A. Nijstad

Silvia C. Kaps

University of Amsterdam

University of Go¨ttingen

It has often been argued and found that preference diversity is beneficial for the quality of group decisions. However, this literature has neglected the fact that in many situations, it is also possible not to choose. Further, preference diversity can be based on attractions, aversions, or both. The authors argue that some types of preference diversity can lead to biased discussions and choice refusal (i.e., the group refuses to choose any of the available options). In a laboratory experiment, three different patterns were observed. When group members held different aversions before discussion, discussions were aversion driven and group members quickly agreed to refuse all alternatives. When each alternative had both a proponent and an adversary, discussions were longer and unbiased but still often led to refusal, which was accompanied by relatively low levels of outcome satisfaction. Only when preference diversity was based only on attractions did it lead to unbiased discussion, low prevalence of refusal, and high outcome satisfaction. Implications for group decision making are discussed. Keywords: group decision making, decision refusal, preference diversity

Van Knippenberg, Nijstad, & De Dreu, 2007; Schulz-Hardt, Brodbeck, Mojzisch, Kerschreiter, & Frey, 2006; Schulz-Hardt et al., 2000). Preference homogeneity in general has the effect that group members become more convinced about the correctness of their choice, because other members apparently have independently come to the same conclusion. This validates one’s own position and may in turn lead to high levels of confidence and a quick (and sometimes erroneous) consensus. Thus, as we review below, researchers often argue that preference diversity enhances the quality of group decision making, because it prevents a premature consensus on a bad alternative and ensures an even-handed and unbiased discussion (e.g., Janis, 1982; Nemeth & Nemeth-Brown, 2003). Although there is little doubt that preference diversity can enhance the quality of group decision making, we argue that this is not necessarily true. In particular, previous research has neglected the fact that in many situations, groups do not have to reach a decision at all but can also refuse to choose (e.g., not deciding on a new resource allocation system within the department, not hiring any of the job candidates). Preference diversity may lead to decision refusal rather than to constructive interactions and highquality decisions. Furthermore, preferences can be based on attractions toward some alternatives but also on aversions against other alternatives or on combinations of attractions and aversions. The type of preference diversity (i.e., based on attractions, aversions, or both) determines whether preference diversity leads to constructive or biased group processes and to choice or refusal. The remainder of this article is structured as follows. First, we give an overview of empirical findings that show that preference diversity enhances the quality of group decision making. We then introduce the possibility that groups can also refuse to choose. On the basis of the negativity bias in judgment and decision making and on the literature regarding group-level decision strategies, we argue that group processes and decision refusal are dependent on

Many important decisions are made in groups. Because different group members can bring unique insights, skills, and information to the table, groups have the potential to outperform individuals when it comes to the quality of decisions. However, although groups can be effective decision makers (see, e.g., McGrath, 1984), various problems are also associated with decision making in groups (also see Kerr & Tindale, 2004). For example, groups are not very effective when it comes to sharing unique information (e.g., Stasser & Titus, 1985), groups sometimes are vulnerable to various cognitive biases and errors (Kerr, MacCoun, & Kramer, 1996; Schulz-Hardt, Frey, Lu¨thgens, & Moscovici, 2000), and groupthink may even lead to decision fiascos (Janis, 1982). A recurring theme in the group-decision-making literature is that these negative effects mainly occur in (or are exacerbated by) groups with preference homogeneity (e.g., Janis, 1982; Scholten,

Bernard A. Nijstad, Department of Work and Organizational Psychology, University of Amsterdam, Amsterdam, the Netherlands; Silvia C. Kaps, Department of Psychology, University of Go¨ttingen, Go¨ttingen, Germany. Silvia C. Kaps is now at the Institute of Pedagogical Psychology, University of Brunswick, Braunschweig, Germany. This research was supported by Grant 452-04-311 from the Netherlands Organization for Scientific Research, and their support is gratefully acknowledged. Parts of the research were presented at the October 2005 annual meeting of the Society of Experimental Social Psychology, San Diego, California; at the July 2005 meeting of the European Association of Experimental Social Psychology, Wu¨rtzburg, Germany; and at the June 2006 INGRoup meeting, Pittsburgh, Pennsylvania. We thank the members of the social decision making group of the University of Amsterdam for their helpful suggestions and comments. Correspondence concerning this article should be addressed to Bernard A. Nijstad, Department of Work and Organizational Psychology, University of Amsterdam, Roetersstraat 15, 1018 WB Amsterdam, the Netherlands. E-mail: [email protected] 860

GROUP DECISION REFUSAL

the type of preference diversity within groups. We report an experiment in which different types of preference diversity are created and effects on group processes and decision refusal are assessed.

The Benefits of Preference Diversity A number of studies suggest that preference diversity is beneficial for the quality of group decision making, because it stimulates a more even-handed and less biased consideration of information during the decision-making process. These studies show the benefits of preference diversity in the different stages of the decision-making process: during the search for information, the discussion of information, and the integration of information into group judgments and decisions. Schulz-Hardt et al. (2000) studied information search during group decision making. When group members initially agreed about the best alternative, they tended to mainly search for evidence that supported their position and thus showed a strong confirmation bias. This bias was not present in groups that had diverse preferences, whose members searched for information on all sides of the issue. The effect of preference diversity was mediated by confidence and commitment: Groups with homogeneous preferences were more confident about their choice and felt more committed to it than did heterogeneous groups, and this subsequently led to a search for confirmatory evidence, ignoring potentially relevant counterpreferential information. Preference homogeneity also affects information sharing during group discussion. Groups have been found to often discuss and repeat shared information (information available to all group members before discussion) more than unshared information (information available to only one of the group members), which can lead to suboptimal decisions (e.g., Larson, Foster-Fishman, & Keys, 1994; Stasser & Titus, 1985, 1987). Schulz-Hardt et al. (2006) recently found that groups in which preferences were diverse discussed more information overall and had longer discussions than did groups in which there was preference homogeneity. Further, the bias toward sharing and repeating shared rather than unshared information was weaker in groups with preference diversity than in groups with homogeneous preferences. Groups with preference diversity also more often made the correct decision. Prediscussion preference diversity thus prevented a premature consensus on the wrong alternative and led to higher quality decisions. After information search and discussion, information must be integrated into a final judgment or decision. Often, however, group decisions simply reflect prediscussion preferences, even when information that speaks against these preferences becomes available later (e.g., Greitemeyer & Schulz-Hardt, 2003; Postmes, Spears, & Cihangir, 2001). In particular, when decisions cannot be evaluated using objective criteria of correctness, initial majorities will often determine a group’s choice (e.g., Davis, 1973; Laughlin & Ellis, 1986; Stasser, 1999). This is true because group members may follow the “consensus implies correctness” heuristic (Chaiken & Stangor, 1987; Stasser & Birchmeier, 2003): If most people have independently come to prefer the same alternative, it is likely to be the correct choice. Often such a strategy of pooling preferences yields quick and good decisions. However, when initial majorities are wrong, this strategy will lead the group astray. In

861

that case, preference diversity will force groups to confront their differences and focus on arguments instead of preferences, which eventually leads to a better decision (also see De Dreu, Nijstad, & Van Knippenberg, in press; Scholten et al., 2007). Taken together, these findings suggest that preference homogeneity may lead to a biased consideration of information and arguments and sometimes to a quick consensus on the wrong alternative. Preference diversity, in contrast, is associated with longer and less biased discussions, and diverse groups more often reach a high-quality decision. Many authors have therefore suggested that stimulating debate, criticism, and task conflict should lead to better decisions (see, e.g., Janis, 1982; Nemeth & Nemeth-Brown, 2003; Schulz-Hardt, Jochims, & Frey, 2002).1

Preference Diversity and Decision Refusal Decision Refusal Although the findings discussed above clearly suggest that preference diversity can stimulate high-quality decision making, researchers have ignored the fact that it may also lead to stalemates and no decision being made at all. One important feature of the research discussed above is that groups were required to make a choice among the available alternatives. In many situations, however, groups can also decide not to choose. Indeed, decision makers often have to choose among a nonexhaustive set of alternatives, because it is not possible to identify and evaluate all alternatives (e.g., identify and interview all potential job candidates). When choice sets are incomplete, decision makers can reject all available options and decide to invest further resources to identify more options. Corbin (1980) has called this decision refusal. We argue that preference diversity might sometimes lead to decision refusal. This may happen despite the fact that alternatives are adequate, sometimes without an adequate assessment of alternatives, and sometimes at the cost of lower outcome satisfaction. Before moving on, we need to note three things. First, decision refusal should be distinguished from choosing the status quo: In the case of refusal, one does not choose but rather decides to look for further alternatives before making a choice, which implies that a final decision has not been reached. Second, decision refusal is not by definition maladaptive. Indeed, if all presently available options are unattractive, it might be better to look for more options. However, there is a point after which the investments in the search for more options will no longer pay off. Thus, adaptive decision making should depend on the balance between the costs of searching and the potential benefits of finding a superior alternative (also see Tversky & Shafir, 1992). Third, in the present study, we do not directly examine the quality of the final group decision. Rather, we examine biases during group discussion and the prevalence of decision refusal (which can be more or less adaptive).

Decision Refusal and Types of Preference Diversity To make a choice when initial preferences within a group are diverse, group members must negotiate an alternative that is ac1 It should be noted, though, that downsides of preference diversity include slower decision making, more conflict, and more stress among group members.

862

NIJSTAD AND KAPS

ceptable to all (or most) group members. This implies that some of the group members must concede and accept that a nonpreferred alternative is chosen. However, when group members do not concede, the group may resort to the refusal option as an easy way out, even when alternatives are adequate. Indeed, some evidence indicates that preference diversity can lead to no decision being made when group members fail to concede. Kerr and MacCoun (1985) studied jury decision making and found that public (as opposed to private) polling led to more hung juries, but only for larger juries and for close cases. The authors argued that public polling increased the commitment of jury members to their original verdict, making it less likely that they would give in to reach the required unanimity. Because opposing preferences are more likely in larger juries and with closer cases, these juries were more often hung. Preference diversity (or opposing factions) will not always lead to decision refusal. This depends on the type of preference diversity. A preference for a certain alternative implies that an alternative is liked more than other alternatives. This might be because that alternative is thought to be attractive, because the other alternatives are seen as relatively unattractive, or both. On this basis, we distinguish between three types of preference diversity. First, preference diversity may be based on attractions (i.e., different group members like different alternatives), in such a way that different alternatives have different proponents. Second, preference diversity may be based on aversions (i.e., different members dislike different alternatives), in such a way that different alternatives have different adversaries. Third, preference diversity may be based on a combination of attractions and aversions (there are both members who like and members who dislike different alternatives), in such a way that alternatives have both proponents and adversaries. To make a choice in the case of preference diversity, some group members must give in. When preference diversity is based on attractions, group members have to accept that their preferred alternative is not chosen (but another alternative is). When preferences are based on aversions, some group members have to accept that an initially disliked alternative is chosen. We argue that this has different effects on group processes and on decision refusal. Our argument is based on the negativity bias in judgment and decision making and on the literature about decision strategies in groups. Much research has shown that people are more influenced by negative than by positive information (the negativity bias; for reviews, see Baumeister, Bratslavsky, Finkenauer, & Vohs, 2001; Rozin & Royzman, 2001; Skowronski & Carlston, 1989). Furthermore, people are less likely to revise a negative opinion than a positive opinion. For example, Bolster and Springbett (1961) studied hiring decisions and found that when an initial judgment favored hiring, an average of 3.8 unfavorable bits of information were needed to shift the decision toward rejection. However, when an initial judgment favored rejection, an average of 8.8 pieces of positive information were needed to shift the decision toward acceptance. In addition, people are more confident in negative than in positive judgments (Hamilton & Zanna, 1972). This literature therefore suggests that group members are more likely to accept that a preferred alternative is not chosen (i.e., when preferences are based on attractions) than a disliked alternative is chosen (i.e., when preferences are based on aversions). Thus, when preferences

are (partly) based on aversions, it is more likely that groups will refuse to choose.

Types of Preference Diversity and Decision Strategies We further argue that different types of preference diversity lead to different group decision-making strategies. The literature about group decision making suggests that two broad strategies are available for groups to make a choice (De Dreu et al., in press; Hastie, Penrod, & Pennington, 1983; Stasser & Birchmeier, 2003). Groups can take stock of existing preferences, assume that truth must lie with the majority (cf. Chaiken & Stangor, 1987), and decide accordingly (a preference-driven strategy). Alternatively, groups can exchange and integrate information about alternatives and use a more effortful and systematic information-driven strategy. Research reviewed in the earlier sections of this article suggests that preference diversity prohibits the use of a simple preference-driven strategy. This might subsequently lead groups to adopt an information-driven strategy to resolve their differences and make a choice that is supported by the available information. However, we argue that this will only happen when preference diversity is based on attractions but not when it is based (solely) on aversions. A critical factor that determines whether groups will use an information-driven strategy is the motivation of group members to argue for and against certain alternatives instead of satisfice and choose an alternative that appears to be good enough (see Kerr & Tindale, 2004; see also De Dreu et al., in press). When preference diversity is based on attractions, group members are motivated to (continue to) argue in favor of their preferred alternative (and against other alternatives) and to resist the rejection of this alternative. To convince others, they will have to use the positive and negative information available to them and thus adopt an information-driven strategy. We thus predict that preference diversity based (partly) on attractions will lead to unbiased and relatively long discussions. However, when group members’ preferences are solely based on aversions against alternatives, they will not resist the rejection of their preferred alternative, because only their aversions but not their attractions are strong. When adversaries against all alternatives are present, groups will adopt what might be called an aversion-driven decision strategy: There will be frequent statements against the different options, and the discussion will be biased toward negative information about decision alternatives. Indeed, group members will argue that a certain alternative is not suitable rather than that another alternative is suitable.

Overview and Hypotheses In a laboratory experiment, three conditions were created in which different types of preference diversity were manipulated. In the attraction condition, preference diversity was based on attractions only (i.e., group members were attracted to different alternatives); in the aversion condition, preference diversity was based on aversions only (i.e., group members had aversions against different alternatives); and in the combination condition, preference diversity was based on both attractions and aversions (group members had both different attractions and different aversions). Groups of 3 participants were asked to choose one of three alter-

GROUP DECISION REFUSAL

natives, but they could also refuse all alternatives. Hypotheses can be derived about outcomes (group decision, agreement with the group decision) and about group processes (discussion time, exchange of preferences, exchange of information, and conflict). First, we expected that refusal is least likely in the attraction condition, followed by the combination condition, followed by the aversion condition (Hypothesis 1). The negativity bias should lead to more refusal in the aversion and combination conditions than in the attraction condition, because group members are unlikely to accept that a disliked alternative is chosen. In the aversion condition, the interaction will also be aversion driven, and group members will convince each other that none of the alternatives are suitable, which should lead to a high prevalence of decision refusal. However, in both the attraction and the combination conditions, group members are motivated to defend their preferred alternative, and groups should adopt an information-driven strategy. As a consequence, some group members will concede and accept a nonpreferred alternative on the basis of the evidence provided. In the attraction condition, where no strong aversions are present, this will generally lead groups to make a choice rather than to refuse. In the combination condition, information-driven group interaction might lead some group members to accept that a disliked alternative is chosen on the basis of the evidence; however, other group members might not concede, in which case, decision refusal is the only way out. Regarding group processes, it is first expected that discussion time will be longer in the attraction and combination conditions than in the aversion condition (Hypothesis 2A). In the former two conditions, discussions will be information driven, which takes time (cf. Brodbeck, Kerschreiter, Mojzisch, Frey, & Schulz-Hardt, 2002; Schulz-Hardt et al., 2006). In the aversion condition, discussions should be aversion driven, which should lead to a quick refusal of all options. Second, the exchange of positive versus negative information during discussion should be different in the different conditions. Information-driven interactions in both the attraction and the combination conditions should lead to both positive and negative information being exchanged. However, aversion-driven interactions in the aversion condition should lead to more negative than positive information being exchanged. Thus, information type (positive vs. negative) should interact with condition (Hypothesis 2B). Third, differences are expected in the number of times group members state their attraction for or aversion against an option. In the attraction condition, statements of attraction should be made more often than statements of aversion. In the aversion condition, this should be reversed. In the combination condition, there should be many statements of both attraction and aversion. The result should be an interaction between statement (attraction vs. aversion) and condition (Hypothesis 2C). Fourth, because of preference diversity, there should be some degree of conflict in all conditions. However, participants should experience the most conflict in the combination condition, because in that condition, proponents will face adversaries (Hypothesis 2D). Agreement with the group decision is expected to be lower in the combination condition than in the other two conditions (Hypothesis 3). In the combination condition, often a preferred alternative will be rejected (both when the group takes the refusal option and when a nonpreferred alternative is chosen) and a nonpreferred alternative may be chosen (i.e., an alternative one

863

initially opposed). In the attraction condition, some group members will accept a nonpreferred alternative because they become convinced during the discussion; in the aversion condition, group members will agree that all alternatives should be refused.

Method Participants, Task, and Design Participants were 135 students of the University of Amsterdam (31 men and 104 women), who participated in 45 three-person groups (15 groups per condition). They either received course credit or were paid €7 for their participation. The average age of the participants was 20.89 years old (SD ⫽ 3.10 years). Participants were told that they were members of a three-person group that had to choose among three candidates who had applied for a teaching position at the department of psychology. They had to decide whom to hire but could also decide not to hire any of the candidates (representing the refusal option). Before the discussion, participants individually received information about the three candidates. Whereas the group as a whole always received complete information about all candidates, elements of the complete set of information were distributed among group members in different ways. Some of the information was given to only 1 group member (making it completely unshared), whereas other information was given to 2 group members (making it partially shared). In previous research, information distribution was used to create a hidden profile (e.g., Stasser & Titus, 1985, 1987). In contrast, we did not create a hidden profile but used information distribution to create attractions for, aversions against, or neutral opinions about particular candidates. Three conditions were used, and these conditions differed only in which information was completely unshared and which was partially shared. In the attraction condition, the information given to group members was biased in such a way that each member was attracted to a different candidate (i.e., each candidate had a different proponent) but was neutral toward the two other candidates. In the aversion condition, information was biased in such a way that each group member had an aversion against a different candidate (i.e., each candidate had an adversary) but was neutral toward the two other candidates. In the combination condition, information was biased in such a way that group members were attracted to one candidate but had an aversion against another candidate (i.e., for each candidate there was a proponent and an adversary) and were neutral toward the third candidate.

Materials and Pretests To be able to manipulate attractions and aversions, we created candidate profiles using positive and negative information. Eight attributes were used for each candidate (24 attributes in total), 4 of which were positive and 4 of which were negative. For the complete candidate profiles (i.e., all 8 pieces of information per candidate), we had two requirements. First, we wanted no candidate to be clearly better than another candidate, because we wanted to avoid having groups easily converge on a superior candidate. Second, the candidates had to be adequate: When full information about the candidates is available, participants should choose one of

NIJSTAD AND KAPS

864

Table 1 Candidate Profiles and Information Distribution in the Three Conditions Information distribution Candidate

Complete profile

A

1⫹, 2⫺, 3⫺, 4⫹, 5⫹, 6⫹, 7⫺, 8⫺

B

1⫺, 2⫹, 3⫺, 4⫹, 5⫺, 6⫹, 7⫺, 8⫹

C

1⫹, 2⫺, 3⫺, 4⫺, 5⫺, 6⫹, 7⫹, 8⫹

Group member 1 2 3 1 2 3 1 2 3

Attraction 1⫹, 1⫹, 3ⴚ, 3ⴚ, 2⫹, 1ⴚ, 1⫹, 3ⴚ, 1⫹,

4⫹, 2ⴚ, 5⫹, 5ⴚ, 4⫹, 2⫹, 2ⴚ, 5ⴚ, 6⫹,

5⫹, 4⫹, 6⫹, 6⫹, 6⫹, 4⫹, 4ⴚ, 6⫹, 7⫹,

Aversion 6⫹ 7ⴚ 8ⴚ 8⫹ 8⫹ 7ⴚ 7⫹ 8⫹ 8⫹

2⫺, 1ⴙ, 3⫺, 3⫺, 1⫺, 1⫺, 1ⴙ, 3⫺, 2⫺,

3⫺, 2⫺, 5ⴙ, 5⫺, 3⫺, 2ⴙ, 2⫺, 5⫺, 3⫺,

7⫺, 4ⴙ, 6ⴙ, 6ⴙ, 5⫺, 4ⴙ, 4⫺, 6ⴙ, 4⫺,

Combination 8⫺ 7⫺ 8⫺ 8ⴙ 7⫺ 7⫺ 7ⴙ 8ⴙ 5⫺

2⫺, 1⫹, 1⫹, 2⫹, 1⫺, 1⫺, 1⫹, 1⫹, 2⫺,

3ⴚ, 2⫺, 4⫹, 4⫹, 3ⴚ, 2⫹, 2⫺, 6ⴙ, 3ⴚ,

7⫺, 4⫹, 5ⴙ, 6ⴙ, 5ⴚ, 4⫹, 4⫺, 7⫹, 4⫺,

8ⴚ 7⫺ 6ⴙ 8ⴙ 7⫺ 7⫺ 7⫹ 8ⴙ 5ⴚ

Note. 1 ⫽ clarity; 2 ⫽ enthusiasm; 3 ⫽ involved; 4 ⫽ patience; 5 ⫽ keeping order; 6 ⫽ conscientiousness; 7 ⫽ teaching method; 8 ⫽ experience. Positive information is indicated with a ⫹, negative with a ⫺. For example, 1⫹ means that the candidate scores high on the attribute clarity. Unshared information is bold, other information is partially shared.

them and not refuse all candidates. Two pretests were conducted to establish these requirements. In a first pretest, the eight candidate attributes were rated for importance (on a scale ranging from 1 ⫽ not important to 7 ⫽ very important) in a sample of 40 people, taken from the same population as the participants in the main study. The attributes were (with average importance ratings in parentheses) clarity when explaining (6.14), enthusiasm for teaching (5.90), involved with students (5.87), patience (5.46), ability to keep order (5.05), conscientiousness (4.97), use of different teaching methods (e.g., discussion, video; 4.40), and teaching experience (3.84). On the basis of these attribute ratings, we created three candidate profiles, in which negative attributes (i.e., is not very conscientious) and positive attributes (i.e., has teaching experience) were matched for importance, in such a way that no candidate was clearly better than another candidate (see Table 1). In a second pretest, the resulting profiles were given to another sample of 40 participants, drawn from the same population as the participants in the main study. These participants were given the same instruction as were given to those in the main study, and they received complete information about all candidates. Next, they were asked to indicate their choice (i.e., choose Candidate A, B, or C or reject all candidates). Seven participants chose to refuse all candidates (17.5%), but a large majority (82.5%) found at least one candidate good enough (35% chose A, 20% chose B, and 27.5% chose C). Thus, on the whole, the candidates were perceived to be adequate. To create attractions, we gave some participants only the 4 positive candidate attributes of a particular candidate. To create aversions, we gave some participants only the 4 negative attributes of a candidate. To create a neutral opinion, we gave participants 2 positive and 2 negative attributes of a candidate. Thus, each participant only had half the information available before the discussion (12 attributes, 4 for each candidate). The remaining information was always available to other group members. Further, some of the information was given to only 1 group member (making it completely unshared), whereas other information was given to 2 group members (making it partially shared). In the attraction condition, all positive information was partially shared and all negative information was unshared. In the aversion condi-

tion, all negative information was partially shared and all positive information was unshared. In the combination condition, half of the positive and negative information was partially shared and the remaining half was unshared. The last three columns of Table 1 show the exact information distribution in the three conditions.2

Procedure All instructions were provided on paper. Participants were first seated individually and read a general introduction. They had to imagine that they were members of a three-person selection committee that had to decide whom to hire for a teaching position at the department of psychology. They read that there were three candidates who had all finished their master’s degree at a Dutch university (this would formally make them suitable for a teaching position). They had to decide, together with 2 other participants, whom to hire. However, they read that it was also possible not to hire anybody but to look for further candidates. It was emphasized that they should only do this when they really thought that none of the present candidates were suitable for the position. They further read that, in real life, looking for further candidates is expensive and time consuming. They were thus instructed to hire a candidate when they thought he or she was suitable and only reject all candidates when they really thought that none of them were suitable for the job. After the general instruction, participants individually received the candidate profiles. They had 6 min to study the profiles (this 2

By itself, this may lead to a bias in the information discussed. Previous research has shown that the more group members initially hold a piece of information, the more likely it is to be mentioned during discussion (e.g., Stasser & Titus, 1985, 1987). This might lead to more mentioning of positive then negative information in the attraction condition (because positive information was partially shared and negative information completely unshared) and to more mentioning of negative than of positive information in the aversion condition (because negative information was partially shared and positive information completely unshared). Note, however, that we predicted only biased information sharing in the aversion conditions, but (because of an information-driven decision strategy) no bias in the attraction condition.

GROUP DECISION REFUSAL

was enough for all participants) and were told that they could not refer to these profiles during the later group discussion. They further read that it was possible that other group members had information different from their own. After 6 min, the experimenter collected the profiles and distributed a prediscussion questionnaire. Participants were asked what their individual decision at that moment would be (hire Candidate A, B, or C or not hire any candidate), and they were asked to rate their certainty about that decision as well as each candidate’s attractiveness. Next, they received a follow-up instruction: They were told that they would participate in a group discussion and that they should jointly decide whom to hire. No specific instructions were given about decision rules (e.g., unanimity or majority). It was repeated that they should hire a candidate when they thought that candidate was suitable, but they could also decide not to hire any of the candidates but look for further candidates instead. Again it was emphasized that this would cost additional time and money. They were next seated in one room and were provided with a group decision sheet on which they could indicate their decision. All group discussions were videotaped. When the group had reached a decision, the participants were separated and individually filled out a postexperimental questionnaire. After that, they were debriefed, paid, and dismissed.

Dependent Variables The main dependent variable was group choice, coded dichotomously as choice or refusal. We also measured discussion time. Further, an observer blind to conditions and hypotheses coded every video for information exchange and exchange of attractions and aversions. For every group, it was established which of the total number of 24 pieces of information (8 per candidate) were mentioned accurately and at least once during discussion. Occasionally, inaccurate information was mentioned (e.g., an attribute was ascribed to the wrong candidate), but we only analyzed accurately exchanged information. For each candidate, the observer also counted instances in which participants indicated an attraction for or an aversion to that candidate, resulting in six frequencies for each group (e.g., the number of times an attraction or aversion was mentioned for each candidate). A second observer coded seven video recordings to establish interobserver agreement. Regarding information exchange, the two observers agreed in 93.2% of the cases whether or not a particular piece of information was mentioned, and Cohen’s kappa was .86 ( p ⬍ .001). Regarding the exchange of preferences, the intraclass correlation was computed to establish interobserver reliability (two-way random model, consistency definition) across the 42 frequencies (7 groups ⫻ 6 frequencies), which was .75 ( p ⬍ .001). Given these high levels of reliability, the scores of the first observer were used. The prediscussion questionnaire asked for prediscussion preferences as well as for ratings of each candidate on three items. The three candidates were each rated on three 7-point scales (ranging from 1 ⫽ completely disagree to 7 ⫽ completely agree) to assess suitability for the job (e.g., “Candidate A would be a good teacher”). The three items were averaged per candidate (for all candidates, Cronbach’s ␣ ⫽ .96). Participants were also asked to rate the degree to which they were certain about their initial preference (1 item, using a 7-point scale ranging from 1 ⫽ very uncertain to 7 ⫽ very certain).

865

These candidate ratings were again given in the postdiscussion questionnaire (Cronbach’s ␣ ⫽ .94, .96, and .95 for Candidates A, B, and C, respectively). Further, participants were asked to give their private postdiscussion preference. Agreement with the group decision was measured with two items (e.g., “I agree with the group decision”); these were averaged into a measure of agreement (␣ ⫽ .85). Further, task conflict was assessed with three items (e.g., “there were differences of opinion within the group”; ␣ ⫽ .82). One item was used to assess the degree to which the discussion had focused on negative information (“during the discussion it became increasingly clear that the candidates had more negative than positive characteristics”).

Data Analysis Prediscussion questionnaire items were analyzed at the individual level, because these can be considered independent observations. However, after the group discussion, group members will have influenced each other, and the ratings on the postdiscussion questionnaire will therefore not be independent. These ratings were therefore aggregated to the group level, by taking group averages. Specific, directional hypotheses were tested using planned comparisons.

Results Manipulation Checks (Prediscussion Questionnaire) Per group, the number of group members initially (before discussion) favoring refusal was computed (this measure could vary from 0 to 3). The analysis of variance (ANOVA) on this measure was significant, F(2, 42) ⫽ 10.27, p ⬍ .001, ␩2 ⫽ .33. A Tukey post hoc test showed that the number of group members favoring refusal was significantly higher in the aversion condition (M ⫽ 0.67; 10 out of 45 participants in that condition) than in the attraction condition (M ⫽ 0.00) or in the combination condition (M ⫽ 0.07, 1 out of 45 participants), both ps ⬍ .01. The latter two conditions did not differ ( p ⬎ .90). In the attraction condition and the combination condition, every group member should prefer a different candidate (cf. Table 1). In the attraction condition, this was the case for all 15 groups (100%). In the aversion condition, where this should not necessarily be the case, it was still true for 5 out of 15 groups (33%). In 8 groups, 1 (6 groups) or 2 members (2 groups) favored no choice (53%). In the remaining 2 groups, an initial majority was in favor of a particular candidate (13%). In the combination condition, 9 groups were completely heterogeneous in preferences (60%). However, 5 groups had an initial majority for a candidate (33%). In the remaining group, 1 group member favored refusal (7%). A 3 (condition) ⫻ 3 (candidate) ⫻ 3 (group member) mixed model ANOVA was performed on prediscussion candidate ratings, with candidate as a within-participants variable. This yielded the expected three-way interaction, F(8, 250) ⫽ 140.80, p ⬍ .001, ␩2 ⫽ .82 (see Table 2). Before discussion, group members tended to like and dislike the candidates they should have liked or disliked given the information they had received. For example, Group Member 1 in the attraction condition should like Candidate A and be neutral toward Candidates B and C; Group Member 2 should like Candidate B and be neutral toward Candidates A and C; and

NIJSTAD AND KAPS

866

Table 2 Ratings of Candidates A, B, and C Before and After Discussion Condition Attraction Measure Pre A M SD Pre B M SD Pre C M SD Post A M SD Post B M SD Post C M SD

Aversion

Combination

GM1

GM2

GM3

GM1

GM2

GM3

GM1

GM2

GM3

6.02 0.70

3.49 0.92

2.51 0.64

1.58 0.57

4.40 1.03

3.78 0.97

1.82 0.91

3.71 0.93

5.71 0.49

3.71 0.90

5.87 0.47

3.76 1.10

3.56 1.10

1.76 0.67

4.22 1.17

5.60 0.88

1.58 0.73

4.76 0.96

3.78 0.84

3.27 1.16

6.04 0.50

4.20 0.94

3.69 1.13

1.44 0.54

3.93 1.03

6.26 0.69

1.87 0.71

4.67 1.20

3.67 1.21

3.38 1.28

2.21 1.24

3.36 1.38

3.17 1.17

2.71 1.42

3.80 1.47

4.44 1.29

3.78 1.40

4.47 1.44

3.87 1.80

3.33 1.20

1.95 0.89

3.21 1.24

4.24 1.35

2.93 1.45

4.31 1.27

3.73 1.46

3.36 1.48

4.36 1.20

3.38 1.12

2.83 1.02

2.40 1.09

3.40 1.33

4.29 1.50

2.67 1.10

Note. Bold figures indicate candidate ratings that should be high, italic figures indicate candidate ratings that should be low. GM ⫽ Group Member; Pre ⫽ prediscussion questionnaire; Post ⫽ postdiscussion questionnaire.

Group Member 3 should like Candidate C and be neutral toward Candidates A and B. In general, this was the case. Further, the analysis showed a main effect of condition, F(2, 126) ⫽ 40.36, p ⬍ .001, ␩2 ⫽ .39. As expected, across candidates, participants were most positive in the attraction condition (M ⫽ 4.27), followed by the combination condition (M ⫽ 3.92) and the aversion condition (M ⫽ 3.18). All conditions differed from each other according to a Tukey post hoc test (all ps ⬍ .025).3 Participants were also asked to rate how certain they were about their choice. An ANOVA on this measure revealed an effect of condition, F(2, 123) ⫽ 14.27, p ⬍ .001, ␩2 ⫽ .19. A Tukey post hoc test showed that participants in the aversion condition (M ⫽ 4.58) were less certain than the others (M attraction ⫽ 5.57, M combination ⫽ 5.68, both ps ⬍ .001). Given that they were not induced to have a strong preference, this result, as well as the other results, confirms that the manipulation has been successful.

Group Choice (Hypothesis 1) Group choice was coded dichotomously as choice (Candidate A, B, or C) versus no choice (none of the candidates). A chi-square test yielded a significant effect of condition on group choice, ␹2(2, N ⫽ 45) ⫽ 10.85, p ⫽ .004, Crame´r’s V ⫽ .49. Three groups (20%) made no choice in the attraction condition, 12 groups (80%) did not choose in the aversion condition, and 7 groups (46.7%) made no choice in the combination condition (see Table 3 for all dependent variables). When these frequencies are compared with those obtained in the pretest, in which 40 individuals received complete information and 17.5% refused to choose, the difference is not significant for the attraction condition, ␹2(1, N ⫽ 55) ⫽ 0.05, p ⫽ .83, Crame´r’s V ⫽ .03. It is significant for both the aversion condition, ␹2(1, N ⫽ 55) ⫽ 18.85, p ⬍ .001, Crame´r’s

V ⫽ .59, and the combination condition, ␹2(1, N ⫽ 55) ⫽ 4.89, p ⫽ .03, Crame´r’s V ⫽ .30. Thus, in both the aversion and the combination conditions, refusal was more prevalent than it was for individuals who received complete information from the start. Because it was predicted (Hypothesis 1) that the greatest number of groups would refuse to choose in the aversion condition, followed by the combination condition, followed by the attraction condition, a contrast was specified as [1, 0, ⫺1] for these conditions, respectively. This contrast had a significant effect on choice in a logistic regression, B ⫽ 1.39, SE ⫽ 0.46, Wald ⫽ 9.23, p ⫽ .002, Nagelkerke R2 ⫽ .30. Further, to confirm that the pattern indeed is linear, we specified a quadratic contrast as [⫺1, 2, ⫺1], for the attraction, combination, and aversion conditions, respectively. This contrast was not significant, B ⫽ ⫺0.05, Wald ⫽ 0.04, p ⬎ .80, Nagelkerke R2 ⫽ .00. These results confirm Hypothesis 1 and show that the effect indeed is linear. We tested whether the number of group members that initially (i.e., before the discussion) favored refusal mediated the effect of condition. This was not the case. In a logistic regression, the effect of condition (the contrast specified above) remained significant (B ⫽ 1.06, SE ⫽ 0.50, Wald ⫽ 4.51, p ⫽ .03) when controlling for 3 A number of other findings emerged in this analysis, all involving the candidate variable. Thus, there was a candidate main effect, F(2, 125) ⫽ 3.15, p ⬍ .05, ␩2 ⫽ .05, showing that Candidate A was rated as less suitable (M ⫽ 3.67) than Candidate B (M ⫽ 3.87) or Candidate C (M ⫽ 3.83). Further, a Candidate ⫻ Condition interaction approached significance, F (4, 250) ⫽ 2.24, p ⫽ .07, ␩2 ⫽ .04, and there was a significant Candidate ⫻ Member interaction, F (4, 250) ⫽ 39.14, p ⬍ .001, ␩2 ⫽ .39. However, these are not of substantive interest and are all qualified by the three-way interaction; therefore, they are not discussed further.

GROUP DECISION REFUSAL

867

Table 3 Dependent Variables by Condition Condition Dependent variable

Attraction

Aversion

Combination

Group decision (% no choice) Discussion time in min Video coding Exchange of positive information Exchange of negative information Statement of attraction Statement of aversion Postdiscussion questionnaire Focus on negative information Conflict No. group members who agree Agreement with decision

20 8.83 (4.33)

80 6.41 (3.19)

46.7 8.92 (3.81)

.76 (.27) .76 (.19)

.57 (.22) .74 (.18)

.79 (.13) .78 (.17)

8.60 (4.08) 2.33 (2.32)

5.20 (3.90) 6.20 (3.63)

8.00 (3.74) 6.07 (2.99)

3.53 (1.19) 2.88 (0.72) 2.80 (0.41) 6.02 (0.54)

5.33 (1.00) 3.12 (1.25) 2.73 (0.46) 6.09 (0.55)

3.71 (1.04) 3.58 (0.96) 2.33 (0.62) 5.66 (0.70)

Note.

Standard deviations are in parentheses.

the effect of the number of members initially favoring refusal, which did not have a significant effect, B ⫽ 1.36, SE ⫽ 1.13, Wald ⫽ 1.44, p ⫽ .23. Thus, prediscussion preferences cannot (fully) explain group choice, and group processes must have affected choice.

Discussion Time and Video Coding (Hypotheses 2A–2C) Discussion time was analyzed with an ANOVA, which did not yield a significant effect of condition, F(2, 42) ⫽ 2.10, p ⫽ .14, ␩2 ⫽ .09. However, a planned comparison comparing the two conditions for which information-driven interaction was expected (i.e., the attraction condition and the combination condition) with the aversion condition was significant, t(42) ⫽ 2.04, p ⬍ .05, d ⫽ 0.67. Discussion time was shorter in the aversion condition (M ⫽ 6.41 min) than in the attraction condition (M ⫽ 8.83 min) and the combination condition (M ⫽ 8.92 min). This is consistent with Hypothesis 2A. From the video coding, the proportion of positive and negative information mentioned was computed by dividing the number of positive and negative items mentioned by the total number of positive or negative items available to the group (12, in both cases). These proportions were analyzed with a 3 (condition) ⫻ 2 (information: positive vs. negative) ANOVA, with information as a within-groups variable (Table 3). This analysis yielded an effect of information that approached significance, F(1, 42) ⫽ 3.79, p ⫽ .06, ␩2 ⫽ .08, showing that a somewhat higher proportion of negative information (M ⫽ .76) than positive information (M ⫽ .71) was exchanged. The main effect of condition was not significant, F(2, 42) ⫽ 2.43, p ⫽ .10, ␩2 ⫽ .10. However, there was a significant Condition ⫻ Information interaction, F(2, 42) ⫽ 4.20, p ⫽ .02, ␩2 ⫽ .17. In the attraction condition, there was no difference in the exchange of positive and negative information, t(14) ⫽ ⫺0.07, p ⫽ .94, d ⫽ ⫺0.02. The same was true for the combination condition, t(14) ⫽ 0.31, p ⫽ .76, d ⫽ 0.07. In the aversion condition, more negative than positive information was exchanged, t(14) ⫽ ⫺5.03, p ⬍ .001, d ⫽ ⫺0.87. This pattern of effects is consistent with Hypothesis 2B and calibrates the idea that group discussions in the attraction and combination conditions are

information driven and unbiased but are aversion driven in the aversion condition. The number of times a statement of attraction for or aversion against a candidate was made was analyzed with a 3 (condition) ⫻ 2 (statement: attraction or aversion) ANOVA, with statement as a within-groups factor (i.e., statements were summed across candidates). This yielded a main effect of statement, F(1, 42) ⫽ 18.27, p ⬍ .001, ␩2 ⫽ .30, showing that on average, more statements of attraction were made (M ⫽ 7.27) than statements of aversion (M ⫽ 4.87). This main effect was qualified by a significant interaction, F(2, 42) ⫽ 14.13, p ⬍ .001, ␩2 ⫽ .40 (see Table 3). In the attraction condition, more statements of attraction than statements of aversion were made, t(14) ⫽ 7.16, p ⬍ .001, d ⫽ 1.89. In the aversion condition, the difference was reversed, but this was not significant, t(14) ⫽ ⫺1.03, p ⫽ .32, d ⫽ ⫺0.27. In the combination condition, somewhat more statements of attraction than statements of aversion were made, although this failed to reach significance, t(14) ⫽ 1.82, p ⫽ .09, d ⫽ 0.57. This pattern of findings is largely consistent with Hypothesis 2C, although we had also predicted that in the aversion condition, more statements of aversion than statements of attraction would be made.

Postdiscussion Questionnaire: Conflict and Agreement (Hypotheses 2D and 3) An ANOVA on ratings of task conflict in the postdiscussion questionnaire did not show a significant effect of condition, F(2, 42) ⫽ 1.95, p ⫽ .16, ␩2 ⫽ .09 (M attraction ⫽ 2.88, M aversion ⫽ 3.12, M combination ⫽ 3.58). However, to test the directional Hypothesis 2D, that more conflict would be experienced in the combination condition than the other conditions, we performed a planned comparison. This comparison was significant, t(42) ⫽ 1.86, p ⬍ .05 (one-tailed), d ⫽ 0.61, confirming Hypothesis 2D. For each group, we counted how many group members after the discussion chose the same option as the group had done (i.e., scores could range from 0 to 3). An ANOVA on this measure of agreement with the group decision was significant, F(2, 42) ⫽ 3.76, p ⫽ .03, ␩2 ⫽ .15. To test Hypothesis 3, we computed a planned contrast, comparing the combination condition (M ⫽

NIJSTAD AND KAPS

868

2.33) with the other conditions (M attraction ⫽ 2.80, M aversion ⫽ 2.73), and this contrast was significant, t(42) ⫽ 2.72, p ⫽ .009, d ⫽ 0.89. Further, postdiscussion ratings of agreement with the decision did not differ across conditions according to an ANOVA, F(2, 42) ⫽ 2.24, p ⫽ .12, ␩2 ⫽ .10 (M attraction ⫽ 6.02, M aversion ⫽ 6.09, M combination ⫽ 5.66). However, a planned comparison between the combination condition on the one side and the two conditions on the other was significant, t(42) ⫽ 2.09, p ⬍ .05, d ⫽ 0.68. Thus, agreement with the decision was lower in the combination condition than in the other conditions, and Hypothesis 3 was confirmed.

Postdiscussion Questionnaire: Additional Results In the postexperimental questionnaire, one item asked whether more negative than positive information came up during discussion, and an ANOVA on this item yielded a significant effect of condition, F(2, 42) ⫽ 12.65, p ⬍ .001, ␩2 ⫽ .38. A Tukey post hoc test showed that the aversion condition (M ⫽ 5.33) differed significantly from both the attraction condition (M ⫽ 3.53, p ⬍ .001) and the combination condition (M ⫽ 3.71, p ⫽ .001), whereas the latter two did not differ ( p ⫽ .90). This finding is consistent with the video coding data, which also showed that in the aversion condition, more negative than positive information was exchanged. It is also consistent with the reasoning that interactions in the aversion condition are aversion driven, which makes group members converge quickly on the refusal option. Finally, postdiscussion candidate ratings were analyzed at the group level in a 3 (condition) ⫻ 3 (group member) ⫻ 3 (candidate) mixed model ANOVA, with the latter two variables as withingroup variables. As was the case with prediscussion ratings, the three-way interaction among these variables was significant, F(8, 76) ⫽ 8.66, p ⬍ .001, ␩2 ⫽ .48. As can be seen in Table 2, the pattern of means remained essentially the same from before to after the discussion, only less extreme. There also was a main effect of condition, F(2, 41) ⫽ 8.95, p ⫽ .001, ␩2 ⫽ .30. A post hoc Tukey test ( p ⬍ .025) showed that, across candidates, the ratings were lower in the aversion condition (M ⫽ 2.87) than in the attraction condition (M ⫽ 3.92) and the combination condition (M ⫽ 3.64), but the latter two conditions did not differ reliably.4

Discussion It is widely believed that preference diversity can stimulate high-quality group decisions, because it prevents preferencedriven decision strategies (i.e., a simple pooling of initial preferences) and stimulates information-driven strategies instead (i.e., systematic and unbiased discussion and integration of information). However, this literature has neglected the fact that in many situations, groups can also refuse to choose. Further, preference diversity may be based on attractions, aversions, or both. We argued that information-driven strategies are only adopted when preferences are based on attractions but not when they are based on aversions. When preferences are based on aversions, groups may even adopt an aversion-driven strategy, with frequent mentioning of aversions and a bias toward exchanging negative information. In addition, on the basis of the negativity bias in judgment and decision making, we argued that aversions are harder to overcome than attractions and that group members more easily accept that a

preferred alternative is not chosen than that a disliked alternative is chosen. Aversion-driven strategies and the negativity bias might lead groups to refuse all available alternatives, even though they are adequate. These ideas were tested in a laboratory experiment in which different types of preference diversity were created. It was systematically varied whether group members held different attractions (attraction condition), different aversions (aversion condition), or both (combination condition) toward different job candidates. In the attraction condition, in which each group member was attracted to a different alternative and was relatively neutral toward the other alternatives, groups were generally able to reach a decision. They had fairly long discussions, which were unbiased (i.e., both positive information and negative information were discussed to the same degree). In this condition, group members more often expressed attractions than aversions and reported having some conflict. In the end, 80% reached a decision, and group members generally agreed with the decision made by the group. All in all, the pattern is indicative of an information-driven strategy, in which the group decision reflected an integration of relevant information and in which some group members accepted an initially nonpreferred candidate on the basis of the evidence and agreed with the group decision. In the aversion condition, in which group members were led to dislike different alternatives, the picture is different. Here, most groups (80%) decided to refuse all available alternatives and to invest more resources to look for other alternatives. Discussions were shorter than in the other conditions, aversions were frequently mentioned, and discussions were biased toward negative information (i.e., more negative than positive information was mentioned). Groups seemed to have relatively quickly converged on refusal, and group members usually agreed with that decision. The pattern of results is indicative of aversion-driven group interaction. In the combination condition, in which different group members held both different attractions and different aversions, the picture is different still. About half of the groups chose an alternative, and the other half decided to refuse all alternatives. Discussions were fairly long and unbiased (i.e., as much positive as negative information came up). Further, both attractions and aversions were frequently mentioned. Agreement with the decision, however, was lowest in this condition. Thus there is evidence for informationdriven interaction (long and unbiased discussions), but group members often refused to accept a disliked alternative, leading to lower levels of satisfaction and higher levels of decision refusal. Our conclusion is threefold. First, preference diversity does not necessarily stimulate information-driven decision processes. When it is also possible not to choose but refuse to make a decision, only preference diversity based on attractions stimulates an unbiased consideration of alternatives. Second, because aversions are harder to change than attractions, decision refusal is more likely when there are adversaries against the decision options. Third, preference diversity based on aversions alone might lead to a biased 4

As was the case with prediscussion ratings, there also was a Candidate ⫻ Member interaction, F(4, 38) ⫽ 3.66, p ⫽ .01, ␩2 ⫽ .28. However, this is not of substantive interest and the effect is qualified by the three-way interaction.

GROUP DECISION REFUSAL

869

We believe that the group-decision-making literature has generally neglected two issues that were central to the present article. The first issue is that preferences are not based solely on attractions but also on aversions. We showed that these might have quite different effects on group interaction processes and group decisions. It might therefore be useful to not only assess preferences but more specifically assess or manipulate attractions and aversions. Indeed, a preference can be weak and based on aversions against other alternatives rather than on attraction toward the preferred alternative. It can also be strong, when it is based on attraction toward one alternative as well as on aversions against the others. Potential effects of aversions versus attractions are biases in the group discussion, failure to reach a decision, and lower levels of outcome satisfaction. The second issue is that cases of group decision refusal and other forms of decision avoidance deserve more attention. As we argued, for decision makers—individuals as well as groups—it is often impossible to identify, discuss, and evaluate all potential alternatives. It will therefore often be possible to decide to identify more options, and the conditions under which decision refusal occurs deserve more attention. Further, in this experiment, it was possible to either choose or refuse an option. In reality, there are different ways in which a group might avoid making decisions. For example, it often is possible to delay decisions when group members feel that they do not have sufficient information on which to base their choice. Common experience (e.g., in politics) suggests that often committees are called on to do further investigations before the group reaches a decision. Sometimes it might even be possible to shift responsibility for a decision to someone else, such as higher management, in cases where groups reach a stalemate. These all are interesting possibilities for further investigation.

emphasized that in the real world, refusing all candidates is costly and time consuming, our participants experienced no real costs. It might be the case that no decision refusal would take place when real costs (e.g., extra time in the laboratory) are present, and this clearly is an issue for future research. We emphasize, however, that making a choice similarly involved no real costs or benefits. Our finding that not all groups quickly rejected all candidates (and that refusal rates were different in the different conditions) indicates that our participants were seriously trying to arrive at the best decision. Second, in this experiment, rather extreme situations were created, in which all alternatives had proponents and/or adversaries. Although these situations were useful in testing our hypotheses, one may wonder whether this is realistic and how often this actually happens. However, what might quite often happen in real-life situations is that there are one or more adversaries to an alternative that other group members prefer. These adversaries will generally resist being persuaded, and will be unhappy if the group decides to choose their disliked alternative anyway. In future studies, researchers might create such conditions, for example, varying the number of proponents for and adversaries against a particular focal option. It might also be relatively common for different factions to be in conflict. The literature about bargaining and negotiation suggests that conflicting preferences (e.g., for outcome allocations) can indeed lead to stalemates and impasses (see, e.g., O’Connor & Arnold, 2001). However, in this literature, researchers usually study situations in which individual interests rather than mere preferences conflict. We have also shown that when there are no conflicting interests (i.e., our group members had nothing to gain when their preferred alternative was chosen), preference diversity may lead to stalemates. Another issue that might be addressed in future work is the effect of leadership. It might be expected that good leaders will help groups overcome their differences of opinion. In past work, it has been shown that leaders can function as information managers in groups (Larson, Christensen, Abbott, & Franz, 1996). Leaders may prevent a biased exchange of information and prevent a (premature) consensus on decision refusal. Leaders can also stimulate more effective conflict handling by emphasizing the importance of information rather than preferences (whether they are attractions or aversions). When less emphasis is put on preferences, group members might more easily concede and even accept initially disliked alternatives on the basis of the evidence that is provided during group discussion. Although leadership might reduce certain effects, time pressure may exacerbate them. According to Karau and Kelly’s (1992) attentional focus model, groups under time pressure focus on only a restricted range of task-relevant cues. Indeed, research has shown that group members filter out certain information during discussion when under time pressure (Kelly & Loving, 2004). When group members mainly focus on negative information, as they did in the aversion condition of the present study, this may lead to an even higher prevalence of choice refusal. The effects of leadership and time pressure thus seem promising avenues for future research.

Limitations and Future Directions

Conclusion

A potential limitation of the present study is that the decision to refuse all candidates in our setup was not very costly. Although we

In this study, we have shown that preference diversity in groups does not necessarily lead to constructive group processes, a more

group decision strategy: one that is based on negative information and the expression of aversions, leading to quick refusal decisions (i.e., an aversion-driven strategy). One question that has to be addressed is whether decision refusal in the present experiment was a bad decision. Indeed, one could argue that none of the candidates were very good: All had as many negative as positive attributes, and they should perhaps be refused. Speaking against this are the findings of our pretest, showing that individuals who were given full information about the candidates in more than 80% of the cases did make a choice (and that this percentage was higher than in both the aversion and the combination conditions). However, one still could argue that groups with full information would have made a different choice, because they are better at identifying bad alternatives. We believe that is possible, but unlikely. In another study, Nijstad (2006) directly studied a situation where group members received full information from the start, and this information also contained as many positive as negative attributes for each candidate. Groups refused all candidates in only 20% of the cases—not dissimilar to what we found in our pretest.

Implications

NIJSTAD AND KAPS

870

even-handed consideration of information, and unbiased discussions. Instead, when preference diversity is based (partly) on aversions, it may lead to a negative bias in information sharing, stalemates, and decision refusal. Groups may even refuse to choose among available options when the options on the whole are adequate, at the cost of lower outcome satisfaction, simply because decision refusal is an easy way out.

References Baumeister, R. F., Bratlavsky, E., Finkenauer, C., & Vohs, K. D. (2001). Bad is stronger than good. Review of General Psychology, 5, 323–370. Bolster, B. I., & Springbett, B. M. (1961). The reaction of interviewers to favorable and unfavorable information. Journal of Applied Psychology, 45, 97–103. Brodbeck, F. C., Kerschreiter, R., Mojzisch, A., Frey, D., & Schulz-Hardt, S. (2002). The dissemination of critical unshared information in decision-making groups: The effect of pre-discussion dissent. European Journal of Social Psychology, 32, 35–56. Chaiken, S., & Stangor, C. (1987). Attitudes and attitude change. Annual Review of Psychology, 38, 575– 630. Corbin, R. M. (1980). Decisions that might not get made. In T. S. Wallsten (Ed.), Cognitive processes in choice and decision behavior (pp. 47– 67). Hillsdale, NJ: Erlbaum. Davis, J. H. (1973). Group decisions and social interaction: A theory of social decision schemes. Psychological Review, 80, 97–125. De Dreu, C. K. W., Nijstad, B. A., & Van Knippenberg, D. (in press). Motivated information processing in group judgment and decision making. Personality and Social Psychology Review. Greitemeyer, T., & Schulz-Hardt, S. (2003). Preference-consistent evaluation of information in the hidden profile paradigm: Beyond group-level explanations for the dominance of shared information in group decisions. Journal of Personality and Social Psychology, 84, 322–339. Hamilton, D. L., & Zanna, M. P. (1972). Differential weighing of favorable and unfavorable attributes in impressions of personality. Journal of Experimental Research in Personality, 6, 204 –212. Hastie, R., Penrod, S. D., & Pennington, N. (1983). Inside the jury. Cambridge, MA: Harvard University Press. Janis, I. L. (1982). Victims of groupthink (2nd ed.). Boston: Houghton Mifflin. Karau, S. J., & Kelly, J. R. (1992). The effects of time scarcity and time abundance on group performance quality and interaction process. Journal of Experimental Social Psychology, 28, 542–571. Kelly, J. R., & Loving, T. J. (2004). Time pressure and group performance: Exploring underlying processes in the attentional focus model. Journal of Experimental Social Psychology, 40, 185–198. Kerr, N. L., & MacCoun, R. J. (1985). The effects of jury size and polling method on the process and product of jury deliberation. Journal of Personality and Social Psychology, 48, 349 –363. Kerr, N. L., MacCoun, R. J., & Kramer, G. P. (1996). Bias in judgment: Comparing individuals and groups. Psychological Review, 103, 687– 719. Kerr, N. L., & Tindale, R. S. (2004). Group performance and decision making. Annual Review of Psychology, 55, 2201–2232. Larson, J. R., Christensen, C., Abbott, A., & Franz, T. (1996). Diagnosing groups: Charting the flow of information in medical decision-making teams. Journal of Personality and Social Psychology, 71, 315–330. Larson, J. R., Foster-Fishman, P. G., & Keys, C. B. (1994). Discussion of shared and unshared information in decision-making groups. Journal of Personality and Social Psychology, 67, 446 – 461.

Laughlin, P. R., & Ellis, A. L. (1986). Demonstrability and social combination processes on mathematical intellective tasks. Journal of Experimental Social Psychology, 22, 177–189. McGrath, J. E. (1984). Groups: Interaction and performance. Englewood Cliffs, NJ: Prentice Hall. Nemeth, C. J., & Nemeth-Brown, B. (2003). Better than individuals? The potential benefits of dissent and diversity for group creativity. In P. B. Paulus & B. A. Nijstad (Eds.), Group creativity: Innovation through collaboration (pp. 63– 84). New York: Oxford University Press. Nijstad, B. A. (2006). Choosing none of the above: Persistence of negativity after group discussion and group decision refusal. Manuscript submitted for publication. O’Connor, K. M., & Arnold, J. A. (2001). Distributive spirals: Negotiation impasses and the moderating role of disputant self-efficacy. Organizational Behavior and Human Decision Processes, 84, 148 –176. Postmes, T., Spears, R., & Cihangir, S. (2001). Quality of decision making and group norms. Journal of Personality and Social Psychology, 80, 918 –930. Rozin, P., & Royzman, E. B. (2001). Negativity bias, negativity dominance, and contagion. Personality and Social Psychology Review, 5, 296 –320. Scholten, L., Van Knippenberg, D., Nijstad, B. A., & De Dreu, C. K. W. (2007). Motivated information processing and group decision making: Effects of process accountability on information processing and decision quality. Journal of Experimental Social Psychology, 43, 539 –552. Schulz-Hardt, S., Brodbeck, F. C., Mojzisch, A., Kerschreiter, R., & Frey, D. (2006). Group decision making in hidden profile situations: Dissent as a facilitator for decision quality. Journal of Personality and Social Psychology, 91, 1080 –1093. Schulz-Hardt, S., Frey, D., Lu¨thgens, C., & Moscovici, S. (2000). Biased information search in group decision making. Journal of Personality and Social Psychology, 78, 655– 669. Schulz-Hardt, S., Jochims, M., & Frey, D. (2002). Productive conflict in group decision making: Genuine and contrived dissent as strategies to counteract biased information seeking. Organizational Behavior and Human Decision Processes, 88, 563–586. Skowronski, J. J., & Carlston, D. E. (1989). Negativity and extremity bias in impression formation: A review of explanations. Psychological Bulletin, 105, 131–142. Stasser, G. (1999). A primer of social decision scheme theory: Models of group influence, competitive model testing, and prospective modeling. Organizational Behavior and Human Decision Processes, 80, 3–20. Stasser, G., & Birchmeier, Z. (2003). Group creativity and collective choice. In P. B. Paulus & B. A. Nijstad (Eds.), Group creativity: Innovation through collaboration (pp. 85–109). New York: Oxford University Press. Stasser, G., & Titus, W. (1985). Pooling of unshared information in group decision making: Biased information sampling during discussion. Journal of Personality and Social Psychology, 48, 1467–1478. Stasser, G., & Titus, W. (1987). Effects of information load and percentage of shared information on the dissemination of unshared information during group discussion. Journal of Personality and Social Psychology, 53, 81–93. Tversky, A., & Shafir, E. (1992). Choice under conflict: The dynamics of deferred decision. Psychological Science, 3, 358 –361.

Received April 11, 2006 Revision received August 27, 2007 Accepted September 2, 2007 䡲

Journal of Personality and Social Psychology 2008, Vol. 94, No. 5, 871– 882

Copyright 2008 by the American Psychological Association 0022-3514/08/$12.00 DOI: 10.1037/0022-3514.94.5.871

Distinguishing Between Silent and Vocal Minorities: Not All Deviants Feel Marginal Kimberly Rios Morrison and Dale T. Miller Stanford University People’s opinions can deviate from that of the average group member in two ways. Descriptive deviants diverge from the average group attitude in a direction consistent with the desirable group attitude; prescriptive deviants diverge from the average group attitude in a direction inconsistent with the desirable group attitude. Three studies tested the hypothesis that descriptive deviants are more willing to express their opinions than either nondeviants or prescriptive deviants. Study 1 found that college students reported more comfort in expressing descriptive deviant opinions because descriptive deviance induced feelings of superior conformity (i.e., being “different but good”). Study 2 found that descriptive deviants reported more pride after expressing their opinions, were rated as more proud by an observer, and were more willing to publicize their opinions. Study 3 showed that political bumper stickers with descriptive deviant messages were displayed disproportionately more frequently than were those with prescriptive deviant messages. Keywords: minorities, deviance, social norms

nam War, U.S. Vice President Spiro Agnew famously claimed that the highly visible anti-war protesters constituted a mere “vocal minority” who were out of step with the larger, more representative “silent majority” (Noelle-Neumann, 1984). Noelle-Neumann may be correct that some minority sentiments are more likely to be publicized than others, but how can we predict which minorities will voice their opinions and which will become trapped in a “spiral of silence”? To begin to answer this question, we distinguish between two features of deviant opinions: the magnitude of deviance from the average group opinion and the direction of this deviance. We then explain why these features might influence patterns of opinion expression. In this article, the terms minority and deviant neither carry negative connotations nor imply that a person so characterized necessarily perceives him- or herself as dissimilar from the rest of the group. These terms simply refer to individuals whose attitudes are objectively (i.e., statistically) different from those of most other group members (Blanton & Christie, 2003).

Those holding minority opinions have good reason to keep silent. Expressing opinions that differ from those of the majority can expose individuals to a number of forms of social punishment, including exclusion and persecution (Kuran, 1995; Levine, 1989; Marques & Paez, 1994). Recognizing this fact, minorities predictably report feeling more uncomfortable, anxious, awkward, frustrated, and unhappy in group settings than do their majority counterparts (Matz & Wood, 2005; Nemeth & Wachtler, 1983). The best advice for someone who holds a minority opinion and wishes to avoid trouble would thus seem clear: If you aren’t prepared to modify your opinion in the direction of the majority, at least keep silent. Indeed, evidence suggests that people holding minority opinions take more time to express their opinions (Bassili, 2003), when they do not avoid expressing them altogether (NoelleNeumann, 1984). Not all minorities are silent, however. In her book The Spiral of Silence, Elisabeth Noelle-Neumann (1984) asserts that even when supporters of different minority viewpoints are equal in numbers, they may differ in their reluctance to make their sentiments known. Indeed, minorities can be at least as vocal as majorities under certain circumstances. For example, during the height of the Viet-

Deviance: Magnitude Versus Direction Most research on opinion expression has treated minority/ majority status as a dichotomous variable and has shown that group members who either hold or are randomly assigned to argue in favor of a minority opinion are less willing to speak out than those who are in the majority (Asch, 1956; Bassili, 2003; Matz & Wood, 2005). A smaller number of studies have allowed for the possibility that opinion deviance can vary within a group. These studies have generally found that the likelihood that a person will publicize an opinion depends on the actual or perceived distance of that opinion from the average group opinion. The greater that distance, the more unwilling group members are to voice their opinions (Glynn, Hayes, & Shanahan, 1997). The magnitude of distance from the average group opinion is not the only, nor possibly even the most important, determinant of

Kimberly Rios Morrison and Dale T. Miller, Graduate School of Business, Stanford University. We would like to thank Evan Morrison, Lara Tiedens, Christian Wheeler, Debbie Prentice, Penny Visser, members of the 2004 –2005 Workshop in Behavioral Research, and members of the 2005–2006 Miller Lab for their helpful feedback on earlier versions of this article. We also acknowledge Becky Schaumberg, David Campos, Stephanie Lam, Kyle Peterson, and Valerie Rios for their assistance with data collection and coding. Correspondence concerning this article should be addressed to Kimberly Rios Morrison, Graduate School of Business, Stanford University, 518 Memorial Way, Stanford, CA 94305. E-mail: [email protected] or [email protected] 871

MORRISON AND MILLER

872

the likelihood that a particular opinion will be expressed. We propose that the direction in which an opinion deviates from the group average is also a critical determinant of whether its holder will be vocal or silent. Conceptualizing deviance in terms of both direction and magnitude recognizes that groups can be viewed as often having two norms with respect to many attitudes and behaviors (see Cialdini, Reno, & Kallgren, 1990; Cialdini & Trost, 1998; Prentice & Miller, 1996). The first norm reflects the common attitudes or behaviors of the group—that is, what group members actually think or do (i.e., the descriptive norm). The second norm reflects the desirable attitudes or behaviors of the group—that is, what group members believe they should or ought to think or do (i.e., the prescriptive norm). To adequately describe the deviance of a position, then, one needs to know the relation of that position to both the descriptive and prescriptive group norms. In referring to the average group attitude as the descriptive norm, we are assuming—and we conducted a pretest to confirm— that the distribution of group attitudes is unimodal. Were a bimodal distribution to exist, there would be two descriptive norms (i.e., two different groups) in the population. Furthermore, the average attitude of the population would actually represent a minority opinion within either of these two groups (see Levine & Ranelli, 1978). Supporting our assumption, people generally tend to perceive social distributions (e.g., of group members’ attitudes and behaviors) as normal rather than bimodal (Nisbett & Kunda, 1985), except when the group under question is unfamiliar to them (but see Judd & Johnson, 1981).

Descriptive and Prescriptive Deviance When the descriptive norm (average attitude) and prescriptive norm (desirable attitude) of the group diverge, one can actually speak of two overlapping categories of deviants—those who deviate from the descriptive norm in a direction consistent with the prescriptive norm and those who deviate from the descriptive norm in a direction inconsistent with the prescriptive norm. We call the former category of individuals descriptive deviants, given their divergence from the descriptive (but not prescriptive) norm. Conversely, we call the latter category of individuals prescriptive deviants, given their divergence from both the prescriptive norm and the descriptive norm. As an example of descriptive and prescriptive deviance, consider a college campus in which the descriptive norm for political attitudes is slightly liberal. Assuming that the prescriptive norm is also liberal, extremely liberal students would be descriptive deviants, and extremely conservative students would be prescriptive deviants. In this example, although both descriptive and prescriptive deviants hold positions that differ from that of the average group member, only the position of prescriptive deviants undermines the prescriptive group norm. The position of descriptive deviants, more than that of even nondeviants, reinforces the prescriptive group norm. Our typology of deviance bears a similarity to that proposed by Abrams, Marques, Bown, and Henson (2000). Abrams and colleagues distinguished between pronorm deviants, whose values deviate from the group in the direction of the group prototype (i.e., desirable group attitude), and antinorm deviants, whose values deviate from the group in the direction away from the group prototype. According to their model, pronorm deviants validate the

ingroup’s norms, and hence function to differentiate the ingroup from other groups, to a greater extent than do antinorm deviants. One consequence of this, demonstrated by Abrams and colleagues, is that pronorm deviants are rated as more prototypical and are evaluated more positively than antinorm deviants and, in some cases, normatives (i.e., nondeviant group members; Abrams et al., 2000; see also Abrams, de Moura, Hutchison, & Viki, 2005; Abrams, Marques, Bown, & Dougill, 2002). Despite these similarities, the present model departs from that of Abrams and colleagues in two ways. First, whereas Abrams and colleagues emphasize intergroup differences (those between the ingroup and outgroups), we emphasize intragroup differences (those between ingroup members). For example, Abrams and colleagues define descriptive norms as characteristics of the ingroup that distinguish it from relevant outgroups (e.g., members of one sports team wear a certain color jersey in part to avoid being mistaken for the opposing team; Abrams et al., 2000). By contrast, we define descriptive norms as characteristics of most ingroup members that distinguish ingroup nondeviants from ingroup deviants (e.g., most sports fans at a game might wear a single item of clothing in their team’s color, whereas other fans might wear more or fewer items of clothing in that color). Additionally, although both models suggest that those who deviate from group norms in a desirable direction will voice their opinions more readily than those who deviate from group norms in an undesirable direction or (in many cases) those who do not deviate at all, the reason specified by each model is different. Abrams and colleagues would claim that people adopt and express certain (i.e., pronorm) deviant opinions in order to differentiate their own group from other groups. Our model focuses on how people adopt and express certain (i.e., descriptive) deviant opinions in order to maintain a positive individual identity relative to other ingroup members.

Implications for Opinion Expression We argue that the direction of difference from the group average is critical in determining how comfortable a person with a deviant opinion will feel expressing that opinion. Specifically, descriptive deviants will be more comfortable and willing to express their opinions than both nondeviants and prescriptive deviants. As noted earlier, previous research on opinion expression has generally classified group members into one of two categories: majority or minority (Asch, 1956; Matz & Wood, 2005). One demonstration of the link between minority status and opinion expression is Bassili’s (2003) finding that minority opinion holders were slower to report their attitudes in a telephone survey than were majority opinion holders, even after several different measures of attitude strength (e.g., certainty, importance, extremity; see also Krosnick, Boninger, Chuang, Berent, & Carnot, 1993) were controlled. Bassili’s proposed explanation for these results was that holding a minority opinion induces internal conflict in individuals, which in turn leads to slower response latencies (Bassili, 2003; see also Bassili, 1995; Huckfeldt & Sprague, 2000). When group members’ opinions have instead been represented on a continuum, the predominant assumption is that the more one’s opinion differs from that of the group average (in any direction), the more reluctant he or she will be to express that opinion (Bassili, 2003; Glynn et al., 1997). For instance, Bassili (2003) found the “minority slowness effect” (p. 261) to be especially pronounced in

SILENT AND VOCAL MINORITIES

groups with a large majority and among group members who perceived the size of the majority to be large. Though the relation between deviance magnitude and opinion expression is well established, deviance direction has yet to be examined in this context. Moreover, in explaining why those who hold deviant opinions may be unwilling to express their opinions, previous research has focused on the factors that distinguish deviants from nondeviants, such as perceived similarity to or anticipated liking by other group members (Glynn et al., 1997; Hensley & Duval, 1976; Levine & Ranelli, 1978; see also Levine, 1989). This research has suggested that the greater the magnitude of deviance, the lower one’s perceived similarity to one’s peers (Glynn et al., 1997) and the lower one’s chances of being well liked by one’s peers (Hensley & Duval, 1976; Levine & Ranelli, 1978). Given our focus on deviance direction, however, we took a different approach by examining a factor that may distinguish certain (i.e., descriptive) deviants from both nondeviants and other (i.e., prescriptive) deviants. Specifically, our claim that descriptive deviance, as opposed to prescriptive deviance or nondeviance, increases willingness to express oneself derives from the assumption that people have competing psychological needs for similarity and distinctiveness. As uniqueness theory postulates, individuals are reluctant to accept feedback that they are either extremely similar to or extremely different from their peers (Snyder & Fromkin, 1980; see also Lynn & Snyder, 2002). Consistent with this idea, optimal distinctiveness theory proposes that although difference is generally seen as a positive attribute, the desire to be different is balanced by desires to conform and belong to others (Brewer, 1991; Brewer & Gardner, 1996; Vignoles, Chryssochoou, & Breakwell, 2000). One way in which people can satisfy their needs for belongingness and distinctiveness simultaneously is by differentiating themselves within a group, yet still remaining members of that group (Hornsey & Jetten, 2004). Most relevant to our purposes, people’s quest to be both similar and different can be promoted by conforming even more than other ingroup members to the norms of the ingroup—that is, by being superior conformists (see Codol, 1975, 1984). Superior conformity may be a reflection of group members’ tendencies to distinguish themselves in “good” ways and avoid distinguishing themselves in “bad” ways (Blanton & Christie, 2003). Some evidence that group members prefer deviating from others in a valued direction comes from the group polarization research of Myers and his colleagues (Myers, 1978; Myers, Wojcicki, & Aardema, 1977). In these studies, participants shifted their opinions in the direction of the prescriptive group norm—and away from the descriptive group norm—when exposed to their peers’ attitudes. The presumed reason for their shift was that they wanted to be not only similar to their peers but also superior to their peers (Myers, 1978). If being different in a positive way (i.e., superior conformity) does indeed satisfy the competing drives for similarity and distinctiveness, then group members who hold extreme but desirable positions (i.e., descriptive deviants) should feel more comfortable publicizing their opinions than both those who hold extreme but undesirable positions (i.e., prescriptive deviants) and those who hold moderate positions (i.e., nondeviants). According to the foregoing analysis, this proposed relation between deviance direction and opinion expression arises because descriptive deviants believe that their opinions render them superior conformists.

873

If group members simply wanted to be similar to their peers, nondeviants should be more vocal than deviants overall, and this willingness to be vocal should be driven by nondeviants’ perceptions of similarity. If group members simply wanted to be different from their peers, deviants should be more vocal than nondeviants overall, and this willingness to be vocal should be driven by deviants’ perceptions of difference, regardless of whether this difference is “good” or “bad.”

The Present Studies The foregoing analysis suggests that the willingness of deviants to express their opinions should depend on whether the direction in which they deviate from the group average is toward or away from the prescriptive group norm. Descriptive deviants, whose opinions are closer to the desirable (if not average) group opinion, should be more likely to express their opinions than both nondeviants and prescriptive deviants. Studies 1 and 2 focused on the willingness of college students to express their opinions on several important campus issues. For each issue, it was possible to identify both the prescriptive norm (the liberal position in all cases) and the descriptive norm (i.e., the average attitude). In Study 1, the experimental procedure was to ask participants to imagine arguing for one of three randomly assigned positions: descriptive deviant, nondeviant, or prescriptive deviant. This study tested the hypothesis that descriptive deviants’ perception of their superior conformity would mediate the relation between descriptive deviance and self-reported comfort in opinion expression. Study 1 also tested three additional mediational accounts for the greater comfort descriptive deviants felt in expressing their opinions. The first of these was that descriptive deviants, more so than prescriptive deviants, perceive their opinions as similar to those of the majority (Bassili, 2003; Glynn et al., 1997). The second was that descriptive deviants are less likely than prescriptive deviants to believe that they are different in a “bad” way. The third was that descriptive deviants, more so than prescriptive deviants, believe that their (deviant) opinions will make them popular within their peer group (Hensley & Duval, 1976; Levine & Ranelli, 1978). The first two alternative accounts, if ruled out, would further reinforce the idea that descriptive deviants are more comfortable not because they see themselves as similar to their peers but rather because they see themselves as “better” group members than their peers. In Study 2, we built upon Study 1 in three ways. First, rather than measuring participants’ responses to a hypothetical scenario, in Study 2 we actually assigned participants to deliver a speech in support of one of three different attitude positions. Second, rather than measuring participants’ felt discomfort in opinion expression, in Study 2 we measured both self-reported and observer-rated pride, an emotion presumed to be conceptually closer to the construct of superior conformity. Finally, Study 2 included a measure of actual behavior—namely, participants’ willingness to have their assigned position made public. The specific hypothesis tested in Study 2 was that participants assigned to argue for a descriptive deviant position would experience (and be rated as experiencing) higher levels of pride, as well as be more willing to publicize their position, than would those assigned to argue for a nondeviant or prescriptive deviant position.

MORRISON AND MILLER

874

Study 3 was a field demonstration of opinion expression in which we compared the proportions of political bumper stickers with the proportions of registered voters in two different areas. One area was a predominantly Democratic county, where the descriptive norm was defined as slightly liberal and the prescriptive norm as liberal. The other was a predominantly Republican county, where the descriptive norm was defined as slightly conservative and the prescriptive norm as conservative. The specific hypothesis tested was that the messages on the bumper stickers — compared with the actual percentages of voters—would overrepresent the descriptive deviant position and underrepresent the prescriptive deviant position in each county.

Pretest Our first step was to determine the normative positions among the target group (undergraduates at Stanford University) on select group-relevant issues. Specifically, we tested whether group members accurately perceived where they stood on each issue relative to their peers and whether the distributions of group members’ attitudes were in fact unimodal rather than bimodal (see Nisbett & Kunda, 1985). Only then could we claim that the average group attitude, as opposed to an attitude on either extreme, did in fact represent the nondeviant position on each issue. We also sought to determine the prescriptive group norm on each issue in order to ensure that it was different enough from the descriptive group norm to distinguish descriptive deviants from nondeviants. We measured descriptive and prescriptive group norms via a survey that asked students for their attitudes on three different issues: the wages of full-time (nonstudent) Stanford employees, the presence of ethnic-theme dormitories on the Stanford campus, and the use of affirmative action in college admissions decisions. The survey included three questions on each issue, worded as follows: (a) How would you best describe your own attitude toward [the issue]? (b) How would you best describe the typical Stanford student’s attitude toward [the issue]? and (c) How would you best describe the attitude that Stanford students think they should have toward [the issue], in order to fit in with their peers? The first question was designed to assess the descriptive norm (average group attitude) for the issue, the second question, perceptions of the descriptive norm, and the third question, perceptions of the prescriptive norm (desirable group attitude). The third question was particularly important, as the use of the imperative (i.e., “should”) helped to establish the direction in which the students perceived social pressure to be operating. Twenty-eight participants responded to these questions on 9-point scales (1 ⫽ much too low, 9 ⫽ much too high for the wages questions; 1 ⫽ very strongly opposed, 9 ⫽ very strongly in favor for the ethnic-theme dormitories and affirmative action questions). Lower scores on the wages issue indicated greater degrees of liberalism, as full-time Stanford employees are viewed as being socioeconomically disadvantaged relative to the student population. Conversely, higher scores on the ethnic-theme dormitories and affirmative action issues indicated greater degrees of liberalism. We predicted that the prescriptive norm for each issue would be liberal (as compared with the scale midpoint) and that the prescriptive norm for each issue would be significantly more liberal than both students’ own attitude (actual descriptive norm) and their

perception of the typical student’s attitude (perceived descriptive norm). Indeed, the prescriptive norms for all three issues fell on the liberal end (i.e., as compared with the midpoint) of the scale: t(27) ⫽ ⫺9.84, p ⬍ .001 for wages; t(27) ⫽ 4.77, p ⬍ .001 for ethnic-theme dormitories; and t(27) ⫽ 4.88, p ⬍ .001 for affirmative action. Furthermore, in all but one (marginally significant) case, the prescriptive norm was judged to be significantly more liberal than both the actual and perceived descriptive norms, which were at or near the scale midpoint (see Table 1 for means). The actual and perceived descriptive norms did not differ significantly for any of the issues. These results supported the claim that the liberal position represents the prescriptive group norm and that the descriptive and prescriptive norms diverge enough to classify students who hold extreme attitudes (in a direction either consistent or inconsistent with the prescriptive norm) as deviants. They also show that students, at least with respect to these issues, have relatively accurate perceptions of where they stand relative to their peers (see Nisbett & Kunda, 1985).

Study 1 Study 1 experimentally tested the hypothesis that people are more comfortable expressing descriptive deviant opinions than nondeviant or prescriptive deviant opinions. This was accomplished by randomly assigning participants to imagine that they had to argue in support of a descriptive deviant, nondeviant, or prescriptive deviant position on an issue and then measuring their comfort in doing so. In addition, four potential mediators were included. One of these assessed participants’ perceptions of the desirability of the two forms of deviance, thereby permitting a direct test of the mediational claim that descriptive deviants are more comfortable because they know that their opinions render them superior conformists (i.e., different from their peers in a “good” way). The other three measures were included to permit tests of alternative explanations. These questions asked how well liked and popular participants believed their assigned position would make them, how similar they believed their assigned position was to that of their peers, and how much they believed their assigned position made them different from their peers in a “bad”

Table 1 Mean Attitudes of Participants Toward Each Issue, Pretest Attitude

Own

Typical Stanford student’s

Socially desirable

Issue

M

SD

M

SD

M

SD

Wages Ethnic-theme dormitories Affirmative action

4.07a 5.39ab 5.11a

1.51 2.10 2.17

3.43a 5.00a 5.11a

1.20 1.47 1.42

2.68b 6.32b 6.36b

1.25 1.47 1.47

Note. Ratings derived from a 9-point scale: 1 ⫽ much too low, 9 ⫽ much too high for the wages question; 1 ⫽ very strongly opposed, 9 ⫽ very strongly in favor for the ethnic-theme dormitories and affirmative action questions. On wages question, lower scores ⫽ greater liberalism; on dormitories and affirmative action questions, higher scores ⫽ greater liberalism. Within each row, columns with different subscripts are significantly different from one another.

SILENT AND VOCAL MINORITIES

way. To the extent that descriptive deviants’ greater comfort was driven by their perception of superior conformity, we would not have expected their greater comfort to depend on their perception that they were more positively evaluated or closer to the average group opinion than were other group members.

Method Participants. One hundred and thirty-six Stanford students (57 men, 77 women, 2 unspecified) participated in this questionnaire experiment as part of a 1-hr-long mass testing session. They were paid $20 upon completion of the entire session. Materials and procedure. The questionnaire consisted of two pages. On the first page, participants were asked to imagine being required to deliver a speech to another student advocating a position on one of the three issues used in the pretest. The assigned position was that of a descriptive deviant (much too low for wages, strongly in favor for ethnic-theme dormitories and affirmative action), a nondeviant (a little too low for wages, slightly in favor for ethnic-theme dormitories and affirmative action), or a prescriptive deviant (much too high for wages, strongly opposed for ethnic-theme dormitories and affirmative action). Participants were further instructed to imagine that the other student believed that the position they took in their speech was their true attitude. After reading the scenario, participants indicated how comfortable and nervous they would feel delivering the speech (1 ⫽ not at all, 9 ⫽ very much). Responses to the two items were averaged (and the nervousness item was reverse coded) to form a composite (r ⫽ .32, p ⬍ .001). Finally, participants responded to several questions about their assigned position, also on a scale from 1 (not at all) to 9 (very much). Specifically, they indicated the extent to which actually holding their assigned position would make them similar to their peers (similarity), different from their peers in a good way (superior conformity), and different from their peers in a bad way (different but bad). They also indicated the extent to which actually holding their assigned position would make them well liked by and popular among their peers. These two items were averaged to form a liking index (r ⫽ .75, p ⬍ .001).

Results We hypothesized that participants assigned to express descriptive deviant opinions would report feeling more comfortable and less nervous than those assigned to express either nondeviant or prescriptive deviant opinions. To test this hypothesis, we first submitted the results to a 3 (position: descriptive deviant vs. nondeviant vs. prescriptive deviant) ⫻ 3 (issue: wages vs. ethnictheme dormitories vs. affirmative action) analysis of variance (ANOVA). The main dependent variable of interest was the comfort/nervousness composite, but we also conducted parallel analyses on each of the ancillary measures described earlier. Because these analyses revealed no main effects of issue and no Position ⫻ Issue interactions, we collapsed the results across the three issues. For each dependent measure, we ran a one-way ANOVA to examine the overall effect of assigned position. We also ran a planned contrast comparing descriptive deviants (coded as 2) to both nondeviants and prescriptive deviants (both coded as –1). (For a list of means and differences between individual conditions, see Table 2.)

875

Table 2 Mean Ratings of Participants by Assigned Position and Dependent Measure, Study 1 Descriptive deviant

Nondeviant

Prescriptive deviant

Feeling/perception

M

SD

M

SD

M

SD

Comfort/nervousness Similarity to others Superior conformity— “different but good” “Different but bad” Liking by others

5.50a 5.38a

1.98 1.69

5.09ab 5.33a

1.69 1.52

4.49b 3.70b

2.12 2.26

4.87a 3.20a 4.78a

2.50 2.05 2.07

4.18a 2.91a 4.20a

2.17 1.82 1.82

3.09b 4.73b 2.72b

2.19 2.86 1.57

Note. Ratings derived from a 9-point scale (1 ⫽ not at all, 9 ⫽ very much) indicating participants’ comfort/nervousness about advocating the assigned position and the extent to which advocating the position made them similar to, different from (in a good way and a bad way), and liked by their peers. Within each row, columns with different subscripts are significantly different from one another.

Comfort/nervousness. The overall effect of assigned position was significant, F(2, 133) ⫽ 3.12, p ⬍ .05, ␩2p ⫽ .045. Consistent with predictions, those assigned to be descriptive deviants reported being more comfortable and less nervous than did those assigned to be nondeviants and prescriptive deviants, t(133) ⫽ 2.04, p ⬍ .05. Similarity. The overall effect of assigned position was significant, F(2, 133) ⫽ 11.98, p ⬍ .001, ␩2p ⫽ .153. Those assigned to be descriptive deviants felt more similar to their peers than did those assigned to be nondeviants and prescriptive deviants, t(133) ⫽ 2.60, p ⫽ .01. Superior conformity. The overall effect of assigned position was significant, F(2, 132) ⫽ 6.86, p ⫽ .001, ␩2p ⫽ .094.1 Those assigned to be descriptive deviants were more inclined than those assigned to be nondeviants and prescriptive deviants to believe that their positions made them superior conformists (i.e., different but good), t(132) ⫽ 2.97, p ⬍ .005. Different but bad. The overall effect of assigned position was significant, F(2, 132) ⫽ 8.14, p ⬍ .001, ␩2p ⫽ .110. However, those assigned to be descriptive deviants were no less likely than those assigned to be nondeviants or prescriptive deviants to believe that their positions made them different but bad, t(132) ⫽ ⫺1.50, p ⬍ .14. Liking. The overall effect of assigned position was significant, F(2, 129) ⫽ 16.64, p ⬍ .001, ␩2p ⫽ .205. Those assigned to be descriptive deviants were more likely than those assigned to be nondeviants or prescriptive deviants to report that their assigned positions would make them well liked and popular, t(129) ⫽ 4.17, p ⬍ .001. Mediational analyses. To test whether the relationship between assigned position (i.e., descriptive deviance) and comfort was driven by feelings of superior conformity, as opposed to assumptions of similarity, feelings of being “different but bad” (among prescriptive deviants), or liking, we conducted mediational 1 The difference in degrees of freedom is due to the fact that one participant (or more, for some of the ancillary measures) did not respond to the item.

MORRISON AND MILLER

876

analyses using multiple regression (see Baron & Kenny, 1986). For each analysis, we regressed the comfort composite onto the term comparing descriptive deviants to both other conditions, the term comparing nondeviants to prescriptive deviants (to control for differences between these two groups), and the proposed mediator. As noted above, the relation between assigned position and comfort was significant, as were the relations between assigned position and each proposed mediator except “different but bad” (i.e., similarity, superior conformity, and liking). Thus, three of the four potential mediators satisfied Baron and Kenny’s (1986) first two criteria and were submitted to mediational analyses. When the comfort composite was regressed onto assigned position and similarity, there was no relationship between similarity and comfort, ␤ ⫽ .132, t(132) ⫽ 1.44, p ⫽ .15. This finding indicates that perceived similarity did not mediate the effect of assigned position on comfort. When the comfort composite was regressed onto assigned position and the superior conformity measure, superior conformity and comfort were significantly correlated, ␤ ⫽ .217, t(131) ⫽ 2.47, p ⬍ .02. However, the relation between assigned position and comfort was reduced to nonsignificance, ␤ ⫽ .111, t(131) ⫽ 1.28, p ⫽ .20. A Sobel test revealed that this reduction was marginally significant, z ⫽ 1.90, p ⬍ .06. This finding, consistent with predictions, indicates that feelings of superior conformity partially mediated the effect of assigned position on comfort (see Figure 1). When the comfort composite was regressed onto assigned position and liking, there was no relation between liking and comfort, ␤ ⫽ .152, t(128) ⫽ 1.58, p ⬎ .11. This finding indicates that liking did not mediate the effect of assigned position on comfort.

Discussion Study 1 demonstrated that group members experimentally assigned to argue for a descriptive deviant position were more comfortable expressing that position than were those assigned a prescriptive deviant or nondeviant position. The results strengthen the claim that the greater comfort of descriptive than prescriptive deviants derives from descriptive deviants’ perception that their position distinguishes them from their peers in a positive way, hence rendering them superior conformists. Study 1 also ruled out two plausible alternative routes by which the observed relation between descriptive deviance and comfort might arise. The first of these involves the possibility that descriptive deviants’ greater willingness to express their opinions is due to a perception that they are similar to the rest of the group (Bassili, 2003; Glynn et al., 1997). The present findings argue strongly against this account. Though descriptive deviants did perceive themselves as being more similar to their peers than did nondeviants and prescriptive deviants, there was no relation between their

Superior conformity .246**

Position (2 –1 –1)

.217*

.111 (.173*)

Comfort/nervousness

Figure 1. Mediational analysis, Study 1.*p ⬍ .05.

**

p ⬍ .01.

perceptions of similarity and their comfort in opinion expression. Instead, descriptive deviants were more comfortable expressing their opinions because they clearly recognized that their opinions differed from the group average, albeit in a direction consistent with the prescriptive norm. A second possibility is that descriptive deviants, relative to prescriptive deviants and nondeviants, believe that their opinions will make them more popular and well liked among their peers. Consistent with previous research on reactions to opinion deviants (e.g., Hensley & Duval, 1976; Levine & Ranelli, 1978), Study 1 did find a positive effect of descriptive deviance on anticipated liking by others. However, this effect did not mediate descriptive deviants’ greater comfort in opinion expression. It is possible that anticipated liking may prove to be a more critical predictor of opinion expression when nondeviants are simply compared with deviants, as opposed to when certain (i.e., descriptive) deviants are compared with other group members.

Study 2 In Study 2, we moved beyond a vignette methodology by assigning participants to actually deliver a speech in support of a particular position on a controversial issue and then measuring both their self-reported pride and their pride as rated by an observer. If descriptive deviants do in fact believe that they are different in a good way, then they should be proud of themselves for “one-upping” their peers, not merely relieved (i.e., comfortable) as a result of fitting in with their peers (see Myers, 1978). Indeed, pride-related emotions are generally thought to convey one’s successes and achievements to oneself and one’s group, thereby serving to enhance one’s social status—that is, to make one seem superior to one’s peers (Tracy & Robins, 2007). Study 2 also included a behavioral measure of opinion expression—the extent to which participants were willing to have their assigned opinion made public. We hypothesized that participants assigned to argue for a descriptive deviant (relative to nondeviant or prescriptive deviant) opinion would both experience more priderelated emotions and be more willing to publicly express their assigned opinion.

Method Participants. Thirty-four Stanford undergraduates participated in this experiment. Participants were randomly assigned to one of three conditions: descriptive deviant (n ⫽ 12), nondeviant (n ⫽ 12), or prescriptive deviant (n ⫽ 10). The experiment took approximately 30 min, and participants were paid $8. One participant was omitted from the analyses because she had previously participated in a similar experiment, and 3 more were excluded because they indicated that they did not know what affirmative action was. The data from the remaining 30 participants (17 women, 13 men) were retained in the final sample. Materials and procedure. Participants were tested individually. Upon arrival, they were greeted by an experimenter and given a page of written instructions. Participants read that the purpose of the study was to “assess people’s reactions to different types of arguments” and that they would prepare a short speech on a controversial issue. They further read that their speech would be videotaped and shown to another Stanford student in a different

SILENT AND VOCAL MINORITIES

experimental session. More specifically, they learned that they would be assigned to argue in support of a particular attitude position on an issue and that whoever viewed the videotape would think that this position represented their true attitude. Participants were instructed to play along as best they could with the experimental deceit and to not give any indication of their real attitude during the speech. Next, the experimenter explained that the issue and position were written on the next page and that participants could also use that page for notes. In all cases, the issue was the use of affirmative action in college admissions decisions. However, participants were assigned to take one of three different positions: descriptive deviant (strongly in favor—affirmative action is definitely a good thing), nondeviant (slightly in favor—affirmative action is probably a good thing, though there are some potential problems with it), or prescriptive deviant (strongly opposed—affirmative action is definitely a bad thing). The experimenter left the room and gave participants 5 min to prepare the speech, which, they were told, could be as short or long as they wanted. Once the 5 min had passed, the experimenter re-entered the room with the video camera and recorded the speech. She then gave participants a packet of paper-and-pencil questionnaires, which contained two of the three dependent measures. The first was a 7-item measure of achievement pride (Tracy & Robins, 2007). The scale items (e.g., accomplished, confident) were interspersed with other pride-irrelevant emotions (e.g., critical, optimistic) to disguise the purpose of the experiment. Participants responded on a 5-point scale (1 ⫽ not at all, 5 ⫽ extremely) according to how they had felt during the speech, with higher scores reflecting higher levels of pride. The measure demonstrated good reliability (␣ ⫽ .92). The second dependent variable was the behavioral measure of public opinion expression. Following a series of filler questionnaires designed to further minimize suspicion, participants read that ethically the researchers could not use the videotape of the speech in future experimental sessions unless participants were comfortable with their doing so. Participants then indicated whether they wanted their speech to be shown to another participant (0 ⫽ no, 1 ⫽ yes). Participants were told that in all cases they would be identified by participant number and not by name and that they would receive full payment regardless of which choice they made. At the end of the experiment, participants completed a suspicion probe and were fully debriefed. Specifically, they learned that their speech would never be shown to another participant and that it would only be used for coding purposes within the lab. They were then paid, thanked, and dismissed. None of the participants reported any awareness of the true experimental hypothesis. Video coding. After the data had been collected, all videotapes were viewed by a coder who was blind to the experimental hypothesis. The coder used a 5-point scale (1 ⫽ not at all, 5 ⫽ extremely) to rate each participant on two pride-related emotions: proud and confident. Because the coder’s two ratings were highly correlated, r(30) ⫽ .84, p ⬍ .001, they were averaged to form a composite. Participants’ scores on this composite were marginally correlated with their self-reported pride scores, r(30) ⫽ .29, p ⫽ .12.

877

Results Self-reported pride. We hypothesized that participants assigned to argue for the descriptive deviant position would report higher levels of pride than those assigned to argue for both the nondeviant and prescriptive deviant positions. To test this hypothesis, we submitted participants’ self-reported pride scores to a one-way ANOVA. Paralleling the analytic strategy used in Study 1, participants in the descriptive deviant condition (coded as 2) were compared with participants in both of the other conditions (coded as –1). The overall omnibus test was significant, F(2, 27) ⫽ 5.62, p ⬍ .01, ␩2p ⫽ .294. In addition, the planned contrast was significant, t(27) ⫽ 3.21, p ⬍ .01, indicating that participants in the descriptive deviant condition felt more proud during their speeches than did those in either the nondeviant condition or the prescriptive deviant condition (see Table 3). Observer-rated pride. We hypothesized that participants assigned to argue for the descriptive deviant position would be rated as more proud than those assigned to argue for both the nondeviant and prescriptive deviant positions. To test this hypothesis, we submitted participants’ pride scores (as rated by the coder) to a one-way ANOVA and planned contrast, as described earlier. The overall omnibus test was marginal, F(2, 27) ⫽ 2.91, p ⫽ .07, ␩2p ⫽ .177. However, the planned contrast attained significance, t(27) ⫽ 2.13, p ⫽ .04, indicating that participants in the descriptive deviant condition—relative to those in the other two conditions—appeared to an outside observer to be more proud (see Table 3). Behavioral opinion expression. We hypothesized that participants in the descriptive deviant condition would be more willing to give the researchers permission to use their videotape in future research than would participants assigned to the other two conditions. This hypothesis was confirmed, ␹2(2, N ⫽ 30) ⫽ 7.08, p ⬍ .03. In total, 5 participants (50%) in the descriptive deviant condition, 0 participants (0%) in the nondeviant condition, and 2 participants (20%) in the prescriptive deviant condition gave permission for their videotape to be shown to another student.

Discussion The results of Study 2 demonstrate that the relation between descriptive deviance and opinion expression is not limited to hypothetical situations, nor is it limited to self-reported feelings of comfort. When participants were actually instructed to deliver a speech on a controversial issue, those who took the position of a Table 3 Mean Self-Reported and Observer-Rated Pride by Condition, Study 2 Descriptive deviant

Nondeviant

Prescriptive deviant

Pride

M

SD

M

SD

M

SD

Self-reported Observer-rated

2.90a 3.35a

1.02 1.03

1.79b 2.40b

.58 .66

2.09b 2.85ab

.63 .91

Note. Within each row, columns with different subscripts are significantly different from one another.

MORRISON AND MILLER

878

descriptive deviant reported feeling and were rated by an observer as being more proud than did those who took either the nondeviant or prescriptive deviant position. Furthermore, participants in the descriptive deviant condition were especially willing to give permission for their speeches to be used in future research. One potential limitation of Study 2 (as well as Study 1) is the fact that the descriptive deviant position was always the liberal position. Thus, it cannot be ruled out that those holding extremely liberal positions tend generally to be more comfortable expressing their opinions than do their more conservative peers, even when the liberal position does not represent the prescriptive group norm. The fact that participants were more comfortable expressing descriptive deviant (extremely liberal) positions when the positions were not their own and had merely been assigned to them is reassuring, but still it is not conclusive. A more compelling demonstration would show that descriptive deviants were more comfortable expressing their opinions even when those opinions were conservative. We sought to perform just such a demonstration in Study 3. In addition, unlike Studies 1 and 2, Study 3 employed a measure of opinion expression that occurred naturally in the field.

Study 3 One form of opinion expression (especially in political domains) is the display of bumper stickers on one’s vehicle. The aftermath of the 2004 U.S. presidential election provided an appropriate context in which to examine this form of opinion expression among descriptive and prescriptive deviants. In Study 3, we accomplished this by recording the proportions of pro-Democratic and pro-Republican bumper stickers in two different counties of California: a “Blue” county (i.e., one with more registered Democrats than Republicans) and a “Red” county (i.e., one with more registered Republicans than Democrats). We argue that the distinction between Democratic and Republican bumper sticker holders is not simply that between individuals who espouse the majority and minority positions in each location. A bumper sticker supporting a particular political candidate or party identifies one not just as a member of a minority or majority group but as an especially committed member of that group. Thus, we argue—indeed, we demonstrate in survey results reported— that displaying a sticker on one’s vehicle in support of a candidate is a sufficiently distinctive act that it qualifies one as a statistical deviant, irrespective of majority/minority status. For purposes of this study, Democratic sticker holders were classified as descriptive deviants in the Blue county and prescriptive deviants in the Red county. Conversely, Republican sticker holders were categorized as descriptive deviants in the Red county and prescriptive deviants in the Blue county. We hypothesized that, irrespective of the prescriptive county norm, descriptive deviant stickers would be overrepresented and prescriptive deviant stickers underrepresented in each county. More precisely, we expected that the ratio of descriptive deviant stickers to prescriptive deviant stickers in each county would be significantly higher than the ratio of descriptive deviant voters to prescriptive deviant voters.

Method The study was conducted at six different Target store locations—three in Santa Clara County (the Blue county) and three in

San Diego County (the Red county)— over a 2-week period in March 2005. There were 8 days of data collection in total. To minimize the chances that the same sticker would be counted more than once (e.g., on different days or at different times), we assigned one researcher to each parking lot. On the days of the study, the researchers spent approximately 45 min circulating around the parking lots of each store and tallying the numbers of political bumper stickers that were displayed. Pro-Kerry and anti-Bush stickers were both considered to indicate Democratic voters, whereas pro-Bush stickers were classified as indicating Republican voters.2 There were no anti-Kerry stickers in either county. To measure additional variables of potential interest, the researchers put an addressed and stamped envelope containing a short survey on all sticker-sporting vehicles. Of the 53 individuals with bumper stickers on their vehicles, 30 (57%) returned the survey. Respondents were asked to indicate how they themselves felt about George W. Bush and John Kerry, as well as how they thought the average resident of their county felt about the two candidates. Responses to these questions were made on a 10-point scale (1 ⫽ don’t like him at all, 10 ⫽ like him very much). The order in which the questions about Bush and Kerry were presented was counterbalanced across participants. Although participants were not asked to specify their county of residence, the survey items were worded to pertain to either San Diego County or Santa Clara County (e.g., “How would you describe the average San Diego [Santa Clara] County resident’s feelings toward George W. Bush [John Kerry]?”). Thus, we can be reasonably assured that participants who provided answers to these questions were from the county in which the study was conducted. Only 1 participant—a Republican sticker holder in San Diego County— indicated that he or she was not a resident of that county and thus did not complete the survey. The data from that participant were omitted from all subsequent analyses.

Results and Discussion Bumper sticker ratios. The results from this study were analyzed using chi-square tests and are depicted in Table 4. As a manipulation check (i.e., establishing that descriptive deviants in both Santa Clara and San Diego counties were more vocal than prescriptive deviants), we first looked at the prevalence of Democratic and Republican bumper stickers in each locale. As expected, in Santa Clara County, the number of Democratic (descriptive deviant) bumper stickers exceeded the number of Republican (prescriptive deviant) stickers. In San Diego County, however, there were more Republican (descriptive deviant) stickers than Democratic (prescriptive deviant) stickers. The difference between the counties was significant, ␹2(1, N ⫽ 52) ⫽ 21.0, p ⬍ .001. Next, we tested our hypothesis that descriptive deviant stickers would be overrepresented in both Santa Clara and San Diego counties. To do this, we compared the proportion of descriptive deviant (vs. prescriptive deviant) stickers to that of descriptive deviant (vs. prescriptive deviant) registered voters in each locale 2

It would have been interesting to separate pro-Kerry and anti-Bush stickers into different categories and compare their popularities within each county. However, given our small sample size, we were unable to perform such an analysis.

SILENT AND VOCAL MINORITIES

879

Table 4 Proportions of Democratic and Republican Bumper Stickers and Registered Voters in Santa Clara and San Diego Counties, California, Study 3 Santa Clara County

San Diego County

Democratica

Republicanb

Democraticb

Republicana

Variable

%

n

%

n

%

n

%

n

Bumper stickers Registered voters

93.0 62.9

14

7.0 37.1

1

27.0 43.0

10

73.0 57.0

27

a b

Descriptive deviant. Prescriptive deviant.

(California Secretary of State, 2004).3 Consistent with our hypothesis, the ratio of Democratic-to-Republican stickers was significantly higher than the ratio of Democratic-to-Republican registered voters in Santa Clara County, ␹2(1, N ⫽ 15) ⫽ 7.10, p ⬍ .01, indicating that the descriptive deviant (Democratic) voice was indeed overrepresented in the population. However, the Democratic-to-Republican sticker ratio was significantly lower than the Democratic-to-Republican registered voter ratio in San Diego County, ␹2(1, N ⫽ 37) ⫽ 3.85, p ⬍ .05 (see Table 4). That is, there were disproportionately more descriptive deviant (Republican) stickers than prescriptive deviant (Democratic) stickers. Survey results. The first goal of the survey was to establish that displaying a bumper sticker identifies oneself as an extremist (i.e., a descriptive or prescriptive deviant), regardless of majority or minority status. To do so, we compared Democratic and Republican respondents’ own feelings toward Bush and Kerry—as reported in the survey—with those of the average resident of their county, using paired samples t tests. Supporting the assumption that bumper sticker holders perceived themselves to be extreme partisans, Democratic respondents in each county reported that they liked Kerry and disliked Bush significantly more than did the average resident of their county. Conversely, Republican respondents in San Diego County reported that they both liked Bush and disliked Kerry significantly more than did the average San Diego County resident (see Table 5). We could not conduct a similar analysis with Republican Santa Clara County residents because there was only 1 such individual in our sample. These findings demonstrated that both descriptive and prescriptive deviants were aware that their opinions were statistically extreme. The survey results also allowed us to rule out a feasible alternative explanation for the disproportionate sticker-to-vote ratios: specifically, that descriptive deviants (i.e., Democrats) in Santa Clara County are simply more extreme than prescriptive deviants (i.e., Democrats) in San Diego County. In fact, San Diego County Democrats liked Kerry (n ⫽ 7, M ⫽ 8.29, SD ⫽ 0.95) just as much as their Santa Clara County counterparts did (n ⫽ 10, M ⫽ 8.30, SD ⫽ 0.95), t(15) ⫽ ⫺.03, ns. Further, there was no difference in the extent to which Democrats from each county disliked Bush (M ⫽ 1.14, SD ⫽ 0.38 for San Diego County; M ⫽ 1.40, SD ⫽ 0.85 for Santa Clara County), t(15) ⫽ ⫺1.04, ns. Given the limited number of Santa Clara County Republicans in our sample, we could not compare Republicans from Santa Clara County with those from San Diego County. Summary. Study 3 showed that descriptive deviants were more willing to express their political opinions via bumper stickers than

were prescriptive deviants, regardless of whether descriptive deviance was defined as extremely liberal or extremely conservative. This phenomenon occurred even though sticker holders in each county had a generally accurate perception of how their feelings toward each of the two presidential candidates compared with those of the average resident of their county. In fact, in light of the results of Studies 1 and 2, descriptive deviants’ awareness of their superior conformity may have been what prompted them to publicize their position in the first place.

General Discussion The present research takes as its point of departure the assumption that there are two norms from which group members can deviate. The first is the descriptive or statistical norm of the group; individual members’ attitudes can be closer to or further from the average attitude. The second is the prescriptive or prototypical norm of the group; individual members’ attitudes can be closer to or further from the desirable attitude. The existence of these two (often imperfectly correlated) standards means that deviance can take two forms: divergence from the average attitude and divergence from the desirable attitude. Members who deviate from the descriptive norm of the group in a direction away from the prescriptive norm are deviants with respect to both standards. We have used the term prescriptive deviants to describe them. On the other hand, members who deviate from the descriptive norm in the direction of the prescriptive norm are deviants in relation to the first standard, but not in relation to the second. We have used the term descriptive deviants to describe them. The present research thus departs from traditional perspectives on opinion expression, in which opinions tend to be dichotomously categorized as minority/ majority (e.g., Asch, 1956; Matz & Wood, 2005) or to be defined solely based on the magnitude of deviation from the group average (e.g., Bassili, 2003; Glynn et al., 1997). In so doing, the present research introduces a novel framework for classifying (and predicting the behavior of) those group members who hold deviant opinions. 3

We used the percentages of registered voters, rather than the percentages of residents who actually voted for Bush and Kerry, because the former were available for each particular city in which data were collected. The latter were only available countywide. Thus, the data on the percentages of registered voters in each county allowed a more precise test of our hypothesis.

MORRISON AND MILLER

880

Table 5 Bumper Sticker Holders’ Ratings of George W. Bush and John Kerry by County, Study 3 Ratings Bush Average resident

Self County

Kerry Average resident

Self

Party affiliation

M

SD

M

SD

t

M

SD

M

SD

t

Democratic Republican

1.14 9.18

0.38 0.75

5.86 7.45

1.46 1.63

9.04* 3.19*

8.29 1.64

0.95 1.03

5.57 3.73

1.13 1.27

6.45* 5.04*

Democratic

1.50

0.85

3.70

1.42 ⫺4.70*

8.30

0.95

7.30

0.95

2.37*

San Diego Santa Clara

Note. Ratings derived from 10-point scale (1 ⫽ don’t like him at all, 10 ⫽ like him very much). * p ⬍ .05

The distinction between descriptive and prescriptive deviants suggests that not all those whose attitudes diverge from the group average will feel equally deviant. Specifically, those group members whose opinions are statistically extreme but congruent with the prescriptive norm will feel less deviant, and hence will be more willing to express their attitudes, than will those group members whose opinions are both statistically extreme and incongruent with the prescriptive norm, as well as those group members whose attitudes are not extreme. Our proposed explanation for this result is that while both descriptive and prescriptive deviants recognize that their opinions render them different from their peers, the former— but not the latter—also think that their extreme opinions render them superior conformists (i.e., different in a good way) relative to their peers, even their nondeviant peers. The present studies provide strong support for this hypothesis. In Studies 1 and 2, we found a consistent relation between direction of deviance and opinion expression among college students. Study 1 showed that participants were more comfortable expressing descriptive deviant (vs. nondeviant or prescriptive deviant) opinions that were randomly assigned to them and that the relation between descriptive deviance and comfort was mediated by descriptive deviants’ perception of superior conformity (i.e., that they differed from their peers in a good way). Study 2 found that participants assigned to argue for a descriptive deviant position experienced more pride, were rated as being more proud by an observer, and were more willing to publicly express their assigned position than were participants assigned to argue for a nondeviant or prescriptive deviant position. Study 3 replicated the first two studies in a field context and demonstrated that descriptive deviants were more vocal than prescriptive deviants, regardless of whether the prescriptive group norm was liberal (as it was in Studies 1 and 2) or conservative.

Deviance, Superior Conformity, and Opinion Expression The present studies found that descriptive deviants were aware that their opinions diverged from that of the average group member (an indicator of difference). However, because they knew that their opinions conformed to the prescriptive group norm (an indicator of similarity), they were not rendered uncomfortable by their

divergence from the average group opinion. In fact, they felt superior to their peers and proud of themselves as a result of being deviant (see Blanton & Christie, 2003; Myers, 1978). These findings are consistent with the idea that people strive to satisfy their competing needs for similarity and difference (e.g., Brewer, 1991; Hornsey & Jetten, 2004; Snyder & Fromkin, 1980). However, our findings extend the idea by providing empirical evidence for a strategy through which these needs can be satisfied. Specifically, individuals can conform even more than other group members to group norms (Codol, 1975)—in the present context, by expressing opinions that make them different but good. An important conclusion of the present research is that holding deviant attitudes will not inevitably produce conflict in people (e.g., Bassili, 2003; Matz & Wood, 2005; Nemeth & Wachtler, 1983). Group members holding deviant attitudes will only experience marginality, and the attendant reluctance to express their attitudes, when they perceive that their attitudes diverge not just from the average group opinion but also from the desirable group opinion. Those whose attitudes diverge from the average group opinion but not from the desirable group opinion will actually feel emboldened by the knowledge that their attitudes make them superior conformists. In demonstrating that descriptive deviants are more willing to express their opinions because of feelings of superior conformity, our research distinguishes itself from that of Abrams and colleagues (Abrams et al., 2000, 2002, 2005). Because a key assumption of their model is that people seek to differentiate their ingroup from outgroups, they too may have predicted that descriptive deviants (or more generally, those who validate group norms to the extreme) would be more vocal than either prescriptive deviants (or more generally, those who invalidate group norms to the extreme) or nondeviants. However, Abrams et al. did not test this prediction directly, nor did they show that descriptive deviants seek to distinguish themselves within their group by being superior conformists. Future research should determine whether descriptive deviants exhibit higher levels of opinion expression not only because they perceive themselves as different but good, but also because they perceive themselves as helping to clarify the norms of the ingroup. For example, it is possible that the relation between descriptive

SILENT AND VOCAL MINORITIES

deviance and opinion expression is strongest when the norms of the ingroup are made salient (Marques, Abrams, Paez, & MartinezTaboada, 1998; Marques, Abrams, & Serodio, 2001) or when differences between the ingroup and a relevant outgroup are reinforced (Abrams et al., 2000, 2002).

Opinion Expression in Single-Group Versus Two-Group Contexts As noted in the introduction, the present article focused on contexts involving a single group whose members’ attitudes toward particular issues were roughly normally distributed. In Studies 1 and 2, we measured the willingness of college students on a moderately liberal campus to express their attitudes toward important campus issues to their peers. In Study 3, we measured the willingness of county residents to display either liberal or conservative political bumper stickers on their vehicles. However, because previous research has suggested that perceptions of group members’ attitudes are often affected by the social context (Doosje, Haslam, Spears, Oakes, & Koomen, 1998; Ellemers, Spears, & Doosje, 2002), we might predict different results depending on which group identity is most accessible in people’s minds. For instance, if participants in Studies 1 and 2 had been thinking about their political identity (i.e., as liberal or conservative) instead of their Stanford student identity, then they likely would have perceived the distribution of student attitudes as bimodal rather than normal. In other words, they would have perceived Stanford students to be divided into two groups with two prescriptive norms (liberal and conservative), instead of a single group with one prescriptive norm (liberal). When a single group context is inferred, then descriptive deviants—as demonstrated in the present studies—should be more likely than other group members to feel emboldened because their position as descriptive deviants sets them apart from their peers in a direction consistent with the (single) prescriptive group norm. By contrast, when a two-group context is inferred, the concept of descriptive deviance should carry less relevance because group members on either extreme would be considered nondeviants or majorities within that population. Group members who hold moderate attitudes would be considered deviants or minorities within both populations, given their departure from both prescriptive group norms (see Levine & Ranelli, 1978). In this situation, the relation between descriptive deviance and opinion expression would not necessarily emerge, nor would the relationship between descriptive deviance and perceptions of superior conformity. Instead, those with extreme attitudes would simply be more willing to express their opinions within their group than those with moderate attitudes, perhaps because the former feel similar to or well liked by other group members.

Conclusion The present results suggest that the public discourse in groups with strong social identities will tend to be skewed away from the average position of the group in the direction of the desirable position of the group. What this means in relation to NoelleNeumann’s (1984) speculation about a “spiral of silence” is that those group members whose private positions deviate from both the descriptive and prescriptive norms are especially prone to

881

remaining silent. By contrast, those group members whose private positions deviate from the descriptive (but not prescriptive) norm are especially prone to voicing their opinions. In our studies, we found that group members were reasonably accurate in their perceptions of the distance between their own attitude and the average group attitude. Over time, however, descriptive deviants’ willingness to speak out could lead to pluralistic ignorance and a mistaken sense of the group position by both ingroup and outgroup members (Miller & McFarland, 1987; Prentice & Miller, 1996). It could also induce a gradual shift in the private opinions of group members toward the desirable opinion (see Myers, 1978; Paicheler, 1977). The changing of descriptive and prescriptive group norms in this manner, in turn, further increases political and ideological extremism.

References Abrams, D., de Moura, G. R., Hutchison, P., & Viki, G. T. (2005). When bad becomes good (and vice versa): Why social exclusion is not based on difference. In D. Abrams, M. A. Hogg, & J. M. Marques (Eds.), Social exclusion and inclusion (pp. 161–191). Hove, East Sussex, United Kingdom: Psychology Press. Abrams, D., Marques, J., Bown, N., & Dougill, M. (2002). Anti-norm and pro-norm deviance in the bank and on the campus: Two experiments on subjective group dynamics. Group Processes & Intergroup Relations, 5, 163–182. Abrams, D., Marques, J. M., Bown, N., & Henson, M. (2000). Pro-norm and anti-norm deviance within and between groups. Journal of Personality and Social Psychology, 78, 906 –912. Asch, S. E. (1956). Studies of independence and conformity: A minority of one against a unanimous majority. Psychological Monographs, 70, (Whole no. 416). Baron, R. M., & Kenny, D. A. (1986). The moderator–mediator variable distinction in social psychological research: Conceptual, strategic, and statistical considerations. Journal of Personality and Social Psychology, 51, 1173–1182. Bassili, J. N. (1995). On the psychological reality of party identification: Evidence from the accessibility of voting intentions and of partisan feelings. Political Behavior, 17, 339 –358. Bassili, J. N. (2003). The minority slowness effect: Subtle inhibitions in the expression of views not shared by others. Journal of Personality and Social Psychology, 84, 261–276. Blanton, H., & Christie, C. (2003). Deviance regulation: A theory of identity and action. Review of General Psychology, 7, 115–149. Brewer, M. B. (1991). The social self: On being the same and different at the same time. Personality and Social Psychology Bulletin, 17, 475– 482. Brewer, M. B., & Gardner, W. (1996). Who is this we? Levels of collective identity and self-representations. Journal of Personality and Social Psychology, 71, 83–93. California Secretary of State. (2004, 18 October). Report of registration: Registration by political subdivision by county. Retrieved October 16, 2007, from http://www.ss.ca.gov/elections/ror_10182004.htm Cialdini, R. B., Reno, R. R., & Kallgren, C. A. (1990). A focus theory of normative conduct: Recycling the concept of norms to reduce littering in public places. Journal of Personality and Social Psychology, 58, 1015– 1026. Cialdini, R. B., & Trost, M. R. (1998). Social influence, social norms, conformity, and compliance. In D. T. Gilbert, S. T. Fiske, & G. Lindzey (Eds.), The handbook of social psychology (Vol. 2, pp. 151–192). New York: Oxford University Press. Codol, J.-P. (1975). On the so-called “superior conformity of the self” behavior: Twenty experimental investigations. European Journal of Social Psychology, 5, 457–501.

882

MORRISON AND MILLER

Codol, J-P. (1984). Social differentiation and non-differentiation. In G. Tajfel (Ed.), The social dimension (pp. 314 – 337). Cambridge, United Kingdom: Cambridge University Press. Doosje, B., Haslam, S. A., Spears, R., Oakes, P. J., & Koomen, W. (1998). The effect of comparative context on central tendency and variability judgments and the evaluation of group characteristics. European Journal of Social Psychology, 28, 173–184. Ellemers, N., Spears, R., & Doosje, B. (2002). Self and social identity. Annual Review of Psychology, 53, 161–186. Glynn, C. J., Hayes, A. F., & Shanahan, J. (1997). Perceived support for one’s opinions and willingness to speak out: A meta-analysis of survey studies on the “spiral of silence.” Public Opinion Quarterly, 51, 452– 463. Hensley, V., & Duval, S. (1976). Some perceptual determinants of perceived similarity, liking, and correctness. Journal of Personality and Social Psychology, 34, 159 –168. Hornsey, M. J., & Jetten, J. (2004). The individual within the group: Balancing the need to belong with the need to be different. Personality and Social Psychology Review, 8, 248 –264. Huckfeldt, R., & Sprague, J. (2000). Political consequences of inconsistency: The accessibility and stability of abortion attitudes. Political Psychology, 21, 57–79. Judd, C. M., & Johnson, J. T. (1981). Attitudes, polarization, and diagnosticity: Exploring the effect of affect. Journal of Personality and Social Psychology, 41, 26 –36. Krosnick, J. A., Boninger, D. S., Chuang, Y. C., Berent, M. K., & Carnot, C. G. (1993). Attitude strength: One construct or many related constructs? Journal of Personality and Social Psychology, 65, 1132–1151. Levine, J. M. (1989). Reactions to opinion deviance in small groups. In P. B. Paulus (Ed.), The psychology of group influence (pp. 187–231). Hillsdale, NJ: Erlbaum. Levine, J. M., & Ranelli, C. J. (1978). Majority reaction to shifting and stable attitudinal deviates. European Journal of Social Psychology, 8, 55–70. Lynn, M., & Snyder, C. R. (2002). Uniqueness theory. In C. R. Snyder & S. J. Lopez (Eds.), Handbook of positive psychology (pp. 395– 410). New York: Oxford University Press. Marques, J. M., Abrams, D., Paez, D., & Martinez-Taboada, C. (1998). The role of categorization and in-group norms in judgments of groups and their members. Journal of Personality and Social Psychology, 75, 976 – 988. Marques, J. M., Abrams, D., & Serodio, R. G. (2001). Being better by

being right: Subjective group dynamics and derogation of in-group deviants when generic norms are undermined. Journal of Personality and Social Psychology, 81, 436 – 447. Marques, J. M., & Paez, D. (1994). The “black sheep effect”: Social categorization, rejection of ingroup deviants, and perception of group variability. In W. Stroebe & M. Hewstone (Eds.), European Review of Social Psychology (Vol. 5, pp. 37– 68). Chichester, United Kingdom: Wiley. Matz, D. C., & Wood, W. (2005). Cognitive dissonance in groups: The consequences of disagreement. Journal of Personality and Social Psychology, 88, 22–37. Miller, D. T., & McFarland, C. (1987). Pluralistic ignorance: When similarity is interpreted as dissimilarity. Journal of Personality and Social Psychology, 53, 298 –305. Myers, D. G. (1978). Polarizing effects of social comparison. Journal of Experimental Social Psychology, 14, 554 –563. Myers, D. G., Wojcicki, S. B., & Aardema, B. S. (1977). Attitude comparison: Is there ever a bandwagon effect? Journal of Applied Social Psychology, 7, 341–347. Nemeth, C., & Wachtler, J. (1983). Creative problem solving as a result of majority versus minority influence. European Journal of Social Psychology, 4, 53– 64. Nisbett, R. E., & Kunda, Z. (1985). Perception of social distributions. Journal of Personality and Social Psychology, 48, 297–311. Noelle-Neumann, E. (1984). The spiral of silence (pp. 1–57). Chicago: University of Chicago Press. Paicheler, G. (1976). Norms and attitude change I: Polarization and styles of behavior. European Journal of Social Psychology, 6, 405– 427. Prentice, D. A., & Miller, D. T. (1996). Pluralistic ignorance and the perpetuation of social norms by unwitting actors. In M. P. Zanna (Ed.), Advances in Experimental Social Psychology (Vol. 29, pp. 161–209). San Diego, CA: Academic Press. Tracy, J. L., & Robins, R. W. (2007). The psychological structure of pride: A tale of two facets. Journal of Personality and Social Psychology, 92, 506 –525. Vignoles, V. L., Chryssochoou, X., & Breakwell, G. M. (2000). The distinctiveness principle: Identity, meaning, and the bounds of cultural relativity. Personality and Social Psychology Review, 4, 337–354.

Received February 13, 2007 Revision received August 28, 2007 Accepted September 8, 2007 䡲

PERSONALITY PROCESSES AND INDIVIDUAL DIFFERENCES

Making Choices Impairs Subsequent Self-Control: A Limited-Resource Account of Decision Making, Self-Regulation, and Active Initiative Kathleen D. Vohs

Roy F. Baumeister

University of Minnesota

Florida State University

Brandon J. Schmeichel

Jean M. Twenge

Texas A&M University

San Diego State University

Noelle M. Nelson

Dianne M. Tice

University of Minnesota

Florida State University

The current research tested the hypothesis that making many choices impairs subsequent self-control. Drawing from a limited-resource model of self-regulation and executive function, the authors hypothesized that decision making depletes the same resource used for self-control and active responding. In 4 laboratory studies, some participants made choices among consumer goods or college course options, whereas others thought about the same options without making choices. Making choices led to reduced self-control (i.e., less physical stamina, reduced persistence in the face of failure, more procrastination, and less quality and quantity of arithmetic calculations). A field study then found that reduced self-control was predicted by shoppers’ self-reported degree of previous active decision making. Further studies suggested that choosing is more depleting than merely deliberating and forming preferences about options and more depleting than implementing choices made by someone else and that anticipating the choice task as enjoyable can reduce the depleting effect for the first choices but not for many choices. Keywords: choice, self-regulation, self-control, decision making, executive function

to relatively fleeting and inconsequential choices, such as whether to take another cup of tea or to floss that night. Moreover, choices have proliferated, increasing the number of decisions people can (and must) make. The diversity of consumer product selection has expanded exponentially, such that the average American supermarket in 1976 carried 9,000 different unique products, whereas 15 years later that figure had ballooned to 30,000 (Waldman, 1992). It is estimated that there are currently 1 million SKUs (stock keeping units, thus unique specific products) in the US and that the average supermarket carries 40,000 of them (Trout, 2005). The coffee shop chain Starbucks boasted in 2003 that it offered each customer 19,000 beverage possibilities at every store. Similar proliferations of alternatives have occurred with television channels, dating partners, investment options, and in countless other spheres. Has the proliferation of choice uniformly made life easier and better? Possibly not. Consumer behavior scientists long have observed that consumers feel frustrated and overwhelmed with the intense information demands that accompany large assortments (Huffman & Kahn, 1998; Malhotra, 1982). Iyengar and Lepper (2000) found that consumers who faced 24 options, as opposed to 6 options, were less willing to decide to buy anything at all, and

The rich complexity of human social life is partly attributable to choice. Each day millions of people make multiple decisions. These range from momentous and far-reaching decisions, such as what career to pursue and whether to order the troops into battle,

Kathleen D. Vohs and Noelle M. Nelson, Marketing Department, Carlson School of Management, University of Minnesota; Roy F. Baumeister and Dianne M. Tice, Department of Psychology, Florida State University; Brandon J. Schmeichel, Department of Psychology, Texas A&M University; Jean M. Twenge, Department of Psychology, San Diego State University. Preparation of this article was supported by National Institutes of Health Grant MH12794 to Kathleen D. Vohs and Grant MH 57039 to Roy F. Baumeister, funding from the Social Sciences and Humanities Research Council to Kathleen D. Vohs, and support from the Canada Research Chair Council and the McKnight Land-Grant Professorship program to Kathleen D. Vohs. We would like to thank Alison Boyce, Melissa Lassiter, Sloane Rampton, Denise Kartchner, Krystal Hansen, Mandee Lue Chatterton, Louis Wagner, Allison Park, Erica Greaves, Karyn Cirino, and Megan Kimbrell for assistance conducting the studies included in this article. Correspondence concerning this article should be addressed to Kathleen D. Vohs, Marketing Department, Carlson School of Management, University of Minnesota, Suite 3-150, 321 19th Avenue South, Minneapolis, MN 55455. E-mail: [email protected]

Journal of Personality and Social Psychology, 2008, Vol. 94, No. 5, 883– 898 Copyright 2008 by the American Psychological Association 0022-3514/08/$12.00 DOI: 10.1037/0022-3514.94.5.883

883

VOHS ET AL.

884

those who did buy were less satisfied with their purchase. Such findings suggest that choice, to the extent that it requires greater decision making among options, can become burdensome and ultimately counterproductive. Although we do not argue that having no choice is good, recent commentaries have denounced the notion of ever-increasing choice, using words like “relentless” and “inescapable” (Mick, 2005) to describe this “tyranny of freedom” (Schwartz, 2000, p. 79). The present investigation was designed to offer a possible explanation for the detrimental effects of choosing. Our approach was based on recent evidence that the self’s executive function relies on a limited resource that resembles a form of strength or energy. Past work has mainly established that this resource is depleted in acts of self-regulation (Baumeister, 2002; Muraven, Tice, & Baumeister, 1998; Vohs & Heatherton, 2000), but it may also be used in other executive activities of the self, most notably in making choices. We hypothesized that this resource is the same as that used for self-regulation. As a result, one repercussion of making choices could be a subsequent reduction in effective selfregulation due to a lack of resources to put toward subsequent tasks and challenges.

Choice and Control By some analyses, human life is full of constant choices, insofar as almost every time one acts, one could probably have done something different (Sartre, 1956; but cf. Hofmann, Strack, & Deutsch, in press). By that definition, the above Starbucks example would entail that every customer makes 19,000 choices with every order. We use the term choice in a more limited sense, however, to refer to choices made by a conscious consideration among alternatives. Much of the time people proceed by routine, habit, and automatic processes (Bargh, 2002). We consider the contemplation of alternatives and selection among them to be a meaningful and effortful internal act that involves more than habitual behavior. The most advanced form of choosing involves weighing information about currently available options so as to select the option that seems most promising. This process would be the most flexible and potentially the most adaptive in terms of promoting survival and reproduction (especially in the multidimensional social environment known as human culture), but it requires the most elaborate information-processing apparatus and the most pliant behavior control system—which would suggest that it is a costly skill. The cost of such choosing is our current focus.

Self-Regulatory Resource Depletion The self’s executive function is the agent that makes decisions, initiates and maintains action, and regulates the self by operating on its inner states (Baumeister, 1998). We define self-regulation as the self exerting control to override a prepotent response, with the assumption that replacing one response with another is done to attain goals and conform to standards. Recent findings have indicated that many of the self’s activities depend on a common resource, akin to energy or strength. This step encompasses responses designed to move the person from the current point toward the standard (cf. operate mode in cybernetic models; Carver & Scheier, 1990). All of these activities draw on the same resource, which is limited and seems easily depleted.

A series of studies has provided evidence that some selfresource is depleted by acts of self-regulation. Baumeister, Bratslavsky, Muraven, and Tice (1998) and Muraven et al. (1998) showed that performing one act of regulating the self impaired performance on a subsequent, seemingly unrelated act of selfcontrol. Presumably, the first act of self-control depleted some common resource that would have been needed to perform better at the second act of self-control. Depletion of the self’s resources (also termed ego depletion) has been linked to multiple behavioral problems, including overeating by dieters (Vohs & Heatherton, 2000), prejudicial responding (Richeson & Shelton, 2003), ineffective self-presentation (Vohs, Baumeister, & Ciarocco, 2005), intellectual underachievement (Schmeichel, Vohs, & Baumeister, 2003), inappropriate sexual responses (Gailliot & Baumeister, 2007), and impulsive overspending (Vohs & Faber, 2007). Self-regulation and decision making may share more than simply being housed under the executive function of the self. The core question of the present research was whether the resources that drive self-regulation might also govern other activities of the executive function, such as decision making (Vohs, 2006). If so, then making choices should lead to impaired self-control afterward, even on tasks unrelated to making those choices.

Choice Can Impair Self-Control There are several reasons to think that choosing would deplete the self’s strength. These reasons also differentiate the act of deliberation from that of choosing. Self-regulation presumably consumes resources because the self must override one response and then substitute a different response, and energy is needed to perform these interrupt and initiate functions. In support of the uniqueness of choosing, the reflective–implemental model (Strack, Werth, & Deutsch, 2006) conceptualizes choosing as a quasibehavioral act that ties the selected option to the self via the creation of a mental representation. The initiation of a mental link between the active, intentional, reflective part of the self and the desired option also suggests an energy-consuming act that would deplete regulatory resources (Vohs, 2006). Prior work has contained mixed findings about whether choosing depletes resources. One study found evidence of depletion using a dissonance paradigm, in which making a choice to perform a counterattitudinal behavior resulted in subsequent impairment in self-control (Baumeister et al., 1998). This finding could mean that choosing depletes the self’s resources but may also mean that dissonance-reduction processes were depleting. Moller, Deci, and Ryan (2006) produced evidence that participants who freely chose their favorite option showed no signs of depletion. They concluded that autonomous choice is not depleting. We readily accept that some choices are more depleting than others. Pleasantness might well mitigate the impact of choosing, especially if only a few choices are made. Still, we reasoned that making a choice involves a special intrapersonal act. This step, which commits the person to a course of action (Strack et al., 2006), may take effort above and beyond merely thinking about possible options. Hence choosing may consume some of the self’s limited supply of energy, thereby rendering the resource less available for further demands.

DECISION FATIGUE IMPAIRS SELF-REGULATION

Pilot Study The pilot study was designed to justify the assumptions behind the choice procedure that was to be used in Experiments 1– 4, and it also validated a self-report measure for use in Study 5. The purpose was to show that we could measure the exertion involved in choosing.

Method Participants. Participants were 34 undergraduate students (20 men) who participated in exchange for partial course credit. Procedure. Participants were randomly assigned to either make choices or rate products. They were given a list of specific varieties of products, such as colored pens, scented candles, popular magazines, and colored t-shirts. Participants in the no-choice condition were asked to indicate the extent to which they had used each product in the past (on a scale from 1 ⫽ never to 5 ⫽ very often). Participants in the choice condition were given the same list of products but were instructed to choose between two different versions of each product (e.g., a white t-shirt vs. a black t-shirt, a red pen vs. a purple pen). Participants were told that they would receive a small gift on the basis of their choices or ratings (depending on condition). Thus, participants’ responses had potentially real (though relatively minor) outcomes. Both conditions faced a questionnaire with 60 items on it, but only in the choices condition were the items asking for decisions. Subsequently, participants completed the state version of the Positive and Negative Affect Schedule (PANAS; Watson, Clark, & Tellegen, 1988) and an eight-item questionnaire that served as the manipulation check of the methods. Two of the items asked about the extent to which the previous task had involved making choices, two items asked about engaging in deliberation and careful consideration, one item asked if the task reflected the participant’s own choosing, another item asked about whether the task involved thinking about options, and yet another asked about how active the participant had felt during the previous task. The last item asked for ratings of fatigue. The first seven items were designed to tap into the different aspects of choice making that are important in the depletion of self-resources; the last item on fatigue was included to see whether participants reported feeling more tired after making multiple choices. After completing the producttask questionnaire, participants were debriefed and thanked.

Results and Discussion A factor analysis of the eight items showed that one factor accounted for 43% of the variance in the unrotated solution (eigenvalue ⫽ 3.46), whereas the second factor (eigenvalue ⫽ 1.27) accounted for an additional 16% of variance. This principalcomponents analysis extracted a two-component structure (eigenvalues over 1 selected) for the eight items on the choices questionnaire. Factor scores on each of the choice factors were derived for each participant. Of the two factors, the larger one seemed to correspond most closely to the act of making choices, which is to say that the two highest loading items on this factor were perceptions of the extent to which participants’ task involved (a) thinking about different options, followed by (b) making many choices. The other factor seemed to relate most closely to feelings of fatigue, in that the tiredness item made the biggest contribution to this factor.

885

Two independent t-tests, one for each factor, were used to predict factor scores as a function of choices condition versus frequency-rating (i.e., no choice) condition. Scores on the Choices factor were significantly predicted by condition, t(32) ⫽ 2.65, p ⬍ .02, whereas factor scores on the other factor (i.e., Fatigue) showed no differences as a function of condition, t(32) ⫽ 0.49, ns. Thus, the main finding of the pilot study was that participants who made choices among products reported being more active, conscious, and deliberative during the task relative to participants who merely rated the frequency with which they had used the products. The choices task took about a minute longer (M ⫽ 210.32 s, SD ⫽ 65.98) than did the frequency-rating task (M ⫽ 146.32 s, SD ⫽ 44.02), and the difference was significant, t(32) ⫽ 3.36, p ⬍ .01. Time spent on the task did not correlate with either of the two factor scores, Factor 1: r(34) ⫽ .07, ns, Factor 2: r(34) ⫽ .12, ns. As mentioned, participants’ first charge after the product task was to complete the PANAS to test for potential mood differences as a function of condition. As expected, condition did not determine positive affect (choices condition: M ⫽ 24.31, SD ⫽ 7.09; frequency-rating condition: M ⫽ 25.05, SD ⫽ 6.61) or negative affect (choices condition: M ⫽ 13.19, SD ⫽ 4.45; frequency-rating condition: M ⫽ 11.89, SD ⫽ 2.25), t(32) ⫽ 0.32, ns.

Experiments 1A and 1B: Consumer Choices and an Unsavory Drink Our theory holds that effortful, involving choices could deplete the self’s resources and that this depletion would impair performance on a self-regulation task. Hence in Experiments 1A and 1B, a choice (vs. no choice) manipulation was followed by a selfregulation task. Self-regulatory resource depletion was measured by how much of a bad-tasting (but not harmful) beverage people drank. Making oneself drink an aversive beverage requires self-control insofar as people are disinclined to imbibe it and must therefore force themselves to do something they do not want to do. We used a drink made of a combination of vinegar and water to approximate a “taking one’s medicine” scenario, and in this way we measured behavior and not simply responses on a questionnaire (Baumeister, Vohs, & Funder, 2007). We predicted that people who had made choices among products would not consume as much of the drink as the no-choice participants would. The two studies were nearly identical. The one main exception was that we altered the no-choice task in Experiment 1B so as to equalize the duration of the initial tasks. While conducting Experiment 1A (and as seen empirically in the pilot study), it occurred to us that the choices task might last longer than the frequencyrating task, a difference that could potentially confound the results. Hence, Experiment 1B used a different no-choice task so as to ensure equal duration in the two conditions.

Method Participants. Thirty undergraduate students (20 women) participated in Experiment 1A and 30 undergraduate students (18 women) participated in Experiment 1B in exchange for partial course credit. Procedure. Participants were randomly assigned to a choice task or a no-choice task. Before completing questionnaires, par-

886

VOHS ET AL.

ticipants in the choices condition were told that at the end they would receive a gift based on their choices during the questionnaire; participants in the no-choice condition were told they would also receive a gift but that it would be chosen for them. In the choices condition, participants made a long series of choices between products, both within and across categories. Participants made choices between items in the following categories: t-shirts, scented candles, shampoo brands, candy, and socks. After choosing preferred items within each product category, participants then made choices between different categories of products. For instance, a red t-shirt may be labeled Product A and a black t-shirt may be labeled Product B and the questionnaire would ask them to choose between A or B, then A or C, then B or C, and so on. Participants’ options for making the choices were guided by a questionnaire (e.g., “Would you prefer Product A or Product D?”), and participants were told they would be given a gift at the end of the trial based on their responses during this first part of the experiment. Some of the choices involved products that were displayed in the laboratory, such as t-shirts, scented candles, shampoo brands, and color posters. Other categories of products (specifically, candy bars and types of socks) were listed and described on the choices sheet, but the physical products were not present in the laboratory. After choosing between items within each product category, the questionnaire then asked participants to choose between different categories of products (e.g., a t-shirt or a candle). In a final task, participants made choices among occupations described on a sheet of paper. By the end, participants had made 292 choices. Participants in the no-choice condition in Experiment 1A completed a questionnaire that required them to rate products and occupations but were not asked to choose between or among items. Participants in the no-choice condition completed a questionnaire asking them to indicate which products they had used in the past year; these products were by and large the same as those involved in the choice task. Thus all participants were exposed to similar stimuli, and all were prompted to consider their preferences, with the main difference being rating versus choosing. For Experiment 1B, we had participants in the no-choice condition record their thoughts, feelings, and opinions about eight advertisements, a task that also conjured up participants’ preferences but again did not require them to make choices. The duration of this task was recorded for Experiment 1B (but not in Experiment 1A). After completing the product-rating task, participants entered another room and were seated at a table on which were placed 20 small paper cups. Each cup held 1 oz of a mixture made with orange drink mix, water, vinegar, and a small amount of sugar. (The drink was made with two cups of vinegar and six cups of water instead of the eight cups of water that are called for in the standard directions.) The experimenter then told the participant that this part of the experiment concerned motivation. “This is a drink that does not taste good to most people. It is not harmful. I will give you a nickel for every ounce you drink; each little cup is one ounce, and each one is identical. How much you drink is up to you.” The number of ounces each participant drank was recorded as a measure of self-regulatory resource depletion; drinking more ounces presumably requires more self-control (to override one’s distaste). After the vinegar-drinking task, participants were paid for their drink consumption and given a free gift.

Results and Discussion Experiment 1A provided evidence that making choices hampers the self’s regulatory capacity. Participants who made a series of choices among products and occupations later drank fewer ounces (M ⫽ 2.06; SD ⫽ 2.46) of an ill-tasting drink as compared to participants who merely rated their frequency of exposure to those same products and occupations (M ⫽ 7.67, SD ⫽ 5.35), F(1, 29) ⫽ 13.57, p ⬍ .001. Experiment 1B likewise found that choice reduced subsequent self-control. Participants in the choices condition drank significantly less of the vinegar drink than participants in the no-choice condition (choices condition: M ⫽ 1.89, SD ⫽ 2.57; no-choice condition: M ⫽ 6.87, SD ⫽ 6.46), F(1, 28) ⫽ 7.68, p ⬍ .01. Time did not confound the results, as the duration of the tasks did not differ by condition, F(1, 28) ⬍ 1, ns. These initial data confirmed our prediction that decision making causes a subsequent reduction in self-control, a finding that does not appear to depend on the duration of the initial task.

Experiment 2: Consumer Choices and Pain Tolerance Experiment 2 was designed as a replication and extension of Experiment 1, with several refinements. First, the choice manipulation and the dependent measure were administered by separate experimenters and presented as distinct experiments. We used two different experimenters to avoid the possibility that participants would try to perform well on the self-control task in order to ingratiate themselves with the experimenter in the hopes of getting a better gift (which was promised by the first experimenter as the reward for the first task). Second, the experimenter for the dependent measure was kept blind to condition, which eliminated the possibility of unknowingly biasing the results. Third, we sought convergent validity by using a different dependent measure of self-regulation, the cold pressor task. This task requires participants to submerge their arm in frigid water for as long as possible. Overriding the natural tendency to pull one’s arm out of the near-freezing water thus constitutes an act of self-control. We predicted that making choices would deplete the resource needed for self-control, leaving people less able to keep their hand in the painfully cold water for a long period of time.

Method Participants. Twenty-five (16 women) undergraduates participated in exchange for partial course credit. Procedure. Participants were randomly assigned to either the choice condition or no-choice condition. In the introduction to the experimental session, participants were told that the session would consist of several experiments by different experimenters because each experiment on its own was too short to justify using the whole experimental period; therefore, experimenters across two laboratories arranged their experiments sequentially so as to take up one full time slot. In the choice condition, participants made many choices between products, both within and across categories, as described in Experiments 1A and 1B. They were once again informed that they would be given a gift at the end of the trial based on their responses during this first part of the experiment.

DECISION FATIGUE IMPAIRS SELF-REGULATION

In the no-choice condition, participants recorded their thoughts, feelings, and opinions about eight advertisements taken from popular magazines. The instructions asked participants to elaborate on their thoughts and opinions and to write detailed comments about their reactions to the ads. Full sheets of lined paper were given to participants to record their reactions to each advertisement. These steps were done to equate the amount of time participants would work at this no-choice task with the amount of time it would take for participants in the choice condition to complete their task. Participants in the no-choice condition were also informed that they would be given the opportunity to select a gift for themselves at the experiment’s end. Following the manipulation (choosing vs. rating), participants were escorted to another room where a second experimenter who was blind to participants’ condition administered the cold pressor task. For the cold pressor task, water temperature was maintained at 1 °C (approximately 34 °F) using a mixture of ice and water. An aquarium pump circulated the water so as to prevent a warm pocket from forming around the participant’s hand. The room air temperature was also maintained at a constant 72 °F (22 °C). Participants first held their nondominant arm (to the elbow) in room temperature water for 1 min to ensure an equal starting point; then they submerged this arm up to the elbow in the ice water. The experimenter asked the participant to hold there for as long as possible. A stopwatch measured the length of time the participant held his or her arm in the water, with the number of seconds serving as the measure of self-control. After completing the cold pressor task, participants were fully debriefed, chose a gift, and were thanked.

Results The length of time that participants withstood the pain of holding their arms in unpleasantly cold water was significantly reduced among participants who had made a series of choices (M ⫽ 27.70 s, SD ⫽ 15.81) relative to participants in the no-choice condition (M ⫽ 67.42 s, SD ⫽ 56.35), F(1, 23) ⫽ 5.97, p ⬍ .025. Persistence on the cold pressor task was not confounded with time spent on the first task because the product-rating task took no longer than the choice task, F(1, 23) ⫽ 1.76, ns.

Discussion Experiment 2 provided converging evidence that making many decisions impairs subsequent self-regulation, consistent with the hypothesis that both choosing and self-control depend on a common but limited resource. The design of Experiment 2 bolstered the findings of Experiments 1A and 1B by ruling out several alternative explanations. We used two experimenters in the current study, one to administer the dependent measure and one to administer the product task. Moreover, the experimenter overseeing the dependent measure was blind to condition, thereby eliminating concern that experimenter demand could have contributed to the results. Also, participants in the no-choice condition were told they would be able to choose their own gift from a standard set of options, thereby eliminating concern that their performance on the self-control measure was aimed at persuading the experimenter to offer them a better gift or a more appealing set of options.

887

Experiment 3: Choosing College Courses and Procrastination To provide further evidence of the detrimental impact of making choices on subsequent self-regulation, we designed Experiment 3 as a conceptual replication of Experiment 2 but with new procedures for both the choice-task manipulation and the dependent measure of self-regulation. Instead of making choices among small household products, participants in this study either made choices, or not, regarding the courses they would take to satisfy their degree requirements. They were encouraged to take these choices seriously as if they were actually selecting the classes they were to take in future years, so it seems reasonable to assume that they regarded these choices as important and relevant. Self-regulation was measured in terms of resisting procrastination. Participants were given 15 min to study for an upcoming nonverbal (math) intelligence test that was framed as a predictor of many desirable life outcomes. To practice, we gave participants a packet of sample problems. However, as a competing temptation, they were also allowed to read magazines and play a video game. We knew that self-regulation would be required for most participants to override the seductive pull of games and magazines and make themselves practice arithmetic problems. Most likely, this is a self-regulation dilemma that would be familiar to many college students, namely, whether to push oneself to study for a test or indulge in more pleasant pastimes. We hypothesized that choosing one’s courses would deplete the self’s resources as compared to merely reading about courses without choosing. Hence, we predicted that participants who made choices would spend more of their time on the time-wasting temptations of magazines and video games and, correspondingly, would devote less time studying for the upcoming test.

Method Participants. Twenty-six introductory psychology students (17 men) participated in exchange for partial course credit. Data from 2 participants were not included in analyses (leaving 24 participants in the analyses). One participant correctly surmised that the intelligence test was not to be administered, whereas the other was an acquaintance of the experimenter. Procedure. Participants arrived at the laboratory individually, where they were informed that the experiment examined whether a person’s choice of college major was related to nonverbal intelligence. All participants were shown a list of general education course requirements and a list of all the classes that would satisfy each of these requirements. This information was taken directly from the official undergraduate bulletin, which stated that a total of 36 credit hours (12 courses) in predetermined content areas were required of all undergraduates regardless of major area of study. These 12 courses must be selected from a total of over 60 distinct courses offered at the university. In the choices condition, participants were directed to spend 8 min indicating which courses they would choose to take to satisfy each of the general education requirements and to write down their selections on the response sheet they were given. If they finished this task, participants were to consult the undergraduate course bulletin to select and then write down the courses they would take to satisfy their major-degree requirements. In the no-choices con-

888

VOHS ET AL.

dition, participants were instructed to peruse course requirements and then read over the descriptions of different courses that satisfy these requirements. These participants were also encouraged to review course descriptions of classes in their major and to consider courses in which they might enroll to satisfy their major-degree requirements. These participants, unlike choice-condition participants, were not asked to make formal choices by writing them down on a response sheet. Rather, they were simply instructed to think about courses in which they would prefer to enroll. After 8 min had elapsed, the experimenter asked participants to complete the mood measure (PANAS). Participants then began the nonverbal intelligence (math) test portion of the experiment. The experimenter explained the format of the test and told participants that the test was highly predictive of skills important for real-world success. Additionally, participants were told of past research showing that performing practice math problems for 15 min significantly improved performance on the test but practicing for more than 15 min did not lead to additional increases on performance. The experimenter announced he was going to leave the room for 15 min and gave participants a packet of practice math problems. Participants were told they could practice for the upcoming test for as long as they wanted during the next 15 min. The experimenter also noted that participants could look at magazines or play a hand-held video game (both of which were located on a stand next to the participants’ work area) if they did not want to work on the practice problems for the entire practice period. As the experimenter left the room, a research assistant who was blind to participants’ experimental condition entered an adjacent room and observed participants through a two-way mirror. The mirror was covered by closed vertical blinds, except for two slats that were slightly bent at an angle that allowed the observer to clearly view participants’ behavior without their knowledge. The observer recorded participants’ behavior every 30 s according to whether the participant was practicing math problems, looking at a magazine, playing the video game, or engaging in some other (unscripted) activity, such as sitting quietly. When the experimenter returned, she asked participants to complete a questionnaire, which contained several manipulation check questions. Finally, participants were informed that they would not be taking the nonverbal test and were debriefed and thanked.

Results Our main prediction was that making a series of choices would result in a state of ego depletion, thereby truncating persistence (or practice) at the math problems and leading to more procrastination. We calculated number of minutes practicing by multiplying number of times the participant was observed practicing by .5 (to represent 30 s in terms of minutes). As expected, the choices versus no-choices manipulation affected how long participants practiced for the upcoming test, t(22) ⫽ 2.43, p ⬍ .05. After making a series of choices, participants spent less time practicing for the upcoming nonverbal intelligence (math) test (M ⫽ 8.39 min, SD ⫽ 3.64) than did participants who did not make choices (M ⫽ 11.40 min, SD ⫽ 1.66). This finding also indicates that depleted participants spent more time playing video games, reading magazines, and doing nothing than did nondepleted participants. Thus, after making choices, people spent more time on self-indulgent activities and less time on effortful studying.

Although our main focus in the current study was on the amount of time spent on the math problems, we also checked to see whether performance on the math problems differed as a function of choice condition. It did not. We counted every problem participants attempted (because sometimes participants did a bit of work on a problem but failed to finish it) and subjected this measure to a t-test with choice condition as a predictor. This measure showed no difference as a function of condition, t(22) ⬍ 1, ns. The number of problems completed also showed no difference as a function of choice condition, t(22) ⬍ 1, p ⬎ .60. Number of problems correctly answered also showed no differentiation by condition, t(22) ⬍ 1, p ⬎ .80. Last, we conducted an analysis of covariance, comparing the choice and no-choice conditions on number of problems correct, with time spent practicing as the covariate. The effect of the covariate, time spent, approached significance, F(1, 21) ⫽ 4.14, p ⫽ .06, but condition was not significant, F(1, 21) ⬍ 1. We assessed whether the choices manipulation influenced mood states. Consistent with expectations, the choice manipulation did not differentially affect mood. Reports of positive affect, t(22) ⫽ 1.01, p ⫽ .33, and negative affect, t(22) ⬍ 1, ns, were similar in the two groups. Further analyses confirmed that choice and nochoice conditions did not differ with regard to self-rated difficulty of their respective degree programs, t(22) ⫽ 1.10, ns, frustration with the tasks (t ⬍ 1, ns), or stated importance of performing well on the upcoming test, t(22) ⫽ 1.44, ns. Thus, the effects of choice were not due to mood, difficulty, frustration, or perceived importance.

Discussion Experiment 3 conceptually replicated the finding that making a series of decisions leads to subsequent impairment of selfregulation. Participants in this study were given instructions either to select courses to fill the remainder of their undergraduate careers or to read and think about course options without choosing. Subsequently, participants were given the opportunity to practice for an upcoming math test said to be predictive of successful life outcomes, but their studying was compromised by the availability of tempting, fun alternative activities, such as video games and magazines. Participants who had made choices about their future coursework, as compared to those who simply read and considered their options, spent less time studying and practicing for the math test (and spent correspondingly more time indulging in the tempting distracter tasks). Poor or failed self-regulation is an important contributor to procrastination (Tice & Baumeister, 1997), and thus Experiment 3 demonstrates another way in which making many choices can lead to a breakdown of self-control. The fact that choosing what courses to take led to less studying is somewhat counterintuitive. Had the opposite effect been obtained, one might readily have interpreted it as indicating that priming the idea of course work prompted people to study. The fact that choosing courses led to less studying is thus most consistent with a limited-resource model.

Experiments 4A and 4B: Course Content Choices and Solvable and Unsolvable Problems One ambiguity about the findings of Experiment 3 was that participants solved the same number of problems in both condi-

DECISION FATIGUE IMPAIRS SELF-REGULATION

tions, despite the difference in duration of persistence. Although null findings are generally not entitled to substantive interpretation, one could read those results as indicating that people who made choices were better at self-regulation (not worse, as we found in Experiment 2), insofar as they solved approximately the same number of problems in less time. Hence, we felt the importance of conducting a conceptual replication. Experiment 4 tested persistence on unsolvable problems (4A) and solvable problems (4B) after a manipulation of making choices or not. To increase the robustness of our conclusions, we again changed the choice manipulation, in this case to decisions about the psychology course in which participants were currently enrolled. Participants in the choices condition made a series of decisions about the course, choices they were told (veridically) would determine the way the instructor taught the course both during the current term and in subsequent terms. It is possible that participants in Experiment 3 did not see their choices as binding because students can and do change their minds about what courses to take. In contrast, the choices made in Experiment 4 were irrevocable in the sense that once students’ choices were communicated to the instructor via this experiment, there was no opportunity to change the selections, and the instructor did in fact modify the course on the basis of students’ selections. Another change in Experiment 4 was to separate the procedures with different experimenters. When the same experimenter administers both the choice manipulation and the self-regulation measure, it is conceivable that extraneous attitudes toward the experimenter could confound responses to the dependent measure, as noted earlier. Therefore, we used the more elaborate procedure of presenting the tasks as unrelated, including having different experimenters administer the independent and dependent variable tasks in different rooms. The main measure of self-regulation in this study was persistence at challenging problems. Persistence requires self-regulation insofar as the repeated failures are discouraging and frustrating, and the participant would soon wish to be doing something else—so one has to override the impulse to quit. Because of the possibility that quitting fast on unsolvable problems could be regarded as showing exceptionally good self-regulation, however, we ran two versions of this study, one with unsolvable problems (4A) and the other with solvable problems (4B). With the solvable problems, we were also able to calculate performance quality by counting correct solutions.

Method Participants in Experiment 4A. Forty-one undergraduates (26 women) participated in exchange for partial course credit. One participant was unable to complete the study. Procedure for Experiment 4A. After arriving and completing consent forms, participants were told that the first part of the study involved reviewing instructors’ materials from their psychology class, and the second, unrelated part of the study involved completing a spatial design task. As in Experiment 2, participants were told that because each experiment in this session was rather short, experimenters in the department combined two studies so as to maximize efficiency in use of subject credit hours. The first experimenter handed out the materials that contained the choices

889

manipulation. All participants were given the same materials, but the instructions that accompanied them were different. Instructions for participants in the choices condition asked them to read the material and, for each section, to choose the option they preferred. Options were always presented as a two-option forced choice. In one example, participants read descriptions of two possible video clips and chose which film clip they would prefer to see. Another item involved choosing between two different styles of a test question, and another item asked them to choose between two paragraphs of text. Participants in the choices conditions were also told (truthfully) that the choices they made would be reviewed by their instructor and would affect her decisions for future lectures and tests both during this semester while the participants were taking her course as well as for future classes. In all, participants made 35 choices, which were presented as important and consequential for the student participants’ lives. Participants were asked to complete all the choices and return the packet to the experimenter before moving on to the next part of the experiment. Participants in the no-choices condition were simply instructed to read the same material that was presented to the participants in the choices condition. They were not asked to make any choices between the options or to rate the material in any way. They were asked to read the material very carefully and return the packet to the experimenter before moving on to the next part of the experiment. Next, participants moved across the hall to complete the persistence part of the experiment with the second experimenter. The persistence measure involved unsolvable tracing puzzles. This procedure was made popular by Glass, Singer, and Friedman (1969), and it has been used in previous studies as a measure of self-regulation. Participants were given a packet containing two complex figures. Participants were told that performance on these geometric figures was predictive of future life success due to its links with higher order cognitive abilities. Participants were given two stacks of paper with each page displaying one of the complex figures. The stacks of papers were given to participants so that they could use as many sheets as necessary as they attempted the (unsolvable) task of tracing each figure in its entirety without once lifting the pencil from the paper or retracing any lines. They were asked to bring their sheets back to the experimenter either when they had finished or when they had worked as long as they could on them and wanted to stop. The experimenter recorded how long each participant persisted (to the nearest quarter minute). After finishing, participants were given a manipulation check that asked participants to rate their mood states in terms of how happy, sad, depressed, or confident they felt (four items rated on a scale from 1 ⫽ not at all to 7 ⫽ very much so). In addition, they were asked to indicate to what extent they felt that their activities during the initial task regarding elements of the course would alter the content and design of the course (on a scale from 1 ⫽ not at all to 7 ⫽ very much). Last, participants were debriefed and thanked. Participants in Experiment 4B. Forty-two undergraduates (28 women) took part in exchange for partial course credit. Two participants failed to complete the study. Procedure for Experiment 4B. The procedure for Experiment 4B was the same as for 4A, with two changes. First, the length of time it took participants to finish the choices or ratings was held constant at 12 min. This was accomplished by having participants in both conditions work through a lengthy packet of stimuli that

890

VOHS ET AL.

could not be completed in less than a certain amount of time, which in this case was 12 min. After 12 min had elapsed, participants were stopped and informed that they would now move to the second experiment. Second, we altered the operationalization of self-regulation to be persistence at and correct solutions of solvable problems. After being moved to a new laboratory room and greeted by the second experimenter, participants were told that the next study involved a test of simple mathematical calculations, which long have been known to predict success in life. The experimenter explained that this math test was sensitive to brief amounts of practice, and therefore everyone was allowed practice time before taking this test. Participants were given practice sheets of three-digit multiplication problems, which they were told to practice for as long as they could up to 30 min. When participants felt they could not practice any longer, they alerted the experimenter. The experimenter covertly recorded the length of time participants had worked at the math problems (to the nearest quarter minute) and gave participants a question asking them to rate the degree to which their activities during the first task regarding elements of the course would alter the content and design of the course (on a scale from 1 ⫽ not at all to 7 ⫽ very much). Then, participants were debriefed, thanked, and excused.

Results Unsolvable puzzles (Experiment 4A). Participants who did not have to make choices about the material but merely read through it carefully persisted longer on the tracing task (M ⫽ 12.25 min, SD ⫽ 4.31) than did participants who were asked to make many choices about the same material (M ⫽ 9.11, SD ⫽ 3.00), F(1, 38) ⫽ 7.12, p ⬍ .05. Thus, making choices seems to have depleted some resource, thereby reducing persistence on the second task. Ancillary analyses confirmed that the manipulation was effective: Participants in the choices condition reported that they believed that the responses they made would affect their own course more so than participants in the no-choices condition did, F(1, 38) ⫽ 585.95, p ⬍ .001. There were no differences on self-reports of being happy, sad, depressed, or confident (Fs ⬍ 1). Solvable puzzles (Experiment 4B). Participants who made choices about the course material failed to persist on the practice items for as long as did participants who read about the same material but who did not make choices (choices condition: M ⫽ 14.70 min, SD ⫽ 4.05; no-choices condition: M ⫽ 17.80 min, SD ⫽ 4.66), F(1, 38) ⫽ 5.00, p ⬍ .05. Participants who had made many choices also completed fewer practice problems than did participants who had not made choices, F(1, 38) ⫽ 6.23, p ⬍ .05. Making choices also appears to have led to poorer performance on the math problems. Participants who had not made choices got significantly more practice problems correct and marginally fewer wrong than participants who were asked to make many choices got, F(1, 38) ⫽ 16.56, p ⬍ .001 and F(1, 38) ⫽ 3.81, p ⫽ .06, respectively. The difference in number of errors was probably weakened by the fact that participants in the choice condition spent less time and attempted fewer problems, which should cause them to make fewer errors than they would have made on a longer problem set. To correct for this, we computed the error rate by dividing number of errors by number attempted for each participant. Analysis of variance on error rates confirmed that partici-

pants in the choices condition made more errors per attempt than did participants in the no-choices condition, and this was a significant difference, F(1, 38) ⫽ 5.10, p ⬍ .05. On the manipulation check, participants in the choices condition were much more likely to believe that they were making choices that would affect the rest of their semester in the classroom than were participants in the no-choices condition, F(1, 38) ⫽ 224.48, p ⬍ .001. Thus, again, the manipulation was successful.

Discussion Experiment 4 showed that making choices about one’s psychology course had a significant and detrimental effect on subsequent task performance. Those who made choices subsequently gave up faster on unsolvable (Experiment 4A) and solvable (Experiment 4B) items, as compared to participants who did not make choices. These findings provide further evidence that making decisions can deplete an important self-regulatory resource, thereby making it more difficult for the person to resist the temptation to quit while performing a wearisome task. Furthermore, Experiment 4B confirmed that making choices had a negative effect not only on persistence but also on quality of performance. Participants who made choices got fewer math problems right and had a significantly higher error rate than did participants who had merely thought about the course options without making choices. Several design features facilitate interpretation of findings. The choices in Experiment 4 were real and consequential, in the sense that they actually influenced the schedule for the remainder of the course (as opposed, possibly, to what participants thought in Experiment 3). Using two experimenters (one unaware of experimental condition) diminished the likelihood that demand characteristics or desire to impress the (first) experimenter influenced the results. The amount of time spent on the first task was the same for all participants in Experiment 4B, ensuring that persistence on the second task was not affected by how much time had been spent on the first task. It was also apparent that less persistence meant poorer performance: Participants who made choices got fewer problems correct (unlike in Experiment 3) and made more errors than did those who did not make choices. In sum, it appears that making choices depleted some resource that was then unavailable to facilitate performance on both unsolvable and solvable tasks. Self-regulation is useful for making oneself persist on a difficult task, for overseeing the calculation process, and for checking and correcting errors, all of which are weakened by previous efforts involved in making choices.

Study 5: Decision Fatigue at a Shopping Mall To provide a field test of our central hypothesis, we approached customers at a shopping mall and assessed the number of decisions they had made during their shopping trip thus far. To measure self-regulation, we then asked them to perform easy but tedious arithmetic problems (adding three-digit numbers). This task requires self-regulation because most shoppers would probably rather do something else than perform arithmetic, and so the impulse to quit must be overridden if they are to continue. We predicted that shoppers whose resources were depleted by having made a greater number of prior choices would quit faster on the arithmetic problems.

DECISION FATIGUE IMPAIRS SELF-REGULATION

A conceptual replication of the laboratory findings from Experiments 2– 4 was desirable for several reasons. First, this study drew its participants from a nonuniversity sample, which increases confidence in the generalizability of the results. Second, this study avoided a potential confound of differential time spent on different experimental tasks (and shoppers would also furnish estimates of how long they had been shopping, which later could be controlled for when analyzing the impact of prior choices). Third, participation in this study was not affected by a material incentive because no reward or gift was offered. Having shoppers perform math problems gave us two forms of self-regulation to assess. For one, we could check for persistence at the math problems, which is a classic measure of self-control. In addition, as in Experiment 4, we could also check for the carefulness of participants’ work, for which self-regulation would be involved in overseeing the rule-following mathematical process and to check for possible errors. Hence, we predicted that the state of ego depletion among shoppers who had made many choices would therefore lead to poorer persistence and performance relative to shoppers who had not made choices.

Method Participants. Ninety-six shoppers at an open-air shopping mall in Salt Lake City, Utah were approached, and 19 women and 39 men agreed to participate (60% response rate). The age of participants ranged from 18 years to 59 years, with 91% of participants reporting White (non-Latino) ethnicity, 4% reporting Asian ethnicity, and 5% Latino ethnicity. Procedure. Shoppers were approached by members of the research team and asked for their time in a volunteer (i.e., no remuneration) experiment. Research assistants were instructed not to reveal much about the experiment before participants agreed or declined to participate, so that the details of the task (described next) did not influence who chose to participate. Participants were told the experiment involved answering some questions about their shopping trip and then engaging in a cognitive task. After a brief demographic questionnaire, participants completed the self-report scale, which was the same as that from the pilot study except for combining two redundant items asking about the degree of which choices had been made. Participants were asked to respond to questions by thinking about their behaviors during their shopping trip and to give a numeric rating of 1 (not at all) to 10 (very much so) for the following items: “How many choices did you feel you have made on your shopping trip today?,” “How personally important were the choices you made shopping today?,” “How much careful consideration did you put into choices you have made today?,” “How much did you deliberate before making each choice today?,” “How much did you think about your options prior to making each choice today?,” “How active did you feel in making your choices today?,” and “How tired do you feel right now?” Participants also reported time spent shopping in hours and minutes. Shopping times ranged from 1 min (for participants who had just begun shopping) to 4.5 hr. Participants were presented with 64 three-digit plus three-digit addition problems printed across two sheets of paper. They were asked to do as many as they could, with the understanding that they could stop anytime they “quit, finished, or decided to give up.” These instructions come from past depletion research (Vohs &

891

Heatherton, 2000) in which self-control was measured as persistence on a cognitive task. Unbeknownst to participants, there was a second research assistant standing approximately 5 ft (1.5 m) away who surreptitiously recorded the amount of time that participants spent on the addition problems. Then participants were debriefed and thanked.

Results Choices scale. First, we conducted a factor analysis on the items from the choice scale to test whether they revealed patterns similar to that seen in the pilot study, which they did. The data were subjected to a varimax rotation (eigenvalues greater than 1 extracted), and a two-component structure emerged. Factor 1 accounted for 49% of the variance observed and Factor 2 accounted for an additional 17%. The items loaded onto factors similarly as in the pilot study. That is, scale items asking about number of choices, importance of the choices, degree of consideration, deliberation, and thought put into the choices, and degree of activity involved in making those choices mainly loaded onto the first factor, whereas the item asking about tiredness loaded strongly and positively on Factor 2. We computed factor scores for each participant and used them as predictors of math performance. Performance on the math problems. Participants’ performance on the math problems was the primary indication of self-control.1 As mentioned, past research has shown that one consequence of self-regulatory resource depletion is a reduction in cognitive abilities and consequently poorer intellectual performance (Schmeichel et al., 2003). Alongside the two factor scores from the choices scale as extracted by principal-components analysis, the regression models included as predictors time spent shopping, age, ethnicity, and gender (the latter four variables were centered around their means before being entered into the model). The overall model predicting number of problems completed correctly was significant, F(6, 50) ⫽ 2.48, p ⬍ .04. More pertinent was the significant effect of Factor 1 (i.e., the Choices factor), ␤ ⫽ ⫺.32, t(50) ⫽ 2.40, p ⫽ .02. The factor scores for Factor 2, which represented mainly the tiredness item, did not significantly predict number of correct solutions (␤ ⫽ ⫺.04, t ⬍ 1). The regression model contained no other significant predictors of correctly solved problems (ts ⬍ 1.55), except for ethnicity, t(50) ⫽ 2.44, p ⬍ .02.

Discussion Study 5 provided converging support for the hypothesis that decision making interferes with subsequent self-regulation. Shoppers at an outdoor mall reported how much decision making they had done while shopping that day and then were asked to solve arithmetic problems. Self-regulation was measured by performance on math problems. We found that the more choices the shoppers had made, the worse their computations on simple arith1 This study had two dependent variables, persistence at the math problems in terms of duration of time spent working on them and also number of math problems completed correctly. The two variables were highly correlated, r(58) ⫽ .71, and the regression models yielded highly similar results. Hence a second conclusion from this study is that the more decision making the shoppers had done, the less they persisted on the math problems.

892

VOHS ET AL.

metic problems. Moreover, the negative impact of prior decisions on math persistence remained significant even after controlling for how long they had been shopping, for how tired they were, and for several demographic categories including gender, age, race, and ethnicity. These findings are consistent with the general hypothesis that making choices depletes an energy resource and thereby impairs subsequent performance. We acknowledge, however, that the correlational design of this study reduces its capacity for drawing causal conclusions. Third variable explanations are still plausible, such as that people who enjoy making effortful decisions while shopping might simultaneously dislike expending effort on math problems. That said, on an a priori basis, one would likely predict the opposite, such that people with high need for cognition would put more thought into both shopping decisions and math problems. In that respect, these findings are less conclusive than those of the prior studies, but they also add valuable convergence. The decisions in this study were not mandated by the experimenter but instead occurred naturally among people during the course of their daily lives. Additionally, interpretation of these findings is strengthened by the fact that the sample was more diverse (in age, education, and income) than the university populations sampled in the preceding studies.

Experiment 6: Choosing Versus Deliberating Versus Implementing With Experiment 6, we began to delve into the processes and possible boundaries of the effects of choosing. In line with the Rubicon model of action (Gollwitzer, 1990, 1996; Heckhausen & Gollwitzer, 1987), we conceptualized the process of choice as involving three key phases: deliberation among options, deciding on a plan of action (i.e., making a choice), and implementing the chosen option. Deliberating among the options involves weighing their pros and cons and comparing them and, perhaps crucially, forming an ad hoc preference where none existed. Making the choice requires actually selecting one option and committing oneself to behave in that way. Implementing the choice involves behaviors that execute the previously chosen option. In principle, any or all phases of the choice process may tax the self’s resources. Of particular interest was the possibility that choosing would itself deplete the self’s energy, above and beyond the processes of deliberating and implementing. Choosing is akin to forming an implementation intention, in the sense that it sets a conditional program for future behavior. The essence of the Rubicon model is the transition between an initial phase of deliberating about the various options to a phase of readiness to take action, which may be in the immediate present or delayed. Thus, the mind undergoes some qualitative change in order to make that transition. To use the popular metaphor of the computer, the difference between deliberating and deciding resembles the difference between performing calculations and writing the output of those onto the disk for storage, where it can be accessed on future occasions as needed. Performing calculations takes energy, but writing onto the disk also consumes energy. By analogy, therefore, choosing would require more energy than merely deliberating. As Webb and Sheeran (2003) have shown, having such a conditional program (especially in the form of an implementation intention) helps counteract the effects of ego depletion, so a

preestablished program can conserve energy. Thus, we suggest that the choice process expends energy now but may perhaps do so such that the system can save energy later, not unlike the way storing information to a disk enables the computer to retrieve and use the result later without having to repeat the calculations. To provide an initial test of the idea that the act of choosing is depleting apart from the phases of deliberation and implementation, Experiment 6 compared three different conditions. In one, participants only deliberated among options but refrained from making a decision. In another, they made a choice (presumably after also deliberating about the options). In a third condition, they merely implemented choices that had been made for them by someone else, namely a yoked participant in another condition. If the act of choosing is itself depleting, then one should see greater depletion in the choice condition than in the other two conditions. This was our main prediction.

Method Participants. Sixty-four undergraduates (36 women; 2 participants did not complete this item) participated in exchange for extra course credit or payment. The first 52 were randomly assigned among the three conditions. In response to reviewersuggested analyses that yielded a marginal ( p ⫽ .10) and hence inconclusive result, we resumed and ran the final 12 participants, who were randomly assigned between the choice and deliberateonly conditions. Procedure. Choice condition was manipulated with differing instructions as for how to interact with a popular computer website, dell.com. The four pages on the dell.com website contained options for making selections about the computer itself as well as components, services and support options, and accessories. Participants were seated in front of a computer that showed a page for customizing a Dell Dimension desktop computer and then were given one of three sets of instructions. In the implement condition, participants were given sheets of paper that were printouts of the four computer screens that they were to see during this task. Preestablished choices had been made and radio buttons indicated the chosen options. Participants in this condition were simply asked to find the radio button on each page that matched the selected radio button on the printout and click on it with the computer mouse. Thus, they were simply implementing a choice that had already been made by someone else. In the deliberate condition, participants were asked to deliberate about the options on each page and “form an opinion of the information, thinking about what [they] would prefer.” Participants in this group were instructed not to press any buttons to indicate their selections. Participants in the choice condition were asked to deliberate, form preferences, choose the most preferred option in each set, and select it on the website using the computer mouse. The experimenter timed the duration of the dell.com task for each participant. Participants moved away from the computer at this time and were seated in a small room to perform the anagram task, which was comprised of 80 five-letter solvable anagrams. Prior to starting, participants were told that the anagrams constituted a test of verbal ability, a capacity that university students believe is quite important (Vohs & Heatherton, 2001). In line with past work in self-regulation, participants were told to work on the anagrams

DECISION FATIGUE IMPAIRS SELF-REGULATION

until they solved them all, wanted to stop, or decided to give up. The experimenter timed their efforts directed at this task as a measure of persistence. Last, participants were given a set of postexperimental questions and were debriefed and thanked.

Results As a manipulation check, we asked participants the extent to which they had deliberated while performing the dell.com task and found significant differences as a function of condition, F(2, 61) ⫽ 10.52, p ⬍ .01. As expected, the implement group (M ⫽ 2.82, SD ⫽ 1.86) reported deliberating less than the other two groups (choice condition: M ⫽ 5.64, SD ⫽ 2.53; deliberate condition: M ⫽ 5.82, SD ⫽ 2.26). Participants reported enjoying the dell.com task equivalently across conditions, F(2, 59) ⫽ 1.04, p ⬎ .30. We had anticipated that there may be differences in the duration of the dell.com task across conditions and the effect approached significance, F(2, 61) ⫽ 2.81, p ⫽ .07. Descriptively, the implement group performed this task in the shortest amount of time (M ⫽ 223.18 s, SD ⫽ 90.80), compared to the deliberate (M ⫽ 320.95 s, SD ⫽ 84.22) and choice conditions (M ⫽ 273.24 s, SD ⫽ 172.95). The main test of our hypothesis was whether there was a significant difference between choosing and not choosing on later self-regulation. There were debilitating effects of engaging in the full choice process on executive functioning. On anagram persistence, not only was the overall test significant, F(2, 61) ⫽ 3.99, p ⬍ .03, but so was the planned contrast of choosing versus not choosing, a test that compared the choice condition versus the two nonchoice conditions, t(61) ⫽ 2.77, p ⬍ .01. Comparing the conditions individually revealed that the choice condition (M ⫽ 379.24 s, SD ⫽ 180.17) led to significantly less persistence than did the implement condition (M ⫽ 571.29 s, SD ⫽ 286.65), t(61) ⫽ 2.66, p ⬍ .01, and less persistence than did the deliberateonly condition (M ⫽ 514.0 s, SD ⫽ 231.37), t(61) ⫽ 2.01, p ⬍ .05. The difference between the deliberate-only and implement conditions was not reliable (t ⬍ 1, ns).

Discussion Experiment 6 attempted to distinguish among deliberating, choosing, or implementing a choice. Although it is possible that all phases of the decision process can deplete some resources, we did find significant variation among the conditions. Making choices (presumably after some deliberating) was significantly more depleting than either deliberating or implementing alone. Deliberating and implementing were not reliably different from each other. These results point toward the conclusion that actually making the choice itself requires effort and consumes energy, above and beyond the process of thinking about the options and more than expressing or implementing previously made choices.

Experiment 7: Pleasant Versus Unpleasant, Many Versus Few Choices Experiment 7 addressed two final questions. First, is the depleting effect of choosing cumulative such that making more choices produces more depletion than does making only a few choices? Second, does the subjective enjoyment of the choosing task moderate how depleting the task is?

893

Regarding the quantity of choice, we reasoned that insofar as choice requires effort, then more choosing should be more fatiguing. If choosing does deplete some psychological resource, then doing more of it should result in more severe depletion. The difference could also address the criticism raised by Moller et al. (2006), who found that making one or two pleasant choices was not depleting. The amount of effort required to make a single choice might be so small as not to produce depletion, but that small amount of effort multiplied by many choices (even pleasant ones) could still be depleting. Regarding subjective enjoyment, we thought that pleasantness of the choosing process might reduce its deleterious effects. If depletion is caused by forcing oneself to do something, then a pleasant task would presumably be less depleting than an aversive one would be. There was also reason to predict that choice quantity would interact with subjective enjoyment. The beneficial impact of enjoying the task will likely wane as time and exertion increases. By analogy, people may find physical exercise to be less tiring when they enjoy it than when it is aversive, but extended physical exercise (e.g., running for dozens of miles) is still tiring. Given the robust effect of making choices on the executive system in the previous experiments, it seemed likely that the effects of making choices would wear down the executive system over time, such that any positive effects of choice enjoyment would be nullified if the task required making a great deal of choices. Hence, we designed the experiment to have participants make choices for a short or long period of time (4 vs. 12 min, respectively) or no choices at all. We also obtained participants’ anticipated enjoyment of the choice task, which was the creation of a gift registry online. Both variables— duration of choice task (no choices vs. 4 min vs. 12 min) and enjoyment of the choice task—were expected to have significant effects on participants’ active responding to a situation in which there was a problem. Experiment 7 also introduced a change in the dependent variable. Having shown in the preceding studies that decision making affects subsequent self-control, we sought in this study to measure effects on a different manifestation of the self’s executive function, namely initiative or active responding. One previous study provided some initial evidence that responding actively instead of passively requires the same sort of energy used for self-regulation and is therefore vulnerable to depletion (Muraven et al., 1998). In Experiment 7, participants were told that their next task would entail watching a video. For each participant, however, the video playback malfunctioned, thereby rendering the task impossible. The measure was how long the participant (passively) sat there before notifying the experimenter of the malfunction. In this case, passivity was counterproductive for the participant’s presumptive goals of finishing the experiment and going home because it would not be possible to perform the task until the video was fixed.

Method Participants. One hundred and ten students (ages 18 – 43 years; M ⫽ 21.38, SD ⫽ 3.33) at a large midwestern university participated in the experiment for either course credit or monetary payment. Ten participants’ data were removed because of various disruptions in the experimental procedure, such as connectivity problems with the registry website (n ⫽ 8) and participants re-

894

VOHS ET AL.

ceiving calls on their cellular phones. The final participant tally was 48 men and 52 women, for a total of 100 participants. Procedure. Prior to arrival at the laboratory, participants had completed a short questionnaire before participating in the experiment. The questionnaire consisted of items pertaining to participants’ enjoyment of and past experiences with wedding gift registries. These reports showed that only two participants had prior experience creating a gift registry and that the distribution of scores was normal in terms of how enjoyable participants envisioned the creation of a gift registry to be (M ⫽ 4.21, SD ⫽ 1.65 on a 7-point scale where 1 ⫽ not at all enjoyable and 7 ⫽ extremely enjoyable). There was, as might be expected, a gender difference in whether participants viewed the process of creating a gift registry enjoyable, with women reporting more anticipated enjoyment than did men, t(97) ⫽ 5.54, p ⬍ .01. Via a postexperimental question, we confirmed that participants who anticipated that they would enjoy the gift registry creation task indeed got more enjoyment out of the task. Participants were fairly accurate in their predictions: There was a significant though far from perfect correlation between how much participants thought they would enjoy the gift registry creation task and how much they reported enjoying the task after completing it, r(63) ⫽ .31, p ⬍ .02. (Degrees of freedom are lower than that for the full sample because one-third of the conditions did not involve creating a gift registry.) Upon entering the lab, participants were randomly assigned to a no-choices control condition, a short choices condition, or a long choices condition. Participants assigned to the choices conditions spent either 4 min (short choices condition) or 12 min (long choices condition) selecting options from a wedding gift registry using the online interface at target.com’s Club Wed. For their first task, participants assigned to the no-choices control group were instructed to think about the route they would take to get home from the building. This is a neutral task that has been used in past research (Vohs & Heatherton, 2001). After the choices or thought task, participants completed the PANAS to assess mood. Then, participants were moved to a new room and sat in front of a VCR and television. They were told that they would be watching a short video about which they would be answering questions later. The experimenter left, saying she would be back when the video was finished. The video was rigged, however, to show mostly static with faint images of two people talking in the background, behind the static. Given that the video was not showing a scene that was discernible whatsoever, the most responsible action for participants to take would be to alert the experimenter. Hence, the dependent measure was how actively participants responded to the problematic video, specifically in terms of duration of time that passed before participants notified the experimenter of the problem. A 15-min ceiling was in place, such that the experimenter entered the room if the participant had not come to alert her by this time. Ostensibly in lieu of watching the broken video, participants then completed postexperimental questionnaires, a demographics form, and were debriefed. Validation study. A separate sample of 20 participants watched the video and rated their reactions in order to clarify the impact of the procedure. They indicated that they thought the video was broken and that the experimenter ought to know that the video was having problems; furthermore, participants said that they quickly gave up trying to watch the fuzzy video. These responses

confirmed that the optimal response during the main experiment was in fact to notify the experimenter and that sitting in the room was a passive and ineffectual response.

Results In order to analyze the data, they were first coded into two dummy variables that tested the two choice conditions (low and high) against the no-choice (control) condition. The analytical model regressed time spent waiting before alerting the experimenter (i.e., passivity) on five predictors: ratings of anticipated enjoyment (centered), the two dummy variables, and two interaction terms for which each dummy variable was multiplied by the anticipated enjoyment factor. Our predictions about the combined effect of choice and anticipated liking of the choice task can be understood statistically as predicting that only one of the interaction terms would be a significant predictor––the Anticipated Enjoyment ⫻ Low Choice Condition interaction. A significant interaction term would indicate that anticipated enjoyment ceases to predict passivity after participants had made a high number of choices. That is what we found: The effect of anticipated enjoyment on passivity was only pertinent after participants had made few choices, whereas after participants had made many choices, passivity scores (seconds waited before alerting the experimenter) were unaffected by anticipated enjoyment. Statistically, there was a significant interaction between enjoyment and the low choice dummy variable, t(92) ⫽ 3.399, p ⫽ .001, ␤ ⫽ ⫺.39. The interaction of the other dummy variable (representing the high choice condition) with enjoyment was not a significant predictor, t(92) ⬍ 1, ns, ␤ ⫽ .01. The high choice dummy variable on its own, though, was a significant predictor, t(92) ⫽ 3.54, p ⫽ .001, ␤ ⫽ .36, whereas the low choice dummy variable on its own was a nonsignificant predictor, t(92) ⫽ 1.84, p ⬍ .07, ␤ ⫽ .19, and enjoyment was a nonsignificant main effect, t(92) ⬍ 1, ns, ␤ ⫽ ⫺.04. Using the Aiken and West (1991) procedure, we created Figure 1, which displays predicted passivity scores as a function of the three experimental conditions (no choice, low choice, and high choice) and at different levels of anticipated enjoyment. Recall that we measured participants’ moods after the choice task to ensure that active responses were not due to transient changes in emotions. In line with previous work, there was no effect of choice task on positive, F(2, 97) ⬍ 1, or negative emotion, F(2, 97) ⫽ 1.10, p ⬎ .30, as measured by the PANAS. Moreover, we wanted to ensure that enjoyment of the choice task did not alter mood states. We correlated anticipated enjoyment with positive and negative mood as well as posttask reported enjoyment with mood and found no correlations to be significant, anticipated enjoyment and positive mood: r(99) ⫽ ⫺.02; anticipated enjoyment and negative mood: r(99) ⫽ .02; reported enjoyment and positive mood: r(64) ⫽ .23, p ⬎ .06; reported enjoyment and negative mood: r(64) ⫽ ⫺.16, p ⬎ .19. Hence, mood did not play a significant role in this experiment.

Discussion The findings of Experiment 7 add several new aspects to our understanding of the impact of choice. First, quantity of choice contributed to depletion: Participants who made more choices

DECISION FATIGUE IMPAIRS SELF-REGULATION

895 High choice Low choice No choice

Waiting Passively (secs)

450 400 350 300 250 200 150 100 50 0 1 SD Above

Mean

1 SD Below

Enjoy

Figure 1. Effect of choice condition (high choice vs. low choice vs. no choice) and anticipated enjoyment of choice task (1 SD above the mean vs. mean level vs. 1 SD below the mean) on passive waiting (Experiment 7). Higher numbers indicate more passivity and thus worse self-control.

were more passive in the sense of waiting longer to notify the experimenter of the equipment problem. This result suggests that the more choices one makes, the more depleted one is. Such a pattern is most consistent with the theory that choosing progressively consumes a limited resource. The quantity effect may seem at odds with one null result of Experiment 1, which found no link between the amount of time spent choosing and the degree of depletion. We think the most likely explanation is that the extent of resource depletion is determined by the quantity rather than the duration of choosing. In Experiment 1, all participants in the choice condition made the same number of choices so variations in time pertained merely to how fast they made those choices. In Experiment 7, the manipulated differences in time corresponded to making more versus fewer choices. Also, the variations in time in Experiment 1 may have been too small to produce significant differences in ego depletion, at least with that measure. The design of Experiment 7 ensured that some participants spent 3 times as long as others on the choosing task. The second finding from Experiment 7 was that subjective enjoyment moderated the depleting effect of choice but only in the 4 min condition. Making a few enjoyable decisions was apparently less depleting than making a few aversive decisions. But when many decisions had to be made, the process was depleting regardless of whether it was pleasant or unpleasant. This finding integrates the results of Moller et al. (2006) with the more general patterns of ego depletion. Moller et al. (2006) found that making a couple of easy, enjoyable choices that expressed the self did not produce ego depletion, and our results are consistent with that. But making many choices becomes depleting even when the activity is viewed as an opportunity for positive self-expression.

General Discussion Ambivalence about choice presents one of the great seeming paradoxes of modern life. On the one hand, the desire for choice

seems ubiquitous. People clamor for freedom in their private and political lives. They exhibit patterns such as reactance (Brehm, 1966; Fitzsimons & Lehmann, 2004) and illusions of control (Ariely, 2000; Langer, 1975) that indicate deeply rooted motives to maintain a feeling of having choices. The marketplace, normally a reliable guide to what people want, offers ever more fine-grained choices, from dozens of car makes and models to (most recently) personalized boxes of disposable tissue paper. On the other hand, people tire of the endless demands for choice and the stress of decision making. In related research, there are signs that too much choice can be detrimental to satisfaction and that people resist facing up to the tradeoffs that many choices involve (Iyengar & Lepper, 2000; Luce, Payne, & Bettman, 1999). One recent analysis demonstrated that behavioral commitment (i.e., buying) initially rose with the number of options but fell when even more options were presented (Avni & Wolford, 2007). The present investigation sought to shed light on the psychic costs of choice. Making choices can be difficult and effortful, and there is a personal price to choosing, which is seen in worse self-regulation. The main hypothesis was that deliberate, effortful choice consumes a limited resource needed for a broad range of executive functions, including self-regulation. Participants made a series of choices about consumer products, college courses, or class materials— or in the no-choice conditions, participants read, studied, and rated those materials without choosing among them. Making choices apparently depleted a precious self-resource because subsequent self-regulation was poorer among those who had made choices than it was among those who had not. This pattern was found in the laboratory, classroom, and shopping mall. It was found with assigned choices and spontaneously made choices. It was found with inconsequential and more consequential choices. Having multiple experiments permitted us to employ a diversity of manipulations and measures, so that possible ambiguities regarding one procedure could be remedied in another. We had some participants make binding and irrevocable choices, whereas other

896

VOHS ET AL.

choices could be reversed later. In some studies we assigned them to make choices or not, and in others we measured how many choices they had spontaneously made. We allowed some participants unlimited time to choose, whereas others were required to stop midtask after a fixed interval. We measured self-regulation in terms of how long they could hold a hand in ice water, how much of a bad-tasting beverage they forced themselves to drink, how much they procrastinated while studying, how long they persisted on unsolvable puzzles, and how long they tried and how well they performed on solvable problems. We also employed a range of supplementary measures, including measures of emotion and mood, self-ratings of fatigue, and perceived difficulty of the tasks. The most parsimonious explanation for all these findings is that making choices depletes some important intrapersonal resource— indeed, the same resource that is needed for self-regulation. Experiment 7 also showed the depleting effect on reduced active responding and a corresponding increase in passivity. This provides valuable further evidence that one common resource is used by the self’s executive function for its diverse activities. That is, making decisions, active initiative, and self-control all appear to depend on the same inner resource. We attempted also to separate the act of choice itself from the related processes of deliberating and expressing (implementing) the choice. Experiments 1– 4 showed that choosing was more depleting than just thinking about the options. Experiment 6 found that choosing was more depleting than was the process of putting choices into action and was more depleting than forming a preference while considering options was. Taken together, these findings tentatively argue for something special about choice. Based on the Rubicon model, we have proposed that making a choice produces a lasting change in the person’s mental apparatus by etching into the mind and brain the prescription for what to do. The change in mental programming is made at the time of choosing, regardless of whether the chosen action is to be implemented immediately or at some unspecified future time. Making this change requires energy and is depleting.

Alternative Explanations The present investigation needed multiple experiments, partly because there is no single, unambiguous measure of the constructs. There is no single gold standard measure of self-regulatory resource depletion, and so we measured self-regulation in many different behavioral spheres. The diversity of measures was especially important and helpful because of the theoretical assumption that the same resource is used for many diverse self-regulation activities as well as for effortful decision making. Given that choosing can be aversive, one important alternative explanation would be that the choosing manipulation was more aversive than the control condition was and that bad moods contributed to the various behavioral decrements afterward. Multiple findings speak resoundingly against this view. We measured mood in several studies and found no differences as a function of choice condition. We also observed the depleting effect of choice even when participants had not reported their moods. Experiment 7 did find that aversive choices are more depleting than pleasant choices, but pleasant choices also became depleting, and moreover the effect of enjoyable versus aversive choosing disappeared when

participants had made choices for a relatively long time. In short, neither subjective mood nor enjoyment can explain our findings. In Experiments 1A and 4A, the experimenter had the informal impression that the choice procedure seemed to take longer than the no-choice procedure, raising the possibility that the effects on self-regulation were caused by the longer duration of the initial task. Experiment 7 showed that spending more time on a depleting choice task had a stronger effect. In other studies, however, the time for the two tasks was kept rigidly equal, which permitted the conclusion that the depleting effects of choice were not due specifically to the time devoted to the task. The best way to integrate these findings is to suggest that it is the amount of psychological work rather than the simple duration of participation that accounts for the extent of depletion. Experiments 2 and 4 used two different experimenters and blind testing procedures. The results remained strong, and so the effects cannot be explained away in terms of seeking to gain favor for the sake of getting a better gift or a sense of having discharged one’s obligation as a research participant. The two-experimenter system also permitted blind testing, which can largely rule out explanations based on experimenter bias or demand characteristics. Last, it was important for us to confirm empirically that the experimental manipulations about choice were effective. The pilot experiment showed that high-choice procedures made people feel that they were indeed engaging in decision making, as well as putting more deliberate thought into the task, more than the lowchoice procedures. The self was more involved in the high-choice procedure than it was in the no-choice procedure, which is why we think that it expended more of its self-resources. In short, although some findings may seem open to alternative explanations, we attempted to provide evidence against these alternatives with other studies in the current investigation. The most parsimonious explanation for these findings is that making choices depletes a valuable internal resource that is needed for selfregulation, and thus self-regulation is impaired in the aftermath of decision making.

Distinctiveness of Depletion Some readers may wonder how these self-regulatory resourcedepletion effects can be distinguished from other phenomena familiar to cognitive psychology, ranging from cognitive load to mental effort and mental fatigue. Although we are sympathetic to efforts at integrative theorizing that may produce the most general theories, we do note some distinctions and contrasts between the current model and those other processes. Studies using cognitive load, like ego-depletion studies, are based on the assumption of a limited resource. In particular, cognitive load is presumed to preoccupy attention, which is limited in its capacity. In contrast to ego depletion, however, attention is presumed to be limited only during the time of preoccupation, and so attention reverts to its baseline (full capacity) as soon as the load is lifted— unlike self-regulatory resource-depletion effects, which involve lasting consequences afterward. Attention and willpower are, therefore, two different resources and operate somewhat differently. There are also empirical distinctions. A cognitive load impairs the maintenance of information in short-term memory (e.g., Szmalec, Vandierendonck, & Kemps, 2005), whereas ego depletion

DECISION FATIGUE IMPAIRS SELF-REGULATION

does not impair the maintenance of information in short-term memory (Schmeichel, 2007, Experiment 2). Thus, ego depletion and cognitive load have distinct effects on short-term memory, suggesting that they are dissociable phenomena. Moreover, recent studies by Schmeichel and Baumeister (2007) found that cognitive load procedures produced results opposite to those of ego depletion on a cold pressor task performance: Cognitive load led to longer durations, whereas ego depletion yielded shorter durations. Some attention-related phenomena do appear to reveal a “hangover” effect, but these are extremely short-lived—that is, on the order of milliseconds. The phenomenon of attentional blindness, for example, occurs when participants fail to perceive the second of two target stimuli appearing in rapid succession (500 ms or less) at the same location on a viewing screen (Raymond, Shapiro, & Arnell, 1992). Similarly, repetition blindness is the failure to perceive repetitions of stimuli presented in rapid succession (e.g., Kanswisher & Potter, 1989). Note that both phenomena peak and dissipate quite rapidly. By contrast, the current research found that the hangover effect from making choices persisted over the course of at least a few minutes, and other research on ego depletion has found effects up to 45 min postmanipulation. The differing time courses suggest two resources, one that fluctuates rapidly and is primarily attentional in nature and another that fluctuates over longer periods of time and is primarily related to choice making, willpower, and executive control. Our focus was on the latter. The concept of mental fatigue is quite general and may encompass some patterns of ego depletion. Nonetheless, mental fatigue refers to something quite different. Mental fatigue is presumed to affect a broad range of processes, extending even to exceptionally simple and uncontrolled processes, such as perceptual discrimination (e.g., Parasuraman, 1979). It is typically induced by having participants perform tedious tasks for very long periods of time, such as several hours (e.g., Lorist, Boksem, & Ridderinkhof, 2005). In contrast, self-regulatory resource depletion is often induced by manipulations that require less than 10 min. In the present research, Experiment 7 found depletion occurring after just 4 min, which is probably much too brief to permit discussion of mental fatigue in the cognitive science sense.

Concluding Remarks The present findings suggest that self-regulation, active initiative, and effortful choosing draw on the same psychological resource. Making decisions depletes that resource, thereby weakening the subsequent capacity for self-control and active initiative. The impairment of self-control was shown on a variety of tasks, including physical stamina and pain tolerance, persistence in the face of failure, and quality and quantity of numerical calculations. It also led to greater passivity. Decision making and self-control are both prominent aspects of the self’s executive function. It is therefore useful to recognize that they draw on a common psychological resource and that one may affect the other. In particular, making many decisions leaves the person in a depleted state and hence less likely to exert self-control effectively. The common resource needed for self-control, active initiative, and effortful decision making may deserve recognition as an important aspect of self and personality. The human self is quite remarkably different from what is found in most other species. One likely explanation for these differences

897

is that an escalating complexity of social life, including culture, was a defining theme of human evolution (Baumeister, 2005). These uniquely human social systems have conferred remarkable advantages, ultimately including the long and happy lives enjoyed by many modern citizens. But they require advanced psychological capabilities, which are what set the human self apart from the rudimentary selfhood of other animals. Self-control and decision making are central, vital skills for functioning in human culture. Our findings suggest that the formation of the human self has involved finding a way to create an energy resource that can be used to control action in these advanced and expensive ways. Given the difficulty of these modes of action control, the resource is shared and limited. That is presumably why decision making produces at least a temporary impairment in the capacity for self-control.

References Aiken, L. S., & West, S. G. (1991). Multiple regression: Testing and interpreting interactions. Newbury Park, CA: Sage. Ariely, D. (2000). Controlling the information flow: Effects on consumers’ decision making and preferences. Journal of Consumer Research, 27, 233–248. Avni, M. S., & Wolford, G. (2007). Buying behavior as a function of parametric variation of number of choices. Psychological Science, 18, 369 –370. Bargh, J. A. (2002). Losing consciousness: Automatic influences on consumer judgment, behaviour, and motivation. Journal of Consumer Research, 29, 280 –285. Baumeister, R. F. (1998). The self. In D. Gilbert, S. T. Fiske, & G. Lindzey (Eds.), Handbook of social psychology (4th ed., pp. 680 –740). Boston: McGraw-Hill. Baumeister, R. F. (2002). Yielding to temptation: Self-control failure, impulsive purchasing, and consumer behavior. Journal of Consumer Research, 28, 670 – 676. Baumeister, R. F. (2005). The cultural animal: Human nature, meaning, and social life. New York: Oxford University Press. Baumeister, R. F., Bratslavsky, E., Muraven, M., & Tice, D. M. (1998). Ego depletion: Is the active self a limited resource? Journal of Personality and Social Psychology, 74, 1252–1265. Baumeister, R. F., Vohs, K. D., & Funder, D. C. (2007). Psychology as the science of self-reports and finger movements: Whatever happened to actual behavior? Perspectives on Psychological Science, 2, 396 – 403. Brehm, J. W. (1966). A theory of psychological reactance. New York: Academic Press. Carver, C. S., & Scheier, M. F. (1990). Origins and functions of positive and negative affect: A control-process view. Psychological Review, 97, 19 –35. Fitzsimons, G. J., & Lehmann, D. R. (2004). Reactance to recommendations: When unsolicited advice yields contrary responses. Marketing Science, 23, 82–94. Gailliot, M. T., & Baumeister, R. F. (2007). Self-regulation and sexual restraint: Dispositionally and temporarily poor self-regulatory abilities contribute to failures at restraining sexual behavior. Personality and Social Psychology Bulletin, 33, 173–186. Glass, D. C., Singer, J. E., & Friedman, L. N. (1969). Psychic cost of adaptation to an environmental stressor. Journal of Personality and Social Psychology, 12, 200 –210. Gollwitzer, P. M. (1990). Action phases and mindsets. In E. T. Higgins & J. R. M. Sorrentino (Eds.), The handbook of motivation and cognition (Vol. 2, pp. 53–92). New York: Guilford. Gollwitzer, P. M. (1996). The volitional benefits of planning. In P. M. Gollwitzer & J. A. Bargh (Eds.), The psychology of action: Linking

898

VOHS ET AL.

cognition and motivation to behavior (pp. 287–312). New York: Guilford. Heckhausen, H., & Gollwitzer, P. M. (1987). Thought contents and cognitive functioning in motivational versus volitional states of mind. Motivation and Emotion, 11, 101–120. Hofmann, W., Strack, F., & Deutsch, R. (in press). Free to buy? Explaining self-control and impulse in consumer behavior. Journal of Consumer Psychology. Huffman, C., & Kahn, B. E. (1998). Variety for sale: Mass customization or mass confusion. Journal of Retailing, 74, 491–513. Iyengar, S., & Lepper, M. (2000). When choice is demotivating: Can one desire too much of a good thing? Journal of Personality and Social Psychology, 79, 995–1006. Kanswisher, N., & Potter, M. C. (1989). Repetition blindness: The effects of stimulus modality and spatial displacement. Memory and Cognition, 17, 117–124. Langer, E. J. (1975). The illusion of control. Journal of Personality and Social Psychology, 32, 311–328. Lorist, M. M., Boksem, M. A. S., & Ridderinkhof, K. R. (2005). Impaired cognitive control and reduced cingulate activity during mental fatigue. Cognitive Brain Research, 24, 199 –205. Luce, M. F., Payne, J. W., & Bettman, J. R. (1999). Emotional trade-off difficulty and choice. Journal of Marketing Research, 36, 143–159. Malhotra, N. (1982). Information load and consumer decision making. Journal of Consumer Research, 8, 419 – 430. Mick, D. G. (2005, Fall). Choice writ larger. Newsletter of the Association for Consumer Research. Retrieved March 6, 2007, from http:// www.acrwebsite.org/ Moller, A. C., Deci, E. L., & Ryan, R. M. (2006). Choice and ego depletion: The moderating role of autonomy. Personality and Social Psychology Bulletin, 32, 1024 –1036. Muraven, M., Tice, D. M., & Baumeister, R. F. (1998). Self-control as limited resource: Regulatory depletion patterns. Journal of Personality and Social Psychology, 74, 774 –789. Parasuraman, R. (1979, August 31). Memory load and event rate control sensitivity decrements in sustained attention. Science, 205, 924 –927. Raymond, J. E., Shapiro, K. L., & Arnell, K. M. (1992). Temporary suppression of visual processing in an RSVP task: An attentional blink? Journal of Experimental Psychology: Human Perception and Performance, 18, 849 – 860. Richeson, J. A., & Shelton, J. N. (2003). When prejudice does not pay: Effects of interracial contact on executive function. Psychological Science, 14, 287–290. Sartre, J.-P. (1956). Being and nothingness. (H. E. Barnes, Trans.). Secaucus, NJ: Citadel Press. (Original work published 1943) Schmeichel, B. J. (2007). Attention control, memory updating, and emotion regulation temporarily reduce the capacity for executive control. Journal of Experimental Psychology: General, 136, 241–255.

Schmeichel, B. J., & Baumeister, R. F. (2007). Cognitive load and ego depletion have divergent effects on pain tolerance. Unpublished manuscript, Texas A&M University, College Station. Schmeichel, B. J., Vohs, K. D., & Baumeister, R. F. (2003). Intellectual performance and ego depletion: Role of the self in logical reasoning and other information processing. Journal of Personality and Social Psychology, 85, 33– 46. Schwartz, B. (2000). Self-determination: The tyranny of freedom. American Psychologist, 55, 79 – 88. Strack, F., Werth, L., & Deutsch, R. (2006). Reflective and impulsive determinants of consumer behavior. Journal of Consumer Psychology, 16, 205–216. Szmalec, A., Vandierendonck, A., & Kemps, E. (2005). Response selection involves executive control: Evidence from the selective interference paradigm. Memory and Cognition, 33, 531–541. Tice, D. M., & Baumeister, R. F. (1997). Longitudinal study of procrastination, performance, stress, and health: The costs and benefits of dawdling. Psychological Science, 8, 454 – 458. Trout, J. (2005, December 5). Differentiate or die. Forbes. Retrieved December 17, 2006, from http://www.forbes.com/opinions/2005/12/02/ ibm-nordstrom-cocacola-cx_jt_1205trout.html Vohs, K. D. (2006). Self-regulatory resources power the reflective system: Evidence from five domains. Journal of Consumer Psychology, 16, 215–221. Vohs, K. D., Baumeister, R. F., & Ciarocco, N. (2005). Self-regulation and self-presentation: Regulatory resource depletion impairs impression management and effortful self-presentation depletes regulatory resources. Journal of Personality and Social Psychology, 88, 632– 657. Vohs, K. D., & Faber, R. J. (2007). Spent resources: Self-regulatory resource availability affects impulse buying. Journal of Consumer Research, 33, 537–547. Vohs, K. D., & Heatherton, T. F. (2000). Self-regulatory failure: A resource-depletion approach. Psychological Science, 11, 249 –254. Vohs, K. D., & Heatherton, T. F. (2001). Self-esteem and threats to self: Implications for self-construals and interpersonal perceptions. Journal of Personality and Social Psychology, 81, 1103–1118. Waldman, S. (1992, January 27). The tyranny of choice: Why the consumer revolution is ruining your life. New Republic, 22–25. Watson, D., Clark, L. A., & Tellegen, A. (1988). Development and validation of brief measures of positive and negative affect: The PANAS scales. Journal of Personality and Social Psychology, 54, 1063–1070. Webb, T. L., & Sheeran, P. (2003). Can implementation intentions help to overcome ego depletion? Journal of Experimental Social Psychology, 39, 279 –286.

Received March 13, 2007 Revision received January 14, 2008 Accepted January 14, 2008 䡲

Journal of Personality and Social Psychology 2008, Vol. 94, No. 5, 899 –912

Copyright 2008 by the American Psychological Association 0022-3514/08/$12.00 DOI: 10.1037/0022-3514.94.5.899

Adolescent Personality Moderates Genetic and Environmental Influences on Relationships With Parents Susan C. South and Robert F. Krueger

Wendy Johnson

University of Minnesota—Twin Cities

University of Minnesota—Twin Cities and University of Edinburgh

William G. Iacono University of Minnesota—Twin Cities In contrast with early theories of socialization that emphasized the role of parents in shaping their children’s personalities, recent empirical evidence suggests an evocative relationship between adolescent personality traits and the quality of the parent–adolescent relationship. Research using behavior genetic methods suggests that the association between personality and parenting is genetically mediated, such that the genetic effects on adolescent personality traits overlap with the genetic effects on parenting behavior. In the current study, the authors examined whether the etiology of this relationship might change depending on the adolescent’s personality. Biometrical moderation models were used to test for gene– environment interaction and correlation between personality traits and measures of conflict, regard, and involvement with parents in a sample of 2,452 adolescents (M age ⫽ 17.79 years). They found significant moderation of both positive and negative qualities of the parent–adolescent relationship, such that the genetic and environmental variance in relationship quality varied as functions of the adolescent’s levels of personality. These findings support the importance of adolescent personality in the development of the quality of the parent–adolescent relationship. Keywords: personality, parenting, behavior genetics, moderation

examined whether the etiology of the parent–adolescent relationship changes depending on the adolescent’s personality by determining the moderating impact of adolescent personality on the genetic and environmental influences on the parent–adolescent relationship.

For many years, it was assumed that family environment, including parenting quality, played a causal role in personality development. This theory of socialization maintained that parents played the major, if not defining, role in child development (Bell, 1968). Subsequent to this, the dynamic interactionistic paradigm recognized that children were not simply the products of parental behavior (Caspi & Shiner, 2006; Magnusson, 1990; Patterson, 1982; Sameroff, 1983) but that individual differences in childhood personality also lead to variations in the quality of the parent– child relationship. Some took this position to its extreme, suggesting that parents have little if any impact on adolescent development (Harris, 1995, 1998). A reasonable synthetic perspective is that child personality and parental behavior are related through “bidirectional interactive processes” (Collins, Maccoby, Steinberg, Heatherington, & Bornstein, 2000, p. 222). In the current study, we

The Role of Adolescent Personality in the Parent– Adolescent Relationship Evidence from the literature on personality development supports the notion of temperament (Rothbart & Bates, 2006), consisting of core traits (Asendorpf & Van Aken, 2003) or basic dispositions (McCrae & Costa, 1999) that are present from birth and have links to adult personality (Caspi et al., 2003). This is not to say, however, that personality is set from birth. There are several mechanisms by which individual characteristics transact with the environment, including interpersonal relationships (Shiner & Caspi, 2003). One of the most important relationships for personality development is the parent relationship. Here, evidence supports a bidirectional influence between the temperament or personality and the parent– child relationship. Negative, inappropriate, or unskilled parenting behaviors appear to play a particularly important role in the development of externalizing behaviors, whereas warm and supportive parenting behaviors seem to act as protective factors (Bates, Petit, Dodge, & Ridge, 1998; Belsky, Hsieh, & Crnic, 1998; Rubin, Burgess, Dwyer, & Hastings, 2003; Stoolmiller, 2001). Conversely, children with temperaments high in negative emotionality elicit less supportive parenting behaviors, particularly in households of low socioeconomic

Susan C. South, Robert F. Krueger, and William G. Iacono, Department of Psychology, University of Minnesota—Twin Cities; Wendy Johnson, Department of Psychology, University of Minnesota—Twin Cities and Department of Psychology, University of Edinburgh, Edinburgh, Scotland. This work was supported in part by United States Public Health Service Grants AA09367 to Matthew McGue and DA05147 to William G. Iacono. Wendy Johnson holds an RCUK Fellowship at the University of Edinburgh, Edinburgh, Scotland. Correspondence concerning this article should be addressed to either Susan C. South or Robert F. Krueger, Department of Psychology, University of Minnesota, N218 Elliott Hall, 75 East River Road, Minneapolis, MN 55455. E-mail: [email protected] or [email protected] 899

900

SOUTH, KRUEGER, JOHNSON, AND IACONO

status (Paulussen-Hoogeboom, Stams, Hermanns, & Peetsma, 2007). Shiner and Caspi (2003) delineated six environmental processes that work to shape the development of personality through adolescence into adulthood: learning processes, environmental elicitation, environmental construal, social and temporal comparisons, environmental selection, and environmental manipulation. Theoretically, all of these processes could play a part in the emerging relationship between adolescent personality traits and relationships with parents. There is little empirical evidence for the operation of some (e.g., social comparisons, environmental selection and manipulation) but greater support for the ways in which personality traits may shape the elicitation and construal of behavior from others (Shiner & Caspi, 2003). Particularly as children mature into adolescence, individual differences in personality evoke different responses from parents and selection of different types and frequencies of interactions with parents. It is most likely this process of person– environment transaction that contributes to the increasing stability and consistency of personality during adolescence (Roberts & DelVecchio, 2000). Although there are a greater number of studies connecting personality to other forms of interpersonal relationships (e.g., peers and romantic partners), accumulating empirical evidence supports an evocative relationship between adolescent personality and parent behaviors. Branje, van Lieshout, and van Aken (2004) found a positive, cross-sectional link between adolescents’ agreeableness and perceived support from parents. Levels of conscientiousness and openness in the older of two siblings were related to perceived support from both parents, whereas extraversion and neuroticism in the younger siblings were related to perceived support from the father only. In a later study, Branje, van Lieshout, and van Aken (2005) examined the relationship between agreeableness and perceived support across all possible combinations of family relationships for parents and adolescents. They found no general link across family members between self-reported agreeableness and perceived support; however, they did find significant agreeableness–support correlations within most of the possible family dyads (i.e., mother–father, father– oldest child) except for the mother–younger adolescent relationship. Consistent findings for a link between personality and perceived parental support in older siblings suggest that this association becomes stronger over the course of adolescence. Longitudinal studies provide a better method than crosssectional studies for teasing apart causal influences in the personality–parenting association. Asendorpf and Wilpers (1998) examined the longitudinal connections between personality traits and interpersonal relationships in a German college-student sample. Overall, they found strong stability of personality over a period of 18 months, compared with much greater instability of social relationships. Personality traits influenced social relationships, but not vice versa; specific to family, conscientiousness was related to frequency of contact with parents. Further, change in relationship status was unrelated to changes in personality. The authors concluded that by early adulthood, the quality of relationships with peers, family, and romantic partners is a function of personality. To examine whether personality in adolescence might be more malleable to influences from relationships, Asendorpf and van Aken (2003) studied the association between personality and parenting behavior between the ages of 12 and 17 years. Adoles-

cents who were higher in levels of conscientiousness reported increasing levels of support from their fathers over this age range. There were no significant relationships between perceived support from parents at age 12 and personality traits at age 17. The authors concluded that core traits, like conscientiousness, are relatively stable and enduring traits from birth to adulthood, whereas surface traits are more prone to influence from the environment. Their findings also supported cross-sectional research that hinted at the greater influence of personality on parent relationship as the age of the adolescent increases.

Sources of Influences on the Personality–Parenting Relationship: Findings From Behavior Genetics Using behavior genetic methods, researchers have attempted to explicate the nature of the relationship between personality and parental relationship quality. For instance, some have hypothesized that the association between personality and family environment is genetically mediated. At this point, it is well known that measures of the family environment, including parenting, are modestly heritable (Bouchard & McGue, 1990; Elkins, McGue, & Iacono, 1997; Hur & Bouchard, 1995; McGue, Elkins, Walden, & Iacono, 2005; Plomin & Bergeman, 1991; Plomin, McClearn, Pedersen, Nesselroade, & Bergeman, 1988; Reiss, Neiderhiser, Heatherington, & Plomin, 2000; Rowe, 1981, 1983). It was proposed that these measures are heritable because they are influenced by personality traits, which are, on average, 50% heritable (Bouchard & Loehlin, 2001). That is, a person’s individual characteristics affect how they interact with others (or, alternately, how they think others interact with them), so the genetic influences on environmental measures actually reflect the personality traits of the person. The two mechanisms through which genetically influenced characteristics could affect measures of the environment have been labeled evocative and active gene– environment correlations (Scarr & McCartney, 1983). Evocative correlation occurs when differences in people’s genetically influenced characteristics evoke specific responses from the people around them. Adolescents who are highly emotionally labile, for example, likely elicit very different reactions from their parents than do adolescents with calm, eventempered dispositions. Active correlation occurs when genetic influences on personality affect the process by which individuals select their environments (e.g., amount of time spent with parents) or the process by which they make attributions regarding aspects of their relationships with others. Adolescents prone to excitement and sensation seeking may be more likely to choose to spend time with like-minded peers rather than with parents who try to reinforce more constrained behavior. Multivariate genetic models with structural equation modeling detect gene– environment correlation by estimating the degree to which genetic influences on one variable are related to the genetic influences on a second variable. As the degree of genetic overlap between personality and parenting increases, it becomes more probable that genetic influences on personality are responsible for genetic influences on parenting. To date, research has found genetic relationships between personality traits and life events (Billig, Hershberger, Iacono, & McGue, 1996; Saudino, Pedersen, Lichtenstein, McClearn, & Plomin, 1997) and risk of divorce (Jocklin, McGue, & Lykken, 1996). Chipuer, Plomin, Pedersen,

ADOLESCENT PERSONALITY MODERATES RELATIONSHIPS

McClearn, and Nesselroade (1993) collected measures of current family environment, neuroticism, and extraversion in a sample of older adult twins. They found that the association between the personality traits and the relationship scale from the Moos Family Environment Scale was primarily genetically mediated; however, a significant portion of the genetic variation in family environment was unaccounted for by personality. Krueger, Markon, and Bouchard (2003) investigated whether the genetic influences on childrearing environment, including parenting, could account for genetic influences on adult personality. Using data from the Minnesota Study of Twins Reared Apart, Krueger et al. found that the connections between the personality traits of negative emotionality and constraint and the amount of cohesion in the recalled family environment were genetically mediated. They concluded that the same genotype that led to adult personality traits also influenced recall of childhood rearing environment. Taken together, the results from these myriad studies would suggest that a person’s genetically influenced personality traits often influence the nature of their relationships within the family.

Current Study The limitation in the behavior genetic work on personality and parenting relationship is that these bivariate quantitative genetic models average over any group differences within the population. This is similar to any main-effects model in statistics; for instance, a regression equation predicting adolescent smoking behavior from access to cigarettes averages across the sample. However, much like an examination of moderation in that example may reveal a smaller effect when there is greater parental monitoring, it is also possible that the estimation of genetic and environmental influences on one phenotype may vary as a function of differences in another. This is a form of Gene ⫻ Environment interaction, the genetic susceptibility to environmental risk, or, alternatively, differential genetic expression in different environments. When Gene ⫻ Environment interaction occurs, the genetic influences on a phenotype are more or less important depending on the level of a second trait. Examples of this work and their findings include less genetic influence on depression in unmarried women (Heath, Eaves, & Martin, 1998), smaller genetic influence on disinhibition in more religious families (Boomsma, de Geus, van Baal, & Koopmans, 1999), and greater genetic influence on IQ in families of high socioeconomic status (Turkheimer, Haley, Waldron, D’Onofrio, & Gottesman, 2003). Using models that allow for and quantify moderation of this kind to address the current research topic may increase our understanding of how the parent–adolescent relationship develops in the context of the adolescent’s emerging personality. Research suggests that there is a bidirectional relationship between the development of personality and the emerging parent– child relationship. In prior work using the same sample of adolescent twins, we found that perceptions of the parenting relationship moderated the genetic and environmental influences on adolescent personality traits (Krueger, South, Johnson, & Iacono, in press). We built on that study by examining the other direction of influence— how qualities of the parent–adolescent relationship, as reported by both the parent and adolescent, result from the adolescent’s personality traits. Using new modeling for biometrical moderation, we examined whether the individual characteristics of the adolescent (i.e.,

901

personality traits) can change or influence the etiology of the personality–parenting relationship.

Method Participants The current study used participants from the Minnesota Twin Family Study (MTFS), an ongoing population-based, longitudinal study of adolescent twins and their families. Birth records and public databases were used to locate more than 90% of twin births in the state of Minnesota from 1971 through 1985. Families were excluded from the study if either twin had a cognitive or physical handicap that would preclude them from completing our daylong, in-person assessment, or if the family lived more than 1 day’s drive from our Minneapolis laboratory. Of the eligible families, 83% agreed to participate. Although there were no significant differences between participating and nonparticipating families in regard to socioeconomic status and self-reported mental health problems, parents in participating families had slightly, albeit significantly, more education (0.25 years) than did parents in nonparticipating families (Iacono, Carlson, Taylor, Elkins, & McGue, 1999). Reflecting the population of Minnesota at the time of the twins’ birth, approximately 98% of the sample was Caucasian. Children gave informed assent, and parents gave informed consent for themselves and their children. The research protocol was approved by the University of Minnesota Institutional Review Board. Further information regarding all aspects of MTFS recruitment is detailed elsewhere (Iacono et al., 1999). The MTFS uses two cohorts of twins in an accelerated longitudinal design. Participants enter the study at age 11 or 17 years (corresponding to younger and older cohorts, respectively) and return for follow-up assessments approximately every 3 years thereafter. The original 11-year-old cohort consisted of 756 samesex, reared-together monozygotic (MZ) and dizygotic (DZ) twin pairs: 376 male pairs (254 MZ; 122 DZ) and 380 female pairs (233 MZ; 147 DZ). The 17-year-old cohort consisted of 626 same-sex twin pairs: 289 male pairs (188 MZ; 101 DZ) and 337 female pairs (223 MZ; 114 DZ). For the purposes of the current study, we used data from both cohorts at the overlapping assessment point of age 17 years: the older cohort at intake and the younger cohort at their second follow-up visit. This included all 1,252 individuals from the older cohort at the intake assessment and 1,320 twins from the younger cohort who completed the second follow-up assessment (87% of the younger cohort). Participants were excluded from this total sample of 2,572 if they were missing data on all of the personality variables and all of the parenting variables (n ⫽ 67) and if cotwin data were entirely missing (n ⫽ 53). This brought the final sample size to 2,452, including 585 male twin pairs (386 MZ; 199 DZ) and 641 female twin pairs (412 MZ; 229 DZ). The greater percentage of MZ twins relative to DZ twins in this sample is due to an overrepresentation of MZ twins in the population from which the sample was drawn, as well as a somewhat higher participation rate of families with MZ twins (Hur, McGue, & Iacono, 1995). At the time participants completed the measures used in the current study, they ranged in age from 16.55 to 20.12 years, with a mean of 17.79 (SD ⫽ 0.65).

902

SOUTH, KRUEGER, JOHNSON, AND IACONO

Zygosity In the MTFS, three estimates are used to determine twin zygosity. MTFS staff evaluates the twins’ physical similarity, including visage, hair color, and face and ear shape. Next, parents complete a standard zygosity questionnaire. Finally, ponderal and cephalic indices and fingerprint ridge count are measured. A previous validation study (N ⫽ 50) demonstrated 100% accuracy of zygosity determination when these three estimates agree. When these three estimates do not agree, a blood sample is requested, and a serological analysis is performed.

Assessment of Personality Personality was measured with the Multidimensional Personality Questionnaire (MPQ; Tellegen & Waller, in press), a 198-item self-report personality measure designed to assess a broad range of personality characteristics across normal populations. All participants were mailed the MPQ prior to the assessment and asked to bring the completed inventory with them to their in-person visit. If the MPQ was not completed upon a participant’s arrival for his or her laboratory assessment or by the end of the day-long visit, the participant was asked to take it home and return the completed measure to the study by mail. Internal consistency reliabilities for the MPQ range from .76 to .89 and 30-day test–retest reliabilities range from .82 to .92. The MPQ assesses 11 primary personality domains, 10 of which load on three higher order factors (the 11th scale, Absorption, does not load on any factor). The three higher order factors are Positive Emotionality (PEM; a broad measure of positive well-being and tendency to view life as a pleasurable experience), Negative Emotionality (NEM; a propensity to experience psychological distress), and Constraint (CN; a tendency to endorse traditional values and act in a cautious manner). PEM subsumes the lower order scales of Well-Being, Achievement, Social Potency, and Social Closeness. NEM is composed of Aggression, Alienation, and Stress Reaction. Finally, CN is a composite of Traditionalism, Control, and Harm Avoidance. Only the higher order scales were included in the present analyses. MPQ data were available for 2,169 individuals (nmen ⫽ 1,053, nwomen ⫽ 1,116) from the total sample of 2,452.

Parent–Adolescent Relationship Quality The Parental Environment Questionnaire (PEQ) was administered to both the adolescent twins and their mothers to tap perceptions of parent– child relationships. The PEQ was mailed to all participants prior to the assessment. If the PEQ was not completed upon their arrival for their laboratory assessment or by the end of the day-long visit, participants were asked to complete it at home and return it by mail. The PEQ asks mothers and twins to complete 50 items assessing aspects of their relationship on a 4-point scale (1 ⫽ definitely true, 2 ⫽ probably true, 3 ⫽ probably false, and 4 ⫽ definitely false). Twins rated their relationships with their mothers and fathers, whereas mothers rated their own relationship with each individual twin. Items in the versions of the PEQ completed by mothers and twins were essentially the same, with minor changes in wording appropriate for the particular rater. The PEQ was developed by the MTFS because the standard measures of family environment available when the study began

failed to assess dyadic relationships within the family, instead focusing on the overall family climate. The PEQ scales were organized around the two broad domains of conflict (vs. nurturance/warmth) and control and correlate significantly in the expected direction with an alternative measure of the family environment (Elkins et al., 1997). Further details regarding the development, theoretical rationale, and psychometric properties of the PEQ are given in Elkins et al. (1997). Five factor-analytically derived scores are produced by responses to the PEQ: Conflict, Parent Involvement, Regard for Parent, Regard for Child, and Structure. The Structure scale is the only scale to assess the control aspect; in addition, it has low internal reliability (Elkins et al., 1997), thus, we did not consider it in this study. For the present investigation, we used scores for the Conflict scale (12 items, e.g., “my parent often loses her/his temper with me”; ␣ ⫽ .82), Involvement scale (12 items, e.g., “I talk about my concerns and my experiences with my parent”; ␣ ⫽ .74), Regard for Parent scale (eight items, e.g., “I want to be like my parent in a number of ways”; ␣ ⫽ .75), and the Parent Regard for Twin scale (five items, e.g., “my parent is proud of me”; ␣ ⫽ .69; McGue et al., 2005). Scores were prorated for scales missing ratings for 10% or fewer of their items (i.e., the average of the other items was used as the missing item’s score); if more than 10% of the scale’s constituent items were missing, the scale was considered missing. To create composite indices of Regard, Conflict, and Involvement, we averaged ratings from mothers and twins (Burt, Krueger, McGue, & Iacono, 2003; Burt, McGue, Iacono, & Krueger, 2006; Burt, McGue, Krueger, & Iacono, 2005). Because twin ratings about the mother and father were highly correlated (⬎.80; McGue et al., 2005), we first averaged these ratings. Significant correlations were also found for (a) twin reports of Regard for Parent and Parent Regard for Twin (r ⫽ .65, p ⬍ .0001) and (b) mother reports of Regard for Parent and Parent Regard for Twin (r ⫽ .41, p ⬍ .0001), so we combined scores for these two scales to form summary scores for Regard. Separate composite indices were then created for Regard, Conflict, and Involvement. We averaged the twin composite and the mother report of twin to more fully capture the relational aspect of the parent–adolescent relationship (the correlations between mother report of twin and twin report of mother were .41 for Conflict, .29 for Regard, and .36 for Involvement). This is supported by previous work in the MTFS samples that has found that each informant uniquely contributes to the prediction of external validity criteria, including behavior problems and grades (Burt et al., 2005). PEQ data were available for 2,364 participants for Regard (nmen ⫽ 1,119, nwomen ⫽ 1,245), 2,365 individuals for Conflict (nmen ⫽ 1,120, nwomen ⫽ 1,245), and 2,364 individuals for Involvement (nmen ⫽ 1,119, nwomen ⫽ 1,245).

Biometric Analyses We used biometric modeling to evaluate the genetic and environmental moderation of parent–adolescent relationship quality by adolescent personality. This type of modeling makes use of twin methodology and structural equation modeling to estimate how much of the variance in a trait (phenotype) is due to additive genetic effects, or the effect of individual genes summed over loci (A); shared environmental effects, or the extent to which growing up in the same family makes people similar (C); and nonshared

ADOLESCENT PERSONALITY MODERATES RELATIONSHIPS

environmental effects, or the extent to which people are unique, despite growing up in the same family (E). The standard univariate ACE model assumes that the A, C, and E components are fixed over the entire population from which the sample is drawn. In other words, there is no provision for the association between the genetic and environmental influences on personality and any other trait. To test our assertion that the genetic and environmental influences on the parent–adolescent relationship differ as a function of adolescent personality, we needed to use a model that allowed the variance components of parent relationship quality to vary as a function of adolescent personality. This type of analysis has been referred to as a test of gene– environment interaction, or the notion that different environments can lead to different genetic expression of a phenotype. However, this term does not completely describe the nature of the effects we examined. In the current study, neither the moderator variable, personality, nor the dependent variable, parenting behavior, are wholly environmental or wholly genetic. The advantage of the model that we used is that it is possible to decompose the moderator variable into its genetic and environmental variance components and test for gene– environment interaction in the presence of gene– environment correlation (Purcell, 2002). A genetic correlation is the amount of overlap in the genetic influences on two phenotypes and ranges from ⫺1 to 1; similar types of correlations (i.e., overlap) can occur for shared and nonshared environmental influences. Therefore, we use the more accurate term biometrical moderation to refer to the analyses conducted in these studies; this term better captures the goal of this study—to determine whether the magnitude of genetic and environmental influences on parenting depends on the adolescent’s level of personality (gene– environment interaction) and the extent to which influences acting on parenting also exert influences on personality (gene– environment correlation). We fit biometric models to the raw data using the Mx software system (Neale, Boker, Xie, & Maes, 2003). To ease interpretation, we recoded the parenting variables as necessary so that they would be positively correlated with the personality variables (which remained scored in the same direction for all models, with higher scores corresponding to greater PEM, NEM or CN). To correct for potential biases in model fitting, we adjusted the personality and parenting relationship scales for effects of age and gender (McGue & Bouchard, 1984). Each scale was regressed on age, age2, Age ⫻ Gender, and Age2 ⫻ Gender, and the standardized residuals from

903

these regressions were used in subsequent analyses. Because not all participants had both MPQ and PEQ data for all scales, we used full-information maximum likelihood with the raw data, a procedure that was also necessary for the moderated biometric models we were using. This procedure relies on the assumption that data are missing at random (Little & Rubin, 2002), an assumption that we considered reasonable in this case, as questionnaire return status was not linked in any way to any study measure status. Fits of the moderation models were judged relative to the fit of a bivariate decomposition model in which the six moderation parameters (␤XcM and ␤XuM for A, C, and E) were fixed at zero, so that aC ⫹ ␤XcM became aC ⫹ (0 ⫻ M) ⫽ aC. We used two indices to evaluate model fit: (a) the likelihoodratio test (LRT; distributed as chi squared and computed as the difference in the ⫺2 log-likelihood values for the two models) and (b) the Akaike Information Criterion (AIC; Akaike, 1987). The LRT is used as a goodness-of-fit index, representing the degree of fit between model expectations and observed data. Statistically significant values are associated with a relatively poor fit. Improvements in the model’s fit, from adding or omitting parameters, can be assessed by a statistically significant change in LRT. AIC is also conventionally used to compare the fit of alternative models. The AIC considers goodness of fit in the likelihood sense (how well the model reproduces the observed data) but prefers models that capture the data both accurately and parsimoniously over more complex models. Because the aim of model fitting is to explain the data as parsimoniously as possible, the model with the lowest AIC value is generally considered best.

Results Descriptive Statistics We calculated the phenotypic correlations between personality and parenting variables using Mplus (Muthe´n & Muthe´n, 1998 – 2006), which uses a maximum likelihood robust estimator to produce confidence intervals that are adjusted for the nonindependence of the twin data. The relationships among the transformed variables were in the expected directions (see Table 1). Regard was positively correlated with PEM (r ⫽ .25; 95% confidence interval [CI] ⫽ 0.20, 0.31; p ⬍ .0001), negatively correlated with NEM (r ⫽ ⫺.25; 95% CI ⫽ ⫺0.29, ⫺0.21; p ⬍ .0001), and positively correlated with CN (r ⫽ .24; 95% CI ⫽ 0.19, 0.29; p ⬍ .0001).

Table 1 Phenotypic and Twin Correlations for Parenting and Personality Measures Twin correlations Relationship quality variable 1. 2. 3. 4. 5. 6.

Regard Conflict Involvement Positive Emotionality Negative Emotionality Constraint

Note.

1

2

— ⫺.63 .77 .25 ⫺.25 .24

— ⫺.62 ⫺.11 .35 ⫺.26

MZ ⫽ monozygotic; DZ ⫽ dizygotic.

3

— .29 ⫺.25 .26

4

— ⫺.10 .17

5

— ⫺.08

6

MZ

DZ

—

.70 .72 .71 .52 .49 .54

.52 .46 .53 .22 .17 .21

SOUTH, KRUEGER, JOHNSON, AND IACONO

904

Cc Ac

Cu

aM

Au

Ec

cM

aU+βXuM

eM cC+βXcM

Eu cu+βXuM

eu+βXuM

eC+βXcM

aC+βXcM

PERSONALITY

PARENTING

Figure 1. Path diagram of a biometrical moderation model with adolescent personality (PERSONALITY) moderating the genetic and environmental influences on parent–adolescent relationship (PARENTING). The model is a variation of the bivariate (Cholesky) decomposition model, in which the variances and covariances of the observed variables are decomposed into the proportion of variance associated with genetic (a), shared environmental (c), and nonshared environmental (e) components that are shared between the two phenotypes and unique to the second phenotype. There are two sets of paths contributing genetic and environmental influences: those common to parent–adolescent relationship and the moderator (personality), and those unique to parent relationship. The paths from the moderator (M) variable to the dependent variable are now linear functions of the form a ⫹ ␤M, where a is the parameter for genetic influence on the variable, ␤ is a regression coefficient, and M is the level of the moderator variable. A ⫽ genetic variance, C ⫽ shared environmental variance, E ⫽ nonshared environmental variance.

Conflict was negative correlated with PEM (r ⫽ ⫺.11; 95% CI ⫽ ⫺0.16, ⫺0.06; p ⬍ .0001), positively correlated with NEM (r ⫽ .35; 95% CI ⫽ 0.30, 0.39; p ⬍ .0001), and negatively correlated with CN (r ⫽ ⫺.26; 95% CI ⫽ ⫺0.31, ⫺0.21; p ⬍ .0001). Involvement was positively correlated with PEM (r ⫽ .29; 95% CI ⫽ 0.24, 0.34; p ⬍ .0001), negatively correlated with NEM (r ⫽ ⫺.25; 95% CI ⫽ ⫺0.30, ⫺0.21; p ⬍ .0001), and positively correlated with CN (r ⫽ .26; 95% CI ⫽ 0.21, 0.31; p ⬍ .0001). Also shown in Table 1 are basic MZ and DZ twin correlations for the personality and parenting measures. We computed twin correlations using an intraclass correlation coefficient in SPSS. These MZ and DZ intraclass correlations can be compared to obtain a general indication of the extent to which genetic and environmental influences operate on the phenotype. In all cases, the MZ correlations exceeded the DZ correlations, suggesting that both personality and parenting measures are likely to be influenced by genetic effects. However, MZ correlations for the parenting measures were less than double the DZ correlations, implicating the importance of shared environmental effects.1

Biometric Moderation Analysis We tested whether adolescent personality traits moderated the genetic and environmental influences on the quality of the parent– adolescent relationship by fitting our variables to the biometrical

moderation model in Figure 1. There were nine combinations of parenting and personality variables: Regard moderated by PEM, NEM, and CN; Conflict moderated by PEM, NEM and CN; and Involvement moderated by PEM, NEM, and CN. For each of these nine combinations, we compared the fit of a model with all moderation parameters fixed to zero (no moderation) to a model with all moderation parameters estimated (A, C, and E full moderation). As shown in Table 2, full ACE biometrical moderation was significant for seven of the nine possible combinations of parenting and personality variables. These results were supported by both LRT and AIC. The strongest effects were for the moderating effects of PEM, ␹2(6) ⫽ 54.89, p ⬍ .0001; NEM, ␹2(6) ⫽ 1

An anonymous reviewer suggested that we augment our results by providing MZ and DZ twin correlations for low versus high levels of personality. If the moderator variable were the same for both twins (e.g., parental marital status or socioeconomic status), then it would be quite easy to calculate twin correlations at different levels of the moderator. However, in this study, the twin pairs are often discordant for level of personality traits (i.e., Twin 1’s level of PEM differs from Twin 2’s level of PEM). So to calculate a twin correlation at low versus high levels of personality is not possible. Indeed, the fact that the biometric moderation models use all individual data, and not covariances between twins, is a strength of the study and why these models are able to calculate ACE variance estimates at different levels of the moderator (Purcell, 2002).

Table 2 Fit Statistics From the Models of Variance Components Allowing for Gene-Environment Interaction and Correlation Moderating Variable Positive Emotionality

Regard No moderation Only A moderation Only C moderation Only E moderation A and C moderation (no E) A and E moderation (no C) C and E moderation (no A) A, C, and E full moderation Conflict No moderation Only A moderation Only C moderation Only E moderation A and C moderation (no E) A and E moderation (no C) C and E moderation (no A) A, C, and E full moderation Involvement No moderation Only A moderation Only C moderation Only E moderation A and C moderation (no E) A and E moderation (no C) C and E moderation (no A) A, C, and E full moderation

Negative Emotionality

⫺2lnL

df

⌬␹

⌬df

p

AIC

⫺2lnL

df

⌬␹

10566.98 10547.67 10537.09 10519.83 10535.49 10517.11 10514.19 10512.09

4038 4036 4036 4036 4034 4034 4034 4032

19.31 29.89 47.15 31.49 49.87 52.79 54.89

2 2 2 4 4 4 6

0.000 0.000 0.000 0.000 0.000 0.000 0.000

2490.98 2475.67 2465.09 2447.83 2467.49 2449.11 2446.19 2448.09

10558.10 10509.91 10516.23 10529.69 10507.58 10504.38 10502.93 10501.57

4038 4036 4036 4036 4034 4034 4034 4032

48.19 41.87 28.41 50.52 53.71 55.16 56.53

10597.12 10596.10 10588.46 10595.86 10587.26 10584.03 10583.78 10581.12

4039 4037 4037 4037 4035 4035 4035 4033

4039

2 2 2 4 4 4 6

0.600 0.013 0.532 0.043 0.011 0.010 0.014

2519.12 2522.10 2514.46 2521.86 2517.26 2514.03 2513.78 2515.12

10385.29

1.02 8.67 1.26 9.86 13.09 13.35 16.00

10365.76

4033

19.53

6

10497.52

4039

10505.15 10471.38 10476.68 10486.91 10470.66 10468.75 10471.07 10467.82

4039 4037 4037 4037 4035 4035 4035 4033

33.77 28.47 18.23 34.49 36.40 34.08 37.33

2 2 2 4 4 4 6

10486.68

4033

2419.52

10.84

6

0.094

2420.68

2

Constraint

⌬df

p

AIC

⫺2lnL

df

⌬␹2

⌬df

p

AIC

2 2 2 4 4 4 6

0.000 0.000 0.000 0.000 0.000 0.000 0.000

2482.10 2437.91 2444.23 2457.69 2439.58 2436.38 2434.93 2437.57

10577.61 10555.89 10553.98 10531.40 10552.29 10524.95 10520.14 10517.74

4038 4036 4036 4036 4034 4034 4034 4032

21.72 23.63 46.21 25.32 52.66 57.47 59.87

2 2 2 4 4 4 6

0.000 0.000 0.000 0.000 0.000 0.000 0.000

2501.61 2483.89 2481.98 2459.40 2484.29 2456.95 2452.14 2453.74

2307.29

10491.52

4039

0.003

2299.75

10480.89

4033

2427.15 2397.38 2402.68 2412.91 2400.66 2398.75 2401.07 2401.82

10518.30

4039

0.000 0.000 0.000 0.000 0.000 0.000 0.000

10502.59

4033

2413.52

10.63

6

0.101

2414.89 2440.30

15.70

6

0.015

ADOLESCENT PERSONALITY MODERATES RELATIONSHIPS

Relationship quality variable

2

2436.59

Note. ⫺2lnL ⫽ ⫺2 log likelihood; AIC ⫽ Akaike’s Information Criterion; A ⫽ genetic variance; C ⫽ shared environmental variance; E ⫽ nonshared environmental variance. Selected moderation models are shown in bold. Empty cells indicate that no tests of specific moderation parameters were conducted because the full moderation model was not a better fit to the data than the no-moderation model, according to either raw or transformed data.

905

906

SOUTH, KRUEGER, JOHNSON, AND IACONO

56.53, p ⬍ .0001; and CN, ␹2(6) ⫽ 59.87, p ⬍ .0001, on Regard. The moderating effects of PEM on Involvement and CN on Conflict could be removed without a significant decrease in fit, suggesting that the genetic and environmental effects on Involvement and Conflict were the same across all levels of PEM and CN, respectively. Confirmation of findings with transformed data. We used two approaches to confirm that these results were not simply the result of nonnormality in the data. First, we again compared the nomoderation and ACE full moderation models, this time using transformed scores for Conflict (skew ⫽ 0.436), Regard (skew ⫽ ⫺1.117), and Involvement (skew ⫽ ⫺0.571). To use the same type of transformation on all variables, we first reverse scored Conflict and subjected all variables to a square transformation. Results reported above for the raw data (age and gender regressed) were generally replicated using transformed scores for Regard, Conflict, and Involvement; however, the effects of NEM on Conflict, ␹2(6) ⫽ 7.77, p ⫽ ns, and CN on Involvement, ␹2(6) ⫽ 8.88, p ⫽ ns, were no longer significant. In a second, separate attempt to confirm that our findings were not the result of extreme data points, we truncated the raw data for the three personality variables—PEM, NEM, and CN. After age and gender were regressed out of the personality variables, we trimmed the z score residuals such that anyone above 2 or below ⫺2 was recoded to these boundaries. When these truncated personality variables were used, all seven significant effects found with raw data were replicated.2 Further analysis of moderation models. We now turn to further interpretation of the moderation models but limit our discussion to the five models that were best supported with analyses using raw and transformed data: Regard moderated by PEM, NEM, and CN; Conflict moderated by PEM; and Involvement moderated by NEM. Because the moderation of Conflict by NEM and Involvement by CN could not be completely confirmed using transformed data, we remain cautious in our interpretation of these results and do not discuss these models further. Having established that the full moderation model with all moderation parameters freely estimated provided a better fit to the data than did the no-moderation models for five of the combinations of parenting and personality variables, we then sought to establish which moderation parameters were driving the effect. That is, we wished to determine whether all of the three types of variance (genetic, shared environmental, unique environmental) were moderated by the personality variables or whether moderation occurred for some of these variance components and not others. Starting from the no-moderation, baseline model, we added moderation for each of the A, C, and E paths and their combinations in turn. In sum, a total of six models (in addition to the no-moderation and full moderation models) were run: (a) only A moderation (no C and E), (b) only C moderation (no A and E), (c) only E moderation (no A and C), (d) A and C moderation (no E), (e) A and E moderation (no C), and (f) C and E moderation (no A). The results for this full series of models are discussed in turn below. The results of the full series of models are presented in Table 2, with the best fitting moderation models highlighted in bold. Table 3 presents the estimated variance components (A, C, and E) for Regard, Conflict, and Involvement from the no-moderation models and the best fitting moderation models. Estimates from the

no-moderation models are equivalent to estimates from a standard bivariate decomposition model such that they apply to the population in the aggregate, regardless of level of personality. When moderation is significant for any of the ACE components, the variance estimates vary as functions of every level of the moderator variable (here, personality); for ease of presentation, they are shown in Table 3 at five different levels, scaled in standard deviation units (z scored): ⫺2, ⫺1, 0, 1, and 2 standard deviations away from the mean of the moderator.3 Two types of ACE estimates are given in Table 3, for both the moderation and no-moderation models: (a) the raw A, C, and E estimates are shown in the first three columns, followed by the total phenotypic variance; and (b) A, C, and E estimates expressed as proportions of the total variance in parenting (e.g., A% ⫽ A / A ⫹ C ⫹ E, also referred to as h2, the heritability estimate). The final three columns present the genetic and environmental correlations (rA, rC, and rE). Moderation of Regard by PEM. The models based on PEM and Regard are displayed graphically in Figure 2. The first panel illustrates the unstandardized variance components for Regard from the no-moderation model decomposition of PEM and Regard. As shown, the A (genetic) component is .40, whereas the E (nonshared environmental) and C (shared environmental) components are about .30, corresponding to the variance estimates given in Table 3. As this model does not take into account moderation of parenting by personality, the variance components are static across the range of PEM. As shown in Table 2, the moderation model clearly fits the data better than the no-moderation model. Further parsing of the full ACE moderation model revealed that two submodels of the moderation model, in which there was moderation on E only, ␹2(2) ⫽ 47.15, p ⬍ .0001, AIC ⫽ 2447.83, and moderation on C and E only, ␹2(4) ⫽ 52.79, p ⬍ .0001, AIC ⫽ 2446.19, technically fit better than the full ACE moderation model, ␹2(6) ⫽ 54.89, p ⬍ .0001, AIC ⫽ 2448.09. However, examination of the plot of the full ACE model revealed a clearly visible effect of A and C, thus convincing us that the full moderation model best represented the true nature of the moderation of Regard by PEM. Figure 2 shows the results of estimates from the full ACE moderation model, in which all three ACE variance components for Regard vary as functions of PEM (shown as z scores from ⫺2 to 2). As PEM increases, the genetic variance of Regard increases, whereas both the shared and nonshared environmental effects decrease. With no moderation, the proportion of variance in individual differences in Regard (from a bivariate decomposition model with PEM) is 40.2% genetic, 31.4% shared environmental, and 28.4% nonshared environmental. This is very similar to the results from the moderation model at a mean level (z score of 0) of PEM (h2 ⫽ .44, c2 ⫽ .30, e2 ⫽ .27). As Table 3 shows, however, the genetic component of variance increases from low to high levels of PEM; because the total phenotypic variance in Regard decreases from low to high levels of PEM, the heritability of Regard is greatest when PEM is greatest. At low levels of PEM, Regard is largely due 2

Results for models using the square transformed and truncated data are available from Susan C. South. 3 Variance component estimates for parenting are presented at five levels of personality, but they could easily be extended to any personality score found in the population.

ADOLESCENT PERSONALITY MODERATES RELATIONSHIPS

907

Table 3 Estimates of Unstandardized and Standardized Variance Components and Genetic and Environmental Correlations in ParentAdolescent Relationship Quality and the Personality Moderating Variables Moderating variable Variance components Relationship quality variable

A

C

Proportions of variance E

Total variance

Correlations with moderator

A(%)

C(%)

E(%)

rA

rC

rE

PEM Regard No-moderation model PEM level ⫺2 ⫺1 0 1 2 Conflict No-moderation model PEM level ⫺2 ⫺1 0 1 2

0.40

0.32

0.29

1.00

0.40

0.31

0.28

.23

1.00

.18

0.22 0.31 0.42 0.55 0.69

0.69 0.46 0.28 0.15 0.05

0.43 0.33 0.26 0.20 0.15

1.34 1.11 0.96 0.89 0.90

0.16 0.28 0.44 0.62 0.77

0.52 0.42 0.30 0.16 0.06

0.32 0.30 0.27 0.22 0.17

.17 .23 .27 .31 .33

1.00 1.00 1.00 1.00 1.00

.36 .28 .18 .04 ⫺.14

0.56

0.17

0.26

0.99

0.56

0.18

0.26

⫺.06

1.00

.15

0.74 0.64 0.54 0.46 0.39

0.01 0.07 0.17 0.33 0.53

0.28 0.26 0.24 0.23 0.23

1.04 0.96 0.95 1.02 1.16

0.72 0.66 0.57 0.45 0.34

0.01 0.07 0.18 0.32 0.46

0.27 0.27 0.25 0.23 0.20

⫺.18 ⫺.12 ⫺.05 .03 .13

1.00 1.00 1.00 1.00 1.00

.39 .28 .15 .01 ⫺.12

NEM Regard No-moderation model NEM level ⫺2 ⫺1 0 1 2 Involvement No-moderation model NEM level ⫺2 ⫺1 0 1 2

0.40

0.31

0.29

1.00

0.40

0.31

0.28

.28

1.00

.18

0.36 0.36 0.36 0.36 0.36

0.09 0.20 0.36 0.56 0.81

0.20 0.24 0.28 0.32 0.38

0.65 0.79 0.99 1.24 1.54

0.55 0.45 0.36 0.29 0.23

0.13 0.25 0.36 0.45 0.52

0.32 0.30 0.28 0.26 0.25

.35 .35 .35 .35 .35

1.00 1.00 1.00 1.00 1.00

.36 .27 .18 .10 .03

0.44

0.29

0.27

1.00

0.45

0.29

0.27

.31

1.00

.20

0.24 0.32 0.44 0.60 0.79

0.28 0.28 0.28 0.28 0.28

0.26 0.26 0.26 0.26 0.26

0.78 0.86 0.98 1.13 1.32

0.31 0.38 0.45 0.53 0.59

0.36 0.33 0.29 0.25 0.21

0.33 0.30 0.26 0.23 0.19

.63 .46 .32 .22 .14

1.00 1.00 1.00 1.00 1.00

.20 .20 .20 .20 .20

CN Regard No-moderation model CN level ⫺2 ⫺1 0 1 2

0.40

0.32

0.29

1.01

0.40

0.32

0.28

.40

1.00

.06

0.44 0.44 0.44 0.44 0.44

0.54 0.40 0.28 0.18 0.10

0.46 0.36 0.27 0.20 0.14

1.44 1.20 0.99 0.82 0.68

0.30 0.37 0.44 0.54 0.65

0.37 0.33 0.28 0.22 0.15

0.32 0.30 0.27 0.24 0.20

.32 .32 .32 .32 .32

1.00 1.00 1.00 1.00 1.00

⫺.02 .02 .08 .15 .26

Note. A ⫽ genetic variance; C ⫽ shared environmental variance; E ⫽ nonshared environmental variance. PEM ⫽ Positive Emotionality; NEM ⫽ Negative Emotionality; CN ⫽ Constraint.

to shared (c2 ⫽ 52%) and nonshared environmental effects (e2 ⫽ 32%), with a smaller contribution from genetic effects (h2 ⫽ 16%). At high levels of PEM, a majority of the variance in Regard is due to additive genetic effects (h2 ⫽ 77%), whereas nonshared environmental effects have decreased (e2 ⫽ 17%), and there are only very small shared environmental effects (c2 ⫽ 6%). Examination of the genetic correlations between Regard and PEM reveal a somewhat stronger association between the genetic influences on PEM and the genetic influences on Regard at high

levels of PEM (see rA in Table 3). The genetic correlation between PEM and Regard increased from rA ⫽ .17 at low levels of PEM (⫺2) to rA⫽ .33 at high levels of PEM (⫹2). For adolescents who are higher in PEM, there are more genetic influences common to both this constellation of personality traits and the amount of warmth in their relationship with both parents. Conversely, the overlap between nonshared environmental influences on PEM and Regard decreases as the level of PEM increases (where the proportion of variance attributable to E is lowest). Therefore, the

SOUTH, KRUEGER, JOHNSON, AND IACONO

908

A

B

Variance in Regard at Age 17

1.2

1.2

.9 A

.6

C E

Variance

.9 Variance

Variance in Regard as a Function of Positive Emotionality at Age 17 By Source of Variance

A .6

C E

.3

.3

.0

.0 -2

-1

0

1

-2

2

-1

0

1

2

Positive Emotionality in Standard Deviation Units

Positive Emotionality in Standard Deviation Units

Figure 2. A: No-moderation model of Regard and Positive Emotionality. B: Moderation model of Regard as a function of Positive Emotionality. A ⫽ genetic variance, C ⫽ shared environmental variance, E ⫽ nonshared environmental variance.

variance in Regard due to nonshared environmental influences is greater at low levels of PEM, and it is here that the nonshared environmental correlation between PEM and Regard is greatest (rE ⫽ .36 at PEM of ⫺2). Moderation of Conflict by PEM. Figure 3 shows the best fitting moderation model of PEM on Conflict. Again, although the full moderation model was not technically the best fitting moderation model according to AIC and p value (see Table 2), it was very close, and the plot of the full moderation model showed an effect of A that justified its designation as the best fitting model. As PEM increased, the genetic variance component of Conflict decreased, the shared environmental variance increased, and the nonshared environmental variance decreased slightly (see Table 3). Thus, at low levels of PEM, the proportion of variance in Conflict was weighted heavily toward genetic (h2 ⫽ 72%) influences, with the rest mainly attributable to nonshared environmental

(e2 ⫽ 27%) effects. At high levels of PEM, proportion of variance due to genetic effects was much smaller (h2 ⫽ 34%), whereas shared environmental effects (c2 ⫽ 46%) increased dramatically and unique environmental (e2 ⫽ 20%) effects decreased slightly. The nonshared environmental correlation between PEM and Conflict decreased as levels of PEM increased, from rE ⫽.39 at PEM of ⫺2 to rE ⫽ ⫺.12 at PEM of ⫹2. Moderation of Regard by NEM. In the best fitting moderation model of Regard and NEM, there was significant moderation of the C and E variance components (see Figure 4). Contrary to PEM, for which the influence of shared environment on Regard was largely negligible at higher levels of personality, shared environment was greater than genetic and nonshared environmental effects at high levels of NEM. The C and E variance components increased as functions of NEM; as a result, there was greater total variance in Regard at high levels of NEM. Thus, even though the

Variance in Conflict as a Function of Positive Emotionality at Age 17 By Source of Variance

Variance in Regard as a Function of Negative Emotionality at Age 17 By Source of Variance

1.2

.9

A .6

C E

Variance

Variance

.9 A

.6

C E

.3

.3

.0

.0

-2

-1

0

1

2

Positive Emotionality in Standard Deviation Units

Figure 3. Variance in Conflict as a function of Positive Emotionality. A ⫽ genetic variance, C ⫽ shared environmental variance, E ⫽ nonshared environmental variance.

-2

-1 0 1 Negative Emotionality in Standard Deviation Units

2

Figure 4. Variance in Regard as a function of Negative Emotionality. A ⫽ genetic variance, C ⫽ shared environmental variance, E ⫽ nonshared environmental variance.

ADOLESCENT PERSONALITY MODERATES RELATIONSHIPS

genetic variance remained stable from low to high levels of NEM, the proportion of variance due to genetic effects declined. At low levels of NEM (⫺2), most of the variance in Regard was split between genetic (h2 ⫽ 55%) and nonshared environmental effects (e2 ⫽ 32%). However, at high levels of NEM (⫹2), the proportion of variance was weighted more heavily toward shared environment (h2 ⫽ 23%, c2 ⫽ 52%, e2 ⫽ 25%). The overlap between unique environmental influences on Regard and NEM were greatest at low levels of NEM (rE ⫽ .36 at NEM of ⫺2 SD). It is interesting that, as with the results for PEM and Conflict, when the valence (i.e., positively vs. negatively tinged) of the relationship and the valence of the personality trait are mismatched, the etiology of the relationship variable is more attributable to shared environmental effects at higher levels of the personality trait. That is, C effects increase with increasing levels of personality when either (a) the parenting variable measures a positive aspect of the relationship (Regard) and the personality variable is a relatively more negative trait (NEM) or (b) when the parent variable measures a negative aspect of the relationship (Conflict) and the personality trait is relatively more positive (PEM). Moderation of Involvement by NEM. In the best fitting moderation model, only the A variance component of Involvement was significantly moderated by NEM (see Figure 5). As the level of Involvement increased, the genetic variance increased; because the C and E variance components were stable across all levels of NEM, this resulted in greater phenotypic variance and higher heritability at high levels of NEM. At low levels of NEM, the proportions of variance due to genetic and environmental effects were almost equally divided (h2 ⫽ 31%, c2 ⫽ 36%, e2 ⫽ 33%). The proportion of variance due to genetic effects almost doubled from low to high levels of NEM (h2 ⫽ 59% at SD ⫽ ⫹2). However, the genetic correlation between Involvement and NEM decreased from low to high levels of NEM. Even though genetic effects had a greater influence on Involvement at high levels of NEM, the types of genetic effects operating on Involvement and NEM were different.

Variance in Involvement as a Function of Negative Emotionality at Age 17 By Source of Variance

Variance

.9

A

.6

C E

.3

.0 -2

-1 0 1 Negative Emotionality in Standard Deviation Units

909

Moderation of Regard by CN. The results of moderation analysis for Regard and CN were similar to the moderation model of Regard and PEM, although in the best fitting model for Regard and CN, moderation was significant for C and E only. The influences on Regard shifted from an almost equal distribution of the proportion of variance among all three components at low levels of CN to largely genetic influences at higher levels of CN (see Figure 6 and the values in Table 3, which show the ACE decomposition of Regard as a function of CN). Estimates derived from the moderation model show a decrease in both sources of environmental variance, from c2 ⫽ 37%, e2 ⫽ 32% at low CN to c2 ⫽ 15%, e2 ⫽ 20% at high CN. Notably different from the moderation of Regard by PEM, however, was the finding of greater nonshared environmental influences between Regard and CN at higher levels of CN. The nonshared environmental correlation between CN and Regard increased with increasing levels of CN. For adolescents higher in CN, the presence of these nonshared environmental influences common to CN and Regard suggest a within-family selection process linking the two. Extension of analyses to both sources of report on parent– adolescent relationship. Because we used a composite report of parent–adolescent relationship, we also conducted analyses separately for the mother’s and adolescent’s perspective of the relationship, to determine if the pattern of results differed depending on the reporter. In general, the results for the composite report are closer to the results found using adolescent report rather than mother report. Using the raw scores for adolescent report, we found that six of the seven moderation models that were significant in the composite report were also significant (PEM and Regard, Conflict; NEM and Regard, Conflict, Involvement; CN and Regard), whereas for mother report of the relationship, five of the same models that were significant for composite report were also an improvement over the no-moderation models (PEM with Conflict; NEM with Regard, Conflict, Involvement; CN with Regard). However, when transformed scores were used, moderation models fit better for four of the same five models using adolescent-only report as when the composite measure was used (PEM with Regard, Conflict; NEM with Regard; CN with Regard), whereas moderation models using mother report were significant for only two of the same models as when the composite measure was used (NEM and Regard, Involvement). Given the moderately strong, but not perfect, nature of the correlation between mother and adolescent report of the relationship, we feel that the composite report most likely is the best way to capture the parent–adolescent relationship. Differences in moderation of the etiology of the parent–adolescent relationship depending on the reporter are difficult to interpret, especially because there is no theory to guide why results would or would not be different. Complete results for moderation analyses of the parent–adolescent relationship (as reported by the mother and the adolescent) can be obtained from Susan C. South.

2

Figure 5. Variance in Involvement as a function of Negative Emotionality. A ⫽ genetic variance, C ⫽ shared environmental variance, E ⫽ nonshared environmental variance.

Discussion In the developmental context of the family, it appears that personality and interpersonal relationships are involved in a bidirectional process of reinforcement. The goal of the current study was to examine whether personality traits can moderate genetic

SOUTH, KRUEGER, JOHNSON, AND IACONO

910

Variance in Regard as a Function of Constraint at Age 17 By Source of Variance

1.2

Variance

.9

A .6

C E

.3

.0 -2

-1

0

1

2

Constraint in Standard Deviation Units

Figure 6. Variance in Regard as a function of Constraint. A ⫽ genetic variance, C ⫽ shared environmental variance, E ⫽ nonshared environmental variance.

and environmental influences on positive and negative qualities of the parent–adolescent relationship. Specifically, we used a sample of more than 2,400 individual twins at a mean age of 17 years to examine whether adolescent personality traits of NEM, PEM, and CN moderated the genetic and environmental contributions to three aspects of the parent–adolescent relationship—namely, Regard, Conflict, and Involvement. We found (a) significant moderation of all three features of the parent relationship by these personality traits and (b) changes in the genetic and environmental correlations between personality and parenting at different levels of personality. The present study’s large sample size was a notable strength. Of course, it is possible that with an even larger sample size, we may have found significant moderation in the four models in which moderation was not detected (i.e., PEM and Involvement, NEM and Conflict, CN and Conflict, CN and Involvement). The goal of the current study was to extend previous work examining genetic and environmental influences on the relationship between personality and parent relationship quality. Our results support previous research that attributed genetic and environmental influences on perceptions of the family environment to personality traits (Krueger et al., 2003). Further, our findings suggest that whether a genotype will lead to positive or negative aspects of the parent relationship depends on the personality of the adolescent. Beginning with the groundbreaking work of Rowe (1981, 1983), behavior genetic methods have been used to show that putatively “environment” measures have a sizeable genetic component. Prior work from the MTFS that examined the parenting measures used in the current study found small to moderate heritability estimates (Elkins et al., 1997) that tended to increase with age of the adolescent (McGue et al., 2005). In accord with these results, we found that aspects of the parent relationship were moderately heritable. In the no-moderation models, which again are comparable to standard bivariate decomposition models, 40% of the variance in Regard, 56% of the variance in Conflict, and 45% of the variance in Involvement was due to genetic effects, with the rest of the variance attributed generally equally to shared and nonshared environment. We also found moderate genetic

correlations between aspects of the parent–adolescent relationship and adolescent personality traits, which is consistent with prior research findings that the heritability of family and parenting measures is due, at least in part, to the characteristics (i.e., personality traits) of the family members (Chipuer et al., 1993; Krueger et al., 2003). The limitation in these prior studies (and in the results of our own unmoderated models) is that estimates of genetic and environmental influences from univariate models and estimates of genetic correlations from bivariate models of personality and family environment are static and fixed across the entire population. Newer models that test for biometrical moderation, or gene– environment interaction, examine whether genetic and environmental influences may differ in different environments. Conceptually, the idea is similar to examining differences in the main effect of a variable across different groups. For example, does the effect of a reading intervention (main effect) differ between men and women (groups)? Now, extending this idea to the realm of biometrical modeling, we replace the main effect with genetic and environmental influences (here, on parenting behaviors) and different groups with different levels of a moderator variable (here, personality traits). Thus, we have the ability to calculate a specific heritability estimate for an individual dependent on where that person falls along the dimension of a second, moderating variable (Purcell, 2002). A major contribution of the current study was the finding that, for aspects of warmth, conflict, and involvement in the parent– adolescent relationship, the relative contribution of genetic and environmental influences varied as a function of the adolescent’s amount of negative emotionality, positive emotionality, or constraint. Specifically, we found that PEM, NEM, and CN significantly moderated the genetic and environmental influences on Parental Regard, PEM significantly moderated the etiology of Conflict, and NEM moderated the etiology of Involvement. Along with variations in the proportion of variance due to genes and environmental influences, we found different levels of genetic and environmental correlations between personality and parenting at varying levels of personality. These moderation models present a fuller picture of the relative influence of genes and environment on the etiology of the parent–adolescent relationship. When we considered only the static parameter estimates from a no-moderation model, the etiology of Regard was weighted toward genetic influences but with sizable shared and nonshared environmental influences. Results from moderation models tell a more nuanced story. The total phenotypic variance in Regard was generally lowest, and the heritability highest, at the most positive level of the personality trait; that is, at high levels of positive emotionality and constraint and low levels of negative emotionality. Adolescents with the greatest levels of positively valenced personality traits had a level of Regard in their relationship with their parents that better reflected their genotype. This is an example of genes operating differently in different environments— being high in positive emotionality or constraint (or low in negative emotionality) allows for the expression of a genetic predisposition to positive aspects of the parent–adolescent relationship. Our findings provide further empirical support for the role of adolescent personality in the quality of the parent–adolescent relationship. The etiology of the parent–adolescent relationship appears to differ depending on the adolescent’s level of certain

ADOLESCENT PERSONALITY MODERATES RELATIONSHIPS

personality traits. Shiner and Caspi (2003) outlined six possible mechanisms that shape personality development. The unique finding from the current study is that it is not simply one of these processes at work for any given combination of personality trait and aspect of parenting—the type of mechanism depends on the level of the personality trait. For example, the substantial heritability and moderate genetic overlap found between positive emotionality and regard at the extreme high end of personality suggests that at least some of the same genes that influence personality also influence the parent–adolescent relationship. This could simply be an example of a more even-tempered adolescent eliciting more positive parenting behaviors or, at least in part, construing their parents’ actions in a more positive light. Alternatively, the low heritability and substantial shared environmental influence on regard found at low levels of PEM may indicate a process whereby an adolescent with low levels of positive emotions will compare themselves, possibly unfavorably, with their siblings, and thus, his or her relationship with the parents will be based on common factors shared by the parents and all siblings within the household. One of the most consistent findings from the current analysis was the importance of shared environment, particularly at extremely high or low levels of personality. Even in the nomoderation models, shared environmental effects had a sizeable influence on the variance in Regard, Conflict, and Involvement. These effects were enhanced or diminished depending on the level of the personality trait. For instance, at lower levels of positive (PEM, CN) and higher levels of negative (NEM) personality traits, the amount of Regard in the parent–adolescent relationships was more attributable to factors shared within the family than to genetics or unique environment. When an adolescent has a less even-tempered personality, the quality of the parent–adolescent relationship is due largely to factors that are shared in common with other family members. Thus, a more “difficult” teenager may have a relationship with his or her parent that is based on a standard set of rules and interactions shared by all children in the house. Shared environmental influences have been notoriously difficult to detect across a range of phenotypes (Rowe, 1994), leading some to conclude that family has little influence on personality development (Harris, 1995, 1998). The finding of significant shared environmental effects on individual differences in parenting behaviors strengthens the argument for using these new moderation models (Purcell, 2002)—shared environmental effects may be particularly important for certain people within the population but are lost when estimates are averaged across the whole sample. It is well established at this point that most psychological phenomena are heritable (Turkheimer, 2000). The field of behavior genetics must now move beyond static estimates of heritability and genetic and environmental correlations toward elucidation of how genetic influences and environment transact. The current article provides evidence that differences in personality can affect the etiology of parent relationship quality. An obvious extension of this work is to examine the dynamic influences of personality traits on relationship quality over time. The stability of personality generally increases from childhood to adulthood (Roberts & DelVecchio, 2000), but personality is already more stable than relationship quality in adolescence (Branje et al., 2004) and young adulthood (Asendorpf & Wilpers, 1998). Further, both personality (McGue, Bacon, & Lykken, 1993) and aspects of family environ-

911

ment (McGue et al., 2005) become more heritable over time. Longitudinal biometric models could help to determine whether the increasing stability in personality is related to expression of genetic variance and articulate how genetic and environmental transactions unfold over time.

References Akaike, H. (1987). Factor analysis and AIC. Psychometrika, 52, 317–332. Asendorpf, J. B., & van Aken, M. A. G. (2003). Personality–relationship transaction in adolescence: Core versus surface personality characteristics. Journal of Personality, 71, 629 – 666. Asendorpf, J. B., & Wilpers, S. (1998). Personality effects on social relationships. Journal of Personality and Social Psychology, 74, 1531– 1544. Bates, J. E., Petit, G. S., Dodge, K. A., & Ridge, B. (1998). Interaction of temperamental resistance to control and restrictive parenting in the development of externalizing behavior. Developmental Psychology, 34, 982–995. Bell, R. Q. (1968). A reinterpretation of the direction of effects in studies of socialization. Psychological Review, 75, 81– 85. Belsky, J., Hsieh, K., & Crnic, K. (1998). Mothering, fathering, and infant negativity as antecedents of boys’ externalizing problems and inhibition at age 3: Differential susceptibility to rearing influence? Development and Psychopathology, 10, 301–319. Billig, J. P., Hershberger, S. L., Iacono, W. G., & McGue, M. (1996). Life events and personality in late adolescence: Genetic and environmental relations. Behavior Genetics, 26, 543–554. Boomsma, D., de Geus, E., van Baal, G., & Koopmans, J. (1999). A religious upbringing reduces the influence of genetic factors on disinhibition: Evidence for interaction between genotype and environment on personality. Twin Research, 2, 115–125. Bouchard, T. J., Jr., & Loehlin, J. C. (2001). Genes, evolution, and personality. Behavior Genetics, 31, 243–273. Bouchard, T. J., Jr., & McGue, M. (1990). Genetic and rearing environmental influences on adult personality: An analysis of adopted twins reared apart. Journal of Personality, 58, 263–292. Branje, J. T., van Lieshout, C. F. M., & van Aken, M. A. G. (2004). Relations between Big Five personality characteristics and perceived support in adolescents’ families. Journal of Personality and Social Psychology, 86, 615– 628. Branje, J. T., van Lieshout, C. F. M., & van Aken, M. A. G. (2005). Relations between agreeableness and perceived support in family relationships: Why nice people are not always supportive. International Journal of Behavioral Development, 29, 120 –128. Burt, S. A., Krueger, R., McGue, M., & Iacono, W. (2003). Parent– child conflict and the comorbidity among childhood externalizing disorders. Archives of General Psychiatry, 60, 505–513. Burt, S. A., McGue, M., Iacono, W. G., & Krueger, R. F. (2006). Differential parent– child relationships and adolescent externalizing symptoms: Cross-lagged analyses within a monozygotic twin differences design. Developmental Psychology, 42, 1289 –1298. Burt, S. A., McGue, M., Krueger, R. F., & Iacono, W. G. (2005). How are parent– child conflict and childhood externalizing symptoms related over time? Results from a genetically-informative cross-lagged study. Development and Psychopathology, 17, 145–165. Caspi, A., Harrington, H., Milne, B., Amell, J. W., Theodore, R. F., Moffitt, T. E. (2003). Children’s behavioral styles at age 3 are linked to their adult personality traits at age 26. Journal of Personality, 71, 495–513. Caspi, A., & Shiner, R. G. (2006). Personality development across the life course. In N. Eisenberg, W. Damon, & R. M. Lerner (Eds.), Handbook of child psychology: Vol. 3. Social, emotional, and personality development (6th ed., pp. 300 –365). Hoboken, NJ: Wiley.

912

SOUTH, KRUEGER, JOHNSON, AND IACONO

Chipuer, H. M., Plomin, R., Pedersen, N. L., McClearn, G. E., & Nesselroade, J. R. (1993). Genetic influence on family environment: The role of personality. Developmental Psychology, 29, 110 –118. Collins, W. A., Maccoby, E. E., Steinberg, L., Hetherington, E. M., & Bornstein, M. H. (2000). Contemporary research on parenting: The case for nature and nurture. American Psychologist, 55, 218 –232. Elkins, I. J., McGue, M., & Iacono, W. G. (1997). Genetic and environmental influences on parent–son relationships: Evidence for increasing genetic influence during adolescence. Developmental Psychology, 33, 351–363. Harris, J. R. (1995). Where is the child’s environment? A group socialization theory of development. Psychological Review, 102, 458 – 489. Harris, J. R. (1998). The nurture assumption: Why children turn out the way they do. New York: Free Press. Heath, A. C., Eaves, L. J., & Martin, N. G. (1998). Interaction of marital status and genetic risk for symptoms of depression. Twin Research, 1, 119 –122. Hur, Y.-M., & Bouchard, T. J., Jr. (1995). Genetic influences on perceptions of childhood family environment: A reared apart twin study. Child Development, 66, 330 –345. Hur, Y.-M., McGue, M., & Iacono, W. G. (1995). Unequal rate of monozygotic and like-sex dizygotic twin birth: Evidence from the Minnesota Twin Family Study. Behavior Genetics, 25, 337–340. Iacono, W. G., Carlson, S. R., Taylor, J., Elkins, I. J., & McGue, M. (1999). Behavioral disinhibition and the development of substance use disorders: Findings from the Minnesota Twin Family Study. Development and Psychopathology, 11, 869 –900. Jocklin, V., McGue, M., & Lykken, D. T. (1996). Personality and divorce: A genetic analysis. Journal of Personality and Social Psychology, 71, 288 –299. Krueger, R. F., Markon, K. E., & Bouchard, T. J. (2003). The extended genotype: The heritability of personality accounts for the heritability of recalled family environments in twins reared apart. Journal of Personality, 71, 809 – 833. Krueger, R. F., South, S. C., Johnson, W., & Iacono, W. G. (in press). The heritability of personality is not always 50%: Gene– environment interactions and correlations between personality and parenting. Journal of Personality. Little, R. J., & Rubin, D. B. (2002). Statistical analysis with missing data (2nd ed.). New York: Wiley. Magnusson, D. (1990). Personality development from an interactional perspective. In L. A. Pervin (Ed.), Handbook of personality: Theory and measurement (pp. 193–222). New York: Guilford Press. McCrae, R. R., & Costa, P. T. (1999). A five-factor theory of personality. In L. A. Pervin & O. P. John (Eds.), Handbook of personality: Theory and research (pp. 102–138). New York: Guilford Press. McGue, M., Bacon, S., & Lykken, D. T. (1993). Personality stability and change in early adulthood: A behavioral genetic analysis. Developmental Psychology, 29, 96 –109. McGue, M., & Bouchard, T. J. (1984). Adjustment of twin data for the effects of age and sex. Behavior Genetics, 14, 325–343. McGue, M., Elkins, I., Walden, B., & Iacono, W. G. (2005). Perceptions of the parent–adolescent relationship: A longitudinal investigation. Developmental Psychology, 41, 971–984. Muthe´n, L. K., & Muthe´n, B. O. (1998 –2006). Mplus user’s guide (4th ed.) [Computer software and manual]. Los Angeles: Muthe´n & Muthe´n. Neale, M. C., Boker, S. M., Xie, G., & Maes, H. H. (2003). Mx: Statistical modeling (6th ed.) [Computer software]. Department of Psychiatry, Virginia Commonwealth University, Box 900126, Richmond, VA 23298. Patterson, G. R. (1982). Coercive family process. Eugene, OR: Castalia. Paulussen-Hoogeboom, M. C., Stams, G. J. J. M., Hermanns, J. M. A., &

Peetsma, T. T. D. (2007). Child negative emotionality and parenting from infancy to preschool: A meta-analytic review. Developmental Psychology, 43, 438 – 453. Plomin, R., & Bergeman, C. S. (1991). The nature of nurture: Genetic influence on “environmental” measures. Behavioral and Brain Sciences, 14, 373– 427. Plomin, R., McClearn, G. E., Pedersen, N. L., Nesselroade, J. R., & Bergeman, C. S. (1988). Genetic influence on childhood family environment perceived retrospectively from the last half of the life span. Developmental Psychology, 24, 738 –745. Purcell, S. (2002). Variance components models for gene– environment interaction in twin analysis. Twin Research, 5, 554 –571. Reiss, D., Neiderhiser, J. M., Heatherington, M., & Plomin, R. (2000). The relationship code: Deciphering genetic and social patterns in adolescent development. Cambridge, MA: Harvard University Press. Roberts, B. W., & DelVecchio, W. F. (2000). The rank-order consistency of personality traits from childhood to old age: A quantitative review of longitudinal studies. Psychological Bulletin, 126, 3–25. Rothbart, M. K., & Bates, J. E. (2006). Temperament. In N. Eisenberg, W. Damon, & R. M. Lerner (Eds.), Handbook of child psychology: Vol. 3. Social, emotional, and personality development (6th ed., pp. 99 –166). Hoboken, NJ: Wiley. Rowe, D. C. (1981). Environmental and genetic influences on dimensions of perceived parenting: A twin study. Developmental Psychology, 17, 203–208. Rowe, D. C. (1983). A biometrical analysis of perceptions of family environment: A study of twin and singleton sibling kinships. Child Development, 54, 416 – 423. Rowe, D. C. (1994). The limits of family influence: Genes, experience, and behavior. New York: Guilford Press. Rubin, K. H., Burgess, K. B., Dwyer, K. M., & Hastings, P. D. (2003). Predicting preschoolers’ externalizing behaviors from toddler temperament, conflict, and maternal negativity. Developmental Psychology, 39, 164 –176. Sameroff, A. J. (1983). Developmental systems: Contexts and evolution. In W. Kessen (Ed.), Handbook of child psychology: Vol. 1. History, theory, and methods (4th ed., pp. 237–294). New York: Wiley. Saudino, K. J., Pedersen, N. L., Lichtenstein, P., McClearn, G. E., & Plomin, R. (1997). Can personality explain genetic influences on life events? Journal of Personality and Social Psychology, 72, 196 –206. Scarr, S., & McCartney, K. (1983). How people make their own environments: A theory of genotype– environment effects. Child Development, 54, 424 – 435. Shiner, R., & Caspi, A. (2003). Personality differences in childhood and adolescence: Measurement, development, and consequences. Journal of Child Psychology and Psychiatry, 44, 2–32. Stoolmiller, M. (2001). Synergistic interaction of child manageability problems and parent-discipline tactics in predicting future growth in externalizing behavior for boys. Developmental Psychology, 37, 814 – 825. Tellegen, A., & Waller, N. G. (in press). Exploring personality through test construction: Development of the Multidimensional Personality Questionnaire (MPQ). Minneapolis: University of Minnesota Press. Turkheimer, E. (2000). Three laws of behavior genetics and what they mean. Current Directions in Psychological Science, 9, 160 –164. Turkheimer, E., Haley, A., Waldron, M., D’Onofrio, B., & Gottesman, I. I. (2003). Socioeconomic status modifies heritability of IQ in young children. Psychological Science, 14, 623– 628.

Received March 19, 2007 Revision received December 13, 2007 Accepted January 3, 2008 䡲

Journal of Personality and Social Psychology 2008, Vol. 94, No. 5, 913–923

Copyright 2008 by the American Psychological Association 0022-3514/08/$12.00 DOI: 10.1037/0022-3514.94.5.913

Societal Threat, Authoritarianism, Conservatism, and U.S. State Death Penalty Sentencing (1977–2004) Stewart J. H. McCann Cape Breton University On the basis of K. Stenner’s (2005) authoritarian dynamic theory, it was hypothesized that the number of death sentences and executions would be higher in more threatened conservative states than in less threatened conservative states, and would be lower in more threatened liberal states than in less threatened liberal states. Threat was based on state homicide rate, violent crime rate, and non-White percentage of population. Conservatism was based on state voter ideological identification, Democratic and Republican Party elite liberalism-conservatism, policy liberalism-conservatism, religious fundamentalism, degree of economic freedom, and 2004 presidential election results. For 1977–2004, with controls for state population and years with a death penalty provision, the interactive hypothesis received consistent support using the state conservatism composite and voter ideological identification alone. As well, state conservatism was related to death penalties and executions, but state threat was not. The temporal stability of the findings was demonstrated with a split-half internal replication using the periods 1977–1990 and 1991–2004. The interactive hypothesis and the results also are discussed in the context of other threat-authoritarianism theories and terror management theory. Keywords: authoritarian, conservatism, death penalty, intolerance, terror management theory

Some have argued that the United States stands out as one of the most punitive nations in the world and that this proclivity is manifested in public, legislative, and judicial endorsement of the death penalty (e.g., Baumer, Messner, & Rosenfeld, 2003; Soss, Langbein, & Metelko, 2003; Stenner, 2005, p. 191). But hidden within the aggregate national statistics are wide variations in individual and state or regional support for capital punishment (e.g., Barkan & Cohn, 1994; Baumer et al., 2003; Jacobs, Carmichael, & Kent, 2005). Notably, research has shown that authoritarian and conservative persons generally display attitudes more supportive of governmentsanctioned executions (e.g., Feldman & Stenner, 1997; Moran & Comfort, 1986; Soss et al., 2003), that conservative jurisdictions are more likely to endorse and exercise the death penalty (e.g., Baumer et al., 2003; Jacobs & Carmichael, 2004; Jacobs et al., 2005), and that societal threat is associated with heightened death penalty support (e.g., Baumer et al., 2003; Jacobs & Carmichael, 2004; Soss et al., 2003). The present research was initiated to test the potential of Stenner’s (2005) authoritarian dynamic theory (ADT) to shed further light on the integration and interaction of these key factors that might cause differences in the level of individual and state support for the death penalty.

tarian pole by a “preference for difference and insistence upon individual autonomy” (p. 15). Authoritarians have a worldview that stresses conformity and obedience, constrains freedom and autonomy, and fosters political pressure for the regulation of moral deviants through policies of official punitive enforcement (p. 17). They are more likely than libertarians to see capital punishment as fair and just (p. 308). Central to ADT (Stenner, 2005; see also Feldman, 2003; Feldman & Stenner, 1997) is the claim that authoritarians respond to societal threat by showing an increase in authoritarian attitudes and behavior, whereas libertarians do not. Threat leads to authoritarian activation only in authoritarian persons when the manifestations are “needed” by the individual. ADT assumptions are consistent with the functional approach to the study of attitudes espoused by Katz (1960): The “ego-defensive” attitudes of authoritarianism have as their “motivational basis” the maintenance of some collective oneness and sameness that serves the “psychological function” of providing the individual with identity, security, meaning, and/or comfort. Accordingly, those “defensive” stances—racial, political, and moral intolerance—are “aroused” by “emotionally laden suggestions” and “threats” to that oneness and sameness, and “modified” by some “catharsis” or “removal of threat” that relieves the emotional tension and purges those fears. (Stenner, 2005, p. 59)

ADT Stenner’s (2005) ADT uses an authoritarian personality dimension characterized at the authoritarian pole by a “preference for uniformity and insistence upon group authority” and at the liber-

ADT also predicts a polarization effect with libertarians showing less authoritarian behavior than usual in the face of threatening circumstances. Threats challenge both authoritarians and libertarians (Stenner, 2005, p. 63), and although they may not seem so different in their tolerance-related positions in a low-threat climate, they “will suddenly sharply diverge in the stances they adopt toward any issue touching upon diversity, dissent, and deviance” (Stenner, 2005, p. 323) when threat is elevated.

Correspondence concerning this article should be addressed to Stewart J. H. McCann, Department of Psychology, Cape Breton University, P. O. Box 5300, Sydney, Nova Scotia, Canada B1P 6L2. E-mail: [email protected] 913

MCCANN

914

Normative threats are most important in ADT—threats to the “system of oneness and sameness that makes ‘us’ an ‘us’” (Stenner, 2005, p. 17) such as “lack of conformity to or consensus in group values, norms, and beliefs” (p. 18). Although Stenner emphasized normative threats, she also acknowledged that any threat to the collective can lead to somewhat heightened intolerant behavior in authoritarians (p. 29) and reduced intolerance in libertarians (p. 311).

The Threat–Authoritarianism Link in Other Authoritarian Personality Theories Evidence for the idea that contemporaneous situational threat heightens authoritarianism (e.g., Fromm, 1941; Rokeach, 1960; Sanford, 1966) has accumulated in recent years (e.g., McCann, 1997; Peterson & Gerstein, 2005; Rickert, 1998), and although the link is fairly broadly accepted (e.g., Altemeyer, 1988; Esses, Dovidio, & Hodson, 2002; Feldman, 2003), the underlying dynamics have had various interpretations. Most theorists assume that threat tends to increase everyone’s authoritarianism. For instance, Wilkinson (1972, pp. 103–111) articulated a psychodynamic theory of authoritarian cause in which he makes it quite clear (pp. 18, 105–106) that both authoritarians and nonauthoritarians generally tend to show more authoritarianism when threatened. To Altemeyer (1988), authoritarianism is developed and maintained through principles of social learning, and both authoritarians and nonauthoritarians have the potential and the tendency to show elevated levels of authoritarian behavior and attitudes when there are threats to societal integrity (p. 60). Duckitt (2001) also theorized that both authoritarians and nonauthoritarians show rises in authoritarianism under threat but has acknowledged that perhaps only authoritarians respond to threat in this manner (Duckitt & Fisher, 2003, p. 214). Previous threat–authoritarianism studies at the societal level have involved aggregate authoritarian attitudes and behavior. They have not allowed for the potential interaction of threat and authoritarian disposition in relation to authoritarian displays (e.g., Doty, Peterson, & Winter, 1991; McCann, 1999; Sales, 1973). As Stenner (2005) has noted, such evidence for a threat–authoritarianism link “observed in the aggregate can be equally compatible with a process that depends upon variation in individual predispositions and one that does not” (p. 30).

Terror Management Theory (TMT) Although ADT prompted the present study, aspects of the theory and the research scenario bear likeness to those in TMT (e.g., Greenberg, Pyszczynski, & Solomon, 1986; Pyzszczynski, Solomon, & Greenberg, 2003; Solomon, Greenberg, & Pyzszczynski, 1991). Inspired by Becker (1973), TMT essentially says that our internalized worldview—which is an individualized but largely shared symbolic conception of reality that involves all of our values, norms, customs, commitments, identities, institutions, authorities, and so on—acts as a buffer against our biding existential anxiety stemming from our awareness of our own mortality. Our worldview allays existential fear by giving meaning to our lives, by providing self-esteem if we believe that we behave in accord with its inherent value standards, and by promising “literal” or symbolic immortality if we live by its prescriptions. Consequently,

we strenuously defend our worldview whenever we perceive that it is threatened. The most widely researched TMT tenet, with now over 300 demonstrations (Strachan et al., 2007), is that we defend our worldview most vigorously when our own mortality is made salient (e.g., Rosenblatt, Greenberg, Solomon, Pyszczynski, & Lyon, 1989). Another somewhat less researched tenet is that the presence of others with different worldviews is a threat that prompts defense of our worldview (e.g., Greenberg et al., 1990). According to TMT, striving for the maintenance of faith in our cultural worldview to ward off mortality anxiety is an unconscious process that functions in addition to more immediate defensive actions such as denial and rationalization, which we embrace when our own mortality is brought to conscious awareness. The proximal defense temporally gives way to the unconscious distal defense of preserving and bolstering the integrity of our worldview when death-related thought is accessible but not the active object of attention (Greenberg, Arndt, Simon, Pyszczynski, & Solomon, 2000). Although no systematic empirical investigation of the potential of TMT in regard to support for the death penalty and the carrying out of executions has been conducted, Judges (1999) has theoretically linked capital punishment to the dynamics of TMT. He sees the death penalty as mostly a “nonconscious” symbolic defense against mortality terror rather than as a rational process in the service of retribution, deterrence, or incapacitation. To Judges, the conditions are inherent in capital cases for the TMT process to occur: “a reminder of mortality, time to push awareness of it to the fringes of consciousness, and an opportunity to indulge in punitive and often authoritarian aggression against an offending target person” (p. 161).

Relevant Empirical Death Penalty Literature Supporters of capital punishment are likely to be both more authoritarian and more conservative (e.g., Feldman & Stenner, 1997; Moran & Comfort, 1986; Soss et al., 2003). This is not surprising because strong parallels and intimate connections between the two constructs often have been postulated (e.g., Altemeyer, 1998; Eckhardt, 1991; Wilson, 1973). In addition, the close association between the two is evident in the model of political conservatism as motivated social cognition (Jost, Glaser, Kruglanski, & Sulloway, 2003). As well, although Stenner (2005) views authoritarianism and political conservatism as conceptually distinct constructs that are not necessarily linked across cultures and time (p. 89), she emphasizes the symbiotic nature of their relationship and notes that their interests and concerns often converge, making it difficult to distinguish between the two (p. 175). Conservative jurisdictions also are more likely to support and exercise the death penalty (Baumer et al., 2003; Jacobs & Carmichael, 2002, 2004; Jacobs et al., 2005). Baumer et al. argued that higher death penalty endorsement in more conservative areas may occur because (a) there simply are more conservative individuals in those areas or (b) because persons in those areas are exposed to the elements of a more conservative political climate. But, of course, the two factors are not entirely independent because a politically conservative climate in a democracy must be put in place and maintained by a predominantly conservative electorate. There also is an association between authoritarianism and prefer-

DEATH PENALTY SENTENCING

ence for conservative, authoritarian, right-wing politicians (e.g., Doty et al., 1991; Kemmelmeier, 2004; Stone & Smith, 1993). One threat variable that might relate to capital punishment is violent crime. As Baumer et al. (2003, pp. 846 – 847) have recounted, national homicide rates and death penalty support have fluctuated somewhat similarly since the mid-1960s, leading to speculation that the public view of capital punishment is sensitive to elevations in homicide and violent crime. But Baumer et al. noted a lack of evidence except for a positive correlation between death penalty support and a 3-year lagged violent crime index reported by Rankin (1979). Nevertheless, Baumer et al. did find that persons living in areas with higher violent crime rates were more likely to support capital punishment. In contrast, Jacobs and Carmichael (2002) found no significant link between either homicide or violent crime rates and whether a state had legalized the death penalty, and Jacobs et al. (2005) found no relationship between the number of murders and the number of state death sentences. However, Jacobs and Carmichael (2004) and Jacobs et al. did find that death sentences were more frequent in states with higher violent crime rates even when homicides were held constant. As well, Vidmar (1974) and Thomas and Foster (1975) found evidence that those who felt threatened by criminal activity were more likely to favor capital punishment. Another variable that has received some support in this context is racial threat. According to Baumer et al. (2003, p. 849), Whites perceive non-Whites as threatening to their rule and influence, and the larger the non-White proportion, the greater the pressure to protect the status quo with enhanced crime control measures. To Jacobs and Carmichael (2004, p. 254), Whites politically counter non-White efforts to gain a fairer share from an inherently unequal social order by calling for and imposing more repressive legal measures designed to have greater impact on non-Whites than on Whites. To Stenner (2005), the presence of a racial minority is a normative threat, at least to authoritarians, simply because it contributes to diversity and undermines their strong desire for uniformity and the authority of the collective. In line with the racial threat hypothesis, Baumer et al. (2003) found that persons residing in areas with higher Black populations showed heightened support for capital punishment, Jacobs and Carmichael (2002) found that the death penalty was more likely to be legal in states with high Black populations, and Soss et al. (2003) found that racial prejudice (often related to authoritarianism) was a relatively strong indicator of White support for capital punishment and that this association was magnified by the threat of a higher proportion of Black residents in a county. As well, Jacobs and Carmichael (2004) showed that racial threat could account for whether a state ever used capital punishment but not variance in the number of death sentences, and Jacobs et al. (2005) found that “states with percentages of blacks above a threshold are more likely to use the death sentence at least once” (p. 672), but the Black percentage was not related to state death sentence frequencies. Three other studies also are noteworthy in the present context: Stenner (1998) found that societal threat was related to surges in decisions for capital punishment in Texas juries in a monthly time series analysis for the period 1960 –1982; Feldman and Stenner (1997), using 1992 National Election Study data, found that societal threat interacted with authoritarianism, resulting in greater support for the death penalty among threatened authoritarians; and Stenner (2005), in work with the Cumulative General Social Sur-

915

vey from 1972 to 1994 (Davis & Smith, 1994, as cited in Stenner, 2005), found that authoritarians showed heightened support for capital punishment during threatening periods (p. 31).

The Hypotheses of the Present Study Although there is some evidence that authoritarianism and conservatism, living in a conservative jurisdiction, and being exposed to threat might have an impact on death penalty support and death sentencing, systematic research has not been carried out that can furnish evidence to support or refute the effects of interactive relationships between authoritarianism and societal threat of the type postulated by ADT at the state level. ADT is centered on the person, but if one assumes that genetic (e.g., McCourt, Bouchard, Lykken, Tellegen, & Keyes, 1999), cultural, and migration factors can lead to unequal development and distribution of authoritarian personalities throughout the nation, that there is a demonstrated tendency for authoritarians to show strong preference for the conservative side of the political spectrum, and that the citizens of more conservatively oriented states tend to be more authoritarian, then it also would seem warranted to extend the fundamental ADT interaction to conservative and liberal states. ADT leads to the following expectations in regard to the relationships between state conservatism, normative threat, and frequency of death sentences and executions: Because of the strong correspondence between conservatism and authoritarianism in America, one should assume that jurors in murder trials are more likely to be authoritarian in conservative states than in liberal states. Similar differences are likely for other courtroom players and those in the larger community and are apt to be reflected in the policies, precedents, and norms of the state culture. According to ADT, normative threats activate authoritarians to produce more authoritarian behaviors and attitudes and activate libertarians to produce more libertarian behaviors and attitudes. Consequently, in threatened conservative states, there should be an even stronger tendency for jurors to call for the death penalty than in nonthreatened conservative states, and in threatened liberal states, there should be an even weaker tendency for jurors to call for the death penalty than in nonthreatened liberal states. Similar dynamics also should apply to executions. TMT can lead to the same interactive hypotheses. To begin with, we should assume that state conservatism or liberalism is represented in the conservatism or liberalism of jurors, other courtroom players, and wider community members and that these differences also are reflected in the state culture. Those with a conservative worldview are especially inclined to impose the death sentence because that is the fitting punishment in that worldview, whereas those with a generally more tolerant liberal worldview are reluctant to impose capital punishment because the values of that worldview are in opposition to the death penalty. The components of the composite threat measure used here (i.e., state homicide rate, violent crime rate, non-White percentage of the population), especially when the threats are relatively high, signify that not all share one’s worldview, and this in itself increases efforts to bolster and defend one’s worldview. Therefore, even before they are selected as murder trial jurors, those in threatened conservative and liberal states may be somewhat more adamant and defensive about their particular worldviews. But murder clearly is a violation of the norms of both conservative and liberal worldviews, and being a

MCCANN

916

juror in a murder trial exposes one to a strong mortality salience (MS) catalyst that eventually leads to the distal defense of strong emphasis on the validity and superiority of one’s worldview. Therefore, the MS of the trial may just further accentuate worldview defenses, resulting in the corresponding differences in the imposition of death sentences and executions. Such bilevel TMT dynamics lead to the same interactive hypotheses as those deduced from ADT. The present study covered the period 1977–2004. The conservatism of each state was based on a composite of archival measures, and degree of threat was based on a composite derived from estimates of the mean homicide rate, violent crime rate, and non-White percentage of the population in each state over the period in question. The test of the hypotheses involved the estimated level of state conservatism, the estimated level of state threat, the total number of death sentences, and the total number of executions in each state during this period. With statistical control for estimated mean state populations and number of years that a state had a death penalty from 1977 to 2004, the following hypotheses were tested: Hypothesis 1: State conservatism is positively related to (a) death sentences and (b) executions. Hypothesis 2: State threat is positively related to (a) death sentences and (b) executions. Hypothesis 3: There is an interaction between state conservatism and state threat in which the number of (a) death sentences and (b) executions tends to be (i) higher in more threatened conservative states than in less threatened conservative states and (ii) lower in more threatened liberal states than in less threatened liberal states. Hypothesis 1 is based on existing death penalty research, Hypothesis 2 on the conventional threat–authoritarianism link, and Hypothesis 3 on ADT, but it could have been based on TMT.

Method Measures Death sentences from 1977 to 2004. The number of death sentences in each state from 1977 to 1996 was taken from Shepherd (2005), and these data were supplemented by the numbers from 1997 to 2004 as displayed in an annual state table by the Death Penalty Information Center (2006a). Executions from 1977 to 2004. The number of executions in each state from 1977 to 2004 was calculated from a table presented by the Death Penalty Information Center (2006b). Population. State populations for the 1977–2004 period were estimated by averaging the state resident populations of 1980, 1990, and 2000, provided by the U.S. Census Bureau (2001). Death penalty years from 1977 to 2004. The number of years from 1977 to 2004 that each state had a death penalty sentence was taken from the state information of the Death Penalty Information Center (2006c). Values ranged from 0 to 28. Conservatism. To distinguish between more conservative and more liberal states, a composite based on seven dimensions was formed from five variables (i.e., voter ideological identification,

Democratic Party elite liberalism– conservatism, Republican Party elite liberalism– conservatism, composite policy liberalism, and religious fundamentalism) developed by Erikson, Wright, and McIver (1993), combined with the U.S. Economic Freedom Index (Pacific Research Institute, 2005a) and a variable based on the 2004 presidential election results. The economic freedom and election variables pertain more directly to the latter part of the 1977–2004 period and provide a temporal counterbalance to the five Erikson et al. variables. Ideological identification was based on the state aggregates of 122 national telephone polls conducted by the New York Times and CBS News between 1976 and 1988. Usable ideological identifications were obtained for 141,798 individuals. For all states except Alaska and Hawaii, the percentage who considered themselves conservative or liberal is displayed by Erikson et al. (1993, p. 16). Extensive reliability, validity, and temporal stability evidence also is provided (pp. 21–39). For example, the estimated reliability was .92, based on sampling theory, and .93, based on the Spearman– Brown split-half formula. In regard to temporal stability, when the sampling period was divided into 1976 –1982 and 1983–1988 and sampling theory and corrected split-half formulas were applied, the mean stability estimate was .99. For the present study, the variable was the conservative percentage minus the liberal percentage in each state. Erikson et al. (1993) also determined state party elite ideological identification for each party from data on the conservatism and liberalism of state legislators in 1974 (Uslaner & Weber, 1974); congressional candidates in 1974, 1978, and 1982 (Wright, 1986; Wright & Berkman, 1986); national convention delegates in 1980 (Miller & Jennings, 1987); and local party chairpersons in 1979 – 1980 (Cotter, Gibson, Bibby, & Huckshorn, 1984). (Each preceding reference is as cited in Erikson et al., 1993, pp. 98 –99.) With Alaska, Hawaii, Nevada, and Nebraska excluded, correlations between the four estimates ranged from .56 to .73 for Republicans and from .52 to .69 for Democrats (Erikson et al., 1993, p. 100). Erikson et al. (1993, p. 101) formed a composite measure for each party by summing the standardized scores of the four components, with lower scores indicating higher conservatism. Democratic and Republican dimensions served separately in the present study. As well, Erikson et al. (1993, p. 77) constructed a “composite policy liberalism” index on the basis of eight policy issues that reflect conservative and liberal differences: commitment to progressive tax policies, approaches to criminal justice, public educational spending per pupil, scope of Medicaid health provisions, range of Aid to Families with Dependent Children health provisions, degree of hesitancy to ratify the Equal Rights Amendment, extent of legalized gambling, and responses to the consumer protection movement. The eight variables are highly correlated, load on a single factor, and the sum of their standardized values results in a composite with a Cronbach’s alpha of .89 (pp. 76 –77). The tabled composite z scores (p. 77) were used in the present research. It is commonly perceived that fundamentalists are relatively conservative (Erikson et al., 1993, p. 63), and there is empirical support for such a relationship (e.g., Coe & Domke, 2006; Miller & Wattenberg, 1984). Consequently, for their analysis of the relationship between state opinions and state policies, Erikson et al. (1993) estimated the percentage of each state’s population (excluding Nevada) that were fundamentalist (p. 67) from state data reported by Johnson,

DEATH PENALTY SENTENCING

Picard, and Quinn (1974, as cited in Erikson et al., 1993, p. 65). These estimated state percentages were used here. The Pacific Research Institute is a conservative think tank advocating “freedom, opportunity, and personal responsibility for all individuals by advancing free-market policy solutions” (Pacific Research Institute, 2005b). The 2004 Economic Freedom Index was constructed from information for each state on 143 judicial, regulatory, fiscal, government size, and welfare spending variables such as state spending, environmental regulations, tax rates, occupational licensing, number of state agencies, and income redistribution from 1995 to 2003. Lower scores indicate higher degrees of conservatism. The events of September 11, 2001, and their aftermath left the whole nation in a state of heightened threat. If authoritarian and conservative tendencies are more likely to be manifested under threatening conditions, then the degree of state support for the Republican presidential candidate in 2004 affords another opportunity to gauge which states have the most conservative and authoritarian voters. Consequently, the variable used here was the percentage of the popular vote for Bush in each state in 2004 taken from the U.S. Census Bureau (2006). For the present study, a state conservatism composite was constructed in the following way: After reversing the sign for the Democratic Party elite, Republican Party elite, composite policy liberalism, and economic freedom variables, the seven conservatism variables were standardized, averaged, and the resulting values were converted to z scores. A composite score for Nebraska was calculated using the five available variables, but no score was produced for Nevada, which had scores on only three of the seven state conservatism variables. For the composite state conservatism variable for the 46 states with scores based on complete data (i.e., excluding Alaska, Hawaii, Nebraska, and Nevada), Cronbach’s alpha was .93. The state ideological identification variable based on individual telephone poll results correlated with the composite (r ⫽ .92) and with each of the composite’s six other variables, with correlations ranging from .61 to .84, with three .77 or over. Threat. A composite threat index was formed by combining information on state homicide rates, violent crime rates, and nonWhite population percentages. For each state, the homicide rate and violent crime rate constituents contained the respective mean rates for the 1975–2004 period, and the non-White percentage of the population constituent contained the mean percentage based on the census data for 1980, 1990, and 2000. State homicide and violent crime rates for each year from 1975 to 2000 were taken

917

from the The Disaster Center (2006) Web site, which provides state crime statistics of the United States Uniform Crime Report for the years from 1960 to 2000. These were supplemented with state homicide and violent crime rates for 2001, 2002, 2003, and 2004 from the Federal Bureau of Investigation (2006). The White percentages of the population in each state in 1980, 1990, and 2000 were averaged to form an estimate of the White percentage of the population in each state over the 1977–2004 period, and this value was subtracted from 100 to form the non-White percentage variable for the present study. For 1980, the White percentage of the population in each state was calculated from the White and total population numbers tabled by the U.S. Census Bureau (1984). The White percentages for 1990 and 2000 were taken from the U.S. Census Bureau (1991, 2001). For the 48 states, excluding Alaska and Hawaii, the three threat variables were standardized, averaged, and the resulting values were converted to z scores. This resulted in a state threat composite with a Cronbach’s alpha of .92. For use in determining temporal stability, two separate composites also were constructed, the first based on the homicide and violent crime rates from 1975 to 1990 and the mean of the 1980 and 1990 non-White population percentages and the second on the homicide and violent crime rates from 1989 to 2004 and the mean of the 1990 and 2000 non-White population percentages. Cronbach’s alpha was .90 for the first composite and .93 for the second composite, and the two composites were very highly correlated, r(46) ⫽ .98, p ⬍ .001, indicating that the relative positions of the states on the threat composite were quite stable.

Procedure Hierarchical multiple regression was used to test the predictions that state conservatism would be associated with higher frequencies of death penalty sentences and executions (Hypothesis 1), that state threat would be associated with more death penalty sentences and executions (Hypothesis 2), and that state conservatism and state threat would be interactively related to the frequency of death penalty sentences and executions (Hypothesis 3).

Results Correlations between the main variables are presented in Table 1. Sentences and executions are highly correlated, as are the two measures of state conservatism. In contrast, threat is not correlated

Table 1 Correlations Between the Main State Variables in the Study Variable 1. 2. 3. 4. 5. 6. 7. *

Sentences Executions Conservatism composite Conservatism (single measure) Threat composite Population Years with a death penalty

p ⬍ .05.

**

p ⬍ .01.

***

p ⬍ .001.

1

2

— .66*** .20 .21 .56*** .71*** .41**

— .33* .31* .31* .35* .24

3

— .92*** .12 ⫺.32* .47***

4

— .03 ⫺.28 .32*

5

6

7

— .55*** .48***

— .15

—

MCCANN

918

Table 2 Variance in State Death Penalty Sentences and Executions Accounted for by Hierarchical Regression Predictors Using Composite Conservatism and State Ideology Criterion

Predictor

df

⌬R2

⌬F

Death sentences

Population Years with the death penalty Conservatism Threat Conservatism ⫻ Threat interaction Population Years with the death penalty Conservatism Threat Conservatism ⫻ Threat interaction Population Years with the death penalty Ideology Threat Ideology ⫻ Threat interaction Population Years with the death penalty Ideology Threat Ideology ⫻ Threat interaction

1, 45 1, 44 1, 43 1, 42 1, 41 1, 45 1, 44 1, 43 1, 42 1, 41 1, 45 1, 44 1, 43 1, 42 1, 41 1, 45 1, 44 1, 43 1, 42 1, 41

.502 .097 .117 .000 .115 .123 .037 .189 .000 .127 .502 .097 .109 .002 .102 .123 .037 .142 .001 .135

45.40*** 10.66** 17.72*** 0.02 28.14*** 6.29* 1.92 12.43*** 0.00 9.95** 45.40*** 10.66** 16.03*** 0.29 22.18*** 6.29* 1.92 8.76** 0.09 9.86**

Conservatism Composite

Executions

Ideology

Death sentences

Executions

*

p ⬍ .05.

**

p ⬍ .01.

***

p ⬍ .001.

with either conservatism index. Of course, correlations between the other variables and either sentences or executions are meaningless in regard to the hypotheses because they are dependent on state population and years with a death penalty. To test the three hypotheses for each criterion, a hierarchical multiple regression equation was computed on the basis of a logical order of entry. The state population control variable was entered first, the state death penalty years control variable was entered second, the state conservatism variable was entered third, the state threat variable was entered fourth, and a variable containing the product of the state conservatism variable and the state threat variable was entered last. The decision to enter conservatism before threat was fairly arbitrary but to some extent based on the assumption that state conservatism is relatively more enduring and state threat potentially more transient. In any case, as noted later, regressions also were computed with the order of the two variables reversed. With death sentences as the criterion (see Table 2), state population accounted for 50.2% of the variance in death sentences, and death penalty years accounted for a further increment of 9.7%. State conservatism accounted for an additional 11.7% of the variance in death sentences, and the sign of the regression coefficient was positive, indicating support for Hypothesis 1. In contrast, the state threat variable produced a nonsignificant increment of .0%, showing no support for Hypothesis 2. 1 With the state conservatism–threat product term appended, the interaction of conservatism and threat accounted for another 11.5% of the criterion variance, and subsequent standardized equation computations with ⫹1 and ⫺1 standard deviation values for the interacting predictors and the mean of 0 for the others showed the expected ordinal interactive pattern (see Figure 1), supporting Hypothesis 3. With executions as the criterion (see Table 2), state population accounted for 12.3% of the variance in executions, but death penalty years only accounted for a nonsignificant increment of

3.7%. State conservatism accounted for an additional 18.9% of the variance in executions, and the sign of the regression coefficient was positive, indicating support for Hypothesis 1. However, state threat again produced only a nonsignificant increment of .0%, showing no support for Hypothesis 2 (see Footnote 1). With the state conservatism–threat product term appended, the interaction of conservatism and threat accounted for an additional 12.7% of the criterion variance, and subsequent standardized equation computations with ⫹1 and ⫺1 standard deviation values for the interacting predictors and the mean of 0 for the others showed the anticipated ordinal interactive pattern, similar to the one depicted in Figure 1, supporting Hypothesis 3.2 Some readers might be concerned with the lack of statistical controls for state demographics in the preceding analyses. In addition to the fact that control variables reduce degrees of freedom substantially with an N of only 47, Erikson et al. (1993) found that only a small amount of the variance in state ideology is related to demographic variables such as education, income, race, religion, gender, and age (pp. 61– 62). The only two state-level demographic variables found to be particularly related to ideology were 1 To ascertain whether a different entry order would produce significant relationships between threat and the criterion, regressions were recalculated with threat entered before the measure of conservatism. Conservatism again was a significant predictor, but threat accounted for .0% of the variance in the criterion. 2 Similar results were found when each of the three threat variables alone was substituted for the composite threat variable in all of the preceding analyses, with either sentences or executions as the dependent variable. As well, to determine whether the preceding hypothesis tests for either sentences or executions were biased by the inclusion of states without a death penalty at any time from 1977 to 2004, the tests were repeated with such states excluded, and essentially the same results again emerged.

DEATH PENALTY SENTENCING

Figure 1.

919

The interaction of state conservatism and threat, with death sentences as the criterion.

religious fundamentalism and urbanism, respective markers for conservatism and liberalism (p. 229). Of course, fundamentalism was incorporated into the conservatism composite here as a conservative element rather than as something to be controlled. However, the key tests of the hypotheses involving the full sample were repeated with state-level controls for the 1990 percentage of persons 25 and over with at least a high school diploma, disposable personal income per capita as a percentage of the U.S. average, and urban percentage of the population (U.S. Census Bureau, 2007), all entered as a block on the first step. The pattern and magnitude of the relationships remained much the same. The only notable difference was a reduction by about half in the death sentence variance accounted for by the conservatism composite. Of crucial importance to the tested interactive hypothesis, additional regression analyses showed no interactions between threat and education, income, or urbanism in relation to either criterion. Because ideological identification was based on the polling of almost 142,000 persons (see Erikson et al., 1993, pp. 12– 46), it is the most direct assessment of individual levels of conservatism in each of the states among the seven variables in the composite state conservatism measure. Therefore, it seemed prudent and advantageous to test the hypotheses with this single index of conservatism as well. Hypothesis test results using this sole measure were parallel to those obtained with the full state conservatism composite (see Footnote 1 and Table 2), and the anticipated ordinal interactive pattern, similar to the one shown in Figure 1, also was evident for sentences and executions. When the analysis was repeated with appropriate controls for education, income, and urbanism, the results again were analogous to those found using the conservatism composite. To determine the temporal stability of the results using the composite conservatism variable, separate analyses were con-

ducted on the corresponding data for 1977–1990 and for 1991– 2004. Outcomes in each data set essentially were parallel to each other and to those of the earlier analyses (see Table 3). Standardized equation computations with ⫹1 and ⫺1 standard deviation values for the interacting predictors and the mean of 0 for the others also revealed the expected ordinal interactive pattern with either death sentences or executions as the criterion. For example, Figure 2 shows the interactive pattern for 1977–1990 with sentences as the dependent variable.

Discussion The pattern of support for the three hypotheses was clear and consistent whether the dependent variable was the number of death sentences or executions. For each criterion, Hypotheses 1 and 3 were supported, whereas Hypothesis 2 was not. That is, state conservatism was positively related to both criteria, but state threat did not show any direct connection to either. State threat, however, did interact with state conservatism in the ADT- and TMTpredicted manner in regard to both: Faced with higher threat, more conservative states engaged in higher levels of death sentencing and executions, whereas more liberal states engaged in lower levels of death penalty sentencing and executions. Furthermore, the support pattern remained the same whether analyses were based on all available states or just those with a legal death penalty for at least one of the years from 1977 to 2004, whether voter ideology alone was substituted for the conservatism composite, and whether analyses were restricted to 1977–1990 or to 1991– 2004. The association between state conservatism and capital punishment found here (Hypothesis 1) is in line with previous research results, but this is the first study to show a link between state

MCCANN

920

Table 3 Variance in State Death Penalty Sentences and Executions Accounted for by Hierarchical Regression Predictors for 1977–1990 and for 1991–2004 Years

Criterion

Predictor

df

⌬R2

⌬F

1977–1990

Death sentences

1991–2004

Death sentences

1977–1990

Executions

1991–2004

Executions

Population Years with the death penalty Conservatism Threat Conservatism ⫻ Threat interaction Population Years with the death penalty Conservatism Threat Conservatism ⫻ Threat interaction Population Years with the death penalty Conservatism Threat Conservatism ⫻ Threat interaction Population Years with the death penalty Conservatism Threat Conservatism ⫻ Threat interaction

1, 45 1, 44 1, 43 1, 42 1, 41 1, 45 1, 44 1, 43 1, 42 1, 41 1, 45 1, 44 1, 43 1, 42 1, 41 1, 45 1, 44 1, 43 1, 42 1, 41

.442 .129 .093 .000 .101 .543 .055 .133 .000 .101 .098 .062 .189 .033 .182 .125 .021 .188 .001 .101

35.58*** 13.22** 11.96*** 0.01 17.58*** 53.45*** 6.03* 21.32*** 0.03 24.84*** 4.91* 3.22 12.46*** 2.26 17.09*** 6.43* 1.08 12.10*** 0.09 7.35**

*

p ⬍ .05.

**

p ⬍ .01.

***

p ⬍ .001.

conservatism and the number of both death sentences and executions. The complete lack of support for a direct link between higher state threat and the number of state death sentences or executions (Hypothesis 2), with and without statistical control for state conservatism, perhaps is not so surprising given that in ADT, threat is expected to have its impact only in interactive conjunction with

Figure 2.

levels of dispositional authoritarianism, and in TMT, MS and threat to worldview also are expected to similarly strengthen adherence to both conservative and liberal worldviews. However, the strong and consistent evidence that state threat interacted with state conservatism (essentially as a proxy for state authoritarianism) in relation to state death sentences and executions in the

The interaction of state conservatism and threat, with death sentences as the criterion for 1977–1990.

DEATH PENALTY SENTENCING

predicted fashion (Hypothesis 3) offers indirect substantiation for a pivotal aspect of ADT and expands its theoretical sweep to demonstrably account for relevant interactive phenomena in the macro sphere. But the interactive results also could have been predicted by TMT, so they also give additional confirmation to that theory and bring the empirical realm of capital punishment into its extensive and multifarious body of support (see, e.g., Pyszczynski et al., 2003). The results in regard to Hypotheses 2 and 3, however, are somewhat problematic for explanations based on authoritarianism– threat theories that predict increases in the manifestation of authoritarian tendencies among both authoritarian and nonauthoritarian persons facing threat (e.g., Altemeyer, 1988; Duckitt, 2001; Wilkinson, 1972). Such theories lead to Hypothesis 2, which received no confirmation but do not lead to Hypothesis 3, which did receive support. Furthermore, the results of previous research at the societal level, which did show the direct threat–authoritarianism link (e.g., Doty et al., 1991; Sales, 1973), could have been masking somewhat divergent authoritarian and libertarian responses (Stenner, 2005, p. 30). The TMT explanation of the Hypothesis 3 results relies on the assumption that conservatives and liberals respond according to the theory’s two major tenets regarding the effects of MS and challenges to worldview in a similar fashion but with ideologically opposing outcomes, an assumption that has not gone unchallenged. For instance, Paulhus and Trapnell (1997) have suggested that worldview defense in the face of mortality cues may be more or less confined to conservatives and that the usual “worldview” conceptualized and assessed in TMT research really is a “conservative worldview” (p. 43). Jost et al. (2003) also pointed out that although proponents of TMT have claimed that MS can produce increases in political conservatism or liberalism, whichever is dominant, “most of the demonstrated effects of MS have had a politically conservative or intolerant flavor” (p. 349). Nevertheless, in studies conducted with the conservatism or authoritarianism of the participants as moderating variables, the effects of the “threat” of MS actually have been moderated in the fashion suggested by TMT. For example, MS has been found to increase dislike for dissimilar others among conservatives but to decrease such dislike among liberals (Greenberg, Simon, Pyszczynski, Solomon, & Chatel, 1992), to increase negative evaluations of another with dissimilar attitudes among high but not low authoritarians (Greenberg et al., 1990), to increase support for strong military actions among conservatives but not liberals (Pyszczynski et al., 2006), and to increase interest in reading a one-sided rather than a two-sided article on capital punishment among high but not low authoritarians (Lavine, Lodge, & Freitas, 2005). Such results, of course, also are readily interpretable through the lens of ADT. Stenner (2005) has stated that ADT “can comfortably accommodate terror management theory, and it goes well beyond the latter in specifying precisely the critical aggravating conditions, and the manner in which their impact is contingent upon variation in individual predispositions” (p. 298). Her conclusion apparently rests on the argument that the impact of MS “appears to be entirely conditional upon authoritarian predisposition” (p. 298). TMT proponents might well counter that such a conclusion was premature, that TMT does allow for the moderating effects of individual differences related to worldview, such as authoritarian disposition, is much more extensive in conceptual reach, more precisely stip-

921

ulates the conscious and unconscious processes underlying the effects of mortality threat, has amassed a far greater body of diverse affirmative empirical research results, and actually can subsume ADT. One apparent shortcoming of the present investigation is that individual levels of authoritarianism were not directly assessed in state samples. No such indices exist as far as I know. Therefore, it must remain unknown conclusively whether the results pattern actually occurred because there are more authoritarians in conservative states, as required by the ADT explanation. The best that can be done is to provide circumstantial evidence that supports the reasonableness of the conclusion that the ideological variable, based on the responses of the 142,000 state-sampled individuals, reflects the authoritarian–libertarian dispositions of those respondents and that by inference, the state conservatism composite, which correlates .92 with the ideological variable, also highly reflects underlying state authoritarian–libertarian differences. New data (P. J. Rentfrow, personal communication, April 23, 2007; Rentfrow, Jost, Gosling, & Potter, 2006; see also Jost, 2006) fortunately provide rich evidence of a different type that also points to justification for using the state ideological variable as a proxy for authoritarian–libertarian differences. Past research at the individual level has shown that the Big Five Openness to Experience factor (McCrae et al., 2000) is consistently correlated negatively with both conservatism (e.g., Butler, 2000; Trapnell, 1994) and authoritarianism (e.g., Peterson, Smirles, & Wentworth, 1997; Trapnell, 1994; see also Stenner, 2005, pp. 144 –146). However, Rentfrow et al. (2006) obtained Openness scores for approximately 250,000 Americans in each of two Internet surveys, one in 1999 – 2000 and another in 2003–2004, and the data were aggregated at the state level for each sample. The present state ideological variable, which is based on 1976 –1988 phone surveys, correlates ⫺.51 and ⫺.39 with 1999 –2000 and 2003–2004 Openness scores, respectively, and the matching correlations for the conservatism composite are ⫺.35 and ⫺.27 (P. J. Rentfrow, personal communication, April 23, 2007). These state-level results, coupled with those at the individual level linking authoritarianism to conservatism and both to Openness, strongly suggest that the state conservatism scores in the present study also reflect state authoritarianism levels based on the individual dispositions of those in the state aggregates. Given that differences exist between ADT and TMT, which theory offers the “best” explanation of the present findings? Perhaps the ADT account is somewhat more parsimonious in the sense that it can predict and explain the present interactive results using a one-step rather than a marginally more complex two-step process to describe the dynamics involved. However, the ADT interpretation depends on the assumption that state conservatism reflects underlying state authoritarianism, whereas the TMT interpretation needs no such assumption. But these are rather superficial advantages, and it seems evident that the results can be successfully explained using either theory. Any present choice between the two competing interpretations seems dependent solely on personal preference. The research scenario and the results simply do not appear to provide sufficient substantive information to permit a rational empirically based superiority selection of either ADT or TMT. In regard to wider implications, it is evident that not only the present results but also those of earlier works in which authori-

MCCANN

922

tarianism and conservatism have been used as moderators of the threat of MS (i.e., Greenberg et al., 1990, 1992; Lavine et al., 2005; Pyszczynski et al., 2006) are equally compatible with ADT and TMT. The two theories do have key similarities, but they also differ in several core respects. For example, the idea in ADT that we conform to collective norms and authority in an effort to prevent and relieve anxiety is akin to the TMT concept of adherence to the “social anxiety buffer,” which includes worldview contents to protect us from existential anxiety. As well, both ADT and TMT offer a functional account: Worldview defense is activated or strengthened as needed. But there are apparent differences in regard to what triggers worldview defense. In ADT, it is normative threat or any threat to the collective; in TMT, it is MS and worldview challenges. Nonetheless, the triggers in ADT and TMT are connected and similar in regard to worldview threats and challenges, and the trigger of MS can be seen as a catalyst to the protection of worldview. Such shaded commonality should encourage future research in which ADT and TMT are pitted against each other in critical studies to determine when and how the two differ in their predictive capacities in various contexts and what constitutes the boundaries of their operating domains. Ultimately, the parallels and the divergences should prompt both ADT and TMT proponents to conduct full and formal comparisons of these two intriguing theories and the nature of their respective support.

References Altemeyer, R. A. (1988). Enemies of freedom: Understanding right-wing authoritarianism. San Francisco: Jossey-Bass. Altemeyer, R. A. (1998). The other “authoritarian personality.” In M. P. Zanna (Ed.), Advances in experimental social psychology (Vol. 30, pp. 47–91). New York: Academic Press. Barkan, S. E., & Cohn, S. F. (1994). Racial prejudice and support for the death penalty by Whites. Journal of Research in Crime and Delinquency, 31, 202–209. Baumer, E. P., Messner, S. F., & Rosenfeld, R. (2003). Explaining spatial variation in support for capital punishment: A multilevel analysis. American Journal of Sociology, 108, 844 – 875. Becker, E. (1973).The denial of death. New York: Free Press. Butler, J. C. (2000). Personality and emotional correlates of right-wing authoritarianism. Social Behavior and Personality, 28, 1–14. Coe, K., & Domke, D. (2006). Petitioners or prophets? Presidential discourse, god, and the ascendancy of religious conservatives. Journal of Communication, 56, 309 –330. Death Penalty Information Center. (2006a). Sentencing: Death sentences by state, 1977–present. Retrieved June 17, 2006, from http:// www.deathpenaltyinfo.org Death Penalty Information Center. (2006b). Death penalty fact sheet: Number of executions by state since 1976. Retrieved June 24, 2006, from http://www.deathpenaltyinfo.org Death Penalty Information Center. (2006c). State by state information. Retrieved June 28, 2006, from http://www.deathpenaltyinfo.org The Disaster Center. (2006). United States uniform crime report. Retrieved July 19, 2006, from http://www.disastercenter.com/crime Doty, R. M., Peterson, B. E., & Winter, D. G. (1991). Threat and authoritarianism in the United States, 1978 –1987. Journal of Personality and Social Psychology, 61, 629 – 640. Duckitt, J. (2001). A dual-process cognitive-motivational theory of ideology and prejudice. In M. P. Zanna (Ed.), Advances in experimental social psychology (pp. 41–112). San Diego, CA: Academic Press. Duckitt, J., & Fisher, K. (2003). The impact of threat on worldview and ideological attitudes. Political Psychology, 24, 199 –222.

Eckhardt, W. (1991). Authoritarianism. Political Psychology, 12, 97–124. Erikson, R. S., Wright, G. C., & McIver, J. P. (1993). Statehouse democracy: Public opinion and policy in the American states. New York: Cambridge University Press. Esses, V. M., Dovidio, J. F., & Hodson, G. (2002). Public attitudes toward immigration in the United States and Canada in response to the September 11, 2001 “attack on America.” Analyses of Social Issues and Public Policy, 2, 69 – 85. Federal Bureau of Investigation. (2006). Uniform crime reports. Retrieved August 2, 2006, from http://www.fbi.gov/ucr/ucr.htm Feldman, S. (2003). Enforcing social conformity: A theory of authoritarianism. Political Psychology, 24, 41–74. Feldman, S., & Stenner, K. (1997). Perceived threat and authoritarianism. Political Psychology, 18, 741–770. Fromm, E. (1941). Escape from freedom. New York: Holt, Rinehart & Winston. Greenberg, J., Arndt, J., Simon, L., Pyszczynski, T., & Solomon, S. (2000). Proximal and distal defenses in response to reminders of one’s mortality: Evidence of a temporal sequence. Personality and Social Psychology Bulletin, 26, 91–99. Greenberg, J., Pyszczynski, T., & Solomon, S. (1986). The causes and consequences of a need for self-esteem: A terror management theory. In R. F. Baumeister (Ed.), Public self and private self (pp. 189 –212). New York: Springer-Verlag. Greenberg, J., Pyszczynski, T., Solomon, S., Rosenblatt, A., Kirkland, S., & Lyon, D. (1990). Evidence for terror management theory II: The effects of mortality salience on reactions to those who threaten or bolster the cultural worldview. Journal of Personality and Social Psychology, 58, 308 –318. Greenberg, J., Simon, L., Pyszczynski, T., Solomon, S., & Chatel, D. (1992). Terror management and tolerance: Does mortality salience always intensify negative reactions to others who threaten one’s worldview? Journal of Personality and Social Psychology, 63, 212–220. Jacobs, D., & Carmichael, J. T. (2002). The political sociology of the death penalty: A pooled time-series analysis. American Sociological Review, 67, 109 –131. Jacobs, D., & Carmichael, J. T. (2004). Ideology, social threat, and the death sentence: Capital sentences across time and space. Social Forces, 83, 249 –278. Jacobs, D., Carmichael, J. T., & Kent, S. L. (2005). Vigilantism, current racial threat, and death sentences. American Sociological Review, 70, 656 – 677. Jost, J. T. (2006). The end of the end of ideology. American Psychologist, 61, 651– 670. Jost, J. L., Glaser, J., Kruglanski, A. W., & Sulloway, F. J. (2003). Political conservatism as motivated social cognition. Psychological Bulletin, 129, 339 –375. Judges, D. P. (1999). Scared to death: Capital punishment as authoritarian terror management. University of California at Davis Law Review, 33, 155–248. Katz, D. (1960). The functional approach to the study of attitudes. Public Opinion Quarterly, 24, 163–204. Kemmelmeier, M. (2004). Authoritarianism and candidate support in the U.S. presidential elections of 1996 and 2000. Journal of Social Psychology, 144, 218 –221. Lavine, H., Lodge, M., & Freitas, K. (2005). Threat, authoritarianism, and selective exposure to information. Political Psychology, 26, 219 –244. McCann, S. J. H. (1997). Threatening times, strong presidential popular vote winners, and the margin of victory, 1824 –1964. Journal of Personality and Social Psychology, 73, 160 –170. McCann, S. J. H. (1999). Threatening times and fluctuations in American church memberships. Personality and Social Psychology Bulletin, 25, 325–336. McCourt, K., Bouchard, J., Jr., Lykken, D. T., Tellegen, A., & Keyes, M.

DEATH PENALTY SENTENCING (1999). Authoritarianism revisited: Genetic and environmental influences examined in twins reared apart and together. Personality and Individual Differences, 27, 985–1014. McCrae, R. R., Costa, P. T., Ostendorf, F., Angleitner, A., Heˇbı´cˇkova´, M., Avia, M. D., et al. (2000). Nature over nurture: Temperament, personality, and life span development. Journal of Personality and Social Psychology, 78, 173–186. Miller, A. H., & Wattenberg, M. P. (1984). Politics from the pulpit: Religiosity and the 1980 elections. Public Opinion Quarterly, 48, 301– 317. Moran, G., & Comfort, J. C. (1986). Neither “tentative” nor “fragmentary”: Verdict preference of impaneled felony jurors as a function of attitude toward capital punishment. Journal of Applied Psychology, 71, 146 – 155. Pacific Research Institute. (2005a). Economic Freedom Index: 2004. Retrieved May 16, 2005, from http://www.pacificresearch.org/pub/sab/ entrep/2004/econ_freedom/freedom.html Pacific Research Institute. (2005b). Pacific Research Institute: 26 years of putting ideas into action. Retrieved June 19, 2005, from http:// www.pacificresearch.org/about/index.html Paulhus, D. L., & Trapnell, P. D. (1997). Terror management theory: Extended or overextended? Psychological Inquiry, 8, 40 – 43. Peterson, B. E., & Gerstein, E. D. (2005). Fighting and flying: Archival analysis of threat, authoritarianism, and the North American comic book. Political Psychology, 6, 887–904. Peterson, B. E., Smirles, K. A., & Wentworth, P. A. (1997). Generativity and authoritarianism: Implications for personality, political involvement, and parenting. Journal of Personality and Social Psychology, 72, 1202– 1216. Pyszczynski, T., Abdollahi, A., Solomon, S., Greenberg, J., Cohen, F., & Weise, D. (2006). Mortality salience, martyrdom, and military might: The great Satan versus the axis of evil. Personality and Social Psychology Bulletin, 32, 525–537. Pyszczynski, T., Solomon, S., & Greenberg, J. (2003). In the wake of 9/11: The psychology of terror. Washington, DC: American Psychological Association. Rankin, J. H. (1979). Changing attitudes toward capital punishment. Social Forces, 58, 194 –211. Rentfrow, P. J., Jost, J. T., Gosling, S. D., & Potter, J. (2006). Regional differences in personality predict voting patterns in 1996 –2004 U.S. presidential elections. Manuscript submitted for publication. Rickert, E. J. (1998). Authoritarianism and economic threat: Implications for political behavior. Political Psychology, 19, 707–720. Rokeach, M. (1960). The open and closed mind. New York: Basic Books. Rosenblatt, A., Greenberg, J., Solomon, S., Pyszczynski, T., & Lyon, D. (1989). Evidence for terror management theory: I. The effects of mortality salience on reactions to those who violate or uphold cultural values. Journal of Personality and Social Psychology, 57, 681– 690. Sales, S. M. (1973). Threat as a factor in authoritarianism. Journal of Personality and Social Psychology, 28, 44 –57.

923

Sanford, R. N. (1966). Self and society. New York: Atherton Press. Shepherd, J. M. (2005). Deterrence versus brutalization: Capital punishment’s differing impacts among states. Michigan Law Review, 104, 203–255. Solomon, S., Greenberg, J., & Pyszczynski, T. (1991). A terror management theory of social behavior: The psychological functions of selfesteem and cultural worldviews. In M. P. Zanna (Ed.), Advances in experimental social psychology (Vol. 24, pp. 93–159). Orlando, FL: Academic Press. Soss, J., Langbein, L., & Metelko, A. R. (2003). Why do White Americans support the death penalty? Journal of Politics, 65, 397– 421. Stenner, K. (1998). Societal threat and authoritarianism: Racism, intolerance and punitiveness in America, 1960 –1994. Dissertation Abstracts International, 58(08), 3294A. Stenner, K. (2005).The authoritarian dynamic. New York: Cambridge University Press. Stone, W. F., & Smith, L. D. (1993). Authoritarianism: Left and right. In W. F. Stone, G. Lederer, & R. Christie (Eds.), Strength and weakness: The authoritarian personality today (pp. 144 –156). New York: Springer-Verlag. Strachan, E., Schimel, J., Arndt, J., Williams, T., Solomon, S., Pyszczynski, T., & Greenberg, J. (2007). Terror mismanagement: Evidence that mortality salience exacerbates phobic and compulsive behaviors. Personality and Social Psychology Bulletin, 33, 1137–1151. Thomas, C. W., & Foster, S. C. (1975). A sociological perspective on public support for capital punishment. American Journal of Orthopsychiatry, 45, 641– 657. Trapnell, P. D. (1994). Openness versus intellect: A lexical left turn. European Journal of Personality, 8, 273–290. U.S. Census Bureau. (1984). Statistical abstract of the United States. Washington, DC: U.S. Government Printing Office. U.S. Census Bureau. (1991). Statistical abstract of the United States. Washington, DC: U.S. Government Printing Office. U.S. Census Bureau. (2001). Statistical abstract of the United States. Washington, DC: U.S. Government Printing Office. U.S. Census Bureau. (2006). Statistical abstract of the United States. Washington, DC: U.S. Government Printing Office. U.S. Census Bureau. (2007). Statistical abstract of the United States. Washington, DC: U.S. Government Printing Office. Vidmar, N. (1974). Retributive and utilitarian motives and other correlates of Canadian attitudes toward the death penalty. Canadian Psychologist, 15, 337–356. Wilkinson, R. (1972). The broken rebel. New York: Harper & Row. Wilson, G. D. (1973). The psychology of conservatism. London: Academic Press.

Received November 29, 2006 Revision received January 8, 2008 Accepted January 15, 2008 䡲