New Perspectives on Narrative and Multimodality
Routledge Studies in Multimodality EDITED
BY
K AY L. O’HALLORAN, Na...
246 downloads
1966 Views
2MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
New Perspectives on Narrative and Multimodality
Routledge Studies in Multimodality EDITED
BY
K AY L. O’HALLORAN, National University of Singapore
1. New Perspectives on Narrative and Multimodality Edited by Ruth Page
New Perspectives on Narrative and Multimodality
Edited by Ruth Page
New York
London
First published 2010 by Routledge 270 Madison Ave, New York, NY 10016 Simultaneously published in the UK by Routledge 2 Park Square, Milton Park, Abingdon, Oxon OX14 4RN Routledge is an imprint of the Taylor & Francis Group, an informa business This edition published in the Taylor & Francis e-Library, 2009. To purchase your own copy of this or any of Taylor & Francis or Routledge’s collection of thousands of eBooks please go to www.eBookstore.tandf.co.uk. © 2010 Taylor & Francis All rights reserved. No part of this book may be reprinted or reproduced or utilised in any form or by any electronic, mechanical, or other means, now known or hereafter invented, including photocopying and recording, or in any information storage or retrieval system, without permission in writing from the publishers. Trademark Notice: Product or corporate names may be trademarks or registered trademarks, and are used only for identification and explanation without intent to infringe. Library of Congress Cataloging in Publication Data New perspectives on narrative and multimodality / edited by Ruth Page. p. cm.—(Routledge studies in multimodality ; 1) Includes bibliographical references and index. 1. Semiotics. 2. Discourse analysis, Narrative. I. Page, Ruth E., 1972– P99.N48 2009 401'.41—dc22 2009014990
ISBN 0-203-86943-5 Master e-book ISBN
ISBN10: 0-415-99517-5 (hbk) ISBN10: 0-203-86943-5 (ebk) ISBN13: 978-0-415-99517-7 (hbk) ISBN13: 978-0-203-86943-7 (ebk)
Contents
List of Figures and Tables Permissions Acknowledgments 1
Introduction
vii xi xiii 1
RUTH PAGE
2
Multimodal Storytelling: Performance and Inscription in the Narration of Art History
14
FIONA J. DOLOUGHAN
3
A Multimodal Approach to Mind Style: Semiotic Metaphor vs. Multimodal Conceptual Metaphor
31
ROCÍO MONTORO
4
The Computer-Based Analysis of Narrative and Multimodality
50
ANDREW SALWAY
5
Opera: Forever and Always Multimodal
65
MICHAEL HUTCHEON AND LINDA HUTCHEON
6
Word-Image/Utterance-Gesture: Case Studies in Multimodal Storytelling
78
DAVID HERMAN
7
“I Contain Multitudes”: Narrative Multimodality and the Book that Bleeds ALISON GIBBONS
99
vi Contents 8
Multimodality and the Literary Text: Making Sense of Safran Foer’s Extremely Loud and Incredibly Close
115
NINA NØRGAARD
9
Electronic Multimodal Narratives and Literary Form
127
MICHAEL TOOLAN
10 Gains and Losses? Writing it All Down: Fanfiction and Multimodality
142
BRONWEN THOMAS
11 Respiratory Narrative: Multimodality and Cybernetic Corporeality in “Physio-Cybertext”
155
ASTRID ENSSLIN
12 Cruising Along: Time in Ankerson and Sapnar
166
JESSICA LACCETTI
13 Beyond Multimedia, Narrative, and Game: The Contributions of Multimodality and Polymorphic Fictions
183
CHRISTY DENA
14 Keg Party Extreme and Conversation Party: Two Multimodal Interactive Narratives Developed for the SMALLab
202
SARAH HATTON, MELISSA MCGURGAN, AND XIANG-JUN WANG
15 Coda/Prelude: Eighteen Questions for the Study of Narrative and Multimodality
217
DAVID HERMAN AND RUTH PAGE
Contributors Index
221 225
Figures and Tables
FIGURES 4.1
A local grammar fragment induced from corpora of film scripts and audio description, from Vassiliou (2006).
55
Monomodal narration used to evoke a single reference world.
82
Monomodal narration used to evoke multiple reference worlds.
82
Multimodal narration used to evoke a single reference world.
82
Multimodal narration used to evoke multiple reference worlds.
82
The Incredible Hulk, Volume 2, Issue 1, created by Stan Lee, written by Gary Friedrich and Marie Severin, inked by George Tuska, lettered by Artie Simek, p. 7. New York: Marvel Comics Group (Issue 102), April 1968.
85
6.6
Gesture use in off-site narration.
91
6.7
Gesture use in on-site narration.
91
6.8
Number of points used to create transpositions between gesture spaces.
92
Number of points used to create laminations of gesture spaces.
92
6.10
A continuum of place-making strategies in narrative.
94
7.1
“First Pain” and “Then Knowledge.”
6.1 6.2 6.3 6.4 6.5
6.9
103
viii Figures and Tables 7.2
“Trifold.”
106
7.3
Reemerging input spaces.
109
12.1
A zoomed-out scene from Cruising illustrating the filmic frames and the written narrative.
171
A zoomed-in scene from Cruising illustrating the fi lmic frames and the written narrative.
172
12.3
White rounded font from the written narrative.
173
12.4
The only image in Cruising in which the protagonists appear.
174
An enlarged view of the driver. Ankerson and Sapnar, Cruising.
176
Filmic sequence showing the black space punctuation between each image.
177
Diagram illustrating the relations between principles, modes, and media, as espoused by Kress and van Leeuwen (2001).
195
Picture of a student using the green glowing ball to interact with a 3-D trace system in SMALLab.
206
Participant navigates through a virtual space in SMALLab’s Alphabet Soup scenario.
206
14.3
Image used in Keg Party Extreme.
208
14.4
Participant stands over the keg so to trigger the sounds associated with that location.
210
14.5
Participants engaging in Conversation Party.
211
14.6
The image used in Conversation Party.
212
14.7
Picture of a flowchart about sound trigger and probability system.
214
12.2
12.5 12.6 13.1
14.1 14.2
TABLES 1.1
Dimensions of the Multimodal Ensemble
7
2.1
Multimodal Transcription and Analysis.
22
Figures and Tables ix 4.1
Four Corpora of Text Surrogates for Film
53
6.1
Cassell and McNeill’s (1991) Taxonomy of Gestures
89
6.2
An Expanded Inventory of Pointing Gestures
90
Permissions
CHAPTER 6 Figure 6.5 is reproduced from The Incredible Hulk, Volume 2, Issue 1, created by Stan Lee, written by Gary Friedrich and Marie Severin, inked by George Tuska, lettered by Artie Simek, p. 7. New York: Marvel Comics Group (Issue 102), April 1968. HULK: TM&©2008 Marvel Entertainment, Inc., and its subsidiaries. Used with permission.
CHAPTER 7 Figures 7.1 and 7.2 are reproduced from VAS: An Opera in Flatland, by Steve Tomasula, art and design by Stephen Farrell (2002), Chicago: University of Chicago Press, pp. 9–10 and p. 58 respectively. Used with permission.
CHAPTER 12 Figures 12.1–12.6 are reproduced from Cruising by Ingrid Ankerson and Megan Sapnar. Used with permission.
Acknowledgments
Many people made this volume possible, and I can only single out some of the people who helped facilitate its production and completion. My thanks go to the contributors for their patience and commitment to the project. It has been a pleasure to work with scholars from across Europe, Canada, the United States, and Australia. I am also grateful to Elizabeth Levine and Erica Wetter at Routledge for their prompt and helpful editorial advice. The collection of essays fi rst began to take shape at the Narrative and Multimodality Symposium (April 2007), hosted at Birmingham City University. I am indebted to Birmingham City University for their support that facilitated that event, and for the research leave that has enabled me to bring the present collection to final completion. Special thanks go to Michael Toolan and David Roberts for the conversations that gave rise to the symposium in the fi rst place; to David Herman for believing in the collection and for providing ever-ready encouragement; to Louise Sylvester for scholarly generosity more than reciprocated. Most of all, I thank my family who have supported me in so many ways during the time this project has taken shape—especially Gavin Page, whose practical help at the symposium will be remembered by all present for years to come, and who I am so glad to be sharing the story of my life with.
1
Introduction Ruth Page
THE MULTIMODAL NATURE OF STORYTELLING From conversational anecdotes told in face-to-face contexts through to film, digital storytelling, and beyond, narrative experiences employ a rich range of semiotic resources. The multifaceted nature of storytelling is nothing new, and is without doubt far more widespread, creative, and diverse than these initial examples signal. Stories might be spoken or written, and in their performance employ gesture, movement, facial expression, and prosodic elements such as voice quality, pitch, pace, and rhythm. Other narrative resources might include soundtracks, music, image, typeface, and hyperlinks, none of which are exceptional for their presence in stories of various kinds available at the outset of the twenty-fi rst century. Put simply, stories do not consist of words alone. However, the multiple and integrated nature of semiotic resources used in storytelling is less simple to explain than to assert, and is long overdue for systematic and close attention in narrative theory. The dominant and interrelated trends that have shaped contemporary narrative studies in the last three decades provide the backdrop against which we can situate this now pressing need.
TRENDS IN CONTEMPORARY NARRATIVE STUDIES The interdisciplinary expansion of narratology across the humanities and beyond means that the kinds of stories that now come under scrutiny extend much further than the literary texts typically prominent in classical narratology. Instead, audiovisual stories of various kinds (such as cinematic fi lm or televised serials); stories that foreground image (including tableaux and cartoons of many kinds); and music (ranging from opera to hip-hop) have all been found to be of interest for their narrative function and potential. The stories that harness the rapid development of new technologies are part of this semiotic expansion, and in themselves are often characterized by multiple resources as they integrate words, image, sound, hyperlinks, and animation (for example, as seen on Web-based homepages, or creatively
2
Ruth Page
exploited in digital fiction). The innovative nature and social impact of recent technology developments mean that the question of digital media and its role in narrative processing has come to center stage, prioritizing the place of media in narrative studies more generally. The increasing diversity of narrative texts, combined with an openness to embrace methodology from other fields of inquiry, means that a narratology derived from the study of verbal resources alone can no longer be fully adequate to the task of interrogating storytelling in its broadest sense. Narratologists have long recognized the limitations of early studies in the field. The contextualist rejection of structuralist abstraction led to an increased interest in the situated and process-oriented nature of narrative. The outcome of this was a shift away from earlier text-immanent analyses towards an attempt to account for the nonlinguistic factors that might be involved in the cover-all domain of “context.” Contextual sensitivity in narrative studies has been manifest in various ways. Within the sociolinguistic fields influenced by Labov and Waletzky’s (1967) groundbreaking work on personal narratives, the shaping force of variables such as the narrative participant’s gender, age, and ethnicity began to be given due attention. Likewise, discourse analysts debated the constraining and enabling nature of the immediate situation in which the narrative was elicited, for example, evaluating stories told in interview settings (Lambrou 2005), meal times (Blum Kulka 1993), and peer group conversation (Georgakopoulou 1997). Critically, studies influenced by sociolinguistics also sought to theorize the social function of narrative, for example, as a means of managing interpersonal relations or performing identity work. However, even in these more contextually oriented studies, the spoken stories investigated were often transcribed in such a way that paralinguistic features (including gaze or prosodic resources like intonation) appeared as annotations embellishing the verbal record of telling rather than being recognized as semiotic systems in their own right. Literary studies of narrative have followed a similar move away from abstract, quasi-scientific formalist systems, but with a rather different focus. At least in some quarters, literary narratologists rejected empirical analysis of actual and immediate contexts of narration (Chatman 1990), instead conceptualizing “context” in broader, culturally oriented terms. A wealth of what might be termed “critical narratology” emerged, exposing connections between narrative and ideology. Again, the influence of gender, race, and ethnicity on storytelling is found in work that illustrates the move from “poetics to politics” (Currie 1998, 4), seen through recouping the value of previously marginalized writing. For example, feminist narratology sought to take account of literary texts by women (Lanser 1986), while others have used narratological tools to uncover postcolonial, and/or historicist perspectives in literary and cinematic texts (see Aldama 2003). Cognitive Narrative Analysis developed as a separate but allied strand of contextualism in both literary and linguistic domains of study. Here the contextual focus was trained on the user and the cognitive processes
Introduction
3
employed in making sense of narratives. As Herman (2003) points out, Cognitive Narrative Analysis is inherently interdisciplinary (drawing on neurolinguistics, psycholinguistics, and anthropology, for example) and intermedial. Nonetheless, within narratology, analyses have tended to be devoted to literary texts or to fabricated examples, again given in verbal form (Herman 1997). In both cases, critical and cognitive trends in contextualist narratology rightly allude to the interactive nature of narrative production and reception. Nonetheless, the nonlinguistic resources of the contextual domain are extrapolated through discussion of given texts, and more specifically mediated through analysis of specific verbal choices from sample (often) literary texts to the neglect of other semiotic resources. From the stance of the early twenty-first century, contemporary narrative analysis has clearly come a long way from its structuralist beginnings and focus on literary texts and analysis. Nonetheless, these origins have left significant legacies for narrative theory that are problematic if we are to take account of the full range of semiotic resources regularly found in storytelling. Most prominent are the assumption of monomodality and the privileging of verbal resources. Both in classical narratology and sociolinguistic accounts of narrative practice, the source material has predominately focused on verbal resources, either realized in the form of literary texts or written transcriptions of spoken data. Similarly, linguistics has functioned as a dominant paradigm in the development of narrative theory. From the initial use of Saussurean principles to distinguish between deep and surface structures, through to grammatical metaphors used to explain plot structure (Longacre 1983) or actantial relations (Greimas 1983) and still current in contemporary applications of systemic functional linguistics (Herman 2002) and corpus linguistics (Toolan 2008), tools from linguistics have been used to build evidence for narrative patterns and in turn underpin narrative concepts themselves. Both typical source data and conceptual bias in narratology leads to a situation not just of media-blindness, (the assumption that concepts derived from one format can be unproblematically transferred to another) but also mode-blindness. By this I mean that we should not assume that the dominance of the verbal mode thus far in narrative theorizing means that it is fully adequate to explicate the contribution of other modes (be they visual, verbal, kinaesthetic, or related to conventions such as dress codes). Instead, we need to reconfigure narrative theory and analysis in such a way that verbal resources are understood as only one of many semiotic elements integrated together in the process of storytelling.
MULTIMODAL THEORY: BACKGROUND AND DEFINITIONS The chapters in this collection mark a paradigm shift away from modeblindness. Instead, in various ways, the discussions develop and debate our
4
Ruth Page
understanding of narrative resources as multimodal phenomena. Although communication has always exhibited multimodal qualities, it is since the 1990s that conceptualizing multimodality has enjoyed renewed critical interest. Initially associated with the scholarly work of the New London Group, multimodality is rooted in semiotics but interfaces particularly with discourse analysis, systemic functional linguistics, and socially oriented work in Critical Discourse Analysis. As the prefi x “multi-” suggests, work that might come under the heading of multimodality is a pluralistic enterprise, drawing on different perspectives and data. Baldry and Thibault thus describe multimodality as “a multipurpose toolkit, not a single tool for a single purpose” (2006, xv). The work in this volume is no exception to such pluralism, and the authors employ different kinds of analyses, survey different kinds of narratives, and take different stances towards multimodality itself. However, the diversity of multimodality is neither random nor ad hoc. Instead, multimodality is grounded in certain central claims, which I revisit now. Above all else, multimodality insists on the multiple integration of semiotic resources in all communicative events. From this perspective, all texts are multimodal (Kress 2000, 187; Baldry and Thibault 2006, 19). Monomodality in comparison is not an actual quality of texts but rather a way of thinking about individual semiotic resources once abstracted from the communicative ensembles in which they occur. Multimodality’s insistence on the multiple resources used in communication is coupled with the democratic stance that all modes are equal. The extensive knowledge of verbal resources (and relative neglect of other systems) in certain quarters is not necessarily a result of the sole existence of language, but rather a continuing aftereffect of the centrality of linguistics (Kress 2000, 193). The assertion of modal democracy does not deny that when in use particular modes from the ensemble can relate to one another in various ways (by complementing or contrasting in the meanings they construct, or being arranged in hierarchies), and that in any given text, one mode may dominate. But even when one mode dominates, this does not mean that the other either less prominent or less well-recognized semiotic resources are not in play. Multimodality requires us to fundamentally rethink the position of verbal resources within semiotic configurations (here specifically within narrative theory) and to ask what the narrative system would look like if we examined other modes with equal priority. If multimodality calls attention to the range of semiotic resources used in communication, it also seeks to draw connections between them. In some work this is expressed as a unifying tendency, where multimodality is viewed as “common semiotic principles [that] operate in and across different modes” (Kress and van Leeuwen 2001, 2). Typically, if not ironically, these semiotic principles thus far have borrowed and reworked concepts from various subfields of linguistics (for example, grammars, framing, modality, elaboration). The question of how far such metaphors are useful
Introduction
5
is contentious and not without debate (Kress and van Leeuwen 2001, 124– 25), and indeed the wider issue of whether it is possible or not to create a single system of principles for all modes lies beyond the scope of this volume. Instead, applying multimodal principles to narrative analysis must negotiate the tensions between fi nding transferable frames of reference that enable comparisons to be made between different modal combinations and avoiding the levels of abstraction that negate the inevitable influence of localized narrative contexts. The process-oriented nature of multimodal analysis forces us to return to the multifaceted nature of narrative context from a fresh perspective. Drawing on the work of Malinowski (1935), multimodal theory distinguishes between context of situation (the localized situation in which words are uttered) and context of culture. In narrative analysis, these two contextual facets have often been subject to separation across disciplinary divides. In narratological criticism of literary texts, discussion of context appears closer to the broader cultural concerns (for example, as treated in subfields such as feminist narratology). In sociolinguistic studies of narrative, the factors involved in “context of situation” are more strongly foregrounded through typical data (often narratives that occur in talk of various kinds) and methodological preoccupations. Multimodal narrative analysis reminds us that such separation is illusory and that not only does all language operate within cultural systems, so narratives (whether naturally or technologically produced) are received in local contexts by actual audiences. The principles of multimodality provide a significant means of expanding the project of contemporary narratology. Like other recent narrative studies that have indicated a turn to media (Ryan 2004), multimodality brings into the frame a number of issues that have previously been overlooked in the verbal hegemony of classical narratology, namely questions of remediation (Bolter and Grusin 1999) and the narrative affordances of specific media. The distinct contribution of a multimodal analysis is to shift the focus from media to modes, and to focus less on a comparison of specific media but instead to reconceptualize all narrative communication as multimodal. Clearly the enormity of fully examining the semiotic resources used in narrative (in its broadest sense) falls far beyond a single framework, and must by necessity be diverse, interdisciplinary, and integrative. On the one hand, this opens up a wealth of possible connections between methodology, source material, and critical perspective. On the other, such diversity is also open to terminological confusion and controversy. Central to the debates and analyses that follow in this volume are the terms “medium” and “mode,” which are notoriously slippery and ambiguous. In order to anchor the chapters that follow within some kind of critical context, I move now to sketch out some definitions against which multimodal narrative analysis might position itself.
6
Ruth Page
CRITICAL TERMINOLOGY The modes described in multimodal analysis refer specifically to semiotic modes (as opposed to other, specialist uses of the term). Thus a mode is understood here as a system of choices used to communicate meaning. What might count as a mode is an open-ended set, ranging across a number of systems including but not limited to language, image, color, typography, music, voice quality, dress, gesture, spatial resources, perfume, and cuisine. The status of a mode is relative and may vary according to its instantiation within a given community. For example, the potential of particular scents to carry meaning may be high for perfume creators but less so for other individuals who are not trained to differentiate between them. Given the fluid nature of modes, central questions are how, why, and to what extent some modes become particularly privileged in certain contexts. The collection of work in this volume concentrates on narrative as a significant mode by which humans make sense of themselves and the world around them. However, the principles of multimodality remind us that narrative is only one mode amongst many, and a multimodal narrative analysis should not be taken to reinforce narrative imperialism but rather might serve to broaden our understanding of how and why narrative functions in relation to other modes in different contexts. Semiotic modes are realized materially through particular media. Although closely intertwined with multimodality, the analysis of medium is a separable and independent issue. As Kress and van Leeuwen put it, “multimodality and multimediality are not quite the same thing” (2001, 67). In probing this issue further, it is worth distinguishing between the different uses of “medium.” Ryan (2004) rightly points out that two meanings are current in theoretical discussions. First, medium refers to the physical materials used when conveying communication (e.g., print, airwaves, radio). Second, it may also be defi ned as a channel of communication. Both meanings are important in multimodality, which recognizes the potential of materiality for meaning-making. However, modes and medium cannot be mapped univocally. As Kress and van Leeuwen (2001) argue, a given semiotic mode can appear in different media. For example, language can be spoken or written. Conversely, and fundamental to the project of multimodality, different modes can be realized in the same medium, as demonstrated through the use of image and words in comics or illustrated stories. As materiality may function as a source of difference and hence meaning, multimodal narrative analysis treats media as one element operating within the wider ensemble of semiotic modes used in storytelling. It is less concerned with media-specific affordances in isolation. Thus multimodal narrative analysis can be focused on a single medium (print literature) or may survey narratives from different media (audiovisual and written). However, regardless of whether one or more media are involved, the focus remains on the integration of semiotic resources, not the comparison of media alone.
Introduction
7
The relationship between materiality and multimodality draws attention to the physical work involved in narrative processing, both in the use of tools and technology, and also by the human body and its sensory organs. Kress and van Leeuwen (2001, 66) argue that although semiotic modes are grounded in physiological experience, they are not naturalistically equated to sensory modes. The senses being addressed, the priority between sensory tracks and the various affordances they yield within a storytelling context thus contribute to but should not be taken as synonymous with semiotic mode in its widest sense. Rather, recognizing that narrative is not just a means of artistic expression but a fundamental human endowment (Herman 2007, 17) the role of sensory modes remains vital to a multimodal narrative analysis. Storyworlds routinely depict sensory modality, indexically evoked through verbal or visual resources, and, more generally, narrative communication is itself embodied. Tellers and audiences interact variously with the substance of their stories, whether that is through gesture and tone of voice in conversational stories, or through the sensorimotor manipulations of a page, keyboard, screen, or other material. Analyzing the holistic contribution of sensory modes to storytelling raises important issues for multimodal narrative analysis. It might ask why certain modes play more significant roles in narrative production and reception than others (for example, visual and auditory senses are usually more prominent than olfactory senses). Likewise, attending to sensory modality demands an adequately theorized account of how the human body interacts with narrative materials of different kinds. The analysis of semiotic modes can combine factors used in textual presentation (such as the choice to use words and/or a diagram), sensory perception (the use or evocation of sight and touch, for example), and media of transmission, all of which are situated within a given physical environment. The semiotic resources combine in multiplex configurations across these parameters, demonstrating that multimodal narrative analysis is far more than a text-based concern. Heuristically, the different dimensions of multimodality might be categorized schematically in the following figure.
Table 1.1 Dimensions of the Multimodal Ensemble Textual resources
Platform of delivery
Physical environment
Sensory modalities
Words
Digital screen
Private (domestic)
Sight
Image
Printed page
Public
Hearing
Sound
Cinema/TV screen
Inside/Outside Touch rooms or buildings
Movement
Face-to-face
Light/dark
Smell
Objects/space
Taste
Olfactory resources Telephone
8
Ruth Page
Of course in actuality, modal elements such as those listed are experienced in synergy rather than separately, and in open-ended configurations. Sketching the range of factors included in multimodality is not intended to be defi nitive or exclusive, but to make plain that narratives are embodied communicative events, experienced in particular environments. The narrative participants may interact with a story using a variously synthesized range of sensory modalities (sight, hearing, touch, smell, taste). For example, I might read a novel whilst holding it and turning the pages; watching a film I both see the images on the screen and hear the sound track but I will not touch the screen itself. Narratives may be delivered in different media. Thus a written narrative might appear on pages or digital screens of various kinds, an audiovisual narrative played out on a theater stage, cinema, television, or computer screen, I can tell an oral story in a face-toface situation, or use a telephone, or Skype. The physical environment in which a narrative is produced also deploys a range of semiotic resources involved in meaning-making. A narrative may be experienced in private or public spaces (an anecdote can be told over dinner or from a pulpit; a fi lm may be viewed in a cinema or at home on a television; an extract of a narrative might be pinned to a notice board or pasted on a billboard). Such public or private environments may variously use light or darkness, space, heat, or movement in creative ways to generate narrative meaning. Clearly, the contribution made by any one element of the semiotic ensemble may be more or less prominent in the way a story is meaningful. The point here is to bring to light the holistic manner in which narratives can be embedded in different modal contexts. Capturing the dynamic interplay of semiotic resources as they contribute to narrative meaning requires some form of transcription as a fi rst step in the process of analysis and interpretation. Given that multimodal theory requires us to rethink the preeminence of any single mode, it is vital that the transcription makes plain not just the verbal content of a narrative, but the wider ensemble of semiotic resources at work. In so doing, this may reveal more clearly patterns of resources integration across different examples of storytelling and help us identify points of commonality and contrast across the narrative spectrum. Transcription must be at once systematic and replicable, but flexible enough to embrace the rich diversity of all that multimodality encompasses. This is no mean feat, and a definitive, established system of multimodal transcription is far from complete in the way that systems for representing single modes have developed (say, for language, prosody, music). In part, this may lie with the recent emergence of multimodality as a field of interest, but the expansive and potentially infi nite combinations of modes that might be considered make a single framework of transcription less than desirable. In the present collection of chapters, the contributors use a range of techniques for describing the range of semiotic storytelling resources including description, tabulation, and diagrams. Their focus and the level
Introduction
9
of delicacy involved in the transcription also varies, so that some contributors are interested in micro-level interplay between individual words and gestures (Herman) and/or images (Montoro) or spaces (Hatton et al.) whilst others are more concerned with discourse-level issues. All attempts at transcription should be understood as heuristic and present challenges to the unfolding project of multimodality. Whilst multimodality relocates language as only one of many semiotic resources at work in making meaning, verbal preeminence still largely characterizes transcription practices. For the most part, it is nonverbal resources (image, sound, movement, gesture) that are rendered in written verbal description rather than the other way round (that is, words are not so often “translated” into image or color). While transcription is multimodal (using words in combination with layout, font, graphics, and embedding image and links to multimedia) it would seem that academic discourse has some way yet to go in exploiting the multimodal potential of communication. As such, there are inevitable losses entailed by “telling” nonverbal resources in words rather than “showing” them directly. The translation of one mode into another reveals the interpretive nature of transcription, for example, found in the selection of elements from the semiotic context, how nonverbal resources are to be represented, how the transcription itself might be laid out.
ABOUT THIS VOLUME The complexity of transcription choices is further complicated by the diverse range of multimodal narrative analysis. The chapters in this collection reflect just a small part of the variety available, including analysis of print, digital, and audiovisual texts (literature, comics, film, computer games, hyperfiction) alongside oral stories and staged productions such as opera. The focus on a particular semiotic component of each narrative also varies considerably. Some contributors focus more on the multimodal nature of the text (Gibbons, Nørgaard, Toolan), while others focus more on matters related to production (Thomas, Herman) or reception (Michael and Linda Hutcheon, Ensslin). Of course, narrative interaction cannot separate out the issues of production and reception, as these and other chapters in the volume demonstrate. In keeping with the integrative spirit of multimodal theory, the chapters are not categorized according to an isolated focus on a particular mode. Instead, a range of sensory, textual, and environmental resources are considered for their semiotic potential. The analysis ranges across and between visual elements (in Doloughan’s discussion of art history, Herman’s analysis of comics, Gibbons’s chapter on contemporary graphic novels); sound, whether that be in opera (Michael and Linda Hutcheon’s chapter) or used in Interactive Narrative (Hatton et al.); textual resources such as typography (discussed in relation to print texts by Nørgaard and hypertext fiction by Laccetti); gesture (discussed in relation to film by Montoro and oral
10
Ruth Page
storytelling by Herman) and haptic resources, whether that be control of a computer’s mouse (Laccetti) or a printed page (Gibbons). Although each of the chapters may be read in isolation, they are sequenced so that the opening three (Doloughan, Montoro, Salway) contain discussion of audiovisual resources. The chapters by Herman on oral stories and Michael and Linda Hutcheon on opera move to texts that are performed in particular contexts. Herman’s discussion of comics alongside Gibbons’s and Nørgaard’s analyses of contemporary graphic novels provide perspectives on the multimodality of print texts. The remaining six chapters all examine multimodality in new media narratives of various kinds. Toolan, Ensslin, and Laccetti discuss examples of hyperfiction, although their stances to this are very different. Thomas compares fanfiction with the multimodality of the TV series Lost. Hatton and her colleagues describe their construction of Interactive Narrative soundscapes whilst Dena introduces an evolving genre of narratives that cross boundaries between video game, online format, and off-line merchandise. Amongst the variety of subject matter, certain common threads reoccur in the discussion of multimodal narratives found in the chapters in this volume. These might be framed within the overarching sense that storytelling is an experiential process. The exact nature of the particular relationship between the narrative participants (the teller, audience) and the story itself will vary considerably according to the context and the nature of the story being told. However, multimodal theory draws attention to matters of production, not least of the human body. Michael Toolan’s chapter argues that in the case of narrative, some modes appear to be more easily reproduced by human speech than others. The corporeality of narrative interaction is discussed in other chapters too, especially in relation to the use of physical space. Hatton and her colleagues describe how movement through an artificially configured space could more or less easily be used to create story-like scenarios. Linda and Michael Hutcheon examine three contrasting uses of stage space to reinterpret the multimodal resources of Wagnerian opera. Dena’s discussion of polymorphic fictions illustrates the ways in which physical space can contribute to the ways in which narrative elements are interpreted. Lastly, Herman’s study of gesture theorizes the relationship between place, space and storyworlds. Human bodies do not just experience stories in space, they also interact with the materiality of narrative texts. The physical work done when interacting with printed pages, digital screens, or computer technology comes to the fore in a number of chapters. Both Nørgaard and Gibbons provide examples of particular instances where the haptic play with print pages can be used to create additional meaning in contemporary novels. But the influence of haptic modality is perhaps felt most strongly in the context of new media narratives, where the reader must physically manipulate an
Introduction
11
element of digital apparatus (a mouse, keyboard, headset) in order to make sense of the story being told. The influence on the reader’s response can be highly varied. Toolan recognizes the now well-documented problems that emerge when the digital interface prohibits the reader’s immersion in the text. In contrast, Ensslin uses the physio-cybertext of Pullinger and colleagues’ Breathing Wall to reconfigure concepts of intentionality in reader response. Yet a further perspective is provided by Laccetti’s reading of the haptic demands of Ankerson and Sapner’s Cruising, where the navigation of the multimodal hyperfiction is interpreted from a feminist perspective. The process-oriented nature of multimodal theory is not limited to the physicality of text, teller, or audience but is usefully complemented by cognitive approaches. In this collection, Gibbons draws on fi ndings from neurolinguistics in her discussion of multimodal processing. Montoro and Nørgaard also use multimodal theory in combination with conceptual metaphor to augment our understanding of particular narrative texts.
CODA / PRELUDE Taken together, the chapters in this volume are but the starting point to a program that brings mutual benefit to both multimodal and narrative studies. Multimodality gains from a close study of the potential of the narrative as an influential mode of discourse that crosses cultures and media. The perceived monomodality of existing narrative theory, and specifically the dominance of verbal resources, is challenged profoundly by multimodality’s persistent investigation of the multiple semiotic tracks at work in storytelling. This leads to a critical rethinking of narrative defi nitions and concepts, source data and analytical tools, so reenergizes the field. As such, the central aims that underpin the project of multimodal narrative analysis can be summarized as the desire to: 1. Explore the enabling and constraining properties of different combinations of modes in narrative production or reception. 2. Ask how the relationship between narrative and multimodality is influenced by particular contexts. 3. Critique existing definitions of narrative and construct alternatives that take account of the multimodal nature of communication. 4. Expand the transmedial study of narrative to investigate the relationship between medium and mode. These aims are realized in various ways and in different measure in the chapters that follow. However, the aims are by no means exhausted by the studies represented here, nor can they cover in full the range of what multimodal narrative analysis might come to embrace. At heart, multimodal
12
Ruth Page
narrative analysis is not just a pluralistic enterprise, but one that seeks to open up dialogue between scholars working in different areas of study. That dialogue can only be represented in part by the chapters in this volume. It seems fitting then, that the fi nal contribution to the collection is framed not as analysis or discussion but rather as a series of questions that might stimulate further work that brings together scholarship not just from narratologists and those working in multimodality but from the wider academic community too. It is my hope that both the chapters and the closing list of questions in this volume will invite inquiry into narrative in all its multimodal richness for many years to come.
REFERENCES: Aldama, Frederick. 2003. Postethnic Narrative Criticism: Magicorealism in Ana Castillo, Hanif Kureishi, Julie Dash, Oscar “Zeta” Acosta, and Salman Rushdie. Austin: University of Texas Press. Baldry, Anthony, and Paul Thibault. 2006. Multimodal Transcription and Text Analysis: A Multimedia Toolkit and Coursebook. London: Equinox. Blum-Kulka, S. 1993. “‘You Gotta Know How to Tell a Story’: Telling, Tales and Tellers in American and Israeli Narrative Events at Dinner.” Language in Society 22 (3): 361–402. Bolter, Jay, and Richard Grusin. 1999. Remediation: Understanding New Media. Cambridge, MA: MIT. Chatman, Seymour. 1990. “What Can We Learn from Contextualist Narratology?” Poetics Today 11 (2): 309–28. Currie, M. 1998. Postmodern Narrative Theory. Basingstoke: Macmillan. Georgakopoulou, A. 1997. Narrative Performances: A Study of Modern Greek Storytelling. Amsterdam: Benjamins. Greimas, A. J. 1983. Structural Semantics: An Attempt at a Method. Trans. Daniele McDowell, Ronald Schleifer, and Alan Velie. Lincoln: University of Nebraska Press. Herman, D. 1997. “Scripts, Sequences and Stories. Elements of a Postclassical Narratology.” PMLA 112 (5): 1046–59. . 2002. Story Logic: Problems and Possibilities of Narrative. Lincoln: University of Nebraska Press. . 2003. “Introduction.” In Narrative Theory and the Cognitive Sciences, ed. David Herman, 1–32. Stanford: CSLI Publications. . 2007. “Introduction.” In The Cambridge Companion to Narrative, ed. David Herman, 3–21. Cambridge: Cambridge University Press. Kress, Gunther. 2000. “Multimodality.” In Multiliteracies: Literacy Learning and the Design of Social Futures, ed. B. Cope and M. Kalantzis, 182–202. London and New York: Routledge. Kress, Gunther, and Theo van Leeuwen. 2001. Multimodal Discourse: The Modes and Media of Contemporary Communication. London: Hodder Arnold. Labov, William, and J. Waletzky. 1967. “Narrative Analysis: Oral Versions of Personal Experience.” In Essays on the Verbal and Visual Arts: Proceedings of the 1966 Annual Spring Meeting of the American Ethnological Society, ed. J. Helm, 12–44. Seattle: University of Washington Press.
Introduction
13
Lambrou, Marina. 2005. “Story Patterns in Oral Narratives: A Variationist Critique of Labov and Waletzky’s Model of Narrative Schema.” PhD diss., Middlesex University. Lanser, Susan. 1986. “Toward a Feminist Narratology.” Style 20 (3): 341–63. Longacre, Robert. 1983. The Grammar of Discourse. New York and London: Plenum. Malinowski, B. 1935. Coral Gardens and their Magic. London: Allen and Unwin. Ryan, Marie-Laure. 2004. “Introduction.” In Narrative Across Media: The Languages of Storytelling, ed. Marie-Laure Ryan, 1–40. London and Lincoln: University of Nebraska Press. Toolan, Michael. 2008. Narrative Progression in the Short Story: A Corpus Stylistic Approach. Basingstoke: Palgrave Macmillan.
2
Multimodal Storytelling Performance and Inscription in the Narration of Art History Fiona J. Doloughan
INTRODUCTION According to Mitchell (1994), the linguistic turn has been followed by the pictorial turn. Pointing to the prevalence of and concern with contemporary visual culture, Mitchell nevertheless sees evidence of what he calls “the general anxiety of linguistic philosophy about visual representation” (1994, 12), by which he appears to be calling attention to the seeming paradox of fi nding a language to talk about our experience of the visual that does it justice and does not subordinate the pictorial to the linguistic. In his view what is required is a “broad, interdisciplinary critique . . . one that takes into account parallel efforts such as the long struggle of film studies to come up with an adequate mediation of linguistic and imagistic models for cinema and to situate the fi lm medium in the larger context of visual culture” (15). Art history, he believes, could well play a central role here, if it rises to the challenge of offering “an account of its principal theoretical object—visual representation—that will be usable by other disciplines in the human sciences” (15). Mitchell is acknowledging here the specificity of particular modes of representation while at the same time inviting interdisciplinary inquiries into how visual representation functions in contemporary Western society. His challenge to articulate that understanding in a manner consistent with the “complex interplay between visuality, apparatus, institutions, discourse, bodies, and figurality” (16) has its parallel in efforts to get to grips with the dynamic interplay of modes and media as they are employed by producers and consumers of multimodal texts today. As a theorist and proponent of multimodality, Kress (2003) has pointed to consistent changes in the contemporary communicational landscape: “on the one hand, the broad move from the now centuries-long dominance of writing to the dominance of the image and, on the other hand, the move from the dominance of the medium of the book to the dominance of the medium of the screen” (1). To put it simply, recognition of the changing communicational landscape has led to attempts to identify the meaning-making potentials and limitations of the various modes and media available to human agents and
Multimodal Storytelling
15
to render a more or less systematic account of how these resources are used or can be used in specific communicational and expressive contexts (Kress and van Leeuwen 1996, 2001; Levine and Scollon 2004; O’Halloran 2004; Royce and Bowcher 2007; Machin 2007). The very volume of work produced on multimodality in recent years is evidence of interest in the dynamics of communication and the extent to which our engagement in the world is effected by nonverbal means (e.g., gesture, gaze, positionality). Systems of transcription and analysis, too, have responded to the demands of multimodal communication by trying to capture the multidimensional nature of text in layout and presentation (see, for example, Baldry 2004 and Lim 2007) in an effort to better represent the ways in which we make meanings simultaneously on the basis of sometimes competing, sometimes complementary signals. This chapter will attempt to bring together insights from narrative and multimodality in order to look at the ways in which story is told multimodally in the case of part of an episode of the television series on the power of art narrated by Simon Schama. The episode in question relates the story of the painter Jacques-Louis David (1748–1825) and his transformation from court painter to Revolutionary propagandist. It will, however, do this against the backdrop of Mitchell’s (1994) concern that “visual experience” or “visual literacy” might not be fully explicable on the model of textuality (16). For, as O’Toole (1994) points out, art history as a discipline has its own preferred modes of storytelling and sets of discourses that tend to privilege context, genealogy, and information about the material and biographical conditions under which the work was produced (cf. chap.5, 169– 82). To a certain extent, such a tendency is evident in Schama’s story of art. However, in focusing on a particular painting and trying to “explain” its power, Schama is of necessity engaging with the specifics of an individual work and evaluating it in the light both of the other works produced by the artist and in relation to ideas about art, its contemporary relevance, and its power to move and speak to its viewers. For O’Toole (1994), the language and systematic mode of analysis offered by semiotics allows the viewer to enter an ongoing conversation about art that is not constrained by the voice of authority or regulated by what he sees as the taken-for-granted discourses of art history. He encourages engagement with the work itself (169) and a focus on the “experience of the painting as we stand before it or study a reproduction of it” (171). While this challenge is a useful reminder that viewing is an active process demanding our input, what it fails to acknowledge is that viewing itself is a highly mediated and situated cultural practice that, like other cultural practices, depends on knowledge and an understanding of craft. Like writing, painting, and more specifically in this case, an ability to “read” paintings, it is a kind of cultural technology acquired over time. The existence of individual paintings, while seemingly autonomous, may depend not only on the prior works of the artist in question but also on the “influence” of
16
Fiona J. Doloughan
other past and contemporaneous works. This is not to mythologize art but to recognize that decoding signs is not always a transparent process but one that depends partly on cultural and disciplinary knowledge. Even the representational level, i.e., what a painting is about, can be problematic without some knowledge of context and/or convention. The painting that is central to the Schama episode I shall discuss, The Death of Marat (1793), is a case in point. Asked to describe what they saw in the painting, students from a variety of cultural and national backgrounds (Poland, Turkmenistan, Hungary, and England), none of whom had a background in art history, variously described the central character as a turbaned man, a man with a towel on his head, a man who has committed suicide in his bath. This is not to belittle a nonspecialist view but simply to point out that what we see and the words we choose to describe what we see cannot be taken for granted. Other functions, such as modality and composition (O’Toole 1994), can of course be discussed as separate analytical categories but the experience of viewing, as opposed to analysis, is an initially holistic act and one that depends as much on knowledge and prior experience as on what is actually “there.” For as Mitchell (1994, 13) puts it: “we still do not know exactly what pictures are, what their relation to language is, how they operate on observers and on the world, how their history is to be understood, and what is to be done with or about them.” The point I wish to flag up here for further consideration is the extent to which storytelling is a product not only of available resources and their interaction and apprehension by reader, listener, and/or viewer across modes and media, but also of narrative conventions that cannot be assumed to hold across periods and cultures. In this sense, I would argue, discourses of art underpin the manner in which we “see” a painting; the verbal impinges upon and interacts with our construction of the visual, which we never approach without presuppositions or assumptions, even if these are not made explicit. In other words, what Lim (2007) calls the context plane intersects with and may not be totally separable from the content plane. Such a view of the interdependency of context and text, textual marks, cues, and mental models owes much to recent developments in narrative from a discourse comprehension perspective. Herman’s (2002) work on the logic of stories has been instrumental in bringing together insights from narratology and more recent work from linguistic and cognitive science aimed at inquiring into the processes involved in complex meaning-making. In other words, Herman’s work offers an attempt to bridge the gap between text production and reception. Additionally, his reorientation of narrative as not just a temporal but also a spatial mode lends plausibility to the view that while the visual and the verbal cannot be conflated and indeed may well have different affordances and constraints (Kress 2003, 1–6), storytelling involves the representation and construction of setting, place, and trajectories both spatial and temporal, as well as actions, events, states, processes, and goals. Conversely, the image, while it may initially be
Multimodal Storytelling
17
apprehended as a whole, can be viewed in relation to what O’Toole (1994, 10) calls different episodes, a term that implies not only a spatial but also, potentially, a temporal dimension.
MULTIMODAL COMMUNICATION: “PERFORMANCE” AND “INSCRIPTION” Taking Kitchener’s famous recruitment poster (1914) by way of example, van Leeuwen (2004) points to the inadequacy of an analysis that assumes that the message here is realized in two discrete acts: a speech act and what he calls an “image act” (7). Rather he proposes seeing this communicative act as multimodal wherein “image and text blend like instruments in an orchestra” and where there is a “fusion of all the component semiotic modalities” (7). He goes on to suggest that speech events be seen as “multimodal microevents in which all the signs present combine to determine its communicative intent” (8). Leaving to one side any problems relating to the notion of communicative intent, the main point seems legitimate: in considering only the transcript of a conversation or a stretch of written text without reference to the whole context in which it took place or was produced (e.g., service encounters in contemporary supermarkets), the linguist is factoring out dimensions of the communicative situation that impinge upon the meanings being made. Thus van Leeuwen concludes: Genres of speech and writing are in fact multimodal: speech genres combine language and action in an integrated whole, while written genres combine language, image and graphics in an integrated whole. Speech genres should therefore be renamed “performed” genres and written genres “inscribed” genres. Various combinations of performance and inscription are of course possible. (2004, 8; italics in original) Van Leeuwen’s notion of “performance” and “inscription” is productive in a number of ways. “Performance” brings to the fore the potential kinetic and gestural dimensions of speech, whether naturally occurring or scripted, as it is actualized in particular contexts. It also permits the possibility of “staging” speech in both a literal and a metaphoric sense and draws attention to the performative aspects of language. The notion of inscription points to the visual and graphic potentials of written language and recognizes the semiotic value of choice of layout, font, typeset, etc., in meaning-making. It accords a spatial dimension to text, including narrative text. Indeed, van Leeuwen goes on to suggest that “[m]ultimodal analysis must work with concepts and methods that are not specific to language, or indeed to any other mode, but can be applied cross-modally” (2004, 15). His emphasis on the importance of visual communication is in the context of what he sees as an overprivileging of language by linguists; by the
18 Fiona J. Doloughan same token, he insists on the usefulness of linguistic analysis for students of visual communication (18). Many of the concepts developed in the study of grammar and text are not specific to language. In some cases, for instance narrative, this has been known for a long time; in others (e.g. transitivity, modality, cohesion) it is only just starting to be realized. (van Leeuwen 2004, 16; italics in original) This quotation returns us, in a sense, to narrative and to the recognition of its inherent spatial qualities. As Herman (2002) puts it: “narrative can also be thought of as systems of verbal or visual cues prompting their readers to spatialize storyworlds into evolving configurations of participants, objects, and places” (263). However, it is important to draw out some distinctions here in relation to the differing contexts and purposes of the discussion. Van Leeuwen’s aim in this article appears to be a call to linguists to take more seriously the visual components of communication and to offer a rethinking of conventionalized boundaries between language and communication. Herman’s (2002) book-length study, Story Logic, aims to revalorize narrative “as a discourse genre and a cognitive style, as well as a resource for literary writing” (1). At the same time, Herman wishes to reassess “the relations between narrative theory and . . . linguistics and cognitive science” (2), viewing both language theory and narrative theory “as resources for—or modular components of—cognitive science” (5). Regardless of differences in aims and focus, insofar as narrative theory is interested in exploring the effects of different media on narrativity (Ryan 2003) as well as in revising definitions of narrative to include recognition of the importance of spatial relationships and the notion of paths in narrative domains (Herman 2002), rather than just a series of temporally organized (and causally motivated) events, the interests of narratologists and social semioticians in relation to the interaction of modes and the production and comprehension of multimodal texts are converging. Ryan’s (2003) work on “Defi ning Narrative Media” provides a bridge in the sense that she produces a typology of narrative media based on the premise that a medium is “narratively relevant if it makes an impact” on at least one of three domains: that of plot, or story; that of discourse, or narrative techniques: and that of narrative as performance or narrative pragmatics (3–4). By narrative pragmatics, she means to refer to the “uses of storytelling and the mode of participation of human agents (authors, actors, readers) in the narrative event” (4). It will be useful to bear in mind this reference to uses of storytelling and mode of participation alongside van Leeuwen’s (2004) notions of performance and inscription in the discussion of Schama’s use of intersecting and complementary narratives in Power of Art. Arguably, Herman’s (2002) notion of “autonomous but interconnected modules of narrative structure—perceptual, emotive, and thematic/conceptual” (277) will also be of relevance here.
Multimodal Storytelling
19
OVERVIEW The program on which I shall focus is one of a series of eight presented by Simon Schama in October and November 2006 on BBC2 on artists from the sixteenth to the twentieth century. According to the introduction to the series presented on the BBC Web site (http://www.bbc.co.uk/arts/powerofart/intro.shtml) and written by Simon Schama, the series aims to look at “how some of the most transforming works got made by human hands.” The series, Schama continues, “drops you . . . into those difficult places and unforgiving dramas when artists managed, against the odds, to astound.” Each program, by and large, seems to follow the same format: an artist’s work is presented against the backdrop of the historical and social context in which it was produced. However, as the previous references to “difficult places” and “unforgiving dramas” suggest, there is also the story of individual protagonists struggling to achieve “[a]rt that aims high . . . against the odds.” And the power of art about which Schama is talking is “the power [of the greatest art] to shake us into revelation and rip us from our default mode of seeing” (my italics). This is art that “seems to have rewired our senses,” leaving us with a different view of the world. Such a view positions art and indeed the individuals who produce it in particular ways: as something that comes out of an encounter with “moments of commotion” and that is the result of a “craft of exhilarating trouble.” At one level the discourse reinforces popular notions of the troubled artist whose life is in some sense valorized by the production of a “masterpiece” towards which he is tending and that outlives him—interestingly, all the artists in the series are men. Yet, in the programs more weight is given to the historical and social context out of which the artists emerged as well as to their family and personal situations and aspirations. Episode four relates the story of Jacques-Louis David and his trajectory from court painter to Revolutionary painter, a story set against the backdrop of a depiction of events leading up to the French Revolution and its aftermath. As the creator of images for the new France and a propagandist for the architects of the Revolution, it fell to David to avenge Marat’s death by painting him in such a way that he would not be forgotten. Later, however, David’s close identification with the Terror and his reputation as a “Tyrant of the Arts” led to his downfall and with the restoration of the monarchy, he was forced to flee France. The story begins at the end with the death in 1845 of David who had latterly been living in exile in Belgium. It then switches to an earlier period, depicting by means of historical reconstruction Versailles in 1783 with its pretensions to modernity signaled by the presence of hot air balloons. Embedded within the unfolding and contextualizing historical narrative is the personal or biographical narrative of David, which attempts through a focus on salient and colorful details, such as his disfigurement following a sword fight, to understand what made him the kind of artist he became.
20 Fiona J. Doloughan David’s development as an artist is set against significant historical events deemed to have influenced or motivated him in some way. The embedded narratives create a spatial dimension to the verbal text, encouraging the viewer/listener to make connections between the different lines of narrative. Alongside the scripted narrative, we view a series of images in time, images that have been assembled and edited in such a way that they impact upon the meanings we as viewers and listeners are enabled to make. While the language of film is useful here in terms of helping to identify the techniques used to produce meanings—wide-angle shots; close-ups; and zooming in on particular images, as is the case with the “forbidden” painting—what we are ultimately required to do as analysts is to “translate” the viewing experience into (more or less technical) words that articulate our understandings of both the what and the how, that is, what it is we have seen/heard/understood and how those meanings have been made. While analytically we can separate out the visual mode from the verbal mode as well as from the acoustic, in actual fact the meanings being constructed are the result of an interaction between and among the various modes, modes whose presentation and articulation are realized through a transmissive and spatiotemporal medium—television in the case of the BBC2 series—with multiple channels. At the same time, the structure and content of the program is mediated not just by the technologies available for its production and transmission but also by the social and cultural conventions governing its narrative construction and interpretation at a given historical moment. In addition, each individual program is the result of a process of production involving a team of technicians, cameramen, editors, and producers who collaborate to produce the final product for the particular target audience. Schama’s animation of the programs in addition to his role as scriptwriter and, to an extent, designer of text, even though he may not have realized every aspect of the design himself—this is, after all, Simon Schama’s Power of Art—suggests a high degree of control if not of the entire cultural and technologically mediated product, then certainly of the discourse and narrative strategies used in its presentation.
MULTIMODAL TRANSCRIPTION AND ANALYSIS In an attempt to respect the dynamic quality of the interaction between the various lines of narrative unfolding and intersecting in the prologue, I have brought together the language of fi lm, particularly that which relates to storyboarding (see Readman 2003), with a more “descriptive” but, in my view, necessarily interpretive, analysis of the sights and sounds presented to the viewer in this segment. In this connection, there are a number of related methodological and epistemological points that require
Multimodal Storytelling
21
some preliminary comment. One important point is captured by Mitchell’s (1994) reference to “the resistance of the icon to the logos” (28), by which he means that pictures cannot simply be translated into words without potential loss of meaning. As he goes on to say, the “otherness or alterity of image and text is not just a matter of analogous structure, as if images just happened to be the ‘other’ to texts” (28). Thus when what one sees is realized in words, the assumption of analogous transfer seems at best problematic and at worst untenable. A related point is a concern with categorization and the presumption of a distinct separation between description and interpretation. For the very act of separating what is experienced dynamically and holistically into discrete, if interrelated, component parts involves a degree of reification and stasis of what is otherwise in process. While measurements can be made of duration of shot type and sound or dialogue, such measurements do not in and of themselves tell the listener or viewer anything significant in the absence of other kinds of information and insights of a more qualitative, contextual, and interpretive nature. Clearly, conventions play a role here. For example, a close-up of an image serves to highlight it and to focus our attention on its details, parts, or attributes rather than, say, inscribing it in a wider context. Sequencing of images is also crucial in making sense of a narrative so that our understanding of it is a product of an editing process as well as of the particular viewer’s ability to “pick up” and activate the production cues.
COMMENTARY While the table has attempted to simultaneously display a range of modalities, there are a number of features of the scripted and the visual narratives that bear comment both separately and in respect of their interaction. Beginning with the voice-over that frames and provides a commentary on the sequence of shots with which the viewer is presented, we note the highly stylized and scripted nature of the narrative. This is a script designed to hook the viewer/listener through effective use of narrative strategies. It is a staged performance in which the animator and author of the scripted text is conscious of cueing in the viewer/listener by means of design and delivery. The story begins at the end with the funeral of the protagonist, Jacques-Louis David, the script carefully crafted to elicit a series of questions on the part of the viewer/listener, questions that will then be addressed in the course of the program, which will take us back to David’s beginnings, his apprenticeship, his rise to fame, and subsequent fall from grace. From the perspective of emplotment, that is, the manner in which the events of the story are sequenced and articulated so as to suggest, if not explicitly provide, a rationale or motivation for them, this is the classic stuff of narrative.
LS Statue still visible in background S.1 A funeral cortège clatters its 4 secs. (6 secs). but Flemish building on which it way through the cobbled streets sits occupies the foreground of Brussels.
CU Inside a lined coffin, close up of S.2 Inside the carriage is the (4 secs.) black boots with silver buckles. body of the most powerful French painter there had ever been—
MS Cut to head and upper body of (7 secs.) man lying in repose in lined coffin
3
4
5 Music resumes
(pause of 3 secs.) Jacques-Louis 1 sec. David.
4 secs.
3 secs.
MS Statue, still in silhouette, (screen then brass (6 secs.) left) from further away revealing more details of surrounding building
2
5 secs.
CU Close-up of statue (screen right) Music in background. First (6 secs.) with arm outstretched on top of woodwind instruments, a building (town hall?) silhouetted against the sky
1
As above
Evocation through both colour/ lighting—blacks and dark blues—and music of mournful and lugubrious setting/set of circumstances
Annotations and remarks on function
Importance Creation of dramatic effect of pause before articulation of name
Stress on ‘most’ and ‘ever’
Stress on Movement from part to whole ‘clatters’ and ‘cobbled
Music in minor key
Music in minor key
Features of sound/ dialogue
Shot Type of # shot and duration
Duration of sound/ dialogue
Table 2.1 Multimodal Transcription and Analysis Sound/ dialogue
Fiona J. Doloughan
Image description
22
Wipe
Wipe
Wipe
MS
6
7
8
9
4 secs.
all but one which just happened Battle scene replaced by third image/episode – three distressed to be the greatest of them all women in a domestic setting, clinging to and consoling one another, looking off into the distance at some imminent danger
Another picture gradually comes S.4 It was a picture which hadn’t 8 secs. into view, as if emerging from seen the light of day for 30 years. the mists of time. A man with a No one, least of all the man who beatific countenance lies slumped painted it, dared show it. in his bath, a letter in his hand. On the crate stand a quill, an ink- Music resumes (6 secs.) stand and a piece of paper. The man seems to have been interrupted at the moment of death. There is a blood stain on a white sheet hanging over the bath and the man’s head is shrouded in a turban-like piece of cloth.
5 secs.
A battle scene comes into view: holding up placards with the woman in centre of painting, in names of his paintings the midst of a throng, a soldier with a shield on her left
Painting or part of a painting S.3 Following the carriage is a 5 secs. comes into view. Three figures, solemn procession of art students all in Roman dress, on the righthand side, one on the left-hand side clutching three swords Rebellious quality of art?
continued on next page
Stress on ‘pic- Delayed view of prohibited ture’, ‘day’ painting creating space of desire and ‘years’; and mimicking process of ‘least’, ‘man’ sense-making of emergent visual and ‘dared’ elements?
Stress on Message about the dire conse‘one’; ‘great- quences of war and conflict. est’ Pre-figuring of ‘terror’ caused by painting?
Pause after ‘paintings’
Stress on ‘car- Sense of movement and transiriage’, ‘art’ tion created by wipe and changing quality of light. Paintings and/or episodes from paintings stand for or take the place of participants in the procession.
Multimodal Storytelling 23
Slow zoom into painting
Cut
11
12
Computer-generated graphics of flowers and parts of flowers appear on screen in a kind of choreographed dance.
20 secs.
6 secs. Camera gradually zooms into S.8 But the picture was also a painting seen first at a distance guilty secret … the real reason in a gallery why David’s body was refused burial in France. 12 secs. S.9 So what was it about this painting which made it both his unforgettable masterpiece and his unforgivable crime?
1 sec. S.6 No wonder. S.7 It was the most spell-binding 2 secs thing he had ever made (pause of 4 secs.), a painting before which people had once swooned, a painting both beautiful and repulsive
Dramatic effect realized by close-up of almost angelic countenance of protagonist. Picture as carrier of attributes
Conventional transition into body of program. Petals and flowers associated with drops of blood via red colour symbolizing potential danger and revolutionary zeal of art?
Emphasis on Question used to provide context for entering main narrative to ‘picture’, follow—focus on motivation. ‘guilty’ Why this reaction to a painting? ‘secret’, ‘real’; ‘refused’; ‘was’, ‘unforgettable’ and ‘unforgivable’
Stress on ‘spell-binding’, ‘ever’ ‘swooned’; highlighting of adjectives ‘beautiful’ and ‘repulsive’
Significance of pause
Code: CU = Close-up; MS = Medium shot; LS = Long shot; Wipe = “A transition between two shots whereby the second gradually appears by pushing or ‘wiping’ off the first” (Readman, 2003: 73).; Tilt = “A camera movement along a vertical axis, with the camera body swivelling up or down on a stationary tripod” (Readman, 2003: 73).
Tilt up to CU of face
10
Table 2.1 continued Multimodal Transcription and Analysis
24 Fiona J. Doloughan
Multimodal Storytelling
25
Specifically, the scripted prologue functions to create a sense of mystery around the funeral in Brussels of this most powerful of French painters. The viewer/listener is invited to wonder why the funeral took place outside France and why David’s most important painting should be missing from the list on the placards of the art students following the procession. The language used, in combination with the images, aims to reconstruct the scene and to trigger in the viewer/listener a sense of drama. The method of development of the scripted narrative serves to underscore this sense of unfolding drama in a number of ways, which are elaborated in what follows. Similarly, the choice of music in a minor key, as it crescendoes and diminuendoes in line with the pauses in the scripted narrative, helps to underscore the lugubriousness of the occasion. The camera shots and the lighting too have a role to play in orchestrating the creation of a particular mood and viewpoint. For example, the withholding and then gradual bringing into focus of the “proscribed” painting serve to create suspense and to “ready” the viewer to take in the painting that had created such a stir. The funeral scene is narrated as if it were happening now—“A funeral cortège clatters its way through the cobbled streets of Brussels.” The historic present (e.g., “clatters”) is a feature of Schama’s mode of narration, as is his preference for what might be termed the language of literary realism, by which I mean to refer to the creation of a reality effect (cf. Barthes’s l’effet de réel) through both description and accumulation of detail. In addition, the use of marked word order and syntax serves to thematize the circumstances and location of the drama (e.g., “inside the carriage”; “following the procession”) and to postpone or delay introduction of the main actor—Jacques-Louis David. In the prologue, it is the paintings and their attributes, particularly those of the “proscribed” painting (“both beautiful and repulsive”; “the most spellbinding thing he had ever made”) that constitute the real object of study. There appears to be a progressive staging in the prologue whereby desire is created in the viewer to see the painting that caused such a furore and that, for Schama, triggers such a contradictory set of emotions. This staging is marked by pauses in the scripted narrative designed for dramatic effect and by the emphasis given to particular words through stress. The peculiar power of this particular work is re-created through its alignment with “uniqueness,” “greatness,” and the intensity of the emotions it inspires. This is, after all, a history of art and an exploration of the power of art, its allure and its dangers. Through the use of marked themes; the role of adjectives and superlative expressions in pointing to the qualities and attributes of things, which also serves to evaluate and classify them; the conscious use of a stylized, literary language, including alliteration and onomatopoeia (“cortège clatters . . . cobbled streets of Brussels”), and parallel structures (e.g., “his unforgettable masterpiece and his unforgivable crime”) a message is conveyed about art and its effects, a message that is
26
Fiona J. Doloughan
delivered via the narrative vehicle. In other words, the (self-)consciously crafted scripted narrative is designed, in conjunction with the sequence of shots, to draw the reader into the story of a protagonist—Jacques-Louis David—whose fortunes have risen and waned in his home country, and the art that he produced and on which his reputation depended. It is a story told through multiple modes: scripted narrative, images, both static and moving, and music. It is in the interaction of these modes realized through the medium of television broadcasting or DVD that the Gesamtkunstwerk is experienced. For someone trained in literary and/or linguistic modes of analysis, it is possible to analyze the way/s in which particular verbal effects are achieved with a degree of confidence. However, when “translating” the manner in which visual effects are created, a process that necessarily takes place in words, the gaps between visual and verbal “language” become apparent. As my attempt to gloss the visual shows, questions arise that relate to a number of factors, such as the extent to which background knowledge (e.g., of history and of art history) is needed to inform the analysis and/or description; the degree to which specialist or technical language is required (e.g., the language of fi lm; the language of art and design) to designate the changing images on screen; and the extent to which interpretive schemata come into play. For example, the shots that show details of the painting The Death of Marat come at a moment in the scripted narrative when we need to understand something of the aesthetic effects and power of the painting. For why would people swoon before it and why was it both beautiful and repulsive? The close-up of the arm of Marat draped over the bath alongside the details of the bloodstained sheets is interesting in this regard, since there is a grainy, flecked texture (like marble) that serves to aestheticize the body part and transform the corpse into an aesthetic object. In other words, at some level the concept of beauty is both discursively and pictorially constructed. While the script “tells” us about the peculiar impact of the painting and situates it for us, the picture itself “shows” us what that “beauty” consists of and how it looks. Yet, at another level we know that the history of art is full of competing notions of the “beautiful,” the “great,” the “powerful,” and that visual culture, like literary culture, is a product of the politics of taste as well as of the exercise of judgments founded on specialist knowledge. As nonspecialist viewers, we rely on the “expert” opinions of Schama, our guide through the history of art. Whatever access we have to the paintings of Jacques-Louis David is mediated through the narrative commentary of Simon Schama the art historian and through our experience of the program’s mode/s of production. Individual viewers may well “see” and “appreciate” the painting differently depending on prior knowledge and experience and on level of interest as well as personal preferences; however, that vision and appreciation will be shaped by the program’s staging and presentation of the “storyline” as it is realized through the interaction of the various modes on which it draws. At the
Multimodal Storytelling
27
same time, were we as viewers to describe what we have seen and experienced through the pictorial mode in words, we would come up against the difficulty in “matching” words to images. With respect specifically to this particular David painting, elsewhere in the program while giving his own evaluation of it, Schama says: If there’s ever a picture that would make you want to die for a cause, it is Jacques-Louis David’s Death of Marat. That’s what makes it so dangerous—hidden away from view for so many years. . . . I’m not sure how I feel about this painting, except deeply conflicted. You can’t doubt that it’s a solid gold masterpiece, but that’s to separate it from the appalling moment of its creation, the French Revolution. This is JeanPaul Marat, the most paranoid of the Revolution’s fanatics, exhaling his very last breath. He’s been assassinated in his bath. But for David, Marat isn’t a monster, he’s a saint. This is martyrdom, David’s manifesto of revolutionary virtue. Again, it is interesting to consider the language of description and evaluation used in this context. Marat has been “assassinated in his bath,” yet as Schama recognizes, this is “a painting that would make you want to die for a cause,” it’s “a manifesto of revolutionary virtue,” and in it Marat has been painted not as a “monster” but as a “saint.” In pointing to the difference between the painting as product and art as a historical, social, and political process, Schama is making clear his interest in the shaping role of context in the production and appreciation of a work of art and of his own “deeply confl icted” feelings about the painting. While recognizing its value as a well-executed artifact—he calls it a “solid gold masterpiece”— and its ability to transform a fanatic into a saint, he feels uncomfortable with the conditions motivating its production, during the Reign of Terror in France. So the initial evaluation of the painting as both beautiful and repulsive, set up within the frame of a story yet to be told, as in the prologue, is fleshed out here in relation to the historical and contextual variables relating to its production and, for a time, proscription. The fate of both artist and painting provides the motivation for Schama’s account and sits well within the structure of the series as a whole where the focus is on the production of a particular work within the artist’s oeuvre as a whole, a work presented as in some senses the high point of that artist’s production or his defi ning work.
CONCLUDING REMARKS Power of Art is an example of a multimodal art historical presentation for an educated and potentially discriminating but not necessarily specialist audience (as evidenced by its transmission on BBC2 at 9 p.m. on a Friday
28 Fiona J. Doloughan evening) transmitted through the medium of television but available in other complementary formats (e.g., an accompanying book to the series, a DVD, a BBC Web site dedicated to aspects of the series with hyperlinks). With respect to the eight episodes of the television series, they can be seen to have a broadly similar narrative structure, that is, one that projects a story logic upon the unfolding narrative such that the artist-protagonist whose work is both the product of and a departure from the prevailing cultural and artistic norms of the society that produced him, emerges as a kind of hero or antihero (cf. Schama’s evaluation of David as “a monster” but also “a fantastic propagandist”). But this is not just the story of individual artists struggling to overcome the obstacles placed in their way on the road to genius or to compensate for defects of character through their artistic production. It is the story of art itself and of the power of art to affect the viewer with “[v]isions of beauty or a rush of intense pleasure” or to produce “shock, pain, desire, even revulsion.” The discourse about art is part of the story of art, just as the unfolding story is narrated from a particular perspective. Schama’s choice of language in relation both to the description of particular scenes and paintings, as well as his depiction of historical events and individual actions in the light of their projected contexts and possible causes and their apparent consequences, appears to be motivated by a sense of drama and a need to make connections between the various narrative levels and modes. Indeed, the inclusion of historical reconstructions serves to emphasize this performative aspect. Structurally, too, the embedding of a personal or biographical narrative within an unfolding historical narrative dramatizes the relationship between individual life story and the larger historical and social forces that act upon it, shaping and transforming it. Certainly, narratological tools and terminology can be seen to provide a fi rm basis for understanding how stories function and how storyworlds are projected. Likewise, work done in the area of multimodality and multimodal discourse analysis is helpful in providing insights into the affordances and constraints of different modes and their possible effects in interaction. However, it seems to me that in spite of the volume of recent work in multimodal discourse analysis, much of which has significantly advanced understanding of processes of meaning-making across modes and media and served to develop specialized tools and technologies for enhanced multimodal analysis, work remains to be done in recognizing the discrepancies and potential incompatibilities of different meaningmaking systems. Moreover, the materiality of the marks on the page and the images on the screen while serving to cue particular interpretations of a text do not guarantee them, given the mediating role of culture and context. In this respect, the integrative multisemiotic model proposed by Lim (2007, 198) reflects a desire for systematicity and explanatory adequacy while
Multimodal Storytelling
29
seeking to “build in” recognition of the role of both context of situation and context of culture. Similarly, my own attempts at producing a system of transcription and analysis for a multimodal narrative text have pointed to the problems of integrating, rather than representing, visual and verbal modes of storytelling and have served to support Mitchell’s (1994) contention that the text–image difference “does not rest in a master-code” (30) but “between the speaking and the seeing Subject, the ideologist and the iconoclast” (30). What he refers to as “the temptation to science” (30; italics in original) may well be the search for ever-more precise methods of analysis that try to “objectify” what is in actuality an intersubjective interpretive process afforded and/or constrained, depending on your perspective, by culture, convention, and choice. Schama’s Power of Art is as much a vehicle for narrative performance and inscription as it is a product of art historical discourses and literary tropes, realized in and by the affordances of his chosen medium—television—in the era of the celebrity academic.
REFERENCES Baldry, A. 2004. “Phase and Transition Type and Instance: Patterns in Media Texts as Seen through a Multimodal Concordancer.” In Multimodal Discourse Analysis: Systemic Functional Perspectives, ed. K. O’Halloran, 83–108. London and New York: Continuum. Herman, D. 2002. Story Logic: Problems and Possibilities of Narrative. Lincoln: University of Nebraska Press. Kress, G. 2003. Literacy in the New Media Age. London and New York: Routledge. Kress, G., and T. van Leeuwen. 1996. Reading Images: The Grammar of Visual Design. London: Routledge. . 2001. Multimodal Discourse: The Modes and Media of Contemporary Communication. London: Arnold. Levine, P., and R. Scollon, eds. 2004. Discourse and Technology: Multimodal Discourse Analysis. Washington, DC: Georgetown University Press. Lim, V. 2007. “The Visual Semantics Stratum: Making Meaning in Sequential Images.” In New Directions in the Analysis of Multimodal Discourse, ed. T. Royce and W. Bowcher, 195–213. New Jersey: Lawrence Erlbaum Associates. Machin, D. 2007. Introduction to Multimodal Analysis. London: Hodder Arnold. Mitchell, W. 1994. Picture Theory: Essays on Verbal and Visual Representations. Chicago and London: University of Chicago Press. O’Halloran, K. 2004. Multimodal Discourse Analysis: Systemic Functional Perspectives. London and New York: Continuum. O’Toole, M. 1994. The Language of Displayed Art. London: Leicester University Press. Readman, M. 2003. Teaching Scriptwriting, Screenplays and Storyboards for Film and TV Production. London: British Film Institute. Royce, T., and W. Bowcher, eds. 2007. New Directions in the Analysis of Multimodal Discourse. New Jersey: Lawrence Erlbaum Associates.
30
Fiona J. Doloughan
Ryan, M.-L. 2003. “On Defi ning Narrative Media.” Image and Narrative, Online Magazine of the Visual Narrative, no. 6. http://www.imageandnarrative.be/ mediumtheory/marielaureryan.htm. Schama, S. 2006. Power of Art. http://www.bbc.co.uk/arts/powerofart/. van Leeuwen, T. 2004. “Ten Reasons Why Linguists Should Pay Attention to Visual Communication.” In Discourse and Technology: Multimodal Discourse Analysis, ed. P. Levine and R. Scollon, 7–19. Washington, DC: Georgetown University Press.
3
A Multimodal Approach to Mind Style Semiotic Metaphor vs. Multimodal Conceptual Metaphor Rocío Montoro
INTRODUCTION This chapter explores the multimodal realizations of mind style both in novels and their filmic adaptations. Mind style is generally understood as the way in which the psychological and cognitive makeup of written fictional narrative entities is projected. Here I aim to demonstrate that looking at mind style indicators in a multimodal environment, as opposed to the more traditional monomodal focus on novelistic forms, can result in further insights into the nature of the concept itself. Investigating mind style has also been commonly associated with those narratives in which narrator and/or characters display certain unconventional psychological traits. Naturally, any delimitation of what constitutes psychological “unconventionality” is, by defi nition, controversial; still most analyses of mind style realizations coincide in their choice of psychologically unconventional narrators and/or characters because of the markedly salient linguistic features that these narrative elements display. The multimodal focus of this chapter looks into how the salience of linguistic mind style indicators is transposed by exploiting the semiotic resources that typically characterize the fi lmic mode. For the purposes of this chapter I am restricting my understanding of “unconventionality” to the type of social and psychological behavior that typifies the violent, cruel, and sadistic conduct of psychopaths and murderers, for which Bret Easton Ellis’s American Psycho (1991) and its 2000 filmic adaptation by Mary Harron will be studied. Specifically, I examine how conceptual metaphors, semiotic metaphors, and multimodal conceptual metaphors are exploited in both modes to create such a marked manifestation of an idiosyncratic mind style.
MIND STYLE The term mind style was originally coined by Fowler in 1977 who, years later, formulated it again as follows:
32
Rocío Montoro I have called it mind-style: the world-view of an author, or a narrator, or a character, constituted by the ideational structure of the text. From now on I shall prefer this term to the cumbersome ‘point of view on the ideological plane’ which I borrowed from Uspensky [ . . . ] I will illustrate ideational structuring involving three different types of linguistic features: vocabulary, transitivity, and certain syntactic structures. (Fowler [1986] 1996, 214)
This initial work has been subsequently reconsidered by many others (Black 1993; Bockting 1994; Leech and Short 1981; Margolin 2003; Semino and Swindlehurst 1996; and Semino 2002, 2007) who have progressively expanded on the linguistic signs that function as mind style indicators. So where, for instance, Fowler concentrates on the discussion of vocabulary (namely under- and overlexicalization), transitivity, and certain syntactic structures, Bockting (1994) singles out what she calls “attributive style” and Black (1993), Semino and Swindlehurst (1996), and Semino (2002, 2007) all highlight the significance of metaphorical patterns. Looking at how these linguistic markers typically characterizing the language mode can be translated in the filmic mode into an array of different semiotic resources underscores the existence of superordinate principles seemingly determining the actual meaning-making potential of such resources. As Kress and van Leeuwen point out: We move away from the idea that the different modes in multimodal texts have strictly bounded and framed specialist tasks [ . . . ] Instead we move towards a view of multimodality in which common principles operate in and across different modes. (2001, 2) Such commonality of principles across different semiotic modes allowing them to encode and project different meanings is of particular relevance in the analysis of semiotic and multimodal conceptual metaphors as explained in the following. But before considering how these shared principles come to actually function, a further aspect of mind style regularly discussed by most of the scholars in the field needs mentioning. These aforementioned researchers of mind style generally condition the creation of a particular mind style to the recurrence of those markers that project it over the whole text. It would follow, then, that the cumulative effect of repeating a particular linguistic indicator is equally echoed in cinematic formats, albeit due to the presence of semiotic resources different from the purely written. So Bockting’s assessment of which recurrent indicators can encode meaning as “the whole field of linguistics: phonology, morphology, lexis, syntax and pragmatics, as well as various para- and non-verbal signs” (Bockting 1994, 160) would appear to be much more thoroughly formulated in multimodal literature as follows: Meaning is made in many different ways, always, in the many different modes and media which are co-present in a communicational
A Multimodal Approach to Mind Style
33
ensemble. This entails that a past (and still existent) common sense to the effect that meaning resides in language alone—or, in other versions of this, that language is the central means of representing and communicating even though there are ‘extra-linguistic’, ‘para-linguistic’ things going on as well—is simply no longer tenable, that it never really was, and certainly is not now. (Kress and van Leeuwen 2001, 111) It seems, thus, that previous considerations of mind style have focused their attention on the primacy of verbal markers in, mainly, monomodal environments, despite their acknowledgment of the role that “para- and non-verbal signs” (Bockting 1994, 160) can equally play. It is to this obvious gap in, and lack of attention to, the analysis of nonlinguistic mind style projectors that I now turn.
METAPHORICAL PATTERNS: SEMIOTIC METAPHOR AND MULTIMODAL CONCEPTUAL METAPHOR The study of metaphors has long been used in textual approaches to language for a variety of different purposes, mind style projection being just one of them. Looking at the function and role of metaphors in a not purely verbal environment (that is, not monomodally realized as textual resource) has received nowhere as much attention. Two notable exceptions to the lack of relative studies on the field are O’Halloran’s concept of semiotic metaphors (O’Halloran 1999a, 1999b, 2003, 2004, 2005) and Forceville’s work on the multimodal manifestations of conceptual metaphors (1996, 2002a, 2002b, 2006, 2007), which, despite emerging from two distinct theoretical positions, share an interest in the nonverbal manifestations of metaphors. O’Halloran’s work is fi rmly rooted in a systemic-functional tradition of multimodal analysis whereby the meaning-encoding process is investigated according to the meta-functions originally identified by Halliday (1985). Following systemic-functional principles, then, semiotic metaphors are defi ned as follows: In a manner similar to grammatical metaphor, semiotic metaphor may involve a shift in the function and the grammatical class of an element, or the introduction of new functional elements. However, this process does not take place intra-semiotically as for grammatical metaphor in language, rather it takes place inter-semiotically when a functional element is reconstructed using another semiotic code. With such reconstrual, we see a semantic shift in the function of that element. (O’Halloran 2003, 357) Semiotic metaphors, consequently, are capable of encoding meaning by mapping semantic content from one specific semiotic code onto another. Lim (2004) illustrates the working of such semiotic metaphors in the way
34
Rocío Montoro
jewelry advertisers exploit the symbolic meaning of diamonds (for the particular photograph discussed here see Lim [2004, 242], originally taken from www.hearts-on-fi re.com): An example of Semiotic Metaphor is shown in [ . . . ] (t)he visual image of the diamond [ . . . ] juxtaposed with the linguistic clause ‘because he loves me’. This association of the visual image of a diamond with the linguistic clause implies the gift of a diamond is an Expression of love [ . . . ] Indeed, it could be argued that diamonds [ . . . ] are in themselves always semiotic metaphors. (2004, 241) This transference of meaning from one semiotic medium (visual) onto another (linguistic) firmly encapsulates the way mind style is projected in the cinematic mode. Not only visual but sound resources of various kinds are successfully exploited in the filmic version of American Psycho in an attempt to create the same unconventional psychological profile of the original written version. Before undertaking the explanation of how such transference of meaning is actually effected, we need to look at the second successful way in which multimodal realizations of metaphors have been theoretically postulated, mainly in the work of Forceville (1996, 2002a, 2002b, 2006, 2007), who states: Since the publication of Lakoff and Johnson’s Metaphors we Live By (1980), conceptual metaphor theory (CMT) has dominated metaphor studies. While one of the central tenets of that monograph is that metaphors are primarily a phenomenon of thought, not of language, conceptual metaphors have until recently been studied almost exclusively via verbal expressions [ . . . ] One result of this focus is that relatively little attention is paid in CMT to the form and appearance a metaphor can assume. (2007, 19) Forceville’s characterization of multimodal conceptual metaphors stems from cognitivist positions. Far from considering them as language tropes, conceptual metaphor theory (CMT) highlights the essentially cognitive nature of these forms as thought phenomena that, eventually, become linguistically realized in words. Forceville, nonetheless, is successful in highlighting a significant fl aw in the original formulation advocated by Lakoff and Johnson who, although rightly described metaphors as thought rather than language forms, signifi cantly missed out on acknowledging any other surface realization apart from the verbal. Forceville, amongst other few (Carroll 1996, 1998; Cienki 1998; Kennedy 1982, 1993; McNeill 1992, 2000; Whittock 1990), has subsequently demonstrated that the cognitive construal of metaphors identifi ed by Lakoff and Johnson can be concretized in multiple ways, many of which bring into play a variety of semiotic resources. Such a fi nding has led him to isolate two main surface variations for conceptual metaphors, which he terms monomodal and multimodal:
A Multimodal Approach to Mind Style
35
A multimodal metaphor is here defi ned as a metaphor whose target and source are not, or not exclusively, rendered in the same mode [ . . . ] Monomodal metaphors will here be defi ned as metaphors whose two terms are predominantly or exclusively rendered in the same mode. (Forceville 2007, 21, 24) This author’s categorization of which modes1 he is describing has, in fact, been reworked from one publication to another. Whereas in his 2007 article he identifies five different “modes” (“written language,” “spoken language,” “visuals,” “music,” and “sound”), he had previously isolated nine: “pictorial signs,” “written signs,” “spoken signs,” “gestures,” “sounds,” “music,” “smells,” “tastes,” and “touch” (Forceville 2006, 383). For the purposes of this chapter, I am dispensing with smells, tastes, and touch as they are not endowed with a primary role in either the novel or fi lm, although that does not automatically entail that they should never be borne in mind in other multimodal analyses. Forceville’s more recent reformulation into five “modes” (written, spoken language, visuals, music, and sound), however, does seem to comprise the resources primarily exploited in the projection of mind style in the written and cinematic formats respectively, which is why this chapter primarily uses the latter taxonomy. Most interestingly, though, Forceville’s earlier definition also underscores the necessarily distinct semiotic nature of the modes used as target and source domains in the multimodal realizations of conceptual metaphors. Metaphorical expressions rendered in just the one medium (Forceville’s “mode”) would essentially qualify as monomodal, which seems reminiscent of the position defended by O’Halloran in relation to semiotic metaphors. Multimodal metaphors in the cognitive tradition, consequently, would appear to share O’Halloran’s notion of inter-semiosis whereby the semantic shift or transference of meaning from one code to another is what more clearly justifies a multimodal label. Failing such transference, the multimodal tag appears harder to justify. Nevertheless, filmic adaptations of written narratives seem to present the metaphor scholar with an interesting challenge. On the one hand, we are analyzing two distinctive formats typically characterized by their use of the printed word in the case of written narratives, and a combination of verbal resources with visuals, music, and sound in films. Strictly speaking, the metaphors identified in the written medium of the novelistic form should be classified as mainly monomodal for both the target and source domains are delivered in the same form, that is, through language. The fi lmic adaptation, on the other hand, clearly combines various semiotic resources, so a multimodal analysis is not so controversial if, I would suggest, still contingent on the preexisting written narrative from which it would be derived. This derivation from and reliance on a prior mode, the written, would seem to suggest that screen adaptations bank highly on the successful transposition of metaphors from one mode to another, for which the use of various semiotic resources is to be expected. Therefore, it seems profitable to approach the analysis of mind style in two
36
Rocío Montoro
modes, the fi lm versus the printed novel, from a multimodal perspective and to classify those transposed metaphors as primarily multimodal.
AMERICAN PSYCHO American Psycho (1991) tells the story of Patrick Bateman, a typical representative of the 1980s yuppie culture in New York. This apparently “fullyintegrated-in-society” character, however, is actually leading a double life, one in which his existence is only justified by an urging desire to abuse, torture, and kill others. Bateman’s split personality is linguistically marked by the idiosyncratic metaphorical patterns that encode and project a rather disturbing and unstable mind style, one that characterizes him as the social psychopath he actually is. This idea of instability is distinctively linked to CMT’s traditional assessment of the experiential nature of our conceptualization system. 2 Johnson’s analysis of the bodily basis of our making sense of the world (1987) explores the way in which human cognitive traits are largely based on our physical experience of the concept of “balance”: The experience of physical equilibrium within our bodies gives rise to structures for ordering our experience of so-called psychological realities [ . . . ] We experience our entire psychic makeup in terms of balance. The ideal is a balanced personality [ . . . ] If too much weight is put on one activity or enterprise, to the exclusion of others, the individual is unbalanced [ . . . ] Furthermore, fi nancial, marital, political, or sexual problems can weigh on our minds, throwing us out of balance. In such cases, the resulting forces may drive us to ill-advised, or morally improper, behavior. (Johnson 1987, 88–89) Were we to accept this as a tenet applicable to most individuals, marked mind styles would most likely be realized via a conceptualization of reality based on the opposite premise, that is, on some kind of “unbalance.” The fact that Patrick Bateman fits the latter mold greatly contributes to the depiction of his social and cognitive unconventionality. What follows is an insight into how this instability is created and projected via specific metaphorical patterns in the original written form and the cinematic version. American Psycho is firmly rooted in two overarching mega-metaphors (Kövecses 2002, 51; Werth 1994, 102): the VIOLENCE mega-metaphor and the OWNERSHIP mega-metaphor, each one fleshed out by a series of corresponding micro-metaphors (Kövecses 2002, 51). Two of the several micro-metaphors responsible for the embodiment of the VIOLENCE mega-metaphor include the ANGER IS A SUBSTANCE (FLUID/GAS) IN A CONTAINER and ANGER IS A NATURAL FORCE (Kövecses 2000, 216): And while I stand there [ . . . ] something unspoken passes [ . . . ] there’s this weird kind of tension, a bizarre pressure, that fuels the following,
A Multimodal Approach to Mind Style
37
which starts, happens, ends, very quickly [ . . . ] He nods his small head, up, then down, slowly, but before he can answer, my sudden lack of care crests in a massive wave of fury and I pull the knife out of my pocket and I stab him, quickly, in the neck. (Ellis 1991, 286; italics added) where people from the Lotus Blossom are now standing, staring dumbly at the wreckage, no one helping the cop as the two men lie struggling on the sidewalk, the cop wheezing from exertion on top of Patrick, trying to wrestle the magnum from his grasp, but Patrick feels infected, like gasoline is coursing through his veins instead of blood. (Ellis 1991, 336; italics added) The fi rst extract describes the completely unmotivated killing of a toddler (the pronouns “he” and “him” in the quotation) in Central Park. Both the ANGER IS A SUBSTANCE (FLUID/GAS) IN A CONTAINER and ANGER IS A NATURAL FORCE micro-metaphors seem to be at work in this extract. Bateman is seen strolling along Central Park when the mere sight of a young boy awakens his desire to commit murder. The second quotation corresponds to an incident during which Bateman is being pursued by the police and similarly employs the ANGER IS A SUBSTANCE (FLUID/GAS) IN A CONTAINER micro-metaphor. Both emphasize the emotions that this character regularly experiences, that is, mainly rage, fury, and wrath, and that due to their conspicuous presence in the novel so thoroughly project his mind style, as illustrated further in the following:
ANGER IS A SUBSTANCE (FLUID/GAS) IN A CONTAINER My brain is churning. (78) I want to see Luis’s face contort and turn purple and I want [ . . . ] to [ . . . ] have these be the last words, the last sounds he hears until his own gurglings, accompanied by the crunching of his trachea, drown everything else out. (152)
ANGER IS A NATURAL FORCE My rage builds, subsides. (126) I storm out of the men’s room. (153) Cinema audiences are similarly bombarded with nonverbal realizations of this mega-metaphor. Forceville characterizes the creation of metaphors in films as follows: Verbo-pictorial metaphor[s], of course, are also possible cinematographically: a significant juxtaposition of a line of dialogue, or a word on a screen signpost, with a close-up of a character, can result in a
38 Rocío Montoro verbo-pictorial metaphor. In addition film, in contrast to print ads and billboards, has the medium of sound at its disposal, which means that one of the two metaphorical terms can be suggested by music, or by a sound (effect) instead of spoken/written language. (2002b, 10) Although the various killings that take place in the novel are synthesized and sometimes slightly modified or adapted in the fi lm, there are still numerous instances during which “spraying” and “splattering” of blood is used as one element of the source domain FLUID (SUBSTANCE) (as part of the ANGER IS A SUBSTANCE [FLUID/GAS] IN A CONTAINER micro-metaphor) to be mapped onto the target domain ANGER. Several other visual signs are combined to realize the various components of this metaphor: the blood as fluid restricted by the physical boundaries that are veins and arteries is literally freed from its confi nement during the killings. Cinema viewers can also see the upward directionality of the blood springing from the victim’s body onto the actor’s face, his clothes, and his immediate physical surroundings underscoring, so, the freeing of such fluid. In the scene where the character Paul Allen is murdered, the total “liberation” of the victim’s blood is completed with the fi nal trickle of the liquid seen oozing out from the corpse. Needless to say, color plays a determining role in the killing scenes: for instance, bright redness is intentionally contrasted with the immaculate whiteness of the murderer’s flat where Allen’s killing takes place. Not only is this apartment contemporaneous with minimalist trends in neutral hues, but also all the furniture and part of the floor have been protected with white covers to avoid staining. As visual resource, color significantly brings to the fore the way in which this micro-metaphor is inter-semiotically realized, as are also the violent body and facial gestures of the actor Christian Bale. Finally, sonic effects aid all of the aforementioned resources by adding further elements also capable of encoding semantic content for the target domain ANGER. In the case of Paul Allen’s murder, for instance, such effects are illustrated by the sound of the victim’s bones crunched under the axe, alongside the panting noises emitted by the executioner due to the exertion caused by the extraordinary strength actually needed to kill a person. Spoken verbal signs are also used in the form of loud swearing at his victim on the part of Bateman. The fact that the actual killings do not tend to be shown in the fi lm appears to heighten even further the metaphorical nature of all the signs here mentioned. The next set of micro-metaphors illustrates further my argument in relation to Bateman’s mind style. These micro-metaphors operate in a very specific way as their role and effect in the narrative emerge from their functioning, fi rstly, as a group, but secondly by their being formulated as a rather unorthodox form of propositional logic. The metaphors acting as propositions seem to be part of what is generally known as a logical conditional of the type “If . . . then” (Chapman 2000, 51):
A Multimodal Approach to Mind Style
39
‘If’ VIOLENCE/CRUELTY IS VITALITY & HAPPINESS IS VITALITY →(then) VIOLENCE/CRUELTY IS HAPPINESS ‘If’ VIOLENCE IS HAPPINESS & HAPPINESS IS UP → (then) ABSENCE OF VIOLENCE/CRUELTY IS DOWN The complexity of what appears to be Bateman’s way of reasoning and of constructing “logical” thinking strengthens a classification of his mind style as idiosyncratic. The inferences that Bateman makes (“violence/cruelty is happiness,” “absence of violence/cruelty is down”) based on the rather unusual logical construction emerging from the combination of these micro-metaphors are particularly disturbing if viewed from a “conventional,” nonpsychotic perspective. Johnson’s (1987) previous assessment on the experiential basis of our conceptualizing the world suggests that these inferences are only possible if we bear in mind the psychotic traits in his personality. If, as Johnson states, our basic experiencing the world in terms of balance influences the way we think, Bateman’s unbalanced personality would explain his pleasure in the presence of violence, cruelty, and murders. Additionally, the overarching VIOLENCE mega-metaphor and its corresponding micro-metaphors are endowed with a further way to project mind style. It would appear that Bateman’s deviant relationship with the world is possible only if certain conventional metaphors—HAPPINESS IS VITALITY (Kövecses 2000, 24), HAPPINESS IS UP (Kövecses 2000, 170)—are simultaneously conceptualized with other newly created, disturbing, novel ones, such as VIOLENCE/CRUELTY IS HAPPINESS or ABSENCE OF VIOLENCE IS DOWN. The latter group features in such a pervasive and consistent linguistic way that, in Bateman’s case, they could even be considered to have seemingly ceased being novel metaphors to become an elemental part of his conceptual metaphorical system. The possible “upgrading” from novel to conceptual metaphors would firmly justify why Bateman’s mind style is such a salient one. For instance: I can’t help but start laughing and I linger at the scene, amused by this tableau. When I spot an approaching taxi, I slowly walk away. Afterwards, two blocks west, I feel heady, ravenous, pumped up, as if I’d just worked out and endorphins are flooding my nervous system, or just embraced that first line of cocaine, inhaled the first puff of a fine cigar, sipped that first glass of Cristal. (Ellis 1991, 127; italics added) The narrator’s state of mind after killing Al, a beggar he has come across in a New York street, is particularly symptomatic of the combination of micro-metaphors that so distinctively characterize Bateman’s worldview. Whereas the above extract emphasizes the elation that follows the killings, such a rush promptly disappears giving way to a rather depressing low, illustrated by, for instance, the following expressions: “I play it cool,” “I ignore them,” “my high slowly dissolves, its intensity diminishing,” “I grow bored,
40 Rocío Montoro tired,” or “anticlimactic” (Ellis 1991, 127). The only way out of that low tends to be realized by references to the OWNERSHIP metaphor that I discuss later. The filmic adaptation is similarly rich in references to this group of micro-metaphors, this time not exclusively realized by verbal means. For instance, HAPPINESS IS VITALITY is repeatedly projected via Bateman’s body on the screen, his exercise routine being based on a rigid and strict discipline. The actor Christian Bale is seen doing sit-ups, rope-skipping, practicing yoga, all the while perspiring profusely. The camera is constantly lingering on shots of Bale’s perfectly sculpted and tanned body, images that exude vitality and health. The actor’s voice-over aids in the realization of HAPPINESS IS VITALITY by reminding the audience of Bateman’s physical achievement; for instance, that voice-over spells out exactly how many crunches he is able to master (one thousand of them!). Having said this, the VIOLENCE metaphor is also at work as sound resources are combined with the “vitality” shots to reaffi rm the rather complex connection existing between the two. I am referring to the sound track that accompanies some of the exercise scenes, namely the extremely loud and terrifying screaming noises that, initially, seem to be coming out of one of the rooms in Bateman’s apartment. A subsequent slow move of the camera focusing on the up-and-down motion of Bateman’s sit-ups lets the viewer catch a glimpse of a TV set situated right by the actor’s body, the horrifying noise all the while being heard. Eventually, the viewer is let in as to the origin of the shrill wail, namely the screams of the actress from the horror film The Texas Chainsaw Massacre (1974), together with the actual noise of a chainsaw. A classic in the horror film genre and an obvious intertextual and metafictional reference for the American Psycho spectators, this movie is in fact the background sound track of choice for Bateman’s exercise regime. It seems justified to assert, thus, that the combination of sonic and visual mediums via the semiotic resources of sound track and the moving image illustrates rather accurately the multimodal realization of a psychopath’s mind style. Bateman’s coming down from the adrenaline rush injected by violence, embodying the ABSENCE OF VIOLENCE IS DOWN metaphor, is equally present in the filmic mode. The Paul Allen murder scene described earlier is fraught with examples of how this conceptual metaphor is multimodally and inter-semiotically brought about. Kinesics illustrates one such mode but camera work expertly aids in the task too. On the one hand, a medium close-up shot on Bateman’s face and upper torso is maintained after the brutal killing has taken place so that the changes in facial gestures can be clearly appreciated. At fi rst, the upward directionality of his facial muscles correlates with the strain caused by the handling of the axe; such gestural markers are subsequently replaced with a lowering of eyebrows, corners of the mouth, arms, and shoulders, signifying muscular relaxation in the face and body. Further kinetic moves include the downward directionality of Bateman’s hand when he smoothes down his hair; him sitting down and leaning backwards on a sofa; and, finally, his reaching down to get a lighter
A Multimodal Approach to Mind Style
41
so that he can casually smoke a cigar whilst still looking down at his victim who is lying on the floor. The recurrent visual signs indicating downwards directionality are complemented on the sound plane by the utter absence of noises of any kind made by this character who, after loud, heavy breathing as well as terrifying screaming during the murder, seems to have fi nally “shut down” completely. A fi nal micro-metaphor to comment on is ANGRY BEHAVIOR IS ANIMAL BEHAVIOR (Kövecses 2000, 21) illustrated in the following: Now I’m lunging up Lafayette, sweating and moaning and pushing people out of my way, foam pouring out of my mouth, stomach contracting with horrendous abdominal cramps. (Ellis 1991, 144; italics added) Such graphic realization is linguistically elaborated further in “throwing up all the ham,” “brown streaks of bile,” “I belch,” and “my eyes rolling back into my head, greenish bile dripping in strings from my bared fangs” (Ellis 1991, 144–45). As the novel progresses, Bateman’s way of making sense of the world takes on a more and more extreme form, so towards the end any vestige of humanity has dissipated. Some of the most brutal features associated with animal behavior are mapped onto Bateman’s characterization of himself and people around him: I cannot seem to control myself, here in a room that contains a whole host of victims [ . . . ] all of them having one thing in common: they are prey. (Ellis 1991, 334; italics added) By the time humans are seen merely as prey a dehumanization/depersonalization process has taken effect, which the film also evidences profusely. For instance, Bateman’s stalking behavior is symptomatic of how this metaphor is enacted in the filmic mode. In one of these scenes Bateman happens to be walking behind a woman as he is apparently simply strolling in New York. Several semiotic resources come into play to encode a very specific meaning here but its success is amply determined by the existence of noninteractive as well as interactive vectors between these two participants (Kress and van Leeuwen 1996). On the one hand, the human predator starts marching decidedly in a straight line right behind the female victim who is already moving ahead of him on the screen. The directionality towards the woman in front is quickly coupled with a speeding up of his walking pace so that the noninteractive vector (the victim is unaware of what is going on behind her), representing Bateman as Actor and the woman as Goal, is eventually transformed into an interactive one realized by a gaze vector, with Bateman as Reactor and the female character as Phenomenon.3 The dynamic nature of film, nevertheless, implies that such roles can be reversed with a simple shot switch from the woman to Bateman and vice versa. The fact that the
42
Rocío Montoro
“hunting” scene begins with a noninteractive vector emphasizes Bateman’s role as a predator. Bateman does not look directly at his victim until he has caught up with her at the traffic lights. The dynamic force of his straight walk seems to suggest that he is simply positioned behind and not literally stalking her. His gaze is firmly focused ahead of him and not on the woman. Even once gaze vectors connect the two interactants, there is nothing obviating his murderous intentions. The next scene, nevertheless, clarifies the way directional vectors have been skillfully manipulated to suggest rather than to openly indicate “predation” as Bateman is seen at a launderette looking rather irate and incapable of understanding why some conspicuous red stains (he claims they are cranberry juice) have not come off his very expensive sheets; the clear implication is for those stains to be the aftermath of the fortuitous but unfortunate encounter with his lady victim. Further examples of the metaphor ANGRY BEHAVIOR IS ANIMAL BEHAVIOR materialize in Bateman’s admission of cannibalism, this time using the verbal mode and the sonic medium (he phones his lawyer and openly admits to it), or, rather more graphically, on the one occasion in which Bateman’s animalistic behavior is enacted in the actual “biting” of a victim. The second mega-metaphor largely responsible for projecting Bateman’s mind style is the OWNERSHIP metaphor. Again, various micro-metaphors and further idiosyncratic logical conditionals feed the OWNERSHIP references in the following way: ‘If’ MATERIAL THINGS ARE PURCHASABLE ITEMS → (then) MATERIAL THINGS ARE DISPOSABLE ‘If’ PEOPLE ARE ITEMS & ITEMS ARE PURCHASABLE → (then) PEOPLE ARE PURCHASABLE & PEOPLE ARE DISPOSABLE References to material possessions are amply illustrated throughout the novel as Bateman constantly boasts about, for instance, his “white marble and granite gas-log fi replace” over which hangs “an original David Onica,” his “long white down-fi lled sofa,” “thirty-inch digital TV set,” or the “chrome and acrylic Washmobile bath-room sink” that he is using until the marble sinks “ordered from Finland” (Ellis 1991, 24, 25) arrive. Similarly, the PEOPLE ARE ITEMS micro-metaphor is present in examples such as the following: “I’m resourceful,” Price is saying. “I’m creative, I’m young, unscrupulous, highly motivated, highly skilled. In essence what I’m saying is that society cannot afford to lose me. I’m an asset,” [ . . . ] “I mean am I alone in thinking we’re not making enough money?” (Ellis 1991, 3; italics added) As far as the fi lm is concerned, the decadence of Bateman’s lifestyle is translated in the luxurious restaurants, expensive clothes, or the original
A Multimodal Approach to Mind Style
43
mobile phones used as part of the fi lm set. Likewise the PEOPLE ARE PURCHASABLE micro-metaphor is achieved via the numerous references to money: the camera zooms in on credit cards slammed on tables to pay for expensive meals so that the entire screen is taken up by just that single shot; beggars on the street are tempted with money (again visually enhanced by the zooming in of the camera) that is never given to them; visits to the fi rst ATMs are part of Bateman’s daily routine; the actual “purchase” of people is realized via Bateman’s hiring some prostitutes who, as “bought human items,” are waved bundles of notes at, or actually paid for as part of their services. The dehumanizing process mentioned before in relation to the VIOLENCE mega-metaphor is also echoed and constantly reinforced by the surface realizations of the OWNERSHIP metaphor. In novel and fi lm alike, certain components from the source domains of “purchasability” or “disposability” are mapped onto the target domain “humanity,” which helps remove any trace of guilt Bateman might have had when the micro-metaphors PEOPLE ARE ITEMS or PEOPLE ARE DISPOSABLE, for instance, are implemented. There are no “ethical” considerations to bear in mind if a bathroom sink (see Ellis 1991, 24, 25) needs disposing of, but neither do they seem to exist in relation to people. The unfolding of events culminating in Bateman’s dehumanization and depersonalization is one aspect in which cinematic tools allow the director a fully multimodal realization that, I would suggest, is not so clearly rendered in the novel. By virtue of this dehumanization, Bateman’s psyche undergoes an obvious case of dissociation, demarcating two distinct entities that nonetheless still coexist in Bateman’s body, illustrated in the novel as follows: I start to stalk the dark, cold streets off Central Park West and I catch sight of my face reflected in the tinted windows of a limousine that’s parked in front of Café des Artistes and my mouth is moving involuntarily, my tongue wetter than usual, and my eyes are blinking uncontrollably of their own accord. In the streetlamp’s glare, my shadow is vividly cast on the wet pavement and I can see my gloved hands moving, alternately clutching themselves into fist, fi ngers stretching, wriggling [ . . . ] and it takes an awesome amount of strength to fight down the urge to start slapping myself in the face. (Ellis 1991, 157) The dissociation in Bateman’s mind occurs when his eyes perceive that his body is somehow operating of its own accord (mouth is moving, fists clenching), but his brain does not process such movements as responding to his own volition; in fact, some strength is required on his part to resist the urge to slap himself. Providing this dissociation is understood as the eventual consequential effect on his psyche of the deeds that his conceptualization of the world in terms of the combination of the VIOLENCE and OWNERSHIP mega-metaphors has “allowed” him to commit, his
44
Rocío Montoro
unconventionality can be satisfactorily explained. It is my argument that the film exemplifies the double nature of this narrator more regularly and effectively than the novel. Whereas the latter introduces this theme after some of the killings have taken place (the aforementioned example is an extract from the middle of the novel), the fi lm introduces what can be called the REFLECTION metaphor from the very beginning. Surfaces in Bateman’s apartment are always shiny, glass and mirrors featuring abundantly, so cinema viewers are exposed to Bateman, the character, as a double presence on the screen. His street search for prostitutes always happens with him traveling in a limousine on whose glass windows the woman’s image is reflected. The various prostitutes’ visits to his flat are metafictionally shot, for such sexual encounters are also videotaped by Bateman himself who becomes, therefore, a kind of “intradiegetic director.” These sex scenes are subsequently replayed so that the cinema audience, the prostitutes, and Bateman himself can become “witnesses” of the self-referential nature of this double exposure to fi lming. Considering that these realizations of the REFLECTION metaphor occur from the beginning and throughout, I would suggest that they act cataphorically, anticipating from the onset what is to occur to Bateman’s state of mind in the rest of the fi lm. Mainly based on visual signs, but also containing some spoken language referring to the act of shooting, recording, or staring at mirrors, this multimodal metaphor is allowed a pervasive presence missing from the written narrative. Hence, the double, split, or dissociated personality of Bateman is cinematically enhanced thanks to the multimodal possibilities offered by the fi lmic mode itself. The fi nal metaphor I explore neatly encompasses the key mind style characteristics described so far and fully illustrates the theoretical considerations postulated in relation to metaphorical patterns. The MUSIC megametaphor is exploited in both the novel and fi lm, also contains further micro-metaphors, and fi rmly links the VIOLENCE and OWNERSHIP mega-structures, drawing from and complementing them. The surface realization of this metaphor in the novel is, nevertheless, primarily based on recurring instances of overlexicalization in overdescriptive passages about 1980s bands, singers, and their music (Huey Lewis and The News, Phil Collins, Whitney Houston). The filmic mode, on the other hand, abounds in semiotic resources that make the MUSIC metaphor a clear exponent of O’Halloran’s semiotic metaphor, initially, simply because of the many songs from the period chosen as sound track (“True Faith” by New Order, “Walking on Sunshine” by Katrina and The Waves, “Simply Irresistible” by Robert Palmer, etc.).4 More importantly, 1980s music is also employed as a meaning-encoder device of a different kind, for which the Paul Allen murder scene previously described needs to be recalled. The visual and sound resources used to frame the violent nature of such a scene have already been established, but the MUSIC metaphor still succeeds in incorporating additional elements to it. The overlexicalization present in the
A Multimodal Approach to Mind Style
45
novelistic form makes an appearance in the fi lm in the form of Bateman’s “music monologues,” which preface his violent outbursts. As if he was a music critic, conversation between interactants (Paul Allen and himself) is stopped and supplanted by Bateman’s overdescriptive music rants. Visually, close-ups of the actual CD cases he describes are recurrent as are instances of Bateman’s switching on various state-of-the-art hi-fi systems. Eventually, the songs/bands he is so deftly analyzing are heard by victims and audience alike. More specifically, this mega-metaphor gets concretized on the VIOLENCE IS MUSIC micro-metaphor, which only becomes activated once the fi rst killing takes place. This concretization appears to bring to the fore the multimodal nature of this semiotic metaphor inasmuch as the conceptual field used as source domain, MUSIC, is realized multimedially (sound track, close-up shots of CD cases, actor’s voice, linguistic code of Bateman’s descriptions) and then mapped onto the target domain VIOLENCE. The resulting meaning for cinema spectators is that “music” becomes an expression of “violence,” which raises expectations in the audience: cinema viewers anticipate more killings when this combination of semiotic resources comes together. Furthermore, such anticipation is also encouraged because of the pervasive way in which this semiotic metaphor functions, literally inviting spectators to conclude that for Bateman “music is nothing else but violence.” The fi lmic version of American Psycho, thus, rather highlights how the clever manipulation of modes can and certainly does heighten the extreme traits of Bateman’s psychopathic mind style.
CONCLUSION This chapter has concerned itself with attempting to bridge the existing gap in studies of mind style by extending the analysis of such a traditionally textual structure to multimodal environments. Looking at the fi lmic adaptation of original literary narratives has revealed the somewhat impoverished perspective that some scholars have so far afforded themselves in their research of this issue. Having said this, their original considerations on the role that conceptual metaphors play in the projection of particular mind styles have been amply confi rmed by my analysis in monomodal and multimodal discourses alike. What a multimodal approach brings to light is the real extent to which such foundational structures are actually at the heart of our conceptual system as proven by the multifarious ways in which they are materially concretized. Secondly, this chapter has also reviewed various theoretical positions that deal with the form, function, and realizations of metaphors in modes other than the purely linguistic. The systemic-functional tradition has borne fruit in the shape of semiotic metaphors (O’Halloran 1999a, 1999b, 2003, 2004, 2005) whilst cognitivism has informed the concept of multimodal conceptual metaphors (Forceville 1996, 2002a, 2002b,
46
Rocío Montoro
2006, 2007). Both perspectives coincide in highlighting ways of assessing discourse in a more comprehensive way than would be the case in monomodal approaches. As demonstrated here, the implementation of either analysis is not exclusive or incompatible but complementary. Finally, I suggest that further research into the way narratives function, in general, can be aided by multimodal approaches to discourse. The new insights gained into the way mind style indicators work in the fi lmic mode suggest that traditional positions on narrative theory might need to be reevaluated in light of the knowledge acquired from a multimodal narrative perspective. Multimodal approaches, consequently, could and should inform future research on narrative theory, fi lmic studies, and perhaps even the concept of discourse itself.
NOTES 1. It is worth noting that the use of the term mode by Forceville seems to differ slightly from how other multimodal researchers employ it. For instance, Kress and van Leeuwen’s distinction (2001) of the concepts of “mode” and “medium” seems relevant. As I understand it, Kress and van Leeuwen conceive of the latter term as “medium of production,” that is, “the material resources used in the production of semiotic products and events, including both the tools and the materials used” (2001, 22). Such defi nition appears significantly close to Forceville’s understanding of “modes” noted earlier (written language, spoken language, visuals, etc.) as ways of materially rendering the conceptual side of the metaphor. It seems to me that Forceville is here simply misnaming what is essentially the multimedial nature of the surface realization of conceptual metaphors in the cinematic mode. The combination of semiotic resources employed in cinema allows these surface realizations to be far richer than in the printed mode. To avoid confusion, I am here adhering to Kress and van Leeuwen’s distinction whereby any reference to medium is to be understood as the actual medium of production or physical realization (using whichever semiotic resources are necessary) of the conceptual metaphors analyzed. 2. The experiential basis of our cognitive makeup has also been dealt with in multimodal literature as “experiential meaning potential” (Kress and van Leeuwen 2001). Material qualities, hence, do not simply encode meaning because of what they are or how they function but because of our previous experience of them. 3. The use of vectors as a multimodal analysis tool is not free from controversy and has been amply revised from the original formulation by Kress and van Leeuwen (1996) (see Baldry and Thibault 2005, 35; McIntyre 2008 for a recent critique). I am, however, retaining them to indicate directionality, connections between participants, etc. 4. For further assessments on the role of the sonic mode see van Leeuwen (1999), especially in connection to the modality of sound: The modality of sound can be approached along the same lines as the modality of images. Here too, modality judgements are ‘cued’ by the degree to which a number of different parameters are used in the articulation of the sounds, and here too, the context, and more specifically the
A Multimodal Approach to Mind Style
47
coding orientation used in that context, determines the modality value of a particular sound—the degree and kind of truth we will assign to it. (van Leeuwen 1999, 170) My analysis of the MUSIC metaphor here is intricately connected to the degree of truth assigned to the propositions expressed in the meaning encoded in such metaphor. For instance, as I explain later, the certainty of a killing event taking place after particular musical references are made is contextually linked to the psychological frame of mind that characterizes Bateman. Restrictions of space, however, do not allow for further insights into how the modality of sound is linked to the sonic realizations of the conceptual MUSIC metaphor.
REFERENCES Baldry, Anthony, and Paul J. Thibault. 2005. Multimodal Transcription and Text Analysis. London: Equinox. Black, Elizabeth. 1993. “Metaphor, Simile and Cognition in Golding’s The Inheritors.” Language and Literature 2 (1): 37–48. Bockting, Ineke. 1994. “Mind Style as an Interdisciplinary Approach to Characterisation in Faulkner.” Language and Literature 3 (3): 157–74. Carroll, Noël. 1996. Theorizing the Moving Image. Cambridge: Cambridge University Press. . 1998. Interpreting the Moving Image. Cambridge: Cambridge University Press. Champan, Siobhan. 2000. Philosophy for Linguists. An Introduction. London and New York: Routledge. Cienki, Alan. 1998. “Metaphoric Gestures and some of their Relations to Verbal Metaphoric Expressions.” In Discourse and Cognition. Bridging the Gap, ed. Jean-Pierre Koenig, 189–204. Stanford, CA: CSLI Publications. Ellis, Brett Easton. 1991. American Psycho. London: Picador. Forceville, Charles. 1996. Pictorial Metaphor in Advertising. London and New York: Routledge. . 2002a. “The Conspiracy in The Comfort of Strangers: Narration in the Novel and the Film.” Language and Literature 11 (2): 119–35. . 2002b. “The Identification of Target and Source in Pictorial Metaphors.” Journal of Pragmatics 34:1–14. . 2006. “Non-verbal and Multimodal Metaphor in a Cognitivist Framework: Agendas for Research.” In Cognitive Linguistics: Current Applications and Future Perspective, ed. Gitte Kristiansen, Michel Achard, René Dirven, and Francisco Jesús Ruiz de Mendoza Ibáñez, 379–402. Berlin and New York: Mouton de Gruyter. . 2007. “Multimodal Metaphor in Ten Dutch TV commercials.” The Public Journal of Semiotics 1 (1): 19–51. Fowler, Roger. 1977. Linguistics and the Novel. London and New York: Routledge. . [1986] 1996. Linguistic Criticism. Oxford: Oxford University Press. Halliday, M. A. K. 1985. Introduction to Functional Grammar. London: Arnold. Harron, Mary. 2000. American Psycho. Edward R. Pressman Productions. Herman, David, ed. 2003. Narrative Theory and the Cognitive Sciences. Stanford, CA: Center for the Study of Language and Information. Hooper, Tobe. 1974. The Texas Chainsaw Massacre. Bryanston Distributing Company
48 Rocío Montoro Johnson, Mark 1987. The Body in the Mind. The Bodily Basis of Meaning, Imagination, and Reason. Chicago and London: The University of Chicago Press. Kennedy, John M. 1982 “Metaphor in Pictures.” Perception 11:589–605. . 1993. Drawing and the Blind: Pictures to Touch. New Haven and London: Yale University Press. Koenig, Jean-Pierre, ed. 1998. Discourse and Cognition. Bridging the Gap. Stanford, CA: CSLI Publications. Kövecses, Zoltán. 2000. Metaphor and Emotion. Cambridge: Cambridge University Press. . 2002. Metaphor. A Practical Introduction. Oxford: Oxford University Press. Kress, Gunther, and Theo van Leeuwen. 1996. Reading Images. The Grammar of Visual Design. London and New York: Routledge . 2001. Multimodal Discourse. The Modes and Media of Contemporary Communication. London: Hodder Arnold. Kristiansen, Gitte, Michel Achard, René Dirven, and Francisco Jesús Ruiz de Mendoza Ibáñez, eds. 2006. Cognitive Linguistics: Current Applications and Future Perspective. Berlin and New York: Mouton de Gruyter. Lakoff, George, and Mark Johnson. 1980. Metaphors We Live By. Chicago and London: The University of Chicago Press. Leech, Geoffrey, and Mick Short. 1981. Style in Fiction. London: Longman. Lim, Victor Fei. 2004. “Developing an Integrative Multi-semiotic Model.” In Multimodal Discourse Analysis, ed. Kay L. O’Halloran, 220–46. London: Continuum. Margolin, Uri. 2003. “Cognitive Science, the Thinking Mind and Literary Narrative.” In Narrative Theory and the Cognitive Sciences, ed. David Herman, 271–94. Stanford, CA: Center for the Study of Language and Information. McIntyre, Dan. 2008. “Integrating Multimodal Analysis and the Stylistics of Drama: A Multimodal Perspective on Ian McKellen’s Richard III.” Language and Literature 17 (4): 309–34. McNeill, David. 1992. Hand and Mind: What Gestures Reveal about Thought. Chicago: University of Chicago Press. , ed. 2000. Language and Gesture. Cambridge: Cambridge University Press. O’Halloran, Kay L. 1999a “Interdependence, Interaction and Metaphor in Multisemiotic Texts.” Social Semiotics 9 (3): 317–54. . 1999b. ‘Towards a Systemic Functional Analysis of Multisemiotic Mathematics Texts.” Semiotica 124 (1/2): 1–29. . 2003. “Intersemiosis in Mathematics and Science. Grammatical Metaphor and Semiotic Metaphor” In Grammatical Metaphor. Views from Systemic Functional Linguistics, ed. Anne-Marie Simon-Vandenbergen, Miriam Taverniers, and Louise Ravelli, 337–65. Amsterdam and Philadelphia: John Benjamins Publishing Company. , ed. 2004. Multimodal Discourse Analysis. London: Continuum. . 2005. Mathematical Discourse: Language, Symbolism and Visual Images. London and New York: Continuum. Semino, Elena. 2002. “A Cognitive Stylistic Approach to Mind Style in Narrative Fiction.” In Cognitive Stylistics. Language and Cognition in Text Analysis, ed. Elena Semino and Jonathan Culpeper, 95–122. Amsterdam and Philadelphia: John Benjamins Publishing Company. . 2007. “Mind Style 25 Years On.” Style 41 (2): 153–73. Semino, Elena, and Jonathan Culpeper, eds. 2002. Cognitive Stylistics. Language and Cognition in Text Analysis. Amsterdam and Philadelphia: John Benjamins Publishing Company.
A Multimodal Approach to Mind Style
49
Semino, Elena, and Kate Swindlehurst. 1996. “Metaphor and Mind Style in Ken Kesey’s One Flew over the Cuckoo’s Nest.” Style 30 (1): 143–66. Simon-Vandenbergen, Anne-Marie, Miriam Taverniers, and Louise Ravelli, eds. 2003. Grammatical Metaphor. Views from Systemic Functional Linguistics. Amsterdam and Philadelphia: John Benjamins Publishing Company. van Leeuwen, Theo. 1999. Speech, Music, Sound. Houndmills and Basingstoke: MacMillan. Werth, Paul. 1994. “Extended Metaphor—A Text-World Account.” Language and Literature 3 (2): 79–103. Whittock, Trevor. 1990. Metaphor and Film. Cambridge: Cambridge University Press.
4
The Computer-Based Analysis of Narrative and Multimodality Andrew Salway
INTRODUCTION This chapter is concerned with how the processing power and multimedia capabilities of modern computers can be exploited to understand the workings of stories told in combinations of different types of media. Since the early days of digital technology, computers have been used to count and analyze words and patterns of words in texts, chiefly under the rubric of corpus stylistics, and more generally in the field of digital humanities. At the risk of oversimplification, much of this work can be characterized by its focus on single literary texts or on the works of single authors, and also by the way in which scholars look for signs of previously hypothesized linguistic or literary phenomena in the texts, which sometimes involves manual annotation of the texts prior to automated analysis. In contrast, this study advocates a novel computer-based approach to the analysis of narrative and multimodality that is characterized by the use of a computer to extract unusually frequent patterns from the surface forms of large collections of multimodal stories. Crucially, these patterns are to be extracted without the cost and bias due to prior manual annotation and the encoding of grammars, pragmatics, and world knowledge. Instead, patterns are extracted from corpora of multimodal texts on purely statistical grounds. The extraction of such patterns is envisioned as a starting point for developing theory in two ways. Firstly, by asking why particular forms are prevalent in a corpus, we may be led to develop explanations of the observed phenomena. Secondly, because the phenomena of interest have known surface forms, it is straightforward to automatically generate more data about their occurrence in the initial corpus and in new instances of multimodal texts in order to test hypotheses. By means of two case studies I assess the potential for this kind of approach, discuss its inherent limitations, and point to opportunities for further work. A major obstacle to the computer-based analysis of multimodal texts is the current limit on what can be achieved with automatic image and video analysis techniques, compared with text analysis. This issue was sidestepped in the work presented in the fi rst case study as feature fi lms were analyzed by extracting patterns
The Computer-Based Analysis of Narrative and Multimodality
51
from corpora of texts that are surrogates for different parts of the multimodal text, i.e., fi lm scripts, subtitles, audio description, and plot summaries. The second case study considers the extent to which image analysis techniques may be used together with language processing in the analysis of multimodal texts, like Web pages. By way of a backdrop to the case studies, I briefly review other work that used computers in the analysis of multimodal texts. I also give a view on the current state-of-the-art technologies for text and image understanding, which should inform our expectations for, and approaches to, automating the analysis of narrative and multimodality. Baldry (2004) describes a multimodal concordancer with which a researcher annotates a multimodal text, such as a video recording of an advert or a fi lm, with textual descriptions. The system then assists the researcher by generating statistics to help fi nd previously specified patterns in their descriptions. To access and analyze levels of multimodal documents such as rhetorical structure, layout structure, and navigation structure—specifically with a view to fi nding correlations between these layers and document genres—an XML-based annotation scheme was developed and deployed by Bateman and colleagues (Bateman, Henschel, and Delin 2002; Bateman 2008). An approach to computer-assisted multimodal analysis that does not require transcription or annotation is reported in O’Halloran (2004b). This concentrates on analyzing the visual properties of fi lm by using a standard video editing system to manipulate the image’s brightness, color, contrast, etc., in order to highlight different semiotic choices and to view their effects. Our proposed use of technology for data-driven explorations of narrative and multimodality should be seen as complementary to the development of tools that enable researchers to carry out very deep and detailed analyses of multimodal documents in an interactive fashion. O’Halloran (2009) explores possible ways of using digital technology for a kind of multimodal analysis that is grounded in social semiotics. She notes how technology has led to new research paradigms in mathematics and science, and suggests analogous impacts of digital technology on the field of multimodality. In particular, attention is drawn to the potential for visualization techniques, coupled with techniques for automated low-level visual content analysis, to help multimodal discourse analysts to unpick the multiple interwoven strands of meaning in media such as video and interactive Web sites. It is argued that these techniques are better suited than static, page-based transcriptions for helping researchers to articulate and elucidate semiotic choices in multimodal documents that are “conceptualised as continuous spatial-temporaltype relations.” There are challenges here, however, to do with designing interfaces for such tools in a way that maintains theoretical consistency and ease of use (O’Halloran et al., forthcoming). Compared with successes in text analysis and language understanding, the use of computers for the automated analysis of multimodal documents will be limited by the fact that image, video, and audio data are much less
52 Andrew Salway computationally tractable than text data. The nature of language and the machine encoding of written text data makes explicit the basic meaningbearing units (words) and their meaningful sequencing, and so word frequencies and collocation data give insights into document meanings, and some degree of automatic parsing and mapping into semantic representations is possible. By contrast, the machine-level encoding of still and moving images comprises matrices of pixels: each pixel is a color value for one point in the image and so carries no meaning by itself. The state of the art in language understanding technology is represented by work in the field of information extraction; for an overview, see Gaizauskas and Wilks (1998). An information extraction system extracts certain kinds of facts from a given type of text: for example, fi nancial news stories about company takeovers are analyzed to fi ll a database of “Takeover Events” with details for each event about which company bought the other, for how much, and the people involved. Whilst useful for commercial applications, information extraction technology still falls a long way short of what would be considered to be story understanding in the field of narratology. The recent development of a system to model space and time in restaurant narratives (Mueller 2007) points to the challenges here. It required thirteen thousand lines of computer code to extract information from restaurant narratives and to reason about this information in order to answer generic questions about the dining experience. Whilst efforts are ongoing to produce encodings of general world knowledge,1 it seems to me that Dreyfus’s arguments as to what computers still can’t do hold sway here (Dreyfus 1992). So, I argue, if we are interested in generally applicable approaches to the computer-based analysis of narrative and multimodality, we should concentrate on exploiting what computers can do, and favor approaches that analyze patterns in the surface forms of multimodal texts. As noted earlier, the machine encodings of written texts make explicit something of the meaning-bearing elements, but this is not so for image and video data. Thus we must consider what can be done to get beyond pixellevel representations using technologies from the interrelated and overlapping fields of image processing, image analysis, and image understanding (or computer vision); for an introduction to these fields see Gonzalez, Woods, and Eddins (2004). The task of automatic image understanding, i.e., producing a meaningful description of what can be seen in an image, is only achievable in highly constrained images—such as single objects set against a plain background and shot with good lighting (for more on the challenges here, see Smeulders et al. 2000). That said, there are some techniques for image and video analysis that have matured to the extent that they could be applied to the analysis of multimodal documents. A review of multimedia technologies by Smeaton (2004) included: face detection and recognition in image and video data; video segmentation into shots and the selection of representative key frames for each shot; and the detection of commonly occurring image content like “children,” “sand,” “water,” “outdoors,” and “sky.” For the analysis of feature films specifically, a technique has been demonstrated
The Computer-Based Analysis of Narrative and Multimodality
53
that classifies film sequences into either “action,” “montage,” or “dialogue” (Lehane and O’Connor 2006). A combination of low-level video features to do with the rate of shot change and the amount of motion within the frame were combined to give a measure of “Tempo” in feature films by Adams, Dorai, and Venkatesh (2002): changes in Tempo were shown to coincide with dramatically important moments in films. Such image and video analysis techniques suggest that it may be possible to extract useful features from visual information as part of the analysis of multimodal stories. This potential is discussed further in the second case study, but first we look at work that analyzed only textual surrogates for multimodal stories.
CASE STUDY I: ANALYZING NARRATIVE IN FILM Implicit in the approach that we are advocating is the assumption that interesting narrative characteristics manifest in regular ways in the digital forms of multimodal texts, like films, so that we can be led to them by extracting forms that are unusually frequent in a large collection of the texts. The work discussed in this section comprises investigations into how narrative functions in fi lm, based on the automated analysis of corpora of film scripts, plot summaries, subtitles, and audio description. The use of these corpora (summarized in Table 4.1), allows us to deal separately with different information streams, or semiotic strands, in the complex multimodal artefact that is fi lm: audio description2 stands as a surrogate for the characters, props, scenes, and actions depicted on screen; subtitles as a surrogate for the dialogue, and fi lm scripts and plot summaries do both. Of course we must recognize the importance of studying the interplay between the moving image and the sound track, and the diverse semiotic choices available within each; for a detailed inventory of the meaning-making elements in film, see O’Halloran (2004b). We must also recognize that these textual surrogates are only an imperfect and incomplete record of the fi lm. For now though, in the context of the current discussion, we seek to provide evidence in support of our underlying assumption by showing that some narrative aspects of fi lm manifest in regular forms. Table 4.1 Four Corpora of Text Surrogates for Film Text Type
Source
Number of texts
Number of words
Audio Description
Major UK producers of audio description— RNIB, BBC, and ITFC
73
c. 713,000
Film Scripts
www.imsdb.com www.simplyscripts.com
75
c. 1,900,000
Subtitles
www.subscene.com
80
c. 516,000
Plot Summaries
www.imdb.com
111
c. 14,000
54
Andrew Salway
The research summarized and discussed here used a variety of automated techniques to investigate what kinds of information are commonly provided by audio description, film scripts, subtitles, and plot summaries. Though the details varied, the general method was: (a) to identify unusually frequent words in each corpus by comparing the frequencies of words in the corpus with their frequencies in a general language sample, e.g., the British National Corpus; (b) to identify collocations of the unusually frequent words, i.e., statistically significant word sequences that contained them; and, in the analysis of fi lm scripts and audio description, (c) to merge and generalize the collocations to produce fragments of a local grammar that described syntagmas and paradigms induced from the corpora. This third step can be viewed as an implementation of Harris’s approach to the study of language and information, whereby patterns induced from a text corpus reveal the information structures of a domain (Harris 1988). The general method followed, and the link to Harris, were proposed in work done on information extraction from fi nancial news stories (Traboulsi, Cheng, and Ahmad 2004; Almas and Ahmad 2006). For details of the techniques used to analyze film texts, and for more comprehensive discussion of the results, see the following references: for the analysis of audio description and fi lm scripts see Salway, Vassiliou, and Ahmad (2005) and Salway (2007), and in particular for the induction of local grammars from these texts see Vassiliou (2006); for subtitles, see Lingabavan and Salway (2006); and for plot summaries see Tomadaki and Salway (2006) and Tomadaki (2006). Results from these studies are selected and presented here with a view to showing that patterns identified in these corpora lead us to some of the narrative functioning of fi lm. Starting with the corpora of fi lm scripts and audio description—which we discuss together because they exhibit many frequent words and collocations in common—we see that the most unusually frequent words include the following: looks, door, turns, away, head, towards, eyes, room, takes, around, walks, behind. These appear in collocations such as X looks at Y, X walks to the Y, X opens the door and leaves, and X nods her head. It is straightforward to infer that these word sequences are common in fi lm scripts and audio description because the actions that they refer to are frequently depicted in mainstream fi lm. On the basis of the four sequences, we might go on to say that storytelling in fi lm requires the fi lmmaker to ensure that the audience knows who is looking at what or whom, who is coming and going where, and about nonverbal communication between characters, but of course this selection of four phrases is arbitrary. A more systematic approach was taken by Vassiliou (2006) to merge and generalize collocation patterns to automatically generate a description of local grammar fragments around unusually frequent words. Figure 4.1 shows a local grammar fragment for the word looks. Note that the shaded parts of the diagram required some human intervention, but the rest was generated entirely automatically. The diagram,
The Computer-Based Analysis of Narrative and Multimodality
55
read left to right, captures many observed, and predicted, word sequences around looks, and around other words that appeared in similar contexts, i.e., stares, gazes, etc. This local grammar fragment can be interpreted as reflecting one kind of information that these texts, and hence fi lms, frequently convey, i.e., information about what and whom characters are looking at. By automatically generating such local grammar diagrams for the ten most unusually frequent words in audio description and fi lm scripts, Vassiliou (2006) produced a clear picture of some of the kinds of information commonly provided by these texts. By manually abstracting from the diagrams he postulated three main kinds of fi lm events, which were all grounded in empirically observed text forms. He labeled these “Focus_ of_Attention” (comprising X looks at Y, etc.), “Change_of_Location” (X walks to Y, etc.) and “Non-verbal_Communication” (X nods her head, etc.). Whilst an unsurprising fi nding for anyone with a passing knowledge of how stories are told in fi lm, what is significant about the postulation of these events is that each is tied to a realization in the form of word sequences. An important corollary of this is that it is easy to develop an information extraction system to automatically generate a database comprising data about these events in any number of fi lms for which fi lm scripts or audio description is available, as Vassiliou went on to do. The information extraction system, albeit imperfectly, instantiated a database containing data about Focus_of_Attention events, Change_of_Location events and Non-verbal_Communication events in 193 fi lms, including details of the characters, objects, and locations involved, and estimates of when they happened (from audio description time codes and relative positions in fi lm scripts). A similar approach, though implemented less extensively, was taken to the analysis of a subtitle corpus (Lingabavan and Salway 2006). Preliminary results showed unusually frequent words being: don’t, gonna,
Figure 4.1 A local grammar fragment induced from corpora of film scripts and audio description, from Vassiliou (2006).
56
Andrew Salway
didn’t, hey, fuck, guy, shit, uh, gotta, doesn’t. Taking these words as candidate nucleates, the standout collocations were I don’t know, I’m gonna, I wanna, and I gotta. On the basis of these results two common dialogue events were postulated: “Statement_of_Lack_of_Knowledge” (realized by I don’t know) and “Statement_of_Future_Intended_Action” (realized by the other three collocations), which had three varieties according to whether the future action was something that the speaker was planning to do to meet one of their own goals, or because of some external force acting on them. Such an analysis is impoverished compared to the rich discussion of the functions of fi lm dialogue provided by Kozloff (2000), but again the key point is that the postulated dialogue events are tied to formal realizations and so data about their occurrence could be extracted automatically from subtitles for further analysis. The corpus of plot summaries analyzed in Tomadaki and Salway (2006) and Tomadaki (2006) was not large enough to allow for the automatic identification of collocation patterns on statistical grounds, but manual inspection of concordances of frequent words did suggest some common kinds of events being described. Compared with the relatively basic actions described in audio description and fi lm scripts (look, walk, take), plot summaries describe larger-scale events such as: help, meet, kill, bring, tell, force, fi nd, discover, love, and murder, with common patterns including fall in love with, X helps Y, X discovers that . . . and X fi nds Y. The fi ndings reported in this section have made a reasonable case that an automated analysis of narrative texts can be revealing about some of the kinds of information that must be conveyed for successful storytelling, and hence inform theories of how narrative functions. In particular, by analyzing corpora of different surrogate texts it seems possible to gain insights into how the different semiotic strands of fi lm contribute to the whole. All the work presented earlier is incomplete in the sense that the corpus data could have been drilled more deeply, and the corpora could have been increased in size. With further work we would expect to see more kinds of commonly occurring events identified on the basis of statistically significant patterns. There is also much to be done analyzing the data already extracted about the commonly occurring events. One piece of ongoing research is looking at word patterns in audio description that express information about characters’ mental states—common patterns of this type include CHARACTER looking *ed|*ly (where *ed stands for any word ending -ed, and *ly any word ending -ly), and CHARACTER smiles|stares|looks|walks *ly. For example, the description of an onscreen action as John smiles nervously conveys something about John’s mental state, cf. Palmer’s thought-action continuum and his arguments for the centrality of characters’ mental states to story (Palmer 2004). The description of thoughts and actions in audio description is analyzed by Salway and Palmer (2007).
The Computer-Based Analysis of Narrative and Multimodality
57
CASE STUDY II: ANALYZING MULTIMODALITY IN WEB PAGES I now turn to address the question of whether interesting patterns can be extracted automatically from the visual components of multimodal documents. As noted previously, we would not expect such patterns to emerge directly from image and video data represented at the level of pixels, but rather from the analysis of prespecified image and video features that can be derived automatically from image and video data, e.g., color distributions in images, rate of shot change in videos, etc. To date, there have been no attempts to derive insights into the narrative functioning of multimodal documents by analyzing image and video features with a method analogous to that described in the previous section for text corpora. Thus, here, the best we can do is to present a framework that was developed manually to account for image–text relations in multimodal documents, and to assess the extent to which current image analysis techniques could automate the framework in the future. Specifically, we argue that patterning in the digital forms of multimodal documents cues various kinds of relations between the elements making up these documents, with examples of how the functioning of image–text relations in Web pages is cued by combinations of text features, image features, and page-layout features. The core of the work discussed here is the system of image–text relations proposed by Martinec and Salway (2005), which combines two kinds of relations— drawing on the ideas of Barthes (1977) and Halliday (1985)—the relative status of images and text, and how they relate to one another in terms of logico-semantics. Compared with previous functional-systemic approaches to multimodality, such as Kress and van Leeuwen (1996), and most of the papers in O’Halloran (2004a), and in Royce and Bowcher (2007), this system of image–text relations may be considered distinct on two counts: (i) it is intended to be general to all genres of multimodal discourse where images and text co-occur; (ii) in the identification of each image–text relation, great emphasis was placed on the specification of tangible, machineprocessable realizations of each relation, e.g., some mix of text, image, and page-layout features detectable in the surface form of multimodal documents. That said, in the context of the current discussion, it must be stressed that the system of image–text relations was developed through an entirely manual analysis of multimodal documents. This section fi rst summarizes Martinec and Salway’s system of image–text relations, and then discusses the extent to which these relations may be realized in forms that a computer can detect automatically, mentioning recent work that has begun to test empirically one kind of image–text relation in online news stories. With regards to their relative status, the relation between an image and a text is equal when either: both the image and the text are required for successful communication, in which case they are equal-complementary; or, both the image and the text can be understood individually, in which case
58
Andrew Salway
they are equal-independent. The relation between an image and a text is unequal when either the image or the text can be understood individually; that which cannot be understood individually is subordinate to the other. Consider some examples of common kinds of Web pages. Images tend to be subordinate to the main text on news Web pages whereas the text tends to be subordinate to the image on art gallery Web pages. Image and text often share an equal relationship in Web pages used for science teaching and learning. The relationship is equal-independent when both convey the same information in different ways, and when there is a cross-reference between the image and the text. Some technical images may require text to identify them, in which case the image and the text, probably a caption, are equal-complementary. We identified three main kinds of logico-semantic relations between images and texts. A text elaborates the meaning of an image, and vice versa, by further specifying or describing it. Photographs on news Web pages frequently elaborate the text, specifically the fi rst paragraph. For example, an image often specifies what a person referred to in the news story looks like—this information is missing from the text. On art gallery Web pages, parts of painting captions elaborate the images (paintings) with a description of image content. It is possible for an image to elaborate a text and a text to elaborate an image at the same time, as in the case of a scientific diagram and accompanying text in an equal-independent relationship. A text extends the meaning of an image, and vice versa, by adding new information; for example, car adverts that comprise images of cars and text giving information about price and performance. A text enhances the meaning of an image, and vice versa, by qualifying it with reference to time, place, and/or cause-effect. In one online news story we looked at there was an enhancement relation between the image and the fi rst paragraph of the text because the image depicted the effect of the explosion reported in the text. Salway and Martinec (2005) suggested some image features, text features, and page-layout features that might be used together to classify image–text relations automatically, but these were not tested experimentally. It is expected that such classification will be easier if the genre of the image–text combination is known because there may be preferred image– text relations with genre-specific realisations. The features that were considered included: page layout and formatting—the relative size and position of the image and the text, font type and size, image border; references in the text—for example, “this picture shows . . . ,” “see Figure 1,” “on the left,” “is shown by”; the grammatical characteristics of the text—tense, quantification, use of full sentences or short phrases; modality of images— a scale from realistic to abstract, or from photographic to graphic; and framing of images—for example, one centered subject or no particular subject. Crucially, all these features can be analyzed automatically with current multimedia technologies. The text features require straightforward string matching, or relatively simple part-of-speech tagging. 3 Page-layout
The Computer-Based Analysis of Narrative and Multimodality
59
features, whose importance for the reading of multimodal documents is made clear by Kress and van Leeuwen (1996), can be analyzed automatically with page structure algorithms such as those developed by Cai et al. (2003) and Song et al. (2004). The face detection techniques mentioned in the introduction section of this chapter make it feasible to extract framing features for images of people. As for image modality, defi ned by Kress and van Leeuwen (1996) as a function of a function of depth, color saturation, color differentiation, color modulation, contextualization, pictorial detail, illumination, and degree of brightness, this can perhaps be interpreted in terms of low-level image features extracted from the analysis of pixel data. Two kinds of features that seem most indicative of status relations are page layout and lexical references. In languages that are read left to right, and top to bottom, we expect the subordinate media type to appear to the right of or below the other and to occupy less space. References in text such as “this painting . . .” or “Figure X shows . . .” are strong indications that the text is about (and therefore subordinate to) the image. Other references like “this is shown in Figure X” may suggest equal status—especially when near the end of the text. Determining logico-semantic relations involves a comparison between what is depicted in the image and what is referred to by the text. If exactly the same people, objects, and events are depicted and referred to, then there is elaboration. If completely new things are depicted or referred to then there is extension. If related temporal, spatial, or causal information is provided then there is enhancement. The question is how may such comparisons be computed? Information extraction techniques can recognize proper nouns and work out who is the subject of a story, and determine what kind of event or state is being referred to in a text. Image analysis techniques can detect faces, indoor versus outdoor scenes, and framing—all of which may give clues about the main subject being depicted. It might also be interesting to compare the complexity of the image and of the text: image complexity could perhaps be measured as a function of the number of edges/regions or graphic elements. Measures of text complexity relate to sentence length, average word length, and use of embedded clauses. When a text elaborates an image we noticed that often present tense is used, or short phrases rather than complete sentences. When an image elaborates a text in the news domain the image is of a realistic modality and typically depicts a person who is framed so that the head and shoulders fill the photograph, and the text tends to repeat the name of the person depicted. The enhancement relation of cause-effect is realized when the image depicts a process and the text refers to a state, or vice versa. The image seems to normally be a general scene, rather than a closely cropped photograph with one main subject. The fi rst empirical investigation of image–text relations, with both human subjects and automatic classification, was carried out by Hughes et al. (2007). An experiment was conducted to test the hypothesis that
60
Andrew Salway
humans can predict the main theme of a text by looking quickly at an associated image. By seeing pictures of people that accompany eighty online news stories but not the text, twenty-five subjects could predict very accurately whether the story was about the specific person/people depicted in the image or about a more general theme. Human performance on this task was then emulated with automatic classification based on low-level image features. Using a face detection algorithm set to detect large full-frontal faces, a measure of variation in image sharpness across the image and certain features intended to correlate to image modality, it was possible to correctly classify photographs into “Specific” or “General” categories in 82.5 percent of eighty online news stories. This result is the fi rst sign that visual features relevant to multimodality can indeed be extracted automatically from image data. The currently available evidence for frequent and regular surface forms in image data is significantly weaker than that for text data discussed in the previous section. The detailed set of image–text relations discussed here was conceived on the basis of manual investigation, albeit with one eye on realizations for each relation in terms of features that are machine processable. What is certain is that there is a good variety of text features, image features, and page-layout features that can be analyzed automatically, and that these fit well with the explanations of multimodality initiated by Kress and van Leeuwen (1996), and developed by Martinec and Salway (2005). One important factor here is that in order for image–text relations to be recognized automatically, they need to be realized consistently in the same ways, as with online news stories. However, it remains to be seen whether there are many other kinds of multimodal stories that are widely available in digital form, and that exhibit a sufficiently high degree of conventionality in how visual and verbal information are combined. One way to answer this question in the spirit of the current chapter would be to attempt to induce a set of image– text relations based on text, image, and page-layout features extracted from a large sample of multimodal stories. Another direction for further research is to look at multimodality in feature films, perhaps using audio description and subtitles as surrogates for the moving image and dialogue to facilitate the analysis of logico-semantic relations between the two; or by fusing text and video analysis as did Salway, Lehane, and O’Connor (2007).
CLOSING REMARKS To conclude, in broad terms first, the automatic identification of forms peculiar to certain kinds of multimodal narratives can be a starting point in developing data-driven theories that explain their functioning. The data could be used to validate or refute existing theories in a hypothesis-led fashion, but I would argue that the approach lends itself to stimulating new questions and to the development of new theories to account for aspects of narrative and
The Computer-Based Analysis of Narrative and Multimodality
61
multimodality that were previously unrecognized, or at least remain as yet to be established. Following the first path certainly, and most probably following the second too, this approach ought to be compatible with many of the diverse theoretical perspectives from which narrative and multimodality are viewed. In the simplest version of the method there would be one step of automatic analysis to identify idiosyncratic features in a corpus of multimodal stories, and then one step to interpret and explain the observed data. However, most likely, to be effective this method will need to be cyclical, so that results from the early steps guide further steps of automated text annotation, the automated generation of databases and the forming of analytical concepts and hypotheses, which will then all feed back into corpus analysis. For example, Herman directs narrative theory so that “the real target of narrative analysis is the process by which interpreters reconstruct the storyworlds encoded in narratives” (Herman 2002, 5), and so that narrative theory will develop through the charting of “constraints on the variable patterning of textual cues with the mental representations that make up storyworlds” (Herman 2002, 12). Previously this charting has started with researchers postulating mental representations and then looking for the textual cues to which they may correspond. Now we may be in a position to offer a complementary approach that reverses the process and starts with the identification of the textual cues—at least those corresponding with the most common mental representations that make up storyworlds. From our case studies, it seems that insights into the narrative functioning of multimodal documents can be gained from the statistical analysis of an unannotated corpus.4 However, a fundamental limitation of the approach is that whilst it might facilitate generalizations by drawing attention to frequently occurring forms, it will miss the rarer and, to some, the more intriguing aspects of narrative and multimodality. We must also note that the approach is currently more feasible when surrogate texts are used in place of image and video data. The lack of suitable surrogate texts may then limit the applicability of the approach. That said, ongoing developments in the very active fields of image and video analysis, along with new multimedia encoding standards like MPEG-4 that make the structure and content of multimedia documents more explicit in machine-processable representations, mean that we expect most kinds of multimedia data to become increasingly computationally tractable. Thus, through the interdisciplinary effort of narrative and multimodality scholars, and computer scientists, we should expect much more in the coming years from the computer-based analysis of narrative and multimodality than I have been able to demonstrate here.
ACKNOWLEDGMENTS My thanks go to colleagues with whom I have worked on narrative and multimodality, and who have helped to shape my thinking. Working with
62 Andrew Salway me on the TIWO project to investigate narrative and fi lm from a computational perspective were Elia Tomadaki, Andrew Vassiliou, and Yan Xu; their PhD theses explore in depth some of the issues I touched on only briefly in this chapter. I am very grateful to Radan Martinec for a stimulating and enjoyable collaboration working on image–text relations. My appreciation for narrative has benefited enormously thanks to the help of David Herman and Alan Palmer. Ongoing work with researchers at the Centre for Digital Video Processing, Dublin City University, has helped me to understand a wide range of multimedia content analysis techniques. Needless to say, none of the aforementioned are responsible for flaws in this chapter. NOTES 1. http://www.cyc.com/. 2. Audio description is produced for the benefit of blind and visually impaired television and fi lm audiences. It is scripted before it is recorded and aligned with the fi lm/television program: our corpus comprises audio description scripts. For more about audio description, see Diaz-Cintas, Orero, and Remael (2007). 3. Such as http://www.connexor.com/demo/tagger/. 4. These ideas are explored with reference to an analysis of monomodal narratives by Salway and Herman (forthcoming).
REFERENCES Adams, B., C. Dorai, and S. Venkatesh. 2002. “Towards Automatic Extraction of Expressive Elements for Motion Pictures: Tempo.” IEEE Transactions on Multimedia 4 (4): 472–81. Almas, Y., and K. Ahmad. 2006. “LoLo: A System Based on Terminology for Multilingual Extraction.” In Procs. COLING Workshop on Information Extraction Beyond The Document, eds. M. E. Califf, M. A. Greenwood, M. Stevenson, and R. Yangarber, 56–65. Sydney: Association for Computational Linguistics. Baldry, A. 2004. “Phase and Transition, Type and Instance: Patterns in Media Texts as Seen through a Multimodal Concordancer” In Multimodal Discourse Analysis: Systemic Functional Perspectives, ed. K. O’Halloran, 83–108. London: Continuum. Barthes, Roland. 1977. “Introduction to the Structural Analysis of Narratives.” In Image Music Text, ed. R. Barthes, trans. Stephen Heath, 79–124. New York: Hill and Wang. Bateman, J. 2008. Multimodality and Genre: A Foundation for the Systematic Analysis of Multimodal Documents. Hampshire: Palgrave Macmillan. Bateman, J., R. Henschel, and J. Delin. 2002. “A Brief Introduction to the GeM Annotation Scheme for Complex Document Layout.” In Procs. 2nd Workshop on NLP and XML International Conference on Computational Linguistics. Association for Computational Linguistics, eds. N. Ide, L. Romary, and G. Wilcock, 1–8. Morristown, NJ: Association for Computation Linguistics Cai, D., Yu, S. Wen, J. R., and Ma, W. Y. 2003. “Extracting Content Structure for Web Pages Based on Visual Representation.” In Lecture Notes in Computer
The Computer-Based Analysis of Narrative and Multimodality
63
Science (vol. 2642), eds. X. Zhou, Y. Zhang and M. E. Orlowska, Berlin and Heidelberg: Springer, pp. 406–417. Dreyfus, H. 1992. What Computers Still Can’t Do. Cambridge and London: The MIT Press. Diaz-Cintas, J., P. Orero, and A. Remael, eds. 2007. Media for All: Subtitling for the Deaf, Audio Description, and Sign Language. Amsterdam and New York: Rodopi. Gaizauskas, R., and Y. Wilks. 1998. “Information Extraction: Beyond Information Retrieval.” Journal of Documentation 54 (1): 70–105. Gonzalez, R., R. Woods, and S. Eddins. 2004. Digital Image Processing Using MATLAB. Upper Saddle River, NJ: Prentice-Hall. Halliday, M. 1985. An Introduction to Functional Grammar. London: Arnold. Harris, Z. 1988. Language and Information. New York: Columbia University Press. Herman, D. 2002. Story Logic: Problems and Possibilities of Narrative. Lincoln and London: University of Nebraska Press. Hughes, M., A. Salway, G. Jones, and N. O’Connor. 2007. “Analysing Image-Text Relations for Semantic Media Adaptation and Personalisation.” In Procs. 2nd International Workshop on Semantic Multimedia Adaptation and Personalisation, eds. P. Mylonas, M. Wallace, and M. Angelides, 181–86. Los Alamitos, CA: IEEE Computer Society. Kress, G., and T. van Leeuwen. 1996. Reading Images: The Grammar of Visual Design. London and New York: Routledge. Kozloff, S. 2000. Overhearing Film Dialogue. Berkeley and Los Angeles: University of California Press. Lehane, B., and N. O’Connor. 2006. “Movie Indexing via Event Detection.” In Procs. 7th International Workshop on Image Analysis for Interactive Multimedia Services. Incheon, Korea. Lingabavan, V., and A. Salway. 2006. “What Are They Talking About? Information Extraction from Film Dialogue.” Dept. of Computing, Technical Report CS-06–07, University of Surrey. Martinec, R., and A. Salway. 2005. “A System for Image-Text Relations in New (and Old) Media.” Visual Communication 4 (3): 337–71. Mueller, Erik T. 2007. “Modelling Space and Time in Narratives about Restaurants.” Literary and Linguistic Computing 22 (1): 67–84. O’Halloran, K., ed. 2004a. Multimodal Discourse Analysis: Systemic Functional Perspectives. London: Continuum. . 2004b. “Visual Semiosis in Film.” In Multimodal Discourse Analysis: Systemic Functional Perspectives, ed. K. O’Halloran, 109–30, London: Continuum. . 2009. “Multimodal Analysis and Digital Technology.” In Interdisciplinary Perspectives on Multimodality: Theory and Practice. Proceedings of the Third International Conference on Multimodality, ed. A. Baldry and E. Montagna. Campobasso: Palladino. O’Halloran, K. L., S. Tan, B. A. Smith, and A. Podlasov. Forthcoming. “Challenges in Designing Digital Interfaces for the Study of Multimodal Phenomena.” Information Design Journal. Palmer, A. 2004. Fictional Minds. Lincoln and London: University of Nebraska Press. Royce, T., and W. Bowcher, eds. 2007. New Directions in the Analysis of Multimodal Discourse. Mawah and London: Lawrence Erlbaum Associates. Salway, A. 2007. “A Corpus-Based Analysis of the Language of Audio Description.” In Media for All: Subtitling for the Deaf, Audio Description, and Sign Language, ed. J. Diaz-Cintas, P. Orero and A. Remael, 151–74. Amsterdam and New York: Rodopi.
64 Andrew Salway Salway, A., and D. Herman. Forthcoming. “Digitized Corpora as Theory-Building Resource: New Foundations for Narrative Inquiry.” In New Narratives: Stories and Storytelling in the Digital Age, ed. R. Page and B. Thomas. Lincoln and London: University of Nebraska Press. Salway, A., B. Lehane, and N. O’Connor. 2007. “Associating Characters with Events in Films.” In Procs. of ACM Conference on Image and Video Retrieval— CIVR 2007, eds. N. Sebe and M. Worring, 510–17. New York: ACM. Salway, A., and R. Martinec. 2005. “Some Ideas for Modelling Image-Text Combinations.” Dept. of Computing, Technical Report CS-05–02, University of Surrey. Salway, A., and A. Palmer. (2007). “Describing Actions and Thoughts.” Paper presented at the Advanced Seminar Audiodescription for Visually Impaired People: Towards an Interdisciplinary Research Agenda, University of Surrey, June 27–28. Salway, A., A. Vassiliou, and K. Ahmad. 2005. “What Happens in Films?” In Procs. of IEEE Conference on Multimedia and Expo, ICME 2005. Smeaton, A. 2004. “Indexing, Browsing and Searching of Digital Video.” In ARIST—Annual Review of Information Science and Technology, Vol. 38, ed. B. Cronin, 371–407. American Society for Information Science and Technology. Medford, NJ: Information Today, Inc. Smeulders, A., M. Worring, M., Santini, S., Gupta, A., Jain, R. 2000. “ContentBased Image Retrieval: The End of the Early Years.” IEEE Transactions Pattern Analysis and Machine Intelligence 22 (12): 1349–80. Song, R., Liu, H., Wen, J. R., and Ma, W. Y. 2004. “Learning Block Importance Models for Web Pages.” In Procs of WWW 2004, eds. S. I. Feldman, M. Uretsky, M. Najork, and C. E. Wills, 203–11. New York: ACM. Tomadaki, E. 2006. “Cross-Document Coreference between Different Types of Collateral Texts for Films.” PhD diss., University of Surrey. Tomadaki, E., and A. Salway. 2006. “Cross-Document Coreference for CrossMedia Film Indexing.” In Procs. of LREC 2006 Workshop on Crossing Media for Improved Information Access, eds. S. Pipendis, H. Coningham and V. Tablan. European Language Rresources Association. Genoa, Italy. Traboulsi, H., D. Cheng, and K. Ahmad. 2004. “Text Corpora, Local Grammars and Prediction.” In Procs. of 4th International Language Resources and Evaluation Conference, vol. 3, 749–52. Lisbon. Vassiliou, A. 2006. “Analysing Film Content—A Text-Based Approach.” PhD diss., University of Surrey.
5
Opera Forever and Always Multimodal Michael Hutcheon and Linda Hutcheon
From its inception in late sixteenth-century Italy, opera has been a multimodal art form, and each mode deployed has always contributed to its complex of musical, verbal, visual, and dramatic meanings. It is no accident that Gunther Kress and Theo van Leeuwen use the image of the Gesamtkunstwerk for multimodality (2001, 1), but live staged opera did not have to wait for Richard Wagner’s christening of that term for the reality of the operatic “total work of art” to come into being.1 But Wagner was perhaps the fi rst composer to think—and create—multimodally: he wrote his own poetic and dramatic verbal libretti; he composed the music; he acted as stage director, helping design sets and costumes; he rehearsed the orchestra and singers; he even built his own theater at Bayreuth. He was the fi rst, in short, to foreground how each semiotic resource brought with it new meaning and did so at every stage of the creation and reception process. 2 Today, live performed opera, considered in its social and historical contexts and as materialized by staged sounds, sights, and body movements, has joined the previously accepted score and libretto as constituting the aesthetic identity of operatic works. In this art form, what Kress and van Leeuwen call production media (such as the singers’ voices, gestures, motions; the orchestra’s musical sounds; the stage action and sets, etc.) are coupled with design modes (the musical score and libretto, the director’s interpretive plan, the various designers’ and performers’ visions, etc.) to create “a particular and quite specific form of the social organization of semiosis” (2001, 67).3 In an even more complicated and yet, strangely, even more obvious manner than, say, a fi lm or a stage play, a pantomime or a Lieder recital, opera tells and enacts stories multimodally through its verbal text and its dramatic (staged) action, as well as through its music.4 Wagner’s use of what have come to be called leitmotivs offer the clearest case of musical narration, of how music can encode meaning: a melody or a sequence of notes comes to be identified directly with, for example, a character, a plot action, or a place, and the subsequent repetition (and musical development) of this musical idea takes on narrative meaning, telling the musiclistening audience details that the music-deaf characters may not know. 5 Because music is what opera “is” to so many people—from musicologists
66 Michael Hutcheon and Linda Hutcheon to audiences—and because aspects of this dimension (rhythm, timbre, melody, harmony) have already been treated to this kind of semiotic analysis (see van Leeuwen 1999), we will instead concentrate here on what else opera also “is” multimodally, while accepting that vocal music is clearly the defi ning characteristic of opera as a theatrical mode.6 Since the rise of Regietheater (director’s opera), with its return to the principles of the Gesamtkunstwerk, we have been reminded that live opera is both multimedial—the eye and the ear are addressed by different material media—and multimodal, engaging voice and music, but also language, gesture, visual architectural form, color, and many other semiotic resources. Unlike much work on multimodality (e.g., Kress and van Leeuwen 2001, [1996] 2006), our focus here will not be on encoding or intention but rather on the semiotics of reception of this complex operatic “grammar” (Kress and van Leeuwen [1996] 2006) and its interacting parts. Today’s audience is not Wagner’s fi rst 1876 audience; cultural and historical specificity determines reception, in the opera house as elsewhere. As modern audience members attending Wagner’s tetralogy, Der Ring des Nibelungen, we enter the theater on the second evening to see Die Walküre. Since the interpretation of semiotic production (the multimodal event we are describing here) is not “passive reception” (Kress and van Leeuwen 2001, 67) and although all communication is interactive, given that its very materiality inevitably affects us (71), we also need to take into account the fact that we bring with us what hermeneutics calls a “horizon of expectation.” Part of that horizon stems from our previous experience with theater and opera,7 but part also comes from the previous day’s exposure to the “preliminary evening to the stage festival,” Das Rheingold.8 But, as an audience, we are a heterogeneous group. Some of us are seeing this work for the fi rst time. Our major interaction in this case, from our own experience and that of many students and friends, is likely to be with the music or the story, depending on whether we are more musically or narratively oriented. The Ring, known for its powerful and complex music, also tells a mythic tale of gods, dwarves, giants, and humans, all struggling for control of the world through the power symbolized by the ring of the title. The world we see in the fi rst work is already a corrupt one that is eventually destroyed— yet also redeemed, in the end, through love and sacrifice. However, given the relatively small number of works in the standard operatic repertoire, some of us in the audience on that second night will have seen one or more previous productions of the Ring. For us, the story is already well known, but the way it is told is what is going to defi ne our multimodal narrative experience. This is because the physical move from what theater semiotician Kier Elam calls the “dramatic texts” (1980, 3)—or the design—to live performance with actual material bodies in actual physical space is an act of narrative production of both the verbal libretto and the musical score. Every element that we see and hear is transformed, indeed defamiliarized, by being placed materially on the stage: an object
Opera
67
or a person is suddenly rendered “significant”—as if placed in quotation marks, made an intentional sign or a signifying agent (Elam 1980, 8–9). As Petr Bogatyrev succinctly put it: “all that is on stage is a sign” (cited in Elam 1980, 7). And all those staged signs, along with the musical ones, create a fictional world “with its own dynamics and governing rules” (Bennett 1990, 149). Susan Bennett argues that this performative mise-en-scène replaces the authors’ printed texts as the “creative aspect in the signifying process” (1990, 5); but in multimodal terms, what occurs is that production adds to design here to accumulate meaning(s). At any given moment, that multimodal mise-en-scène is structured or organized to give emphasis to a sign or sign-cluster intended to localize the audience’s focalization on that aspect of the drama (Bennett 1990, 160), generating new meanings, perhaps different from those intended by the design of the “dramatic text.” A singer’s voice or a set detail can communicate directly, adding meanings not prefigured by the design (Kress and van Leeuwen 2001, 66)—meanings to which we respond cognitively and affectively. Production does not just “realize” design, in other words. For instance, in the “dramatic text,” the stage directions for the opening of Die Walküre read, in translation: “The curtain rises. The interior of a dwelling. The room is built around the trunk of a mighty ash-tree, which forms its central point. Downstage left is the hearth, behind it the storeroom; at the back of the stage is the main entrance to the room; upstage right, steps lead up to an inner chamber; on the same side, at the front of the stage, is a table, behind it a broad bench, let into the wall, and wooden stools in front of it” (Wagner 1993, 122). In the early 1990s, we attended three very different productions of the Ring that materialized these directions in three very different ways. In the Metropolitan Opera in New York City, a scene very much like the one Wagner described was displayed when the curtain opened at the start of the second evening. In Bayreuth, Germany, in the theater that Wagner himself built, the parting curtains revealed, instead, a bleak, desolate scene: an empty stage, lit from the side by lights projecting across it like an airport runway, extended back far into the distance. The only notable feature was the shattered remnant of a tree, without leaves or branches, protruding from the front of the stage. This was a picture of total destruction and emptiness. In Brussels, at the Théâtre de la Monnaie, we saw something different again but familiar from the first night, since it is only possible to use one set in this small theater’s stage space: a room containing a piano, some rocks, some furniture, with a large picture window at the back framing a view of the Untersberg. In fact, it was the very view from Hitler’s retreat at Berchtesgaden, made popular in a famous photograph of the Nazi leader.9 How did these different opening scenes affect the audience members—in terms of emotional response, but also in terms of making sense or meaning of what they witnessed? These different visual signifiers, we will argue, organized the interpretive strategies of the audience members. That is, they
68
Michael Hutcheon and Linda Hutcheon
both shaped and limited the semiosis and thus the interpretation possible when viewing these different productions. Again, we take for granted that the aural dimension—the musically represented storm the audience is hearing—is important. Each conductor10 and each orchestra made this music sound different: the dynamics, the tempi, and the relative emphasis varied, but not as much as did the visual dimensions of these three productions. While different musical meanings were undoubtedly constructed by audience members, the music was less significant to the interpretative reading of difference here than were the other operative modes. Like theater audiences, opera audiences are never passive. As semiotician Anne Übersfelt has taught us, being a spectator (and listener) involves being part of a “receiver audience”—actively decoding, sorting through information (discarding some details, accepting others), in other words, interpreting.11 She argues that we follow a story and constantly “reconstruct the total figure of all the signs engaged concurrently in the performance.” We are asked to engage and identify and at the same time to back off and distance ourselves from what we are seeing and hearing (Übersfelt 1999, 22–23). If meaning is, in the end, our reading of the use made of the stage (Pavis 1989, 129), then the stage is clearly a place of complex and potentially open signification, as many have argued. Obviously, no two audience members see or hear the same opera; then again, no two performances are ever the same. But it is the task and responsibility of the mise-en-scène to constrain and direct the variety of possible interpretations and responses. Bayreuth’s stage may have been all but empty for that opening scene, but it nevertheless had visual signifiers that both opened up and simultaneously limited the audience’s meaning-making process: light, color (or a resolute lack thereof), depth of stage, the blasted tree. In contrast, the Metropolitan’s stage replicated in material form almost exactly Wagner’s stage directions, but the audience had to make meaning of this nonetheless, if only to understand that this was to be a “faithful” or antiquarian production. In Brussels, the opening scene was decidedly odd, but already familiar from Das Rheingold the night before, where we had already begun our attempts to decode the visual mode: the odd collection of objects and images on the stage. The designers of these strikingly different scenes12 were each part of a team of artists materializing their designs into its multimodal narrativizing on stage. Opera, unlike contemporary digital media, is still a collaborative art form and therefore is still the domain of “discrete professions and their practices” (Kress and van Leeuwen 2001, 47). Opera, as Jean-Jacques Nattiez (1990, 74) has argued, following Nelson Goodman (1968, 129), is an “allographic” art form, not an “autographic” one. With autographic art we experience in production exactly what the artist materially created: we see the paint on the canvas, as applied there by the artist’s hand. With an allographic art like opera, we see instead the results of an entire team, all interpreting the dramatic texts and developing their own way of
Opera
69
making meaning, that is, of telling this story. At this production stage, the orchestra and conductor work primarily with the score; the director, the (set, costume, and lighting) designers, and all those who make their visions actual on stage (from wig makers to makeup artists, from stagehands to set builders) use the libretto as their main point of departure, though the director must obviously also consider the score. The singing actors work with both libretto and score and with the director’s design concept. They not only express themselves vocally; they also move, gesture, show emotions facially, and so on, as they act out the story. It is the mise-en-scène—that is, the theatrical construct that brings together all these semiotic resources of the operatic grammar—that tells this story, and tells it to both the neophyte opera spectator and to the veteran. Some mise-en-scènes are what we could call traditional; others are different, indeed innovative.13 In the traditional vein, the Metropolitan’s production, directed by Otto Schenk, followed Wagner’s stage directions faithfully—from set design to character action—in what was clearly an attempt to recreate the flavor of Wagner’s own nineteenth-century staging. The story was told in a direct manner, with no emphasis on any of the textual or motivational ambiguities that have fueled over a century of critical and directorial debate. As a result, however, a certain amount of the interpretive complexity and intriguing uncertainty that are part of an experienced audience’s sense of the Ring was lost. The dramatic texts were still narrativized and the design was still materialized in production; all the visual and aural channels of communication were operative. But the story, while easier to follow for the novice, was frustrating for those who knew other productions or the dramatic texts themselves. The Bayreuth version, directed by the (formerly East German) enfant terrible Harry Kupfer, was, on the contrary, an interpretation with a distinct theme and a provocative thesis. This was dubbed a “Green” Ring, a post–nuclear holocaust tale that took considerable liberties with the dramatic text of the libretto. The production offered scene after scene of ruined industrial equipment and waste: a destroyed boiler, a shattered factory. For a work that thematizes nature, there was little green on this stage. But the opera’s theme is more specifically one of nature threatened; we have moved on, for here nature is already destroyed. The director constructed an interpretive frame for the audience by having characters appear on stage in scenes not indicated in the libretto: in this version, the CEO of the gods, Wotan, controls and manipulates everything, staying on stage or appearing when not expected. In Brussels, director and designer Herbert Wernicke also created a frame, but using different semiotic means. The initial sets visually alluded to Germany in the 1930s, with characters dressed to suggest an etiolated and declining European nobility, confronting others costumed to suggest the historical figures of fascism. We were clearly in the nightmare of the mid-twentieth century, probing its origins. The provenance of these figures
70 Michael Hutcheon and Linda Hutcheon is relevant to their meanings. As Kress and van Leeuwen point out, signs may be “imported” from one context into another “in order to signify the ideas and values associated with that other context by those who do the importing” (2001, 23). Here, we are also dealing openly with the history of the Nazi appropriation of Wagner’s music as the theme song for the Third Reich. But here the visual mode dominated interpretation: the repetition, often with ironic difference, of some of these costumes portrayed, visually, a narrative of multigenerational continuity—but also of decline. In the second opera, a slim, energetic young father (Siegmund, played by Gary Bachlund) dressed in lederhosen gave way, in the next, to a heavy, clumsy older son (Siegfried, sung by William Cochran) wearing the same costume, but to utterly different effect. Here the Teutonic Wälsung race was not victorious (as the Nazis attempted to assure); it was visibly degenerating. As is evident from these descriptions, the production can and does involve many different modes and media. The visual impact of sets, lighting, color, and perspective, to mention only a few, is part of that multimodality in that everything works to configure the interpretation of the audience. In Brussels, Wernicke also used silent-fi lm-like text projections of the stage directions, thereby linking the semiotic mode of language as printed text to visual image (in a cinematic medium). Here Wernicke joined many other stage directors today who are turning to the semiotic resources of video and film to enlarge the possibilities of stage effects.14 When singing actors step on stage, they too become part of the visual semiotic grammar—through, as we have seen, costuming, movement, gesture, facial expression, blocking. They also, of course, operate within a musical mode: their particular way of interpreting their characters and their lines contributes to the audience’s act of making meaning of what we are hearing as well as seeing. A forcefully articulating Wotan (John Tomlinson in the Bayreuth staging of Die Walküre) will be decoded differently—as more aggressive—than a (deliberately) weaker one (Franz Ferdinand Nentwig in Brussels). Voice quality itself, of course, is a semiotic resource, even in opera where arguably a single style dominates and singers are judged by the degree to which they excel in it. The material qualities of voices—their tension, roughness, breathiness, loudness, pitch range, vibrato—all make meaning (Kress and van Leeuwen 2001, 82–85). We also hear diegetic noises that arise from the action on stage: a spear striking the stage with great force, a huge building crashing down at the end, and from these too we make meaning.15 To return to the three different productions, a study of the multimodal presentation of the central character of the Ring will give some sense of the possible differences in the semiotic interaction of the different modes. Arguably, Wotan (whom we earlier described as the CEO of the gods) is the protagonist of the story, and in the Metropolitan Opera production, the handsome James Morris, with flowing locks and royal blue robes, was majestic, noble, and dignified. Dominant and dominating in demeanor, costume, and gesture, he was the tragic hero par excellence. At this moment
Opera
71
in his career, Morris possessed one of the most beautiful and mellow of voices, and these sonorities, together with his handsome and grand presence, no doubt were among the reasons why he was cast. In contrast, Bayreuth’s John Tomlinson’s voice, with a totally different and less conventionally beautiful timbre, was a dramatic vehicle of greater force and drive. This Wotan radiated aggressive energy; through his large physical movements, he dominated the stage, wearing a long trench coat with a large fur collar and high boots. Physically violent, emotionally extravagant, this Wotan could not be ignored when he was on stage. Most notably, however, he had strikingly red hair—as would all the Wälsungs we would meet on stage. This was semiotically fitting, for their lineage was, after all, their fate, and through the materiality of color, the audience came to understand this. In contrast to the drive of this Wotan, in Brussels Wernicke chose to use three different singers for the three operas in which Wotan appears. The impact of this decision on the audience, whether novice or experienced, is inevitably a loss of continuity of identification. To this Wernicke added characterization that showed a Wotan growing progressively weaker and losing power—in his visual gestures and actions and in his vocal energy—thereby supporting his interpretation of this as a narrative of decline rather than triumph. The production in each case used all the sensory channels of communication available to it in order to encode materially the various designs. But will what is encoded necessarily be decoded in the same way by all the members of the audience—or, for that matter, by any of them? Let us look at one fi nal example to investigate this issue: the very end of the four nights of the Ring. Here are the stage directions for the end of Götterdämmerung—calculated to strike terror into the heart of any opera director or designer: With a single bound she [Brünnhilde] urges the horse into the blazing pyre. The flames immediately flare up so that the fi re fills the entire space of the hall and appears to seize on the building itself. Horrified, the men and women press to the very front of the stage. When the whole stage seems to be engulfed in flames, the glow suddenly subsides, so that soon all that remains is a cloud of smoke which drifts away to the back of the stage, settling on the horizon as a layer of dark cloud. At the same time the Rhine overflows its banks in a mighty flood, surging over the conflagration. The three Rhinedaughters are borne along on its waves and now appear over the scene of the fi re. . . . A red glow breaks out with increasing brightness from the cloudbank that had settled on the horizon. By its light, the three Rhinedaughters can be seen swimming in circles and merrily playing with the ring on the calmer waters of the Rhine, which has little by little returned to its bed. From the ruins of the fallen hall, the men and women watch moved to the very depths of their being, as the glow from the fi re grows in the sky. As
72 Michael Hutcheon and Linda Hutcheon it fi nally reaches its greatest intensity, the hall of Valhalla comes into view, with the gods and heroes assembled as in Waltraute’s description in Act I. Bright flames seem to flare up in the hall of the gods, fi nally hiding them from sight completely. (Wagner 1993, 351) While this is (perhaps) happening on stage, the audience hears the culmination of fi fteen or more hours of leitmotivs, coming together in all the semiotic richness of which Wagner’s music is capable. The Metropolitan production, as might be expected by now, came closest to reproducing what Wagner’s text demands: the Gibichung hall collapsed very dramatically; the Rhine waters overflowed, thanks to the effective scrim used in Das Rheingold; Gil Wechsler’s lighting made the fi re glow. Conductor James Levine, who consistently preferred to enlarge even further the big dramatic moments of the opera’s music through slow tempi and expansive dynamics, ended Götterdämmerung true to form. Here the ironic pomposity of even the Valhalla leitmotiv was transformed into something more dignified and glorious to support the interpretation of these gods as tragic heroes. All the narrative’s ambiguities—from character motivation to the ethical consequences of their actions—were left unexplored. Audience members, whether new to the Ring or old hands, would not have much trouble decoding what was encoded here. Such was not the case at all in Kupfer’s Bayreuth production, where the ambiguities persisted, remaining unresolved to the end. The director set up the fi nal moments by having Wotan appear physically on stage in a scene in which the dramatic text definitely does not have him present: he entered to throw the shards of his shattered spear onto the funeral bier of Siegfried, thereby signaling his acceptance of not only the end of the Wälsung line but also of his own and his world’s necessary demise. With this accomplished, Kupfer ended the opera with a stage filled with people in evening dress—looking much like the audience—drinking cocktails and watching, presumably, the apocalypse of the gods on multiple television sets. Two children, a boy and a girl, walked across the stage and off, guided by a flashlight (a beacon of hope?), but watched very attentively by Alberich, the character whose stealing of the Rhine gold began the cycle of destruction we have just experienced. In other words, the cycle may recommence—and no one cares. For some of the audience, this was a postmodern, ironic, self-reflexive jab at the audience, figured as content to passively watch TV (or an opera) instead of caring about the fate of the world. However, from our experience and from reviews at the time, two responses were possible to this same decoding: anger at this negative representation or else ironic (even sardonic) recognition and even acceptance of the implied judgment. In other words, even the same decoding need not inspire the same reaction. Others in the audience at Bayreuth were simply confused and reacted in various ways to their confounding. From their comments afterwards, many, however, seemed simply annoyed that this nontraditional production
Opera
73
deliberately undercut the dramatic glory of the musical apotheosis. In the early years of the production (1988–1991), conductor Daniel Barenboim appeared to us to downplay the music’s aural splendor, in keeping with this deliberately ironic and deflating interpretation. Yet, in the third and last cycle of the fi nal year (1992), that all changed and audiences heard something much more traditionally magnificent. Was this a change in interpretation? Was it a new way of reading the work—playing the grand music off against the ironic dramatic action? Or was it simply a celebratory ending for five years’ worth of collaborative artistic achievement? In Brussels, Wernicke worked even more against audience expectation at the end than did Kupfer. The stage emptied of people, leaving behind a heap of bodies—of all those who had died of the ring’s power during the four nights’ action. (This mass had been building over the four nights, as bodies were left to accumulate on stage.) Nothing else happened; the music rose to its climax, but not in any particularly memorable fashion. Just before it ended, a singular visual and aural coup de théâtre worked to direct—and limit—the possible readings of the end of this narrative of decline and degeneration. With a tremendously loud crash, a small bulldozer destroyed the left part of the stage wall; a workman in a hard hat entered the room/stage, looking confused and then startled, when he realized he was witnessing a mass grave. If the audience members knew that the Brussels opera house was situated very close to a local mass grave from another era, they would have made the connection with the production’s theme of the constant and continuing self-destructive power of humanity. Three different endings, three materially different interpretations encoded; many different ones potentially decoded. There are many other ways in which a production can challenge an audience. It can, for example, radically reconfigure the entire spatial relation of pit/stage/audience. In 1999, the Netherlands Opera Ring production in Amsterdam changed both orchestras (for the different nights) and their physical placing in the hall: it moved the musicians to a different place for each work—on stage, to one side, in the normal pit, surrounded on all sides by the stage. This physical shifting altered not only the sound—making the music (and the music-production process) a visible and active semiotic participant in the audience’s theatrical experience—but also the visual possibilities for stage action. One fi nal aspect of live opera in performance that we have not mentioned thus far (because none of the three productions discussed in detail here in fact used them at the time) is the verbal and visual presence of surtitles, whether projected at the top of the stage or seen and read on small individual screens visible from each seat. Invented by the Canadian Opera Company in 1983 for their production of Richard Strauss’s Elektra, the now ubiquitous surtitles are not usually full translations of what is being sung. They are abridgments and may not actually be accurate as literal translations at all. Peter Sellars’s updated and indigenized diction in his Mozart/da
74 Michael Hutcheon and Linda Hutcheon Ponte trilogy surtitles represents one extreme; various self-censoring bowdlerizations of politically incorrect wording form the other. For some viewers, surtitles are a visual distraction, drawing attention away from the stage action in their verbal and visual interplay. For others, they provide a rapid and not at all disconcerting way to follow that action with ease. There is little doubt that surtitles have contributed to the democratization of opera as an art form. In this they function like the verbal/visual subtitles on DVD/ television/fi lm versions of operas—versions that in themselves have played an important role in bringing opera out of the (expensive) opera house and into the realm of more popular culture.16 The live staging of the visual and aural does not exhaust, arguably, the multimodal meaning-generating possibilities of opera, even if these do remain dominant. Every opera has a set of paratexts that exist as part of the audience’s context of interpretation and therefore the horizon of expectation when generating semiotic meaning. The (verbal/visual) program is the most obvious guide for audiences, setting the scene for our theatrical experience as both culturally significant and meaningful, in part by establishing those expectations. In some European theatres, like Bayreuth, however, unless we purchase it, we will have no such assistance; in North America, we will usually receive a free booklet that includes a cast list, a plot summary, images from the production, and often an interpretive article on the opera, as at the Metropolitan Opera. The ultimate extreme of paratextual meaning-generating must be Herbert Wernicke’s extremely large and extensive libretti-programs, available for purchase at the Brussels Ring. What made these volumes special was that the verbal text was accompanied by a series of heterogeneous visual images—photographs, paintings, newspaper clippings—that had inspired the director/designer in his multimodal production of the Ring. For instance, a photo of Mussolini swimming in the Adriatic sat alongside the text of Alberich entering the Rhine waters at the start of Das Rheingold: in this production, their shared bald heads not only made the Nibelung dwarf suspect from the start, but established the associative temporal setting in fascist Europe. Other paratexts that guide and shape audience interpretation and response include reviews and interviews. And, almost every production of the Ring is accompanied by a series of seminars or a conference, aimed at contributing to the audience’s context and therefore understanding of the work. Local radio or television programming may work to this end as well as may exhibits at local museums, libraries, or the theater itself. This is how the interpretive community of a production is created. The audience’s context of reception, however, is not the only context to consider in the multimodal grammar of opera. The production is the work of a team that creates meanings within the context of other, earlier stagings. James Treadwell has argued that there is an accumulated vocabulary of Wagner productions, indeed that each new production is in dialogue
Opera
75
with all the others before it and with the history of the progression in terms of style and, we would suggest, meanings (1998, 220). Yet the same is true for the audience member who has seen more than one production. When we attended the Canadian Opera Company’s Ring in 2006, we saw in designer Michael Levine’s bleak rendition of destruction in the sets of Die Walküre an echo of Kupfer’s post–nuclear holocaust world; we interpreted the modern office setting of the Gibichung “hall” as a recalling of Jurgen Flimm’s millennial Ring at Bayreuth. This intertextual memory also contributes to the meaning we construct as audience members. Each production of an opera is literally a retelling of the story and each is different. As an art form, opera has at its disposal a wide range of aural and visual channels and codes of communication, so that what we see and hear on stage is an embodied fictional world rich in meaning possibilities— possibilities whose decoding the encoding theatrical (and musical) mise-enscène attempts to restrict and control. That it cannot—for the individual opera audience member, either novice or veteran—is part of the semiotic adventure of opera.
NOTES 1. Many other art forms today could qualify for this honorifi c, of course— most obviously, fi lm and certain new electronic media. And even in the scholarly study of opera, once the monomodal domain of musicology, interdisciplinarity (and therefore multimodality) has become the new norm. See Hutcheon (2006). 2. In other words, unlike the dominant operatic practice of the time, his singers did not bring their own costumes, interpolate their own arias, or decide for themselves, in the absence of any director, how to act on stage. 3. We use throughout the language of Kress and van Leeuwen’s (2001) four domains of practice (or strata) of multimodal resources available to make meaning in any mode: in the realm of content (or “mental work” [2001, 68]), discourse and design; in the realm of expression, production and distribution. We shall italicize these terms for the sake of clarity of reference. 4. For a fuller analysis of how these specific elements interact to “narrativize,” see Hutcheon and Hutcheon (2005). 5. Carolyn Abbate puts this reality of opera’s musical identity best when she writes: “In opera, the characters pacing the stage often suffer from deafness; they do not hear the music that is the ambient fluid of their music-drowned world. This is one of the genre’s most fundamental illusions” (1991, 119). 6. Obviously musical theater in general can be defi ned this way, but Peter Rabinowitz (2004) argues that there are different interpretive strategies involved for audiences of the two genres of musicals and opera. 7. Other possible specific contexts for audiences here might include knowing Wagner’s source texts (see Levin 1998) or influences on him (see Nattiez 1990, 74–82), information about Wagner’s life and the fate of his music at the hands of the Nazis (see, e.g., Gutman 1990; Rose 1992), other productions, information about this specific performance, how hard it was to get tickets, and so on. Susan Bennett argues that these all form part of the outer frame of the horizon of expectation of the audience (1990, 149).
76 Michael Hutcheon and Linda Hutcheon 8. For Bennett (1990, 149), this would be part of the inner frame of the horizon of expectation. 9. The semiotic issue of provenance, crucial to the meaning of this imported imagery, will be discussed later. 10. Respectively, James Levine, Daniel Barenboim, and Sylvain Cambreling. 11. The contrary view of interpretation in performance sees live performance as resistant to audience comprehension (Garner 1989) or even as “illegible” (Treadwell 1998, 213). 12. Respectively, Günter Schneider-Siemssen, Hans Schavernoch, and (director) Herbert Wernicke. 13. See David Levin’s (1997) terminology—innovative/traditional, critical/literalist, and strong/weak—and James Treadwell’s (1998) challenge to these. 14. Cinema director Atom Egoyan used fi lm both to frame and to enhance the meaning of certain scenes in his production of Richard Strauss’s Salome for the Canadian Opera Company in 1996. The most elaborate operatic use of video to date, however, is Bill Viola’s full-length video for Peter Sellars’s production of Wagner’s Tristan und Isolde for Los Angeles (2004) and Paris (2005). 15. In some opera houses, we may even go beyond the visual and the aural to experience the haptic: the house may vibrate with the music and sound. 16. Clearly this is what Kress and van Leeuwen call distribution and there is no doubt that its editing, camera angles, and other related technical dimensions add meaning, and indeed can change meaning.
REFERENCES Abbate, C. 1991. Unsung Voices: Opera and Musical Narrative in the Nineteenth Century. Princeton, NJ: Princeton University Press. Bennett, S. 1990. Theatre Audiences: A Theory of Production and Reception. London and New York: Routledge. Elam, K. 1980. The Semiotics of Theatre and Drama. London and New York: Routledge. Garner, S. B. 1989. The Absent Voice: Narrative Comprehension in the Theater. Urbana and Chicago: University of Illinois Press. Goodman, N. 1968. Languages of Art. New York: Bobbs-Merril. Gutman, R. W. 1990. Richard Wagner: The Man, His Mind and His Music. New York: Harcourt Brace Jovanovich. Hutcheon, L. 2006. “State of the Art: Interdisciplinary Opera Studies.” The Changing Profession series, PMLA 121: 802–10. Hutcheon, L., and M. Hutcheon. 2005. “Narrativizing the End: Death and Opera.” In A Companion to Narrative Theory, ed. J. Phelan and P. J. Rabinowitz, 441– 50. Oxford: Blackwell. Kress, G., and T. van Leeuwen. [1996] 2006. Reading Images: The Grammar of Visual Design. London and New York: Routledge. . 2001. Multimodal Discourse: The Modes and Media of Contemporary Communication. London: Hodder Arnold. Levin, D. J. 1998. Richard Wagner, Fritz Lang, and the Nibelungen: The Dramaturgy of Disavowal. Princeton, NJ: Princeton University Press. . 1997. “Reading a Staging/Staging a Reading.” Cambridge Opera Journal 9:47–72. Nattiez, J.-J. 1990. Music and Discourse: Toward A Semiology of Music. Trans. Carolyn Abbate. Princeton, NJ: Princeton University Press.
Opera
77
Pavis, P. 1989. “Production, Reception, and the Social Context.” In On Referring in Literature, ed. A. Whiteside and M. Issacharoff, 122–37. Bloomington: Indiana University Press. Rabinowitz, P. 2004. “Music, Genre, and Narrative Theory.” In Narrative across Media: The Languages of Storytelling, ed. M.-L. Ryan, 305–28. Lincoln and London: University of Nebraska Press. Rose, P. L. 1992. Wagner: Race and Revolution. New Haven: Yale University Press. Treadwell, J. 1998. “Reading and Staging Again.” Cambridge Opera Journal 10:205–20. Übersfelt, A. 1999. Reading Theatre. Trans. F. Collins. Toronto: University of Toronto Press. van Leeuwen, T. 1999. Speech, Music, Sound. London: Macmillan. Wagner, R. 1993. Wagner’s Ring of the Nibelung: A Companion. Trans. S. Spencer. London: Thames and Hudson.
6
Word-Image/Utterance-Gesture Case Studies in Multimodal Storytelling David Herman
INTRODUCTION
The Nexus of Semio-Logic and Story Logic This chapter explores the research challenges and opportunities presented by narratives that exploit more than one semiotic channel to evoke a storyworld. To outline directions for inquiry into multimodal storytelling, I discuss two case studies: on the one hand, word–image combinations in comics and graphic novels; on the other hand, utterance–gesture combinations in videotaped occasions of face-to-face narration. With the word–image combinations, I consider strategies for analyzing a key aspect of narrative structure—namely, character or role—in print texts that deploy a visual as well as a verbal information track. Then, turning to gesture use in face-toface storytelling, I draw on gesture research, theories of deixis, and recent work on space and place to investigate how utterances and gestures interact in real-time narration. My chapter thus focuses on two quite different case studies, with the fi rst case study emphasizing issues of story interpretation and the second case study foregrounding issues of story production; in this way, I seek to suggest the diversity of the corpora, disciplinary frameworks, and methods of analysis relevant for research on narrative and multimodality. What is more, to anticipate remarks that I expand upon in my concluding section, the approach developed here suggests the need to bring into closer dialogue two strands of postclassical narratology that have for the most part been pursued separately up to now. The two strands at issue are transmedial narratology, or the study of narrative across media, and cognitive narratology, or the study of mind-relevant aspects of storytelling practices, wherever—and by whatever means—those practices occur.1 In the remainder of this section, I sketch the foundational concepts and key research questions that guide my analysis. Outlining working defi nitions of mode as well as narrative, I also present what I take to be the general semiotic structure underlying multimodal narrative representations. The following section then uses a page from The Incredible Hulk comics to explore how tools for narrative analysis can be brought to bear on one class
Word-Image/Utterance-Gesture
79
of stories instantiating this general semiotic structure, namely, comics and graphic novels—while also considering ways in which the study of graphic storytelling may in turn provide new directions for narrative theory. In shifting to gesture use in face-to-face storytelling, my third section emphasizes the relevance of a further distinction between kinds of narrative representations, one that cuts across the monomodal/multimodal distinction. This is the contrast between worlds that are evoked exophorically, as when a storyteller points at or uses verbal deixis to allude to features of the current communicative context, and worlds that are evoked endophorically, as when the storyteller prompts his or her interlocutors to shift from the here and now to the different spatiotemporal coordinates of earlier situations and events being recounted in the narrative. On-site face-to-face storytelling, for example, offers possibilities for exophoric reference not available in off-site narration or, for that matter, in graphic storytelling in print texts.2 Overall, and in line with the program for inquiry presented in other work (Herman 2004, 2009a), my chapter examines two issues of central importance for research on narrative and multimodality: (a) the narrativeenabling and narrative-constraining effects of the interaction among multiple semiotic channels, and (b) the cognitive protocols that support the design and interpretation of narrative structures emerging from this interplay of semiotic resources.
Foundational Concepts and Key Research Questions It may be helpful to sketch out working defi nitions of some of the basic concepts undergirding my analysis, including “mode” (versus “medium”) and “narrative.” I build on Kress and van Leeuwen (2001) and Jewitt (2006) in distinguishing between modes and media. In this work, modes are semiotic channels (better, environments) that can be viewed as a resource for the design of a representation formulated within a particular type of discourse, which is in turn embedded in a specific kind of communicative interaction. By contrast, media can be viewed as means for the dissemination or production of what is being represented in a given mode; thus media “are the material resources used in the production of semiotic products and events, including both the tools and the materials used” (Kress and van Leeuwen 2001, 22). Conversational storytellers, for example, can use two semiotic modes to design verbal as well as visual (gestural) representations in narratively organized discourse, which both reflects and helps create a particular kind of communicative interaction—one that can (though it need not) facilitate an extended turn at talk by the party seeking to convey information about a storyworld. In turn, spoken language and gesture constitute expressive media by virtue of which the representations at issue can be produced and distributed in a more or less localized way: more localized if there is no secondary recording apparatus to disseminate the story in, e.g., the medium of video accompanied by sound; less localized if the storytelling
80 David Herman process is videotaped. Further, when communicative interactions are remediated in this way, the medium chosen can affect whether the original multimodality of the interactions is preserved or lost. Thus an audio-recording of a face-to-face storytelling situation not only remediates the interaction but also transforms it into a monomodal representation. The reverse is true when a novel or short story is remediated as a movie. Meanwhile, I defi ne narrative as a type of representation that is situated in—must be interpreted in light of—a specific discourse context or occasion for telling, and that cues interpreters to draw inferences about a structured time-course of particularized events (in contrast with general patterns or trends). In addition, the events represented are such that they introduce disruption or disequilibrium into a storyworld, whether that world is presented as actual or fictional, realistic or fantastic, remembered or dreamed, etc. The representation also conveys what it is like to live through this storyworld-in-flux, highlighting the pressure of events on real or imagined consciousnesses undergoing the disruptive experience at issue. 3 Research on narrative and multimodality suggests the importance of examining how the interplay between different semiotic channels or “information tracks” affects stories viewed under each of the three profiles implied by this defi nition: as a kind of communicative practice, as a textual structure that emerges from (or constitutes a more or less partial record of) this communicative activity, and as a mentally projected storyworld. Or rather, to put the matter in more dynamic terms, the question is how multimodality bears on the process of storyworld (re)construction. At issue is how producers and interpreters of narratives—in a given communicative context—make principled use of semiotic cues to build up a more or less richly detailed representation of the world to which they relocate (or deictically shift) while engaged in the production and parsing of narratively organized discourse. This process entails (re)constructing the space-time configuration of narrated events, together with an ontology for the narrated world—in other words, a model of the entities, together with their properties and relations, that exist within the narrated domain.
Semiotic Modes and Types of Narrative Representation: A Taxonomy To investigate these foundational issues, I outline in the present subsection a taxonomy of story kinds that captures the distinction between monomodal and multimodal narratives, and that points to two different ways in which the process of narration (of whatever sort) can evoke storyworlds. I also factor in the distinction between endophoric and exophoric narrative representations. The remainder of my chapter builds on the taxonomy to examine ways in which story logic can be articulated with this general semio-logic, using the case studies to explore aspects of the mapping relations schematized in Figures 6.1–6.4.
Word-Image/Utterance-Gesture
81
Figure 6.1 shows the case of monomodal narration in which the text level of the narrative representation, which by definition consists of one semiotic channel (e.g., print text, or sign language, or a projected image track that lacks an associated audio track—for example a silent fi lm), is anchored in a reference world that likewise constitutes the storyworld. For instance, a fictional narrative in print can use written language to evoke a nonactual reference world, i.e., a storyworld to which interpreters must make an ontological as well as deictic shift. By contrast, Figure 6.2 represents the case of monomodal narration in which the single semiotic channel being exploited is used to evoke more than one reference world; as a result, constructing a mental model of the narrated domain requires projecting these reference worlds into a blended conceptual space (Fauconnier and Turner 2002). Two subtypes of this second kind of monomodal narration can be distinguished. On the one hand, there is a subtype in which none of the reference worlds coincides with the setting in which the narration itself is produced, as when a fictional text evokes the private mental worlds of the characters as well as the “text actual world” (Ryan 1991) to which those characters orient as real, or when a fictional narrative contains a framing as well as a framed tale. On the other hand, there is a subtype in which one of the reference worlds evoked is that in which the process of narration is currently unfolding, as when a user of sign language who is telling a story on-site points to a feature of the landscape to help situate in space the past events that he or she is recounting via that same semiotic channel. We can term the fi rst subtype endophoric monomodal narration and the second exophoric monomodal narration. Figures 6.3 and 6.4, meanwhile, represent cases of multimodal narration—to which the same distinction between endophoric and exophoric reference applies. In Figure 6.3, multiple semiotic channels are used to evoke a single reference world, which maps directly onto the narrated domain. This setup obtains when information about the current state of a storyworld is conveyed through a coordinated interplay of words and images in comics and graphic novels, or when a storyteller conveying a narrative off-site uses both gestures and utterances to evoke the nonpresent storyworld in face-toface interaction. In Figure 6.4, by contrast, more than one reference world is evoked by the interplay of semiotic channels. Again, two subtypes can be noted. First, there are endophoric multimodal representations of multiple reference worlds. Think here of a fi lm using sounds from an earlier time frame to denote an inner mental domain that corresponds to the memory of a particular character, while using the image-track to represent the current state of that same storyworld. The remembered sounds could also be intermixed with sounds from the current moment, just as the image-track could portray visual flashbacks to the earlier time as well as the world as it is currently being experienced by the character; the resulting structure
82 David Herman would then correspond to the crosshatching among semiotic channels and reference worlds shown in Figure 6.4. Further, there are exophoric multimodal representations of multiple reference worlds, which obtain when storytellers combine utterances and gestures while telling a narrative on-site, for example. In this case, too, both semiotic channels—the visual one supporting gestural communication and the auditory one supporting verbal communication—enable the storyteller to evoke more than one reference world. But now one of the reference worlds embeds the process of narration currently underway. Thus,
Figure 6.1 Monomodal narration used to evoke a single reference world.
Figure 6.2 Monomodal narration used to evoke multiple reference worlds.
Figure 6.3 Multimodal narration used to evoke a single reference world.
Figure 6.4 Multimodal narration used to evoke multiple reference worlds.
Word-Image/Utterance-Gesture
83
storytellers can use deictic gestures or points to establish benchmarks in the current environment, which may in turn provide prompts for building a mental model of the storyworld as it was experienced in the past. Likewise verbal deictics can establish reference points in the here and now, while other utterances contained in the narrative prompt interlocutors to make a deictic shift to the nonpresent storyworld or detail particular features of or events within that narrated domain. Building on this general account of how types of semiotic structure in turn afford kinds of narrative representations, my next two sections explore the nexus of semio-logic and story logic in specific sorts of multimodal narratives.
WORD AND IMAGE IN GRAPHIC STORYTELLING This section focuses on multimodal narrative representations in graphic storytelling, using word–image combinations in The Incredible Hulk comics as an illustrative instance.4 Centering on a character originally created in 1962, The Incredible Hulk portrays the experiences of Robert Bruce Banner, a nuclear physicist from Dayton, Ohio, who grew up in an abusive home. Banner’s exposure to gamma radiation has led to his bifurcation into the normal human Banner and his alter ego, the creature known as the Hulk. Sudden surges of adrenaline transform Banner into this creature, a green behemoth who can lift one hundred tons and withstand up to three thousand degrees of heat (Fahrenheit). My discussion of the Hulk will focus on a single page taken from the first issue of Volume 2 of the Hulk comic book series. Published in April 1968, this issue postdates the six issues of the first volume of Hulk, published in 1962, as well as the Hulk’s appearances among the ensemble of characters featured in the Tales to Astonish series, including Giant Man, The Wrecker, Madam Macabre, The Sub-Mariner, and others. As my analysis suggests, popular comics storytelling styles like that used in Hulk, no less than the sophisticated narrative representations of countercultural artifacts such as Daniel Clowes’s Ghost World or Alison Bechdel’s Fun Home, activate complex processing strategies—strategies whose semiotic triggers and cognitive dynamics it is the task of narrative analysis to illuminate. In what follows, I use the word–image combinations in Hulk to explore how multimodality bears on one of the key concerns of narrative theory, namely, the idea of character. At issue, more specifically, is how the distribution of narrative information across more than one semiotic channel affects the process of assigning role-based (or role-creating) attributes to participants in storyworlds.
Multimodality, Character Roles, and Contexts in The Incredible Hulk Adopting structuralist vocabulary for a moment, and foregrounding story logic over semio-logic, consider what (following Greimas 1983) we can call
84
David Herman
the complex actantial structure of The Incredible Hulk comics—as exemplified in the page reproduced as Figure 6.5. Depending on circumstances, the Hulk may be construed as Subject, who seeks to eliminate a threat or irritant in his environment, or merely to display his indomitable strength, as suggested in the fi nal panel on this page. He also functions as Helper, as when the superhuman strength mentioned by the Hulk himself in the fi nal two panels, and evident from Banner’s altered appearance, enables the Hulk to accomplish feats that Banner the scientist could not bring about on his own power. But he can also be slotted into the role of Opponent, vis-à-vis Banner’s desire for a normal existence, or vis-à-vis the intentions and goals of various Opponents ranging from Giant Man to the big-brained Leader. Indeed, any of these actantial roles may be realized by more than one character in a Hulk comic. These one-many and many-one mappings among actors and actants reveal problems with the conceptual underpinnings of the Proppian tradition from which the Greimasian model derived. The problems can be summarized in the form of a paradox, related to the classic bootstrapping problem (or what has been described in somewhat less pessimistic terms as the hermeneutic circle): a processor cannot assign a role to a character without already having knowledge of the overarching plot structure of which the character is an element; but roles are needed to build up an understanding of this larger configuration, i.e., the plot, in the fi rst place. In what follows, I argue that the narratological version of the bootstrapping problem results from a failure to attend closely enough to the nexus between semio-logic and story logic. As prescient work by Roland Barthes suggests, story understanding in general, and framing inferences about characters in particular, depends on an interplay between top-down and bottom-up processing strategies—an interplay necessarily shaped, in part, by the nature of the semiotic cues involved. For Barthes (1977), people’s stereotypical knowledge about the world allows them to chunk narrative discourse into action sequences; these sequences are elements of a broader experiential repertoire based on recurrent patterns of behavior (quest, betrayal, revenge, etc.). Hence action sequences afford heuristics for assigning roles to characters whose doings trigger the inference that the characters are engaged in some culturally salient behavioral pattern or another. But whereas Barthes followed Bremond in emphasizing the medium-independence of narrated content, more recent research underscores the story-shaping (i.e., constraining and enabling) power of semiotic media vis-à-vis the use of heuristics for narrative comprehension. Thus, in the illustrative page from Hulk, a key issue is how the heuristic construct of transformation affords mapping principles for interpreters of the comic. Readers engage in panel-by-panel mapping of word–image combinations onto mental models of Banner’s metamorphosis into the Hulk, while reciprocally using the output of those mapping operations at any given point to guide interpretation of successive panels. How
Word-Image/Utterance-Gesture
85
Figure 6.5 The Incredible Hulk, Volume 2, Issue 1, created by Stan Lee, written by Gary Friedrich and Marie Severin, inked by George Tuska, lettered by Artie Simek, p. 7. New York: Marvel Comics Group (Issue 102), April 1968.
86
David Herman
can the heuristic of transformation be used to make sense of local textual details, verbal as well as visual, by enabling character-to-character and role-to-role mappings over time? Note here that the mapping relationships form an almost perfect chiasmus: Banner’s painful enfeeblement is the gateway to the Hulk’s unconquerable strength, which at this point in the Hulk’s history comes at the cost of a cognitive or at least verbal capacity manifestly inferior to Banner’s—as signaled by the truncated, nonidiomatic syntax and dropped lexical items or nonstandard orthography in the utterances attributed to the Hulk (e.g., too spelled as to in the left-hand speech balloon in the fi nal panel, assuming that this spelling is not an authorial or typesetting error). But this schematization of the pre- and post-metamorphosis Banner/Hulk relationship does not capture the process by which readers construct models of the transformation on the basis of textual cues, while conversely using the models to parse the unfolding text. In essence, understanding of this transformational process requires projecting word–image complexes along an inferred timeline. Readers can thus balance constancy of structure against the alteration of morphological characteristics to compute a trajectory of change, rather than inferring a wholesale substitution of dramatis personae. For example, in the original, color version of this page from Hulk, the background in the sixth panel has the same watermelon color and striated texture as that contained in the fi rst panel, while the purple color of the Hulk’s pants in the fi nal three panels recalls the purple pants Banner is wearing in the third frame while he is being examined by the doctor. In this way, the visual logic of the page reinforces the causal and chronological links articulated or implied by Oldar the witch in the framing narration she provides in rhyming couplets—couplets set off in yellow, rectangular boxes from the white, rounded speech balloons reporting the characters’ utterances at the embedded or hypodiegetic level evoked through Oldar’s story. For that matter, the panel representing Banner’s transformation into the Hulk (panel six) visually recalls the panel (panel 1) in which Banner is exposed to the gamma rays that caused the transformation, while the purple pants establish participant continuity: the color marks Hulk as a different version of Banner, linked to him through a process of transformation, rather than as an altogether different participant. As I have tried to suggest, character-to-role mappings are dynamically enabled by ongoing, moment-by-moment inferences about constellations of textual cues and the way they can be projected onto segments of a larger story arc. In the case of graphic narratives, even where explicit verbal indicators about the temporal position of events are absent, the rendering of a character’s appearance or of the setting can suggest the position of a given scene or occurrence on an overarching timeline. Likewise, the joint operation of visual and verbal cues can interact with the organization of a story into narrative levels—a structure of embedding and embedded narratives— to trigger inferences about how to map roles onto characters over time. In this respect, note that the page from Hulk qualifies as a multimodal
Word-Image/Utterance-Gesture
87
representation of multiple reference worlds; that is, multiple diegetic levels are evoked by the interplay between the comic’s language- and image-based information tracks. Set off from the rounded speech balloons that report the characters’ utterances within the embedded or hypodiegetic narrative that is also conveyed through images, Oldar’s framing story provides an opportunity for retelling the origins of the Hulk in this inaugural issue of the second volume of the Hulk series. Here we can use Emmott’s (1997) term enactor to characterize how interpreters monitor different versions of participants encountered in narrative flashbacks or embedded stories like Oldar’s. In Emmott’s account, readers of print texts rely on mental representations (episodic memory structures containing information about the spatiotemporal coordinates of participants) that Emmott terms contexts; these allow readers to keep track of the current enactor because flashback time is not always signaled by changes in verb tense. In graphic storytelling, however, visual as well as verbal cues (the appearance of a character or what he or she is wearing, the color and texture of the background in a given panel, the contrasting shapes of speech balloons, the contents of a represented speech act) can activate contexts pertinent for a given enactor or character-version. In short, if they hope to improve upon structuralist accounts of character, by clarifying the nature of the interface between localized roles and larger plot configurations, narrative analysts need to study semio-logic in tandem with story logic. In the case of graphic narratives like The Incredible Hulk, one needs to account for how visual cues with varying degrees of detail can supplement verbal cues to serve participant-indexing functions. Emmott’s (1997) context-driven model provides important insights into this process. But further theoretical as well as empirical studies are needed in this connection (cf. Bridgeman 2005; Groensteen 2007). A key question: to what extent do the discourse-processing mechanisms used to reconstruct storyworlds in monomodal print texts account for the monitoring of contexts where more than one information track is involved? In my next section, I shift from word–image combinations in printed graphic narratives to combinations of utterances and gestures in stories told in face-to-face communicative interaction. The center of gravity for my discussion also shifts: from the processing strategies required to make sense of situations, agents, and events evoked by narratives exploiting more than one semiotic channel, to the narrative-producing strategies of tellers who, by coordinating utterances and gestures, prompt their interlocutors to co-construct storyworlds in contexts of real-time narration.
UTTERANCE AND GESTURE IN FACE-TO-FACE NARRATION Defi ning gesture broadly as “that range of bodily actions that are, more or less, generally regarded as part of a person’s willing expression” (Kendon 2000, 47), I focus in this section on functions of speech-accompanying
88
David Herman
gestures used in narrative discourse. In McNeill’s (2000) account, gestures of this sort can be classified as gesticulations, in contradistinction to emblems (like the OK sign), pantomime, or sign language proper. The pilot study reported here is based on videotaped data and examines gesture use in narratives told by storytellers from the state of North Carolina in the United States.5 The storytellers produce a range of gestures in narratives told both on-location and off-site. In particular, they use deictic gestures or “points,” among other gestural modes, to refer to situations, objects, and incidents located in the multiple sets of space-time coordinates orienting narrative speech events. To revert to the taxonomy sketched in the section “Semiotic Modes and Types of Narrative Representation: A Taxonomy,” whereas gestures used in off-site narration can evoke more than one reference world, they function endophorically. By contrast, gestures used in narratives told on-site can contribute to exophoric multimodal storytelling, if tellers use points to refer to elements of the current scene of interaction while also evoking situations and events in the narrated domain. More generally, narrated circumstances, entities, and events can be conceived as more or less spatially and temporally removed from the here and now of the current interaction, and storytellers can use both language and gesture to create transpositions between and laminations of distinct coordinate systems (cf. section “Gesture Use in Stories Told On- and Off-Site: Simulation vs. Computation of Place”). The coordinate systems at issue are those organizing the past (or pasts) being told about, and those organizing the current moment of telling. My study connects these general, structural issues with a more specific concern: namely, how storytellers use points and other gestures to map abstract, geometrically describable spaces onto lived, humanly experienced places—to invoke a distinction developed in the context of humanistic geography, urban studies, ethnography, and other fields (Johnstone 2004; Tuan 1977; Duncan 2000). In Duncan’s (2000) characterization, place is a portion of geographic space, or rather a principle for its organization “into bounded settings in which social relations and identity are constituted” (582). In turn, a number of researchers, in fields spanning geography, urban studies, ethnography, discourse analysis, and sociolinguistics, have proposed that narrative crucially mediates between spaces and places. As Johnstone (1990) puts it, “[c]oming to know a place means coming to know its stories; new cities and neighborhoods do not resonate the way familiar ones do until they have stories to tell” (109; cf. 119). Accordingly, “in human experience, places are narrative constructions, and stories are suggested by places” (Johnstone 1990, 134). Building on Johnstone’s and others’ studies, I adopt here a more microanalytic approach to the study of narrative, space, and place, focusing on gesture use as a resource for narrative place making—for saturating with lived experience what would otherwise remain an abstract spatial network of objects, sites, zones, and regions.
Word-Image/Utterance-Gesture
89
Points in Context: Building an Inventory of Gestural Functions My analysis focuses on the role of points or deictic gestures in narratives told on- as well as off-location; but points need to be situated in a broader functional classification of gestures, i.e., an inventory of communicative functions served by gestures. A helpful starting point is the model proposed by Cassell and McNeill (1991) and adapted in Table 6.1; this model distinguishes between two broad classes of gestures: representational and nonrepresentational. However, as Haviland’s (2000) concept of “gesture spaces” suggests, the category of deictics or points need to be subdivided. Haviland notes that like other indexical signs deictic gestures function by projecting a conceptual space within which they situate a given discourse referent, i.e., an entity-within-a-mental-model. Thus, rather than interfacing simply and primordially with a world that stands outside language or semiosis, points project conceptual domains of variable scope. One way of characterizing the differences between kinds of points is by distinguishing between the gesture spaces in which the gestures are anchored; these spaces can be used to suggest the nonsimple relation between a pointing gesture and what it refers to. At issue are the surrounding contexts or spaces, projected by points, “where the conceptual entities ‘pointed to’ reside” (Haviland 2000, 22); such spaces can be calibrated in different ways with the environment in which the speech event itself takes place. Haviland identifies four such gesture spaces, including a local gesture space where the current speech event unfolds; a narrated gesture space where the situations, entities, or events being told about in the narrative are located; an interactional gesture space, “defi ned by the configuration and orientation of the bodies of the interactants” (23); and a narrated interactional gesture space, in which a narrative interaction, being recounted within another, framing narrative, is located.
Table 6.1 Cassell and McNeill’s (1991) Taxonomy of Gestures Representational gestures: Iconics: gestures that mimic or iconically represent elements of the propositional content of what’s being said. Metaphorics: these gestures, too, are representational, but they represent an abstract idea or conceptual relation as opposed to a particular object or event. Nonrepresentational gestures: Beats: rhythmically timed gestures used to mark the initiation of new discourse topics, mark members in a list of subtopics, and so on. Deictics: i.e., indexically referring gestures, or points. In Kita’s (2003) account, “The prototypical pointing gesture is a communicative body movement that projects a vector from a body part. This vector indicates a certain direction, location, or object” (1).
90 David Herman In light of these considerations, the initial inventory presented in Table 6.1 needs to be expanded, to provide a fuller picture of the subtypes of deictic gestures and the different kinds of conceptual spaces that they project. Table 6.2 presents an enriched inventory of points.
Gesture Use in Stories Told On- and Off-Site: Simulation vs. Computation of Place In my pilot study, I transcribed four narratives by two different tellers who each recounted a story off-site as well as on-location. The transcripts were coded to indicate the multiple, sometimes layered functions fulfi lled by storytellers’ gestures. I then compared frequency counts for gesture use in the off-site versus on-location narratives. Figures 6.6 and 6.7 are screen captures of video clips in which speaker x is shown telling a narrative off- and on-site, respectively.6 Although the Ns or token counts obtained in my preliminary study are too low to provide anything more than directions for further research, in this subsection I extrapolate from my fi ndings some general principles bearing on the construction of place through multimodal storytelling. Figures 6.8 and 6.9 represent the number of transpositions between and laminations of gesture spaces (in Haviland’s sense) per one hundred lines of text. To clarify: transpositions correspond to shifts between gesture spaces over the course of the telling of a story, whereas laminations occur when a point is used to project one gesture space into another, creating a layering or blending of spaces calibrated in different ways with the current communicative context. The number of transpositions was determined by counting how many shifts there were, from line to line, between different subclasses of points. Thus if a storyteller uses a point classifiable as a storyworld deictic in one line and one classifiable as a grounded deictic or a Table 6.2 An Expanded Inventory of Pointing Gestures Deictics/Points: Grounded deictics = refer to elements of the HERE AND NOW environment of the current interaction (cf. Rubba 1996); in other words, these are points to entities or locations perceptible to interactants in the here and now. Extended deictics = refer to entities or locations not perceptible to interactants but discoverably available to them in the current environmental surround. Storyworld deictics = refer to elements of the THERE AND THEN environment(s)—or storyworld(s)—in which the narrated events unfold. In other words, these are points to entities or locations situated in a narrated world; spatiotemporal coordinates for such elements are established vis-à-vis the deictic center (Zubin and Hewitt 1995) located in the narrated domain. Metanarrative deictics = points used to indicate or clarify the status, position, or identity of objects, events, or participants being told about in the process of narration.
Word-Image/Utterance-Gesture
Figure 6.6 Gesture use in off-site narration.
Figure 6.7 Gesture use in on-site narration.
91
92
David Herman
Figure 6.8 Number of points used to create transpositions between gesture spaces.
Figure 6.9 Number of points used to create laminations of gesture spaces.
Word-Image/Utterance-Gesture
93
grounded deictic + storyworld deictic in the next, that would constitute a transposition, whereas two successive lines featuring a storyworld deictic would not. Meanwhile, laminations correspond to any line in which a deictic gesture anchored in gesture spaces associated with the here and now (i.e., grounded, extended, or interactional spaces) is superimposed on a metanarrative or storyworld deictic. Accordingly, in the system of transcription developed for the analysis, wherever a “+” sign joins deictic points in perceptually or inferably available gestures spaces with gesture spaces associated with storyworlds, a lamination has occurred and can be counted as such. Apart from the spike in the fi rst off-site story, the number of deictic transpositions—or shifts between kinds of gesture spaces—is remarkably stable across the stories. The higher frequency of transpositions in speaker x’s off-site narrative can be explained by the interactional profi le of the storytelling situation. Because another participant in the interaction initially displayed reluctance to relinquish the floor, speaker x had to negotiate the launching of his story, and in doing so the speaker’s pointing gestures shifted between the narrated space of the storyworld and the interactional space where the storytelling act itself was unfolding. If the number of transpositions between kinds of gesture spaces is relatively stable across the narratives, there is by contrast striking variability when it comes to the number of laminations. Here the significant factor seems to be the spatial rather than the interactional parameters of the storytelling situation. Narratives told on-location afford considerably more opportunities for laminations (Goffman 1974)—which might also be characterized as deictic “blends” (Liddell 2000)—layering upon one another the here and now and the there and then. To put this point another way, uses of narrative to navigate the world can be relatively endophoric, whereby semantic resources internal to the system are marshaled to cue inferences about where things happen when in a storyworld, or relatively exophoric, whereby objects and locations in the here and now function in effect as cognitive artifacts (Hutchins 1995), or tools for thinking, used to map out positions, places, and trajectories of motion in the world being told about. The more laminations cued by gesture use, the more exophoric the narrative representations relying on both utterance and gesture. In turn, these endophoric and exophoric modes support what might be termed narrative simulation and narrative computation of place, respectively. As Figure 6.10 suggests, the simulation versus the computation of place, or endophoric versus exophoric modes of place making, can be viewed as the poles at either end of a continuum. Narrative simulation occupies the end of the continuum where, to grasp the spatial contours of a storyworld, interpreters must relocate away from the here and now to an alternative set of space-time coordinates (cf. Ryan 1991, 31–47); that is, they must engage in a punctual, more or
94 David Herman Simulation (“endophoric” strategies for place making)
Computation (“exophoric” strategies for place making)
• less lamination
• more lamination
• punctual (and permanent) deictic shift to the storyworld
• durative, ongoing calibration of storyworld with here and now
• more reliance on semantic resources internal to narrative system
• more reliance on pragmatic embeddedness of narrative in contexts of interaction
Figure 6.10 A continuum of place-making strategies in narrative.
less permanent shift to a different deictic center (cf. Zubin and Hewitt 1995). By contrast, narrative computation occupies the other end of the continuum in question; here making sense of storyworld space entails not an instantaneous, once-and-for-all shift to another deictic center, but rather a durative, ongoing process of juxtaposing and calibrating sets of space-time coordinates. Further, as Figure 6.9 indicates, the extent to which a narrative recruits from the simulative or the computational subsystems will vary according to situation. A high percentage of lamination corresponds to a narrative that is fi rmly contextually anchored and recruits heavily from the computational subsystem. Less lamination is the hallmark of stories more loosely anchored in the context of the telling, and thus relying more heavily on narrative simulation. Increments along the continuum displayed in Figure 6.10 thus correspond to different mechanisms for experiential saturations of space, i.e., for the narrative construction of place.
Multimodality and the Narrative Construction of Place My pilot study of gesture use in face-to-face storytelling suggests that narrative provides a crucial mediating link between spaces and places, but does so in different ways in different kinds of storytelling situations. Further research is needed to determine the extent to which various storytelling situations promote the simulation or the computation of place. Comparative analysis along these lines may confi rm the existence of an overarching, preference-based framework for narrative place making—a framework apparently governing the narrative construction of place in the corpus under study. In the shift from off-site to on-location tellings, storytellers could in principle have continued to rely predominantly on endophoric strategies, or simulation. However, when objects and locations in the local and extended environment presented themselves as possible
Word-Image/Utterance-Gesture
95
anchors for spatial reference, storytellers demonstrated a preference to use them to help their interlocutors compute (and not just simulate) places in storyworlds. The preference thus seems to be: Whenever possible, recruit elements from the current environment and use them to help interlocutors compute places, thus reducing the amount of simulation that needs to be done and in effect distributing the cognitive burden of place making across as many components of the material setting as possible. Being able to test the scope and limits of this apparent preference, by studying the construction of place in other environments where exophoric storytelling is possible, is just one of the exciting new directions for research enabled by inquiry into narrative and multimodality.
MULTIMODAL STORYTELLING AND DIRECTIONS FOR NARRATIVE INQUIRY As mentioned in my introductory section, the approach outlined in this chapter suggests the advantages of working to integrate developments in two strands or subdomains of postclassical research on stories: transmedial narratology and cognitive narratology. Transmedial narratology is premised on the assumption that, although narrative practices that exploit different semiotic resources share common features insofar as they are all instances of the narrative text type (Herman, 2008), stories are nonetheless inflected by the constraints and affordances associated with the specific resources on which they draw. Meanwhile, theorists developing cognitive approaches to narrative have worked to enrich the original base of structuralist concepts with ideas about human intelligence unavailable to earlier story analysts, thereby building new foundations for the study of cognitive processes vis-à-vis various dimensions of narrative structure. And here the cognitive and transmedial approaches overlap. The target of cognitive-narratological research is the nexus of narrative and mind not just in print texts but also in face-to-face interaction, cinema, computer-mediated virtual environments, and other settings for storytelling. In turn, “mind-relevance” can be studied vis-à-vis the multiple factors associated with the design and interpretation of narratives across modes and media, including the story-producing activities of speakers, filmmakers, and writers, the processes by means of which interpreters make sense of storyworlds evoked by multimodal as well as monomodal narrative artifacts, and the cognitive states and dispositions of characters in variously configured storyworlds. Hence the emergence of new questions for postclassical narratology: What sense-making possibilities do multimodal storytelling practices afford that are not afforded by monomodal or single-channel narrative practices, and vice versa? What are the differences among the processing strategies required for multimodal narratives that exploit different semiotic channels, e.g., words and images versus utterances and gestures? Arguably, questions such as these could not have been formulated, let alone
96 David Herman addressed, within classical frameworks for narrative inquiry. As the present volume itself suggests, however, story analysts have now begun to engage in the collaborative, cross-disciplinary research needed to chart new directions for the study of narrative and multimodality.
NOTES 1. In the account outlined in Herman (1999), postclassical narratology encompasses frameworks for narrative research that build on the work of classical, structuralist narratologists but supplement that work with concepts and methods that were either ignored by or unavailable to story analysts such as Roland Barthes, Gérard Genette, Algirdas J. Greimas, and Tzvetan Todorov. On transmedial narratology, see Herman (2004), Ryan (2004), and Wolf (2003); see also the section in this chapter titled “Foundational Concepts and Key Research Questions” on the relation between modes and media. On cognitive narratology, see, e.g., Herman (2008, 2009a, and forthcoming). 2. As confi rmed by the discussion in the section “Points in Context: Building an Inventory of Gestural Functions,” the distinction between endophoric and exophoric types of reference is not hard and fast; both involve situating entities and locations within conceptual models. Here I use the terms exophoric and endophoric mainly for heuristic purposes, to distinguish between two strategies for reconstructing storyworlds. The exophoric strategy relates the storyworld to features of the environment in which the current communicative interaction is taking place; the endophoric strategy entails building a mental model of the narrated domain that is not contextually anchored in this way (cf. Herman 2002, 331–71). 3. See Herman (2009a) for a book-length account of the defi nition presented in thumbnail form in this paragraph. 4. A preliminary version of part of this analysis appears in Herman (2009b). 5. My research on gesture use in storytelling was supported in part by National Science Foundation Grants BCS-0236838 and BCS-9910224. I am grateful to Neal Hutcheson, Ben Torbert, and Walt Wolfram for their assistance with this project. 6. Transcripts of these two stories, together with sample annotations (some of which involve coding techniques not described in this chapter), can be found online at: http://people.cohums.ohio-state.edu/herman145/sampletranscripts.html.
REFERENCES Barthes, Roland. 1977. “Introduction to the Structural Analysis of Narratives.” In Image Music Text, trans. Stephen Heath, 79–124. New York: Hill and Wang. Bridgeman, Teresa. 2005. “Figuration and Configuration: Mapping Imaginary Worlds in Bande Dessinee.” In The Francophone Bande Dessinee, ed. Charles Forsdick, Laurence Grove, and Libbie McQuillan, 115–36. Amsterdam: Rodopi. Cassell, Justine, and David McNeill. 1991. “Gesture and the Poetics of Prose.” Poetics Today 12 (3): 375–404.
Word-Image/Utterance-Gesture
97
Duncan, Jim. 2000. “Place.” In The Dictionary of Human Geography, 4th ed., ed. R. J. Johnston, Derek Gregory, Geraldine Pratt, and Michael Watts, 582–84. Oxford: Blackwell. Emmott, Catherine. 1997. Narrative Comprehension: A Discourse Perspective. Oxford: Oxford University Press. Fauconnier, Gilles, and Mark Turner. 2002. The Way We Think: Conceptual Blending and the Mind’s Hidden Complexities. New York: Basic Books. Goffman, Erving. 1974. Frame Analysis: An Essay on the Organization of Experience. New York: Harper and Row. Greimas, A. J. 1983. Structural Semantics: An Attempt at a Method. Trans. Daniele McDowell, Ronald Schleifer, and Alan Velie. Lincoln: University of Nebraska Press. Groensteen, Thierry. 2007. The System of Comics. Trans. Bart Beaty and Nick Nguyen. Jackson: University of Mississippi Press. Haviland, John. 2000. “Pointing, Gesture Spaces, and Mental Maps.” In Language and Gesture, ed. David McNeill, 13–46. Cambridge: Cambridge University Press. Herman, David. 1999. “Introduction.” In Narratologies: New Perspectives on Narrative Analysis, ed. David Herman, 1–30. Columbus: Ohio State University Press. . 2002. Story Logic: Problems and Possibilities of Narrative. Lincoln: University of Nebraska Press. . 2004. “Toward a Transmedial Narratology.” In Narrative across Media: The Languages of Storytelling, ed. Marie-Laure Ryan, 47–75. Lincoln: University of Nebraska Press. . 2008. “Description, Narrative, and Explanation: Text-Type Categories and the Cognitive Foundations of Discourse Competence.” Poetics Today 29 (3): 437–72. . 2009a. Basic Elements of Narrative. Oxford: Wiley-Blackwell. . 2009b. “Cognitive Approaches to Narrative Analysis.” In Cognitive Poetics: Goals, Gains, and Gaps, ed. Geert Brône and Jeroen Vandaele, 79–118. Berlin: Mouton de Gruyter. . Forthcoming. “Cognitive Narratology.” In The Living Handbook of Narratology, ed. John Pier, Wolf Schmid, Jörg Schönert, and Peter Hühn. Berlin: Mouton de Gruyter. Hutchins, Edwin. 1995. Cognition in the Wild. Cambridge, MA: MIT Press. The Incredible Hulk. 1968. Volume 2, Issue 1, created by Stan Lee, written by Gary Friedrich and Marie Severin, inked by George Tuska, lettered by Artie Simek, p. 7. New York: Marvel Comics Group (Issue 102), April. Jewitt, Carey. 2006. Technology, Literacy, and Learning: A Multimodal Approach. London: Routledge. Johnstone, Barbara. 1990. Stories, Communities, and Place: Narratives from Middle America. Bloomington: Indiana University Press. . 2004. “Place, Globalization, and Linguistic Variation.” In Methods in Sociolinguistics: Papers in Honor of Ronald Macaulay, ed. Carmen Fought, 65–83. Oxford: Oxford University Press. Kendon, Adamd. 2000. “Language and Gesture: Unity or Duality?” In Language and Gesture, ed. David McNeill, 45–63. Cambridge: Cambridge University Press. Kita, Sotaro, ed. 2003. Pointing: Where Language, Culture, and Cognition Meet. Mahwah, NJ: Lawrence Erlbaum. Kress, Gunther, and Theo van Leeuwen. 2001. Multimodal Discourse: The Modes and Media of Contemporary Communication. London: Arnold.
98
David Herman
Liddell, Scott K. 2000. “Blended Spaces and Deixis in Sign Language Discourse.” In Language and Gesture, ed. David McNeill, 331–57. Cambridge: Cambridge University Press. McNeill, David. 2000. “Introduction.” In Language and Gesture, ed. David McNeill, 1–10. Cambridge: Cambridge University Press. Rubba, Jo. 1996. “Alternate Grounds in the Interpretation of Deictic Expressions.” In Spaces, Worlds, and Grammar, ed. Gilles Fauconnier and Eve Sweetser, 221– 67. Chicago: University of Chicago Press. Ryan, Marie-Laure. 1991. Possible Worlds, Artifi cial Intelligence and Narrative Theory. Bloomington: Indiana University Press. , ed. 2004. Narrative across Media: The Languages of Storytelling. Lincoln: University of Nebraska Press. Tuan, Yi-Fu. 1977. Space and Place: The Perspective of Experience. Minneapolis: University of Minnesota Press. Wolf, Werner. 2003. “Narrative and Narrativity: A Narratological Reconceptualization and Its Applicability to the Visual Arts.” Word and Image 19:80–97. Zubin, David A., and Lynne E. Hewitt. 1995. “The Deictic Center: A Theory of Deixis in Narrative.” In Deixis in Narrative: A Cognitive Science Perspective, ed. Judith F. Duchan, Gail A. Bruder, and Lynne E. Hewitt, 129–55. Hillsdale, NJ: Erlbaum.
7
“I Contain Multitudes” Narrative Multimodality and the Book that Bleeds Alison Gibbons
INTRODUCTION In search of a “transmedial narratology” (Ryan 2004), the study of narrative has turned its attention to media (Berger 1996; Huisman et al. 2005), forming part of a surge of research into multimodality. It has tended to focus upon digital media, such as hypertext. As a result, multimodality in the literary narratives of innovative print media has been neglected. This is regrettable since, even within mainstream publishing, the turn of the millennium has seen an increase in the inclusion of typography and illustration in fiction. This chapter explores narrative understanding of multimodal novels, including consideration of recent discoveries from neuroscience. The examination of visual elements will be aided by visual perception and multimodal research. This theoretical merger enables an original approach to multimodal texts, both in its expansion of existing work in multimodal studies (Kress and van Leeuwen 1996, 2001; Baldry and Thibault 2006) through consideration of the cognitive dimension and in its application to literary texts. Steve Tomasula’s “imagetext” novel, VAS: An Opera In Flatland is a central example of experimental multimodal printed literature (see Gibbons, forthcoming a). Other notable examples include; Mark Z. Danielewski’s (2000) House of Leaves (see Gibbons, forthcoming b) and Jonathan Safran Foer’s (2005) Extremely Loud and Incredibly Close (see Nørgaard, this volume). While “multimodal literature” can be seen to encompass graphic novels and children’s picture books, the subgenre I am designating can be seen to be more sophisticated, both in terms of its self-consciousness (using metafictional and intertextual reference, foregrounding its own materiality, innovative typographical textual layouts) and the invitations and demands it issues to readers. Speaking of readers, it is important to distinguish my usage of the term. Narratology has long categorized types of readers through critical vocabularies aimed to describe speakers and receivers of narrative texts. Booth’s (1983) conception of the “implied reader” has been particularly influential. Discussing notional readers in relation to literary texts with graphic elements, White (2005, 38) suggests that critical approaches to such texts
100
Alison Gibbons
“depend on a concept of an active, determined and adaptable actual reader. This reader is implied and encoded in the text but at the same time, actual and external to it.” This chapter similarly depends on this dual notion of the reader. My usage of the term “reader” will therefore reference an implied reader, but one that, crucially, is seen to physically engage with the novel in an actualized way.
MULTISENSORY PERCEPTION Current research in neuroscience demonstrates a renewed interest at the interface of modality, sensory perception, and cognition. Recently, there has been a paradigm shift in conceptions of the human mind whereby the modular view, in which sensory modalities were thought to be processed separately, is being replaced by the belief that the brain handles sensory input in an interactive and integrative manner (Thesen et al. 2004). While the traditional view saw the brain as a collection of unisensory modules, the emerging evidence of what Ghazanfar and Schroeder (2006, 278) term “the multisensory nature of most, possibly all, of the neocortex forces us to abandon the notion that the senses ever operate independently during realworld cognition.” The practice of reading multimodal literature can thus be seen as closer to our experiential processing of reality when compared to more conventional novels. This is not to say that more traditional forms of literature are not received in a multimodal fashion. Such a statement would be a reductive account of literary experience, failing to acknowledge the imaginative (potentially visual and aural) capacities of the reader, as well as the physical and locative context of reading with the body situated in relation to the book as artifact. Rather, multimodal novels in their employment of multiple sensory stimuli are self-conscious of their material form, playing upon the integrative nature of cognition and embodied nature of reading. Not only do humans experience the world in multisensory terms, the media-saturated environment and highly visual culture that has developed into the twenty-fi rst century complements the impression of multimodality as contemporary reality. Considering the process of writing VAS, Steve Tomasula (personal correspondence) reflects: I never really thought I was writing a hybrid novel, per se, while writing it—I guess I didn’t think about it except to think that this was a way to write that seemed natural, given the times we live in, i.e., given all the graphics, collaged video, etc., in something as pedestrian as the nightly news; this just seemed to be plain old realism to me—the way we communicate today—and I remember being surprised the fi rst time someone suggested that it wasn’t. Another important fi nding in neuroscientific research suggests that even the most basic combination of two senses appears to result in enhanced
“I Contain Multitudes”
101
neurological response (Stafford and Webb 2005). This undoubtedly holds implications for readerly reception of multimodal novels. Although neuroscientific classification may deem both word and image as forms of visual stimuli in the reading process, Khateb et al.’s (2002) study of their recognition and resulting written and pictorial processing in the brain verifies that the two modalities do not take the same neural pathways. In fact, while there are areas of similarity, Khateb et al. (2002, 211) deduce that “brain regions engaged during verbal and pictorial recognition are different. More specifically, words and word-like recognition involved more dominantly left-hemisphere regions while image recognition involved more dominantly right hemisphere areas.” Therefore, word and image, as disparate phenomena where neural activity is concerned, may indeed be considered as distinct modalities of literary expression. For responsive neurological enhancement to occur, the two modalities must be construed as part of the same event, an outcome that is usually the result of a “horizon of simultaneity” (Stafford and Webb 2005, 187) or “cooccurrence” (Bertelson and Gelder 2004, 141). Admittedly, since reading is a practice grounded in temporal progression, the necessity of co-occurrence may seem to undermine any significance that multisensory processing may have for the reception of multimodal fiction. However, I contend that word and image act in synchronicity, engaged in the production of a shared textual meaning. Their narrative congruence is thus perceived as contributing to a joint event. Consequently, multimodal novels, employing a multitude of modes, may create a more intense narrative experience. Steve Tomasula presumably intended VAS to be experienced as a richly sensory entity, and his proclivity is palpable in the depictions of textual interaction found in the novel. As Square and his wife Circle lie in bed together, Square picks up her book from the nightstand: The weight of the book surprised him, sex having drained him more than it used to. He ran a palm along its trimmed edges, breathed in the scent of its paper. He wanted to taste it, to put his ear to it like the sea shell so he could, as advised in all the books about “how to write stories,” describe it by using all five senses, to make her (and you) experience the text as an object in the world, real as a brick. Only more so because it was more than its materials. (Tomasula and Farrell 2002, 284) The presence of behavioral process verbs (Halliday 1985) such as “breathe(d),” “taste,” and “experience” in relation to the book as a textual artifact emphasize the role of Square as participant in a literary encounter that mediates between cognitive and physical. The similes comparing the book to a seashell and a brick ground the literary article as a corporeal item with which Square may interrelate in mental, physiological, and material ways. The deictic shift from “her” to “you” is also significant. The verb particle “describe” sets up a relationship between Square as a potential speaker and Circle as his hearer/receiver. Conversely, the parenthetical expression
102
Alison Gibbons
“(and you)” instigates a perceptual shift from “her,” referring to Circle in Square’s point of view, to a generalized “you” through indefi nite reference. This reference and the fact that it is not located in direct speech means it cannot refer to another enactor in the narrative world. Furthermore, since parenthesis functions independently of the text in which it is embedded, the typographical difference in type size and depth of color denotes a change in discourse levels. Due to its visual status and linguistic convention, this use of parenthesis indicates a literary relocation into a domain where “you” is interpreted as the reader, while the words are assigned by oppositional reference to an implied author, an extrafictional counterpart of Tomasula himself. As Gavins (2007, 98) argues in her investigation of readers’ mental representations, “The use of ‘you’ specifies the inclusion of an enactor of the reader at the text-world level, along with an enactor of the author.” Tomasula’s use of subjective deictic reference demonstrates that language necessarily has an orientating function; all language, and all literary language, is embodied. In his experience of the book, Square contemplates “how to write stories,” emphasizing the connections between reading and writing, as well as the shared human capacity for sensual imagination. The world of the novel, particularly its imaginative creation, is therefore a mutual act between writer and reader. Tomasula and Farrell’s literary collaboration accentuates sensory features. By employing parenthesis and typography in the earlier extract, they emphasize that multimodal books entice readers into a resonantly physical and sensual encounter with narrative.
MULTIMODALITY AND EMBODIMENT Embodiment is a crucial concept in cognitive studies (Johnson 1987; Lakoff and Johnson 1999; Gibbs 2006), which consider mind and body as a syndicate through which conceptual information is understood. As Gibbs articulates, Cognition is what happens when the body interacts with the physical/ cultural world. Minds are not internal to the human body, but exist as webs encompassing brains, bodies, and world . . .“embodiment” refers to the dynamical interactions between the brain, the body, and the physical/cultural environment. (2005, 66–67) Reading is itself an embodied activity, yet it is not always explicitly recognized as such. Multimodal fictions utilize a plurality of semiotic modes in the communication and progression of their tale. As a result, they often emphasize the dynamic and embodied nature of reading. In multimodal novels, different modes of expression are located on the page not in an autonomous fashion, but in such a way that they constantly interact in the production of textual meaning.
“I Contain Multitudes”
103
The opening to VAS: An Opera In Flatland, written by Steve Tomasula and designed by Stephen Farrell, set up the theme of embodiment that grips the novel: First Pain Then Knowledge: a paper cut. (Tomasula and Farrell 2002, 9–10) When the narrative continues, it becomes clear that at the story level, the opening refers to a physical act: while writing, the main character Square has given himself a paper cut. The initial sentence of Tomasula’s novel reflects embodied understanding, by bringing a physical encounter (through injury) into direct relationship with cognitive awareness using the enumerative conjunct “fi rst” and temporal coordinating conjunction “then.” Moreover, the modes in which the two clauses are represented contribute to the theme of embodiment. The depiction of “fi rst pain” utilizes forms from graphic novels. The capitalization and bold typography of the words combined with their positioning within the rectangle suggest that this information should be interpreted
Figure 7.1
“First Pain” and “Then Knowledge.”
104
Alison Gibbons
as a narrative box, a convention used to comment upon the visual frame. Its proximal positioning to the starburst shape heightens cognitive association between the two elements. The starburst shape is characterized by strokes that are visually akin to action lines, thus suggesting not only is it a site of pain, it is instilled with action and impact too. The completion of Tomasula’s fi rst sentence overleaf sheds new light for the reader on the fi rst page’s graphic realization. The line can be construed as the edge of a piece of paper, or indeed a page, while the starburst shape comes to signify the point at which the paper cut was received. Conversely, the visual representation of the occurrence of the paper cut is not accompanied by a drawing of a character to which this injury can be attributed. Neither does the page-spanning sentence contain an agent. Nevertheless, the indefi nite paper cut presupposes an entity, albeit ambiguous at this point, to have experienced this verbally expressed event. Combined with the sense of impact and movement that action lines imply, the ambiguity surrounding the experiencer of the paper cut enables the construction of a text-world in which the reader may implicate him- or herself in the depicted action. This is enhanced by the positioning of the vertical line in parallel with the edge of the actual page that it can be said to represent, mimicking the real-world situation of the fiction’s actual reader. The visual immediacy of the representation of the paper cut thus induces the reader to envisage the event in relation to the self. The multimodal expression of “fi rst pain” collaborates with verbal grammar, specifically the grammatically deviant lack of agent, to narrate embodiment through invitation to the reader to momentarily project their subjectivity into the narrative action. In comparison, “then knowledge” is printed in a more familiar manner with text left aligned in an unremarkable font, its appearance according with reader assumptions of conventional novels. This suggests its primary aim is the communication of conceptual information concerning the story. While the multimodal fi rst page manifests embodied action, this page privileges the mind in the sense of traditional literary understanding. Like the experience of receiving a paper cut in which pain is felt initially and realization follows, the temporal sequence and multimodal arrangement of Tomasula’s primary sentence enacts this process. The multimodality of the opening page symbolizes the physical experience while the second page clarifies the preceding imagistic design, enabling Tomasula to introduce the theme of embodiment through a multimodal form of narrativity.
CONCEPTUAL METAPHOR AND CONCEPTUAL BLENDING Embodiment is more than a central theme in VAS. A recurring motif of the novel is the metaphorical rapport between the book and the body. This isomorphism is related to the conceptual metaphor PEOPLE ARE BOOKS, in which the conception of PEOPLE is reinterpreted through reference to
“I Contain Multitudes”
105
experiential knowledge of books. The metaphor has a long-standing history in literature, having been used canonically by Shakespeare (Thompson and Thompson 1987) and Whitman (Manguel 1997) as well as experiencing a revival in postmodern literature such as Jeanette Winterson’s Written on the Body. This conceptual metaphor is recognizable in everyday language, in phrases such as “You can read him like a book” or “She’s an open book” (Thompson and Thompson 1987, 201). Conceptual Metaphor Theory (Lakoff and Johnson 1980) suggests that human conceptual patterns are metaphorical by nature. Conceptual metaphors provide a means for understanding abstract concepts (the target) by comparison with a basic-level domain (the source) grounded in bodily and/ or everyday experience. Conceptual Metaphor Theory sets up metaphorical mapping as unidirectional, a restriction known as the “Invariance Hypothesis” (Lakoff and Turner 1989). In other words, meaning may only be transferred from source to target, rather than functioning in a bipolar manner. However, VAS’s preoccupation with the integration of BODY and BOOK is not merely a case of unidirectional mapping. Consider the book’s materiality. The cover is a dappled peach color, added to which are lines of grayish-blue. Undoubtedly, this is representative of skin and underlying veins. Additionally, the pages are printed in the colors of flesh and blood; they are an off-white shade, while text and image are presented in black, beige, or red. Furthermore, just like a human body, this book will age: as it is read, the spine will crease and it will develop “wrinkles.” Therefore, the book’s visual design performs not a singular metaphorical mapping, but a corporeal realization of the blending of two conceptual domains, BOOK and BODY. Blending or Conceptual Integration Theory (Coulson and Oakley 2000; Fauconnier and Turner 2002) developed later than Conceptual Metaphor Theory and seeks to explain how conceptual domains and mental spaces, including metaphor, merge to create new meanings. The input spaces, which in metaphor terms would consist of source (input 1) and target (input 2), are united in a blended space. Additionally, there is an emergent structure, containing any meanings or inferences that are perhaps unexplained by the blend (and generic space that holds the common features of the inputs) or come from the reader’s unique experience. This approach differs from conceptual metaphor theory, amalgamating conceptual ideas rather than transposing one onto the other. However, it has been suggested that conceptual metaphor and conceptual blending should be viewed as complementary notions (Grady, Oakley, and Coulson 1999). Cognitive-poetic analysis of the extended metaphor of the BOOK and the BODY in VAS reveals that multimodal integrations require a complex negotiation of metaphorical domains where the reader must comprehend metaphorical relations through cognitive blending practices. While the central blend of BOOK/BODY has numerous manifestations in VAS, one instance in particular shall be examined here. The extract occupies a trifold foldout page in which two of the frames show DNA chains (Tomasula and Farrell 2002, 58).
Figure 7.2
“Trifold.”
106 Alison Gibbons
“I Contain Multitudes”
107
DNA stores information using a four-letter code: C, G, A, and T. In these strange configurations, strings of the base letters are strewn around stenciled vertical lines, along with chains of words that form sentences. Consequently, DNA code and words as linguistic units become almost interchangeable. In this transposition, language is merged with the genetic alphabet, creating an emergent structure in which the ascendancy of language is deposed. Language and lineage are blended through their shared status as signifying systems; just as we tell a story using words, our genetic bodies narrate our personal and biological histories. To facilitate reading, the words in this extract are printed in darker type, foregrounding them visually. The fi rst word phrase to emerge is: DOUBLE HELIXES|BEING BOTH|MESSAGE|AND|MATERIAL| |
The separation of words using a line follows scientific convention. The lines are used to create what is known in scientific discourse as a “reading frame” representing “codons” (Dawkins 2004, 22; Johansen Mange and Mange 1999, 103, 108), which are trinucleotide sequences or in other words a series of three base letters that specify an amino acid. This is relevant to the theme of embodiment since amino acids are important to nearly every chemical process in the body that affect physical and mental functioning. The fi nal word phrase in this spread also mentions the double helix: BITS AND FRAGMENTS COME|TOGETHER | |LIKE|NUCLEOTIDES| | SOME|SPLIT|APART|AND|DIE OTHERS HOLD LINKING|UP INTO|DOUBLE|HELIXES| |
Reference to the double helix as start and end of linguistic information in the extract provides a vital clue for the reader about the visual design. The way the letters and words weave around the vertical cords mimics the interlacing of two polynucleotide strands, winding around each other to form a double helix. The visual arrangement of words into diagonal contours directs the eye; the eyes are swept along what Baldry and Thibault (2006) call a cluster before doubling back to the start of the next cluster of words. Performing this ocular movement, the reader takes a spiraling pathway through this part of the text, as though tracing a polynucleotide strand. Comparable to the double helix, which consists of the interaction
108
Alison Gibbons
of two tightly associated strands, the shared act of literary construction is foregrounded through the readers’ processing of the helix shape. Literature is always a joint venture, the cognitive compromise between the intentions of the writer and imagination of the reader. By guiding the reader’s eyes into an optical performance as an entwining thread against an empty white page, Tomasula and Farrell insinuate themselves as the other DNA strand through negation, accentuating the interdependent relationship connecting author and reader. The analogy of the interweaving structure of DNA with the reading process constructs a parallel blend of BOOK and BODY, reliant upon their shared property of mutual creation: our DNA shows that each one of us is the genetic combination of mother and father, while text is the product of writer and reader. Toward the top of the right frame, Square is described in the act of writing, as “trying to get sCATTerings to AGGluninATe.” The capitalized letters in the words “scatterings” and “agglutinate” are written in the same font as the DNA chains, and are letters from the base code, integrating the genetic and linguistic alphabets further. The word “agglutinate” carries a specialized meaning for both scientific and linguistic discourse. In biochemistry, “agglutination” is the formation of a mass of particles, such as red blood cells, by the action of antibodies, while in linguistics it is the building up of words from component morphemes in such a way that these undergo little or no change of form or meaning. In each discourse, it is a case of bringing things together to make something new from the constituent parts. In VAS, word and image agglutinate, creating both formal and conceptual emergent blend(s) of BOOK and BODY. In his critical article on the Invariance Hypothesis, Stockwell (1999) suggests that source and target of a metaphor may have a relation of “interanimation” rather than irreversible transposition. The cognitive analysis of this extract from VAS demonstrates that multimodal manifestations of metaphor and blending are inclined to interanimate since, unlike language, visual and multimodal forms do not presuppose a specific temporal arrangement. The cumulative undercurrent of the isomorphism of BOOK and BODY, or in other words this “megametaphor” (Werth 1994, 1999), which extends throughout the novel, not only projects emergent structure, which is a synthesis of BOOK and BODY, but each conceptual domain is also reassessed in light of its parallels to the other through what Coulson (2001) calls “retrospective projection” (as Figure 7.3 depicts). The body can be rewritten through cosmetic surgery, for instance, while literature takes on a life of its own. Thus, while a conceptual integration occurs, it is useful to consider the book and body not simply as input spaces to be fed into a blend, but as conceptual domains (in the sense of conceptual metaphor theory) that reemerge from the blend intact yet cognitively reconsidered. The present examination of VAS demonstrates that sustained multimodal blending activates a multidirectional conceptual mapping; the concepts of BOOK and BODY “interanimate” and the reader’s perception
“I Contain Multitudes”
109
Figure 7.3 Reemerging input spaces.
of the original domains is subsequently transformed in light of the blends that have taken place.
EMBODIMENT AND HAPTIC PERCEPTION Tomasula and Farrell’s exploration of embodiment does not stop at thematic or design level. VAS seeks to involve the reader physically in its narrative, as the opening sentence hinted in its multimodal provocation to the reader. In an early scene from the novel, Square’s mother-in-law is narrating the story behind the famous opera La Traviata to her granddaughter. There are three acts to La Traviata. Act I concludes with the uncertain possibility of a blossoming romance between the young Alfredo and Violetta. Act II “opens with the couple living together” (Tomasula and Farrell 2002, 24). Clearly, between these two acts there is a disjuncture in time and narrative progression, glossing over the development of Alfredo and Violetta’s relationship and leaving the reader to “fi ll in the gaps” of the plot. In VAS, there are three pages inserted between the descriptions of these two acts. The fi rst includes an example of one of the many appearances of scientific quotation that pervade the novel. This entry is from The Naked Ape by
110 Alison Gibbons Desmond Morris, and details the evolution of sexual intercourse in the advancement of human society from Neanderthal man (Tomasula and Farrell 2002, 21): For pair bonding and therefore civilization to develop, the naked ape had to acquire a capacity to become sexually imprinted on a single partner. It did so by linking sex to identity through evolutionary changes that favored face-to-face copulation . . . Copulation is most commonly performed with the naked male over the naked female, with the female’s legs apart. Further down the page, a peculiar diagram can be seen. It would be possible to overlook this puzzling image, yet it is important for an understanding of what occurs overleaf. Through a relationship of visual likeness, this illustration is an icon for the book itself. Its identity as VAS is signaled through the black semicircles on the edges of pages that appear throughout the novel, like thumb tabs in dictionaries and encyclopedias, referencing the name of the scientist or philosopher quoted on that page. The curved dotted line represents the kinetic trajectory of the page as it is lifted. Significantly, there are two arrows beside the dotted line, pointing in each page-turning direction. The structure of Western literature usually ensures that each page is moved from right to left as it is turned and, unless the reader revisits pages to check information, the movement is repeated, page after page accumulating on the left as reading continues. The lower arrow, then, is intriguing since one would not usually expect to move the page back from left to right, or alternate its movement in a fanning motion. The significance of this diagram becomes apparent when the reader turns the page to be confronted by a double-page spread in which the image of a faceless naked male on the left page is reflected by the image of a faceless naked female on the right. Accompanying text is written in the form of scientific report and describes the physical reactions of the body and actions of male and female in the act of copulation. At the bottom of the right page are found the words, “The partner’s genitals may also become the target for repeated actions. Often rhythmically. Often rhythmically” (Tomasula and Farrell 2002, 23); the repeated words producing an inferred auditory mapping between visual and heard replication and the sexual rhythm of a couple in intercourse. Furthermore, given the scientific context with which this design is framed, the mechanical rhythm of repetition suggests a perfunctory, routine kind of sex—copulation—serving the interests of human reproduction and procreation rather than fulfi lling sexual desire. Notably, the noun “copula” and adjective “copulative” refer in linguistics to the connection of words and clauses. Thus, in a similar way to the use of the word “agglutinate,” copulation caries a double entendre of biological (sex for procreation) and linguistic meaning. Once again, the theme of
“I Contain Multitudes”
111
connection is foregrounded, a theme particularly pertinent in a book that brings together word and image, verbal and visual, in the communication of its narrative. Either through following the diagram’s visual instructions, or by turning the page in order to continue reading, the pages are brought together, fulfi lling the act of intercourse for the naked textual lovers. To return to La Traviata, Tomasula and Farrell employ this device to satisfy the narrative gap between Acts I and II of the described opera, verifying the reader’s narrative inference that Alfredo and Violetta have consummated their relationship. Beyond the fiction, it could be said that just as the lovers are moved into a more vehement intimacy in their textual sexual act, the reader is drawn into an “erotics of reading” through a bodily relation with the novel for it is the reader’s physical interaction with the book as material object that accomplishes the narrative effect. The physical body is part of the haptic sensory modality. As cognitive scientists, Lederman and Klatzy (2001, 71) explain, “People use the haptic system to perceive and interact with the world of concrete and virtual objects.” In considering artistic forms and genres that utilize a haptic aesthetic, media critic Laura Marks speaks of the effect such works have on the reception process. To quote Marks (2002, 18) (but with VAS in mind), “by appearing to us as an object with which we interact rather than an illusion into which we enter, [it] calls on [a] sort of embodied intelligence. In the dynamic movement between optical and haptic ways of seeing, it is possible to compare different ways of knowing and interacting.” Moreover in The Body In The Mind, Mark Johnson states: The centrality of human embodiment directly influences what and how things can be meaningful for us, the ways in which these meanings can be developed and articulated, the ways we are able to comprehend and reason about our experience, and the actions we take. Our reality is shaped by the patterns of bodily movement, the contours of our spatial and temporal orientation, and the forms of our interaction with objects. It is never merely a matter of abstract conceptualizations and propositional judgements. (1987, xix) Since embodiment is fundamental to human experience, it seems logical to suggest if the body is more involved in the act of reading, that particular reading may become more meaningful, or loaded with greater significance, as a result. In an article on synesthesia in poetic language, Shen and Cohen (1998, 125) discuss the organizational ranking of modalities along a scale from high to low. Sight is the “highest” modality, followed by sound, smell, and taste, with the “lowest” modality being that of touch. Following Lakoff and Johnson’s (1980, 1999) ideas on embodiment, they convincingly argue that concepts belonging to lower modalities are more accessible since they “involve a more direct, less mediated experience of perception” (Shen and Cohen 1998, 128).
112
Alison Gibbons
All novels necessarily involve haptic interaction on the part of the reader. Tomasula and Farrell enhance the nature of this interaction by using the immediate contact of the reader with the text as part of the creation of narrative meaning. As such, by tapping into the lower modality of (haptic) touch alongside sight, Tomasula and Farrell increase accessibility of the narrative world(s) making VAS a more vivid reading experience.
CONCLUSION VAS is visually, thematically, and experientially a narrative of sensation and embodiment. Not only is its preoccupation with the BOOK as BODY/ BODY as BOOK metaphor sustained throughout in formal and conceptual blends, VAS utilizes the body and senses of the reader demonstrating, to quote Klatzy and Lederman (2000, 236), that when “stimulated by movement in different ways, the same receptors can tell different stories.” Multimodal literature is a relatively underexplored genre. This chapter has sought to consider multimodality from a cognitive perspective, revealing the neurological and experiential impact that literature that explicitly stimulates the senses may have on the reader. Cognitive poetics (see Gavins and Steen 2003; Stockwell 2002), with its close focus upon stylistic features and reader involvement, reveals multimodal novels as complex dynamic forms that require the reader to invest in narrative physically as well as cognitively. Analogous to our bodies, composed of various matter, chains of DNA, and encompassing our genetic histories, VAS: An Opera in Flatland is multimodal in its employment of the visual, verbal, and somatosensory, which collaborate in the production of textual meaning. This is, therefore, a book that can indeed declare, “I contain multitudes” (Tomasula and Farrell 2002, 298). By taking on a corporeal quality through multimodal metaphoric blending and emphasizing its nature as an object in the world for our bodies, brains, and minds to interact with, VAS leaves a telling impression; if you dissect this book, it is likely to bleed.
REFERENCES Baldry, A., and P. J. Thibault. 2006. Multimodal Transcription and Text Analysis: A Multimedia Toolkit and Coursebook. London: Equinox. Berger, A. A. 1996. Narratives in Popular Culture, Media and Everyday Life. California and London: Sage Publications. Bertelson, P., and B. De Gelder. 2004. “The Psychology of Multimodal Perception.” In Crossmodal Space and Crossmodal Attention, ed. C. Spense and J. Driver, 141–77. Oxford and New York: Oxford University Press. Booth, W. C. 1983. The Rhetoric of Fiction. Chicago: University of Chicago Press. Coulson, S. 2001. Semantic Leaps: Frame-Shifting and Conceptiual Blending in Meaning Construction. Cambridge: Cambridge University Press.
“I Contain Multitudes”
113
Coulson, S., and T. Oakley. 2000. “Blending Basics.” Cognitive Linguistics 11 (3/4): 175–96. Danielewski, Mark Z. 2000. House of Leaves by Zampanò with introduction and notes by Johnny Truant. 2nd ed. London and New York: Pantheon Books. Dawkins, Richard. 2004. The Ancestor’s Tale: A Pilgrimage to the Dawn of Life. Additional research by Yan Wong. London: Wiedenfield and Nicolson. Fauconnier, G., and M. Turner. 2002. The Way We Think: Conceptual Blending and the Mind’s Hidden Complexities. New York: Basic Books. Foer, Jonathan Safran. 2005. Extremely Loud and Incredibly Close. London and New York: Houghton Miffl in Company. Gavins, J. 2007. Text World Theory: An Introduction. Edinburgh: Edinburgh University Press. Gavins, J., and G. Steen, eds. 2003. Cognitive Poetics in Practice. London: Routledge. Ghazanfar, A. A., and C. E. Schroeder. 2006. “Is Neocortex Essentially Multisensory?” Trends in Cognitive Science 10 (6): 278–85. Gibbons, A. Forthcoming a. Multimodality, Cognition, and Experimental Literature. London and New York: Routledge. . Forthcoming b. “Multimodality and Cognition: Reading Word and Image.” In Interdisciplinary Perspectives on Multimodality: Theory and Practice, ed. Anthony Baldry and Elena Montagna. Campobasso: Palladino. Gibbs, R. W. 2005. “Embodiment in Metaphorical Imagination.” In Grounding Cognition: The Role of Perception and Action in Memory, Language, and Thinking, ed. D. Pecher and R. A. Zwaan, 65–92. Cambridge: Cambridge University Press. . 2006. Embodiment and Cognitive Science. Cambridge: Cambridge University Press. Grady, J. E., T. Oakley, and S. Coulson. 1999. “Blending and Metaphor.” In Metaphor in Cognitive Linguistics, ed. R. W. Gibbs and G. J. Steen, 101–24. Amsterdam and Philadelphia: John Benjamins Publishing Company. Halliday, M. 1985. An Introduction to Functional Grammar. London: Arnold. Huisman, R. E. A., J. Murphet, A. Dunn, and H. Fulton. 2005. Narrative and Media. Cambridge: Cambridge University Press. Johansen Mange, E., and A. P. Mange. 1999. Basic Human Genetics. 2nd ed. Sunderland, MA: Sinauer Associates, Inc. Johnson, M. 1987. The Body in the Mind: The Bodily Basis of Meaning, Imagination and Reason. Chicago: University of Chicago Press. Khateb, A., A. J. Pegna, C. M. Michel, T. Landis, and J.-M. Annoni. 2002. “Dynamics of Brain Activation during an Explicit Word and Image Recognition Task: An Electrophysiological Study.” Brain Topography 14 (3): 197–213. Klatzy, R. L., and S. J. Lederman. 2000. “Modality Specificity in Cognition: The Case of Touch.” In The Nature of Remembering: Essays in Honor of Robeert G. Crowder, ed. H. L. Roediger, I. Neath Nairne, and A. M. Suprenant, 233– 45. Washington, DC: American Psychological Association Press. Kress, G., and T. van Leeuwen. 1996. Reading Images: The Grammar of Visual Design. London: Routledge. . 2001. Multimodal Discourse: The Modes and Media of Contemporary Communication. London: Arnold. Lakoff, G., and M. Johnson. 1980. Metaphors We Live By. Chicago and London: University of Chicago Press. . 1999. Philosophy in the Flesh: The Embodied Mind and its Challenge to Western Thought. New York: Basic Books. Lakoff, G., and M. Turner. 1989. More than Cool Reason: A Field Guide to Poetic Metaphor. Chicago and London: University of Chicago Press.
114 Alison Gibbons Lederman, S. J., and R. L. Klatzy. 2001. “Designing Haptic and Multimodal Interfaces: A Cognitive Scientist’s Perspective.” In Proceedings of Collaborative Research Centre 453, ed. G. Farber and J. Hoogen, 71–80. Munich: Technical University of Munich. Manguel, A. 1997. A History of Reading. London: Flamingo. Marks, L. 2002. Touch: Sensuous Theory and Multisensory Media. Minneapolis and London: University of Minnesota Press. Ryan, M.-L., ed. 2004. Narrative across Media: The Languages of Storytelling. Lincoln: University of Nebraska Press. Shen, Y., and M. Cohen. 1998. “How Come Silence is Sweet but Sweetness is not Silent: A Cognitive Account of Directionality in Poetic Synaethesia.” Language and Literature 7 (2): 123–40. Stafford, T., and M. Webb. 2005. Mind Hacks: Tips and Tools for Using Your Brain. Sebastopol, CA: O’Reilly. Stockwell, P. 1999. “The Inflexibility of Invariance.” Language and Literature 8 (2): 125–42. . 2002. Cognitive Poetics: An Introduction. London: Routledge. Thesen, T., J. F. Vibell, G. A. Calvert, and R. A. Osterbauer. 2004. “Neuroimaging of Multisensory Processing in Vision, Audition, Touch, and Olfaction.” Cognitive Processes 5:84–93. Thompson, A., and J. O. Thompson. 1987. Shakespeare: Meaning and Metaphor. Brighton: Harvester. Tomasula, S., and S. Farrell. 2002. VAS: An Opera in Flatland. Chicago: University of Chicago Press. Werth, Paul. 1994. “Extended Metaphor—A Text-World Account.” Language and Literature 3 (2): 77–103. . 1999. Text Worlds: Representing Conceptual Space in Discourse. London: Longman. White, G. 2005. Reading the Graphic Surface: The Presence of the Book in Prose Fiction. Manchester: Manchester University Press. Winterson, J. 1992. Written on the Body. Chatham: Quality Paperbacks Direct.
8
Multimodality and the Literary Text Making Sense of Safran Foer’s Extremely Loud and Incredibly Close Nina Nørgaard
INTRODUCTION In a lucid analysis of the typography of a number of literary texts, van Peer (1993) points to a causal connection between the flourishing of literary typographic experimentation and the development of the technologies that enable such experimentation. “New media require new forms for dealing with language and literature,” van Peer argues, and accordingly predicts the likely reverberations of the (at the time) new media, like the computer and other technologies, in the literary output of the near future (van Peer 1993, 59). A quick look at the booksellers’ shelves proves his prediction right. Here we fi nd a multitude of contemporary literary texts that make use of a variety of semiotic modes such as typography, graphics, color, layout, and visual images for their meaning-making. While still predominantly verbal, the explicitly multimodal nature of an increasing number of literary narratives calls for an analytical methodology that accommodates the interplay of different semiotic modes and recognizes the complexity of multimodal narrative meaning, where the significance of visual images, for instance, may well go beyond simple illustration. The approach to multimodality proposed by Kress and van Leeuwen (2001) and Baldry and Thibault (2006) would seem to be a useful tool kit for those who wish to analyze multimodal literary narratives. In seeking to develop a methodology that will allow analysts to deal with multimodal texts in an informed manner, Kress and van Leeuwen and Baldry and Thibault extend the basic ideas of M. A. K. Halliday’s Systemic Functional Linguistics and his view of language as a social semiotics to encompass the analysis of texts that are more than purely verbal. In Kress and van Leeuwen’s view of multimodality, “common semiotic principles operate in and across different modes” (2001, 2). Rather than operating with specialist “grammars” with specialist terminologies for each semiotic mode in isolation, their aim is hence to develop a “grammar of multimodality” that will provide us with a consistent common methodology and metalanguage for dealing with all the different semiotic modes and their interaction in multimodal texts.1 In spite of the linguistic origin of their view of
116
Nina Nørgaard
communication, Kress and van Leeuwen make a point of emphasizing the equal status in principle of all semiotic modes and the currently changing role of verbal language in communication: “Language is moving from its former, unchallenged role as the medium of communication, to a role as one medium of communication, and perhaps to the role of the medium of comment” (Kress and van Leeuwen 1996, 34; their italics). It is the aim of the present chapter to follow this line of thought by exploring the applicability of a multimodal framework in the analysis of a literary narrative that employs different semiotic modes for its meaning-making and thereby rather emphatically invites such an approach. The text selected for analysis is Jonathan Safran Foer’s Extremely Loud and Incredibly Close (2005) about nine-year-old Oscar who is struggling to come to terms with the loss of his father during the September 11 terror attacks in New York. Through representative examples, my analysis will focus on Foer’s use of the semiotic modes of typography, layout, and photographic images as well as the interaction of these modes with that of wording.
TYPOGRAPHY Even though the term “multimodality” would appear to imply the existence of “monomodes” and “monomodality,” it is important to realize that no such thing exists. According to Kress and van Leeuwen (2001), all communication is multimodal, thus also written verbal language since this mode consists not of wording only, but has a visual side to it, too, i.e., typography, 2 which in turn consists of submodes such as shape and color.3 Typography is an aspect of literary narratives that readers tend not to notice much, especially not when the narrative is set in conventional typefaces such as black Times or Palatino. However, this does not make such typefaces any less semiotic than more eye-catching ones, but simply creates the meaning of “typographically conventional” in a literary context. In an attempt to create a “grammar of typography,” van Leeuwen (2006) sets out the following list of distinctive features of different typefaces: Weight (light ↔ bold4), Expansion (narrow ↔ wide), Slope (sloping ↔ upright), Curvature (angular ↔ rounded), Connectivity (connected ↔ disconnected), Orientation (horizontal orientation ↔ vertical orientation), and Regularity (regular ↔ irregular). To these categories, I would suggest the addition of the feature of Color (for a discussion of the list and the possible inclusion of further subdistinctions such as gloss, hue, etc., see Nørgaard 2009). In combination, and in contrast to the not-chosen, the distinctive features distinguish one typeface from another and provide the analyst with a consistent terminology for describing a given typeface when considering its meaning-potential in context. In Foer’s novel, much can be said about the use of typography: from the variety of meaning created by means of italics, to discourse in the shape of
Multimodality and the Literary Text
117
numerals instead of letters and words. In the following, I will focus on the passage that stands out the most in terms of typography. One day, Oscar comes across a key in an envelope labeled “Black” in his father’s room. He senses the possible significance of the key and consequently contacts people he believes to have the expert knowledge needed to fi nd the lock that the key fits. First, he contacts a locksmith to get information about the key, and then he visits an art supply store where he expects the employees to be experts on color and thus able to help him find out what “Black” might mean. The woman at the art supply store fi nds it “sort of interesting that the person wrote the word ‘black’ in red pen” (Foer 2005, 44) and takes Oscar to a display of pens where she shows him a pad of paper next to it. On the subsequent page, readers are given the impression that they are shown the actual pad of paper with handwritten scribbling in different colors. Visually, the graphological nature of the text is constructed in particular by features such as slope, connectivity, irregularity, and color, combined with a chaotic disorganized layout, which altogether make the text stand out against the background of the more conventional black typeface of the rest of the narrative. Only on the subsequent page is Oscar (and the reader) given an explanation by the woman: most people write the name of the color of the pen they are writing with [ . . . ] when someone tests a pen, usually he either writes the name of the color he’s writing with, or his name. So the fact that “Black” is written in red makes me think that Black is someone’s name. [ . . . ] And I’ll tell you something else. [ . . . ] The b is capitalized. You wouldn’t usually capitalize the fi rst letter of a color. [ . . . ] Black was written by Black. (Foer 2005, 46) After having realized that “Black” is probably a name, Oscar sets out to track down everybody by that name in New York with hopes of finding answers about his father’s death. In the preceding passage, the local meaning-potential of color is determined by a complex multimodal interplay of color and wording: if the name of a color is written in that particular color, the word (e.g., “black”) is likely to refer to the color itself. However, if it is written in a different color, it is likely to afford an additional, different kind of meaning, namely that of “proper name.” In addition to this complexity of meaning in terms of color, the particular relation between wording and the visual, i.e., the linking of the two modes, is clearly meaning-making, too. It is thus significant that the readers read the words and see the handwritten text in the order that Oscar experiences it, i.e., that we see the pad of paper before its meaning is explained to us. Through the visual representation of the scribbling in color we consequently experience what we imagine Oscar experiences, while the phenomenon and its meaning is explained to us through the
118
Nina Nørgaard
wording of the narrative that follows. Even though the visual representation of the handwriting thus creates a special kind of meaning, it is fairly unlikely that the reader would have been able to make sense of the handwriting in color on its own. The full meaning constructed by the passage is clearly a multimodal construct that could not have been created by one of the modes alone. Inspired by van Leeuwen (2005a, 27–42; 2005b) I have argued elsewhere for the existence of (at least) three basic semiotic typographical principles: index, icon, and discursive import (cf. Nørgaard, 2009). In the case of typographical iconicity, the signifier resembles or imitates the signified as when the typographical feature of majuscules, i.e., visual salience, is employed to convey the sonic salience of someone shouting. The meaning-potential of the index, on the other hand, resides in a basically physical and/or causal relation between the typographical signifier and the signified. Thus the archetypical typewriter font, Courier, is often seen as an indexical marker that the text has been produced by a typewriter. Finally, typographic discursive import occurs when the associations of a given typeface are imported along with the typeface into a new domain from the domain where it originally belonged. While Courier is thus frequently seen as an indexical marker of having been produced by a typewriter, this meaning of “typewritten” may in turn be imported into a different context such as a book set in Palatino, where the typeface in itself comes to mean “typewritten” even if it is in actual fact not at all produced by a typewriter. In the example from Foer discussed here, we clearly have indexical typographic meaning in the sense that the typographic signifiers appear to invoke the material origin of their own coming into being so that we (along with Oscar) come to see them as indices that the people who tested the pens were actually there. In order to explain the kind of meaning created in this passage, Kress and van Leeuwen’s concept of modality5 is useful. In their work on visual communication, aiming to create a visual grammar, Kress and van Leeuwen have been inspired by the concept of linguistic modality that concerns speaker commitment in terms of probability, usuality, obligation, and inclination (Halliday 1994, 88–92), and is typically expressed by linguistic resources such as modal verbs (“may,” “could,” “would,” etc.) and adverbs (“possibly,” “certainly,” “unlikely,” etc.). In visual communication, Kress and van Leeuwen argue, similar kinds of meaning occur when visual resources signal “as how true or as how real something is represented” (cf. Kress and van Leeuwen 1996, 159–80; van Leeuwen 2005a, 160–77). All things being equal, a naturalistic photograph thus displays higher modality than a photographic image that makes use of soft tone or distinctive manipulation of color.6 In these terms, Foer’s handwritten pages in color signal “high modality,” giving readers the sense that what we see is “what we would have seen if we had been there” (cf. van Leeuwen 2005a, 168). Foer’s high modality choices in terms of typography hence add a certain
Multimodality and the Literary Text
119
authenticity to the passage, yet it should be noted that ultimately the pad of paper with the color samples is just as much of a construct as the rest of the narrative and has been created specifically for this particular novel as part of its meaning-making. On closer scrutiny we realize that the background is, in fact, of low modality in Kress and van Leeuwen’s sense of the term, in that four black lines form a black frame that constructs the meaning of “the edges of a sheet of paper,” and that the color and quality of the paper is simply that of the book that the reader is presently reading. What nevertheless makes us accept it as “real” (i.e., as the pad of paper that Oscar encounters) are the following three salient features: the graphological look of the text, the chaotic layout, and the use of color. These features all stand out, i.e., are foregrounded, against the expected (i.e., the conventional black printed verbal narrative of the previous pages) and do so to such an extent that we tend not to notice the low modality features with which they clash. Modality is more complex than indicated previously, however, since certain modality markers tend to combine with certain modes and communicative contexts. A difference hence exists between what would be perceived as respectively high and low modality in the abstract realism of diagrams in a science book and in photographic realism (cf. Kress and van Leeuwen 1996, 170). Even though readers may decode the handwriting in color in Foer’s novel as high modality and hence as “what we would have seen if we had been there,” this meaning at the same time appears to clash with what must (still) be considered the (expected) high modality of literary narrative, i.e., plain black typography. An important part of the meaning of Foer’s narrative appears to reside in exactly this modality clash—constructing a postmodern irony similar to that encountered when a cartoon figure occurs in a feature fi lm, or a human character in an animated cartoon. In rounding off the multimodal analysis of Foer’s earlier example, it should be mentioned that a significant difference in meaning exists between the edition of the novel in color and the black-and-white edition. Here, production and distribution (cf. Kress and van Leeuwen 2001) become meaning-making factors in their own right, since the meaning-potential of the passage discussed is clearly reduced in the cheap black-and-white paperback version of the novel where the reader misses out on trying to make sense of the combination of color and wording that Oscar struggles to understand.
LAYOUT Like typography, layout is a semiotic mode that is often backgrounded in written literary narratives. Although a novel would be very difficult to read if its pages were entirely covered by words, readers rarely notice the framing of each page by blank margins. Similarly, the processing of line breaks for new paragraphs in the decoding of text is seldom a conscious thing, despite
120
Nina Nørgaard
the semiotic nature of paragraphs as signifiers of “units of meaning.” While clearly participating in the multimodal construction of meaning in literary narratives, layout is so conventionalized that it is typically not noticed. Against this conventionality, Foer explicitly experiments with the semiotic potential of layout. A good example of layout being a meaning-making mode in its own right is the representation, or construction, of the way Oscar’s grandfather, Thomas, communicates with the surrounding world. Thomas has gradually lost his ability to speak, so in order to communicate he writes little messages in notebooks he carries with him. This is not simply explained to the reader through the mode of wording. When reading the novel, we come across a passage that contains one line per page, which gives us a sense of how Thomas uses his notebooks and what it might be like to communicate with him (e.g., Foer 2005, 19–27). While these pages are clearly intended to convey to the readers what Thomas’s notebooks would look like, the high modality of layout clashes with the low modality of the visual look of Thomas’s writing, which occurs as printed typography rather than handwritten text. Even though the surprising layout may well be so salient here that it somewhat overshadows the low modality of the typography, this seems to be yet another multimodal postmodern nudge of Foer’s in the shape of an apparent urge for mimesis that at the same time undermines the mimesis and thereby implicitly points to the textuality of the text (see also Gibbons’s chapter in this volume on the inherently selfconscious nature of multimodal novels). More interestingly perhaps, the use of layout here also seems to be an attempt to represent one mode (i.e., sound, or rather silence) through another mode (i.e., the visual), so that the visual blank (unfi lled) space comes to represent aural silence. It should, of course, be noted that since this is fiction, there is in actual fact no preexisting silence that has been resemiotized (cf. Iedema 2003), but it might be argued that the blank page creates the illusion that such resemiotization has taken place. Later in the novel (Foer 2005, 203–4 and 206–7), blank spaces indicate the words Oscar cannot hear of a conversation next door. Again, visual blank spaces appear to represent aural silence, and interestingly, line length, and hence visual space, seems related to time. This is, in fact, another aspect of the visual side of written verbal language that many readers tend not to give much thought—the fact that the spatial extent of sentences is meaningmaking, too. Just think of the staccato effects of short sentences, or the rambling nature of very long ones as, for instance, Molly Bloom’s interior monologue in James Joyce’s Ulysses. In the passage from Extremely Loud and Incredibly Close, the blank spaces are furthermore intimately related to narrative perspective, since they place the reader in Oscar’s position by enabling us to read only what he hears and thus to know all and only what he knows. In this way, the mode of layout participates in the creation of narrative point of view, which must consequently be seen as a multimodal construct, too.
Multimodality and the Literary Text
121
Another passage that springs to the eye because of salient layout choices occurs in a later part of Oscar’s grandfather’s narrative. Here, Thomas has so much on his mind that he starts worrying that there will not be enough space in his notebook to express it all: “There won’t be enough pages in this book for me to tell you what I need to tell you, I could write smaller, I could slice the pages down their edges to make two pages, I could write over my own writing, but then what?” (Foer 2005, 276) and later “I’m running out of room” and “this book that is nearly out of pages” (280). Rather than being conveyed through wording only, the same meaning is simultaneously constructed visually through the mode of layout. Thus the line and type spacing gradually decreases until the text is no longer readable (281) and the chapter ends with an almost completely black page (284). In itself, this iconic nature of the passage in terms of layout is, of course, surprising, as we do not expect text in a novel to be so dense that we cannot read it. The visual density is paralleled by a density of meaning, since Thomas virtually tries to explain and make sense of everything in this chapter, and emphasis is provided in that the “same” thing is conveyed—or constructed—through two modes at the same time. Added to this, attentive readers may sense a kind of visual–verbal cohesion, i.e., meaning-making ties (cf. Halliday and Hasan 1976), between the black pages and the verbal references to Oscar’s father’s grave that occur in the lines immediately preceding and following the black pages (Foer 2005, 281, 285), and perhaps even to Oscar’s search for Black and the “truth” about his father.
PHOTOGRAPHIC IMAGES A central concept from Systemic Functional Linguistics employed by Kress and van Leeuwen in their visual grammar is that of Halliday’s three metafunctions of language. While Halliday claims that language has developed to express three different major kinds of meaning—experiential, interpersonal, and textual meaning—Kress and van Leeuwen argue that the same types of meaning may be found in a visual context, too. Visual experiential meaning concerns the ways in which we encode experience visually, i.e., the way images represent people, places, things, etc., and the relations between them. Interpersonal meaning concerns the visual encoding of interpersonal relations through the systems of gaze, perspective, and size of frame, which situate the viewer in relation to the represented people, objects, etc. Textual meaning is in Kress and van Leeuwen’s terminology called “compositional meaning” and concerns the compositional structures of the image in terms of information value, salience, and framing (cf. Kress and van Leeuwen 1996). Early on in Foer’s novel we are told that Oscar prints out images from the Internet and puts them in his scrapbook. When we encounter a number of photographic images in the novel, we are therefore likely to conclude that
122 Nina Nørgaard these images must be images from this collection of Stuff That Happened to Me. In the following, I will focus on one image in particular (Foer 2005, 59). Among fourteen very different images (of, e.g., a wall full of keys, two tortoises, Laurence Olivier playing Hamlet, and a template for folding a paper airplane) is a photographic image consisting of a dark space on the left-hand side and a light one on the right. Against the light background, a falling man in the top right corner stands out as salient. Had it not been for the man, most readers would probably not recognize the black space to the left, yet with the man in the picture, we know immediately that this is the World Trade Center and one of the people who jumped to their death after the September 11 terror attacks. According to Kress and van Leeuwen (1996), the grammar of visual images compositionally resembles that of the sentence, in that in the Western world we tend to interpret what is to the left in the image as Given information and what is to the right as New information. Furthermore, there is a tendency to understand the top part of an image as Ideal and the bottom part as Real. While such general claims about the information value of visual images may seem somewhat simplistic, empirical research in fields like advertising and the design of Web sites has provided the categories with a steadily growing scientific confi rmation. Furthermore, the categories of Ideal/Real are supported by evidence from cognitive science that humans in many contexts tend to evaluate “up” as good and “down” as bad (e.g., Lakoff and Johnson 1980, 14–21, on spatialization metaphors). When it comes to the information value of the image of the falling man in Foer’s narrative, the man is in Ideal, yet the readers possess the painful knowledge that he will end up in Real. The information value of Given and New is also straightforward, since the World Trade Center is presented as Given while the falling man is in New. Let me as a brief aside mention that when trying to recall the images circulating in the media immediately after September 11, the images I remember displayed a similar distribution of Given and New. Perhaps these were the images chosen to reappear again and again because they made such good sense compositionally, until they were no longer shown out of respect for the dead and the surviving relatives. Similarly, the recurring images of the planes hitting the buildings tend to be those that depict the buildings on the left (Given) and the planes coming in from the right (New) (cf., e.g., Debatin 2002, 169–70). A few pages later in Foer’s novel, another image occurs in which the falling man is zoomed in on to the extent that a blurred image of him occupies the entire page (2005, 62). At this point of the narrative, the inclusion of this particular image is not accounted for by the verbal text, and the reader only gets an explanation about two hundred pages later in the narrative where it is revealed that Oscar has magnified the image in an attempt to see if the falling man could be his father (257). In Kress and van Leeuwen’s view, size of frame and thereby the social distance between the depicted participants and the viewer is an interpersonal visual resource of meaning.
Multimodality and the Literary Text
123
To judge from the image from Foer’s narrative, interpersonal meaning is also at play when it comes to zoom, in the sense that by means of close-up the distance between the viewer and the object depicted is decreased. Usually with zoom, we get closer and see more, but to Oscar’s—and perhaps also the reader’s—great frustration, what happens here is that, in Oscar’s words, “the closer you looked, the less you could see” (293). The magnified photographic image of the falling man thus visually epitomizes the entire narrative of Oscar’s impossible quest for his father, to whom he will never manage to get any closer than to the falling man in the image. Not surprisingly, the image of the falling man plays a central role in Foer’s novel: as a participant in the verbal narrative, as a visual image the readers must relate to fi rsthand, and as verbal–visual cohesive links that tie the narrative together. In addition to this, Oscar, by the end of the novel, realizes that he can print out images of the falling man, reverse the order of the images, fl ip through them as in a fl ip book, and thereby make the man float upwards instead of falling down. The last fifteen pages of Foer’s novel consist of these images, thereby inviting the reader to play an active role in the creation of meaning by fl ipping through the last pages of the novel in order to make the man float from Real to Ideal. Ultimately, this element of the narrative would seem to construct a kind of interpersonal visual-kinesthetic “offer” (cf. Halliday’s interpersonal metafunction, 1994, 68–105), signifying something like “you can make the man float up again if you like,” although it may, of course, at the same time involve the temptation to fl ip through the pages in the wrong direction, too.
CONCLUSION When analyzing the multimodal meaning-making of narratives like Foer’s, it is important not to limit the analysis to looking at the meaning created by the different modes in isolation, but also consider what meanings are created by the specific interaction of the different modes in specific passages. In Baldry and Thibault’s words, Multimodal texts integrate selections from different semiotic resources to their principles of organisation. [ . . . ] These resources are not simply juxtaposed as separate modes of meaning making but are combined and integrated to form a complex whole which cannot be reduced to, or explained in terms of the mere sum of its separate parts. (2006, 18) Instead of passing off the use of visual images in literary narratives as mere illustration, for instance, it is imperative that we consider what kinds of meaning are created by the specific word–image relations if we wish to understand and describe their multimodal meaning-potential. In much contemporary communication, we presently face a change in the nature of
124 Nina Nørgaard the relations between words and images in terms of a general tendency to move away from verbal language carrying the majority of meaning in many contexts, towards a communicative landscape where words and images appear to share the workload more evenly, or where images increasingly take over. An illustrative example of this change of the multimodal division of labor occurs in Iedema (2003, 33–37), who demonstrates how Macintosh computer user manuals have changed in less than ten years (from 1992 to 1999) from being oriented towards the verbal mode to chiefly consisting of visual images. In spite of the experimental nature of Extremely Loud and Incredibly Close, with its extensive use of semiotic modes other than written verbal language for its meaning-making, written verbal language still plays a (very) privileged role in Foer’s narrative. Not only does the majority of the narrative consist of verbal language set in inconspicuous layout and type, some of the information provided through the mode of wording furthermore appears to reflect an authorial lack of confidence in the reader’s ability to decode the meaning created by the different semiotic modes. On fi rst introducing Oscar’s scrapbook, for instance, Foer apparently does not trust the semiotic potential of italics to create the meaning of “title,” but immediately provides an appositional explanation of the title: “I printed out some of the pictures I found [ . . . ] and I put them in Stuff That Happened to Me, my scrapbook of everything that happened to me” (2005, 42). In other cases, more is demanded of the reader in terms of the multimodal decoding of the text, as when the verbal explanation of the magnified visual image of the falling man occurs almost two hundred pages after the occurrence of the image. In time, one might imagine that along with the new affordances of new media and technologies, authors and readers will gradually become more skilled producers and consumers of multimodal discourse and that these competencies may in turn lead to different uses of the different modes in literary narratives, or to a different division of labor between the different modes—hence the more reason for developing a multimodal methodology that can handle such discourse. While Halliday’s Systemic Functional Linguistics has proved a useful tool kit to draw on for the analysis of the wording of written verbal language in literature (cf., e.g., Toolan 1996; Nørgaard 2003), the approach to multimodality proposed, e.g., by Kress and van Leeuwen looks promising as an extension of that tool kit for dealing with literary narratives that employ several semiotic modes for their meaning-making.
ACKNOWLEDGMENTS I wish to thank Michael Toolan for his helpful comments on an earlier version of this chapter.
Multimodality and the Literary Text
125
NOTES 1. That Kress and van Leeuwen have selected a linguistic theory as their point of departure for such a grammar may seem somewhat ironic and probably reflects a tendency still to regard verbal language as a privileged mode in contemporary communication. 2. I here use “typography” in a very broad sense of the word, stretching from printed typography via calligraphy to handwriting. 3. Moreover, shape cannot exist without color, nor color without shape. 4. The terms employed by van Leeuwen here are “regular” and “bold,” yet following Felici (2003, 41–42), I fi nd “light” a more suitable term here than “regular,” with “light” and “bold” as the end points of the continuum and “regular” situated in the middle. This choice of terminology furthermore has the advantage of avoiding the overlapping of terms otherwise present in van Leeuwen’s list: Weight (regular ↔ bold) and Regularity (regular ↔ irregular). 5. Please note that in multimodal theory, “mode” and “modality” refer to very different concepts. While sound, gesture, music, visual images, written and spoken language, etc., are seen as different communicative “modes” of meaning, “modality” refers to various semiotic resources for expressing “as how true” or “as how real” something is represented (cf. Halliday 1994, 88–92; Kress and van Leeuwen 1996, 159–80; van Leeuwen 2005a, 160–77). 6. See the following for considerations about the relation between modality, communicative mode, and context.
REFERENCES Baldry, A., and P. J. Thibault. 2006. Multimodal Transcription and Text Analysis. London and Oakville: Equinox. Debatin, B. 2002. “‘Plane Wreck with Spectators’: Terrorism and Media Attention.” In Communication and Terrorism. Public and Media Responses to 9/11, ed. B. G. Greenberg, 163–74. Cresskill, NJ: Hampton. Felici, J. 2003. The Complete Manual of Typography. A Guide to Setting Perfect Type. Berkeley, CA: Peachpit Press. Foer, J. S. 2005. Extremely Loud and Incredibly Close. London: Hamish Hamilton. Halliday, M. A. K. 1994. An Introduction to Functional Grammar. 2nd ed. London, New York, Sydney, and Auckland: Arnold. Halliday, M. A. K., and R. Hasan. 1976. Cohesion in English, London and New York: Longman. Iedema, R. A. M. 2003. “Multimodality, Resemiotization: Extending the Analysis of Discourse as Multi-semiotic Practice.” Visual Communication 2 (1): 29–57. Kress, G., and T. van Leeuwen. 1996. Reading Images: The Grammar of Visual Design. London and New York: Routledge. . 2001. Multimodal Discourse: The Modes and Media of Contemporary Communication. London: Arnold. Lakoff, G., and M. Johnson. 1980. Metaphors We Live By. Chicago and London: University of Chicago Press. Nørgaard, N. 2003. Systemic Functional Linguistics and Literary Analysis. A Hallidayan Approach to Joyce—A Joycean Approach to Halliday. Odense: University Press of Southern Denmark. . 2009. “The Semiotics of Typography in Literary Texts: A Multimodal Approach.” Orbis Litterarum 64 (2): 141–60.
126
Nina Nørgaard
Toolan, M. 1996. Language in Literature. An Introduction to Stylistics. London and New York: Arnold. van Leeuwen, T. 2005a. Introducing Social Semiotics. London and New York: Routledge. . 2005b. “Typographic Meaning.” Visual Communication 4 (2): 137–43. . 2006. “Towards a Semiotics of Typography.” Information Design Journal + Document Design 14 (2): 139–55. van Peer, W. 1993. “Typographic Foregrounding.” Language and Literature 2 (1): 49–61.
9
Electronic Multimodal Narratives and Literary Form Michael Toolan
INTRODUCTION Human communication is overwhelmingly multimodal, but in relation to narratives I propose a rough three-way distinction, of “monomodal” narratives, old-tech multimodal narratives, and contemporary digitaltechnology multimodal narratives (typically enabled by digitization and computer-dependent). My focus here is on the last of these, and on whether the potential openness and interactivity of some narratives of the latter kind, such that no two consumers will necessarily view the same texts and images, let alone view them in the same sequence, disqualifies them from consideration as narrative art. This depends on how one defines or conceives of narrative art. My conception of it is that as a convention-bound ideal the object of narrative art needs to appear permanent and stable, with a determinately sequenced experienced content, under notionally full authorial control; such constraining delimitations are the basis of artistic form. This defi nition clearly privileges similarity over difference (it wants all those encountering a particular art object at different times and in different places to meet essentially “the same object”). The privileging is not arbitrary, but rooted in the hope that radically different recipients of the stable narrative object might converge in a shared experience by means of a common focus on a common object; or at least the makings or the illusion of such commonalities. For this, an authorized and stable object, free from reader (or author) alteration—or one that gives every impression of being thus stable and complete—seems a precondition. This leads me to suggest a connection with an important contrast between some older monomodal art narratives (such as the oral tale, the ballad, and the short story), and most multimodal ones, including computer-mediated, namely the potential performance, in the individual’s own body (using voice and touch) of the former, and the impossibility of so performing the latter. This feature of quotable possession, applicable to some monomodal narratives and entirely inapplicable to highly technologized narratives, seems to be an entirely independent issue; but I am not sure that it is.
128
Michael Toolan
None of the implied distinctions (monomodal or multimodal; art or nonart; embodied or technologized; narrative or other form) are truly absolute. But it is interesting to consider how the rich multimodal possibilities of digital technology may have to be constrained, by the creator of the digital work, for that work to qualify (for some consumers) as both narrative and art. The more artistically minded weavers of narrative using the new multimodal digital resources, I believe, are in effect reestablishing certain formal limits, including exclusion of fully independent consumer control of content or progression; and by these means such authors are bringing the (controlled) body, voice, touch—and art—back into these works.
QUOTABILITY OR EMBODIED PERFORMANCE I want to begin by presenting the opening of one of my favorite written narratives. Imagine me reciting this to you, viva voce, as in fact I did at the conference that prompted this book publication. Lily, the caretaker’s daughter, was literally run off her feet. Hardly had she brought one gentleman into the little pantry behind the office on the ground floor and helped him off with his overcoat than the wheezy hall-door bell clanged again and she had to scamper along the bare hallway to let in another guest. It was well for her she had not to attend to the ladies also. I could—of course—go on, all the way to “like the descent of their last end, upon all the living and the dead.” I would thereby “give” you (convey to you, perform for you) the entirety of Joyce’s story “The Dead” (Joyce 1956) Every single word. Just as I could “give” you Wordsworth’s “Daffodils” in its entirety, the poem itself and not just the gist. But suppose now that I want to share with you the opening of one of my favorite films. The image is black and white, mainly dark, there’s a chain link fence, and then on it we can see an overgrown notice saying “No Trespassing” and there’s slow, eerie brass music, trombones I think, and the camera pans to show an incredibly haunted castle kind of gothic mansion, lit by a single light, and then the light snaps off just as the music abruptly reaches a fi nal note. Oh and there are various animals in the garden or enclosure around the gothic mansion, including some monkeys I think. No speech so far, but shortly we reach a focus on a man’s lips and moustache and these fleshy lips hoarsely whisper a single word. . . . It’s so difficult to do, this sharing of multimodal products; it so much relies on your already knowing the film, in which case you may be able to visualize it far better than I can remind you of it. No matter how hard I try, I cannot really retell even the opening, let alone the entirety, of Citizen Kane. I don’t have it in me, physically or otherwise, to reproduce the
Electronic Multimodal Narratives and Literary Form
129
sounds, the music, the images, the shifts in the images from far to close, the panning, all of which are essential contributory elements. I would have to come armed with technology such as a DVD player, in which case it would be disingenuous of me to say that I personally was retelling the Orson Welles fi lm. With Joyce’s “The Dead,” however, if sufficiently motivated, I might memorize and retell the whole of that story. If you allow me to use a copy of the story as a kind of promptbook, I could retell the whole thing easily. You might object that in my oral re-creation of the full story some things from Joyce’s story are “missing”: the pages, the print, the paragraphing, the italicizing of certain words and capitalizing of others. And other things have been egregiously added, such as my tone of voice, and the distraction of looking at my face while I deliver the story (but you may look away or close your eyes). All this is perfectly true. But do they amount to alterations to the point that you could claim that you had not heard and encountered Joyce’s story in full? One would be hard pushed to maintain a parallel claim with regard to poetry—e.g., that one cannot properly experience Keats’s Grecian urn ode if one only hears it and does not read it. A distinction might be claimed between lyric poetry, arguably written to be heard, and the short story, written to be read. But the point remains that even though Joyce may have intended “The Dead” to be read (and never spoken), its essential materials are such that it can be performed orally and experienced aurally. We all know this, at some subliminal level, and know its implications for how we regard essentially and overwhelmingly verbal narratives by comparison with multimodal narratives, whether of the operatic or filmic kind, or of the more interactive type, found in hypermedia environments such as the Internet. I do not have, as normal and natural bodily equipment, resources that can deliver a full compensatory equivalent to what the screen, camera, projector, and sound system deliver. Electronic multimodal narratives thus cannot be taken into my own body; whereas traditional literary narratives, so much less wedded to their familiar modality and technology (writing), can in principle be rendered fully equivalently in speech. Still, I am not claiming that verbal stories are the only forms we can internalize and perform; for example, one might be able to perform the multimodal form that is a song, accompanying oneself on a piano or guitar, although the constituency of those so capable is only a fraction of those who can read and recite. So it is not the case that all nominally monomodal narratives are capable of being internalized and fully reproduced by the individual, while all multimodal ones never are: that would be misleadingly absolute. But it does seem to be the case that traditional “monomodal” literary narratives are so lightly technology- and medium-dependent that they are especially so available. My point concerns quotability, or oral performance, the possibility as it were of bringing a written story or novel into your own body and voicing it in its entirety to others or just to yourself. Or, at whim, of recalling and
130 Michael Toolan using brief snippets of a literary narrative and dropping them into conversation at lunch, or when just walking dully along. From “The Dead,” we might use timeless snippets like “The men that is now is only all palaver and what they can get out of you,” or “One by one, they were all becoming shades.” And so on. And none of this can you do with electronic multimodal creations. It’s even quite difficult with traditional multimodal forms: you can drop a sung line from “My Way” or Don Giovanni (“Regrets? I’ve had a few . . .” or “Andiam, andiam, mio bene”), but these will be pale shadows of the real thing—as a kind of storyworld-internal multimodality, see Hall (1951) for an enjoyable exposition of the uses of material from Don Giovanni in Joyce’s Ulysses. What we can recall and quote, by and large, are again only the verbal bits of iconic fi lms, such as “Frankly my dear, I don’t give a damn!”; “We’ll always have Paris”; and “It’ll probably turn out to be a very simple thing.” We may well recall the images and sounds that accompany these words, but we lack the resources to reproduce them. The potential personal bodily possession (in our brains) and retellability (out of our mouths) of literary narratives is part of the special experiential relation we have with them. Nothing remotely comparable operates in our relation with films, let alone hypertext creations. And it helps us, if ever we make the effort to memorize and perform a poem or story, that these are normally complete and stable forms. We don’t memorize Heaney’s fi rst Glanmore sonnet worrying that he might cut a few lines here, add some new ones there, one of these days; and we know it would make no sense for any of us to make such alterations. This possibility of a personal and embodied full possession of literature seems to extend only to poetry and fiction and song, not operas or plays. Our ability to quote large chunks of a Shakespeare play is not a counterexample: we cannot satisfactorily and solely in our own person reproduce a performance of Hamlet. The distinction seems supported by the fact that an audiobook of one actor reading a novel is quite acceptable, while there seems to be no market for a single-actor recording of a play such as Hamlet. That would not be the real Hamlet; whereas an unabridged one-actor reading of Pride and Prejudice is, I submit, the real novel.
THREE TYPES OF NARRATIVE As indicated in my Introduction, for present purposes I would like to propose a broad distinction be recognized between three types of narratives: socalled monomodal narratives, low-tech or old-tech multimodal narratives, and new or digital-electronic-technology multimodal ones. Usually when people talk of multimodal communication they think of centuries-old writing and millennia-old speech as the traditional main alternatives, especially writing, which is sometimes characterized as a reducing of speech to a linear planting of symbols across a surface. This seems essentially monomodal
Electronic Multimodal Narratives and Literary Form
131
(although it is widely argued today that even written discourse is multimodal). Perhaps other kinds of narrative should also count as monomodal, such as mime unaccompanied by sound; music without words (perhaps especially music from a single instrument or a wordless voice); silent, textless fi lm; and painting (see Ryan 2004, 21, for similar distinctions). But even these products almost inescapably have word-using titles, introducing an element of at least dual modality. The written narrative, therefore, unaccompanied by other contributory modes such as pictures, sounds, or smells, remains the dominant and best exemplar of whatever it is that multimodal narratives are different from, in their pronounced multimodality. This is so even though traditional paper-based writing often involves the sense of touch as well as that of sight (the distinction between touchable and untouchable writing should not be ignored). As for the distinction between older technology and electronic narratives, good examples of the former include folk song, operas, and graphic novels, while the computer-dependent hypertext story showcases the thoroughly modern multimodality. One of the things that often marks out the new multimodality from the old is the reduction of the interactional space to one that has as its focus a screen—whether this is a computer monitor, as seems currently more common, or a space on a wall. This doesn’t mean that everything in a new-tech multimodal narrative emanates from the screen—the sounds, for example, may come from a number of quite removed sources. But the screen, often quite a small screen, is the fons et origo; the computer screen occupies a larger two-dimensional space than that of an opened quarto book—but not much. In my contrasting of low-technology and electronic multimodality, with exemplars like folk song and the hypertext narrative, film narrative has no special prominence, whether of the celluloid or digital variety. This may be partly to do with the technologies it uses, partly related to its still important public and shared forms of consumption (alongside, of course, the private and diverse forms enabled by video and DVD), and partly reflects the degree to which the collective auteur that makes a fi lm narrative controls what the viewer/addressee gets. Perhaps one can say that fi lm is the fi rst or prototype new-tech multimodality, but is no longer, in the cyber era, the best exemplar. A better example of a thoroughly electronic or newtechnology multimodal narrative is Oldton, which has a web presence at http://oldton.com/my_oldton.html. What does such a narrative suggest is characteristic of electronic multimodal narratives?
WHAT IS IT LIKE TO “READ” AN ELECTRONIC MULTIMODAL NARRATIVE? The Oldton link takes you to a page entitled “In Search of Oldton,” on which appears an array of fi fty-two playing-card-shaped panels, with
132
Michael Toolan
numbered hearts, clubs, diamonds, and spades at their corners, but with the main area of each card fi lled by an image and some text, these in such thumbnail miniaturization as to be unrecognizable and unreadable, until, as invited, you pick one. “Picking a card” means manipulating your mouse until the cursor is hovering over one of the card images and then clicking. As is often the case, this multimodal story has no clear, distinct preface or introductory abstract. But if, like an old codex addict, you do not click on one of the cards but on the leftmost and topmost link (In Search of Oldton), then you are taken to what the URL describes as the index page, http://oldton.com/index.html. This page chiefly contains, on the left, an image of scattered cards and the purchase price for this story or game, while on the left, at top, the author of this “90% true” story is declared to be Tim Wright, and the story is said to be “On the web, on the radio, and in print” (though in fact it is no longer—not even via Listen Again—on the radio) and below all that, the following text occurs: How does a town just disappear? “When I was six, my dad killed himself and we left the old town where I was very happy. Now I go back to look for it, it isn’t there . . .” During the last couple of years I’ve been trying to locate Oldton and make sense of what happened there. From the texts, photos and other Oldton artefacts you send me, I am making a map of the place, a grid of 52 cards that locate all the people, events and emotions that have been lost until now. Turn the cards over, and you will fi nd a story from my own past: a pack of half-truths that represent the last remnants of a relationship with my dead dad. This multimodal narrative proceeds, potentially, via many competing “doorways” on the screen. In the now-familiar way, the screen has a rich variety of hotlinks that, provided you point and click or perform an analogous topological focusing, will “take you” to various further linked pages with their own graphics, sounds, and arrays of hotlinks. Reading/ viewing Oldton is thus somewhat like working through a small printed glossary or encylopedia, with a fi nite set of entries, in which if all crossreferences are assiduously followed up, all entries will eventually be read. Or, as Ryan (1994) has put it, “No matter how the reader runs the maze, the maze remains the same. . . . the author remains the hidden master of the maze.” As much might be said of the traditional novel or story, if we call each page an “entry,” but with the proviso that the pages must be read in numerical order. That is no small constraint. But ultimately more critical,
Electronic Multimodal Narratives and Literary Form
133
for questions of the artistic status of the work, will be whether the multimodal set of entries (pages, texts, images, sounds, etc.) is indeed fi nite and determined or in any sense not. The preceding comments do not do Oldton justice, however. Anyone who has moved through its pages, figuring out Tim’s Oldton, other people’s Oldton “memories,” and the shared mapping and information about Oldton, and the possibility of moving the double-sided cards around by left or right contiguous moves, or shuffling, or twisting, can see the depth and ingenuity of the site. It’s fascinating, it’s complex, it’s gamelike but not quite a game. Oldton harnesses some of the power of the Internet; but is it complex narrative art? If so, what makes it so; if not, what makes it not? Incidentally, if you follow the hotlinked name Tim Wright to http:// oldton.com/timwright.htm you reach a page that tells you, inter alia, that “From Sep 03 to Aug 04, Tim was the Digital Writer In Residence for the Writers for the Future project.” And when you click on the hotlinked name Writers for the Future (http://www.writersforthefuture.com/) you reach— thank God!—that commonest of Web pages, a few short paragraphs of black font text in a plain white landscape-orientated panel set against a grey background, containing at top left a small yellow triangle with an exclamation point within, and that most reassuring message, words of comfort, key of which are “Address not Found.” Here is closure of a kind, consummation by rupture or breakdown, the only ending the Internet can ever deliver. End of story. What has happened here, that these Firefox page load error messages should have become a source of relief and respite, rather than of irritation and frustration? In a nutshell (not yet a webby term of art I hope) I’m suffering from too much information, not enough focus, not enough walls (between texts, tasks); which is why I am perversely content to have run up against a metaphorically brick digital one. Usually, there is not enough in the way of walls (or edges, in Updike’s variant image, cited later), net, or tramlines. The idea of “not enough walls” makes me think, not entirely comfortably, of another Frost poem “Mending Wall,” and its double-edged praise of fences and separateness (where the farmer who is averse to intercourse happily insists “Good fences make good neighbours”). But is it not an acknowledged failing of many hypertext sites, narrative or otherwise, that, worse than War and Peace with its “too many words,” they have too many links, too many sources of semiotic communication, too many images, too much everything? That I suppose is why when my ISP returns that “Server not Found” no-can-do message I sometimes send up a little private cheer. Now I intend no general criticism of digital writing, nor a general plea for the upholding of merely traditional canons and norms. Nor do I wish to offend individual writers like Tim Wright, author of In Search of Oldton, which is about entirely serious matters. For a fascinating online interview with Tim Wright (actually done by e-mail, in January and February of 2008,
134
Michael Toolan
by Gavin Stewart), in which he talks in revealing detail about Oldton and his other projects, see Stewart (2008). By Wright’s own admission, the Oldton project both aimed to evangelize (his word choice) digital writing, and to help him to work through the loss of an Edenic childhood. But one of the Oldton project’s distinguishing features is for me the very reason it cannot qualify as narrative art: it incorporates “user-generated content within a narrative fiction format” (words taken from the Oldton Web site). It contains all sorts of images, sounds, and comments sent in by site users and pertaining to their own memories of Oldton-like childhoods, or of missing fathers, or of life in a small English town circa 1960. Those contributions have been moderated by Wright—so some authorial control is retained there—and loosely gathered and presented in the “Your Oldton” Weblog. The latter is one of three macro-strands of the site; but we should note—and it took me some time to discover this!—once you enter the “Your Oldton” Weblog you cannot get back (e.g., via a back button or link) to the Oedipal crossroads where you can choose between Tim’s Oldton, Your Oldton, and “Our” Oldton. The latter are effectively three distinct sites or silos: the viewer can respond to material on Tim’s Oldton by adding stuff to “Your Oldton,” but there seem to have been no changes to the former by way of response to new material in the latter. The three ring-fenced Oldton webs form a digitized archive, a memorializing, so that the matrix architecture of the project is closer to the text type Hoey (2001) has called the “discourse colony” rather than what for Hoey is its opposite, a narrative. This colony contains many narratives, and other forms too, including a fine poem by a Barry Taylor about his dad’s death. So there are forms within the colony that could be judged art; but if In Search of Oldton as a whole is a discourse colony, it can only be art of that form, not narrative form (just as a Mondrian painting or a Mozart concerto, I submit, are not narrative art, amenable to narratological analyses). Denying or rejecting such distinctions (i.e., radically transgressive of established forms) risks dismantling all the recognized categories in terms of which narrative and multimodal analyses are presently possible. Oldton is a mapping, initially onto a grid with fifty-two entry points (the cards); and a map is a picture, not a narrative. On the other hand, if we ask what it is that drives the creation of this map, and viewers’ additions to it, two complex emotions seem to dominate: grief and nostalgia. Grief motivates many of our most cherished narratives (personal or artistic), and I have always emphasized grief as a fundamental motive power in narratives (see, e.g., Toolan 2001, 45–46, 98); so I am certainly willing to concede that there are (many) narrative-like constituents in the discourse colony that is In Search of Oldton.
INTERACTIVITY AND AUTHORIALLY DETERMINED LINEAR PROCESSING I have no objection to digital-technology or hypertext multimodal narratives whatsoever. I am simply suggesting that their variability of processing and
Electronic Multimodal Narratives and Literary Form
135
scope for interactive alteration may compromise their status as a narrative art form. The potential openness and interactivity of hypertext narratives can mean that no two consumers will necessarily view the same texts and images, and that even if they do they need not view them in the same order. There is no object that is the story of Oldton that stays the same across a multitude of consumptions by different viewers or the same viewer at different times. For various purposes, including artistic purposes, that is fine, of course; but not for the purposes of creating works that are both narrative and art. What is wrong with interactivity, a buzzword for twenty years or more in educational circles? Mark Meadows’s defi nition of interactive narrative pinpoints the problem for me: interactive narrative is “a time-based representation of character and action in which a reader can affect, choose, or change the plot” (2002, 62). Now a narrative in which the reader can change the plot gives rise to innumerable unpredictably different narratives. At its best, such collaborative storymaking could well be narrative art (in the hands of a Beaumont and Fletcher, or Louise Erdrich and Michael Dorris partnership), but even then it won’t be fi xed and available to all as a completed object. Or if and as soon as it is, it ceases to be interactive in the required sense. The logic of interactive narrative ordains that out of one narrative starter kit may come many different narratives: reader-controlled degrees of divergence from the given source. As David Herman argues (this volume) a range of concepts from traditional narrative analysis and linguistics can be adapted to apply to multimodal story analysis: narration, focalization, character roles, the semiotics of gesture, deixis, and so on. But in the case of interactive hypertext narratives, the objects to which these concepts are applied are often too protean in sequence and effect to permit us to analyze them as narratives (as distinct from analyzing them as algorithmic recipes for narratives). Or as Adams (1999) concludes, “Interactivity is almost the opposite of narrative; narrative flows under the direction of the author, while interactivity depends on the player for motive power.” At the heart of my reservations, then, about hypertext multimodal narratives as narrative art amenable to stylistic analysis, is their impermanence as objects with a determinately sequenced experienced content, under authorial control. Why should it matter so much that “art narratives” be entirely authorcontrolled (to be art) and have determinacy of narrational sequence (to be a single stable narrative)? The brief answer is that these seem required if art narratives are to be repositories of ideas, glimpses of coherence and explanation, complex belief fragments, in the way I believe we expect of such works. By belief fragments I do not mean moments in any grand explanatory narratives; on the contrary, I see literature as a huge ramshackle storehouse of contributions that displace holistic explanations or stories. As readers of literary narratives I believe we cherish the fragments they contain, of pictures of the world and of ourselves that we can live by, that are commensurate with the depth and breadth of our experience and our mental and physical responses to that experience.
136
Michael Toolan
None of these fragments in itself amounts to a grand mythic narrative, and often enough the fragments clash with each other, as though they were jigsaw pieces cut from differently shaped puzzles. But the fragments engage our thinking and feeling, and all the other physical and mental responses that can be summed up as “insight,” an inward look, a consciousness of consciousness. The fragments are extensive enough and stable enough to be worth believing in; they are not “mere words,” but the specific wording is so crucial that, if taken away, the experience itself is taken away. So the concrete word-for-word literalness of these texts is crucial, as is the related possibility of memorizing quotations and reciting passages from those texts. Again, literature is not alone in furnishing us with fragments of believable insight: many other art forms can do so also, but with these too there is always an assumption of the work’s stability and permanence. I initially titled this chapter “Hypertext Narratives: Storytelling with the Net Down,” as an echoic use of Robert Frost’s famous critical dictum in praise of meter, to the effect that writing poetry without meter was like playing tennis with the net down. Somewhat along the same lines, John Updike recently complained that the supplanting of the printed word by digital writing and the Internet-enabled “infi nite wordstream” threatens us with loss of what the old-fashioned book at its best promised: accountability and intimacy. “The printed . . . book was—still is, for the moment— more exacting, more demanding, of its producer and consumer both.” Why? Because books have edges, whereas “in the electronic anthill, where are the edges?” (Updike 2006). Let me now cite another warning from Frost. Writing about his narrative poem “Home Burial,” in which a couple speak to and against each other in their contrasting ways of grieving over the death of their child, Richard Poirier quotes Frost’s own words about the combination of the sayable and what is better left unsaid, in poetry and in relationships (Poirier 1977). “Poetry is measured in more senses than one,” Frost wrote to Sidney Cox in September 1929. “It is measured feet, but more important still it is a measured amount of all we could say an [if] we would. We shall be judged fi nally by the delicacy of our feeling of where to stop short” (Thompson 1964, 361). It could be said that the central subject of this poem is poetic form seen in the metaphor of domestic form—a debate between a husband and wife about how each “shall be judged fi nally by the delicacy of our feeling of where to stop short.” I believe we fi nd, in the fi nest short stories, a similar stopping short, a measuring and giving of some so as to the more effectively go beyond telling or having to tell everything (Munro’s “Open Secrets”). It is done in lyric narrative poems like “Home Burial” as it is in the lyric narrative that is the short story. It certainly makes for a complexity of technique and texture when it comes to stylistic analysis. By contrast some of the more interactive, user-dependent digital-technology narratives are vulnerable to the objection that they have things too easy, without net or tramlines or other constraining rules, with just too
Electronic Multimodal Narratives and Literary Form
137
much technologically enabled freedom, to the point where my idea of how a literary narrative should aim to meet a challenge and create a freestanding fragment worthy of special attention and special belief-investment seems to have no place. In a recent essay (Kundera 2007, 154), the novelist Milan Kundera has powerfully reminded us of the importance of composition and form (accepted control and constraint) to the literary or artistic narrative. This is most apparent when, with the ostensible intent of rendering them “immortal,” novels are made accessible by adaptation into such modern multimodal forms as fi lm, television, theater, or cartoon strip: But that “immortality” is a chimera! For turning a novel into a theatre piece or a film requires fi rst decomposing the composition; reducing it to just its “story”; renouncing its form. But what is left of a work of art once it’s stripped of its form? One means to prolong a great novel’s life through an adaptation and only builds a mausoleum, with just a small marble plaque recalling the name of a person who is not there. (Kundera 2007, 154–55)
A QUESTION OF CONTROL In art of all kinds we fi nd versions of this willing submission (on the part of creators and consumers) to limits, constraints, and restrictions. Where, in practice, new-tech multimodal narratives remove multiple constraints and allow the consumer to cocreate, they amount to a distinct cultural form, different from traditional narrative art. These freedoms of cocreation are not new—it was always possible for Mozart to let his friends think up a different ending to Don Giovanni at each performance, or for Rembrandt to let others paint the backgrounds in his portraits, or for someone to tell Shakespeare to let someone “fi ll in” the linking scenes and plot-advancing material. But the richness of digitized Web-based resources now available makes the consumer feel more enabled, more able to cut, paste, and compose, than ever before—especially where the creator invites them so to do. But if William Faulkner were alive and writing his stories today, might he compose “That Evening Sun” with two beginnings and three different endings and an invitation to the reader to write their own middle section, however they thought fit? I don’t think so; Alice Munro is alive and writing today, and she doesn’t. It’s a question of control—control in several senses, but especially authorial control of the textual blueprint, and the controlling but also constraining of the author by whatever formal delimitations that author is prepared to embrace. All authors are under constraints (and liberated by those constraints), it is simply that those formal constraints are most palpable when we contemplate a sonnet or villanelle. Taken together these could be referred to by the ambiguous phrase “the control of the author”; you cannot have art (here, literary narrative art) without the control of
138
Michael Toolan
the author. All artists are under formal constraints, and paradoxically are enabled by those constraints. At the same time, and even more than is the case for human communications generally, artworks are required to be analyzable. Here, multimodal forms generally present greater difficulties than written texts, because far and away our preferred means for transcribing and analysis remains writing. Writing works wonderfully well in the analysis of writing, but less well in other circumstances. The difficulties that semioticians and multimodality analysts have with devising an efficient and wieldy transcription system, commensurate with the complex product under scrutiny, are well known. We do not yet have compellingly insightful multimodal recording systems for the transcription/analysis of multimodal forms. Opera scores and fi lm scripts and phonetic transcriptions have their uses, but all are reductive; whereas I can write out the whole of “The Dead,” thereby producing not a transcript but the story itself. Earlier I suggested that the critical question, for artistic status, is whether the multimodal product comprises fi nite and determined components or not. It is no bar if the author instructs us to absorb those components in any random way we see fit: that is the author’s privilege and a form of authorial control (but we are no longer contemplating one narrative). Nor can one have any objection to art forms that invite consumers’ commentaries and reactions immediately after the artistic event, as long as these are understood to be over the Updikean edge, or the other side of the Frosty wall, that marks the boundary of the artwork (cf. applause after a symphony, or someone writing a poem after hearing a string quartet: the poem does not become part of the quartet). But if “interactivity” extends to the point where the work requires or allows completion by the reader, then the new technology and its affordances are being harnessed to something that, however potentially fascinating, inspiring, and productive, does not fit my no doubt culture-embedded defi nition of art. The difficulty with interactivity is that it undercuts immersion, and I believe a sense of immersion amounts to a design feature of art (an idea as old as the catharsis identified in Aristotelian poetics). In a brilliant discussion of the tensions and confl icts between immersion and interactivity, Ryan (1994) argues that traditional fiction can be highly immersive of the reader but in a sense also risks rendering us unquestioning participants. By contrast, postmodern metafiction enables us to see anew that the fictional world of the text is a “non-actual possible world,” so that postmodern knowledge is achieved at the price of expulsion from Edenic immersion. Uncannily (and here’s the third use of the word Eden in this chapter), the narrator of Oldton (90% Tim Wright) himself castigates traditionalists who want to know what each next card will reveal, and defends “noodling around” and seeming randomness. On the eight of clubs card, he remarks “The whole point of most online adventures is to *not* know what the next link will bring,” while on the nine of clubs he adds “You could say that
Electronic Multimodal Narratives and Literary Form
139
Adam and Eve got kicked out the [sic] Garden of Eden for using their noodles a little bit too much.” Like me, Ryan intends no sweeping dismissal of virtual reality resources and hypertext fiction, but she does fi rmly advance one general thesis concerning the anti-immersive nature of hypertext: “a genuine appreciation of a hypertextual network requires an awareness of the plurality of possible worlds contained in the system; but this plurality can only be contemplated from a point of view external to any of these worlds” (Ryan 1994; see Bolter 2000 for a similar view).
CONCLUSION There is of course a powerful temptation (not less, for a student of integrational linguistics like me, but more!) to argue that anything can be a narrative, anything can be art, and consequently that anything can count as narrative art: it all depends on the situated negotiating interactants. I believe I actually subscribe to these axioms! But against that decontextualized openness one has to set the embeddedness of particular historicized cultural practices, categories, focusings, stipulations, and expectations of traditions or generations of situated negotiating interactants. These give rise to all the received systems and categories we rely on, among which most relevant here are: the narrative, the senses, the modes and media of communication, the novel, the short story, linguistics, narratology, the work of art, hypertext fiction, and so on. Against that contextualized background, a creative storytelling that “lowers the net” to permit “reading” in an almost infi nite variety of sequences and to permit additions (rarely deletions, one notes) to the content is arguably by that very relinquishment raising the metaphorical quality standard that readers will tacitly impose. What actually happens when you play tennis with the net down? Assuming all other rules stay the same, the game gets easier in some respects (cf. playing golf with a meter-wide hole) since certain kinds of quality-eliciting difficulty have been removed. Winners will tend to be those with what we revealingly call more brute force and sheer stamina, who can keep on hitting the ball back until the opponent’s legs fail. More strength but less skill is needed. It’s all a question of making distinctions. Similarly, the high-technology multimodal narrative is not, by virtue of the impossibility of its quotable possession, any less a narrative; but this performable possession feature may be an important and undertheorized ground on which these narratives are a profoundly different kind of narrative, the most alienated from the ordinary person’s body that our culture has yet devised for general consumption. And their nonincorporable unperformability may relate to why extant systems of transcription and analysis (in linguistics and narratology at least) continue to have such difficulties in devising a “grammar” of them that is commensurate with their complexity and is as accepted and
140
Michael Toolan
manipulable as our grammars of the sentence and the text. All this makes writing, a moving fi nger in the sand, a more humane and physically intimate activity than it is sometimes represented as. Digital hypertext narratives inevitably have some old narrative features (e.g., on any occasion of consumption the processing will be linear or sequential) along with some challengingly new features: some are too interactive, too open, with too much control relinquished by their creators, to be narrative art. Authorial control of content and sequence is crucial, and crucially it is partially relinquished in some electronic multimodal narratives. For all that, it is perfectly possible for new-technology multimodal narratives to cede no control of content or sequence to the user or addressee, in which case they obviously have the potential to be narrative art (as here restrictively defi ned). (By “genuine control” I mean the kind of authorial dictating of sequence that obliges us to read Chapter 2 of Middlemarch before Chapter 3; of course we may, in breach of contract, flout the form.) But some ceding of this control is often cited as a distinctive strength of hypertext fiction. And, for that matter, it is perfectly possible for an author writing in the old monomodal form of the print short story to hand over to the reader/ addressee control of some of the sequencing and some of the content—e.g., writing a five-section story and leaving the third section blank, instructing the reader to write it up as they see fit; or presenting the text as unbound chunks, to be read in whatever sequence the reader chooses. These, too, would for me mean that the final product(s) could not be narrative art. So it is fi nally neither a question of the semiotic modes of representation (they can be few or many) nor of the medium (screen, paper, sound waves), but the narratological requirement of authorial control of an invariant processing sequence that is critical for (my conception of) narrative art. Oral and print discourses can fail to meet that requirement; if they do they cannot be narrative art, whatever else they may be. Hypertext fictions, testimonies, and life writings are, thanks to the technological affordances, much more prone to fail to meet that requirement. Some time ago, Ryan came to a somewhat similar conclusion: “in literary matters, interactivity confl icts either with immersion or with aesthetic design, and usually with both” (Ryan 1994). Digital, Net-based technology enables kinds and degrees of interactivity, between countless globally dispersed addressees and a particular digital object or an online artist (viewable and audible in real time), that were utterly impossible as recently as twenty years ago. But these powerful new possibilities of interactivity do not make them automatically appropriate to so particular an artistic purpose as narrative art. And one price that electronic multimodality makes us pay is removal from a potential personal and embodied possession of the artwork. One conclusion then has to be that conventional monomodal literary narratives are still special, in certain ways. Yes, the new technologies have changed our lives enormously, and can be tremendously empowering. But
Electronic Multimodal Narratives and Literary Form
141
the old ways of doing things, including the old ways of multimodal narration, still have distinct strengths and affordances—somewhat changed, by the very existence now, as communicational options, of the new hypertext genres. Another conclusion is that artistically minded weavers of narrative using the new multimodal digital resources can be expected to establish certain formal rules and limits, including exclusion of fully independent consumer control of content and sequencing, and by these means to bring the constrained reader-viewer-listener—and art—into these works.
REFERENCES Adams, E. 1999. “Three Problems for Interactive Storytellers.” The Designer’s Notebook. http://www.gamasutra.com/features/designers_notebook/19991229. htm (accessed October 21, 2006). Bolter, J. D. 2000. Remediation: Understanding New Media. Cambridge: MIT Press. Hall, Vernon, Jr. 1951. “Joyce’s Use of Da Ponte and Mozart’s Don Giovanni.” PMLA 66 (2): 78–84. Hoey, Michael. 2001. Textual Interaction: An Introduction to Written Discourse. London and New York: Routledge. Joyce, James. 1956 (1914). “The Dead.” Dubliners. Harmondsworth: Penguin, 175–226. Kundera, M. 2007. The Curtain: An Essay in Seven Parts. London: Faber. Meadows, Mark. 2002. Pause and Effect: The Art of Interactive Narrative. Indianapolis: New Riders. Poirier, Richard. 1977. Robert Frost: The Work of Knowing. New York: Oxford University Press. Ryan, Marie-Laure. 1994. “Immersion vs. Interactivity: Virtual Reality and Literary Theory.” Postmodern Culture 5 (1), http://www.humanities.uci.edu/ mposter/syllabi/readings/ryan.html. . 2004. Narrative Across Media: The Languages of Storytelling. Lincoln: University of Nebraska Press. Stewart, Gavin. 2008. Online interview with Tim Wright. http://spooner.beds. ac.uk/mmrg/?p=134 (accessed May 12, 2009). Thompson, Lawrance. 1964. Selected Letters of Robert Frost. New York: Holt, Rinehart and Winston. Toolan, M. 2001. Narrative: A Critical Linguistic Introduction. 2nd ed. London and New York: Routledge. Updike, J. 2006. “The End of Authorship.” Sunday Book Review, New York Times, June 25.
10 Gains and Losses? Writing it All Down Fanfiction and Multimodality Bronwen Thomas
FROM FANZINES TO FANSITES Fanfiction provides an outlet for readers and audiences of literary texts, comic books, films, video games, and TV shows to share stories based on a particular fictional universe. This activity can be traced back to print-based fanzines dedicated to specific fandoms, but access and distribution (Kress and van Leeuwen 2001) has been significantly boosted in recent years by the advent of the World Wide Web. Publishing fanfiction online is easy and quick, and has the added advantage of being much more difficult to pin down in terms of copyright law. Like their print-based counterparts, fanfiction sites may be dedicated to a specific series of books or a TV show, but sites such as www.fanfiction.net allow users to be much more eclectic in their tastes and allegiances. Readers can navigate and browse between categories, and authors often contribute stories to a number of fandoms, moving readily between prose fiction, fi lm, TV shows, and cartoons. Crossover fanfiction explicitly sets out to “cross” characters and situations from different fictional universes: characters from the Harry Potter novels by J. K. Rowling or the TV show Buffy the Vampire Slayer are particularly well traveled in this respect. Online fanfiction therefore seems to be relatively relaxed when it comes to medium specificity. After all, many of these fandoms are dedicated to multimedia franchises and forms of transmedia storytelling, so it is perhaps only to be expected that fans switch so readily from discussing the book to the fi lm to the TV spin-off. Contributing to or accessing a story on a fanfiction site requires some degree of multimodal literacy and facility with basic Web site protocol, such as selecting from menus, choosing links, and so on. The sites also contain features familiar to Internet users such as forums, Web communities, and message boards, and homepages are usually bedecked with eye-catching visuals, colors, and even sounds associated with the fictional world of the fandom in question. Authors’ profiles, another distinctive element of the design of these sites, make full use of available multimodal resources, incorporating different colored fonts, uploading user pics, and so on. It seems curious, then, that the stories published on these sites are in essence
Gains and Losses? Writing it All Down 143 indistinguishable from their print-based equivalents, and appear to eschew the possibility of utilizing the multimodal resources at their disposal.
FROM MONOMODALITY TO MULTIMODALITY AND BACK AGAIN? Theories of multimodality have long argued that we should give more recognition to the materiality of writing, its visual and kinetic properties (Kress 1998), such that it becomes dangerous to equate multimodality unproblematically only with new technologies. Indeed, print fanzines might themselves be described as multimodal affairs, incorporating images alongside text, and including ads, letters pages, and editorial content alongside the stories. However, theorists have also stressed the need to explore how the practice of writing and reading on a computer screen may be qualitatively different from writing and reading a print text (Burbules 1998). For example, once printed, the “text” of a fanzine remains stable and distribution is dependent on cost and commercial considerations. With Web-based material, the concept of the “page” or the “text” becomes much more arbitrary and fluid, even if readers and writers continue to rely on such print-based terms. Though it is easy enough to produce print copies of online fanfiction, these will only ever be crude approximations of the stories as they appear on-screen, and seem amateurish in comparison with the production values of a magazine. I have argued elsewhere (Thomas, forthcoming) that fanfiction is best understood within the context of a “network culture” (Bolter 2001) where the boundaries between authors and readers become blurred, where authors can revise and update their work at will, and where the choices made by readers are affected by the design of the Web site and the presence of menus and links. This means that though the content of online and print-based stories would be difficult to tell apart, the design and presentation of the stories, and the distribution of power between participants, may be very different indeed. While most fanfiction stories are individually authored, how and where they appear on the Web site will be dictated by the site design, and by editorial choices about how to categorize them. Not so different from the editorial process in producing a magazine, maybe, but fanfiction sites are fluid affairs, constantly changing and evolving, as stories are updated, categories are revised, and even interfaces may be altered. Fanfiction sites also have a potentially limitless capacity to store and archive stories, so that the user can have access to an author’s entire output, and compare stories produced years apart. Discussion of these stories must therefore pay close attention to the specific “production format” (Kress and van Leeuwen 2001) in question, including aspects of the interface design, and the roles of Web site designers, editors, and betareaders.1 Very little work has been done to date on the design and technological affordances of fanfiction sites, though the
144 Bronwen Thomas process of publication and the practices involved have been the subject of some scrutiny (e.g., Pugh 2005; Thomas 2007). In this chapter, however, I wish to focus primarily on the question of why online fanfiction writers content themselves with a largely monomodal discourse, and the “gains and losses” of their attempts to “write down” their favorite multimodal universes. We can perhaps understand this choice more readily where fanfiction is based on “classic” prose fiction, so that an author may be trying to capture or even re-create the style or narrative voice of the source text. But what about fanfiction based on multimedia texts, such as films, manga comic books, or TV shows? What happens when the fiction is based on a “multimedia stew” (Gardner 2008)? Why do fans want to respond by “writing it all down” when they could (as others have done) create their own YouTube video or draw some fan art? This chapter will explore these questions with a view to critiquing Kress’s claims for a “turn to the visual” (1998, 56), and his contention that writing will gradually only become the preserve of a cultural elite (2003). Fanfiction throws into question not only dire warnings about the decline of writing, but also the idea that “writing appears on the screen subject to the logic of the image” (Kress 2003, 8). My analysis will challenge the notion that it is always “old” media that must compete with and remediate the “new,” as is suggested in Bolter and Grusin’s theory (2000) and is implicit in many theories of multimodality. Furthermore, I will examine and challenge the extent to which recoding of content across different modalities must necessarily be framed within terms such as those of “loss” or “gain” (Kress 2005).
LOST FANFICTION AND THE “MULTIMEDIA STEW” The discussion will focus on fanfiction devoted to the American television show Lost, which has become notorious for its narrative complexities, particularly the use of symbolism, hermeneutic gaps, and red herrings. The show follows the adventures of the survivors of Oceanic 815 on a mysterious island populated by polar bears and smoke monsters, where the group live in constant fear of “The Others.” Lost is supported by a wide range of multimodal resources, and viewers are highly likely to engage with the narrative over multiple platforms (Journet 2008). Novelizations of the TV show include Endangered Species (2005) and Secret Identity (2006), while a number of books accompanying the series claim to unlock its mysteries. However, the “multimedia stew” that surrounds the TV show has mainly been cooked up online. On Web sites devoted to the show, fans discuss at great length the significance of colors, pieces of music, and recurring motifs, such as the close-up of an eye on which many episodes closed. Dedicated fansites often explicitly try to replicate the style and aesthetic of the TV show, creating disorienting visual and aural effects and setting games
Gains and Losses? Writing it All Down 145 and puzzles for the user. For example, www.thetailsection.com borrows the logo of The Dharma Initiative, the mysterious organization that had previously conducted experiments on the island, while www.enterthehatch. com replicates the countdown device made familiar in the fi rst few shows of the series. Dedicated fanfiction sites such as www.lostfic.com also play with visual cues, incorporating images of key characters on the homepage, along with a cartoon icon based on one of the more bizarre of the island’s inhabitants, the polar bear.
Writing it Down While Lost fanfiction must be considered within the context of this “multimedia stew,” the fact remains that for those who participate in this activity, it is all about the writing. The fanfiction takes its cue from the TV show in many ways, most obviously where fans refer their readers to specific episodes or plotlines. In the TV show, especially in the fi rst few seasons, each episode follows the backstory of one of the characters, and is focalized from his or her perspective. As Hutcheon (2006) has pointed out, prose fiction based on screen texts often focuses on fleshing out characters’ backstories, or providing the reader with the point of view of minor characters. Lost fanfiction has a high proportion of stories using fi rst-person narration, and frequently delves into the characters’ pasts (and their futures). Such narrative choices seem to bear out the commonly held assumption that prose fiction can “do” introspection and internal thoughts better than TV or fi lm, and thus compensates for the “losses” of the latter. However, any crude attempt to divide up gains and losses in this way is hotly contested by Hutcheon, who argues that screen texts have developed devices that can just as effectively suggest a character’s thought processes or inner turmoil. Fans also use their stories to test out hypotheses based on the gaps left by the TV show, offering another potential “gain.” In this sense, “writing it down” facilitates a process of rationalization and examination, of “fi xing” the narrative so that it is available for inspection. Of course, “fi xing” the narrative also caters to the fans’ desire to hold on to the story so that they can keep revisiting and enjoying their favorite characters and plotlines, while also sharing their enthusiasms and passions with others. In this respect, fi xing is not so far removed from fi xation, evidenced in the way fans’ stories elaborate on a specific habit or characteristic of an individual or a seemingly trivial or inconsequential detail of the plotting. In relation to narrative time, Hutcheon (2006) claims that whereas prose fiction can move the reader around from one period to another, conveying the “meanwhile,” “elsewhere,” and “later” is more readily achievable in a visual medium through devices such as the flashback or the dissolve. In Lost, flashbacks are sometimes explicitly signposted by a “whooshing
146
Bronwen Thomas
noise,” showing the character in close-up deep in thought, or, more rarely, are accompanied by explanatory captions (http://www.lostpedia.com/ wiki/Flashbacks). Flashbacks are frequently linked thematically to events taking place on the island, but elsewhere parallels and even relevance may be much more difficult to detect. Moreover, Lost’s use of the flashback and the flash-forward may be deliberately disorientating: in “Through the Looking Glass” we only fi nd out that what we have been watching is a flash-forward at the very end of the episode (http://www.lostpedia.com/ wiki/Flashforward). In addition, the show offers little sense of continuity between episodes or between the myriad plot strands that are introduced. Writing it down involves attempting to recapture some of these televisual techniques that are such an intrinsic feature of the show. But the process also offers the “gain” of flattening out time, so that it becomes easier to try to make sense of the “before and after” that is often so difficult to grasp with the TV show. Perhaps one of the main reasons for “writing it down” is that it provides fans with the opportunity to publicly contribute to and participate in ongoing discussions about the show and its characters. Web sites have made it much easier for fans to participate and interact in this way, though as I have claimed previously (Thomas 2007), the degree of editorial control over published content may vary considerably from site to site. Writing it down is also a way of getting valuable (and virtually instantaneous) feedback: many fanfiction authors present themselves as budding writers who are eager for advice and guidance. In this regard, they may be subscribing to the process recognized by Kress and van Leeuwen (2001) whereby users want to associate themselves with writing as a mode that continues to enjoy a high status within their culture. At the same time, however, writing television fanfiction is seen as liberating precisely because there is no necessity to emulate or compete with an authorial voice (fanfiction author nerdork, LiveJournal 2008), and because TV texts typically do not carry the same kind of high cultural baggage as a classic novel. Moreover, posting a prose narrative is acknowledged as being infinitely easier than producing a video for YouTube: as fanfiction author suryaofvulcan (LiveJournal 2008) rather anachronistically puts it, all one needs to get started are the basic tools of pen and paper! Fanfiction sites have fostered a review culture where the process of writing is laid bare, and where the story is subject to constant updating, discussion, and revision in a way that is simply not possible with a print text (Thomas, forthcoming). The ability to publish stories online also contributes to the sense of ownership and investment fans have in the fictional worlds they write about. Fanfiction here is about wresting control away from the makers of the source text, especially where the fans disapprove of the direction taken in characterization, plotting, and so on. Writing it down can therefore offer an opportunity to voice one’s disapproval in a public way, and to invite debate and a working through of any grievance or dissatisfaction.
Gains and Losses? Writing it All Down 147
Writing it All Down “Writing it all down” takes us a stage further, implying that the act of writing is therapeutic in some sense, providing a release for the author, and a way of externalizing and working through problems and anxieties. This is borne out in innumerable author’s notes and comments, where fans write about how identification with a particular fandom helped see them through troubled times, such as a bereavement or bullying. Identification with fictional characters thus becomes a kind of projection, a playing out of fantasies as well as anxieties. Although this chapter will focus primarily on “writing it down” rather than “writing it all down,” in order to try and understand more about the practice of writing fanfiction, it will be necessary at times to discuss the fans’ motivations, for which plenty of evidence exists in forums, comments, and notes posted on fanfiction sites.
FANFICTION AND NOVELIZATION Adaptation theory has always been sensitive to the fact that word and image are governed by different logics, and that “The world told is a different world to the world shown” (Kress 2003, 1). Yet the process of “writing down” a film or TV narrative has been neglected to date in favor of a “one way street leading from older, more elite media to newer, mass-cultural media” (Hark 1999, 175). When a phenomenon like novelization is discussed, this has tended to be in quite negative terms (Baetens 2007), perhaps because the process has been associated most often with crude commercialism and the idea of “cashing in” on a successful franchise. Page to screen adaptations frequently disappoint in terms of viewers’ expectations, for example, where an actor is miscast, or the fictional world is transposed to another location or another period. The problem for writers taking on well-known TV shows is that real-world actors have become “welded” (Hark 1999, 177) to their roles, so it becomes difficult to dislodge from one’s mind the appearance, gestures, and mannerisms of that individual as an attempt is made to concretize them on the page. Nevertheless, for those immersed in a “traversal culture” of the kind described by Lemke (online, n.d.), it becomes ridiculous to try to isolate one modality or medium from another. Instead, as Hutcheon (2006) maintains, it may be preferable to conceive of the process as dialogical in the Bakhtinian sense of the term, rather than to always adhere to crude hierarchies or chronologies based on dubious categories such as that of the “original.” Baetens (2007) claims that novelizations are both anti-adaptation and anti-remediation in Bolter and Grusin’s (2000) sense, as their anachronistic status poses a challenge to the idea of progress he sees as implicit in the remediation model. Baetens goes on to contend that their existence “should force us to rethink more than one stereotypical belief about the relationship
148
Bronwen Thomas
between media” (235). Such rethinking might also fruitfully be extended to theories of multimodality, which have been accused of positing an overly simplistic trajectory from the “settled age of print literacy” to one in which image and screen are more dominant (Prior 2005, 24). In terms of adaptation theory, Baetens’s comments might also provoke more discussion of the relationship between source texts and their adaptations. For example, very little attention has been paid to the question of how audience members familiar with the source text of an adaptation may respond very differently to those who only seek out those texts afterwards. Such an analysis might provide invaluable insights into the audience’s relationship with the characters, depending on whether they have seen those characters fleshed out and embodied on the screen before they encounter them in the pages of a book. Baetens (2007) maintains that there is no semiotic gap between the novelization and the screen narrative, because what is being adapted is usually the printed script of the TV or film. Here Baetens may be thinking more specifically of the professional tie-in or novelization, as fanfiction writers can not be presumed to have the same access to these resources. Baetens does acknowledge the subversive potential of fanfiction, and its ability to appropriate these media texts for the expression of subcultural sensibilities. But he is perhaps guilty of focusing too narrowly on the textual product without sufficiently taking into account the particular set of social and aesthetic practices surrounding it. For example, a key feature of stories based on television shows is that author and audience share knowledge of how specific character roles and settings have been embodied and concretized. Fanfiction author thelauderdale (LiveJournal 2008) argues that this sense of belonging to an audience who experience the narrative collectively is something that still persists as an important aspect of TV fandoms, despite the fact that multiple channels, DVD recordings, and the like have made it less likely that we still sit down and watch the shows at the same time.
THE MODES OF TELEVISION NARRATIVE The ongoing nature of many television narratives, with their “infi nitely extended middles” (Fiske 1987) as opposed to clear beginnings, middles, and ends, also affects the author–reader relationship and contributes to creating a distinctive kind of fan community. Of course novels too may be serialized, and the prevalence of fanfiction dedicated to the Harry Potter series of novels by J. K. Rowling is testament to the fervor and excitement publication by installment can help to generate. With television narratives, not only is authorship commonly a team affair, even setting aside the roles of editors, producers, and the like, but over the course of a long-running TV show there may be several changes in personnel, and accompanying that, changes in the show’s style, even its ethos. A good deal of the excitement
Gains and Losses? Writing it All Down 149 (and frustration) generated by Lost comes from the fact that fans have no idea as to the direction the show’s makers will take it, and there has been constant speculation about the number of series to be made and the possibility of a feature-length film. Fanfiction, itself often published in installments, also has the quality of open-endedness, inviting a greater sense of involvement and participation on the part of users. Once again, therefore, it is vital to consider the social practices of specific fanfiction communities, if we are to better understand the “gains and losses” of what they actually do.
“Fleshing Out” Television Characters Kress and van Leeuwen (2001) borrow from Goffman’s (1981) notion of the actor as the “animator” of a role, to highlight the various modes that contribute towards bringing a character to life, particularly patterns of stress, rhythm, and ritualized modes of phrasing that may be unique to that individual. Kress and van Leeuwen go on to argue that recoding may bring gains in accessibility, for example, preserving an actor’s performance over time via transmission on radio, television, or fi lm, but that this process also necessarily leads to a “loss” of embodiment and some inevitable decontextualization. Fanfiction is to some extent born out of a sense of “loss” and a desire to preserve or re-create, perhaps reinhabit the embodied representation of favorite characters and fictional worlds. My analysis will focus on strategies employed by fanfiction authors to try to compensate for this “loss,” and will explore how far such writing may be capable of creative adaptation (Kress and van Leeuwen 2001). The audience’s engagement with the people who populate their TV screens is worthy of closer scrutiny. While watching actors on a screen is very different from watching actors who share the same physical space as us, as is the case in the theater, even from the earliest days of cinema and television, audiences have developed intense attachments to, and identification with, their screen idols. As John Fiske (1987) has demonstrated, our relationship with television characters is particularly intense because we may follow their stories for months and years, and because of the qualities of “liveness” and “nowness” that television offers. With film actors, it is less likely that they will only be identified with one role, and they are not part of the daily furniture of our lives in the same way as can be the case with television stars. Television characters function as indexical signs (Kozloff 1992), impossible at times to separate from the actors who play them. In Lost a running gag involves the fact that the actor who plays Hurley never seems to lose weight despite the fact that the passengers are supposedly on starvation rations. The concept of “canon” as applied to television narratives thus demands fidelity to the verbal and paralinguistic traits exhibited by the actors. Though characters in serial narratives can evolve and have memories in the way that characters in television series typically do not, they must exhibit “instant recognisability” (Ellis 1982,
150
Bronwen Thomas
272), typically achieved through the repetition of familiar catchphrases or physical gestures of some kind. Discussions about why they write fanfiction reveal that for the authors, it is all about the characters (LiveJournal 2008). Moreover, authors generally claim not to distinguish between fandoms in terms of media, or to approach writing stories based on multimedia forms any differently than they would those based on novels or short stories. For fanfiction readers, the pleasure of recognition, of meeting favorite characters again is paramount, as is keeping faith with preferred relationships (or “ships”). At the same time, for both authors and readers, there is also pleasure to be had in approaching these familiar characters from new angles, placing them in different situations, or focusing on periods in their lives not previously explored. For the fans, their experiences of transmedia storytelling mean that the worlds of the book, the fi lm, the video game all intersect, and they are able to sustain their identification with, and responses to, particular characters across their different realizations. Nevertheless, fanfiction stories do explicitly refer to the actors who animate the roles, by means of comments about their physical characteristics, mannerisms, and so on. Reviews habitually praise authors for keeping “in character,” and departures from canon in terms of characterization often provoke the strongest negative reactions. For example, unahappyjater2 berates the author of “Beautiful Strangers” on www.fanfiction.net for an “extremely OOC” [out of character] Jack, arguing that he is more complex than this portrayal would allow, and taking issue with the fact that “he would never taunt Kate” in the manner suggested by the story. Such negative comments partly reflect the reviewer’s interpretation of the characters from the source text (in this case the TV show), but also reflect the ways in which they project outwards from this as though to claim some kind of ownership of the character’s personality, manner of speaking, and so on. Thus while writing it down may help support continuity of character across different media, clashes and inconsistencies may emerge where a writer tries to capture nuances that are more difficult to capture in writing. Writing it down is in a sense always necessarily metonymic, selecting from a character or actor’s physical traits only those that are most significant, as an exhaustive description would be virtually impossible. But authors can draw on the fact that readers will have a memory of that individual as they have been embodied on screen, and in Alternative Universe (AU), slash fiction, or crossover stories may deliberately take liberties with almost any aspect of the character’s personality, sexuality, and even biology.
SPEECH AND DIALOGUE: A LINGUISTIC FINGERPRINT Norman Page (1988) has claimed that the speech of an individual is as much a part of how we identify and recognize them as their fi ngerprints.
Gains and Losses? Writing it All Down 151 Fanfiction writers make great efforts to capture the speech rhythms and idiosyncrasies of the characters. Several of the key characters in Lost have distinctive catchphrases or verbal mannerisms, and these components of visual narratives have been shown to be “the most portable, and the easiest for a viewer to extract and make his own” (Kozloff 2000, 27). For example, Hurley is as easy to recognize by his speech, and constant repetition of “dude” and “freakin” as he is by his long curly hair or by his enormous physique. Similarly, Sawyer has a habit of giving each of his fellow survivors nicknames (“Freckles,” “Sweetcheeks”), while Charlie and Desmond both have recognizable British regional accents (Mancunian and Scottish, respectively). Lost fanfiction relies heavily on these aspects of the characters’ speech, and draws on all of the traditional devices and conventions available for the representation of speech from the novel, such as using capitalization to suggest raised voices or dashes to imply hesitation. Italics are routinely used to convey a shift from speech to inner thought, and graphic devices tend to be employed much more liberally where the writer attempts to recapture the sense of urgency that is such a key feature of the TV show: for example, Pen Liddin uses letter repetition (AHHHHHHHHHHHHH!), capitalization, and orthographic changes (“YOU DARE QUESTION MEE?”) to underline the increasing intensity of the action in “Lost’s Grand Finale” on www.fanfiction.net.
ANALYSIS: “SAWYER’S BOOK CLUB” “Sawyer’s Book Club,” by eponine119 (www.lostfic.com), focuses on a series of conversations between three of the main characters from Lost: Hurley, the overweight Latino lottery winner mentioned earlier; Shannon, the spoilt prima donna who is killed off in series two; and Sawyer, part action hero, part sardonic commentator on events. The story initially draws on references to Watership Down from the episodes “Confidence Man” and “Left Behind,” as Hurley and Sawyer try to interpret the significance of Richard Adams’s novel.3 Later on in the story, Hurley and Sawyer discuss TV shows, and Sawyer tries to explain to Shannon why he has become the unlikely repository of reading material: “No TV on the island, sugar. Makes it easy.” The story is full of these knowing comments about the fictional status of the characters, including a discussion by Hurley and Sawyer about the distinction between fi rst- and third-person narration, and a telling exchange where Hurley tells Sawyer he reminds him of Sam Spade, to which Sawyer replies “That’s crazy . . . Everybody knows Sam Spade is Humphrey Bogart.” The story consists mainly of direct speech, and makes full use of the characters’ verbal idiosyncrasies, including Hurley’s trademark “dude” and Sawyer’s fondness for nicknames, as he refers to Hurley variously as “Pavarotti” and “Puff Daddy.” The story is so dialogue heavy that it could
152
Bronwen Thomas
conceivably be played out as a script, and the speech tags accompanying the dialogue are focused on how the characters react to one another, performing much the same role as the reaction shot in television drama. According to viva_gloria (LiveJournal 2008), when writing TV as opposed to bookbased fanfiction, more attention is given to the character’s body language as this is vividly played out in front of us to such an extent that a character may even remain silent throughout a scene. Up to a point, then, the story appears to allow the reader to revisit already familiar characters by rehearsing their quirks and idiosyncrasies. Hurley and Sawyer remain very much “in character,” and the humor derives from the fact that they behave in a predictable fashion, much as we expect of the “instantly recognizable” characters from a television series. However, the story’s self-reflexivity and ironic commentary on the show suggest another layer to the narrative. In her study of fans of Lost, Debra Journet (2008) found that their discussions exhibited all the hallmarks of the kind of “close reading” and critical appraisal usually associated with the study of literature. With “Sawyer’s Book Club” this kind of discussion is incorporated within the story itself, seamlessly and playfully offering different kinds of reading pleasure. Moreover, once we begin to examine the dialogue more closely, we can see that far from focusing purely on the “external” or the “objective,” the characters’ speech, and the narrative framing accompanying that speech, encourages us to engage with the characters’ emotions and thoughts, beyond the level of what they actually say, to consider what they actually might mean, and what they may be thinking while they are speaking. The most obvious way in which this is achieved is through the use of speech tags, for example, portraying Sawyer “trying to decide” how to react to the conversation, or “wondering” just what Hurley is setting out to achieve. Speech tags represent one of the key ways in which the early novel attempted to re-create for the reader those aspects of a physical, theatrical performance that could not be conveyed by the dialogue alone, functioning as a kind of “stage direction” (Page 1988). But as well as providing the reader with paralinguistic information (“Sawyer leans forward”), speech tags can tell us about the character’s motivation, emotion, reasoning, and the like, and also may carry the interpretative gloss of the narrator. Fanfiction author deird1 (LiveJournal 2008) identifies speech tags as an important tool for the writer who seeks to add their own stamp to the characters’ dialogue, further illustrating fanfiction’s duality in being able to provide users with “more from” as well as “more of” (Pugh 2005) their favorite fictional universes. “Sawyer’s Book Club” makes great play of the banter between characters that is a key feature of the TV show, while also hinting at what might lie behind the characters’ words and their displays of verbal bravado. The story also displays sophistication in its ironic reflection on the TV show and on the nature of fans’ identification with its characters. While it has to be
Gains and Losses? Writing it All Down 153 allowed that this level of sophistication is not shared by the vast majority of fanfiction stories, nevertheless the analysis has helped to highlight aspects of “writing it down” that offer us some insights into the process and the pleasures this affords.
CONCLUSION: WEIGHING UP THE GAINS AND LOSSES Thus, although at fi rst sight fanfiction may appear quaintly old-fashioned, almost retrograde from a multimodal perspective, we have seen that there are many affordances made possible by “writing it down.” These include the possibility of fleshing out characters, exploring their innermost thoughts, and providing a space wherein plot enigmas and intricacies may be worked through. But we have also seen that it is necessary to consider the specific features of writing online, and to examine the precise social practices that make up this activity. In particular, for many fanfiction devotees, “writing it down” is primarily about sharing one’s enthusiasms, frustrations, and creative aspirations within an environment that is largely supportive, and always responsive. Fanfiction shows us that there is no easy way to map out “gains and losses” between different modalities and across different media, such that we may need to revisit some of the hype and myths surrounding new technologies, and reassess the ways in which they may help reinvigorate writing.
NOTES 1. Betareaders offer their services to fellow fanfiction authors to check on spelling and punctuation, and to give advice and feedback on drafts of stories before they are published on the site. 2. In Lost fanfiction parlance, a “jater” is someone who advocates the Jack/ Kate “ship” (relationship) as being central to the show, in opposition to “skaters” who hold out for a Sawyer/Kate “ship,” or “jawyers” who write, or are interested in, stories involving Jack/Sawyer. 3. Tracking and interpreting the significance of intertextual literary references on Lost has become something of a spectator sport not just on dedicated Web sites such as Lostpedia, but also on the pages of the Washington Post (http://www.washingtonpost.com/wp-dyn/content/discussion/2007/07/08/ DI2007070800419.html).
REFERENCES Baetens, J. 2007. “From Screen to Text: Novelization, the Hidden Continent.” In The Cambridge Companion to Literature on Screen, ed. D. Cartmell and I. Whelehan, 226–38. Cambridge: Cambridge University Press. Bolter, J. D. 2001. Writing Space: Computers, Hypertext and the Remediation of Print. 2nd ed. London: Lawrence Erlbaum Associates.
154 Bronwen Thomas Bolter, J. D., and R. Grusin. 2000. Remediation. Cambridge: MIT Press. Burbules, N. C. 1998. “Rhetorics of the Web: Hyperreading and Critical Literacy.” In Page to Screen: Taking Literacy into the Electronic Era, ed. I. Snyder, 102– 22. London: Routledge. Ellis, J. 1982. Visible Fictions. London: Routledge. eponine119. 2008. “Sawyer’s Book Club.” http://www.lostfic.com/viewstory. php?sid=472. Fiske, J. 1987. Television Culture. London: Routledge. Gardner, J. 2008. “Project Narrative Second Debate: ‘The Sopranos’ vs. ‘Lost.’” http://projectnarrative.wordpress.com (accessed January 8, 2008). Goffman, E. 1981. Forms of Talk. London: Blackwell. Hark, I. R. 1999. “The Wrath of the Original Cast: Translating Embodied Television Characters to Other Media.” In Adaptations: From Text to Screen, Screen to Text, ed. D. Cartmell and I. Whelehan, 172–84. London: Routledge. Hutcheon, L. 2006. A Theory of Adaptation. London: Routledge. Journet, D. 2008. “Literate Acts in Convergence Culture; Lost as Transmedia Narrative.” Paper presented at Project Narrative, Ohio State University. Kozloff, S. 1992. “Narrative Theory.” In Channels of Discourse, Reassembled, 2nd ed., ed. R. Allen, 52–76. London: Routledge. . 2000. Overhearing Film Dialogue. London: Routledge. Kress, G. 1998. “Visual and Verbal Modes of Representation.” In Page to Screen: Taking Literacy into the Electronic Era, ed. I. Snyder, 53–79. London: Routledge. . 2003. Literacy in the New Media Age. London: Routledge. . 2005. “Gains and Losses: New Forms of Texts, Knowledge and Learning.” Computers and Composition 22 (1): 5–22. Kress, G., and T. van Leeuwen. 2001 Multimodal Discourse: The Modes and Media of Contemporary Discourse. London: Arnold. Lemke, J. 2008. “Transmedia Traversals: Marketing Meaning and Identity.” http://www-personal.umich.edu/~jaylemke/papers/transmedia_traversals.htm (accessed January 8, 2008). LiveJournal. 2008. http://community.livejournal.com/fanwriting/13952.html. Discussion thread opened July 25, 2008. Page, N. 1988. Speech in the English Novel. 2nd ed. Basingstoke: Macmillan. Pen Liddin. 2008. “Lost’s Grand Finale.” http://www.fanfiction.net/s/2950532/1/ Losts_Grand_Finale (accessed September 24, 2008). Prior, P. 2005. “Moving Multimodality Beyond the Binaries: A Response to Guntner Kress’s ‘Gains and Losses’.” Computers and Composition 22(i): 23–30. Pugh, S. 2005. The Democratic Genre: Fanfi ction in a Literary Context. Bridgend: Seren. Thomas, B. 2007. “Canons and Fanons: Literary Fanfiction Online.” Dichtungdigital 37 (1). http://www.dichtung-digital.org/2007/thomas.htm. . Forthcoming. “Update Soon! Harry Potter Fanfiction and Narrative as Process.” In New Narratives: Stories and Storytelling in the Digital Age, ed. R. Page and B. Thomas. Lincoln: University of Nebraska Press. unhappyjater. 2008. Review of “Beautiful Stranger” by kt2785. http://www.fanfiction.net/r/4535570/ (accessed September 24, 2008).
11 Respiratory Narrative Multimodality and Cybernetic Corporeality in “Physio-Cybertext” Astrid Ensslin
INTRODUCTION In this chapter, I take a particular interest in the corporeality of multimodal perception, which particularly comes to the fore in new media narratives and interactive environments. In so doing, I draw on recent developments in the theory of cyberculture as well as significant perceptive implications of multimodal discourse analysis. In particular, I shall be looking at the discourses of transcendence and of the multiple situatedness of the perceiving human body as put forward by cybertheorists and contemporary phenomenologists. The central assumptions of these theories tie in with the emphasis on the physical and physiological made by leading theoreticians of multimodal discourse analysis, more specifically Gunther Kress and Theo van Leeuwen (2001) and Anthony Baldry and Paul J. Thibault (2006). Against this backdrop, I shall be revisiting the notion of intentionality in reader-users of digital literature, notably cybertext. Using a so far rare example of what I would call “physio-cybertext,” namely Kate Pullinger, Stefan Schemat, and babel’s The Breathing Wall (2004),1 I explore the implications of Espen Aarseth’s (1997) “text machine,” an alternative textual communication model that places the encoded text (notably the underlying software code) at the center of literary communication and that I consider particularly suitable for the analysis of narratives that set out to “de-intentionalize” the receptive process. To provide a solid theoretical background to this undertaking, I explore some of the most common philosophical, psychological, and, not least, critical concepts of intentionality with respect to whether or not, or rather to what degree, they fit in with my own notion as applied to the aforementioned cybertext.
THE PHYSICALITY AND INTENTIONALITY OF MULTIMODAL PERCEPTION Our bodies are spatially, temporally, socially, culturally, and, not least, physiologically situated, i.e., contingent upon a variety of external and
156 Astrid Ensslin internal influences that have a major impact on our mental disposition. In fact, the two cannot be separated. What this means to us as literary and linguistic analysts is that we need to shift our focus away from the primacy of readerly intentionality, which is one of the main underlying assumptions of reader response theory, cognitive stylistics, and hypertext and hypermedia theory. Instead, we need to turn our attention to the interplay of corporeal and psychological functions at work during the receptive process. The notion of intentionality has been explored by leading scholars in philosophy, psychology, and literary criticism since the early 1970s. I want to make it clear that I do not wish to confuse “intentionality” with either the critical concept of “intention” (the writer’s or text’s assumed purpose, or effect he, she, or it is aiming at; see Richards 1929; Hirsch 1967; de Beaugrande and Dressler 1981). Nor am I referring to intentionality as propagated by Speech Act Theory (e.g., Austin 1962; Searle 1969), which denotes the communicative intention of the speaker in performing an illocutionary act successfully, as well as the hearer’s recognition of this intention. Likewise, despite my fascination for Peter Brooks’s Reading for the Plot: Design and Intention in Narrative (1984), I do not share—in my present investigation—his specific idea of narrative intention, which, essentially, relates to the “motor forces” of plot or rather “plotting” and, hence, narrative meaning. In fact, all those aforementioned theories associate intention and intentionality with either the author-producer’s side or the text itself as an abstract static or dynamic concept. Conversely, my main interest lies in the receiver’s side and to what extent we can uphold purpose-driven, goaldirected intentionality in multimodal cybertextual experience. With regard to the receptive focus, the term “intentionality” has two basic meanings, both of which are, in my view, relevant to the theory and analysis of multimodal new media narrative. The fi rst, rather common meaning implies purpose-driven human action (including linguistic behavior and, more specifically, literary reception). The second, far more technical, notion is “reference or aboutness or some similar relation” (Harman 1998, 602; see also Dennett 1971, 1987; cf. Anscombe 1975; Diamond and Teichman 1979; Chisholm 1981) to a so-called “Intentional object” (what we see, hear, and, more generally, perceive; Searle 1983, capitalization in original). Both meanings are examined and conflated by John Searle (1983) in his phenomenological monograph Intentionality under the heading “Intention and Action” (a section that builds up on the line of argument developed in the central chapter of the same book, “Intention and Perception”). For Searle, a (physical) action has two components: the physical action itself (e.g., movement), which is—given a fully functioning, healthy body—intentional in the sense of deliberate (though not always conscious; mark the lowercase i). The second component is Searle’s very notion of experience, which he equates with directness, immediacy, and involuntariness, and which is “intentional” in the sense of directed at an Intentional object. According to Searle, experience forms a major component of perception in
Respiratory Narrative 157 that its direction of fit is mind-to-world, while, at the same time, the Intentional object follows the direction of causation, which is world-to-mind. It is the fi rst of the two notions, i.e., purpose-drivenness in relation to an experienced Intentional object, which has significantly influenced reader response criticism, cognitive stylistics, and, not least, hypertext theory, and it is one of my major aims here to revisit the basic tenets of those critical schools in the light of what I would call “cybertextual de-intentionalization.” The notion of “cybertext” goes back to Espen Aarseth’s (1997, 21) alternative model of textual communication. Aarseth sees texts as “mechanical device[s] for the production and consumption of verbal signs” (21). The term he fittingly uses is “machines,” which underscores both the materiality and semiotic nature of text. Aarseth places the “text machine” at the center of his alternative communicative triangle, surrounding it with the (human) “operator,” the “verbal sign” (which is physically read on the screen), and the material “medium.” These three elements engage in a complex interplay with the text (which, in new media narratives, includes the program code) and with each other. This results in a variety of different cybertextual subgenres, depending on which element is emphasized most strongly. What Aarseth aims to communicate is a distinct sense of performativity conveyed by cybertexts, which renders the operator a constitutive yet potentially disempowered element of (cyber)textual performance. Some new media writers have picked up on the idea of textual empowerment by creating what I refer to as “cybertexts proper,” i.e., works of digital literature that “assume power” over the reader by literally “writing themselves” rather than presenting themselves as an existing textual product. I cannot here discuss the sheer variety of manifestations this writerly agenda can produce, but shall only focus on one cybertext that seems particularly suitable to the theme of this volume. In fact, I see The Breathing Wall as a revealing example of how “multimodal resource integration”2 (Baldry and Thibault 2006) interlinks with the idea of the cybernetic feedback loop, which merges the human body with the electronic text machine. The focus of literary criticism and stylistics has in the past thirty years clearly shifted to reader–text relationships, the procedurality of reading, and to cognitive aspects of reading as the “creative negotiation between writer, text, reader and context to construct a text world” (Wales 2001, 64). Nevertheless, we still lack a significant degree of auto-physiological knowledge, i.e., an insight in and, perhaps more importantly, awareness of the physical processes controlling and interfering with the reading process. Arguably, the advent of multimodal literary analysis will have a significant impact on the development of such (auto-) physical awareness, not least because of the multisensory appeal of the artifacts under investigation. According to Searle (1983), “consciousness and Intentionality are as much part of human biology as digestion or the circulation of the blood” (ix). He advocates “appreciating [the] biological nature” of “mental phenomena” (ix), for which reason I fi nd his theory of Intentionality
158 Astrid Ensslin particularly compatible with individual aspects of cybertheory, which I now want to outline. A discourse that has recently emerged amongst new media narratologists and computer game theorists is the so-called “discourse of transcendence in writing about digital technology” (Dovey and Kennedy 2006, 106). It implies the transgression of the Cartesian body–mind dualism in that it assumes the concept of the double-situatedness of the body in new media environments. Arguing against the common notion held amongst cultural theorists that new media allow the “disembodiment” (Lister et al. 2003, 248; cf. Dery 1993) of the immersed reader, Marie-Laure Ryan asserts the “embodied nature of perception” (2001, 14; cf. Alison Gibbons’s chapter in this volume; Merleau-Ponty 1962). Martii Lahti (2003) adds, particularly in relation to video game reception, that “we remain flesh as we become machines” (169). Furthermore, to give an example of contemporary science-informed philosophy, Antonio Damasio (2004) describes such allegedly “mental” phenomena as emotions as “a complex pattern of chemical neural responses,” which, given certain stimuli, cause a “temporary change in the state of the body proper” (53). Put differently, we need to distance ourselves from the traditional image of “text” and “reader” as separate units. After all, this division seems to be based on the Cartesian model of perception, which sees consciousness and Intentional directedness in opposition to embodiment and materialization. The double-situatedness of the body implies, on the one hand, that user-readers are “embodied” as direct receivers, whose bodies interact with the hardware and software of a computer. On the other, userreaders are considered to be “re-embodied” through feedback that they experience in represented form, e.g., through visible or invisible avatars (third-person or fi rst-person graphic or typographic representations on screen). I would argue that we can transfer not only the fi rst but, more interestingly, the second dimension of situatedness to the analysis of new media narratives. Clearly, new media narratives differ from games and virtual environments in that as readers we categorically do not need a physical representation of our own subjectivity in the text-world. On the other hand, every narrative assumes an implied “reader,” and it is one major achievement of cybertextual narrative that this implied reader or “breather,” as in our example, shares his or her phenomenological physicality with the narrator and/or hero(ine) of the story. I will shortly demonstrate what I mean by that. Based on the aforementioned notion of double-situatedness, I suggest a dual dialectic: fi rstly, the dialectic of corporeal double-situatedness in new media, which ultimately renders the dually embodied user part of a cybernetic feedback loop involving the text machine, the “implied cybertextual reader” and the actual reader. To this I would add, secondly, the dialectics of i/Intentionality, which refers, on the one hand, to purpose-driven readerly action, ultimately aimed at fi lling Iser’s “information gap” (e.g., 1978)
Respiratory Narrative 159 and reducing the degree of indeterminacy, and to the receiver’s experience of in the sense of relation to the Intentional object, on the other.
CYBERTEXTUAL DE-INTENTIONALIZATION? KATE PULLINGER, STEFAN SCHEMAT, AND BABEL’S THE BREATHING WALL Cybertext authors have problematized and aesthetically implemented the aforementioned dual dialectic by investigating creatively to what extent intentionality may be constrained or reduced through textual autonomy and the entextualized emphasis of the perceiving body. When examining cybernarratives, we therefore need to focus on the sensory nature of textual experience. This involves, on the one hand—and this is common to all multimodal texts3 —an examination of how semiotic resources are “clustered” to form an “integrated artefact” (Baldry and Thibault 2006). On the other, we need to look at how the (cyber-)text “‘codes’ the player into the [text] world” (Dovey and Kennedy 2006, 108) and how this multisensory feedback loop allows the thus “embodied”/“re-embodied,” implied reader to evolve and develop his or her mental image of the text-world and, not least, his or her own corporeal, spatiotemporal subjectedness. Kate Pullinger, Stefan Schemat, and babel’s multimodal, hyper- and cybertextual gothic murder mystery The Breathing Wall was fi rst presented at trAce’s 2004 Incubation Conference in Nottingham.4 It is a particularly interesting literary artifact in that it purports to undermine the “normal” cybernetic accommodation process usually experienced in game-play, where the technological idiosyncrasies of a game, in the sense of both hard- and software, are readily “adapted to and appropriated into our available repertoire of bodily behaviours and aptitudes” (Dovey and Kennedy 2006, 111). 5 These appropriations, performed by the human body through repetitive use of the same “moves,” is ultimately based on the workings of intentionality, in the sense of both object-directedness, or “intention in action” and “prior intention” (Searle 1983, 84). Along with a so far rather limited number of similar “physio-cybertexts,”6 The Breathing Wall may well be considered to have the potential to overthrow or at least substantially supplement general assumptions held by current literary criticism and discourse analysis. After all, and fully legitimately, those reader-centered theories accentuate psychological processes operating in interaction with textual structure and perception. Having said that, they tend to neglect the corporeal conditionality, the physical situatedness emphasized by recent cyber-criticism. Together with Berlin-based new media artist and researcher Stefan Schemat and hypermedia authorprogramr babel (a.k.a. Chris Joseph, currently Digital Writer in Residence at De Montfort University), Canadian-British novelist and Reader in Creative Writing and New Media (De Montfort) Kate Pullinger, has created a
160
Astrid Ensslin
cybertextual narrative that confronts readers with a virtually unbearable dilemma: the entextualized threat of losing control of intention-driven textual decoding and inferencing for the sake of a cybernetically controlled process of information disclosure. More specifically, The Breathing Wall uses the reader’s respiratory system as the driving force for revealing essential referential meaning, or “clues” (the term is appropriate because we are in fact dealing with a murder mystery). The text comes in the form of a CD and requires the reader to use a headset with an attached microphone. The microphone, however, does not, as one might assume, fulfi ll the purpose of recording the reader’s voice. It much rather captures the breathing rate of the receiver as it is placed right underneath their nostrils. Depending on the rate and depth of inhaling and exhaling, the text (in the combined sense of program code and user interface) will release either more or less information essential for solving the riddle imposed by the story. In other words, the quality of reception is no longer controlled by the reader’s skillful—in the sense of cognitively, analytically, and aesthetically competent—reasoning, goaldirectedness, or simply intentionality. It much rather depends on his or her physical condition, or rather situatedness, at the time of “reading,” which involves aspects of location (spatiality), time of day or night (temporality), cultural and social embedding, and, not least, physiology (e.g., tiredness, well-being, metabolic functions). In other words, the perceiving body may, at any given time, range from a calm and relaxed to an agitated or even hyperventilating state.7 The narrative tells the story of Michael, who has been falsely convicted of murdering his ex-girlfriend Lana. When her body is spotted in the park, a note scribbled by him the day before is found in her pocket. It reads as follows and is used as evidence against him. I don’t know what I’m going to do without you. Whatever happens—it’s your fault. It’s your responsibility. You shouldn’t have broken up with me.
The gothic aspect of the story derives from the nocturnal appearances of Lana’s voice in Michael’s dreams, as she speaks to him through the prison “wall.” In those “dreams,” which the reader co-experiences through watching fi lmic sequences of mostly inanimate “backgrounds,” Lana gives Michael—and the reader—clues as to what happened to her on the night of her murder. And although, even without “breathing” their way through those “nightdreams,” readers will know by the end of the hypertextual “daydreams” who the killer was, they will not have been initiated in the exact phenomenological processes experienced by the victim. The text itself integrates audio recordings of spoken text and various other types of sound (noise and music), video, animation, graphics,
Respiratory Narrative 161 and hypertextual interaction. With regard to “hypertext,” coauthor/programr babel (2004) admits that “for this project it meant, very loosely, moving from one chapter to another, or from one passage of text to the next, through (clickable) links or mouse movements.” In all likelihood, the author team soon realized that the crime mystery is, in its generic macrostructure, more contingent upon linear perception than many other literary genres and hence one of the most unsuitable genres for hypertextual representation. Similarly, babel (2004) explains that “a cinematic (sit and watch) approach also contrasted well with Stefan [Schemat]’s software, and the unique physical interaction it requires. So the term hypertext was superseded by ‘daydreams’—as opposed to Stefan’s ‘nightdreams.’” The entire novel is characterized by the visual juxtaposition of animated, open (in the dreams) and nonanimated, closed spaces (in the daydreams), which follow one another as in a slide show (in the latter case) or, respectively, an impressionistic motion picture (in the former). Notably, with very few exceptions, we are dealing with a “foregrounding of backgrounds,” which entails not only all four natural elements but various other “symbols” of life such as the moving fur of a breathing, i.e., “living” animal, and the image of a clock symbolizing the temporality of life. Throughout the text, readers never get to see any of the main characters. Instead, they can hear their voices through loudspeakers or earphones and read handwritten and typographic representations of their speech on screen. The closed-room images depict the dark, stifl ing interior of a prison, thus multiplying the effect of Michael’s verbal expressions of frustration, emotional suffocation, and despair. The open spaces conveyed by the nightdreams, by contrast, are “open” in the sense of “ambiguous” on a denotational, connotational, and figurative level as well. They can only be read suggestively or associatively, as readers are, for the major part, not provided with textual material from other semiotic resources that would help them decode the meaning and/ or significance of those open spaces in relation to the whole plot. Only in the fi nal dream are the fi lmic sequences disambiguated denotationally, as the reader is—through Michael’s dreams—taken back to the scene of the killing: the park. I will, in what follows, focus for my analysis on one particular section of the text, the phenomenological turning point, as it were, in which the reader realizes how Lana was killed and how she perceived the process. The visual key, the “Dingsymbol”—to borrow a term from novella theory—is the very close-up shot of a human hand presented as if it were held over the fi rst person’s eyes to protect them from the sun. The animated image fi rst occurs proleptically in Dream 2, at a moment when Lana gives away one of her fi rst cryptic clues, that Michael’s sister Florence will come to see him. The same hand reappears in Dream 4 and undergoes, in indexical supplementation of the verbalized narrative, a semantic or, more precisely, connotational shift. In the fi rst, unmarked instance, the hand symbolizes, through the use of warm, fleshly colors and the light streaming through
162
Astrid Ensslin
between the fi ngers, a positive, hope-inspiring atmosphere. In the second instance, contrarily, the spoken text recontextualizes the hand to make it become part of Lana’s association of the game “Who am I,” of which she is, ironically, “the master.” She tells Michael she can easily identify who hides behind every hand in the game, whether or not the person is a complete stranger, and whether he or she is calm—with dry hands—or nervous— with wet hands. This last clue inadvertently initiates another semantic shift towards the negative connotational potential of the image and thus makes the attentive reader associate the game with her murderer. And indeed, some time later, in Dream 4, Lana verifies this suspicion by describing in detail the process of her suffocation through the hands of somebody whom she does not identify as either familiar or alien. To sum up, the symbol of a protective and simultaneously threatening hand, which serves as a pars pro toto of both ally and enemy, is used to show, on a literal level, impeded visual sensation and thus, figuratively, withheld insight and knowledge. It therefore serves as an ex negativo reification of the epiphanic moment, which denies the reader access to the essential piece of information while opening up the possibility of doing so. Finally, the psychoanalytical implications of the inter-semiotically revealed infanticide (Lana was in fact killed by her own emotionally deranged father) add yet another level of semiotic encoding to Baldry and Thibault’s (2006) “meaning-compression principle,” i.e., the reduction of complex, higherscalar phenomena to a minimal amount of semiotic encoding.
CONCLUSION Arguably, The Breathing Wall may be regarded as one of the very few hypertexts or rather cybertexts defying the allegation that the murder mystery has never really entered the realm of “anti(mono)linear” storytelling (Ensslin 2007). The text clearly deviates from—and thereby reconfi rms, on a metageneric level—the rules of the conventional thriller by leaving the solution of the mystery not merely to the reader’s intention-driven, cognitive engagement with the plot, but chiefly to his or her very physical condition both at the time of reading and, more generally, in comparison with the implied, “ideal” reader, or “breather.” Having said that, very few new media narratives now follow the radical anti-linear principles of early hyperfiction as represented most prototypically by Michael Joyce’s, Stuart Moulthrop’s, and Shelley Jackson’s work. In fact, to achieve a maximum revelatory and sensory effect in the reader, The Breathing Wall closely follows Marie-Laure Ryan’s dictum from 2000. The next generation of hypertexts will have to be visually pleasurable, and hypertext will be a work of design and orchestration as much as
Respiratory Narrative 163 a work of writing. [ . . . ] To remain readable, these conceptual hypertexts will have to be shorter than the hypertext novels of the fi rst generation. And it will be necessary to give a strong allegorical meaning to the action of moving through a textual network—not an invariant generic message inherent to the medium, but a meaning unique to each particular text, and ideally recreated with every use of the device. Clearly, whilst giving readers the opportunity to traverse the text in a variety of non-mono-linear directions and sequences, the generic and macrostructural implications suggested by The Breathing Wall’s hyper- and cybertextual arrangement reinforce the opposite: the macrotextual linearity characteristic of the conventional thriller. Furthermore, the text aims at a maximum of not only visual but aural and, more importantly with regard to its thematic and medial objectives, haptic pleasurability, although we may have to replace the latter with “haptic challenge,” as the work convincingly problematizes the controversy surrounding the Cartesian mind–body dualism. On the one hand, the text confronts readers with the unconditional nature of their own physical situatedness and thus perceptive and cognitive dependence on the full functionality of their own body (or, more precisely, their full physical functionality as dictated by the text machine). One could argue, in fact, that the longhailed “empowered” “wreader” (Landow 1997) is transformed into a mere breathing apparatus, whose reading experience depends to a large degree on the spatiotemporal situatedness of their very own metabolism. On the other hand, the very same insight seems to prove—ex negativo— the plausibility of Cartesian dualism, as mind and body, during the reception process, do not appear to cooperate in such a way as to facilitate text– reader interaction. By the same token, the possibility of a successful reading-breathing process intrinsically underscores the hegemony of the mind, i.e., intentionality, over the body. Overall, this apparent conflict raises the question whether we, in our long-lasting endeavor to identify the most suitable ways of describing and analyzing textual and interpretive processes, have been partially deceived. What my reading of The Breathing Wall suggests is that the multimodal turn facing us as literary analysts affords a turn towards a broader anthropological analysis, which includes the cognitive and the corporeal, particularly when it comes to decoding the largely sensory nature of intersemioticity. Hence, with the advent of experimental forms such as “physio-cybertext,” narrative activity has taken possession of the multimodal and technological affordances of the digital medium, thus pushing fi rst-generation hyperfiction (Hayles 2002) back into its own capricious little “niche.” It is probably too soon to make predictions, but it is in particular the convergence between semiotics, cybertheory, and narratology, with an emphasis on the narrated, perceiving body, to which future stylistics will have to pay due attention.
164 Astrid Ensslin NOTES 1. I am most grateful to Kate Pullinger, Chris Joseph, and Stefan Schemat for their comments on an earlier version of this chapter. 2. Baldry and Thibault (2006) use the term “resource integration principle” to describe the inherent semiotic in the sense of the “meaning-making” nature of functionalized semiotic resource systems, i.e., systems “of semiotic forms that we can use for the purposes of making texts” (18). 3. As Baldry and Thibault (2006) rightly observe, the expression “multimodal text” is in fact pleonastic, as “in practice, texts of all kinds are always multimodal, making use of, and combining, the resources of diverse semiotic systems in ways that show both generic (i.e. standardised) and text-specific (i.e. individual, even innovative) aspects” (19; emphasis in original). 4. The premier of the earlier form of it dates as far back as 1994 and “took place in Hamburg at the World Computer Congress [ . . . ]. That work was entitled Respiration of the introjects, and you could only hear the story when you had fallen into a hypnotic sleep (as measured by the regularity of your breathing)” (Stefan Schemat in a personal e-mail exchange with Kate Pullinger, May 29, 2005 [sic]). 5. A physio-cybertextual “cognate” from the world of narrative adventure games is Atari’s Indigo Prophecy (2005), in which the avatar’s breathing rate is matched to that of the player (see Johnson 2007). Unfortunately, a comparative analysis of both cybertexts would go beyond the scope of this chapter. 6. At the time of writing I am aware of only one further specimen of the same technological (though not literary) genre: Lewis LaCook’s Dirty Milk, a “recombinant poem sequence for the Internet, using your computer’s microphone as a catalyst for transformation” (www.lewislacook.org/dirtymilk/, accessed June 11, 2007). Blowing into the microphone triggers a mechanism that will rearrange the poems across the screen. This evokes a strong participatory, physio-cybertextual aesthetic yet, unlike The Breathing Wall, does not embed the reader’s physical situatedness in the flow of a narrative. 7. According to Stefan Schemat, “the more relaxed the ‘breather’ becomes, the deeper access to the story they gain. He says his ultimate goal is to send the ‘breather’ to sleep and that, once you are asleep, all will be revealed in your dreams” (Kate Pullinger in a personal e-mail exchange, June 17, 2007).
REFERENCES Aarseth, E. J. 1997. Cybertext: Perspectives on Ergodic Literature. Baltimore: Johns Hopkins University Press. Anscombe, G. E. M. 1975. “The First Person.” In Mind and Language, ed. S. Guttenplan, 45–66. Oxford: Clarendon Press. Austin, J. L. 1962. How to Do Things with Words. Oxford: Oxford University Press. babel. 2004. The Breathing Wall: An Online Journal. http://tracearchive.ntu. ac.uk/studio/pullinger/bwone.html (accessed April 25, 2007). Baldry, A., and P. J. Thibault. 2006. Multimodal Transcription and Text Analysis: A Multimedia Toolkit and Coursebook with Associated On-line Course. London: Equinox. Brooks, P. 1984. Reading for the Plot: Design and Intention in Narrative. Oxford: Clarendon Press.
Respiratory Narrative 165 Chisholm, R. 1981. The First Person: An Essay on Reference and Intentionality. Brighton: The Harvester Press. Damasio, A. 2004. Looking for Spinoza. London: Vintage. de Beaugrande R., and W. Dressler. 1981. Introduction to Text Linguistics. London: Longman. Dery, M., ed. 1993. Flame Wars. Special Edition of the South Atlantic Quarterly, 92 (4). Durham, NC: Duke University Press. Dennett, D. 1971. “Intentional Systems.” Journal of Philosophy 68:87–106. . 1987. The Intentional Stance. Cambridge, MA: MIT Press. Diamond, C., and J. Teichman, eds. 1979. Intention and Intentionality: Essays in Honour of G. E. M. Anscombe. Brighton: The Harvester Press. Dovey, J., and H. Kennedy. 2006. Game Cultures: Computer Games as New Media. Maidenhead: Open University Press. Ensslin, A. 2007. Canonizing Hypertext: Explorations and Constructions. London: Continuum. Harman, G. 1998. “Intentionality.” In A Companion to Cognitive Science, ed. W. Bechtel and G. Graham, 602–10. Malden, MA: Blackwell. Hayles, N. K. 2002. Writing Machines. Cambridge, MA: MIT Press. Hirsch, E. D. 1967. Validity in Interpretation. New Haven: Yale University Press. Iser, W. 1978. The Act of Reading. Baltimore: Johns Hopkins University Press. Johnson, M. S. S. 2007. “Combat to Conversation: Towards a Theoretical Foundation for the Study of Games.” Dichtung-digital 37 (1). http://www.dichtungdigital.org/ (accessed January 27, 2009). Kress, G., and T. van Leeuwen. 2001. Multimodal Discourse: The Modes and Media of Contemporary Communication. London: Arnold. Lahti, M. 2003. “As We Become Machines: Corporealized Pleasures in Video Games.” In ed. M. J. P. Wolf and B. Perron, The Video Game Theory Reader, 157–70. London: Routledge. Landow, G. P. 1997. Hypertext 2.0: The Convergence of Contemporary Critical Theory and Technology. Baltimore: Johns Hopkins University Press. Lister, M., J. Dovey, S. Giddings, I. Grant, and K. Kelly. 2003. New Media: A Critical Introduction. London and New York: Routledge. Merleau-Ponty, M. 1962. Phenomenology of Perception. London: Routledge and Kegan Paul. Pullinger, K., S. Schemat, and babel. 2004. The Breathing Wall. CD-ROM. London: The Sayle Literary Agency. Richards, I. A. 1929. Practical Criticism. London: Kegan Paul. Ryan, M.-L. 2000. “Narrative as Puzzle!?—An Interview with Marie-Laure Ryan.” Dichtung-digital. www.dichtung-digital.de/Interviews/Ryan-29-Maerz-00/ index.htm (accessed October 27, 2005). . 2001. Narrative as Virtual Reality: Immersion and Interactivity in Literature and Electronic Media. Maryland: Johns Hopkins University Press. Searle, J. 1969. Speech Acts: An Essay in the Philosophy of Language. Cambridge: Cambridge University Press. . 1983. Intentionality: An Essay in the Philosophy of Mind. Cambridge: Cambridge University Press. Wales, K. 2001. A Dictionary of Stylistics. 2nd ed. Harlow: Pearson Education.
12 Cruising Along Time in Ankerson and Sapnar Jessica Laccetti
Memory has become rhizomatic and roots itself no longer in the word but in the disorientation of new visual and temporal perspectives for the Information Age. Carolyn Guertin
Even if the linear temporality of Newtonian mechanics is disputed, more contemporary representations of time by topological closed curves still rely on geometrical and fundamentally spatial models for their coherence. Elizabeth Grosz
New time-based interactive media have in turn introduced supplementary questions about perception, visuality, and performance. Rita Raley
INTRODUCTION Cruising is an example of a born digital fiction, a fiction created for online reading. Though the story itself, about three teenage girls cruising around their town in search of fun and friends, may seem simple, the navigation required to read it is not. By requiring readers to learn how to interact with the text and the other sensory modalities, Ankerson and Sapnar link Cruising with a development of time and subjectivity. By carving out a distinct theoretical space sympathetic to the conditions and affordances of the online environment this chapter engages with interpretation and literary critique though it is not fully aligned with it, at least not traditional evocations of narrative theory. As such, this chapter, like the titular Web fiction it analyzes, reworks and reinterprets aspects of feminist theory and close reading in order to examine the differences and the possibilities afforded by this specific online multimodal environment. Multiplicity appears as a technique
Cruising Along
167
both through the variety of modes Cruising employs and through the recognition of constantly becoming, evolving, and en-procès subjectivities. The intersections of feminism and narrative indicate intricate and subtle forms of reading whose process entwines the subjectivities of reader and author (Cavarero 2000). In terms of the online environment this interaction provides a highly significant paradigm for illustrating how the processes of narrative affect the interpretation of subjectivity (Laccetti 2008, 2009a, 2009b). Additionally, works like Cruising constantly call attention to the multiple processes and unfoldings of the narrative, or as Drucker might put it, Cruising “stresses narrativizing as an action taken by a reader” (Drucker 2008, 124), provoking questions about how temporality might be reconsidered in light of multimodal born digital fictions.
NARRATIVE TEMPORALITIES AND BORN DIGITAL FICTIONS As Stacy Burton notes, the notion of time and its role in narrative “remains vague; generally relying on an unwritten premise that time is a unitary, explicable phenomenon” (Burton 1996, 42).1 In formalist criticism, according to Burton, time is something that can be dissected and tracked as Genette does in order to impose a linear “order” on works where chronology is “neither clear nor coherent” (Genette 1980, 90). Burton critiques the formalist work for ignoring, among other things, the “relationship between temporality, narrativity and experience” (Burton 1996, 42). In fact, the kind of structuralist diagramming favored by narratologists such as Genette has a “tendency to freeze the fluidity of meaning” (Ryan 2003, 334). Cruising is one example of many born digital works that challenges such frameworks and makes a break with “mechanical” or “chronometric time,” as Richardson notes is largely the case with fiction of “the fi rst third of the twentieth century” (Richardson 2006, 603). Though Cruising shares a similar “suspicion of linearity and teleology” with Richardson’s corpus of Proust, Conrad, and Faulkner (ibid.), albeit in a very different way, it employs those suspicions to call directly into question the “signifying practises of a male-dominated society” (Sapnar 2002a, 80). It appears that, at least according to digital media theorists Guertin and Sapnar, there is a common interest in rethinking the notion of time alongside a critique or problematization of linear or singular structures (Guertin 1998). Although each employs a different vocabulary, both Guertin and Sapnar recognize the need to articulate conceptions of fluidity and simultaneity. However, there is a concurrent awareness that their attempts might well enact exactly the kind of stasis or linearity they aim to challenge. An alternative to such a dichotomy is presented in Cruising where Ankerson and Sapnar combine temporality with reader interaction so that the narrative develops through a convergence of times. There is further dialogism at work here. The representation of subjectivity and narrative through the
168
Jessica Laccetti
multiple modes of image, sound, text, and haptics means that there are always interactions occurring; narrative in Cruising is always an enmeshed network of relations both temporal and multimodal.
CRUISING INTO VIEW In “Narrative Speed and Contemporary Fiction” Kathryn Hume argues that “excessive rapidity” where “events . . . hurtl[e] past too quickly . . . [and] scenes and focal figures change rapidly” denies “real understanding” (2005, 105). Rather differently, Viktor Shklovsky, according to Hume, proposed “retardation” as a way of extending excessively short “kernels” so that they might develop into stories (ibid.). Cruising can be inserted into these kinds of discussions of narrative temporality in a very specific way. Rather than document narrative speed or delay solely within the narrative content, Ankerson and Sapnar invoke it via the Interactive Time of the reading. As Moulthrop might have it, the narrative is “playable” (2008). Born digital fictions such as Cruising exploit the sense of what Hume describes as when “the narrative [is] accelerated beyond some safe comprehension-limit” (2005, 106). It is precisely the notion of narrative speed that “highlight[s] the materiality of text, fi lm, and interface” (Ankerson and Sapnar 2001a) in Cruising. Ankerson and Sapnar’s larger case is that, since the more refined development of the internet, it is possible to narrow one’s interests and focus on concepts like “reactivity” (Sapnar 2003). It is exactly along these lines that, for Sapnar, reactivity appears to constitute an ontology of digital texts. For reactivity is a “condition of time” and as such the narrative comes into being only “in tandem with the viewer’s own movement or action” (ibid.). Instead of conceptualizing narrative as sustaining or giving access to events, episodes, or situations, Sapnar and Ankerson shift the agency to the readers’ “instantaneity of response,” which grants access to the events and consequently to the narrative. The result, in Cruising, is “un montaggio ibrido” (Di Rosario and Gilebbi 2006), composed of disjunctive rhythms between different modes (at least until a haptic sensibility develops), variable temporality, and the constant creation of narrative as an unfolding that parallels the “shift from space to time . . . an aesthetic shift from mapping, and radial structures, to happening, morphosis, and temporal experiences” (Marsh 2001). When the reader fi rst approaches Cruising, at no point are they provided with any background information, description, or introduction about the characters or plot. In fact, without the reader’s participation, the narrative streams across the computer screen in an incoherent flow of sound, image, video, and text. Thus a brief summary is in order. The unnamed narrator is relating a poignant memory of an event that has occurred numerous times. The narrator is with two friends, Mary Jo and Joanie, and they enact a typical North American teenage pastime; they “cruise” up and down local
Cruising Along
169
streets in search of adventure. As the girls drive up and down the same streets, the narrator notes that this cruising is an effort to fi nd love with the male drivers also driving up and down the same streets. For the sake of succinctness the summary will end here, even though the narrative itself, as long as it is on the reader’s screen, will continue ad infi nitum. The unusual tripartite form of Cruising (cinematic, textual, and audiovisual), what Daniel Punday refers to as a “textual ontology,” represents not only the connections between the different modes but the linked relationship between form and content (2003, 111). Punday sees the tripartite appearance of Cruising indicative of “theme, words and interface”; however, he suggests there is a hierarchy of modes here since there are “various levels” between which the reader must “shuffle” (112). Extending the relationship of the various modes to the theme of temporality reinforces notions that time is not a fi xed and organized structure offering only reified significations. Rather temporality here, like the continuous flow of images, sound, and text, is distorted, leaving it up to the reader to negotiate all instances of multiplicity. The speed of the narrative, as Hume (2005) would see it, is easily established: the images, text, and sound flood past the reader. However, while the voice of the narrator is clearly distinguishable, the concurrent flow of black-and-white images and text obscures the textual narrative leaving the reader feeling queasy or carsick from driving/reading too quickly. In this fiction, the reader acquires too much narrative too quickly. In order to proceed, the reader must learn how to read haptically. According to Frank Wilson: the new way of mapping the world was an extension of ancient neural representations that satisfy the brain’s need for gravitational and inertial control of locomotion . . . a new physics would eventually have to come into this brain, a new way of registering and representing the behavior of objects moving and changing under the control of the hand. It is precisely such a representational system—a syntax of cause and effect, of stories and of experiments, each having a beginning, a middle, and an end—that one fi nds at the deepest levels of the organization of human language. (Frost 2007) Cruising comes into existence in a way that is similar to Wilson’s new mapping of the world. A new ergonomics of reading is essential as a way of “registering” this particular narrative. 2 If the reader cannot come to grips, literally, with the temporality of the text, the narrative will remain unrepresentable. The multimodal, tripartite structure of Cruising and the concept of cyclical travel challenge patriarchal assumptions of reading temporality as following a “Hansel–&–Gretel trail of breadcrumbs.”3 Guertin’s appeal to view temporality as a re-envisioning that opens up “ever-shifting perspectives”
170 Jessica Laccetti (Guertin 2003) coincides with Luesebrink’s insistence that “the reader’s reconstruction of time through the assembling and gathering of text and media is what makes each reading unique” (1998, 111). Both statements raise similar concerns regarding the adequacy of purely formalist methodology. By employing a more process-oriented approach there is an opportunity to view temporality as a dynamic synthesis of other semiotic modes. By opting for a fiction that allows the reader to assemble time and thereby the narrative, Ankerson and Sapnar ensure that each reading experience is unique but this does raise some difficulties. In Cruising, Ankerson and Sapnar employ a triadic configuration that includes the simultaneous unfolding of a sequence of cinematic frames, text, and audio. For readers still developing their haptic modalities, like Larsen’s neophyte readers, the audio track provides the first easily intelligible rendering of the narrative. The sonic version details the basic storyline concerning the three girls cruising up and down Main Street in search of love. The words flow from the unnamed homodiegetic narrator, relating a story that occurred in the past but remains perpetually present. The racing fi lmic frames that run through the center of the reader’s screen operate dialogically; at times corresponding to the audio at other times countering it. Above the speeding reel of film streams text much like operatic surtitles, that at once mimics the movement of the fi lm frames as well visualizing the spoken words. The arrangement of the multiple modes on the computer screen—all fleeting without the reader’s interception—reflects the characters’ journey and the reader’s own tenuous grasp on the evanescent narrative. Consequently the Web fiction interweaves the multimodal representation of temporality with the reading experience. Hayles argues that “navigation becomes a signifying strategy for electronic hypertexts” (Hayles 2004, 83), going on to claim that hypertext “employ[s] both horizontal and vertical registers” and so provides two different sets of information: “narrative” appears on one plane while “linguistic, historical, and geographic” cues appear on another. Importantly, the various registers cannot be polarized. Any interpretation of either of these registers requires attention to multimodality for it appears both as narrative and as semiotic cue. While the audio spills out, the visual mode of cinematic frames sweeps to the right (at least in this particular reading). This movement means the textual narrative must be read backwards, from right to left, from end to beginning. The images, when streaming from right to left seem to follow a discernible chronological sequence, flowing from beginning to end and echoing the spoken narrative. The images, however, only match the audible narrative in the sense that they narrate the same events. The tempo at which those events are related differ drastically; the filmic narrative unfolds at a breakneck speed while the woman’s voice keeps pace with contemplative reminiscing. In Cruising, it is up to the reader to regain both temporality and directionality. Once the reader learns how to control the mouse, the narrative
Cruising Along
171
can be slowed down and, in fact, frozen, able to be considered at length. This move establishes what Ryan sees as a “the triple unity of interface, theme, and image” (Ryan 2004). However, freezing the narrative in this way profoundly dislocates it from/in time. Time is no “neutral medium,” Grosz explains, it is a “dynamic force” in the “framing” of subjectivity (Grosz 1999a, 3). Indeed, the demand for reader interaction in this way means that Interface Time is not extratemporal but resolutely linked to the inner flow of the narrator’s thoughts and the plot in general. In this way, the reader’s intervention creates a double bind. At once the reader is able to “read” the narrative while at the same time the reader halts the progression of the story. Thus, freeze-framing the narrative highlights the implicit underpinning of general conceptions of representation where “visualisation is a way of fi xing (in) time” (Braidotti 1994, 49). Here, however, it is not so much the representation but the narrative itself that becomes fi xed, stationary. In a way, then, the reader (re)gains control over the narrative, or at least the telling of it, for as Mulvey sees it “freez[ing] the narrative” is a way for women, as readers and spectators, to “disrupt linearity and cohesion” (1975, 11). The connections between the three instances of multimodal representation do not always tell the same story; they merge and divide in knotted ways. For Guertin, following Harpold, knots “reach” across time and space (Guertin, 2003). Thinking of modes as knots helps explain their temporal positioning, always already in relationships with one another. At times the modes appear to offer parallel information, as the spoken words mirror the textual words. At other times the linguistic account swirls by too quickly to maintain time with the spoken narrative, creating a fissure between events and times. For Kress and van Leeuwen this denotes two kinds of visual
Figure 12.1 A zoomed-out scene from Cruising illustrating the filmic frames and the written narrative.
172 Jessica Laccetti
Figure 12.2 A zoomed-in scene from Cruising illustrating the filmic frames and the written narrative.
literacy, “one in which visual communication has been made subservient to language” and another where “language exists side by side with, and independent of, forms of visual representation” (Kress and van Leeuwen 1996, 21). Consequently, to bridge the multiple times, to arrange or rearrange the various modes, and to map the connections among them, the reader must “pull back the perspective” (Punday 2003, 112). “Driving” the narrative across the screen means placating the “coming-of-age . . . hormones” (Ankerson and Sapnar 2001b), enabling the ongoing action, the fi lmic sequences, and the text to slide slowly from right to left so that the words connect in a legible order. This malleable and amorphous form (amorphous at least until the reader develops her haptic sensibility) highlights Braidotti’s view of “transformation” that here constitutes the act of reading: the “reinscription of the text into a set of discontinuous variations . . . marks the tempo” of a subject’s becoming (Braidotti 2002, 96). In other words, the reader’s physical grappling with the multiplicity of Cruising generates new possibilities of subjectivity, as Butler explains, “multiplicity is not the death of agency, but its very condition” (2004, 194). The multiplicity of modes in Cruising is the condition of the narrativizing, the event of the narrative; all occurring within the orbit of the present. That the narrative is bound up in the present tense can be seen from a feminist stance; Grosz explains that by working within the “parameters” of the present one can explicitly question becoming as a temporal condition, affected by the time in which it is apparent, rather than by a time that is other to it (2005, 73). Thus, by evoking the temporality as an everpresent cycling or unfolding now and allowing the reader to intervene in the “speed” of the narrative (as Hume would put it) allows one to explicitly
Cruising Along
173
question the condition of constraints, both in terms of subjectivity and of the reading process itself. The audio version of the narrative is a homodiegetic account of an unnamed woman relating a repeated teenage pastime, the cruising of the streets of Wisconsin. The narrator is the “skinny girl” in the back and is on the lookout for love, like her two female friends and the other cruisers. While the words themselves suggest a quotidian occurrence, the spoken voice, retroactively sarcastic, realizes that “maybe, we could find [love] driving past us, maybe in a pick-up truck . . .” suggesting that the kind of love for which the three friends are searching will not be found by cruising. The streaming narrative text running along the top of the cinematic images acts, in a sense, like a counterpoint, establishing two durations (the tempo of the spoken word and that of the written), and in doing so brings another kind of temporality to the event both for the narrative and for the reader. The white rounded font seems to eschew the rhythmic complications audible in the spoken version, appearing either distinct when the reader “drives” ably or as a flood of hieroglyphs. In a way like the uneasy alignment that Grosz sees between Deleuzian and Irigarayan modes of thought, here too are modes that “rub up against each other unevenly” though provocatively (2005, 163). In Cruising, the two modes, the aural and the textual, are in motion with and against each other, telling the same story but narrating it differently. This pressure, on the two competing versions (at least from a beginner reader’s point of view) can illustrate the pressure the narrator feels to behave a certain way; the implosive momentum perhaps represents the skinny girl’s struggle with peer pressure. That the girls may seem unable to fully “disengage themselves from their social [and] sexual identity” (Braidotti 2002, 239, 240)4 might at fi rst have negative connotations implying constraint. However, in a positive light, this move establishes a continuity that connects the past with the present; here there is no traumatic break with earlier events. The resulting blend of times, what Ettinger would refer to as “plaiting” (Ettinger 2006, 78), assures each subjectivity of their connected mobility, their nomadism (Braidotti 2002, 240). It is not so much the destination itself that secures transformation, but movement; they “become, transforming themselves and us as they go” (ibid.). Both the written text and spoken word foster another kind of connection between the narrative and the reading experience especially apparent when one recognizes the situated knowledge and partial perspective of the narrator. While there is no lengthy background information even
Figure 12.3 White rounded font from the written narrative.
174
Jessica Laccetti
though aspects are narrated as homodiegetic analepsis, the triadic montage of modes is successfully symptomatic of the protagonist’s enduring attempt to challenge the notion of a beginning and ending. Most interesting are the verbs: “night rolling,” “sniffi ng the street,” “honking at us,” “laughing at them,” “tracing the edge,” and “eyeing life.” Each construction is missing, or more likely, eliding, a fi nite auxiliary so it remains unclear whether this progressive tense is in the past or present. Significantly, this linguistic play with temporality represents structurally what the becoming experience is to the narrator. Here Ankerson and Sapnar offer a vision of “motion [as] another layer of language” (Sapnar 2002c) and in the process they help disrupt typical ways of reading while connecting the multiple modes with what Kristeva sees as features of women’s time; repetition and eternity (1981, 16). If the sound track and the textual narrative frequently seem at odds with one another, how then might the fi lmic sequences connect with both sound and text? Keeping in mind Mulvey’s argument that posits cinema as irreducibly shaped by sexual difference, showing that classical Hollywood cinema is built upon looks or gazes that reciprocally shape the narrative. Importantly, for Mulvey, it is the men who look and the woman who is displayed; connoting the “looked-at-ness” that Ankerson and Sapnar tackle. With over fi fty-four images used, there is only one scene that portrays the characters (see Figure 12.4).5 This kind of nonrepresentation seems all the more marked in light of the strong multimodal presence of the textual, sonic, and haptic unfoldings. The deconstructive gaze derives, not from a genuine opposition (Butler 1997, 140), as in Butler’s sense, but from the capacity for the narrator, and to a lesser extent Mary Jo and Joanie, to perpetually “reinvoke” gazing. In fact, for the protagonists, their entire pastime of cruising revolves around looking and being looked at. Interestingly, the act of making others
Figure 12.4 The only image in Cruising in which the protagonists appear.
Cruising Along
175
look inculcates a certain kind of power, but also then the process of gazing becomes reciprocal, shifting the truck drivers from subject to object of the girls’ gazes. In a subversive move, the narrator refrains from including any words that denote the act of looking. Consequently, while the young women are on the prowl they only “hope they can fi nd [love] passing” them, rather than see love. Additionally, the men in pickup trucks “drive past” the girls and laugh and wave, they however, do not look. This subversive maneuver engenders another instance to problematize linear aspects of temporality. Mulvey suggests that stasis and movement are opposing forces in relation to the gaze. Stasis, necessarily, fi xes the woman as the object of desire, it “freeze[s] the flow of action in moments of erotic contemplation” (1975, 19). Movement can provide a way out of this predicament, it reinserts subjectivity into time. Although Mulvey notes this in relation to spectacle (as stasis) and narrative (as movement), here it seems feasible to view this temporal distinction solely within the visual mode. Although the gaze imposes itself in a particularly subtle way, the logic of the masculine gaze and the control Mulvey purports it to exert are explicitly parodied. According to Hutcheon, “parodic art both is a deviation from the norm and includes that norm within itself as background material” (1980, 50). Instead of presenting the three young women as objects of desire, Ankerson and Sapnar rework the conventions of the cinematic gaze. The protagonists appear only eleven frames into the narrative rather than at the opening. The narrator, Mary Jo, and Joanie appear only twice in the narrative and both times the images are remarkably grainy. The blurry visual code testifies to Braidotti’s notion of the becoming female subjectivity, “it is still a blank, it is not yet there” (1994, 131). This scene further plays with the (traditional) logic of the gaze. Rather than granting easy visual accessibility to Mary Jo, Joanie, and the narrator, only one young woman is most fully discernable, the driver wearing glasses (see Figure 12.5). With reference to the online presence of Cruising and the requirement for a “literate” haptics, it is pertinent that the reader cannot manipulate any image, including that of the protagonists. Though Cruising itself is reactive in a general sense, modes within it are not, suggesting, as Sapnar does, that dynamics of looking and interacting are “ripe with issues of power” (Sapnar 2002a, 51). The accompanying front-seat passenger is extremely blurry and the backseat passenger, the narrator, is visible only in the rearview mirror as a reflection (see Figure 12.4). In this way Cruising’s anonymous narrator manipulates the gaze: she can look at the reader, but the reader has access only to a reflection. The story here is resolutely from the inside, a subjective text that “alternately challenges and ignores the institutional apparatus for ‘traditional’ or ‘mainstream’ literature” (Sapnar 2006). The visual problematization of the “hegemonic regime” multimodally represents the concept of time that is at stake in Ankerson and Sapnar’s text by foregrounding the “becoming-ness” of the narrative as well as the reading.6 Thus, what is
176 Jessica Laccetti
Figure 12.5 An enlarged view of the driver. Ankerson and Sapnar, Cruising.
persistently explored in Cruising, on the level of the narrative, its telling, and its reading, is the desire “not to know who we are,” but “what, at last, we want to become” (Braidotti 2002, 2). For Ryan, this means “the interface is much more than a way to manipulate the text—it is a simulative mechanism that enables the reader to participate symbolically in the experience of the speaker” (Ryan 2004). Through the visual evidence of a sujet-en-procès and the necessity to learn to drive the narrative, the only time that remains pertinent to the Web fiction is that of subjective articulation itself. Just as the complex entanglement of images, sound, and the reader’s navigation permit multiple temporalities to emerge, the textual account adds further complication. I remember cruising Main Street with Mary Jo and Joanie, the heat pumping full blast, windows down, night rolling through Mary Jo’s father’s station wagon like movie credits. I was the skinny girl in back, sniffi ng the street like a dog. We wanted love. That’s all anybody ever wanted, and we thought maybe we could fi nd it driving past us, maybe in a pick-up truck, and we’d pass each other a few times—them honking at us, us laughing at them, until fi nally, they’d wave us into Shopko’s parking lot. Joanie would slide a line of pink lipstick on, and we’d all really get to know each other. There were hundreds of us, tracing the edge of small town Wisconsin, eying life through a car we couldn’t yet take to the world. This lengthy paragraph represents the entirety of the Web fiction narrative. Although it seems short for a story in terms of textual duration, it is genuinely limitless for the narrative will revolve fugitively as long as the reader allows the URL to remain open.7 From a more structural point of view
Cruising Along
177
the concept of frequency raises some interesting insights into temporality. If for Rimmon-Kenan, following Genette, frequency denotes “the relation between the number of times an event appears in the story and the number of times it is narrated (mentioned) in the text” (2002, 46), then the act of reading Cruising drastically alters its frequency. If the reader chooses to interact with the narrative for more than one loop, then the act of cruising moves from a singular instance (telling once what happened once) to multiple instances where the repetitive telling repeats events. Furthermore, the duration of the textual narrative illustrates another temporal aspect. According to Genette, the narrative pace is calculated by the amount of “space” given to each narrated event. For him, acceleration implies a short textual episode that narrativizes a lengthy event while deceleration is the opposite. In Cruising, then, the somewhat brief textual account describes an event that presumably lasted some time. Here the acceleration that Genette describes is doubled if the reader has yet to develop an adequate haptic sensibility. What this kind of formalist approach makes apparent is that Cruising depends on temporal parameters that come into being during the narrativizing, in the process of reading. Further attempts to negotiate linear time develop via the fi lmic frames that whiz past the inexperienced reader; however, this is not the only way that time is emplotted. While all three parts of the triadic structure—the voice-over, the fi lmic sequences, the rolling text—consist of short portions, looped ad infi nitum, the filmic frames contain a kind of punctuation. Rather than a deluge of images, each scene is separated by a black (not blank) space. The melodic voice reciting the story and the streaming words make it easy for any reader to conceive of the images as one continuous flow. However, with practice, the reader can slow down the narrativizing time and see that the Web fiction alternates between images and black spaces, sound, and silence. Reading the silences as detached from the fi lmic frames reduces them to a separate narrative. Rather, readers should employ Raley’s “deep reading” or reading along the “z-axis”: “the user does not simply read the words [ . . . ] but she also reads through and behind [the text]” (Raley 2006). One might argue that what Cruising calls for is a “shallow viewing” in order to enjoy the fleeting, dynamic story. However, in doing so
Figure 12.6 Filmic sequence showing the black space punctuation between each image.
178 Jessica Laccetti readers would grasp only the superficial and cinematic sense of the story. Reading the gaps alongside the streaming images, narrating voice, and text in a syncretic way facilitates a strategic interference. Strategic in the sense that the coming together of these multiple layers of representation consistently calls attention to the instability of time—the black spaces are “the irreducibility of in-between spaces” (Braidotti 2002, 157). Importantly, the black spaces interrupt the flow of the narrator’s memory. As Irigaray asks, “[w]hat do we call a gap that is full” (cited in Guertin 2003). Without the gaps, the story would continue to swirl by both narrator and reader, but the black spaces signify a time to slow down and make the controlling of time imperative, to “hear the inaudible” (ibid.). In this way, the use of gaps act as evidence of the double bind of representation— in Cruising there is an absolute necessity to slow down the time of the narrative—the rush of the memories—in order to create coherence. That the work has no “simple starting point” (Punday 2003, 112) and the various modes are consistently entangled can also be viewed through a feminist lens as a bid to remain mobile. For Braidotti mobility is a physical as well as creative feature, but in Cruising mobility, as a nomadic enterprise, appears as an element with temporal dimensions. Just as Braidotti notes that women’s freedom to “take back the night” signifies “the freedom to invent new ways of conducting our lives, new schemes of representation of ourselves,” so too does taking control of the reading of Cruising suggest a certain kind of freedom (1994, 256). However, this freedom, or “potential energy” as Sapnar sees it, is exclusively for the reader, how exactly the protagonists’ experiences unfold results only with the reader’s intervention (Sapnar 2002c). For Braidotti, the “textual apparatus” is inherently “linear” and “binary” and thus her theory of nomadic subjects is a way of “renewing . . . language” and, by extension, subjectivity (2002, 8). If writing, for Braidotti, is a way to make a space habitable (particularly for women), reading can transform the space into time, pluralizing the reader and the narrative world. Thus the deeply haptic interaction with Cruising means that the unfolding of the narrative is never simply a case of “temporalization,” which for Grosz is “the putting of matter and events into a timeline or chronology” (2005, 75). Rather, the reader interacts with the text at specific moments and “constructs it” much like Braidotti’s view of the intersections of philosophy and feminist theory (Braidotti 1996). Therefore, the explicitly controlled emergence of temporality in Cruising and the tightly linked though uncontrollable multimodality combine and interact to expose what Ankerson and Sapnar perceive as weaknesses inherent in frameworks that preclude dynamism (Sapnar 2002b). In “The Temporality of Hypertext Fiction” Hink demonstrates how “the separation of plot chronology or textual time from narrative time or sequence is a key element of readers’ experiential navigation of hypertext fiction” (2004). Rather differently, in Cruising such a separation renders
Cruising Along
179
the narrative impenetrable. Instead of requiring readers to divide narrative from reading time, Ankerson and Sapnar maintain that a direct correlation exists between Cruising’s narrative drive and their attempt to engage systematically with a notion of time that takes the (re)production of difference into account. Each reader and each reading will enact a different experience of time passing and of the narrative progressing that links with Braidotti’s nomadic becoming where the “subject . . . is defi nitely not one, but rather multilayered, interactive and complex” (2003, 43). Phrased more generally, the I of the reading and the I of the narrative are braided together alongside a temporal dimension that demands correlation. As Cavarero explains, narrative is reciprocal—it requires a teller and a listener for the story to come into being. As implicitly evident in Cruising, the paradoxes in the narrative logic—requiring the slowing down of time in order to progress the story—are multimodally representative of the constitution of the narrator’s subjectivity. Just as the narrative requires slowing down in order to enable the story to progress, so also the narrator’s subjectivity comes into view with temporal speed, for it is precisely the motion that allows for the construction of subjectivity. Subjectivity in Cruising, like the narrative itself, appears in “in-between spaces . . . temporal points of transition” that deeply implicate the reader in their rendering (Braidotti 2002, 40).
NOTES 1. For a detailing of the ambiguities surrounding time in fiction see Ursula K. Heise, Chronoschisms: Time, Narrative, and Postmodernism (Cambridge: Cambridge University Press, 1997), chap. 1. 2. It is possible that the haptic nature of Cruising may render it more readable by a gamer than by a traditional hypertext or print reader. 3. Guertin “believe[s] that retracing one’s steps in the new media is [not] possible. Instead, we experience re-visionings.” For Guertin this is a specifically feminist way of reading digital fiction (Guertin 2003). 4. Although Braidotti makes these insights based on her interpretation of Thelma and Louise, they are applicable to Cruising. 5. With the aid of a Flash decompiler it was possible to decode the elements and number of elements used to create the Web fiction. 6. For Braidotti, “visualization is the hegemonic regime” (2002, 246). 7. It is not necessary that the reader remain on any “page” of the Web fiction for it to continue of its own accord. As long as the URL is open, it will continue to scroll through the fi lmic frames.
REFERENCES Ankerson, Ingrid, and Megan Sapnar. 2001a. “Author Description.” Cruising, Electronic Literature Collection. http://collection.eliterature.org/1/works/ankerson_sapnar__cruising.html (accessed July 10, 2008).
180
Jessica Laccetti
. 2001b. Cruising, Electronic Literature Collection. http://collection. eliterature.org/1/works/ankerson_sapnar__cruising.html (accessed July 10, 2008). Braidotti, Rosi. 1994. Nomadic Subjects: Embodiment and Sexual Difference in Contemporary Feminist Theory. New York: Columbia University Press. . 1996. “Nomadic Philosopher: A Conversation with Rosi Braidotti— Kathleen O’Grady.” Women’s Studies Resources, University of Iowa. http:// bailiwick.lib.uiowa.edu/wstudies/Braidotti/index.html (accessed June 20, 2008). . 2002. Metamorphoses: Towards a Materialist Theory of Becoming. Cambridge: Polity Press. . 2003. “Becoming Woman: Or Sexual Difference Revisited.” Theory, Culture and Society 20 (3): 43–64. Burton, Stacy. 1996. “Bakhtin, Temporality, and Modern Narrative: Writing the Whole Triumphant Murderous Unstoppable Chute.” Comparative Literature 48 (1): 42. http://links.jstor.org/sici?sici=0010– 4124%28199624%2948%3A1%3C39%3ABTAMNW%3E2.0.CO%3B2-D. Butler, Judith. 1997. Excitable Speech: A Politics of the Performative. New York: Routledge. . 2004. Undoing Gender. New York: Routledge Cavarero, Adriana. 2000. Relating Narratives: Storytelling and Selfhood. Trans. Paul Kottman. London: Routledge. Di Rosario, Giovanna, and Matteo Gilebbi. 2006. “Hyperpoetry: Sincretismi– Ibridazioni–Margini–Interstizi.” Poesianet: Digital and Visual Poetry. http:// www.poesianet.it/materiali9.htm (accessed July 15, 2008). Drucker, Joanna. 2008. “Graphic Devices: Narration and Navigation.” Narrative 16 (2): 121–39. http://search.ebscohost.com/login.aspx?direct=true&db=aph& AN=31627468&site=ehost-live (accessed October 10, 2008). Ettinger, Bracha L. 2006. “Fascinance and the Girl-to-m/Other Matrixial Feminine Difference.” In Psychoanalysis and the Image, ed. Griselda Pollock, 60–93. Oxford: Blackwell Publishing. Frost, Gary. 2007. “Reading by Hand: How the Hands Prompt the Mind.” Institute for the Future of the Book. http://www.futureofthebook.com/storiestoc/ hand (accessed June 20, 2008). Genette, Gèrard. 1980. Narrative Discourse. Trans. Jane E. Lewin. Oxford: Blackwell. Grosz, Elizabeth. 1999. “Becoming . . . An Introduction.” In Becomings: Explorations in Time, Memory, and Futures, ed. Elizabeth Grosz, 1–12. Ithaca: Cornell University Press. . 2005. Time Travels: Feminism, Nature, Power. Durham, NC: Duke University Press. . 2003. “The Archive: Memory, Writing, Feminisms.” Quantum Feminist Mnemotechnics: The Archival Text, Digital Narrative and the Limits of Memory. PhD diss., University of Toronto. http://www.mcluhan.utoronto.ca/academy/carolynguertin/1i.html (accessed May 15, 2008). Hayles, N. Katherine. 2002. Writing Machines. Cambridge, MA: MIT Press. . 2004. “Print Is Flat, Code Is Deep: The Importance of Media-Specific Analysis.” Poetics Today 25 (1): 67–90. Heise, Ursula K. 1997. Chronoschisms: Time, Narrative, and Postmodernism. Cambridge: Cambridge University Press. Hink, Gary. 2004. “Temporality of Hypertext Fiction: The Subjective Narrative of Sequence.” Juxtaposition. http://caxton.stockton.edu/Juxtaposition/stories/ storyReader$83 (accessed May 3, 2008).
Cruising Along
181
Hume, Kathryn. 2005. “Narrative Speed in Contemporary Fiction.” Narrative 13 (2): 105–24. Hutcheon, Linda. 1980. Narcissistic Narrative: The Metafictional Paradox. New York: Methuen. Kress, Gunther, and Theo van Leeuwen. 1996. Reading Images—The Grammar of Visual Design. London: Routledge. Kristeva, Julia. 1981. “Women’s Time.” Signs 7 (1): 5–12. Laccetti, Jessica. 2008. “New Media Stories: Subjectivity, Feminism and Narrative Structures.” PhD diss., De Montfort University. . 2009a. “Narrative Beginnings in Hyperfictions.” In Anthology of Narrative Beginnings, ed. Brian Richardson, 179–90. University of Nebraska Press. . 2009b. “Reading Links as Reading Strategies.” In Blurring the Boundaries, ed. Bernd Herzogenrath. New York: Mellen Press. Ludwig, Jessica. 2001. “Students’ Poetry Web Site Showcases Hypertext Verse.” Information Technology, The Chronicle of Higher Education. http://chronicle. com/free/2001/06/2001061201t.htm (accessed June 20, 2008). Luesebrink, Marjorie. 1998. “The Moment in Hypertext: A Brief Lexicon of Time.” Proceedings of the Ninth ACM Conference on Hypertext and Hypermedia: Links, Objects, Time and Space—Structure in Hypermedia Systems, June 20–24, Pittsburgh: ACM, 106–12. Marsh, Bill. 2001. “Reading Time: For a Poetics of Hypermedia Writing.” Currents in Electronic Literacy 16 (Fall). http://www.cwrl.utexas.edu/currents/ fall01/marsh/marsh.html (accessed June 3, 2008). . 2008. “Some Joyces, Not an Eco, Instruments and Playable Text.” The Iowa Review Web 9 (2). http://research-intermedia.art.uiowa.edu/tirw/vol9n2/ (accessed August 25, 2008). Mulvey, Laura. 1975. “Visual Pleasure and Narrative Cinema.” Screen 16 (3): 6–18. Punday, Daniel. 2003. “Toying with the Parser: Aesthetic Materiality in Electronic Writing.” The Journal of Aesthetics and Art Criticism 61 (2): 105–19. Raley, Rita. 2006. “Editor’s Introduction: Writing 3D.” The Iowa Review Web. http://www.uiowa.edu/~iareview/mainpages/new/september06/raley/editorsintro.html (accessed May 20, 2008). Richardson, Brian. 2006. “Making Time: Narrative Temporality in TwentiethCentury Literature and Theory.” Literature Compass 3 (3): 603–12. Rimmon-Kenan, Shlomith. 2002. Narrative Fiction: Contemporary Poetics. London: Routledge. Ryan, Marie-Laure. 2003. “Narrative Cartography: Toward a Visual Narratology.” In What Is Narratology? Questions and Answers Regarding the Status of a Theory, ed. Harald Fricke, 333–64. Berlin: Walter De Gruyter Inc. . 2004. “Cyberspace, Cybertexts, Cybermaps.” Dichtung Digital 1. http:// www.brown.edu/Research/dichtung-digital/2004/1/Ryan/index.htm (accessed June 15, 2008). Sapnar, Megan. 2002a. “The Code Looks Back: Flash Software, Virtual Spectators and the Interactive Image.” MA diss., Georgetown University. . 2002b. Poems that Go, Online Posting, 13:36. http://www.poemsthatgo. com/discussion.htm#form (accessed June 28, 2008). . 2002c. “Text that Moves.” Archived Log of Live Chat: With Guests Thomas Swiss of the Iowa Review Web and Megan Sapnar of Poems that Go Led by Deena Larsen. LinguaMOO, trAce Archive. http://tracearchive.ntu. ac.uk/forumlive/chat102002.cfm (accessed July 15, 2008). . 2003. “Reactive Media Meets E-Poetry.” Poems that Go 12. http://www. poemsthatgo.com/gallery/winter2003/print_article.htm (accessed July 13, 2008).
182
Jessica Laccetti
. 2006. “Digital Poetry, Visual Media: The Promises and Pitfalls for New Practitioners.” Scene 360. http://www.scene360.com/STORYboard_article_ digitalpoetry.html (accessed May 25, 2008).
13 Beyond Multimedia, Narrative, and Game The Contributions of Multimodality and Polymorphic Fictions Christy Dena INTRODUCTION Polymorphic fictions offer a unique methodological opportunity to stretch current notions of the role of delivery media and environments in the meaning-making process, as well as understanding narrative and game modes. At present, though, current theories that directly describe these practices are not attending to the meaning-making process or the issue of the copresence of narrative and game modes. This chapter discusses the contributions multimodality makes to the study of polymorphic fictions, and what both of these contribute to narrative, game, and media studies.
A QUICK PRIMER ON MULTIMODALITY [The] arts have begun to use an increasing variety of materials and to cross the boundaries between the various art, design and performance disciplines, towards multimodal Gesamtkunstwerke, multi-media events, and so on. The desire for crossing boundaries inspired twentieth-century semiotics. (Kress and van Leeuwen 2001, 1) There have been many papers, books, and conferences developing the topic of multimodality, including Anthony Baldry’s (2000) edited collection, Kay O’Halloran’s (2004) edited collection, and the biennial International Conference on Multimodality, which began in 2002. This chapter will focus on the book written by semioticians Gunther Kress and Theo van Leeuwen (2001). Their text is chosen because it has many concerns that are congruent with my own study into polymorphic fictions: the goal towards somewhat “unifying” theory that is inclusive of many modes, the incorporation of experiential elements, the view that communication only occurs with both producer and interpreter, and a focus on combinations of modes. Despite the popularity of the term multimodality, Kress and van Leeuwen’s usage of “multimodality” is actually polysemous. It refers to, I argue, a type of practice (“multimodal texts”), a reaffi rmed perspective that all
184
Christy Dena
communication is multimodal, as well as a communication methodology (“multimodal communication”). On the fi rst, Kress and van Leeuwen explain that they are concerned with how “people use the variety of semiotic resources to make signs in concrete social contexts” (2001). This practice perspective describes multimodality as the: use of several different semiotic modes in the design of a semiotic product or event, together with the particular way in which these modes are combined—they may for instance reinforce each other (‘say the same thing in different ways’), fulfi l complementary roles, [ . . . ] or be hierarchically ordered. (Kress and van Leeuwen 2001, 20) Kress and van Leeuwen are also careful to point out that while the skills involved with this practice are somewhat contemporary, meaning has always been made “in many different modes and media which are copresent in a communication ensemble” (2001, 111). The past view, they explain, that “meaning resides in language alone” or “is the central means of representing and communicating” is “simply no longer tenable” (ibid.). Likewise, in 1998, semiotician Jay Lemke explained that: All literacy is multimedia literacy: you can never make meaning with language alone, there must always be a visual or vocal realization of linguistic signs that also carries non-linguistic meaning (e.g. tone of voice, or style of orthography). Signs must have some material reality in order to function as signs, but every material form potentially carries meanings according to more than one code. All semiotics is multimedia semiotics, and all literacy is multimedia literacy. (Lemke 1998, 283) While this understanding of the modally diverse nature of communication is now increasingly prevalent, it is the methodological models that actuate this insight that are in development. So, in order to understand and communicate multimodal practice, indeed identify it as such, Kress and van Leeuwen put forward a theory of multimodal communication. They moved away from their previous work on image-specific semiotics (Kress and van Leeuwen 1996) and “the idea that the different modes in multimodal texts have strictly bounded and framed specialist tasks,” towards “a view of multimodality in which common semiotic principles operate in and across different modes” (2001, 2). Kress and van Leeuwen’s multimodal practice and methodology are explored here in the context of polymorphic fictions. The fi rst part of this chapter shows how the study of meaning-making in polymorphic fictions develops understanding of semiotic resources, and the second part utilizes their semiotic principles to progress understanding of narrative and game modes in polymorphic fictions and beyond.
Beyond Multimedia, Narrative, and Game
185
INTRODUCING POLYMORPHIC FICTIONS All objects are walls for writing on. Take any form and project it onto any surface. Earrings can be billboards, tables can be screens, buildings can be magazines. We are the Babylonian translators. (Gerritzen et al. 2001, 129) The term “polymorphic fictions” refers to fictions that are expressed across multiple forms. Specifically, they are identified by the intra-systemic use of combinations of distinct articulations for meaning. Examples of works put forward by theorists exploring the area in narrative, game, and media studies include the Wachowski Brothers’ The Matrix franchise and its continuing storyline across comics, computer game, feature fi lms, online games, and anime; I Love Bees (42 Entertainment 2004), the “alternate reality game” expressed across Web sites, live events, and pay phones, created to extend and market the fictional world of the computer game Halo 2 (Bungie Studios for Microsoft Game Studio, 2004); Mark Z. Danielewski’s House of Leaves, “which operates across no less than 5 media channels (novel, novella, live performance, recorded music, the web), each integral to the establishment of the narrative storyworld” (Ruppel 2006); and the novel distributed as paragraphs on stickers in streets across the world: Implementation (Montfort and Rettberg 2004). Polymorphic fictions are a difficult phenomenon to capture and analyze due to the challenges they present in theorizing the role of mediums in the meaning-making process, and theorizing the presence of both narrative and game elements at intracompositional (single work) and intercompositional (between works) levels. Both of these factors will be explored here, but due to the nascent nature of research into this phenomenon, this section will fi rst briefly outline the theoretical area.
Polymorphic Fictions Are Combinations of Distinct Articulations Theories put forward to identify the phenomenon of polymorphic fictions include Jill Walker Rettberg’s “distributed narrative” (Walker 2004),1 Marc Ruppel’s “cross-sited narratives” (Ruppel 2005a), Henry Jenkins’s “transmedia storytelling” (Jenkins 2006), Markus Montola’s “pervasive gaming” (Montola 2005), and Jane McGonigal’s “ubiquitous gaming” (McGonigal 2006). While there are some differences in the phenomena described with these theories, a unifying trait of the practices is the use of multiple mediums. This trait is significantly different to what is commonly considered “multimedia.” While multimedia is a terribly polysemous term, it is invoked here rhetorically to denote the conventional association with a mix of text, images, video, and sound. The problem with this notion of “multimedia” is that it is regarded as being representative of all expressive possibilities, but is oblivious to other medial factors such as a delivery
186
Christy Dena
medium (a computer or book, for example). Therefore, in an attempt to distinguish the phenomenon discussed here from “multimedia” within a media platform, current theorists have referred to these fictions variously as being “distributed across varying media channels (fi lm, web, music, video games, print, live performance, etc.)” (Ruppel 2005b). Further to this, the expressive medium doesn’t just have to be what is conventionally regarded as a media channel: it can denote “physical spaces” (Walker 2004), “locations” (McGonigal 2006) or “spaces” (Montola 2005).2 The street a pervasive game is played in, for instance, can be part of the meaning-making process rather than incidental to it (Davenport 2005; Davies 2007; Flanagan 2007). This need to include semiotic resources beyond media platforms is the reason why the expressive mediums employed in polymorphic fictions are described here as “articulations.” The term articulation is intended to be inclusive of all objects, processes, actions, environments, and media that have the potential to communicate in some way. For some theorists it is the distribution (Walker 2004), or expansion across space (Montola 2005), that is more significant than the need for more than one media channel (Ruppel 2005a, b) or media platform (Jenkins 2006; McGonigal 2006). That is, a work may be accessed through a single computer, but distributed across many Web sites (Walker 2004). While this multiplicity of location within an articulation is a noteworthy practice, it is not a feature of what is proposed here as polymorphic fictions. Instead, polymorphic fictions highlight the peculiar literacy involved in creating and experiencing a work that is expressed across distinct articulations. Experiencers (readers, audiences, players) of polymorphic fictions have to engage in haptically distinct interaction modes in order to traverse the fictional world: they move from flicking pages in a book to clicking on a keyboard to watching a television screen. “Distinct articulations,” then, captures the change in interaction mode required of the experiencer (and in many cases the literacy required of the creator). But in a definitional context, the notion of “combinations of distinct articulations” is not synonymous with a skill set. This is one reason why the notion of “intra-systemics” has been introduced.
Polymorphic Fictions are Intra-Systemic The mere presence of distinct articulations renders virtually all franchises, adaptations, remixes, homages, and so on, a polymorphic fiction. There are two steps needed therefore to prevent the phenomenon from disappearing into such an inordinate scope: a trait that distinguishes polymorphic fictions from transtextuality (Genette [1982] 1997), and another to distinguish them as a type of franchise.3 Polymorphic fictions are not subsumed under transtextuality because that theory is concerned chiefly with different-author relations. It is noted that while there are times Genette does refer to same-author relations in literary works—described variously
Beyond Multimedia, Narrative, and Game
187
as “intratextuality,” “autohypertextuality,” and “autographic”—Genette makes it clear that these practices are exterior to the notion of transtextuality.4 It is also noted that different-author relations across distinct articulations is a novel area of research too, 5 but there are other areas of inquiry admirably taking on this task already, such as intermediality (Wolf 2005) and transfictionality (Saint-Gelais 2005).6 Instead, what is argued to be a remarkable yet largely unrecognized phenomenon is the use of a combination of distinct articulations for meaning by a sole producer or creatively controlled group. Indeed, one of the motivations behind Kress and van Leewuen’s theory of multimodality is their observation that a single person can, with the aid of digitization, create works with different modes and material realizations. “Different modes have technically become the same level of representation,” they explain, “and they can be operated by one multi-skilled person using one interface, one mode of physical manipulation, so that he or she can ask, at every point: ‘Shall I express this with sound or music?’, ‘Shall I say this visually or verbally?’ and so on” (2001, 2). But unlike this multimedia or monodistinct-articulation approach, a polymorphic fiction creator asks: “Shall I express this part of my fictional world with a novel, computer game, painting, or film?” Of course, there have been multimodal texts (and fictional worlds expressed across distinct articulations) before; but previously multimodal texts “were organised as hierarchies of specialist modes” by “hierarchically organised specialists in charge of the different modes” and then integrated by an editing process (Kress and van Leeuwen 2001, 2; original emphasis removed). Indeed, media scholars Christian Krug and Joachim Frenk argue that it is a democratization of modes that is the outstanding feature of The Matrix: The franchise is not remarkable because supplementary texts now elaborate on or even modify the story of a successful pretext—after all, Hollywood has made films out of successful comic books and has expanded on the myths that inform these pre-texts (Superman, Batman). Rather, the radically new potential of the Matrix franchise derives from the status of the various media involved in the process. (Krug and Frenk 2006, 75; original emphasis) This usage heralds an emerging literacy that requires the creator not only to have an awareness of the affordances and skills needed for a range of articulations—what Sue Thomas et al. (2007) described as “transliteracy: the ability to read, write and interact across a range of platforms, tools and media from signing and orality through handwriting, print, TV, radio and fi lm, to digital social networks”7—but what I provisionally describe here as a combinatorial literacy. That is, a person who can write a novel and a screenplay is not necessarily capable of writing a story that begins in a novel and continues in a screenplay. The term “intra-systemic” is an attempt to
188
Christy Dena
capture this emerging combinatorial literacy and creatively organized production process.8
Problematizing Intra-Systemic Distinct Articulations: Levels of Meaning-Making The intra-systemic use of more than one distinct articulation qualifies a work as a polymorphic fiction, but this trait does not detail how meaning-making occurs. This section will outline how the channel of transmission or delivery medium—such as a computer, book, canvas, television, or street—can be an active part of the meaning-making process. Conventionally, a delivery medium is not considered part of a communication ensemble. Werner Wolf, for example, explains that the “technical or institutional channels” are a secondary concern in intermediality (Wolf 2005, 253). However, the configurative role of so-called transmissive or distribution mediums has been argued by many, including Marie-Laure Ryan (2003) and Kress and van Leeuwen: Distribution technologies are generally not intended as production technologies, but as re-production technologies, and are therefore not meant to produce meaning themselves. However, they soon acquire semiotic potential of their own and even unwanted ‘noise’ sources such as the scratches and discolorations of old fi lm prints may become signifiers in their own right. (2001, 21; original emphasis) As is evidenced in the references to “unwanted noise” in this quote, discussions on the semiotic potential of distribution technologies have gathered around the notion of configuration in an extra-systemic sense. That is, the configurative role is incidental, actuated by the technology rather than the creator. When considering the role of a distinct articulation in a polymorphic fiction, the discussion shifts to investigating how a distribution technology or environment is intentionally invoked by the creator to be a part of the meaning-making process. In her paper on the importance of the “meaning, history, and significance” of the space a game is played in, game researcher Mary Flanagan (2007) cites the “urban tourism” game You Are Not Here (Mushon et al. 2006) as an example of a conscious use of a location. The game, described on the main Web site as a “dislocative tourism agency” (Mushon et al. 2006), “invites participants to become meta-tourists on an excursion through the city of Baghdad” (Flanagan 2007). Participants navigate through New York City using a two-sided map of New York City and Baghdad, and by holding the map up to the light can discern corresponding locations. Once at that location, they fi nd stickers with a phone number, in which they can listen to a recorded message narrating details of a corresponding Baghdad location or an event that occurred there. While the actual streets are not
Beyond Multimedia, Narrative, and Game
189
necessarily significant, it is the choice of city and country that is: “You Are Not Here attempts to expose the contrasts and similarities between two cities [ . . . ] While each city’s realities are politically involved, both the emotional and social perception of these corresponding spaces are completely detached from one another” (Mushon et al. 2006). The space can be seen to operate as part of the meaning-making process in at least two ways: the experiential aspect brings the participant closer to the reality of a remote city than newspaper reports do, and the juxtaposition of the two locations asks the participant to question the differences between the cities so set apart by political rhetoric. An example of a polymorphic fiction that activates the delivery medium as part of the intra-systemic meaning-making process is the novel Cathy’s Book: If Found Call 650–266–8233 (Stewart and Weisman 2006). As one can garner from the book title, the usual “paratextual” element of a book title is actually diegetic, in that the book is presented as if it is the personal diary of girl that the reader has stumbled upon. Stuck to the inside cover of the book are a variety of personal items one would fi nd in a diary: removable photos, scribbles, and napkins with phone numbers that lead to character recordings and fictional Web sites. In a similar vein to epistolary fictions of the past, the use of the diegetic title, accompanying objects and Web sites all contribute to an immersive effect by asking the reader to interpret the delivery mediums as part of the fictional world.
NARRATIVE AND GAME MODES IN POLYMORPHIC FICTIONS My intuition is that what we will see in the future will be a number of hybrid phenomena which contain elements of what we traditionally used to defi ne either as a game or a story, but which are also themselves altering the very notion of these concepts, and of what a game or a narrative can be. (Klastrup 2003, 18) Beyond the significance of the use and activation of various distinct articulations in polymorphic fictions is the use of narrative and game modes. Polymorphic fictions can involve a combination of distinct articulations that include what can be regarded as narrative-based and game-based compositions. For instance, Douglas Adams created many intra-systemic adaptations and extensions of the Hitch Hikers Guide to the Galaxy fictional world: the 1978 BBC Radio 4 radio play (partly cowritten with producer John Lloyd), novels, and the story and puzzles for the interactive fiction game (Infocom 1984). Max Barry created his own Internet simulation game, Nation States (Barry 2004), to compliment his novel Jennifer Government (2004). The computer game The Sopranos: Road to Respect (2006), was cowritten by the creator of the TV show The Sopranos (HBO), David Chase. Independent filmmaker Lance Weiler wrote and directed his
190 Christy Dena feature film Head Trauma (2006) as well an accompanying alternate reality game Hope is Missing (2007). On the face of it, these polymorphic fictions have narrative and game modes in operation at both an intercompositional (between works) and intracompositional (within a work) levels. To understand the phenomena, then, theories from both narrative and game studies would need to be drawn on and revised. Currently however, theories addressing the area described here as polymorphic fictions privilege either a narrative or game mode.
Current Narrative-Based Theories Walker Rettberg proposes “distributed narratives” to describe “stories that aren’t self-contained,” specifically: A new kind of narrative is emerging from the network: the distributed narrative. Distributed narratives don’t bring media together to make a total artwork. Distributed narratives explode the work altogether, sending fragments and shards across media, through the network and sometimes into the physical spaces that we live in. (Walker 2004) Walker Rettberg invokes Aristotle’s dramatic unities from Poetics and summarizes them as unity in time, space, and action: a play should depict action within one day, within one place, and be directed towards a single overarching idea. Walker Rettberg then reframes these as disunities to identify the exotic nature of these works, they can be: distributed in time, where the “reader, player or viewer experiences the narrative in bits and pieces over a period of time”; distributed in space, because there “is no single place in which the whole narrative can be experienced”; and distributed across authors, where “no single author or group of authors has complete control of the narrative” (Walker 2004). Ruppel has proposed what he calls “cross-sited narratives” as “multisensory ‘clustered’ or ‘packeted’ stories told across a divergent media set” (Ruppel 2006). These “multi-sited narrative networks,” he explains, have a “narrative sequence [that] is distributed across varying media channels (fi lm, web, music, video games, print, live performance, etc.)” (Ruppel 2005b). To Ruppel, cross-sited narratives are a “truly multimedial method of storytelling” (Ruppel 2005a). And fi nally, among other theories in media studies, there is Henry Jenkins’s well-known theory of “transmedia storytelling”: A transmedia story unfolds across multiple media platforms with each new text making a distinctive and valuable contribution to the whole. In the ideal form of transmedia storytelling, each medium does what it does best—so that a story might be introduced in a fi lm, expanded through television, novels, and comics; its world might be explored
Beyond Multimedia, Narrative, and Game
191
through game play or experienced as an amusement park attraction. (Jenkins 2006, 95–96) Before the discussion continues, it is important to clarify here that transmedial narrative and transmedia storytelling are not analogous concepts. Indeed, despite the repetition of the same core term “transmedia” in narrative, game, media, and even education studies, all of these areas of inquiry are referring to different phenomena. In admittedly simplistic terms, the area variously described as “transmedial narratology,” “transmedial narrative,” and “narrative media studies” interrogates the nature of narrative in light of the relationship between narrative and media. Research questions include the medium-specific and medium-independent nature of narrative. The same concern exists in game studies, where games are regarded as transmedial phenomenon in that there is “no set of equipment or material support common to all games” (Juul 2001, 48). The nature of game is likewise being explored through an interrogation of its medium-specific and medium-independent character (Juul 2001, 48–52; Eskelinen 2005). In both narrative and game studies the term “transmedial” also refers to an element that is medium-independent. So “transmedial narrative” and “transmedial game” can refer to a research inquiry and a medium-independent element identified within that inquiry. Jenkins’s “transmedia storytelling” on the other hand, refers to the expansion of stories from a single fictional world across media. It is a story that is expressed in combinations of media platforms. In this sense, transmedia storytelling has more affinities with “transfictionality”—which “covers those practices that expand fiction beyond the boundaries of the work” (Saint-Gelais, translated in Ryan 2006)—than transmedial narrative. To confuse the situation even more, there is also a pedagogical approach that is oftentimes utilized by media and narrative studies educators, who sometimes equate it incorrectly with transmedia storytelling: “transmediation.”9 These are not the only usages of transmedia and this disparity is a natural occurrence, but a multidisciplinary address requires an interrogation of discipline- and field-sensitive nomenclature. Criticisms and Defences of Narrative-Based Theories Walker Rettberg, Ruppel, and Jenkins have all referred to computer games and unique hybrids such as “alternate reality games” to illustrate their respective theories. Despite this, they all invoke narrative in their identification and description of these works. Criticisms of some of these and other narrative-based approaches to contemporary phenomena abound. Game scholar Markku Eskelinen criticizes the “detriment of the narrativist approach (or ideology)” in media studies (Eskelinen 2005), and Espen Aarseth laments the “use of unfocused terms such as emergent narratives” to describe games (2004b, 366). However, media studies terms such as
192 Christy Dena Jenkins’s use of “storytelling” could also be strategic: championing an aesthetically rather than economically motivated approach to his industry readers, and highlighting a new focus on “aesthetic implications” to academics (Jenkins 2004, 40). Indeed, media scholar John T. Caldwell introduced the term “second shift aesthetics” to recognize similar phenomena and to “bridge the unfortunate gap that has widened between academic studies of industry, from a political-economic perspective, and critical studies in the humanities” (Caldwell 2003, 132). But, as Eskelinen argues, the use of “storytelling” is still inappropriate: As there are different modal contexts and origins (fi lms and games for starters) for cross-media franchises such as Star Wars or Tomb Raider, it is old-fashioned and inaccurate to rename or indiscriminately label these franchise economies and strategies as mere or pure storytelling ecologies or transmedial storytelling. (2005) Anticipating a future challenge, Walker Rettberg has defended her use of the term “narrative”: In earlier work, I have proposed the term distributed narrative to describe the increasing number of texts where elements of a story are distributed in time or space. By using the term narrative, rather than discussing the larger group of texts variously called “contagious media” or “crossmedia”, I wish to emphasize the ways in which our basic knowledge of narrative structures allows us to see connections between fragments that may have no explicit links. (2005) Walker Rettberg’s sense of narrativity lies in what she claims is a narrativebased reading of distributed narratives, not necessarily a characteristic of them. This argument raises two issues: (a) describing a phenomenon according to what is considered the primary way it is perceived; (b) bundling “connection between fragments” with narrative knowledge. To describe a work according to the way it is presumed to be interpreted, irrespective of its qualities, is problematic. While the subjective nature of realities is not contended here, what is contended is merging of an audience or reader-response theory with a theory of a work. Granted, there is no work outside its perception . . . but claiming one perception accounts for everyone’s experience of it is perhaps not the intention but is the logical outcome of Walker Rettberg’s claim. Secondly, the ability to perceive connections between things is not necessarily narrativistic. While “the ability to infer causal relations is essential to narrative understanding” (Ryan 2004, 11), the reverse is not. This argument is taken up in the penultimate section of this chapter. Irrespective of the reason for invoking narrative as the primary trait, Walker Rettberg’s “distributed narrative,” Ruppel’s “cross-sited narratives,” Jenkins’s “transmedia storytelling,” and my own previous
Beyond Multimedia, Narrative, and Game
193
terms—“cross-media narrative” (Dena 2003), “cross-media storytelling” (2004a), “polymorphic narrative” (2004b) and “multi-channel storytelling” (2004c)—could all be seen then as being guilty, intentionally or unintentionally, of what Aarseth describes as “narrative colonialism” or “narrativism”: This is the notion that everything is a story, and that story-telling is our primary, perhaps only, mode of understanding, our cognitive perspective on the world. Life is a story, this discussion is a story, and the building that I work in is also a story, or better, an architectural narrative. (Aarseth 2004a) A narrativistic methodology also thwarts understanding, for “[w]hen games are analysed as stories, both their differences from stories and their intrinsic qualities become all but impossible to understand” (Aarseth 2004b, 362). Indeed, there has been “a question that has split, but also animated and energized, the young academic discipline of video game studies” (Ryan 2006, 181). Fundamentally, game theorists have “made an ontological argument about games being a formally different transmedial mode and cultural genre of expression and communication than stories” (Eskelinen 2005). A lot of discussion has concentrated, then, on how this difference can be understood—games are simulations (Frasca 2003) or rules-based (Juul 2005), for example. Ironically though, the monomodal perspective and its ramifications so criticized by game scholars is also exercised in game studies.
Current Game-Based Theories Walker, Ruppel, and Jenkins all refer to alternate reality games (ARGs) in their theories, a type of polymorphic fiction that I have described elsewhere as having high degrees of both narrative and ludic elements (Dena 2007). However, ARGs are also labeled by game scholars as pervasive games (Montola 2005) and ubiquitous gaming (McGonigal 2006). Montola and McGonigal describe these genres in ways similar to Walker: Montola describes pervasive games as a genre of gaming that is identified by the systematic “blurring and breaking the traditional boundaries of game” in “spatial, temporal and social dimensions,” unlike a “regular game [that] is played in certain spaces at certain times by certain players” (Montola 2005); and McGonigal has argued that “immersive games [. .] erase game boundaries—physical, temporal and social” (McGonigal 2003) and more recently that ubiquitous games are characterized as (among fourteen other factors): “distributed experiences [that are] distributed across multiple media, platforms, locations, and times” (McGonigal 2006, 43). In these game studies therefore, the focus is likewise on the medial characteristics and the presence of one primary mode without any mention of the copresence of game and narrative modes.10
194
Christy Dena
So, it appears the same practice, ARGs, can be an example of distributed narrative, cross-sited narrative, transmedia storytelling, pervasive gaming, and ubiquitous gaming. But the question that needs to be asked here is: are these researchers studying the nature of narrative, game, or media, or the nature of the phenomena? Their mode-specific labels are perhaps warranted if they are investigating the nature of narrative or game in these phenomena. That is, the remit of narrative and game studies respectively. But it is self-evident that in privileging a narrative or game mode, any insight into the modally diverse nature of polymorphic fictions is thwarted. A methodological goal for the study of polymorphic fictions, then, has been to develop a model for identifying and interrogating the nature of both narrative and game elements. This has led to a questioning of their arbitrary delineations. Beyond confi rming and developing notions of difference between narrative and game, similarities have also been observed. The awareness of similarities is not new: Juul has previously noted that it is possible that “games and narratives can on some points be said to have similar traits” (Juul 2001), and Ryan has recently cited similarities between computer games and narrative: characters, events, setting, and trajectories leading from a beginning to an end state (Ryan 2006, 182). Unfortunately though, the conclusion Ryan draws from these similarities is that computer games “have integrated play within a narrative and fictional framework” (Ryan 2006, 182). Ryan has also furthered this argument with a call for a “ludo-narrativism” to acknowledge the “dual nature of video games” (Ryan 2006, 203). While the research questions Ryan proposes for a ludonarrativism defi nitely progress the area, it is my contention that there is still another step that can be taken. This step moves beyond seeing similarities between narrative and game elements as an indication of boundary shifting (that is, it should have been narrative or game to begin with) or complex co-presence. It is a step illuminated by Kress and van Leeuwen’s theory of multimodal communication.
THE INSIGHTS OF MULTIMODAL COMMUNICATION As explained at the beginning of this chapter, Kress and van Leeuwen’s multimodality is a multifaceted term that refers to a type of practice, a reaffi rmed understanding of the diverse nature of communication as well as a communication methodology. Their theory of multimodal communication is enunciated with their proposed conceptual and material levels (though they don’t describe them as such). The top level, if you like, has the common semiotic principles, such as “action,” “emotion,” and “framing.” Framing principles can be observed with arms in a painting, borders in a newspaper, and pauses in speech. Framing is a common semiotic principle,
Beyond Multimedia, Narrative, and Game
195
“a multimodal principle, that can be differently realised in different semiotic modes” (Kress and van Leeuwen 2001, 3). The next level consists of modes, which are semiotic resources that “can be realised in more than one production medium” (2001, 21–22; original emphasis removed). Narrative, they explain, is a mode that “can be realised in a range of different media” (2001, 22). Media is, then, the final level: they “are the material resources used in the production of semiotic products and events” (ibid.). Examples of media are paint, cameras, computers, and (human) vocal apparatus. To illustrate the relationships between these levels, consider this diagram (see Figure 13.1). As one can see in this interpretation of Kress and van Leeuwen’s relations, there are three levels: principles, modes, and media. The difficulty with Kress and van Leeuwen’s nomenclature is that they invoke “multimodal” to refer to a combination of modes and a common semiotic principle at the same time. To illuminate the difference between the two, I term a common semiotic principle a “transmodal element.” The following is a summary, then, of the areas of inquiry a theory of multimodal communication affords, reframed with what is argued to be a more appropriate nomenclature: • • • •
transmodal: elements that can be realized in different modes transmedial: elements (modes) that can be realized in different media mode: methods employed that influence the way messages are presented media: material resources used in the production of semiotic products and events • multimodal: combinations of modes
Figure 13.1 Diagram illustrating the relations between principles, modes, and media, as espoused by Kress and van Leeuwen (2001).
196
Christy Dena
The insight that is highlighted here is the transmodal element that can be realized in different modes. This means that similarities between narrative and game modes are not necessarily an indication of co-presence or the need for a demarcation shift. Instead, similarities indicate elements which are non-mode-specific. This may seem quite fundamental when expressed in such a manner, but as evidenced in the theoretical discussions noted in this chapter, it is not currently utilized as an approach to understanding narrative and game modes. So how does it change the way polymorphic fictions, indeed any creative phenomena, are theorized?
TOWARDS A TRANSMODIOLOGICAL APPROACH [F]or pleasure in the confusion of boundaries, and for responsibility in their construction. (Haraway 1991, 150; original emphasis) Shifting the inquiry to be inclusive of modal diversity facilitates understanding of narrative, game, and media in ways each in isolation cannot. This is why I have shifted the investigation to the mode-agnostic term fictions rather than narrative, storytelling, or game. Fiction is not, it should be noted, intended to be synonymous with literariness and narrative.11 With this methodologically strategic move, the investigation facilitates the study of combinations of modes and transmodal elements. Since transmodal elements appear to be a young area of research, it is worth exploring this area. To do so, let’s return to Walker Rettberg’s claim that distributed narratives are described with the narrative mode in order to “emphasize the ways in which our basic knowledge of narrative structures allows us to see connections between fragments that may have no explicit links” (Walker 2005). This claim about the relationship between narrative knowledge and the ability to see connections between things has an established history in narratology. Narrative interpretation has long been understood in terms of being able to recognize connections through changes of state (though recent theories like David Herman’s work on storyworlds (2002) champion a more complex rendering). Indeed, as mentioned earlier, Ryan has stated that “the ability to infer causal relations is essential to narrative understanding” (Ryan 2004, 11). What I wish to challenge here is the conflation of discerning connections and narrativity, by looking at neuropsychology. Eugene G. d’Aquili and Andrew Newberg argue that there are seven fundamental functions that “allow the mind to think, feel, experience, order, and interpret the universe”: holistic, reductionist, causal, binary, abstractive, quantitative, and emotional value (d’Aquili and Newberg 1999). The causal operator, d’Aquili and Newberg explain, “permits reality to be viewed in terms of causal sequences” (1999, 53). The causal operator (indeed all the operators) intimate a conceptual process that is not developed through narrative understanding but is interpreted as narrative understanding. Indeed, if we take “cause” as a possible transmodal principle,
Beyond Multimedia, Narrative, and Game
197
then it can be extrapolated out to narrative and game modes equally in the form of plot and quests respectively. Quests have been rhetorically labeled by Aarseth (2004b) as a “postnarrative discourse.” A quest, Aarseth explains, can be understood as “a perfect path or ‘ideal sequence’ that must be realized, or the game/story will not continue” (ibid., 367). They can be understood, Aarseth continues, as a “string of pearls: within each pearl (microworld) there is plenty of choice, but on the level of the string there is no choice at all” (ibid.). Here, one can see a quest being described according to some chronological conditional structure, much like plot. Indeed, game scholar Susana Tosca (2003) has even juxtaposed quests with plot, with the caveat that one involves action on the behalf of the player and the other narration. Here, causal relations are not peculiarly narrativistic or ludic. That is, the perception of (causal) connections is perhaps better described (at least for methodological purposes) as a transmodal principle: one that can be realized in both narrative and game modes. Indeed, it is exactly these kinds of discussions that illuminate understanding of polymorphic fictions, narrative, and game more than any discussion of a single mode in isolation.
CONCLUDING REMARKS This chapter has attempted to explore new ways of understanding meaningmaking through the challenge of polymorphic fictions and the insights of multimodality. The study of polymorphic fictions is introduced as a remarkable phenomenon that contributes to Kress and van Leeuwen’s notion of multimodal texts through insights into the use of semiotic resources beyond (intra-articulation) multimedia: delivery media and environments. Drawing on the levels of meaning-making implied in their multimodal communication model, the key insight of transmodal principles was highlighted and explored in light of the need to develop a methodology to study narrative and game modes in complex phenomena. The approaches offered here have only touched on the possibilities and complications, but hopefully provide some guidance intro fruitful future directions for the study of narrative, game, media, and beyond.
NOTES 1 Since writing the essay referred to in this chapter the author has changed her name. This chapter will refer to the author with her married name (Walker Rettberg), but cite her essay using the name it was published under (Walker). 2. This is not the fi rst time, of course, that other elements beyond a media platform have been analyzed as part of the meaning-making process. There are a few chapters in O’Halloran’s (2004) edited collection, for example, that discuss this.
198
Christy Dena
3. To facilitate a comparative and more comprehensive inquiry, practices 4.
5. 6. 7. 8.
9.
10.
11.
encompassed in the theory of polymorphic fictions include mass entertainment, marketing and independent fi lm, art, and gaming. Same-author relations are mentioned in Genette’s discussion of transposition and translation when he refers to bilingual writers such as Samuel Beckett and Vladimir Nabokov, who do “self-translations” (Genette [1982] 1997, 214); in his discussion of expansion he refers to Queneau’s own expansions (261); in his discussion of transtylizations he refers to Queneau’s Exercises in Style (226), Valery and Mallarmé (227), and describes the latter as “selftranstylizations.” This “self” prefi x continues on occasions: self-excision (231), self-expurgation (235), self-concision (237), self-condensation (243), self-transvocalization (290), and even an “autographic epilogue” (208). How these relations are distinct from his theory of hypertextuality is made clear in his discussion about continuation as a form of imitation, and how “[t] his ‘autotextuality’ or ‘intratextuality,’ is a specific form of transtextuality, which ought perhaps to be considered in itself—but no hurry” (207). Genette also comments in his brief discussion of self-transtylization that he shall not “theorize on the paratextual function of the foretext, or self-hypotext; [as] this may be the topic of another inquiry” (227). Therefore, while Genette has included what he has variously described as intratextuality, autohypertextuality, and autographic, they are all antithetical to his theory of transtextuality. Genette does refer to “hyperartistic practices” ([1982] 1997, 384–87), but these are primarily relations between works within the same art form. Although not indicated in his Routledge entry (2005), Saint-Gelais does include same-author expansions “in order precisely to study and compare what’s at stake in each case” (Saint-Gelais, personal communication). It should be noted that Thomas et al.’s theory of transliteracy is wide enough to embrace combinatorial literacies as well as literacies that recognize digital and nondigital communication types (Thomas et al. 2007). Identifying the intra-systemic nature of collaborative creative works (especially franchises) is not without its complications. Of course, the issue of theorizing creative control outside of solitary art forms like literature is not new, but there are a variety of factors (which cannot be explored here) such as analyzing creative control during production processes, the sharing of primary assets, and confi rmations from meta-commentary. Transmediation was introduced by language educator Suhor to refer to the “translation of content from one sign system into another” (Suhor 1984, 250), and since then has been actively employed in teaching “because learners must invent a connection between the two signs systems” (Siegel 1995, 455). It is a pedagogical technique to illuminate relationships between medium and content, primarily through adaptations of single stories. McGonigal does discuss “the novel recombinations of play and performance that ubicomp enables and provokes” (McGonigal 2006, 41). But performance is the result of “ludic” design. While there is obviously a leaning towards recognizing more than a game mode in McGonigal’s ubiquitous gaming, it is not an overt part of her nomenclature or study. See Ryan’s case for distinguishing literary, narrative, and fiction (Ryan 1991).
REFERENCES Aarseth, E. 2004a. “Genre Trouble: Narrativism and the Art of Simulation.” In Electronic Literature Review, (3), ed. N. Wardrip- Fruin, and Patt Harrigan.
Beyond Multimedia, Narrative, and Game
199
http://www.electronicbookreview.com/thread/firstperson/vigilant (accessed May 31, 2008). (Originally published First Person, New Media as Story, Performance, and Game, ed. Noah Wardrip-Fruin, N and Patt Harrigan, 45–55. Cambridge: MIT Press.) . 2004b. “Quest Games as Post-Narrative Discourse.” In Narrative Across Media: The Languages of Storytelling, ed. M.-L. Ryan, 361–76. Lincoln: University of Nebraska Press. Baldry, A. 2000. Multimodality and Multimediality in the Distance Learning Age: Papers in English Linguistics. Campobasso: Palladino. Barry, M. 2004. Nationstates. http://www.nationstates.net/ (accessed May 31, 2008). Caldwell, J. T. 2003. “Second-Shift Media Aesthetics: Programming, Interactivity, and User Flows.” In New Media: Theories and Practices of Digitextuality, ed. A. Everett and J. T. Caldwell, 127–44. New York: Routledge. d’Aquili, E. G., and A. B. Newberg. 1999. The Mystical Mind: Probing the Biology of Religious Experience. Minneapolis: Fortress Press. Davenport, G. 2005. When Place Becomes Character: A Critical Framing of Place for Mobile and Situated Narratives. http://mf.media.mit.edu/pubs/detail. php?id=427 (accessed 31 May, 2008). Davies, H. 2007. “Place as Media in Pervasive Games.” Proceedings of the 4th Australasian Conference on Interactive Entertainment. Melbourne: Rmit University. ACM International Conference Proceeding Series, vol. 305. Dena, C. 2003. “Response as Input: The Role of Reader Response, HCI and HRI in the Modeling of a Cross Media Narrative.” Presented at the Seismic Readings 2003 Conference, School of Literary, Visual and Performance Studies, Monash University, Clayton. . 2004a. “Current State of Cross Media Storytelling: Preliminary Observations for Future Design.” Presented at European Information Society Technologies, the Netherlands. . 2004b. “New Media Methodologies, Applied.” Presented at Contexts, School of Creative Arts Postgraduate Seminar Day, University of Melbourne. . 2004c. “Towards of Poetics of Multi-Channel Storytelling.” Presented at Critical Animals: This is Not Art Festival, Newcastle. . 2007. “The Future of Digital Media Culture is All in Your Head: An Argument for the Age of Integrating Media.” In Proceedings of perthDAC 2007: The 7th Digital Arts and Culture Conference, ed. A. Hutchinson, 116–25. Perth: Curtin University of Technology Eskelinen, M. 2005. “Explorations in Game Ecology, Part 1.” Forum Computerphilologie. http://computerphilologie.uni-muenchen.de/jg05/eskelinen.html (accessed May 31, 2008). Flanagan, M. 2007. “Locating Play and Politics: Real World Games and Activism.” Special Issue: Social Media: Narrative and Literacy in Digital Culture, Leonardo On-Line 16 (2–3). http://www.leonardo.info/LEA/PerthDAC/DACSocialMedia.html (accessed Fed 25, 2009). Frasca, G. 2003. “Simulation versus Narrative: Introduction to Ludology.” In Video/Game/Theory, ed. M. J. P. Wolf and B. Perron, 221–35. London and New York: Routledge. Genette, G. [1982] 1997. Palimpsests: Literature in the Second Degree. Lincoln: University of Nebraska Press. Gerritzen, M., G. Lovink, Brunisma, M., Ruyg, M. 2001. Catalogue of Strategies. Amsterdam: BIS Publishers. Haraway, D. 1991. Simians, Cyborgs and Women: The Reinvention of Nature. New York: Routledge. Herman, D. 2002. Story Logic: Problems and Possibilities of Narrative. Nebraska: University of Nebraska Press.
200 Christy Dena Jenkins, H. 2004. “The Cultural Logic of Media Convergence.” International Journal of Cultural Studies 7 (1): 33–43. . 2006. Convergence Culture: Where Old and New Media Collide. New York: New York University Press. Juul, J. 2001. “Games Telling Stories?: A Brief Note on Games and Narratives.” Games Studies 1 (1). http://www.gamestudies.org/0101/juul-gts/ (accessed May 31, 2008). . 2005. Half-Real: Video Games between Real Rules and Fictional Worlds. Cambridge, MA: MIT Press. Klastrup, L. 2003. “Towards a Poetics of Virtual Worlds: Multi-User Textuality and the Emergence of Story.” PhD thesis, Department for Digital Aesthetics and Communication, IT University of Copenhagen. Kress, G. R., and T. van Leeuwen. 1996. Reading Images: The Grammar of Visual Design. London and New York: Routledge. . 2001. Multimodal Discourse: The Modes and Media of Contemporary Communication. London: Arnold. Krug, C., and J. Frenk. 2006. “Enter the Matrix: Interactivity and the Logic of Digital Capitalism.” In The Matrix in Theory, ed. M. Díaz-Diocaretz and S. Herbrechter, 73–92. Amsterdam and New York: Editions Rodopi B.V. Lemke, J. L. 1998. “Metamedia Literacy: Transforming Meanings and Media.” In Handbook of Literacy and Technology: Technological Transformation in a Post-Typographic World, ed. D. Reinking, L. Labbo, M. McKenna, and R. Kiefer, 283–301. Hillsdale, NJ: Erlbaum. McGonigal, J. 2003. “A Real Little Game: The Performance of Belief in Pervasive Play.” Presented at the Digital Games Research Association (DiGRA) “Level Up” Conference, University of Uhrecht, the Netherlands. . 2006. This Might Be a Game: Ubiquitous Play and Performance at the Turn of the Twenty-First Century, Performance Studies and the Designated Emphasis in Film Studies in the Graduate Division. Berkeley: University of California Press. Montfort, N., and S. Rettberg. 2004. Implementation. http://nickm.com/implementation/ (accessed May 31, 2008). Montola, M. 2005. “Exploring the Edge of the Magic Circle: Defi ning Pervasive Games.” Presented at Digital Arts and Culture, Copenhagen, Denmark. Mushon Zer-Aviv, D. P., K. London, T. Duc, R. Tao, and C. Joseph. 2006. You Are Not Here. http://www.youarenothere.org/ (accessed May 31, 2008). O’Halloran, K. L. 2004. Multimodal Discourse Analysis: Systemic-Functional Perspectives. London: Continuum. Ruppel, M. 2005a. “Learning to Speak Braille: Convergence, Divergence and Cross-Sited Narratives.” PhD qualifying exam presentation, University of Maryland College Park. . 2005b. “Triggers and Traces: Cross-Sited Narratives and Medial Materialities.” Presented at the Society for Literature, Science, and the Arts Conference, Chicago. . 2006. “Many Houses, Many Leaves: Cross-Sited Media Productions and the Problems of Convergent Narrative Networks.” Presented at the Digital Humanities 2006 Conference, Paris-Sorbonne. Ryan, M.-L. 1991. Possible Worlds, Artificial Intelligence, and Narrative Theory. Bloomington: Indiana University Press. . 2003. “On Defi ning Narrative Media.” Image and Narrative, Online Magazine of the Visual Narrative, no. 6. http://www.imageandnarrative.be/mediumtheory/marielaureryan.htm. . 2004. “Introduction.” In Narrative across Media: The Languages of Storytelling, ed. M-L Ryan, 1–40. Lincoln: University of Nebraska Press.
Beyond Multimedia, Narrative, and Game
201
. 2006. Avatars of Story. Minneapolis: University of Minnesota Press. Saint-Gelais, R. 2005. “Transfictionality.” In Routledge Encyclopedia of Narrative Theory, ed. D. Herman, M. Jahn, and M.-L. Ryan, 612–13. London and New York: Routledge. Siegel, M. 1995. “More than Words: The Generative Power of Transmediation for Learning.” Canadian Journal of Education 20 (4): 455–75. Stewart, S., and J. Weisman. 2006. Cathy’s Book: If Found Call 650–266–8233. Philadelphia and London: Running Kids Press. Suhor, C. 1984. “Towards a Semiotics-Based Curriculum.” Journal of Curriculum Studies 16 (3): 247–57. Thomas, S., C. Joseph, Laccetti, J., Mason, B., Mills, S., Perril, S., Pullinger, K. 2007. “Transliteracy: Crossing Divides.” First Monday 12 (12). http://www.uic.edu/ htbin/cgiwrap/bin/ojs/index.php/fm/article/viewArticle/2060/1908 (accessed May 31, 2008). Tosca, S. 2003. “The Quest Problem in Computer Games.” In Proceedings of the First International Conference on Technologies for Interactive Digital Storytelling and Entertainment TIDSE ’03. Darmstadt: Fraunhofer IRB Verlag. http://www.itu.dk/people/tosca/quest.htm (accessed May 31, 2008). Walker, J. 2004. “Distributed Narrative: Telling Stories across Networks.” Presented at the Association of Internet Researchers 5th Annual Conference, Brighton. . 2005. “Pattern Recognition: Reading Distributed Narratives.” Presented at the Viral and Distributed Narratives Panel, Society for Literature, Science and the Arts 19th Annual Conference, Chicago, November 10–12. Wolf, W. 2005. “Intermediality.” In Routledge Encyclopedia of Narrative Theory, ed. D. Herman, M. Jahn, and M.-L. Ryan, 252–56. Oxfordshire: Routledge.
14 Keg Party Extreme and Conversation Party Two Multimodal Interactive Narratives Developed for the SMALLab Sarah Hatton, Melissa McGurgan, and Xiang-Jun Wang INTRODUCTION Recent studies in Interactive Narrative reveal the possibility of a new genre that leverages exploratory participation within multimodal environments. Janet Murray (1997) describes this genre’s affordances as giving the users the ability to experience various emotional media and encounter fantasies in interactive spaces. Interactive Narrative is further defi ned by Marie-Laure Ryan and can be described as enactment through relatively free dialogue and gestures. Importantly, Interactive Narrative can include nontraditional displays that are immersive and allow for a sense of presence of body in a virtual space representing a conceptual or reconstructed world with sonic and visual references. Interactive Narrative can involve a navigable space and the themes it propounds should be exploratory in nature. The participants could either be the addressee of something performed or, instead, transcend the boundaries of an actual author (Ryan 2004). If narrative is what is being enacted by a participant, then it is important to have a deeper understanding of what narrative is. Ryan posits that narrative is a mental representation of casually connected states and events that captures a segment in the history of a world and of its members (Ryan 2004). These worlds can be different than our own. Importantly, this means that story offers a new way of being in a world by providing the opportunity to envision environments and experiences through another’s eyes or ears. In its mimetic sense, Interactive Narrative has the potential to create a multimodal representation relative to the natural world. Importantly, interpretations must then be based both on images and sounds. When both images and sound relate the occurrence of a narrative within a fictional word, the multimodal aspects allow a broader range of connections between the reader, his or her senses, and the world being represented. Looking closer at the modality of sound within Interactive Narrative shows that sonic resources play an integral role in creating mood, revealing information, and sustaining story in Interactive Narrative structures. Theo
Keg Party Extreme and Conversation Party 203 van Leeuwen looks closely at the semiotic value of sound in regards to speech. He points out that sound quality is multidimensional and that the features of sounds are what help defi ne what a sound represents. It is thus no surprise that video games use sound for redundancy and player feedback (van Leeuwen 1999). Diegetic sounds within a game may give clues about the contents of an environment, if danger is near, or may simply add to the illusion of creating a realistic immersive ambience. Nondiegetic sounds add to mood as well. Music inside of games can give information on how much time the player has. Hearing the internal monologue of the hero of the game may also help progress the story further. These sonic landscapes are just as important as the intricate 3-D graphics found in most contemporary video games (Friberg and Gardenfors 2004). Spoken language, as an element of sound, is an important component of story because it can give specific attributes to character and personality. Van Leeuwen points out that the sound of speech can derive significance from different qualities and aspects. These sound qualities include tenseness, pitch, nasality, breathiness, and roughness. The example of breathiness in speech reveals how the context of the breathiness can change the meaning of the sound. A character’s voice in a story could be breathy because they are out of breath from running or because they are excited or aroused. Additionally, van Leeuwen reveals that humans associated different connotations with different qualities of speech such as accents or pitch. A woman who often speaks with a loud, shrill, high-pitched voice may be thought of as lower class and a teenager with a southern American accent who uses the word “like” much too often may be thought of as far from intellectual. Our team wanted to investigate the qualities of speech as a sound that could drive narrative because of the different associations these qualities posses (van Leeuwen 1999). One facet of naturally occurring conversation worth exploring in Interactive Narrative is its ability to function as a medium of soundscape that can generate an exploratory sonic space. Conversations reveal information about narrative because they are an exchange of events between at least two people in a linear format. In order to generate this sonic space, we chose a social situation where many conversations are occurring simultaneously. Each of these singular exchanges of voice could become navigable pathways in a larger space. We thus appropriately chose a party as the encapsulating space where different conversations could occur. In the projects described in this chapter, we constructed two party scenarios to be explored as two Interactive Narratives. We carefully composed a visual environment that encapsulated sounds that, when navigated and explored by a participant, can create a cohesive story. In the design of these two Interactive Narrative pieces, we leverage the fact that cultures often engage in following conversation as a main component of narrative. Vocal narrative is personal and relational to not only the other characters in the story, but to the observer as well. Story additionally allows for discovery by providing a sense of anticipation on what actions
204
Sarah Hatton, Melissa McGurgan, and Xiang-Jun Wang
may occur. When navigating an Interactive Narrative structure, a participant can experience this sense of anticipation while trying to understand events that lead up to a fictional moment captured in time. These moments can be represented through visual references and sonic clues. In this chapter, we go further into explaining the interaction design and techniques used in developing two multimodal Interactive Narrative scenarios for a hybrid mixed-reality space. In the design of these interactions, we aim to achieve two research goals: (a) Create a sonic interactive narrative where sounds are mapped onto a visual so to reference a moment captured in time; (b) Give participants a sense of agency due to their ability to discover a series of events that reveal a narrative.
Prior Work As early as 1966, there has been experimentation in the design of Interactive Narrative. ELIZA (Weizenbaum 1966), was a chat-based project where the participant was able to message a psychotherapist, which was in reality just a computer. By listening for key words in a computer messaging scenario, ELIZA created a conversational base that was generated by the participant’s initial questions and later responses. ELIZA is a great example of how agency can shift during the interaction. Previous Interactive Narrative experiences that are multiplayer also include role-playing games like MUDs and LARP (Murray 1997). These experiences allow participants to make choices in how the plot unfolds. A modern example of an Interactive Narrative that is game based is Façade, an artificial intelligence-based art and research experiment that attempts to move beyond traditional branching or hyperlinked narrative such as Afternoon (Joyce 1990), so to create a fully realized, one-act interactive drama (Mateas and Stern 2003). These works inspire the ability to design stories, characters, and even emotions to provide a real interactive experience. The true reward of the game is the exchange of language, or seemingly natural conversation that goes on between the characters and the story elements revealed by those characters. Façade is a successful example of an Interactive Narrative structure that utilizes traditional screen-based platforms. It is important to understand the role of sound in Interactive Narrative. Previous work in sound design has explored topics such as urban environments, context awareness, architectural spaces, and creating scenarios that give participants the ability to compose music using nontraditional methods. Gaye, Maze, and Holmquist’s Sonic City is an example of mapping sound to context in order to be aware and connected to the immediate environment (Gaye, Maze, and Holmquist 2003). Gaye points out that music is inextricable from the lifestyles and textures of daily urban life. Feeling the same way about language, our team also hoped to map sound to a specific context, but in contrast to Sonic City, we decided to devise our own
Keg Party Extreme and Conversation Party 205 environment in order to facilitate a fictional narrative element as opposed to a feeling of chance encounter. Like Sonic City, we hoped to generate a personal soundscape coproduced by physical movement and local activity. Encounters, events, and behaviors all become means of interaction. Sound Mapping, another mobile sound installation, also uses architecture and the urban environment to engage participants as an actual composer, rather than just a passive listener (Mott and Sosnin 1997). Camera Musica, a project by Gerhard Eckel, uses environment as composition as well, but employs a virtual system rather than the physical interface of a city. Eckel goes on to describe what he calls a “meta-composition” or the method at which the composition should be composed (Eckel 1996). Our own projects also took this idea of meta-composition into consideration. Contemplating the user visual design of our environments was extremely important, especially when relating the sound back to the space for coherency’s sake. Sound is often a product of physical objects colliding, humans speaking, or friction between built structures, thus we wanted to incorporate the visceral quality of sound’s relationship to physical objects. In summary, our research team draws from previous examples in interactive narrative, sonic landscape design, and contextual associations between sounds and physical objects and spaces. Examples like Façade used traditional keyboard- and mouse-driven interactions; however, we wanted to reiterate the notion of navigating through a real space. Realizing that an immersive environment can provide a richer, multimodal experience, we decided to develop our ideas in a multimodal environment called SMALLab. Utilizing this space made it possible to design such an environment that involves a multimodal interface that incorporates both sound and visual feedback to reward the participant in the interaction and ownership of their own narrative structure. These rewards depend on the actual physical movement through the space. Traditional mouse- and keyboard-driven technologies were inadequate in creating the spatial illusions. We describe the SMALLab architecture and space in the following section.
SMALLab The Situated Multimodal Arts Learning Lab 9 (Birchfield, Ciofo, and Minyard 2006) is a hybrid physical-digital environment for art making, learning, play, and reflection. SMALLab’s open framework facilitates social interaction and collaborative learning. The physical space fills a fifteen-foot-by-fifteen-foot footprint and extends to a twelve-foot ceiling. An aluminum truss structure supports the sensing and feedback apparatus. Four audio speakers project spatialized sound feedback, while a top mounted projector provides real-time visual feedback on the floor of the space. Detailed in Figures 14.1 and 14.2, SMALLab utilizes six cameras that track real-time 3-D positions of illuminated glowball objects. The
206 Sarah Hatton, Melissa McGurgan, and Xiang-Jun Wang
Figure 14.1 Picture of a student using the green glowing ball to interact with a 3-D trace system in SMALLab.
lab allows people of all ages to interact with one another and their composed media worlds through free play, structured movement, and vocalization.
PROJECT ONE
Conversation as Narrative Our team was motivated to create an interactive soundscape where the participant could reveal audio clues about the condition of a visual interactive space. We chose to simulate the environment of a typical college keg party, where one often fi nds themselves eavesdropping on other people’s conversations due to their proximity, volume, and one’s own curiosity. The college party environment is ripe with narrative potential. Confl icts arise between students who no longer live with their parents, and are
Figure 14.2 Participant navigates through a virtual space in SMALLab’s Alphabet Soup scenario.
Keg Party Extreme and Conversation Party 207 thus free to experiment with their peers. Inhibitions are low, sexual desire is high, and there are no chaperones. Although all these people drinking are considered adults by law, these weekend adventures are full of people who are drinking underage due to the fact that United States law requires a person to be twenty-one or over to drink legally. Thus, college parties are full of young, naïve people so eager to drink that they commit crimes weekly. These parties can result in binge drinking, overt sexual pandemonium, and they often end in house calls made by the local police. If the cops show up, then by the end of the night, it is every drunken man or woman for himself or herself. Students either run out of the house or into closets, frantically text messaging and calling their friends to make sure no one got caught. Our team was especially interested in the types of characters associated with college parties. We wanted to exploit the shallow stereotypes of male and female party attendees. These stereotypes include the male jock, the male misogynist, the stoner, the airhead, the snobby girl, and the fl irtatious girl. These characters’ relationships at the party are inspired by their conversations regarding the beer, their opinions on each other’s clothes, and strange occurrences throughout the evening. By listening to the conversations, a story evolves. In order to construct these stories, we fi rst staged and photographed a party scene. Sounds were then recorded of the conversations present at the event carried on by stereotypical partygoers. We also recorded ambient sounds, in order to facilitate the illusion of walking around the space.
Project 1: Keg Party Extreme In the design of Keg Party Extreme, we wanted to make sure the participants in our space could become a composer of language and conversation so that navigating through the space in different paths would allow for alternate experiences. Our initial idea involved the act of eavesdropping on conversations taking place at a typical college party. We decided to use this social construct as the basis of our game titled Keg Party Extreme. Creating the Visual Environment In order to achieve our fi rst research goal, which was to create a sonic interactive narrative where sounds are mapped onto a visual so to reference a moment captured in time, we fi rst created an environment that encouraged exploration. We had to create a place that would allow a participant to physically travel around the visual space of the scenario as if listening to other partygoers’ conversations. The space was highlighted with objects and artifacts left over from the party, and moving towards them would cause auditory components to play. These components described the types of people at the party and the events that happened.
208
Sarah Hatton, Melissa McGurgan, and Xiang-Jun Wang
The referenced moment in time was an American-style university keg party. A keg party, also referred to as a house party, a frat party, or a kegger, is when a group of students who live together in a house or fraternity purchase a giant keg of beer and tap it at their house, and thus have gallons of beer on draught, as if their house was a chaotic open bar for one evening. With the researchers all being graduate students at one of the major party schools of the country, we chose to create a parody of the environment most people associate with our institution. In other words, creating an environment of a fraternity or college party offered us a lot of content to experiment with. We were able to think of a variety of people with different character traits as well as the audio and visual design associated with a sense of place. By staging and digitally compositing an image, we were able to create a result representative of the aftermath of the party. As displayed in Figure 14.3, we chose iconic artifacts and objects reminiscent of a party, such as beer bottles, a keg, red plastic cups, as well as underwear and furniture strewn across the floor. By enclosing these objects in a large fi fteen-by-fifteen-foot visual field, we hoped to create digital relics that revealed a sort of reactive history about the nature of the environment. These specific objects were chosen because we were aiming to create a sense of mystery and humor about what events happened at the party prior to the moment in time we captured with our photograph. The resulting photo was then projected onto the floor of SMALLab. The size of the image was extremely important because we wanted to create a slightly larger-than-life image, which shifted as the participants in our game moved throughout the space. Doing so created an illusion of being able to walk through and explore a much larger space than the actual size of the SMALLab and much closer to a large room at the location of the illusory party.
Figure 14.3 Image used in Keg Party Extreme.
Keg Party Extreme and Conversation Party 209 Design of the Interactive Sonic Landscape The sound, which was the occurring conversations at the party and their accompanying ambient noises, was an integral aspect in achieving goal two, which was to give participants a sense of agency due to their ability to discover a series of events that reveal a narrative. The sounds of the conversations and ambience not only set the mood of the visual environment, but also gave the most clues as to what happened at our staged party. Our team recorded voices of male and female characters present at the party. The actors were supplied with the image of the scene and instructed to ad-lib conversation under suggested social scenarios as if they were present. The resulting conversations provided clues as to how and why the scene appears to be in this specific condition. A sense of ambiguity is present throughout all of the conversations, so to provide an open-ended storyline, which can be pieced together by the participant in any order as they explore the environment. Ambient sounds were additionally layered with party conversations for the participant to eavesdrop on. All of the existing conversations, ambient noises, and background party music play when one enters the space; however, the resulting ad-libbed conversations are spotted throughout the 3-D space and are only triggered when a participant walks to a location where that conversation took place. Each location was associated with an object in the projected party image. The objects that we tagged were the keg, bra, pile of beer bottles, fl ip-flop, red plastic cups, tie, chair, and the person passed out on the floor. The sound fi les, which were attached to the objects within the image, were compilations of ambient noises and conversations. Some sounds contained only one conversation, or a portion of a conversation, such as the sound of one person talking on a cell phone, while others contained multiple conversations, with some being sounded simultaneously. All of the overheard conversations are not voiced at the same volume. Our actors controlled this detail by choosing the volume at which their character was speaking due to the type of conversation they were having. For example, there are conversations ranging from one to four people and because of the number of people in certain conversations, their voices increase in volume in order for the characters to hear each other. Compared to individuals whispering, these loud voices possessed a much different temperament. These shifts in volume and inflection supply a relational aspect to the characters since the participant has to recognize and reason why these people are speaking that way. The discovery process takes place as participants realize that some conversations appear to be isolated and secretive whereas others are more open. Additionally, by layering conversations on top of one another, we were able to simulate the spatial effect of standing in a room and overhearing several conversations at the same time. The content of the conversations
210 Sarah Hatton, Melissa McGurgan, and Xiang-Jun Wang also related to the object to which they were attached and allows for participants to choose the path they want to make throughout the space. For example, when a participant walks near the keg, he or she hears a conversation of two guys talking about how bad the beer is. In addition to the object that the sound files are attached to, all of the ad-libbed conversations also refer to one or more objects present in the party scene, such as the bra and person passed out on the floor, thus providing multiple threads for the participant to follow as they reveal more information. Conclusions In order to test the experience of Keg Party Extreme, we invited fifteen other graduate students to come and try our project. They gave us verbal feedback on their experience and we used that feedback in regards to our research goals. Referring back to research goal one, which was to create a sonic interactive narrative where sounds are mapped onto a visual so to reference a moment captured in time, we look at how participants interacted in the space. They expressed that when interacting with Keg Party Extreme, they truly felt that they were walking through a space that was in fact larger than the floor of the SMALLab. Not only did they feel immersed in a different environment, but the recorded sounds coupled with the visual data gave them a true sense of being at a party. Looking at goal two, which was to give participants a sense of agency due to their ability to discover a series of events that reveals a narrative, participants felt that the sounds needed to reveal the story more rapidly
Figure 14.4 Participant stands over the keg so to trigger the sounds associated with that location.
Keg Party Extreme and Conversation Party 211 so to be more rewarding to the participant. We explain this necessity due to the participant’s lack of patience to stay in one hot spot and listen to the whole conversation. Standing still and waiting in an immersive explorative multimodal environment proved to be much more challenging for the participant than we had originally imagined. Because participants did not want to stay and listen to a longer conversation, we feel they were not able to construct in their minds a clear narrative of the party and thus did not experience as much agency or ownership as we would have liked.
Project 2: Conversation Party After Keg Party Extreme, we decided to make another attempt at building a piece that also promoted narrative using sound. Instead of users discovering precomposed conversations pertaining to a specific time and space, we wanted to give users the ability to create conversation between characters within a social soundscape. Within this interaction, a specific time and place is able to be explored and created through the composition of conversation. The place, again the aftermath of a keg party, includes two characters situated within the party, and their voices are located throughout the visual landscape. Users can select words and phrases to say to one another through the voices of the two characters. Each word and phrase can be voiced in a variety of moods, therefore allowing the users not only to create the words of a conversation but the emotional response each character has to one another. In order to address research goal one, which was to create an environment where sounds are mapped onto a visual so to reference a moment captured in time, we once again structured our scenario around the social construct of the college party and implemented the visual and audio feedback relating to the environment. We chose to record voices of two different characters that could potentially have a variety of conversations in that space about occurrences at that moment in time. While doing so, we had
Figure 14.5 Participants engaging in Conversation Party.
212 Sarah Hatton, Melissa McGurgan, and Xiang-Jun Wang each character ask questions, make statements, or utter responses in up to four different tones of voice. Our male character revealed a laid-back, optimistic, angry, or sleazy state of being within his responses, whereas our female character’s responses exemplified a slutty, naïve, bitchy, or superficial personality. Choosing the Visual Environment By staging an original photo of the aftermath of a college party (Figure 14.6), we were able to create the visual map for our sounds. We decided to utilize the same system of staging the photo as in Keg Party Extreme. Mood as Agency In order to address research goal two, which was to give participants a sense of agency due to their ability to discover a series of events that reveals a narrative, we designed the character’s voices as able to be composed by the participants in the space. Sound was thus the most important aspect of Conversation Party, because it functioned not only as an environmental setting but also as the elements of the scenario that the participants could discover and design. Our team recorded the voices of a male and female character present at the party. The actors were supplied with a list of questions, statements, and responses common to conversation within this social setting. The questions, statements, and answers where chosen due to their ability to be mixed and matched as an actual progression through a conversation. Each
Figure 14.6 The image used in Conversation Party.
Keg Party Extreme and Conversation Party 213 actor recorded the list four times, once for each mood. The resulting pieces of conversation provided an array of responses in four temperaments for each character to voice at the choice of the participant. The conversation fragments, such as “hello,” “what’s up,” and “umm,” were associated to objects, much like in Keg Party Extreme. The four different tones of voice associated with each piece of conversation were situated in 3-D positions throughout the space. For example, when the ball was held high in the air and a participant activated a sound near the keys, the participant could hear the phrase, “let’s get out of here,” voiced in a superficial tone. Around that spot may also be a bitchy, naïve, or slutty tone of voice uttering the same phrase. While the participants explore the space further, they begin to understand where different moods are placed and can walk to those spots to activate them. We additionally provided a setting for the characters and their designers through ambient sounds. These sounds were not tagged to specific objects within the projected image, they simply looped continuously as one played within the space. Interaction Design Our sounds were mapped in 3-D space according to where we felt a character should be situated throughout the image. For example, if a guest in our interaction scenario were to walk over to the keg and move the green glowing ball up through the space, they would hear three to four different versions of our male character asking the question “Can I get you a beer?” Appropriately, we hoped to map some spots around the photo as male character spots with others being associated with our female character. This idea related to our mission to create this design scenario for two people. Like the two participants utilizing two colored glowballs instead of one in Figure 14.5, our participants could manipulate the sounds within the 3-D space so to compose their own conversation. Because participants in the last project were interested in exploring the relationships among the sound events, we wanted to see if the opportunity to explore different paths and change behaviors would be enjoyable. We decided to design a probability system based on our sound story. When participants come across certain locations a message would be sent to another location, therefore changing the conversation scenario as illustrated in Figure 14.7. As a result, how and when participants explore would generate different conversations to enhance the novel experience of each participant interacting with the story.
OBSERVATIONS Conversation Party appeared to be too difficult for the participants to control, since an adequate amount of time is needed for exploration of the space in order to gain an adequate understanding of what questions, statements,
214 Sarah Hatton, Melissa McGurgan, and Xiang-Jun Wang
Figure 14.7 Picture of a flowchart about sound trigger and probability system.
and responses are located where. Once participants learned where specific sounds were located, it was difficult for them to reactivate them due to their specified locations; therefore, the successful conscious composition of conversation was never fully realized by the users. Conversation Party was a much more active experience compared to Keg Party Extreme due to its multiplayer capabilities. The pairs of participants who tried creating a conversation talked to each other about what words they wanted to use and worked together to attempt making a short conversation. Some participants were able to do so and thus also noticed the probability system’s ability to shift a various spot’s sounds. Future iterations of Conversation Party will utilize other methods of organizing the sounds within the space so to create a shorter learning curve.
Keg Party Extreme and Conversation Party 215 For Conversation Party, the voice of the characters was an important device in constructing a narrative. Referring back to van Leeuwen, it was the quality of the voices that revealed their context. Combining an angry female voice with a sleazy male voice could yield participant assumptions that the woman was angry with the man for being a liar or a cheater. CONCLUSIONS AND FUTURE WORK In regards to our research goals, we conclude that both Keg Party Extreme and Conversation Party rely on sound as being the primary means of allowing the participant to experience a sense of agency when interacting within the narrative space. We posit then that mapping sound to a visual environment is an excellent way to contemplate Interactive Narrative within a certain social scenario. By utilizing objects and spatial constructs to compose a meaningful setting or illusion of place, a designer can in fact map out an interactive environment that can inspire thought, a sense of discovery, and exploration of character. Other social situations could be created besides that of a party. Dangerous confrontations between cops and criminals, situations involving cultural differences, or fantasy scenarios could also be explored. ACKNOWLEDGMENTS We would like to thank the Arts, Media, and Engineering program at Arizona State University for letting us have the chance to develop multimodal systems in the SMALLab as well as Aaron Cuthbertson, David Birchfield, Aisling Kelliher, and the SMALLab team for their assistance in the design and realization of our work. We would also like to especially thank our voice actors Peter Bugg, Eric McMaster, Marco Rosichelli, Joe Trevino, and David Young. In addition to his excellent acting, Peter Bugg also assisted in photography. REFERENCES Birchfield, D., T. Ciofo, G. Minyard. 2006. SMALLab: A Mediated Platform for Education. Boston: ACM SIGGRAPH. Carson, D. 2000a. “Environmental Storytelling: Creating Immersive 3D Worlds Using Lessons Learned from the Theme Park Industry.” Gamasutra.com (March). . 2000b. “Environmental Storytelling, Part II: Bringing Theme Park Environment Design Techniques to the Virtual World.” Gamasutra.com (April). Eckel, G. 1996. “Camera Musica: Virtual Architecture as Medium for the Exploration of Music.” San Francisco. Presented at the International Computer Music Conference.
216 Sarah Hatton, Melissa McGurgan, and Xiang-Jun Wang Friberg, J., and D. Gardenfors. 2004. Audio Games: New Perspectives on Game Audio. Singapore: ACE. Gaye, L., R. Maze, and L. E. Holmquist. 2003. Sonic City: The Urban Environment as a Musical Interface. Montreal: NIME. Joyce, M. 1990. Afternoon, a Story, Hypertext Document for Macintosh Computers. Cambridge: Eastgate Systems. Mateas, M., and P. Sengers. 1999. Narrative Intelligence. Orlando: Narrative Intelligence AAAI. Mateas, M., and A. Stern. 2003. Facade: An Experiment in Building a Fully-Realized Interactive Drama. San Jose: Game Developers Conference. Mott, I., and J. Sosnin. 1997. Sound Mapping, an Assertion of Place. Interface. Murray, J. H. 1997. Hamlet on the Holodeck. Cambridge: MIT Press. Ryan, M.-L., ed. 2004. Narrative across Media: The Languages of Storytelling. Lincoln and London: University of Nebraska Press. Stern, A. 2003. “Creating Emotional Relationships with Virtual Characters.” In Emotions in Humans and Artifacts, eds R. Trappl, P. Petta, A. S. Payr. Cambridge: MIT Press. van Leeuwen, T. 1999. Speech, Music, Sound. London: Macmillan. Weizenbaum, J. 1966. “ELIZA—A Computer Program for the Study of Natural Language Communication between Man and Machine.” Communications of the Association for Computing Machinery 9:36–45.
15 Coda/Prelude Eighteen Questions for the Study of Narrative and Multimodality 1
David Herman and Ruth Page
The emergent field of multimodal narrative analysis is less a unified body of scholarship (or framework for inquiry) than an open-ended nexus of connections. Its central agenda is to provoke dialogue among analysts of narrative, on the one hand, and scholars who study the structures and affordances of multimodal discourse, on the other hand. Given the many disciplines that can and should contribute to this dialogue, a single conversation is by no means the end of the matter. The chapters brought together here are but a small indication of the diverse topics that could be interrogated, and as we look to the future, it is clear that this volume is but a fi rst attempt to map out parts of a very rich domain for research. It is perhaps appropriate, then, that the fi nal (printed) words of this text should take the form of a series of questions intended to stimulate further discussion. The questions that follow are neither defi nitive nor exhaustive. While the reader will fi nd echoes of certain issues covered in the preceding chapters, he or she will also encounter a range of new starting points for research on multimodal storytelling. What is clear is that this enterprise will need to be a richly interdisciplinary affair in which scholars from many fields pool their areas of expertise to investigate complex narrative experiences. As contemporary narrative analysis continues to broaden its focus, turning its attention to stories of all sorts, we invite you to use these closing questions as one means of charting a path through a territory large areas of which remain to be explored. This territory is the place where narrative and multimodality intersect. 1. What is mode, and how does it relate to medium? Further, what are the most productive frameworks for studying modes, multimodality, and media in concert with research on stories and storytelling? 2. In what ways is multimodality a resource for the expression of cultural identity? What is the relationship between multimodality and gender, for example, or multimodality and the body/embodiment? 3. What is the relationship between multimodality and the experience of the lifeworld? How can multimodal representations help immerse interpreters in the lifeworlds of fictional characters as well as real-world
218 David Herman and Ruth Page
4.
5. 6.
7.
8. 9.
10.
11.
12.
persons whose experiences are being recounted, and how might researchers from different fields work together to develop frameworks for studying the semiotic and cognitive processes involved? What is the relationship between digital works and works presented in the more traditional medium of print? Can high-tech multimodal narratives be considered a species of literary art, and if so do they extend the notion of “the literary” itself? What is the relationship between multimodality and literacy practices more generally? In studying multimodality, how can theorists avoid linguistic imperialism? How can they study visual, auditory, and other modes without subordinating them to concepts and terms inherited from linguistic study? What does it mean to say that there is a visual grammar, for example? In studying multimodal texts as reflexes of underlying cognitive processes, or alternatively the cognitive processes activated by the interpretation of multimodal texts, what models will be most productive? Given that cognitive linguistics will not suffice because of its focus on a single mode, how might theorists develop a broader, cognitivesemiotic approach to multimodal texts? How do narrative and nonnarrative elements work together in multimodal texts of different kinds? How does cohesion obtain in the visual versus the verbal modes, and what analytic frameworks might best account for the joint operation of visual and verbal cues in the creation of cohesive ties across narrative segments? Do children orient to multimodality differently than adults? What age-graded factors are relevant for the study of narrative and multimodality? For example, what is the role of prosodic information in narrative comprehension, and how does it shape the process of narrative acquisition? In studying interactions that unfold within multimodal environments (whether computer-mediated or face-to-face), do analysts need to work with transcripts? If so, what should those transcripts look like? How can researchers best capture or represent information about the nonverbal in a language-based transcript? Extrapolating from the previous question(s), what is the best way to study processes of remediation as they bear on issues of multimodality (and vice versa)? How might the study of cinematic adaptations of print texts, for example, be informed by research on multimodality? How might the field benefit from comparative study of (a) the process by which multimodal interactions are remediated as a monomodal document, as with transcripts, and (b) the process by which monomodal texts are remediated as multimodal artifacts, as with cinematic adaptations of novels?
Coda/Prelude
219
13. How might methods for analyzing multimodal texts be brought into a more complementary relationship with the creative production of multimodal texts (e.g., Web-based fiction)? How can we promote more dialogue between analysts and artists working on or in multimodal textual environments? 14. How do scholars interested in multimodal digital narratives in particular navigate two sets of issues: on the one hand, technical and engineering issues having to do with the design and implementation of digital texts; on the other hand, narratological and hermeneutic issues having to do with the analysis and interpretation of those digital texts to which producers and interpreters orient as narratives? 15. What is the relationship between interactivity and multimodality? For example, what are the experiential or phenomenological differences between a system capable of generating specific sounds when a user intersects with particular locations on a “sound map” and noninteractive multimodal representations that feature a pre-scripted audio track, whether in TV, film, or cut scenes in video games? 16. How might computers be used to search for, retrieve, and analyze the underlying structure of multimodal documents or artifacts, including films? How might the development of new applications for this purpose be interdependent with research on the nature and scope of narrative itself—e.g., if a user wanted to search for scenes of a particular sort in a large database of films? 17. What is the optimal way to integrate multimodality into teaching practices, whether in the humanities or other fields? To what pedagogical goals are spoken versus written versus visual communication best suited? In what ways does technology enable hybrid modes— e.g., online discussion fora and wikis—and what pedagogical gains might accrue to such hybridity? 18. Is there a mismatch between older concepts of story and new means for storytelling? Are multimodal digital texts amenable to analysis via previous understandings of narrative, or do the structure and experience of such works challenge those prior understandings? What are the possibilities and limitations of received ideas about narrative—ideas developed in conjunction with corpora of nondigital texts—when they are brought to bear on digital narratives? Or does this new corpus of works require new theoretical tools as well? In short, has narrativity itself perhaps assumed new forms in the digital age?
NOTES 1. The composition history of this “Coda/Prelude,” so named because it is intended to weave together some of the ideas proposed elsewhere in the volume while also opening up avenues for further inquiry, is in itself richly multimodal—and also highly collaborative. David Herman developed an
220
David Herman and Ruth Page initial draft of the questions while preparing over the course of the two-day symposium on Narrative and Multimodality: Language, Theory, Contexts, convened by Ruth Page at Birmingham City University in April of 2007, some remarks for a fi nal, roundtable discussion held at the end of that event. The questions were framed in response to the stimulating presentations given by speakers, as well as the lively exchanges among speakers and members of the audience during the discussion periods held after individual talks. The questions were further refi ned in response to comments made by participants during the closing roundtable session and revised still further while Herman worked to complete his chapter for this volume. Page composed the framing paragraphs that precede the questions, and Herman re-revised the questions once more after reading and responding to her productive framing remarks.
Contributors
Christy Dena researches changes to narrative, game, and media in mass entertainment, independent arts, and gaming. She is currently at the School of Letters, Art and Media, University of Sydney. She has published in Convergence Journal, Electronic Book Review, given keynotes, presentations, and taught in many countries. Her Web site is www.ChristyDena.com. Fiona J. Doloughan is a Lecturer in English at the University of Surrey where she teaches Creative Writing. She has a dual background in Comparative Literature and Applied Linguistics with research interests in the areas of contemporary narrative/s; text and image; and concepts of creativity. Specifically, her interests revolve around writing across cultures; cultural translation; adaptation and transformation; and intermediality. Most recently she has written on notions of cultural translation in Monica Ali’s Brick Lane and Xialou Guo’s A Concise Chinese-English Dictionary for Lovers. Astrid Ensslin is Lecturer in Digital Communication at the National Institute for Excellence in the Creative Industries, Bangor University. Her main research interests are in the areas of digital literature, discourse analysis, and language in the (new) media. She is founding editor of Journal of Gaming and Virtual Worlds and the MHRA Working Papers in the Humanities and has guest-edited (with Alice Bell) the 2007 issue of dichtung digital. Her further publications include Canonizing Hypertext: Explorations and Constructions (Continuum 2007), Language in the Media: Representations, Identities, Ideologies (coedited with Sally Johnson, Continuum 2007), as well as articles in Language and Literature, Corpora, Journal of Literature and Aesthetics, Gender and Language, and Sprache und Datenverarbeitung. She is Co-Investigator of the Leverhulme-funded Digital Fiction International Network (DFIN). Alison Gibbons is a lecturer in Stylistics at the University of Nottingham, UK, where she teaches courses in cognitive poetics, literary linguistics,
222
Contributors
and narratology. Her doctoral work (2008), entitled Towards a Multimodal Cognitive Poetics, presents a cognitive approach to postmillennial multimodal literature and was undertaken at the University of Sheffield. Sarah Hatton, Melissa McGurgan, and Xiang-Jun Wang were collaborators during 2006 on the Arts, Media, and Engineering (AME) program’s Situated Multimedia Arts Learning Lab (SMALLab) mixed-reality project at Arizona State University. Collaborating together as an interdisciplinary team with diverse backgrounds and skills, the three worked on developing multimodal environments for the SMALLab that explored various aspects of sound and narrative. Hatton will fi nish her Master of Fine Arts (MFA) degree in digital technology with an AME concentration in May of 2009 and will move on to work in the games industry, with a casual gaming focus. McGurgan fi nished her MFA in printmaking in spring of 2008. She currently continues to make and show work in the Phoenix area as well as nationally. McGurgan also manages ASU arts outreach programs and she runs her own graphic design company. Wang’s computer science research at AME focused on multimedia within social platforms. He is currently working as a software engineer in a social networking company building social platforms for professionals and consumers. David Herman, who cofounded the Project Narrative initiative at Ohio State University and served as its inaugural director, teaches in OSU’s English Department. He focuses on linguistic and cognitive approaches to narratives of all sorts, from stories exchanged in face-to-face communicative interaction and graphic narratives, to written nonfictional accounts and innovative modern and postmodern literary texts. The author, editor, or coeditor of nine books in the field, Herman also serves as editor of the Frontiers of Narrative book series and of the new journal Storyworlds, both published by the University of Nebraska Press. He was recently awarded a research fellowship from the American Council of Learned Societies for his 2009 project on “Storytelling and the Sciences of Mind.” Linda Hutcheon holds the rank of University Professor of English and Comparative Literature and Michael Hutcheon is Professor of Medicine and Deputy Physician in Chief for Education at the Toronto Health Network, both at the University of Toronto. Linda is the author of nine books on contemporary cultural theory, most recently A Theory of Adaptation (2006). The two of them have worked collaboratively and across their very different disciplines on the intersection of medical and cultural history, using opera as their vehicle of choice. They have published a number of articles and three books so far: Opera: Desire,
Contributors
223
Disease, Death (1996); Bodily Charm: Living Opera (2000); Opera: The Art of Dying (2004). They are currently studying creativity and aging through the late style and later lives of nineteenth- and twentiethcentury opera composers. Jessica Laccetti is Research Fellow at the Institute of Creative Technologies, De Montfort University. Jessica’s research interests focus on born digital narratives, including theory (multimodal, narratological) and practice. Her doctoral thesis examines new media narratives alongside a feminist theory of multimodality. For her current project, Jessica is researching transdisciplinarity in academia, which will help her found a new journal dedicated to creative technologies and transdisciplinary theories and methodologies. Other interests include new media pedagogies, neuroliteracy, postmodern fiction, Web 2.0 applications, and social epistemology. Jessica can be contacted via her Web site: http://www.jesslaccetti. co.uk. Rocío Montoro is currently a Lecturer at the University of Granada, Spain; she has formerly lectured at the universities of Huddersfield, Sheffield, and Nottingham. Her research interests lie in the interface of language, literature, and their multimodal realisations. She has done work on the way in which frameworks of analysis that underscore the cognitive aspects of language can illuminate multimodal realizations of narratives, especially in cinematic form. Her other major line of work is on the analysis of the narrative genre known as Chick Lit. Her forthcoming projects are entitled The Stylistics of Chick Lit: An Analysis of Cappuccino Fiction and Key Terms in Stylistics. Nina Nørgaard, PhD, holds the rank of Associate Professor of Applied Linguistics at the Institute of Language and Communication, University of Southern Denmark. She has published various articles on stylistics and multimodality as well as the monograph Systemic Functional Linguistics and Literary Analysis. A Hallidayan Approach to Joyce—A Joycean Approach to Halliday (Odense: University Press of Southern Denmark, 2003). She is currently working on a monograph on Multimodal Stylistics. Ruth Page is a Reader in the School of English at Birmingham City University. She has published numerous articles and essays that bring together the interests of language and gender studies and narrative theory along with her monograph, Literary and Linguistic Approaches to Feminist Narratology (Palgrave, 2006). With Bronwen Thomas she is coediting New Narratives: Stories and Storytelling in the Digital Age (University of Nebraska Press) and is currently working on the story genres found in the everyday writing that use Web 2.0 technologies.
224
Contributors
Dr. Andrew Salway’s research focuses on the computer-based analysis of multimedia documents, with an emphasis on their narrative and multimodal properties—he is interested in investigating the extent to which these properties manifest in formal features that are amenable to computation. He enjoys collaborating with colleagues from a wide range of academic disciplines and industries. He has written over thirty papers and given international keynote lectures, invited talks, and seminars on multimedia computing, new media, multimodal semiotics, narratology, information studies, corpus linguistics, and audiovisual translation. From 2000 to 2006 he was a Lecturer in the Department of Computing, University of Surrey, UK. Since then he has been working to develop Burton Bradstock Research Labs through research consultancies, collaborative projects, and visiting academic positions. Bronwen Thomas is Senior Lecturer in Linguistics and Literature in the Media School at Bournemouth University. She is coeditor with Ruth Page of New Narratives: Stories and Storytelling in the Digital Age (to be published by the University of Nebraska Press). She has previously published work on hypertext fiction and online fanfiction, as well as the fi lm adaptation of Michael Ondaatje’s The English Patient. Michael Toolan is a Professor of English Language in the Department of English, at the University of Birmingham, UK, where he teaches courses in Stylistics, Narrative Analysis, Linguistic Theory, and Language and the Law. Since 2002 he has been editor of the Journal of Literary Semantics. His publications most directly relevant to his contribution to this volume include books on narrative and on integrational linguistic theory: Narrative: A Critical Linguistic Introduction, 2nd ed. (London: Routledge, 2001); Total Speech: An Integrational Linguistic Approach to Language (Durham and London: Duke University Press, 1996); and Narrative Progression in the Short Story: A Corpus Stylistic Approach (Basingstroke: Palgrave, 2008). Work in progress includes a book provisionally titled Immersion and Emotion: How Textual Patternings Shape our Experiencing of Literary Narrative, which extends further a corpus linguistic approach to literary narratives, and will combine this with testing and questioning of readers.
Index
A Aarseth, Espen, 157 accent, 203 actantial structure, 84 action sequences, 84 adaptation theory, 147–148 agency of participants, 210, 211 allographic art, 68 alternate reality games, 191, 193 alternative universes, 150 American Psycho (1991), 36 anti-remediation, 147 art history, 14,15 art, story of, 28 audience, 148, 149 audience reception, 66, 68 audio description, 53, 54 auditory mapping, 110 auditory mode, 82 authorial control, 127, 134, 137, 140, 146 author-reader relationship, 148 autographic art, 68
Clowes, Daniel, 83 Cognitive Narrative Analysis, 2, 95, 99 cognitive stylistics, 156 cohesion, visual-verbal, 121 collocation, 54 color, 38, 86, 87, 105, 115, 116, 117, 119 comics, 78 compositional meaning, 121 Conceptual Integration Theory. See blending, 105 Conceptual Metaphor Theory, 34, 105 conceptual metaphor, 36, 104 context, 5, 17, 87; of culture, 5; of situation, 29; of reception, 74 context plane, 16 conversation, 209, 211 corpus stylistics, 50 coup de theatre, 73 crime mystery, 161 critical narratology, 2 cross-sited narrative, 190 cyberculture, 155
B
D
Barenboim, Daniel, 73 Barthes, Roland, 57, 84 Bechdel, Alison, 83 Behavioral process verbs, 101 blending, 105 born digital fiction, 166 British National Corpus, 54
David, Jacques-Louis, 15, 19, 21, 27; The Death of Marat, 16, 26 deictic: blends, 93; gesture, 83, 88; shift, 101; reference, 102; verbal, 83 delivery medium, 188 design, 66, 67, 68; modes, 65 dialogue events, 56 diegetic levels, 87 diegetic noises, 70 digital humanities, 50 digital technology, 50, 127 digital writing, 133, 134, 136 Dingsymbol, 161 directionality, 41
C camera shots, 22, 25, 40 camera shot, close up, 40, 161 character, 83–84, 150, 152, 207, 210, 215; mental state of, 56; speech of, 151; voice of, 215; Citizen Kane, 128
226
Index
discourse colony, 134 distributed narrative, 185, 190 distribution, 119, 142, 186
hypertext multimodal narratives, 135, 140 hypertext, 170
E
I
Embodied performance, 128 embodiment, 102, 104, 107, 111, 130, 140, 149, 158 enactor, 87 expressive media, 79
Image border, 58 image processing, 52 image resolution, 175 image-text relation, 57, 59, 60 immersion, 138, 202, 205, 210 implied reader, 99 information gap, 158 inscription, 17 intentionality, 155 interactional space, 131 Interactive Narrative, 135, 202; history of, 204 Interactive time, 168 interactive vectors, 41 interactivity, 135, 138, 140 interanimation, 108 Interface Time, 171 intermediality, 3 interpersonal meaning, 121, 123 inter-semiosis, 35 Invariance hypothesis, 108
F Face detection techniques, 59, 60 face-to-face narration, 78 fan community, 148 fanfiction, 142 feminist theory, 166 fictional universes, 142 fictional world, 138 film, 14, 50, 53, 131; language of, 20; adaptation, 31, 35; dialogue, 56; scripts, 54 first person narration, 145, 151 focalization. See narrative perspective, 145 Foer, Jonathan Safran, 116 font type, 58, 173 Forceville, Charles, 34 Fowler, Roger, 31, 32 framing boxes, 86 framing, 194–195
G Game mode, 184, 189–190, 194 gaze vectors, 41, 42 gaze, 174–175 Gennette, Gerard, 167 Gesamtkunstwerk, 26, 65, 66, 183 gesture, 69; definition of, 87; McNeill’s account of, 88; research 78; space, 89; deictic, 83, 88, 89; pointing, 90 Given and New, 122 graphic novels, 78 Greimas, Alviras, 84 grief, 134
H Haptic mode, 10, 111, 163, 170, 178, 186; of reading, 169 Herman, David, 16, 18, 61, 196 historic present, 25 human body 7, 10, 127, 139, 155, 202 hyperfiction, 163
J Johnson, Mark, 36 Joyce, James, The Dead, 128, 129
K kinesics, 40 Kristeva, Julia, 174 Kupfer, Harry, 69, 72
L La Traviata, 109 Labov, William, 2 Lakoff and Johnson, 34 layout, 119–120 leitmotiv, 72, 65 Levine, James, 72 Levine, Michael, 75 lexical items, 59, 86 libretto, 69 linguistics, 3, 4 literacy, haptic, 175; multimedia, 184; multimodal, 142; visual, 15, 171–172 literary language, 25 literature, 136 live staging, 74 local grammar, 55
Index location, 188 logico-semantic relations, 58, 59 Lost, 144–145
M Malinowski, 5 meaning compression principle, 162 media, 195 media-blindness, 3 medium—definition of, 6, 79 megametaphor, 108 metaphor, 33; conceptual, 33; multimodal 35; realized in films, 37; semiotic, 33 mind relevance, 95 mind style, 31 mise-en-scène, 68, 69 modality, 118, 119, 120 mode, definition of, 6, 79 mode-blindness, 3 monomodal narratives, 129, 130, 131 monomodality, 4, 81, 116, 193 moving image, 53 multimedia, definition of, 185; texts, 144, 145 multimodal concordancer, 51 multimodality, definition of, 4, 183–184 Mulvey, Laura, 171, 175 music, 25, 44, 45, 70, 203 musical narration, 65 Mussolini, 74
N Narrative, definition of, 80; art, 127, 135, 138, 139; colonialism, 193; constraints, 136, 138, 141; narrative frequency, 177; imperialism, 6 narrative perspective, 120 narrative pragmatics, 18 narrative time, 145–146, 167 narrativity, 192, 196 narratology, 196 narratorial voice, 169 network culture, 143 neuropsychology, 196 neuroscience, 100 New London Group, 4 news story, 58, 60 novel, 35 novelisation, 148
O Oldton, 131–134 orthography, 86, 151
227
overlexicalization, 44
P Page layout, 57, 59 paratexts, 74 part-of-speech tagging, 58 passive reception, 66 performance, 17, 127 physio-cybertext, 155, 163 pictorial processing, 101 pixel, 52 place, 88, 94 plot summaries, 56 polymorphic fictions, 183 present tense, 172 production, 9, 66, 67, 68, 70, 73, 119; media, 65; format, 143 provenance, 69–70 psychological dissociation, 43, 44 Pullinger, Kate, 155; The Breathing Wall, 159–160
Q Quest, 197 quotability, 129
R Reader, 102, 108, 143, 158; ideal, 162; implied, 99; situation of, 160 reader-response theory, 156 reader-text relationships, 157 reality effect, 25 receiver audience, 68 reception, 9 reference worlds, nonfactual, 81 Regietheater, 66 remediation, 5, 80, 144 restaurant narratives, 52 review culture, 146 role animator, 149
S Schama, Simon, 15, 19, 20, 26, 27, 29 Schenk, Otto, 69 scientific convention, 107; quotation, 109, 110 screen idols, 149 screen, 8, 131, 143 Searle, John, 156 sensory modes, 7, 100 September 11 terror attacks, 122 sociolinguistics, 2, 5, 88 sound quality, 203
228
Index
sound resources, 34, 38, 42, 120, 202, 205 sound track, 40, 53, 174 space, 8, 18, 73, 88, 93, 120, 188, 202, 205; size of, 208 speech, 150–152 Speech Act Theory, 156 speech and thought presentation, 151 speech balloons, 86, 87 speech tags, 152 status relations, 59 story logic, 16, 18, 28, 78, 83–84 storyworld, 61, 196 Strauss, Richard, 73 Structuralist narratology, 3 subjectivity, 167 surtitles, 73, 74 synesthesia, 111 Systemic Functional Linguistics, 33, 115, 121, 124
T technologies of production, 20 television, 28, 148 tempo, 170 text actual worlds, 81 text complexity, 59 text-world, 104 The Incredible Hulk, 83–84 The Matrix, 185, 187 The Texas Chainsaw Massacre (1974), 40 third person narration, 151 thought-action continuum, 56 Tomasula, Steve, 99, 100 transcription, 8, 9, 15, 20, 29, 51, 93, 138, 139 transfictionality, 191
transliteracy, 187 transmedia storytelling, 142, 150, 185, 190, 191 transmedial narratology, 95, 99 transtextuality, 186 traversal culture, 147 typography, 99, 102, 103, 115, 151; grammar of, 116; meaning of, 118
U ubiquitous gaming, 185
V verbal resources, 3, 4, 26 verbs, 174 visual culture, 14, 26, 100 visual grammar, 70, 118, 121–122 visual literacy, 15 visual resources, 17, 34, 67, 82, 115, 120 visualization techniques, 51 voice quality, 70, 213 voice, 71 voice-over, 40
W Wagner, Richard, 65, Der Ring des Nibelungen, 66; Die Walküre, 67; Götterdammerüng, 72 Wechsler, Gil, 72 Wernicke, Herbert, 69, 73 word-image combinations, 78 World Wide Web, 57, 142 wreader, 163 Wright, Tim, 133, 134 writing, 124, 138; importance of, 145; materiality of, 143; processing of, 101