Academic and Professional Discourse Genres in Spanish
Studies in Corpus Linguistics (SCL) SCL focuses on the use of c...
60 downloads
814 Views
5MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Academic and Professional Discourse Genres in Spanish
Studies in Corpus Linguistics (SCL) SCL focuses on the use of corpora throughout language study, the development of a quantitative approach to linguistics, the design and use of new tools for processing language texts, and the theoretical implications of a data-rich discipline.
General Editor
Consulting Editor
Elena Tognini-Bonelli
Wolfgang Teubert
The Tuscan Word Center/ The University of Siena
University of Birmingham
Advisory Board Michael Barlow
Graeme Kennedy
Douglas Biber
Geoffrey N. Leech
Marina Bondi
Anna Mauranen
Christopher S. Butler
Ute Römer
Sylviane Granger
Michaela Mahlberg
M.A.K. Halliday
Jan Svartvik
Susan Hunston
John M. Swales
Stig Johansson
Yang Huizhong
University of Auckland Northern Arizona University University of Modena and Reggio Emilia University of Wales, Swansea University of Louvain University of Sydney University of Birmingham University of Oslo
Victoria University of Wellington University of Lancaster University of Helsinki University of Michigan University of Nottingham University of Lund University of Michigan Jiao Tong University, Shanghai
Volume 40 Academic and Professional Discourse Genres in Spanish Edited by Giovanni Parodi
Academic and Professional Discourse Genres in Spanish
Edited by
Edited by Giovanni Parodi
Giovanni Parodi Pontificia Universidad Católica de Valparaíso
John Benjamins Publishing Company Amsterdam / Philadelphia
8
TM
The paper used in this publication meets the minimum requirements of American National Standard for Information Sciences – Permanence of Paper for Printed Library Materials, ansi z39.48-1984.
Cover design: Françoise Berserik Cover illustration from original painting Random Order by Lorenzo Pezzatini, Florence, 1996.
Library of Congress Cataloging-in-Publication Data Academic and professional discourse genres in Spanish / edited by Giovanni Parodi. p. cm. (Studies in Corpus Linguistics, issn 1388-0373 ; v. 40) Includes bibliographical references and index. 1. Spanish language--Discourse analysis. 2. Spanish language--Spoken Spanish. 3. Spanish language--Variation. 4. Discourse markers. I. Parodi, Giovanni. II. Title. III. Series. PC4434.A23 2010 460.1’41--dc22 2010004884 isbn 978 90 272 2314 2 (Hb ; alk. paper) isbn 978 90 272 8825 7 (Eb)
© 2010 – John Benjamins B.V. No part of this book may be reproduced in any form, by print, photoprint, microfilm, or any other means, without written permission from the publisher. John Benjamins Publishing Co. · P.O. Box 36224 · 1020 me Amsterdam · The Netherlands John Benjamins North America · P.O. Box 27519 · Philadelphia pa 19118-0519 · usa
To our beloved and encouraging girls and boy: Karime, Mailén, Carolina and Diego
Table of contents
Foreword About the authors Introduction Acknowledgments chapter 1 Discourse genres, academic and professional discourses: The book and its contents Giovanni Parodi chapter 2 Written discourse genres: Towards an integral conception from a sociocognitive perspective Giovanni Parodi chapter 3 Discourse genres in the PUCV-2006 Academic and Professional Corpus of Spanish: Criteria, definitions, and examples Giovanni Parodi, Romualdo Ibáñez and René Venegas
ix xi 1 5
7
17
37
chapter 4 Academic and professional genres: Variations across disciplines Giovanni Parodi
65
chapter 5 University academic genres: A miscellaneous discourse Giovanni Parodi
83
chapter 6 Multi-dimentional analysis of an academic corpus in Spanish René Venegas
101
viii Academic and Professional Discourse Genres in Spanish
chapter 7 Automatic text classification of disciplinary texts René Venegas
121
chapter 8 Rhetorical organisation of Textbooks: A “colony-in-loops”? Giovanni Parodi
143
chapter 9 The Textbook genre and its rhetorical organisation in four scientific disciplines: Between abstraction and concreteness Giovanni Parodi chapter 10 The Disciplinary Text genre as a means for accessing disciplinary knowledge: A study from genre analysis perspective Romualdo Ibáñez chapter 11 Academic discourse comprehension in Spanish and English: Accessing disciplinary domains Romualdo Ibáñez chapter 12 Corollary: A critical synopsis of this book and some prospects for future challenges Giovanni Parodi References Index
171
189
213
233
239 253
Foreword
Over the past 20 years, there has been an explosion of corpus-based studies of discourse, as evidenced by the previous books in this series. However, most of those studies have focused on the analysis of English-language corpora, and this is especially true of the studies that have been published in English. As a result, we currently know surprisingly little about the corpus-based study of linguistic variation in languages other than English. The present book takes a major step towards filling this gap in previous research, providing detailed corpus-based descriptions of genre variation in Spanish, one of the most widely-spoken languages of the world. The descriptions focus on written discourse in four major disciplines: psychology, social work, industrial chemistry, and construction engineering. However, unlike most previous studies, the research here further distinguishes between academic and professional discourse, and explicitly compares the linguistic patterns of use between the two domains. Although the book is technically an edited collection, it is much more coherent than most edited books and reads much more like a single-authored treatment of this topic. This is because the book represents the coordinated efforts of the editor (Giovanni Parodi) and two colleagues on his research team at Pontificia Universidad Católica de Valparaíso (Romualdo Ibáñez and René Venegas). Thus, all chapters focus on analysis of the same corpus with the same general research goals: to provide a detailed description of academic and professional research genres in Spanish. Given that there has been comparatively little previous corpus-based research on Spanish, it might be expected that the standard of research for these studies would not be as rigorous as in recent corpus-based studies of English. But this is far from true. In fact, the research reported here is a model for future corpus-based studies of any language. For example, the PUCV-2006 corpus of academic genres is perhaps the largest and most carefully designed corpus of academic writing: for any language, not just for Spanish! (In addition, the corpus is freely available for other researchers, in contrast to most English-language corpora.) The standard of analysis similarly reflects the current state-of-the-art, applying sophisticated computational techniques coupled with careful qualitative analyses.
Academic and Professional Discourse Genres in Spanish
In sum, Academic and Professional Discourse Genres in Spanish will clearly be of interest to all scholars interested in Spanish-language research. But its contributions go well beyond that audience; it will be of interest to any scholar interested in the study of discourse and language use, as a model for how corpus-based techniques can be applied to this research domain. October, 2009
Douglas Biber Northern Arizona University, USA
About the authors
GIOVANNI PARODI is presently Head of the Postgraduate School of Linguistics at Pontificia Universidad Católica de Valparaíso, Chile, and Editor of Revista Signos. Estudios de Lingüística. He obtained an M.A. in Applied Linguistics and later received his Ph.D. in Linguistics. His major fields of interest are text linguistics, discourse psycholinguistics (reading comprehension and written production processes), and corpus linguistics. Currently he is conducting research in specialized academic/professional written discourse, press media discourse analysis, and computational resources through three grants funded by major Chilean research foundations and international programs, such as ECOS and UNESCO UNITWIN Chairs. His publications include articles in Spanish and English journals and several books published by EUDEBA (2005, 2007) and EUVSA (1999, 2002, 2003, 2005, 2008). During 2007, he edited the book Working with Spanish Corpora, published by Continuum. Coming publications are: Lingüística de Corpus: de la teoría a la empíria (Iberoamericana, 2010), Saber Leer (Aquilar, 2010), Alfabetización académica y profesional: perspectives contemporaneas (Planeta, 2010). Because of his scientific merits, in 2008 he was appointed as a Member of the Academia Chilena de la Lengua. ROMUALDO IBÁÑEZ has a Bachelor degree in Linguistics and Literature in English. He also received a Master’s degree in Applied Linguistics. He also received his Ph.D. in Linguistics in 2004 from Pontificia Universidad Católica de Valparaíso (PUCV), Chile. His main research areas are discourse comprehension processes, psycholinguistics, genre theory. He has participated in several research projects as collaborator, with special interest in reading strategies in L1 and L2. He has published several articles in journals and specialised books. He is an associate professor in the graduate programmes at the Institute of Literature and Language Sciences at PUCV. He is currently carrying out research on several projects as principal researcher and as collaborator. RENÉ VENEGAS is professor at the Pontificia Universidad Católica de Valparaíso, Chile. He has a Ph.D. in Linguistics. He is currently the Publishing Editor of Revista Signos. Estudios de Lingüística. He teaches linguistics and semantics to undergraduate and graduate students. His research interests are academic
xii Academic and Professional Discourse Genres in Spanish
discourse, the study of meaning with computer tools, the development of computer tools for text analysis, and oral argumentation. Dr. Venegas is a member of the Chilean Linguistic Society (SOCHIL) and the Latin American Discourse Analysis Association (ALED).
Introduction
As connoisseurs of the Edited Book genre, it is important to remember that the Introduction section is normally written towards the end of the writing process, including all required revisions and adjustments; which is to say, once the work has been finished and all the chapters have been compiled into its final structure. Of course the aesthetic pleasure of having printed the entire final text and picking up the finished volume into one’s hands is very much like that of an artist gazing at his painting or that of a chef looking at a dish just out of the oven and about to be served. The work is finally finished! However, despite this incredible feeling, we are left with a certain emptiness, disorientation and lack of purpose. The task that engaged us for months and even for years now leaves us with an empty space. This is undoubtedly a good thing, in that this allows us to think about all the efforts put forth, the joint teamwork and our achievements. It also gives us the chance to look back on knowledge construction pathways and the many lessons learned. We can thus realise how little we knew at the onset of the challenge and identify the small and possibly giant advances made. This process leads us to appreciate group discussion, the long and drawn-out technical study sessions that sometimes became intensive and even heated. This process of self-reflection helps us to discover the magic of team work. So much has been written and rewritten on this topic, but the incredible relevance of social construction of knowledge, situated and collaborative learning must be experienced and understood in order for this to come alive. René and Romualdo have gradually become my colleagues and friends. In some cases, they went from being my undergraduate students to full-fledged students in the Ph.D. in Linguistics programme at Pontificia Universidad Católica de Valparaíso, Chile. As for me, I have gone from being the professor for several of their graduate courses, to thesis director for each of these men who have been my closest collaborators and scientific colleagues to date. I am at a loss to describe the different sensations I feel right now and which come together upon finishing this book as a joint effort with people who have built our academic, professional and personal lives around linguistics. Together with our dreams of scientific advancement in our disciplines, we also grow as people and become our own architects. One word that comes to mind is pride, but healthy pride. I am proud to see René
Academic and Professional Discourse Genres in Spanish
and Romualdo grow as human beings and as scientists sharing this stand with me today, a stand granted to us by the scientific writing encapsulated in this book. I am also especially grateful for proving ourselves worthy of this challenge to further corpus linguistics, this time from the rich perspective of discourse genre. We face a complex construct in this volume and have dared to define and classify genres from a sociocognitive and linguistic viewpoints, coining a perspective we consider to be novel with some outstanding features. Likewise, we are no less proud to have collected one of the largest academic and professional corpora on record in recent times. This is due to its idiosyncratic features, size, situationality and ecology. This is also due to its high degree of specialised subject matter in four scientific disciplines, its on-line availability on the Internet and free of charge access to texts with morphosyntactic tagging and parsing at our website El Grial (www.elgrial.cl). Academic and Professional Discourse Genres in Spanish is a title we discussed at great length. We believe the name captures the essence of this contribution. It reveals the focus of our current concerns and our scientific interests: getting to know discourse practices of students in scientific training and their discourse university settings and their eventual workplaces; looking for the discourse keys leading to the construction of knowledge, cognitive processing and use in everyday life; searching for lexicogrammatical features of specialised texts across different knowledge domains; leading disciplinarity to emerge as a relevant feature in linguistic use and approaching similarities and differences in texts from different areas. These and many others are issues that we have attempted to encapsulate in the title of this book, in which we are certainly only scratching the surface of a fascinating world. The twelve chapters that comprise this volume have been conceived as a bound unit by means of which we have attempted to progressively approach the object under study from several viewpoints. This complementarity constitutes a unique feature of this book, in that all chapters approach and provide a joint perspective of the PUCV-2006 Academic and Professional Corpus of Spanish. We have placed special emphasis on declaring the theoretical principles framing subsequent works using more empirical methods. This theoretical positioning reveals some of the ontological and epistemological principles of the members of Escuela Lingüística de Valparaíso (www.linguistica.cl) and aims to be a determined contribution to contemporary discussion of discourse genres. Chapter 1 is the central nucleus of these reflections, although these contributions extend throughout each and every one of the other chapters. Academic and Professional Discourse Genres in Spanish is already out of our hands and has taken on its own life. We sincerely hope that this will be a contribution to the cumulative advancement of science and a mechanism for replicating
Introduction
future research projects. Of course, there will be a lot to improve and revise and there will be many gaps left herein. We are open and willing to take constructive criticism. However, these same possible mistakes will undoubtedly contribute to learning and enrichment for ourselves and for the future researchers of the 21st century. I wish you the best of luck on this adventure. November, 2009
Giovanni Parodi, Editor Valparaíso, Chile
Acknowledgments
As is well known, and this book is definite evidence, writing in general (and academic writing maybe even more) is a demanding collective process that brings together several hands and voices. Intertextuality is a prototypic feature and it certainly has been embodied in this book. Although we, the authors, and I, the editor, are obviously responsible for what has been written in each and all the chapters, this book is the end result of a huge network of readers, writers, assistants, field assistants and technical personnel. It is thus fair that we highlight some names and sincerely thank Gilda Bastías, Alejandro Córdova, Leonardo Zamora, Natalia Silva, Astrid Órdenes, Gina Burdiles, Mailén Parodi and Karime Parodi for their essential support. In addition, Roberto Parra as statistical assistant has never failed to contribute innovative facets to support us in specialised calculations, and has always been more than willing to make mathematical sciences easier to understand. Our research assistant Pablo Malverde merits honorary mention. His efforts, tenacity and perseverance never failed and he showed no signs of fatigue. Outstanding motivation, a keen eye and a special creative attitude are the hallmarks of his future as a professional editor. His contributions can be seen in several different touches, innovative designs, graphs and tables throughout this volume. Angela Gerhart, Edith Laupichler, James Conaghan, Maria Inés Zilleruelo, Alan Steel, Todd Gibson, Alan Cahoon and Millaray Salas were fundamental when it came to drawing up a text with the right texture and coherence in academic English. Todd took on the difficult chore of helping to review content aspects and academic writing in English. His was undoubtedly a painstaking and often not sufficiently appreciated task. But Todd had the courage and passion to never let his guard down or cease from his labours. His contributions are sprinkled throughout the entire book and have helped to make it a little more legible and understandable in academic English. Our interesting discussions with Todd have kindled several scientific hypotheses as to the differences between academic writing in Spanish and in English; not just in terms of different lexicogrammatical possibilities, but also regarding communicative purposes, scientific perspectives,
Academic and Professional Discourse Genres in Spanish
how to communicate science and in the hard process of learning how to mean in Spanish and English. Our friends J. R. Martin, Max Louwerse, Vijay Bhatia and Phil MacCarthy were happy to read and comment on some of these chapters. Their contributions are of great worth to us. We wish to repeat that they are not in any way responsible for the final contents published herein. Last but not least, we warmly thank Doug Biber for taking the time to read the entire book and contributing a splendid Prologue, which will certainly enrich this volume. Thank you all!
Romualdo Ibáñez René Venegas Giovanni Parodi
chapter 1
Discourse genres, academic and professional discourses The book and its contents Giovanni Parodi
This introductory chapter describes the aim of the book and explains how each chapter contributes to this aim; so they do not stand alone as isolated or disconnected pieces. In addition, background information is provided about the production process of this volume and how all of the pieces came to be part of this unit. The book focuses on the collection, in-depth description, and quantitative and qualitative analysis of one the largest available on-line corpus on academic and professional disciplinary written discourse of Spanish across four disciplines (59 million words).
Introducing Academic and Professional Genres: Corpus evidence on Spanish The issues of specialised discourse and disciplinarity have become increasingly salient. Questions as to how differences in the structure of intellectual scientific fields and educational curricula help to shape academic and professional experience are the focus of several studies across a variety of disciplines, applying a wide range of approaches. In addition, questions as to how cognitive, social and linguistic dimensions of discourse genres interact dynamically and help to develop human communication efficiently are the focus of much debate. At the same time, academic and professional written discourse has turned out to be a central topic of challenging research in recent years from a variety of disciplines, inasmuch as successful written academic and professional interaction through written discourse is recognized to be essential for crucial achievement in the academic and professional worlds. The increasing importance of genre variation across disciplines as an explanatory factor for diverse knowledge construction within discourse communities has been documented over the past decade. The perception that there is no
Giovanni Parodi
core disciplinary discourse per se and that it is better to talk about disciplinary discourses in plural is becoming more accepted among researchers. Empirical findings based on various approaches have documented the importance of corpus-based analysis as a way to advance and describe in detail the variation across disciplines and across genres. Corpora of natural, annotated texts have produced considerable impact on linguistic analysis over the last two decades. More specifically, research into the English language, as well as certain European and Asian languages, has revealed that linguistic studies based on large corpora of digital texts do not always corroborate researchers’ initial intuitions. The use of computer-supported corpora, as well as the availability of computer programmes used to manage these, has boosted linguistic investigation in a way that was previously unthinkable. The remainder of this initial chapter is divided into four sections. In the next section, I offer a brief review of Spanish and its growing importance in the world as a lingua franca. The following section describes my personal motivation behind producing this volume. The third section identifies the book’s goals and intended audience. Finally, I present a summary of twelve chapters that make up the book.
Spanish as a lingua franca Studies on the Spanish language from a corpus linguistics and genre analysis perspective are scant, not only in English speaking countries but also in Spanish speaking communities, even though there is growing interest in the area. Over the last decade, corpora are being compiled and software and computational programs are being developed to cover the needs of researchers, but there is still a lot of research needed, particularly on languages other than English. Spanish is quickly becoming an international language, making the need for empirical studies on Spanish language use more urgent than ever. The Spanish speaking population is growing faster every day and the language is becoming more and more important. Considering figures published by several institutions, over 600 million people speak Spanish around the world, and it is the official language of 22 countries. In the USA alone, there are more than 40 million Spanish speakers, and their numbers are increasing every day. In addition, Spanish is an official language of many international organizations (United Nations, GATT, among others) and is becoming increasingly popular in the world of art, tourism and business. However, there have been surprisingly few corpus-based studies conducted on Spanish to date. In contrast, most studies of language variation and use in Spanish tend to focus on examples taken from a few original or madeup texts.
Discourse genres, academic and professional discourses
Why this book? I have been doing research on the structure of the Spanish language for the last eight years, and furthering this subject from a corpus linguistics approach for the last six years. I have also used multi-register and multi-dimensional analysis to describe Spanish corpora, particularly written and spoken varieties. During this corpus linguistics principle-based research process, I have been fascinated by corpora analysis using computational tools and by collecting electronic corpora. At the same time, I have been able to discover that research conducted to describe Spanish has been far away from the mainstream of the corpus linguistics approach and that there are few exceptions in Spain and Latin America. Also, it is surprising to see how few publications (if any) have devoted space to disseminating research conducted on Spanish following some updated methodologies. Therefore, one of my purposes with this book is to help to fill the gap and try to disseminate research conducted on Spanish but published in English. This is one way to face the description of Spanish from a discourse and corpus linguistics approach and put relevant findings together as a coherent book to disseminate this scientific knowledge for the advancement of linguistics and for the development of the Spanish language around the world.
What does this book have to offer? This book mainly focuses on discourse genres emerging from corpora of specialised texts in Spanish employed in university and professional settings across disciplines. Indeed, the successful use of discourse genres in academic and professional worlds is perhaps one of the most important achievements of cognitiveand-socially mediated language interactions of human beings. In order to do so, the book focuses on the collection, in-depth description, and quantitative and qualitative analysis of one the largest available on-line corpus on academic and professional disciplinary written discourse of Spanish across four disciplines. We also seek to identify all discourse genres in both settings involved (academic as well as professional), capitalizing from a complementary “bottom-up” and “top-down” methodology. In order to accomplish these objectives, we collected a corpus based on four undergraduate university programmes: Psychology, Social Work, Industrial Chemistry, and Construction Engineering. The academic corpus comprises all the reading materials that students are given during each five-year programme. This corpus, amounting to approximately 59 million words, has been classified into different discourse genres. Complementarily, we also collected another corpus on the same four disciplines
10
Giovanni Parodi
and science domains, but in the corresponding professional workplaces and institutional settings. Hence, we have contrastive corpora of the written genres that circulate in both academic and professional settings. In this book, we help to bridge the aforementioned gap by organizing a collection of papers following some of the principles of corpus linguistics and genre analysis from the perspective of the Valparaíso School of Linguistics in a complementary manner. The twelve chapters that make up this book represent research on contemporary Spanish conducted at one leading university in Latin America: Pontificia Universidad Católica de Valparaíso, Chile. As a whole, all chapters are organised sequentially from more theoretical to the more applied studies. Thus, these all constitute a unit that presents a unique academic piece of writing as a book. No book focusing on Spanish variation among diversified corpora, genres and registers in academic and professional domains and describing corpora of academic and professional texts across four disciplines in detail has been produced until now. To date, the PUCV-2006 Corpus of Academic and Professional Spanish is the largest available on-line tagged corpora, separated by genre, downloadable free of charge and accessible at www.elgrial.cl. As said, this book is made up of twelve chapters. These range from the theoretical guiding principles of research in terms of genre conception and academic and professional discourse, detailed descriptions of each corpus (academic and professional), computational analysis of different kinds and from different perspectives, qualitative analysis of two specialized genres describing the rhetorical moves (University Textbooks and Disciplinary Texts), to the design and administration of reading comprehension tests based on these genres for university students, in both Spanish and English. Theoretically speaking, a multi-dimensional perspective (social, linguistic and cognitive) is emphasised and special attention to the cognitive nature of discourse genres is defended and supported. Although some of the chapters are different from one another in terms of their specific methodologies, these all share the objective of describing PUCV-2006 Corpus of Academic and Professional Spanish in detail from a corpus linguistics approach, as well as move analysis of genres never before analysed in such detail (a quantitative description is given of each occurring move and step in the corpus) and from such a large corpus. Taken together, these studies illustrate several alternative ways of approaching the study of Spanish through corpora research, showing not only quantitative approaches but also qualitative methods. The innovative approach of this book is partly evidenced by the multi-dimensional conception of discourse genres in a three-pronged coordinated and interdisciplinary approach: cognitive, linguistic and social dimensions presented from a sociocognitive perspective; the collection and description of unique corpora of academic and professional discourse in four disciplines: Social Work,
Discourse genres, academic and professional discourses
Psychology, Industrial Chemistry, and Construction Engineering; and the identification, analysis and comparison of the discourse genres circulating in academic settings and professional workplaces (9 genres identified in academic settings and 28 in the corresponding professional workplaces) presented with varying methodologies: quantitative and qualitative approaches. In addition, the description and availability of the 491 academic texts in an on-line tagged version (almost 59 million words) is an outstanding feature of this book. The computational analysis employed in the description and comparison of the PUCV2006 Corpus of Academic and Professional Spanish is another interesting and relevant contribution of this book. Rhetorical organization of two genres is also presented (University Textbook and Disciplinary Text). All findings are robust empirical data that contribute to advancing the description of academic genres, characterized as a mixed discourse. Disciplinarity emerges as a central issue in knowledge construction by means of written specialized discourse. Specific pedagogical and communication interactions through genres in academic and professional environments are detected and revealed as discourse tools that require reading and writing abilities to help students and professionals to develop language expertise. This book’s main audience will be English-speaking linguists (of all nationalities), specifically, those interested in Spanish, and in corpus linguistics, move analysis, contrastive linguistics and contrastive rhetoric; undergraduate and postgraduate students who study the Spanish language and whose programmes focus on corpus linguistics studies and discourse analysis, contrastive linguistics and/or on contrastive rhetoric; English-speaking teachers of Spanish, grammarians and discourse researchers. The book is also intended for linguists around the world interested in corpus linguistics and discourse genres, and academic and professional discourse. Scholars interested in university and professional literacy are also the target audience of this book. The secondary audiences of the book are linguists in all languages, language students, and teachers of Spanish, as well as language teachers in general. Strictly speaking, we wish to highlight that there is no book or special issue in any journal to which this book could be compared. In spite of the fact that corpora have been intrinsic parts of linguistic research in Spanish, on the one hand, neither corpus linguistics nor genre analysis approaches are widely used methodologies for the study of Spanish or research published in journals and books. On the other hand, although Spanish is a language with a number of speakers and readers that is growing exponentially every day and could be said to be the second most important language used as a lingua franca in the world, no initiatives for publishing a collection of research comparable to the one we offer here have been observed as yet.
11
12
Giovanni Parodi
Outlining the book This Introductory Chapter provides an overview of this book, describing its objective and all of the chapters in order to explain how each of these contributes to the overall aim of the book, beyond their individual worth as pieces of linguistic research, but not standing alone as isolated and disconnected pieces. As mentioned beforehand, the main focus of this book is to further the description of Spanish with emphasis on its variation across different genres based on particular corpora. In Chapter 2, Giovanni Parodi opens the research contributions by outlining an interdisciplinary framework in which nuclear concepts are advanced from an integral conception of language and genres, all of this considering a sociocognitive perspective. As is well-known, multiple alternative discourse genre conceptions and classifications have been proposed, from both theoretical and empirical approaches. Over the last ten to fifteen years, there has been an important debate in linguistics and some other disciplines in the attempt to clarify this conception. However, the elusive and divergent conceptions underlying this term sometimes encounter extreme opposite viewpoints. In this opening chapter of this book, an interdisciplinary theoretical discussion is offered in order to reach a three-pronged multi-dimensional conception: cognitive, social and linguistic dimensions. Thus, accepting the social as well as the semiotic factors involved, we propose a theoretical framework and a definition with an emphasis on a cognitive perspective and on a subject’s active role in its construction. At the same time, a proposition for academic and professional genre construction in disciplinary domains is sketched. The identification and the classification of discourse genres have been one of the permanent concerns in linguistic studies. Particularly, since genres as complex objects have become the focus of analysis, multi-dimensional approaches have been developed in order to capture their dynamic nature. In Chapter 3 of the book, written by Giovanni Parodi, Romualdo Ibáñez and René Venegas, we identified, defined, classified, and exemplified discourse genres emerging from the 491 texts (58,594,630 words) that make up the PUCV-2006 Academic and Professional Corpus of Spanish. In order to accomplish these objectives, we carried out a complementary methodology of a deductive and inductive nature. Six criteria were selected and employed in order to analyse all the texts in the corpus: Communicative Macro-purpose, Relations between the Participants, Discourse Organization Mode, Context of Production, Context of Circulation, and Modality. After analysing all the texts of the corpora, twenty-nine genres were identified and grouped according to the criteria mentioned above. Interesting clusterings emerged, reflecting cognitive, social, functional/communicative
Discourse genres, academic and professional discourses
and linguistic variations. The importance of using a group of different variables interacting in the identification of genres evidenced high potential and proved to be a powerful methodology. Chapters 4 and 5, both written by Giovanni Parodi, focus on the quantitative description of the largest available on-line corpus (almost 59 million words) of specialised written Spanish across four disciplines: Psychology, Social Work, Industrial Chemistry, and Construction Engineering. The corpora were collected at one Chilean university and from the corresponding professional settings. The corpora description shows that access to disciplinary knowledge is constructed, on the one hand, through a varying repertoire of nine written academic genres, which show varying occurrence across the disciplines under study. On the other hand, a richer group of twenty-eight professional genres is detected. Interesting variations across the four disciplines under study are also identified in professional workplaces. Psycholinguistic and educational implications are proposed in relation to knowledge acquisition, discourse genres and reading comprehension processes. The group of academic genres identified in undergraduate university genres is characterized as a miscellaneous discourse, while a distinction emerging between Social Sciences and Humanity texts and Basic Sciences and Engineering texts is described as abstraction and concreteness. Chapter 6 describes specialised multi-register and multi-genre corpora using diverse corpus methodologies, such as multidimensional analysis. In two complementary studies, René Venegas presents data based on the PUCV-2006 Corpus of Academic Spanish. Both studies employ the five dimensions (i.e. Contextual and Interactive Focus, Narrative Focus, Commitment Focus, Modalising Focus, and Informational Focus) identified by Parodi (2005a). Each of these dimensions emerged from the functional interpretation of co-ocurring lexicogrammatical features identified through multi-dimensional, multi-register and multi-genre analysis. The main assumption underlying these studies is that the dimensions determined by a previous multidimensional analysis can be used to characterise new corpora of university genres. In the first study, we calculate linguistic density across the five dimensions that provide a lexicogrammatical description of the nine academic genres that make up the corpora. In the second study, we compare the PUCV-2006 Corpus with four corpora from different registers (i.e. Latin American Literature Corpus, Oral Didactic Corpus, Research Articles Corpus and Public Policies Corpus). The findings confirm the specialized nature of the genres in the PUCV-2006 Corpus, where both strong lexicogrammatical compactness of meanings and regulation emphasis of the degree to which certainty is manifested are strongly expressed. The aim of the following study reported in Chapter 7, as already stated and written by René Venegas, is to classify the academic texts included in the PUCV-
13
14
Giovanni Parodi
2006 Corpus of Spanish, using and comparing two automatic classification methods. These methods are based on shared lexical-semantic content words present in a corpus of academic texts used in four undergraduate university programmes at the Pontificia Universidad Católica de Valparaíso, Chile. A sample of 216 texts from the PUCV-2006 Corpus was employed (30,886,081 words). This subcorpus was divided as follows: 26 texts from Construction Engineering, 31 from Industrial Chemistry, 64 from Social Work, and 95 from Psychology. The classification methods compared in this research are Multinomial Naïve Bayes and Support Vector Machine. Both helped to identify a small group of shared words that allowed René Venegas to classify a new text into the four disciplinary areas in accordance with statistical weights. The findings helped him to conclude that the Support Vector Machine classifies academic texts more efficiently. This method enables automatic identification of the disciplinary domain of an academic text (based on a reduced number of shared content lexemes) and delivers high performance. In Chapter 8, Rhetorical organization of Textbooks: A “colony-in-loops”?, Giovanni Parodi identifies and describes the rhetorical organization of the Textbook genre, based on a part of the PUCV-2006 Corpus of Academic Spanish. This corpus includes a total amount of 126 textbooks collected from four disciplines: Social Work, Psychology, Construction Engineering and Industrial Chemistry. More specifically, Parodi identified and described the rhetorical moves of the Textbook genre. He also describes the communicative purposes of each of the moves and steps identified, providing examples from the four disciplines. A new macro-level of analysis is introduced and justified, which turned out to be essential for a better description of an extensive textual unit such these texts. Parodi called it a macro-move. At the end of the chapter, he answers the question appearing in the chapter’s title and explains why the Textbook should have a rhetorical organization, which he identified as a “colony-in-loops”. The following chapter continues and complements the previous one. In Chapter 9, Giovanni Parodi argues that it is hard to find research in which a combined report of the rhetorical organization of discourse genres and the respective quantifications of the frequency of occurrence of moves and steps are worked together across a group of diversified disciplines. It is even more difficult to find research that complementarily approaches the investigation from robust corpora, collected from ecological principles and based on the analysis of variations which genres reveal by means of diverse scientific disciplines. From this setting, this chapter aims to become a step towards helping to bridge this gap in Spanish. The Textbook genre, the determination of the frequency of occurrence of its rhetorical moves and steps, and the contrastive study of disciplinarity are the focuses of this study. More specifically, in this chapter Parodi concentrates on determining
Discourse genres, academic and professional discourses
the frequency of occurrence for the macro-moves, moves and steps identified in a corpus of 126 university textbooks, as part of PUCV-2006 Corpus of Spanish. Parodi distinguishes the area of knowledge (Basic Sciences and Engineering and Social Sciences and Humanities) as well as the specific disciplines (Social Work, Psychology, Industrial Chemistry and Construction Engineering). Therefore, in this research, Parodi also aims to investigate possible variation patterns among the rhetorical moves and steps by means of the disciplines involved, in order to cause discipline to emerge as a relevant factor. As a general finding, he concluded that there is an important difference between the texts in this genre across the four disciplines; and this is why he proposed that a distinction should be made between abstraction and concreteness. In Chapter 10, following the methodological steps proposed by Giovanni Parodi in the previous two chapters, Romualdo Ibáñez focuses on the move description of one of the most important genres identified in the PUCV-2006 Corpus of Spanish: the Disciplinary Text. In his study, Romualdo Ibáñez describes the Disciplinary Text, an academic genre that has emerged as one of the most frequent means of written communication in the PUCV-2006 Academic Corpus, in three different disciplinary domains (Social Work, Psychology, and Construction Engineering). In order to do so, he carried out a bottom-up/topdown analysis of the rhetorical organization of 270 texts. The results not only revealed the persuasive communicative purpose of the genre, but also its particular rhetorical organization, which is executed by means of three rhetorical macro-moves, eight moves and eighteen steps. In addition, quantitative results showed a relevant variation between Social Sciences and Humanities and Basic Sciences and Engineering in terms of the rhetorical steps and moves that execute the communicative purpose of the genre. These results allow Romualdo Ibáñez to state that the genre under study steps beyond the boundaries of one discipline, but, at the same time, some of its central characteristics are affected by disciplinary variation. One of the main implications of this analysis is empirical evidence that the ways of negotiating knowledge through discourse varies depending on the discipline and area of study. Romualdo Ibáñez, in Chapter 11, in a study entitled Academic discourse comprehension in Spanish and English: Accessing disciplinary domains, focuses on the application of the processes of genre identification. He argues that, nowadays, it is essential for undergraduate university students to have psychodiscourse skills that allow them to successfully comprehend academic texts. However, as English has become a lingua franca, in countries where English is not mother tongue, undergraduate university students must also develop strategies in order to deeply comprehend disciplinary texts written in English. Starting in the last decade, in Chile, this situation has become an issue of great concern for universities and
15
16
Giovanni Parodi
the government, particularly taking into account the amount of texts written in English that tertiary level students have to face during their academic life. As a result, the issue of reading comprehension for academic texts written in English and in Spanish has become a focus of investigation for Chilean researchers, bringing out questions such as: how much do university students comprehend when facing an academic text? Is there any relation between levels of comprehension and the knowledge students have about the genre they have to read? Can reading skills be transferred to a process developed in a second language? Is disciplinary community integration a variable to be taken into account when explaining the levels of comprehension achieved by students? Approaching some of these questions, in this chapter Romualdo Ibáñez examined the process of comprehension carried out by a group of 112 undergraduate university students belonging to the Industrial Chemistry programme at Pontificia Universidad Católica de Valparaíso, Chile. Ibáñez focused on the level of comprehension these students achieved when facing academic texts written in English and in Spanish. As part of the findings, it is possible to observe a complex interaction among the variables involved, which shows the relation between certain psychodiscourse processes and the particular characteristics of the genre in question that the students had to read. In Chapter 12, the last section of the book, Giovanni Parodi provides a brief account of all findings. In addition, he offers a critical analysis of the theoretical framework and the proposed criteria for identifying genres, including a comparison using the empirical outcomes obtained in these chapters. Projections, challenges and future research niches are outlined.
chapter 2
Written discourse genres Towards an integral conception from a sociocognitive perspective Giovanni Parodi
The concept of discourse genre has been a focus of analysis and discussion in recent literature. Multiple alternative conceptions and classifications have been advanced, from both theoretical and empirical perspectives. During the last ten to fifteen years an important debate has taken place in linguistics. However, the elusive and divergent conceptions underlying this concept sometimes encounter extreme opposite viewpoints. This chapter offers an interdisciplinary theoretical approach in order to reach an integral conception. Thus, accepting the social and semiotic factors involved, we propose a theoretical framework and a definition with an emphasis on a cognitive perspective and on a subject’s active role in its construction. At the same time, a proposition for academic and professional genres construction in disciplinary domains is advanced.
Introduction What exactly do we mean by discourse genres? Are they closed units, easily defined and operationalised? Are they found “out there”, as some scientists suggest, or are they “purely mental” artefacts, as others propose? Are they unique units of analysis created by some radical empirical researcher? Can academic literacy or teaching programmes be feasibly guided by genre theory? That is to say, are genres “teachable” or merely “usable”? Most or all of these questions erratically surround genre theory. However, they demonstrate diverse interests, aims, origins and diverse perspectives. The elusive and divergent theoretical conceptions underlying the term genre offer a wide diversity of alternative options. This may undoubtedly confuse and mislead both the novice and the expert. Approaches from New Rhetoric, Language for Specific Purposes, Systemic Functional Linguistics, Semiolinguistics, and Discourse
18
Giovanni Parodi
Analysis, among others, are all options to be systematically explored and discussed. In some cases, highly relevant differences are detected in the nature of genre conceptions, as well as in possible classifications and educational applications. Therefore, the focus of attention and the means used to approach genre analysis vary greatly, as do the type of categorisations or taxonomies and methods used to conduct empirical investigations. This chapter aims to outline a theoretical framework describing my own integral conception of discourse genres, emphasising a sociocognitivist approach (with special attention to written language). In order to do so, I will base part of my theoretical proposition on the founding principles of Escuela Lingüística de Valparaíso (ELV) (Peronard & Gómez Macker 1985; Gómez Macker 1998; Parodi 2007a, 2008a; Peronard 2007a) and on an empirical exploration of academic and professional genres across scientific disciplines. This study does not aim to identify nor to compare the different perspectives that have approached the analysis of discourse genres. Nevertheless, it aims to review the historical evolution of the concept and its progress and problems. Several studies adequately contribute to this scenario (e.g., Bhatia 1993, 2004; Devitt 2004; Hyland 2007, 2008; Bruce 2008). This chapter presents and defends a theoretical/empirical thesis that may be obvious to many researchers in fields outside linguistics (such as psycholinguistics, cognitive science, evolutionary psychology, and discourse psychology), which tend to prefer inter- or transdisciplinary approaches. Moreover, this thesis may be controversial and non-evident for an important number of communication theory, sociology, discourse analysis or critical discourse analysis researchers, as well as for humanity and social science researchers. This discourse genre thesis is based on the ontological and epistemological principles of my conception of human beings and language. It is therefore also based on my integral and multi-dimensional conception of discourse. These principles are crucial when it comes to exploring any theory of academic and professional genres, or specialised disciplinary literacy. From this psychodiscourse perspective of language (Parodi 2003, 2005a, 2007a), genres are integrally comprised from a socioconstructivist approach in at least three dimensions: cognitive, social, and linguistic. The cognitive dimension constitutes a fundamental, and until now absent, component of this triadic connection. In this three-pronged relationship, the linguistic dimension allows the articulation of social and cognitive dimensions and helps to empower human beings as crucial communicative agents, thus avoiding reification and externalism. This chapter will focus on these relationships and their fundamental association with discourse genres.
1.
Written discourse genres
Genres as cognitive constructs
The status of the cognitive dimension in language studies has evidenced a relatively uncertain course. The terms cognition or cognitive have been rather absent in language science studies over the last twenty to thirty years. Of course, this does not include the work of Noam Chomsky or the so-called cognitive linguistics (Lakoff 1987; Lakoff & Johnson 1981). The relative absence of these terms not only reflects the lack of attention focused on this dimension, but also implies that attention has been placed on other theoretical foundations. Nevertheless, the presence of other terms, such as knowledge, thought, experience, meaning, processing, concepts, and ideas, does draw our attention. This means that there is a type of linguistics we could call mentalist or psychological, in which mental facts are recognised. However, no investigation into processes of a clearly cognitive nature has been made for this kind of linguistics and we thus detect conceptual vagueness. This clearly evidences that there is a current trend to leave the cognitive dimension of language aside (Parodi 2008a). With regard to discourse genres from a linguistic perspective, I observe no important use of cognitive terms and the scarce examples I have found are vague. This lack of commitment and precision is abundant. Genre theory tends to exclude the cognitive dimension, or it has certainly denied, underestimated and underemphasised the relationship between cognition and language. Exceptions are few, although increasing attention has been paid to this cognitive dimension in recent years (Bhatia 2004; Virtanen 2004; De Beaugrande 2004; van Dijk 2008; Bruce 2008). A multi-dimensional conception of genres must envisage the different angles comprised in such a conceptual proposition. This conception must determine the basic dimensions supporting the complex outlook I adhere to. Figure 1 shows an interactive conception of genre in which a cognitive dimension, a social dimension and a linguistic dimension are proposed as fundamental. The principles underlying this conceptual framework are represented in Figure 1, proposing the idea that the relation among these three dimensions is not symmetrical, but rather an interaction in a dialectal cycle. Most essentially, the linguistic dimension performs a fundamental and synergetic role among the three. At the same time, this dimension establishes a nexus between the other two. Nevertheless, differences only exist in terms of degree, and these dialectical interaction processes are inseparable, as modelled in Figure 1. In order to construct a cognitive mental representation of what happens in the social external world, language as a nuclear and connecting tool of human life takes the semiotic process into the cognitive dimension.
19
20 Giovanni Parodi
Dimensions of Discourse Genre
Social Dimension
Linguistic Dimension
Cognitive Dimension
Figure 1. Dimensions that interact in the construction of genres
Without denying the proposed three dimensions involved and the diverse interactions implied in this conception of genre, I would like to emphasise the conception of discourse genres as cognitive constructs. Specialised literature has not yet fully considered this dimension, tending to excessively lean towards an externalist semiotic conception of genres (Halliday 1978; Kress & Threadgold 1988; Martin 1992; Stubbs 2007). Notwithstanding, this chapter does not aim to atomise the richness of genres, but rather to introduce a dimension that I believe has either intentionally or naively been forgotten or overlooked. This dimension is vital to a full comprehension of the object in question (a more detailed discussion with respect to the internalism/externalism continuum can be found in Parodi 2008a). It is now commonly accepted that human beings build knowledge through interaction with other human beings and in contexts that demand a wide range of discourse instruments. All human interactions produce knowledge, which is stored in the contents of cognition. However, there is sufficient evidence that knowledge, constructed through ontogenetic processes, is stored in the memory of readers/ writers and speakers/listeners in a complex representational form whose format has not yet been fully determined. Two exceptional conceptual constructs that are relevant to the cognitive dimension of genres will be discussed. In recent years, the so-called situation model (van Dijk & Kintsch 1983) has gained importance as a knowledge representation built from discourse processing. Such a level of cognitive representation may also display genre knowledge, given that genres exist because the expert reader/listener must have a mental representation of the social situation in which these are produced and used. Likewise, the concept of context model, more recently coined by van Dijk (van Dijk 1999, 2006, 2008) also points to the type of knowledge referenced in this chapter. This claim gives support to
Written discourse genres
Discourse Genres Situational Models
Context Models
Figure 2. Discourse genres, situational and context models
the cognitive nature of the context construct in discourse processing. These two models imply diverse types of knowledge: some more procedural, others more declarative. Both models are offered as a path towards better comprehension and explanation of the cognitive operations involved in the construction of discourse genres. Due to space and focus considerations, detailed explanations of these two models have not been included herein (see previous references by van Dijk). Figure 2 shows this proposal. These two cognitive constructs provide singular support for a genre theory in which the cognitive dimension provides stability for knowledge. These two models help to explain the psychological substrate of the processing of written discourse. They reveal that genres are not entities existing “out there”, but that they are in fact constructed from knowledge processed from a socioconstructivist approach, then stored and activated using different types of memory systems (Schacter & Tulving 1994). This integral conception points to a wider approach and reflects the multidimensionality of the concept of genre. Therefore, genres are more than mere social constants, as are behaviour and interaction patterns (uniquely defined by variables of social context: place, participants, etc.). This conception provides a more encompassing perspective that also features a cognitive substrate. In addition, this helps to overcome reductionisms implied in extremely rhetorical or contextualist approaches (Parodi 2008a). In keeping with this approach, emphasis is placed on the cognitive dimension of genres, since this aims to claim the central role of the human being as a speaker/writer and listener/reader within a dynamic and participatory communication process. It is ultimately the human being who constructs discourse genres in his/her mind as communicative
21
22
Giovanni Parodi
instruments, using his/her experience and cognitive representation of contexts and specific social situations and, certainly, by interacting with the rest of the world. Thus, genre knowledge, which is both socially and individually constructed, is stored in the form of cognitive representations. From this point of view, these representations will be activated and will be materialised in specific texts, within social and cultural contexts, according to the nature of each particular communicative event. Moving deeper into this perspective, an integral conception of genre is offered herein without making any distinction between social genres and cognitive genres (Bruce 2008) or between more social and other more linguistic or cognitive alternative variants (van Dijk 2008). The distinctions made as to the views of Bruce or van Dijk are between planes or dimensions within an integral conception. However, if what may be considered from either of these authors is the idea of different genres on different planes, I certainly disagree with this proposition. Cognition, language and context would interact in a dialectal cycle, with each side informing and guiding the other. From my point of view, the Bakhtinian concept of genre (Bakhtin 1998) (although powerful in a sense and enlightening for initial discussion) becomes narrow. I believe that an exclusively contextualist emphasis from a sociosemiotic perspective has fallen into a new theoretical and methodological reductionism. These weaknesses need to be overcome in the conception of discourse genre. Genres are created through a series of social interaction mechanisms and these enable the construction of discourse practices. However, genres are built and rebuilt by means of cognitive and linguistic constructs that are complexly formed together. In summary, the discourse context of genres is based on knowledge that is fundamentally of a cognitive nature, and thus the subject and his/her memory of previous events constructed in specific settings and interactions provide continuity and stabilisation to the genre construct. In keeping with this socioconstructivist principle, genres evolve and respond to new communicative demands. Every single individual organises his/her knowledge dynamically, using cognitive representation systems that feature categorisation and hierarchy mechanisms. Different theories have proposed different structures or organisation systems of knowledge, such as the theories of schemata, frameworks or scenarios (Rumelhart 1975, 1980; Minsky 1975; Rumelhart & McClelland 1986). Other theories integrate knowledge from multiple sources and account for procedural dynamic representation structures (Kintsch 1998; Kintsch & McNamara 1996). The actual format and modus operandi of cognitive knowledge representations are currently controversial topics. Three alternative options coexist: a propositional approach, a connectionist approach, and a hybrid approach (a combination of the two) (van Dijk & Kintsch 1983; Rumelhart &
Written discourse genres
MacClelland 1986; Kintsch 1988, 1998). For a brief review and critical analysis of these proposals, see Parodi (2005a, 2007a) and Ibáñez (2007a). My approach to the term genre implies a progressive enrichment of my own conception of human language in concrete manifestations that operationalise communicative situations and interactions. Hence, terms such as text type or class imply a more reductionist view and tend to be based on an excessively linguistic perspective. The term genre reveals a wider approach and reflects the multidimensionality of language in action and also an inter- or transdisciplinary point of view. Therefore, the cognitive dimension, the social dimension and the linguistic dimension interact in a complex way, giving form and support to discourse genres. This integrating and encompassing conception moves towards comprehension of human beings and their language, while giving a central role to the subjects and their social construction of knowledge. Based on these principles, a subject interacts in a specific context and constructs his/her own approach to reality using situated cognitions and deliberate actions while interacting with other subjects.
1.1
Interactions among multiple planes of genres
This sociocognitive conception of genres bears a direct link to mental processing, in this case, that of written discourse. From this perspective, the relationship between genres, their respective linguistic text structures and corresponding psycholinguistic processing opens new niches for research. For example, investigation of organisational cognitive constructs and the possible existence of a hierarchy of different written genres and the corresponding degrees of comprehension are highly relevant focuses. More specifically, the study of relationships between the cognitive organisation of genres, specialised cognitive knowledge structure, linguistic organisation of texts and comprehension of written texts offers unexplored fields of investigation. These fundamental interactions are represented in Figure 3. The relationships of written texts that bear disciplinary knowledge in connection with the lexicogrammatical structure of genres and the reader’s previous knowledge is an almost unexplored field. This research area must be developed in order to gain proper understanding of specialised disciplinary genre literacy processes. It is a well-known fact that diverse types of genre emerge in response to and in order to satisfy different communicative demands. Thus, the rhetorical linguistic structure and organisation of these variables are designed in order to achieve these aims. The question underlying Figure 3 is whether different discourse genres
23
24
Giovanni Parodi
Figure 3. Interactions among linguistic structures, cognitive representations and psycholinguistic processing
entail or imply different types or levels of cognitive processes, which in turn would be consolidated in diverse representations and demand different types of inferential processes to access full comprehension of written texts.
2.
Genres: Towards a definition
Finding a concise definition of genre is no an easy task. There is a wide variety of definitions available in specialised literature. This particular construct has been approached from various perspectives, some of which are theoretical, while others are more instructional, more rhetorical, or even more grammatical. However, I believe that excessive emphasis tends to be placed on one component as opposed to another, or that focus is detected on one dimension at the expense of others, which sometimes leads to a rather unbalanced definition. On the one hand, this is a simple question of theoretical and methodological options and orientations; on the other hand, it highlights the fact that a single, brief definition cannot easily give account for the richness of the concept or the extent of the very definition attempted. Interesting options are available and, in some cases, strongly defended from different points of view. Some of these include approaches based on new rhetoric (Freedman & Medway 1994; Bazerman 1994, 2008), applied linguistics
Written discourse genres
(Swales 1990, 2004; Bhatia 1993, 2004), semiodiscourse perspective (Charaudeau 2004), the Sydney School (Halliday 1978; Halliday & Hasan 1989; Martin 1992; Martin & Rose 2008), the German communicative perspective (Heinemann 2000; Heinemann & Viehweger 1991), and discourse analysis (van Dijk 1997, 2002, 2008). From my point of view, the key issue here is an integral conception regarding this phenomenon. A genre is a constellation of potential discourse conventions, sustained by previous knowledge of the speakers/writers and listeners/readers (stored in the memory of each subject), based on contextual, social, linguistic, and cognitive possibilities and/or constraints. This sociocognitively constructed knowledge is operationally codified using highly dynamic mental representations. Thus, a genre as a potential set of resources is an instantiation of groups of conventionalised selections, which present certain synchronically identifiable regularities, but which may also be observed in terms of diachronic variations or stabilisations. Therefore, genres are not static entities but are in fact highly dynamic; although these may evidence major standardisation depending on the communicative purposes to be served. In concrete manifestations, genres are varieties of a language that operate using groups of linguistic-textual features that co-occur systematically throughout the passages of a text, and which are linguistically circumscribed by the purposes of the participants (writers and comprehenders), the contexts of use, etc. These groups of linguistic-textual features may be identified by means of a corpus description – representative of instantiations in concrete texts – and from which prototypical regularities may be projected in order to characterise a certain genre. Figure 4 represents some of the core features of genres. Their conjugation and the operationalisation of their more specific variables give form to an exceptional genre. Another genre may emerge from the instantiation of some of these nuclear components, with a different focus on their more specific variables. This figure is open to new core or satellite components, since greater precision and evolution of the components may lead to more enriching variants. For greater detail and specific definitions of each criterion and variables, see Chapter 3 by Parodi, Ibáñez and Venegas. Genres, as structures of knowledge stored as dynamic mental representations, are a constellation of conventions acquired interactively by a subject through communication with other subjects. This conventionalised knowledge, cognitively constructed from cultural contexts, guides the discourse processes that participating subjects put into practice for social interactions. A highly important sine qua non prerequisite, from the perspective of an expert reader/writer, is the participation of subjects who are aware of both their active role in communicative interaction and of the desire to meet the goals of this interaction. Awareness
25
26 Giovanni Parodi
Communicative Purposes Context of Circulation
Participants (Writers and Readers)
Lexicogrammatical Features (Co-occurent groups)
Medium (paper, digital)
Other Criteria
Discoursive Organisation Modes
Figure 4. Discourse genre components
stands as a singular feature of human beings and reveals itself as the turning point of discourse and cognition interactions. Therefore, readers and writers must plan, monitor and revise their participation in order to regulate fulfilment of the communicative act. Context and social roles constrain genre structuring; however, subjects with discourse maturity (a product of ontogenetic recursive processes) and who are aware of their possibilities and resources may choose between discourse alternatives, make adjustments, propose changes and vary the aim, the focus communication process, etc. It is certainly possible that all of this may lead to variation of the genre to the extent that the same becomes another genre, but this is the prerogative of the discourse activity of participants. In sum, the expert subject is not fully constrained by the context, but may and must freely decide to adapt to the context and act within the framework in question. Thus, a subject may consciously violate (certainly on purpose) the conventions of a determined genre, but the other participants will certainly be the ones to evaluate the appropriateness of such a transgression.
2.1
Written discourse genres
Genres: Sociocognitively and ontogenetically constructed objects
In order to make my approach even more explicit, I would like to more accurately define some of the distinctions between the cognitive and social construction of discourse genres, modelled in Figure 4. Figure 5 illustrates the idea that external physical objects are represented cognitively after these have been processed by human cognition. This figure attempts to represent part of the knowledge planes that a writer/reader must cognitively construct by means of complex ontogenetic processes in his/her interactions in physical, social and cultural settings. This means that the subject must create a cognitive representation of, among other things, different objects, processes and mechanisms, storing them in diverse mnemonic systems. These cognitive constructs are highly dynamic and may change over time. Subject
CONTEXT (EXTERNAL)
Cognitive Representation of Genre (Participants, Physical Places, Purposes, Texts)
Language and Cognition
Context Variables a) Participants a.1 Social Roles b) Physical Place b.1 Place b.2 Surroundings c) Texts c.1 Linguistic Structures c.2 Topic c.3 Rhetorical Organisation c.4 Discourse Organisation Modes c.5 Purposes
Figure 5. Interactions between subject and context
The external world is made up of initial planes for the relative and intersubjective construction of individual knowledge. More specifically, considering discourse genres (for example, the participants and their social roles), it is clear that these are “out there”. Nevertheless, since the genre is a discourse tool for social interaction, a cognitive representation of the participants and their potential roles is required in order for the reader to understand what is being written and which communicative functions are available through language. It is undoubtedly the
27
28
Giovanni Parodi
context that may eventually activate this knowledge, but social interaction will fail to meet its communicative purpose if a previous cognitive construction is not stored in the subject's memory. New knowledge constructions are only possible if there is previous information in the form of cognitive structures. Figure 6 attempts to show the kind of knowledge that a subject constructs using language in his/her relation with society and, of course, with other subjects (for example, knowledge of genres, knowledge of the world and the discourse competence).
Cognitive Representation of Genre (Participants, Physical Places, Purposes, Texts)
Background Knowledge
Discourse Competence
Figure 6. The reader/speaker and the cognitive representation of genres
As mentioned before, cognitive, social and linguistic dimensions must be taken into account in order to fully understand what people do with genres. This approach to genres raises the issue of the status and scope of the concept of “context”. From my perspective, context has a dual nature. It is physical and exists “out there”, but it is also cognitive as a mental phenomenon, as a cognitive representation. At least two planes may thus be distinguished: the external and the internal. Nevertheless, the sociocognitive construction of genres imposes the acceptance that context becomes a purely cognitive construction, since the substrate of representation, storage, activation and eventual reorganisation is basically cognitive. The fact that genres are partly constructed from social interactions of the participants, their social roles, and the cultural objects that comprise the physical settings constitutes a fundamental issue. Nevertheless, this issue does not directly influence the type of representational cognitive format or mechanisms used for the cognitive storage of information.
Written discourse genres
3.
Academic and professional genres
3.1
Genres, educational settings and scientific disciplines
Written language is the favoured resource through which disciplinary knowledge is created, fixed and transmitted. Distinctively, this takes place through the use of prototypical genres that support the initial construction of specialised knowledge and that gradually consolidate the integration of an individual into a specific discourse community. Academic and professional genres are operationalised in a group of texts that are organised into a continuum in which progression is made from general school texts towards professional and academic university texts. This progression is conceived based on a single person in academic education, who should progress in order to gradually face more diverse scenarios and genres. Figure 7 attempts to capture this genre conceptualisation of the means and context of production and the context of use, among others. General School Genres
Other Specialised Genres (Scientific Domains) University Academic Genres
Academic Genres in Technical High Schools
Professional Genres in Workplace
Figure 7. Continuum of genres in different settings
The unidirectional arrows show communication between some alternative genres that a subject going through education processes could eventually avoid by taking different options (such as a technical-professional education). However, it is evident that in elementary and secondary education, these genres present a more obligatory character. As modelled in this figure, the centrality of academic genres within this continuum as axes connecting professional genres and other specialised genres (e.g. scientific genres) reveals a fundamental role in the construction of a specialised disciplinary domain. This academic educational space thus acts as a guideline for initial genre construction, offering a set of alternatives
29
Giovanni Parodi
that constitute means of access to knowledge and to specialised written practices. The diverse relationships represented in Figure 7 capture my conception of the interactions that a subject in education must experience in order to be able to construct a discourse domain in both academic and professional settings. This ideally implies that a particular subject would be supported by a progressive process of ongoing literacy processes. Figure 8 shows interactions and overlaps between academic and professional genres, as well as between other specialised genres. At the same time, the possible transversality of certain genres over disciplines is considered. These theoretical assumptions will, of course, be contrasted with subsequent empirical research presented in this book. Other Specialised Genres Academic Genres
DISCIPLINES
30
Professional Genres
Figure 8. Academic and professional disciplines and genres
Figure 8 displays the idea of an intersection between academic and professional genres in one or several disciplines, as well as other specialised genres. Similarly, there will also be other more specific and prototypical genres, idiosyncratic to each domain. The above reveals a substantial dynamism in the construction, evolution and circulation of genres. This theoretical assumption is empirically investigated in some of the following chapters of this volume and is also supported with findings across disciplines (see Chapters 4 and 5). Until very recently, the conception of academic discourse was typically envisaged as a group of unified genres, specifically from language teaching domains. My
Written discourse genres
point of view regarding this issue is that some of these discourse genres may vary greatly over disciplines and even within a discipline (Bhatia 1993, 2004; Parodi 2005a, 2007a, b, 2008b; Ibáñez 2008; Parodi & Ibáñez 2008). On the other hand, empirical research has also demonstrated that some specific genres may remain relatively homogeneous throughout several scientific disciplines (Venegas 2006; Parodi 2007a, b, c). As accurately stated by Bathia (2004), genres move across disciplines, although it is logically possible that there is heterogeneity within a genre, as is the case from one discipline to another. The existence of groups of genres, of systems of genres (Bazerman 1994; Martin & Rose 2008; Tardy 2003), of colonies of genres (Bhatia 2004), and of macro-genres (Martin 1992) partly serves to demonstrate this assertion. Discourse genres may thus move across both academic discourse and professional discourse, within one or several disciplines. Some of these genres will be prototypical of only one field, be it academic or professional, or of only one discipline. Others will be present in several settings and may evidence diverse forms and functions. In order to more clearly refine and illustrate the concept of academic and professional genres, the following model in Figure 9 describes and diagrams a more global framework. Specialised discourse acts at the highest and abstract level. This figure shows the hierarchical interaction between specialised discourse (SD), academic and professional genres (A&PG), disciplinary discourse (DD),
SPECIALISED DISCOURSE ACADEMIC GENRES
PROFESSIONAL GENRES
DISCIPLINARY GENRES (Psychology – Chemistry – Biology – Linguistics – ……)
TEXTS
(Textbook, Didactic, Guideline, Disciplinary Text, Report, Record, …)
Figure 9. Levels of realisation and hierarchical integration in the continuum of genres and discourse
31
32
Giovanni Parodi
and the final instantiation in concrete texts (T) as linguistic units, but also as units of meaning. Taken together, they constitute progressive changes of degree from a concrete level to a more abstract one, such as from T to A&PG. This continuum shows the way in which the linguistic system offers multiple potentials, which are selected and organised in terms of certain variables, until they become constituent parts of operationalised objects, that is, of texts such as the Organic Chemistry Textbook in the Industrial Chemistry undergraduate university programme and Didactic Guideline No. 3 from the Organisational Psychology course in the Psychology undergraduate university programme. The framework outlined intended to capture, on the one hand, my conception of discourse as a wide-ranging phenomenon and, on the other, the concrete nature of text as a linguistic unit in which language is visible and linguistic features are captured. At the same time, the framework outlined in Figure 8 aims to envisage the place and role of genres as an intermediate layer between discourse and texts. In addition, disciplinarity is highlighted as a relevant issue in this integral conception of discourse, genres and texts. The proposed specialised discourse configuration, starting from specific texts and reaching disciplinary genres, is highly harmonious with the position defended in this book (Parodi 2008c), as an initial point for empirical and theoretical reflection. Hence, a circular “theoretical-empirical-application-theoretical” (Parodi 2008a) model that covers diverse planes enables the collection of relevant information to support educational proposals in specific disciplines.
3.2
Towards disciplinary literacy based on genres
In an attempt to conceptualise the complex process of construction and learning of disciplinary knowledge based on academic and professional discourse practices, I envisage these processes as a dynamic circuit. Given the highly synergetic way in which diverse communicative instantiations contribute and recontribute to the sociocognitive construction of specialised knowledge, written discourse plays a fundamental role in which the psycholinguistic processes of reading and writing become central for disciplinary literacy. This circuit, modelled in Figure 10, also aims to capture a group of variables relevant to the process of construction and appropriation of disciplinary genres, and to highlight the directional and bidirectional connections through which I conceive the flow and construction of knowledge that take place as situated and distributed psychodiscursive activities. At the same time, Figure 10 shows this circuit is displayed in social and cultural contexts.
Written discourse genres
CONTEXT (SOCIAL AND CULTURAL ENVIRONMENTS) Discourse Genres
TING WRI iscipline) and D t x e (T
n) idual Indiv Cognitio zed li a tu x
ING READ iscipline) and D t x e T (
y munit Com rsive ion) it n (Cog
u Disco
te (Con
Discourse Genres CONTEXT (SOCIAL AND CULTURAL ENVIRONMENTS)
Figure 10. Circuit of the construction of academic and professional genres
One way to approach the construction of specialised genres is through the processes of reading and writing disciplinary discourse. Therefore, a student entering an academic discourse community at university level must, at the beginning of this process, gain access to disciplinary knowledge basically by reading texts related to the discipline. All of this takes place in the context of a series of formal activities, such as the courses of a syllabus, and in spoken interactions with other students and teachers. From the perspective of writer discourse, the student is thus immersed in a progressively growing reservoir of written discourse and disciplinary genres that gradually support the construction of his/her specialised knowledge. The relevance of reading comprehension processes and the acquisition and development of efficient psycholinguistic strategies is crucial to a deep understanding of the disciplinary contents and the acquisition of specialised genres (Parodi 2003, 2005a; Marinkovich, Peronard & Parodi 2006; McNamara 2007). Focusing on written language as the preliminary approximation to specialised discourse genres constitutes the fundamental discourse and cognitive tools needed in order to become familiar with specific written information. It is thus the disciplinary genres that progressively open spaces towards knowledge of the discipline to the reader and that support him/her throughout gradual integration into the respective discourse community. Therefore, academic literacy processes are key factors when it comes to helping the non-expert to develop into a highly specialised reader and writer in disciplinary communication. One way to
33
34
Giovanni Parodi
help him/her to pursue this aim is not only to acquire a good collection of situated cognitions in rich and meaningful contexts of instruction, but also to construct new specialised knowledge through processes of distributed cognitions by interacting in discussions, study sessions and other intellectual group activities. This circuit of construction and reconstruction of knowledge based on discourse practices is initially opened by the comprehension of written material and gradually complemented by the practice of writing specialised genres. The synergetic relationships between both practices of specialised reading and writing transform disciplinary knowledge while the subject progressively gains command of disciplinary genres. Some of these genres will only be read in order to access specific knowledge; others will constitute writing tasks with the purpose of communicating specific information. Initially, some of these academic genres must be exclusively read and these will subsequently amplify their communicative purposes when the learner is able to adequately write them and carry out relevant communicative functions in academic practices. Upon reaching the moment when the writer is competent in some highly specialised prototypical genres, he/she will no longer be a novice, but rather an effective member of the group. Similarly, when the writer evidences adequate control of written discourse practices within the community, the subject will have demonstrated his/her capacity to construct and to communicate specialised meanings (Parodi 2007g). Hence, reading is a fundamental preliminary step in accessing knowledge and the discourse organisation of written material, but only effective written discourse production of the required genres reveals the maximum level of discourse competence.
Closing remarks This chapter began with a series of relevant questions for a theory on discourse genres, questions that were not necessarily all answered within the scope of this study. Emphasis was also placed on terminological and conceptual possibilities, as well as on theoretical and applied complexities. I proposed limiting my discussion to the development of some theoretical principles especially focused on a sociocognitive and discourse perspective of language, highlighting the cognitive dimension of genres. I also moved towards some principles for disciplinary literacy based on both academic and professional specialised genres. Basically, emphasis has been placed on the central place of reading processes as a preliminary starting ability to open access to disciplinary knowledge and specialised genres. Writing continues to be a secondary tool in the development of discourse competence but also serves as the revealing psycholinguistic process of mastery in the language of the discipline.
Written discourse genres
The principles underlying my conceptual proposal may be summarised as a few core ideas: a. Genres are articulated in a dynamic and complex manner from cognitive, linguistic and social dimensions; b. The relation among these three dimensions is not symmetrical, but rather an interaction in a dialectal cycle in which the possible differences are only in degree, since these dialectical interaction processes are inseparable; c. Most essentially, genres are cognitive constructs; d. Context is “out there”, but context is mainly considered as a cognitive artefact for this theory of genre; e. Genres are acquired socioconstructively, which means that the process of construction is through situated and distributed cognitions in a varying range of settings; f. Genres may be acquired through formal and informal environments, but academic literacy (reading and writing) in academic and professional domains is an educational tool that may greatly help to facilitate the processes of genre construction. The cognitive dimension of genre theory remains an interesting challenge that still requires theoretical reflection and empirical investigation. In my opinion, the connection established between the situation model and the context model as cognitive representations of fundamental knowledge to processing genres is an innovative route that may provide a better understanding of the cognitive, linguistic and social constitution of genres. Similarly, the integral view of a theory of genres, in which each dimension is integrally articulated without losing emphasis placed on one dimension over another, offers important challenges.
35
chapter 3
Discourse genres in the PUCV-2006 Academic and Professional Corpus of Spanish Criteria, definitions, and examples Giovanni Parodi, Romualdo Ibáñez and René Venegas
The identification and the classification of discourse genres have been one of the important concerns in linguistics studies. Particularly, since genres are analysed as complex objects and multi-dimensional approaches have been employed in order to capture their dynamic nature, more precise distinctive features have been determined. This chapter identifies, defines, classifies and exemplifies discourse genres emerging from the 491 texts forming the PUCV-2006 Academic and Professional Corpus of Spanish. In order to accomplish these objectives, we executed a complementary methodology of deductive and inductive nature. After analysing all the texts of the corpus, twenty-nine genres were identified and grouped. Interesting clusterings emerged reflecting cognitive, social, functional/communicative and linguistic variations.
Introduction The identification and classification of different types or classes of texts, discourses and genres have always been a major concern for linguists. The purposes behind these taxonomic endeavours have been diverse. For example, theoretical, descriptive or educational perspectives have been explored. In this context, the objective of this chapter is to identify, classify and define the emerging discourse genres based on the texts of the PUCV-2006 Corpus of Spanish. In order to identify and define each genre, we will apply a multi-dimensional framework in which a complementary methodology is developed. In the first part of this chapter, we concentrate on a brief revision of fundamental concepts, criteria and variables in order to frame the present study. We then report on the research, with a brief description of the corpora, and we also provide details regarding the methodological procedures. As part of the results,
38
Giovanni Parodi, Romualdo Ibáñez and René Venegas
we show the emerging identification of academic and professional genres and the definitions we arrived at. Furthermore, together with each definition, we offer a representative passage of a text, as an example of each genre in question. Lastly, a descriptive analysis of some of the variables involved is conducted.
1.
A multi-dimensional genre identification
It is worth noting that this chapter does not discuss the concept of genre or focus on its definition. We follow the principles developed by Parodi in Chapter 2 of this book. Therefore, the focus is placed on the discussion of the criteria under analysis and the variables involved in order to achieve in-depth genre classification. Classically, in order to classify the objects under study, text typologies or discourse genre classifications have emphasised dichotomist perspectives around categorical distinctions, based on discrete criteria. This was one of the fundamental goals of Text Linguistics, from the very beginning. From Functional Systemic Linguistics (FSL) and particularly from genre theory, Martin and Matthiessen (1991) and Martin (1992, 1996, 2002) have shown the usefulness of the concept of topology (in opposition to typology) for the identification of genres and their grouping. From the concept of topology, we arrive at the gradation of a semiotic space, rather than a conception based on a finite set of discrete categories. This idea is based on different sources, including adaptation of Wittgenstein’s classical theory (1958) which can be used to distinguish “family likelihood”, Rosh’s proposal (1975) from the theory of prototypes, and the concept of fuzzy categories in Cognitive Linguistics (Lakoff 1972). These ideas are also shared by studies conducted in text linguistics regarding complex multi-dimensional or multi-level taxonomies (Bassol & Torrens 1997; Ciapuscio 2003; Parodi & Gramajo 2003). These complementary ideas have expanded and deepened our comprehension of the complex process of categorising and classifying genres. At the same time, all of these approximations have helped us to realise that our objects of study have definitively porous limits or fuzzy frontiers. Therefore, the only way to capture the richness involved in a text in order to identify the genre to which it belongs is to require analysis from diverse criteria and multiple levels. We are sure that these complex approximations, with the co-existence of aspects of the cognitive, linguistic and functional/communicative types may provide much profounder descriptions and explanations of the phenomena under study. Since we conceive discourse genres as highly complex and dynamic units, where the social, the cognitive, and the linguistic dimension are proposed as being essential, a focus of several dimensions is most adequate for drawing up valid classifications from a theoretical and empirical point of view.
Discourse genres in the PUCV-2006
In addition, a complementary approach that combines deductive or “topdown” and inductive or “bottom-up” perspectives is employed. This combination means that we use both theoretical categories emerging from the researchers’ previous knowledge and empirical data coming from a more “corpus-based” approach (Tognini-Bonelli 2001; Biber, Connor & Upton 2007). Hence, this complementary perspective is vital in order to achieve better and more efficient distinctions among the genres emerging from the corpora. Thus, in order to carry out the classification, we took into consideration the internal characteristics of the texts as well as the extra-linguistic context where these are produced and also where they circulate. This focus is based in the eminently dialogic character of language and the attempts to account for the complex socio-linguistic-cognitive relationship between the participants of a certain discourse community.
2.
The research
2.1
The corpora
Here we present a brief description of the PUCV-2006 Corpus, which is, in turn, constituted by both an academic and a professional corpus. More specific details are provided in Chapters 3 and 4 of this volume. The academic corpus comprises nearly 100% of the written material consulted by students during their four year university programs: Psychology (PSY), Social Work (SW), Construction Engineering (CE) and Industrial Chemistry (IC). The professional corpus includes some of the texts considered to be fundamental material for dealing with professional duties of individuals who graduated from the aforementioned programmes. In contrast to the aim we had when building the academic corpus, we did not seek to get 100% of the circulating written material, as we know this is unlimited. We did collect all the texts we could possibly gather during a six-month period. This is because our purpose was to obtain the largest possible amount of samples from diverse genres, not to track down the total amount of material in circulation. Table 1 sums up information from both corpora. A first look at the data reveals the existence of a larger number of texts in the disciplines of Social Science and Humanities (SS&H), in comparison with the disciplines of Basic Sciences and Engineering (BS&E). According to these numbers, it can be seen that Psychology students may read up to four times as much as Industrial Chemistry students, in terms of number of texts. The same situation is true for the professional corpus in the same disciplines. All of this reveals a tendency towards associating a knowledge domain and its disciplines, in the academic as well as in the professional setting with both a larger amount of texts
39
40 Giovanni Parodi, Romualdo Ibáñez and René Venegas
Table 1. Academic and Professional corpora descriptions Science Domain
Discipline
Social Sciences & Humanities (SS&H) Basic Sciences & Engineering (BS&E)
Number of Texts
Number of Texts
Academic Corpus
Professional Corpus
227 142 69 53 491
220 101 62 59 442
PSY SW CE IC Total
and lengthy texts, as is the case with PSY and SW. See Chapters 5 and 6 for a more in-depth description of these figures.
2.2
Criteria and variables used to identify discourse genres
Figure 1 captures some of the features that constitute the discourse genres upon which we will base our analysis. As shown in Figure 1, the six criteria mentioned herein are proposed as initial, but not definitive research categories. They are open to new criteria and to the operationalisation of more productive variables, whether their status is nuclear
Communicative Purposes Context of Circulation
Participants (Writers and Readers)
Lexicogrammatical Features (Co-occurent groups)
Medium (paper, digital)
Other Criteria
Figure 1. Possible criteria to classify genres
Discoursive Organisation Modes
Discourse genres in the PUCV-2006
or satellite. The idea behind this open organisation of criteria is that new insights into genre theory may entail challenging possibilities that may produce criteria and variables that can capture genres, and identify and differentiate these in a better manner. The construction of this proposal of identification and classification followed deductive-inductive principles. The deductive step conveyed us “from the already defined types, traditionally established as observable linguistic objects” (De Beaugrande & Dressler 1981: 251) in order to, on this basis, identify and select criteria and variables founded on specialised bibliography. Subsequently, in order to complement and complete this first matrix of features, we proceeded to investigate it empirically in a micro-corpus. By means of these procedures, we arrived at a specific matrix that allowed us to more precisely distinguish and classify the texts of the corpora. The criteria emerged from, among others, variables that are related to communicative purposes, interaction between writers and readers, contexts where texts circulate, and predominant discourse organisation modes that characterise the texts. In sum, in the present study we decided to focus on five criteria and twentysix variables. As can be observed in Figure 2, these criteria are: (a) communicative macro-purposes, (b) discourse organisation mode (c) relationship between participants, (d) context of circulation, and (e) modality. We established a set of variables based on the five aforementioned criteria. These variables are analysed in the texts from the principle of “predominance” of one over the other. As we have emphasised above, this means that we are aware that the texts and the genres are not absolutely circumscribed units, nor closed in themselves. For example, these feature the co-existence of diverse communicative macro-purposes and multiple discourse organisation modes are intertwined
Figure 2. Selection of criteria to define genres
41
Giovanni Parodi, Romualdo Ibáñez and René Venegas
Communicative macropurpose
y
lit
a od
To instruct
To consign
To regulate
To guide
To confirm
To persuade
To invite
To offer
Expert writer – semi-lay and lay reader
Expert writer – expert, semi-lay and lay reader
Expert writer – lay reader
Expert writer – expert reader
Expert writer – semi-lay reader
Expert writer – expert and semi-lay reader
universal scientific
Discourse organisation mode
descriptive
labour pedagogical
narrative
argumentative
m
m ul tim on od om al od al
M
Relation between participants
42
Context of circulation
Figure 3. Criteria and variables that classify genre
throughout their passages. In keeping with the ideas mentioned before, it may be observed in this figure that the matrix of criteria is left open to new possibilities in the blank spaces available in the corners. Figure 3 presents the variables established for each criterion. Each of the criteria and its variables are commented on hereunder.
2.2.1 Communicative macro-purpose Before illustrating what we understand in this research as communicative macro-purpose, it is necessary to emphasise that texts, as concrete instances of a particular language, are intentionally produced by individuals, according to the needs that context imposes upon them. Therefore, each communicative event or text has a purpose that originates it and which, at the same time, together with other variables, allows it to be identified as belonging to a certain discourse genre (Askehave & Swales 2001). Some authors call this purpose Communicative Purpose of Genre (Swales 1990, 2001, 2004; Bhatia 1993, 1997a, 2004; DudleyEvans 1994). A communicative purpose is conceived as the ultimate objective for which a discourse genre is used in a communicative exchange. Since a great part of the genres that make up our corpora are known to possess not only a general communicative purpose, but also a set or group of minor communicative purposes from which the general purpose is shaped, we identify this general purpose as Communicative Macro-purpose (see Chapter 8). The macro-purposes identified in this study are:
– – – – – – – –
Discourse genres in the PUCV-2006
To instruct about a specific disciplinary matter. To consign the state of a procedure, the health of an individual or a concept. To regulate behaviours and/or procedures. To persuade of a theoretical or ideological statement. To guide behaviours and/or procedures. To invite to take part in a public bidding. To confirm the validity of a fact or procedure. To offer a product or service.
2.2.2 Relationship between participants In order to describe the relationship between participants, it is necessary to approach the concept of discourse community. This kind of community is constituted by a group of persons who share knowledge about a certain matter, as well as the necessary conventions to interact discursively in order to communicate this knowledge. According to Swales (1990) and Bhatia (1993, 2004), discourse communities exist and support themselves by means of a dynamic membership in terms of expertise. This is due to the fact that the members composing these, in most cases, enter the community as novices and gradually become experts as a result of their discursive interaction in the community. Thus, the roles of speech (Halliday & Hasan 1976; Halliday 2004a) which persons can adopt in a communicative exchange as speaker/writer and hearer/reader are projected in a discourse community as roles determined by the participants’ degree of expertise. Although we acknowledge that the stability of a discourse community is based on the continuous development of its members’ degree of expertise, we have decided to establish four central roles: expert writer, expert reader, semi-lay reader, and lay reader. The expert writer and the expert reader are members with a high degree of expertise, which allows them not only to comprehend the genres that circulate within their community, but also to produce them. The semi-lay reader has medium (intermediate) expertise. Although he/she might have acquired certain knowledge that is shared by the community, he/she is not yet capable to interact efficiently through the genres circulating within a community. Lastly, the lay reader corresponds to the member with the lowest degree of expertise. Based on the above roles, we have identified three prototypical relationships between the participants. These relationships are illustrated in Figure 4. In this figure, prototypical discourse relationships correspond to those existing between an expert writer and an expert reader, an expert writer and a semilay reader and between an expert writer and a lay or uninformed reader. Nevertheless, it is important to state that, although discourse genres are conceived by some authors as identifying elements of a discourse community (Swales 1990, 2001; Bhatia 1993, 2004), in many cases their circulation is not limited to only
43
44 Giovanni Parodi, Romualdo Ibáñez and René Venegas
Discourse Community
Expert Reader
Expert Writer
Semi-lay Reader Lay Reader
Discourse Community Lay Reader
Expert Reader Semi-lay Reader
Figure 4. Relationships between participants
one community. Therefore, we cannot but acknowledge the possibility of a relationship between an expert writer from a discourse community and a reader from another community, as may be observed in Figure 4. This ensures that a genre, produced within a certain community, can also constitute the discourse of another community, very possibly serving another communicative purpose.
2.2.3 Discourse organisation mode The discourse organisation modes allow the sequence of contents and define the genre as predominantly descriptive, narrative, or argumentative. More specifically, it is possible to establish that discourse organisation modes are relative stable types of combining enunciations, through which an organisation, recognisable by its internal hierarchic structure and its combining units can be identified (Charaudeau 1992; Adam 1992; Parodi & Gramajo 2005). – Descriptive Mode: In this discourse organisation mode, three features are identified: denomination, localisation-situation and qualification. These three features allow us to characterise objects, persons, situations or processes, from their qualities or temporary and special circumstances. Through this descriptive mode, information can be organised such as taxonomic ordering, definitions of a type of action, procedure or character (Hamno 1991; Charadeau 1992; Adam 1992). – Narrative Mode: In this discourse, organisation mode actions and events are presented in an integrating temporal order, providing unity and orienting action right up to the end; therefore, if one of the parts of the action is
Discourse genres in the PUCV-2006
displaced or deleted, the entire organisation is altered. This discourse organisation mode is characterised by the presence of a high density of causeeffect relations, such as purpose, possibility and temporal proximity (Propp 1928; van Dijk 1983; Charaudeau 1992; Adam 1992). – Argumentative Mode: In this discourse organisation mode, information may be displayed in a logical, demonstrative or persuasive manner. Each of these manners allows the writer to express a point of view or thesis and different kind of arguments (or counter-arguments) that support (or refute) the thesis (Lo Cascio 1998; Toulmin 1958; van Dijk 1983; Charaudeau 1992; Adam 1992).
2.2.4 Context of circulation This criterion informs of the context within which, ideally, the texts belonging to a genre are used. One can distinguish more specialised contexts and of restricted circulation and others that are more ample, general and with a lesser degree of restriction. Some contexts of circulation identified in this study are: – Pedagogical: formal domain of teaching and learning of contents and procedures. – Labour: domain wherein a technical or professional activity is carried out. – Scientific: domain wherein knowledge is generated and transmitted that is the product of research tasks. – Universal: domain that involves all the above, but which, at the same time, includes contexts of ample circulation that imply the society as a whole. This means that it is not restricted to contexts of reduced circulation or to highly specialised ones.
2.2.5 Modality This criterion corresponds to the semiotic approach used in discourse genres to construct the conceptual network of the message and give it sense. These modes are materialised from two types of signs: verbal (oral-written) and non-verbal (scientific formulae, images, drawings, illustrations, etc.) (Hartshorne & Weiss 1965; Eco 2000; Kress & van Leeuwen 2001). This criterion operates in a dichotomist manner, with predominance of one or another mode. These are: – Monomodal: in the genre, one semiotic, verbal (oral or written) or non-verbal (graphs, signals, tables, schemes, images, etc.) mode is present. – Multimodal: in the genre, more than one semiotic mode can be present, verbal (oral or written) and non-verbal (graphs, signals, tables, schemes, images, etc.).
45
46 Giovanni Parodi, Romualdo Ibáñez and René Venegas
2.3
Example of classifying genres from the matrix of criteria and variables
To illustrate the way of using the criteria and variables chosen in the identification and comparison of genres, Figure 5 shows a characterisation of the Disciplinary Text and Textbook genres. These two genres have been identified by applying the feature matrix. We will define and exemplify these as follows: As shown in Figure 5, it is possible to identify and distinguish the Textbook genre from the Disciplinary Text genre, based on the co-occurrence of the variables identified in each of the criteria. Thus, for the Textbook, the prototypical structuring variables are: Macro-purpose (to instruct); Discourse organisation mode (descriptive); Context of circulation (pedagogical); Relationship between the participants (expert writer and expert or semi-lay reader); Modality (multimodal). The prototypical variables characterising the Disciplinary Text are: Macropurpose (to persuade); Discourse organisation mode (argumentative); Context of circulation (scientific); Relationship between the participants (expert writerexpert reader); Modality (monomodal). Textbook and Disciplinary Text differentiate in the five criteria because, prototypically, they reveal different macropurposes (to instruct versus to persuade), diverse discourse organisation modes (descriptive vs. argumentative), different ideal relationship between reader and writer (expert or semi-lay reader vs. expert reader), and they preferably present different modalities (multimodal vs. monomodal). As observed, the selected
Communicative macropurpose
y
lit
al od
To guide
To consign
To invite
To offer
m
To regulate To persuade
3
3
Expert writer – expert, semi-lay and lay reader
Expert writer – semi-lay and lay reader
Expert writer – expert and semi-lay reader
Expert writer – lay reader
Expert writer – expert reader
Expert writer – semi-lay reader
Relation between participants
3 descriptive
labour
universal
pedagogical
scientific
narrative
3 3
3 Textbook
Discourse organisation mode
3 Disciplinary text
argumentative
3
3
m
m
To instruct To confirm
on o
ul
tim
od
al
M
3
a od
Context of circulation
Figure 5. Example of the identification of genres: Disciplinary Text and Textbook
Discourse genres in the PUCV-2006
criteria and variables allow us to clearly identify one genre from the other and cause the fundamental differences between them to emerge.
3.
Genres emerging from the corpora
In this section, we identify twenty-nine genres by applying the criteria matrix to the corpora. These genres will be analysed in detail in the following chapters of this book.
3.1
Definitions and exemplifications
Hereunder we present, in alphabetical order, the definitions for each of the twenty-nine genres of the PUCV-2006 Corpus of Spanish. We also provide with a text passage representing each of them. These passages were selected from the corpora and across all the disciplines involved. 1. Bidding Specification: The macro-purpose of this discourse genre is to extend an invitation to take part in a bidding process. The invitation comes from public or private entities and is directed towards organisations and/or enterprises which would present bids for the execution of a certain service. The relationship between the participants is between expert writer and expert reader. It circulates in a labour context. Generally, it is monomodal, with a predominant descriptive discourse organisation mode. 2. Brochure: Discourse genre with the macro-purpose of offering products, services and/or information. It preferably circulates in a universal domain and the relationship between the participants is between an expert writer and a semi-lay or lay reader. The predominant discourse organisation mode is descriptive, and generally it uses multimodal resources as support.
47
48 Giovanni Parodi, Romualdo Ibáñez and René Venegas
3. Calculation Memory: The communicative macro-purpose of this discourse genre is to consign the description of procedures and steps in a building project. It circulates in labour contexts, and the relationship between participants is between an expert writer and an expert reader. The predominant discourse organisation mode is descriptive, and multimodal resources are used.
4. Call for Bids: The macro-purpose of this discourse genre is to publicly invite one or various persons or institutions to carry out activities under pre-established criteria. The relationship between the participants is between an expert writer and an expert reader. It circulates in a labour context. The predominant discourse organisation mode is descriptive. Monomodal resources are preferred. 5. Certificate: Discourse genre, whose macro-purpose if to officially confirm an administrative event or achievement. It is issued normally by an authority to confirm the validity of what is stated in the document. The relationship between the participants is between expert writer and lay reader, and the context of circulation is a universal one. The certificate presents a predominantly descriptive discourse organisation mode. Monomodality is normally preferred.
Discourse genres in the PUCV-2006
6. Commercial Catalogue: The macropurpose of this discourse genre is to offer products and/or services. The relationship between the participants is between an expert writer and an expert reader. It circulates ideally in a labour context. Generally, it presents multimodal resources, and a descriptive discourse organisation mode is predominant. 7. Development Plan: The communicative macro-purpose of this discourse genre is to guide actions to produce the achievement of one or more objectives. It circulates between an expert writer and an expert reader within labour contexts. Multimodal resources are used, and the discourse organisation mode is predominantly descriptive.
8. Dictionary: Discourse genre with the macro-purpose of consigning the definition of concepts or procedures of a discipline or matter. Its context of circulation is the pedagogical field, and the relationship between participants is between an expert writer and an expert or semi-lay reader. Preferably, a descriptive discourse organisation mode is used, with multimodal resources.
49
50
Giovanni Parodi, Romualdo Ibáñez and René Venegas
9. Didactic Guideline: The communicative macro-purpose of this genre is to instruct about a specific disciplinary matter and/or procedure. Its context of circulation is pedagogical, and the relationship between participants is between an expert writer and a semi-lay or lay reader. Preferably, an argumentative discourse organisation mode is used; and, occasionally, multimodal resources are employed.
10. Disciplinary Text: The communicative macro-purpose of this discourse genre is to persuade readers of one or more subject matters of a particular discipline. Ideally, the context of circulation is the scientific field, and the relationship between participants is between an expert writer and an expert or reader. Preferably, an argumentative mode of discourse organisation is used. Multimodal resources are also employed.
Discourse genres in the PUCV-2006
11. Law: The macro-purpose of this discourse genre is to regulate the behaviour of individuals and the execution of procedures and processes. It ideally circulates in universal contexts. The relationship between participants is between an expert writer and an expert or semi-lay reader. It is monomodal, and its predominant discourse organisation is descriptive.
12. Lecture: This discourse genre has the macro-purpose of persuasion. The relationship between the participants is between an expert writer and an expert or semi-lay reader in a scientific context. Preferably, an argumentative discourse organisation mode is employed, with multimodal resources.
13. Medical Order: The communicative macro-purpose of this discourse genre is to guide the execution of a medical procedure. It circulates within labour contexts, and the relationship between participants is between an expert writer and an expert reader. It is monomodal, and its discourse organisation mode is predominantly descriptive.
51
52
Giovanni Parodi, Romualdo Ibáñez and René Venegas
14. Medical Report: Discourse genre whose macro-purpose is to consign the health condition of a patient as well as the procedures used for his/her treatment. It is used between an expert writer and an expert reader within labour contexts. It employs monomodal resources and presents a predominantly descriptive discourse organisation mode.
15. Memorandum: The communicative macro-purpose of this discourse genre is to confirm the information requested. Ideally, it circulates in labour contexts, and the relationship between participants is between an expert writer and an expert reader. The predominant discourse organisation mode is descriptive and, monomodal resources are used.
16. News: The macro-purpose of this discourse genre is to confirm events of diverse nature. Ideally, it circulates in universal contexts and the relationship between participants is between an expert writer and an expert, semi-lay or lay reader. The predominant discourse organisation mode is narrative. Multimodal resources are generally used.
Discourse genres in the PUCV-2006
17. Observation Guideline: The communicative macro-purpose of this discourse genre is to regulate the observation of behaviour, things or events. It circulates between an expert writer and an expert reader and, ideally, in labour contexts. Generally, it is monomodal and the predominant discourse organisation mode is descriptive.
18. Operating Manual: The macropurpose of this discourse genre is to regulate behaviours and/or procedures. Ideally, it circulates in labour contexts, and between an expert writer and an expert reader. Normally, multimodal resources are used, and its predominant discourse organisation is descriptive.
53
54
Giovanni Parodi, Romualdo Ibáñez and René Venegas
19. Plan: The macro-purpose of this discourse genre is to guide the organisation and distribution of an architectural structure, a town (population) or a machine. The relationship between participants is between an expert writer and an expert reader. Ideally, it circulates in labour contexts. Predominantly, the discourse organisation mode is descriptive, and multimodal resources are employed. 20. Quotation: Discourse genre whose macro-purpose is to confirm the value of a property or service. Ideally, it circulates in labour contexts. The relationship between the participants is between an expert writer and an expert or semi-lay reader. Preferably, a descriptive mode of discourse organisation is used, and multimodal resources are employed. 21. Record: The macro-purpose of this discourse genre is to consign the state or condition of a procedure or product. It circulates within labour contexts. The relationship between participants is between an expert writer and an expert reader. Its predominant discourse organisation mode is descriptive. Generally, monomodal resources are used.
Discourse genres in the PUCV-2006
22. Regulation: The communicative macro-purpose of this discourse genre is to regulate behaviours and/or procedures. The context of circulation is universal. The relationship between participants is between an expert writer and an expert or semi-lay reader. It usually is monomodal and presents a discourse organisation mode that is predominantly descriptive.
23. Report: This discourse genre has the macro-purpose of consigning situations, procedures and/or problems. Ideally, the context of circulation is the labour field. The relationship between participants is between an expert writer and an expert reader. It is usually monomodal and presents a discourse organisation mode that is descriptive.
24. Research Article: Discourse genre, whose macro-purpose is to persuade someone to a point of view, based on a theoretical or empirical study. It ideally circulates in scientific contexts. The relationship between the participants is between an expert writer and an expert reader. Preferable, an argumentative discourse organisation mode is employed. It is supported by multimodal resources.
55
56
Giovanni Parodi, Romualdo Ibáñez and René Venegas
25. Research Project: The communicative macro-purpose of this discourse genre is to offer a scientific research proposal. Ideally, it circulates in scientific contexts and between an expert writer and an expert reader. It employs monomodal resources. Predominantly, an argumentative discourse organisation mode is displayed.
26. Statement: The macro-purpose of this discourse genre is to consign a decision, intention or agreement regarding the state, condition or nature of something. Normally, it circulates in universal contexts, and the relationship between the participants is between an expert writer and an expert reader. A descriptive discourse organisation mode is employed. Monomodal resources are normally used.
Discourse genres in the PUCV-2006
27. Test: The communicative macro-purpose of this discourse genre is to consign the psychological characteristics of an individual. It circulates within labour contexts, and the relationship between participants is between an expert writer and a lay reader. It can be multimodal and its preferred discourse organisation mode is descriptive.
28. Textbook: The macro-purpose of this discourse genre is to instruct regarding concepts and/or procedures within a specialised subject matter. Its context of circulation is the pedagogical field, and the relationship between participants is between an expert writer and a semi-lay or lay reader. Preferably, a descriptive discourse organisation mode and multimodal resources are used. 29. Thesis: This discourse genre has the macro-purpose of persuading in respect of a theoretical or ideological proposal. Ideally, it circulates within scientific contexts, and the relationship between participants is between an expert writer and an expert reader. The predominant discourse organisation mode is argumentative. Multimodal resources are usually employed.
57
58
Giovanni Parodi, Romualdo Ibáñez and René Venegas
3.2
Analysis of the genres identified according to criteria and variables
The following section focuses on an analysis of the twenty-nine genres identified in the corpora, starting with each of the variables included in the criteria framework. Firstly, we approach the communicative macro-purpose. In Figure 6 the genres are grouped around the eight macro-purposes. As can be observed in Figure 6, from the twenty-nine identified genres, seven of these reveal the predominance of the purpose to consign. This occurrence amounts to 24.1% of the total, which fact makes it the most frequent macro-purpose among the genres in the corpora. At the same time, four genres (Law, Operation Manual, Regulation, and Observation Guideline) correspond to the purpose to regulate (13.8%), another four genres (Research Article, Lecture, Thesis and Disciplinary Text) share the macro-purpose to persuade and another four genres (Certificate, Quotation, Memorandum, and News) coincide in the macro-purpose to confirm. On the other hand, the purposes to guide (10.3%) and to offer (10.3%) each gather three different genres. Lastly, the macro-purposes to instruct (6.9%) and to invite (6.9%) assemble two genres each. As may be appreciated and from this data, it is possible to establish that, although the corpora are of an academic and professional nature, the predominant purpose of the genres appears to be more oriented towards the professional domain than to the academic one.
Communicative Macro-purpose
To instruct
: TB : DG
To consign
To regulate
To persuade
To guide
To invite
To confirm
To offer
:
MR
: LAW
:
RA
:
MO
:
CB
: CERT
:
CC
:
REP
:
OM
:
LECT
:
DP
:
BS
: QUOT
:
BRO
:
CM
:
REG
:
THE
:
PLA
:
ME
:
RP
:
REC
:
OG
:
DT
:
NEW
:
TEST
:
DIC
:
STA
Bidding Specification (BS) Brochure (BRO) Calculation Memory (CM) Call for Bids (CB) Certificate (CERT) Commercial Catalogue (CC) Development Plan (DP) Dictionary (DIC) Didactic Guideline(DG) Disciplinary Text (DT) Law (LAW) Lecture (LECT) Medical Order (MO) Medical Report (MR)
Memorandum (ME) News (NEW) Observation Guideline (OG) Operating Manual (OM) Plan (PLA) Quotation (QUOT) Record (REC) Regulation ( REG) Report (REP) Research Article (RA) Research Project (RP) Statement (STA) Test (TEST) Textbook (TB) Thesis (THE)
Figure 6. Classification of the academic and professional genres according to communicative purposes
Discourse genres in the PUCV-2006
Figure 7. Classification of professional and academic genres according to the discourse organisation mode
Hereunder, we revise the distribution of genres in respect of the criterion: discourse organisation mode. This mode is described by means of three variables (see Figure 7). In Figure 7, we observe that the majority of the genres present a predominantly descriptive mode (Dictionary, Memorandum, Bidding Specifications, Commercial Catalogue, Call for Bids, Quotation, Statement, Medical Report, Brochure, Report, Law, Textbook, Operating Manual, Calculation Memory, Regulation, Medical Order, Observation Guideline, Development Plan, Plan, Record and Test). This mode assembles 75.9% of the genres identified in the corpora. On the other hand, six genres show predominance of the argumentative mode (Research Article, Lecture, Didactic Guideline, Research Project, Thesis and Disciplinary Text), which corresponds to 13.8%. Lastly, only News evidences predominance of the narrative mode, equivalent to 3.4%. In view of the academic and professional nature of the corpora under study, this information acquires special relevance when verifying that its predominant feature, such as the discourse organisation mode, shows a tendency towards description over the other two modes. The following criterion under analysis is the context of circulation. The corresponding data is shown in Figure 8.
59
60 Giovanni Parodi, Romualdo Ibáñez and René Venegas
Figure 8. Classification of professional and academic genres according to context of circulation
With respect to the context of circulation of these twenty-nine genres, we can establish that the labour context (with 48.3% of the genres) assembles the majority of the genres in the corpora (Memorandum, Plan, Bidding Specification, Commercial Catalogue, Call for Bids, Quotation, Medical Report, Report, Operation Manual, Calculation Memory, Medical Order, Observation Guideline, Development Plan, Record and Test). In second place, we have the context of the universal kind, which has 24.1% of the genres in the corpora (Statement, Law, Certificate, Brochure, Regulation, and News). The third place is filled by the scientific context, which assembles 17.2% of the genres (Research Article, Lecture, Research Project, Disciplinary Text and Thesis). With a much lower percentage of genres (10.3%) comes the pedagogical context, which assembles the Textbook, the Didactic Guideline and the Dictionary. The grouping up of these genres according to this criterion reveals that, although many of these genres may circulate within the academic domain, these are ideally produced from the professional type of context. In Figure 9, we observe that 100% of the texts from the corpora were produced by expert writers. The type of reader to which the genre is addressed establishes the difference in the relationship between participants.
Discourse genres in the PUCV-2006
Figure 9. Classification of professional and academic genres according to the relationship between the participants
As may be noted in Figure 9, the genres Certificate and Test (6.9%) are created exclusively for lay readers. On the other hand, the genres Didactic Guideline, Brochure and Textbook are produced for semi-lay or lay readers, which correspond to 10.3%. Likewise, Lecture, Quotation, Dictionary, Regulation, Law, Plan, and Test are produced in order to be read by expert or semi-lay readers (24.1%). Lastly, a group of seventeen genres is detected, corresponding to 58.6%, which have been produced to be read by experts (Research Article, Bidding Specifications, Commercial Catalogue, Call for Bids, Statement, Medical Report, Report, Operation Manual, Memorandum, Calculation Log, Observation Guideline, Development Plan, Research Project, Record, Thesis, and Disciplinary Text. This means that these are genres that exact a high degree of previous knowledge in order to be understood and used. Only News has been created for diverse readers, such as experts, semi-lay or lay. The following criterion in analysis corresponds to Modality. Distribution of the genres, according to this criterion, is shown in Figure 10. As can be observed in Figure 10, monomodality is 51.7% of the genres in the corpora. On the other hand, 48.3% of the genres show a complementation between verbal mode and non-verbal resources of other semiotic modes, especially
61
62
Giovanni Parodi, Romualdo Ibáñez and René Venegas
Figure 10. Classification of professional and academic genres according to modality
with images, graphs and scientific formulae. This finding reveals an interesting balance in the distribution of genres in respect of this criterion. This shows that a considerable importance is given to combine more than one mode of communication not only in academic but also in professional settings.
Closing remarks As shown in this study, the use of a multidimensional framework constitutes a robust theoretical-empirical support. Together with this, the initial assumption about fuzzy categories helped us to distinguish apparently similar genres that share more than one feature. Moreover, apparently very different genres, which could be located on the periphery, could be identified. At the same time, the idea of predominance of one feature over another, within the same criterion, was implemented as a way of including the idea that the objects under study are constituted by groups of complex co-occurring features and not by unique, simple and homogeneous characteristics. The search for a variable as predominating over others has been a governing principle of our analysis. This same approach is detected
Discourse genres in the PUCV-2006
along the several chapters of this book, since the complex and multi-dimensional nature of the genres demands an ad hoc approach with similar characteristics. The five criteria identified and selected in this research, as well as the more concrete and specific twenty-six variables where these operate, have allowed us to determine a set of twenty-nine discourse genres. Likewise, this complex matrix of multi-dimensions has enabled the formulation of operational definitions: exemplification and classification of each of the genres identified. This is the main contribution of this study. The construction of a multidimensional proposal from a complex and dynamic perspective of the genres of written discourse has also been one of the underlying purposes of this research. To this end, we have put into practice a complementary methodology, where deductive and inductive methods are fused (merge). Undoubtedly, their application to corpora of the size and length of the texts that constitute the PUCV-2006 Professional and Academic Corpus has been no easy task and posed a major challenge for the research team. We are sure that the construction of a complex taxonomy as well as its operation in specific genres is an initial step that must continue towards progressive empirical contrasting. These diverse approximations and procedures must validate this proposal and will show its strengths and weaknesses. Thus, all the investigations reported in the following chapters of this book are based on this first classification. In each chapter, each of the genres identified herein is studied and the particular questions that contribute in one or another to the validation or refutation of this classification are duly investigated. Likewise, relevant details are provided that enable in-depth characterisation of some or all of these twenty-nine specialised written genres by means of the four disciplines involved.
63
chapter 4
Academic and professional genres Variations across disciplines Giovanni Parodi
This chapter focuses on describing the largest available on-line corpora (59 million words) of written specialized Spanish on four disciplines: Psychology, Social Work, Industrial Chemistry, and Construction Engineering. The corpora were collected in one Chilean university and in the corresponding professional settings. The corpus description shows that access to disciplinary knowledge is constructed, on the one hand, through a varying repertoire of nine written academic genres. On the other hand, a larger group of twenty-eight professional genres is detected. Interesting variations across the four disciplines under study are identified.
Introduction Recent research provides evidence of the reading comprehension problems readers face when processing both academic and professional specialised texts (Parodi 2005b, 2007a, b; Arnoux, Nogueira & Silvestri 2006; Peronard 2007a; Ibáñez 2007a). A similar situation has been documented for academic and professional writing (Parodi 2003; Marinkovich 2001–2002; Carlino 2005; Fernández & Carlino 2007). From disciplinary texts point of view, academic and professional genres involve a set of features that have not been described in-depth, and thus it is not easy to help readers of such texts to gain deep levels of comprehension. This is partly responsible for negative epistemic attitudes towards specialised texts, based on the impression that these are inaccessible and inconsiderate (Kantor, Anderson & Armbruster 1983). For a novice reader, this means that gaining proper command of disciplinary specialisation becomes a slow and complex process. Few studies focus on the links between academic and professional discourse genres from a corpus perspective. A significant amount of research focuses on disciplines such as medicine, law, business, history, and the area of governmental organisations (Trosborg 2000; Gallardo 2005; Alcaráz Varó, Mateo & Yus 2007;
66 Giovanni Parodi
Ciapuscio 2007; Facchinetti 2007; Mahlberg & Teubert 2007; Candlin 2002; Devitt 2004), yet there are few robust empirical studies in other areas of knowledge (Bruce 2008; Biber, Connor & Upton 2007; Wignell 2007; Connor & Upton 2004; Curado, Edwards & Rico 2007; Flowerdew 2002a; Swales 2004; Vine 2004; Bargiela-Chiappini & Nickerson 1999). There are no previous examples of contrastive studies based on texts from undergraduate university programmes and their corresponding professional settings. Previous studies of Spanish focus mainly on specialised discourse or academic discourse (Cubo de Severino 2005; Castel, Aruani & Severino 2004; Harvey 2005; Ciapuscio 2003; Torner & Battaner 2005; Castelló 2007; Parodi 2004, 2005a, 2007a, b; Montolío 2002; Núñez, Muñoz & Mihovilovic 2006). The research study reported in this chapter is based on large contrastive corpora and moves towards a more ecological and representative description of variations existing across genres and disciplines. This chapter provides information about the processes of collecting, constructing and describing academic and professional written discourse corpora from four disciplines: Industrial Chemistry, Construction Engineering, Social Work, and Psychology. This is followed by a comparison of the genres identified along with an analysis of the findings. The theoretical framework for this study can be found in Chapter 2, where our concept of discourse genre is described in detail.
1.
The research
1.1
Method
Our research objectives focus on the description of the written genres that are used in four university undergraduate programmes and their corresponding professional workplaces by collecting and studying the written texts that are read in these contexts and through which specialised knowledge provides access to disciplinary interactions. We examine assigned student readings in four academic degree programmes and the written texts that make up the core of daily written communication in professional settings related to these disciplines.
1.2
Collection of the academic corpus
The academic corpus was constructed by collecting 100% of the written material read during each year of the curriculum for each of the four undergraduate
Academic and professional genres
Table 1. Steps in Academic Corpus collection and processing Step 1: Construction of a database with all the curricula information from the four university degree programmes (including that of each course) Step 2: Construction of a database from all obligatory and complementary bibliographic references included in each of the course programmes Step 3: Preparation of a survey for all professors from each of the four programmes, including a request for complementary materials not included in the course programmes Step 4: Collection of complementary material for each course, which the professors pass on to students in the form of handouts, digital files, and photocopied material Step 5: Internet search in order to find the selected books available in digital format, thus minimising digitalisation efforts Step 6: Collection of the texts from the corresponding libraries and from professors’ offices Step 7: Process of photocopying each text with the aim of building a paper database Step 9: Training of a team of assistants to scan and compile all texts Step 9: Processing all the texts into plain text format (*.txt) using the “El Grial” tagger and parser (www.elgrial.cl)
programmes under study. Nine steps were followed in order to compile the academic corpus and set up a database, as outlined in Table 1.
1.3
Collection of the professional corpus
In order to create the professional corpus, all students from years 2000 to 2006 enrolled in the four degree programmes mentioned above were contacted and invited to participate in the research project. This first contact explored whether each subject would fit the profile of a graduate as determined by the degree programme. It is important to point out that we were not interested in studying the means by which the professional corpus is stored or transmitted (paper, electronic, the Internet, etc.). Our interest was focused on building a database of the main genres employed in each specialised professional field. Thus, this part of the study did not attempt a quantitative analysis of each of the identified genres, as was our aim with the academic corpus. The final objectives of this collection process were to identify the written communication practices of each working environment and to collect as much prototypical written material as possible from that read by the subjects in the course of their professional activities. Table 2 indicates the steps followed in order to build the professional corpus.
67
68 Giovanni Parodi
Table 2. Steps in the collection of the Professional Corpus Step 1:
Construction of a first database comprised of graduate professionals from the four degree programmes over a period of five years
Step 2:
Selection of the number of professionals who met the prerequisite of working in the field of their professional qualification
Step 3:
Telephone contact with the graduates and scheduling of interviews
Step 4:
Carrying out interviews by means of an in situ group of assistants using an ad hoc protocol, in order to request examples of written texts used in the workplace on a daily basis
Step 5:
E-mail contact with the participants enabling the delivery of other materials in electronic format, as agreed in the interview
Step 6:
Search of libraries and companies, or an Internet search for certain materials mentioned by but not received from the professionals surveyed
Step 7:
Construction of a second database using students completing their final supervised work experience in the four degree programmes
Step 8:
Determination of an accessible number of interviewees per discipline or area of specialisation in order to contact or request materials for them
Step 9:
Telephone or e-mail contact, or professor supervision of final work experience in order to set up interviews at the university with the aim of obtaining written texts used in the workplace on a daily basis
Step 10: Photocopying of all texts collected from the first and second groups with the aim of building a paper database Step 11: Training of a team of assistants to scan and compile all texts Step 12: Processing all the texts into plain text format (*.txt) using the “El Grial” tagger and parser (www.elgrial.cl)
1.4
The PUCV-2006 Corpus of Spanish: The disciplines
As mentioned above, the corpus was compiled from four university undergraduate programmes and from the four professional areas directly related to them. Table 3 shows the four academic and professional areas divided into scientific fields. Table 3. Academic and professional areas Academic and Professional Disciplines Basic Sciences and Engineering
Construction Engineering (CE) Industrial Chemistry (IC)
Social Sciences and Humanities
Social Work (SW) Psychology (PSY)
Academic and professional genres
As stated in the Introduction, the choice of these four disciplines is based on: (a) the exploration of areas different from those classically investigated in English and Spanish, such as law, medicine, economics, history and business; (b) our aim of contrasting, from different points of view, the genres and prototypical features of the texts used in undergraduate university education with those from the professional world in which these university studies are put into practice; and (c) a further interest in the contrast between the disciplines of Basic Sciences and Engineering (BS&E) and those of Social Sciences and Humanities (SS&H).
2.
Results and discussion
The first part of this section contains a quantitative description of the corpus. Given the different collection processes used for the academic and professional corpora, these figures are shown separately.
2.1
Academic Corpus
This section presents a description of the Academic Corpus by means of some figures and tables. The idea is to give a general overview of the constitution of this corpus in the four disciplines involved. Figure 1 presents the general number of texts collected, expressed as percentages. A clear distinction between SS&H and BS&E emerges from these data: there is a greater number of texts to be read by students of SS&H. In other words, SS&H students are expected to read far more written material than are BS&E students.
Social Work Psychology
Construction Engineering Industrial Chemistry 11% 29%
14%
46%
Figure 1. Distribution by degree programme in the PUCV-2006 Academic Corpus
69
70 Giovanni Parodi
Table 4. Constitution of the PUCV-2006 Academic Corpus of Spanish
Number of Texts
%
Number of Words %
Psychology (PSY) Social Work (SW) Construction Engineering (CE) Industrial Chemistry (IC)
227 142 69 53
46 29 14 11
21,933,860 18,641,309 8,734,086 9,285,375
37 32 15 16
Total
491
100
58,594,630
100
These figures evidence that, for example, Psychology students read four times as much as Industrial Chemistry students. Table 4 shows information in terms of the number of texts and the number of words for each of the degree programmes. As already mentioned, this heterogeneous percentage distribution confirms the concentration of written material in SS&H. Psychology not only has the largest number of texts (227) but also the largest number of words (21,933,860). Industrial Chemistry has the smallest number of texts (53). The figures in Table 4 coincide with the ideas developed from Figure 1 in a revealing way. Table 4 confirms the tendency towards the existence of a larger number of texts and words in the field of SS&H. This implies a possibly larger amount of reading, in terms of time devoted to academic activities and the scope of material to be processed. While the difference between SS&H and BS&E in terms of the number of words is important, its distribution is not as remarkable as that of the number of texts. However, the fact that the range of texts to be processed by students in the two SS&H degrees (37% and 32%) is twice that of the two BS&E degrees (15% and 16%) indicates a clear tendency towards a progressive increase in the number of words and the number of texts based on the undergraduate degree programme and the disciplinary area. The findings presented in Table 4 constitute a genuine milestone. There is no previous documentation on a written academic corpus with these features, in Spanish or in any other language; namely, a corpus that is subdivided into disciplines and in which texts are classified into discourse genres, including data on numbers of words and texts. These texts are also available on-line, free of charge for consultation (www.elgrial.cl). We use both communicative-functional and textual-discourse criteria in order to classify all texts in the corpus into specific discourse genres. Table 5 shows the nine genres that were identified and the figures for each type from the total of the PUCV-2006 Academic Corpus. Precise definitions, examples and procedures for each genre are given in Chapter 3. As can be seen in Table 5, a very heterogeneous distribution of genres emerges, with clear concentrations. Two genres are by far the most frequent: Disciplinary Text (DT) and Textbook (TB). This provides an initial overview combining
Academic and professional genres
Table 5. Distribution of the Academic Corpus by genre Genres
Number of Texts
Dictionary (DC) Didactic Guideline (DG) Disciplinary Text (DT) Lecture (LECT) Regulation (REG) Report (REP) Research Article (RA) Test (TEST) Textbook (TB)
2 41 270 1 15 11 22 3 126
Total
491
specialised communication conveyed through different genres, such as DT and TB. The DT deals with specialised disciplinary knowledge in the form of subject matter books (also called disciplinary books), sometimes evidencing a high level of complex and dense prose (see Chapter 10). Although the TB is also oriented towards disciplinary knowledge, TB has more explicit didactic communicative purposes intended for a relatively novice audience. The process of knowledge construction in the TB is supported by diverse pedagogical resources (graphs, tables, diagrams, etc.) in most cases and is also scaffolded for the reader by means of exercises, glossaries, and knowledge application didactic strategies (see Chapters 8 and 9). The corpus presented herein comparatively indicates a reduced number of Didactic Guidelines (DGs) (only 41 were found). Although this genre is the third most common in the corpus, we expected to find more of these, since texts have been collected from four university degree programmes and from academic curricula over a period of four to five years. This finding reveals that this is a commonplace genre in undergraduate university academic life, but that TB and DT are more common. The small number of Research Articles (RA) (22) is also of interest; this could reasonably be assumed to be associated with BS&E degrees. The few examples of RAs are another remarkable finding that may be explained by the fact that these are the most important medium for up-to-date and cutting-edge information, which may predominate in graduate rather than in undergraduate studies. Figure 2 presents the distribution of the academic genres across disciplines. It is not surprising to see that the TB turns out to be the most important academic genre. The TB is the only academic genre used across all four disciplines. This finding partially reveals the way in which students approach new knowledge in these four disciplines. The TB fulfils a clear didactic communicative purpose
71
Giovanni Parodi
4
3
Presence within Areas
SW PSY
2
CE IC
1
CT LE
ST TE
P RE
RA
IC D
T D
G RE
D
G
0
TB
72
Figure 2. Distribution of academic genres by discipline
in undergraduate university settings and represents an important discursive tool used to open new pathways to disciplinary knowledge. Its prototypical rhetorical structure involves the presentation of concepts and definitions of increasing complexity, the posing and solving of problems and exercises, and – among other resources – the inclusion of glossaries (see Chapters 8 and 9 in this volume for a detailed description of TB’s rhetorical organisation). This shows that scientific information is codified in a precise sequence and that, at the same time, different instructional resources are displayed in order to support access to new information and test it through direct questions and problems. The distribution of the nine genres across the four disciplines indicates an interesting overlapping of academic genres and a reduced tendency towards exclusive discourse resources. Of the nine genres, 78% are found in at least two of the four disciplines being studied, and only 22% are found to be exclusive to one discipline. Psychology (PSY) is the discipline in which all nine genres are identified. PSY evidences major heterogeneity in discourse tools of the other three disciplines, as opposed to IC, which only has two genres (TB and DG). Four genres are detected as exclusive to the two disciplines of SS&H: RA, REP, TEST, and LECT. On the other hand, limited genre variation is detected in BS&H. Only two genres are used in IC and five in CE. These findings shed light on fundamental discourse distinctions between SS&H and BS&E, implying the existence of differences in linguistic, cognitive and social dimensions associated with genres.
Academic and professional genres
Table 6. Constitution of the Professional Corpus Area
Number of Texts %
Psychology Social Work Construction Engineering Industrial Chemistry
220 101 62 59
50 23 14 13
Total
442
100
2.2
Profesional Corpus
The following is a brief quantitative description of the Professional Corpus. As mentioned before, this collection of texts does not represent an exhautive search of all the texts in circulation, but rather an attempt to build a general profile of the genres. Our aim was not to collect the largest number of texts possible but rather to obtain the most diverse variety of material circulating in these workplaces. Table 6 reveals a proportional similarity to the figures of the Academic Corpus as shown in Table 4. Although a similar collection procedure was used in all disciplinary areas, a larger quantity of available material was again found in the areas of SS&H (PSY accounting for 50% of the total). A decreasing percentage distribution can be observed, as in the case of the Academic Corpus, through PSY, SW, CE and IC. Surprisingly, this finding reveals an interesting distribution with respect to the number of written texts that circulate in both the university and the workplace in the four disciplines. In keeping with the same classification procedures used with the Academic Corpus, all the texts in the Professional Corpus were analysed by the research team in order to identify the emerging genres. A figure of twenty-eight discourse genres was determined for the four disciplines. Precise definitions and examples of each professional genre may be found in Chapter 3 in this volume. All genres are listed alphabetically with their corresponding acronyms in Table 7. The range of genres identified based on texts collected from the workplace is more heterogeneous in nature than that emerging from the undergraduate university settings: three times larger, to be exact. A preliminary analysis reveals that there is considerable knowledge specialisation and disciplinary restrictions for some genres (e.g., Call for Bids, Regulation, and Development Plan). There are also some more common genres in terms of general use (e.g., News, Dictionary, and Plan). Some genres are the same as those identified in the Academic Corpus (e.g., Research Article, Disciplinary Text, and Textbook). Genres with a high degree of specialised communication were found in the Professional Corpus. Some of them clearly respond to specific communicative
73
74
Giovanni Parodi
Table 7. The twenty-eight genres of the Professional Corpus Bidding Specification Brochure Calculation Memory Call for bids Certificate Commercial Catalogue Development Plan Dictionary Disciplinary Text Law Lecture Medical Order Medical Report Memorandum
BS BRO CM CB CERT CC DP DIC DT LAW LECT MO MR ME
News Observation Guideline Operating Manual Plan Quotation Record Regulation Report Research Article Research Project Statement Test Textbook Thesis
NEW OG OM PLA QUOT REC REG REP RA RP STA TEST TB THE
purposes for discourse interactions such as business transactions, regulatory procedures, reporting and describing medical situations, dissemination of information, and knowledge acquisition. Some of these communicative interactions take, for example, the form of Bidding Specifications, Call for Bids, Quotations, Medical Reports, Brochures, and Textbooks. Considering cross-linguistic impact, the study conducted by Barbara and Scott (1999) on six texts (three in English and three in Brazilian Portuguese) addresses what they call Invitation for Bids (IFBs), which correspond to what we refer to as Call for Bids (CB). In their study, Barbara and Scott described linguistic features of the IFBs and also advanced a preliminary description of the basic rhetorical moves of this genre. Interestingly, they found discourse organisation and communicative purpose similar to what we found in our Spanish collection of CBs (see Chapter 3 in this volume for a description and definition of CB). Figure 3 illustrates the occurrence of each genre across the four disciplines. The most remarkable finding is that only three genres are shared by the four fields of professional activity: Research Article (RA), Brochure (BRO), and Report (REP). This finding empirically supports the assertion stated earlier regarding the possible occurrence of genres across the disciplines. All of these three involve a high degree of specialisation and are typical of discourse interactions in specific communication settings. It is not surprising to see that RAs appear in the four disciplines, since these represent a vital instrument for the acquisition of cuttingedge knowledge. This indicates that in the four specialised fields, this genre is used by professionals to access information on respective disciplines as part of their daily working activities. BROs and REPs are also important in transmitting
Academic and professional genres
4
Disciplinary Area
3 SW PSY
2
CE IC
1
RA BR O RE P LE CT TB O M RE G RE CE C RT M E D T CB ST A LA W D P RP TH N E EW M R M O O TE G ST CC CM PL A Q BS U O T D IC
0
Figure 3. Distribution of the professional genres by discipline
and receiving information. BROs provide relevant data and transmit this data to diverse audiences (a good example might be a medical brochure on contagious diseases). REPs provide information about situations, procedures or analyses for a case of commercial interest, medical status or business transactions in all four disciplines. Three genres were found in only two disciplines: CERT in PSY and CE; ME in IC and CE; and DT in SW and PSY. It is interesting that a genre like CERT circulates in disciplines under both SS&H and BS&E, while the remaining two are linked to certain specialties in both fields. ME appear only in the two socalled natural sciences, and DT is exclusive to the field of Social Sciences and Humanities, as was seen with the Academic Corpus. This is a consistent tendency that reveals a distinction in the transference and construction of disciplinary knowledge. While the ME is an important means of communication in BS&E and shows how the flow of information is accomplished in organisations (requesting, organising and directing actions), the DT reveals a special focus on the need for ongoing updating of knowledge, as codified in SS&H disciplinary books. This finding reveals varying requirements of discourse interactions in organisations of a different nature. At the same time, as mentioned before, the RA was found in all four disciplines, reflecting the need for updated information in all professions. Nevertheless, distinctive resources that can be used to meet this communicative purpose also emerged for the SS&H. This finding shows how knowledge is
75
76
Giovanni Parodi
constructed and reproduced differently in the disciplinary areas involved in this study, in both professional and academic settings. Of the twenty-eight genres, only eleven are found in two, three or four disciplines. The other seventeen are exclusive to only one of the four disciplines. This means that 61% of the genres are specific to only one discipline, indicating the specialised nature of disciplinary communication and the need to construct specific discursive tools to fulfil specific communicative purposes. Similarly, 39% of the genres indicate some degree of variability in distribution over the four disciplines, and only 10% appear in all four.
2.3
Academic and professional genres: Interaction and overlapping
The final part of this chapter compares the Academic and the Professional Corpus. In order to identify the genres that are shared across Academic and Professional domains and those that are exclusive, Figure 4 shows Venn diagrams of the data. Only the Didactic Guideline (DG) is exclusive to the Academic Corpus, while there are twenty genres exclusive to the professional corpus. Eight genres were found to be shared and used in both academic and professional settings. These findings reveal empirical information not previously identified in specialised literature: few genres can be identified as prototypically academic. These findings will lead us to classify Academic Discourse as a Miscellaneous Discourse, one which is generally made up of a group of genres originated not in the academic settings and not primarily fulfilling academic purposes (for a detailed analysis of this kind of discourse, see Chapter 5). Directly connected to the previous idea of a miscellaneous discourse, the DG has been identified as the only genre exclusive to higher education in these four
DG
BS ME DP MO RA CC CERT PLA CM TEST REP DT DIC THE QUOT STA BRO LECT TB CB OG NEW RP REG OM REC LAW MR
Academic Corpus Professional Corpus Intersection between the Academic Corpus and the Professional Corpus
Figure 4. Genres between Academic Corpus and Professional Corpus (irrespective of discipline)
Academic and professional genres
undergraduate university degree programmes. This finding may come as no surprise, but its empirical nature shows how university instruction develops and uses specific discourse resources to exercise teaching contents and help students to access disciplinary information. It is also evident that this written vehicle constructs and opens new knowledge spaces that put applied or theoretical contents into practice in a decidedly didactic way. Eight genres are shared between the academic and professional settings. These constitute a bridge between the academic world and professional life. For example, TB, RA, DIC, and DT are genres that serve communicative purposes at the university and in the workplace as repositories of disciplinary knowledge. Interestingly, these four genres most likely serve similar functions in the workplace, although specific aims of their use may vary substantially. The profile emerging from Figure 4 reveals a continuum of access and incorporation across the discourse community even though it may not be a response to conscious and careful planning. In other words, this can be viewed as proceeding from an initial academia point, starting with didactic and disseminating genres (such as DG and TB), as well as another eight with more specialised features that will also subsequently be found in professional settings. The extreme of the continuum, as shown in Figure 9, is progressively displayed to the professional subject only once he/she enters the professional world, as new communicative demands need to be fulfilled. Conversely, the reduced variety of genres found in academic settings (only nine), in comparison with the more widespread and diverse number detected in the workplace, could present an obstacle when going from academic to professional life. The discourse genres collected from professional settings vary greatly in comparison to those identified in the academic world with respect to rhetorical organisations and communicative purposes, as well as lexicogrammatical features. This may well lead to slowing down the integration process, which could affect initial performance of these graduates in the workplace. This raises several questions: Should the academic world include the study of genres found in the workplace? Or should institutions and professional organisations assume responsibility for developing the discursive skills required to comprehend and produce the specialised written genres they work with every day? Is professional literacy in reading and writing genres truly an issue for professional settings? Figure 5 displays only those genres that appear at the same time in both corpora (Academic and Professional) and in the same discipline. If a particular genre appears in one discipline's academic environment but is not detected in the professional setting of that same discipline, it will not be included in this figure.
77
78
Giovanni Parodi
IC
CE
REG
TB REP DT
RA
SW
LECT
PSY
Basic Sciences and Engineering Domain Social Sciences Domain
Intersection between the Academic Corpus and the Professional Corpus within the four disciplinary areas Academic and Professional genres within SW and PSY Academic and Professional genres within IC and CE
Figure 5. Academic and professional genres throughout the four disciplines
No single genre is found in both the Academic Corpus and the Professional Corpus at the same time and within all four disciplines. This means that there is no genre typifying written discourse across the four disciplines that can be easily transferred from the academic to the professional setting. In fact, each genre tends to reveal certain specific features that are very idiosyncratic and conventionalised in response to specific communicative purposes. As indicated in Figure 5, only six genres are found with a diverse degree of commonality across the four disciplines: TB, DT, RA, REP, LECT, and REG. Only one genre is shared by the AC and the PC across three disciplines (IC, SW, and PSY): the TB. As already said, the TB is a conventionalised form of knowledge dissemination supported by various didactic resources. This genre (TB) is meant to serve the goals of specific discourse communities that share purposes common to the academic and professional communities. As stated above, the TB also corresponds to conventionalised communicative settings in routine interactions in which didactic resources such as the definition, explanation, and exemplification of concepts and theories are introduced progressively (see Chapters 8 and 9). These findings show that there are genres found in more than one of the four disciplines being studied, although not in all four and in both corpora (AC and PC). This supports a tendency towards the specialisation of certain genres in certain areas of knowledge (LECT in PSY). This is also evident in the fact that the TB is the only genre of the academic and professional corpora that is shared by
Academic and professional genres
three disciplines. Thus, the TB functions as a mediator between less specialised knowledge and disciplinary specialisation. As was previously stated, its educational orientation makes it a useful discourse tool in both university and professional settings – though, of course, not in all four fields. The data contained in Figure 5 also show the existence of more genres interacting between academic and professional settings in the disciples of SS&H. A total of four genres appear in PSY and SW (DT, REP, RA and LECT), while only one is detected in CE (REG). No genre, with the exception of the TB, is found in the AC and PC in IC. This reveals the specificity of and difference between genres existing in undergraduate university and professional settings in the two BS&E disciplines. The profile emerging from this data helps to determine dimensions of the corpora, as well as, for example, links between disciplines and their variety of identified genres. This leads to the detection of a clear tendency towards a progressive increase or decrease in quantitative terms relative to the degree programme with respect to the number of texts, the total number of words, the discipline, and the existing variety of genres in both academic and professional settings. Figure 6 displays these regularities in an inverted pyramid design. As stated above, a regular pattern emerges from the variables of the corpora. This indicates that if we analyse SS&H, a large number of texts and words are detected in both the Academic and the Professional Corpus. But, comparatively speaking, if we examine the disciplines of BS&E, the number of texts and the number of words show a progressive reduction. For example, the opposition in quantitative terms between Psychology and Construction Engineering is quite Undergraduate Program
Number of Texts Academic
PSY SW
CE IC
Number of Words
Professional
Academic
227
220
21,933,860
142
101
18,641,309
69
62
8,734,086
53
59
9,285,375
Figure 6. Dimensions of corpora variables
Science Area
Genre Variety
Academic
SC & H
14
BC & I
7
Professional
29
21
79
80 Giovanni Parodi
revealing, whether we are analysing the academic or the professional corpus. Therefore, if we analyse SS&H (or more specifically, PSY) in general terms, the result is always the same: a clear progressive reduction of the figures for the corpora from SS&H to BS&E. This progressive quantitative distribution associated with the undergraduate degree programmes, disciplines, and genres represents a finding that needs to be corroborated by other corpora in other universities and professional settings. Of course, it is also possible to use these data to establish new research studies and thus generate new hypotheses to explore.
Conclusions The conclusions are focused on the Academic Corpus, the Professional Corpus, and the similarities or differences between them. Regarding the Academic Corpus: 1. There are important numerical differences in both the number of texts and the number of words between the SS&H and the BS&E. This is related to the different amount of written material that students are exposed to in their education, and also to a greater variety of genres that circulate in some university degree programmes. The implications for the construction of disciplinary knowledge, for the variation in discourse genres, and for the development of skills in the comprehension of written text are numerous and open to further investigation. 2. Two genres appear most often in this corpus: the Disciplinary Text (270 texts) and the Textbook (126 texts). This shows a balance between a highly specialised genre with almost no concern for laymen or semi-laymen (DT) compared with the special attention given to supporting learners through the use of didactic resources in order to gradually open access to disciplinary knowledge (TB). 3. There is greater variety in the genres of the SS&H (5 in SW and 9 in PSY) than in those of the BS&E (2 in IC and 5 in CE). PSY constitutes the most heterogeneous of the disciplines, as it reveals examples of all the genres identified in this corpus. Subsequent research should focus on a detailed and comparative study of these genres throughout more disciplines in order to detect more similarities and differences.
Academic and professional genres
Regarding the Professional Corpus: 1. There is greater diversity of genres than in the AC. The twenty-eight genres identified reveal varying written means of communication, some of which are very prototypical to the disciplines and some of which are highly specialised for specific contexts and functions (Calculation Memory, Bidding Specification, Medical Report, and Observation Guideline). 2. Of the twenty-eight genres, only eleven are found in some or all of the disciplines, and 61% of the total number corresponds to only one discipline. Therefore, the aforementioned specialisation is directly related to one particular area of knowledge. 3. The absence of the majority of these genres in the academic corpus implies that professionals meet and learn to deal with them directly in the workplace, without specific education or previous knowledge of the genres. 4. These genres must be explored in detail in order to execute a more detailed analysis to obtain more information about their prototypical features, discourse organisation mode, rhetorical structure, linguistic features, etc. Regarding both corpora: 1. Most genres collected from written professional communication reveal a repertoire that does not occur in academic settings. 2. There is no discourse genre that co-exists in both the AC and the PC and which is also present across the four disciplines. Findings reveal that the disciplines have a genre-specific character in both the undergraduate university and in professional settings. Therefore, there is no means of crossing from the academic world to the professional world within the four disciplines in the form of a common prototypical discourse pathway. 3. The DG emerges as the only genre to distinguish the academic corpus from the professional corpus throughout all four disciplines. Interestingly, the DG is a genre characterised by a special emphasis on didactic resources and institutional tasks, features which are prototypical in an undergraduate educational environment. 4. There are few intersection points between the two corpora in terms of genres that coincide within the four disciplines. Only six genres cross over. 5. Of these six genres, a total of four (57%) are found exclusively in the academic and professional settings of SS&H. The analysis provided by the genre descriptions helped to uncover fundamental discourse and genre distinctions between SS&H and BS&E. Interesting differences emerged from the comparison of academic and professional corpora in similar disciplinary domains. It is clear that the texts employed as reading
81
82
Giovanni Parodi
material in some academic disciplines (hard sciences) are not the same as those in other scientific areas (soft sciences). The findings show that the discourse of Social Sciences and Humanities is constructed and re-constructed through different linguistic, cognitive and social dimensions, as opposed to that of Basic Sciences and Engineering. This implies that the members of the corresponding discourse communities interact through different conventionalised social tools and construct different meanings that employ different discursive mechanisms. All of the above supports the claim that cognitive representations of meaning should be different in some way. As claimed in our theoretical framework presented in Chapter 2, the formal features of language reveal the meaning-making process, and if these repeated patterns of usage are recorded in corpora, patterns of meanings become visible (genres). This is what was found through the process of genre identification and through the results of comparing the genres in the four scientific disciplines. SS&H and BS&E build and characterise social relationships in divergent forms, and thus there are systematic distinctions and explanatory hierarchies that are expressed through the whole of language use. Full participation in disciplinary and professional cultures requires informed knowledge of key written genres. These are intimately linked to a discipline’s norms, values and ideologies. They package information used by professionals to communicate with their peers. Understanding the genres of written communication in a disciplinary field is, therefore, essential for professional success. The discourse and cognitive demands that these genres impose upon professionals cannot be determined in the framework of the present study, but it is a challenging open area of research. Research studies similar to this study also have pedagogical implications concerning: (a) the selection of written genres, (b) the development of teaching materials, and (c) the preparation of language tests of various kinds, such as the assessment of disciplinary contents and of specialised discourse comprehension. This is because the genres detected in each discipline constitute the language use employed in the written communication to which students and professionals are exposed. According to these results, students from the four disciplines being studied need to develop their discourse and cognitive skills in order to master very specific and specialised varieties of written Spanish.
chapter 5
University academic genres A miscellaneous discourse Giovanni Parodi
The issue of disciplinarity is increasingly salient in language studies. Questions as to how undergraduate students construct disciplinary knowledge from the texts included in the university curricula and how the knowledge structure involved in these texts helps to shape educational experiences and outcomes are the focus of studies across a variety of disciplines, using a wide range of approaches. One way to access the specialised written genres employed by academia is to begin from the tenet that all materials read by students during their university training reveal relevant data about disciplinary genres. This chapter presents an in-depth description of the collection and construction of the PUCV-2006 Academic Corpus, which comprises almost 59 million words, in four disciplines: Industrial Chemistry, Construction Engineering, Social Work, and Psychology. In addition, a genre typology emerged from the 491 texts that comprise this corpus.
Introduction What are the prototypical written genres that circulate across different academic disciplinary domains of knowledge? What are the most common written discourse devices that help novices to become experts in specialised knowledge? Specialised written discourse literacy in academic university settings has just started to be explored in most countries of the world. Thus, one way to access disciplinary written genres employed by academia in varying settings is to begin with the assumption that all materials read in these contexts reveal relevant data about the means of written communication and knowledge organisations. Empirical findings from diverse linguistic approaches have documented the relevance of analysis based on corpora as a way of advancing and describing linguistic and discourse variations in greater detail through the disciplines and through prototypical genres (Biber 1988, 1993, 1995, 2005, 2006; Biber, Connor
84
Giovanni Parodi
& Upton 2007; Martin & Veel 1998; Wignell 1998; Williams 1998; Swales 1990, 2004; Flowerdew 2002a; Parodi 2005a, 2006b, 2007a, b). At present, it is increasingly common to find research studies based on large corpora of natural complete texts by employing computational support and statistical techniques. New frontiers are being explored from these perspectives, and refreshing and robust empirical results open interesting new areas of study. Research into the English language, as well as certain European and Asian languages, has revealed that linguistic studies based on large corpora of digital texts do not always corroborate researchers’ initial intuitions. Genre descriptions must be based on sufficient texts of naturally occurring language use in order to ensure that the regularities and patterns observed reveal actual characteristics of the genres under study. The use of computer-readable corpora as well as the availability of computer programmes has boosted linguistic research to a previously unpredictable extent. Research describing specialised genres based on Spanish corpora and used in academic settings is relatively scant. Most of the research produced in the Spanish language on specialised discourse addresses the so-called specialiseddisseminating discourse of magazines, journals, and newspapers (Cademártori 2003; Calsamiglia 2000; Ciapuscio 2003; López 2002). Alternately, it focuses on discourse markers in a variety of genres (Martín Zorraquino & Portolés 1999; Montolío 2001; Portolés 1998). Other studies concentrate on a few linguistic features in small and exemplary texts (Harvey 2002) or on specialised terminology (Cabré 1999, 2002; Lorente 2002). It is difficult to find research concerning Spanish genre identification and descriptions based on large corpora of texts that are fundamental reading material in undergraduate university programmes across disciplinary domains. The relevance of a corpus-based approach is that it facilitates construction of empirically founded descriptions to study genre variations. Supported by the computational advances that make it possible to work with growing corpora, researchers capitalise on corpus linguistics principles and may explore all dimensions of language, including semantics and multimodal discourse. In the fifth chapter of this book, we will report on a research study based upon the largest available on-line corpora (59 million words) of written specialised discourses of academic Spanish on four disciplines: Psychology, Social Work, Industrial Chemistry, and Construction Engineering. A more in-depth description and analysis of the 491 texts is presented, complementing the information analysed in Chapter 4. Some of the genres detected are shared by some disciplines, but important distinctions are also detected. Thus, it is clear that access to disciplinary knowledge is constructed by means of a varying repertoire of written genres depending on the disciplinary domain. This heterogeneous identification of emerging genres has led us to propose that academic discourse could be designated as miscellaneous discourse.
1.
University academic genres
Theoretical framework
Given the lack of robust available educational discursive data and the scarcity of knowledge about materials for improving the discourse skills of undergraduate university students, our research is focused in this direction. Our initial steps were to collect and study the written texts that a group of university students were assigned as reading material (providing them with knowledge specific to their discipline) in four academic undergraduate degree programmes. It is essential to state some fundamental assumptions for this research. On the one hand, the approach taken towards discourse is decidedly interdisciplinary and of a psychosociolinguistic nature. Hence, the texts under analysis are linguistic and semantic units immersed in cognitive and social contexts; their function is determined cognitively and contextually. Texts are linguistic units with meanings produced by writers/speakers and readers/hearers in specific contexts and with specific purposes, with prior knowledge constructed from human cognition in specific social contexts and stored in their minds. In other words, the texts are conceived of as processes and products of cognition and context. At the same time, these are considered to be discursive tools that partly help human beings to acquire knowledge and then actively and consciously reproduce this knowledge.
1.1
Specialised discourse: Academic settings
The notion of specialised discourse (SD) acts as a general framework that includes other objects of study. Academic discourse (AD) and professional discourse (PD) are analyzed as part of SD. The use of the term “specialised discourse” is currently widely accepted by most language scholars. However, since its initial use, the term has been employed to express a variety of meanings. Hence, SD includes a varying group of texts, but with certain prototypical features. It is precisely this idea of the heterogeneity of texts within a scale of gradation that Parodi (2005a) applies when approaching the notion of SD. According to this conception, SD must necessarily be understood as a continuum in which texts are aligned along a diversified gradient that runs from a high degree to a low degree of specialisation. Thus, SD could be conceived of as a hyperonym of AD and PD. Parodi (2005a) defines SD by using a series of characterizing co-occurring linguistic features. Many researchers also agree that there is a set of features that identify SD, and many of them believe that the specialised lexicon is highly important since it constitutes one relevant nucleus (Cabré 1993; Burdach 2000; Cabré, Doménech, Morel & Rodríguez 2001; Ciapuscio 2003). Academic and professional discourses are made operational by means of a set of texts that can be organised along a continuum in which the
85
86 Giovanni Parodi
Figure 1. Continuum of texts in academic and professional fields
texts are linked together, from general school discourse to university academic discourse to professional discourse in a workplace environment. This is presented graphically in Figure 1. Figure 1 illustrates the conception of specialised discourse in academic and professional settings along a continuum of sequential possible interactions. These interactions are thought of from a student’s perspective in which a continuing and updating process of knowledge organization may take place. This distribution of specialised knowledge construction through discourse tools is proposed from a learner who faces alternative possibilities in the process of instruction, not necessarily completing all the levels involved. In keeping with the presentation of data in this figure, it can be said that scientific discourse arises in part from AD and, in turn, is linked to and interacts with PD. In other words, this is not the point of view of a researcher or university professor, because the interactions would be different. For example, if research articles are considered, it is highly possible that they would be found in both academic and professional contexts, since these may be discourse tools employed at both levels. Any newcomer to the study of AD will discover a diversity of approaches and perspectives that hamper initial understanding of the field. As Flowerdew (2002a) suggests, there has been little systematic research into exactly what AD is. When a novice undertakes the study of this kind of discourse, the following questions could be addressed: (a) Are there any existing criteria that accurately define AD? If so, (b) what kind of criteria are they? In trying to briefly answer these questions, three approaches to AD will be considered below: (1) a functional communicative approach, (2) a contextual approach, and (3) a textual approach. In functional communicative terms, AD is characterised by the predominance of descriptions with persuasive and didactic purposes. Furthermore, it is a
University academic genres
type of discourse that expresses credibility and prestige. From the contextual criterion, AD is employed in academic contexts or for academic purposes (Kennedy 2001; Flowerdew 2002a; Dudley-Evans & St. John 2006). However, it is evident that academic purposes are varied and not always easy to determine, making the criteria rather complex. This is due to the fact that AD does not have clear limits and may be confused with or assimilated into other types of discourse in adjacent fields, such as technical-scientific discourse, professional discourse, pedagogic discourse, or institutional discourse (López 2002; Flowerdew 2002a). From a contextual viewpoint, Hyland (2000) argues that identification of the participants involved in interactions is essential when it comes to characterising AD; in other words, analyzing the texts as social practices is decisive. This approach includes an analysis of the communication media through which these texts circulate and are used (Gunnarsson 1997). Thus, AD is considered as an expression of a specific community (Valle 1997). Due to the fact that AD is oriented towards the dissemination of knowledge, generally through definitions, classifications, and explanations (Wignell 1998), its linguistic features attempt to produce an effect of clarity and even objectivity, avoiding ambiguity and erroneous interpretation. Moreover, in the interest of conciseness, AD tends towards an economy of words, an absence of empty adjectives, and the elimination of redundancy and repetition. It maintains controlled syntax in an established order and has a higher proportion of nominalisations expressed by words that are typical of specialised discourse (Ciapuscio 1992; Halliday 1993; Lang 1997; Gotti 2003; Charaudeau 2004; Parodi & Venegas 2004; Cademártori, Parodi & Venegas 2006). Non-linguistic aspects frequently found in this type of discourse, such as chemical formulas, physics equations, virtual recreations, mathematical representations, and symbols, must also be considered. In addition, items such as graphs, tables, figures, diagrams, and other graphic representations are found in this discourse. In view of the above aspects, Lemke (1998) suggests speaking about this discourse as a hybrid, semiotic system. These complex units have also been studied as multimodal texts (Kress & van Leeuwen 2001). In sum, our approach towards the study of academic discourse is decidedly interdisciplinary and of a psychosociolinguistic nature (Parodi 2005b, 2007a, 2008a, b). The texts collected are considered as units produced in cognitive and social contexts, that is, as texts whose function is constructed through complex semiotic and cognitive processing.
87
88
Giovanni Parodi
2.
The research: Describing the PUCV-2006 Academic Corpus
Given the above framework, this research carries out a descriptive-comparative study of linguistic-textual order, beginning with the texts that are read in the academic areas of Basic Sciences and Engineering (BS&E), as well as in the Social Sciences and Humanities (SS&H). This will be accomplished by collecting and examining an academic corpus using a methodology based on corpus linguistics principles (Sinclair 1991; Leech 1991; Stubbs 1996, 2006; Tognini-Bonelli 2001; Teubert 2005; Parodi 2006b, 2007a, c). The Academic Corpus will be determined by four undergraduate degree programmes offered by Pontificia Universidad Católica de Valparaíso, Chile: Industrial Chemistry, Construction Engineering, Social Work and Psychology. The texts that comprise the corpora are collected based on criteria that are highly representative and accurately reflect the academic environment. In this chapter, a description of the collecting processes and a preliminary quantitative and qualitative analysis are executed, as indicated in the Introduction. Special emphasis will be placed on genre classification emerging from texts collected in the four undergraduate university programmes.
2.1
Constitution of PUCV-2006 Corpus
As stated above, the aim is to collect 100% or as close to 100% as possible, of the written material that is required reading or reference material for all requisite courses in the respective undergraduate university degree programmes. The specific research methodology is divided into different stages according to the status and focus of this material (see Table 1). Once all of these steps had been completed, we were sure to have collected the universe of written texts to which students belonging to these four university programmes are exposed and which circulates as possible bibliographical sources for accessing disciplinary knowledge.
3.
Results and discussion
Two types of results will be presented below. First, we focus the constitution of the PUCV-2006 Academic Corpus, as distributed among the academic disciplines and the four undergraduate programmes, in qualitative terms. In addition, the first classification of the texts comprising the total PUCV-2006 Academic Corpus is described in terms of genres. This description identifies nine genres and quantifies the overall occurrence of texts in the corpora.
University academic genres
Table 1. Steps used to collect and process the Academic Corpus Step 1:
Construction of a database with all the information from the curricula of the four university degree programmes (including information for each course)
Step 2:
Construction of a database using all obligatory and complementary bibliographical references included in each of the course programmes
Step 3:
Preparation of a survey for all professors from each of the four programmes, including a request for complementary materials not included in the course programmes
Step 4:
Collection of complementary material for each course, which professors give to students in the form of handouts, digital files, and photocopied material
Step 5:
Internet search in order to find the selected books available in digital format, thus minimising digitalisation efforts
Step 6:
Collection of the texts from the corresponding libraries and from the professors’ offices
Step 7:
Process of photocopying each text with the aim of building a paper database
Step 8:
Training of a team of assistants to scan and compile all texts
Step 9:
Processing of all the texts into plain text format (*.txt) using the “El Grial” tagger and parser (www.elgrial.cl)
3.1
Quantitative results
Distribution of the 491 texts in the PUCV-2006 Academic Corpus is presented in Table 2. Table 2. Constitution of the PUCV-2006 Academic Corpus: Number of texts Science field
University programme
Basic Sciences and Engineering (BS&E) Construction Engineering Industrial Chemistry Social Sciences and Humanities (SS&H) Social Work Psychology Total
Number of documents 69 53 142 227 491
Table 2 shows the high degree of diversity in terms of the number of texts collected in each discipline. In addition, there is a progressive increase in and a substantial difference between the quantity of texts in the fields of BS&E and those of SS&H, as well as a considerable difference between the specific degree programmes. Preliminary interpretation might lead to the impression that students in the SS&H have to read much more than students in BS&E. It might even be said that Psychology students read up to four times more than Industrial
89
90 Giovanni Parodi
Chemistry students. However, before making such claims, a very relevant issue must be taken into account: the length of each text in question, which may completely alter first impressions based on preliminary data. In order to clear up these data as to the length of the texts and to estimate the overall size of the corpus, Table 3 provides figures evidencing a more complete panorama. It is worth pointing out that, in order to maintain the lowest to highest progressive order, this table has changed the positions of Industrial Chemistry and Construction Engineering. Industrial Chemistry has a smaller number of texts, and Construction Engineering now has a considerably reduced number of total words. Table 3. PUCV-2006 Academic Corpus: Number of texts and words Psychology Social Work Construction Engineering Industrial Chemistry Total
Number of Texts %
Number of Words %
227 142 69 53 491
22,163,379 16,343,175 8,813,663 9,304,407 56,624,624
46 29 14 11 100
39 30 15 16 100
The data contained in Table 3 are very revealing and match the ideas formulated according to Table 2. In fact, these findings help confirm the tendency towards higher figures in the disciplinary area of Social Sciences and Humanities compared to Basic Sciences and Engineering – not only in terms of the number of texts, but also in terms of the number of words. This may reflect and demand a more extensive reading process in one disciplinary domain than in another, considering the required time of study. Table 2 shows that students in Psychology and Social Work (39% and 30%) read (in terms of number of words) more than twice as much as students in Industrial Chemistry and Construction Engineering (15% and 16%). This same comparison in terms of books is doubled, to the extent that there is almost a 400% increase from one group to the other in this respect. Therefore, as already pointed out by Parodi (2007d), there is a growing and progressive tendency to find connections among the number of texts and the number of words, the university programme, and the disciplinary domain to which the texts belong. A study designed to classify all texts in the corpora into specific discourse genres was conducted in order to carry out a more in-depth analysis of the written material collected. We used both communicative-functional and textual-discourse criteria (for precise definitions and procedures for each genre, see Chapter 3). Table 4 shows the nine genres that were identified and the figures for each type from the total of the PUCV-2006 Academic Corpus.
University academic genres
Table 4. Distribution by genre Genres in the PUCV-2006 Corpus
Number of texts
Dictionary (DC) Didactic Guideline (DG) Disciplinary Text (DT) Lecture (LECT) Regulation (REG) Report (REP) Research Article (RA) Test (TEST) Textbook (TB) Total
2 41 270 1 15 11 22 3 126 491
The organisation of information in Table 4 is in alphabetical order for the names applied to each genre. Simple, everyday names were used as much as possible. As can be observed, a quite heterogeneous panorama with clear concentrations emerges. Two genres are by far the most frequent: Textbook (TB) with 270 texts and Disciplinary Text (DT) with 126 texts. This genre configuration provides an overall initial description of the way the multidimensionality of discourse is accomplished by means of a varying repertoire of alternative rich communicative tools. On the one hand, we see disciplinary knowledge as presented in books concentrating on highly thematic specialised knowledge in each domain (such as DT), sometimes with a high degree of discourse complexity and abstraction (see Chapter 10). On the other hand, there is the Textbook that, although also oriented towards disciplinary knowledge, shows a didactic nature and focuses on the dissemination of knowledge. These findings reveal two points of the continuum of genres, but not two extreme points. Both the TB and the DT are oriented towards greater specialisation, but with a clear tendency towards the mainstream dissemination of information (see Chapters 8, 9 and 10). DG, which represents the extreme of the gradient on one extreme pole, does not appear to be extremely important in terms of quantity, to the overall corpora. In order to compare the genre frequency of occurrence in the two undergraduate university programmes belonging to the Social Sciences and Humanities, Figure 2 provides detailed comparative information. These figures reveal a more in-depth analysis based on the findings displayed in Table 4. Only five common genres are detected in both university programmes. It is indeed noteworthy that the area with the most genre types is Psychology (9), not only in Social Sciences and Humanities, but in the corpora including Basic Sciences and Engineering (see next Figure 3). The discipline of Social Work shows only five of these genres (with an important concentration in two of them). DT
91
Giovanni Parodi
100% 90% 82.4% 80% 70%
64.3%
60% 50% Psychology
40%
Social Work
30%
ct
ur
e
0.4%
Le
st Te
rt po
le Ar
1.3%
ar se
lin ip
4.9% 1.3%
ch
ar
gu
lat
yT
tic
ex
t
n io
ok Re
bo xt Te
0.7%
Re
id
ac
tic
D
G
ict
ui
io
de
na
lin
ry
e
0.4%
0%
9.3%
0.7% 1.3%
Re
13.7% 11.3%
isc
7.9%
10%
D
20%
D
92
Figure 2. Genres in Social Sciences & Humanities (PUCV-2006 Academic Corpus)
and TB are the highest-frequency genres detected in both university areas of study, and were revealed to be the most common discursive tools of reading material students use while attending five-year undergraduate academic programmes. This distribution clearly reflects the kind of genres through which students access disciplinary-specific knowledge. These genres help students to acquire academic expertise and become part of the professional community to which they will eventually belong. In the case of Psychology, seven genres emerge with relatively low occurrence: Research Article (10%), Regulation (1.3%), Lecture (0.4%), Didactic Guideline (7.9%), Report (1.3%), Test (1.3%), and Dictionary (0.4%). Although these are part of the kind of readings students engage in during university training, they only marginally contribute to a student’s text exposure. Similarly, we now compare the genre occurrence percentages in Basic Sciences and Engineering. Figure 3 shows the findings expressed as percentages. This quantitative analysis reveals interesting differences in terms of internal genre variability that are not found when comparing areas in the Social Sciences and Humanities. Not only are there fewer genres in Industrial Chemistry and Construction Engineering, but there is also less genre diversity. Only five genres were identified in Construction Engineering, with TB dominating the quantitive distribution. Four other genres are part of the collected corpus: two are more closely related to the professional workplace (Regulation and
University academic genres
100% 90% 80%
71.0%
70%
58.5%
60% 50%
Construction Engineering
41.5%
Industrial Chemistry
40% 30% 20%
15.9% 10.1%
10% 0%
1.4%
1.4%
Dictionary
Didactic Guideline
Textbook
Regulation
Disciplinary Text
Figure 3. Genres in Basic Sciences & Engineering (PUCV-2006 Academic Corpus)
Disciplinary Text) and two are more typical of academic environments (Didactic Guideline and Dictionary). It is worth noting that only two genres were identified in Industrial Chemistry: Didactic Guideline and Textbook. This was unanticipated. There is a clear reader-oriented focus, recognizing the dialogic dimension of disciplinary instruction and directing readers towards action and understanding of the truths and facts under study. These two genres are important academic tools that open pathways to knowledge for novice students. Figures 2 and 3 illustrate that the comparisons favoured genres that disseminate knowledge (TB and DG), as well as more highly specialised genres (DT). Taken together, these figures reveal an important pattern of situating and distributing academic genres in these four university programmes. This provides evidence that the social, cognitive, and discourse interactions of members of these academic communities help to shape their specialised knowledge by means of specific reading materials. It also provides information about the way a university organises its academic curricula. A final comparative figure is presented in order to compare the number of texts collected in the Social Sciences and Humanities and in Basic Sciences and Engineering. These figures show a complete overall description of the genres in the four disciplines under study and the frequency of their occurrence in the Academic
93
94 Giovanni Parodi
100% 90%
83.0%
80%
71.0%
70%
64.3% 58.5%
60% 50%
Construction Engineering
41.5%
Psychology Industrial Chemistry Social Work
40% 30% 20%
15.9% 10.6%
13.7%
10%
7.9%
0.4% 1.4%
1.4%
10.1%
9.3%
5.0% 0.7% 1.3% 1.3%
1.3% 0.7%
0.4%
e ur ct Le
xt Te
le
rt po Re
Ar ar se
Re
lin ip isc
ch
ar
gu
lat
yT
tic
ex
t
n io
ok bo xt Te
Re D
D
id
ac
tic
D
G
ict
ui
io
de
na
lin
ry
e
0%
Figure 4. Genres in four university programmes (PUCV-2006 Academic Corpus)
Corpus. The rich variety of genres detected helps to construct a preliminary overview of the communication interactions students belonging to these undergraduate programmes must face in their daily activities once they enter the university setting. Access to specialised disciplinary knowledge is thus only granted through these discourse genres. These comparative findings show that written language displays prototypic rhetoric organisations in each discipline and also reveal the way SS&H and BS&E discourse is constructed differently by means of specific discourse tools. Focusing on the main frequencies of occurrence, three genres emerge as the most important in undergraduate university written communication: Disciplinary Text (DT), Textbook (TB), and Didactic Guideline (DG). The high circulation of these genres reveals a possible material selection oriented towards disseminating knowledge (TB and DG), but also towards a highly specialised discourse (DT). This variety of reading materials depicts the most common academic writing students must encounter in their daily university discourse activities. The findings show that the primary genre found in this corpus is the Disciplinary Text (DT), one that conveys highly specialised knowledge and evidences complex rhetorical organisation (see Chapter 10). The second genre identified in this research is oriented towards disseminate knowledge (Textbook). While this genre uses disciplinary prose, it also combines instructional devices such as examples, diagrams,
University academic genres
and problem-solving exercises (see Chapters 8 and 9). TB is widely used in Construction Engineering and Industrial Chemistry (71% and 58.5%), while DT appears most frequently in Social Work and Psychology (82.4% and 64.3%).
3.2
Academic discourse: Disclosing a miscellaneous nature
At the beginning of this chapter, we focused on the idea that specialised discourses across disciplines could be understood as monolithic, relatively compact, and homogenous. The findings reported have moved towards a more in-depth analysis, showing and describing genres in terms of their particular organisations and arriving at detailed characterisations. The identification of a variety of discourse genres across four academic disciplinary domains draws attention to interesting differences in the discourse students in each of these disciplines have to face in their progressive incorporation into a particular discourse community (see also Chapter 4). Figure 5 illustrates four groups of texts from which four genres originate. These groups were formed by applying two of the criteria proposed by Parodi et al. (Chapter 3). They are the “context of circulation” and the “communicative macro-purpose”. In our opinion, one unique criterion cannot capture the essence of a genre; therefore, at least two criteria should be employed. Based on these two criteria, it is possible to identify scientific, professional, universal, and academic genres, as is shown in the following figure.
Scientific Genres
Professional Genres
Academic Genres
Figure 5. Interactions between diverse genres
Universal Genres
95
96 Giovanni Parodi
Rather than focusing on the identification processes of each of these groups of genres, Figure 5 illustrates the idea that, within each particular discourse community, genres circulate in the context where they have been produced in order to satisfy their emerging communicative demands. This idea is the result of the two aforementioned criteria. For example, the set of genres known as “academic”, since these genres have been collected in university settings in this case, has not always been produced in this same particular context, nor does it have the original communicative macro-purposes of these discourse communities. The unidirectional arrows arising from the groups of scientific, professional and universal genres show that discursive tools can be supplied for the constitution of academic genres based on these genres. The above means that, although specific genres may be produced within the academic context, this context is also nurtured by other genres that were not originally created to satisfy the specific demands of university settings. Undoubtedly, it is also feasible that genres formulated within university academic settings circulate within scientific, professional, and universal contexts. Nevertheless, the communicative macro-purposes may potentially be very similar in all these genres. In fact, the arrows emerging from academic genres properly illustrate this idea in Figure 5. The ideas implied by Figure 5 points to the constitution of a kind of discourse that is very heterogeneous or miscellaneous in nature. That is to say, many of the genres that circulate within academic contexts could certainly not originate from these settings. This might not be a surprising or unexpected finding. However, it does disclose theoretical and applied research niches of great interest from linguistic, cognitive, social, and educational perspectives. Empirical findings show, for example, that in disciplines such as PSY, nine diverse genres are identified in academic settings, while only two genres are identified in IC. It thus becomes highly relevant to identify whether these genres stem from purely academic environments, with their original communicative macro-purposes pointing towards the satisfaction of discourse university exchanges, or whether these come from and have been originally created for other contexts and in order to fulfil other macro-purposes. For example, the Disciplinary Text and the Test are not necessarily genres that originate in the academic context, nor do they initially have pedagogical purposes. The educational and psycholinguistic processing implications are some of the issues that can initially be observed. Following the criteria table proposed by Parodi et al. (see Chapter 3), Figure 6 illustrates the way in which the scientific, professional, and universal genres may be distributed and how they may be embedded in the constitution of the Academic Discourse, and thus come to belong to the group of academic genres. At the same time, the only authentic academic genres in nature are identified in this
University academic genres
Scientific Genres RA LECT DT
Professional Genres REP TEST
Universal Genres REG
ACADEMIC DISCOURSE DG
TB DIC
RA LECT DT REP TEST REG
DG: Didactic Guideline DIC: Dictionary DT: Disciplinary Text LECT: Lecture REP: Report
REG : Regulation RA: Research Article TB: Textbook TEST: Test
Figure 6. Genres constituting the Academic Discourse
figure: Didactic Guideline, Textbook, and Dictionary. This indicates that these three genres emerge exclusively from pedagogical contexts. According to this research data, six other genres also make up part of the socalled academic discourse (AD). They are RA, LECT, DT, REP, TEST, and REG. The fact that they are considered here means they were collected in university settings, but that they were originally created in other contexts and were possibly created in order to comply with other communicative purposes, i.e., not directly for those pertaining to academic instructional tasks. Thus, the academic discourse identified in this research from a set of nine written genres is clearly a miscellaneous discourse. Drawing attention to this exceptional constitution of discourse, these revealing findings show data elucidating the heterogeneous nature of the academic discourse for the first time. By unveiling this miscellaneous nature, certainly not at its point of origin but in its realisation configuration, we emphasise the diversity of academic discourse in university settings and point out the multiple and interesting research projections that become apparent from these findings.
97
98 Giovanni Parodi
Discussion and final remarks In the present study, we focused on an in-depth description of the PUCV-2006 Academic Corpus of Spanish, formed by 491 texts in four disciplines. A variety of nine discourse genres came out of the analysis of these texts. Global findings revealed some intra-disciplinary distinctions, but the variations and connections among the texts in the disciplines of the corpora turned out to be even more interesting. Industrial Chemistry and Psychology texts were identified as extreme poles in the continuum of the occurrence of genres. This asymmetry is based on two genres at one extreme and nine genres at the other. In general, more disseminating and reader-oriented genres were found in the discourse fields of Basic Sciences and Engineering, with a particularly high frequency for the Didactic Guideline and the Textbook (especially in Industrial Chemistry). The Social Sciences and Humanities evidenced a richer variety of genres, but with reduced concentration on disciplinary-oriented perspectives and less emphasis on didactic resources. With regard to quantitative analysis of the PUCV-2006 Academic Corpus of Spanish, data presented in the areas of Social Sciences and Humanities and Basic Sciences and Engineering revealed differences both in terms of the number of written texts in use as a means of transmitting disciplinary knowledge and in terms of the variety of genres disseminated as specialised knowledge. Thus, in Psychology and Social Work, there is a tendency to employ a greater quantity of texts with a relatively more extensive range (at least based on the number of words) during the course of their degree programmes, as compared to Industrial Chemistry and Construction Engineering, which tended to use a smaller number of texts and a more limited range of genres. In addition, these findings helped to design a characterisation of academic discourse, which emerges as a highly heterogeneous discourse wherein genres arising from other contexts that were not originally created for those settings are inlaid. The original communicative purposes of these genres (scientific, professional, universal) do not necessarily correspond to those associated with their present context of use. Due to their heterogeneous nature, we named this group of diverse genres found in academic settings miscellaneous discourse. In sum, the reported findings in this study constitute a preliminary approach to the detection of discourse genres across four disciplines in university settings using a “corpus-based” method. The revealing differences between the SS&H and BS&E texts brought out disciplinarity as a vital factor in supporting the way knowledge construction and knowledge reproduction are achieved through specific written discourse genres. This study therefore proves that, in undergraduate university education, belonging to one discipline or another implies divergent access to knowledge for students through discourse. These different kinds of
University academic genres
knowledge are organised and presented to the novice in each discipline by means of specific discursive mechanisms. This causes the nature of the discipline to determine very specific pedagogical protocols pertaining to the prototypic analytical procedures that encourage a certain way of approaching science. A few highly interesting closing questions may be posed related to the fact that humans are not born with a pre-scientific ability oriented towards one or another area of knowledge, nor are they born with an organisation of analytical methods in this or another modus operandi. It is only through the process of progressive socialisation, together with discourse psycholinguistic development, that human beings construct knowledge and reasoning techniques. Disciplinary academic discourse and the respective specialised genres emerge as crucial tools in all learning processes, as well as in the acquisition of social and cognitive abilities that help human beings develop strategies. Nevertheless, how can a human being build an approach associated with one discipline or another? How does a human being become a member of one community or another? If there is no preacquired or innate knowledge in disciplinary terms, how do brains and minds manage to adapt to a discipline? These questions and others emerge from these findings and await future research and data.
99
chapter 6
Multi-dimentional analysis of an academic corpus in Spanish René Venegas
Recent studies on Spanish oriented to describe specialised and multi-register corpora have been carried out using diverse corpus methodologies, such as multi-dimensional analysis. This chapter presents two studies based on the written academic PUCV-2006 Corpus of Spanish. Both employ the five dimensions (i.e. Contextual and Interactive Focus, Narrative Focus, Commitment Focus, Modalising Focus, and Informational Focus) identified by Parodi (2005a). Each of these dimensions emerged out of the functional interpretation of co-ocurrent lexicogrammatical features identified through a multi-dimensional and multiregister analysis. In the first study, we calculate linguistic density across the five dimensions that provide a lexicogrammatical description of the nine academic genres that compose the corpora. In the second study, we compare the PUCV2006 Corpus with four corpora from other different registers. The findings confirm the specialised nature of the genres in the PUCV-2006 Corpus, where both a strong lexicogrammatical compactness of meanings and a regulation emphasis of the degree to which certainty is manifested are strongly expressed.
Introduction Technological advances in corpus linguistics have made possible the study of megacorpora – corpora consisting of over 100 million words – and more recently, the study of specialised corpora (Parodi 2007b, c). These advances have shown, with increasing precision, the lexicogrammatical patterns which reflect the regularities of a language and, in later studies, those of specialised discourse genres (Biber 1988; Biber, Conrad & Reppen 1998; Biber 2007b; Louwerse, McCarthy, McNamara & Graesser 2004; Parodi 2005a, 2007a, b; Crossley & Louwerse 2007). All these findings represent a source of knowledge that is relevant to disciplines such as computational linguistics and natural language processing.
102 René Venegas
Further studies are needed in order to demonstrate regularities which characterise the professional and academic dimensions of the Spanish language. Working with his team, Douglas Biber has provided a multi-dimensional perspective of varied registers in English and other languages (Biber 1995, 2003, 2005, 2006, 2007b; Biber & Conrad 2001). Recent studies have also been conducted in this area for Spanish, using diverse corpus methodologies and, more specifically, multi-dimensional analysis to describe specialised and multi-register corpora (Parodi 2005a, 2006, 2007a, b; Biber & Tracy-Ventura 2007). As a continuation of the research carried out on the Spanish language, this Chapter presents a lexicogrammatical description of the PUCV-2006 Corpus of Academic Spanish. We will report on two studies: In the first study, we compare nine academic genres comprising this corpus, using the dimensions proposed by Parodi (2005a). In the second study, we contrast the lexicogrammatical features of the corpus with those constituting four other corpora belonging to different registers. This study is based on the assumption that the dimensions suggested by a previous multi-dimensional analysis may be used to characterise a new corpus of undergraduate university texts. This assumption is based on previous research findings, which showed that co-occurrence patterns reveal how specialised discourse genres transmit disciplinary knowledge (Parodi 2004, 2005a, 2007a, b; Venegas 2005, 2006, 2007a, b). A brief synthesis of the results obtained using this multi-dimensional analysis is presented below, in both English and Spanish. This is followed by the methodology used to analyse the PUCV-2006 Corpus of Written Academic Spanish and the contrasting corpora used in this research. In the last section, we present results, a discussion of those results and implications for future studies of academic genres based on their lexicogrammatical characterisation.
1.
Multi-dimensional and multi-register analysis
Multi-dimensional analysis was initially performed in order to identify patterns of co-occurring linguistic features in a wide range of registers in English (Biber 1988; Biber, Conrad & Reppen 1998; Biber & Conrad 2001). This aimed to find linguistic variables in the texts that may be reliably used to compare different spoken and written registers. Biber (1988) employed factor analysis as a technique for multivariate data-reduction analysis. This technique is used to identify simple patterns occurring within relationships among a high number of variables. In particular, it seeks to explain observed variables largely or entirely in terms of a much smaller number of previously unobserved variables, called factors. Once
Multi-dimentional analysis of an academic corpus in Spanish 103
the factors have been identified, the researcher is able to interpret them as dimensions; in other words, this kind of analysis detects those meaningful underlying dimensions that allow the researcher to explain observed similarities or dissimilarities between the objects being investigated. For this multifactor analysis and the comparison of multiple registers of the English language, Biber (1988) used a corpus of 481 written and oral texts, divided into 17 written registers and 6 oral registers. The study was based on 67 lexicogrammatical features. The normalised frequency of the features that comprise each of the registers served as input data for the factor analysis. This analysis resulted in six factors which, when interpreted, were converted into dimensions that would allow the differentiation of the registers according to their location along each dimension. Biber (1988) defined each set of correlations between the texts in the following way: (1) Involved versus Informational Production; (2) Narrative versus Non-Narrative Concerns; (3) Explicit versus Situation-Dependent Reference; (4) Overt Expression of Persuasion; (5) Abstract versus Non-Abstract Information; (6) Online Informational Elaboration. Thus, for example, registers such as Romantic Fiction, Mystery Novels or Science Fiction have high positive scores for the Narrative dimension, whereas registers such as Academic Prose, Official Documents and Radio Transmissions have negative scores for that dimension – that is, they are characterised as non-narrative. These results have been confirmed by other studies, such as those of Conrad, Biber and Reppen (1998), Louwerse et al. (2004) and Crossley and Louwerse (2007). In the case of Louwerse et al. (2004), their input data were the normalised frequencies of 250 measurements of linguistic cohesion, from which they obtained results very similar to those of Biber (1998). In their study, the authors propose that focusing the attention exclusively on the lexical level does not allow registers to be characterised textually, and they argue in favour of the use of linguistic units that occur above this level. Thus, they used measurements obtained from the Coh-Metrix (McNamara, Lowerse & Graesser 2002). This computational tool can be used to measure dimensions such as general identification and reference information, readability, general information on the text and its words, syntactic indices, semantic and referential indices and situation models. While researchers using the Coh-Metrix tool claim to analyse above the lexical level, it is not clear that this is the case. Most measurements used by the Coh-Metrix continue to be at the lexical level (Sabaj 2007). In the case of Spanish, Parodi (2005a, 2007b) performed a multi-dimensional and multi-feature analysis on the El Grial PUCV-2003 Corpus, which comprises 74 technical-professional texts, mainly specialised textbooks, 12 texts from Latin American literature and 150 spoken interviews. This study calculates the normalised frequency of 65 Spanish lexicogrammatical features, defined
104 René Venegas
functionally and communicatively. Factor analysis was used as described above, grouping lexicogrammatical features. This study resulted in five dimensions: (1) Contextual and Interactive Focus, (2) Narrative Focus, (3) Commitment Focus, (4) Modalising Focus, and (5) Informational Focus. These dimensions, apart from being very similar to those proposed by Biber (1988), differentiated registers not only between spoken and written, but also between specialised (technical/scientific texts) and non-specialised (oral or literary texts). The relationship between specialisation and the written modality of the language in a text is particularly revealing. Both of these are associated with features which constitute the Informational Focus Dimension, which consists of both positive and negative features: modal verbs of obligation, subjunctive mood, nominalisations, prepositional phrases as noun complements and participles in an adjectival function comprise the positive features; whereas third person singular inflections, the indefinite past, the active non-durative form of ser (to be), private verbs, negation pronouns and modal verbs of volition constitute the negative features. This correlation of lexicogrammatical features was empirically demonstrated in Spanish for the first time in this research. Biber and Tracy-Ventura (2007), using 85 lexicogrammatical features, described a spoken and written corpus in Spanish (4,052 texts and 20,301,847 words). They identified six dimensions. One of the most relevant findings is that the first dimension – Oral versus Literate Discourse – allows for differentiation between prototypically spoken registers (by their positive features) and prototypically written registers (by their negative features). This finding confirms results obtained in other languages (Reppen, Fitzmaurice & Biber 2002) and is highly similar to Parodi’s research. It also allows for identification of a first dimension, which accounts for much of the total variance of the lexicogrammatical features in the corpus, and which clearly differentiates spoken and written registers. All of these results lead to the conclusion that this particular dimension might be universal and should thus appear in the description of any language. Although all of the studies mentioned above use multi-register corpora, none of these provide a detailed description of academic discourse, except for the study presented by Biber (2006). He identified four dimensions in academic registers. In a second part of the same study, Biber explored the differences between textbooks and classes in various academic disciplines (Engineering, Business Studies, Education, Humanities, Social Sciences and Natural Sciences). According to the study, only two dimensions – Procedural versus Content Focused Register and Reconstructed Register of Events – allow us to differentiate these disciplines. Both dimensions strongly differentiate classes from textbooks. Such classification, based on multi-dimensional analysis, is not only a description of
Multi-dimentional analysis of an academic corpus in Spanish 105
the more frequent registers of a language, but also of more restricted and specialised areas, such as those employed in universities. A possible conceptual limitation is that Biber (2006) states no distinction between registers (written, spoken, academic prose, institutional writing, etc.), genres (Textbook, Encyclopaedia, Academic Article), and various communicative situations (Laboratory Conversations, Office Appointment Conversations, Classroom Management and Subject Management). In our research we clearly put into practice the definition of genre stated by Parodi in Chapter 1.
2.
Method
2.1
Corpus
2.1.1 The PUCV-2006 Corpus of Academic Spanish The PUCV-2006 Corpus of Written Academic Spanish (PUCV-2006 Corpus) is made up of almost all of the texts that circulate in four undergraduate university programmes at the Pontificia Universidad Católica de Valparaíso (PUCV), Chile. These texts are provided by the teachers of obligatory courses spread over five years in the following undergraduate programmes: Industrial Chemistry, Construction Engineering, Social Work and Psychology. The corpus is comprised of 491 texts (58,594,630 words) and is available free of charge at www.elgrial.cl (see Parodi 2005a, 2007b; Venegas 2008). Table 1 presents the number of texts and words contributed by texts in each of the programmes mentioned above. This corpus includes complete texts, not random samples. This ensures a more natural representation of the texts which circulate in these academic areas and guarantees more reliable results. This explains why the corpus is not quantitatively balanced. However, it is a highly representative sample of the texts that circulate in these undergraduate programmes. Table 1. Number of texts and words in the PUCV-2006 Corpus of Spanish Area
Degree Programme
Texts
Words
Basic Sciences and Engineering
Industrial Chemistry Construction Engineering
53 (10.7%) 69 (14.0%)
9,285,375 (15.8%) 8,734,086 (14.9%)
Social Sciences and Humanities
Social Work Psychology
142 (28.9%) 227 (46.2%)
18,641,309 (31.8%) 21,933,860 (37.4%)
491
58,594,630
Total
106 René Venegas
Table 2. Genres present in the four degree programmes of the PUCV-2006 Corpus (x = no genre was identified) Genre
Construction Industrial Engineering Chemistry
Psychology
Social Work
TOTAL
Dictionary
1
1.45% x
x
1 0.44% x x
2 0.41%
Didactic Guideline
1
1.45% 22
41.51% 18 7.93% x x
41 8.35%
Disciplinary Text
7
10.14% x
x
146 64.32% 117 82.39% 270 54.99%
Lecture
x
x
x
x
1 0.44% x x
Regulation
11
15.94% x
x
3 1.32% 1 0.70% 15 3.05%
1 0.20%
Report
x
x
x
x
3 1.32% 8 5.63% 11 2.24%
Research Article
x
x
x
x
21 9.25% 1 0.70% 22 4.48%
Test
x
x
x
x
3 1.32% x x
Textbook
49
71.01% 31
58.49% 31 13.66% 15 10.56% 126 25.66%
Total
69
100%
100%
53
227 100%
142 100%
3 0.61% 491 100%
2.1.2 The genres of the PUCV-2006 Corpus In the first study, we classified genres based on the criteria stated in Chapter 2 (macro-purpose, role of the participants, discourse organization mode, context of circulation and modality). Analysis of the 491 texts of the corpus allowed us to identify nine academic genres. Table 2 presents the distribution percentages of texts collected from each degree programme. This reveals the distribution of the nine genres in the corpus. Construction Engineering texts present only five genres, being Textbook, Regulation (showing the highest percentages for these genres for the whole corpus), and Disciplinary Text predominant. In Industrial Chemistry texts, only two genres appear, the Didactic Guideline (with the highest percentage out of all disciplines in the corpus) and the Textbook. Psychology texts include all nine genres, with Disciplinary Text, Textbook and Didactic Guideline being the most frequent. Finally, Social Work texts include five of the nine genres. The Disciplinary Text genre is predominant, having the highest percentage of texts in the corpus for this genre, followed by the Textbook and the Report. Overall, as indicated in Table 2, the most frequently appearing genres in the whole corpus are Disciplinary Text (54.99%) and Textbook (25.66%), followed by Didactic Guideline (8.35%), Research Article (4.48%), Regulation (3.05%) and Report (2.24%). The remaining genres all have percentages less than 1%.
Multi-dimentional analysis of an academic corpus in Spanish 107
Table 3. Multi-register contrasting corpora Corpus
Register
Number of Texts
Number of Words
Latin American Literature (LLC) Oral Didactic (ODC) Research Articles (RAC) Public Politics (PPC)
Literary Didactic Scientific Political
12 36 30 19 97
513,359 410,981 81,856 234,818 1,241,014
2.2
Contrasting Corpora
Four corpora of different registers, available at www.elgrial.cl, were selected to provide a contrast for the results obtained in the description of the genres that constitute the PUCV-2006 Corpus (see Table 3). This was based on the assumption that a particular dimension or some particular dimensions will characterise the academic corpus, thus providing distinction from other registers.
2.2.1 Latin American Literature Corpus The texts that comprise this corpus were collected from the required reading list for Spanish Language and Communication (Lengua Castellana y Comunicación), a course offered by three technical-professional secondary schools in the city of Valparaiso, Chile. These texts are short stories and novels written by Latin American authors. 2.2.2 Oral Didactic Corpus The texts of this corpus are transcriptions of conversations between professors and students during different classroom interactions. Four of these are conversations about study techniques and reading and comprehension strategies, held with 75 final-year secondary school students from the Valparaiso area. The rest of the texts are transcriptions of conversations held during classes of different courses: 9 from a Language course, 10 from a Mathematics course, 7 from Natural Sciences and 6 from Social Sciences. 2.2.3 Corpus of Research Articles This corpus is comprised of 30 research articles randomly chosen from the ARTICOS corpus (Venegas 2005, 2007b); 10 of these articles are from the Social Sciences area, 10 are from Biological Sciences and 10 are from Exact Sciences. These articles were obtained from the tagger ScIELo (Scientific Electronic Library On-Line, www.scielo.cl).
108 René Venegas
2.2.4 Public Politics Corpus This corpus is comprised of nineteen speeches by Chilean politicians and economists. All of these deal with poverty issues from a socio-economic perspective. The views considered in these speeches are representative of centre, left-wing and right-wing political parties. 2.3
Procedure
As stated above, the five dimensions to be used in this research come from the multi-dimensional analysis carried out using the El Grial PUCV-2003 Corpus (Parodi 2005a, 2007b). In order to perform the two studies discussed in this report, the dimensions were operationalised using only the five positive lexicogrammatical features with the highest factorial weight. This decision was based on the idea that the lexicogrammatical features with highest factorial weight would have sufficient correlational power to differentiate the genres in the corpus, as well as the PUCV-2006 Corpus from the other corpora used in the study. The features that comprise the dimensions and the corresponding factor weight of each of the features are presented in Table 4. This analysis considers the calculation of the co-occurrence of five lexicogrammatical features from each of the five dimensions for each of the texts corresponding to each genre. The frequency of each lexicogrammatical feature was calculated using the computer tool El Manchador de Textos (The Text Highlighter), from the website www.elgrial.cl (see Venegas 2008). In simple terms, the coTable 4. Dimensions and factorial weights of each of the features of each dimension Contextual and Interactive Focus Dimension (D1): cause/effect adverbial clauses (0.945), time adverbs (0.934), negation adverbs (0.928), second person singular pronouns (0.911), first person singular pronouns (0.823). Narrative Focus Dimension (D2): second person singular pronouns (0.842), first person singular pronouns (0.828), periphrastic future (0.823), imperfect past (0.820) third person plural pronouns (0.708). Commitment Focus Dimension (D3): private verbs (0.824), first person singular pronouns (0.789), indefinite past (0.705), modal verbs of volition (0.655), first person singular inflections (0.640). Modalising Focus Dimension (D4): active forms of durative to be (0.671), hedges (0.656), modal verbs of possibility (0.641), adverbs of manner (0.606), predicative adjectives (0.565). Informational Focus Dimension (D5): modal verbs of obligation (0.496), subjunctive mood (0.494), nominalisations (0.456), participles in adjectival functions (0.413), prepositional phrases as noun complements (0.413).
Multi-dimentional analysis of an academic corpus in Spanish 109
ccurrence of features from each dimension is used to calculate a density index o for each dimension within the text. This was done using Formula 1: IDdT =
∑ (fR1..n · α1..n) · ∑ (R) ∑ pT Formula 1
Formula 1 represents the co-occurrence of features, considering the importance or statistical weight of each feature in relation to the other features of each dimension. The result of the equation is called the dimension density index and allows us to characterise each text by calculating its density along each of the five dimensions. The following example shows the visualisation of the data obtained using the Text Highlighter and the calculation used for each dimension. In this case, the calculation has been carried out using features from the Informational Focus Dimension and a paragraph of the Textbook Air and Water Pollution (Contaminación del agua y del Aire), from the Industrial Chemistry corpus. Debido a que los residuos con requerimiento de oxígeno eliminan con prontitud el OD en el agua, es importante poder calcular la cantidad de estos contaminantes en una masa de agua determinada. La demanda bioquímica de oxígeno (DBO) de l agua es una cantidad relacionada con la proporción de residuos presentes. En una muestra de agua , la DBO indica la cantidad de oxígeno disuelto que se gasta durante la oxidación de los residuos con requerimiento de oxígeno. Se mide incubando una muestra de agua durante cinco días a 20 C. La cantidad de oxígeno Sequence 1: 1 consumido (DBO) se establece mediante la determinación química de Sequence 2: 2 la concentración de OD en el agua, antes y después de la incubación . Sequence 3: 5 Una DBO de 1 ppm es característica del agua casi pura. El agua se con- Sequence 4: 1 ceptúa como muy pura con una DBO de 3 ppm, y de una pureza du- Sequence 5: 11 dosa cuando se llega a las 5 ppm. Las autoridades encargadas de la salud Total Words: 261 pública ponen objeciones a que las aguas residuales entren en las corri- Index: 0.16 entes si la DBO de las primeras sobrepasa las 20 ppm. Una comparación entre estos niveles de DBO y la gama de valores característicos de las fuentes que se da en la tabla 8-2 indica la gravedad del problema. Es evidente que los contaminantes de la tabla 8-2 deben estar muy diluidos a l entrar en el agua, si no se quiere que el oxígeno disuelto sea rápidamente eliminado por completo. (AC-QUI-ma425).
110 René Venegas
“Due to the fact that oxygen-requiring residues quickly eliminate OD in water, it is important to be able to calculate the amount of this pollutant in a determined mass of water. The biochemical oxygen demand (BOD) of water is an amount related to the proportion of residue present. In a water sample, the BOD indicates the amount of dissolved oxygen used up during oxidation of the oxygen-requiring residues. It is measured by incubating a water sample for five days at 20OC. The amount of oxygen consumed (BOD) is established by determining the chemical concentration of OD in the water, before and after incubation. A BOD of 1 ppm is characteristic of almost pure water. The water is considered to be very pure with a BOD of 3 ppm, and of questionable purity when BOD reaches 5 ppm. The authorities in charge of public health restrict wastewater entering the public water supply if its BOD is over 20 ppm. A comparison between these levels of BOD and the gamma values characteristic of sources as shown in table 8.2 indicates the seriousness of the problem. It is evident that the pollutants in table 8.2 must be diluted upon entering the water supply, in order to keep the dissolved oxygen from being rapidly eliminated”.
Using this data, it is possible to apply Formula 1 to calculate the density index associated with informational density (D5), which in this case corresponds to: ID5T = ((1 · 0.496) + (2 · 0.494) + (5 · 0.456) + (1 · 0.413) + (11 · 0.413)) · 5/261 = 0.16
The index (0.16) gives this text a density value for D5 that will subsequently be compared to the values obtained for each of the dimensions in each of the identified genres.
3.
Results and discussion
3.1
First study
3.1.1 The PUCV-2006 Corpus Once values were calculated according to the five dimensions of the study for each text in the corpus, it was established that the highest density index was obtained in the Informational Focus Dimension (0.16). According to this result, the Informational Focus Dimension best characterises the PUCV-2006 Corpus as a register of academic written Spanish. The dimension with the second-highest density value is the Modalising Focus (0.07), followed by the Contextual and Interactive Focus Dimensions (0.05), then by the Commitment Focus Dimension (0.03) and finally by Narrative Focus (0.02). All of these values, despite being apparently very low, statistically differentiate the density index of each of the corpus dimensions (One-Sample z-Test; p < 0.05). The data obtained shows that the PUCV-2006 Corpus is mainly characterised by information-dense texts. This is typical of highly specialised texts, since they use complex syntax to present content. This result is similar to the one obtained
Multi-dimentional analysis of an academic corpus in Spanish
0.18 0.16 0.14 0.12 0.1 0.08 0.06 0.04 0.02 0 Contextual and Interactive Focus
Narrative Focus
Commitment Focus
Modalising Focus
Informational Focus
Figure 1. Comparison of the dimensions in the PUCV-2006 Corpus
for the Technical-Professional Corpus included in El Grial PUCV-2003 Corpus (Parodi 2005a; Cademártori, Parodi & Venegas 2007) and for academic prose (Biber 2006). Second, this corpus shows evidence of modalisation, a characteristic of the degree to which facts or doubts are manifested. Though to a lesser extent, the density index for the Contextual and Interactive Focus Dimension indicates the co-occurrence of features associated with cause/effect, with temporal sequence and with interaction between participants of the conversation. These are features that generally often appear in dialogues and tend to appear together.
3.1.2 The genres of the PUCV-2006 Corpus Figure 2 shows the density of each of the dimensions in the nine genres found in the PUCV-2006 Corpus. The Informational Focus Dimension is predominant in all genres. This predominance over the other dimensions is particularly evident in the Regulation genre, where the dimension reaches a density index greater than 0.22. The Research Article, Report, Textbook and Disciplinary Text genres obtained very similar densities in the Informational Focus Dimension, constituting a group of genres characterised by linguistic features associated with dense prose and centred on specialised contents. The Lecture genre has a lower density value in this dimension. This is evidence of sparse prose, and thus indicates the genre is somehow closer to speeches and written texts delivered to a general audience. Example 1 comes from a passage from the Official Chilean Regulations for Steel Construction, 1994, in which each of the five features of the Informational Focus Dimension were identified ( modal verbs of obligation , subjunctive mood , nominalisations , participles in adjectival function , prepositional
111
112 René Venegas
Figure 2. Value of the five dimensions in each of the nine genres
phrases as noun complement ). As can be observed, there is a significant grouping of features indicating a dense text that, from a lexical and syntactic point of view, is oriented towards compacting information. Example 1. Presence of the Informational Focus Dimension in the Regulation genre 6. Durante la operación de soldar y durante la etapa inicial de enfriamiento del cordón no deberán martillarse o someterse a vibración las piezas que se unen. Sin embargo, cuando sea necesario para disminuir las tensiones internas, en especial en las soldaduras efectuadas por capas múltiples, se podrá martillarlas con golpes suaves de martillo manual o neumático empleando una herramienta alargada de extremo redondeado . El martilleo se efectuará después que la soldadura se haya enfriado, pero esté todavía caliente al contacto con la mano. (AC-ICC-nm405) 6. During welding and during the initial cooling stage of the chain, the pieces being joined must not be hammered or exposed to any vibration. However, when it is necessary to lower the internal tension, especially when welding with multiple layers, hammering may be done with soft blows using a manual or pneumatic hammer and using an extension tool with a rounded head. Hammering must be executed after the weld has cooled but while it is still hot to the touch.
The second most common dimension in all the genres is the Modalising Focus Dimension. The genres Lecture and Dictionary are those which have the highest scores in this dimension. These genres show features that are associated with fact regulation and attenuation. Modalisation decreased in each genre in the following
Multi-dimentional analysis of an academic corpus in Spanish 113
descending order: Disciplinary Text, Textbook, Research Article, Test, Didactic Guideline, Report and Regulation. As can be seen in Example 2, in the Lecture genre there is a co-occurrence of features that tend to keep facts concise (to be able + infinitive, seem + that, adverbs of probability) and that also tend to form judgments of similarity or equivalence which do not depend on immediate experience (active-stative form of the verb to be). If we now compare this result with Example 1, only one of these features appears in the text (to be able + infinitive, i.e. podrá martillarlas, translation: will be able to hammer them). This result demonstrates a complementary correlation in Regulations between the Informational Focus and the Modalisation dimensions: a high co-occurrence of features that keep information compact and a low frequency of features that tend to keep facts concise. Example 2. Presence of the Modalising Focus Dimension in the Lecture genre Hay otro test diagnóstico que se llama Skid, que es otra entrevista bastante comprensiva para elaborar todas las diferenciaciones de los subtipos de estas personalidades perturbada; serias, y últimamente hay un esquema que probablemente se va a adoptar por la Organización Mundial de la Salud para diagnosticar estructuras de personalidad con problemas y este está dado por el Dr. Loringer, que es una evaluación de personalidad. Quería mencionarles estas tres entrevistas, porque cuando uno empieza a trabajar con pacientes de este tipo pareciera que siempre los autores hablan de otros pacientes y son muchos los que están bajo este tema que es el de personalidad de cuadros limítrofes, y hay mucha necesidad de poder seleccionarlos en forma mucho más homogénea. Quisiera concluir diciéndoles que hay varias entrevistas que están estandarizadas y validadas que se pueden usar para proyectos de estudios de estos pacientes, para evaluar su desarrollo ulterior, para poder evaluar el efecto de medicamentos, y también para evaluar las diferentes técnicas de psicoterapia que se usan para estos pacientes. No quiero prolongar más mi presentación, Muchas gracias. (AC-PSI-co225) “There is another diagnostic test called Skid, which is another quite comprehensive interview for formulating all the differentiations between subtypes of these disturbed personalities; and a framework has recently come up, given by Dr. Loringer, which is a personality evaluation and that will probably be adopted by the World Health Organisation for diagnosing problems in personality structures. I would like to mention these three interviews, because when one starts to work with patients of this type, it seems that authors always speak about other patients, and many of them fall into this category of borderline personalities, and there is a pressing need to be able to choose between them in a more homogeneous way. I would like to conclude by saying that there are several interviews that have been standardised and validated and can be used to study these patients, to evaluate their ultimate development, to evaluate the effects of medication and also to evaluate the different psychotherapy techniques used for these patients. That concludes my presentation. Thank you”.
114 René Venegas
The Lecture genre shows the highest modalisation index and is characterised by relatively high indices in three other dimensions (Commitment Focus, Narrative Focus and Contextual and Interactive Focus). This suggests that Lecture constitutes a genre that is very close to speech: speakers on a specialised subject matter demonstrate their commitment to their topic through their vocabulary choices. If we now concentrate on the most frequently occurring academic genres in the corpus – Disciplinary Text (54.9%), Textbook (25.6%) and Didactic Guideline (8.35%) – we can state that the Contextual and Interactive Focus, Narrative Focus and Commitment Focus dimensions significantly differentiate these genres (One-Sample z-Test; p < 0.05). The Disciplinary Text is differentiated from the other two genres by its higher density of features associated with direct reference to physical context, to temporality and to participants, which is similar to how the Lecture genre is manifested. This genre also evidences a higher tendency than the other genres towards narration and towards expression of the author’s commitment to what is being stated. However, the Modalising Focus dimension does not differentiate between the Textbook and the Disciplinary Text, showing that both genres present a similar density of modalising features. This leads to the assumption that these genres present information in a subjective way, where the facts remain open to verification and meaning is subject to negotiation. Turning to the Informational Focus Dimension, the Textbook shows higher density here than in both the Disciplinary Text and the Didactic Guideline. This is congruent with the results obtained by Parodi (2005a). According to his results, the Technical Textbook showed the highest values for this dimension. A more detailed characterisation of the Textbook genre is pertinent (see Chapters 7 and 8), since it is the only genre that is found in all four disciplines. This makes it prototypical of the construction of disciplinary knowledge. Figure 3 shows the values obtained for each dimension in this genre in each of the four disciplines. There is variation in the Textbook across disciplines. The most significant difference is in the Informational Focus Dimension. Construction Engineering and Industrial Chemistry textbooks show a high degree of information compactness and lexicogrammatical density. The Commitment Focus Dimension also shows differences between disciplines. The author often appears to be highly involved in textbooks in the disciplines of Psychology and Social Work. The densities of the Contextual and Interactive Focus, Narrative Focus and Modalising Focus dimensions are not significantly different in the Textbook across disciplines. Thus, there are differences in the way specialised domains of knowledge are presented across textbooks.
Multi-dimentional analysis of an academic corpus in Spanish 115
Figure 3. The Textbook genre across the disciplines
3.2
Second study
In the second study, the values for each dimension of the PUCV-2006 Corpus are compared with the values for the same dimension in another four corpora (see Table 3). Furthermore, our interest is to confirm whether the greater degree of specialisation, as identified in the Informational Focus dimension, is actually an exclusive characteristic of this corpus or it is shared by others. As shown in Figure 4, the Contextual and Interactive Focus, Narrative Focus and Commitment Focus dimensions demonstrate a clear difference (One-Sample z-Test; p < 0.05) between the Oral Didactic and the Latin-American Literature Corpora and the other corpora. The PUCV-2006 Corpus shows a much lower index in these three dimensions compared with the corpora of Oral or Literary registers, though its index is higher than that of the Research Article and Public Politics corpora. The Modalising Focus Dimension is therefore very prototypical of this corpus. The density index for the Informational Focus Dimension is similar between the PUCV-2006 Corpus and both the Research Article and Public Politics corpora. Furthermore, these three corpora are differentiated from the Oral and Literary Registers (One-Sample z-Test; p < 0.05). The similarity in the Informational Focus Dimension between the PUCV2006 Corpus and the Research Article corpus is somewhat anticipated, since the
116 René Venegas
Figure 4. Comparison of the PUCV-2006 Corpus with other contrasting corpora
genres of both corpora are part of the so-called specialised discourse (Parodi 2006). In fact, 4.48% of the PUCV-2006 Corpus is made up of research articles. This suggests that the written academic register contains recognisable scientific genres and that there is integration between both registers. Thus, from the perspective of meaning construction, there is a common lexicogrammatical pattern of information density, indicated by the use of modal verbs of obligation, subjunctive verbs, nominalisations, prepositional phrases and participles with adjectival functions. It is surprising that such a similarity exists between the Public Politics Corpus and the PUCV-2006 Corpus in the Informational Focus Dimension. However, a review of the topics of the Public Politics Corpus shows that although these texts are mainly spoken – and therefore probably better characterised by the Contextual and Interactive Focus Dimension – they are predominantly oriented by topic specialisation, particularly among those texts from the subject of economics (see Example 4). The PUCV-2006 Corpus, as established by the first study, shows its highest density index in the Informational Focus Dimension – making it similar to more specialised corpora, such as those of Research Articles and Political Discourses – and in the Modalising Focus Dimension, making it significantly different from all the other corpora. In addition, it presents a moderate level of Contextualisation and Interaction, which differentiates it from the El Grial PUCV-2003 Corpus (Parodi 2005a). Finally, this corpus shows a very low index for co-occurring features associated with narration and commitment.
Multi-dimentional analysis of an academic corpus in Spanish 117
Example 4. Informational Focus Dimension in a Public Politics Discourse Debe considerarse la distribución primaria del Ingreso . En los estudios de la Comisión Económica para América Latina y el Caribe frecuentemente la atención para superar el problema distributivo se centra prioritariamente en el factor educacional y su incidencia en la ocupación , aspecto, sin duda, importante pero que no puede servir para perder de vista el punto de partida básico de la distribución del ingreso, o sea la participación en su repartición de l capital y de los poseedores de los medios de producción usables en el proceso de producción por los cuales se tenga renta, de un lado, y los trabajadores del otro. Los documentos de la Comisión Económica para América Latina y el Caribe consi deran que el trabajo remunerado representa alrededor del 80 por ciento del ingreso total de los hogares en América Latina. Pero, no se detiene en la relación de los niveles remuneracionales con la productividad de l trabajo, y por tanto, tampoco con las ganancias obtenidas por el capital, sino que se centra en mostrar las nuevas posibilidades que se presentarían si los trabajadores contaran con un nivel educativo mayor y, por lo tanto, con la posibilidad de acceder a trabajos mejores. (CPP-iz1) “The primary distribution of Income must be considered. In the studies of the Economic Commission for Latin America and the Caribbean, in order to overcome the distribution problem attention is frequently centred on the educational factor and its incidence in occupation; an aspect which is undoubtedly important but does not help in losing sight of basic starting point of income distribution, that is, the participation in its distribution of capital and of the owners of the means of production usable in production processes which are profitable, on the one hand, and which have workers on the other hand. The document of the Economic Commission for Latin America and the Caribbean considers that paid work represents about 80% of total household income in Latin America. But, it does not dwell on the connection between payment levels and work productivity, nor on the connection between payment levels and earnings obtained by the capital; but it concentrates on indicating new possibilities which would become available if workers had a higher level of education and, therefore, the possibility to find better jobs”.
Conclusions The aim of this chapter was to characterise the PUCV-2006 Corpus of Academic Spanish by calculating the co-occurrence of linguistic features belonging to five dimensions: Contextual and Interactive Focus, Narrative Focus, Commitment Focus, Modalising Focus, and Informational Focus. Thus, two complementary studies were conducted. In the first study, the corpus and the genres comprising this corpus were described. The main result from this study is that the Informational Focus Dimension best characterises this corpus and all of the nine genres. The second most relevant dimension in the corpus and so in the genres under study is that of Modalising Focus. For instance, the Lecture, the Dictionary, the Disciplinary Text and the Textbook make up the genres with the highest density indices in this
118 René Venegas
dimension. This finding demonstrates that the genres of the PUCV-2006 Corpus are principally characterised by densely informative and modalised prose. Both of these attributes emphasise the specialised and academic nature of the genres in the corpus and are consistent with properties identified in previous studies (Chafe & Danielwicz 1987; Halliday & Martin 1993; Parodi 2005a, 2007b; Biber 2006, 2007b; Ciapuscio 2007). Therefore, the genres of this corpus of academic discourse are primarily oriented towards a highly compacted presentation of content through lexical and syntactic structures. In addition to this, propositions are presented in a modalised way, possibly with the function of regulating the degree to which facts or uncertainties are presented, thus presenting content in such a way that it must be verified or negotiated by the readers, either specialists or students. The Textbook genre shows a higher density index in the Informational Focus Dimension in the disciplines of the Basic Sciences and Engineering than in the Social Sciences and Humanities. The disciplinary discourse in the Textbook of Construction Engineering and Industrial Chemistry is more closely associated with information density (and with lexicogrammatical compacting of its content) than the disciplinary discourse of Social Work and Psychology. The Modalising Focus Dimension is similar in the textbooks of all four disciplines. This demonstrates that, in terms of the co-occurrence of lexicogrammatical features, there is no difference in the way facts or uncertainties in propositions are regulated in the Textbook genre. In our second study, we compared the PUCV-2006 Corpus with another four corpora of different registers. The most relevant result here is that the academic corpus is strongly correlated with scientific and political discourse. This confirms the specialised nature of the genres comprising the corpus. Moreover, it presents a higher density along the Modalising Focus Dimension than do the other corpora compared in this research. This result suggests that academic genres are modalised in order to present their disciplinary content in a graduated manner. Unlike the Research Article corpus, the PUCV-2006 Corpus shows a greater co-occurrence of features from the Contextual and Interactive Dimension. This difference demonstrates that academic discourse shows higher density in features associated with temporal and spatial contextualisation of content. It is also characterised by the inclusion of writers in their own statements; this allows writers of academic texts to provide information in a concrete way, thus making their texts accessible to readers. These results are scientifically relevant because they present new data on the characterisation of genres that constitute written academic discourse. Moreover, the data presented here confirm results presented by Parodi (2005a). This strengthens the methodology chosen for this research, given that quantifying the co-occurrence of the first five features of each dimension leads to the conclusion
Multi-dimentional analysis of an academic corpus in Spanish 119
that: (a) the Contextual and Interactive Focus dimensions best characterise the Oral Didactic Corpus; (b) the Narrative Focus and Commitment Focus dimensions best describe Latin American Literature; and (c) the Informational Focus Dimension best characterises specialised written texts. Investigating dimensions using the principal co-occurring lexicogrammatical features provides qualitative and quantitative empirical information that is relevant to the characterisation of the oral and written genres found in specialised discourse. The findings obtained here allow us to make some projections for further studies of academic genres, such as the confirmation of these results through new multi-dimensional analysis on a wide-ranging corpus of oral and written academic genres. This may be executed by using lexicogrammatical features and textual indices, such as indices of semantic similarity, indices of cohesion, indices of co-referentiality, and others. Such studies could lead to a deeper identification of the differences and similarities between disciplinary academic genres and other genres of specialised discourse. Future studies should develop algorithms that consider the co-occurrence values of features from different dimensions, in order to achieve automatic classification and retrieval of disciplinary discourse genres. As the following Chapter indicates, advances have been made in this field. Specifically, we have automatically classified the texts of the PUCV-2006 Corpus according to disciplinary criteria.
chapter 7
Automatic text classification of disciplinary texts René Venegas
The aim of this research is to classify, using and comparing two automatic classification methods, the academic texts included in the PUCV-2006 Corpus of Spanish. The methods are based on shared lexical-semantic content words present in the corpus of academic texts. The classification methods compared in this study are Multinomial Naive Bayes and Support Vector Machine. Both enable the identification of a small group of shared words that help, according to statistical weights, to classify a new text into the four disciplinary areas involved in the corpora. The results allow us to establish that Support Vector Machine classifies academic texts efficiently. Using this method, we were able to automatically identify the disciplinary domain of an academic text – based on a reduced number of shared content lexemes – delivering high performance even in highly-refined disciplines such as Psychology and Social Work.
Introduction Most of the information processed by a student during his or her university life is in written form. Compulsory and complementary reading materials help students acquire knowledge about the discipline that they will use in their future professional lives. Nowadays, due to massive production of rapidly disseminated specialized texts, there are more and better opportunities for accessing academic knowledge than in the past. The increasing number of digitised texts (such as ebooks, e-journals, and digital newspapers) available on the World Wide Web is evidence of this changing scenario. This new medium of access to written information provides researchers with a unique opportunity to describe and classify genres of academic discourse. The field of academic discourse studies has placed fundamental focus on identifying certain rhetorical structures and lexicogrammatical features of academic discourse genres (Bhatia 1993, 2002, 2004; Flowerdew & Peacock 2001;
122 René Venegas
Flowerdew 2002a; Halliday & Martin 1993; Halliday 2004b; Hyland 2004; Swales 1990, 2001, 2004). However, the results of analysing sample texts in English may not be transferable into algorithms in order to automatically identify disciplines from text. Studies conducted on academic registers from a multi-dimensional and multi-feature perspective in English (Biber 2003, 2005, 2006, 2007; Biber & Conrad 2001) and in Spanish (Parodi 2005a, 2006a, 2007a, b; Biber & TracyVentura 2007) may turn out to be a useful way of collecting information for future algorithms. This is due to their efficiency for the identification of lexicogrammatical features clusters that characterize certain academic genres. However, these data have not been tested in automatic classification tasks. From a corpus linguistics and computer language processing perspective, research conducted in these fields is twofold. The first purpose is to describe and predict linguistic patterns. The second is to build automatic systems for text classification and retrieval by analysing linguistic information in the texts algorithmically. Texts from law and journalism have usually served as test data for such tasks (Boreham & Niblett 1976; Masand & Linoff 1992; Moens & Uyttendaele 1996; Tin-Yau 1998; Moens & Dumortier 2000; Stokes & Carthy 2001). Certain spam detection studies have also used these texts from the same disciplinary domains (Lai 2007; Koprinska, Poon, Clark & Chan 2007). The aim of the research presented in this chapter was to classify the texts in the Academic Written Corpus PUCV-2006 automatically into the four disciplines to which the texts belong, using a limited group of words spread across all the texts of the corpus. We applied two alternative methods and then compared the results in order to identify the method that best discriminates among texts from various disciplines. One application of this research might be the construction of personalised digital libraries aimed at complementing academic literacy of university students. Another application may be a support system for the study of large-scale specialised corpora. The following discussion briefly touches on the concepts of specialised and academic discourse. Next, we describe both classification methods used in this research: Naive Bayes and Support Vector Machine (If the reader finds too technical this description, it is possible to skip over or avoid Sections 1.2.1, 1.2.2, and 1.2.3 and then proceed immediately to Section 2). We describe the corpus and automatic classification methods employed in the method section. Finally, we present the results and conclusion.
Automatic text classification of disciplinary texts 123
1.
Theoretical background
1.1
Specialised and academic discourse
Classifying a text as specialised or general discourse is a theoretical and descriptive problem (Schröder 1991; Parodi 2004, 2006b). It is a theoretical problem because researchers have not yet clearly determined the precise boundaries between the notion of general discourse and that of specialised discourse (Gläser 1982, 1993). Due to this ambiguity, researchers have used various terms for this concept: academic discourse, special discourse, professional discourse, technical discourse, and institutional discourse. The choice of terms depends on individual researchers’ ways of formulating their subject matter. Nowadays, the prevailing view of the boundary between these two notions favours a progressive continuum of discourses, from a domain of highly specialised discourses to a domain of general discourses (Gläser 1982; Schröder 1991; Halliday & Martin 1993; Jeanneret 1994; Ciapuscio 1994, 2000; Peronard 1997; Cabré 2002; Parodi 2004, 2006b). Parodi (2004 :10) suggests that: …it is a fact that establishing precise parameters between one type of text and another is a matter of great importance. Regardless of whether the focus of attention is on one or another classificatory criterion, there will always be mixed cases or limits; nevertheless, it seems that specialised discourse belongs to a recognisable category for any speaker of Spanish.
Gotti (2003), following the idea of a continuum and reflecting on the multidimensional nature of specialised discourse, suggests that there is no homogeneity among the various specialised languages. He argues that disciplinary variations produce not only special lexical connotations, but often influence other morphosyntactic and textual features. It is precisely this variation on different levels that reveals a descriptive problem, since the studies of specialised discourse have tended to operate on a lexical and syntactic level. This variation implies lack of integration of other levels, such as the textual level (Gotti 2003). We focus on studying and describing academic discourse at the university level: in particular, on texts that are read in four different academic disciplines at the Pontificia Universidad Católica de Valparaíso (PUCV) and included in the PUCV-2006 Corpus of Written Academic Spanish. We conceive academic discourse as the language used in academic settings, concerning particular disciplinary topics, and implying academic purposes. We assume in this research that academic discourse can be automatically identified through the retrieval of certain linguistic features that tend to co-occur
124 René Venegas
systematically in texts. This is because such systematic co-occurrence reflects communicative functions inherent to any text (Parodi 2004, 2005a; Venegas 2005, 2006, 2007; Ibáñez 2008; Parodi, Venegas, Gutiérrez, & Ibáñez 2008).
1.2
Towards automatic text classification
Text classification (Ciapuscio 1994, 2000; Parodi 2005a) can be done manually using multilevel criteria (linguistic, textual, pragmatic, functional, etc.). However, a quick, automatic process for completing this task is preferable. The process of creating algorithms for automatic text classification consists of the identification of variables that may be used to discriminate among texts belonging to distinctive and fixed classes. Automatic classification is useful because it can retrieve documents, for example, from the Web. This approach assumes that texts classified together address the same subject matter, have the same origin, or come from the same author (spam filters used for e-mail are an example of a tool developed using these processes) (Jurafsky & Martin 2000; Jackson & Moulinier 2002). Automatic classification of documents may be seen as a process of mathematical-statistical learning. An algorithm captures the distinguishing features of each category or class of document. There are no absolute or unequivocal characteristic identifiers. They lie on a graduated scale. Documents that have a certain characteristic have a statistical probability of belonging to a certain class. These characteristics add up to a coefficient of association for each of the fixed classes. Each coefficient represents the degree of certainty that a document really belongs to the class associated with that coefficient.
1.2.1 Vector representation The first step for any method that automatically assigns numbers to a text is to represent the document (a complete text, part of the text, or a group of texts) by characteristics of the document (for example, punctuation marks, content or functional words or another kind of textual mark). Text classification and information retrieval research projects often represent texts by the so-called bag-of-words classification. The bag-of-words model is a simplifying assumption used in natural language processing and information retrieval. A bag is an unordered collection of words disregarding grammar and even word order; only the number of occurrences is recorded (Jackson & Moulinier 2002). Not all of the words in a text contribute to its characterisation with the same relevance. The classification process removes words like articles, prepositions and conjunctions – known in linguistics as functional words and in natural language processing as stop words. Such words do not distinguish among categories and
Automatic text classification of disciplinary texts 125
thus render classification results misleading (Silva & Ribeiro 2003). In addition, the classifier does not consider words with a low occurrence rate in a given corpus because these words have little or no discriminatory weight. Research in Spanish text processing shows that a stop word list includes between 100 to 200 words (Cerviño, García, Calvo & Ceccatto 2004). In addition to discarding functional words, the classifier may apply lexical transformations such as lemmatisers, wordstem extractors, term-labelling functions, and multiword detectors to the text being classified. These computational tools are useful, as they minimise the number of words to be considered in the word analysis matrix. However, one drawback is that these applications may result in the loss of relevant information, such as gender and number, which may be necessary for accurate classification. Texts are characterised by creating a vector representation of the number of times a word occurs in the document. Each component of the vector is a word whose statistical weight is a function of word occurrence. A variety of methods have been suggested to calculate the weight of each word in the representative vector (Salton & McGill 1983; Salton & Buckley 1988; Harman 1992). In general, two counterpoised ideas prevail: (1) if a word occurs frequently in a text, then it is important for characterising the document, but (2) if it occurs throughout a set of documents, then the word has little power to discriminate among documents. The classifier calculates how many times a word occurs in a given document in order to determine its representative capacity. This yields term frequency (tf). If term frequency is extremely high throughout the entire collection of documents, then the researcher discards it from the set of terms in the collection. In fact, the retrieval power of a term is inversely proportional to its frequency in the collection of documents: this is known as inverse document frequency (idf). In order to calculate the weight of each element of the vector, the classifier determines the inverse frequency of the term by dividing the overall number of documents by the number of documents containing the term, and then taking the logarithm of that quotient. This is known as the tf idf weighting method. Salton and Buckley (1988) experimented with more than 200 systems for calculating statistical weight. One of the most common is shown below in Equation 1: Equation 1
where df j is the number of documents in which term j occurs and N is the total number of documents in the collection. A word will receive more weight if it occurs in the document more frequently, and will receive less weight if it occurs in many documents. Using this relatively simple representative method, we can
126 René Venegas
classify documents by calculating the distance between the vectors of the documents, or between vectors of documents and vectors of classes of documents (Salton & Buckley 1988). This vector representation is the starting point for more complex vector or probabilistic classification models. In keeping with our goal of classifying the academic texts of the PUCV-2006 Corpus and identifying the similarities and differences among the disciplines in this corpus, we will compare two methods of classification based on the vector model: (1) the Naive Bayes (NB) method; and (2) the Support Vector Machines (SVM) method. These two methods are commonly used to classify documents. However, they have not been sufficiently tested on textual corpora in Spanish (despite exceptions, such as Figuerola et al. 2000; and Cerviño et al. 2004). In the following section, the main features of each method will be briefly described. However, a description of the statistical procedures and the calculations of these methods is beyond the scope of this Chapter. This information can be found in the works of Johnson (2000), Manning and Schütze (2003) and Molina and García (2004) for NB; and Cristianini and Shaw-Taylor (2002), Hsu, Chang and Lin (2003) and Betancourt (2005) for SVM.
1.2.2 Naive Bayes classifier (NB) Bayesian classifiers (Duda & Hart 1973) are statistical classifiers that can predict the probabilities of the number of members of a particular class and the probability that a given sample belongs to a particular class. Many studies have demonstrated that Bayesian classification is highly accurate and fast when applied to large text databases, particularly in Spanish (Molina & Garcia 2004; Cerviño et al. 2004; Bordignon, Peri, Tolosa, Villa & Paoletti 2004). A Naive Bayes classifier assumes the possibility of computing or estimating the distribution of terms (words, bigrams, or phrases) within the documents assigned to these categories. The idea is to use this term distribution to predict the class of unseen documents (Jackson & Moulinier 2002). Several studies that compared classification algorithms have determined that performance of the NB classifier is comparable to a decision tree and to a neural network classification, procedures that are computationally much more complex (Molina & Garcia 2004). The objective of the Bayesian classifier method is to determine the best (the most probable) hypothesis, given a set of pre-existing data. If we designate P(D) as the a priori probability of the data (calculated by examination of the existing data) and P(D|h) as the probability of the data according to a particular hypothesis, then we want to calculate P(h|D), the subsequent probability of h, given certain known conditions; hence the notion of conditional probability. Bayes’s theorem can be used to estimate this:
Automatic text classification of disciplinary texts 127
Equation 2
In order to estimate the most probable hypothesis (MAP, [maximum a posteriori hypothesis]) we want to find the highest P(h|D), as shown in the following equation:
Equation 3
Since P(D) is a constant independent of h, these equations reflect an assumption that all hypotheses are equally probable. This leads to the concept of the hypothesis of maximum likelihood (ML) expressed in Equation 4: Equation 4
The Naive Bayesian classifier has a particular application to classifying a set of characteristics or attributes (ais) within a finite set of classes (V). It classifies in accordance with the most probable value, given the values of the attributes. Hence, if Equation 4 is applied to the classification process, the result is Equation 5:
Equation 5
The Naive Bayesian classifier also assumes that the values of the attributes are conditionally independent, given the value of the class. This leads to Equation 6 and, in turn, to Equation 7.
Equation 6
Equation 7
128 René Venegas
The Naive Bayesian classifier assumes that the effect of a value of the attribute of a given class is independent of the values of the other attributes. This assumption is called the conditional independence of class (Jurafsky & Martin 2000; Molina & García 2004; Bordignon et al. 2004). This allows us to simplify the calculations involved.
1.2.3 Support Vector Machines (SVMs) Support Vector Machines (SVMs) are a relatively new method of data classification. Several researchers have applied this method to a broad range of classification problems with favourable results. Several studies have shown that SVM minimises errors in text classification processes (Cortes & Vapnick 1995; Hsu, Chan & Lin 2003; Baldi, Fresconi & Smyth 2003; Téllez 2005). SVMs are particularly appropriate when working with high-dimensional data, such as vector representations of text documents. Their standard formulation is designed to solve binary classification problems. These may also be used for multi-class classification by reducing the overall problem to binary sub-problems (Cortes & Vapnick 1995). A normal text classification task involves data for training and data for testing the algorithm based on quantifiable characteristics of the texts. Each textual unit in the training group contains a classification value, represented by a class label and multiple characteristics. SVMs produce a model that predicts the classification values (i.e., identifies the class) in the testing phase by using only the attributes (Baldi et al. 2003). SVMs solve the geometric problem of identifying a linear decision frontier between two groups. SVM analysis finds the line (also called a hyperplane) that is oriented so as to maximise the margin between two or more groups, similarly to discriminant analysis (Sharma 1996; Hair, Anderson, Tatham & Black 1999). However, SVMs add a new operation called a kernel function (also known as a kernel trick). This operation can do non-linear separations of the data, optimising the classification. A kernel function is a generalization of the metric distance: it measures the distance between two vectors as the data are projected into a higher-dimensional space. Figure 1 shows that there are multiple ways to separate classes in a high-dimensional space. However, we need maximum separation of the hyperplane. The frontier of optimal separation is that which minimises the subsequent probability of erroneously classifying a new text. This frontier must be orthogonal to the line connecting the centres of mass of these two distributions. In this case, the frontier of optimal separation is the dotted line in Figure 1. Random hyperplanes (the segmented lines) that separate the two classes of the training points (i.e. the values that represent each text used in the training phase) have short distances to individual training points but will generalise poorly when presented
Automatic text classification of disciplinary texts 129
Figure 1. Linear decision frontiers for a binary classification problem (Baldi et al. 2003: 99)
with new training points (Baldi et al. 2003). As the dimensionality of the space of the documents increases, the computational complexity of the problem increases exponentially. Vapnick (2000), in his theory of statistical learning, proposes a hyperplane of maximum separation that has two important properties: (1) It is exclusive for each linearly separable datum, and (2) The associated risk of overestimation is the lowest of any other hyperplane. The margin of separation M of the classifier will be the distance between the hyperplane of separation and the nearest training point. Hence, the hyperplane of maximum separation is the one with the maximum margin. In order to calculate this hyperplane, we start by determining the distance from one data point x to the hyperplane of separation (see Figure 2). This separation is enforced by defined margins, which are the support vectors (as shown in Figure 2). The final determination of the maximum separation of hyperspace involves a series of steps that must satisfy the conditions of Karush-Kuhn-Tucker (Vapnick 2000; Christianini & Shaw-Taylor 2002; Baldi et al. 2003). A key characteristic of SVMs is their use of kernel functions (e.g., polynomial, radial basis function, sigmoidal) to extend the determination of the hyperplane of maximum separation to non-linear cases (Christianini & Shaw-Taylor 2002). This is accomplished by mapping data points in space X into a broader space of characteristics F by a function Φ, and solving the linear learning problem in F (Φ: X → F). The real function Φ does not need to be known; it is sufficient to have a kernel function k to calculate the internal product in the space of
130 René Venegas
Figure 2. Illustration of the hyperplane of maximum separation and its margins. The data in circles indicate the support vectors (Baldi et al. 2003: 100)
Figure 3. Mapping data into another space using kernel functions to simplify the classification task (Christianini & Shaw-Taylor 2002: 28)
characteristics: k(x,y) = Φ (x) Φ (y) (Christianini & Shaw-Taylor 2002; Bautista, Guzmán & Figuerola 2004). Figure 3 illustrates this. In sum, the goal of SVM is to find the optimal hyperplane that separates clusters of vectors so that cases with one category of the target variable are on one side
Automatic text classification of disciplinary texts 131
of the plane and cases with the other category are on the other side of the plane. The vectors near the hyperplane are the support vectors. We will compare these two techniques in the text classification task.
2.
Method
2.1
The PUCV-2006 Corpus of Written Academic Spanish
As stated above, our main objective is to describe the PUCV-2006 Corpus of Academic Spanish (PUCV-2006 Corpus) and to determine which of the two methods we described will be the most effective in classifying academic texts used in four disciplines. This corpus includes nearly 100% of the texts read by students in four university disciplines: Industrial Chemistry, Construction Engineering, Social Work, and Psychology. The first two represent Basic Sciences and Engineering, and the latter two represent Social Sciences and Humanities. The corpus comprises 491 digital texts, with a total of 58,594,630 words (for more details see Chapter 4). Table 1 indicates the total number of texts and words in the samples used in this study. As shown in Table 1, the sample used in the study (216 texts and more than 30 million words) represents a substantial percentage of the original corpus (about 44% of the digitised texts and 52.71% of the total number of words). The difference in sample percentages does not affect representativeness of the results, because the data used in methods based on vector models are subject to normalisation and statistical weighting processes. Table 1. Percentages for the number of texts and words used in the research sample with respect to the total corpus Academic Corpus Domain
Careers
Digitised Texts PUCV- Sample % 2006
Words PUCV2006
Sample
%
Basic Sciences Industrial and Engineering Chemistry Construction Engineering Social Sciences Social Work and Humanities Psychology
53
26
49.05% 9,285,375 8,209,911 88.42%
69
31
44.92% 8,734,086 4,937,665 56.53%
142 227
64 95
45.07% 18,641,309 7,601,999 40.78% 41.85% 21,933,860 10,136,506 46.21%
Total
491
216
43.99% 58,594,630 30,886,081 52.71%
132 René Venegas
2.2
Procedure
The method has two stages: (1) text pre-processing and (2) classification.
2.2.1 Pre-processing the texts The first step was to create a text file compiling all the texts of the respective disciplines. We did this with an ad hoc Perl programme (cambiopala.pl). We obtained four text files containing textual information from the disciplines under study (QUI.txt, IC.txt, TS.txt, and PSI.txt). This process included the elimination of non-linguistic information and stop words. The four text files were then uploaded to El Grial (www.elgrial.cl). This interface for automatic text tagging and corpus query, based on the Connexor tagger and parser, was used to accurately identify all the nouns, verbs, and adjectives present in each of the four files. This was also complemented by a manual revision system for Spanish, ensuring a nearly 100% reliability percentage (Parodi 2006a). Once the semantic content words were identified, we determined which of these were shared by the four text files by means of another Perl programme (batch.pl). Unlike other studies (Figuerola et al. 2000; Cerviño et al. 2004), we did not do any lemmatisation or root extraction, because we wanted to simplify the process of word recognition in the classification procedures. This allows the
Figure 4. Pre-processing of the sample texts for the application of the classification techniques
Automatic text classification of disciplinary texts 133
computational programme to tag and process new texts without having to execute any complex lemmatisation or stemming process. After identifying shared semantic content words for the four text files, the computational programme constructed a matrix of 2,729 lexemes, showing the frequency of each of the words shared by each individual text. This procedure was based on the assumption that the texts could be differentiated according to the patterns of occurrence of shared words. Figure 4 diagrams the pre-processing stage and the programmes associated with each substage.
2.2.2 Application of the classification techniques NB and SVM require normalisation of the matrix data (the frequency at which the 2,729 shared words appear in each of the 216 texts) as a prerequisite for constructing the corresponding vector spaces. As mentioned above, Salton and Buckley (1988) experimented with more than 200 systems for calculating weights in order to normalise the data. In our case, we follow the model by Salton (1968) expressed in Equation 1. This process yields a matrix obtained with the same vectors as the original matrix, but with terms weighted according to how much information these provide. The large size of the matrix makes statistical calculations more difficult. The matrix must thus be reduced to a matrix that is representative of the original. In the case of the NB classifier, we used two combined techniques for this: document frequency thresholding (DF) and information gain (IG) (see Yang & Pedersen 1997; Manning & Schütze 2003). By applying DF and IG to the NB method, we identified 74 shared-content words that would be useful in the classification task. For the SVM method, only the IG criterion had to be applied. We identified 515 shared-content words that were useful for classifying texts. In addition, the Radial Basis Function (RBF) as a kernel function was used, since this function has a high degree of efficiency during the training phase (Colmenares 2007). The distribution of content-shared words chosen by the statistical methods is quite similar in both methods. In the case of the NB, 48% of the words were nouns; 35% were verbs; and 11% were adjectives. SVM also has a preference for nouns (54%), then for verbs (38%) and finally, for adjectives (14%). Table 2 shows an example of seven shared-content words used as classification variables by both methods. Once the most appropriate shared semantic content words for the classification of the texts were identified, the next step was to use each method to calculate the classification values. This was executed in three phases: (a) the training phase, (b) the testing phase, and (c) the general classification phase. Approximately 80% of the texts for each discipline were used in the training phase, and the remaining 20% were used in the testing phase. This text distribution was confirmed and
134 René Venegas
Table 2. Example of shared-content words used by both methods Shared-Content Words Nouns
Verbs
Adjectives
Antigüedad (ancient times)
Calculan (they calculate)
Absolutas (absolute)
Apéndice (appendix)
Pienso (I think)
Acompañada (accompanied)
Arma (weapon)
Reacciona (he/she reacts)
Ascendente (ascendant)
Arreglo (arrangement)
Reaccionan (they react)
Asimétrica (asymmetrical)
Banco (bank)
Sube (something rises)
Convencionales (conventional)
Cálculos (calculations)
Suman (they add)
Dividida (divided)
Cámara (camera)
Tenerse (to take into)
Doméstico (domestic)
Cartas (letters)
Unirse (to join)
Esperado (expected)
validated by means of a K-fold cross-validation analysis, which brings out the best input parameters for a classifier model. In the case of SVM, a 34-fold cross-validation analysis and a 3,000 seed were used. The result was an 85% adjustment rate for the training and the test set. We measured the effectiveness of each method using recall and precision metrics. In a statistical classification task, the precision for a class is the number of true positives (i.e. the number of items correctly labelled as belonging to the class) divided by the total number of elements labelled as belonging to the class (i.e. the sum of true positives and false positives, which are items incorrectly labelled as belonging to the class). In this context, recall is defined as the number of true positives divided by the total number of elements that actually belong to the class (i.e. the sum of true positives and false negatives, items that were not labelled as belonging to that class but which should have been). We also calculated the F(Fβ) measurement (Figuerola et al. 2000). This is expressed as: Fβ =
(β2 + 1) PR β2 P + R Equation 8
Automatic text classification of disciplinary texts 135
where β is a parameter that adjusts the relative influence of both components: recall and precision. Fβ = 1 means that both components of the measurement are weighted equally. When Fβ is greater than 1, precision is favoured over recall, and when Fβ is less than 1, recall is favoured over precision (Manning & Schütze 2003).
3.
Results
We will first examine the results from the classification training phase, followed by the results from the testing phase. We will then focus on the overall results of classification for the two methods and their corresponding evaluation.
3.1
Training phase
We first present results obtained using both classification methods. We selected 172 of the 216 texts (79.6%) for the training phase, and calculated the percentage of correct identifications for each discipline. Table 3 shows the percentage of classifications obtained using the NB method and the SVM method with the Radial Basis Function. NB performed well for Industrial Chemistry and Construction Engineering texts, correctly classifying 100% of the Chemistry texts and 91.67% of the Engineering texts. This percentage came to 83.3% for Social Work texts. NB performed poorly with Psychology texts, correctly classifying only 58.33% of these. Without considering disciplines, the NB method correctly classified 83.33% of the texts. Although NB actually outperformed SVM on chemistry texts, SVM outperformed NB in the other three disciplines, and by a substantial margin in Social Work (98.15%) and Psychology (98.59%). This demonstrates a difference between the two methods, with SVM outperforming NB by an overall 14.7%. This difference rose to 14.71% if we consider all the texts (without the disciplinary criteria) because SVM correctly classified 98.04% of the texts. Table 3. Percentage classification in the training phase %
Naive Bayes
Support Vector Machine
Construction Engineering Industrial Chemistry Social Work Psychology Average Percentage
91.67% 100% 83.33% 58.33% 83.33%
100% 95.45% 98.15% 98.59% 98.04%
136 René Venegas
In summary, based on the data from academic texts in the four disciplines presented during the training phase, the SVM classification method performed better than the NB classifier.
3.2
Testing phase
The testing phase is designed to test the algorithms trained in the previous phase. The percentages in Table 4 indicate the predicted level of correct classification for the 44 academic texts that were not included in the training phase (20.4% of the total academic texts). In this predictive stage, as in the training phase, the NB method correctly classified all of the Industrial Chemistry texts. However, for Construction Engineering texts, the correct classification percentage dropped sharply with respect to the training stage. The percentage for Social Work texts (80%) is similar to that of the training phase. On the other hand, for Psychology, despite slightly results than the training phase, NB only classified 60.87% of these texts correctly. The mean percentage of texts correctly classified by the NB classifier for the four areas was 74.5%, almost 8% lower than the previous phase. The most remarkable changes from the training phase using SVM were observed for Social Work texts, where the method performed worse than NB (80% and 54%). SVM and NB performed identically in Industrial Chemistry texts (100%). Both algorithms classified the texts similarly, suggesting a commonality among texts in this discipline. For Construction Engineering, SVM outperformed NB (100% and 57.14%). As for Psychology, SVM correctly predicted 93.33% of the academic texts, which is slightly lower than the training phase prediction, but still significantly higher than the NB classifier prediction (60.87%). The mean correct classification percentage for SVM came to 84.66% in the testing phase, which is 13.38% lower than the training phase. In general terms, both methods showed a reduced level of classification capacity in the testing phase compared to the training phase. Although NB performed more consistently in
Table 4. Classification percentages in the testing phase %
Naive Bayes
Support Vector Machine
Construction Engineering Industrial Chemistry Social Work Psychology Average Percentage
57.14% 100% 80% 60.87% 74.50%
100% 100% 54% 93.33% 84.66%
Automatic text classification of disciplinary texts 137
the Social Science disciplines, SVM evidenced a better classification capacity in Construction Engineering and Psychology.
3.3
General classification phase
The third phase examines the general classificatory capacity of both methods when applied to texts belonging to the PUCV-2006 Corpus. Table 5 presents the general classification percentages of academic texts in the four disciplines under study using both methods. Once again, the NB method correctly classified 100% of Industrial Chemistry texts. This percentage is lower for Construction Engineering compared to the training phase but higher when compared to the testing phase. Social Work results are similar to those in the other two phases. NB continued to perform quite poorly with Psychology texts (about 59%). This is by far the weakest discipline for the NB classifier. The mean percentage for the general classification phase came to 81.40%. This percentage is slightly lower than the percentage observed for NB in the training phase (down 1.62%), but higher (by 6.36%) than that of the testing phase. In the general classification phase, SVM levelled off with steadier performance similar to its training phase results (with a small increase in the standard deviation: training phase 1.9 and general classification phase 4.3). Industrial Chemistry and Psychology evidenced very high percentages of correctly classified texts and Construction Engineering and Social Work were only slightly lower, both at approximately 90%. In particular, Construction Engineering improved significantly compared to the testing phase. SVM outperformed NB (6.45% higher) with respect to correctly classified Construction Engineering texts. SVM also outperformed NB (by 6.25%) when classifying Social Work texts. In Psychology, the SVM method maintained its high degree of accurate classification (97.85%), while maintaining its clear superiority (about 39% better) over the NB method in this discipline. The mean percentage of correctly classified academic texts for the SVM method in this general phase amounted to 93.34%. This percentage, Table 5. Classification percentages in the general classification phase %
Naive Bayes
Support Vector Machine
Construction Engineering Industrial Chemistry Social Work Psychology Average Percentage
83.87% 100% 82.81% 58.95% 81.40%
90.32% 96.15% 89.06% 97.85% 93.34%
138 René Venegas
100 90 80 70
Construction Engineering
60
Industrial Chemistry
50
Social Work
40
Psychology
30 20 10 0 NB
SVM
Training
NB
SVM Testing
NB
SVM
General
Figure 5. A comparison of methods in each of their phases
although somewhat lower than that of the training phase (about 5% lower), is much higher than that reported in the testing phase (about 20.2% higher). Without considering the disciplines, SVM correctly classified 93.34% of the texts, and NB 81.4% of them. This result helps to rank SVM performance 11.94% higher than the NB classifier. In summary, Figure 5 shows the percentages of correctly classified academic texts from each of the four disciplines, organised according to the methods used in each of the three phases (Tr = Training; _Te = Testing; _G = General). A remarkable feature indicated by Figure 5 is the consistency for correct Industrial Chemistry text classification, independent of the method used or the classification phase. This suggests that the content lexemes in this case present stable values for word frequency that are characteristic of Industrial Chemistry texts; these values are significantly different from those for texts from the other three disciplines. Another observation is that the SVM method clearly outperformed the NB classifier in each of the phases when it came to classifying psychology texts. As for Social Work texts, the SVM method generally outperformed the NB method, with the exception of the testing phase, where SVM performance dropped sharply. In contrast, the NB classifier evidences very stable percentage values (about 80%) for Social Work in all three phases. As for Construction Engineering, the NB method evidences substantial variation across the three phases of classification, dropping below the 60% mark in the testing phase. However, the SVM method evidences a high percentage of correct Construction Engineering text classification during the training and the general phases. The general classification phase evidences a noticeable difference between the NB and SVM methods. With the exception of Industrial Chemistry (an area with great stability across the board), the NB classifier shows rather lower results than
Automatic text classification of disciplinary texts 139
SVM. In contrast, the SVM method performs quite well in all four disciplinary areas, obtaining high correct classification percentages in each case.
3.4
Method evaluation
In this section, we apply evaluation metrics in order to validate the results obtained by both methods of classification. As mentioned previously, we will calculate recall and precision. In addition, we will apply the Fβ measurement, which allows us to evaluate the relationship between these two criteria (see Table 6). With respect to the NB method, the data in Table 5 show that NB classifier recall is generally higher than its precision. This is confirmed by the Fβ measurement. The NB method tends to be better at capturing relevant information about the texts as opposed to providing a high degree of text classification precision. As for the SVM method, we have seen that the means for recall and precision are quite similar (R = 0.93 and P = 0.95). More importantly, both measurements are considerably higher than those obtained using the NB method. This suggests that the SVM method classifies academic texts in these four areas with greater precision and recall than the NB method. The Fβ measurement in SVM confirms the balance between the two measurements. The results indicate that the SVM method with the kernel RBF is a better instrument for classifying the texts of the PUCV-2006 Corpus than the NB method when following the methods used in this study. Table 6. Validation measurements for the results obtained using both methods (R = Recall; P = Precision) Methods
Naive Bayes
Support Vector Machine
Validation Measures
R
P
Fβ
R
P
Fβ
Construction Engineering Industrial Chemistry Social Work Psychology
0.84 1 0.83 0.59
0.76 0.52 0.8 0.85
0.8 0.68 0.82 0.7
0.9 0.96 0.89 0.98
0.97 0.96 0.97 0.91
0.93 0.96 0.93 0.94
Mean
0.81
0.73
0.75
0.93
0.95
0.94
Conclusions The contribution of this study is to demonstrate the feasibility of a computerized system of automatic classification for a large number of disciplinary texts that are used in undergraduate university courses in Chile. It is particularly interesting to
140 René Venegas
observe that the procedure, based on a reduced number of content words shared by the texts from the four disciplines considered, evidences favourable results using either one of the two classification methods tested. This might suggest that choosing this type of linguistic information is very useful for different algorithms, guaranteeing a reliable classification system that is relatively simple to implement in any computer setting. Another particularly interesting finding is the fact that content words statistically chosen as variables show the same distribution percentage regardless of the method used. Thus, there is greater preference for the use of nouns (between 48% and 54%), than for verbs (between 35% and 38%) and finally, for adjectives (between 11% and 14%). Therefore, nouns seem to be the most useful word class for classifying disciplinary texts. These figures may be used to affirm that variation in the frequency of these shared nouns was determined by the difference these words have in each of the four disciplines. This is consistent with the idea that the use of individual words in text representation needed in order to carry out computerized classification and retrieval tasks usually works better than more complex representations (Zhang, Yoshida & Tang 2008). However, in this case, information on these words has been enriched by the tagging process. With regard to the disciplines, it is interesting to note that Industrial Chemistry texts have a very high general classification percentage regardless of the method used. This may be explained by the existence of a clear quantitative pattern in the use of content words associated with the texts of this discipline. Consequently, we may expect that these texts will be more likely to be identified and classified using computational means such as the Internet or digital databases. We may also assert that both methods achieve better results when classifying texts from the Social Sciences and Humanities (Social Work and Psychology) than when sorting out Basic Sciences and Engineering texts (Industrial Chemistry and Construction Engineering). This is due to the stronger disciplinary relation between these areas, evinced by the use of similar lexical patterns. An example would be texts such as Theories in Social Psychology (Deutsh & Krauss 1980), included in the Social Work corpus or Social Psychology of the Organisation Process (Weick 1982), included in the Psychology corpus. Regarding comparison of the methods used, SVM classified the texts of the PUCV-2006 Corpus of Academic Spanish as having a higher degree of balance between precision and recall (Fβ 0.94) than that of the NB method (Fβ 0.75). This difference may exist for three reasons:
Automatic text classification of disciplinary texts 141
a. The NB, despite having excellent recall for Industrial Chemistry texts (1), evidences very low precision (0.52). This implies that the method classifies texts from other disciplines as if they were Industrial Chemistry texts. b. NB recall in the classification of Psychology texts came to 58.95%, while SVM recall was 97.85%. Thus, despite the fact that both methods evidence low percentages of error (NB 13% and SVM 2.15%) when discriminating between psychology and social work texts, the NB method tends to classify Psychology texts as if they were Industrial Chemistry texts at a significant percentage (22.11%). c. SVM has a very stable average percentage (94%) for recall and precision in the classification of texts from all four disciplines. Consequently, SVM appears to be the most adequate method for classifying disciplinary texts in Spanish, based on the reduced number of shared content lexemes, delivering high performance even in highly refined disciplines such as Psychology and Social Work. This confirms the results of some studies that have used different types of texts in English (Joachims 1998; Kang 2005; Zhang, Yoshida & Tang 2008), as well as others that have worked with texts in Spanish (Solorio, Pérez-Coutiño, Montes-y-Gómez, Villaseñor-Pineda & López-López 2005; Téllez 2005). In sum, the findings of our study confirm the feasibility of creating computer programmes that can both identify content words with a high occurrence in large corpora of digitised texts (on the Internet, for example), and classify them by discipline. This will be highly valuable in the creation and organization of specialised text corpora. Another possible application is for the implementation and administration of virtual libraries for specialised disciplines that can cater to the needs of undergraduate students. Finally, we can predict further advances in the description of academic texts from these four disciplines through the use of a corpus-based approach, as is the case of several studies conducted at the Pontificia Universidad Católica de Valparaíso over the last few years (Parodi & Venegas 2004; Venegas 2005; Cademártori, Parodi & Venegas 2006; Parodi 2006 a, b; Venegas 2006). In particular, we predict the potential use of the SVM method for the classification of academic texts, this time considering discourse genres.
chapter 8
Rhetorical organisation of Textbooks A “colony-in-loops”? Giovanni Parodi
In this chapter, we identify and describe the rhetorical organization of the Textbook genre, based on a part of the PUCV-2006 Academic Corpus of Spanish. This corpus includes a total amount of one hundred and twenty-six textbooks collected from four disciplines: Social Work, Psychology, Construction Engineering and Industrial Chemistry. In order to do so, we identify steps and moves of the genre, and we describe the communicative macro-purposes underlying these discourse units. In addition, we selected passages from the texts to exemplify the corresponding steps and moves. A new macro-level of analysis is introduced and justified, which becomes necessary for a better description of an extensive textual unit such as this. We have named it macro-move. At the end of the chapter, we answer the question appearing in the chapter’s title and we explain why the Textbook would have a rhetorical organization, which we denominate “colony-in-loops”.
Introduction The characterisation of discourse genres and their variations across disciplines is a relatively recent concern in linguistics. Empirically-based and in-depth descriptions of academic genres across a variety of disciplines have thus become an interesting arena for contemporary research, supported by technological advances such as computers and software tools. Large numbers of textbooks circulate in paper and electronic formats in undergraduate university settings. Textbooks are also employed as fundamental means of constructing specialised knowledge in a variety of disciplines. However, little information and research can be found regarding Textbook genre organisation from theoretical or applied perspectives. According to the empirical data collected in the PUCV-2003 Corpus of Spanish and the PUCV-2006 Corpus of Spanish, the Textbook is one of the genres evidencing a major impact on certain
144 Giovanni Parodi
areas of technical-professional education, as well as on undergraduate university programmes (Parodi 2004, 2005a, 2008a and b; Parodi & Gramajo 2003). Texts belonging to this genre constitute an important part of the total amount of both corpora (33% and 26%, respectively). These facts not only indicate the importance of the Textbook as a means of accessing specialised knowledge but also the way in which the Textbook’s predominant communicative purpose is exercised across texts. Most textbooks seek to offer a first approach to new knowledge within an area of specialisation as part of their communicative purpose. Moreover, their rhetorical structure is supposed to help readers to create a mental representation of specific disciplinary knowledge. Study of the rhetorical organisation of discourse genres has been restricted to only a few genres and has particularly concentrated on Research Articles. Starting from the seminal work of Swales (1981), this genre has been profoundly explored in several languages, with a special emphasis on English. It is highly probable that this concentration on the genre, due to its fundamental function as a means of scientific communication, may have eclipsed investigation of the rhetorical organisation of other genres, although such research is not entirely absent (e.g., DudleyEvans 1986; Bazerman 1988; Bhatia 1993, 1997, 2004; Bunton 2002; Barbara & Scott 1999; Biber, Connor & Upton 2007; Moss & Chamorro 2008). Comparatively, research on Spanish is scarcer. Many research works can also be identified concerning Research Articles (Ciapuscio 1996; Ciapuscio & Otañi 2002; Acosta 2006), but few in-depth studies have been conducted on other genres. Some unique research works are, for instance, those of Núñez (2004), Núñez, Muñoz and Mihovilovic (2006), and Espejo (2006), who approach the study of the Report; those of Cubo de Severino (2000, 2002), who studied textbooks, and those of Bolívar (1999, 2000), who analysed research abstracts for scientific meetings. This scenario reveals an interesting niche for study of the Spanish language. In an attempt to fill this gap, this chapter describes the rhetorical organisation of the Textbook, based on part on the PUCV-2006 Corpus of Academic Spanish. This corpus includes 126 textbooks collected from four disciplines: Social Work, Psychology, Construction Engineering, and Industrial Chemistry. More specifically, we identify and describe the communicative purposes of each of the moves and steps, and provide examples from textbooks from the four disciplines. A new macro-level of analysis is introduced and justified where this becomes necessary for further in-depth description of an extensive discourse text unit, as is normal when this genre is involved. We have named it the macro-move. By the end of the chapter, we answer the question that provides the title for this section of the book, explaining why the Textbook would have a rhetorical organisation, which we call a colony-in-loops.
1.
Rhetorical organisation of Textbooks 145
Textbooks as a discourse genre
The term textbook turns out to be polysemic because it applies at the same time to undergraduate university academic textbooks, technical procedure manuals such as instruction booklets, as well as primary and secondary school textbooks. In addition, sub-classifications into diverse sub-types are possible for some of these genres. Thus, the idea of a genre system (Bazerman 1994; Martin & Rose 2008; Tardy 2003), as well as a colony of genres (Bhatia 2004) or a macro-genre (Martin 1992) may be applied here. Nevertheless, in this chapter we do not inquire into the diverse meanings or possible sub-classifications of the textbook discourse genre. We begin from a theoretical concept of genre as a multi-dimensional construct with a sociocognitive emphasis (see Chapter 2) and from empirical corpus-based findings (several chapters in this volume), partly guided by the communicative purpose the Textbook fulfils, the university training domain, participants involved, the kind of mono- or multi-modality applied, and the predominant discourse organisation mode. As already mentioned, there is not much research on the rhetorical organisation of textbooks. Although there is some research in English, little attention has been paid to university textbooks in Spanish. When studies are found, these are clearly focused on primary and secondary school textbooks (González 2007; King 2007). This genre is somewhat different from that of undergraduate university textbooks. One seminal study in Spanish was conducted in Argentina by Liliana Cubo de Severino (2002, 2005b). Her findings established a preliminary starting point for the description of some rhetorical moves, as well as an approximation of possible reading comprehension difficulties involved in its processing. Unfortunately, in our opinion, Cubo de Severino’s findings come from a reduced sample of textbooks where diverse disciplines and subdisciplines (linguistic, sociolinguistic, psycholinguistic, economic and human geography) intersect randomly, with a certain tendency toward homogenisation. Partly for this reason, her results cannot be largely projected or generalised.
1.1
Textbook rhetorical organisation: The rhetorical move approach
As is well known, the study of genres in terms of rhetorical moves was originally developed by Swales (1981, 1990, and 2004) to functionally describe a part or section of Research Articles. This approach, which seeks to operationalise a text into particular segments and identify the communicative purposes of these segments, originated from the educational objective of supporting the teaching of academic writing and reading for non-native speakers of English. The idea of
146 Giovanni Parodi
clearly describing and explaining the rhetorical structure of a particular genre and of identifying each associated purpose is a contribution that can assist beginners and novices who do not belong to a specific discourse community. The move analysis of a genre aims to determine the communicative purposes of a text by categorising diverse text units according to the particular communicative purpose of each unit. Each one of the moves where a text is segmented constitutes a section, revealing a specific communicative function, but this is linked to and contributes to the general communicative objective of the whole genre. The unique organisation of the moves of a specific genre is what provides its identity and distinguishes it from other genres. It is this underlying organisation inlay in the textual surface which the researcher must make visible by identifying the various rhetorical steps and moves. Figure 1 captures the classical idea of progressive integration of specific communicative objectives and the steps embedded in moves, which subsequently give form to the genre. As can be seen in Figure 1, the number of moves and steps of a genre or genre section is not governed by any fixed rules. This is because there is not necessarily a relationship between the rhetorical and functional organisations of a genre’s formal structure. This issue relates to the level of abstraction pursued in the determination of each communicative purpose. In other words, if a researcher seeks an extremely detailed analysis, it is feasible that he/she may atomise each text proposition into a step or move. However, if an analysis is established using
Figure 1. Diagram of the hierarchical and variable rhetorical structure of a genre
Rhetorical organisation of Textbooks 147
a larger degree of abstraction, the number of functions and macro-functions may diminish significantly. This may result in variable descriptions of a genre’s organisation, with varying degrees of specificity. On the other hand, the nonexistence of exact rules for the application of this approach implies that each researcher does not necessarily proceed by means of a previous set of clearly determined phases. Consequently, moves in a genre could be quite variable in terms of their internal organisation, length and connections with other moves. There is an additional characteristic of moves and steps that should be noted, since this contributes to the difficulties involved in this kind of analysis: some moves and steps in a genre may be obligatory, while others may be optional. Swales (1981, 1990, 2004), Bhatia (1993, 2004), and Kwan (2006), as well as Biber, Connor and Upton (2007), propose useful and general orientations as to how to perform a functional identification of moves from a genre analysis perspective. In all of these analyses, keys and procedures are offered in greater or less detail in order to carry out a study of the communicative purposes of a text and its segmentation into minor text units. It is important to point out that none of these authors is very clear when it comes to offering a very detailed process, since it is not easy to state the linguistic limits of a discourse unit or its communicative purpose a priori. Therefore, it is evident that this is a question that each researcher must face based on his/her experience with the genre under scrutiny and from what is called a “bottom-up” or a “corpus-driven” approach (TogniniBonelli 2001; Thompson & Hunston 2006; Biber et al. 2007). Likewise, a fundamental step in this methodological approach is the triangulation of the data by means of human judgments to evaluate the discourse components identified and the corresponding classification of those text segments as discourse moves in a genre. This process must occur during the development process for the first descriptions as well as in the final validation of the move and step codification table and the corresponding communicative purposes. This is also known as inter-rater reliability by expert judges and turns out to be a support for the contrast and validation of the researcher’s judgment (Hatch & Lazaraton 1991).
2.
The research
As was mentioned, this chapter aims to provide a functional description of the rhetorical moves and steps constituting the Textbook genre. At the same time, it focuses on the analysis and comparison of a corpus belonging to four university undergraduate programmes (Psychology, Social Work, Industrial Chemistry and Construction Engineering), divided into two scientific domains (Social Sciences and Humanities (SS&H) and Basic Sciences and Engineering (BS&E)). We are
148 Giovanni Parodi
also interested in offering examples that illustrate the focus of analysis from all the scientific disciplines involved. This is due to our interest in discovering the mechanisms that allow access to specialised disciplinary knowledge and allow novices to become full members of a discourse community.
2.1
Corpus description
The Textbook is the second most predominant genre in the PUCV-2006 Academic Corpus (126 texts: 26% of the total amount). It is only preceded by the Disciplinary Text (270 texts: 55% of the total number of texts). A relevant finding is the fact that the Textbook is the only one of the nine genres identified in the PUCV-2006 Academic Corpus that is found across the four disciplines under study. Numerical distribution of textbooks per university programme is detailed in Table 1. It is worth emphasising that this constitution of the corpus is not intentional in any instance. The collection process followed strict procedures and the objective was always to access the realm of texts being delivered as obligatory and optional reading material to the students of the four university programmes during the total number of years included in each curriculum. Therefore, the nature of the corpus is highly ecological and representative. All of this material is available electronically at our El Grial website, where the texts have been processed morphosyntactically in plain format (www.elgrial.cl). Further details regarding development and possible consultation using the El Grial interface can be found in Parodi (2006a). It is remarkable to note the heterogeneous distribution of the Textbook within the corpus. In terms of the number of texts, Industrial Chemistry and Engineering have almost double the presence of Psychology and Social Work; therefore, the number of texts varies importantly among disciplines. Thus, in terms of the number of words, this corpus of textbooks in the PUCV-2006 Corpus represents 41.5% of the total amount. These figures show that we are facing texts of substantial length, since the average number per text is more than 200,000 words. This is Table 1. Numerical constitution of the corpus University Programme
Number of Texts
Number of Words
Psychology (PSY) Social Work (SW) Industrial Chemistry (IC) Construction Engineering (CE)
31 15 31 49
4,925,931 2,465,747 9,161,146 6,936,212
Total
126
46 80
23,489,036
7,391,678 16,097,358
Rhetorical organisation of Textbooks 149
another fact to consider in the analysis of the rhetorical organisation of this genre, since this will lead to the identification of more specific communicative purposes of the text units.
2.2
Method
2.2.1 Some preliminary precisions The method used in this study is between the so-called descending (deductive) and ascending (inductive) approaches. This is a complementary methodology and not an exclusive one. This complementary methodological option, following Biber et al. (2007) and Baker (2006), does not correspond to either of the two alternatives proposed by these authors, i.e., the “top-down” and “bottom-up” approaches. In the former, research originates from predetermined categories that were previously familiar to the researcher; in the latter, categories emerge exclusively from the data. While these previous distinctions are highly relevant, it is important to state that we do not agree with the separation that Biber et al. (2007) establish between what they call “discourse analysis” and the “corpus-based approach”. This is because the distinction opposes the two approaches in an excessively reductive and radical way by circumscribing, on the one hand, the so-called “discourse analysis” to “top-down” methodologies – and, on the other, the “corpus-based approach” to “bottom-up” methodologies. Basically, our objections focus on the strict division Biber et al. (2007) makes between the “top-down” and the “bottom-up” approaches and the assimilation of a “corpus based-approach” to stronger and more scientifically validated studies (as compared to those based on the so-called “discourse analysis” approach). Likewise, we do not agree with Baker (2006) regarding the four supposed advantages of a “corpus-based” approach as opposed to a “topdown” analysis. It does not seem possible to transfer a study from the discourse analysis arena to one of an exclusively inductive investigation, with small, random and exemplary corpora, impressionistic and biased analysis, and scarce data triangulation. On the contrary, we consider it unfair to emphasise that only studies from a “corpus-based” approach perspective and with “bottom-up” methodologies impose larger restrictions to our cognitive prejudices, display greater focus on the data than its interpretation, are based on digital texts and large corpora, and thus reduce manipulation of the chosen texts. We defend an analysis based on robust corpora of complete, non-mutilated texts that have been ecologically collected with the highest possible degree of representativeness. We are thus in favour of precise methods that will allow for the replication of procedures and proposed categories. In this integrated and
150 Giovanni Parodi
complementary approach, the researcher is not deprived of his/her previous knowledge at the moment of the analysis. The text under analysis will also guide the segmentation process and will complementarily cause the underlying communicative purposes to emerge from the linguistic structure. These communicative purposes are, in fact, in the researcher’s mind and are part of his/her previous knowledge. Based on this knowledge, he/she is able to identify these in the text. On the other hand, the overall configuration of these purposes is a question that does not previously exist in the researcher’s mind; thus, it is this organisation system coming from the text that the analyst must codify and explain. The balance produced between the text’s data and the information that is part of the researcher’s previous knowledge constitutes the final outcome.
2.2.2 Methodological steps of the analysis The 126 textbooks under study were available in paper and electronic formats. These were analysed using both versions, depending in part on the step of the analysis being executed. Initially, the sections of each text representing particular moves were identified and marked on paper; subsequently, they were identified and separated in the digital files, thus creating independent electronic documents per move. As previously mentioned, the researcher's previous knowledge is an important issue for this analysis. Therefore, his/her experience and awareness of underlying communicative functions play a vital role in the segmentation process for texts and the identification of functional categories. In some cases, these functional segments may correspond to clearly identifiable structural text units – as is the case, for example, with the Table of Contents in the Textbook. In other cases, they may correspond to parts of the text that are less linguistically determined and for which the communicative purpose is more difficult to identify; for instance, The Expression of Gratitude as part of the structural unit Preface. Thus, the functions that a semantic/structural unit may perform, in principle, rest on the analyst’s expert judgment. As previously stated, this first codification must be validated or adjusted according to analysis of the peer evaluation, the linguistic analysis and the researcher’s own revisions. Another relevant step in the analytical process is awareness of decisions regarding the degree of abstraction that was applied in the segmentation of the functional units. As stated earlier, the length of the texts that form the Textbook genre is considerable (an average 200,000 words). This imposes a level of abstraction higher than that of research in short genres such as the Research Article or with text sections of this genre, which are traditionally brief (an average of 1,000 to 4,000 words). It has been hard to find a previous study with a complete record of large texts such as textbooks in the available literature. Therefore, awareness of
Rhetorical organisation of Textbooks 151
a higher degree of abstraction applied to each text under study becomes a complex issue that researchers must face carefully. Table 2 shows a summary of the fifteen methodological steps applied in order to describe the rhetorical moves of the PUCV-2006 Academic Corpus of Textbooks. In this table, special effort was made to include as many details as possible in order to provide a complete overview of the procedures executed. These twelve steps have guided our analysis and oriented our research. They may help to guide those who are interested in pursuing studies of this kind. We are aware that some of these may entail a certain degree of subjectivity and that the researcher must make decisions while each stage and step are being developed. Not only the research team, but also the raters of the triangulation process are fundamental to improving the different stages of analysis and the identification of categories and definitions.
2.2.3 Inter-rater reliability With respect to reliability of the criteria table, three expert reviewers (Hatch & Lazaraton 1991) performed a preliminary analysis. The objective pursued was to obtain a high percentage of agreement in order to ensure consistency of the proposal developed by the researchers. This triangulation process reinforces the empirical data and grants more objectivity to the results. In this research, a high degree of agreement (over 87%) was reached among the reviewers who examined the codification guideline and the corresponding rubrics. This means that we made sure that each of the three raters equally understood the definitions of each macro-move, move, and step, and the purposes identified for each. The most basic and well-known system was used for the statistical calculation of these reliability procedures: a table for agreement on percentages. 3.
Results and discussion
In the first part of this section, we detailed the results stemming from identification of the Textbook’s rhetorical organisation and then detailed macro-moves, moves and steps identified. A definition of the communicative purpose is given for each of these. In order to incorporate information based on the four disciplines involved in this study, some of the steps were illustrated, taking examples from the actual texts of the corpus by selecting some passages, but choosing them from different disciplines. In addition, we decided to move into a complementary analysis, and therefore identification of the approximate structural section where moves and steps occur is included.
152 Giovanni Parodi
Table 2. Stages and steps to conduct a move analysis in PUCV-2006 Corpus Stages and steps to conduct a move analysis
Description
Stage 1. Analytical framework configuration
A preliminary analysis is performed from a micro-corpus for the construction of a first criteria table.
Step 1.1. Identifying text units
Based on an initial analytical study of the micro-corpus, a set of text units is identified.
Step 1.2. Determining the observation focus
The degree of abstraction is established in order to observe the communicative purposes forming the genre.
Step 1.3. First equalisation
The relation between the focus of observation and the identified text units is revised and adjusted.
Step 1.4. Assigning communicative purposes
Each identified discourse unit is associated with a communicative purpose.
Step 1.5. Label production
A label is assigned to each identified discourse unit, according to the communicative purpose it eventually fulfills.
Step 1.6. Identifying the general communicative purpose
The genre’s general communicative purpose is determined according to the set of previously identified communicative purposes.
Step 1.7. Designing the first criteria table
Based on previously developed steps, a first classification in terms of macro-moves, moves and steps is designed.
Stage 2. Extension and adjustments
The criteria table is applied to the whole corpus and eventual modifications are made.
Step 2.1. Applying the criteria table
The criteria table is applied to the total amount of texts in the corpus.
Step 2.2. Second equalisation
Based on the application to the total corpus, the necessary modifications are made to the criteria table, which implies including or excluding some macro-moves, moves and/or steps.
Stage 3. Reliability of criteria table
A triangulation process is carried out in order to establish the instrument’s reliability.
Step 3.1. Determining the instrument’s reliability
In order to determine the percentage of agreements among raters, three expert reviewers applied the criteria table following the same procedure.
Step 3.2. Third equalisation
After the triangulation process and over 80% agreement, several adjustments are made to the criteria table in order to settle emerging discrepancies noted after the reviewers had made their analysis.
Stage 4. Establishing the occurrence of functional categories
The final criteria table is applied to the corpus in order to quantify the occurrence of moves and steps.
Step 4.1. Quantification
The occurrence of each move and step in each text in the corpus is quantified.
3.1
Rhetorical organisation of Textbooks 153
Description of the overall Textbook rhetorical organisation
One important issue regarding the results obtained in this research was related to the rhetorical organisational pattern presented in Figure 1 of this chapter. Figure 1 showed that the structure of a progressive communicative purpose embedding was operationalised from the general purpose of the genre through moves and steps or from specific to general distribution. An innovative element of this initial model is the creation of the concept and coining of the term macro-move. This macro-level helps to better reveal: (a) the length of texts making up this genre, (b) the higher level of abstraction implicit in this analysis, and (c) the recursive functional organisation of certain obligatory sections. As is well-known, a move has been defined as a discourse unit performing a specific function in a text. Thus, each move has a particular communicative purpose and contributes to the overall communicative purpose of the genre. By using the term macro-move, we seek to define a discourse unit of higher rank than that of a move. This implicitly includes a more abstract view in terms of the communicative purpose that serves the macromove. These macropurposes are inclusive in nature with respect to more limiting purposes. This means that the analysis of the Textbook’s functional organisation reveals a multilevel complex distribution that requires specification of a macro-purpose including a set of more specific moves and, in turn, more detailed steps. This major form of organisation also enables the differentiation of nuclear components and satellite components in the genre under study. The detection and inclusion of this more abstract level and higher rank produces a more visible overall form of rhetorical organisation, not only for the analysis but also for possible educational applications. Figure 2 shows the segmentation process, presented according to rank as described above.
Figure 2. Segmentation process of the Textbook functional components
154 Giovanni Parodi
Figure 3. Diagram of possible hierarchic rhetorical structure of genre with macro-moves
Starting from this procedure, our analysis helps to identify three macro-moves in the Textbook – each, in turn, organised within a total of ten moves and as many more specific steps. The macro-moves involve wider and abstract categories that specify or include more enclosed and specialised distinctions. Figure 3 offers an example of the new emerging model revealed from analysis of the corpus under study. This new model is a re-formulation of Figure 1, which corresponds to what we might call the “classical model”. Analysis of the data has led to the identification of three fundamental rhetorical macro-moves in the Textbook genre: Preamble, Conceptualisation & Exercising, and Corollary. Each of these operationalises in moves and, more specifically, in steps (as may be understood from Figure 3). Some of these moves and/or steps constitute nuclear categories, while others clearly serve a satellite communicative function. This means that some obligatory categories are distinguished that are highly constitutive of this genre, while others are clearly optional. This reveals that the Textbook genre is constituted by a group of macro-moves, moves and steps that define its character, but that empirical research has demonstrated the existence of certain categories of minor rank (moves and steps) that are not always present in the corpus of the texts under study. Likewise, a recursively reiterative macro-move that forms the central nucleus of this genre has emerged, i.e., the Textbook’s organisational structure shows that Conceptualisation & Exercising is displayed progressively throughout each text, revealing a very prototypical characteristic of this genre. This macro-move has no
Rhetorical organisation of Textbooks 155
predetermined numerical occurrence and will be produced as many times as the contents of each text requires. Figure 4 identifies the three macro-moves, together with their respective constitutive moves.
Preamble
Contextualisation (CON) Contents Organisation (CO) Resources Organisation (RO)
Conceptualisation & Exercising
Corollary
Concept Definitions (CD)
Solutions and Answers (SA)
Practice (PRA)
Specifications (SPS)
Recapitulation (REC)
Guidelines (GUID)
Presentation (PRE)
Figure 4. Macro-moves and moves detected in the Textbook
Organisation of the macro-moves in the Textbook varies. While the first is displayed in four moves, the following do this in three. This evidences the non-existence of fixed rules or standard procedures for the organisation of a genre. Table 3 summarises a deeper analysis without identifying the specific steps of each move. This table shows the overall rhetorical organisation of this genre, which gives definitions and describes macro-purposes and moves and relates them to an approximate textual structure. This table shows the three macro-moves and ten more specific moves with their respective communicative purposes. Here, Macro-move 1 (Preamble) includes four moves, Macro-move 2 (Conceptualisation & Exercising) includes three moves and Macro-move 3 (Corollary) is described in three more moves. An interesting distribution of the macro-moves in emphasising a form of genre organisation is detected. It is straightforwardly focused on pedagogical purposes, with a revealing display of didactic resources. The three macro-moves are evidence of this distinctive educational concern and prove that the writer/author is quite certain of the objective being pursued. The opening part of the book offers a Preamble, where essential keys are given for discourse comprehension and for precise reader orientation. Special concern is shown by the writer/author for the audience to whom the objectives, procedures, indexes and other didactic resources are displayed – a concern for helping the audience to comprehend and learn from the text.
156 Giovanni Parodi
Table 3. Overall rhetorical organisation of the Textbook genre Move Name
Communicative Purpose
Macro-move 1. Preamble (PREA)
To present the book initially and provide useful information to help read the work
Move 1.1. Contextuali sation (CON)
To relate parts of the text, to comment on Prologue/Preface its contents, and to include acknowledgments.
Move 1.2. Contents Organisation (CO)
To show the book’s contents and its thematic organisation.
Thematic Index/ Contents
Move 1.3. Resources Organisation (RO)
To support comprehension of the book’s contents.
Index or Table of Symbols and Abbreviations
Move 1.4. Presentation (PRE)
To comment on references, context and the objective of the text to the reader.
Introduction
Macro-move 2. Conceptualisation & Exercising (C&E)
To provide concepts and definitions, with problems, examples and solutions
Move 2.1. Concept Definitions (CD)
To describe and explain processes, objects or others.
Nucleus of a chapter
Move 2.2. Practice (PRA)
To present practical tasks based on the contents reviewed in the section.
Part of a chapter
Move 2.3. Recapitulation To list global ideas. (REC)
Structure
End part of a chapter
Macro-move 3. Corollary (COR)
To complement and to deepen the central contents
Move 3.1. Solutions and Answers (SA)
To point out solutions to the exercises and problems, and offer answers to the problems presented in each chapter.
Annexes/ Appendices
Move 3.2. Specifications (SPS)
To support the comprehension of terms, units and abbreviations.
Annexes/Appendices/Glossary
Move 3.3. Guidelines (GUID)
To offer bibliographical sources and support the search for topics through a guideline in alphabetical order.
Analytical Index/ Bibliography
The second macro-move, called Conceptualisation & Exercising, constitutes the heart of the Textbook and reveals its prototypic function, which is “to provide concepts and definitions, with examples, problems and solutions”. This macromove thus gives form to the genre’s nucleus and fulfils its most relevant macro-purpose. Identification of this macro-move reveals the importance of having included a more global and hierarchical level in the analysis. It also allows for emergence of a macro-category that would possibly not have been detected in a unique, more detailed analysis. By means of three specific moves, the second
Rhetorical organisation of Textbooks 157
Figure 5. Diagram representing macro-move organisation in the Textbook genre
macro-move executes the genre fundamentals and condenses what is substantial to this discourse genre. As we will see hereafter, it is possible to argue that the two first moves of this macro-move can be understood as important categories: Move 2.1 Concept Definitions (CD) and Move 2.2 Practice (PRA). Figure 5 captures the idea that nuclear Conceptualisation & Exercising is repeated cyclically throughout textbooks and provides unity and coherence to this genre. These macro-moves are embedded recursively, developing thematic nodes that sometimes make up self-contained units. This organisation approaches the idea of a “discourse colony”, as proposed by Hoey (1986). In this form of discourse organisation, each discourse unit may become very independent and not directly linked to the others in terms of the construction of meaning. It would even be possible, in some cases, to reorder them randomly without affecting the global coherence or the text’s overall thread (a good example could be a Regulation, where a list of rules is enumerated). However, this organisation as a constellation of isolated and independent nuclei is not very common. In most genres, each macromove or move is indeed related to the previous and subsequent ones, and different kinds of connections are established from one thematic nucleus to the next. The Textbook genre emerges as a well-organised and hierarchical sequence of macro-moves where the nucleus may exhibit a cyclical form of organisation. A clear concern for students or learners, who find a source of specialised knowledge that may enable them to begin their academic and professional education in the texts belonging to this genre, is noted. The Textbook’s organisation reveals that it is made up of a group of rhetorical procedures that aim to gradually introduce the novice to specific contents and particular methodologies. At the same time,
158 Giovanni Parodi
these discourse procedures are intended to develop in the students’ or learners’ particular ways of reasoning that are very much associated with the discipline involved, which the student should gradually acquire in order to be part of the discourse community. This evidences that the writers of textbooks are aware that the audience will have to adopt this way of reasoning, in view of which they will adequately grade the contents under study as long as they aim to create what Kantor, Anderson and Armbruster (1983) call “considerate texts”. “Considerate texts” are texts articulated in a special way in order to be more accessible to readers. Therefore, if textbooks are to be “considerate”, this means the author/writer of a textbook must cooperate with his/her readers. Communication may operate more effectively by following this rule. Kantor et al. (1983) propose four principles (partly inspired by Grice’s cooperation principles) that they argue are inferred from rhetoric and are supported by research in text comprehension. These principles are structure, coherence, unity, and audience appropriateness. Thus, a “considerate text” is clear and direct in its message and allows the reader to elaborate information efficiently and with minimal cognitive effort.
3.2
Macro-move 1: Preamble
Table 4 illustrates the organisation of Macro-move 1. Some examples are also given. One of the interesting findings emerging from this analysis is the non-existence of a canonical order for the moves included in Macro-move 1. The most prototypical form of organisation is to find the Contextualisation at the beginning, but Contents Organisation, Resources Organisation, and Presentation may also appear randomly. The sequence proposed in Table 4 is detected to be the most common, but the inclusion of Contextualisation before 1.2 and 1.3 is also offered as an optional alternative. Another important finding is that this tendency towards non-canonical and regular organisation also applies to the presentation order of most of the steps in each move belonging to Macro-move 1. This is the case for Move 1.1 and Move 1.4. This characteristic of Macro-move 1 will partly coincide with what will happen in Macro-move 3, but it is not detected in Macromove 2 (Textbook’s nucleus). Some of the moves included in the Textbook genre may undergo a much more detailed analysis, as is the case with the Presentation. However, as has been stated previously, we have opted for an analysis on a more abstract level, which may immediately account for the complete genre organisation, as from the analysis of each text (the length of which ranges from 200,000 words in average). In fact, this higher degree of analytical abstraction implies a level of minor granularity in the
Rhetorical organisation of Textbooks 159
Table 4. Detailed rhetoric organisation of Macro-move 1. Preamble Name of Move and Steps
Communicative Purpose
Structure
Move 1.1. Contextualisation (CON)
To relate parts of the text, to comment on its contents and include acknowledgments.
Prologue/Preface
Step 1.1.1. Situating the Reader (SR)
To explain the context for the text’s production.
Step 1.1.2. Expressing Acknowledgments (EA)
To express thanks to editors, collaborators, students and others.
Move 1.2. Contents Organisation (CO)
To show the book’s contents and its thematic organisation.
Step 1.2.1. Presenting the Contents (PC)
To offer a list of sections and/or parts of the contents of the book by means of a numbered list.
Move 1.3. Resources Organisation (RO)
To support comprehension of the book’s contents.
Step 1.3.1. Supporting Comprehension (SC)
To provide a list of symbols used in the text that support comprehension.
Move 1.4. Presentation (PRE)
To describe antecedents, context and objective of the text for the reader.
Step 1.4.1. Declaring Textbook Purpose and Audience (DTPA)
To describe the objective and audience.
Step 1.4.2. Describing the Thematic Nucleus (DTN)
To present the specific thematic nucleus to be discussed.
Step 1.4.3. Giving Guidelines (GG)
To describe textbook phases, steps or stages.
Content Index
Index or Table of Symbols and Abbreviations
Introduction
analysis but makes it possible to obtain a level of global completeness that allows for the study of each text as a complete unit. Thus, in keeping with this approach, we do not focus only on one section of a genre, as traditionally has been the case in previous research (for example, when only the Introduction is studied in Research Articles or a complete genre is the focus of analysis but with a very limited number of words – such as with fundraising letters). In order to exemplify one step of this macro-move, the following passage of a Textbook on Industrial Chemistry is presented, where Step 1.4.1 from Move 1.4 Presentation is displayed.
160 Giovanni Parodi
Step 1.4.1. Declaring Textbook purpose and audience “Esta breve y concisa introducción a la química de coordinación, a la del estado sólido y a la inorgánica descriptiva de los elementos representativos se ha concebido para el estudiante que ya ha realizado un curso de iniciación. Se ha realizado con material obtenido por profesores y estudiantes, en clases, seminarios, sesiones de repaso y discusiones dentro y fuera de las horas de trabajo, así como con los libros de texto normalmente utilizados en estos cursos.” (CA-QUI-ma420) “This brief and concise introduction to the chemistry of coordination, of the solid state and inorganic description of the representative elements, has been conceived for the student who has already passed an introductory course. It has been carried out with material obtained from teachers and students, in class, seminars, revision sessions and discussions within and out of working hours, as well as with the textbooks normally used in these courses.” (CA-QUI-ma420)
This example shows concern about the need to explicitly identify the kind of reader for whom the book is intended: a person who already has some experience with the subject matter. The sources of the material used for the text are also declared, with special emphasis placed on revealing the context in which some of these were issued. At the same time, it is clearly explained herein that the objective is to introduce a specific topic briefly and concisely. The passage is identified in the example with the classification code assigned within the PUCV-2006 Corpus. Each text can thus be consulted online, with its corresponding morphosyntactic labels, by means of the “El Grial” interface at www.elgrial.cl. Table 5 gives a detailed description of Macro-move 2, its moves and its steps. As in the preceding section, the communicative purposes and examples are also included. Macro-move 2 must be considered the nucleus of this genre. As mentioned before, a cyclical and spiral-directed form of organisation stands out as the distinctive feature of this genre’s core. A sequence that operates regularly and repeatedly as many times as the conceptual framework disposition that sustains each text is required. At the same time, another interesting finding is the systematic identification of an organization, in which each of the moves of the second macro-move is displayed, revealing a very regular pattern in the articulation of each new Conceptualisation & Exercising macro-move. Thus, each constitutive move and each particular step are presented in a very carefully planned and hierarchical sequence, where the contents being developed are treated systematically in a perfect execution of the pattern displayed. It is evident that the writers/authors of most of these textbooks appear to be very conscious of the rhetorical organisation they aim to articulate in order to operationalise this genre into particular texts.
Rhetorical organisation of Textbooks 161
Table 5. Detailed rhetorical organisation of Macro-move 2: Conceptualisation & Exercising Name of Move and Step Communicative Purpose
Structure
Move 2.1. Concept Definition (CD)
To describe and explain processes, objects or others.
Nucleus of a chapter
Step 2.1.1. Linking Contents (LC)
To link new concepts or procedures with those of one or more preceding chapters.
Introduction to a chapter/section
Step 2.1.2. Presenting the To describe and define the object, concept or procedure under study, often Topic Nucleus (PTN) accompanied by drawings, figures, tables or formulae. Step 2.1.3. Specifying Components or Sections (SCS)
To subclassify or divide the concept or procedure under study into parts, with descriptions and definitions of types, parts or components.
Nuclei or units of a chapter/section, where a multimodal component is highlighted Subunits of a chapter/section
Move 2.2. Practice (PRA) To present practical tasks based on the contents reviewed in the section.
Part of a chapter
Step 2.2.1. Presenting an To present a problem, exercise or example, Exercise or Example (PE) accompanied by one or more questions directly related to previous definitions and descriptions.
Exercise-ProblemExample
Step 2.2.2. Solving the Task (SC)
To solve the problem in a concise and direct manner or propose strategies for solving it. Formulae, mathematic equations and brief explicative or descriptive phrases are used.
Step 2.2.3. Expanding Practice (EP)
To deliver more problems or examples (without the solution)
Move 2.3. Recapitulation To list global ideas. (REC) Step 2.3.1. Macrosemantisising the Contents (M)
Supplementary Exercise/Problem End part of a chapter
To sum up or define nuclear concepts, ob- Summary jects or procedures presented in the chapter; normally, introduced by means of vignettes.
The cyclical disposition of this macro-move, along with the moves and steps throughout each text, accounts for the way in which the information is organised and reveals, on the one hand, a very recursive format and, on the other, a hierarchical method of organisation. Thus, moves 2.1, 2.2 and 2.3, with their respective steps, display the contents and guide the reader into practice with each new conceptual nucleus. This reveals the purpose of teaching new ideas to the student through these textual sequences and, subsequently, of exercising these
162 Giovanni Parodi
ideas by means of a set of problems and solutions. In fact, as Hyland (1999) states, textbooks are a repository of knowledge that opens paths to those initiating a discipline and allows them to construct preliminary access to this specialised knowledge. To exemplify part of this macro-move, three passages from textbooks from Construction Engineering, Psychology, and Social Work were chosen so that steps 2.1.2, 2.2.1 and 2.3.1 may be identified. Step 2.1.2. Presenting the topic nucleus (Move 2.1. Concept definitions) “Componentes del concreto 2.1 Cemento El cemento a emplear en pavimentos de concreto será normalmente el de tipo I, es decir, el de tipo común. En casos especiales en que los pavimentos están expuestos a acciones moderadas de sulfatos, o por requerimientos de tiempo de hidratación, se utilizarán los cementos II a V. En general, el cemento empleado deberá cumplir mínimamente con las Normas de Calidad vigentes de la S.C.T. Ref. 1.” (CA-IC-ma372) “Concrete components 2.1. Cement The cement to be employed in concrete pavement will normally be type I, i.e., the common type. In special cases where the pavement is exposed to moderate sulphate action, or because of hydration time requirements, cements II to V will be used. Generally, the cement used must at least comply with the S.C.T. Ref. I Quality Standards in force.” (CA-IC-ma372)
The passage of this text from Construction Engineering opens with a subtitle announcing the upcoming definitions: Concrete Components. It is followed by a subnumeration clearly indicating a series or list, beginning with the name: Cement. Specifications are given there. In this example, one may note the information advancement and the link to specialised knowledge already presented in the text. The following example realises a fundamental communicative purpose in support of the aforementioned definitions and specifications. Step 2.2.1. Presenting an exercise or example (Move 2.2. Practice) “VI. PREGUNTAS Y EJERCICIOS • ¿Qué hitos históricos marcaron el nacimiento y desarrollo de la Psicología Educa cional? • ¿Qué desafíos le esperan a esta disciplina en construcción? • ¿Cuál es el rol que debe jugar un Psicólogo Educacional en la realidad actual?, ¿Qué habilidades nuevas debiera desarrollar?” (CA-PSI-ma37)
Rhetorical organisation of Textbooks 163
“VI. QUESTIONS AND EXERCISES º What historical milestones marked the birth and development of Educational Psychology? º What challenges await this discipline in construction? º What role must an Educational Psychologist play in the present reality? What new skills should he/she develop?” (CA-PSI-ma37)
This example was taken from a Psychology textbook and shows how information that has formerly been studied is put into practice. By means of open-ended questions, it aims to strengthen the learning of conceptual nuclei and to support the reader in his/her comprehension and projection of the knowledge under study. It is relevant to highlight the differential character of the first question with regard to the last two questions. The first points to a reproduction of previously presented new knowledge, while the other two lead the reader to make evaluative and projecting inferences. These two last questions reveal the way in which the constructing knowledge is directed in disciplines of social sciences and humanities: There is not necessarily a body of absolutely objective knowledge, nor is there intended to be one. Therefore, the reader is stimulated to seek ways and form his/her own opinion of the events under study. Step 2.3.1. Macro-semantisising the contents (Move 2.3. Recapitulation) “1.4. RESUMEN Hemos visto cómo el medir consistía en asignar números y que el conjunto de éstos era una variable. Los números asignados, en base a una normativa que es el objeto de la teoría de la medida, ofrecían diferente cantidad de información. Los niveles nominal y ordinal se conocen como niveles débiles de medida y los niveles de intervalo y razón como niveles fuertes de medida.” (CA-TS-ma231) “1.4. SUMMARY We have noted how measurement consisted of assigning numbers and that the total amount of the same was a variable. The numbers assigned, based upon a norm that is the object of the measuring theory, offered different amounts of information. The nominal and ordinal levels are known as weak measuring levels and the levels of interval and reason as strong measuring levels.” (CA-TS-ma231)
This last example is taken from a Social Work textbook. In the passage, it is clear that the writer/author is concerned with taking up the thread of exposition and summing up the previously treated contents. This rhetorical step illustrates the didactic attitude detected in this genre. The writer/author seeks to support the reader and therefore applies summarising strategies to the contents as a closing step for what had been studied. This is how the communicative purpose of teaching is carried out and how it functions as a guiding light in the learning process.
164 Giovanni Parodi
From these examples, we noted a gradual exercising of nuclear concepts of varying degrees of difficulty. Also, solving questions and problems, together with the application of summarising strategies at the closing of the section of the text, demonstrates how the process of disseminating knowledge operates within this genre and, more specifically, within each of the disciplines of the corpus. This “step-by-step” design of the progressive approach to scientific knowledge implies careful planning and ranking of disciplinary contents; at the same time, it shows a planned interaction between the writer and the reader, where the role of the novice audience is evidently identified. Thus, the writer/author assumes the role of a specialist in a discipline that guides the student through a methodical path to specialised knowledge by means of previously organised steps. This writer-reader communication is made sufficiently clear in this knowledge-disseminating genre by each text that reveals the mechanisms put into practice in order to fulfil its general communicative purpose.
3.4
Macro-move 3: Corollary
The third and last macro-move is displayed in three moves: Solutions and Answers, Specifications, and Guidelines. Together, they operate in five steps that execute the Textbook’s closing segments. As stated previously, the organisational sequence of these moves may not always display a canonical order. There is a certain degree of variability in their order of occurrence between one text and another, although all of these are always a part of the same macro-move. For example, Macro-move 3.1, Solutions and Answers, may appear subsequent to Move 3.2, Specifications, although the last one is always Move 3.3, Guidelines. This means that a certain degree of interchangeability between the first two moves is feasible, but this is not the case with the third one. Table 6 presents the details of Macro-move 3. The same occurs with certain steps within some of these three moves. For example, steps 3.2.1, Giving specifications, and 3.2.1, Defining terms, may be interchanged in terms of their order of appearance without this affecting, in the words of Kantor et al. (1983), the unity or the coherence of the text. These steps may also coincide with Hoey’s (1986) idea regarding a discourse colony, since this random organisation reveals autonomy between these, where the global meaning is not derived from the sequence in which they appear. The Corollary Macro-move stands out because it contributes to the contents with more exercises and solves the proposed problems (Move 3.1). Thus, the didactic function of this genre is fulfilled once again. Likewise, the communicative purpose underlying the last two moves, Specifications and Guidelines, is to
Rhetorical organisation of Textbooks 165
Table 6. Rhetorical organisation of Macro-move 3: Corollary Name of Move and Step Communicative Purpose
Structure
Move 3.1. Solutions and Answers (SA)
To point out solutions to the exercises and Annexes/Appendices problems and give answers to the problems presented in each chapter.
Step 3.1.1. Resolving and Answering (RA)
To give solutions to exercises and answers to problems presented in each of the preceding chapters.
Annexes/Appendices
Move 3.2. Specifications To support comprehension of terms, units (SPS) and abbreviations.
Annexes/ Appendices/Glossary
Step 3.2.1. Giving Specifications (GS)
To give a set of tables where diverse technical information is recorded.
Annexes/Appendices
Step 3.2.2. Defining Terms (DT)
To support comprehension of technical terms, presented in alphabetical order and accompanied by a brief definition.
Glossary/Key Terms/ Definitions
Move 3.3. Guidelines (GUID)
To offer bibliographical sources and support the search for topics through an alphabetically ordered guideline.
Analytical Index/ Bibliography
Step 3.3.1. Declaring Sources (DS)
To give bibliographical references.
Bibliography/ References
Step 3.3.2. Listing Subjects of Text in Alphabetical Order (LS)
To offer a list of the subjects of the book in alphabetical order, with indications as to their location in the text.
Analytical Index
introduce ways to build knowledge, supporting the new ideas with (among others) annexes, glossaries, tables of contents, analytical indexes, and key terms. In order to exemplify this macro-move, a passage from a Social Work text was selected that accounts for Step 3.2.2, Defining terms, which belongs to Move 3.2, Specifications. Step 3.2.2. Defining terms (Move 3.2. Specifications) “Glosario Acoso sexual: insinuaciones, comentarios o comportamientos de tipo sexual que una persona plantea a otra que no los desea en los que persiste, aunque está claro que esa segunda persona se resiste a ellos. Agentes de socialización Grupos o contextos sociales en los que tienen lugar los procesos de socialización. La familia, los grupos de amigos, los colegios, los medios de comunicación y el ámbito laboral son los lugares en los que se produce este aprendizaje cultural.” (CA-TS-ma222)
166 Giovanni Parodi
“Glossary Sexual harassment: insinuations, comments or behaviour of a sexual nature that one person implies to another who does not welcome them [and] in which the person persists, although it is clear that the second person is resisting them. Socialising agents Groups or social contexts where the processes of socialisation take place. Family, peer groups, school, the media and the workplace are the places where this cultural learning is produced.” (CA-TS-ma222)
In this passage where the closing move is identified, comprehension of the contents is supported via the definition of relevant terms, which are unquestionably new to the student, in the writer/author’s view. This didactic resource again points out the special concern for a semi-lay audience, an audience in the process of acquiring new disciplinary knowledge, and shows the organisation of procedures that support the reader.
4.
Summary
Further to the presentation of macro-moves and identification of the corresponding purposes, as well as the respective examples from texts belonging to the PUCV2006 Corpus of Spanish in four disciplines, four unexpected findings emerged from the analysis of this genre. They constitute a very particular superstructural organisation in which prototypic features were identified from the macro-level to the most micro-organisational level. These are (a) the cyclic nature of some macro-moves and moves, (b) the degree of hierarchy governing some moves and steps, (c) the degree of interchangeability of some moves and steps, and (d) the central or obligatory role versus the satellite or more optional function. These all revealed the highly prototypical organisation of the Textbook. These features may be diagrammed into a three-pronged design where two of the components share some characteristics, such as interchangeability in the distributional position and a more satellite role with respect to the genre macropurpose. The third component is identified as central and highly hierarchical in the distribution and occurrence of the internal steps, where this is the component that provides the structuring nucleus for this genre. The two more satellite and distributionally flexible components are the Preamble and the Corollary. The Textbook nucleus is comprised of Conceptualisation & Exercising. It is possible that this particular organisational form has not been identified before, since the available literature does not record similar data. Therefore, these empirical findings may show some degree of originality in research concerning genre and its rhetorical organisation.
Rhetorical organisation of Textbooks 167
M
LC
EP
PTN
Conceptualization & Exercising ST
ET PC
SR
Preamble
SCS PE
LS
RA
ST
Corollary
GS
GG
DGN DTPA
DT DS
Figure 6. Flexibility and rigidity of macro-moves in the Textbook genre
In brief, the Textbook characterisation may be summed up as shown in Figure 6. Based on this figure, the opening and closing macro-moves show a high degree of interchangeability between the constitutive moves and steps within each macro-move (Preamble and Corollary). This means that they show flexibility in their order of occurrence. In addition, it is evident that they are not explicitly linked and do not need each other for the global coherence and unity of the text. Thus, each of them executes an independent discursive process and fulfils its communicative purpose in an independent manner. In contrast, the central and most prototypical macro-move of the Textbook (Conceptualisation & Exercising) reveals a highly internal hierarchical organisation. All moves and steps within each macro-move follow a sequential and linear pattern that does not admit mobility or interchangeability. This hierarchical sequence ensures an adequate approximation process for disciplinary knowledge and a progressive increase in the degree of difficulty of the contents. An important mechanism in this process is the relation between the moves, by means of linguistic interconnections – as these may allow, for instance, anaphoric references between textual sections. Thus, construction of an integrated global and complementary meaning is ensured. On the other hand, a spiral and recursive distribution of one macro-move as a whole is noticeable. Each conceptual unit, together with the exercises, problems and solutions, may be repeated as many times as book organisation requires. This distinctive rhetorical organisation may be called a “colony-in-loops”. This expression coincides with what is illustrated in Figure 6, on both global and internal levels. There is a certain degree of encapsulation of each of the three
168 Giovanni Parodi
constitutive macro-moves, but, at the same time, these are linked to a macroorganisational level in a hierarchical way. In addition, in the two satellite macromoves, the idea of a colony emerges due to interchangeability and flexibility in the distribution of the moves and steps of each of the two macro-moves. Nevertheless, the Textbook nucleus shows that it is highly linked in its internal organisation. This distributional sequence is characterised by a series of compartments that, at the same time, are steadily organised. This is why the “colonyin-loops” characterisation may be applied.
Conclusions Identification of the rhetorical organisation of a genre such as the undergraduate university Textbook has proved to be a complex task. The analytical procedures applied allowed us to identify a set of moves specific to the genre and to construct adequate operational definitions. As part of this task, the triangulation stage turned out to be highly important and helped to corroborate and adjust researchers’ preliminary intuitions. Fulfilling the objectives of this research, the Textbook corpus demonstrated that it was satisfactorily sufficient in order to obtain reliable and robust results. It is therefore possible to contribute larger generalisations from the findings and to develop prototypes that are representative of the genre, with the support of examples. It is also possible to make comparisons with other genres and thereby bring out the most characteristic features of the Textbook. Given the length of the texts that make up this genre and the decision to aim for a higher level of abstraction in the analysis (although with a minor degree of granularity), the concept of macro-move proved to be an innovating resource that helped to obtain a better preliminary approximation of this genre. The analytical and complementary methodology of the deductive-inductive or “top-down” and “bottom-up” processes was a fundamental support for the segmentation and identification process for communicative purposes, in the macro-moves as well as on the other levels of analysis. Considering the most important findings of this study, special attention should be given to identification of the prototypical rhetorical organisation of the Textbook genre, which has been named the “colony-in-loops” form. This particular organisation is basically characterised by four features: (a) the spiral and cyclical nature of certain macro-moves and moves, (b) a hierarchical sequence of certain macro-moves and moves, (c) flexibility inside a macromove and the distributional interchangeability of some moves and steps, and (d) the central or satellite function of certain macro-moves. The complementarity of all these features
Rhetorical organisation of Textbooks 169
as part of the complex organisation of a structure, with groups of independent colonies on macro- and micro-levels, emerges as prototypical of the Textbook. All of these features result in textbooks being “considerate texts” (Kantor et al. 1983) where specialised knowledge is delivered in a way intended to disseminate it. This means that this rhetorical organisation pays particular attention to an audience under academic instruction, together with a sequence of contents supported by means of didactic resources – such as, on the one hand, problems and exercises and, on the other, tables, figures and diagrams. Subsequent studies should confirm this singular organisation of macro-moves and reinforce its usefulness. The two following and complementary chapters in this same volume focus, respectively, on the quantification of the rhetorical moves and steps in the Textbook genre, with an emphasis on the four disciplines under study, and on a detailed study of the Disciplinary Text genre.
chapter 9
The Textbook genre and its rhetorical organisation in four scientific disciplines Between abstraction and concreteness Giovanni Parodi
It is hard to find research in which the rhetorical organisation of a discourse genre and the respective frequency of occurrence of moves and steps are worked together across disciplines. This chapter aims to become a step in helping to fill this gap in research on Spanish. The Textbook genre, the occurrence of its rhetorical macro-moves, moves and steps and disciplinarity become the focus of this study. This chapter is offered as a continuation and complementation of the previous one. More specifically, in this chapter we concentrate on the frequency with which each of the macro-moves, moves and steps are executed throughout 126 textbooks. We distinguish the area of knowledge (Basic Sciences and Engineering and Social Sciences and Humanities) as well as the specific discipline (Social Work, Psychology, Industrial Chemistry and Construction Engineering). The main findings show there are interesting differences between the occurrence of some moves and steps across the disciplines under study.
Introduction In close connection with the previous chapter, the purpose of this section of the book is to determine the frequency of occurrence of the macro-moves, moves and steps in the corpus of 126 university textbooks across four disciplines, as part of PUCV-2006 Corpus of Spanish. The four disciplines involved are part of the corpora under study in this book: Social Work, Psychology, Industrial Chemistry and Construction Engineering. Therefore, special attention is paid in order to investigate possible variation patterns among the rhetorical moves and steps across disciplines, in order to find out if disciplinarity may emerge as a relevant issue. Available literature features scarce research focusing simultaneously on the rhetorical organisation of discourse genres and on the respective determination
172 Giovanni Parodi
of move frequency. It is even more difficult to find studies that complementarily approach investigation based on diverse corpora of texts collected from ecological principles and covering a range of scientific disciplines. Very few exceptions are documented, such as the research conducted by Kanoksilapatham (2007), comparatively focused on a genre with texts of limited length. Thus, determining the frequency of occurrence of the Textbook genre’s macro-moves, moves and steps and disciplinarity variation is the focus of this research (for detailed analysis of Textbook rhetorical organisation, see Chapter 8 in this same volume).
1.
The research
1.1
The Textbook genre
Let us briefly recall the nuclear Textbook’s rhetorical organisation, as identified in Chapter 8. After an exhaustive study, a decision was made to include a new category of analysis, the macro-move. This macro-level was necessary to account for a complete analysis of each text and provide a comprehensive account of the total length of the genre in question (an average 200,000 words per text). A summary of the identified Textbook organisation is modelled in Figure 1. Genre organisation is displayed in three macro-moves (Preamble, Conceptualisation & Exercising, and Corollary) and ten moves. In a more detailed organisation, nineteen specific steps were identified. For a summarised presentation of Textbook rhetorical organisation, see Annexes 1, 2 and 3.
Figure 1. Rhetorical organisation of the Textbook genre
The Textbook genre and its rhetorical organisation 173
The nucleus of this genre is expressed in Macro-move 2: Conceptualisation & Exercising, which acts as a recurring cyclical unit that is repeated as many times as required in order to cover the contents to be displayed. By means of this macro-move, along with its corresponding moves and steps, the students are shown a set of technical definitions, explanations and descriptions. They are gradually presented with the new contents (Move 2.1), which are carefully exemplified and summarised. A particular concern is detected in exercising the nuclear contents and in the resolution of the tasks set forth (Move 2.2). The writer/author’s clear decision to guide the audience under training is also expressed in his/her reformulation of the thematic nucleus, since the macro-move is closed with a macrosemantisation where summarising strategies are offered (Move 2.3). Macro-moves 1 and 3 (Preamble and Corollary) do not operate in the same way as Macro-move 2, since they are included only once throughout the text. It is only Macro-move 2 that occurs systematically as a prototypical feature of the Textbook genre and thus becomes a recursive category of great impact. Another interesting contrast between these macro-moves is detected in the more flexible and sometimes random distribution of the moves and steps of Macro-moves Preamble and Corollary. This second feature becomes another important difference compared to Macro-move 2. In Conceptualisation & Exercising, the organisation is more rigid and hierarchical. Moves and steps do not mix, nor is the described order altered. This is clearly due to the author’s careful planning of these didactic resources, which are intended as a means to gradually approach specialised knowledge and the practical stage of exercising. Due to these prototypical features, this genre has been called “colony-in-loops” (see Chapter 8). Undoubtedly, these functional mechanisms, through which sets of theoretical and practical units are delivered, are gradually introducing novices to the methodology and procedures of reasoning that future professionals should apply in order to become part of the discourse community. This means that the student is not only learning and acquiring conceptual or declarative knowledge but also procedural knowledge.
1.2
Corpus description
As stated in the Introduction, the corpus of 126 undergraduate university textbooks was collected and analysed. The number of texts and words comprised in the corpus is detailed in Table 1. The figures in this table show how lengthy these texts are in the four disciplines under study. Likewise, this information reveals a tendency towards more frequent occurrence of this genre in the disciplines of Basic Sciences and Engineering. These textbooks are more than twice the number of texts as those dealing
174 Giovanni Parodi
Table 1. Numerical constitution of the corpus University Programme
Area
Number of texts
Number of words
Psychology (PSY) Social Work (SW) Industrial Chemistry (IC) Construction Engineering (CE)
Social Sciences and Humanities (SS&H) Basic Sciences and Engineering (BS&E)
31 15 31 49
4,925,931 7,391,678 2,465,747 9,161,146 16,097,358 6,936,212
Total
126
46 80
23,489,036
with Social Sciences and Humanities. This difference may be partly due to the more important occurrence in SS&H of another discourse genre which seems to be the most common means of transmission and construction of specialised knowledge in Psychology and Social Work, i.e., the Disciplinary Text. Chapters 4 and 5 discuss and analyse this comparison in detail; and Chapter 10 gives account of a rhetorical organisation study of the Disciplinary Text genre. The total amount of words in this Textbook corpus is also relevant information, since there is no record of another study based on almost twenty-four million words. There is no available bibliography describing corpora collected from four disciplines of university communication accounting for all texts used in overall undergraduate university curricula. There is even less literature highlighting disciplinarity as the central feature of variability throughout genres written in Spanish.
1.3
Method
As indicated in the previous chapter, verification and establishment of the degree of reliability for Textbook rhetorical organisation table of criteria was executed by means of a triangulation process by three expert judges and with a high consensus percentage (87%). To face the quantification of the moves and to determine the frequency of occurrence in the 126 textbooks, the research team divided the analysis into disciplinary domains. During a subsequent phase, quantification by discipline was executed in a crosswise manner. Final quantitative data were triangulated in order to reach a consensual final result when required.
2.
Results and discussion
Figure 2 shows the overall distribution of the macro-move and move frequencies of occurrence across the 126 textbooks of the corpus, separated in each
The Textbook genre and its rhetorical organisation 175
Macro-moves and Moves 100 90 80 70 60 SS&H
50
BS&E
40 30 20 10 0
M 1.1 CON M 1.2 CO
M 1.3 RO
M 1.4 PRE
MACRO 1: Preamble (PREA)
M 2.1 CD
M 2.2 PRA M 2.3 REC
M 3.1 SA
MACRO 2: Conceptualisation & Exercising (C&E)
M 3.2 SPS M 3.3 GUID
MACRO 3: Corollary (CO)
Figure 2. Macro-moves and moves in each knowledge area
knowledge domain: Social Sciences and Humanities (SS&H) and Basic Sciences and Engineering (BS&E). We then review more specific steps in each of the four disciplines under study. In order to ensure accurate comparison, all figures have been standardised by means of a percentage process. Clearly, the Textbook genre organisation is not distributed homogeneously in all texts of SS&H and BS&E; interesting differences in the occurrence of some moves characterise the genre. Figure 2 shows that the macro-moves and moves are not equally represented in these textbooks. The data provided in this section demonstrate that the texts of the corpus in BS&E show a 100% occurrence or near the maximum possible in almost all the moves of the three macro-moves identified. The only move with an occurrence under 90% is Recapitulation (REC). In this third move of Macro-move 2 Conceptualisation & Exercising (C&E), textbooks of BS&E show 88% occurrence. In contrast, Social Sciences and Humanities (SS&H) texts stand out due to a low frequency of occurrence in four of the ten moves analysed here. Within Macro-move 1 Preamble (PREA), SS&H texts only amount to 20% in Move 1.3 Resources Organisation (RO). Then, in Macro-move 2 Conceptualisation & Exercising (C&E), the SS&H textbooks evidence only 20% in Move 2.2 Practice (PRA) and a minor occurrence of Move 2.3 Recapitulation (REC) (7%). Lastly, textbooks in this same knowledge domain maintain a 7% frequency of occurrence in Move Solutions & Answers (SA), as part of Macro-move 3 Corollary (CO). These findings indicate an interesting distinction in the rhetorical organisation of the genre between textbooks of BS&E and SS&H, where disciplinarity
176 Giovanni Parodi
Macro-moves and Steps 100 90 80 70 60 SS&H
50
BS&E
40 30 20 10 0
SR
ET
PC
ST
DTPA DTN
MACRO 1: Preamble (PREA)
GG
LC
PTN
SCS
PE
ST
EP
M
MACRO 2: Conceptualisation & Excercising (C&E)
RA
GS
DT
DS
LS
MACRO 3: Corollary (CO)
Figure 3. Distribution of the Text genre’s rhetorical steps in two knowledge domains
shows diversifying rhetorical organisation patterns. This tendency clearly points towards a specific textbook component as the distinguishing feature between the two domains of knowledge under study. This overall analysis of the frequency of occurrence of macro-moves and moves in the entire corpus of textbooks identifies the didactic component in which the tasks suggested are exercised and resolved. In order to more thoroughly describe the rhetorical organisation of the genre under study and delve into these first results, the frequency of occurrence for each step in each macro-move will be determined. According to the data presented in Figure 3, it is clear that this in-depth analysis of texts from the two knowledge domains using macro-moves and steps confirms previous overall findings. The record of occurrence for each of the steps across the 126 textbooks shows that some steps in each of the three macro-moves show important numerical differences between the texts of both a knowledge domains. The occurrence, equal to or less than 30% in six steps out of the total nineteen steps in SS&H texts, reveals a focus that is not common to both areas. Comparatively, this low occurrence of some steps in SS&H texts contrasts with a high percentage in the occurrence of all nineteen steps in BS&E texts. Thus, most steps evidence 90% to 100% occurrence in BS&E. Only three steps ranged between 80% and 90%. This shows that BS&E textbooks perform the complete rhetorical organisation described herein, completing each of the steps emerging from the analysis. Likewise, the pedagogic component is also inserted in most BS&E texts
The Textbook genre and its rhetorical organisation 177
and is evidently of great importance. A comprehensible and well-organised decision is noticeable with respect to the pedagogical function of textbooks in BS&E, since these categories appear repeatedly and at the same level as other central categories. In contrast, SS&H texts do not always conform to the overall organisation described for BS&E texts. Reduced or limited presence of the steps performing certain communicative purposes concentrates on the proposal of exercises, tasks and problems and their guided solutions. Interpretation of these data reveals an exceedingly relevant contrast in the way these texts put knowledge into practice. It is quite clear that the nature of scientific disciplines makes different demands between knowledge domains. In BS&E texts, once a conceptual nucleus has been determined, exemplification and exercising are the fundamental means to integrate and stabilise new knowledge. In SS&H texts, the nature of the objects of study, as well as the degree of abstraction involved in the conceptual clusters does not directly lead into practical applications. However, it is important to state that this detected tradition could be re-directed, since there is no reason why questionnaires or tasks accompanied by the corresponding guided solutions could not be included as part of SS&H textbooks. However, there is evidently a tendency displayed by textbook writers in SS&H disciplines to avoid exercising or solving tasks in a step-by-step disposition, which could to be directly related to the subject matter under study. Disciplinarity emerges as a source of distinction and evidences the fact that knowledge is not delivered or exercised in the same way across the scientific disciplines under study. Hence, a distinct mode arises from the data, where each discipline constructs, transmits and re-constructs knowledge in a different manner under one genre which is not totally unified in its organisation. While BS&E textbooks tend to concentrate on theoretical components and methods of analysis, SS&H textbooks emphasise theorisation, but not practical exercising nor the transmission of models to practice contents or apply this to actual exercises. In order to further analyze these differences, the Textbook corpus was subdivided in each of the four disciplines, and the corresponding calculations were made. Figure 4 provides a detailed and comparative analysis of macro-move and rhetorical step frequency of occurrence. Regularity emerging from each of the nineteen rhetorical steps throughout the texts of each of the four disciplines is surprising. According to the adscription of these steps to BS&E or SS&H, a clear pattern is identified. This revealing finding is detected in the structuring of information and rhetorical functions; on the one hand, in Psychology (PSY) and Social Work (SW) texts and, on the other, in Construction Engineering (CE) and Industrial Chemistry (IC) texts. This more detailed analysis confirms the findings presented in Figures 2 and 3.
178 Giovanni Parodi
Moves and Steps per Discipline 100 90 80 70 PSY
60
SW
50
CE
40
IC
30 20 10 0 SR
ET
PC
ST DPTA DTN GG
Preamble
LC
PTN SCS
PE
ST
EP
M
Conceptualization & Excercising
RA
GS
DT
DS
LS
Corolary
Figure 4. Frequency of Textbook genre’s rhetorical steps in four disciplines
The main differences lie in certain steps of each of the three macro-moves. Supporting Comprehension (SC) is the only step of the first macro-move to evidence substantial differences across the four disciplines. This step presents a frequency of more than 80% in CE and IC, but decreases sharply in SW (10%) and PSY (29%). In textbooks of these two latter disciplines, lists of symbols employed throughout the texts do not emerge as a relevant feature, indicating that technical symbols that require explicit identification in lists supporting reading comprehension are not frequently employed. Conversely, in disciplines such as CE and IC, these iconic resources constitute an important part of disciplinary knowledge, and the writer/author considers that they are a fundamental support for the reader. Besides, although exercises, problems and solutions are very relevant in the nucleus of textbooks in both CE and IC (near 90% occurrence in both disciplines), in PSY (33%) and SW (28%) the frequencies of occurrence are minor. However, although presenting problems or questions is a purpose that appears to be of relative importance in SS&H, the major difference between disciplines undoubtedly stems from Resolving the Task (RT), Expanding Practice (EP) and Macro-semantisising (MS). These three steps corresponding to moves Practice and Recapitulation of macro-move Conceptualisation & Exercising reveal the main distinction between IC and CE textbooks and PSY and SW textbooks. Undoubtedly, as may be observed in the percentages of occurrence, these steps are highly optional in these textbooks.
The Textbook genre and its rhetorical organisation 179
The didactic component thus appears as the deferring feature across the textbooks of the corpus in the four disciplines under study. More precisely, the exercising component, based on the presentation of problems and questions accompanied by the corresponding solutions, in some cases resolved step-by-step and with a detailed analysis of the execution procedure employed, shows differences between the disciplines of BS&E and those of SS&H. The pattern detected is highly irregular across the four disciplines. While the CE and IC textbooks are very similar in the inclusion of the three final steps of macro-move Conceptualisation & Exercising, PSY and SW textbooks show very limited or non-existent occurrence of these steps. Similar results are found in the first step of the last macromove Resolving & Answering (RA). The communicative purpose of this textbook section is to deliver very detailed and step-by-step solutions for the tasks and/or problems anticipated in the preceding steps of the previous macro-move. Once again, the frequency of this step in CE and IC texts is over 90%, and under 10% for SW and PSY texts. Although questions had been raised in former sections, it is evident in the SW and PSY textbooks that the writer/author does not feel the need to explicit the answers or to provide a guided resolution in order to help students to come up with the expected answer. This revealing attitude of the writer/author of SW and PSY textbooks, in contrast to that of the writer/author of IC and CE textbooks, denotes transmission of both a form of knowledge and methodological procedures, that is to say, declarative knowledge and procedural knowledge. Accordingly, Macro-move 2 shows its importance as a functional category with a higher level of abstraction. Hence, the inclusion of the macro-functional level of analysis enabled the detection of differences between disciplines. This also reveals its potential identification and implementation. If only the concept of a set of moves had been employed, textbooks might have been described as so different across the disciplines mentioned that we would wonder if these all belong to the same genre. Incorporation of macro-level proves that there actually is a genre shared by the four disciplines under study. It also reveals that part of this macrofunction is not performed in all textbooks. The similarity between CE and IC texts, as well as between PSY and SW texts is impressive and revealing. This finding is most unanticipated and surprising, since such regularity in the rhetorical pattern between the disciplines under study by adscription to an area of knowledge was unexpected. Although the texts of the four disciplines preserve the global macro-purpose “...to instruct regarding concepts and/or procedures within a specialised thematic” (see Chapter 3) and, therefore, all of them belong to the Textbook genre, there are important distinctions between the disciplines and the knowledge domains. However, the differences detected do not actually mean that the texts from these knowledge domains no longer belong to the Textbook genre. On the contrary,
180 Giovanni Parodi
with all of these texts belonging to the same genre, there are interesting differences in their characteristics across the disciplines. The latter confirms an assumption that is repeatedly made and empirically proved in this book: Differences in the way of articulating the didactic component between textbooks of different areas do not mean that these are no longer textbooks. What is indeed shown empirically is that the genre varies to a great extent across some disciplines (SW and PSY; CE and IC) and that there is also high homogeneity in the variation pattern of disciplines belonging to the same domain (BS&E and SS&H). This noticeable difference in the way of leading a reader to approach disciplinary knowledge is definitely linked to the way scientific work is conceived by the author or writer as a specialist in this discipline. It is also related to the best way of arranging the presentation of new ideas and of how the student is expected to organise knowledge in a specific discipline. These empirical findings based on a Spanish corpus of textbooks partially correspond to what was previously stated by Halliday and Martin (1993), Martin (1998), Martin and Rose (2007), Martin and Veel (1998) and by Wignell (1998, 2007a and b), regarding differences between social and basic scientific texts written in English. Basically, these research works point towards a distinction between an abstract versus a technical mode, between SS&H and BS&E respectively. Although none of these investigations are supported by empirical analysis such as that presented herein, and these do not follow corpus-based principles (TogniniBonelli 2001), their contributions generally establish a difference between the BS&E and SS&H texts in English. The findings reported have led us to propose a distinction between abstraction and concreteness as distinctive features between SS&H and BS&E textbooks. Thus, while IC and CE texts tend to be concrete and precise in the way they articulate new concepts and indicate resolution procedures, PSY and SW text contents are comparatively more abstract: they do not necessarily correspond to a paradigmatic presentation of one single and exclusive scientific position. Likewise, in PSY and SW textbooks, permanent exemplification or exercising does not exist, and no straightforward intention of leading the reader to find a unique and definite way to solve possible emerging problems is observed. At the same time, no carefully laid out specifications as to the way a certain problem should be approached and solved were identified. Apparently, no specific and explicit ways of reasoning are decidedly being imposed on the reader of these textbooks, or at least they were not identified in the analysis executed. In contrast, PSY and SW texts revealed something inherent to these disciplines: the modus operandi which the texts of SS&H put into practice is precisely the prototypical ways in which this disciplinary knowledge is transmitted, constructed and gives form to each of these disciplines. It is not the presence/absence
The Textbook genre and its rhetorical organisation 181
of a certain rhetorical component, but rather the intrinsic manner of approaching a theme and putting it into practice. Both PSY and SW texts evidence a clear pattern of how to organise contents and present them for study and learning. Hence, the abstraction and absence of definitive truths appear as the most common features in PSY and SW texts. The tendency here is to meditate about the various models without describing a single pattern for the delivery of paradigmatic contents, i.e., for delivery of well-grounded knowledge, unquestioned truths and concepts coming from a defined theoretical position. IC and CE textbooks also explicitly identify the manner in which the reader should approach this knowledge and the expected reasoning format for the resolution of tasks is clearly described. This leads the reader to learn not only the concrete contents, but also the procedures of the disciplines and the steps that each new member in the process of apprenticeship pertaining to these communities is expected to perform and systematise for problems that might come up in the future. In short, IC and CE present a specific and clearly defined path that each student or novice must learn to follow precisely. The nature of IC and CE disciplines imposes a modus operandi that is very different from the one presented in PSY and SW textbooks.
Discussion and conclusions As stated beforehand, for English, Halliday and Martin (1993) propose, on the one hand, a distinction between texts based on principles of abstraction and technicality; Martin and Rose (2008) propose a difference in terms of control of the social world and control of the natural world, and Wignell (1998, 2007a and b) argues that there is a difference in terms of two grammatical resources: technical vocabulary and taxonomic relations (basically those of hyponymy and meronymy). All these investigations have mainly been developed within the paradigm of Systemic Functional Linguistics (SFL) of the School of Sydney, which implies a specific way of understanding knowledge and discourse genres (Halliday 1978; Halliday & Martin 1993; Christie & Martin 1997, 2007). As stated in our framework in Chapter 2, our ontological and epistemological conception of human beings, genres and knowledge do not coincide with the basic assumptions of SFL. Basically, due to our sociocognitive discourse option where mental representations play a crucial role, as well as the fundamental subject’s awareness in the construction of the disciplinary knowledge. Therefore, knowledge is not defined as a semiotic externalist process, but is constructed during social interactions and is represented in the subject’s mind (Peronard & Gómez 1985; Parodi 2005a, 2007a). Therefore, a breaking point with the basic principles of SFL is its outright antimentalism.
182 Giovanni Parodi
Our empirical findings for the Spanish language stemming from data delivered herein lead us to establish three major differences between the textbooks of two BS&E disciplines and two SS&H disciplines. These differences lie in: (1) the degree of abstraction and concretion of knowledge, (2) the way the pedagogic device is articulated and (3) the type of methodological procedures being transferred. Thus, quantitative results delivered through the rhetorical steps of Macromove 2 and their systematic occurrence in terms of the corpora of IC and CE show a clear and precise display of very specific didactic resources to introduce and guide novices in specialised topics and fundamental technical procedures of the disciplines in question. In turn, SW and PSY corpora reveal the manner in which the new knowledge is approached by these sciences and demonstrate that exercising and solving of possible tasks proposed are a minor focus of attention. Likewise, scarce or non-existent detailed explanations of desirable procedures and reasoning were found in SW and PSY texts. On the contrary, IC and CE texts evidenced explicit concern to prepare the construction of new specialised knowledge and to announce the resolution of exercises and problems as well as indicating the discipline’s ideal prototypical reasoning in order to achieve an expected outcome. From these two disciplines, diverging epistemological assumptions between the two groups of scientists and disseminators of knowledge are executed by means of textbooks, originating this specialised genre. Hence, textbooks must and cannot be conceived exclusively as mere didactic instruments. It is true that a defined pedagogic model is underlying in the Textbook, where the role of the writer as a disciplinary expert is to instruct a lay or semi-lay student in a specialised subject matter. This pedagogic model imposed in the textbooks stands out in that the model provides a unique access route towards knowledge where a text is used to execute a one-way instruction process. The apprentice is thus initiated into a new world and develops disciplinary, cultural and social competences. In addition to the pedagogic model, students also gain perspective and understanding of the specific scientific field through textbooks. In short, a novice will not become an expert member of a discourse community by the exclusive process of reading university textbooks. As has been argued in Chapter 2 of this book, effective participation in a discourse community takes place when an expert member is also able to write fundamental disciplinary written genres. In fact, a student certainly does not become as a full and active member of a discourse community just by writing textbooks. It is by means of writing other specialised genres that the novice will eventually become a member of the community. For example, writing Research Articles, Reports and Lectures, as highly specialised discourse means, constitutes a very important writing exercise in order to approach disciplinary practices.
The Textbook genre and its rhetorical organisation 183
Likewise, for certain professionals, the production of Calculation Memories, Bidding Specifications and Medical Reports will show their expertise in daily communication. Nevertheless, the reading of textbooks as a means of initiation texts is fundamental as an articulating path towards a knowledge area and its specialised procedures. Thus, reading and writing are clearly revealed as highly interconnected processes. In particular, it is worth noting the way in which these interlink and complement each other as nuclear part in disciplinary academic literacy (Parodi 2007b). However, it is evident that the reading and writing of other genres leads the new members from “knowledge-telling” to active “knowledge-transforming” (Bereiter & Scardamalia 1987). A gradual process is established by means of these progressive mechanisms towards the writing of specialised genres within a discipline, where the texts actually end up acting as “paths towards knowledge” (Bartolomé 1986). The processes of specialised reading and writing are not executed once and for all, but are complementarily articulated over a lifetime. Acquisition and mastery of discourse genre knowledge thus becomes an ongoing and progressive task which every disciplinary literate person puts into practice every day of his/her academic and professional life. This shows that genres are highly dynamic objects that change progressively in order to meet information exchange and to fulfil interaction needs of writers and readers. It also evidences that discourse genres emerge, crystallise and may disappear or become obsolete in order to meet the communication purposes of the members from diverse spheres of academic and professional life. The quantification and analysis of the occurrence of rhetorical moves and steps in the corpus revealed interesting variations in the rhetorical organisation of the Textbook genre. Interesting differences emerged between Social Sciences and Humanities (SS&H) contrasting with Basic Sciences and Engineering (BS&E). The nuclei that give unity and stability to the genre and realise the general purposes of the Textbook also stand out. The strengths of research based on corpora of complete texts reinforce the assumptions of studies conducted on robust corpora, now with the objective of determining the rhetorical organisation of genres. Disciplinarity is confirmed as a highly relevant variable in the rhetorical organisation of a genre such as the Textbook. The findings indicate that texts belonging to the disciplines of Psychology and Social Work are organised rhetorically in contrast to the text of the disciplines of Construction Engineering and Industrial Chemistry. The pedagogic component emerged as the most important contrast, i.e., the moves that operate these functions evidence high occurrence in BS&E texts, while these do not appear to be prototypical in most texts of SS&H. This difference between the textbook rhetorical organisations across disciplines becomes
184 Giovanni Parodi
very revealing for academic literacy at undergraduate university level. This pedagogic component is displayed basically by means of the proposal of exercises and problems and/or open questions. However, above all, the major difference comes up in the resolution of tasks and specification of procedures in order to obtain the expected answer. All of this shows that that disciplinarity offers several resources in the texts of a same genre in order to conduct the process of construction and appropriation of specialised knowledge. It is worth noting that quantitative differences found in the frequency of occurrence of some specific steps across the four disciplines do not mean we are in the presence of two different genres underlying a “macro-genre” called Textbook. The limited occurrence of some functions in Psychology and Social Work texts does not imply that the genre’s general macro-purpose is different. The reduced presence of some rhetorical steps in the pedagogic component in SS&H texts is just a sign of disciplinary variability across one unified genre. In sum, we may conclude from these reported findings that the teaching/ learning of rhetorical devices varies from one discipline to the other. This means that teaching takes place in a different manner in diverse disciplines, due to the divergent nature of the disciplines themselves. Furthermore, there is a higher degree of abstraction in SS&H texts, which is not necessarily accumulative. This means that knowledge in PSY and SW is less hierarchically displayed than in BS&E, and its presentation tends to show a construct in line with co-existent alternative theories and focuses. Therefore, diverse theories are often presented in textbooks, approaching a problem in an alternative manner and without definitely discarding any of these. This undoubtedly entails a selection of prototypical features to the written text of the SS&H Textbook genre. On the other hand, CE and IC texts clearly show devices to transmit specific conceptual knowledge and, at the same time, specific procedures of reasoning. Thus, texts in BS&E revealed a way of organising knowledge with a minor degree of flexibility. Answers included in BS&E textbooks are often limited to a strict display of information and are formulated from one viewpoint. There are few options in the presentation of the information, and knowledge tends to become more concrete and highly accumulative, presented from clearly paradigmatic perspectives. As an explanation for these differences, a distinction between concreteness and abstraction in the mode of displaying knowledge has been proposed, as well as in the modus operandi for displaying the pedagogic component. Subsequent research will have to investigate the connections between linguistic patterns and rhetorical steps in different scientific spheres.
The Textbook genre and its rhetorical organisation 185
Annex 1: Detailed rhetoric organisation of Macro-move 1. Preamble Name of Move and Steps
Communicative Purpose
Structure
Move 1.1. Contextualisation (CON)
To relate parts of the text, to comment on its contents and include acknowledgments.
Prologue/Preface
Step 1.1.1. Situating the Reader (SR)
To explain the context for the text’s production.
Step 1.1.2. Expressing Acknowledgments (EA)
To express thanks to editors, collaborators, students and others.
Move 1.2. Contents Organisation (CO)
To show the book’s contents and its thematic organisation.
Step 1.2.1. Presenting the Contents (PC)
To offer a list of sections and/or parts of the contents of the book by means of a numbered list.
Move 1.3. Resources Organisation (RO)
To support the comprehension of the book’s contents.
Step 1.3.1. Supporting Comprehension (SC)
To give a list of symbols used in the text that support comprehension
Move 1.4. Presentation (PRE) To describe antecedents, context and objective of the text for the reader. Step 1.4.1. Declaring Textbook Purpose and Audience (DTPA)
To describe the objective and audience.
Step 1.4.2. Describing the Thematic Nucleus (DTN)
To present the specific thematic nucleus to be discussed.
Step 1.4.3. Giving Guidelines To describe textbook phases, steps or stages. (GG)
Content Index
Index or Table of Symbols and Abbreviations
Introduction
186 Giovanni Parodi
Annex 2: Detailed rhetorical organisation of Macro-move 2: Conceptualisation & Exercising Name of Move and Step
Communicative Purpose
Structure
Move 2.1. Concept Definition (CD)
To describe and explain processes, objects or others.
Nucleus of a chapter
Step 2.1.1. Linking Contents (LC)
To link new concepts or procedures with those of one or more preceding chapters.
Introduction to a chapter/section
Step 2.1.2. Presenting the Topic Nucleus (PTN)
To describe and define the object, concept Nuclei or units of a chapter/section, or procedure under study, often accompanied by drawings, figures, tables or formulae. where a multimodal component is highlighted
Step 2.1.3. Specifying Components or Sections (SCS)
To subclassify or divide the concept or pro- Subunits of a cedure under study into parts, with descrip- chapter/section tions and definitions of types, parts or components.
Move 2.2. Practice (PRA) To present practical tasks based on the contents reviewed in the section. Step 2.2.1. Presenting an Exercise or Example (PE)
To present a problem, exercise or example, accompanied by one or more questions directly related to previous definitions and descriptions.
Part of a chapter Exercise-Problem-Example
Step 2.2.2. Solving the Task To solve the problem in a concise and direct manner or propose strategies for solving it. (SC) Formulae, mathematic equations and brief explicative or descriptive phrases are used. Step 2.2.3. Expanding Practice (EP)
To deliver more problems or examples (without the solution)
Supplementary Exercise/Problem
Move 2.3. Recapitulation (REC)
To list global ideas.
End part of a chapter
Summary Step 2.3.1. Macro-semanti- To sum up or define nuclear concepts, objects or procedures presented in the chapter; sising the Contents (M) normally, introduced by means of vignettes.
The Textbook genre and its rhetorical organisation 187
Annex 3: Rhetorical organisation of Macro-move 3: Corollary Name of Move and Step
Communicative Purpose
Structure
Move 3.1. Solutions and Answers (SA)
To point out solutions to the exercises and problems and give answers to the problems presented in each chapter.
Annexes/Appendices
Step 3.1.1. Resolving and Answering (RA)
Annexes/Appendices To provide solutions to exercises and answers to problems presented in each of the preceding chapters.
Move 3.2. Specifications (SPS)
To support comprehension of terms, units and abbreviations.
Annexes/Appendices/ Glossary
Step 3.2.1. Giving Specifications (GS)
To provide a set of tables where diverse technical information is recorded.
Annexes/Appendices
Step 3.2.2. Defining Terms (DT)
To support comprehension of technical terms, presented in alphabetical order and accompanied by a brief definition.
Glossary/ Key Terms/ Definitions
Move 3.3. Guidelines (GUID)
To offer bibliographical sources and support the search for topics through an alphabetically ordered guideline.
Analytical Index/ Bibliography
Step 3.3.1. Declaring Sources (DS)
To provide bibliographical references
Bibliography/ References
Step 3.3.2. Listing Subjects To offer a list of the subjects of the book Analytical Index in alphabetical order, with indications as of Text in Alphabetical Order (LS) to their location in the text.
chapter 10
The Disciplinary Text genre as a means for accessing disciplinary knowledge A study from genre analysis perspective Romualdo Ibáñez
This chapter describes and illustrates the Disciplinary Text, an academic genre that has emerged as one of the most frequent means of written communication in the PUCV-2006 Academic Corpus of Spanish, in three different disciplinary domains (Social Work, Psychology, and Construction Engineering). In order to do so, we carried out an ascendant/descendent analysis of the rhetorical organisation of 270 texts. The results revealed the communicative purpose of the genre, as well as its particular rhetorical organisation, which is realized by three rhetorical macro-moves, eight moves and eighteen steps. Besides, quantitative results showed relevant variation between Social Sciences and Humanities and Basic Sciences and Engineering in terms of the rhetorical steps and moves that realize the communicative purpose of the genre. These results allow us to say that, although the main characteristics of the genre under study are the same across disciplines, some others are not due to disciplinary variation. One of the main implications of this analysis is the empirical evidence that the ways of transmitting knowledge through discourse varies depending on the discipline and knowledge domain.
Introduction One of the main assumptions underlying the studies presented in this volume is that exhaustive descriptions of specialised languages use can be used to obtain empirical data to support the design of adequate didactic tools. This, in order to improve the discursive practices of university students as new members of a disciplinary community (Parodi 2007a; Ibáñez 2007b). According to some researchers (Bhatia 2002, 2004; Hyland 2004), successful interaction with texts from a particular discipline is a basic condition for becoming a member of the disciplinary community.
190 Romualdo Ibáñez
Disciplinary literacy should be approached from a theoretical-methodological perspective that enables the integrated study of academic text characteristics, as well as of the psychodiscursive processes involved in processing them. In keeping with this idea, we believe it is impossible to maintain a categorical distinction between so-called externalist studies and those considered to be internalist. Instead, both types of study must be understood as complementary. This standpoint becomes stronger when we assume that the specific disciplinary knowledge and the necessary psychodiscursive skills underlying the processing of any academic text are developed, transformed and transmitted through discursive interaction. For this reason, in the study reported here, we analyse the cognitively-mediated interaction existing between the general communicative purpose, or macropurpose, (see Parodi, Chapter 8) of an academic genre and the diverse discursive units realizing this purpose. We also explore the way in which this interaction may vary depending on the discipline. We focus on a genre present in the PUCV2006 Corpus of Spanish, identified as Disciplinary Text (see Chapter 3). The objective of this research is to describe the Disciplinary Text genre in terms of its rhetorical organisation, identifying the possible differences determined by each discipline and knowledge domain – in this case, Social Sciences and Humanities (SS&H) and Basic Sciences and Engineering (BS&E). In order to do so, we analysed 270 texts corresponding to the genre in question, which, at the same time, is present in three of the four disciplines that make up the PUCV-2006 Corpus of Spanish: Construction Engineering, Psychology and Social Work.
1.
Approaching Academic Discourse from the perspective of Genre Analysis
1.1
Academic Genres
There is currently a great variety of perspectives from which the study of discourse genres can be conducted. Among them, we can find those proposed by New Rhetoricians (Bazerman 1994), by Functional Systemic linguists (FSL) (Martin 1992; Martin & Rose 2008), by English for Academic Purposes specialists (EAP) (Swales 1981, 1990, 2004; Bhatia 1993, 2002, 2004) and, more recently, from Corpus linguists (Biber, Connor & Upton 2007). In this context, it is relevant to state that the research presented herein aims to integrate various approaches. That is, considering the relevance we assign to cognitive as well as to social dimensions of disciplinary literacy, we will adopt Genre Analysis, as framed in the pioneering contributions of Swales’ model (1981, 1990);
The Disciplinary Text genre 191
but we will also integrate recent contributions from other researchers (Bhatia 2004; Kwan 2006; Biber, Connor & Upton 2007; Parodi, in Chapters 8 and 9). We have to emphasise that, unlike works such as those by Freedman and Medway (1994) and Bhatia (2004), this chapter does not represent an in-depth historical revision regarding the evolution of or the diverse perspectives in the study of discourse genres. Rather, we shall directly approach only those proposals and concepts that are central for our analysis. Thus, we must first establish what we understand as text and, of course, what we conceive as a discourse genre. From our point of view, a text constitutes a concrete instance of a particular language, understood as a meaning potential. This lexico-grammatical and semantic instance is consciously and intentionally produced by an individual within a particular context, in order to satisfy a communicative purpose. For this reason, the characteristics of each communicative event, or text, emerge from the linguistic choices an individual can make, according to her/his communicative purpose and restrictions imposed by situational and cultural contexts. Therefore, we believe that even though every communicative exchange has a social objective, every linguistic phenomenon has its origin in the individual. Likewise, we understand discourse genres as the conventionalised standardisation of the linguistic activities that individuals carry out in order to achieve their communicative purpose. This means that even though genres are discursively instantiated as texts in social interaction, they exist at the same time as knowledge structures, which are built and stored according to the previous experiences of the individuals who use them. We thus conceive discourse genres as complex entities, configured by the interaction of three constitutive dimensions: the linguistic dimension, the cognitive dimension and the social dimension (see Parodi, Chapter 8). As products of multidimensional interactions, genres can be described in terms of both their contextual and their linguistic characteristics (see Chapter 3). Thus, as texts tend to share some of these characteristics, such as the communicative purpose, (Askevave & Swales 2001), they can be identified as belonging to a particular genre. Consequently, for the discourse analyst, a genre is a construct that allows the categorisation of texts according to recurrent features of both a contextual and a linguistic nature. In the same way, knowledge of and familiarity with these characteristics enable expert members of a given disciplinary community to identify certain texts as belonging to one or another discourse genre (Swales 1981, 1990; Bhatia 1993, 1997).
192 Romualdo Ibáñez
1.2
Disciplinary Text and rhetorical moves
As mentioned in Chapter 3 of this book, the Disciplinary Text emerges from the PUCV-2006 Corpus of Spanish as a written genre, whose main purpose is to persuade a disciplinary specialist regarding a certain theoretical, methodological or ideological thesis. This particular purpose makes it possible to distinguish this genre from some other genres belonging to the Academic Discourse. At the same time, due to its high degree of specialisation, it is not in principle directed to nonexpert members of the discursive community. These particular contextual characteristics can be identified in text linguistic and structural features (see Venegas, Chapter 7). Unlike genres such as the Research Article, which has been exhaustively studied, in English (Swales 1981, 1990, 2004; Berkenkotter & Huckin 1995; Samraj 2002; Ruiying & Allison 2003; Kanoksilapham 2007) and also in Spanish (Ciapuscio 1996; Ciapuscio & Otañi 2002), the Disciplinary Text has not been described, and for this same reason not distinguished from other academic genres such as the Textbook. Since we understand that the communicative purpose of the Disciplinary Text is different from that of the Textbook and therefore that its rhetorical characteristics are also different (Parodi, Chapters 8 and 9); we believe it relevant to study and to identify it as a particular and independent genre. Therefore, since we are interested in focusing on the communicative or macro-purpose of the Disciplinary Text, and on the way it is realised discursively in its rhetorical organisation, we follow the central concepts of the Create a Research Space (CARS) model proposed by Swales (1981, 1990, 2004). This model has become increasingly relevant for this kind of research and has inspired a large number of additional studies (Bhatia 1993, 1997; Bolivar 1999, 2000; Bunton 2002; Núñez, Muñoz & Mihovilovic 2006; Kwan 2006; Biber et al. 2007). In this model, a text is described in terms of a sequence of moves, where each move is associated with a discourse unit that serves a specific communicative purpose. Hence, each move not only has its own purpose, but also contributes to the global purpose of the genre (Swales 1990). According to Swales (1981, 1990) and Bhatia (1993, 2004), the purpose of a particular genre is the basic feature that enables us to distinguish one genre from another. At the same time, these authors maintain that this purpose can be easily identified only by expert members of a certain discourse community. Moves are constituted by multiple elements which, in different combinations, realise the move. These elements are called steps by Swales (1981, 1990, 2004) and strategies by Bhatia (1993). In Table 1, we present an example of the CARS model, by Swales (1990).
The Disciplinary Text genre 193
Table 1. Move 1 in CARS model Move 1.
To establish Territory
STEPS
Section: Introduction EXAMPLES
Step 1.
Emphasising the relevance and/or…
Step 2. Step 3.
Proposing generalities and/or … Revising important aspects of former studies
The study of the discourse has become fundamental since… There are many situations where… Martin (1992) states that…
Table 1 shows how Swales (1981, 1990, 2004) illustrates the interaction between a move and its constitutive steps. The author states that communicative purposes are realised through these interactions. Applying this model to introductions of Research Articles, Swales (1990) identifies move 1, Establishing the territory, as the move which introduces the general research subject. He then explains that it is constituted by a maximum of three steps: Point out the relevance and/or Propose generalisations and/or Revise important aspects of former studies. We believe that moves may vary in terms of abstraction. Thus, they could be associated with functional units of major or minor length, which may coincide with structural units of diverse extension, such as complete sections of a text – an introduction – or minor textual units that constitute a particular section – a paragraph. This idea turns out to be highly useful when we consider that one of the most salient characteristics of the Disciplinary Text is its extension. Thus, our proposal is different in terms of abstraction with respect to Swales (1981, 1990). The CARS model, as presented by Swales (1981, 1990) and later revised (Swales 2004), is focused on a textual section, the introduction. This narrow scope allows for a detailed analysis of rhetorical moves and their corresponding steps. The same happens in studies focused on genres of short extension (Bhatia 1993, 1997; Connor & Upton 2004). Our research, on the other hand, focuses on a global and integral view of the Disciplinary Text as the unit of analysis, which implies a lengthy text and thus, a higher level of abstraction in terms of moves and steps. This takes us to the conceptions of the macro-move and the macro-purpose, as proposed by Parodi (in Chapter 8). The ideas of the macro-move and macro-purpose imply a large scope of observation. This approach entails the identification of major discourse units that serve diverse communicative purposes, which in turn make up the genre’s communicative macro-purpose. This makes it possible to identify certain major units or macro-moves and, therefore, to describe an extensive genre as a whole (Parodi, Chapters 8 and 9).
194 Romualdo Ibáñez
2.
The research
This research aims to describe the Disciplinary Text genre in functional-discursive terms. In order to do so, we will first carry out a qualitative description and then a quantitative one.
2.1
The corpus
According to the frequency of appearance in the PUCV-2006 Academic and Professional Corpus of Spanish, the Disciplinary Text genre turns out to be crucial in the construction of disciplinary knowledge at a university level. Table 2 presents this genre’s frequency of appearance in each of the disciplines. As already stated in Chapters 4 and 5, the PUCV-2006 Academic and Professional Corpus of Spanish represents about 100% of the written material that students from four university degree programs (Construction Engineering, Psychology, Industrial Chemistry and Social Work) at Pontificia Universidad Católica de Valparaíso (PUCV), Chile, are required to read during their four years of disciplinary instruction (for further detail concerning this corpus, see Parodi, Chapters 4 and 5). As indicated in Table 2, the PUCV-2006 Academic and Professional Corpus of Spanish consists of 491 texts, grouped into 9 genres. Even more importantly, 60% of the texts that make up the corpus correspond to the Disciplinary Text genre and are concentrated into three disciplines: Construction Engineering (7), Psychology (146), and Social Work (117). Since we focus on the Disciplinary Table 2. PUCV-2006 Academic and Professional Corpus of Spanish Construction Psychology Chemistry Social Work Engineering Dictionary Didactic Guideline Textbook Regulation Disciplinary Text Research Article Report Test Lecture
1 1 49 11 7 0 0 0 0
1 18 31 3 146 21 3 3 1
0 22 31 x x 0 0 0 0
0 0 16 1 117 1 7 0 0
Total number of texts per discipline Total number of texts
69
227
53
142
491
Texts per genre 2 41 127 15 270 22 10 3 1
The Disciplinary Text genre 195
Text, this research will concentrate on the sub-corpus constituted by the 270 texts identified as pertaining to the Disciplinary Text genre.
2.2
Procedures
According to Biber et al. (2007), discourse analysis may be approached from two mutually excluding directions. These authors distinguish two different ways of approaching the corpus under study: the ascending, or “bottom-up” method and the descending, or “top-down” method. The first is an inductive procedure, entailing preliminary analysis of the corpus, during which discourse units would emerge according to the linguistic features of the texts. The second entails the establishment of categories of analysis prior to studying the texts. In order to conduct our research, we integrated the ascending and descending methods. By doing so, we contrasted categories of analysis established by disciplinary specialists participating in the research with those categories that emerged from the texts under analysis, as detailed in Table 3. The implications of this method were that certain categories were rejected, others reinforced and, of course, some new categories emerged throughout the procedure. This analytical framework corresponds to the frame designed by Parodi (see Chapter 8) and consists of four stages and twelve methodological steps, as indicated in Table 3. The following is a detailed description of the fundamental procedures in each of these stages. The most important procedure in the first stage was the establishment of a degree of abstraction for observation. This procedure was carried out after the identification of a series of discourse units which emerged from analytical reading of the texts. After considering the extension of the genre under study, the identified discourse units were ranked, meaning that some minor units were subsumed into major ones. This enabled the identification of a series of broader communicative purposes that included other more specific purposes, thus resulting in a framework formalised in a table of criteria which illustrates the moves and macro-moves constituting the genre. After applying the criteria table to all of the texts in Stage 2, it was possible to perform some necessary modifications. In the third stage, the central step was estimation of the instrument’s reliability by means of an interrater reliability procedure (Hatch and Lazaraton 1991). As part of this process, reviewers were trained as to the moves and macro-moves to be observed, which entailed calibrating the abstraction level of their observational focus. In order to execute this procedure, a micro-corpus was constructed through a random selection of a group of texts pertaining to the three disciplines involved. The outcome of this stage allowed for more consistent application of these criteria. The raters reached a percentage agreement of 85%. Finally, for Stage 4, an electronic
196 Romualdo Ibáñez
Table 3. Possible stages and steps for conducting move analysis Stages and steps to conduct a move analysis
Description
Stage 1. Analytical framework configuration
A preliminary analysis is performed from a micro-corpus for the construction of a first criteria table.
Step 1.1. Identifying text units
Based on an initial analytical study of the micro-corpus, a set of text units is identified.
Step 1.2. Determining the observation focus
The degree of abstraction is established in order to observe the communicative purposes forming the genre.
Step 1.3. First equalisation
The relation between the focus of observation and the identified text units is revised and adjusted.
Step 1.4. Assigning communicative purposes
Each identified discourse unit is associated with a communicative purpose.
Step 1.5. Label production
A label is assigned to each identified discourse unit, according to the communicative purpose it eventually fulfills.
Step 1.6. Identifying the general communicative purpose
The genre’s general communicative purpose is determined according to the set of previously identified communicative purposes.
Step 1.7. Designing the first criteria table
Based on previously developed steps, a first classification in terms of macro-moves, moves and steps is designed.
Stage 2. Extension and adjustments
The criteria table is applied to the whole corpus and eventual modifications are made.
Step 2.1. Applying the criteria table
The criteria table is applied to the total amount of texts in the corpus.
Step 2.2. Second equalisation
Based on the application to the total corpus, the necessary modifications are made to the criteria table, which implies including or excluding some macro-moves, moves and/or steps.
Stage 3. Reliability of criteria table
A triangulation process is carried out in order to establish the instrument’s reliability.
Step 3.1. Determining the instrument’s reliability
In order to determine the percentage of agreements among raters, three expert reviewers applied the criteria table following the same procedure.
Step 3.2. Third equalisation
After the triangulation process and over 80% agreement, several adjustments are made to the criteria table in order to settle emerging discrepancies noted after the reviewers had made their analysis.
Stage 4. Establishing the occurrence of functional categories
The final criteria table is applied to the corpus in order to quantify the occurrence of moves and steps.
Step 4.1. Quantification
The occurrence of each move and step in each text in the corpus is quantified.
The Disciplinary Text genre 197
tabulation sheet was designed, enabling efficient accounting and subsequent automatic calculation of the results.
3.
Results and discussion
3.1
Qualitative description
As a result of the procedures corresponding to Stages 1, 2 and 3, it was established that the Disciplinary Text genre is characterised by three macro-moves and eight moves, which are detailed in Table 4. Table 4. Macro-moves and Moves in the Disciplinary Text Moves
Communicative Purpose
Macro-move 1. Preamble (PREA)
To present the work, providing information that facilitates and guides its reading
Possible structure
1.1. Schematic exposition To show the contents of the work and Index, Thematic the manner in which it is organised. Index/Contents of contents (SE) 1.2. Contextualisation (CON)
To situate the work, justifying its cre- Prologue/Preface, ation and showing the author’s point of Acknowledgment, view regarding the topic in question. Translators’ comments
1.3. Presentation (PRE)
To delimit the work’s focus and describe Introduction its organisation.
Macro-move 2. To develop Problems to be approached Theoretical Proposal (TP) To identify the problems to be faced, Introductory Chapter 2.1. Establishing theoretical problems (STP) emphasising their relevance. 2.2. Author’s proposal (PROP)
To present the author’s proposals about Thematic Chapters the topic in question, comparing it with proposals from other authors.
2.3. Closure (CL)
To synthesise the proposals presented by Conclusions the author, emphasising their value.
Macro-move 3. Complements & Orientation (C & O)
To provide additional data and necessary guidelines to understand the work
3.1. Guidelines (GUID)
To illustrate the manner in which the Index, Thematics/ work’s contents are presented. Contents, Statistical Content Index
3.2. Specifications (SPE)
To support the comprehension of terms Appendices, Annexes, and present detailed evidence. Glossary, Bibliogra phic References
198 Romualdo Ibáñez
Table 4 shows that the Disciplinary Text is formed by the macro-moves Preamble, Theoretical Proposal and Complements and Orientation. The first aims to familiarise the reader with the topic to be analysed or discussed. It is constituted by three moves, Schematic Exposition of Contents, Contextualisation and Presentation. These moves help the reader to formulate an idea about the text she/he will be reading and about the author’s view of the topic to be discussed; likewise, they facilitate the search for specific information, in case this was necessary. The second macro-move, whose communicative purpose is definitively persuasive, presents the author’s view regarding a theoretical or ideological problem. It is comprised of three moves: Establishing Theoretical Problems, Author’s Proposal and Closure. By means of these moves, the author develops her/his proposal or approach, with the support of theoretical bases or preliminary studies; she/he also places her/his proposal in relation to other authors, emphasising the value of her/ his own research and the weaknesses of others’ research. Finally, the macro-move Complements and Orientation is designed to reinforce the comprehension of proposals made by the author, based not only on information referring to specific terms or bibliographic sources, but also on the organisation of text contexts. This macro-move is made up of two central moves: Guidelines and Specifications. Based on the procedures corresponding to stages one, two and three (Table 3), some interesting findings can be reported. First, the moves constituting a macromove do not always occur in sequence in the texts; on certain occasions, these occur in a different order than that presented in the criteria table. In addition, not every move is present in every text under study. Another interesting finding is that there is sometimes a direct relation between the discursive units and the structural units or text sections. Such a relation, although not very frequent, may be appreciated between the move Schematic Exposition of Contents and the section Thematic Index, since thematic indexes are not common in the books studied. In fact, the move Contextualisation occasionally appears as forming part of the sections Prologue or Preface, and on other occasions, as part of the Introduction. Finally, it becomes evident that, in the case of texts that compile chapters belonging to different authors, each chapter is configured as an individual unit, through the presence of Macro-move 2, Theoretical Proposal. In addition to the global view of the Disciplinary Text in our study we also investigate the genre’s more specific functional constituents. In Tables 5, 6 and 7 we illustrate the relation between each move or macro-move and their respective rhetorical steps. As mentioned in Table 5, the first macro-move, Preamble, is formed by three moves: Schematic Exposition of Contents, Contextualisation and Presentation, each of which contributes to familiarising the reader with the book’s contents. Thus, and as we can observe in Table 5, the purpose of the first move, Schematic
The Disciplinary Text genre 199
Table 5. Macro-move 1. Rhetorical Moves and Steps Macro-move 1. Preamble (PREA). To present the work, providing information in order to facilitate and guide the reading Move 1.1. Schematic Exposition of Contents (SE) Step 1.1.1. Schematic presentation of the work’s contents (SP)
Example “Capítulo I: La media aritmética y el error medio 1. Consideraciones generales 4 4 2. Media de los errores 5”. 3. Error Medio (AC-IC-td355)
Move 1.2. Contextualisation (CON) Step 1.2.1. Introducing the thematic area (IT)
Step 1.2.2. Justifying the work (JW) Step 1.2.3. Explaining the author’s view on the themes in question (AV) Step 1.2.4. Acknowledging (ACK)
Example “En este libro se abordan ambas dimensiones: los efectos de las políticas sociales sobre el resto de la economía y el funcionamiento interno de esas políticas”. (AC-TS-td316) Example “Lo considero una importante contribución al pensamiento sobre esta materia”. (AC-TS-td286) Example “El propósito de Bosier es llegar a una distribución más equitativa de los ingresos regionales.... Su libro trata, en consecuencia, del balance de poder...” (AC-TS-td286) Example “Agradezco el apoyo prestado por el equipo de profesores de la Escuela de Psicología de la Pontificia Universidad Católica de Valparaíso quienes, además de ser mi equipo de trabajo, me han enseñado en distintos momentos de nuestra historia como Escuela la importancia de trabajar en equipo”. (AC-PSI-td84)
Move 1.3. Presentation (PRE) Step 1.3.1. Establishing the scope (SS)
Step 1.3.2. Presenting the work’s organisation (ORG)
Example “El análisis de las políticas sociales que se realiza en este libro está referido casi exclusivamente a sus aspectos económicofinancieros”. (AC-TS-td316) Example “El libro se ha organizado en cinco capítulos. El primero ofrece una breve revisión histórica de las políticas sociales”. (AC-TS-td286)
Exposition of Contents, is to indicate the location of diverse thematic contents in the text, and it is performed by means of only one rhetorical step, Schematic Presentation of the Work’s Contents. This step is directly associated with the Index or Contents section. The second move, Contextualisation, carries out its purpose
200 Romualdo Ibáñez
by means of four rhetorical steps, Introducing the Thematic Area, Justifying the Work, Explaining the Author’s View on the themes in question and Acknowledging. Unlike the first move, this one does not familiarise readers with the text in terms of its organisational structure, but rather in terms of its background. In this sense, the first step, Introducing the Thematic Area, aims to describe the disciplinary areas and situate the text’s thematic scheme within them. This implies realising a scientific-historical contextualisation of the discipline and, of course, of the topic in question. The second step, Justifying the Work, gives diverse reasons for approaching these topics. The third step, Explaining the Author’s View on the Themes in Question, describes the author’s view, by presenting her/his line of investigation or the epistemological paradigm to which she/he adheres. Finally, the step Acknowledging eventually appears, emphasising the participation of different actors involved in generating the book. This step helps to fulfil the move’s purpose, since it enables the reader to become acquainted with members of the scientific community to which the author belongs. The third and last move, Presentation, could be associated with what Bhatia (1997) calls Introduction to the Work, and fulfils its purpose in two steps, Establishing the Scope and Presenting the Work’s organisation. The first of these steps is closely related to the second move, Contextualisation, since it delimits the study’s focus, thus restricting its scope. The second step, Presenting the Work’s Organisation, describes the thematic contents that will be discussed in each of the book sections, whether in terms of parts or chapters. From the description of moves and steps constituting the Preamble move, we may appreciate the way in which a favourable scenario is set, ensuring an adequate reading of the book. This also entails preparing the way to develop the author’s theoretical proposal. Table 6 illustrates the rhetorical moves and steps constituting the macromove Theoretical Proposal. The macro-purpose of this macro-move is not only to present the author’s proposal, but also to point out its value above that of others. The particular persuasive character of this macro-move requires a series of moves that realise such a presentation effectively. The first move, Establishing the Theoretical Problem, the purpose of which is to describe the problem to be approached in detail, fulfils this purpose in two rhetorical steps, Describing the Problem and Revising the Main Respective Theoretical Views. The first of these presents the theoretical and ideological problem and points out the relevance of approaching this problem. The second step gives an overview of the main thesis and proposals. The second move, Author’s Proposal, has the purpose of describing in detail the author’s proposal regarding the problem being approached and comparing it with those proposed by others. To achieve this purpose, it consists of two rhetorical steps, Evaluating and Presenting Author’s Proposal in this respect. The first of these
The Disciplinary Text genre 201
Table 6. Macro-move 2. Rhetorical Moves and Steps Macro-move 2. Theoretical proposal (TP). To develop the problem to be discussed Move 2.1. Establishing the Theoretical Problem (STP) Step 2.1.1. Describing the Problem (DP)
Step 2.1.2. Revising the main respective theoretical views (RT)
Example “Las políticas sociales se han constituido en una parte importante de la acción del Estado en el ámbito socioeconómico, al menos durante los últimos cuarenta años”. (AC-TS-td316) Example “En las palabras de Revel-Mouroz: ‘El régimen descentralizado podría usar la descentralización como una técnica de relegitimación...” (AC-TS-td316)
Move 2.2. Author’s Proposal (PROP) Step 2.2.1. Evaluating (EV)
Step 2.2.2. Presenting Author’s proposal (PA)
Example “Con mucha frecuencia se reitera que la planificación – en cualquiera de sus varias dimensiones – se encuentra en un profundo estado de crisis. Si bien esta afirmación es indudablemente cierta, poco es lo que aporta en términos de descubrir la verdadera naturaleza de la crisis, algo que sin duda ayudaría a superarla”. (AC-TS-td286) Example “La planificación debe ser entendida también como una saga fundacional... la planificación no es otra cosa que la organización de la sociedad en el tiempo...” (AC-TS-td286)
Move 2.3. Closure (CL) Step 2.3.1. Summing up the main propositions (MP) Step 2.3.2. Emphasising own value (VAL)
Example “Se ha analizado críticamente la reforma previsional de 1980 y sus consecuencias. Entre los principales aspectos analizados cabría mencionar los siguientes...” (AC-TS-td316) Example “En síntesis, la reforma descrita deja sin resolver varios de los problemas que tenía nuestra seguridad social y crea otros nuevos. Estos nuevos problemas eran claramente predecibles, dadas las características del sistema que se ha impuesto”. (AC-TS-td316)
steps evaluates the quality of the above theses and propositions and occasionally points out their weaknesses. The second step details the author’s theoretical or ideological proposal for the work. Finally, the third move, Closure, aims to sum up the author’s exposition, emphasising its value. It makes its proposition in two rhetorical steps, Summing up the Main Propositions and Emphasising own value. The first of these synthesises the basic principles of the work, centring on those stated by the author. The second one contrasts the author’s basic principles with those of others, formerly evaluated, pointing out the value of her/his own.
202 Romualdo Ibáñez
Table 7. Macro-move 3. Rhetorical Moves and Steps Macro-move 3. Complements and Orientation (C & O). To supply additional information and guidelines needed in order to comprehend the work Move 3.1. Guidelines (GUID) Step 3.1.1. Schematic presentation of the work’s contents (SP) Step 3.1.2. Listing text contents in alphabetical order (LIST)
Example “Capítulo I : La media aritmética y el error medio 1. Consideraciones generales 4 4 2. Media de los errores 5.” (AC-IC-td355) 3. Error Medio Example “Refuerzo 124s 139 252-254 281 286 Parcial 295 Regiones, estratos de la personalidad 176s Perceptivo motoras 176 Registro de huellas e impresiones 120s Regresión 198 Relación padre-hijo 171” (AC-PSI-td108)
Move 3.2. Specifications (SPE) Step 3.2.1. Giving specific information (SPI)
Example
Apéndice VIII: Medidas antiguas de España y América Medidas de longitud Medidas de superficie 1 Robada = 1458 varas cuadradas del Navarra país = 8,984 a. 1 Vara = 0,785 m 1 Legua = 5,49 Km Aragón 1 Fanegra: Huesca (1200 varas cuadra1 Vara = 0,772 m das del país) 7,15 a; Teruel (1600 varas (Teruel 0,768) cuadradas castellanas) 11,118 a. 1 Legua = 5,573 Km 1 Cuartal (1/4 fanegra): Zaragoza 2,384 a. (Huesca 6,176 Km) (AC-IC-td355) Step 3.2.2. Example Defining terms (DEF) “Canal: Sistema físico a través del cual se puede transmitir información. Se caracteriza matemáticamente por un alfabeto de entrada y un alfabeto de salida y una función que hace corresponder a cada signo de entrada una función de probabilidad de signos de salida. La capacidad de un canal es la cantidad máxima de información expresada en bits por segundo que puede transmitirse por dicho canal sin perturbaciones. Ver Ruido”. (AC-IC-td351) Step 3.2.3. Specifying the men- “Fuentes, Fernando, Análisis técnico para proyectos de inversión. ICAP, tioned sources (REF) San José, 1988. Guadagni, A.A., “El problema de la optimización del proyecto de inversión: Consideración de sus diversas variantes”. En BID-Odeplan, Programa de Adiestramiento, preparación y evaluación de proyectos. Vol. V. Santiago, 1976.” (AC-PSI-td122)
The Disciplinary Text genre 203
In Table 7, we present the rhetorical moves and steps that constitute the macro-move Complements and Orientation. The third and last macro-move, Complements and Orientation, consists of two moves, Guidelines and Specifications, each of which functions to reinforce comprehension of the author’s propositions. The first move aims to show how the contents of the book are presented, and it fulfils this purpose by means of two rhetorical steps, Schematic Exposition of the Work’s Contents and Listing Text Contents in Alphabetical Order. The former indicates the location of diverse thematic contents in the text, in order of appearance. The latter indicates the location of these contents in alphabetical order. The second move aims to facilitate understanding of particular terms and presents detailed evidence. Its objective is realised in three steps, Providing Specific Information, Defining Terms and Specifying the Sources Mentioned. The first step presents additional detailed information that supports the author’s proposal. The second provides definitions of specific terms related to the area under study. Finally, the third step provides detailed information on bibliographic sources mentioned in the work’s body. The above description represents a view of the Disciplinary Text configuration, not only in terms of the three macro-moves and the eight moves forming the genre, but also in terms of such moves and each of their constituent steps. An important point to be made here is that, as is the case with the moves, sometimes the steps that form a move do not occur sequentially; that is, these occasionally occur in an order that is different from the one presented in the framework of analysis. An example of this are the steps that comprise the move Contextualisation. Likewise, on occasions, not all the steps identified as parts of a move are present in the texts under study. Finally, the step Schematic Presentation of the Work’s Contents is a special case that appears in some of the texts as part of the move Schematic Exposition of Contents, and in others as part of move Guidelines. This step was included in both moves for this reason.
3.2
Quantitative description
The above procedures allowed us to identify a set of functional constituents of the Disciplinary Text genre. In order to complete the description, the occurrence of each of these units was quantified in the total number of texts making up the corpus. This procedure corresponded to Stage 4 of the analysis (see Table 3), and enabled us to distinguish between the functional elements of the genre to be identified as obligatory and those to be identified as optional, according to their frequency of occurrence. Likewise, this procedure allowed us to explore occurrence variation of such elements, depending on the discipline and knowledge domain
204 Romualdo Ibáñez
Macro-move 1
Macro-move 2
Macro-move 3
100% 90% 80% 70% 60% 50% 40% 30% 20% 10% 0% 1.1 SE
1.2 CON
1.3 PRE
2.1 STP
2.2 PROP
2.3 CL
3.1 GUID
3.2 SPE
Average Occurence
Figure 1. Average move occurrence
(SS&H and BS&E). In order to arrive at an adequate comparison, all of the measured totals were standardised by means of a percentage process. In Graph 1, we present the outcome of this procedure in terms of macro-moves and moves. The quantification of occurrences indicated that the total amount of macromoves and most of the moves identified are largely present in most texts of the corpus. The average rate of occurrence of the identified moves is 83%. As can be observed in Figure 1, exceptional cases are moves Contextualisation (CON) (81%), Establishing the Theoretical Problem (STP) (99%), Author’s Proposal (PROP) (90%), Closure (CL) (82%) and Specifications (SPE) (96%), which, on the average, occur in 90% of the texts. At this point, it is important to recall that the move Schematic Exposition of Contents (SE) is realised by means of only one rhetorical step, Schematic Presentation of the Work’s Contents (SP), which, on occasion, appears as part of the move Guidelines (GUID), in the third macro-move. This implies that occurrence of this step is manifested in a distributed way. Therefore, the total percentage decreases when quantifying its occurrences. These results reveal the high occurrence of the moves that form macro-move two, Theoretical Proposal, which amount to an average 90%. This high occurrence is relevant in that these moves mainly realise the macro-purpose of the Disciplinary Text genre. This is because, as mentioned, these moves are employed not only to present the author’s proposal with respect to the problem approached, but also to emphasise the relevance of approaching the problem in question and, of course, to point out the value of the author’s proposal over those of others.
The Disciplinary Text genre 205
100% 90% 80% 70% 60% 50%
SW PSY CE
40% 30% 20% 10% 0% SE
CON
PRE
STP
PROP
CL
GUID
SPE
Figure 2. Occurrence of moves by discipline
By their high occurrence, these moves prove to be central constituents of the genre and, at the same time, support our idea about Disciplinary Text’s persuasive nature. As a persuasion instrument, Disciplinary Text is produced by an expert member of a disciplinary community for another expert member who is not only able to understand the author’s proposal, but also to adhere to or reject it. Although Figure 1 enables us to establish certain general parameters, it becomes necessary to investigate a possible variation imposed by the discipline. In Figure 2 we present the results of the quantification, distinguishing the three disciplines in question and, in this sense, also the knowledge domain. The results shown in Figure 2 enable us to appreciate important differences between the three disciplines in which this discourse genre was studied. The most relevant case corresponds to move Contextualisation (CON). In the disciplines pertaining to Social Work (SW) and Psychology (PSY), this move reaches a percentage of 90% and 97% respectively, whilst, in the discipline of Construction Engineering (CE) it occurs only 58% of the time. An important point related to disciplinary variation is associated to the way in which the particular epistemic nature of a discipline determines discursive interaction within the community. Therefore, we adhere to Bhatia (2004), since we believe that a genre may vary according to the discipline within which it circulates,
206 Romualdo Ibáñez
without losing its essential features or central functional constituents. This phenomenon is apparent in the differences we have noted above between the Disciplinary Text that circulates in SW and PSY and the one circulating in CE. Applying the criterion used by Kanoksilapatham (2007), where an occurrence of 60% is said to determine whether a constituent of the genre is obligatory or optional, the move Contextualisation should be considered as optional in the Disciplinary Text that circulates within discipline CE. The remaining moves (except for the move Schematic Exposition of Contents (SE)) occur in at least 70% of the total number of texts studied in all disciplines. Under this criterion, these may be considered as obligatory constituent elements of the Disciplinary Text genre in all three disciplines. If we consider that CE belongs to the BS&E, while SW and PSY belong to the SS&H, this finding illustrates an important difference between the Disciplinary Texts circulating in the SS&H and those circulating in the BS&E. This illustrates the different ways in which disciplinary knowledge is constructed and transmitted in these diverse knowledge domains. On the one hand, it is evident that in SS&H, it appears necessary not only to situate the work with a description of the themes in question, but also to justify its creation and to show the author’s view of the problem being approached. On the other hand, in BS&E, this move would not be an obligatory element of the genre. This suggests that, according to expert members of the disciplinary communities associated with BS&E, establishing an acquaintance between the reader and the processes within the discipline does not constitute a purpose of the genre. Quantitative analysis of the numerical occurrence of macro-moves and moves enabled us to determine the central moves of the Disciplinary Text. At the same time, this analysis made it possible to prove the existence of important differences between the characteristics of the Disciplinary Text genre which circulates within SS&H and the one that circulates within BS&E. Nevertheless, the more specific constituents still need to be investigated. Thus, in Figure 3, we present the number of occurrences of the rhetorical steps within each move and macro-move. By completing the qualitative description previously reported (Tables 5, 6, 7), we observed the way in which diverse moves constituting different macro-moves were realised through different steps. We specifically observed that, in some cases, not all the steps identified as part of a move were present in a text. Through the quantification procedure, whose results are presented in Figure 3, the rate of occurrence of each step identified as constituent of a move was established, and we were thus able to determine which steps can be considered optional and which can be considered obligatory within the genre under study.
The Disciplinary Text genre 207
100% 90% 80% TS PSI IC
70% 60% 50% 40% 30% 20% 10% 0% SE 1.1 SE
IT
JW
AV
ACK SS
1.2 CON
1. PREA
ORG
1.3 PRE
DP
RT
2.1 STP
EV
PA
2.2 PROP
2. TP
MP
VAL
2.3 CL
SP
LIST SPI
3.1 GUID
DEF REF 3.2 SPE
3. C&O
Figure 3. Occurrence of moves and steps by discipline
In Figure 3, we can see that the eighteen steps identified as constituents of the eight moves appear in a rather heterogeneous distribution, with individual occurrence ranging from a maximum of 100% to a minimum of 13%. An integrated view – that is, observing the three disciplines at the same time – reveals that the most frequently occurring steps are Describing the Problem (DP), Presenting Author’s Proposal (PA) and Specifying the Mentioned Sources (REF), with an average occurrence of 97%, 92% and 97%, respectively. As for the apparent scarce occurrence of the step Schematic Presentation of Work’s Contents (SP), we must recall that this step occasionally appears as part of the move Schematic Exposition of Contents (SE). In the first macro-move and in other instances, it occurs as part of the move Guidelines (GUID) in the third macro-move. If we count these occurrences, we obtain a percentage of 100%, and therefore this must also be considered to be the step that occurs most frequently. Applying the same criterion used formerly to determine which moves should be considered as optional and which should be considered as obligatory (Kanoksilapatham 2007), it is possible to state the same distinction as before, now in terms of rhetorical steps. In order to do so, we calculated an average rate of occurrence for each step in each of the three disciplines. As a result, the only rhetorical steps of the Disciplinary Text genre to be considered as optional are Giving Specific Information (SPI) (59%) and Defining Terms (DEF) (48%).
208 Romualdo Ibáñez
We assume that these results stem from the fact that the Disciplinary Text is a genre produced by experts, for experts. Thus, it would not be necessary to provide additional detailed information explaining the author’s propositions nor to deliver definitions of specific terms within the subject. However, according to the data contained in Figure 3, we believe it necessary to identify disciplinary differences in detail. Counting the individual occurrences of the steps previously identified as optional reveals important differences between the disciplines. Giving Specific Information occurs at a rate of 22% in SW, 28% in PSY, and 90% in CE. Defining Terms occurs at a rate of 22% in SW, 28% in PSY, and 95% in CE. Such differences in occurrence suggest that we consider such steps as optional only in SW and PSY, while these are obligatory in CE. Once again, these numbers illustrate disciplinary differences regarding the manner in which knowledge is negotiated and, likewise, support the assertion that these differences are imposed by the epistemological nature of the areas to which each discipline belongs. In this same context, the configuration of the move Contextualisation is also relevant. Results of the rhetorical step quantification reveal that in SS&H, except for the step Acknowledging (ACK), all constituent steps are obligatory, whereas the only obligatory step in BS&E is Introducing the Thematic Area (IT). In fact, it is important to note that the step Justifying the Work (JW) shows the lowest rate of occurrence (40%). Something similar occurs with the rhetorical steps forming the moves previously identified as central within the Disciplinary Text. This is the case with Establishing the Theoretical Problem (STP), Author’s Proposal (PROP) and Closure (CL). The first of these is realised in the SS&H by means of the steps Describing the Problem (DP) and Revising the Main Respective Theoretical Views (RT), which show an average rate of occurrence of over 90%. In the case of BS&E, on the other hand, this move is realised mainly through the step Describing the Problem (DP), with occasional optional use of the step Revising the Main Respective Theoretical Views (RT), which occurs at a rate of 48%. Likewise, the move Author’s Proposal (PROP) is realised in SS&H by means of the steps Evaluating (EV) and Presenting Author’s Proposal (PA), which show an average rate of occurrence near 90%. In BS&E, this same move is realised mainly through the step Presenting Author’s Proposal, with occasional optional use of the step Evaluating, which occurs at a rate of 45%. Finally, the move Closure (CL) is realised in SS&H through the steps Summing up the Main Propositions (SP) and Emphasising Own Value (VAL). In the case of the BS&E, this move is realised mainly through the step Summing Up the Main Propositions.
The Disciplinary Text genre 209
The above empirical findings not only help us to understand how the Disciplinary Text genre is constituted but, at the same time, to notice the way it varies, depending on the discipline within which it circulates and the knowledge domain to which that discipline belongs. Hence, since the Disciplinary Text is an eminently persuasive genre, it presents different ways of arguing or negotiating knowledge. In SW and PSY, the author’s theoretical proposal requires prior validation from a detailed contextualisation, which may even include the work’s justification, whereas in CE, this validation is focused on introducing the topic to be treated. Also, in SW and PSY, the problem’s treatment is primarily composed of its comparison with and evaluation in relation to other perspectives, whilst in CE, treatment of the problem is mainly applied by the description and the presentation of the author’s proposal. Finally, in SW and PSY, it is not necessary to supply additional data or specialised terminology in order to reinforce the author’s proposal, whereas in CE, this is highly necessary. Quantification of the rhetorical steps allows us to observe considerable differences regarding discourse organisation. On the one hand, the texts from SS&H evidence a persuasive manner that is much based on reference to proposals from other authors, implying a certain degree of intertextuality. On the other hand, in texts from BS&E, persuasion is mainly based on the author’s proposal and its empirical support. This implies a great deal of recursivity and circularity in texts from SS&H, while in BS&E texts, the procedure is much more direct and, in this sense, less recursive.
Conclusions Through this study, we sought to fill a gap concerning the need for thorough descriptions of extensive genres as a whole. We also intended to bring up some answers associated with the various ways in which disciplinary knowledge is discursively constructed and transmitted, depending on the discipline and knowledge domain (Social Sciences & Humanities and Basic Sciences & Engineering). We are convinced that providing this type of data can help to improve and facilitate the instruction of university students. Based on the PUCV-2006 Corpus of Spanish, it was possible to note the relevance of the Disciplinary Text in the academic instruction of students in three undergraduate programmes at Pontificia Universidad Católica de Valparaíso, Chile. Likewise, by means of bibliographic revision, we noticed the scarcity of works describing this academic genre. Therefore, and in order to contribute to academic instruction from a psychodiscursive perspective, the objective of the research we
210 Romualdo Ibáñez
have presented in this chapter has been to describe rhetorical organisation of the Disciplinary Text. These analyses allowed us not only to describe the rhetorical organisation of the Disciplinary Text, as formed by three macro-moves and eighteen rhetorical steps, but also to determine which functional constituents could be considered optional and which obligatory by examining the degree of occurrence of each constituent. Furthermore, we confirmed the genre’s persuasive nature, which is mainly reflected in the second macro-move, Theoretical Proposal, and in the moves which comprise it. According to our analysis, this macro-move, whose macro-purpose does not only present the author’s proposal, but also points out its value over others, appears as the central macro-move of the genre under study. We also detected variation in the realisation of moves through their rhetorical steps. This variation depends on the different disciplines within which this genre circulates (Psychology, Social Work and Construction Engineering) as well as the knowledge domain to which these disciplines belong (in this case, SS&H and BS&E). This variation, mainly manifest in the move Contextualisation and in the total amount of the moves that constitute the macro-move Theoretical Proposal, shows the different persuasive methods this genre adopts, depending on the disciplinary community within which it moves and the knowledge domain to which this discipline belongs. Definitively, and according to the results discussed above, we may claim that the Disciplinary Text circulating in SS&H persuades by means of comparison and evaluation, which implies a higher degree of negotiation. On the other hand, the Disciplinary Text circulating in BS&E realises its communicative purpose basically through the author’s proposals and their theoretical-empirical support, which implies a lesser degree of negotiation. Nevertheless, we believe that these differences do not turn the genre into another one; on the contrary, they illustrate the integrity of discourse genres and, at the same time, their disciplinary flexibility. This integrity is due to the presence of a set of shared central functional elements that contribute to the achievement of a common macro-purpose. In the case of the Disciplinary Text genre, these elements comprise the total amount of macro-moves and almost all the moves identified in the qualitative analysis, except for the move Contextualisation. The variation becomes manifest, as may be observed in Figure 3, in the rhetorical steps that comprise these moves. These findings show the different ways of negotiating disciplinary knowledge across disciplines. One pedagogical implication might entail the need to develop psychodiscursive strategies framed in the particular discourse practices of each disciplinary community. This study may support the design of pedagogic instruments oriented to help non-expert members of given disciplinary communities to
The Disciplinary Text genre 211
develop the necessary skills to efficiently interact with the genres they must face during their process of disciplinary instruction. Another implication involves paying close attention to the gradient of diverse disciplinary genres when distributing study material to students in the classroom. Our research enables us to assert that, due to the complexity of the Disciplinary Text, and because of its persuasive nature, non-expert members of an academic community should not have to face this genre. These non-expert members, or students in the first years of their university undergraduate programmes, should, in our opinion, start on their path to disciplinary integration by means of more didactic genres, such as the Textbook (see Parodi, Chapters 8 and 9).
chapter 11
Academic discourse comprehension in Spanish and English Accessing disciplinary domains Romualdo Ibáñez
Psychodiscursive skills allowing university students to comprehend academic texts are considered to be essential. Besides, in countries where English is not the mother tongue, university students must also develop the ability to comprehend academic texts written in such language. As a result, reading comprehension of academic texts written in English and in Spanish has become a focus of investigation for Chilean researchers, bringing up questions such as: how much do university students comprehend when facing an academic text? Is there any relation between levels of comprehension and the knowledge students have about the genre they have to read? Can reading skills be transferred to a process developed in a second language? Approaching some of these questions, in this chapter we examine the comprehension process executed by undergraduate university students when facing disciplinary texts written in English and in Spanish. The students belonged to an Industrial Chemistry programme. We focused on the level of comprehension these students achieved when facing an academic Text written in English and also on the way that level of comprehension was predicted by the participants’ English proficiency level, their degree of insertion into the disciplinary community, and their reading skills in Spanish.
Introduction Most chapters in this book identify and describe different discourse genres from the PUCV-2006 Corpus of Spanish. In this chapter we move toward possible applications of these studies in the field of written discourse comprehension. We have particular interest in academic text comprehension, not only in Spanish as a mother tongue, but also in English as an L2. This interest stems from the central role that written text comprehension plays in academic settings. In our opinion, efficient interaction with academic
214 Romualdo Ibáñez
texts constitutes a fundamental route to accessing a discipline, not only in terms of specific knowledge acquisition, but also with regard to the integration of students into their disciplinary groups. This idea emphasises the need for adequate psychodiscursive skills in order to interact successfully in a disciplinary community. Unfortunately, research carried out in Latin America and particularly in Chile (Parodi 2007a; Peronard 2007b) reveals serious comprehension deficiencies among university students when reading academic texts written in Spanish. These findings allow us to infer worse results for academic text comprehension in English. If we consider the amount of academic material written in English that these students must read during their four-year undergraduate program, this handicap may become a major obstacle in their disciplinary access. Thus, in pursuit of solutions to the aforementioned needs, we believe it critical to study academic text comprehension, not only in terms of levels achieved in both Spanish and English, but also in terms of variables that may influence those levels, especially when the process is carried out in an L2. This research focuses on academic text comprehension in Spanish and in English by students whose mother tongue is Spanish; special attention is paid to some variables that may determine the level of comprehension achieved when the process is executed in English. We have two aims: first, to determine the level of academic text comprehension in both Spanish and English; and second, to describe the way in which the level of academic text comprehension in English is related to the students' level of English proficiency, their degree of disciplinary expertise, and their reading comprehension ability in their mother tongue.
1.
An integrated theoretical perspective
We believe that through the proper integration and application of some advances made in certain areas of research, it is possible to deal with the problems and needs of university students when it comes to academic text comprehension, not only in Spanish as a mother tongue, but also in English as an L2. For example, in the areas of Discourse Analysis and English for Academic Purposes (EAP), there are researchers who describe texts pertaining to different registers and specialised genres (Martin & Rose 2007; Bhatia 1993, 2004). Similarly, researchers from the fields of psycholinguistics (Parodi 2007a) and discourse psychology (McNamara & Kintsch 1996; Kintsch 1998, 2002) have not only focused on the cognitive processes involved in comprehending specialised texts, but also on the processes involved in learning from them. As a cognitive process, academic text comprehension, either in Spanish as a mother tongue or in English as an L2, demonstrates particular characteristics that
Academic discourse comprehension in Spanish and English 215
differentiate it from non-specialised text comprehension. That is, even though it involves skills associated with any process of written text comprehension, such as word decoding, inference generation, and strategic use of previous knowledge, it also entails knowledge and skills regarding a particular discourse community (Swales 1990, 2004; Bhatia 1993, 2004). Besides, in the case of academic texts written in an L2, proficiency in the L2 plays a fundamental role (Alderson 1984, 2000). For these reasons, this study integrates research in the areas of written discourse comprehension (van Dijk & Kintsch 1983; Graesser, Singer & Trabasso 1994; Parodi 2007a), L2 reading (Alderson 1984, 2000; Koda 2005), academic discourse (Parodi 2007b) and genre theory (Martin 1992; Swales 1990; Bhatia 1993, 2004). We conceive written discourse comprehension as a highly complex goal-oriented cognitive process, consisting of a series of psychodiscursive sub-processes that are supported in turn by a variety of lower level (attention, perception and memory) and higher level (decision-making, monitoring, reflection, among others) cognitive processes. The simultaneous interactions among all the processes involved generate a mental representation of the situation and/or process described in the text, based on both the textual information and the previous knowledge of the reader (van Dijk & Kintsch 1983; Kintsch 1988, 1998; Ibáñez 2007a). One fundamental skill for successful text comprehension corresponds to the strategic use of previous knowledge. This, to a large extent, will determine the level of comprehension that a reader can reach (van Dijk & Kintsch 1983). Therefore, academic discourse comprehension will largely depend on the knowledge shared by a certain discourse community that a reader is able to use efficiently. This comprises specialised knowledge about the disciplinary field, as well as knowledge and skills concerning the discursive ways in which that disciplinary knowledge is generated and transmitted. That is, knowledge about different features of a genre used in the discourse community, such as communicative purpose, rhetorical organisation, and lexicogrammar; and skills such as strategic use of disciplinary terminology and identification of genre moves. Besides, in L2 reading comprehension, language proficiency involves both the mother tongue and the L2. This idea is based on the assumption that the reader must be able to transfer certain skills and knowledge used for a comprehension process in the mother tongue to a comprehension process in the L2 (Alderson 1984; Carrell & Grabe 2002; Grabe & Stoller 2002). From our perspective, in L2 academic text comprehension, instead of possessing general proficiency in the L2, the reader must have a command of the genres that realise discourse in the L1 and the L2 of a particular discipline. Command of the genres used in a particular discourse community will be directly associated with the degree of disciplinary expertise of an individual. This
216 Romualdo Ibáñez
idea relies on the assumption that accessing a discipline is a process through which members of a certain discourse community develop skills and specialised knowledge associated with the discipline, which in turn transforms them from novices into experts. This knowledge and these skills, according to Hyland (2004), are developed through appropriate discourse practices distinctive to each community.
2.
The study
Based on the aims stated above, three main hypotheses were explored: H1: There will be a high positive correlation (r > 0.75) between the level achieved for academic text comprehension in English and the students’ level of proficiency in English. H2: There will be a medium positive correlation (r > 0.50 and < 0.74) between the level achieved for academic text comprehension in English and the students’ degree of disciplinary expertise. H3: There will be a high positive correlation (r > 0.75) between the level achieved for academic text comprehension in English and the level achieved for academic text comprehension in Spanish. In order to describe the relations in these hypotheses, we used the Pearson Correlation Coefficient (Hernández, Fernández & Baptista 2006; Hatch & Lazaraton 1991) and the Goodman/Kruskal Gamma (Conover 1999), depending on the needs and characteristics of the analysis.
2.1
Method
For the purposes of the study reported herein, we worked with the total number of students in the degree programme of Industrial Chemistry at the Pontificia Universidad Catolica de Valparaiso (PUCV), Chile. All of them are native speakers of Spanish. Detailed information about the participants is given in Table 1. In order to collect the necessary data, three tests were used: a test to measure academic text comprehension in Spanish, a test to measure academic text comprehension in English, and a standardised test to measure proficiency in English. The degree of disciplinary expertise was determined based on the number of years students had been part of the programme (Table 1).
Academic discourse comprehension in Spanish and English 217
Table 1. Number of participating students and their year of incorporation Year of Incorporation
Number of students
2002 2003 2004 2005 2006 2007
8 9 9 14 14 30
Total number of students
84
Table 2. Types of questions and scores Types of questions
Subtype
Literal
Score range Number assigned to the question in the test
Maximum score
1-3
1-6-11-13
12
Local inference
Lexical Co-referential Cause-consecuence
1-3-6 1-3-6 1-3-6
2-8 3-7 4-9
12 12 12
Global comprehension
General idea Inter-paragraph Title
1-5-15 1-5-15 1-5-10
5 12 10
15 15 10
1-10-20
14
20
Application Total Score
2.2
108
Construction and selection of materials
2.2.1 First test: Academic text comprehension in Spanish In order to measure academic text comprehension in Spanish, we selected a test previously designed for a research project conducted by Parodi (2007a). The underlying theoretical model assumes comprehension as comprising three levels of discourse representation (Surface Code, Textbase and Situation Models) and two cognitive structures (microstructure and macrostructure). Similarly, it assumes that in order to reach these levels of representation and establish the two structures, it is necessary to carry out several psychodiscourse processes, such as the identification of explicit information, the generation of inferences and the integration of information into a coherent meaning. The test is composed of fourteen open-ended questions aimed at measuring the diverse psychodiscourse processes necessary to generate the different levels of representation and the cognitive structures mentioned above. The questions were assigned different scores and divided into four types. Details are shown in Table 2.
218 Romualdo Ibáñez
Question types were assigned different scores according to the cognitive load implied in answering them, thus creating a hierarchical organisation. Literal questions measure the recognition of explicit information in the text. As the cognitive demand for this process is less compared to that associated with other question types, these were assigned a lower score, having less influence on the total score of the test. As can be observed in Table 2, Local Inference questions were assigned a higher score than Literal ones. This, because by generating local inferences (lexical, co-referential and cause/consequence), and so linking explicit and non-explicit information from the text, the reader establishes local coherence or the Microstructure. Then, as they are directed at measuring Macrostructure generation and Textbase level, Global Comprehension questions were placed at the next hierarchical level. Finally, Application question measures generation of the Situation Model. This level of discourse representation is associated with proficient comprehension, since it implies the integration of the meaning acquired from the text with previous knowledge. Reaching this level of comprehension, according to Kintsch (1998, 2002), implies learning and, therefore, the capacity to use the information acquired from the text in novel and diverse situations. Thus, the Application question was assigned the highest score. The other component involved in the design and construction of this test concerns the text. It corresponds to a Textbook section whose topic was the measurement of atmospheric pressure, with a particular emphasis on Bourdon gauges.
2.2.2 Second test: Academic text comprehension in English This material was designed and built based on the test used to evaluate academic text comprehension in Spanish. It thus contains the same type and number of questions, which of course point to the evaluation of the same processes we described for the test above. The questions were written in Spanish, and the students also had to answer in Spanish. This helped us to evaluate what the students had comprehended from the text, without any interference that might have been presented by possible limitations in terms of their capacity to express themselves in written English. With the aim of estimating test reliability, eight expert peers judged the test, obtaining interrater reliability agreement of over 90%. The test was also applied to a pilot sample composed of students and professors from the degree programme in question. Additionally, we applied Cronbach’s alpha, calculating a coefficient of 0.77. The text we selected corresponds to a Textbook section and its topic is the oxidation/reduction reaction. The adequacy of the discourse genre, the text in particular and its topic were determined by experts in the field of Industrial Chemistry.
Academic discourse comprehension in Spanish and English 219
Table 3. QPT scores and levels (ESOL Examinations, 2007) QPT Score
QPT Level
0–17 18–29 30–39 40–47 48–54 55–60
Beginner Elementary Lower Intermediate Upper Intermediate Advanced Very Advanced
2.2.3 Third test: Evaluation of English proficiency In order to assess English proficiency, we selected the Quick Placement Test (QPT). This test was designed by Cambridge University through its department for the design and construction of English tests for speakers of other languages, ESOL Examinations. This test allows teachers to rank English as a second language students into different levels according to their proficiency. These levels are presented in Table 3. According to ESOL Examinations (2007), this test allows objective and efficient decisions to be made with respect to the placement of students into different levels or courses. One of its particular qualities is that the test may be applied to learners of any age. As with other materials developed by ESOL, this standardised test has proven validity and reliability, and as such, it is used in Europe and in other countries around the world. The levels specified in Table 3 represent the standardised levels established and described by ALTE (Association of Language Testers in Europe), an organisation that encompasses and standardises the majority of institutions in the field of designing and administrating level testing for second languages. 2.3
Application of the measurement tests
Application of the material described above was carried out over a period of forty-five days. During this time, each student first took the test to evaluate academic comprehension in English, then the test to measure academic comprehension in Spanish, and finally the test to measure proficiency in English. This order ensured that administration of the test for academic comprehension in English was not influenced by previous test administrations.
220 Romualdo Ibáñez
3.
Description and discussion of the results
Description, analysis and discussion of the results were carried out in two phases. First, we reviewed results from each of the three tests, and then we reviewed the correlational analysis testing our hypotheses.
3.1
First phase: Results obtained in the evaluation
In this phase, using a simple descriptive statistic, we first presented the results obtained in the QPT, then those obtained from the administration of the test for academic text comprehension in Spanish, and finally, those for the test concerning academic text comprehension in English.
3.1.1 English proficiency Table 4 shows the distribution of the participating students according to their level of English proficiency, along with the percentage of the total number of students that each level represents. Moreover, the table shows the score corresponding to each level of the test used (Quick Placement Test) and the way in which said levels correspond to the international levels established by ALTE. As indicated in Table 4, the data reveal marked differences in terms of English proficiency and a clear concentration at the more basic levels, since the students in the Beginner, Elementary and Lower Intermediate levels make up 95% of the total number of participants. This means that 70% of the Industrial Chemistry students have only basic English proficiency. These results are low if we consider that ALTE (2007) describes Beginner level speakers as only capable of using English in simple communicative situations such as in following basic instructions, participating in basic factual conversations or writing notes. Those who attain the Elementary level are capable of expressing requirements or simple opinions in a
Table 4. Results in terms of levels ALTE Levels
Score per level QPT Level
Number of students
Percentage of students per level
Breakthrough 1 2 3
0–17 18–29 30–39 40–47
Beginner Elementary Lower Intermediate Upper Intermediate
33 26 21 4
39% 31% 25% 5%
4
48–54
Advanced
0
0%
5
55–60
Very Advanced
0
0%
Academic discourse comprehension in Spanish and English 221
familiar context and understanding explicit information pertaining to a known area, such as simple study texts or reports on familiar material (ALTE 2007).
3.1.2 Academic text comprehension in Spanish The mean percentage attained by the participants in this test was 68.1%. This percentage reflects a relatively high level of academic comprehension in Spanish, since previous research (Peronard, Gómez, Parodi & Núñez 1998) considers 60% the minimum acceptable level. Nevertheless, we cannot have a clear idea of the individual performance only through the mean percentage alone. Thus, in Table 5, we present the distribution of the students in terms of percentages achieved and the equivalent percentage with respect to the total number of participating students. As can be seen in Table 5, only 10.8% of the participants scored less than 50%. Similarly, 89% of the participants scored higher than 50%, and more specifically, those who scored higher than the acceptable level (above 60%) make up 69% of the total number of students. These data suggest that the majority of participants can be considered to be proficient when it comes to academic Spanish comprehension. Having described the results in terms of mean percentages, we present results by question type. As stated in the description of the test, each type of question falls into a different level in a hierarchy of difficulty according to the comprehension model underlying this research (van Dijk & Kintch 1983). Figure 1 shows these results. Results in terms of question types are partially similar to previous empirical data (Parodi 2007a; Peronard 2007). As predicted, a gradual decrease in the achievement level can be appreciated across the Literal, Local Inference and Global Comprehension questions, due to increased complexity in the psychodiscourse processes involved. We must highlight that low performance on the Global Table 5. Number of students per range and the equivalent percentage with respect to total number of students Percentage ranges
Number of students per range
Equivalent percentage with respect to total number of students
0% – 29% 30% – 39% 40% – 49% 50% – 59% 60% – 69% 70% – 79% 80% – 100%
0 4 5 17 16 22 20
0% 4.8% 6.0% 20.2% 19.0% 26.2% 23.8%
222 Romualdo Ibáñez
Figure 1. Results by question type
Comprehension questions reveals a low capacity for constructing a textbase. Not attaining this level of representation implies, according to the comprehension model underlying this study (van Dijk & Kintsch 1983; Kintsch & Rawson 2005), incapacity to integrate information extracted from the text into a coherent meaningful representation. However, in the Application question, students attained the second highest level of achievement. This finding is interesting if we consider that coping with this type of question requires the most complex cognitive processing in comparison with the other four question types included in the test. One possible explanation for the high level achieved on the Application question may be related to the degree of disciplinary expertise of these individuals, that is to say, not only specialised theoretical knowledge but also knowledge associated with discourse genres, in this case, Textbook (Bhatia 2004). According to McNamara and Kintsch (1996), in certain reading comprehension situations, the strategic use of previous knowledge may make up for the inability to generate inferences or to make use of macrostrategies, thus enabling construction of an adequate situation model. Hence, in discipline-specific academic situations, knowledge of certain discourse attributes, such as recurrent topics, discourse organisation modes and rhetorical organisation, may greatly facilitate the comprehension process.
3.1.3 Academic text comprehension in English The mean percentage attained for academic text comprehension in English was 47.7%. This percentage reflects a low level, since, as stated above, we considered
Academic discourse comprehension in Spanish and English 223
Table 6. Number of students per range and the equivalent percentage with respect to the total number of students Percentage ranges
Number of students per range
Equivalent percentage with respect to total number of students
0% – 29% 30% – 39% 40% – 49% 50% – 59% 60% – 69% 70% – 79% 80% – 100%
20 12 14 14 13 6 5
23.8% 14.3% 16.7% 16.7% 15.5% 7.1% 6%
60% as the minimum acceptable level. Nevertheless, it is necessary to determine whether the average percentage accurately reflects the skills of the majority of participants. Thus, below, Table 6 shows the distribution of the students in terms of their percentages and their equivalent percentage with respect to the total number of students. Table 6 shows that 23.8% of participants scored a percentage less than 30%. Similarly, 54.8% of them did not score any higher than 50%, and those who attained a level of comprehension considered to meet the acceptable minimum requirements (more than 60%) represent just 28.6% of the total number of students. These data suggest that the majority of participants can be considered non-skilled readers of academic texts written in English. In similar manner to the results for academic text comprehension in Spanish, we present the results for academic text comprehension in English in terms of question types. This is shown in Figure 2. Figure 2 shows a gradual decrease in the levels of achievement in terms of mean percentages per question type across Literal, Local Inference and Global Comprehension questions. As was the case with the test for measuring academic comprehension in Spanish, the Application question again presented the second-highest level of achievement. However, differences are not as marked as in the case of results obtained for the test in Spanish. This slight difference can be observed among Local Inference, Global Comprehension and Application questions. From these results, we can see that students had more problems answering Local Inference and Global Comprehension questions than answering the other types of questions included on the test. Moreover, the results obtained suggest that deficiencies that readers may exhibit when reading in a second language are not clearly evident in Literal questions (76%). As stated above, answering this sort of question only requires
224 Romualdo Ibáñez
Figure 2. Results by question type
the recognition of explicit information in the text. In the case of academic text comprehension in English, the process may be facilitated by the strategic use of disciplinary knowledge related to the specialised lexicon. Another interesting finding is the level of achievement attained for the Application question; this is higher than the levels achieved with questions involving less complex psychodiscursive processes, though only slightly. An interesting fact concerning this result is that, in this situation, one must consider not only the great diversity of cognitive processing and resources implied in reaching this level of comprehension, but also the difficulties involved in carrying out this process in a second language. Results for the Application question suggest that when generating a situation model, lack of proficiency in the second language, in lexicogrammatical terms, can be compensated for by using other types of knowledge – in this case, knowledge of the discipline. These findings suggest a reconsideration of the way in which discourse comprehension is conceived of and evaluated, when this concerns academic discourse. We believe that in this situation, the degree of disciplinary expertise plays a central role, precisely because of the knowledge involved, especially discourse knowledge and knowledge associated with the tasks and activities typical of the discipline, such as problem-solving.
3.2
Academic discourse comprehension in Spanish and English 225
Correlation analyses
3.2.1 Correlation between the level achieved for academic text comprehension in English and the level of English Proficiency In the first hypothesis, we state that there is a large positive correlation (rxy > 0.74 y < 0.89) between the level of comprehension achieved by students when reading an academic text written in English and their English proficiency level. The application of the statistical test yields a gamma of 0.40. This is equivalent to a small positive correlation between the two variables, which means that the hypothesis has not been corroborated. 3.2.2 Correlation between the level achieved for academic text comprehension in English and the degree of disciplinary expertise In the second hypothesis, we state that there is a medium positive correlation (rxz > 0.49 y < 0.75) between the level of comprehension achieved by students when reading an academic text written in English and their degree of disciplinary expertise. Application of the statistical test yields a gamma of 0.21. This is equivalent to a weak positive correlation (rxz = 0.21) between the two variables, which means that the hypothesis has been not been corroborated. 3.2.3 Correlation between the level achieved for academic text comprehension in English and the level achieved for academic text comprehension in Spanish In the third hypothesis, we state that there is a high positive correlation (r xy > 0.74 y < 0.89) between the level of comprehension achieved by students when reading an academic text written in English and their level of comprehension when reading an academic text written in Spanish. Application of the statistical test yields a correlation coefficient of 0.14. This is equivalent to a very small positive correlation (rxz = 0.21) between the two variables, which means that the hypothesis has been not been corroborated. 3.3
Cluster analysis
The main aim of this research is to identify and relate general results obtained by the participants; however, in trying to find possible reasons for low levels of correlation obtained in the analysis, we decided to carry out another analysis – this time in individual terms. This type of analysis implies visualisation of the way in which the variables involved interact to determine levels of achievement attained
226 Romualdo Ibáñez
100% 90%
Academic Comprehension in English
80% 70% 60% 50% 40% 30% 20% 10% 0% 0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Academic Comprehension in Spanish
Figure 3. General clustering
by each student. Figure 3 shows that the weak linear correlation may be explained by the existence of what is called a cluster. In order to show data clustering, the figure is divided into four quadrants based on the cut-off score of 60%, previously assumed as the minimum accepted. This procedure shows that the data is distributed over what can be interpreted as three defined clusters. First, in the lower left quadrant, we can observe the participants whose results are low for both tests. This cluster corroborates the idea that there are students who are non-skilled readers in any language and therefore, their level of proficiency in English is not a determinant variable when facing an academic text written in this language. Subsequently, in the lower right quadrant, there are several subjects who obtained high scores on the test for comprehension in Spanish and low scores for comprehension in English. In this particular case, it is necessary to analyse the role of proficiency in English, as well as the degree of disciplinary expertise. Finally, those subjects who obtained high scores on both tests can be observed in the upper right quadrant. In this last case, it is also important to pay attention to the role of the degree of disciplinary expertise and the level of proficiency in English. In order to determine the influence that disciplinary expertise and proficiency in English might have in the level achieved by students, we present Figure 4,
Academic discourse comprehension in Spanish and English 227
Beginner 100% 90%
4
Academic Comprehension in English
80% 70%
6
6
60%
7 7
50%
2 7
5 5
33
7
3 3
5
40% 3 7
30%
7
20%
6
3 7
7 6
7 7
2
5 7 6 7 7 4
7
10% 0% 0%
10%
20%
30% 40% 50% 60% 70% Academic Comprehension in Spanish
80%
90%
100%
Figure 4. Beginners clustering
which shows the way in which the variables involved interact to determine the level of achievement attained by participants in the test to measure academic comprehension in English. To do so, in this instance, we focus only on students who scored at the Beginner level on the QPT test. We can observe that in the lower left quadrant, there is a rather small group of students constituted by those who started the programme in 2003, those who started in 2006, and those who started in 2007. Then, in the lower right quadrant, there is a large group incorporated into the programme between 2002 and 2007. Finally, in the upper right quadrant, there is an extreme lack of students, and those who are present are from 2004, 2006 and 2007. The figure shows that, in the case of the students in the lower left quadrant, in addition to low English proficiency, these are also incapable of comprehending a text in their mother tongue, and therefore, independent of their degree of disciplinary expertise, they were unable to comprehend the academic text written in English. Furthermore, in the lower right quadrant, we can observe that though the students were able to comprehend an academic text in their mother tongue, independent of their degree of disciplinary expertise, they were not able to transfer this skill in order to comprehend the academic text written in English. This is presumably due to their low level of English proficiency. In order to test this assumption, we present Figure 5, where we perform the same type of analysis
228 Romualdo Ibáñez
Intermediate
100%
4
Academic Comprehension in English
90%
5
80%
3 6
70%
3
60%
4 7
7
5
6
5 7
7
50%
5
7
6 7
4
40%
7 6
7
6
30% 7
20% 10% 0% 0%
10%
20%
30% 40% 50% 60% 70% Academic Comprehension in Spanish
80%
90%
100%
Figure 5. Intermediate clustering
as we did with Figure 4, but this time focused on students with a higher level of English proficiency, that is, those in the Lower Intermediate and Upper Intermediate levels. From this perspective, we can first observe that the scores obtained for academic comprehension in English are definitely higher than those presented in the previous case (Figure 4). Secondly, if we look at the number of participants placed in each quadrant, it is possible to see a small group of students in the lower left quadrant that entered the programme in 2004, 2005 and 2007. In the lower right quadrant, there is a medium-sized group that entered the programme in 2006 and 2007. Finally, in the upper right quadrant, there is another medium-sized group that started the programme between 2003 and 2007. This cluster analysis indicates the relevance of the role played by English proficiency level above and beyond the role of the other variables involved. It is also important to highlight that this variable determines the behaviour of the others, since it has been proven that a linguistic threshold is necessary to transfer the skill used to comprehend texts in the mother tongue to a situation in which the text is written in a second language. Nevertheless, it is important to note what happened with the students placed in the lower right quadrant. Although they have a threshold level of English, they could not transfer their skills used to comprehend texts in their mother tongue to the same process in a second language.
Academic discourse comprehension in Spanish and English 229
Some other students with more years in the programme were able to do so. Taking into account their years of incorporation (2006 and 2007), it seems that their insufficient degree of disciplinary expertise, independent of their level of English and their comprehension skills, hampered their attainment of a higher level of achievement.
Conclusions From the data obtained, it is possible to conclude that the levels of achievement attained by the students on both the test to measure English proficiency and the test for academic comprehension in English are definitely low. This situation is a cause of concern when we consider that these students must face academic texts written in English in most of their courses and throughout five years of their degree programme. However, one positive aspect to highlight is represented by the level these students achieved for Academic text comprehension in Spanish. In additiona, from the analysis performed, it is possible to state that the variable which most determines the level of academic text comprehension in English is the level of proficiency in English variable, followed by the disciplinary expertise variable, and finally the comprehension skill variable. However, although correlational analysis demonstrated the existence of a positive correlation in all cases studied, the correlation level was low in the three relationships examined by this research. Similarly, cluster analysis demonstrates that the variables involved behave differently in different situations. In the case of skill for comprehending texts in the mother tongue, this can only be transferred if the reader possesses a threshold level of proficiency in English. However, level of proficiency incidence depends on the psychodiscourse process involved. This can be seen in the results organised by question type. Finally, when considering the degree of disciplinary expertise, we can say that this variable evidences direct incidence in cases where its level is extremely low, clouding comprehension, independent of the comprehension ability and of the level of English proficiency. One of the findings we considered highly interesting concerns the results of the Application question. These results indicate that, though unable to generate a textbase, some participants were capable of constructing an adequate Situation Model. We therefore believe that these students probably achieved the third level of representation by using general cognitive strategies directly related to the discipline and, thus, associated with problem solving. Of course, this implies a degree of disciplinary expertise, which increases knowledge related to the particular discourse genre they are familiar with, in this case the Textbook genre. These specific
230 Romualdo Ibáñez
results suggest that we should think about and discuss the way we conceive and evaluate academic discourse comprehension. According to what we stated above, it seems pertinent to ask what we should understand as deep academic comprehension: (a) the process performed by a student who is capable of achieving Local Inference, and using macro-structural strategies that lead to a textbase; or (b) the process performed by a student who manages to create a Situation Model principally based on her/his disciplinary knowledge, including discourse knowledge. In the first case, according to McNamara and Kintsch (1996) and Kintsch (1988, 1998), the student would be able to extract the global meaning from the text and even summarise it; in the second, the student would be able to integrate the information with her/his own previous knowledge and apply it to new situations, as shown by the results obtained in this research. By means of this research, we obtained relevant data concerning levels of academic discourse comprehension achieved by students, not only in Spanish but also in English. We also explored the relationship between the level of comprehension of academic texts written in English and the variable understood as the determinant of the level of achievement attained. This is the case for the degree of disciplinary expertise variable, which, in spite of its relevance, is rarely considered when evaluating comprehension processes in disciplinary settings.
Variables Involved in the Multidimensional Profile Across the Disciplines
Discipline 1
Academic Comprehension in Spanish
Academic Comprehension in English
Discipline n
Discipline 3
Degree of Disciplinary Expertise
Level of Proficiency in English
Discipline 2
Figure 6. Variables involved in the multidimensional profile across the disciplines
Academic discourse comprehension in Spanish and English 231
We wish to highlight that the results obtained motivate further study concerning disciplinary access. Besides, we must move forward towards a further in-depth description of the interaction between all the variables involved and to concentrate on the synergetic relationship, given its transcendence not only in academic setting performances but also in professional ones. Also, it would be interesting to broaden the disciplinary scope and compare the way in which these and other variables are related in other disciplines. This is displayed in Figure 6. From this conception of the phenomenon, as shown in Figure 6, a multidimensional profile of the processing of written language may be built in Spanish and English, and in different disciplines. This complex perspective on different skills and knowledge allows us to describe the way in which these variables interact and also the relative weight each of these represents at the university level, epistemically considering different disciplines.
chapter 12
Corollary A critical synopsis of this book and some prospects for future challenges Giovanni Parodi
In this last chapter of the book, a revision of all findings is presented. Also, a critical analysis of the theoretical framework and the proposed criteria for identifying genres (Chapter 2 and 3) with a comparison with the empirical outcomes obtained through the chapters is offered. Projections and future research niches are outlined.
Overview In this final chapter, we focus critically on our comprehensive framework for researching genres, in which the cognitive, social and linguistic dimensions are dynamically interacting as constitutive components of a theory of genre and language. Nevertheless, the cognitive dimension is emphasised in our genre conception, partially because – in our opinion – this has been rather neglected until recently by mainstream theories. This chapter also organises and partly describes the various findings empirically supported within a single framework. This framework emerged from our conception of human beings, language, discourse and genre. This framework also emerged from a long step-by-step development of a school of thought at Escuela Lingüística de Valparaíso (ELV) www.linguistica.cl, and it is aligned with the tenets of our mentors and founders Marianne Peronard and Luis Gómez Macker at Pontificia Universidad Católica de Valparaíso, Chile. The first theoretical notion that drives this framework is that language can only be understood interdisciplinarily and from a psycho-socio-linguistics approach, where cognitive, contextual and linguistic dimensions are dynamically interrelated. Critical to this notion is the explicit sociocognitive principle that defines our ontological and epistemological conception of human beings as conscious and free individuals seeking to interactively construct their knowledge of
234 Giovanni Parodi
the world within society and culture. Also critical is that we do not attempt in any way to study language by itself, disconnected from the world and society. Although it may be clear that cognition is an important dimension in our theory of language and genre, it can be argued that encompassing dimensions are required. This is why we collected complete and unmutilated texts as corpora, representing concrete instances and manifestations of language in use, seeking the highest degree of representativeness. This is also why we are interested in disciplinarity and specialised knowledge construction through written discourse in particular settings, such as universities and professional workplaces. It is further notable that this psychodiscourse framework addresses a dynamic conception of genre, empirically supported and combining complementary methodological approaches: “top-down” and “bottom up” perspectives (that is, a complementation of deductive and inductive methods). Just as this volume is concerned with genre descriptions across disciplines, computational tool developments for improving robust scientific results, and generalisable conclusions as well as educational applications into reading comprehension processes, so too is this framework. This means we are not only interested in the theoretical and descriptive steps of scientific research, but also in developing computational tools and programmes as well as focusing on written language literacy at undergraduate and graduate educational levels, that is, on reading and writing processes as fundamental means to access and produce knowledge. This could be called applied linguistics, or rather, applied interdisciplinary linguistics. This also means that we are interested in researching instructional techniques used to develop successful readers and writers and to foster reading comprehension strategies so as to help readers overcome various levels of comprehension difficulties with specialised genres. This encompassing approach is based on the assumption that we need to help contemporary students and professionals who are as never before hugely interconnected in so many ways, but sometimes cannot understand each other with deep levels of comprehension. Language is vital, but should not be parcelled out so strictly and definitively in order to investigate deeply functioning parts or fundamental factors or dimensions. There must be a moment when reassembling the divided parts, components, sections or levels of language into the integrated whole, which should also be a stage of the research challenges. In this scenario, comprehending, identifying and describing genres is clearly challenging. First, it involves a series of text processes, such as decoding strategies and vocabulary access. Second, readers and writers must successfully execute these processes in a coordinated fashion in order to successfully accomplish their communicative purposes and fulfil the needs and goals of communication interactions. Command of specialised textual and discourse processes, command
Corollary 235
of the domain and world knowledge, command of basic linguistic rules are just some of the vital components required. Many readers and writers struggle at one stage or another in the cognitive, textual and communicative processes involved in accurately adjusting to registers and genres and to successfully accomplishing the intended communicative purposes. The stages or dimensions at which readers and writers struggle may differ depending on the reader/writer and the genre characteristics. How to help inexperienced readers/writers to reach successful command of specialised genres is a challenge underlying the founding principles of this book. The findings reported are just a small stone in the building blocks of genre theory and language theory. Analytical and in-depth descriptions of the circulating genres in specific undergraduate and professional settings constitute one of the first levels in this progressive and accumulative line of research. The chapters in this volume focused on genre descriptions and described moves and steps of some emerging important genres that are theoretically motivated and have shown to be relevant as educational tools in four differing undergraduate university programmes.
Corollary: Robust findings and research limitations Thus far, the findings of our research reported in this volume indicate that there are a number of potential obstacles a reader may encounter when attempting to become part of a specialised discourse community, from the point of view of written genres. The diversity of genres depends on the discipline or scientific domain (Socials Sciences and Humanities or Basic Sciences and Engineering) but also depends on the undergraduate university settings or the professional workplaces, which revealed a quite heterogeneous panorama of written discourse means of communication. Successful undergraduate university students and successful professionals must face this growing and varying diversity of genres and must develop reading strategies to process and access specialised information which is vital for their progress in life and work. The following comments are offered as a way of indentifying strengths and limitations of the studies reported in this book: – Segmentation, identification and determination of moves and steps of a genre across the texts are undoubtedly a point for discussion and debate, particularly if the texts belonging to a genre of considerable length (an average 200,000 words). Likewise, abstraction levels implicit in this analysis are a complex issue to be faced by each researcher. Although this can be guaranteed through
236 Giovanni Parodi
–
–
–
–
joint and collaborative work of a research team, or by a triangulation process with expert judges, we incorporated the macro-level known as macro-move in order to specifically support part of these complexities. This higher level proved to be highly suitable to the analysis and the decision turned out to be very useful as well. Another critical point regarding the analytical steps used in this research is the corpus-based identification of both academic and professional genres. After thorough bibliographic revision, we decided to formulate a table containing five criteria and their corresponding operationalisation in a set of variables. This deductive and inductive step went up against a series of theoretical and empirical complications. This was partly due to polisemy between different terms in available literature, and also due to the lack of specific consensual terminology (e.g., discourse organisation mode, modality, role of participants, etc.) and empirical clarity for application of the identified variables. Notwithstanding, we were able to manage this research stage and successfully identified 9 academic genres and 28 professional genres. Nevertheless, the set of criteria and variables has evidently failed to encapsulate all the richness and complexity of the corpora. In fact, there are several criteria that do not evidence important discriminatory power and move across the genders identified. For example, such is the case with the Research Article and the Disciplinary Text. Both are observed to share all of the fundamental criteria, and the definitions we formulated show this. Therefore, one challenge is to further investigate criteria and variables that represent the texts of a corpus and fundamentally make an in-depth distinction between these. Of course, identification of meeting and crossing points constitutes another future challenge. It is important to point out that headway has been made regarding technological instruments and expectations for automatisation procedures in computer analysis. However, we still need to refine these instruments and delve further into these findings. The overcoming of manual analysis is another challenge with potential for progressive study of increasingly larger and more robust text corpora. One issue that has not been discussed in this book is the identification of lexicogrammatical features in the moves or steps of the genres under study. Describing patterns of structure and organisation across the texts of different genres is an open area of research. This may lead to progressive and automatic identification of moves and steps across genres. In direct relation with the point made above, the possibility of in-depth study of the lexicogrammatical features as automatic classifiers of the differences
Corollary 237
detected in certain genres, such as the Textbook and the Disciplinary Text, remains open. Abstraction and concreteness in Textbooks across Social Sciences and Humanities texts and Basic Sciences and Engineering texts, while differing degrees of the way knowledge is negotiated in Disciplinary Text across disciplines are awaiting for more research.
Prospects for future research We need to continue working on ways to incorporate the many variables interacting in a genre and keep trying to efficiently provide better accounts of most of these. In addition, we should move into ways of capturing more properties of texts in our corpora design and analysis, as well as building ways to analyse them computationally and automatically. We need to proceed to the study of lexicogrammatical features in the discourse units constituting genres across diverse disciplines. Two other highly relevant considerations to keep in mind for corpus-based descriptions of specialised discourse are the broader physical and social contexts around texts and the psychodiscoursive processing of patterns of structure and organisations across the texts of different genres. On the one hand, recent methods of text descriptions are moving beyond the text itself and are trying to capture interactions in the context of production and use while readers and writers process information, such as observations, interviews, protocols, focus groups as well and varying methodologies supported by ethnographic perspectives. On the other hand, reading comprehension methodologies not only for scientific research descriptions but also for educational purposes are employing sophisticated technologies, such as on-line techniques and also improving off-line instruments. Future corpora research of written discourse should explore and find ways to be more systematic in gradually but robustly incorporating data obtained through social-contextual descriptions, as well as findings emerging from psychodiscoursive processing of written language. Reassembling these results may be difficult as attested in many inter or trans-disciplinarity domains, but becomes fundamental when approaching complex and rich objects of study. Otherwise, the risk is to further continue with a fragmentation of complex objects and not to advance in terms of developing more progressive, cooperative and inclusive work around human beings and discourse in order to help society to resist disintegration of the “wholeness” of human discourse and human beings. This has been the case with some recent mainstream theories of language or models of language.
238 Giovanni Parodi
Closing remarks In this book, we have briefly reviewed a framework to understand genres as multidimensional objects and empirically documented findings regarding genre variation across disciplines. We also stated descriptions based on move analysis methodology of two highly relevant university genres: Textbook and Disciplinary Text. Moreover, we investigated and compared diverse computational tools to account for genre identification and disciplinary variation. In order to sketch a complete panorama from theorising about language and genres to empirically-based descriptions on qualitative and quantitative approaches, we also moved into application stages designing and administering reading comprehension tests to undergraduate university students from the disciplines involved in this research, not only in Spanish but also in English. The shallow levels undergraduate university students of Industrial Chemistry revealed when processing a text from a Disciplinary Text genre is another finding that shows future niches of research. We are aware that we might have ignored potentially important theoretical contributions and outcomes as well as research limitations and projections. However, in this final chapter, our goal was not to include all of the findings of this book nor to summarise all the chapters, but rather to put together some of the most important contributions in order to think of the challenges and to look back on the work undertaken. More research is needed so that we can better understand the nature of discourse genres and the way in which we can help readers to become successful comprehenders of the fundamental genres, but also to be able to face and process new emerging genres. To some extent, many of the chapters in this volume contribute to this understanding. However, we look forward to more work to describe specialised genres in different settings and from different disciplines, in addition to cross-linguistic studies and exploring harmonisation of the cognitive, social, and linguistic dimensions.
References
Acosta, O. (2006). Análisis de introducciones de artículos de investigación publicados en la Revista Núcleo 1985–2003. Núcleo, 18(23), 9–30. Adam, J.-M. (1992). Les textes: Types et prototypes. Paris: Nathan-Université. Alcaraz, E., Mateo, J. & Yus, R. (Eds.) (2007). Las lenguas profesionales y académicas. Barcelona: Ariel. Alderson, J. (1984). Reading in a foreign language: A reading problem or a language problem? In J. Alderson & A. Urquhart (Eds.), Reading in a foreign language (pp. 1–24). London: Longman. Alderson, J. (2000). Assessing reading. Cambridge: Cambridge University Press. ALTE (2007). The ALTE framework: A common european level system [on line]. Available at: http://www. alte.org/further_info/framework_english.pdf. Arnoux, E., Nogueira, S. & Silvestre, A. (2006). Comprensión macroestructural y reformulación resuntiva de textos teóricos en estudiantes de institutos de formación de docentes primarios. Revista Signos, 39(60), 9–30. Askehave, I. & Swales, J. (2001). Genre identification and communicative purpose: A problem and a possible solution. Applied Linguistics, 22(2), 95–212. Baker, P. (2006). Using corpora in discourse analysis. London: Continuum. Bakhtin, M. (1998). La estética de la creción verbal. Mexico: Siglo XXI. Baldi, P., Frasconi, P. & Smith, P. (2003). Modeling the internet and the web. Chichester: John Wiley & Sons. Barbara, L. & Scott, M. (1999). Homing in on a genre: Invitations for bids. In F. Bargiela-Chiappini & C. Nickerson (Eds.), Genres, media and discourses (pp. 227–254). London: Longman. Bargiela-Chiappini, F. & Nickerson, C. (Eds.) (1999). Writing Business Genres, Media and Discourses. London: Longman. Bartholomé, D. (1986). Inventing the university. Journal of Basic Writing, 5, 4–23. Bassols, M., & Torrens, A. (1997). Modelos textuales, teoría y práctica. Barcelona: Eumo Editorial. Bautista, E., Guzmán, E. & Figueroa, J. (2004). Predicción de múltiples puntos de series utilizando support vector machines. Computación y sistemas, 7(3), 148–155. Bazerman, C. (1988). Shaping written knowledge: The genre and activity of the experimental article in science. Madison, WI: The University of Wisconsin Press. Bazerman, C. (1994). Systems of genres and the enhancement of social intentions. In A. Freedman & P. Medway (Eds.), Genre and New Rhetoric (pp. 79–101). London: Taylor and Francis.
240 References
Bazerman, C. (Ed.) (2008). Handbook of research on writing. History, society, school, individual, text. New York: Lawrence Erlbaum/Taylor & Francis. Beaufort, A. (2007). College writing and beyond. Logan, UT: Utah State University Press. Bereiter, C. & Scardamalia, M. (1987). The psychology of written composition. Hillsdale, NJ: Lawrence Erlbaum. Berkenkotter, C. & Huckin, T. (1995). Genre knowledge in disciplinary communication: Cognition, culture, power. Hillsdale, NJ: Lawrence Erlbaum. Betancourt, G. (2005). Las máquinas de soporte vectorial. Scientia e Técnica, 11(27), 67–72. Bhatia, V. (1993). Analysing genre: Language use in professional settings. London: Longman. Bhatia, V. (1997a). Introduction: Genre analysis and world Englishes. World Englishes, 16, 313–319. Bhatia, V. (1997b). Genre-Mixing in academic introductions. English for Specific Purposes, 16(3), 181–195. Bhatia, V. (2002). A generic view of academic discourse. In J. Flowerdew (Ed.), Academic discourse (pp. 21–39). Cambridge: Cambridge University Press. Bhatia, V. (2004). Worlds of written discourse. A genre based view. London: Continuun. Biber, D. (1988). Variation across speech and writing. Cambridge: Cambridge University Press. Biber, D. (1995). Dimensions of register variation: A cross linguistic comparison. Cambridge: Cambridge University Press. Biber, D. (2003). Variation among university spoken and written registers: A new multi-dimensional analysis. In P. Leistyna & C. Meyer (Eds.), Corpus analysis: Language structure and language use (pp. 47–70). Amsterdam: Rodopi. Biber, D. (2005). Paquetes léxicos en textos de estudio universitario: Variación entre disciplinas académicas. Revista Signos, 38(57), 19–29. Biber, D. (2006). University language: A corpus-based study of spoken and written registers, Studies in Corpus Linguistics 23. Amsterdam: John Benjamins. Biber, D. (2007a). Discourse analysis and corpus linguistics. In D. Biber, U. Connor & T. Upton (Eds.), Discourse on the move. Using corpus analysis to describe discourse structure, Studies in Corpus Linguistics 28 (pp. 1–20). Amsterdam: John Benjamins. Biber, D. (2007b). University language: A corpus-based study of spoken and written registers. Applied Linguistics, 28, 624–627. Biber, D., Connor, U., & Upton, T. (2007). Discourse on the move. Using corpus analysis to describe discourse structure, Studies in Corpus Linguistics 28. Amsterdam: John Benjamins. Biber, D. & Conrad, S. (2001). Introduction: Multi-dimensional analysis and the study of register variation. In S. Conrad & D. Biber (Eds.), Variation in English: Multi-dimensional studies (pp. 3–12). London: Longman. Biber, D., Conrad, S. & Reppen, R. (1998). Corpus linguistics: Investigating language structure and use. Cambridge: Cambridge University Press. Biber, D. & Tracy-Ventura, N. (2007). Dimensions of register variation in Spanish. In G. Parodi (Ed.), Working with Spanish corpora (pp. 54–89). London: Continuum. Björk, L. & Räisänen, C. (2003). Academic writing: A university writing course. Lund: Studentlitteratur. Bolívar, A. (1999). Los resúmenes para eventos científicos en lingüística aplicada en América Latina: Estructura e interacción. Opción, 15(29), 61–81. Bolívar, A. (2000). Homogeneidad versus variedad en la estructura de los resúmenes de investigación para congresos. Akademus, 2, 121–138.
References 241
Bordignon, F., Peri, J., Tolosa, G., Villa, D. & Paoletti, L. (2004). Experimentos en clasificación automática de noticias en español utilizando el modelo bayesiano [on line]. Available at: http://www.unlu.edu.ar/~tyr/TYR–publica/paper-un lu-bayes-2004.doc. Boreham, J. & Niblett, B. (1976). Classification of legal texts by computer. Information Processing & Management, 12(2), 125–132. Bruce, I. (2008). Academic writing and genre. A systematic analysis. London: Continuum. Bunton, D. (2002). Generic moves in Ph.D. introduction chapters. In J. Flowerdew (Ed.), Academic discourse (pp. 57–75). London: Longman. Burdach, A. (2000). El léxico científico y técnico: un recurso publicitario persuasivo. Onomazein, 5, 189–208. Cabré, M. T. (1993). La terminología: Teoría, metodología y aplicaciones. Barcelona: Antártida. Cabré, M.T. (1999). Hacia una teoría comunicativa de la terminología: Aspectos metodológicos. Revista Argentina de Lingüística, 15, 24–38. Cabré, M. T. (2002). Textos especializados y unidades de conocimiento: Metodología y tipologización. In J. García & M. Fuentes (Eds.), Texto, terminología y traducción (pp. 122–187). Barcelona: Almar. Cabré, M. T., Doménech, M., Morel, J. & Rodríguez, C. (2001). Las características del conocimiento especializado y la relación con el conocimiento general. In M. T. Cabré & J. Feliú (Eds.), La terminología técnica y científica (pp. 173–186). Barcelona: Instituto Universitario de Lingüística Aplicada. Cademártori, Y. (2003). La inscripción de las personas en textos de divulgación científica. Revista Latinoamericana de Estudios del Discurso, 3, 9–17. Cademártori, Y., Parodi, G. & Venegas, R. (2006). El discurso escrito y especializado: Caracterización y funciones de las nominalizaciones en los manuales técnico. Revista Literatura y Lingüística, 17, 243–265. Cademártori, Y., Parodi, G. & Venegas, R. (2007). El discurso escrito y especializado: Las nominalizaciones en los manuales técnicos. In G. Parodi (Ed.), Lingüística de corpus y discursos especializados: Puntos de mira (pp. 79−98). Valparaíso: Ediciones Universitarias de Valparaíso. Calsamiglia, E. (2000). Decir la ciencia: Las prácticas divulgativas en el punto de mira. Discurso y Sociedad, 2(2), 3–8. Candlin, C. (Ed.) (2002). Research and practice in professional discourse. Hong Kong: City University of Hong Kong Press. Carlino, P. (2005). Representaciones sobre la escritura y formas de enseñarla en universidades de América del Norte. Revista de Educación, 336 (enero–abril), 143–168. Carrell, P. & Grabe, W. (2002). Reading. In N. Schmitt (Ed.), An introduction to applied linguistics (pp. 233–250). London: Arnold. Castel, V., Aruani, M. & Severino, V. (Eds.) (2004). Investigaciones en ciencias humanas y sociales: Del ABC disciplinar a la reflexión metodológica. Mendoza: Editorial de la Facultad de Filosofía y Letras de la Universidad Nacional de Cuyo. Castelló, M. (Coord.) (2007). Escribir y comunicarse en contextos científicos y académicos: Conocimientos y estrategias. Barcelona: GRAÓ. Cerviño, U., García, J., Calvo, R. & Ceccato, A. (2004). Automatic classification of news articles in Spanish [on line]. Available at: http://citeseer.ist.psu.edu/beresi04automatic.html. Ciapuscio, G. (1992). Impersonalidad y desangetivación en la divulgación científica. Lingüística Española Actual, 2, 183–205. Ciapuscio, G. (1994). Tipos textuales. Buenos Aires: EUDEBA.
242 References
Ciapuscio, G. (1996). El subtipo textual: “Conclusiones” en revistas de lingüística hispánica: Una perspectiva lingüístico-textual contrastiva. Filología XXIX, 1–2, 5–19. Ciapuscio, G. (2000). Hacia una tipología del discurso especializado. Discurso y Sociedad, 2(2), 39–71. Ciapuscio, G. (2003). Textos especializados y terminología. Barcelona: IULA. Ciapuscio, G. (2007). Epistemic modality and academic orality: Pilot study for COTECA (Corpus Textual del Español Científico de la Argentina). In G. Parodi (Ed.), Working with Spanish corpora (pp. 90–105). London: Continuum. Ciapuscio, G. & Otañi, I. (2002). Las conclusiones de los artículos de investigación desde una perspectiva contrastiva. RILL, 15, 117–133. Colmenares, G. (2007). Función de base radial. Radial Basis Function (RBF) [on line]. Available at: http:// www.webdelprofesor.ula.ve/economia/gcolmen/programa/redes_ neuronales/capitulo4_funciones_bases_radiales.pdf. Connor, U. & Upton, T. (Eds.) (2004). Discourse in the professions: Perspectives from corpus linguistics, Studies in Corpus Linguistics 16. Amsterdam: John Benjamins. Conover, W. (1999). Practical nonparametric statistics. New York, NY: John Wiley & Sons. Conrad, S. & D. Biber. 1998. Multi-dimensional methodology and the dimensions of register variation in English. In S. Conrad & D. Biber (Eds.), Variation in English: Multi-dimensional Studies (pp. 13–42). London: Longman. Cortes, C. & Vapnick, V. (1995). Support vector networks. Machine Learning, 20, 273–297. Crismore, A. (1989). Talking with readers. Metadiscourse as rhetorical act. Frankfurt: Peter Lang. Cristianini, N. & Shawe-Taylor, J. (2002). Introduction to support vector machines: And other kernel-based learning methods. Cambridge: University of Cambridge. Crossley, S. & Louwerse, L. (2007). Multi-dimensional register classification using collocations. International Journal of Corpus Linguistics, 12, 453–478. Cubo de Severino, L. (2000). El manual universitario como género: Análisis retórico. (unreleased) Presented in a report to Secyt UNT. Cubo de Severino, L. (2002). Evaluación de estrategias retóricas en la comprensión de manuales universitarios. RILL, 15, 55–59. Cubo de Severino, L. (Coord.) (2005a). Los textos de la ciencia. Principales clases del discurso académico-científico. Córdoba: Comunicarte. Cubo de Severino, L. (2005b). Los manuales universitarios. In L. Cubo de Severino (Coord.), Los textos de la ciencia. Principales clases del discurso académico-científico (pp. 325–336). Córdoba: Comunicarte. Curado, A., Edwards, P. & Rico, M. (2007). Approaches to specialised discourse in higher education and professional contexts. Newcastle, UK: Cambridge Scholars Publishing. Chafe, W. & Danielewicz, J. (1987). Properties of spoken and written language. In R. Horowitz & S. J. Samuels (Eds.), Comprehending oral and written language (pp. 83–113). New York, NY: Academic Press. Charaudeau, P. (1992). Grammaire du sens et de l’expression. Paris: Hachette. Charaudeau, P. (2004). La problemática de los géneros. De la situación a la construcción textual. Revista Signos, 37(56), 23–39. Christie, F. & Martin, J. (Eds.) (1997). Genre and institutions. Social Processes in the workplace and the school. London: Continuum. Christie, F. & Martin, J. (Eds.) (2007). Language, knowledge and pedagogy. Functional linguistics and sociological perspectives. London: Continuum.
References 243
De Beaugrande, R. (2004). Language, discourse, and cognition: Retroprospects and prospects. In T. Virtanen (Ed.), Approaches to Cognition through Text and Discourse (pp. 17–31). Berlin: Mouton de Gruyter. De Beaugrande, R. & Dressler, W. (1981). Introducción a la lingüística del texto. Barcelona: Ariel. Deutsch, M. & Krauss, R. (1980). Teorías en psicología social. Barcelona: Paidós. Devitt, A. (2004). Writing genres. Carbondale, IL: Southern Illinois University Press. Duda, R. & Hart, P. (1973). Pattern classification and scene analysis. New York, NY: John Wiley & Sons. Dudley-Evans, T. (1994). Genre analysis: An approach for text analysis for ESP. In M. Coulthard (Ed.), Advances in written text analysis (pp. 219–228). London: Routledge. Dudley-Evans, T. (1986). Genre analysis: An investigation of the introduction and discussion sections of MSc dissertations. In M. Coulthard (Ed.), Talking about text (pp. 128–145). Birmingham: University of Birmingham. Dudley-Evans, T. & St. Johns (2006). Developments in English for Specific Purposes. Cambridge: Cambridge University Press. Eco, U. (2000). Tratado de semiótica general. Barcelona: Lumen. ESOL Examinations (2007). [on line]. Available at: http://www.cambridgeesol.org/index. Espejo, C. (2006). La movida concluyendo en torno al tema en informes de investigación elaborados por estudiantes universitarios. Onomázein, 13(1), 35–54. Facchinetti, R. (Ed.) (2007). Corpus linguistics 25 years on. Amsterdam: Rodopi. Fairthorne, R. (1961). The mathematics of the classification. Towards Information Retrieval. London: Butterworths. Fernández, G. & Carlino, P. (2007). Leer y escribir en los primeros años de la universidad: Un estudio proyectado en ciencias veterinarias y humanas de la UNCPBA. Cuadernos de Educación, 5, 277–289. Figuerola, C. (2000). La investigación sobre recuperación de la información en español. In E. Gonzalo & V. García (Eds.), Documentación, terminología y traducción (pp. 73–82). Madrid: Síntesis. Figuerola, C., Zazo, A. & Berrocal, J. (2000). Categorización automática de documentos en español: Algunos resultados experimentales [on line]. Available at: http://imhotep.unizar.es/ jbidi/jbidi2000/14_2000.pdf. Flowerdew, J. (Ed.) (2002a). Academic discourse. London: Longman. Flowerdew, J. (2002b). Introduction: Approaches to the analysis of the academic discourse in English. In J. Flowerdew (Ed.), Academic discourse (pp. 1–20). Cambridge: Cambridge University Press. Flowerdew, J. & Peacock, M. (2001). Issues in EAP: A preliminary perspective. In J. Flowerdew & M. Peacock (Eds.), Research perspectives on English for academic purposes (pp. 315–359). Cambridge: Cambridge University Press. Freedman, A. & Medway, P. (Eds.) (1994). Genre and the new rhetoric. London: Taylor and Francis. Gallardo, S. (2005). Los médicos recomiendan: Un estudio de las notas periodísticas sobre salud. Buenos Aires: Eudeba. Gläser, R. (1982). The problem of style classification in LSP (ESP). Lecture given at the 3rd European Symposium on LSP, Copenhague. Gläser, R. (1993). A multi-level model for a typology of LSP genres. Fachsprache. Internacional Journal of LSP, 15(1−2), 18−26.
244 References
Gómez Macker, L. (1998). Dimensión social de la comprensión verbal. In M. Peronard, L. Gómez Macker, G. Parodi & P. Núñez (Eds.), Comprensión de textos escritos: De la teoría a la sala de clases (pp. 34–58). Santiago de Chile: Andrés Bello. Gómez Macker, L. (2005). El hombre y su palabra. Valparaíso: Ediciones Universitarias de Valparaíso. González, C. (2007). La identidad discursiva de los sujetos participantes en el género editorial de prensa. In G. Parodi (Ed.), Lingüística de corpus y discursos especializados: Puntos de mira (pp. 301–319). Valparaíso: Ediciones Universitarias de Valparaíso. Gotti, M. (2003). Specialized discourse. Linguistic features and changing conventions. Bern: Peter Lang Grabe, W. & Stoller, F. L. (2002). Teaching and researching reading. London: Pearson. Graesser, A., Singer, M. & Trabasso, T. (1994). Constructing inferences during negative text comprehension. Psychological Review, 111, 371–395. Gunnarsson, B. (1997). On the sociohistorical construction of scientific discourse. In B. Gunnarsson, P. Linell & B. Nordberg (Eds.), The construction of professional discourse (pp. 99–126). London: Longman. Hair, J., Anderson, R., Tatham, R. & Black, W. (1999). Análisis multivariante. Madrid: Prentice Hall. Halliday, M. (1978). Language as a social semiotics: The social interpretation of language and meaning. London: Arnold. Halliday, M. (1993). On language and physical science. In M. Halliday & J. Martin (Eds.), Writing science. Literacy and discursive power (pp. 54–68). Pittsburg, PA: University of Pittsburg Press. Halliday, M. (2004a). The ontogenesis of dialogue. In J. Webster (Ed.), The language of early childhood (pp. 144–152). New York, NY: Continuum. Halliday, M. (2004b). On the language of physical science (1988). In J. Webster (Ed.), The language of science (pp. 140–158). London: Continuum. Halliday, M. & Hasan, R. (1976). Cohesion in English. London: Longman. Halliday, M. & Hasan, R. (1989). Language, context and text aspects of language in a social-semiotic perspective. Hong Kong: Oxford. Halliday, M. & Martin, J. (1993). Writing science. Literacy and discursive power. Pittsburgh, PA: University of Pittsburgh. Hamon, P. (1991). Introducción al análisis de lo descriptivo. Buenos Aires: Edicial. Harman, D. (1992). Relevance feedback and other query modification techniques. En W. B. Frakes & R. Baeza-Yates (Eds). Information retrieval: Data structures and algorithms (pp. 241–236). Englewood Cliffs, NJ: Prentice Hall. Hartshorne, C. & Weiss, P. (Eds.) (1965). Collected papers of Charles Sanders Pierce. Cambridge, MA: Harvard University Press. Harvey, A. (2002). Representación e imagen del quehacer científico en los Medios de Comunicación. In G. Parodi (Ed.), Lingüística e Interdisciplinariedad: Desafíos del nuevo milenio. Ensayos en honor a Marianne Peronard (pp. 335–353). Valparaíso: EUVSA. Harvey, A. (Comp.) (2005). En torno al discurso: Contribuciones de América Latina. Santiago de Chile: Ediciones Universidad Católica de Chile. Hatch, E. & Lazaraton, A. (1991). The research manual. Design and statistics for applied linguistics. Boston, MA: Heinle & Heinle. Hayes, R. (1963). Mathematical models in information retrieval. Natural language and the computers. New York, NY: McGraw-Hill.
References 245
Heinemann, W. (2000). Textsorten. Zur Diskussion um Basisklassen des Kommunizierens. Rückschau uns Ausblick. In K. Adamzik (Ed.), Textsorten (pp. 9–29). Tübingen: Stauffenburg. Heinemann, W. & Viehweger, D. (1991). Textlinguistik: Eine Einfübrung. Tübingen: Niemeyer. Hernández, R., Fernández, C. & Baptista, P. (2006). Metodología de la investigación. Santiago de Chile: McGraw-Hill. Herrington, A. & Moran, C. (Eds.). (2005). Genre across the curriculum. Logan, UT: Utah State University Press. Hoey, M. (1986). The discourse colony: A preliminary study of a neglected discourse type. In M. Coulthard (Ed.), Talking about text (pp. 1–26). Birmingham: English language research. Hsu, Ch., Chang, Ch. & Lin, Ch. (2003). A practical guide to support vector classification [on line]. Available at: http://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf. Hyland, K. (1998). Hedging in scientific research articles, Pragmatics & Beyond New Series 54. Amsterdam: John Benjamins. Hyland, K. (1999). Talking to Students: Metadiscourse in introductory coursebooks. English for Specific Purposes, 18(1), 3–26. Hyland, K. (2000). Disciplinary discourses: Writer stance in research articles. In C. Candlin & K. Hyland (Eds.), Writing texts, processes and practice (pp. 99–121). London: Longman. Hyland, K. (2004). Disciplinary discourses: Social interactions in academic writing. London: Longman. Hyland, K. (2007). Genre and second language writing. Ann Arbor, MI: University of Michigan Press. Hyland, K. (2008). Academic clusters: Text patterning in published and postgraduate writing. International Journal of Applied Linguistics, 18(1), 41–62. Ibáñez, R. (2007a). Cognición y comprensión. Una aproximación histórica y crítica al trabajo investigativo de Rolf Zwaan. Revista Signos, 40(63), 81–100. Ibáñez, R. (2007b). Comprensión de textos disciplinares escritos en inglés. Revista de Lingüística Teórica y Aplicada, 45, 67–85. Ibáñez, R. (2008). Pasos retóricos en textos disciplinares: Aproximación desde cuatro disciplinas científicas. In G. Parodi (Ed.), Géneros Académicos y Géneros Profesionales: Accesos discursivos para el saber y el hacer. Valparaíso: Ediciones Universitarias de Valparaíso. Jackson, P. & Mouliner, I. (2002). Natural language processing for online applications: Text retrieval, extraction & categorization, Natural Language Processing 5. Amsterdam: John Benjamins. Jeanneret, Y. (1994). Écrire la science. Formes et enjeux de la vulgarisation. Paris: PUF. Joachims, T. (1998). Text categorization with suport vector machines: Learning with many relevant features. Lecture notes in computer science, 1398, 137–142. Johns, A. (2002). Genre in the classroom: Multiple perspectives. Mahwah, NJ: Lawrence Erlbaum. Johnson, D. (2000). Métodos multivariados aplicados al análisis de datos. Ciudad de México: Thomson. Jurafsky, D. & Martin, J. (2000). Speech and language processing: An introduction to natural language processing, computational linguistics, and speech recognition. Englewood Cliffs, NJ: Prentice Hall.
246 References
Kang, Y. H. (2005). Representative term based feature selection method for SVM based document classification. Knowledge-based intelligent information and engineering systems, 3681, 56–61. Kanoksilapatham, B. (2007). Rhetorical moves in biochemistry research articles. In D. Biber, U. Connor & T. Upton (Eds.), Discourse on the move, Studies in Corpus Linguistics 28 (pp. 73–119). Amsterdam: John Benjamins. Kantor, R., Anderson, T. & Armbruster, B. (1983). How inconsiderate are children’s textbooks? Journal of Curriculum Studies, 15(1), 61–72. Kennedy, G. (2001). An introduction to corpus linguistics. London: Longman. King, P. (2007). Estudio multidimensional de la oralidad a partir de los textos escolares para la enseñanza del inglés como lengua extranjera. In G. Parodi (Ed.), Lingüística de corpus y discursos especializados: Puntos de mira (pp. 301–319). Valparaíso: Ediciones Universitarias de Valparaíso. Kintsch, W. (1988). The role of knowledge in discourse comprehension: A construction-integration model. Psychological Review, 95, 163–182. Kintsch, W. (1998). Comprehension: A paradigm for cognition. Cambridge: Cambridge University Press. Kintsch, W. (2002). On the notions of theme and topic in psychological process models of text comprehension. In M. Louwerse & W. van Peer (Eds.), Thematics: Interdisciplinary studies, Converging Evidence in Language and Communication Research 3 (pp. 151–170). Amsterdam: John Benjamins. Kintsch, W. & Rawson, K. (2005). Comprehension. In M. Snowling & C. Hume (Eds.), The science of reading. A handbook (pp. 209–226). Victoria: Blackwell. Koda, K. (2005). Insights into second language reading. A cross-linguistic approach. Cambridge: Cambridge University Press. Koprinska, I., Poon, J., Clark, J. & Chan, J. (2007). Learning to classify e-mail. Information Sciences, 177, 2167–2187. Kress, G. & Threadgold, T. (1988). Towards a social theory of genre. Southern Review, 21(3), 215–243. Kress, G. & van Leeuwen, T. (2001). Multimodal discourse. London: Arnold. Kwan, B. S. C. (2006). The schematic structure of literature reviews in doctoral theses of applied linguistics. English for Specific Purposes, 25, 30–55. Lai, C. (2007). An empirical study of three machine learning methods for spam filtering. Knowledge-Based Systems, 20, 249–254. Leech, G. (1991). The state of the art in corpus linguistics. In K. Aijmer & B. Altenberg (Eds.), English corpus linguistics. Studies in honor of Jan Svartvik (pp. 8–29). London: Longman. Lakoff, G. (1972). A study in meaning criteria and the logic of fuzzy concepts. Linguistics Society, 8, 183– 288. Lakoff, G. (1987). Women, fire and dangerous things: What categories reveal about the mind. Chicago, IL: University of Chicago Press. Lakoff, G. & Johnson, M. (1981). Metaphors We live by. Chicago, IL: University of Chicago Press. Lang, M. (1997). Formación de palabras en español. Madrid: Cátedra. Lemke, J. (1998). Multiplying meaning: Visual and verbal semiotics in scientific texts. In J. Martin & R. Veel (Eds.), Reading science: Critical and functional perspectives on discourses of science (pp. 87–113). London: Routledge. Lo Cascio, V. (1998). Gramática de la argumentación. Madrid: Alianza.
References 247
López, C. (2002). Aproximaciones al análisis de los discursos profesionales. Revista Signos, 35(51–52), 195– 215. Lorente, M. (2002). Verbos y discurso especializado. [On line] Available at: http://elies.rediris. es/elies16/Lorente.html. Louwerse, M., McCarthy, P., McNamara, D. & Graesser, A. (2004). Variation in language and cohesion across written and spoken registers. In K. Forbus, D. Gentner & T. Regier (Eds.), Proceedings of the twenty-sixth annual conference of the Cognitive Science Society (pp. 843– 848). Mahwah, NJ: Lawrence Erlbaum. Love, A. (1991). Process and product in geology: An investigation of some discourse features of two introductory textbooks. English for Especific Purposes, 10, 89–109. Love, A. (1993). Lexicogrammatical features of geology textbooks: Process and product revisited. English for Specific Purposes, 12, 197–218. Mahlberg, M. & Teubert, W. (Eds.). (2007). Text, discourse and corpora. Theory and analysis. London: Continuum. Manning, C. & Schütze, H. (2003). Foundations of statistical natural language processing. Cambridge, MA: The MIT Press. Marinkovich, J. (2001–2002). La competencia textual narrativa en adolescentes chilenos y españoles. Lenguas Modernas, 28–29, 145–164. Marinkovich, J., Peronard, M. & Parodi, G. (2006). LECTES Programa de Lectura y Escritura. Valparaíso: EUV. Martin, J. (1992). English text. System and structure. Amsterdam: John Benjamins. Martin, J. (1996). Types of structure: Deconstructing notions of constituency in clause and text. In E. Hovy & D. Scott (Eds.), Burning issues in discourse: A multidisciplinary perspective (pp. 198–234). Heidelberg: Springer. Martin, J. (1998). Discourse of science: Recontextualisation, genesis, intertextuality and hegemony. In J. Martin & R. Veel (Eds.), Reading science: Critical and functional perspectives on discourses of science (pp. 3–14). London: Routledge. Martin, J. (2002). A universe of meaning – how many practices? In A. Johns (Ed.), Genre in the classroom: Multiple perspectives (pp. 269–278). Mahwah, NJ: Lawrence Erlbaum. Martin, J. & Matthiessen, C. (1991). Systemic typology and topology. In F. Christie (Ed.), Literacy in social processes: Papers from the first Australian systemic linguistics conference, January 1990 (pp. 345–383). Darwin, NT: Centre for Studies of Language in Education. Martin, J. & Rose, D. (2008). Genre relations. Mapping culture. London: Equinox. Martin, J. & Veel, R. (Eds.). (1998). Reading science. Critical and functional perspectives on discourses of science. London: Routledge. Martín-Valdivia, M., García-Vega, M. & Ureña-López, L. (2003). LVQ for text categorization using a multilingual linguistic resource. Neurocomputing, 55, 665–679. Martín Zorraquino, M. & Portolés, J. (Eds.) (1999). Los marcadores del discurso. Teoría y Análisis. Madrid: Arco Libros. Masand, B. & Linoff, G. (1992). Classifying news stories using memory based reasoning. SIGIR Forum (ACM Special Interest Group on Information Retrieval), 59–65. McNamara, D. (2004). Aprender del texto: Efectos de la estructura textual y las estrategias del lector. Revista Signos, 37(55), 19–30. McNamara, D. (Ed.) (2007). Reading comprehension strategies. Theories interventions, and technologies. New York, NY: Lawrence Erlbaum/Taylor & Francis. McNamara, D. & Kintsch, W. (1996). Learning from texts: Effects of prior knowledge and text coherence. Discourse Processes, 22, 247–288.
248 References
McNamara, D., Louwerse, M. & Graesser, A. (2002). Coh-Metrix: Automated cohesion and coherence scores to predict text readability and facilitate comprehension. Memphis, TN: University of Memphis. Minsky, M. (1975). Framework for representing knowledge. In C. Winston (Ed.), The psychology of computer vision (pp. 45–71). New York, NY: MacGraw Hill. Moens, M.-F. & Uyttendaele, C. (1996). Automatic text structuring and categorization as a first step in summarizing legal cases. Information Processing & Management, 33(6), 27–737. Moens, M.-F. & Dumortier, J. (2000). Text categorization: The assignment of subject descriptors to magazine articles. Information Processing & Management, 36(6), 841–861. Molina, J. & García, J. (2004). Técnicas de análisis de datos aplicaciones prácticas utilizando Microsoft Excel y Weka [on line]. Available at: http://galahad.plg.inf.uc3m.es/~docweb/ad/ transparencias/apuntesAnalisisDatos.pdf. Montolío, E. (2001). Conectores de la lengua escrita. Barcelona: Ariel. Montolío, E. (Coord.) (2002). Manual práctico de escritura académica (3 vols.). Barcelona: Ariel. Montolío, E. & López Samaniego, A. (2008). La escritura en el quehacer judicial: Estado de la cuestión y presentación de la propuesta aplicada en la Escuela Judicial de España. Revista Signos, 41(66), 33–64. Moss, G. & Chamorro, D. (2008). La enseñanza de la ciencia sin asidero en el tiempo ni en el espacio: Análisis del discurso de dos textos escolares. Revista Lenguaje, 36(1), 87–115. Núñez, P. (2004). Hacia una caracterización discursiva de los informes escritos por universitarios. Ponencia presentada en el Segundo Congreso Internacional y Quinto Nacional sobre Lengua Escrita y Textos Académicos. Universidad Autónoma de Tlaxcala, Tlaxcala, Mexico. Núñez, P., Muñoz, A. & Mihovilovic, E. (2006). Las funciones de los marcadores de reformulación en el discurso académico en formación. Revista Signos, 39(62), 471–492. Oteíza, T. (2006). El discurso pedagógico de la historia: Un análisis lingüístico sobre la construcción ideológica de la historia de Chile (1970–2001). Santiago de Chile: Frasis. Parodi, G. (2003). Relaciones entre lectura y escritura: Una perspectiva cognitiva discursiva. Valparaíso: Ediciones Universitarias de Valparaíso. Parodi, G. (2004). Textos de especialidad y comunidades discursivas técnico-profesionales: Una aproximación basada en corpus computarizado. Estudios Filológicos, 39, 7–36. Parodi, G. (Ed.) (2005a). Discurso especializado e instituciones formadoras. Valparaíso: Ediciones Universitarias de Valparaíso. Parodi, G. (2005b). Compresión de textos escritos. Buenos Aires: Eudeba. Parodi, G. (2005c). Lingüística de corpus y análisis multidimensional: Exploración de la variación en el corpus PUCV-2003. In G. Parodi (Ed.), Discurso especializado e instituciones formadoras (pp. 83–125). Valparaíso: Ediciones Universitarias de Valparaíso. Parodi, G. (2006a). El Grial: Interfaz computacional para anotación e interrogación de corpus en español. Revista de Lingüística Teórica y Aplicada, 44, 91–115. Parodi, G. (2006b). Discurso especializado y lengua escrita: Foco y variación. Estudios Filológicos, 41, 165–204. Parodi, G. (2007a). Comprensión y aprendizaje a partir del discurso especializado escrito: Teoría y empíria. In G. Parodi (Ed.), Lingüística de corpus y discursos especializados: Puntos de mira (pp. 223–255). Valparaíso: Ediciones Universitarias de Valparaíso. Parodi, G. (Ed.) (2007b). Lingüística de corpus y discursos especializados: Puntos de Mira. Valparaíso: Ediciones Universitarias de Valparaíso. Parodi, G. (Ed.) (2007c). Working with Spanish corpora. London: Continuum.
References 249
Parodi, G. (2007d). El discurso especializado escrito en el ámbito universitario y profesional: Constitución de un corpus de estudio. Revista Signos, 40(63) 147–178. Parodi, G. (2007e). El grial: Interfaz computacional para anotación e interrogación de corpus en español. In G. Parodi (Ed.), Lingüística de corpus y discursos especializados: Puntos de mira (pp. 31–52). Valparaíso: Ediciones Universitarias de Valparaíso. Parodi, G. (2007f). Lingüística de corpus: Puntos de mira. In G. Parodi (Ed.), Lingüística de corpus y discursos especializados: Puntos de mira (pp. 13–30). Valparaíso: Ediciones Universitarias de Valparaíso. Parodi, G. (2007g). Reading-writing connections: Discourse-oriented research. Reading & Writing Interdisciplinary Journal, 20, 225–250. Parodi, G. (2008a). ¿Qué es ser un lingüista en el siglo XXI?: Reflexión teórica y metateórica. Revista Signos, 41(67), 135–154. Parodi, G. (2008b). Lingüística de corpus: Una introducción al ámbito. Revista de Lingüística Teórica y Aplicada, 46(1), 93–119. Parodi, G. (2008c). Written genres in university studies: Evidence from a Spanish corpus in four disciplines. In C. Bazerman, A. Bonini & D. Figueiredo (Eds.), Genre in a changing world. Writing across the curriculum. West Lafayette, IN: Clearinghouse and Parlor Press. Parodi, G. (2008d). Academic and professional genres: Similarities and differences from a corpus-based approach. In J. Renkema (Ed.), Discourse, of course. Amsterdam: John Benjamins. Parodi, G. (2010). Lingüística de corpus: De la teoría a la empiria. Frankfurt: Iberoamericana/ Vervuert. Parodi, G. & Gramajo, A. (2003). Los tipos textuales del corpus PUCV-2003: Una aproximación multiniveles. Revista Signos, 36, 207–223. Parodi, G. & Venegas, R. (2004). BUCÓLICO: Aplicación computacional para el análisis de textos. Hacia un análisis de rasgos de la informatividad. Revista Literatura y Lingüística, 15, 223–251. Peronard, M. (1997). ¿Qué significa comprender un texto escrito? In M. Peronard, L. Gómez, G. Parodi & P. Núñez (Comps.), Comprensión de textos escritos: De la teoría a la sala de clases (pp. 55–78). Santiago de Chile: Andrés Bello. Peronard, M. (2007a). La Escuela Lingüística de Valparaíso: Algunos principios fundantes. Revista Signos, 40(65), 489–494. Peronard, M. (2007b). Lectura en papel y en pantalla de computador. Revista Signos, 40(63), 179–195. Peronard, M. & Gómez Macker, L. (1985). Reflexiones acerca de la comprensión lingüística: Hacia un modelo. Revista de Lingüística Teórica y Aplicada, 23, 19–32. Peronard, M., Gómez Macker, L., Parodi, G. & Núñez, P. (1998). Comprensión de textos escritos: De la teoría a la sala de clases. Santiago de Chile: Andrés Bello. Portolés, (1998). Marcadores del Discurso. Barcelona: Ariel. Propp, V. (1928). Morfología del cuento. Madrid: Fundamentos. Reppen, R., Fitzmaurice S. & Biber, D. (Eds.) (2002). Using corpora to explore linguistic variation. Studies in Corpus Linguistics 9. Amsterdam: John Benjamins. Rosch, E. (1975). Cognitive representations of semantic categories. Journal of Experimental Psychology: General, 104, 192–233. Ruiying, Y. & Allison, D. (2003). Research articles from applied linguistics: Moving from results to conclusions. English for Specific Purposes, 22, 365–385.
250 References
Rumelhart, D. (1975). Notes on a schema for stories. In D. Bobrow & A. Collins (Eds.), Representation and understanding: Studies in cognitive science (pp. 185–210). New York, NY: Academic Press. Rumelhart, D. (1980). Schemata: The building of blocks of cognition. In B. Spiro, B. Bruce & W. Brewer (Eds.), Theoretical issues in reading comprehension (pp. 33–58). Hillsdale, NJ: Lawrence Erlbaum. Rumelhart, D. & McClelland, J. (1986). Parallel distributed processing: Studies in the microstructure of cognition (2 vols.). Cambridge, MA: The MIT Press. Russell, D. (2002). Writing in the academic disciplines: A curricular history. Carbondale, IL: Southern Illinois University Press. Sabaj, O. (2007). Hacia una matriz de rasgos lingüísticos con impacto textual: Un estudio exploratorio. Revista Signos, 40(63), 197–218. Salton, G. (1968). Automatic information organization and retrieval. New York, NY: McGrawHill. Salton, G. & Buckley, C. (1988). Term-witghting approaches in automatic text retrieval. Information Processing and Management, 24(5), 513–523. Salton, G. & McGill, M. (1983). Introduction to modern information retrieval. New York, NY: McGraw-Hill. Samraj, B. (2002). Introduction in research articles: Variation across disciplines. English for Specific Purposes, 21, 1–17. Schacter, D. & Tulving, E. (1994). Memory Systems 1994. Cambridge, MA: MIT. Schröder, H. (1991). Linguistic and text-theorical research on languages for special purposes. A thematic and bibliographical guide. In H. Schröder (Ed.), Subject-oriented texts. Languages for special purposes and text theory (pp. 1–48). Berlin: Walter de Gruyter. Sharma, S. (1996). Applied multivariate techniques. New York, NY: Wiley & Sons. Silva, C. & Ribeiro, B. (2003). On the evaluation of text processing in text categorization. In Proceedings of the 2003 International Conference on Machine Learning and Applications, 121–127. Simon, H. A. (1979). Information processing models of cognition. Annual Review of Psychology, 30, 363–396. Sinclair, J. (1991). Corpus, concordance, collocation. Oxford: Oxford University Press. Solorio, T., Perez-Coutiño, M., Montes-y-Gomez, M., Villaseñor-Pineda, L. & López-López, A. (2005). Question classification in Spanish and Portuguese. Lecture Notes in Computer Science, 3406, 612–619. Stokes, N. & Carthy, J. (2001). Combining semantic and syntactic document classifiers to improve first story detection. SIGIR Forum (ACM Special Interest Group on Information Retrieval), 424–425. Stubbs, M. (1996). Text and Corpus Analysis. Oxford: Blackwell. Stubbs, M. (2006). Corpus analysis: The state of the art and three types of unanswered questions. In S. Hunston & G. Thompson (Eds.), System and corpus: Exploring connections (pp. 15–36). London: Equinox. Stubbs, M. (2007). On texts, corpora and models of language. In W. Teubert & M. Mahlberg (Eds.), Text, discourse and corpora (pp. 127–162). London: Continuum. Swales, J. (1981). Aspects of article introductions. Birmingham: University of Aston. Swales, J. (1990). Genre análisis. English in academic and research settings. Cambridge: Cambridge University Press.
References 251
Swales, J. (1995). The role of textbooks in AEP writing research. English for Especific Purposes, 14(1), 3–18. Swales, J. (1998). Other floors, other voices: A textography of a small university building. Mahwah, NJ: Lawrence Erlbaum. Swales, J. (2001). EAP-related linguistic research: An intellectual history. In J. Flowerdew & M. Peacock (Eds.), Research perspectivas on English for academic purposes (pp. 42–54). Cambridge: Cambridge University Press. Swales, J. (2004). Research Genres: Exploration and aplications. Cambridge: Cambridge University Press. Tardy, C. (2003). A genre system view of the funding of academic research. Written Communication, 20(1), 7–36. Téllez, A. (2005). Extracción de información con algoritmos de clasificación [on line]. Available at: http://ccc.inaoep.mx/~mmontesg/tesis%20estudiantes/TesisMaestria-AlbertoTellez.pdf. Teubert, W. (2005). My version of corpus linguistics. International Journal of Corpus Linguistics, 10, 1–13. Thaiss, C. & Zawacki, T. (2006). Engaged writers dynamic disciplines. Portsmouth: Boyton/ Cook. Thompson, G. & Hunston, S. (Ed.) (2006). System and corpus. Exploring connections. London: Equinox. Tin-Yau, J. (1998). Automated text categorization using support vector machine. ICONIP, 347– 351. Tognini-Bonelli, E. (2001). Corpus linguistics at work. Studies in Corpus Linguistics 6. Amsterdam: John Benjamins. Torner, S. & Battaner, P. (Eds.) (2005). Corpus PAAU 1992: Estudios descriptivos, textos y vocabulario. Sèrie Monografies, 9. (1a. ed.). Barcelona: Institut Universitari de Lingüística Aplicada. Toulmin, S. (1958). The uses of argument. London: Cambridge University Press. Trosborg, A. (Ed.) (1997). Text typology and translation, Benjamins Translation Libarary 26. Amsterdam: John Benjamins. Trosborg, A. (Ed.) (2000). Analysing professional genres, Pragmatics & Beyond New Series 74. Amsterdam: John Benjamins. Turing, A. M. (1950). Computing machinery and intelligence. Mind, 59, 433–460. Valle, E. (1997). A scientific community and its texts. In B. Gunnarsson, P. Linell & B. Nordberg (Eds.), The contruction of professional discourse (pp. 151–172). London: Longman. van Dijk, T. (1983). Ciencia del texto: Un enfoque interdisciplinario. Buenos Aires: Ediciones Paidós. van Dijk, T. (1997). Cognitive context models and discourse. In M. Stamenow (Ed.), Language structure, discourse and the access to consciousness, Advances in Consciousness Research 12 (pp. 189–226). Amsterdam: John Benjamins. van Dijk, T. (1999). Context models in discourse processing. In H. van Oostendorp & S. Goldman (Eds.), The construction of mental representations during reading (pp. 123–148). Mahwah, NJ: Lawrence Erlbaum. van Dijk, T. (2002). Tipos de conocimiento en el procesamiento del discurso. In G. Parodi (Ed.), Lingüística e interdisciplinariedad: Desafíos del nuevo milenio. Ensayos en honor a Marianne Peronard. (pp. 43–66). Valparaíso: Ediciones Universitarias de Valparaíso. van Dijk, T. (2006). Discourse, context and cognition. Discourse Studies, 8(1), 159–177.
252 References
van Dijk, T. (2008). Discourse and context. A sociocognitive approach. Cambridge: Cambridge University Press. van Dijk, T. & Kintsch, W. (1983). Strategies of discourse comprehension. New York, NY: Academic Press. van den Broek, P. & Gustafson, M. (1999). Comprehension and memory for texts: Three generations of Reading Research. In S. Goldman, A. Graesser & P. van den Broek (Eds.), Narrative comprehension, causality and coherence. Essays in honor of Tom Trabasso (pp. 15–34). Hillsdale, NJ: Lawarence Erlbaum. Vapnick, V. (2000). The nature of statistical learning theory. New York, NY: Springer. Venegas, R. (2005). Las relaciones léxico-semánticas en artículos de investigación científica: Una aproximación desde el Análisis Semántico Latente. Tesis doctoral, Pontificia Universidad Católica de Valparaíso, Valparaíso, Chile. Venegas, R. (2006). La similitud léxico-semántica en artículos de investigación científica en español: Una aproximación desde el Análisis Semántico Latente. Revista Signos, 39(60), 75–106. Venegas, R. (2007a). Clasificación de textos académicos en función de su contenido léxicosemántico. Revista Signos, 40(63), 239–271. Venegas, R. (2007b). Using latent semantic analysis in a Spanish research article corpus. In G. Parodi (Ed.), Working with spanish corpora (pp. 195–216). London: Continuum. Venegas, R. (2008). Interfaz computacional de apoyo al análisis textual: “El Manchador de Textos”. Revista de Lingüística Teórica y Aplicada, 46(2), 53–79. Vine, B. (2004). Getting things done at work. The discourse of power in workplace interaction, Pragmatics & Beyond New Series 124. Amsterdam: John Benjamins. Virtanen, T. (Ed.) (2004). Approaches to cognition through text and discourse. Berlin: Mouton de Gruyter. von Neumann, J. (1958). The computer and the brain. New Haven, CT: Yale University Press. Weick, K. (1982). Psicología social del proceso de organización. Mexico: Interamericana. Wignell, P. (1998). Technicality and abstraction in social science. In J. Martin & R. Veel (Eds.), Reading science. Critical and functional perspectives on discourses of science (pp. 297–326). London: Routledge. Wignell, P. (2007a). On the discourse of social science. Darwin, NT: Charles Darwin University Press. Wignell, P. (2007b). Vertical and horizontal discourse and the social sciences. In J. Martin & F. Christie (Eds.), Language, knowledge and pedagogy. Functional linguistics and sociological perspectives (pp. 184–204). London: Continuum. Williams, G. (1998). Collocational networks: Interlocking patterns of lexis in a corpus of plant biology research articles. International Journal of Corpus Linguistics, 4(2), 121–132. Wittgenstein, L. (1958). Philosophical investigations. Oxford: Blackwell. Yang, Y., & Pedersen, J. (1997). A comparative study on feature selection in text categorization [on line]. Available at: http://citeseer.ist.psu.edu/yang97comparative.html. Zazo, A., Figuerola, C., Alonso, J. L. & Gómez, R. (2002). Recuperación de información utilizando el modelo vectorial [on line]. Available at: http://tejo.usal.es/inftec/2002/DPTOIAIT-2002-006.pdf. Zhang, W., Yoshida, T. & Tang, X. (2008). Text clasification based on multi-word with support vector machine. Knowledge-based systems [on line]. Available at: http://www. sciencedirect.com.
Index
A abstraction 13, 15, 91, 146–147, 150–153, 158, 168, 171, 177, 179–182, 184, 193, 195–196, 235, 237 academic corpus 15, 40, 63, 67, 69–71, 73, 75–76, 78, 80, 83, 88–90, 92–94, 98, 131, 143, 148, 151, 189 academic discourse 15, 30–31, 33, 66, 76, 84–87, 95–99, 104, 118, 121–123, 190, 192, 213, 215, 224, 230 academic discourse comprehension 15, 213 academic genres 11, 13, 29, 34, 59–62, 65, 71–72, 83, 93, 95–96, 101–102, 106, 114, 118–119, 122, 143, 190, 192, 236 academic literacy 17, 33, 35, 122, 183–184 academic purposes 76, 87, 123, 190, 214 academic settings 11, 76–77, 81, 84–85, 96, 98, 123, 213 academic text 14, 16, 121, 190, 213–218, 220–227, 229 academic writing 5, 94, 145 automatic text classification 121 B Bayes 14, 121–122, 126, 135–137, 139 bottom-up 9, 15, 39, 147, 149, 168, 195 C classification 12, 14, 37–39, 41, 58–63, 73, 88, 104, 119, 121–122, 124–141, 147, 152, 160, 196
classification methods 14, 121–122, 135, 140 classifier 125–129, 133–134, 136–139 cognitive 2, 7, 9–10, 12, 17–25, 27–28, 33–35, 37–39, 72, 82, 85, 87, 93, 96, 99, 149, 158, 190–191, 214–215, 217–218, 222, 224, 229, 233, 235, 238 cognitive constructs 19–21, 23, 27, 35 cognitive dimension 18–21, 23, 34–35, 191, 233 cognitive representation 20, 22, 27–28 cognitive structures 28, 217 colony 14, 143–145, 157, 164, 167–168, 173 communicative event 22, 42, 191 communicative functions 27, 34, 124, 150 communicative macro-purposes 41, 96, 143 communicative purpose 15, 28, 42, 44, 71, 74–75, 144–147, 150–153, 156, 159, 161–165, 167, 179, 185–187, 189–192, 196–198, 210, 215 community 16, 29, 33–34, 39, 43–44, 77, 87, 92, 95–96, 99, 146, 148, 158, 173, 182, 189, 191–192, 200, 205, 210–211, 213–216, 235 comprehension 159, 178, 185, 218, 221–223 context 12, 20–23, 26–29, 33, 35, 37, 39, 41–42, 45–51, 55, 57, 59–60, 85, 95–96, 98, 106, 114, 134, 156, 159–160, 185, 190–191, 208, 221, 237
context of circulation 12, 41, 45–46, 48–50, 55, 57, 59–60, 95, 106 continuum 20, 29, 31–32, 77, 85–86, 91, 98, 123 corpus-based 8, 39, 84, 98, 141, 145, 149, 180, 236–237 corpus-driven 147 D deductive 12, 37, 39, 41, 63, 149, 168, 234, 236 dimension 18–21, 23–24, 34–35, 38, 93, 103–104, 107–119, 191, 233–234 dimensional 9–10, 12–13, 18–19, 37–38, 63, 101–104, 108, 119, 122, 128, 145 disciplinary community 16, 189, 191, 205, 210, 213–214 discourse 2, 7, 12, 17, 21, 26, 31, 37, 44, 46–49, 52, 54–55, 76, 96–97, 104, 117, 190, 192, 214 discourse analysis 11, 18, 25, 149, 195, 214 discourse community 29, 33, 39, 43–44, 77, 95–96, 146, 148, 158, 173, 182, 192, 215–216, 235 discourse comprehension 15, 82, 155, 213, 215, 224, 230 discourse genre 2, 12, 17–18, 22, 26, 38, 42, 47–57, 66, 81, 145, 157, 171, 174, 183, 191, 205, 218, 229 discourse organisation 34, 41, 44–57, 59, 74, 81, 145, 157, 209, 222, 236 discourse organisation mode 41, 44–57, 59, 81, 145, 236 discourse unit 147, 152–153, 157, 192, 196
254 Academic and Professional Discourse Genres in Spanish
E elgrial 2, 10, 67–68, 70, 89, 105, 107–108, 132, 148, 160 expert writer 43–44, 46–57 G genre 19, 42, 84, 106, 172, 190 genre analysis 8, 10–11, 13, 18, 147, 189–190 I inductive 12, 37, 39, 41, 63, 149, 168, 195, 234, 236 interactive 13, 101, 104, 108, 110–111, 114–119 L labour context 47–49, 60 lexicogrammatical 2, 5, 13, 23, 77, 101–104, 108, 114, 116, 118–119, 121–122, 224, 236–237 lexicogrammatical description 13, 101–102 lexicogrammatical features 2, 13, 77, 101–104, 108, 118–119, 121–122, 236–237 linguistic dimension 18–19, 23, 38, 191 linguistic features 32, 74, 81, 84–85, 87, 102, 111, 117, 123, 195 linguistic patterns 122, 184 M macro-level 14, 143–144, 153, 166, 172, 179, 236 macro-move 14, 143–144, 151, 153–162, 164–165, 167–168, 172–179, 182, 185–187, 193, 197–204, 206–207, 210, 236 Macro-move 155–156, 158–161, 164–165, 173, 175, 179, 182, 185–187, 197–199, 201–202 macro-purpose 12, 42, 46–58, 95, 106, 153, 156, 166, 179, 184, 190, 192–193, 200, 204, 210 macro-purposes 41–42, 46, 58, 96, 143, 155 matrix 41–42, 46–47, 63, 125, 133 mental representation 19–20, 144, 215
move 10–11, 14–15, 31, 143–147, 150–168, 172–179, 182, 185–187, 192–193, 196–208, 210, 213, 231, 236–238 move analysis 10–11, 146, 152, 196, 238 move and step 10, 147, 152, 161, 165, 186–187, 196 multi-dimensional 9–10, 12–13, 18–19, 37–38, 63, 101–104, 108, 119, 122, 145 multi-dimensional analysis 9, 101–102, 104, 108, 119 multimodal 45–55, 57, 84, 87, 161, 186 N Naive Bayes 121–122, 126, 135–137, 139 Naive Bayesian classifier 127– 128 P patterns 15, 21, 82, 84, 101–102, 122, 133, 140, 171, 176, 184, 236–237 professional 1–2, 5, 7, 9–13, 17–18, 29–35, 37–40, 45, 58–63, 65–69, 73–82, 85–87, 92, 95–96, 98, 102–103, 107, 111, 121, 123, 144, 157, 183, 194, 231, 234–236 Professional Corpus 2, 12, 37, 40, 68, 73–74, 76, 78–81, 111, 194 professional discourse 2, 10–11, 31–32, 65, 85–87, 123 professional education 29, 144, 157 professional genres 7, 13, 17–18, 29–31, 33, 38, 58, 65, 75–76, 78, 236 professional settings 9–10, 13, 30, 62, 65–66, 76–77, 79–81, 86, 235 professional workplaces 10–11, 13, 66, 234–235 professional written discourse 7, 66 proficiency 213–216, 219–220, 224–229
prototypical features 69, 81, 85, 173, 184 purpose 1, 12, 15, 26, 28, 34, 39, 42, 44–58, 71, 74–75, 95, 106, 122, 144–147, 150–153, 156, 159–167, 171, 178–179, 184–187, 189–193, 196–200, 203–204, 206, 210, 215 R representation 19–20, 22, 27–28, 105, 124–126, 140, 144, 215, 217–218, 222, 229 rhetorical moves 10, 14–15, 74, 145, 147, 151, 169, 171, 183, 192–193, 199–203 rhetorical organisation 72, 94, 143–145, 149, 151, 153, 155–156, 160–161, 165–169, 171–172, 174–176, 183, 186–187, 189–190, 192, 210, 215, 222 rhetorical steps 15, 146, 176–178, 182, 184, 189, 198, 200–201, 203, 206–210 rhetorical structure 72, 81, 144, 146, 154 S semantic 14, 85, 103, 119, 121, 132–133, 150, 191 semantic content words 14, 121, 132–133 skills 15–16, 77, 80, 82, 85, 163, 190, 211, 213–216, 223, 228–229, 231 social dimension 19, 23, 191 sociocognitive 2, 10, 12, 17, 23, 28, 32, 34, 145, 181, 233 specialised discourse 7, 31–33, 66, 82, 84–87, 94, 101–102, 116, 119, 123, 182, 235, 237 specialised written genres 63, 77, 83 strategies 15, 33, 71, 99, 107, 161, 163–164, 173, 186, 192, 210, 229–230, 234–235 structure 1, 7, 9, 23, 44, 54, 72, 81, 83, 144, 146, 150, 153–156, 158–159, 161, 165, 169, 185–187, 197, 200, 236–237
summarising strategies 163– 164, 173 Support Vector Machine 14, 121–122, 135–137, 139 T technical-professional 29, 103, 107, 111, 144 text classification 121–122, 124, 128, 131, 138–139 textual features 25, 123 top-down 9, 15, 39, 149, 168, 195, 234
Index 255
W writer 21, 25, 27, 33–34, 43–57, 155, 158, 163–164, 166, 173, 178–180, 182, 235 written 1, 5, 7, 9–13, 15–18, 21, 23–24, 27, 29–30, 32–34, 39, 45, 63, 65–70, 73, 77–78, 80–85, 88, 90, 94, 97–98, 101–105, 107, 110–111, 116, 118–119, 121–123, 131, 174, 180, 182, 184, 189, 192, 194, 213–215, 218, 223, 225–231, 234–235, 237 written communication 15, 66–67, 82–83, 94, 189
written discourse 7, 9, 17, 21, 23, 32–34, 63, 66, 78, 83, 98, 213, 215, 234–235, 237 written discourse genres 17 written genres 10, 23, 63, 66, 77, 82–84, 97, 119, 182, 235 written in English 15–16, 180, 213–214, 223, 225, 227, 229–230 written material 34, 39, 66–67, 69–70, 80, 88, 90, 194 written registers 102–104 written texts 23–24, 66, 68, 73, 85, 88, 98, 111, 119
In the series Studies in Corpus Linguistics (SCL) the following titles have been published thus far or are scheduled for publication: 40 Parodi, Giovanni (ed.): Academic and Professional Discourse Genres in Spanish. 2010. xii, 255 pp. 39 Gilquin, Gaëtanelle: Corpus, Cognition and Causative Constructions. 2010. xvii, 326 pp. 38 Murphy, Bróna: Corpus and Sociolinguistics. Investigating age and gender in female talk. 2010. xviii, 231 pp. 37 Balasubramanian, Chandrika: Register Variation in Indian English. 2009. xviii, 284 pp. 36 Quaglio, Paulo: Television Dialogue. The sitcom Friends vs. natural conversation. 2009. xiii, 165 pp. 35 Römer, Ute and Rainer Schulze (eds.): Exploring the Lexis–Grammar Interface. 2009. vi, 321 pp. 34 Friginal, Eric: The Language of Outsourced Call Centers. A corpus-based study of cross-cultural interaction. 2009. xxii, 319 pp. 33 Aijmer, Karin (ed.): Corpora and Language Teaching. 2009. viii, 232 pp. 32 Cheng, Winnie, Chris Greaves and Martin Warren: A Corpus-driven Study of Discourse Intonation. The Hong Kong Corpus of Spoken English (Prosodic). 2008. xi, 325 pp. (incl. CD-Rom). 31 Ädel, Annelie and Randi Reppen (eds.): Corpora and Discourse. The challenges of different settings. 2008. vi, 295 pp. 30 Adolphs, Svenja: Corpus and Context. Investigating pragmatic functions in spoken discourse. 2008. xi, 151 pp. 29 Flowerdew, Lynne: Corpus-based Analyses of the Problem–Solution Pattern. A phraseological approach. 2008. xi, 179 pp. 28 Biber, Douglas, Ulla Connor and Thomas A. Upton: Discourse on the Move. Using corpus analysis to describe discourse structure. 2007. xii, 290 pp. 27 Schneider, Stefan: Reduced Parenthetical Clauses as Mitigators. A corpus study of spoken French, Italian and Spanish. 2007. xiv, 237 pp. 26 Johansson, Stig: Seeing through Multilingual Corpora. On the use of corpora in contrastive studies. 2007. xxii, 355 pp. 25 Sinclair, John McH. and Anna Mauranen: Linear Unit Grammar. Integrating speech and writing. 2006. xxii, 185 pp. 24 Ädel, Annelie: Metadiscourse in L1 and L2 English. 2006. x, 243 pp. 23 Biber, Douglas: University Language. A corpus-based study of spoken and written registers. 2006. viii, 261 pp. 22 Scott, Mike and Christopher Tribble: Textual Patterns. Key words and corpus analysis in language education. 2006. x, 203 pp. 21 Gavioli, Laura: Exploring Corpora for ESP Learning. 2005. xi, 176 pp. 20 Mahlberg, Michaela: English General Nouns. A corpus theoretical approach. 2005. x, 206 pp. 19 Tognini-Bonelli, Elena and Gabriella Del Lungo Camiciotti (eds.): Strategies in Academic Discourse. 2005. xii, 212 pp. 18 Römer, Ute: Progressives, Patterns, Pedagogy. A corpus-driven approach to English progressive forms, functions, contexts and didactics. 2005. xiv + 328 pp. 17 Aston, Guy, Silvia Bernardini and Dominic Stewart (eds.): Corpora and Language Learners. 2004. vi, 312 pp. 16 Connor, Ulla and Thomas A. Upton (eds.): Discourse in the Professions. Perspectives from corpus linguistics. 2004. vi, 334 pp. 15 Cresti, Emanuela and Massimo Moneglia (eds.): C-ORAL-ROM. Integrated Reference Corpora for Spoken Romance Languages. 2005. xviii, 304 pp. (incl. DVD). 14 Nesselhauf, Nadja: Collocations in a Learner Corpus. 2005. xii, 332 pp. 13 Lindquist, Hans and Christian Mair (eds.): Corpus Approaches to Grammaticalization in English. 2004. xiv, 265 pp. 12 Sinclair, John McH. (ed.): How to Use Corpora in Language Teaching. 2004. viii, 308 pp. 11 Barnbrook, Geoff: Defining Language. A local grammar of definition sentences. 2002. xvi, 281 pp. 10 Aijmer, Karin: English Discourse Particles. Evidence from a corpus. 2002. xvi, 299 pp. 9 Reppen, Randi, Susan M. Fitzmaurice and Douglas Biber (eds.): Using Corpora to Explore Linguistic Variation. 2002. xii, 275 pp.
8 7 6 5 4 3 2 1
Stenström, Anna-Brita, Gisle Andersen and Ingrid Kristine Hasund: Trends in Teenage Talk. Corpus compilation, analysis and findings. 2002. xii, 229 pp. Altenberg, Bengt and Sylviane Granger (eds.): Lexis in Contrast. Corpus-based approaches. 2002. x, 339 pp. Tognini-Bonelli, Elena: Corpus Linguistics at Work. 2001. xii, 224 pp. Ghadessy, Mohsen, Alex Henry and Robert L. Roseberry (eds.): Small Corpus Studies and ELT. Theory and practice. 2001. xxiv, 420 pp. Hunston, Susan and Gill Francis: Pattern Grammar. A corpus-driven approach to the lexical grammar of English. 2000. xiv, 288 pp. Botley, Simon Philip and Tony McEnery (eds.): Corpus-based and Computational Approaches to Discourse Anaphora. 2000. vi, 258 pp. Partington, Alan: Patterns and Meanings. Using corpora for English language research and teaching. 1998. x, 158 pp. Pearson, Jennifer: Terms in Context. 1998. xii, 246 pp.