s
STUDIA SEMITICA NEERLANDICA
edited by Prof. dr. P.C . Beentjes Prof. dr. W.J. van Bekkum Dr. M.P.L.M. Bernards Dr. W.e. Delsman
CORPUS LINGUISTICS AND TEXTUAL HISTORY
Dr. M.L. Folmer
A Computer-Assisted Interdisciplinary Approach to
Prof. dr. J. Hoftijzer
the Peshitta
Prof. dr. T. Muraoka
Prof. dr. K.A.D. Smelik
Prof. dr. H.J. Stroomer Prof. dr. E. Talstra
Prof. dr. K. van der Toorn
Prof. dr. K.R. Veenhof
Edited by P.S.F. van Keulen and W.Th. van Peursen
Volume 48
Submission of manuscripts _
Manuscripts should be submitted to the editor of Van Gorcum Publishers, P.O. Box 43, 9440 AAAssen, The Netherlands. E-Mail:
[email protected]
_
Each manuscript submitted is reviewed by two reviewers.
_
The reviewers will not be identified to the authors.
Van Gorcum
system, QT transmitted, in any form or by any means, electronic, mechanical, photocopy ing, recording, or otherwise, without the prior pennission of the Publisher.
Contents
ISBN 90232 4194 0
Acknowledgments
...
. . . . . . . . . . ........ ............ .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . IX
Transliteration Alphabet Introduction
.............. . . . . . . . . . . . . . . . . . . . . . . . . . . .
. ............. . . . . . . . . . . . . . . . . . . . . . .... IX
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .......... . . . . . . . . . . ..................................................
I
Wido van Peursen
Part One Papers Presented at the CALAP Seminar CALAP: An Interdisciplinary Debate between Textual Criticism, Textual History and Computer-assisted Linguistic Analysis ... .
. .......
III
Konrad D. Jenner, Wido van Peursen and Eep Talstra
.
..
......
.. .
How t o Transfer the Research Questions into Linguistic Data Types and Analytical Instruments?
. . . .. . . . . . . . . . . . . . ........ . . . . . . . .... . . . . . . . ..... . . . . . . .
13
.45
Eep Talstra, Konrad D. Jenner and Wido van Peursen A Discourse on Method: Basic Parameters of
Computer-Assisted Linguistic Analysis on Word Level
Hendrik Jan Bosman and Constantijn J Sikkel
........... . . . . . . . . . . . . . .
Response to 'A Discourse on Method' by Hendrik Jan Bosman and Constantijn J. Sikkel
Pier G. Borbone
... . . . .
.. . . .. .
..
................ . . . .
Response to Pier G. Borbone . . ... Hendrik Jan Bosman and Constantijn J Sikkel . ..........
........ . .
. .. . .... . .. . . . .. . . ... . ... . .. . . . ....
. . ......
..
.. . . .
. . ..
85
I 15
..
119
. . . . . . ...... . . . . . . . . . . . . . . . . . . . . . .
123
. . . . . ....
.
. . . ........ . .
On not Putting Descartes before D. Hume: Balancing Rationalism and Empiricism in Corpus Tagging. Comments on 'A Discourse on Method' by Hendrik Jan Bosman and Constantijn J. Sikkel . . .
A. Dean Forbes Printed by: Royal Van Gorcum, Assen, The Netherlands
........... . . .
...... . . . . . .. . . .
....
p
Response to A. Dean Forbes . ...... .... .. .
.
.
.
...
... .. ..... ..
Hendrik Jan Bosman and Constantijn J Sikkel
.
.
. · · · · ... · .. · · · · ...
...
.. .
........
129
Data Preparation: What are we Doing and Why Should we? ................ 133
Janet W. Dyk
Response to 'Data Preparation: What are we Doing and Why Should we?' by Janet W. Dyk
..
Geoffrey Khan
. .... . . .. ... .. ....... . .............. . . . . . .. ... .. . . . . . . . . . . . . . .....
ISS
. . . ....... . .
157
Three Approaches to the Tripartite Nominal Clause in Syriac
.......
Wido van Peursen Comments on 'Three Approaches to the Tripartite Nominal Clause in Syriac' by Wido van Peursen .. .... .
..
.
.. . . . . . . .. ......... .. . .. . . .... . . . . . . . .
175
Gideon Goldenberg
Clause in Syriac' by Wido van Peursen
.
..
..
...........
.
...
. . . . . ......
.. . . .
.
...........
185
.
. . . . . . . . . . . . . . . .......
189
Takamitsu Muraoka Response to the Responses
...
......
....
. .. .
..
....... .
.....
.
.......
.
....
.......
.......
..
. ..
. . .. . . .
..
197
Wido van Peursen Points of Agreement between the Targum and Peshi(ta Versions of Kings against the MT: a Sounding
Percy SF. van Keulen
....
.. . . ..
..
.
. . . . . . . . . .. . . . . . ... .....
.......
.
.
205
. .. 237
Bas ter Haar Romeny
Response t o 'Points of Agreement between the Targum and Peshi(ta Versions of Kings against the MT' by ...
.
.......
.. . .
....
I
Kings 2: 1-9: Some Results of a Structured Hierarchical Approach
........
Lexical Correspondence and Translation Equivalents: Building an Electronic Concordance . . . .. .....
Janet W. Dyk
. . . . . . . .... . . . . . .
Wido van Peursen
I
. . . . . ..... ......... .
.......... . . . . . . . .........
271
277
311
Kings 2:1-9 ..................................... 327
Percy SF. van Keulen
Epilogue: The Peshi(ta of I Kings 2:1-9 from a Linguistic and Text-Historical Perspective . .... . . . . .... .. . ... .... . .. . .. . ...
Wido van Peursen
..
A Reply to 'Points of Agreement between the Targum and Peshi(ta
Percy S.F. van Keulen
. . . . . . . . . . . . . . . .. .
Janet W. Dyk
Index of passages . ...
Versions of Kings against the MT' by Percy S.F. van Keulen ..
Donald M. Walter
Worked Examples from I Kings 2:1-9: Word Level Analysis
Hendrik Jan Bosman and Constantijn J. Sikkel
Exegetical and Text-Historical Differences from the MT in the Peshi(ta Version of I Kings 2:1-9 .................................................... .......... 333
A Response t o 'Three Approaches to the Tripartite Nominal Clause in Syriac' by Wido van Peursen and a Bit More
Textual Features of the Peshitta of I Kings 2: 1-9 ...................................... 253
Percy SF. van Keulen
Nominal Clauses in the Peshitta of
Comments on 'Three Approaches to the Tripartite Nominal
Jan Joosten
Part Two 1 Kings 2 :1-9
... . . . . . . . . ... . . .... ... . . . . . . . . . . . ... . . . . . . ... ..... ... . .
245
Index of authors
.
..
.
.. .
.
.
..
.
..
.
.
..
.
... .. .... . 345
.... . . .
.
.
..
. .. . . . ... ...... .. ..... . .. . . . . .. . . . . . .. . . . . . . . . . . . . . . . .. . . ......... . ...... . ...... . 359
. . . . ......... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . .
365
p
Acknowledgments
The present volume has its roots in the CALAP seminar held 3-4 April 2003 at the Netherlands Institute for Advanced Study (NIAS) in Wassenaar, the Netherlands. It is therefore appropriate here to thank those who granted the funds that made the seminar possible: The Netherlands Organization for Scientific Research (NWO), the Peshitta Foundation, and the Leiden Institute for the Study of Religions (LISOR). Regarding the preparation of this volume, special thanks are due to Dr Konrad D. Jenner who read the draft meticulously and whose corrections and suggestions led to improvements at many points. We are indebted to Mrs Helen Richardson-Hewitt who performed the task of checking and correcting the English of a number of articles in a most conscientious way. We are also grateful to Mrs Madelon A. Grant and Mrs Roelien E. Smit for their editorial assistance. Finally we wish to thank the editorial board of Studia Semitica Neer landica for accepting the manuscript for the series. Leiden, July 2005
Percy van Keulen Wido van Peursen
Transliteration alphabet Sometimes quotations are given in the following transliteration alphabet used by the WIVU/CALAP databases: >BGDHWZXVJKLMNS
Introduction Wido van Peursen
The main part of this volume consists of the papers presented at the CALAP seminar 2003. CALAP (Computer-Assisted Linguistic Analysis of the Peshitta) is a joint research project of the Peshitta Institute Leiden (PIL) and the Werkgroep Informatica Vrije Universiteit (WIVU). The main focus of this project is on the language of the Peshitta, its character as a translation and its textual transmission. In order to understand the Syriac translation of the Bi ble, it is essential to bring into focus matters having to do with the separate language systems, with the interaction between characteristics of the lan guage systems in the translation from Hebrew (source language) into Syriac (target language) and with the translation techniques at work in this transla tion. To address these questions Konrad Jenner, the director of the PIL invited Eep Talstra, head of the WIVU, to set up a joint research project. This project was born in the late nineties of the twentieth century under the name CALAP. A project group, an executive committee and an advisory board were estab lished. The project group consists of Hendrik Jan Bosman, Janet Dyk, Percy van Keulen, Wido van Peursen and Constantijn Sikkel. The executive com mittee consists of Pier G. Borbone (Pisa), Donald M. Walter (Elkins, USA) and the two initiators of the project. The members of the advisory board are Jan Joosten (Strasbourg), Geoffrey Khan (Cambridge), Arie van der Kooij (Leiden) and Takamitsu Muraoka (Leiden). The idea to set up a joint research project arose from the conviction that a combination of the expertises represented by the two cooperating institutions would be a sine qua non for an analysis of the Peshi\ta which is linguistically elegant, technically efficient and exegetically, philologically and text historically significant. The PIL is an internationally acknowledged centre of expertise for the study of the text, transmission and interpretation history of the Peshitta and its relation to the other Ancient Versions. Its projects include the preparation of a text edition of the Peshitta (about three-quarters of the edition is complete; the last four volumes are scheduled to be published in the near future), a concordance in six volumes (the first volume appeared in 1997) and an New English Annotated Translation of the Syriac Bible (NEATSB; the first fascicles will appear in 2006 or 2007). The WIVU is an international centre of expertise for the computer-assisted linguistic research of the Hebrew Bible. It was one of the pioneering project groups when in-
p
2
3
fonnation technology entered the scene of biblical studies in the seventies of the twentieth century and it has continued research into technical and meth odological aspects related to the application of infonnation technology to the study of the Bible. The interaction between infonnation technology, linguistics and textual criticism and textual history is one of the innovative aspects of CALAP. Even though the use of the computer has become increasingly important in biblical studies over the last few decades, a combination or confrontation with dia chronic text-critical and text-historical approaches has hardly ever taken place. Rather, there is quite often mutual misunderstanding between com puter-linguistics and 'traditional' scholars in the field of linguistics and tex tual analysis. The mutual misunderstanding is partly due to the fact that the disciplines involved differ considerably, each having its own history and traditions. On the one hand, many 'traditional' linguists or biblical scholars do not see the advantage of computer-assisted research. Some regard the computer as a tool that can be used, for example, as a search engine. In their view the computer may be a time-saving instrument, but not a tool that can help in fonnulating or even answering research questions. Others are somehow aware that the computer can do more than function as a search engine, but do not know exactly what can be done with the computer. With some vague notions about artificial intelligence they expect that the computer can give the answers where the human researchers fail. It seems to escape them that computer assisted research includes a complicated interaction between the human re searcher and the computer, whatever method or research strategy is followed. On the other hand, those who are more acquainted with computer linguis tics than with biblical studies will be confronted with some complexities related to the study of the Bible that do not occur in other areas of computer assisted research of text corpora. In much research that is based, for example, on large corpora of contemporary English, the functional analysis of a text is not considered problematic or controversial, and the labels or tags given to fonns and structures may concern fonnal characteristics and semantic func tions indiscriminately. However, the assumption that we can understand the text and that we know the functions and meanings of the linguistic elements of which it consists, is not valid in the case of ancient texts. In the study of the Bible or other texts from the remote past, there is a fundamental differ ence between the fonnal registration of data and their functional analysis much more than in other fields of computer linguistics. Another problem concerns the way in which 'the text' has come down to us. It is scholarly unsound to take one manuscript or edition as 'the text'. Accordingly, in the computer-assisted analysis of the Hebrew Bible and its Ancient Versions one has to fmd a procedure that accounts for variant readings in the manuscripts. For a 'traditional' Old Testament scholar this sounds self-evident, for a com puter linguist this is an unwelcome complication.
If computer linguistics and textual history are so much different, one may wonder whether an attempt to combine the two disciplines is worthwhile or even whether interaction between the two is possible at all. The initiators of
the CALAP project thought that it was. In other fields of linguistic and textual
studies, computer-assisted research has demonstrated its worth. If applied to the linguistic and textual analysis of the Bible, it can be valuable as well, provided that due attention is paid to specific problems involved in the study of ancient texts which originated in cultures much different from ours and which may have had a long and complicated transmission history. The main focus of the papers presented at the CALAP seminar 2003 was the methodology of the interaction between infonnation technology, linguis tics and textual criticism and textual history. This interaction is complicated by the fact that each of the disciplines involved itself consists of a broad field
of methods, currents of thoughts and ongoing debates. It is not enough to organise a dialogue between 'the programmer', 'the linguist' and 'the textual
critic' as if each of them represents a homogeneous field of studies. For ex ample, the position of 'the linguist' in relation to 'the programmer' and 'the textual critic' will depend much on the position that he or she takes in ongo ing debates about linguistic theory and analysis. The methodological issues that were the central themes of the CALAP seminar are not only relevant to those working in the field of 'Bible and computer'. The fruitful discussions with scholars from several disciplines, some of them with little affinity for infonnation technology, showed that the questions involved are relevant to a broad area of linguistic and textual stud ies. These questions concern, for example, the confrontation between syn chronic and diachronic approaches, the role of linguistic analysis in the in terpretation of texts, the interaction of linguistic theory and the analysis of linguistic data, and the balance between elements that can be registered for mally and the functional interpretation of these elements. Because of this relevance of the questions involved it was decided that the proceedings of the seminar as well as some additional contributions should be published. In the first article in this volume
and Eep Talstra
Konrad D. Jenner,
Wido
van Peursen
give a survey of the discussions preceding and during the
CALAP project. Their focus is on the confrontation between synchrony and
diachrony, between the fonnal description of linguistic aspects and the text critical and text-historical study of manuscripts. Before the start of CALAP the computer-assisted research at the
WlVU was mainly focused on a synchronic
linguistic analysis of the Masoretic Text of the Hebrew Bible as represented in the Biblia Hebraica Stuttgartensia (BHS). CALAP added new dimensions to this research because it concerned the comparison of two textual witnesses with two different language systems. This raised two questions. The first question is how we define 'language system'. In a structuralistic, synchronic approach, the language system consists of the fonnal categories of grammar and syntax. In recent years, however, linguists have become increasingly
5
4 aware of the need to take into account also psycho-linguistic, socio-linguistic and pragmatic factors. The second question relates to the textual basis for a description of the language system. In the comparison of the Hebrew Bible and the Peshi(ta text-critical questions become urgent. As to the Hebrew text, it cannot be assumed a priori that the Hebrew source text of the Peshi(ta was
identical with the consonantal framework of the Masoretic Text as preserved in the Leningrad Codex. As to the Syriac text, the Leiden Peshi1ta edition does not give a critically established 'original' text, but a representation of the so-called BTR-text, an average text that is not attested in any extant bibli cal manuscript. The editors of the Leiden edition never intended to present a text that could be studied in isolation as 'the Peshi(ta' without the critical apparatus attached to it. In their second contribution
Talstra, Jenner and Van Peursen
discuss
the question of how the research questions can be transferred into linguistic data types and analytical instruments. The article contains two main sections. The first section addresses the production and presentation of text data. It gives a short explanation of the treatment of the Hebrew data according to the model that has been developed at the
WlVU over
the last decades. This is
followed by a presentation of a new dimension added to the research by CALAP, namely the way in which the computer programs and the analytical
procedures that had been used for Biblical Hebrew were adapted so that they could be applied to Classical Syriac. The second section of this paper dis cusses the instruments that have been designed for the synoptic presentation and comparison of the Hebrew and Syriac data. The introduction into the computer-assisted research performed in CALAP by Talstra, Jenner and Van Peursen is followed by the contribution of
Hendrik Jan Bosman and Constantijn J. Sikkel,
who locate the CALAP
model of computer-assisted linguistic analysis within the broad and hetero geneous area of computer linguistics. As we said above, computer linguistics is not a uniform field of studies. Bosman and Sikkel contrast corpus linguis
tics, which has as its aim to describe and explain human language, with natu
ral language processing, the purpose of which is the production of robust and efficient annotation systems. In corpus linguistics the text is an instrument for the linguistic analysis; in natural language processing the linguistic the ory is an instrument for the analysis or production of texts. The former usu ally has data-driven parsing systems, while that of the latter are usually rule based. The article includes a discussion of specific aspects of CALAP and a comparison with other computer corpora of biblical texts. In his response to Bosman and Sikkel,
Pier G. Borbone
emphasises the
need to distinguish between 'the text' and the concrete evidence for it (the manuscripts). In their reaction to Borbone's response Bosman and Sikkel describe the way in which variant readings are accounted for in CALAP. Another response to Bosman and Sikkel is
A. Dean Forbes'
article.
Forbes, too, has many years of experience in the field of computer linguis tics. His response shows that the term 'computer linguistics' covers a wide
variety of sub-disciplines and approaches. Much depends on the answer given to the question of how one can find a proper balance between a rule based (,rationalistic') approach and a data-driven ('empiristic') approach, and between a bottom-up and a top-down analysis. The answers to these questions relate to the underlying linguistic model, the goals of research, the linguistic categories and the tagsets one uses etc. The contribution by Bosman and Sikkel, as well as their discussion with Borbone and Forbes takes the perspective of the programmer and the com puter linguist. They address questions such as What specific opportunities are involved in the analysis of a Semitic language with a rich morphology? and What are the textual problems one should take into account in the analy sis of ancient texts? The article by
Janet W. Dyk
shows that linguists ap
proach computer-assisted research with very different questions. Their first
how we should prepare data, but the question is What are we why should we? (the title of Dyk's paper). What advantage is there
concern is not doing and
in the computer-assisted research? It is demonstrated how the study of the Syriac and Hebrew language systems can benefit from computer-assisted research. Dyk describes the data preparation at the lower linguistic levels, already briefly touched upon in the contribution of Talstra, Jenner and Van Peursen. She focuses on data preparation at word level and phrase level. The result of the data preparation of the Hebrew and Syriac texts according to the CALAP system is an arrangement of data which makes it possible to compare
the two versions at various different levels, like lexical entries, phrase struc ture, verbal valency, use of conjunctions, clause types, distribution of verbal tenses, clause structure etc.
Geoffrey Khan
argues that the clear-cut choices required by the compu
tational parsing presented in Dyk's paper are not always satisfactory. His torical change in language causes opacity and fuzziness in grammatical pars ing and category assignment. This raises again the question as to the object of linguistic research. Since languages are the product of historical develop ment and subject to ongoing change, linguistic research should go beyond a synchronic analysis of the 'language system' and take other factors into ac count as well. Whereas the contribution by Bosman and Sikkel and their discussion with Borbone and Forbes show that the terms 'information technology' and 'com puter-linguistics' cover a wide variety of sub-disciplines and approaches,
Wido van Peursen's contribution
and his discussion with Gideon Golden
berg, Jan Joosten and Takamitsu Muraoka demonstrate that the same applies for 'Syriac linguistics'. Van Peursen discusses three approaches to the tripar tite nominal clause. He claims that the gap between approaches that are at first sight irreconcilable (especially that between the approaches of Muraoka and Goldenberg) can be bridged if a clear distinction is made between gramm atical and pragmatic levels of description and if diachronic factors are taken into account as well. As to the computer-linguistic analysis, Van Peursen shows how the application of the so-called form-to-function princi-
6
7
pie leads to a fonnal registration of linguistic data which does not require an
a priori
choice between the different views on the function of the enclitic
personal pronoun. M'
In the discussion about the nominal clause in Syriac, clauses of the type NO<
� play an important role, because they have been put forward
as attesting to the pattern Subject - e.p.p. - Pronoun and hence as an argu ment against the view that the e.p.p. (�enclitic personal pronoun) is a 'lesser
tic research' and 'Syriac linguistics', 'textual criticism' also covers a wide variety of sub-disciplines and approaches. In a computer-assisted analysis of the Ancient Versions, one should not ignore the results of centuries of text critical and text-historical research. In the case of the Peshi(ta, a central ques tion in the history of research concerns the relation between the Peshitta and the Targumim. Percy S.F. van Keulen argues that a comparison between these two versions may shed light on the interaction of language systems
subject' that always follows the predicate. Gideon Goldenberg argues that in this type of clause the pronoun in first position is made the fonnal predi
represented by the Hebrew and Aramaic texts and the translation technique
cate by the following e.p.p. and that therefore the label Subject - e.p.p. -
differences between the Peshitta and the Masoretic Text.
Pronoun is not applicable. In Goldenberg's view, it is not in this type of
clause but rather in clauses of the type � NO< NO< that the e.p.p. shows
disagreement with the subject due to attraction to the predicate. He also ad
used and thus help us appreciate the nature and background of a number of In his response to Van Keulen,
Bas ter Haar Romeny highlights some
issues about the relation between the
MT,
the Peshi(ta and the Targum, such
as the so-called 'Targumic' character of the Peshi(ta, James Barr's 'typology
dresses the alleged copulaic function of the enclitic personal pronoun and the
of literalism', the need to compare the Peshi(ta with the MT on the basis of a
question of whether the grammatical and the pragmatic structure of sentences should be treated differently.
critical text, and various theories about the dependency of the Peshitta on the
This latter point is also addressed in
Jan Joosten's response to Van
Peursen. In the parsing of nominal clauses Van Peursen distinguishes be tween the grammatical categories of subject and predicate and the functional categories of topic and comment. Joosten argues that such a distinction is impossible. He rejects the view that one can establish a scale of defmiteness on the basis of grammatical features, without taking into account the context. This issue, which is also addressed by Van Peursen in his response to the responses, plays a crucial role in the linguistic discussions that are involved in, and partly precede, the computer-assisted linguistic analysis. Related to the debate about the function of the enclitic personal pronoun in tripartite nominal clauses is the discussion about its function in the so called imperfectly-transfonned cleft-sentences.
Takamitsu Muraoka argues
that the category of cleft sentences as applicable to classical Semitic lan guages should be abandoned completely, thus taking a position which is the opposite of Goldenberg'S. Another point about which there is fundamental disagreement between Goldenberg and Muraoka is the syntactic equivalence of the nominal and the verbal clause. According to Goldenberg this equiva lence is an important clue for understanding the nature of nominal sentences; for Muraoka this equivalence is minimal, if it exists at all. In his response to the responses
Van Peursen elucidates some of the
points raised in his main paper. In response to a question raised by Joosten, Van Peursen demonstrates the way in which variant readings in the manu scripts (in this case the variation between bi- and tripartite nominal clauses) can contribute to linguistic studies. He concludes that the text-critical evi dence is an important source of infonnation about the variety of construc tions that were in line with - or at least did not contradict - the Syriac lan guage system. This brings us to the third discipline that plays a major role in the CALAP project: textual criticism and textual history. Like 'computer-assisted linguis-
Targum (or
vice versa),
which played an important role in the history of
research. Ter Haar Romeny considers Van Keulen's goal to compare the Peshi(ta and the Targum in order to determine the nature of differences be tween the MT and the Peshi(ta of Kings an important objective for future research.
Donald M. Walter's response to Van Keulen consists of two parts. The first part concerns the interpretation of individual passages. The second part deals with further research that should be done in the future. While both Ter Haar Romeny and Walter address the questions that should play a central role in future research, the way in which they address this issue is somewhat different. Ter Haar Romeny gives a survey of research on the Ancient Ver sions in the present and the past and asks what factors should be taken into account if the research envisioned by Van Keulen is carried out. Walter takes his starting point in the present state of computer-assisted research on the Peshitta and asks how this can be extended. The editors of the present vol ume
fully
agree with Walter that this extension should not only pertain to
other parts of the Peshi(ta, but also to other versions of the Old Testament, especially the Targumim and the Septuagint. Unfortunately this was not possible in the scope of the CALAP project. It will be part of a new project, called TURGAMA: Computer-Assisted Analysis of the Peshi(ta and the Tar gum: Text, Language and Interpretation. This project will start in 2005. Since the main focus of the seminar was on the methodology of computer assisted textual analysis and the interaction of programmers, linguists and biblical scholars, less attention was paid to the results of the computer assisted research. These results will be published in a joint monograph by Janet Dyk and Percy van Keulen on the Peshi(ta of Kings and in another monograph on language and interpretation in the Syriac text of Ben Sira by Wido van Peursen. In addition, Hendrik Jan Bosman and Constantijn Sikkel are preparing a monograph on word level analysis. This monograph will
8
9
discuss the methodology of computer-assisted morphological research. Nev ertheless, the members of the CALAP project group wished to include in the present volume a section in which different aspects of the CALAP analysis are applied to a selected passage of the Peshi\lll. For this purpose
I
Kgs 2: 1-9
was chosen. The purpose of this second section of the volume is to demon strate the consequences of the more abstract methodological discussions in the first section if it comes to the analysis of concrete texts and the results to which such an analysis may lead. Since this section is based on only a small text sample, the conclusions presented in it are limited in scope.
In the first contribution to this section Van Keulen provides an inventari
sation and preliminary categorisation of textual features of the Peshi\lll that require explanation within the framework of the comparison of the Peshi\ta
with the MT. The inventory made contains both formal differences between the Peshi\lll and the MT and inner-Syriac variations. The categories distin guished are partly linguistic, and partly exegetical and text-critical.
Bosman and Sikkel take three words from I Kgs 2: 1-9 to illustrate the procedure of word level analysis presented in their main paper. The first
example is relatively simple. It is used to show how the analysis includes the grapheme level and the morpheme level and the strict distinction that is made between distributional and functional sides of description. The two other examples are more complicated. One concerns the treatment of morphemes that are not visible in the surface form of a word, the other example illus trates the problem of ambiguous forms.
Dyk gives some examples of the research that can be done after the com pletion of the data preparation. After an independent bottom-up analysis of both the Hebrew text and the Syriac text, a synoptic presentation of the two versions can be made according to the procedure described in her main paper and that of Talstra, Jenner and Van Peursen. From this presentation it is pos sible to return to the lower linguistic levels of words, phrases and clauses and compare those elements that are parallel. In a second contribution Oyk dem onstrates the way in which lexical correspondences and translation equiva lents can be analysed on the basis of the synopsis of two texts. Van Peursen discusses the nominal clauses that occur in I Kgs 2: 1-9. He shows how the approaches presented in his main paper can be applied to these clauses. He argues, for example, that the difference in word order be tween the Hebrew and Syriac text of I Kgs 2:8 should not be ascribed to the Syriac translator's exegetical activity, because it is due to differences be tween the Hebrew and Syriac language systems. While, Bosman, Sikkel, Oyk and Van Peursen focus on linguistic issues
in the Syriac text of I Kgs 2: 1-9, Van Keulen discusses those differences from the MT that defy explanation in terms of differences in language system. The computer-linguistic analysis contributes considerably to marking the borderline between linguistically motivated deviations and other differences between the MT and the Peshi\lll. It is at this borderline that linguistic investi
gations and text-critical research meet. Once this boundary has been estab-
lished, a fresh investigation of the exegetical profile of the translator and the text-critical and text-historical position of the Syriac translation is possible
and even required. In this way the computer-assisted linguistic analysis can contribute not only to the study of the Syriac language system, but also to the broad field of Peshi\lll studies. In the final contribution to this volume
Van Peursen formulates some
general conclusions about the linguistic and theological profile of the Peshi\ta of
I
Kgs 2:1-9.
Part One Papers Presented at the CALAP Seminar
p
CALAP: An Interdisciplinary Debate
between Textual Criticism, Textual History and Computer-Assisted Linguistic Analysis Konrad D. Jenner, Wido van Peursen and Eep Talstra
1 Introduction The funds granted by the Netherlands Organization for Scientific Research (NWO) in 1999 to the Peshi(ta Institute Leiden (PIL) and the Werkgroep In formatica Vrije Universiteit (WlVU) for the CALAP project provided a unique opportunity for a debate between the disciplines and expertises of these two institutions. However, in reality this debate had started much earlier, since several years of discussions preceded the proposal for a joint research pro ject. In the mid-nineties the main participants in the discussion were Konrad D. Jenner, the director of the Peshitta Institute, and Eep Talstra, head of the Werkgroep Informatica. In the present paper we will describe some of the discussions that took place before and during the CALAP project. At stake was the debate of schol ars who have in common that they strive for an adequate and promising ap proach in Old Testament exegesis, but hold different, sometimes even oppo site opinions about the purpose and method of linguistic and textual analysis, and the question of how they can combine their insights and expertise in an interdisciplinary joint research project. Much of the discussions reflected the scholarly debate between a text-critical and text-historical diachronic analy sis of the Hebrew Bible and the Ancient Versions on the one hand and their synchronic linguistic and literary analysis on the other. The common chal lenges were: What types of linguistic data do we need for a proper textual analysis? And how can they be created by the computer? The first question will be addressed in the present contribution, the second in the following article.
14
15
2 The PIL and the WIVU as representatives of two disciplines
�
The history, research tradi ion and purposes of the two institutions involved in CALAP differed. The PIL was founded m
1959
when the internatIOnal Or
ganization for the Study of the Old Testament (IOSOT) appointed the Leiden . Professor P.A.H. de Boer to be the chief editor of the new cntlCal edloon of the Old Testament Peshi\Ul.' As a consequence the main concern of the re search carried out at the PIL was the text-critical and text-historical study of the Peshitta. This research was in keeping with the long Leiden tradition of the diac
�onic analysis of the Old Testament from the perspective of histori
cal and literary criticism, textual transmission and reception history.
Its purpose was to assist the study of the language and structure of the text of the Hebrew Bible with the help of in formation technology. The main concern of the WlVU was to develop pro The WIVU was founded in
1977.
grams that would enable a computer-assisted analysis of the Masoretic Text as attested in the Codex Leningradensis and to set up a database contammg this text in an electronic format together with the results of its full linguistic
. )
ana1YSls.
. The text-critical scholar, who investigates the ancient and medieval manuscripts of the Hebrew Bible and the Ancient Versions, will be well . aware that not any of these manuscripts, whatever its status or age, con tams
the text of the Hebrew Bible or one of its translations; they are just textual ' witnesses. To attain the best and 'original' reading of a text thorough inves tigations are needed to weigh the variants contained in the witnesses, to re construct the textual history and to distinguish phases and text types' This
will affect the text-critic's view on exegesis. He or she will not take the 0In ning text of a text edition for granted as the authoritative or obvious representative of a text, but will always enquire as to its status from the perspective of literary and textual history. The computer-assisted linguistic analysis of the Hebrew Old Testament as attested in the Codex Leningradensis and edited in the
gartensia
Biblia Hebraica
Stutt
poses another challenge to Old Testament exegesis. It raises the
question of which features in the text can be accounted for by referring to the Hebrew language system. The rules of a language system put constraints on oral or written communication. The defmition of the rules is necessary to establish how an utterance is modelled within these constraints. For this pur pose a thorough and detailed linguistic analysis is necessary. The practice of linguistic analysis at the WlVU follows a bottom-up pro cedure, starting with the segmentation of words into morphemes, followed by the combination of words into phrases and the combination of phrases into clauses and ending up with the text-hierarchical relationships between clauses. The result of this analysis is a formal systematic description of lin guistic units with increasing size and complexity.' Being aware of the danger of an oversimplification, one could say that the differences between the approach of the PIL and that of the WlVU are more or less parallel to those between the so-called diachronic and synchronic ap proaches in Old Testament exegesis. These differences concern the frame work of presuppositions, assumptions and purposes of textual analysis and the basic understanding of language, texts and their interpretation.' The claims made on the basis of either diachronic or synchronic analysis are in principle valid only within the framework of the presuppositions and points
I The original Dutch name of the institute was 'Werkkamers voor de tekstgeschiedenis en de tekstkritiek van het Oude Testament'. 2 See P.A.H. de Boer and W. Baars, The Old Testament in Syriac according to the Peshilla Version. General Preface (Leiden 1972); P.A.H. de Boer, 'Towards an Edition of the Syriac Version of the Old Testament', VT31 (1981),346-357. ) See E. Talstra and F. Postma, 'On Texts and Tools. A Short History of the "Werkgroep Informatica" (1977-1987)'. in Computer Assisted Analysis of Biblical Texts. Papers Read
at the Workshop on the Occasion of the Tenth Anniyersary of the 'Werkgroep Infor matica ', Faculty of Theology, Vrije Uniyersiteit. Amsterdam, Noyember 5-6. J 987 (ed. �. Talstra; Applicatio 7; Amsterdam 1989),9-27; idem, 'Introduction', Stuttgart Electromc Study Bible (Stuttgart 2004). � See e.g. E. Tov, Textual Criticism of the Hebrew Bible (2nd rev. ed.; Assen 2001), 2; see also P.G. Borbone's contribution to the present volwne. 5 The notion of the 'original reading' or the 'original text' assumes that 'at the end of �he process of the composition of a biblical book stood a tex t which was �o �sidered authonta . only by a lImited group of peo tive (and hence also finished at the literary level), even If . ple,and which at the same time stood at the be �inning of a process of copy�ng and textual . . transmission'. However, this text is an abstraction, because 10 reality copymg and textual transmission did not begin with the completion of the literary composition; see Tov, Textual Criticism, 164-180, esp. 177. Note that in the first edition of his Textual Criticism (Assen 1992) Tov speaks of 'one textual entity (a single copy or tradition) which was . . considered finished at the literary level', without reference to the notion of 'authontattve'.
of departure of the respective disciplines. The PIL and the WlVU decided to start an interdisciplinary research project in order to reach conclusions that are valid beyond these frameworks.
, For further details see the contributions by Talstra-JenneT-Van Peursen, Bosman-Sikkel and Dyk in the present volume. 7 Cf. RW. Langacker, Fundamentals of Linguistic Analysis (New York 1972), 21: 'When a linguist fonnulates a syntactic rule, he is presupposing a theory of language that makes it meaningful for him to speak of syntactic rules. When he posits a phonological represen tation,he is presupposing a metatheory that allows abstract theoretical constructs of this sort ( . . ) In short, every step he takes draws upon metatheoretical assumptions, be it im plicit or explicit.' Langacker's remarks on linguistic theories apply mutatis mutandis also to other fields of textual analysis. See also his remark on synchronic and diachronic ap proaches on p. 15 of the same monograph: 'The linguistic description of a language is called a grammar. A grammar can be regarded as a theory of the structure of a language. Synchronic analysis therefore involves the fonnulation and evaluation of theories about how languages are put together, while diachronic analysis involves theories pertaining to the evolution of linguistic systems. A theory must be based on the data it purports to explain, but the relationship between data and theory is not necessarily direct or simple . It is thus appropriate for us to examine the nature of linguistic data and its relation to theo ries that purport to account for it.' .
17
16 3 Points of Departure of the Interdisciplinary Research In the preceding paragraph we mentioned the differences in presuppositions and points of departure that exist between diachronic and synchronic ap
the fact that there is an enormous temporal and cultural gap that separates the
Ancient Near Eastern culture in which the Masoretic Text and the Peshilta originated and the culture in which the modern scholar lives. This gap in
creases the danger of misinterpretation, not only in the field of the under
proaches, for which we took the research of the P[L and that of the W[VU as
standing of religious texts, but also in that of linguistic analysis.
were some assumptions that were shared by both research centres.
the languages involved being well aware, however, of the limitations of this
representatives. At the start of the project, however, it was evident that there
For this reason the PIL and the WlVU decided to focus on formal aspects of
approach. II The use of information technology would guarantee consistency in the formal analysis. I'
3.1 The Importance of the Stndy ofthe Syriac Bible The reason to choose the Peshilta as the object of investigation hardly needs
any justification. It is a most valuable document from the perspectives of linguistics, textual criticism, textual history and reception history. Though it
is still a matter of scholarly debate in what religious context the Peshilta
3.3 Appreciation of the Peshitta as a Literary Document in Its Own Right Another point of consensus was the insight that the Ancient Versions should
originated and what was the aim of the translators andlor editors, it is undis
be considered as literary and religious documents in their own right with
text was cared for and transmitted in Christian scriptoria, monasteries or by
ideological profiles of the Masoretic Text and the Peshilta have to be estab
puted that its text enjoyed much authority in the Syriac churches and that its
their own linguistic and literary profiles.1J This means that the linguistic and
individual scribes and copyists.
lished in an independent investigation before a comparative analysis is done.
Linguistic and text-critical issues interact in the question as to the source
text of the Peshilta. Linguistically there is a strong relationship between the
Hebrew Bible and its Syriac translation because they are written in two
3.4 Tbe Role of Linguistic Analysis in Old Testament Exegesis
well. It is generally assumed that the source text of the Peshilta was very
The
closely related languages.' Text-critically there is a strong relationship as similar to the Masoretic Text but not identical to it.' In spite of this consensus
no complete reconstruction has ever been made of this slightly different He
brew text.
10
PIL and the WlVU agreed that too often interpreters of the Old Testament
ignore the linguistic peculiarities of the Hebrew text and the Ancient Ver
sions and their making use of the respective language systems. The impor
tance of linguistic analysis for biblical exegesis can hardly be overestimated.
It is impossible to appreciate literary features of any text, if there is no clarity
3.2 Tbe Need of a Formal and Consistent Registration of tbe Lingnistic Data
!1 ]2
These will be discussed below, in §§ 4-5. Cf. S.c. Dik and J.G. Kooij,
Beginselen van de a/gemene taalwetenschap (Aula-bocken
448; 3rd ed.; Utrecht/Antwerpen 1973), 48: 'Working with computers is ( . . . ) of the ut most importance for linguistics because it forces one to register linguistic phenomena in
The P[L and tbe WIVU formulated as their point of departure that the inter
precise, unequivocal and exhaustively explicit descriptions' (,Het werken met computers
the consistent and detailed description of the language systems involved. The
wordt de taalverschijnseJen in nauwkeurige, ondubbelzinnige en volledig expliciete for
pretation of the Hebrew Bible and its Ancient Versions sbould be based on
reason for this focus on a formal and systematic description of the data was •
mUleringen vast te leggen '). In the study o f ancient languages, the need to start with a
fannal registration of the linguistic data is even more urgent. This has been emphasised by 1. Hoftijzer, W. Richter, G. Khan, and other advocates of the so-called 'fonn-to
Those parts of the Peshitta that have been translated from Greek will not concern us
here. , A. Gelston emphasised this point in his
The Peshilta of the Twelve Prophets (Oxford The Syriac Version of the Old Testament
198 7), 1 1 1-1 30; see also M.P. Weitzman, (Cambridge [999), 52--{i 1 .
10
is [ . . . ] van enorm bclang voor de taalwetenschap juist amdat men daarbij gedwongen
K.D. Jenner, 'La Peshitta: fille du texte massoretiqueT, in
brafque. L 'his/oire du tale
de [,Ancien
L 'en/ance de la Bible he Testament a la lumiere des recherches recentes
(ed. A. Schenker and Ph. Hugo; Le monde de la Bible 52; Geneve 2005), 238-263, esp. 241-242.
function' approach; cf. e.g. J. Hoftij zer, 'The Nominal Clause Reconsidered',
VT
32
(1973), 446-5 10, esp. 477: 'In the study of languages of which we cannot get a real de
gree of competence, as we can have with modem languages, the safest way is to start with
fonnal criteria and with formal oppositions. For in such a case it is easier to get a reason able grip on these phenomena than on functional, semantic and other ones. ' On the prob lems involved in the linguistic analysis of ancient corpora see further below, § 4.8.
IJ See e.g. K.D. Jenner, 'Religious Change and Pluralistic Society' (forthcoming); A. van
der Kooij,
The Oracle of Tyre: the Septuagint 0/ Isaiah
(VTSup 7 1 ; Leiden 1998), 8-19.
XXIII
as Version and Vision
19
18 about the linguistic framework within which it was created. Lexicographic or
a succession of states is difficult to describe or comprehend. How can a state
stylistic peculiarities of the sources of the Pentateuch, for example, can only
of language, seen synchronically and holistically and containing no change, change into another state which is quite different? '" Since language is al ways subject to change, eliminating change from the description of language
be determined if the overall framework of the Hebrew language system that covers all sources is established. Likewise, the work of the Peshi\la transla tors can only be understood properly if the possibilities and restraints of the target language are taken into account.
4 Synchrony, Diachrony and 'Language System'
implies that the object of research is an abstract ideal, a situation at a 'point' in time, rather than a description ofwhat is going on in language. De Saussure himself acknowledged that it is difficult to make this abstraction the object of scholarly research. Therefore he made the con cession that 'in practice a language-state is not a point but rather a certain span of time during which the sum of the modifications that have supervened
One of the points of departure for the interdisciplinary research project was
is minimal' .18 However, even with this concession, language is considered as
the insight that the rules of a language system put constraints on the way in which an utterance is modelled and that Old Testament exegesis should deal
a static system, as the very term 'language state'
(etat de la language)
implies.
with the question of which features in the text can be ascribed to the lan guage system
(§ 3.4). The questions arise, however, of what we mean by
4.2 Beyond Synchrony and Diachrony: Language as a Dynamic System
'language system' and of how this notion relates to the interaction between synchronic and diachronic analysis.
If we want to study the languages of the Hebrew Bible and the Peshi\la we need to make some modifications to the Saussurean understanding of the language system as the interrelatedness of linguistic elements at the syn
4.1 De Saussure on Synchrony, Diachrony and 'Language State'
chronic level, both because of general theoretical considerations and because of specific problems related to the corpora involved.
In the first decades of the twentieth century, F. de Saussure introduced into
From a general linguistic perspective there is the problem discussed
linguistics the distinction between synchrony and diachrony." Although he
above that a strict synchronic description concerns an abstract ideal of lan
considered both the synchronic and the diachronic axes of languages impor
guage at a certain 'point' in time, rather than what is actually going on in
tant' the emphasis of his thinking was laid on the former. This may be related
language, because languages are continually subject to change. In the words
to the fact that in his time historical linguistics was already much developed,
of R. W. Langacker: 'A language is a complicated system that changes slowly
while the insight that language as it functions at any one time is a functional
through the centuries.''' Moreover, these changes leave their traces in the languages, resulting in different layers in the system. In his discussion of the
system of oppositions was very innovative." De Saussure did not deny the importance of linguistic change. While synchrony concerned the language system as the relation between simultane
ous
elements, diachrony concerned the substitution of one element for an
other in time." Accordingly, he defmes change in terms of the substitution of elements in the system, which leads to a succession of systems or 'language
phenomenon of 'grammaticalization', J.A. Cook draws attention to the fact
that 'grarnmaticalization often creates layers, so that a form may have more than one meaning and several forms may concurrently express a particular meaning'." G. Khan speaks of 'opacity' and ' fuzziness' due to historical . ., change m 1anguage.
states'. The changes themselves were eliminated from the description of language as it functions at any one time. However, as J. Barr remarks, 'such 14
See especially his famous Cou.rs de linguistique generale (paris 1916). A useful intro duction to de Saussure's understanding of language and linguistics is R.S. Wells, 'De Saussure's System of Linguistics', in Structuralism. A Reader (ed. M. Lane; London 1970), 85-123. 15 On the notion of language as a system, see also Langacker, Fundamentals ojLinguistic Analysis, 1 : 'Modem linguistics is based on the fundamental empirical fact that languages are elaborately and intricately structured. Linguistic analysis is the attempt to discover and describe language structure, to elucidate the pattern and regularity that inhere in the sound-meaning correlations of languages.' I' Cours, 91; italics ours.
17 Cf. 1. Barr, 'The Synchronic, the Diachronic, and the Historical. A Triangular Relation ship?', in Synchronic or Diachronic? A Debate on Method in Old Testament Exegesis.
Papers Read at the Ninth Joint Meeting 0/ Ret Oudtestamentisch Werkgezelschap in Nederland en Belgiij and the Society for Old Testament Study held at Kampen, 1994 (ed. 1.e. de Moor; OTS 34; Leiden 1995), 1-14, e,p. 6. II
De Saussure. Cours, 101.
19 R. W. Langacker, Language and ils Structure. Some Fundamental Linguistic Concepts
(2nd ed.; New York 1973),9. " 1.A. Cook, 'The Hebrew Verb: A Grammaticalization Approach', ZAH 14 (2001), 117143, e,p. 119-120. 11 P. 155 in the present volume.
20
21 To account for linguistic change to which the system is subjected, some
The result of all these changes is not a 'deteriorated' system, but a new sys
insights from system theory as it has been developed in the natural sciences " may be illuminating. In natural sciences a system is described as a set of
tem with different terms and functional oppositions.
factors disturb the balance, the system will try to restore the balance by a
4.3 Problems Involved in the 'Synchronic' Analysis of Biblical Hebrew
interrelated elements that pursue a state of equilibrium. If internal or external
feedback mechanism. We can distinguish between a negative and a positive feedback mechanism. The negative feedback tries to restore the disturbed system to its original state. If the negative feedback works, the system might undergo structural and functional changes to achieve a new equilibrium by incorporating the disturbing elements into the system. If the disturbing ele ments cannot be incorporated the system will essentially move further and further from its original state and collapse in the end. This is called the posi tive feedback. If we approach language as a system, it is important to use the notion of an open and changing system as it has been developed in system theory. The language system is exposed to factors that may disturb the balance of the system, but develops mechanisms to restore the balance or to create a new balance by incorporating these factors. The notion of an open system may help to arrive at a model oflinguistic analysis that takes into account both the synchronic description of the language at a certain point in time and the dia " chronic description of language change. Let us take as an example the transition from Late Biblical Hebrew to Mishnaic Hebrew. The language systems of Standard Biblical Hebrew, Late Biblical Hebrew and Mishnaic Hebrew are scholarly abstractions from a continuum of chronological and geographical diversity in the Hebrew lan guage from the so-called Second Temple period. These abstractions are use ful because they represent more or less clearly distinguishable forms of He brew. The Hebrew language system of this period is unmistakably subject to
change. It is therefore, a dynamic system. Factors that influence the system
include Aramaic influence, contact with other languages and dialects, and the status of the classical language and literature in the post-exilic community. Many changes can be observed if we look, for example, at the verbal system: Some verb forms disappear almost completely (like the consecutive verb
forms); others lose some of their functions (like the use of yiqtof for repeti tion in the past). Nevertheless, the system does not collapse; other forms take over certain functions (like the use of qataf or and new syntagms are created (like
'atid f-
w-qataf as
a 'narrative tense')
for the future). The changes con
cern not only the terms of the system (i.e. one form replacing the other), but also their relationships (i.e. what oppositions are marked and what are not).
n
von Bertalanffy, General System Theory: Foundations. Development, Applica tions (New York \968). 1) Cf. Langacker. Fundamentals 0 / Linguistic Analysis, 14: 'A language is a system of rules, and historical changes in the structure of a language are therefore changes in a system of rules.' Cf. L.
Apart from the general theoretical linguistic problems involved in a sharp distinction between synchrony and diachrony, there are corpus-specific prob lems involved in the linguistic study of the Hebrew Bible and the Peshilta. The awareness that the corpora under investigation are the product of a long history of origin and transmission poses the question of whether it is justified to call the language represented in it a system. If one defines a synchronic linguistic analysis as the description of the language system as it functions at any one time, a synchronic analysis of the Hebrew Bible is by definition impossible. Even if one takes a very loose defmition of 'one time', to include a year, ten years, or even a generation, it will be obvious that the Old Testament covers much more than 'one time', because the production and redaction of the texts took place over a long pe riod of many centuries. One may try to solve this problem by making a historical reconstruction of several states of the language, as it was in, say, the ninth century, the sev enth, in the fourth and so on. However, even apart from the question of whether such an undertaking has any chance of succeeding, the linguistic analysis of the reconstructed stages cannot any longer be called synchronic. Rather
it leads to a reaffirmation of the traditional historical-critical ap
�"
proach
Another solution would be to make an analysis of the Masoretic
Text in its final shape. However, such an analysis is not 'synchronic' in the Saussurean sense of the word. As Barr observes: 'But far from being a syn chronic approach in the sense of Saussure s mind, it seems to be the oppo site: it is more like the recent idea ( . . . ) of a "time-incorporating developmen tal
linguistics", which would actually
diachronic and synchronic.
,2S
diminish
the opposition between
It can be concluded that it is impossible to subject the Bible to a purely synchronic analysis in the way De Saussure defined this axis of language. As Barr puts it, the Masoretic Text 'does not give us direct and precise access to any one synchronic state of ancient Hebrew. The materials lie in layers which represent differing stages of analysis and registration over a long time.
,26
204
Barr, 'The Synchronic, the Diachronic and the Historical', 3 . 15 Barr, 'The Synchronic, the Diachronic and the Historical', 4. It Barr, 'The Synchronic, the Diachronic and the Historical', 4. The observations given in this paragraph relate to the synchronic linguistic analysis of the Bible. They are not neces sarily valid for synchronic approaches to the literary study of the Bible, which approach the text as a final product, a completed composition.
22
23
4.4 Problems Involved in the 'Synchronic' Analysis of Biblical Syriac At first sight, the situation is less complicated in the case of the Old Testa ment Peshitta, because it probably originated in a limited span of times of, say, some decades. However, at closer inspectation it appears that factors of
Niildeke made little use of the Old Testament Peshi!!a because in his view it frequently approximates to the original Hebrew text too closely." In some cases the translation seems to contain elements that are not part
of the Classical Syriac language, such as object marker � in the Peshilla of Gen 1 : \." In other cases it is less clear whether the Peshitta reflects 'pure'
linguistic variety and change also have played an important role in shaping
Classical Syriac. M.D. Koster and others have argued that the original
the extant textual wituesses of the Peshi!!a. Firstly, the general linguistic problems discussed in
Peshitta followed the Hebrew text more closely than later text types ." At first · sight, there are some phenomena, especially in the earliest biblical manu
§ 4.2 apply here as
well. The Peshitta reflects a certain phase in the development of Syriac and a
scripts, that reveal a strong influence from the source text on the linguistic
certain position within the Aramaic/Syriac language spectrum. It is one of
profIle of the translator, like the use of the construct state against construc
the earliest texts in Classical Syriac, written in the formative period of the
tions with Dalath in later manuscripts, the occurrences of bipartite clauses where later manuscripts add the enclitic personal pronoun, or the clinging to
Syriac standard, in which there was less uniformity in Syriac than in later times." Jan Joosten has argued that the orthography, vocabulary and syntax of the Peshi!!a may be defmed in three linguistic relations: with other varie
the Hebrew word order. However, it cannot be ruled out that the develop ment of the Syriac language played a role as well and that the earliest biblical
ties of Syriac, especially the earlier ones, with earlier Aramaic dialects, espe
manuscripts reflect a form of Syriac in which, for example, the construct
cially Imperial Aramaic, and with Western Aramaic "
state or the bipartite nominal clause were more common than in the later
Secondly, if not in the translation, the temporaVdiachronic factor is prominent in the transmission of the Peshi!!a. If we wish to make a linguistic
analysis of the Peshitta we should keep in mind that the transmission, if only
in the period between the origin of the Peshitta and the date of the earliest manuscripts, covers more than 'one time' , which makes it impossible to assume a priori that the Peshitta reflects a language state as it functioned at 'one time'.
13
standard language. Even if the Hebrew source text influenced the linguistic profIle of the translation, it did not result in a slavish mirroring of the Hebrew language. A number of patterns of equivalent phrase structures (e.g. Hebrew Construct
Noun + Abstract Noun corresponding to Syriac Noun + Adjective) show the freedom of the Syriac translator to use constructions different from those in the Hebrew text." The same freedom can be observed when we look at lexi cal correspondences. A survey based on the concordance to the Peshilta of the Pentateuch" shows that there are many cases where one Hebrew word
4.5 The Impact of the Source Text on the Linguistic Profile of the Translation
cases there is not an equivalent at word level at all. Moreover, Hebrew roots
The linguistic analysis of the Peshi!!a is complicated by the fact that we are
that in general the Syriac translator did not try to mirror the linguistic struc
corresponds to two or three words in Syriac and vice versa, while in other or lexemes are translated differently in different places. It can be concluded
dealing with a translation. The question arises of to what extent the language of the source text has influenced the linguistic profile of the translation." 21 O. Goldenberg, 'Bible Translations and Syriac Idiom'. in The Peshilta as a Translation: Papers Read at the II Peshitta Symposium held at Leiden 19-21 August 1993 (ed. P.B. Dirksen and A. van der Kooij ; Leiden 1995), 25-40, esp. 25. 21 J. Joosten 'Materials for a Linguistic Approach to the Old Testament Peshitta', JAB 1 (1999), 203-21 8; see also L. Van Rompay, 'Some Preliminary Remarks on the Emer
gence of Classical Syriac as a Standard Language. The Syriac Version of Eusebius of Caesarea's Ecclesiastical History', in Semitic and Cushitic Studies (ed. G. Goldenberg and S. Raz; Wiesbaden 1 994), 70-89. Van Rompay discusses the place of early Classical Syriac within the Aramaic dialects and its rise as a 'standardised' language. On the rela tion of early Syriac to other forms of Aramaic, see also K. Beyer, 'Der Reichsaramaische Einschlag in den altesten syrischen Literatur', ZDMG 1 1 6 (1966), 242-254. 29 Cf. Goldenberg, 'Bible Translations and Syriac Idiom', 25: 'Such special texts as trans lations of the Scriptures may be found to include calques or syntactical imitations of the original, which ought to be examined carefully as to whether or not they exist independ ently in the language in ordinary prose'.
ture of the Hebrew source text in every detail and that cases like the render
ing of mt with � in Gen I : I are exceptions rather than the rule.
JO Th. Noldeke KurzgeJ asste syrische Grammatik (2nd ed.; Leipzig 1898; repro with addi tional materials: Darmstadt 1966), xiii-xiv: 'Oas syrische AT schliesst sich dem he braischen Urtext oft zu nahe an, und grade wegen der engen Verwandtschaft der Sprachen konnen wir manchmal nicht erkennen, ob die wortliche Wiedergabe noch dem wahren syrischen Sprachgebrauch gemass oder wirklich ein Hebraismus ist.' 11 Thus Goldenberg, 'Bible Translations and Syriac Idiom', 28-29; cf. Weitzman, Syriac
Version, 122-123. See below, § 6.3.
II
)) Cf. Van Peursen, 'Response to the Responses', § 2, pp. 198-200 in the present volume. U For more corresponding phrase patterns see W.Th. van Peursen, Language and Interpre tation in the Syriac Text oj Ben Sira. A Comparative Linguistic and Literary Study (forth coming), Part Two. ]5 P .G. Borbone and K.D. Jenner, The Old Testament in Syriac according to the Peshilta Version V. Concordance I: Pentateuch (Leiden 1997).
25
24 Nevertheless, language and culture are tightly intertwined. The most obvious ex ample of this interdependence is literature, oral and written: principles of literary style, prosody. and so on that are developed in terms of one language cannot al ways find satisfactory equivalents in a second. Words designating concepts spe cific to a given culture are likely to present a serious translation problem. The adoption of a new language is often accompanied by the gradual adoption of a new culture. Language and culture are closely associated in practice, therefore, but there is no reason to believe that any particular type of linguistic structure is specially suited to any particular type ofculture.41
4.6 Language System and Language Use Another Saussurean distinction that plays a crucial role in his understanding of the language system is that between
/a langue and /a parole.
The former
refers to language as a self-contained system. It is the set of interrelated pho· nological, morphological and syntactical rules, which enable the production of utterances. The latter refers to the utterances themselves. In this distinction
the language system is an abstraction, an entity underlying and structuring human communication, but existing, as it were, independently of language use and language users. A formal description of it takes as its point of depar ture an ideal non-real language user, whose language use is not hindered by cultural, social or psychological issues, nor by any other factor such as prob lems that arise from putting utterances into writing, situations of multilin gualism or translation processes."
In reality, however, language does not exist in such an 'ideal' situation.
Language is first and foremost a vehicle of communication and a means in the process of conveying information " Language use takes place in a large
variety of contexts of communication, serving all kinds of purposes.38 Its use depends on communication conventions and language rules that are valid in a given community. In other words, cultural and social factors have a strong impact on the way utterances are modelled." Moreover, language is related to the language users ' conceptual universe and their world view." All this means that the scope of linguistic analysis is very limited if it is restricted to a formal description of the rules of the language system without taking into account 'the historically, sociologically, and psychologically defined extra
If religious literature takes a prominent place in the literature of a certain culture, religion, culture and language appear as three closely related fac tors." The Syriac context provides an interesting example of this. The Bible occupies a central role in Syriac literature. Quotations from the Bible or allu sions to biblical passages abound in historiographical, literary, theological, liturgical, philosophical and scientific Syriac literature. Due to this inter woveuness of religion, culture and language, Biblical Syriac has become an organic part of literary Syriac .... The question of how language and language use interact with culture, society and personality are the main concerns of disciplines like socio linguistics, psycho-linguistics and anthropological linguistics. One can say that the influences from sciences like sociology, psychology, anthropology and neurology characterise the main tendencies in linguistics in the last three decades of the twentieth century." However, even earlier classical studies of Sapir," Hoyer" and Hymes" have emphasised the importance of culture, society and personality in the use of language and vice
versa.
linguistic situation in which they [i.e. words and utterances] are embedded'." One of the places where language and culture meet is literature. As R.W. Langacker puts it: Jfi
B. Th. Tervoort et 01., Psycholingui'stiek (Aula-boeken 48 1 ; 4th ed.; UtrechtlAntwerpen 1 976), 93; cf. N. Chomsky, Aspecls oJthe Theory oJSyntax (Cambridge, Mass. 1965), 3: 'Linguistic theory is concerned primarily with an ideal speaker-listener, in a completely homogeneous speech-comrnunity, who knows its language perfectly and is unaffected by such grammatically irrelevant conditions as memory limitations, distractions, shifts of attention and interest, and errors (random or characteristic) in applying his knowledge of the language in actual perfonnance.' }1 E.A. Carswell and R. Rommetveit, 'Introduction', in Social Contexts of Messages (ed. E.A Carswell and R. Rommetveit; European Monographs in Social Psychology I; Lon donlNew York 1971). 3-12, esp. 3-7; R.H. Robins, General Linguistics, (3rd ed.; Lon donlNew York 1980), 9-14. )1 See e.g. Robins General Linguistics, 21-22; Dik-Kooij, Algemene taalwetenschap, 2531. )9 See e.g. G. Palmer, Towards a Theory of Cultural Linguistics (Austin 1996) and the contributions in Carswell-Rommetveit, Social Contexts o/Messages. 40 G. van Steenbergen, 'Componential Analysis of Meaning and Cognitive Linguistics: Some Prospects for Biblical Hebrew Lexico1ogy', JNSL 28 (2002), 19-37, esp. 33-35; Robins, General Linguistics, 23, 298. 41 Carswell-Rommetveit, 'Introduction', 5.
41
Langacker, Language and Its Structure, 16-17. On the interaction of language and religion, see also R. Needham, Belief, Language and Experience (Oxford 1 972). 44 Cf. T. Muraoka, 'Response to G. Goldenberg, "Bible Translations and Syriac Idiom" ', in Dirksen-Van der Kooij, The Peshitta as a Translation, 41-46, esp. 42: 'It seems to us reasonable to assume that the biblical and related literature played a most vital role in the fonnative periods of Syriac writing.' 45 Thus A Wagner, 'Die Stellung der Sprechakttheorie in Hebraistik und Exegese', in Congress Vofume Basef 200f (ed. A. Lemaire; VTSup 92; Leiden 2002), 55-83. "" The Collected Works of Edward Sapir (ed. P. Sapir et at.; 16 vols.; BerlinlNew York 1990--2 001); Selected Writings of Edward Sapir in Language. Culture, and Personality (ed. E.G. Mandelbaum; Berkeley 1 949). 4} H. Hoyer, 'The Sapir-Whorf Hypothesis', in Language in Culture. Proceedings of a Conference on the Interrelations of Language and Other Aspects of Culture (ed. H. Hoyer; Comparative Studies of Cultures and Civilizations 3; Memoirs of the American Anthropological Association 79; Chicago 1 954), 92-1 05; id., 'The Relation of Language and Culture', in Anthropology Today: An Encyclopedic Inventory (ed. AL. Kroeher et aJ.; Chicago 1953), 554-573; 'Anthropological Linguistics', in Trends in European and American Linguistics. 1936-1960 (ed. Ch. Mohrmann, A Sommerfelt and J. Whatmough; trechtiAntwerpen 1961), 1 1 ()... 127. D. Hymes, Language in Culture and Society. A Reader in Linguistics and Anthropology (New York 1964). 43
!!
26
27 Here, too, we may resort to system theory as developed in the natural
sciences. From the viewpoint of system theory, all factors affecting the use of a language can be defmed as internal and external factors that influence
4.8 Analysis of the Langnage System on the Basis of Text Corpora In the case of the Hebrew Bible and the Peshilta the question of whether
the system. For the description of the system a 'zero situation' is established
utterances provide access to the abstract system becomes even more urgent
and the dynamics of that situation is described, taking into account all rele
because of the restrictions related to the use of written documents. Although
vant factors. Subsequently, changes are registered and the way in which they affect the system is described. In system theory one of the exigencies of a proper analysis of the development of the system is that all factors influenc ing it are taken into account. In describing language systems this would re
corpus linguistics has been celebrated as being more empirical and reliable than many other approaches," some scholars have argued that written sources are not a proper basis for describing a language, because 'language is speech and the linguistic competence underlying speech. Writing is no more
quire an interdisciplinary effort of psycho-linguistics, socio-linguistics, cul
than a secondary graphic representation of language'."
arrive at a 'complete' interdisciplinary approach, but it remains important
languages, where we can fmd native speakers to help us interpret texts. In the
tural-linguistics, cognitive linguistics and others. In practice it will be hard to
that scholars who focus on one set of factors influencing the language sys tem, remain aware of the other sets as well.
That linguistics should primarily deal with speech may be true for modem case of ancient texts, however, we have in a way to organise our own 'native speaker' by making a description of the language used in those texts, as con sistent as possible." For that reason a corpus based approach of ancient texts has no choice but to rely on the analysis of written documents.
4.7 Language System and Style
One of the areas in which spoken languages differ from their written
There is a close relationship between language, style and meaning." The language system can be considered the framework in which an utterance is modelled, while style and rhetorics refer to the way in which the utterance is
representation is in the area of language change and language variety. It will
affect therefore the diachronic study of language. In the words of R. W. Lan
gacker:
modelled within the conditions of the language system. This means that liter
Languages, especially in their primary (spoken) form, are continually changing.
ary and rhetorical strategies influence the utterance and have an impact on
Written languages are for the most part more conservative than their spoken pro genitors, for a number of reasons. People usually write with more care and for
the communication process. Language users connect issues that they regard
mality than they speak with, so that innovations are slower to appear in writing.
as important with certain constituents in a clause. They can give priority to such a constituent in the information structure of a text by giving it a syntac tically important place in the clause or by mentioning it explicitly, even if it is already known in the context. An unexpected or irregular word order indi
cates what is the topic of a proposition and what information is conveyed about that topic." The syntactical choices evoke an impression that has a deep impact on the interpretation of texts." In the context of the
CALAP pro
ject, where we are dealing with a translation, the questions are whether the Syriac translators were sensitive to the stylistic and rhetorical features of the Hebrew text, whether they could and, even more important, tried to retain these features and whether they have introduced their own stylistic character istics.
52 See e.g. D. Biber, S. Conrad and R. Reppen, Corpus Linguistics. Investigating Lan �uage Structure and Use (Cambridge Approaches to Linguistics; Cambridge 1 998), 4. ) Langacker, Language and Its Structure, 59; similarly Dik-Kooij, Algemene taalweten schap, 1 6 ; cf E. Lipinski, Semitic Languages: Outline of a Comparative Grammar (OLA 80; Leuven 1 997), 86: 'Writing is no more than a secondary, graphic and largely inade
quate representation of language. There is even a greater difference between a living language and a "dead" language.'; R. Jakobson, 'Linguistics and Communication The ory', in
Structure of Language and Its Mathematical Aspects (ed. R. Jakobson; Proceed 1 2 ; New York City 196\), 245-252, esp. 250:
ings of Symposia in Applied Mathematics
'Attempts to construct a model of language without any relation either to the speaker or to the hearer and thus to hypostasize a code detached from actual communication threatens to make a scholastic fiction from language.'
s.
The use of speakers of Modem Hebrew or modem Aramaic dialects as infonnants will
confuse, rather than help the description ofthe ancient phases of Hebrew and Syriac. It is incorrect, for example, to argue that the Biblical Hebrew verb expresses tense rather than 49
See e.g. R.A. Jacobs and P .S. Rosenbaum,
setzt und herausgegeben von 1. Guntner und Linguistik; Frankfurt am Main
511
C.H.J. van der Merwe and
plays an important role, is at least in part responsible for the increased popularity in the
Talstra, 'Biblical Hebrew Word Order. The Interface of
last decades of the understanding of the Biblical Hebrew verb as a tense-based system. With these remarks we do not deny the important heuristic contribution that the study of
E.
ZAH 15116 (2002-2003), 6S-107. Meaning and Style, Collected Papers (Language and Style
Information Structure and Formal Features',
" Cf.
S.
Oxford
Ullmann,
1 973).
mood or aspect on the basis of the situation in Modem Hebrew. Although no scholar will deny this, one may suspect that the fact that in the Modem Hebrew verbal system tense
Weill; Fischer Athenaum Taschenbticher.
1973).
E.
Transformationen, Stil und Bedeutung (iiber
Series
14;
the modem Northwest Semitic languages can make to the study of the earlier phases. For example, insights about infonnation structure and prosody in Modem Hebrew can illumi
nate problems related to 'emphasis', 'topic' and 'focus' in Biblical Hebrew.
28
29 The tradition o f correctness in spelling keeps changes in pronunciation from being reflected in written messages. Because writing is relatively pennanent, writings from the past (unlike past speech) continue to remind people of the way things , used to be done and thus act as a restraining influence. People concerned wIth the correctness or purity of language will regard developments in the spoken lan
�
t guages as evidence of linguistic corruption and resist their incorporation into , jng. As a result, written language and its spoken counterparts somettmes dnft quite far apart.H
lexicographical and grammatical information that the corpus provides is
limited. The vocabulary of Biblical Hebrew consists of
7,500
to
8,000
words,'1 while the size of modern large dictionaries such the Oxford English Dictionary, Das Deutsche
sche Taal
Worterbuch and the Woordenboek der Nederland 350,000 and 450,000 entries." In the CALAP
vacillates between
project the corpus under investigation was even much smaller than the com
plete Hebrew Bible. We studied the books of Kings and Ben Sira, which in
The conservative tendency that can be observed in written languages appears
the Syriac text covers 35,800 and
Langacker:
considerably larger than that preserved in the Old Testament. If we try to
to be even stronger in the case of liturgical languages. Again in the words of
describe the relations between words as those between the terms of a system,
The isolation of liturgical languages from their spoken origins is a similar phe nomenon. There is, of course, even greater resistance to change in a language used as the vehicle of religious doctrine and devotion than in any literary language. Moreover, the religious significance is sometimes sufficient t? p�eserve a liturgi cal language long after its spoken counterpart has ceased to eXIst.
It follows that even because of general cultural linguistic considerations the
corpus of our research is likely to represent a literary, standardised form of
the language with conservative tendencies.
An additional challenge is the way in which the texts are actually handed
down to us. Studying the Hebrew Bible and the Peshi\ta we are dealing with
texts that came to us through a tradition of reading and transmission. This
has influenced the representation of the language in these corpora. This rep
resentation is limited and hybrid. The absence of vowel signs in early Peshi\ta manuscripts and the discrepancy between the langua e reflected by y, the Masoretic vocalisation and the consonantal framework restncts the scope of the research and the validity of its conclusions.
The size of the corpora under discussion is very small. The Old Testa
ment contains altogether between
300,000
to
420,000
contains ten times as many words:
British National Corpus contains
100
(Corpus 8,000,000." The
million words." As a consequence, the
n
Langacker, Language and Its Structure, 6 1 ; similarly Lipinski, Semitic Languages, 95. Langacker, Language and Its Structure, 6 1 . . n An example is the phenomenon of vowel length, which seems to be an essentlal featu e � of the realisation of vowels as reflected in the matres lectionis, while it has no phonemiC status in the pronunciation reflected by the Masoretic signs. SI According to E. Ullendorff the Masoretic Text contains 300,000 words; s�e is
�
" See http ://Iands.let.kun.nllcgn/home.htm. See http://www.natcorp.ox.ac.uk.
60
the fact that our corpus is limited poses a serious problem, because we do not
know all the terms in the system. Even such a semantic field as kinship rela
tions, for which a number of words occur in the Bible, cannot be described completely, because not all the terms in the system are known."
At first sight the problems are less serious in the field of syntax than in
the field of lexicography and semantics. However, even there the restrictions of the corpus hinder the analysis. Classical Hebrew epigraphic material
shows that ancient Hebrew grammar must have known forms and construc
tions not attested in the Bible. Moreover, lacunae in Biblical Hebrew para digms strongly suggest that a number of forms that were part of Classical
Hebrew are by chance not attested in the Bible."
The difficulties raised by the limited size of our corpora compel us to
make an analysis that is as exact and consistent as possible. We cannot afford the inaccuracies and inconsistencies that we come accross frequently in the
computer-assisted analysis of large text data bases. Automatic procedures allowing for
lical corpus.
80%
or
90%
correctness are insufficient in the study of the bib
words." Even a rela
tively modest modern corpus such as the Spoken Dutch Corpus
Gesproken Nederlands)
1 8,500 functional words respectively.
It stands to reason that the ancient Hebrew vocabulary must have been
5 Communication Processes Involved in Translation Activities
5.1 Ancient Texts and tbeir Communicative Functions The writing of a text or a collection of texts aims at communication. In gen
eral one can say that a text contains something to be conveyed to an audi-
6.
62
Ullendorff, 'Is Biblical Hebrew a Language? ',
5.
The first edition of the Oxford English Dictionary contained 41 4,800 entries, Das Deutsche W6rterbuch about 350,000 entries, and the Woordenboek der Nederlandsche Taal between 350,000 and 400,000 entries. � Ullendorff, 'Is Biblical Hebrew a Language?', 12-13.
" Such as a number of verbal forms for the second person feminine and some hophal fonns; cf. Ullendorff, 'Is Biblical Hebrew a Language?', 14.
31
30 ence." Form, content and meaning of the text are influenced by a large num ber of factors: the author's linguistic abilities (his command of style, idiom and grammar), the way in which he acquires information and uses it in his
argument, the character of the author's mental processes (whether or not it is logical, consistent, associative and/or emotional), his cultural and social posi tion, and his personal attitude. These factors also play a role with the reader: What is his attitude? What effect does the text have on him? How does it
the translation." The work of translation of a text or a collection of texts is in itself a process as well. A translator may 'learn' during the translation proj ect
and earlier decisions may influence later ones."
From a linguistic perspective the question is: 'To what extent are lan " guage differences barriers to understanding?' Some insights from experi mental psycho-linguistics can be illuminating. People who are instructed to read a text in a foreign language and reproduce its meaning and pwport start
with a grammatical parsing of the clauses and try to recognise compositional
influence his cognitive processes? The modern student of ancient texts has the task of recognising the lin guistic means used for conveying ideas, concepts, symbols, values and emo tions in written documents, and of reconstructing the cognitive processes underlying the documents. This research is hindered by the fact that the ma terial, cultural and religious environment in which the ancient texts origi nated is very different from that of the modern scholar. Words can be defined as 'tools for message transmission in a shared human world of objects and events',66 but the question is to what extent there is a 'human world of objects and events' shared by the ancient author and the modern scholar and how the latter can get insight into the 'message transmission' of the former.
and redactional aspects. Psycho-linguistic experiments in the sixties of the last century have shown that this involves a complicated process in which 'words emerge in acts of decoding when they [i.e. stimuli of strings of let
ters] are met with an internally provided request for some meaningful mes
sage element.''' According to R. Rommetveit this demonstrates that 'words can only be explored psychologically as very complex
processes; ( . . . )
en
tirely novel perceptual tactics may develop when such tactics are required in order for a perceiver to generate meaning. The perceived word is thus as
much a product of efferent processes as of afferent processes'." 'Meaning' is the result of conceptual-representational, associative and emotional proc
esses. It originates from 'an initial process of reference
( o o .)
to branch off
into sustained representation, affecting and affected by an associative and an
5.2 Communication and Translation
affective activity. ,73
The communication process becomes even more complex in the case of a
esses is one of the basic principles of the traditional historical critical exege
translation. In
sis
The concern for the author's or translator's mental and cognitive proc
this context the translator is both the reader of an existing text
and the author of a new one. The character of a translation is often described in terms of translation technique or strategy. This relates to the translators '
view of the work of translation, for example whether he should translate, to
e mente auetoris.
While in subsequent structuralist approaches the atten
tion shifted from the author to the text, in some post-structuralist approaches we see a renewed interest in 'the mind of the author' . Thus one of the main points of departure of Cognitive Linguistics is the insight that 'language does
67 sensum de sensu.
not reside in dictionaries, but in the minds of the speakers and listeners, writ
However, the translation is shaped not only by the translator's ideas of what
ers and readers of that language'." PJ.P Van Hecke contrasts this approach
a translation should look like, but also by his psychological and cognitive
with the Saussurean structuralist understanding oflanguage as follows:
use Cicero's and Jerome's words,
verbum e verbo
or
processes, such as the observe and recall process: the translator starts from the source text where he fmds a small segment of text, makes up his mind how it should be translated, and puts the translation into the new text he is
61
creating. The size of the segments and the cognitive processes between the
titels van de gei1lustreerde Syrisehe kanselbijbel van Parijs (MS Paris, Bibliotheque Nati
picking up and writing down of the segments will influence the character of
On this process see also Weitzman, Syriae Version,
3-7;
K.D. Jenner, De perikopen
onale, Syriaque 341). Een vergelijkend onderzoek naar de oudste Syrische perikopenstel
1993), 2 16-224. 31-33. }Q Cf. Langacker, Language and Its Structure, 4. 7 1 Rommetveit, 'Verbal Message Transmission', 16. 71 Rommetveit, 'Verbal Message Transmission" 17. 1 1 Rommetveit, 'Verbal Message Transmission', 20-2 1 . sels (PhD diss., Leiden University,
" Cf. Weitzman, Syriac Version,
U
For models describing the processes, participants, messages and meanings involved, see
e.g. Carswell-Rommetveit, 'Introduction',
5-7;
Jakobson, ' Linguistics and Communica
tion Theory'; S. Moscovici, 'Communication Processes and the Properties of Language'
in Advances in Experimental Social Psychology 3
66
R. Rommetveit. 'Words, Contexts, and Verbal Message Transmission', in Carswell
Rommetveit, Social Contexts ofMessages,
6'
(1 967), 225-270.
13-26,
esp.
14.
Cf. S.P. Brock, . Aspects of Translation Technique in Antiquity', Greek Roman and
Byzantine Studies
( 1979), 69-87.
7� E . van Walde, 'Wisdom, Who Can Find It? A Non-Cognitive and Cognitive Study of Job
28: I-I I ', in Job 28. Cogn ition in Context (ed. E. van Walde; Biblical Interpretation 64; Leiden 2003), 1-35, esp. 2. Note, however, that in Cognitive Linguistics the
Series
notion of the author as a user of a language system and a member of a cultural group of people
is
auetoris.
mOre anthropological and psychological than the classical principle of e mente
32 33 The guiding assumption of cognitive semantics, and of cognitive linguistics more generally, is that human language cannot be properly understood without taking into account the way in which people think. In contrast to the structuralist ap proach to language, with its stress on language-internal paradigmatic and syntag matic relations, the cognitive approach to linguistics explicitly studies language 's against the background of human cognition.
True as this may be, in the case of the Hebrew Bible and the Ancient Ver sions we do not have direct access to the author's or translator's cognitive and communicative processes. The object of investigation remains primarily the product of literary activity, rather than the communication process itself. This challenges the biblical or semitic scholar to discover signals that reveal the process of communication. In a recent study of Biblical Hebrew word order, C.H.J. van der Merwe en E. Talstra have argued that word order pat terns are 'syntactic forms that display the information structure of utterances of particular points during a communication process' and function as 'the interface between the world of the formal grammar ( ... ) and the cognitive environment of the speakers'. Syntactical analysis of these patterns helps us identify new or presupposed items or propositions and thus reveal 'the au thors' s assumptions concerning the cognitive environment of their address ees'." Accordingly, the study of this and other syntactical issues can help us in establishing the communication processes that resulted in the written documents we have at our disposal. 6 The Textual Basis of the Linguistic Analysis of Biblical
Syriac
6.1 Introduction Accepting the limitations of a linguistic analysis based on ancient written texts, we have to establish more precisely what sources are taken as the rep resentatives of these texts and as the basis for the description of Biblical Hebrew and Peshitta Syriac. For both the Hebrew Bible and the Peshitta printed scholarly editions are available: the Biblia Hebraica Stuttgartensia (BHS) and the Leiden Peshitta edition, called Vetus Testamentum Syriace. The wide acceptance of these editions among scholars argues in favour of the use of these editions as the basis for the linguistic research. Moreover, the use of these editions would enable students in Hebrew or Syriac to verify the results of the investigations.
75
PJ.P. Van Heeke, 'Searching for and Exploring Wisdom. A Cognitive-Semantic Ap proach to the Hebrew Verb -Paqu in Job 28', in Van Wolde, Job 28 (see preceding foot
note), 1 39-162, esp. 143. 76 Van der Merwe--Talstra, 'Biblical Hebrew Word Order'; quotations from pp. 71-73.
Nevertheless the use of these sources for the analysis is problematic be cause the two editions reflect two different types of texts and two different editorial policies. The BHS is a diplomatic edition, presenting the text .of one manuscript, the Codex Leningradensis. The cntIcal apparatus contams the rather subjective selection, and often also the evaluatIOn of vanant readings from Hebrew sources and the Ancient Versions. The Leiden Peshltta edltl �n presents the so-called BTR text. It gives a reconstruction of an early phase m . 77 the development of the PeshItta text. Since the running text of the Peshitta edition cannot be taken as the Peshitta or as the representative of Biblical Syriac, a careful evaluatIOn of variant readings is of the utmost importance not only for a reconstructIOn of . the textual history of the Peshitta, but also for Its Imgulstlc analYSIS. Many variant readings concern linguistic phenomena, like the variation between the construct state and the so-called genitive cons�ction with Dalath or that between bipartite and tripartite nommal clauses. The questIOn at �take IS how differences in the textual transmission should be assessed hngUlstIcally and whether these variant readings reflect different language systems, or variation within one system.
6.2 Manuscripts The two oldest manuscripts of the Peshitta are British Library Add. 14425 (5bl in the Leiden edition), which contains Genesis and Exodus, and Bntlsh Library Add. 14512 (5ph l)," which contains parts of Isalah and Ezekiel. The earliest manuscript containing the complete Old Testament IS Milan, Amb�o sian Library B. 21. Inf. (7al) from the sixth or seventh century. This lffiphes that for large parts of the Old Testament the oldest textual witness is four to five centuries later than the origin of the Peshitta · Another important manuscript is Florence, Bibl. Medicea Laurenziana Orient. 58 (9a l). This manuscript is characterised by a close agreement With the MT. According to M.P. Weitzman the text type represented m thiS manu script goes back to the sixth century." The large number of secondary and n See further below, §
71
6.4.
.
See Van Peursen's 'Response to the Responses' in the present volume, espeCIally § 2 197-204). . . These are the oldest dated biblical manuscnpts m any language. . 10 M.P. Weitzman, 'The Originality of Unique Readings in Peshitta MS 9al ' . m The
WP.
Peshitta: Its Early Texl and History. Papers Read al the Peshilta Symposi m held al � Leiden 30-31 August 1985 (ed. P.B. Dirksen and M.l. Mulder; MPIL 4; Lelden 1988),
225-258. The correctness of this view is corroborated by readmgs occunng only m 6�5
and 9al in Deuteronomy and by shared distinctive variants betwe�n 5phl �nd 9a l 10 . . Isaiah; see W.M. van Vliet's introduction to Deuteronomy in �e Lelden P�shltta �dl �l�n . . (Leiden 1991). viii, and S.P. Brock, 'Text History �d Text DIVlons In Peshltta IsaIah , In Dirksen-Mulder, The Peshilta: Its Early Text and HIstOry, 49-80, esp. 52. Note, however,
34
35
allegedly wrong readings in this manuscript has been put forward as evi dence for a revision of the Peshi(ta.'1 Occasionally, 7al and 9al agree with the LXX over against the MT. In Kings the agreements are strongest between 7 a l and the Antiochene text of the LXX (formerly called the Lucianic Recen " sion). Since it is unlikely that the agreements between the Septuagint and these manuscripts are due to a common Hebrew source or to copyists, they suggest an influence from the Greek text.
6.3 Text Types The oldest Syriac manuscript containing the complete Old Testament, 7al, represents a text type which in scholarly literature is called BTR. M.D. Koster introduced this designation in 1977 as a combination of Basic Text (BT) and Textus Receptus (TR). " In Koster's view BT referred to the text of 7al , BTR to the text of manuscripts from the sixth and seventh century, and TR to the text type that was the standard from the tenth century onwards. In Koster's reconstruction this text type developed in the ninth century from the Paris Bible (8al). In the history of research the meaning of BTR has been broad ened to include manuscripts from the sixth and ninth century. The number and kind of manuscripts representing BTR differ for each biblical book. Leaving aside orthography, we can observe that BTR is about 95% identical to the text of 7al, but lacks the idiosyncracies of this manu script. It has much in common with TR and the portion of agreement between BTR and TR increases if TR-manuscripts from the ninth and tenth century are taken into account. In Koster's reconstruction the oldest Peshi!ta manuscripts (e.g. 5bl for Exodus) reflect the earliest attainable stage of the textual history." This text
that the text of 9a 1 is of a very mixed character and that there are also a number of cases where 9a 1 follows BTR or even TR against the earlier manuscripts; see Brock, ibid. '1 W.E. Barnes, The Peshi,tta Psalter according to the West Syrian Text Edited with a
Critical Apparatus (Cambridge 1904), xli-xliv; id., An Apparatus Criticus to Chronicles in the Peshilla Version with a Discussion of the Value of the Codex Ambrosianus (Cam
bridge 1 897), xxx. For arguments against the view that the scribe of 9al has revised a standard manuscript of its age to confonn more closely with the Hebrew see D.M. Waiter, 'The Use of Sources in the Peshitta of Kings ', in Dirksen-Van der Kooij, The Peshitta as
n. 1 . Cf. D.M. WaIter and K.D. Jenner, The Peshilla oJKings (forthcoming). Il M.D. Koster, The Peshilta of Exodus. The Development of Its Text in the Course of Fifteen Centuries (Assen/Amsterdam 1 977), 2. In Anglo-Saxon literature BTR is often interpreted as 'Basic Textus Receptus' . It Koster's conclusions were based on his study of Exodus. A.P. Hayman has indicated that the same is true of Numbers, see his review of Koster, Exodus, JSS 25 (1980): 263270, esp. 266-267; on the orthographic and linguistic profile of 5bl see P. Wemberg Moller, 'Some Scribal and Linguistic Features of the Genesis Part of the Oldest Peshilla Manuscript (B.M. Add. 14425) ' , JSS \ 3 ( 1 968), 136-16 1 ; on its importance for the early a Translation,
"
187-204, esp. 187
type is a literal, 'Hebraising' translation of a Hebrew text which i s identical or almost identical to the Hebrew Masoretic Text. Koster even assumes that originally the text of the Peshi(ta 'was still much closer to its Hebrew exem plar than even in the earliest stage attainable from the extant manuscnpt " evidence' and takes this as a criterion for the evaluation of vanant read " ings. Adcordingly, there is no arguing that the original Syriac translation depends on the Septuagint or the Targurn. From the sixth century the. manu scripts show a gradual move away from the Hebrew towards more ldiomatJc literary Syriac. In the seventh and eighth centunes thIS results In the BTR, an intermediate stage between the original text and the later TR. BTR IS the result of an internal Syriac development, which took place independently of the other Ancient Versions. Although some scholars have contested or modified Koster's reconstruc tion, his view that 7al or BTR does not represent the original text of the Peshitta is commonly accepted. The text type attested in 5bl or 9al seems to " be m�ch closer to the original Peshilla. This means that 7al , BTR or the running text of the Leiden edition (see below) are not appropriate as a basis for a linguistic and translational study. The question of how the language system of the target text responds to that of the source text can only be an close as pOSSIble to swered if the analysis is based on a text form that IS as .. the original translation, rather than a secondary text form.
6.4 The Leiden Edition The Leiden Peshi(ta edition presents basically the BTR text in the broad defi nition of the average text of all manuscripts dating from the sixth up to and " including the tenth century. It is not a diplomatic text since it does not give textual history of the Peshi�ta see A. van der Kooij, 'On the Significance of MS 5b 1 for Peshitta Genesis' in Dirksen-Mulder, The Peshi!ta: Its Early Text and History, 1 83-224. 15 M. . Koster, New Introduction to the Peshitta of the Old Testament', Aramaic Studies I (2003), 2 1 1-246, esp. 222. . . . . . 16 M.D. Koster, 'The Copernican Revolunon m the Study of the Ongms of the Peshltta , m Targum and Peshilta (ed. P.V.M. Flesher; Targum Studies 2; Atlanta 1998), 1>-54,. esp. 52: 'Agreement (or disagreement) with the Hebrew text IS our best cntenon for dlstm guishing between good (original) and dubious (secondary) readings' ; see also Brock,
D
'A
. ,
'Peshilla Isaiah', 7S-79. . however' that there are a number of differences between these two manuscnpts. of ersion V yriac S the f o .Th. van Peursen, review of C.E. Morrison, The Character U Cf. the First Book oJSamuel (MPIL I I ; Leiden 2001), Aramaic Studies I (2003): 1 31-137,
17 Note
W
esp. p. 136. . . This means that manuscripts later than the tenth century are not used for establlshmg the the eleventh and twelfth running text, although the apparatus includes manuscripts . century. In the course of the project the rules for detenmmng the runnmg text have . 1966), Part Tobit; Songs, of (Song Edition Sample The slightly changed over the years. lV 6 (Canticles or Odes, Prayer of Manasseh, Apocl)'phal Psalms, Psalms of Solomon, Tobit, I Esdras; 1972) and Part IV,3 (Apocalypse of Baruch, 4 Esdras; 1973) give a dlp19
tr.o':l
,
36
37
the text of any extant manuscript, nor is it a critical text stricto sensu be cause it is a reconstruction of an early (but not 'the earliest') phase U; the development of the Peshi(ta text. This is a pragmatic solution to a very com plex situation of textual history and reception history. In various publications the initiator of the Leiden Peshiita edition, P .A.R. de Boer, has emphasised that the running text of the Leiden edition should not be consulted without the critical apparatus. This apparatus contains the old biblical manuscripts and the so-called lectionaries (with the letter I in the sigla of the Leiden edition, e.g. 911). In addition to the readings that are text critically significant, the lectionaries contain a large number of secondary readmgs, most of which are adaptations to liturgical use. These variants are not included in the apparatus. Another category of manuscripts not included in the apparatus are the so called Masoretic manuscripts (with an m in the sigla of the Leiden edition, e.g. 9ml). From a text-critical viewpoint their value is limited because in general they support readings known from other sources. From a linguistic viewpoint, however, they contain interesting information about Syriac or " thography and phonology.
6.5 Quotations Biblical passages Occur not only in biblical manuscripts, but also as quota Iton m the works of Syriac writers and commentators." The use of citations in the text-critical study of the Peshi(ta has been a debated issue. Do they reflect ancient readings that can bridge the gap between the origin of the Peshiita and the earliest manuscripts? Do they provide readings that by chance have been lost in the biblical manuscripts? In the early decades of the twentieth century G. Diettrich regarded the text-cnlIcal value of the commentaries very highly: 'Wenn nicht sehr vie I iiltere Handschriften gefunden werden k6nnen, empfehlt es sich, erst die
�
�
omatic edition of a I. Part 11,4 (Kings; 1976) is the first volume in which the subjective Judgment o the edItor plays a larger role (see p. vi in that edition). In the Kings volume all man �scnpts are taken into account for establishing the running text, which means that the runmng text comes closer to TR than in the earlier volumes. 90 See G. Diettrich, Die Massorah der as/lichen und westlichen Syrer in ihren Angaben
�
�
zum Propheten esaia nach flin! Handschriften des British Museum in Verbinding met zwel, Tractaten uber Accenten (Londen 1899); Th. Weiss, Zur ostsyrischen Laut- und Akzentlehre auf Gnmd der ostsyrischen Massorah-Handschrift des British Museum. Mit Facsimiles lion 50 Seiten der Londoner Handschrift (Bonner Orientalistische Studien 5· Stuttgart 1933); M.H. Goshen-Gottstein, 'Prolegomena to a Critical Edition o f th� Peshitta' in id., Text and Language in Bible and Qurman (Jerusalemffel Aviv 1960)' 163-204.
�l This paragraph is a revised version of the paper 'Die syrische Bibel und ihre Rezeption
In Auslegung, Theologie und Liturgie' presented by K.D. Jenner at the der deutschsprachigen Syrologen, Bamberg, 26-28 July 2002.
Drilles Symposium
Kommentare herauszugeben, ehe man an eine textkritische Ausgabe der Peschitta geh!. ,92 M.H. Goshen-Gottstein held an opposite position. In his " view the text-critical value of quotations was minimal for three reasons: It is obvious that early writers often quoted from memory, omitted parts of verses and changed verses to fit their homiletic needs. 2 Fairly often the quotation does not belong to the Peshi(ta tradition, but rather is based on a different tradition. 3 Agreements between quotations and early manuscripts against later ones are very rare.
The editors of the Leiden Peshi(ta decided in 1959 not to include quotations in the critical apparatus. Although this decision agreed with Goshen Gottstein's argument, it was based on practical rather than theoretical con siderations. More than thirty years after Diettrich's remark, still much Syriac patristic literature was unedited and editions that did appear did not always " have a sound text-critical basis. Had critical editions of the early Syriac authors been available at that time, the quotations would probably have been " included. Goshen-Gottstein's first reason to minimalise the text-critical value of quotations needs modification because of the heterogenity of the material. " While it is generally accepted that Aphrahat quoted from memory, this cannot be claimed for Ephrem when he wrote his commentary on Genesis. It stands to reason that the function and purpose of biblical quotations in Ephrem's commentary differed much from that in Aphrahat's Demonstra92 Quotation in L. Haefeli, Die Peshitta des Alten TesJamentes mit Riicksicht auf ihre textkritische Bearbeitung und Herausgabe (Alttestamentliche Abhandlungen XI, 1 ; Mun ster 1 927), 1 1 5 . The number of manuscripts has not increased significantly since Diettrich
and Haefeli. The most interesting discovery since then concerns the manuscripts from Oayr as-Suryan. P.A.H. de Boer has made this material accesible by films and registration of variant readings. 9) Goshen-Gottstein, 'Prolegomena', 35-36. 904 Cf. De Boer, 'Edition of the Syriac version' (n. 2), 355. 95 Note that the critical edition of the Syriac New Testament by B. Aland and A. luckel does include quotations from Syriac literature. The text-critical value of New Testament quotations has been discussed by Von Harnack, Baumstark, Van Unnik and Baarda; see e.g. T. Baarda, The Gospel Quotations of Aphrahat the Persian Sage (PhD diss., Vrije Universiteit, Amsterdam 1 975), 300: 'Aphrahat is a witness to the Syriac text of the early fourth century: he read and wrote the text of the Scriptures in the Syriac form or fonns of that time. He was, in our view, a reliable witness to that text'. " RJ. Owens, The Genesis and Exodus Citations ofAphrahat the Persian Sage (MPIL 3; Leiden 1983); id., 'Aphrahat as a Witness to the Early Syriac Text of Leviticus ', in Dirk sen-Mulder, The Peshilta: its Early Text and History, 1--48; see also id.,
39
38 tions. A.G.P. Janson has investigated the quotations from Genesis 1 2-25 in Ephrem's commentary." Since Ephrem gives the lemmata in the order of the text, followed by his comment, it is easy to distinguish the quotation and the commentary. Janson convincingly argues that Ephrem took the lemmata from a biblical manuscript. Goshen-Gottstein's third reason, too, is applicable to Aphrahat's Demon strations rather than to Ephrem's commentary. R.J. Owens has shown that the numerous citations from Genesis, Exodus and Leviticus in Aphrahat's Demonstrations agrees with BTR in many instances, while there are no traces of the text type attested in the oldest biblical manuscript (5bl )." In Ephrem's commentary, however, the situation is more complex. Janson has argued that Ephrem used more than one version of the Peshi\la at the same time." In many cases Ephrem's readings agree with the text of 5b l . There are also cases where they agree with 7al against 5bl and other cases where they do not agree with any of the biblical manuscripts. The latter group contains a number of readings that are closer to the Hebrew text. On the one hand these observations support the view that 5b I comes very close to the earliest at tainable stage of the text and that the original text was very close to the MT, probably even more than 5b I . On the other hand it shows that the earliest phase in the textual history cannot be equated with 5bl , that even in the fourth century multiple text forms existed side by side and that some BTR readings that are not shared by 5bl already existed in the fourth century. Goshen-Gottstein's second remark touches upon the very question what we consider to be a variant. A central question in this respect is what we are doing when we compare quotations with biblical manuscripts and when we register 'variants' attested in commentaries, liturgical books, or others. Many generations of Syriac authors have used the Peshi\la in many different ways and for many different usages. Its use is surrounded by a number of unsolved questions about the text form in which the Syriac Bible is manifest in the sources of its reception history. This does not prevent us from registering agreements and disagreements between the text of biblical manuscripts and that of quotations, but it hinders their interpretation from a text-critical and text-historical perspective.
" A.G,P. Janson, De Abrahamcyclus in de Genesiscommentaar van Ephrem de Syrier (PhD diss., Leiden University, 1998). 23-1l9; see especially the conclusions on pp. 76-82. " Owens, Genesis and Exodus Citations; id., 'Early Syriac Text of Leviticus'. Owens does not go so far as to conclude that the BTR can be traced back to the fourth century and that in the fifth century probably two text types existed side by side. " Janson, Abrahamcyclus, 88. Unlike later commentators, Ephrem does not quote from the Septuagint. If Janson is right that he followed the MT in many cases (see e.g. the ex amples in his Abrahamcyc/us, 82-88), this is perhaps not so much due to his own prefer. eoce, but more because of views about the authority of the Hebrew and the Greek versions of the Pentateuch among his readership. Perhaps the religious background of his audience rendered it preferable to use a Syriac text type that stood very close to the Hebrew. This would explain the manifold quotations of the text type 5b 1 . It is a debated issue, however, whether Ephrem knew Hebrew.
The text of biblical quotations in early Syriac literature sometimes agrees with one or more of the Targums against the extant Peshilta manuscripts.'oo A number of scholars, including Baumstark, V66bus, Peters and Running, took this as evidence for an early stage of the textual history in which the Peshilla stood close to the Targums. "1 In Koster's reconstruction of the textual his tory, however, the text of the Peshi\la developed from a text that was very close to the Hebrew to a text in more literary Syriac.102 In this view there is little reason to use these biblical quotations for the reconstruction of the original Peshi\la. Many agreements between Syriac quotations and the Tar gums can be explained from a common desire to give clarifying expansions. The few remaining passages where the text of a citation indeed shows sub stantive agreement with a Targum against the biblical manuscripts, testify to the continuing circulation of phrases from the Targums within the Syriac church, quite independently of the Peshi\la rather than to the existence of an early lost stage in the transmission of the Syriac Bible.IOl The question of how the multitude of biblical quotations in Syriac litera ture can contribute to the reconstruction of the original text of the Peshilla receives different answers. Early scholars like Baumstark too easily assumed an early 'Targumic' text type, which had been lost in the biblical manu scripts. Goshen-Gottstein and Koster evaluate the weight of the evidence according to the agreement with the Masoretic Text. In their reconstruction of the textual history it is unlikely that the Syriac literature provides new information about the original text of the Peshilta. Studies by Weitzman, Owens and Janson, however, suggest that Koster's linear chronological framework needs some modification. If one agrees with Jenner and Weitz man that Koster's three stages are not strictly successive in time,I" the quota tions in early Syriac literature can provide relevant and significant text critical information about the various alternative, if not competitive text types that existed in the earliest phrase of the textual history. From a text-historical perspective they are important because they allow us to date a number ofBTR readings some centuries earlier than 7al, the main BTR repre sentative. 100
Cf. Weitzman, Syriac Version, 129-139. Cf. S.P. Brock, 'Bibeliibersetzungen 1,4 Die Obersetzungen ins Syrische', 4.1 Altes Testament'. Theologische Realenzyklopiidie VI (BerlinlNew York 1980). 181-189. esp. 182: Au13erdem scheint der Targumcharakter des Peschitta·Pentateuch in den ftiihen lahrhunderten des syrischen Christentums erheblich auffallliger gewesen zu sein, als die gedruckten Texte vermuten lassen: Zitate bei friihen Schriftstellem wie Afrahat und Ephraem bieten ebenso wie friihe Handschriften manchmal Lesarten, die der Targumtra· dition naher verwandt und die zu einem spateren Zeitpunkt aber in der handschriftlichen O berlicferung der Peschitta verloren gegangen sind.' 101 See above, § 6.3 and Koster, 'Copernican Revolution', 29-30. IOJ Weitzman, Syriac Version, 130, 137-139 with further references. 10( Jenner, Perikopentite!s, 356; id., 'Fille du textc massoTl!tique'; Weitzman, Syriac Ver sion, 305. 101
I
40
41
6.6 Conclusion The running text of the Leiden Peshitta edition is the BTR, an average text that is not attested in any extant biblical manuscript, but nevertheless might have existed from the sixth up to and including the tenth century. Some quo tations from Ephrem and Aphrahat reveal that a number of BTR readings existed earlier, even in the fourth century. To go back behind the BTR in the direction of the original Peshitta is dif ficult because of the small number of sources. It seems that BTR has devel oped from an earlier text type that was close to the Hebrew text and for which we find evidence in 5b I and 9a l . That the original translation was even closer to the Hebrew than these witnesses (Koster) is possible, but there are hardly any data to corroborate this assumption. Although the reconstruction of the textual history can help us evaluate variant readings and make assumptions about the original Peshitta, the extant manuscripts do not allow us to attain the original translation. Even readings that are shared by all extant witnesses may be secondary and in need of lOS
emendation. The indirect witnesses to the biblical text, comprising quotations and allusions in early Syriac literature, contain readings that may go back to dif ferent traditions. For this reason not every reading should be taken as a vari ant in the text-critical and text-historical research into the Peshitta.
Je tiefer die Wissenschaft in ein erforschtes Objekt eindringt, desta kompliziertere Methoden (oder Instrumente) benutzt sie. Aus dieser Erkenntnis folgt, dan jede Methode ihre Grenzen hat und daB man mit ihr nur bestimrnte Eigenschaften des Objektes erforschen kann. C . . . ) Die Annahme, daB eine hestimmte Methode sollte sie anfangs auch noeh so revolutionar aussehen - aIle Probleme zu losen imstande sei oder eventuell die einzig richtige Methode darstelle, hat sich im Lauf der Geschichte immer als falsch erwiesen. Zwei verschiedene Methoden k6nnen zu demselben Ziel fUhren, und die Entscheidung, welche von ihnen "besser" ist, hangt aann von den gestellten Kriterien ab (Zweck, Einfachheit, Schnelligheit l usw.). In the interaction of two disciplines the question arises of whether one of them can claim priority in the procedure of analysis. One could say that cor pus linguistics can only start if the corpus to be investigated and its status from a text-historical perspective has been established. This would imply that textual criticism and textual history have priority over linguistic analysis. However, if it is acknowledged that linguistic phenomena belong to the es sential characteristics of a text, then corpus linguistics is also an instrument for textual criticism. To cope with the complex interrelatedness of the two disciplines, it is necessary to start with both a linguistic analysis and a text-critical investiga tion. In CALAP the linguistic analysis started from the text of the Leiden edi tion. However, because of the linguistic change reflected in the textual transmission
7 Translation ofthe Theoretical Considerations into Research
Strategies
(§ 6.3) it became clear that the analysis of a single manuscript
(e.g. 7al) or a single text type (e.g. BTR) was not sufficient. For this reason it
was decided that several textual witnesses and text types
(in
casu 7 a l , 9al
and BTR) would receive an independent analysis.'" At this point the text
historical research contributed to the linguistic analysis.
In the preceding part of this paper we have presented a number of linguistic and text-critical considerations that are involved in the study of the Peshitta. These considerations have led to a number of decisions about the research
The text-critical and text-historical analysis concentrated on a careful classification and explanation of inner-Syriac variants as well as differences between the Peshitta and the Masoretic Text. While in many cases the variant
strategies followed in the CALAP project.
in question required obviously a text-critical or text-historical explanation, in
7.1 The Interaction between Linguistic and Text-Critical Analysis
nes that dId not fit exclusively into a linguistic or a text-historical analysis.
Coming from different disciplines, the members of the CALAP project group
from both disciplines were carefully weighed. It appeared that some catego
other cases a linguistic explanation was preferable. At this point the input from the linguistic analysis was of great help. There remained some catego
were aware that the claims made on the basis of either diachronic or syn chronic analysis are in principle valid only within the framework of presup positions and points of departure of the respective disciplines
(§ 2). The
In those cases the individual passages were investigated and the arguments
ries like variation between singular and plural included both passages in which the variation is due to linguistic aspects and passages in which it is the result of exegetical, translational or text-historical factors .'"
complexity of the material under investigation requires an interdisciplinary approach. As G. Altmann puts it:
lOS
See above, § 6.3 and Koster, 'Copernican Revolution', 29-30.
1 06 G. Altmann, 'Status und Ziele der quantitativen Sprachwissenschaft', in Linguistik und Sla �istik (ed. S. Jager; Schriften zur Linguistik 6; Braunschweig 1 972), 1-9, esp. 1 . Cf. Weitzman, Syriac Version, 7, 293. 107 See also Bosman-Sikke1 'Response to Barbone', pp. 1 1 9-121 in the present volume, esp. pp. 12G-121. lOB Cf. Van Peursen, 'Epilogue', pp. 345-358 in the present volume, esp. p. 347.
43
42
7.2 Appreciating the Peshi!ta as a Translation Because of the peculiar problems involved in the analysis of a translation (§§ 4 .5 ,
5.2)
the CALAP analysis starts with a formal registration of patterns
of agreement and disagreement with the Masoretic Text. This registration is separated from and precedes the interpretation of these patterns in terms of, for example, translation technique or strategy, or multiple translators. Because of the influence that genre and style may have on linguistic vari ety (§ 4.7), it was decided that two corpora should be investigated which represent different types of literature: the Books of Kings, which contains mainly prose and the Wisdom of Ben Sira, one of the sapiential poetic books. A comparison between the two corpora should give us more insight into the linguistic differences that may exist between Biblical Syriac prose and Bibli cal Syriac poetry. Two other issues are relevant here as well. The ftrst is the question of how linguistic variation in the Hebrew text is reflected in the Peshitta: To what extent are syntactical differences between different parts of the Hebrew Bible reflected in the Peshilta? If the syntactical differences between several parts of the Peshitta are smaller than those in the corresponding sections in the Hebrew Bible, this would suggest a levelling activity on the part of the Syriac translators. The second question concerns the relation between Biblical Syriac and non-translated Classical Syriac corpora. To answer the question of whether Biblical Syriac is distinct from non-translated Syriac corpora and to what extent the source language has influenced the language of the target text, it is necess� to make a linguistic analysis of a corpus 'original' Classical ' Syriac. A systematic investigation into these two issues was beyond the scope of
the CALAP project. They will be two of the main research questions of the
TURGAMA project, an advanced research project initiated by W.Th. van Peursen which will start in 2005.
take the current grammatical insights as its point of departure. If there are data to support the supposition, it becomes a hypothesis. If there are enough data to support the hypothesis, it becomes a theory. Although the computer-assisted analysis follows a consistent and efficient procedure, the outcome of the analysis is not always unequivocal. In that case the human researcher has to ftnd out whether the task given to the com puter was ambiguous, or whether the object of research contained ambigui ties. This requires a procedure of disambiguation. In the parsing of words into morphemes, for example, a number of disambiguities will appear. Thus an adjective with the Alaph ending may be either a feminine form in the absolute state, or a masculine form in the emphatic state. More options are present if there is a verb form with only three radicals without preftx or suftx. For the time being, it is the human researcher who takes the disambigua tion decisions. These decisions function as working hypotheses that form the starting point to the analysis of the subsequent linguistic levels. Thus if an adjective with the Alaph ending is an attribute to a noun, the working hy pothesis about its analysis will be conftrmed or rejected at phrase level. If the human researcher has analysed it as a feminine absolute form, but it appears as an attribute to a masculine in the emphatic state, the preliminary disam biguation has to be corrected. If the adjective is the predicate of the nominal clause, the conftrmation or rejection of the disambiguation takes place at clause level, where agreement between Subject and Predicate is taken into "o account. In the future we hope to reftne the programs in such a way that in the case of ambiguity the computer can suggest which analysis is the most probable because of contextual and statistical data.
8 Conclusion The Hebrew Bible and the Ancient Versions can be approached as products of a long and complex history of composition, redaction, transmission and interpretation. At the same time they are the sources that provide data for
7.3 Ambiguous Forms and Working Hypotheses
computer-assisted linguistic analysis. In the present paper we have discussed some of the complexities involved in the study of these ancient texts and the
Corpus linguistics is empirical in that it analyses the actual patterns of use in
way in which the computer-assisted analysis can respond to these complexi ties. In many ways our possibilities for research are limited. These limita
extant texts (§ 4.8). The textual analysis can be considered a kind of observa tion of the empirical data. The information technology guarantees the consis tency of the analysis (§
3.2).
The empirical data serve to check suppositions
about the language. A supposition has an intuitive character and will often 10'\1
G. Goldenberg. 'Bible Translations and Syriac Idiom', in Dirksen-Van der Kooij, The Peshitta as a Translation, 25-40, esp. 25; T. Muraoka, 'Response to G. Goldenberg, "Bible Translations and Syriac Idiom''', in Dirksen-Van der Kooij, The Peshilla as a Translation, 41-46, esp. 41-42.
tions also constitute the challenges for the interdisciplinary debate: How can we fully exploit all textual witnesses if we cannot obtain the original text? How can we build and test hypotheses about the language system, if we do 110
In the example of the disambiguation of an adjective the procedure is straightforward, and even if the human researcher has only a basic knowledge of Syriac, it is unlikely that he or she will formulate the wrong 'working hypothesis' as long as he follows the current grammatical paradigms. However, the status of the disambiguation decisions as a working hypothesis becomes relevant in more complicated questions. See also § 1.2 of the follow ing article in this volume.
44 not know the complete set of lexical and grammatical features that were present in ancient times? How can we do justice to the hybrid presentation of the language system in the corpora? How can we identifY linguistic traces of communicative and mental processes to which we have no direct access? In the last part of this paper (§ 7) we have indicated some of the research strategies that have been developed in the CALAP project to respond to
How to Transfer the Research Questions into
the challenges. These concern the procedure of the interdisciplinary analysis,
Linguistic Data Types and
the way in which a formal registration of elements in the source text and the
Analytical Instruments
translation interact with traditional research questions about translation tech nique, the number of translators, and the influence of the source text, and the complex interaction of the human researcher and the computer in the interac
Eep Talstra, Konrad D. Jenner and Wido van Peursen
tive computer-assisted linguistic analysis. How these research strategies can be transferred into linguistic data types and analytical instruments will be discussed in the next paper in this volume. There is still much uncertainty about the cultural and religious context in which the Peshilta originated and the purpose of the translators. Was this translation primarily intended to provide the Syriac Christians with an under standable, contemporary Bible, or to produce a faithful imitation of the He brew original? Any attempt to answer this question should start with a col lection of data that enable a full linguistic analysis of the Peshilta and a com parison with the Masoretic Text at all linguistic levels. Accordingly, a systematic, consistent interdisciplinary linguistic and textual analysis is re quired not only for a deeper insight into the language systems involved, but also for a better understanding of the cultural and religious background in which these documents originated.
'" We will end this article with a remark that G.A. Miller made in 1965 'In a word, what I am trying t o say, what all my preliminary admonitions
boil down to, is simply this: Language is exceedingly complicated. Forgive me for taking so long to say such a simple and obvious thing. '
The basic task set by the goals of the CALAP project is to establish linguistic data and produce instruments that allow a researcher of the Peshitta to ex plain features of the transmitted Peshitta text(s), either in terms of the Syriac language system; or in terms of variations due to processes of transmission of the Hebrew and Syriac manuscripts; or in terms of the translation tech nique and theological preferences in the Peshitta. To put it in general terms: when comparing the Hebrew Bible and the traditions of the Peshitta transla tion, the question is: What is 'system' and what is 'particular'? or: what is to be explained in terms of language (system) and what is to be explained in terms ofliterature (composition and transmission)? This line of research requires both
data and analytical instruments.
Since the
CALAP project, from a technical point of view, can be seen as an extension of texts and instruments developed for the Amsterdam Hebrew database pro ject, this presentation will begin with a short explanation of the existing He brew data and their production (§§ l . l and 1.2). Secondly, some information will be given about the adaptations of the system of Hebrew data production that had to be made so that it could be used for the parsing of Syriac texts (§
1 .3).
Thirdly, the instruments will be presented which were designed to
perform the main analytical tasks, i.e. the synoptic presentation (§ 2.1) and the comparison of Hebrew and Syriac data (§ 2.2).
1 Text Data The first task to be performed was clear from the very start of the CALAP project, namely, to create a database of syntactically analysed Old Testament texts in Syriac, analogous to the database of syntactically analysed texts of the Hebrew Bible that is being developed and produced by the WlVU. To clarifY what type of parsing of the Hebrew texts is involved, the next III
Quoted in Jacobs-Rosenbaum, Transformationen, Slit und Bedeutung, 143.
section will present a small segment of Hebrew text
(I
Kgs 2: 1-9) that has
47
46 been grammatically parsed from word level up to the level of clauses and
Abbreviations and symbols:
clause connections.
Grammaticalfeatures of the predicate 3sgM
3 singular masculine
The actual process of producing syntactic information has been described in
3pl-
3 plural
number of different presentations. By way of introduction we present first a
Parsing of clause constituents
The surface text presentation below is followed by an overview of the inter
1 . 1 A Database of Syntactically Analysed Hebrew Texts
1 singular
Isg-
other publications.' Once produced, the actual linguistic data allow for a short section taken from the final product, i.e. a fully analysed Hebrew text.
nal structure of the data themselves.
Object
Adjunct
Locative
tactic embedding (not in this sample text) a line will represent only part of a
Clause-type labels
{<Su> 'n '�� I
[ ':J?i?� I
[ 11 1 I 2
« pr> n17.l'11 { ;J � il1�'?1p n�]
[ W�l
= == == = == == == = = = = == ===== === == = = === == === == = =+
« Co>
nl$ij- � =rn:;; ll
[ ��n) [<Su> ':;!Jl$l { J;lif!lJJ [ 1 1
« PC> Ui'l t?l : [ n'�iJ l " [ 1'iJ � ii"i� n'J72tp1;)-ntt l { t:'Il��l « Co> " �li� 1
[ I I {
« pr> n�?'11 [ ,'bip?l
[ " QiTa?1 " tt�tp1;l1 " Oi¥1;) " !;lv(J l « Co> il�fO n1in;lJ [ :J�n�l [ - - ; n « Ob> .,� n� 1 « Pr> "'�ipJ) 1 « Cj > 1 �1;>?1 [ ilo/¥,13] « Re> 'W� I I « Ob>
[ OWl
� n�1
[<pr> i1��J;l l
« Cj»
« Re> 'o/!$]
1
I
I
= = = = = = = = ==================================+
WayX
N
InfC
3p1 - 01 01
N
WayX
35gM 01
4
N
InfC
01
5
NQ
PtcA
-s9- 02
6
NQ
WQtl
25gM 02
7
NQ NQ
WQtl
8
WQtl
25gM 02 2sgM 03
9
NQ
InfC
03
I 10
NO
InfC
I 11 I 12
NQ NQ
PtcP
NQ NQ
xYqt
NQ
xYqt
[ 11 3
« pr> "/)X'1I
N
I I I
I
I
I
I
I 13 I 14
I 15
xYqt Defc
I
== === ==
- s g-
WayX !nfC
Defc
2sgM 03
=======
w-qatal clause wayyiqtol (+<Su»
Complement
Conjunction
clause
PtcA Participle (active) clause PtcP Participle (passive) clause xY qt yiqtol (not on fIrst position) clause
Infmitive construct clause Part of clause, before or after an embedded clause
Text types N
Quotation (direct speech) section
Q
Narrative section
The diagram given above presents the text of three verses from divided into:
03 03
2sgM 03 2sgM 03 -- - 03
singular
Nominal complement to predication
<Su>
WQtl
2 singular masculine
Subject
clause.
-sg-
Verbal predicate
An Example ofan Analysed Text (Surface Text): 1 Kings 2:1-3
The text is presented here in lines that equal simple clauses. !n cases of syn
2sgM
Textual Domains.
1 Kings 2,
A distinction is made between the narrative layer ('N')
and the direct speech section used in it ('Q'). Domains that are embedded
2 3
in other textual domains are indicated by a box drawn around them.
Clauses.
These are defmed as an arrangement of phrases, with a maxi
Phrases.
To each phrase a parsing label is added, indicating its function
mum of one verbal or nominal predication in a clause. in the clause.
4 Clause Hierarchy. Vertical lines 'I' are used to indicate connections be
tween clauses at a distance greater than one line. That means, for exam ple, that not only is the infinitive clause of line 2 nected
to
[ il:;l
1 E. Talstra, 'A Hierarchy of Clauses in Biblical Hebrew Narrative', in Narrative Syntax and the Hebrew Bible. Papers of the Tilburg Conference 1996 (ed. E.J. van Wolde; Bibli
cal Interpretation Series 29; Leiden 1 997), 85-1 18; id" 'Text Segmentation and Linguistic Levels. Preparing Data for SESB', published with the software package SESB (Stuttgart Electronic Study Bible) (Stuttgart 2004),
line
1
"; f/J'7W n� 1
line 3 back to line
but
also
[ < Pr>
the
'1!� 1
wayyiqtol
[
clause
n1/J?1
of
con
lme
3
[ 1 1 . Since the connection of
1 bridges the connection of line 2 to line 1 , in the pres
entation of the text line 2 has been indented further to allow for the lme drawn between 3 and
J.
49
48 In the database itself these clause connections are also labelled by a code that marks the type of connection. The connection of line
2
back to line I is of the
'adjunct clause connection' type: the infinitive clause is part of the predica
tion frame of line I (verbal predication + Subject
+ Adjunct, realised by an
infinitive clause). On the basis of an internal code number for the preposit ion ';> this clause connection is labelled:
section is marked by
'999'.
'64'.
The start of a direct speech
This organisation of the data can be represented
by a tree diagram of the following type:
1 - ------- -------- - - 1 -----<200>
<64>
1
1
1 ------
1 -- - - - - 1 - - - - - - - - - - - - - - - - - -
111 '�� 1�')i(�J
[I]
1m:>?
[2]
;l� ;'7:l''P-n\t 1¥'J
[3]
1 Kgs 2 . 1
,'7:lN?
[4]
1
n�\!- '� ,'1:0 1�;;' ':;» �
[5]
1 Kgs 2 : 1 1
Kgs 2 : 1
1 Kgs 2 . 1
<64>
1 -- - - - - 1 - - - - - - - - - - <999>
1 - ----------
The connection of line
wayyiqtol],
3
back to line
for which the code
'200'
Kgs 2 . 2
1 is of the 'parallel' type [wayyiqtol
is used in the database. The reason for
using a system of codes, rather than grammatical labels, is that this procedure always allows grammatical relationships to be established between clauses,
even if one still has to discuss how exactly the linguistic features or the func
tion of a particular clause connection should be labelled. The advantage of this procedure is that the database remains open to all kinds of further text linguistic investigations and that annotations can always be added on the
basis of continued syntactic research.
With respect to the linguistic levels beyond word level analysis, it is im
portant to note that these processes carmot always be applied in the same
order as mentioned above. Frequently, linguistic features or relations found at a higher linguistic level may require a recalculation of segments or fea
tures at a lower level. Thus a participle used as an adjective in a Noun Phrase
needs the extra label 'adjective' in addition to the initial label 'verb'. In a
large number of cases, however, the parsing process can proceed from
smaller elements to larger elements without needing recalculation. These are the cases one can start from.
Words The text of
I
Kgs
2:2,
line I :
� '�il$ nl$ v -� 1 11 :t 1 m
can be divided into
seven words. Their forms can be assigned to seven different lexemes, with the help of a so-called analytical lexicon. The lexemes receive a label indi
cating their part of speech; the word forms receive labels for, for example,
'person', 'number', 'gender', also with the help of a lexicon and a morpho
logical paradigm applied by the programs':
I FW noun
vi
';>�I
1111
sg.
sg.
det.art. noun
sg.
�I
prep.
noun
1 7;'11
'�il$l
ptc.sg.
I st.sg.
verb
pers.pron.
Phrases The segmentation of this line into seven words and the analysis at word level is followed by the segmentation at phrase level, which creates four groups of
words. Also this process is based on the use of a 'lexicon' of accepted word groups produced in the analysis of previous texts. For example, when reading the words
'�il$
and
1 7;'1
the program will not fmd the word order 'personal
pronoun' - 'verb (participle)' within a phrase in the lexicon, with the result
that it has to propose to take the word '�il$ as a separate phrase. The same is
1.2 Parsing and Data Types: The Production of Grammatically Analysed Data The process of syntactic parsing can be described in terms of three different actions. The first action is text
linguistic segments
(blocks)
segmentation:
a linear text is divided into
at all textual levels: morphemes, lexemes,
groups of words (phrases), groups of phrases (clauses), groups of clauses (sentences). The second step is the calculation of grammatical and lexical
features
of the segments. The third action is the re-combination of these
segments on the basis of their grammatical
relations. In this
presentation we
omit the morpheme and word level analysis that have been presented else
where.2 2
Cf. AJ.C. Verheij, Grammatica Digitalis I. The Morphological Code in the ' Werkgroep (Applicatio 1 1 ; Amsterdam 1994); E.
Informatica ' Computer Text 0/ the Hebrew Bible
true of the next sequence: 'verb (participle)' and 'preposition'. After that the
program will find a match for the sequence of five words following and their
morphological features. For example, the preposition � has no pronominal suffix, otherwise the match would fail. The words 111 and ';>� are unmarked
g
for 'state ', so they do not resist the proposal of a lon er phrase with genitive constructions here.
I / Fl l$v I v /
';>� /
111 1
:t /
prepositional phrase
I
/ 1 7;'1/
verbal phrase
I ' :l il$ /
noun phrase
Talstra and C.J. Sikkel, 'Genese und Kategorienentwicklung def WIVU-Datenbank, oder: ein Versuch, dem Computer Hebraisch beizubringen', in Ad Fontes! Quellen erfassen lesen - deuten. Was ist Computerphilologie? (ed. C. Hardmeier, W.-D. Syring, J.D. Range, E. Talstra; Applicatio 15; Amsterdam 2000), 33-{;S. See also Bosman-Sikkel, 'Discourse on Method'. 3 Verheij, Grammatica Digitalis L
51
50 The last phrase also presents an example of the recalculation of gramm atical
nl$v �
11':9 is acceptable as a preposi tional phrase, the implication is that the words lTJ and � can be labelled as features. Since it is concluded that
nouns in the construct state. On the basis of that information a further analy sis of the internal structure of the phrase is possible:
nl$v nl$v nl$v
[ [[ [[[
1 1 1
7:> 7� �
and a pattern
'noun (pron . suffix ) ' - 'conj . :1 ' - 'noun ( identical pron . suffix) ' prompt the recognition of the pattern of four nouns in verse
3
as one phrase.
Complex Clauses: Sentences
111 111 1 �
Also this type of analysis is performed on the basis of a 'lexicon' with previ ously accepted patterns. It must be emphasised that all of the procedures
'clauses' On other occasions one will find that a segmentation of the text into with the case the is This r. a amm gr of terms in is just a preliminary decision a noun of parts where example, for occur, They so-called embedded clauses.
3:
phrase are expanded by additional relative clauses, as in verse
mentioned here are run in an interactive way, which means that the user has
the possibility of choosing differently, though always within certain rules
9
7
5
6
2
3
I 4
that allow for particular types of combinations.
10
Clauses
How can this line be segmented into clauses? The program that proposes the
In a way similar to the division of phrases, the text can also be divided into
segmentation into clauses will search its 'lexicon of phrase combinations' for
clauses. How does one decide that the words coming after the line presented here, i.e.
l'lPIIJ1, actually begin a new clause?
Again the programs use a 'lexi
con' built up in previous analytical runs. This lexicon contains sequences of phrases that have been identified as clauses on earlier occasions. Thus, when reading I Kgs
2:2
order:
a program will be unable to fmd a match for the phrase
8
matches with the sequence of phrases in the text. In reading the text phrase after phrase it will not find in its 'lexicon' any match for the pattern
n� +noun - reI : 'W�
and 4; phrase
as in phrase
3
as in phrase
5, 6, 7.
7 and 8.
nor for the pattern
'prepositional phrase' - 'conjunction : ) ' - 'verb: qatal'
verb
- conj : 1 - n�
But it will fmd matches for the pattern
Since, however, a match with the pattern
'conjunction: ':' - 'verb: qatal'
reI : 'W� - verb
can be found, it can be concluded that the prepositional phrase is closing the actual clause and a new clause begins with this pattern. From '.. we have a new clause.
and
c onj : ': - n�+noun
as in phrase
4 and 5; phrase 8 and 9.
as in phrase
6 and 7.
The application of these patterns will result in the segmentation of this piece
Complex Phrases
of text into four 'clauses', of which, however, the third one is not acceptable
Of course, things quickly become more complicated. For example, on many
as a grammatically correct clause.
occasions the conjunction
1
does not mark the start of a pattern that implies
the beginning of a new clause. The conjunction '. is also used to connect words or phrases, as can be seen in verse
" l)'7�1·
3 : " lJ�o/I:l1 " l)i�7;l " l)"ii' (1 11:l1p?
In such cases a more precise definition of patterns in the 'lexicon of
phrases' is sufficient to elicit a decision. A pattern
'noun (pron. suffix) ' - 'noun ( identical pron . suffix) '
� l'l1!
":>ipD 1��'? nipl1D 'W t!
7� l'l1!1
:OW n�pl'l 'Wt!
a
b
c
d
These phenomena required a modification of the parsing process at sentence level that would make it possible to organise the four 'incomplete clauses'
52
53
(we usually call them 'clause atoms') of these lines into three grammatically correct clauses. This work remains an important part of the continuing research of our project
'lJ�-P . . .
�ipl¥l1'?l. 1l. );).
'�:;J. � ,?. '�'Jip� .
ni�:;J� . 'Jip 'W'? . nip¥ '1!! . l!
In this clause the prepositional phrase
of computer-assisted syntactic parsing. We have divided the task into two parts: The first goal is to design linguistic data structures, so that one will be able to define and store the various linguistic data and their features in a
is expanded by a compound apposition consisting of two parallel parts
consistent way. This goal has been achieved, a text-linguistic database of the Hebrew text now exists.' To store the syntactic patterns of the 'clause atoms' listed above, we use the system of coding clause connections as presented in the previous paragraphs. A special set of codes is used to indicate that clause atoms that are disconnected by embedded clauses, have to be combined and to be analysed as one grammatical clause. In such cases the code
'223'
used to indicate that the predication is in the first 'clause atom' . A code
is
'222'
(not present in the example below) signifies that the predication is in the second 'clause atom'.
1t1' P . . . 'l 1'1 . . . In the parsing of such cases we follow again the order:
linguistic units -
(I)
segmentation into
(2) functional labelling. In the process of phrase pattern
recognition the segmentation is made and the separate segments are labelled
1 - - --- ---- -- -- - 1 -- - -- ---
<223>
each of which has been expanded again by an apposition:
1 Kgs 2 , l . a
<11>
1 1 -------1 1 - - - - - --- --- -- - 1 - ----- --
1
Kgs 2 , 3 . b
according to their grammatical type. These segments are called 'phrase at
oms' (compare our remarks above on 'clause atoms '):
"'PP,O 'ip�
1 Kgs 2 , 3 . c
� n�)
1 Kgs 2 , 3 . d
'Oli' "t�1'l 'ip�
<11>
1 --------
� n� "'�ipO 1 �1J,?
The second goal is to design analytical procedures to perform the text syntac tic analysis. A program for clause hierarchy has been developed that, again on the basis of lists of previously accepted patterns, is able to propose the connection of clauses into sentences.' Again, these procedures are necessarily interactive, since on many occasions a user will need the option to choose
formal:
NPhr
1
PPhr
1 Cj Phr
1
NPhr
1
Pphr
PPhr
The second step is the encoding of the relations between the segments by labels indicating the type of the relation and codes indicating the distance between the segments, counted in phrase atoms, counting backwards to the grammatically dominating segment.
differently. This implies that computers are 'learning' because of the ever increasing number of previously accepted patterns stored in the database. The computers help the human researcher to be consistent; they do not pro vide complete objectivity.
More on Complex Phrases Complex phenomena are also present at a linguistic level below sentences and clauses, i.e. in complex phrases. A good example can be found in verse
5.
" Cf. Talstra-Sikkel, 'Genese und Kategorienentwicklung def wJVu-Datenbank'. , Talstra, 'A Hierarchy of Clauses'; id., 'Workshop: Clause Types, Textual Hierarchy, Translation in Exodus 19, 20 and 24', in Narrative Syntax, 1 1 9-132.
functional:
Appos
Paral
I Link
I Appos
I Appos
I
1 -2Phr
1
1
1
head: Complement
distance:
- l Phr
1 - 3 phr
- l Phr
- l Phr
Abbreviations used: NPhr
noun phrase (atom)
CjPhr
conjunction phrase (atom)
Pphr
preposition phrase (atom)
55
54 Appos
apposition to Nphr or Pphr
Paral
parallel part of phrase
-I,
1 .2.1
-2, -3
Link
phrase level conjunction
As a result of this, the main line of the procedures for the production of grammatically analysed data is
ascendant. Lower level linguistic
information
is used to generate higher-level information. However, as mentioned in the distance between phrase atoms (this distance is measured in
the previous paragraph, during the parsing process one quickly fmds that the
phrase atoms, not in words)
flow of linguistic information is far from fully 'ascendant'. At each level of
THE FLOW O F LINGUISTIC INFORMATION. AN OUTLINE
1 .2. 1 . 1 The Main Line ofParsing
grammatical analysis one also has to draw conclusions about features or relations that are valid at the previous level. This means that there will al ways be a secondary analysis of a
descendent type, as in the case of a partici 58, on I Kgs 2:8).
ple reanalysed as an attributive adjective (see below, p.
In the previous section we have given some examples of the challenges that
the computer-assisted grammatical analysis of Biblical Hebrew texts has to
1 .2 . 1 . 2 A Summary of the Parsing Process
this research has been done in the area of Biblical Hebrew, the examples will
At each level the parsing procedures start with an inventory of linguistic forms, which means a definition of linguistic elements not in terms of their function in a higher-level paradigm, but in terms of the distribution of their lower-level constituents.
be presented in the next section, is performed along the same lines.
patterns present in data sets that are gradually built up during the parsing
respond to. In this section we will present a more general outline of the pro cedures that have been (and are being) developed for the construction of the
Werkgroep Informatica
database of syntactically parsed Hebrew texts. Since
be taken from the Hebrew texts of I Kings
2. The parsing of Syriac texts, to
Two characteristics of the procedure of analysis should be mentioned here. First, in our parsing procedures the application of descriptive linguistic
The inventory is made by a search for matches of patterns in the text with process.
procedures comes prior to philological analyses that would reflect more of
Pattern Recognition
the interpretative and exegetical approach to a specific text.
The comparison of a new text and the existing data sets is performed:
Second, the procedures used in this research are based on a distributional, taxonomic type of linguistic analysis, rather than on the application of prede fined paradigms of linguistic functions taken from functional grammar or on the application of text production rules of a generative type. The idea is to start with the parsing of linguistic data as they present themselves in their
textual order and to do this with the help
of pattern matching.
More abstract or functional grammatical categories can only be produced
when an extensive and consistent inventory of linguistic forms has been
to construct from an array of lexemes in a text one or more grammatically acceptable phrase
matically acceptable clause atoms;
to construct from an array of clause atoms in a text a grammatically acceptable
clause-hierarchy.
hierarchical ascendant order of
The analytical procedures, therefore, take into consideration the
made with the help of the database. Therefore, in the phase of the 'produc
linguistic structure
tion' of new, grammatically analysed data, the registration of simple and
analysis.
compound linguistic forms precedes the assignment of grammatical func
atoms;
to construct from an array of phrase atoms in a text one or more gram
of texts and follow primarily an
On the basis of the study of distributional patterns and their morphologi
tions and values. The following statements may make clear in which order
cal and lexical features, a further linguistic analysis of grammatical relations
linguistic information is produced in the process of grammatical parsing that
and functions is made. This results in the production of functional labels for
is described in the next paragraphs.
linguistic elements at all grammatical levels of a text. This linguistic research implies asking questions of various kinds, for
In terms of 'input' and 'output' the analytical process proceeds from:
example:
linguistic forms
linguistic functions
At what linguistic level can the blocks, features and relations be estab
simple forms
compound forms
lished? What morphological, lexical or syntactical restrictions and conditions
In terms of the 'order' of data processing one proceeds from: grammatical data lower level (morphology)
-t
-t
lexical data higher level (text structure)
exist?
56
57 [in '�'] [iJ :;I] [n07W n�] [i n '/:)�] [iJ:;I ,1071P n�] . [ni o':] ['1T'�' l:iIi':l] Clause atoms (distributional = 2 blocks): Clauses (functional 1 block): '1T'/:); 1:Jii?�1] [mo': '1T'�� 1:Ji��1 ] Sentence atoms/sentences:
What grammatical and lexical patterns can be recognised by the machine? What grammatical decisions are dependent on lexical data or semantic information? The best way to answer these questions is to start from the sets of grammati cal data that are created and updated during the process of text parsing. The presence of more or less complex texts in the corpora that have to be ana lysed, raises the question of method: How should one present the text seg ments (blocks) and their various relations in a text database and make them retrievable? And how should one establish the various types of segments and their relations at the different linguistic levels?
1 .2.2
DATA
TyPES AND DATA SETS
In this paragraph we will list a selection of the most important data types that
are produced as part of the Hebrew database. These data types are designed to meet the requirement of the fIrst goal mentioned (see p. 52): to store the results of text syntactic analysis in a consistent way, even if we have not yet been able to fmd out how many of the analytical procedures can be done completely automatically.' At each linguistic text level discussed here (words, phrases, clauses, sentences) we will illustrate the three categories of linguistic data distinguished: Blocks, Features and Relations.
Data Types: Blocks: Morphemes Words Phrase atoms (distributional: NP [noun phrase], PP [preposition phrase], VP [verb phrase]) Phrases (functional: <Subject>, +InfInitive, or' + NP + VP) Clauses (functional: W-X-Qatal clause, or Inf. clause as Complement) Sentence atoms (distributional) and Sentences (functional) Some examples taken from 1 Kgs 2 : 1
,,'ON? iJ:J. ;fO'':>1P. -n�. 1 l!�1. mo': 'n '��. Morphemes: LexemesIWords:
Phrase atoms (distributional): Phrases (functional): =
(= main clauses + embedded clause that functions as one of its constituents) and sentences ( clause connections of the 'fInal' or 'consecutive' type) has not yet been applied. Further research is also performed to clarify how larger blocks of text, such as 'Paragraphs' and maybe 'Episodes', could possibly be established by constructing them from the more basic blocks men tioned here. In the actual data the difference between sentence atoms =
2 Features: Morpheme features: These are defmed by the position of a morpheme in the paradigm of grammatical word functions Lexeme and word features: lexical part of speech tense, person, number, gender, root fonnation state, phrase dependent or functional part of speech Phrase-atom features and phrase features: phrase type, determination Clause-atom features and clause features: grammatical type phrase-level clause Some examples taken from
1 Kgs 2: I :
'iON? iJ:;I. ;lO'':>1P-n� '1 . . W'. 1 mo': 'n '/:)�. 1:J1i?" . . Morpheme features: These are defmed by the position a morpheme takes in a particular set of morphemes. For example, the morpheme ' is found both in the set of nominal endings (as in '��) and in the set of verbal pre fIxes. So the verbal morphemes 1 and ' as in 1 :J1i' '. 1 decide . about verbal tense, person and number. NB: The 1 in initial position (1) is doing double duty. It is a lexeme, i.e. the conjunction 1, but in combination with the verbal morphemes 1 and � it also establishes the wayyiqtol form of the verb :l,i'.
.
1:J1i" . . 1
1 and ' in 1 1 :J'W I : 1 1 1, :J,i', C1." etc.
, For a further elaboration see c.l . Doedens, Text Databases. One Database Model and Several Retrieval Languages (Language and Computers, Studies in Practical Linguistics 14; PhD diss., University of Utrecht; Amsterdam 1 994).
59
58 Lexeme and word features:
Some examples taken from I Kgs 2: 1-9, 24
lexical part of speech conjunction 1, the verb :l1i', noun 01', etc. tense, person, number, gender, root formation 1 :l,)i?�1: wayyiqtol (or imperfect consecutive), third person, plu ral, masculine, qal. state, phrase-dependent or functional part of speech '�: : construct state, noun. For all example of a functional part of speech, see the expression: n�ll?l i1??P in verse 8. The participle n�ll?l: fern., absolute state, is an attribute of the noun i1??p. Therefore its part of speech functionally changes into 'adjective'. Phrase-atom features and phrase features: phrase type, determination, grammatical function '11 '�� Noun Phrase, determined (by name), Subject Clause-atom features and clause features: grammatical type
[(wayyiqtol+subject clause) '11 '�� . [(infinitive clause) n11J?1 phrase-level clause [(adjunct clause) mlJ?1
1:liP'11
Sentence atoms are composed from clauses. For example, a wayyiqtol clause + an infmitive clause. In the database no particular features have been assigned to sentence atoms. 3
Relations: Phrase-level relations between words or phrase atoms: attributive genitive Clause-level relations between phrases: apposition specification parallel and link Text-level relations between clauses or clause atoms: attributive adjunctive coordinated start of direct speech
Phrase-level relations between words, phrase atoms, clause atoms: attributive v. 8 v. 9 genitive v. 2 v. 8
n�1l?l i1??i?. c�r:t tV'l:t [Y1I$'J
''1 1
[O;m7;> '!'9'n
TJ':P
O;'� . . . 'l'z'?j1 1\1i11
Clause-level relations between phrase atoms:
The relation type is coded together with the distance, counting back to the governing phrase. apposition v. 8 C'X91;l 'l'l;l�v 1 � K1' 1� ''¥l?lP �I?'¥ specification v. 8 c'X9/) <spec -3 > 'l'/)�v 1� / K1' 1� / ''¥I?w �r;>,¥ parallel and link v. 5 1l. P 'PtI? 'tl1ip� n;K;l:� 'Jip 'l.o/'? , n'.. 1:::1.. Ktv1:)17'- <par-3> 1 ...
..
;
-,
Text-level relations between clauses or clause atoms: "
The type is mentioned in connection with the distance, counting back to the governing phrase or clause. attributive v. 4
;'�7 n� m,.,� C'i?; ' 'z¥ ,:;tT 'IP!!
object v. 5
1177;>7 ----
1'117,' - i11'11\ - 0)1 ;
"
..
;
- - - -
adjunct v. I
n�7.)?
coordinated v. 24
'Jl':JiT
.
start of direct speech v. 4
,UiN n'il' 'In
'11 '�� 1:liP'1
- - - -
-'�N " ,-' I\O:!' ,i 'J':l'W;" 1 n�� '7' �W¥T 'o/�l < �oo �di �at �d>
0'1')''1 n� :n.� 111;l1p� Cl:(
- - - -
60
61
On p. 65 we present an overview of these data structures as applied to 1 Kgs 2: 1 . In § 1 .3.2 we present a text syntactic overview of the Hebrew text of 1 Kgs 2: 1-1 1 together with a similar presentation of the Syriac text. Abbreviations used in the data presentation: Vs LexP prf rtf vbe nme pnSfX VbFrm Pers Numb
verse lexical part of speech preformative root formation verbal ending nominal ending pronominal suffIx verbal form / tense person number
Gend State PhrP PType Dtrm Sub-PhrTyp Constit Lexeme Word
gender abs. or constr. state phrase dep. part of speech phrase type phrase determination phrase-internal relations clause constituent parsing lexical form of word textual form of word
Vl -" "CI C " • o 0 .. U u �
8. : �:
Subcategories: PhrP PType Dtrm Sub-Phrase Type
conj[unction], prop[er name], noun, verb CjP Conjunction Phrase, VP = Verb Phrase, NP Noun Phrase det = determined, idet = indetermined +rgE regens connection with next Element -rcE rectum connection with previous Element Conj[unction], Pred[icate], Sub[ject], Obj [ect], Appo[sition]
k u • • .0 U •
=
=
Constit
0.. " >,
=
The presentation of the words and constituents of each clause is followed by two lines with higher-level linguistic information:
0. • k 0.
.0 k • >
0.. " >,
u • • E
1
0. . k o.
.0 .. • > o k • "
..
e •8
• "
"
z
The Functional information presents the sentence number [Sent], clause number within the sentence [Cl], the clause type [VbCl: verbal clause; or NmCl: nominal clause] and the numbers of the phrases in this clause. The Formal information presents information about the position of the clause in the textual hierarchy: the line number [Line] and the relations this line has to other lines in the text [reI. to Line]. These relations are in codes: 64 = +Infinitive connection; 200 formally parallel clauses; 999 = start of direct speech section. (This is just a selection applicable to the data presented here.)
o
=
� ;,; -' c u :3 - 2
de 't , 0.. '
o
de _
N
� ;;
ii .S
� -'
� t
,
,
de
Q
o
� 0.. X U ....l
C. • k C.
.0 k • >
N �
� � � :J
62
63
1.3 The Production of Syriac Data
or, selecting manuscript 9al :
In technical terms the goal of the CALAP project is to have both a Hebrew and a Syriac textual database of biblical texts that makes it possible to compare their respective linguistic formats. This goal requires three further stages of computerised analysis: The preparation of Syriac data, comparable to the Hebrew data produced, starting from the transliteration of the text as prepared and published by the Peshi(ta Institute Leiden, into an electronic format that could be used by programs for grammatical parsing. (§ 1 . 3 . 1 ) 2 The parsing process itself, equivalent to the process developed in earlier research for the parsing of Hebrew texts. Existing programs had to be re vised to have them work with Syriac texts and have them built data sets of Syriac linguistic analysis (§ 1 .3.2). To that end also lists of morphemes used in Syriac had to be composed, which in combination with a morpho logical paradigm (see § 1 .2) should allow for the calculation of word functions from actual patterns of morpheme combinations found with tex tual words. 3 The design of new programs for the linguistic comparison of Hebrew and Syriac data on the level of lexemes, phrases, clauses and text. (§ 2) 1.3.1
STAGE J : MACHINE-READABLE SYRIAC DATA
1 .3 . 1 . 1 Reformatting First we needed a transformation of the eXlsttng electronic text of the Peshitta Institute Leiden into a consonantal text according to formats used by the Amsterdam Werkgroep Informatica. Below we present a sample from the text format of the Peshi(ta Institute, a text including manuscript variants, and a sample of the reformatted text. It is possible to select from the input text the variant material present in one of the manuscripts. input
program
output
@1R2 1
wqrbw ywm: why ddwyd
lmmt
%verse 2 , 1 WQRBW JWM "WHJ DOWJD LMMT
WPQD LCLJMWN BRR W>MR
J .3 . 1 . 2 Adding Morphological Segmentation A second program ('Analyse') is used to add markers of morphological seg mentation into the text, according to a paradigm that the research team has established on the basis of Syriac morphology presented in classical gram mar. The program uses an 'analytical lexicon' to make proposals. Words that it cannot find in the lexicon are to be analysed by the user according to the paradigm established. New segmentations are inserted into the analytical lexicon. In this way the lexicon grows during the process of data preparation. Below we present a sample from this Syriac Analytical Lexicon. The column 'input' has the plain word form as it occurs in the consonantal text to be analysed. The column 'output' has the morphological segmentation of that word form. input
-
output
> � XWHJ
>X/ ( J&W+HJ
> - XJHWN
>X/J+HWN
> - XJKWN
>X/J+KWN
> - LH>
>LH/ (J->
> - LHJHWN
>LH/J+HWN
> - LHJHJN
>LH/J+HJN
> - LHJK
>LH/J+K
> R LHJKWN
>LH/J+KWN
> R MJN
>M(>/JN
>"NJN
>NJN
> - NCJH
>NC/J+H=
wpqd l$lyrnwn brh w ' mr [lhl
- g al l i
pil2wit
'Reges02' %verse 2 , 1 WQRBW JWM-WHJ nnWJD LMMT
WPQD LCLJMWN BRH W>MR LH
The morphological segmentation involves the insertion of morpheme mark ers and the reformatting of the text into verses.
65
64 input
Morpheme level (fmal result of stage I)
output
program
'Reges02 '
1 RGS 2 : 1 W-QRB [W JWM/ (J&W+HJ D-DWJD/ L- ! M ! M (WT { /
%verse 2 , 1 WQRBW JWM . WHJ DDWJD
2
LMMT
WPQD LCLJMWN 8RH
1 KGS 2 : 1 W-QRBW JWMWHJ D-OWJD L-MMT
W>MR LH
2
1 RGS 2 : 1 W ORB
'A nalytical Lexicon ' ---7
Analyse
---7
'lKings02.at'
3
%language syriac L- ! M ! M (wr [ ! w-PQD I [ L -CLJMWN/ BR/+H lR 2 , 2 >N> >ZL ( I >N> B->WRX/-> D-KL/+H= >R-> ! ! @>T@XJL= I [ W- ! ! H W ( >&J [ GBR/->
analysis of Hebrew texts (Stage 2). The process of analysis of the Syriac texts runs parallel to the stages outlined above for Hebrew in § 1.2. STAGE 2:
GRAMMATICAL ANALYSIS OF THE SYRlAC DATA
The second stage of the research concerned the adaptation and development of the existing programs for the analysis of Hebrew texts into a much more 'language-independent' series of programs for the morphological and syntac tical analysis of biblical languages and documents. This concerns the devel opment of sets of lemmas (Syriac lexicon), sets of grammatical morphemes and sets of grammatical rules for the combination of morphemes and the functional analysis, at the level of words, phrases, clauses and clause con stituents. Much of this work has been discussed and prepared by all the members of the Leiden and Amsterdam team in various combinations, depending on the type of cooperation: on programming, on linguistic analysis, or on text critical analysis. It may be sufficient here just to present the results of the various steps in the grammatical analysis of the Syriac text. The process of analysis and parsing is similar to the process described for the parsing of Hebrew texts in § 1 .2.
DWJD L
MWl'
(textual words) (Iexemes + word functions)
[W-]
[QRBW]
[JWMWHJ]
[D-DWJD ]
[L-]
[MMT]
4 Clause level (adding clauses + clause types + constituent parsing)
W->MR [ L+H
From this stage of analysis the Syriac texts are available in a format that can be read and analysed by updated versions of the existing programs for the
JWM
o
Phrase level (adding phrases, phrase types and phrase functions) 1 KGS 2 . 1
lR 2 , 1 W-QRB ( W JWM/ (J&W+HJ D-OWJD/
1 .3.2
Lexeme and word level (adding grammatical word functions)
5
[JWMWHJ / D-DWJD <sp><Su>]
1 KGS 2 : 1
[W- ]
[QRBW ]
1 KGS 2 . 1
[L-
(MMT j
1 KGS 2 : 1
[W- J
[ PQD ]
[L-CLJMWN / BRH ]
1 KGS 2 . 1
[W-]
[>MR j
[LH ]
Text level (see the clause hierarchy below, § 1 .3 .2.2)
At this stage, the grammatical analysis, ends with a full textual hierarchy of the Syriac text. The resulting data can now be compared with those of the textual hierarchy of the same chapter in the Hebrew text. See below, 1 .3 . 2 . 1 (1 Kgs 2: 1-12 in Hebrew) and 1 .3 .2.2 (1 Kgs 2:1-12 in Syriac).
§
§
0\ 0\
1 .3.2. 1 / Kgs 2: /-/2. Hebrew Text. Presentation of Clause Hierarchy [<Su> 1" [ ll:J
(cCo> r1N:l
I
(cPr> nlO?J
1i'l
mown nK]
1 7m « Su> « Cj> 11 (
[
[cPr>
[ ,J
Ln
Ttyp.
N
N N
I
I , 1
« Pr> 1CN"1
[cPr> nyrn) (cPr> n";"T]
« PC> lD" N?1
[ l':1'7x I iniT'
[cPr> 1:1
(cPr>
;'11)?W l1NI
?:l " ,:J)
1V' I
'/:)')
N
':mn
NO NO ,
NO NO
[ 1" " J] (cPr> n" ,] (cOb> 1'm'lI'1 " ��lt'tI' 1'n1:t1:l 1'nprTJ [ 10lU?J
I I I I [ iHU7J n1m:1l [ J1n:)] [eRe> J ] I I { " 11 K] [ ":IIU111 [ JYO?J I [ illVllnJ (eRe> 1W/(] I I [cOb> 'J:) nX) { 1 J I [ OW] [ 1WNI I (cOb> '1:11 nN] [<Su> ;'1;" ] [cPr:> C'i"'l I 1»1)71 [cCo> '?YI [ 1:J'J [eRe> 1lDXl [cPr> 10K?)
9 10 11
12
13
NO
NO
H
NO
16
NO
15 "
NO
NO
Vpng
v.
WayX
3plM 01
WayO
399M 01
InfC
InfC PtcA
01
01
-89M 02
WQt.l
289M 02
WOt.1
InfC
289M 03
PteP
-S9M 03
WOt.l
InfC xYqt
28gM 02 03 03
2a9M 03
xYqt
289M 03
xYqt
259M 0 3
xQtl
389M 04
Ode
xYqt.
03
3S9M 04
NO
InfC
19
NQO
xYqt.
3plM 04
21
NO
InfC
- - - - 04
22
NOQ
xYqt
389M 04
NO
WxQt
259M 05
389M 05
10
I [cOb> OJ" l1N) « Su> 1'3:11 « Pr> " lllD'l IcC» CN] I I [ ')!)'J « PC> n:J"J I I + I [
NO NO NO
C1Lab
04
.. ... .. ... .........- .................- --- ..... --- --+
20
••••••••••••••••••••••••••••••••••••••
« Co> ';!N"'IIV' NO:J 'YOJ
[<Su> IV'N)
[<Su:> il'l"'l�
« Co:> '1 1=1
I
"'Il::JN' I [ "'In' 1=I / NIVOY' / ,
[
« Co> 1')
[
I
c'7lV:1]
« Ob>
;,on,o
'0'7)
« Pr:> CIV')
{ 1'lM:l1
'Om
[
[ i1Wl:lnJ
« Pn '7"'11nJ
« Co:> " Y71il
[ lm?lV " 'N::::I ] « co> '?NI
[<Su:><sp>
[ n'IVl:l]
I N ' I
« Cj> 11
[<Mo>
PI
[ nn7.))
n'7y'l
[ 1 ]
[ " ' J
I.;Su> Nlil]
::::I"'Iml
" 1
[
[ l1:JIVNI
(.;Aj>
(.;PO>
lTl'ONl
[ < Cj > ' 1
(.;Co> " 1 [ 'rIKIVl
[ c:ln lV'Kl (.;Pr> nY" 1
[.;Pr> illVl.'nJ
[.;Aj > C,:Jl
I
• • • • • • • • •-.===�-:-� • • • • • • • • • • •
« Co> ,'n::JN OY]
(.;Co> " , " y:J ] « Su> ,nJ
(.;Cj> 'J]
« Cj> 11
[ m" i'1J
[ 1'0 1
( .; C j > 1 1
[ 'NJ
I
[<Su>
" '1
[ ':Jp'1
<
[ Pr > :J:J'W" I
[
[ , 1
« Su> O'O'i1 )
[de> 'WN I
32
NO
WQtl
2a9M 06
I
I [ 1 ]
I « Co> ?N"'IlV' 'YI « PC> i11W O'Y::::I"'I N) I « Th O')1lJ l1:JW] [ 1'0 1 [.;1.0> l1':::ln:JJ [ 1'0] [.;Lo> c,un" ::::I l [ < C j > 1 J (.;Ti> O'llV W?lV1 O'W'11]] [ 1'::IN I " , NO:l 'YI [ :JW'] « Su> ;'7;)?lv ] « Cj > [<Mo> 'N7;)) « Ob> ,n:l'7:)J [ pn] [ .; C j > 1 1 I
I I
I I
I I 11 I
'I
NQN
NQ
NO NO
OS
NmCl
WxYq
259M 06
WxYq
2S9M 07
WQtl
3pl-
07
NO
xQtl
InfC
07
NO
NmCl
08
NO
WXOt
NO
NO
InfC
3pl- 07
3a9M 0 8 08
41
NO
WXOt
399M 0 8
43
NO
Defe
08
4.4
NON
WayO
1S9- 08
InfC
08
46
NONO
xYqt.
1 9 9 - 08
4.7 48
NO
HSyn
NQ
49
NO
NmCl
S2
• ��• •� = = = � • • • • • • • • • • • • •+
OS
os
S1
I
[.;Cj> 1 1
NmCl
3sgM 05
De'c
SO
I
NQN
WayO
389M 05
NQN
4S
I
NON
xOt.I
30
4.2
I
(de> "'IlVN TlNl
[ ,Tl:J'tv TlN]
35gM 05
40
'1
[.;Cj>
[<Mo> ilTlYI « Su> :1nN1
WayO
39
1
• • • • • •_+
(.;PO> ,i1j7ln]
NQN
38
« Cj > 1 1
[ ' 1
[ ON]
28
37
mill
I < P r > "'ION')
••••••••••••••••••••••••••
3agM 05
36
-----------------------------------------+
[.;Co> il1:1'::::I 1
WayO
3S
I
[<Pa> 'Tl'71
[.;PO> 'TlN"'Ii" ] I (.;Co> l1" il]
NON
34
[ ':I]
[<Su> K1i1)
[.;Co> C')n1)J
26
33
I
[ 1]
'l"PI
« PO>
xOt.l
31
I
« Cj > 1 J
I ''''''''1::::1 'l:J ' J
« Pn W1J
[
I I
- --+
[ 1'nN / Cl'71V:JN 'ltlO) [<Pa> 'm:J::J] ')'0'i'1 P / N'l P / '»'010] « PC> lOY] (<1j >
c',roo I
[ |
•.
NO
29
[ .; C j > 11
[de> "'IlVN]
04
NO
27
[ 1 1
InfC
"
2S
I
{
[ 1no:J":I] « Ob> 1n:::l' WI
I
[ 1 1
In')
[
(.;Co> " Y):I)
" ,1'::::1 1
23
I 1)
CJ,;'P] [ ' 1
[
( i1bM?o 'O'7J
« PC>
« Aj > C'W:1l
I
« Pn nY1'J [<Su> i1nNJ « Mo> Cl) :IN''1 [ " I { "'IIVN nNJ 'N"'IIV' nlN:J� " IV 'llV;1 [ i1IVl1J [ "'IIVNJ
1=1
{ m'ln::JJ
[ 'NlV l
« Ng> N')
NQQ
53 54
55 56
NO
NON
NO NO
09
xyq t
2s9M 09
WOt.I
299M 09
xYqt
09
2s9M 09
WOtl
299M 09
3S9M 10
N
WayX
N
WayO
N
De'c
N N
58
N
59
N
61
08
NQ
"
60
InfC
N N
xQtl NmCl
xOtl WxQt WXQt WayO
3s9M 10 11
3 s9M 11
11
3a9M 1 1
359M 1 1 399M 12
3sgF 12
�
'" 00
1 .3.2.2 1 Kgs 2:1-12. Syriac Text. Presentation of Clause Hierarchy
[<SU><SP" .\00.1:\ I ,ma=XICl..oI [ C\;>i.Dl « [ �J { j,] i<4p> m'b I _ � 1 [ :\t>.!!Io J [ « Co> mlJ [ b..<) [cCj ..
••• • •••• • ••• • ••••• •••••
[<sp> r6...1. re' I mk..., I rG.tiD�1
• •• •• ••••••• ••• •••• ••• •
[<Ep> r6re')
Ln
Cj,. oj
1
I
oj
(\)
[
[ 11'P)(\!7.L��1
[<pr;,· ):>.o.
[ i::::olr( ]
« CO> �J
I
[ l � l [ lJ
[cOb> �m�'(u<J [cOb> � l [ ��] [ �J [ ,) I �m..::J. / ml..:.. .,p'lJ [ �Nl.:)J I .:::r;I� 1 {cPr> cul�J { �J I [< ...p>up> �� I mln;o. � I a I ....· · · · · · ·.. · ····· ..···········
·····-····· ······+
.
· . ... ·· · · .. · - --. · ... - ···
·.· ...·.·... .. · · · · . · · +
(cPr> bot6J)
•••••••-• • • • •_••- • • • •_••••••••••••
[<Su> r<�
(cco><sp>
l.. ila.r<,. I
« co> �)
�i�
l...... 1
1
••••••••_ ••
+
I I
I �r<1 [c pc> �J [cRe> .,1 [cOb> �r -=:.0) [cCj> aJ
[ ��l
[ "\;.� �r<]
[ �l
[ mll
« Pr> :\:».. J
[cNg> rd.}
� '
,,\NOr<
[cPO> ,..)(u::,u. )
/ :n�r<
[cPr> u.b... }
�1
4=z.. �J
« Co> �:\i�1 l
[ Nul
« Co> o-lo-i('l...a.]
r<'mlo< /
rG�]
[ ml] [cCo>
mll
« Su> om]
[cPr> �]
IcCj> 0.)
{cPr> �i:::r< :r.I ]
[cCj> 0]
« PO> �r
(cNg> rd]
_ • • 2 Z _ • • • • • • • • • • •C2
(<Su> rar<j
[ :\)
• •_ . _• • • • • • • • • • •� • • • • • • •
[
mlJ
[cPr> .�hJ
[cRe>
mr<) [ an.; hJ (cOb> mln�J [cPr> � [cCo> ��] [ rC:IG�] [cOb> en��J [cPr> �r<) [ ,mo.CT.l..:r<" :l J;Uo-I
[cCo>csp> :\0('1.\:\
I r<}u'k=)
.
,]
« Su> :\oa:\1
« Pr> � ]
I « Cj > ('Ij [<Su> { +0<1 (de> :\ ] I I .. . .. .=.' \1 <] [cPC> � " I
+r<'] h]
I o.J
r<'�a.. j
03
03 03 289M 03 269M 03 OJ 03
xOtl
100
xYqt InfC
3plM 04 04
10
InfC
- . - - 04
10
1Q
100 100 10
10
10
10 10 10
10
OYqt
xYqt
03 04 04
xYqt PteA
309M 04
Nmel PtcA
05
xOtl
04
05
389M 05
Dete xOtl
369M 0 5
xOtl WOl:l
389M 05 389M 05
05
05
3sgM 05
NmCl
34
10
wQtl
05 389M 05
10
10
1Q
10
1Q 1Q 10 10
10
Impv WxYq Impv
3plM 07 1agF 07
10
44
10
WXOt xOtl
45
10
WXOt
10
48 49
54 55
56
57
64
35gM 08
xYqt
lsg- 08
10
Nmel
09
xYqt AjCl Impv
28gM 0 9
xYqt lmpv
09
?Q 10
10 10
1Q
10
0'
lmpv
2sgM 289M 289M 289M
WOU WQtl
38gM 10 3agM 10
Defe
63
lagF 08
100
xQtl
62
08
3agM OB
WOtl
59
61
O'
1.gF DB 1agF OB
10
WOtl
58
60
2agM 0 6 28gM 07 3plM 07
xOU NmCl NmCl
46
2sgM 06
xOtl
10
52 53
[cCj> 0.1
03
Nmel
51
I I [cCo> l..il:a...r< hJ « su> ""0:\] I I IcTi> � �] {cPr> +r ��] I [db � hllr.o o;;ohllr.] [cPr> [ �i('lt61 I o) I « PC> ...;.... ;:. . ] [<Su> _ � ] (cCj> ('I J icap> ,enc=r< I :\00.:\:\ I a'h.QIic.u. [cMo> �] [<Su> mlr.�J [ No;'] I a] I I « Pr> 'bolr.r<')
02 02 02
WOtl
SO
I
289M 289M 289M 289M 2sgM
10
47
IcCj> al [cCj> oj
01
36gM 01 3119M 01
32
43
leIj > f6.mJ
(CPC>C8P> �I
24
25 26 27 2B
Detc
Ypng v.
3plM 0 1
269M 269M 369M 389M
21
41
« PO> ,m�\�j [cNg> rdJ « Su> Nr<J [cPC> r<'� [cR�n ., ) I [cOb> ,,�] [ .......:\1 [cCj> al
10
100
42
+
10
19
40
+
xYqt
?O
10
39 [ ('II
10
10
18
38
{cCj> ('I]
._ • • • •_ • • • • • • • • • • • • • • • • • • • • • • • __ • __• • • • • • • •
[ �i.»..::l ]
16
37
[ 0.)
( < I j > r<'en]
« PC>
14 15
36
« Re > :\ �] [ :u.]
13
35
[cCj> o.)
(cCj> 0.1
l « Su> r r<'lrd.'bo r<'A.;...('I ] [ """"' SJ [cSu> om] [ < C j > o j S « co> � ] [cPC> hl\r<') [de> .,] I
I �., I
[<Ti> �(\&:)l
J;I.'\O
« Su> �m]
11
12
J3
[ 0.]
« Ob> r<;.�) « Pr> :\:»..] « Aj > o<..� I �I'b ,:b J;Uo-] [cPC> ,,\,a� ,l.:..� �) [cPr> �0.CT.L.I] ( 0.] I
{
�iNO
{' �:\l
« Pr> �;.)
[cOb> m#l�l
[ :n�l
[cAj >
I �I
10
31
I
[ :\]
[ ,mC\� l
[cCo>csp> ,(7)�:\
9
29 30
[ r<::!'\.o:I l [ < Cj > :\ �r<] (cOb> �m.:":):\ J « Pr> .u.r<1 [ 0.)
?Q ?O
1Q 1Q
22 21
•• ••••• •• •••••• • +
WOtl InfC WOtl
PtcA Impv Impv Impv lmpv Impv PtcP xYqt xYqt
10
20
[ < I j > .amI [ al [cOb> ".'\::11) 1 [<Ep> Nr<J [ ...."") .. « Su> Nr<1 « Su>cap> o<.,� 'b I �r .l1 [cPr> .�I [cRe> :\1 I I [ ,,:\,:7.1 I al I 'i..=rd / .\..'im..r<., I r<#lcl.:.... ..=.' ....uJ.) ( cPr> :I.:1>....1 [ .' 1 1 I [cCo><sp>cap> ih. 'b / � I a I w b
« Aj > �I
10
17
( �I
[ .u....J ) [cNg> rd.1 [cPr> �h.J [cRe> .,1
• • •• ••••• •••- ••• •• • • •••• •••••
1
I I I I I I
C1Lab
WOtl
• • • • • • ••• ••• ••• +
[ <6'<1 « Pr: > l....�r<'J I « PC> ....i.::Ll.,J .. [cPr> ,om] [ al I [ '"'\ml...... I rG.b., I m�'a\,ll « PC.. �) [ 0) I « Co> m)w'1a..c:.) [ �m) [ m�a;\mtic\ ,mc:\i.lO Imru.\i:i�a ,m�J [ �J [<sp> ..a.�l / �C\!;Ql:;I l [ .' ,,\-r<'J I I I [ bl [ ��I (d... ., �J I I [ :\::>.>.... �l (de.. ;,, ) I I I [ 01 J I [ j,\..c';'] [eRe> lJ I I I
[
rtwa
09
09 09
11
389M 1 1
NmCl
11
xQtl
389M 11
WxOt WXOt WOtl
3sgM 11
3a9M 12 3agM 12
�
71
70 2 Instruments for Comparison
Once a Syriac database has been produced in a format similar to that of the Hebrew database, a number of instruments and procedures are used for fur ther research. In the first place it is important to perform independent analyses of the Syriac and the Hebrew material, for example, to construct sets of clause types, sets of phrase types, or sets of verbs in combination with their comple ments. This allows the researcher to analyse the Hebrew and Syriac data separately at the level of language structure. In the second place it is necessary to construct a synoptic overview of the Hebrew and Syriac data, both in terms of surface text lines and in terms of internal grammatical information. This analysis will allow the researcher to compare text-level data and ask questions about how systematic or how unique certain differences between a Hebrew text and its Syriac counterpart might be. Below we will present some samples of this type of synoptic re search.
2.1 The Hebrew and Syriac Synoptic Data Types In order to create various options for the comparison of Hebrew texts of the Old Testament and their Peshilta counterparts, we have decided to reformat the Hebrew and Syriac data into more compact data categories that would facilitate the comparison of both texts in terms of similar linguistic concepts. A new program has been developed that reads the Hebrew and Syriac data of a particular chapter and reformats them into five data types, applied to each line of text (for samples see below, § 2.2. 1-2): I the surface text, segmented into phrases; 2 the lexemes used in each phrase; 3 a selection of granunatical functions of each lexeme and of the phrases they are part of; 4 the constituent parsing of each phrase; 5 the clause-type label.
With these data it is possible, first, to calculate which Hebrew clauses are, fully or partially, linguistically parallel to their Syriac counterparts, so that a program will be able to make proposals for a synoptic presentation. On the basis of the five data types listed the texts can be compared in terms of words, but also in terms of grammatical features, regardless of the different Hebrew and Syriac lexemes used. Second, once a synoptic rearrangement of the data has been produced, one will be able to produce material for textual comparison at various linguistic levels - i.e. Hebrew and Syriac lexemes, phrases and clauses - on the basis of their parallel alignment in the synopsis.
The results of these procedures o f collecting and comparing should facili tate the discussion on the interaction of textual criticism and linguistic analy sis to which the CALAP project aims to contribute.
2.2 Process of Linguistic Comparison When the final products of the morphological and syntactical parsing of He brew and Syriac texts (§§ 1 .2 and 1 .3) are available, the following procedure is applied. Syriac.data
Hebrew. data
J.
J.
Prepare [Syriac]
Prepare [Hebrew]
J.
J.
Hebrew.Paral.Data
Syriac.Paral.Data
J.
J. Synopsis
[= §§ 1 .3.2. 1 and 1 .3.2.2]
[reformat data
=
§ 2.1]
[= format for comparison] [below in § 2.2 . 1 ]
[produces Hebrew and Syriac synoptic data]
J.
Hebr.Syr.Synopsis + Hebr.Syr.ParaITxt
J.
Compare
J.
[= Synoptic presentation of texts] [below in § 2.2.2] [= Synoptic data files] [below in § 2.2.2] [Comparison of Hebrew and Syriac data]
Parallel.Lexemes
[below in § 2.2.3]
Parallel.Phrases
[below in § 2.2.3]
Parallel. Clauses
[below in § 2.2.3]
Below we present some samples of text files as they are produced in this process. Some of these files are just 'data', which means that they are sup posed to be read by programs rather than by the human researcher. Some of these files are presentations of the same data in a format that much more resembles the usual scholarly data. In each case we will indicate whether the example presented is just 'data' to be processed further, or whether it is
72
73
meant also for scholarly reading. Other contributions to the present volume
1 KGS 0 2 , 0 2
will discuss how the research evaluates and uses the data produced. The texts
Lexeme
are listed in the transliteration that is used during the production process.
2.2.1
EXAMPLES OF SYNOPTIC PREPARATION
[HJJT
[ L ->JC
HJH [
L >JC/
PhraseT:ype 6 ( 6 )
1 11 , 2 )
5 15 , 2 . 2 )
PhraseLab
501 [ 0 ]
521 [0)
1 KGS02 , 03
have been added here for the convenience of the reader.
Lexeme
1 KGS 0 2 , O l
[W-
PhraseType 6 ( 6 ) PhraseLab
509[0)
[JQRBW j
[JMJ DWD <Su>]
QRB[
JWM/ DWD/
1 ( 1 011)
2 12.1,3.2)
501 (0)
502 [ 0 )
" , '1')' '::J'i" 1
Lexeme
[ L -MW!' j L MWT [
CW- J W
509[0)
[JYW j
[>T CLMH
'':i''
>T CLMH/ BNI
1 (1:11)
5 1 5 , 3 . 2 [ 2 . 1 ; 31 2 »
501 [ 0 )
503 [ 0 )
1 KGSQ 2 , O!
( L - >MR ] L >MR[
SOI ( O }
n!!"-'� " 1� ,,;-, '�l!! [ >NKJ <Su>] >NJ<J
PhraseType 7 ( 7 )
Lexeme
502 [0)
[W-
PhraseType 6 ( 6 ) PhraseLab
KGS02 , 03
[ L - LKT } L HLK [
" �'1� m""
[ B-DRKJW ] B DRK/
PhraseType 1 ( 5 , 1 : 4 )
5 1 5 , 2 . 1;312)
PhraseLab
504 [ 0 ]
501 [ 0 ]
509 [ 0 )
ClauseType WQtl
Syriac Parallel Data, formatted into categories for comparison with the He brew text. Sample text: I Kgs 2: 1-3 . Data file only. The lines in Syriac have
:\.00;\:\ Im�C\.a C\.::)i.oo KGS02 , O l
[W-]
[ QRBW
[ JWMWHJ 0
[D-DWJD <sp>] <Su>]
QRB [
JWM/
PhraseType 6 ( 6 )
1 (1 0 2 )
2 12 . 1 ; 312 [ 5 , 3 . 2 »
PhraseLab
501 [ 0 )
502 !0]
W
509 [ 0 ]
1 KGS 02 , Ol
[L-]
[MMT ]
Lexeme
L
MW!' [
PhraseType 5 1 5 )
1 ( 1 :4)
[HLK }
[ B-DRK KL H->RY J
PhraseLab
501 [0]
HLK[
B ORK! KL/ H >RY/
C l auseType InfC
1 11,6.2)
5 (5 , 2 .1,2.1,0,2.2)
521 [0)
504 [ 0 )
DWJD/
1 KGS02 , Ol [XZQT J XZQ[ 1 11,2) 501[0]
509 [ 0 ]
m'i.:J __ �
ClauseType PtcA
1 KGS 0 2 , 02
503 [0]
509 [ 0 ]
>LHJM/
ClauseType WQtl
ClauseType InfC
PhraseLab
501 [ 0 ]
1
PhraseType 1 ( 5 , 1 : 4 )
Lexeme
PhraseLab
Lexeme
Lexeme
1 KGS 0 2 , 02
5 1 5 , 2 . 1 , 3 . 2 [ 2 . 1 ; 2 12 ] )
been added here for the convenience of the reader.
(BNW J J
YWH [
ClauseType wayO
PhraseLab
1 11 , 2)
501 [ 0 )
PhraseType 6 ( 6 ) phraseLab
[ >LHJK ]
PhraseType 6 ( 6 )
W
C l auseType InfC
'l:J ;TOW nx Lexeme
[>T MCMRT JHWH
>T MCMRT / JHWH I
1
ClauseType InfC
1 KGS0 2 , Ol
[CMRT I CMR[
Lexeme
PhraseType 1 1 5 , 1 , 4 ) PhraseLab
Vil?N ii1il' n,bW?:> nN n,7:)llJ1 (W- I
ClauseType WQtl
ClauseType WayX
1 KGS0 2 , 0 1
509 [ 0 ]
IV'!!' n"i"'ll
ClauseType WQtl
Hebrew Parallel Data, formatted into categories for comparison with the Syriac text. Sample text: I Kgs 2: 1-3 . Data file only. The lines in Hebrew
Lexeme
(W-
l1j?ln1
Lexeme
[W-
[ PQD <pr>j
[L-CWMWN [BRH ] }