Abstract machines and grammars

Walter J. Savitch Abstract Machines and Grammars Uttle, Brown Computer Systems Series Abstract Machines and Grammars...

104 downloads 1276 Views 6MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

Walter J. Savitch

Abstract Machines and Grammars

Uttle, Brown Computer Systems Series

Abstract Machines and Grammars Walter J. Savitch University of California San Diego

Little, Brown and Company Boston

Toronto

Library of Congress Cataloging in Publication Data Savitch, Walter J ., 1943Abstract machines and grammars. Bibliography: p. 206 Includes indexes. 2. Formal languages. 1. Machine theory. I. Title. 81-16136 QA267.S29 511 AACR2 ISBN 0-316-771619

Copyright © 1982 by Little, Brown and Company (Inc.) All rights reserved. No part of this book may be reproduced in any form or by any electronic or mechanical means including information storage and retrieval systems without permission in writing from the publisher, except by a reviewer who may quote brief passages in a review. Library of Congress Catalog Card No. 81-16136 ISBN 0-316-771619 9

8

7

6

5

4

3

2

VB Published simultaneously in Canada by Little, Brown & Company (Canada) Limited Printed in the United States of America

To My Parents

Preface

Computability theory is concerned primarily with determining what tasks can be accomplished by idealized computer programs. Formal language theory deals primarily with the ability of certain grammars to express the syntax of various types of languages. Both fields are branches of formal mathematics. The former theory has its roots in mathematical logic, the latter in mathematical linguistics which includes the mathematical theory of compiler design. In recent years, these two subjects have been studied extensively by computer scientists and have interacted to the point where, from the perspective of a computer scientist, they are frequently viewed as a single unified subject. As such, they have become a standard course topic in many computer science curricula. This book contains an introduction to these two interrelated subjects. It is suitable for an introductory course at either the advanced undergraduate or beginning graduate level. It also can be read and understood, without the aid of an instructor, by anybody who has either a background in theoretical mathematics or a reasonable amount of programming experience and some exposure to mathematical proofs. The present text grew out of notes used in both graduate and undergraduate courses at the University of California at San Diego. These courses were designed for students with a good deal of experience in the areas of computer programming and systems, but with limited exposure to formal mathematics. This explains the slightly unorthodox arrangement of the material in this book. In teaching these courses it was found that the notion of a contextfree grammar was easily understood by students who have had a fair amount of programming experience. In fact it proved to be the one notion in the course that they were most comfortable with. Primarily for this reason, the book preix

x

PREFACE

sents context-free grammars first and uses these grammars as the central focus of the book. This approach, while it may not be the traditional way to structure the material, has proved to be pedagogically sound. It has also proved to be a sound and efficient approach mathematically, since context-free grammars turn out to be a convenient and efficient vehicle for presenting a number of other key notions in computability and formal language theory. There are some points to keep in mind when using this book as a text. In reading this book you will find that a number of details are left as exercises. These exercises should be done by the student. They are usually not difficult, but they do insure that the student is an active participant in the development of the material presented. Occasionally these exercises, although routine, are lengthy. These instances are noted in the text and, for these exercises, a quick mental review of what is involved should be sufficient for understanding the material. The material presented has proved to be a bit too much for a onequarter course, and it may be too much for a one-semester undergraduate course. It is possible to choose a subset of the sections as a shorter course. Sections marked with a star may be omitted without missing results needed later on in the book. In preparing this book I have sought and received helpful comments from a number of people. I would like to especially thank the following people for their helpful comments on earlier versions of the text: K. Apt, P. W. Dymond, M. L. Fredman, J. Goldstine, B. Levergood, G. Rozenberg, M. J. Stimson, D. Vermeir, P. M. B. Vitanyi, and most especially the students who suffered through and helped correct mistakes in earlier versions of this text. I would also like to thank Sue Sullivan for her expert assistance in typesetting and preparing the manuscript. W.J.S.

Contents

Chapter One

Introduction

1

Procedures and Algorithms Sets, Relations, and Functions Alphabets and Languages Summary of Notation Exercises

2

4

5

6

Chapter Two

Context-Free Grammars

8

Definitions and Examples

9

Parse Trees and Ambiguity Leftmost Derivations Chomsky Normal Form The Pumping Lemma Exercises

11

15 16 23

28

xi

xii

CONTENTS

Chapter Three

Finite-State Acceptors

30

Conceptual Description

31

Nondeterministic Machines

33

Formal Definitions and Basic Results

36

Equivalence of Deterministic and Nondeterministic Acceptors 45 The Pumping Lemma and Languages Which Are Not Finite-State 48 *Regular Expressions

51

*Two-Way Finite-State Acceptors Exercises

55

60

Chapter Four

Finite-State Transducers

64

Definitions and Examples

65

*The Pumping Lemma for Finite-State Transducers 68 *The Multiplication Function Exercises

70

73

Chapter Five

Turing Machines and Computable Functions The Turing Machine Model

75

TL/1, A Programming Language for Turing Machines 81 TL/1 Abbreviations

86

TL!1S, TL/1 with Subprocedures

88

75

CONTENTS

A Compiler for TL/I..S

95

PS/1.., a High-Level Language for Procedures to Compute Functions 107 A Compiler for PS/1..

114

The Church/Turing Thesis Exercises

121

123

Chapter Six

Universal Turing Machines and the Halting Problem 125 A Universal Turing Machine The Halting Problem Exercises

126

128

138

Chapter Seven

Turing Machines as Accepting Devices Recursively Enumerable Languages

140

140

Recognizable Languages (Recursive Languages) 144 Exercises

148

Chapter Eight

*Nondeterministic Turing Machines An Informal Example

150

151

Definitions and Examples

153

Equivalence of Nondeterministic and Deterministic Turing Machines 155 Exercises

156

xiii

xiv

CONTENTS

Chapter Nine

Pushdown Automata

158

Definitions and Examples

160

Equivalence of PDA's and Context-Free Grammars 164 *Pushdown Transducers Deterministic PDA's Exercises

173

174

180

Chapter Ten

Unsolvable Questions About Context-Free Languages 182 The Post Correspondence Problem Containment Questions Ambiguity

189

192

Inherent Ambiguity Exercises

182

193

203

Appendix: The Greek Alphabet

205

References for Further Reading

206

Index of Notation

208

Alphabetic Index

211

Chapter One

Introduction

The term machine in the title of this book refers to computers. A more descriptive title would be "computing machines and grammars." Much of the formal development of the subject will be stated in terms of machines or computers. However, the primary object of study will not be computers so much as computer programs. We assume that the reader has a general idea of what a computer program is. It is a set of instructions for a computer to follow. Of course, the instructions could be carried out by a human being instead of a computer. In this context, you might consider a human being to be a computer. In any event, the set of instructions describe a process to be carried out and the set of instructions can be given even if no computer is available. Such sets of instructions are frequently called effective procedures or, in certain circumstances, algorithms. If you interpret the term "computer program" loosely enough, then an effective procedure is approximately the same thing as a computer program. Later on we will give formal definitions of what we mean by effective procedures and algorithms. In this chapter we will give an intuitive idea of what effective procedures and algorithms are. For this intuitive discussion, it is best to think of the instructions as being given to a human being in an informal but still completely precise way. After this intuitive discussion, we go on to describe some of the mathematical notation we will use. PROCEDURES AND ALGORITHMS

An effective procedure is a finite set of instructions that can be carried out by a computing device, or a human being, in an effective, deterministic, step-by-step manner. Usually, the procedure has some input and produces some output. So, an effective procedure is usually a set of instructions for computing a function from inputs to outputs. However, an effective procedure need not

1

2

CHAPTER 1. INTRODUCTION

necessarily compute a function. To qualify as an effective procedure, the instructions must be completely unambiguous and there must be a method for unambiguously determining the order in which the instructions are carried out. The instructions must also be carried out in a discrete, deterministic way. There must not be any resorting to continuous methods or analog devices. There must not be any resorting to random methods or devices. Usually, an effective procedure has access to facilities for storing and retrieving intermediate calculations. Programs for digital computers are examples of effective procedures. Since an effective procedure need not be written in a programming language for any particular computer, the notion of an effective procedure is slightly more general than that of a computer program. However, the added generality is slight and we will not go too far astray if we think of an effective procedure as a computer program. When thinking of an effective procedure as a computer program, we must not think of our programs as being limited by any time or machine size constraints. We will not require that the procedure terminate in a "reasonable" amount of time. Also, we will assume that the storage facilities available have no size limitation. At any point in time the procedure will use only a finite amount of such storage. However, we will assume that the storage facilities always have room to hold more information. In particular, if we think in terms of computer programs, we will assume the machine is so large that we never run out of storage and never get an integer overflow or other type of termination due to a size limit of the machine. This may seem unreasonable to some computer programmers. However, it is a reasonable first step in developing a theory of computer programs. More advanced texts refine these notions and develop theories for programs that run realistically efficiently. An effective procedure need not end. For example, a computer program may contain an infinite loop. If an effective procedure has the property that, for all inputs, the computing process eventually terminates, then the procedure is called an algorithm. One final word of warning is in order. Later on we will generalize the notion of an effective procedure and consider things called nondeterministic procedures. These nondeterministic procedures are mathematically useful and interesting. However, they do not correspond exactly to our intuitive notion of an effective procedure and some of the above remarks do not apply to such nondeterministic procedures. SETS, RELATIONS, AND FUNCTIONS We will use standard set theory notation and assume familiarity with the elementary notions of set theory. In this section we will briefly describe most of the set theory notation employed in this text. This section need only be skimmed through rapidly and then used as a reference if necessary. Finite sets will usually be denoted by listing their elements (members) enclosed in brackets. For example, {1, 3, 5} denotes the set which contains

CHAPTER 1. INTRODUCTION

3

exactly the three elements 1, 3, and 5. This notation is extended in the usual way to obtain a notation which describes the set in question by some property that holds for exactly those elements in the set. For example, {x Ix is an integer and 1 ~ x ~ 3} = {1, 2, 3} and {xI x is an even positive integer} = {2, 4, 6, 8, ... }. When part of the description of a set is clear from context, we will omit it. For example, if it is clear from context that we are discussing integers, then {xl1 ~ x ~ 3} = {1, 2, 3}. The symbols E, ~. U, n, -, ~. and c denote membership, nonmembership, set union, set intersection, set difference, set inclusion and set proper inclusion respectively. So, for example, 1 E {1, 2}, 3 ~ {1, 2},

{1, {1, {1, {1,

2} u {2, 3} = {1, 2, 3}, {1, 2} n {2, 3} = {2}, {1, 2}- {2, 3} = {1}, 2} ~ {1, 2, 3}, {1, 2} ~ {1, 2}, {1, 2} c {1, 2, 3} but it is not true that 2} c {1, 2}. The symbol 0 will denote the empty set. Thus, for example, 2} n {3, 4} = 0.

Ordered tuples will be denoted in the usual way. For example, (1, 2, 2) is the ordered three-tuple with the three elements 1, 2 and 2 in that order. Note that the order is important and so an element may occur more than once in an ordered tuple. The sign x will denote Cartesian product. If A and B are sets, then by definition A x B = {(a, b) I a E A and b E B}. For example, {1, 2} X {3, 4} = {(1, 3), (1, 4), (2, 3), (2, 4)}. A binary relation R on a set A is defined as a set of ordered pairs (twotuples) of elements of A. If (a, b) E R, then we will write "aRIJ' instead of "(a, b) E R." Intuitively, a relation is a condition on pairs of elements and we will usually define relations by describing a condition on pairs. For example, we may define aRb to mean a and b are integers and b is the square of a. So, with this definition of R, 2R4 and 4R16 hold but it is not true that 1R2 or 4R2 hold. A more formal definition of this particular relation is R = {(a, b) I a and b are integers and a 2 = b}. However, we will seldom use this formal settheory type of definition. A function !from a set X to a set Y is a subset of X x Y such that: for each x E X there is at most one y E Y such that (x, y) E f. Rather than writing "(x, y) E /," we will follow standard notation and write "f(x) = y." To make the above definition look more familiar, read "! (x) = y" in place of "(x, y) E f." Iff is a function from X to Y, we may denote this by writing j:X- Y. The set X is called the domain off and the set Y is called the range of f. Note that we have not required that the following condition hold:

1. for each x E X, there is a y E Y such that f(x)

= y.

If, for some particular x E X, no such y exists, then we say that f(x) is undefined at x. If f does satisfy condition (1), then f is said to be a total function. We will use the terms function and partial function as exact synonyms. The adjective "partial" is to remind ourselves that condition (1) might possibly not hold. To prevent confusion, we will avoid using the unmodified term "function."

4

CHAPTER 1. INTRODUCTION

The following condition may or may not be satisfied by a partial function: 2. for each y E Y, there is an x E X such that f(x)

=

y.

Iff is a partial function from X to Y and f satisfies (2), then we sometimes say that f is a function from X onto Y. If (2) is possibly not satisfied, we say that f is a function from X into Y. Actually, any function from X to Y can be said to be a function from X into Y, even if (2) is satisfied. However, the term "into" is usually used when (2) does not hold or at least is not definitely known to hold. The words function and mapping are exact synonyms. The only reason for using both words is that it introduces some variety and that they are both standard terms. We conclude this section with some simple examples of functions in order to clarify our terminology. Let Q denote the set of rational numbers and let I denote the set of integers and let N denote the set of natural numbers (nonnegative integers). Define f :1- N by f (x) = x 2• Then f is a total function. Define g :N- Q by g (x) = 1/ x, provided x is not zero. Then g is a partial function. The function g is not total, since g(O) is undefined. It is correct to call f a partial function. The term partial function means a function which is not necessarily total. However, if we know f is a total function, we will usually call it that. We will sometimes encounter situations where we do not know whether or not a function is total. In these cases, the function will be referred to as a partial function. ALPHABETS AND LANGUAGES The notions of symbol and string of symbols are central to what we will be doing. The input and output to procedures and algorithms will usually consist of strings of symbols and we will represent many aspects of the computing process itself as strings of symbols. We now review our notation for handling strings and symbols. The notion of a symbol is a primitive notion which we will not define. It corresponds to the common use of this notion. So, for example, it is true that the English alphabet can be thought of as having 26 symbols. If we consider both upper- and lower-case letters, then it contains 52 symbols. The word "pop' contains two symbols, "p" and "o." It contains three occurrences of symbols, two occurrences of "p" and one occurrence of "o." It is common practice to use the word "symbol" rather loosely to mean either symbol or occurrence of a symbol. Whenever we do so, the intended meaning should be clear from the context. Word is another primitive notion. Intuitively, a word is a finite string of symbols or, more correctly, a string of occurrences of symbols. In fact, we will use the words string and word interchangeably. A symbol is a special kind of word. It is a word with only one occurrence of a symbol. There is also a word with zero occurrences of symbols. This is the empty word and will be denoted by the upper-case Greek letter A. For example, if asked to find the word between the

CHAPTER 1. INTRODUCTION

5

symbols "x" and "y" in the words "xabley' and ".xy," the correct answers are "able' and A. (At the back of this book there is an appendix which gives the names of the Greek letters.) The last primitive notion we will need is that of concatenation. The concatenation of two strings a and {3 is the string obtained by writing {3 after a. The concatenation of a and {3 is denoted af3. For example, if a is the string "abd' and {3 is the string "bab," then af3 is the string "ababab." SUMMARY OF NOTATION

The following is a tabulation of the notation we will use for handling symbols and strings. Give it a quick reading now and then use it as a reference. The terms are assimilated best through usage and we will be using them frequently. 1.1 Definition 1.1.1 An alphabet is a finite set of symbols. 1.1.2 A language is a set of words. 1.1.3 If L 1 and L 2 are languages, then L 1L 2 denotes the product of the two languages. The product L 1L 2 is the set of all strings of the form af3 where a is a string in L 1 and {3 is a string in L 2• For example, if L 1 = (a,bb} and L 2 = (ab, c} then L 1L 2 = (aab, ac, bbab, bbc}. 1.1.4 A denotes the empty string. 1.1.5 If a is a string and n is a non-negative integer, then an denotes n occurrences of a concatenated together. By convention, a 0 = A. For example, if a = "abc," then a 3 = "abcabcabc." From now on, we will usually not bother with quotes. We will write a = abc instead of a ="abc." This can produce some ambiguity and we will have to rely on context to clear up this ambiguity. For example, if a = 2, then a 2 = 22, when 2 denotes the symbol "2" and a 2 = 4 when 2 denotes the number two. In what follows we will always be careful to ensure that the intended meaning is clear from context. 1.1.6 If L is a language, then L n denotes the product of L with itself n times. L n can be defined recursively as follows: L0

=

(A}

Ln+I

=

Ln L

1.1. 7 If L is a language, then L * is called the star closure of L. It consists of all finite concatenations of strings from L. That is, 00

L*

=

U n=O

Ln

6

CHAPTER 1. INTRODUCTION

For example, if L contains just the two symbols 0 and 1, then L * consists of all binary numerals with any number of leading zeros. 1.1.8 If L is a language, then L +denotes the language defined as follows: 00

L+

=

U

Ln

n =I

So, in particular, if I. is an alphabet, then

r.+

=I.*- {A}.

1.1.9 If a denotes a string, then length(a) denotes the length of a. More precisely, length(a) is the number of occurrences of symbols in a. For example, length(abc) = 3, length(aabb) = 4 and length(A) = 0. In order to make the text more readable, we will follow a number of conventions on symbol usage. Upper-case Greek letters I., r, ... , will usually range over alphabets. Lower-case Roman letters near the beginning of the alphabet, a, b, c, ... , will usually range over individual symbols. Lower-case Greek letters, a, {3, y, ... , will usually range over strings of symbols. The one notable exception to this rule is 8, which will always denote a function. Upper-case Roman letters, A, B, C, ... , will be used to denote sets and to denote some special kinds of symbols. Since languages are sets, they are denoted by upper-case Roman letters, usually by L. Upper-case Roman letters will also be used to denote machines. In fact, we will usually use M for machines. The other conventions are standard to most all fields of mathematics. The letters, i, j, and k are usually indices or integers; I, m, and n inevitably stand for integers; f, g and h inevitably denote functions and of course, x, y and z are variables that might range over anything. Frequently, we will want an entire word to be treated as a single symbol. We indicate this by enclosing the word in angular brackets. Thus "even" is a word with four occurrences of symbols while "<even>" is just one single symbol. The reason for doing this is to make the mathematics more readable. We could use e or X or a in place of <even>. However, if we want a symbol to use in recording the fact that some count is even, then <even> has obvious advantages over x. It even has advantages over e which may already stand for something. Sometimes we will use boldface type instead of angular brackets. So even denotes a single symbol just as <even> does.

EXERCISES 1. Let a. b. c. d.

J, K, and L be languages. Prove the following: (JK)L = J(KL) J(K U L) = JK U JL If J ~ K, then J* ~ K* (L*)* = L*

CHAPTER 1. INTRODUCTION

7

2. Give examples of languages to show that the following statements are false. a. J(K n L) = JK n JL b. L+ = L*- {A} 3. Give an effective procedure that takes an arbitrary natural number as input and gives the square root of the number as output, provided the square root is itself a natural number.

Chapter Two

Context-Free Grammars

Many common programming languages have a syntax that can be described, either completely or partially, by a set of rewrite rules called context-free productions. The set of such productions that defines the syntax of a particular language is called a context-free grammar. Context-free grammars are also frequently referred to as BNF grammars, but we will always use the term contextfree grammar. An informal example will help to introduce the notion of a context-free grammar. Suppose we want to describe the syntax of all well-formed arithmetic expressions involving the variables x and y and the operation +. We can do this by a set of productions or rewrite rules which describe how a specified start symbol can be transformed to such an arithmetic expression. Specifically, we can let S be the start symbol and use the following productions: S - (S +~, S- V, V- x and V- y. The productionS- (S +S), for example, means S can be rewritten as (S + S). So one possible sequence of rewritings is: Sis rewritten as (S +S) using the productionS- (S +S), then (S + S) is rewritten as ( V + S) using the production S - V, then ( V + S) is rewritten as ( V + V) using the production S - V again and finally, in two steps, ( V + V) is rewritten as (x + y) using the productions V- x and V- y. We can abbreviate this rewriting process by S ~ (S +S) ~ ( V +S) ~ ( V + V) ~ (x + V) ~ (x + y). Another possible sequence of S ~ (S+S) ~ ((S+S)+S) ~ ((V+S)+S) ~ rewritingsis: (( V + V) + S) ~ (( V + V) + V) ~ ((x + V) + V) ~ ((x + y ) + V) ~ ( ( (x + y) + x). The symbols x, y, +, ) and ( are called terminal symbols, since they cannot be rewritten. The symbols S and V are called nonterminals or sentential variables. It should be clear that the set of all strings of terminal symbols that can be obtained in this way is exactly equal to the set of all well-formed, parenthesized arithmetic expressions involving the terminal symbols. We now make these notions more formal. 8

CHAPTER 2. CONTEXT-FREE GRAMMARS

9

DEFINITIONS AND EXAMPLES 2.1 Definition 2.1.1 A context-free grammar (cfg) is a quadruple G

=

(N, T, P, S) where

1. N is a finite set of symbols called nonterminals. 2. T is a finite set of symbols called terminals and is such that N n T is empty. 3. S is an element of N called the start symbol. 4. P is a finite set of strings of the form A - a where A is in N and a is a string in (N U T)*. These strings in P are called productions. Productions are sometimes also called rewrite rules. We must be able to distinguish between the four elements of G. We choose to do this by the usual mathematical trick. We make the four elements into an ordered four-tuple. Then, by convention, the first entry is the set of nonterminals, the second is the set of terminals, and so forth. This trick requires an arbitrary convention that is hard to remember and is, after all, an arbitrary convention. It does not really matter that N is the first entry. What matters is that N is the entry that is the set of nonterminals. So as not to have to remember this arbitrary convention, we will use more suggestive notation than first entry, second entry, and so forth. The nonterminals of G will be denoted by Nonterminals(G ). The next three entries will be denoted by Terminals(G ), Productions(G ), and Start(G ) respectively. So Nonterminals(G) = N, Terminals(G) = T, Productions(G) = P, and Start(G ) = S. In mathematical terminology, NonterminalsO is a projection that maps each context-free grammar onto its first coordinate, Terminals 0 maps it onto its second coordinate, and so forth. The important thing to remember is that if G is a context-free grammar, then Nonterminals(G ) is the set of nonterminals of G and the other three projections yield the things they sound like. It will sometimes be useful to have a notation for all the symbols of G, both terminals and nonterminals. So we expand the above notation by defining Symbols(G ) to be Nonterminals(G ) U Terminals(G )·. 2.1.2 The relation ~ on (Symbols(G) )* is defined so that {3 ~ y holds provided {3 and y can be expressed as {3 = vAIL and y = vaf.L where A - a is in Productions ( G ) and both v and f.L are strings of symbols. The notation ~ denotes any finite (possibly zero) number of applications of ~ . More precisely' {3 ~ 'Y if {3 = 'Y or if there is an n and there are eJ, e2, ... , en such that: e1 = {3, en= y and ei ~ ei+I for all i < n. If we have more than one grammar under consideration, we will sometimes write G~ and to mean ~ and ~ in the grammar G.

?

2.1.3 Let w be in (Terminals(G) )*. A derivation of w in G is a sequence en of strings in (Symbols(G) )* such that S, en w and ei ~ ei+I fori= 0, 1, ... , n-1.

eo, eh ...,

eo=

=

10

CHAPTER 2. CONTEXT-FREE GRAMMARS

2.1.4 The language generated by G is defined and denoted by L ( G ) = {w IS ~ w and w E (Terminals(G ) )*}. A language H is called a context-free language (cfl) provided H = L (G ) for some cfg G. 2.1.5 Two cfg's G 1 and G2 are said to be equivalent if L (G 1) = L (G 2). Intuitively, two cfg's are equivalent if they describe the same set of syntactically "correct sentences." 2.2 Examples of cfg's G = (N, T, P, S) where N = {S}, T ={a, b}, and 2.2.1 Let P ={S-A, S-aSh}. Then L (G) ={an bn In ~ 0}. A sample derivation in G isS~ aSh~ aaSbb ~ aabb. When two or more productions have the same nonterminal on the lefthand side, we may abbreviate the list of productions. For example, S - A and S - aSh would be abbreviated to S - A I aSh .

2.2.2 Let G = (N, T, P, S ) where N = {S, <noun>, , }

T = {man , dog , bites } and P contains the following productions : S-

<noun > < verbphrase >

- <noun> - bites <noun>- man Idog Then L (G ) contains exactly four strings: dog bites man, dog bites dog, man bites man, man bites dog. A sample derivation follows: S

~

<noun > < verbphrase > man <noun >

~ ~

<noun > <noun > man bites <noun >

~

~

man bites dog

Note that we are treating the English words "man," "bites," and "dog" as single symbols. This is quite common in linguistic analysis. The reason is that, for our purposes, the words are basic units that will not be broken down. We will not bother to break "man" down into m-a-n. We could consider "man" as a three-symbol string and redefine G. We get a well-defined cfg either way. However, if we treat "man" as a single symbol, the grammar is easier to handle. Notice that, in this example, the derived terminal strings are sentences. For this reason, elements of context-free languages are sometimes referred to as "sentences." Although it is a little bit confusing, the formal notion of a derived terminal word corresponds to our intuitive notion of a sentence and the formal notion of a terminal symbol corresponds to our intuitive notion of a word. 2.2.3

In this example we define a cfg G for a subset of the well-formed

CHAPTER 2. CONTEXT-FREE GRAMMARS

11

arithmetic expressions using + and x as operators, x, y, and z as the only variables, and binary numerals as the constants. Nonterminals (G ) = {S, , , ) Terminals (G) = {1, 0, x, y, z, (, ), +, x), Start (G)= S The productions of G are given below: S - (S S ) I I - x IY I z - 0 Ill O II - +I x A sample derivation is the following: S ~ (S S) ~ (S+S) ~( +S) ~ (x + S) ~ (x + (S S)) ~ (x + (S x S)) ~ · (x +( x S)) ~ (x + (y x S)) ~ (x + (y x )) ~ (x + (y x 1 )) ~ (x + (y x 10))

PARSE TREES AND AMBIGUITY A useful observation that can be a great aid to intuition is that derivations can be viewed as trees called parse trees. These parse trees can also be helpful when we study the semantics of languages. Since we will usually use trees in an informal sense, we will explain them by examples. We will use them frequently though, so the examples should be thoroughly understood. 2.3 Examples In these examples of derivations and their corresponding trees, all upper-case letters are nonterminals and all lower-case letters are terminals. S is the start symbol. We only describe as much of the grammar as is needed for the examples. ~ AB ~ aAaB ~ aaaB ~ aaab by the productions S - AB, A- aAa, A- a, B- b. The parse tree is given in Figure 2.1(a}.

2.3.1 S

~ AB ~ Ab ~ aAab ~ aaab by the productions S - AB, B- b, A - aAa, A -a. The parse tree for this derivation is also the tree in

2.3.2 S

Figure 2.1 (a). So two distinct derivations can yield the same parse tree. 2.3.3 S ~ AB ~ aAB ~ aB ~ ab by the productions S - AB, A - aA , A - A, B - b. The parse tree is given in Figure 2.1 (b). The parse tree for a string generated by a context-free grammar often has much to do with the meaning of the sentence. For example, the parse trees for the sentences in Example 2.2.2 describe the grammatical structure of the strings as sentences in the English language. Each of the sentences in that example has only one parse tree. If some sentence had two parse trees and if the grammar

12

CHAPTER 2. CONTEXT-FREE GRAMMARS

s

/~

A

a

B

a

A

b

a

Figure 2.1(a)

s

/~

B

A

/~

a

A

b

I

A

Figure 2.1(b)

contained clues to the meaning, then the meaning of the sentence could be ambiguous. For example, it is not hard to construct a context-free grammar that generates a subset of English, mirrors our intuitive notion of English grammar, and yields two parse trees for the following sentence: "They are flying planes." The two parse trees are diagrammed in Figure 2.2. The diagram in Figure 2.2(a) would explain the structure of the sentence if it were the answer to the question "What are those people doing?" The diagram in Figure 2.2(b) would explain the structure of the sentence if it were the answer to the question "What are those things in the sky?" A classic example of ambiguity that can arise in designing grammars for programming languages has to do with the optional ELSE in the IF-THENELSE construction.

CHAPTER 2. CONTEXT-FREE GRAMMARS

13

<sentence>