LIST OF SYMBOLS TOP IC
SYMBOL
MEANIN G
LOGIC
""'p p 1\ q pvq P (J)q p-+q p ++ q p=.q T F P(xJ,... ,xn) VxP(x) 3xP(x...
3624 downloads
4415 Views
25MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
LIST OF SYMBOLS TOP IC
SYMBOL
MEANIN G
LOGIC
""'p p 1\ q pvq P (J)q p-+q p ++ q p=.q T F P(xJ,... ,xn) VxP(x) 3xP(x) 3!xP(x)
negation of p conjunction of p and q disjunction of p and q exclusive or of p and q the implication p implies q biconditional of p and q equivalence of p and q tautology contradiction propositional function universal quantification of P(x) existential quantification of P(x) uniqueness quantification of P(x) therefore partial correctness of S
p{S}q
SETS
PAGE 3 4 4
5 6 9 22 24 24 32 34 36 37 63 323
x is a member of S x is not a member of S
U Ai
list of elements of a set set builder notation set of natural numbers set of integers set of positive integers set of rational numbers set of real numbers set equality the empty (or null) set S is a subset of T S is a proper subset of T cardinality of S the power set of S n-tuple ordered pair Cartesian product of A and B union of A and B intersection of A and B the difference of A and B complement of A
1 12 1 12 1 12 1 12 1 12 1 12 1 13 1 13 1 13 1 13 1 14 1 14 1 15 1 16 1 16 1 17 1 17 1 18 12 1 121 1 23 1 23
union of Ai, i= 1 , 2, . . . , n
1 27
n Ai
intersection of Ai, i= 1 , 2, . . . , n
128
A(J)B
symmetric difference of A and B
131
XES x fj S {aJ,... ,an} {x I P(x)} N
Z z+ Q
R S=T 0
SC;T SeT lSI peS) (aJ,... ,an) (a, b) AxB AUB AnB A-B
A n
i=J n
i=J
TOP IC
SYMBOL
MEANIN G
FUNCTIONS
J (a ) J : A ---+ B JI+ h Jd2 J (5) LA (S ) J -I(x ) J og LxJ fx l
value of the function J at a function from A to B sum of the functions J I and J2 product of the functions J I and 12 image of the set 5 under J identity function on A inverse of J composition of J and g floor function of x ceiling function of x term of {ai} with subscript n
1 33 1 33 1 35 135 1 36 138 1 39 1 40 1 43 1 43 1 50
sum of ai, a2, ... , an
1 53
sum of aa over a
1 56
an n Lai i=1 L aa aES n fl an i=!
J (x) is O (g (x » n! J(x) is Q(g(x» J(x) is 8(g (x»
min (x , y) max(x , y) �
PA GE
E
5
product of ai, a2, ... , an
1 62
J (x ) is big-O o f g (x ) n factorial J (x ) is big-Omega of g (x ) J (x ) is big-Theta of g(x) asymptotic minimum of x and y maximum of x and y approximately equal to
1 80 1 85 1 89 1 89 1 92 216 217 395
INTEGERS
alb alb a div b a mod b a == b (mod m ) a =1= b (mod m ) gcd (a , b) 1cm (a, b) (akak-I ... alaO)b
a divides b a does not divide b quotient when a is divided by b remainder when a is divided by b a i s congruent to b modulo m a i s not congruent to b modulo m greatest common divisor of a and b least common multiple of a and b base b representation
20 1 20 1 202 202 203 203 215 217 219
MATRICES
[aij ] A+B AB In At AvB AAB A0B A[n]
matrix with entries aij matrix sum of A and B matrix product of A and B identity matrix of order n transpose of A join of A and B the meet of A and B Boolean product of A and B nth Boolean power of A
247 247 248 25 1 25 1 252 252 253 254
(List of Symbols continued at back of book)
Discrete Mathematics and Its Applications Sixth Edition
Kenneth
H.
Rosen
AT&T Laboratories
Boston Bangkok Milan
Burr Ridge, IL
Bogota
Montreal
Caracas New Delhi
Dubuque, IA
New York
Kuala Lumpur
Lisbon
Santiago
Seoul
San Francisco London
Singapore
St. Louis
Madrid
Sydney
Mexico City
Taipei
Toronto
The
McGraw Hilt Compames
4
DISCRETE MATHEMATICS AND ITS APPLICATIONS, SIXTH EDITION International Edition
2007
Exclusive rights by McGraw-Hill Education (Asia), for manufacture and export. This book cannot be re-exported from the country to which it is sold by McGraw-Hill. The International Edition is not available in North America.
1221
Published by McGraw-Hill, a business unit of The McGraw-Hill Companies, Inc., Avenue of the Americas, New York, NY Copyright by Kenneth H. Rosen. All rights reserved. No part of this publication may be reproduced or distributed in any form or by any means, or stored in a database or retrieval system, without the prior written consent of The McGraw-Hill Companies, Inc., including, but not limited to, in any network or other electronic storage or transmission, or broadcast for distance learning. Some ancillaries, including electronic and print components, may not be available to customers outside the United States.
10020.
© 2007
1020 0909 0808 0707 0606 05 04 03 02 01
CTF
BJE
The credits section for this book begins on page C-I and is considered an extension of the copyright page.
When ordering this title, use ISBN-13: 978-007-124474-9 or ISBN-I0: 007-124474-3 Printed in Singapore www.rnhhe.com
Contents Preface vii
T he MathZone Companion Website xviii
To the Student xx
1
The Foundations: Logic and Proofs .................................. 1
1.1 1 .2 1 .3 1 .4 1 .5 1 .6 1 .7
Propositional Logic . . ... .. . 1 Propositional Equivalences . . . 21 Predicates and Quantifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 Nested Quantifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 Rules of Inference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63 Introduction to Proofs . . . . 75 Proof Methods and Strategy . 86 End-of-Chapter Material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 04
2
Basic Structures: Sets, Functions, Sequences, and Sums ....... ... 111
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
2 . 1 . Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . : . . . . . . . . 1 1 1 2.2 Set Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 2 1 2 .3 Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 3 3 2 .4 Sequences and Summations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 49 End-of-Chapter Material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 63 3
The Fundamentals: Algorithms, the Integers, and Matrices........ 167
3.1 3.2 3.3 3 .4 3.5 3.6 3.7 3.8
Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 67 The Growth of Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 80 Complexity of Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 93 The Integers and Division . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200 Primes and Greatest Common Divisors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1 0 Integers and Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 1 9 Applications of Number Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 3 1 Matrices 246 End-of-Chapter Material . 257 .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
4
Induction and Recursion .......................................... 263
4. 1 4.2 4.3 4.4
Mathematical Induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263 Strong Induction and Well-Ordering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283 Recursive Definitions and Structural Induction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294 Recursive Algorithms . .. . . 311 .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
iii
Contents
4.5
5
5. 1 5.2 5.3 5 .4 5.5 5.6
6
6. 1 6.2 6.3 6.4
Program Correctness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 322 End-of-Chapter Material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 328 Counting
.
.
.
.
.
.
.
.
.
.
.
7. 1 7.2 7.3 7.4 7.5 7.6
8
8. 1 8.2 8.3 8.4 8.5 8.6
9
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
335
.
The Basics of Counting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 335 The Pigeonhole Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 347 Permutations and Combinations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355 Binomial Coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 363 Generalized Permutations and Combinations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 370 Generating Permutations and Combinations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 382 End-of-Chapter Material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 386 Discrete Probability
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
393
.
An Introduction to Discrete Probability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 393 Probability Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 400 Bayes' Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 1 7 Expected Value and Variance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 426 End-of-Chapter Material . . . . 442 .
7
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Advanced Counting Techniques
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
449
Recurrence Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 449 Solving Linear Recurrence Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 460 Divide-and-Conquer Algorithms and Recurrence Relations . . . . . . . . . . . . . . . . . . . . . . . 474 Generating Functions . . .... . 484 Inclusion-Exclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 499 Applications of Inclusion-Exclusion . . . ........ ... 505 End-of-Chapter Material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1 3 .
Relations
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
519
.
Relations and Their Properties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 1 9 n-ary Relations and Their Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 530 Representing Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 537 Closures of Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 544 Equivalence Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 555 Partial Orderings . . . . . . . .. 566 End-of-Chapter Material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58 1 Graphs
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
589
.
9. 1 9.2 9.3
Graphs and Graph Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 589 Graph Terminology and Special Types of Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 597 Representing Graphs and Graph Isomorphism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 1 1 62 1 9.4 Connectivity .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Contents
v
9.5 9.6 9.7 9 .8
Euler and Hamilton Paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 633 Shortest-Path Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 647 Planar Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 657 Graph Coloring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 666 End-of-Chapter Material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 675
10
Trees
1 0. 1 1 0.2 1 0.3 1 0.4 1 0.5
Introduction to Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 683 Applications of Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 695 Tree TraversaL . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 1 0 Spanning Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 724 Minimum Spanning Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 737 End-of-Chapter Material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 743
11
Boolean Algebra
1 1.1 1 1 .2 1 1 .3 1 1 .4
Boolean Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 749 Representing Boolean Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 757 Logic Gates 760 Minimization of Circuits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 766 End-of-Chapter Material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78 1
12
Modeling Computation
12.1 1 2 .2 1 2 .3 1 2 .4 1 2 .5
Languages and Grammars . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 785 Finite-State Machines with Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 796 Finite-State Machines with No Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 804 Language Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1 7 Turing Machines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 827 End-of-Chapter Material . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 838
.
.
.
.
.
.
.
.
.
.
Appendixes
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
749
.
.
.
683
.
785
A-I
A- I Axioms for the Real Numbers and the Positive Integers . . . . . . . . . . . . . . . . . . . . . . . . . . A- I A-2 Exponential and Logarithmic Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-7 A-3 Pseudocode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A- I O Suggested Readings
B-1
Answers to Odd-Numbered Exercises Photo Credits C-l
Index of Biographies 1-1
Index 1-2
S-1
About the Author K enneth H. Rosen has had a long career as a Distinguished Member of the Technical Staff at AT&T Laboratories in Monmouth County, New Jersey. He currently holds the position
of visiting research professor at Monmouth University where he is working on the security and privacy aspects of the Rapid Response Database Project and where he is teaching a course on cryptographic applications. Dr. Rosen received his B.S. in Mathematics from the University of Michigan, Ann Arbor ( 1 972), and his Ph.D. in Mathematics from M.LT. ( 1 976), where he wrote his thesis in the area of number theory under the direction of Harold Stark. Before joining Bell Laboratories in 1 982, he held positions at the University of Colorado, Boulder; The Ohio State University, Columbus; and the University of Maine, Orono, where he was an associate professor of mathematics. While working at AT&T Labs, he taught at Monmouth University, teaching courses in discrete mathematics, coding theory, and data security. Dr. Rosen has published numerous articles in professional journals in the areas of number theory and mathematical modeling. He is the author ofthe textbooks Elementary Number Theory and Its App lications, published by Addison-Wesley and currently in its fifth edition, and Discrete Mathematics and Its App lications, published by McGraw-Hili and currently in its sixth edition. Both of these textbooks have been used extensively at hundreds of universities throughout the world. Discrete Mathematics and Its Applications has sold more than 300,000 copies in its lifetime with translations into Spanish, French, Chinese, and Korean. He is also co-author of UNIX The Complete Reference; UNIX System V Release 4: An Introduction; and Best UNIX Tips Ever, each published by Osborne McGraw-Hili. These books have sold more than 1 50,000 copies with translations into Chinese, German, Spanish, and Italian. Dr. Rosen is also the editor of the Handbook of Discrete and Combinatorial Mathematics, published by CRC Press, and he is the advisory editor of the CRC series of books in discrete mathematics, consisting of more than 30 volumes on different aspects of discrete mathematics, most of which are introduced in this book. He is also interested in integrating mathematical software into the educational and professional environments, and has worked on projects with Waterloo Maple Inc.'s Maple software in both these areas. At Bell Laboratories and AT&T Laboratories, Dr. Rosen worked on a wide range of projects, including operations research studies and product line planning for computers and data commu nications equipment. He helped plan AT&T's products and services in the area of multimedia, including video communications, speech recognition and synthesis, and image networking. He evaluated new technology for use by AT&T and did standards work in the area of image network ing. He also invented many new services, and holds or has submitted more than 70 patents. One of his more interesting projects involved helping evaluate technology for the AT&T attraction at EPCOT Center.
Preface I discrete mathematics. For the student, my purpose was to present material in a precise, readable manner, with the concepts and techniques of discrete mathematics clearly presented n writing this book, I was guided by my long-standing experience and interest in teaching
and demonstrated. My goal was to show the relevance and practicality of discrete mathematics to students, who are often skeptical. I wanted to give students studying computer science all of the mathematical foundations they need for their future studies. I wanted to give mathematics students an understanding of important mathematical concepts together with a sense of why these concepts are important for applications. And most importantly, I wanted to accomplish these goals without watering down the material. For the instructor, my purpose was to design a flexible, comprehensive teaching tool using proven pedagogical techniques in mathematics. I wanted to provide instructors with a package of materials that they could use to teach discrete mathematics effectively and efficiently in the most appropriate manner for their particular set of students. I hope that I have achieved these goals. I have been extremely gratified by the tremendous success of this text. The many improve ments in the sixth edition have been made possible by the feedback and suggestions of a large number of instructors and students at many of the more than 600 schools where this book has been successfully used. There are many enhancements in this edition. The companion website has been substantially enhanced and more closely integrated with the text, providing helpful material to make it easier for students and instructors to achieve their goals. This text is designed for a one- or two-term introductory discrete mathematics course taken by students in a wide variety of majors, including mathematics, computer science, and engineering. College algebra is the only explicit prerequisite, although a certain degree of mathematical maturity is needed to study discrete mathematics in a meaningful way.
Goals of a Discrete Mathematics Course A discrete mathematics course has more than one purpose. Students should learn a particular set of mathematical facts and how to apply them; more importantly, such a course should teach students how to think logically and mathematically. To achieve these goals, this text stresses mathematical reasoning and the different ways problems are solved. Five important themes are interwoven in this text: mathematical reasoning, combinatorial analysis, discrete structures, algorithmic thinking, and applications and modeling. A successful discrete mathematics course should carefully blend and balance all five themes.
1 . Mathematical Reasoning: Students must understand mathematical reasoning in order to read, comprehend, and construct mathematical arguments. This text starts with a discussion of mathematical logic, which serves as the foundation for the subsequent discussions of methods of proof. Both the science and the art of constructing proofs are addressed. The technique of mathematical induction is stressed through many different types of examples of such proofs and a careful explanation of why mathematical induction is a valid proof technique. 2 . Combinatorial Analysis: An important problem-solving skill is the ability to count or enu merate objects. The discussion of enumeration in this book begins with the basic techniques of counting. The stress is on performing combinatorial analysis to solve counting problems and analyze algorithms, not on applying formulae. vii
vJli
Preface
3 . Discrete Structures: A course in discrete mathematics should teach students how to work with discrete structures, which are the abstract mathematical structures used to represent discrete objects and relationships between these objects. These discrete structures include sets, permutations, relations, graphs, trees, and finite-state machines.
4. Algorithmic Thinking: Certain classes of problems are solved by the specification of an algorithm. After an algorithm has been described, a computer program can be constructed implementing it. The mathematical portions of this activity, which include the specification of the algorithm, the verification that it works properly, and the analysis ofthe computer memory and time required to perform it, are all covered in this text. Algorithms are described using both English and an easily understood form of pseudocode. 5. Applications and Modeling: Discrete mathematics has applications to almost every conceiv able area of study. There are many applications to computer science and data networking in this text, as well as applications to such diverse areas as chemistry, botany, zoology, linguis tics, geography, business, and the Internet. These applications are natural and important uses of discrete mathematics and are not contrived. Modeling with discrete mathematics is an extremely important problem-solving skill, which students have the opportunity to develop by constructing their own models in some of the exercises.
Changes in the Sixth Edition The fifth edition of this book has been used successfully at over 600 schools in the United States, dozens of Canadian universities, and at universities throughout Europe, Asia, and Oceania. Although the fifth edition has been an extremely effective text, many instructors, including longtime users, have requested changes designed to make this book more effective. I have devoted a significant amount of time and energy to satisfy these requests and I have worked hard to find my own ways to make the book better. The result is a sixth edition that offers both instructors and students much more than the fifth edition did. Most significantly, an improved organization of topics has been implemented in this sixth edition, making the book a more effective teaching tool. Changes have been implemented that make this book more effective for students who need as much help as possible, as well as for those students who want to be challenged to the maximum degree. Substantial enhancements to . the material devoted to logic, method of proof, and proof strategies are designed to help students master mathematical reasoning. Additional explanations and examples have been added to clarify material where students often have difficulty. New exercises, both routine and challenging, have been inserted into the exercise sets. Highly relevant applications, including many related to the Internet and computer science, have been added. The MathZone companion website has benefited from extensive development activity and now provides tools students can use to master key concepts and explore the world of discrete mathematics. Improved Organization •
The first part of the book has been restructured to present core topics in a more efficient, more effective, and more flexible way.
•
• •
Coverage of mathematical reasoning and proof is con centrated in Chapter 1 , flowing from propositional and predicate logic, to rules of inference, to basic proof techniques, to more advanced proof techniques and proof strategies.
•
A separate chapter on discrete structures--Chapter 2 in this new edition---covers sets, functions, sequence, and sums. Material on basic number theory, covered in one sec tion in the fifth edition, is now covered in two sections, the first on divisibility and congruences and the second on pnmes. The new Chapter 4 is entirely devoted to induction and recursion.
Preface ix Logic •
Coverage of logic has been amplified with key ideas explained in greater depth and with more care.
•
Conditional statements and De Morgan's laws receive expanded coverage.
•
The construction of truth tables is introduced earlier and in more detail.
•
More care is devoted to introducing predicates and quantifiers, as well as to explaining how to use and work with them.
•
The application of logic to system specifications-a topic of interest to system, hardware, and software engineers-has been expanded.
•
Material on valid arguments and rules of inference is now presented in a separate section.
Writing and Understanding Proofs •
Proof methods and proof strategies are now treated in separate sections of Chapter 1 .
•
An appendix listing basic axioms for real numbers and for the integers, and how these axioms are used to prove new results, has been added. The use of these axioms and basic results that follow from them has been made explicit in many proofs in the text.
•
The process of making conjectures, and then using different proof methods and strategies to attack these
conjectures, is illustrated using the easily accessible topic of tilings of checkerboards. •
•
Separate and expanded sections on mathematical in duction and on strong induction begin the new Chapter 4. These sections include more motivation and a rich collection of examples, providing many examples dif ferent than those usually seen. More proofs are displayed in a way that makes it pos sible to explicitly list the reason for each step in the proof.
Algorithms and Applications • •
More coverage is devoted to the use of strong induc tion to prove that recursive algorithms are correct. How Bayes' Theorem can be used to construct spam filters is now described.
•
Examples and exercises from computational geometry have been added, including triangulations of polygons.
•
The application of bipartite graphs to matching prob lems has been introduced.
Number Theory, Combinatorics, and Probability Theory •
Coverage of number theory is now more flexible, with four sections covering different aspects of the subject and with coverage of the last three of these sections optional.
•
The introduction of basic counting techniques, and permutations and combinations, has been enhanced.
•
Coverage of counting techniques has been expanded; counting the ways in which objects can be distributed in boxes is now covered.
•
Coverage of probability theory has been expanded with the introduction of a new section on Bayes' Theorem.
•
Examples illustrating the construction of finite-state automata that recognize specified sets have been added.
•
Minimization of finite-state machines is now men tioned and developed in a series of exercises. Coverage of Turing machines has been expanded with a brief introduction to how Turing machines arise in the study of computational complexity, decidability, and computability.
Graphs and Theory of Computation •
The introduction to graph theory has been streamlined and improved.
•
A quicker introduction to terminology and applica tions is provided, with the stress on making the correct decisions when building a graph model rather than on terminology.
•
Material on bipartite graphs and their applications has been expanded.
•
x
Preface
Exercises and Examples •
Many new routine exercises and examples have been added throughout, especially at spots where key con cepts are introduced.
•
Extra effort has been made to ensure that both odd numbered and even-numbered exercises are provided for basic concepts and skills.
•
A better correspondence has been made between examples introducing key concepts and routine exercises.
•
Many new challenging exercises have been added.
•
Over 400 new exercises have been added, with more on key concepts, as well as more introducing new topics.
Additional Biographies, Historical Notes, and New Discoveries • •
Biographies have been added for Archimedes, Hopper, Stirling, and Bayes. Many biographies found in the previous edition have been enhanced, including the biography of Augusta Ada.
The MathZone Companion Website
•
The historical notes included in the main body of the book and in the footnotes have been enhanced.
•
New discoveries made since the publication ofthe pre vious edition have been noted.
(www.mhhe.comlrosen)
•
MathZone course management and online tutorial sys tem now provides homework and testing questions tied directly to the text.
•
Expanded annotated links to hundreds of Internet resources have been added to the Web Resources Guide.
•
Additional Extra Examples are now hosted online, covering all chapters of the book. These Extra Examples have benefited from user review and feedback.
•
Additional Self Assessments of key topics have been added, with 1 4 core topics now addressed.
•
Existing Interactive Demonstration Applets support ing key algorithms are improved. Additional applets have also been developed and additional explanations are given for integrating them with the text and in the classroom.
•
An updated and expanded Exploring Discrete Mathe matics with Maple companion workbook is also hosted online.
Special Features This text has proved to be easily read and understood by beginning students. There are no mathematical prerequisites beyond college algebra for almost all of this text. Students needing extra help will find tools on the MathZone companion website for bringing their mathematical maturity up to the level of the text. The few places in the book where calculus is referred to are explicitly noted. Most students should easily understand the pseudocode used in the text to express algorithms, regardless ofwhether they have formally studied programming languages. There is no formal computer science prerequisite. Each chapter begins at an easily understood and accessible level. Once basic mathematical concepts have been carefully developed, more difficult material and applications to other areas of study are presented. ACCESSIBILITY
This text has been carefully designed for flexible use. The dependence of chapters on previous material has been minimized. Each chapter is divided into sections of approximately the same length, and each section is divided into subsections that form natural blocks of material for teaching. Instructors can easily pace their lectures using these blocks.
FLEXIBILITY
Preface xi WRITING STYLE The writing style in this book is direct and pragmatic. Precise mathe matical language is used without excessive formalism and abstraction. Care has been taken to balance the mix of notation and words in mathematical statements. MATHEMATICAL RIGOR AND PRECISION All definitions and theorems in this text are stated extremely carefully so that students will appreciate the precision of language and rigor needed in mathematics. Proofs are motivated and developed slowly; their steps are all carefully justified. The axioms used in proofs and the basic properties that follow from them are explicitly described in an appendix, giving students a clear idea of what they can assume in a proof. Recursive definitions are explained and used extensively. WORKED EXAMPLES Over 750 examples are used to illustrate concepts, relate different topics, and introduce applications. In most examples, a question is first posed, then its solution is presented with the appropriate amount of detail. APPLICATIONS The applications included in this text demonstrate the utility of discrete mathematics in the solution of real-world problems. This text includes applications to a wide va riety of areas, including computer science, data networking, psychology, chemistry, engineering, linguistics, biology, business, and the Internet. ALGORITHMS Results in discrete mathematics are often expressed in terms of algo rithms; hence, key algorithms are introduced in each chapter of the book. These algorithms are expressed in words and in an easily understood form of structured pseudocode, which is described and specified in Appendix A.3. The computational complexity of the algorithms in the text is also analyzed at an elementary level. HISTORICAL INFORMATION The background of many topics is succinctly described in the text. Brief biographies of more than 65 mathematicians and computer scientists, accom panied by photos or images, are included as footnotes. These biographies include information about the lives, careers, and accomplishments of these important contributors to discrete mathe matics and images of these contributors are displayed. In addition, numerous historical footnotes are included that supplement the historical information in the main body of the text. Efforts have been made to keep the book up-to-date by reflecting the latest discoveries. KEY TERMS AND RESULTS A list of key terms and results follows each chapter. The key terms include only the most important that students should learn, not every term defined in the chapter. EXERCISES There are over 3800 exercises in the text, with many different types of ques tions posed. There is an ample supply of straightforward exercises that develop basic skills, a large number of intermediate exercises, and many challenging exercises. Exercises are stated clearly and unambiguously, and all are carefully graded for level of difficulty. Exercise sets contain special discussions that develop new concepts not covered in the text, enabling students to discover new ideas through their own work. Exercises that are somewhat more difficult than average are marked with a single star *; those that are much more challenging are marked with two stars **. Exercises whose solutions require calculus are explicitly noted. Exercises that develop results used in the text are clearly identified with the symbol G>. Answers or outlined solutions to all odd-numbered exercises are provided at the back of the text. The solutions include proofs in which most of the steps are clearly spelled out. REVIEW QUESTIONS A set of review questions is provided at the end of each chapter. These questions are designed to help students focus their study on the most important concepts
Iii
Preface
and techniques of that chapter. To answer these questions students need to write long answers, rather than just perform calculations or give short replies. SUPPLEMENTARY EXERCISE SETS Each chapter is followed by a rich and varied set of supplementary exercises. These exercises are generally more difficult than those in the exercise sets following the sections. The supplementary exercises reinforce the concepts of the chapter and integrate different topics more effectively. COMPUTER PROJECTS Each chapter is followed by a set of computer projects. The approximately 1 50 computer projects tie together what students may have learned in computing and in discrete mathematics. Computer projects that are more difficult than average, from both a mathematical and a programming point of view, are marked with a star, and those that are extremely challenging are marked with two stars. COMPUTATIONS AND EXPLORATIONS A set of computations and explorations is included at the conclusion of each chapter. These exercises (approximately 1 00 in total) are designed to be completed using existing software tools, such as programs that students or instructors have written or mathematical computation packages such as Maple or Mathematica. Many of these exercises give students the opportunity to uncover new facts and ideas through computation. (Some of these exercises are discussed in the Exploring Discrete Mathematics with Maple companion workbook available online.) WRITING PROJECTS Each chapter is followed by a set of writing projects. To do these projects students need to consult the mathematical literature. Some of these projects are historical in nature and may involve looking up original sources. Others are designed to serve as gateways to new topics and ideas. All are designed to expose students to ideas not covered in depth in the text. These projects tie mathematical concepts together with the writing process and help expose students to possible areas for future study. (Suggested references for these projects can be found online or in the printed Student's Solutions Guide.) APPENDIXES There are three appendixes to the text. The first introduces axioms for real numbers and the integers, and illustrates how facts are proved directly from these axioms. The second covers exponential and logarithmic functions, reviewing some basic material used heavily in the course. The third specifies the pseudocode used to describe algorithms in this text. SUGGESTED READINGS A list of suggested readings for each chapter is provided in a section at the end of the text. These suggested readings include books at or below the level of this text, more difficult books, expository articles, and articles in which discoveries in discrete mathematics were originally published. Some of these publications are classics, published many years ago, while others have been published within the last few years.
How to Use This Book This text has been carefully written and constructed to support discrete mathematics courses at several levels and with differing foci. The following table identifies the core and optional sections. An introductory one-term course in discrete mathematics at the sophomore level can be based on the core sections of the text, with other sections covered at the discretion of the instructor. A two-term introductory course can include all the optional mathematics sections in addition to the core sections. A course with a strong computer science emphasis can be taught by covering some or all of the optional computer science sections.
Preface xiii Optional Computer Science Sections
Chapter
Core Sections
1 2 3 4 5 6 7 8 9 10 11 12
1 . 1-1 .7 (as needed) 2 . 1-2 .4 (as needed) 3 . 1-3.5, 3 . 8 (as needed) 4. 1--4.3 5 . 1-5.3 6. 1 7. 1 , 7.5 8. 1 , 8.3, 8.5 9. 1-9.5 1 0. 1
Optional Mathematics Sections
3.6 4.4, 4.5 5.6 6.4 7.3 8.2
3 .7 5 .4, 5.5 6.2, 6.3 7.2, 7.4, 7.6 8.4, 8.6 9.6-9.8 1 0.4, 1 0.5
1 0.2, 1 0.3 1 1 . 1-1 1 .4 1 2 . 1- 1 2 . 5
Instructors using this book can adjust the level o f difficulty o f their course b y choosing either to cover or to omit the more challenging examples at the end of sections, as well as the more challenging exercises. The dependence of chapters on earlier chapters is shown in the following chart. Chapter I
I
Chapter 2
I
Chapter 3
I
Chapter 4
I
Chapter 5
---� I� 8 9 11
Chapler 6
Chapler 7
Chapler
Chapter
Chapter
Chapter 12
I
Chapter 10
Ancillaries STUDENT'S SOLUTIONS GUIDE This student manual, available separately, contains full solutions to all odd-numbered problems in the exercise sets. These solutions explain why a particular method is used and why it works. For some exercises, one or two other possible approaches are described to show that a problem can be solved in several different ways. Suggested references for the writing projects found at the end of each chapter are also included in this volume. Also included are a guide to writing proofs and an extensive description of common mistakes students make in discrete mathematics, plus sample tests and a sample crib sheet for each chapter designed to help students prepare for exams.
(ISBN-lO: 0-07-310779-4)
(ISBN-J3: 978-0-07-310779-0)
INSTRUCTOR'S RESOURCE GUIDE This manual, available by request for instructors, contains full solutions to even-numbered exercises in the text. Suggestions on how to teach the material in each chapter of the book are provided, including the points to stress in each section and how to put the material into perspective. It also offers sample tests for each chapter and a test
xiv
Preface bank containing over 1 300 exam questions to choose from. Answers to all sample tests and test bank questions are included. Finally, several sample syllabi are presented for courses with differ ing emphasis and student ability levels, and a complete section and exercise migration guide is included to help users of the fifth edition update their course materials to match the sixth edition. (ISBN-10: 0-07-310781-6)
(ISBN-13: 978-0-07-310781-3)
INSTRUCTOR'S TESTING AND RESOURCE CD An extensive test bank of more than 1 300 questions using Brownstone Diploma testing software is available by request for use on Windows or Macintosh systems. Instructors can use this software to create their own tests by selecting questions of their choice or by random selection. They can also sort questions by section, difficulty level, and type; edit existing questions or add their own; add their own headings and instructions; print scrambled versions of the same test; export tests to word processors or the Web; and create and manage course grade books. A printed version of this test bank, including the questions and their answers, is included in the Instructor s Resource
Guide.
(ISBN-1O: 0-07-310782-4)
(ISBN-13: 978-0-07-310782-0)
Acknowledgments I would like to thank the many instructors and students at a variety of schools who have used this book and provided me with their valuable feedback and helpful suggestions. Their input has made this a much better book than it would have been otherwise. I especially want to thank Jerrold Grossman, John Michaels, and George Bergman for their technical reviews of the sixth edition and their "eagle eyes," which have helped ensure the accuracy of this book. I also appreciate the help provided by all those who have submitted comments via the website. I thank the reviewers of this sixth and the five previous editions. These reviewers have provided much helpful criticism and encouragement to me. I hope this edition lives up to their high expectations. Reviewers for the Sixth Edition
Charles Ashbacher, Mount Mercy College
Greg Cameron, Brigham Young University
Beverly Diamond, College o/Charleston
Ian Barlami, Rice University
John Carter, University o/ Toronto
Thomas Dunion, Atlantic Union College
George Bergman, University o/California, Berkeley
Greg Chapman, Cosumnes River College
Bruce Elenbogen, University 0/Michigan, Dearborn
David Berman, University 0/ North Carolina, Wilmington
Chao-Kun Cheng, Virginia Commonwealth University
Herbert Enderton, University o/ California, Los Angeles
Miklos Bona, University 0/ Florida
Thomas Cooper, Georgia Perimeter College, Lawrenceville
Anthony Evans, Wright State University
Prosenjit Bose, Carleton University
Barbara Cortzen, DePaul University
Kim Factor, Marquette University
Kirby Brown,
Daniel Cunningham , Buffalo State College
William Farmer, McMaster University
George Davis, University 0/ Georgia
Li Feng, Albany State University
Polytechnic
University
Michael Button,
The Master s College
Preface xv Stephanie Fitchett, Florida Atlantic University
Jonathan Knappenberger, LaSalle University
Henry Ricardo, Medgar Evers College
Jerry Fluharty, Eastern Shore Community College
EdKorntved, Northwest Nazarene University
Oskars Rieksts, Kutztown University
Joel Fowler, Southern Polytechnic State University
Przemo Kranz, University ofMississippi
Stefan Robila, Montclair State University
Loredana Lanzani, University ofArkansas
Chris Rodger, Auburn University
Yoonjin Lee, Smith College
Robert Rodman, North Carolina State University
Miguel Lerma, Northwestern University
Shai Simonson, Stonehill College
Jason Levy, University ofHawaii
Barbara Smith, Cochise College
Lauren Lilly, Tabor College
Wasin So, San Jose State University
Ekaterina Lioutikova, Saint Joseph College
Diana Staats, Dutchess Community College
V ladimir Logvinenko, De Anza College
Lorna Stewart, University ofAlberta
Joan Lukas, University ofMassachusetts, Boston
Bogdan Suceava, California State University, Fullerton
William Gasarch, University ofMaryland Sudpito Ghosh, Colorado State University Jerrold Grossman, Oakland University Ruth Haas, Smith College Patricia Hammer, Hollins University Keith Harrow, Brooklyn College Bert Hartnell, St. Mary s University Julia Hassett, Oakton College Kavita Hatwal, Portland Community College John Helm, Radford University Arthur Hobbs, Texas A&M University Fannie Howell, Roanoke High School Wei Hu, Houghton College Nan Jiang, University ofSouth Dakota Margaret Johnson, Stariford University Martin Jones, College of Charleston
Lester McCann, University ofArizona Jennifer McNulty, University ofMontana John G. Michaels, SUNY Brockport Michael Oppedisano, Morrisville State College Michael O'Sullivan, San Diego State University Charles Parry, Virginia Polytechnic Institute Linda Powers, Virginia Polytechnic Institute Dan Pritikin, Miami University
Kathleen Sullivan, Seattle University Laszlo Szekely, University ofSouth Carolina Daniel Tauritz, University ofMissouri, Rolla Don VanderJagt, Grand Valley State University Fran Vasko, Kutztown University Susan Wallace, University ofNorth Florida Zhenyuan Wang, University ofNebraska, Omaha
Anthony Quas, University ofMemphis
Tom Wineinger, University of Wisconsin, Eau Claire
Eric Rawdon, Duquesne University
Charlotte Young, South Plains College
Eric Allender, Rutgers University
Zhaojun Bai, University ofCalifornia, Davis
Alfred E. Borm, Southwest Texas State University
Stephen Andrilli, La Salle University
Jack R. Barone, Baruch College
Kendall Atkinson, University ofIowa, Iowa City
Klaus Bichteler, University of Texas, Austin
TimKearns, California Polytechnic State University, San Luis Obispo Reviewers of Previous Editions
Ken W Bosworth, University ofMaryland Lois Brady, California Polytechnic State University, San Luis Obispo
xvi
Preface
Scott Buffett, University ofNew Brunswick
David Jonah, Wayne State University
HoKuen Ng, San Jose State University
Russell Campbell, University ofNorthern Iowa
Akihiro Kanamori, Boston University
Timothy S. Norfolk, Sr. University ofAkron
E. Rodney Canfield, University of Georgia
W ThomasKiley, George Mason University
Truc T. Nguyen, Bowling Green State University
Kevin Carolan, Marist College
TakashiKimura, Boston University
George Novacky, University ofPittsburgh
Tim Carroll, Bloomsburg University
NancyKinnersley, University ofKansas
Jeffrey Nunemacher, Ohio Wesleyan University
Kit C. Chan, Bowling Green State University
Gary Klatt, University of Wisconsin
Allan C. Cochran, University ofArkansas
NicholasKrier, Colorado State University
Peter Collinge, Monroe Community College
Lawrence S. Kroll, San Francisco State University
Ron Davis, Millersville University
Shui F. Lam, California State University, Long Beach
Jaroslav Opatmy, Concordia University Jonathan Pakianathan, University ofRochester Thomas W Parsons, Hofstra University Mary K. Prisco, University of Wisconsin, Green Bay Halina Przymusinska, California Polytechnic State University, Pomona
Nachum Dershowitz, University ofIllinois, Urbana-Champaign
Robert Lavelle, lona College
Thomas Dowling, The Ohio State University
Harbir Lamba, George Mason University
Patrick C. Fischer, Vanderbilt University
Sheau Dong Lang, University of Central Florida
Jane Fritz, University ofNew Brunswick
Cary Lee, Grossmont Community College
Ladnor Geissinger, University ofNorth Carolina
Yi- Hsin Liu, University ofNebraska, Omaha
Jonathan Goldstine, Pennsylvania State University
Stephen C. Locke, Florida Atlantic University
Paul Gormley, Villanova University
George Luger, University ofNew Mexico
Brian Gray, Howard Community College
David S. McAllister, North Carolina State University
Jonathan Gross, Columbia University
Robert McGuigan, Wesifield State College
Laxmi N. Gupta, Rochester Institute of Technology
Michael Maller, Queens College
Michael 1. Schlosser, The Ohio State University
Daniel Gusfield, University of California, Davis
Ernie Manes, University ofMassachusetts
Steven R. Seidel, Michigan Technological University
David F. Hayes, San Jose State University
Francis Masat, Glassboro State College
Douglas Shier, Clemson University
Xin He, SUNY at Buffalo
1.
Alistair Sinclair, University of California, Berkeley
Donald Hutchison, Clackamas Community College
Thomas D. Morley, Georgia Institute of Technology
Carl H. Smith, University ofMaryland
Kenneth Johnson, North Dakota State University
D. R. Morrison, University ofNew Mexico
Hunter S. Snevily, University ofIdaho
M. Metzger, University ofNorth Dakota
Don Reichman, Mercer County Community College Harold Reiter, University ofNorth Carolina Adrian Riskin, Northern Arizona State University Amy L. Rocha, San Jose State University Janet Roll, University ofFindlay Matthew 1. Saltzman, Clemson University Alyssa Sankey, Slippery Rock University Dinesh Sarvate, College of Charleston
Preface
xvii
Daniel Somerville, University ofMassachusetts, Boston
Philip D. Tiu, Oglethorpe University
Thomas Upson, Rochester Institute of Technology
Patrick Tantalo, University of California, Santa Cruz
Lisa Townsley-Kulich, Illinois Benedictine College
Roman Voronka, New Jersey Institute of Technology
Bharti Temkin, Texas Tech University
George Trapp, West Virginia University
James Walker, University ofSouth Carolina
Wallace Terwilligen, Bowling Green State University
David S. Tucker, Midwestern State University
Anthony S. Wojcik, Michigan State University
I would like to thank the staff of McGraw-Hill for their strong support of this project. In particular, thanks go to LizHaefele, Publisher, for her backing; to Liz Covello, Senior Sponsoring Editor, for her advocacy and enthusiasm; and to Dan Seibert, Developmental Editor, for his dedication and attention. I would also like to thank the original editor, Wayne Yuhasz, whose insights and skills helped ensure the book's success, as well as all the other previous editors of this book. To the staff involved in the production of this sixth edition, I offer my thanks to Peggy Selle, Lead Project Manager; Michelle W hitaker, Senior Designer; JeffHuettman, Lead Media Producer; Sandy Schnee, Senior Media Project Manager; Melissa Leick, Supplements Coordi nator; and Kelly Brown, Executive Marketing Manager. I am also grateful to Jerry Grossman, who checked the entire manuscript for accuracy; Rose Kramer and Gayle Casel, the technical proofreaders; Pat Steele, the manuscript copyeditor; and Georgia Mederer, who checked the accuracy of the solutions in the Student's Solutions Guide and Instructor's Resource Guide.
Kenneth H. Rosen
The MathZone Companion Website MathZon�
'",
T he extensive companion website accompanying this text has been substantially enhanced lL for the sixth edition and is now integrated with MathZone, McGraw- Hill 's customizable Web-based course management system. Instructors teaching from the text can now use Math Zone to administer online homework, quizzing, testing, and student assessment through sets of questions tied to the text which are automatical ly graded and accessible in an integrated, exportable grade book. MathZone also provides a wide variety of interactive student tutorials, including practice problems with step-by-step solutions; Interactive Demonstration applets; and NetTutor, a live, personalized tutoring service. This website is accessible at www.mathzone . com by choosing thi s text off the pull-down menu, or by navigating to www.mhhe. com/rosen. The homepage shows the Information Center, and contains login links for the site 's Student Edition and Instructor Edition. Key features of each area are described below:
T H E I N FO R M AT I O N C E NT E R The Information Center contains basic information about the book including the Table of Contents, Preface, descriptions of the Ancil laries, and a sample chapter. Instructors can also learn more about MathZone 's key features, and learn about custom publishing options for the book. M ATHZO N E STUDENT E DI T I ON The Student Edition contains a wealth of resources available for student use, including the following, tied into the text wherever the special icons below are shown in the text: a
E xtra Exam ples You can find a large number of additional examples on the site, covering all chapters of the book. These examples are concentrated in areas where students often ask for additional material. Although most of these examples amplify the basic concepts, more-challenging examples can also be found here.
a
I n teractive Demon stration Applets These applets enable you to interactively explore how important algorithms work, and are tied directly to material in the text such as examples and exercises. Additional resources are provided on how to use and apply these applets.
a
Self Assessments These interactive guides help you assess your understanding of 1 4 key concepts, providing a question bank where each question includes a brief tutorial followed by a multiple-choice question. If you select an incorrect answer, advice is provided to help you understand your error. Using these Self Assessments, you should be able to diagnose your problems and focus on available remedies.
a
Web Resources G uide This guide provides annotated links to hundreds of external websites containing relevant material such as historical and biographical information, puzzles and problems, discussions, applets, programs, and more . These links are keyed to the text by page number. Additional resources in the MathZone Student Edition include :
a
xviii
Exploring Discrete Mathematics with Maple This ancillary provides help for using the Maple computer algebra system to do a wide range of computations in discrete mathematics. Each chapter provides a description of relevant Maple functions and how they are used, Mapl e programs to carry out computations in di screte mathematics, examples, and exercises that can be worked using Maple.
The MathZone Companion Website
xix
•
Applications of Discrete Mathematics
•
A Guide to Proof- Writing
•
Common Mistakes in Discrete Mathematics This guide includes a detailed list of com mon misconceptions that students of discrete mathematics often have and the kinds of errors they tend to make. You are encouraged to review this list from time to time to help avoid these common traps. (Also available in the Student's Solutions Guide.)
•
Advice on Writing Projects
•
NetTutor™
This ancillary contains 24 chapters--each with its own set of exercises-presenting a wide variety of interesting and important applications covering three general areas in discrete mathematics: discrete structures, combinatorics, and graph theory. These applications are ideal for supplementing the text or for independent study. This guide provides additional help for writing proofs, a skill that many students find difficult to master. By reading this guide at the beginning of the course and periodically thereafter when proof writing is required, you will be rewarded as your proof-writing ability grows. (Also available in the Student's Solutions Guide.)
This guide offers helpful hints and suggestions for the W rit ing Projects in the text, including an extensive bibliography of helpful books and articles for research; discussion of various resources available in print and online; tips on doing library re search; and suggestions on how to write well. (Also available in the Student's Solutions Guide.)
NetTutor is a live, Web-based service that provides online tutorial help. Students can ask questions relating to this text and receive answers in real- time during regular, scheduled hours or ask questions and later receive answers.
This part of the website provides access to all of the resources in the MathZone Student Edition, as well as the following resources for instructors:
MATHZONE INSTRUCTOR EDITION •
Course Management Using MathZone's course management capabilities instructors can create and share courses and assignments with colleagues; automatically assign and grade homework, quiz, and test questions from a bank of questions tied directly to the text; create and edit their own questions or import their own course content; manage course announcements and due dates; and track student progress.
•
Instructor's Resource Library
•
Suggested Syllabi
•
Migration Guide
•
Teaching Suggestions This guide contains detailed teaching suggestions for instructors, including chapter overviews for the entire text, detailed remarks on each section, and com ments on the exercise sets.
•
Printable Tests
Printable tests are offered in Word and PDF format for every chapter, and are ideal for in-class exams. These tests were created from the Brownstone Diploma electronic test bank available separately by request on the Instructor's Testing a nd Resource CD.
•
Image Bank PowerPoints Downloadable PowerPoints files are offered containing images of all Figures and Tables from the text. These image banks are ideal for preparing instructor class materials customized to the text.
Collected here are a wide variety of additional instructor teaching resources, created by our wide community of users. These resources include lecture notes, classroom presentation materials such as transparency masters and PowerPoint slides, and supplementary reading and assignments. We will periodically update and add to this library, and encourage users to contact us with their materials and ideas for possible future additions. Several detailed course outlines are shown, offering suggestions for courses with different emphases and different student backgrounds and ability levels.
This guide provides instructors with a complete correlation of all sections and exercises from the fifth edition to the sixth edition, and is very useful for instructors migrating their syllabi to the sixth edition.
To the Student
W
hat is discrete mathematics? Discrete mathematics is the part of mathematics devoted to the study of discrete objects. (Here discrete means consisting of distinct or unconnected elements.) The kinds of problems solved using discrete mathematics include: a
How many ways are there to choose a valid password on a computer system?
a
W hat is the probability of winning a lottery?
a
Is there a link between two computers in a network?
a
How can I identify spam e- mail messages?
a
How can I encrypt a message so that no unintended recipient can read it?
a
W hat is the shortest path between two cities using a transportation system?
a
How can a list of integers be sorted so that the integers are in increasing order?
a
How many steps are required to do such a sorting?
a
How can it be proved that a sorting algorithm correctly sorts a list?
a
How can a circuit that adds two integers be designed?
a
How many valid Internet addresses are there?
You wi11 learn the discrete structures and techniques needed to solve problems such as these. More generally, discrete mathematics is used whenever objects are counted, when relation ships between finite (or countable) sets are studied, and when processes involving a finite number of steps are analyzed. A key reason for the growth in the importance of discrete mathematics is that information is stored and manipulated by computing machines in a discrete fashion. There are several important reasons for studying discrete mathematics. First, through this course you can develop your mathematical maturity : that is, your ability to understand and create mathematical arguments. You will not get very far in your studies in the mathematical sciences without these skills. Second, discrete mathematics is the gateway to more advanced courses in all parts of the mathematical sciences. Discrete mathematics provides the mathematical foundations for many computer science courses including data structures, algorithms, database theory, automata theory, formal languages, compiler theory, computer security, and operating systems. Students find these courses much more difficult when they have not had the appropriate mathematical foundations from discrete math. One student has sent me an e- mail message saying that she used the contents of this book in every computer science course she took ! Math courses based on the material studied in discrete mathematics include logic, set theory, number theory, linear algebra, abstract algebra, combinatorics, graph theory, and probability theory (the discrete part of the subject). Also, discrete mathematics contains the necessary mathematical background for solving problems in operations research (including many discrete optimization techniques), chemistry, engineering, biology, and so on. In the text, we will study applications to some of these areas. Many students find their introductory discrete mathematics course to be significantly more challenging than courses they have previously taken. One reason for this is that one of the primary goals of this course is to teach mathematical reasoning and problem solving, rather than a discrete set of skills. The exercises in this book are designed to reflect this goal. Although there are plenty of exercises in this text similar to those addressed in the examples, a large
WHY STUDY DISCRETE MATHEMATICS?
xx
To the Student
xxi
percentage of the exercises require original thought. This is intentional. The material discussed in the text provides the tools needed to solve these exercises, but your job is to successfully apply these tools using your own creativity. One of the primary goals of this course is to learn how to attack problems that may be somewhat different from any you may have previously seen. Unfortunately, learning how to solve only particular types of exercises is not sufficient for success in developing the problem-solving skills needed in subsequent courses and professional work. This text addresses many different topics, but discrete mathematics is an extremely diverse and large area of study. One of my goals as an author is to help you develop the skills needed to master the additional material you will need in your own future pursuits. I would like to offer some advice about how you can best learn discrete mathematics (and other subjects in the mathematical and computing sciences). You will learn the most by actively working exercises. I suggest that you solve as many as you possibly can. After working the exercises your instructor has assigned, I encourage you to solve additional exercises such as those in the exercise sets following each section of the text and in the supplementary exercises at the end of each chapter. (Note the key explaining the markings preceding exercises.)
THE EXERCISES
Key to the E xercises No marking
A routine exercise
*
A difficult exercise
**
An extremely challenging exercise An exercise containing a result used in the book (The Table below shows where each of these exercises are used.)
(Requires calculus)
An exercise whose solution requires the use of limits or concepts from differential or integral calculus
Hand-Icon Exercises and Where They Are Used Section
Exercise
Section Where Used
Page Where Used
1 .2 1 .6 2.3 3.1 3.2 3.7 4. 1 4.2 4.3 4.3 5.4 5.4 6.2 8.1 9.4 10. 1 10. 1 10. 1
42
1 1 .2 1 .6 2.4 3.1 10.2 6.2 4. 1 4.2 4.3 4.3
758 81 158 1 74 700 412 270 287 296 297 414 428 414 545 685 688 693 700 763 477
I Ll
A.2
81 75a 43 62 30 79 28 56 57 17 21 15 24 49 15 30 48 12 4
6.2 6.4 6.2 8.4 10. 1 10. 1 10. 1 10.2 1 1 .3 7.3
xxii
To the Student The best approach is to try exercises yourself before you consult the answer section at the end of this book. Note that the odd- numbered exercise answers provided in the text are answers only and not full solutions; in particular, the reasoning required to obtain answers is omitted in these answers. The Student 50 Solutions Guide, available separately, provides complete, worked solutions to all odd- numbered exercises in this text. When you hit an impasse trying to solve an odd- numbered exercise, I suggest you consult the Student 50 Solutions Guide and look for some guidance as to how to solve the problem. The more work you do yourself rather than passively reading or copying solutions, the more you will learn. The answers and solutions to the even numbered exercises are intentionally not available from the publisher; ask your instructor if you have trouble with these. You are strongly encouraged to take advantage of additional re sources available on the Web, especially those on the MathZone companion website for this book found at www.mhhe.com/rosen. You will find many Extra Examples designed to clar ify key concepts; Self Assessments for gauging how well you understand core topics; Interactive Demonstration Applets exploring key algorithms and other concepts; a Web Resources Guide containing an extensive selection of links to external sites relevant to the world of discrete mathematics; extra explanations and practice to help you master core concepts; added instruc tion on writing proofs and on avoiding common mistakes in discrete mathematics; in- depth discussions of important applications; and guidance on utilizing Maple software to explore the computational aspects of discrete mathematics. Places in the text where these additional online resources are available are identified in the margins by special icons. You will also find NetTutor, an online tutorial service that you can use to receive help from tutors either via real- time chat or via messages. For more details on these online resources, see the description of the MathZone companion website immediately preceding this "To the Student" message.
WEB RESOURCES
THE VALUE OF THIS BOOK My intention is to make your investment in this text an excellent value. The book, the associated ancillaries, and MathZone companion website have taken many years of effort to develop and refine. I am confident that most of you will find that the text and associated materials will help you master discrete mathematics. Even though it is likely that you will not cover some chapters in your current course, you should find it helpful as many other students have-to read the relevant sections of the book as you take additional courses. Most of you will return to this book as a useful tool throughout your future studies, especially for those of you who continue in computer science, mathematics, and engineering. I have designed this book to be a gateway for future studies and explorations, and I wish you luck as you begin your journey.
Kenneth H. Rosen
C H A P T E R
The Foundations : Logic and Proofs 1.1 Propositional Logic
1.2 Propositional Equivalences
1.3 Predicates and Quantifiers
1.4 Nested
Quantifiers
1.5 Rules of
Inference
1.6 Introduction to Proofs
1.7 Proof Methods and Strategy
1.1
T he rules of logic specify the meaning of mathematical statements. For instance, these rules 11. help us understand and reason with statements such as "There exists an integer that is not
the sum of two squares" and "For every positive integer n , the sum of the positive integers not exceeding n is n (n + 1 )/2." Logic is the basis of all mathematical reasoning, and of all automated reasoning. It has practical applications to the design of computing machines, to the specification of systems, to artificial intelligence, to computer programming, to programming languages, and to other areas of computer science, as well as to many other fields of study. To understand mathematics, we must understand what makes up a correct mathematical argument, that is, a proof. Once we prove a mathematical statement is true, we call it a theorem. A collection oftheorems on a topic organize what we know about this topic. To learn a mathematical topic, a person needs to actively construct mathematical arguments on this topic, and not j ust read exposition. Moreover, because knowing the proof of a theorem often makes it possible to modify the result to fit new situations, proofs play an essential role in the development of new ideas. Students of computer science often find it surprising how important proofs are in computer science. In fact, proofs play essential roles when we verify that computer programs produce the correct output for all possible input values, when we show that algorithms always produce the correct result, when we establish the security of a system, and when we create artificial intelligence. Automated reasoning systems have been constructed that allow computers to construct their own proofs. In this chapter, we will explain what makes up a correct mathematical argument and intro duce tools to construct these arguments. We will develop an arsenal of different proof methods that will enable us to prove many different types of results. After introducing many different methods of proof, we will introduce some strategy for constructing proofs. We will introduce the notion of a conjecture and explain the process of developing mathematics by studying conj ectures.
Propositional Lo g ic Introduction The rules of logic give precise meaning to mathematical statements. These rules are used to distinguish between valid and invalid mathematical arguments. Because a maj or goal ofthis book is to teach the reader how to understand and how to construct correct mathematical arguments, we begin our study of discrete mathematics with an introduction to logic. In addition to its importance in understanding mathematical reasoning, logic has numerous applications in computer science. These rules are used in the design of computer circuits, the construction of computer programs, the verification of the correctness of programs, and in many other ways. Furthermore, software systems have been developed for constructing proofs automatically. We will discuss these applications of logic in the upcoming chapters.
Pro p ositions
I-I
Our discussion begins with an introduction to the basic building blocks of logic-propositions. A proposition is a declarative sentence (that is, a sentence that declares a fact) that is either true or false, but not both.
2
1 -2
1 / The Foundations: Logic and Proofs
EXAMPLE l
Exam= �
All the following declarative sentences are propositions.
1 . Washington, D.C., is the capital of the United States of America. 2 . Toronto is the capital of Canada. 3. 1 + 1 = 2 . 4. 2 + 2 = 3 . Propositions
1
and
3
are true, whereas
2
and 4 are false.
Some sentences that are not propositions are given in Example
EXAMPLE 2
2.
Consider the following sentences.
1. 2. 3. 4.
What time is it? Read this carefully.
x + 1 = 2. x + y = Z.
Sentences 1 and 2 are not propositions because they are not declarative sentences. Sentences 3 and 4 are not propositions because they are neither true nor false. Note that each of sentences 3 and 4 can be turned into a proposition if we assign values to the variables. We will also discuss � other ways to turn sentences such as these into propositions in Section 1 .3 . We use letters to denote propositional variables (or statement variables), that is, variables that represent propositions, just as letters are used to denote numerical variables. The conven tional letters used for propositional variables are p , q , r, S , . The truth value of a proposition is true, denoted by T, if it is a true proposition and false, denoted by F, if it is a false proposition. The area of logic that deals with propositions is called the propositional calculus or propo sitional logic. It was first developed systematically by the Greek philosopher Aristotle more than 2 300 years ago. .
.
.
unkS � ARISTOTLE (384 B.C.E.-322 B.C.E.) Aristotle was born in Stagirus (Stagira) in northern Greece. His father was the personal physician of the King of Macedonia. Because his father died when Aristotle was young, Aristotle could not follow the custom of following his father's profession. Aristotle became an orphan at a young age when his mother also died. His guardian who raised him taught him poetry, rhetoric, and Greek. At the age of 1 7, his guardian sent him to Athens to further his education. Aristotle joined Plato's Academy where for 20 years he attended Plato's lectures, later presenting his own lectures on rhetoric. When Plato died in 347 B.C. E., Aristotle was not chosen to succeed him because his views differed too much from those of Plato. Instead, Aristotle joined the court of King Hermeas where he remained for three years, and married the niece of the King. When the Persians defeated Hermeas, Aristotle moved to Mytilene and, at the invitation of King Philip of Macedonia, he tutored Alexander, Philip's son, who later became Alexander the Great. Aristotle tutored Alexander for five years and after the death of King Philip, he returned to Athens and set up his own school, called the Lyceum. Aristotle's followers were called the peripatetics, which means "to walk about;' because Aristotle often walked around as he discussed philosophical questions. Aristotle taught at the Lyceum for 1 3 years where he lectured to his advanced students in the morning and gave popular lectures to a broad audience in the evening. When Alexander the Great died in 323 B.C.E., a backlash against anything related to Alexander led to trumped-up charges of impiety against Aristotle. Aristotle fled to Chalcis to avoid prosecution. He only lived one year in Chalcis, dying of a stomach ailment in 322 B.C.E. Aristotle wrote three types of works: those written for a popular audience, compilations of scientific facts, and systematic treatises. The systematic treatises included works on logic, philosophy, psychology, physics, and natural history. Aristotle's writings were preserved by a student and were hidden in a vault where a wealthy book collector discovered them about 200 years later. They were taken to Rome, where they were studied by scholars and issued in new editions, preserving them for posterity.
1 . 1 Propositional Logic 3
1-3
unkS �
DEFINITION 1
We now turn our attention to methods for producing new propositions from those that we already have. These methods were discussed by the English mathematician George Boole in 1 854 in his book The Laws of Th ought. Many mathematical statements are constructed by combining one or more propositions. New propositions, called compound propositions, are formed from existing propositions using logical operators.
Let p be a proposition. The negation ofp, denoted by p (also denoted by p), is the statement .....
"It is not the case that p."
The proposition p is read "not p." The truth value of the negation of p, ..... p, is the opposite of the truth value of p . .....
EXAMPLE 3
Exam: �
Find the negation of the proposition "Today is Friday." and express this in simple English.
Solution: The negation is "It is not the case that today is Friday." This negation can be more simply expressed by "Today is not Friday," or "It is not Friday today."
EXAMPLE 4
Find the negation of the proposition "At least
1 0 inches of rain fell today in Miami."
and express this in simple English.
Solution: The negation is "It is not the case that at least
10
inches of rain fell today in Miami."
This negation can be more simply expressed by "Less than
TABLE 1 The Truth Table for the Negation of a Proposition. p
"" p
T
F
F
T
1 0 inches of rain fell today in Miami."
Remark: Strictly speaking, sentences involving variable times such as those in Examples 3 and
4 are not propositions unless a fixed time is assumed.
The same holds for variable places unless a fixed place is assumed and for pronouns unless a particular person is assumed. We will always assume fixed times, fixed places, and particular people in such sentences unless otherwise noted.
Table 1 displays the truth table for the negation of a proposition p . This table has a row for each of the two possible truth values of a proposition p . Each row shows the truth value of -'p corresponding to the truth value of p for this row.
4
1 / The Foundations: Logic and Proofs
/-4
The negation of a proposition can also be considered the result of the operation of the
negation operator on a proposition. The negation operator constructs a new proposition from a single existing proposition. We will now introduce the logical operators that are used to form new propositions from two or more existing propositions. These logical operators are also called
connectives.
DEFINITION 2
Let p and q be propositions. The conjunction of p and q , denoted by p /\ q, is the proposition " " p and q . The conjunction p /\ q is true when both p and q are true and is false otherwise .
Table 2 displays the truth table of p /\ q . This table has a row for each of the four possible combinations of truth values of p and q . The four rows correspond to the pairs of truth values TT, TF, FT, and FF, where the first truth value in the pair is the truth value of p and the second truth value is the truth value of q . Note that in logic the word "but" sometimes is used instead of "and" in a conjunction. For example, the statement "The sun is shining, but it is raining" is another way of saying "The sun is shining and it is raining." (In natural language, there is a subtle difference in meaning between "and" and "but"; we will not be concerned with this nuance here.)
EXAMPLE 5
Find the conjunction of the propositions p and q where p is the proposition "Today is Friday" and q is the proposition "It is raining today."
S olution: The conjunction of these propositions, p /\ q , is the proposition "Today is Friday and
it is raining today." This proposition is true on rainy Fridays and is false on any day that is not a � Friday and on Fridays when it does not rain.
DEFINITION 3
Let p and q be propositions. The disjunction of p and q , denoted by p V q , is the proposition " p or q . The disj unction p V q is false when both p and q are false and is true otherwise.
"
Table 3 displays the truth table for p V q . The use of the connective or in a disjunction corresponds to one of the two ways the word or is used in English, namely, in an inclusive way. A disjunction is true when at least one of the two propositions is true. For instance, the inclusive or is being used in the statement "Students who have taken calculus or computer science can take this class."
TABLE 2 The Truth Table for the Conjunction of Two Propositions. p
TABLE 3 The Truth Table for the Disjunction of Two Propositions.
q
p /\ q
p
q
pVq
T T T F
T
T
T
T
T
T
F
F
T
F
F
T
F
F
F
F
F
F
T F
1 . 1 Propositional Logic
1-5
5
Here, we mean that students who have taken both calculus and computer science can take the class, as well as the students who have taken only one of the two subjects. On the other hand, we are using the exclusive or when we say "Students who have taken calculus or computer science, but not both, can enroll in this class." Here, we mean that students who have taken both calculus and a computer science course cannot take the class. Only those who have taken exactly one of the two courses can take the class. Similarly, when a menu at a restaurant states, "Soup or salad comes with an entree," the restaurant almost always means that customers can have either soup or salad, but not both. Hence, this is an exclusive, rather than an inclusive, or.
EXAMPLE 6
What is the disjunction of the propositions p and q where p and q are the same propositions as in Example 5?
Solution: The disjunction o f p and q , p v q, i s the proposition "Today is Friday or it is raining today." This proposition is true on any day that is either a Friday or a rainy day (including rainy Fridays). .... It is only false on days that are not Fridays when it also does not rain. As was previously remarked, the use of the connective or in a disjunction corresponds to one of the two ways the word or is used in English, namely, in an inclusive way. Thus, a disjunction is true when at least one of the two propositions in it is true. Sometimes, we use or in an exclusive sense. When the exclusive or is used to connect the propositions p and q , the " proposition p or q (but not both)" is obtained. This proposition is true when p is true and q is false, and when p is false and q is true. It is false when both p and q are false and when both are true.
DEFINITION 4
Let p and q be propositions. The exclusive or of p and q , denoted by p E9 q , is the proposition that is true when exactly one of p and q is true and is false otherwise.
The truth table for the exclusive or of two propositions is displayed in Table 4.
unkS � GEORGE BOOLE ( 1 8 1 5-1 864) George Boole, the son of a cobbler, was born in Lincoln, England, in November 1 8 1 5 . Because of his family's difficult financial situation, Boole had to struggle to educate himself while supporting his family. Nevertheless, he became one of the most important mathematicians of the 1 800s. Although he considered a career as a clergyman, he decided instead to go into teaching and soon afterward opened a school of his own. In his preparation for teaching mathematics, Boole-unsatisfied with textbooks of his day-decided to read the works of the great mathematicians. While reading papers of the great French mathematician Lagrange, Boole made discoveries in the calculus of variations, the branch of analysis dealing with finding curves and surfaces optimizing certain parameters. In 1 848 Boole published The Mathematical Analysis a/Logic, the first of his contributions to symbolic logic. In 1 849 he was appointed professor of mathematics at Queen's College in Cork, Ireland. In 1 854 he published The Laws 0/ Thought, his most famous work. In this book, Boole introduced what is now called Boolean algebra in his honor. Boole wrote textbooks on differential equations and on difference equations that were used in Great Britain until the end of the nineteenth century. Boole married in 1 855; his wife was the niece of the professor of Greek at Queen's College. In 1 864 Boole died from pneumonia, which he contracted as a result of keeping a lecture engagement even though he was soaking wet from a rainstorm.
6
I / The Foundations:
Logic and Proofs
1-6
TABLE 4 The Truth Table for
TABLE 5 The Truth Table for the Conditional Statement p -- q.
the Exclusive Or of Two Propositions. p
q
p ffi q
p
q
p -+ q
T
T
T
F
T
T
T
F
T
F
F
F
T
T T
F
T
T
F
F
F
F
F
T
Conditional Statements We will discuss several other important ways in which propositions can be combined.
DEFINITION 5
Assessment i::J
Let p and q be propositions. The conditional statement p � q is the proposition "if p, then q ." The conditional statement p � q is false when p is true and q is false, and true otherwise. In the conditional statement p � q , p is called the hypothesis (or antecedent or premise) and q is called the conclusion (or consequence).
The statement p � q i s called a conditional statement because p � q asserts that q i s true on the condition that p holds. A conditional statement is also called an implication. The truth table for the conditional statement p � q is shown in Table 5. Note that the statement p � q is true when both p and q are true and when p is false (no matter what truth value q has). Because conditional statements play such an essential role in mathematical reasoning, a variety of terminology is used to express p � q . You will encounter most if not all of the following ways to express this conditional statement:
"
"if p, then q "if p, q " " " p i s sufficient for q "q if p " "q when p " " "a necessary condition for p is q " " q unless -'p
"p implies q " " p only if q"
"a sufficient condition for q is p
" " q whenever p " " q is necessary for p " " q follows from p
"
A useful way to understand the truth value of a conditional statement is to think of an obligation or a contract. For example, the pledge many politicians make when running for office is "If I am elected, then I will lower taxes." If the politician is elected, voters would expect this politician to lower taxes. Furthermore, if the politician is not elected, then voters will not have any expectation that this person will lower taxes, although the person may have sufficient influence to cause those in power to lower taxes. It is only when the politician is elected but does not lower taxes that voters can say that the politician has broken the campaign pledge. This last scenario corresponds to the case when p is true but q is false in p � q . Similarly, consider a statement that a professor might make: "If you get
1 00% on the final, then you will get an A."
1 . 1 Propositional Logic
/-7
7
If you manage to get a 1 00% on the final, then you would expect to receive an A. If you do not get 1 00% you may or may not receive an A depending on other factors. However, if you do get 1 00%, but the professor does not give you an A, you will feel cheated. " " " Many people find it confusing that p only if q expresses the same thing as "if p then q . " " To remember this, note that p only if q says that p cannot be true when q is not true. That is, the statement is false if p is true, but q is false. When p is false, q may be either true or false, because the statement says nothing about the truth value of q . A common error is for people to " " think that q only if p is a way of expressing p -* q . However, these statements have different truth values when p and q have different truth values. " The word "unless" is often used to express conditional statements. Observe that q unless " " " -.p means that if -'p is false, then q must be true. That is, the statement q unless -.p is false " " when p is true and q is false, but it is true otherwise. Consequently, q unless -.p and p -* q always have the same truth value. We illustrate the translation between conditional statements and English statements in Example 7.
EXAMPLE 7
Let p be the statement "Maria learns discrete mathematics" and q the statement "Maria will find a good job." Express the statement p -* q as a statement in English.
Exlra �
Solution: From the definition of conditional statements, we see that when p is the statement
ExampleS �
"Maria learns discrete mathematics" and q is the statement "Maria will find a good job," p represents the statement
-*
q
"If Maria learns discrete mathematics, then she will find a good job." There are many other ways to express this conditional statement in English. Among the most natural of these are: "Maria will find a good job when she learns discrete mathematics." "For Maria to get a good job, it is sufficient for her to learn discrete mathematics." and "Maria will find a good job unless she does not learn discrete mathematics." Note that the way we have defined conditional statements is more general than the meaning attached to such statements in the English language. For instance, the conditional statement in Example 7 and the statement "If it is sunny today, then we will go to the beach." are statements used in normal language where there is a relationship between the hypothesis and the conclusion. Further, the first of these statements is true unless Maria learns discrete mathematics, but she does not get a good job, and the second is true unless it is indeed sunny today, but we do not go to the beach. On the other hand, the statement "If today is Friday, then
2 + 3 = 5."
is true from the definition of a conditional statement, because its conclusion i s true. (The truth value of the hypothesis does not matter then.) The conditional statement "If today is Friday, then 2 +
3 = 6."
i s true every day except Friday, even though 2 + 3 = 6 i s false. We would not use these last two conditional statements in natural language (except perhaps in sarcasm), because there is no relationship between the hypothesis and the conclusion in either
8
I / The Foundations: Logic and Proofs
1-8
statement. In mathematical reasoning, we consider conditional statements of a more general sort than we use in English. The mathematical concept of a conditional statement is independent of a cause- and- effect relationship between hypothesis and conclusion. Our definition of a conditional statement specifies its truth values; it is not based on English usage. Propositional language is an artificial language; we only parallel English usage to make it easy to use and remember. The if-then construction used in many programming languages is different from that used in logic. Most programming languages contain statements such as if p then S, where p is a . proposition and S is a program segment (one or more statements to be executed). When execution of a program encounters such a statement, S is executed if p is true, but S is not executed if p is false, as illustrated in Example 8.
EXAMPLE 8
What is the value of the variable
if 2 + 2
x=
x after the statement
= 4 then x := x + 1
:= stands for assignment. The I to x .) Solution: Because 2 + 2 = 4 is true, the assignment statement x := x + I is executed. Hence, .... x has the value 0 + I = I after this statement is encountered. if 0 before this statement is encountered? (The symbol statement + 1 means the assignment of the value of +
x := x
x
CONVERSE, CONTRAPOSITIVE, AND INVERSE We can form some new conditional statements starting with a conditional statement p -+ q . In particular, there are three related conditional statements that occur so often that they have special names. The proposition q -+ p is called the converse of p -+ q . The contrapositive of p -+ q is the proposition -.q -+ -'p. The proposition -'p -+ -.q is called the inverse of p -+ q. We will see that of these three conditional statements formed from p -+ q , only the contrapositive always has the same truth value as p -+ q . .. We first show that the contrapositive, -.q -+ -'p, of a conditional statement p -+ q always has the same truth value as p -+ q . To see this, note that the contrapositive is false only when -'p is false and -.q is true, that is, only when p is true and q is false. We now show that neither the converse, q -+ p, nor the inverse, -'p -+ -'q , has the same truth value as p -+ q for all possible truth values of p and q . Note that when p is true and q is false, the original conditional statement is false, but the converse and the inverse are both true. When two compound propositions always have the same truth value we call them equivalent, so that a conditional statement and its contrapositive are equivalent. The converse and the inverse of a conditional statement are also equivalent, as the reader can verify, but neither is equivalent to the original conditional statement. (We will study equivalent propositions in Section 1 .2.) Take note that one of the most common logical errors is to assume that the converse or the inverse of a conditional statement is equivalent to this conditional statement. We illustrate the use of conditional statements in Example 9.
EXAMPLE 9
What are the contrapositive, the converse, and the inverse of the conditional statement "The home team wins whenever it is raining."?
Solution: Because "q whenever p" is one of the ways to express the conditional statement
p -+ q , the original statement can be rewritten as "If it is raining, then the home team wins."
Consequently, the contrapositive of this conditional statement is "If the home team does not win, then it is not raining."
1 . 1 Propositional Logic
1-9
9
The converse is "If the home team wins, then it is raining." The inverse is "If it is not raining, then the home team does not win." Only the contrapositive is equivalent to the original statement.
BICONDITIONALS We now introduce another way to combine propositions that expresses that two propositions have the same truth value.
DEFINITION 6
Let p and q be propositions. The biconditional statement p *+ q is the proposition "p if and only if q ." The biconditional statement p *+ q is true when p and q have the same truth values, and is false otherwise. Biconditional statements are also called bi-implications.
The truth table for p *+ q is shown in Table 6. Note that the statement p *+ q is true when both the conditional statements p -+ q and q -+ P are true and is false otherwise. That is why we use the words "if and only if " to express this logical connective and why it is symbolically written by combining the symbols -+ and +-. There are some other common ways to express p *+ q :
"p is necessary and sufficient for q " "if p then q , and conversely" "p iff q ." The last way of expressing the biconditional statement p *+ q uses the abbreviation "iff" for "if and only if." Note that p *+ q has exactly the same truth value as (p -+ q ) 1\ (q -+ p).
EXAMPLE 10
Let p be the statement "You can take the flight" and let q be the statement "You buy a ticket." Then p *+ q is the statement "You can take the flight if and only if you buy a ticket." This statement is true if p and q are either both true or both false, that is, if you buy a ticket and can take the flight or if you do not buy a ticket and you cannot take the flight. It is false when p and q have opposite truth values, that is, when you do not buy a ticket, but you can take the flight (such as when you get a free trip) and when you buy a ticket and cannot take the flight � (such as when the airline bumps you).
TABLE 6 The Truth Table for the Biconditional p +-+ q. p
q
p +-+ q
T
T
T
T
F
F
F
T
F
F
F
T
10
1 / The Foundations: Logic and Proofs
1-10
IMPLICIT USE OF BICONDITIONALS You should be aware that biconditionals are not always explicit in natural language. In particular, the "if and only if" construction used in biconditionals is rarely used in common language. Instead, biconditionals are often expressed using an "if, then" or an "only if" construction. The other part of the "if and only if" is implicit. That is, the converse is implied, but not stated. For example, consider the statement in English "If you finish your meal, then you can have dessert." What is really meant is "You can have dessert if and only if you finish your meal." This last statement is logically equivalent to the two statements "If you finish your meal, then you can have dessert" and "You can have dessert only if you finish your meal." Because of this imprecision in natural language, we need to make an assumption whether a conditional statement in natural language implicitly includes its converse. Because precision is essential in mathematics and in logic, we will always distinguish between the conditional statement p --+ q and the biconditional statement p ++ q .
Truth Tables o f Compound Propositions
Demo �
EXAMPLE 11
We have now introduced four important logical connectives--conjunctions, disjunctions, con ditional statements, and biconditional statements-as well as negations. We can use these con nectives to build up complicated compound propositions involving any number of propositional variables. We can use truth tables to determine the truth values of these compound propositions, as Example 1 1 illustrates. We use a separate column to find the truth value of each compound expression that occurs in the compound proposition as it is built up. The truth values of the compound proposition for each combination of truth values of the propositional variables in it is found in the final column of the table. Construct the truth table of the compound proposition (p
v
-.q) --+ (p
/\
q).
Solution: Because this truth table involves two propositional variables p and q , there are four
rows in this truth table, corresponding to the combinations of truth values TT, TF, FT, and FE The first two columns are used for the truth values of p and q , respectively. In the third column we find the truth value of -'q , needed to find the truth value of p v q found in the fourth column. The truth value of p /\ q is found in the fifth column. Finally, the truth value of (p v -.q ) --+ (p /\ q) is found in the last column. The resulting truth table is shown in � Table 7. -'
,
Precedence of Logical Operators We can construct compound propositions using the negation operator and the logical operators defined so far. We will generally use parentheses to specify the order in which logical operators
TABLE 7 The Truth Table of (p v .., q) P
q
-+
..., q
p V ""q
p /\ q
(p
/\
q). (P V ..., q )
->
T
T
F
T
T
T
T
F
T
T
F
F
F
T
T
F F
T
F
F T
F
F
F
(p /\ q )
1 . 1 Propositional Logic
1-1 1
TABLE S Precedence of Logical Operators. Operator Precedence �
1
1\
2 3
---+
4 5
v
*+
11
in a compound proposition are to b e applied. For instance, (p v q) /\ ( -,r ) i s the conjunction of p v q and -'r . However, to reduce the number of parentheses, we specify that the negation operator is applied before all other logical operators. This means that -'p /\ q is the conjunc tion of -'p and q , namely, (-,p) /\ q , not the negation of the conjunction of p and q , namely -'(p /\ q). Another general rule of precedence is that the conjunction operator takes precedence over the disjunction operator, so that p /\ q V r means (p /\ q ) V r rather than p /\ (q V r ) . Because this rule may be difficult to remember, we will continue to use parentheses so that the order of the disjunction and conjunction operators is clear. Finally, it is an accepted rule that the conditional and biconditional operators -+ and *+ have lower precedence than the conjunction and disjunction operators, /\ and v . Consequently, p v q -+ r is the same as (p v q ) -+ r . We will use parentheses when the order ofthe conditional operator and biconditional operator is at issue, although the conditional operator has precedence over the biconditional operator. Table 8 displays the precedence levels of the logical operators, -', /\, v, -+ , and *+ .
Translating English Sentences There are many reasons to translate English sentences into expressions involving propositional variables and logical connectives. In particular, English (and every other human language) is often ambiguous. Translating sentences into compound statements (and other types of logical expressions, which we will introduce later in this chapter) removes the ambiguity. Note that this may involve making a set of reasonable assumptions based on the intended meaning of the sentence. Moreover, once we have translated sentences from English into logical expressions we can analyze these logical expressions to determine their truth values, we can manipulate them, and we can use rules of inference (which are discussed in Section 1 .5) to reason about them. To illustrate the process oftranslating an English sentence into a logical expression, consider Examples 1 2 and 1 3 .
EXAMPLE 12
How can this English sentence be translated into a logical expression? "You can access the Internet from campus only if you are a computer science major or you are not a freshman." Solution: There are many ways to translate this sentence into a logical expression. Although it
is possible to represent the sentence by a single propositional variable, such as p, this would not be useful when analyzing its meaning or reasoning with it. Instead, we will use propositional variables to represent each sentence part and determine the appropriate logical connectives between them. In particular, we let a, c, and / represent "You can access the Internet from campus," "You are a computer science major," and "You are a freshman," respectively. Noting that "only if" is one way a conditional statement can be expressed, this sentence can be repre sented as
a -+ (c V -,/). EXAMPLE 13
How can this English sentence be translated into a logical expression? "You cannot ride the roller coaster if you are under 4 feet tall unless you are older than 1 6 years old."
12
1 / The Foundations: Logic and Proofs
1-12
Solution: Let q, r , and s represent "You can ride the roller coaster," "You are under 4 feet tall,"
and "You are older than 1 6 years old," respectively. Then the sentence can be translated to
Of course, there are other ways to represent the original sentence as a logical expression,
System Specifications Translating sentences in natural language (such as English) into logical expressions is an essential part of specifying both hardware and software systems. System and software engineers take requirements in natural language and produce precise and unambiguous specifications that can be used as the basis for system development. Example 1 4 shows how compound propositions . can be used in this process.
EXAMPLE 14
Express the specification "The automated reply cannot be sent when the file system is full" using logical connectives. Solution: One way to translate this is to let p denote "The automated reply can be sent" and q
denote "The file system is full." Then -'p represents "It is not the case that the automated reply can be sent," which can also be expressed as "The automated reply cannot be sent." Consequently,
System specifications should be consistent, that is, they should not contain conflicting requirements that could be used to derive a contradiction. When specifications are not consistent, there would be no way to develop a system that satisfies all specifications.
EXAMPLE 15
Determine whether these system specifications are consistent: "The diagnostic message is stored in the buffer or it is retransmitted." "The diagnostic message is not stored in the buffer." "If the diagnostic message is stored in the buffer, then it is retransmitted." Solution: To determine whether these specifications are consistent, we first express them using
logical expressions. Let p denote "The diagnostic message is stored in the buffer" and let q denote "The diagnostic message is retransmitted." The specifications can then be written as p v q , -'p, and p ---+ q . An assignment of truth values that makes all three specifications true must have p false to make -'p true. Because we want p v q to be true but p must be false, q must be true. Because p ---+ q is true when p is false and q is true, we conclude that these specifications are consistent because they are all true when p is false and q is true. We could come to the same conclusion by use of a truth table to examine the four possible assignments of truth values to
EXAMPLE 16
Do the system specifications in Example 15 remain consistent ifthe specification "The diagnostic message is not retransmitted" is added? Solution: By the reasoning in Example 1 5 , the three specifications from that example are true
only in the case when p is false and q is true. However, this new specification is -'q , which is
1 . 1 Propositional Logic
J-/3
13
Boolean Searches
unkS �
EXAMPLE 17
Logical connectives are used extensively in searches of large collections of information, such as indexes of Web pages. Because these searches employ techniques from propositional logic, they are called Boolean searches. In Boolean searches, the connective AND is used to match records that contain both of two search terms, the connective OR is used to match one or both of two search terms, and the connective NOT (sometimes written as AND NOT ) is used to exclude a particular search term. Careful planning of how logical connectives are used is often required when Boolean searches are used to locate information of potential interest. Example 1 7 illustrates how Boolean searches are carried out. Web Page Searching Most Web search engines support Boolean searching techniques, which usually can help find Web pages about particular subjects. For instance, using Boolean searching to find Web pages about universities in New Mexico, we can look for pages matching NEW AND MEXICO AND UNIVERSITIES. The results of this search will include those pages that contain the three words NEW, MEXICO, and UNIVERSITIES. This will include all of the pages of interest, together with others such as a page about new universities in Mexico. (Note that in Google, and many other search engines, the word "AND" is not needed, although it is understood, because all search terms are included by default.) Next, to find pages that deal with universities in New Mexico or Arizona, we can search for pages matching (NEW AND MEXICO OR ARIZONA) AND UNIVERSITIES. (Note: Here the AND operator takes precedence over the OR operator. Also, in Google, the terms used for this search would be NEW MEXICO OR ARIZONA.) The results ofthis search will include all pages that contain the word UNIVERSITIES and either both the words NEW and MEXICO or the word ARIZONA. Again, pages besides those of interest will be listed. Finally, to find Web pages that deal with universities in Mexico (and not New Mexico), we might first look for pages matching MEXICO AND UNIVERSITIES, but because the results of this search will include pages about universities in New Mexico, as well as universities in Mexico, it might be better to search for pages matching (MEXICO AND UNIVERSITIES) NOT NEW. The results of this search include pages that contain both the words MEXICO and UNIVERSITIES but do not contain the word NEW. (In Google, and many other search engines, the word "NOT" is replaced by a minus sign "-". In Google, the terms used for this last search would be MEXICO UNIVERSITIES - NEW.) ....
Logic Puzzles
unkS �
Puzzles that can be solved using logical reasoning are known as logic puzzles. Solving logic puzzles is an excellent way to practice working with the rules of logic. Also, computer pro grams designed to carry out logical reasoning often use well-known logic puzzles to illustrate their capabilities. Many people enjoy solving logic puzzles, which are published in books and periodicals as a recreational activity. We will discuss two logic puzzles here. We begin with a puzzle that was originally posed by Raymond Smullyan, a master of logic puzzles, who has published more than a dozen books containing challenging puzzles that involve logical reasoning.
EXAMPLE 18
In [Sm78] Smullyan posed many puzzles about an island that has two kinds of inhabitants, knights, who always tell the truth, and their opposites, knaves, who always lie. You encounter two people A and B . What are A and B if A says "B is a knight" and B says "The two of us are opposite types"?
Exam=�
14
1 / The Foundations: Logic and Proofs
1-14
Let p and q be the statements that A is a knight and B is a knight, respectively, so that -'p and -.q are the statements that A is a knave and that B is a knave, respectively. We first consider the possibility that A is a knight; this is the statement that p is true. If A is a knight, then he is telling the truth when he says that B is a knight, so that q is true, and A and B are the same type. However, if B is a knight, then B 's statement that A and B are of opposite types, the statement (p /\ -.q) V (-'p /\ q), would have to be true, which it is not, because A and B are both knights. Consequently, we can conclude that A is not a knight, that is, that p is false. If A is a knave, then because everything a knave says is false, A 's statement that B is a knight, that is, that q is true, is a lie, which means that q is false and B is also a knave. Furthermore, if B is a knave, then B 's statement that A and B are opposite types is a lie, which is consistent <0lIl with both A and B being knaves. We can conclude that both A and B are knaves. Solution:
We pose more of Smullyan's puzzles about knights and knaves in Exercises 5 5-59 at the end of this section. Next, we pose a puzzle known as the muddy children puzzle for the case of two children. EXAMPLE 19
A father tells his two children, a boy and a girl, to play in their backyard without getting dirty. However, while playing, both children get mud on their foreheads. When the children stop playing, the father says "At least one of you has a muddy forehead," and then asks the children to answer "Yes" or "No" to the question: "Do you know whether you have a muddy forehead?" The father asks this question twice. What will the children answer each time this question is asked, assuming that a child can see whether his or her sibling has a muddy forehead, but cannot see his or her own forehead? Assume that both children are honest and that the children answer each question simultaneously. Solution: Let s
be the statement that the son has a muddy forehead and let d be the statement that the daughter has a muddy forehead. When the father says that at least one of the two children has a muddy forehead, he is stating that the disjunction s V d is true. Both children will answer "No" the first time the question is asked because each sees mud on the other child's forehead. That is, the son knows that d is true, but does not know whether s is true, and the daughter knows that s is true, but does not know whether d is true. After the son has answered "No" to the first question, the daughter can determine that d must be true. This follows because when the first question is asked, the son knows that s V d is
unkS � RAYMOND SMULLYAN (BORN 1 9 1 9) Raymond Smullyan dropped out of high school. He wanted to study what he was really interested in and not standard high school material. After jumping from one university to the next, he earned an undergraduate degree in mathematics at the University of Chicago in 1 955. He paid his college expenses by performing magic tricks at parties and clubs. He obtained a Ph.D. in logic in 1 959 at Princeton, studying under Alonzo Church. After graduating from Princeton, he taught mathematics and logic at Dartmouth College, Princeton University, Yeshiva University, and the City University of New York. He joined the philosophy department at Indiana University in 1 98 1 where he is now an emeritus professor. Smullyan has written many books on recreational logic and mathematics, including Satan, Cantor, and Infinity; What Is the Name of This Book?; The Lady or the Tiger?; Alice in Puzzleland; To Mock a Mockingbird; Forever Undecided; and The Riddle of Scheherazade: Amazing Logic Puzzles, Ancient and Modern. Because his logic puzzles are challenging, entertaining, and thought-provoking, he is considered to be a modem-day Lewis Carroll. Smullyan has also written several books about the application of deductive logic to chess, three collections of philosophical essays and aphorisms, and several advanced books on mathematical logic and set theory. He is particularly interested in self-reference and has worked on extending some of Godel's results that show that it is impossible to write a computer program that can solve all mathematical problems. He is also particularly interested in explaining ideas from mathematical logic to the public. Smullyan is a talented musician and often plays piano with his wife, who is a concert-level pianist. Making telescopes is one of his hobbies. He is also interested in optics and stereo photography. He states "I've never had a conflict between teaching and research as some people do because when I'm teaching, I'm doing research."
1 . 1 Propositional Logic
1-15
TABLE 9 Table for the Bit Operators AND, and XOR.
15
OR,
x
y
xVy
x /\ y
x EB y
0
0
0
0
0
0
1
1
1
0
1
0 0
1 1
1
1
1
1
0
true, but cannot determine whether s is true. Using this information, the daughter can conclude that d must be true, for if d were false, the son could have reasoned that because s V d is true, then s must be true, and he would have answered "Yes" to the first question. The son can reason in a similar way to determine that s must be true. It follows that both children answer "Yes" the .... second time the question is asked.
Logic and Bit Operations Truth Value
Bit
T
1
F
0
unllS �
DEFINITION 7
EXAMPLE 20
Computers represent information using bits. A bit is a symbol with two possible values, namely, o (zero) and 1 (one). This meaning of the word bit comes from binary digit, because zeros and ones are the digits used in binary representations of numbers. The well-known statistician John Tukey introduced this terminology in 1 946. A bit can be used to represent a truth value, because there are two truth values, namely, true and false. As is customarily done, we will use a 1 bit to represent true and a 0 bit to represent false. That is, 1 represents T (true), 0 represents F (false). A variable is called a Boolean variable if its value is either true or false. Consequently, a Boolean variable can be represented using a bit. Computer bit operations correspond to the logical connectives. By replacing true by a one and false by a zero in the truth tables for the operators /\, v, and EB, the tables shown in Table 9 for the corresponding bit operations are obtained. We will also use the notation OR, AND, and XOR for the operators V , /\, and EB, as is done in various programming languages. Information is often represented using bit strings, which are lists of zeros and ones. When this is done, operations on the bit strings can be used to manipulate this information. A bit string is a sequence of zero or more bits. The length of this string is the number of bits
in the string.
1 0 1 0 1 00 1 1 is a bit string of length nine.
We can extend bit operations to bit strings. We define the bitwise OR, bitwise AND, and bitwise XOR of two strings of the same length to be the strings that have as their bits the OR, AND, and XOR of the corresponding bits in the two strings, respectively. We use the symbols V , /\, and EB to represent the bitwise OR, bitwise AND, and bitwiseXOR operations, respectively. We illustrate bitwise operations on bit strings with Example 2 1 .
EXAMPLE 21
Find the bitwise OR, bitwise AND, and bitwise XOR of the bit strings 0 1 1 0 1 1 0 1 1 0 and 1 1 000 1 1 1 0 1 . (Here, and throughout this book, bit strings will be split into blocks of four bits to make them easier to read.)
16
1 / The Foundations: Logic and Proofs
1-16
Solution: The bitwise OR, bitwise AND, and bitwise XOR of these strings are obtained by taking
the OR, AND, and XOR of the corresponding bits, respectively. This gives us 01 101 1 01 10 1 1 000 1 1 1 0 1 11 101 1 1 1 1 1 0 1 000 1 0 1 00 10 1010 101 1
bitwise OR bitwise AND bitwise XOR
Exercises 1 . Which of these sentences are propositions? What are the truth values of those that are propositions?
a) b) c) e)
Boston is the capital of Massachusetts. Miami is the capital of Florida.
2 + 3 = 5. x + 2 = I I.
d ) 5 + 7 = 1 0. 1) Answer this question. 2. Which ofthese are propositions? What are the truth values of those that are propositions? a) b) c) d) e) 1)
Do not pass go. What time is it? There are no black flies in Maine.
4 + x = 5.
The moon i s made of green cheese.
2 n ::: 1 00.
3 . What is the negation of each of these propositions?
a) Today is Thursday.
b) There is no pollution in New Jersey. c) 2 + 1 = 3 . d) The summer i n Maine i s hot and sunny. 4. Let p and q be the propositions
p : I bought a lottery ticket this week. q : I won the million dollar j ackpot on Friday. Express each ofthese propositions as an English sentence. a) --'p b) p v q c) p --+ q d) p /\ q e) p - q 1) --'p --+ --.q b) --'p v (p /\ q ) g) --'p /\ --.q 5 . Let p and q b e the propositions "Swimming at the New Jersey shore is allowed" and "Sharks have been spotted near the shore," respectively. Express each of these com pound propositions as an English sentence. a) --.q d) p --+ --.q g) p _ --.q
c) --'p v q b) p /\ q e) --.q --+ p 1) --'p --+ --.q b) --'p /\ (p V --.q )
unkS�
JOHN WILDER TUKEY ( 1 9 1 5-2000) Tukey, born in New Bedford, Massachusetts, was an only child. His parents, both teachers, decided home schooling would best develop his potential. His formal education began at Brown University, where he studied mathematics and chemistry. He received a master's degree in chemistry from Brown and continued his studies at Princeton University, changing his field of study from chemistry to mathematics. He received his Ph.D. from Princeton in 1 939 for work in topology, when he was appointed an instructor in mathematics at Princeton. With the start of World War II, he joined the Fire Control Research Office, where he began working in statistics. Tukey found statistical research to his liking and impressed several leading statisticians with his skills. In 1 945, at the conclusion ofthe war, Tukey returned to the mathematics department at Princeton as a professor of statistics, and he also took a position at AT&T Bell Laboratories. Tukey founded the Statistics Department at Princeton in 1 966 and was its first chairman. Tukey made significant contributions to many areas of statistics, including the analysis ofvariance, the estimation of spectra of time series, inferences about the values of a set of parameters from a single experiment, and the philosophy of statistics. However, he is best known for his invention, with 1. W Cooley, of the fast Fourier transform. In addition to his contributions to statistics, Tukey was noted as a skilled wordsmith; he is credited with coining the terms bit and software. Tukey contributed his insight and expertise by serving on the President's Science Advisory Committee. He chaired several important committees dealing with the environment, education, and chemicals and health. He also served on committees working on nuclear disarmament. Tukey received many awards, including the National Medal of Science. HISTORICAL NOTE There were several other suggested words for a binary digit, including binit and bigit, that never were widely accepted. The adoption of the word bit may be due to its meaning as a common English word. For an account of Tukey's coining of the word bit, see the April 1 984 issue of Annals of the History of Computing.
1 . 1 Propositional Logic
1-1 7
6.
Let p and q be the propositions "The election is de cided" and "The votes have been counted," respectively. Express each of these compound propositions as an English sentence. a) ""'p b) p v q d) q --+ P c) ""'p /\ q e) ""'q --+ ""'p I) ""'p --+ ""'q g) p ++ q b) ....,q v (....,P /\ q )
7 . Let p and q be the propositions
p : It is below freezing. q : It is snowing.
Write these propositions using p and q and logical connectives. a) b) c) d) e)
It is below freezing and snowing. It is below freezing but not snowing. It is not below freezing and it is not snowing. It is either snowing or below freezing (or both). If it is below freezing, it is also snowing. I) It is either below freezing or it is snowing, but it is not snowing if it is below freezing. g) That it is below freezing is necessary and sufficient for it to be snowing. 8. Let p, q, and r be the propositions p : You have the flu. q : You miss the final examination. r : You pass the course. Express each ofthese propositions as an English sentence. a) p --+ q b) ""'q ++ r d) p v q V r c) q --+ ....,r e) (p --+ ....,r ) v (q --+ ....,r ) I) (p /\ q ) v (""'q /\ r) 9. Let p and q be the propositions
p : You drive over 65 miles per hour. q : You get a speeding ticket. Write these propositions using p and q and logical connectives. a) You do not drive over 65 miles per hour. b) You drive over 65 miles per hour, but you do not get a speeding ticket. c) You will get a speeding ticket if you drive over 65 miles per hour. d) If you do not drive over 65 miles per hour, then you will not get a speeding ticket. e) Driving over 65 miles per hour is sufficient for getting a speeding ticket. I) You get a speeding ticket, but you do not drive over 65 miles per hour. g) Whenever you get a speeding ticket, you are driving over 65 miles per hour. 10. Let p, q, and r be the propositions p : You get an A on the final exam. q : You do every exercise in this book. r : You get an A in this class.
17
Write these propositions using p, q , and r and logical connectives. a) You get an A in this class, but you do not do every exercise in this book. b) You get an A on the final, you do every exercise in this book, and you get an A in this class. c) To get an A in this class, it is necessary for you to get an A on the final. d) You get an A on the final, but you don't do every ex ercise in this book; nevertheless, you get an A in this class. e) Getting an A on the final and doing every exercise in this book is sufficient for getting an A in this class. I) You will get an A in this class if and only if you either do every exercise in this book or you get an A on the final. 1 1 . Let p, q , and r be the propositions p : Grizzly bears have been seen in the area. q : Hiking is safe on the trail. r : Berries are ripe along the trail. Write these propositions using p, q, and r and logical connectives. a) Berries are ripe along the trail, but grizzly bears have not been seen in the area. b) Grizzly bears have not been seen in the area and hiking on the trail is safe, but berries are ripe along the trail. c) If berries are ripe along the trail, hiking is safe if and only if grizzly bears have not been seen in the area. d) It is not safe to hike on the trail, but grizzly bears have not been seen in the area and the berries along the trail are ripe. e) For hiking on the trail to be safe, it is necessary but not sufficient that berries not be ripe along the trail and for grizzly bears not to have been seen in the area. I) Hiking is not safe on the trail whenever grizzly bears have been seen in the area and berries are ripe along the trail. 12. Determine whether these biconditionals are true or false. 2 + 2 = 4 if and only if I + 1 = 2. 1 + 1 = 2 if and only if 2 + 3 = 4 . 1 + I = 3 if and only if monkeys can fly. 0 > I if and only if 2 > l . 13. Determine whether each of these conditional statements is true or false. a) b) c) d)
If I + I = 2, then 2 + 2 = 5 . I f 1 + 1 = 3, then 2 + 2 = 4. If I + I = 3, then 2 + 2 = 5. If monkeys can fly, then 1 + I = 3 . 1 4 . Determine whether each o f these conditional statements is true or false. a) b) c) d)
a) b) c) d)
If I + I = 3, then unicorns exists. If 1 + 1 = 3, then dogs can fly. If I + I = 2, then dogs can fly. If 2 + 2 = 4, then I + 2 = 3 .
18
15. For each of these sentences, determine whether an inclu
16.
1 7.
18.
19.
I - I ii
I / The Foundations: Logic and Proofs
sive or or an exclusive or is intended. Explain your answer. a) Coffee or tea comes with dinner. b) A password must have at least three digits or be at least eight characters long. e) The prerequisite for the course is a course in number theory or a course in cryptography. d) You can pay using U. S. dollars or euros. For each of these sentences, determine whether an inclu sive or or an exclusive or is intended. Explain your answer. a) Experience with C++ or Java is required. b) Lunch includes soup or salad. e) To enter the country you need a passport or a voter registration card. d) Publish or perish. For each of these sentences, state what the sentence means ifthe or is an inclusive or (that is, a disjunction) versus an exclusive or. Which of these meanings of or do you think is intended? a) To take discrete mathematics, you must have taken calculus or a course in computer science. b) When you buy a new car from Acme Motor Company, you get $2000 back in cash or a 2% car loan. e) Dinner for two includes two items from column A or three items from column B. d) School is closed if more than 2 feet of snow falls or if the wind chill is below - 1 00. Write each of these statements in the form "if p , then q " in English. [Hint: Refer to the list of common ways to express conditional statements provided in this section.] a) It is necessary to wash the boss's car to get promoted. b) Winds from the south imply a spring thaw. e) A sufficient condition for the warranty to be good is that you bought the computer less than a year ago. d) Willy gets caught whenever he cheats. e) You can access the website only if you pay a subscrip tion fee. t) Getting elected follows from knowing the right people. g) Carol gets seasick whenever she is on a boat. Write each of these statements in the form "if p, then q " in English. [Hint: Refer to the list of common ways to express conditional statements.] a) It snows whenever the wind blows from the northeast. b) The apple trees will bloom if it stays warm for a week. e) That the Pistons win the championship implies that they beat the Lakers. d) It is necessary to walk 8 miles to get to the top of Long's Peak. e) To get tenure as a professor, it is sufficient to be world-famous. t) If you drive more than 400 miles, you will need to buy gasoline. g) Your guarantee is good only if you bought your CD player less than 90 days ago. h) Jan will go swimming unless the water is too cold.
20. Write each o f these statements i n the form "if p , then q "
21.
22.
23.
24.
25.
in English. [Hint: Refer to the list of common ways to express conditional statements provided in this section.] a) I will remember to send you the address only if you . send me an e-mail message. b) To be a citizen of this country, it is sufficient that you were born in the United States. e) If you keep your textbook, it will be a useful reference in your future courses. d) The Red Wings will win the Stanley Cup iftheir goalie plays well. e) That you get the job implies that you had the best credentials. t) The beach erodes whenever there is a storm. g) It is necessary to have a valid password to log on to the server. h) You will reach the summit unless you begin your climb too late. Write each of these propositions in the form "p if and only if q " in English. a) If it is hot outside you buy an ice cream cone, and if you buy an ice cream cone it is hot outside. b) For you to win the contest it is necessary and sufficient that you have the only winning ticket. e) You get promoted only if you have connections, and you have connections only if you get promoted. d) If you watch television your mind will decay, and conversely. e) The trains run late on exactly those days when I take it. Write each of these propositions in the form "p if and only if q " in English. a) For you to get an A in this course, it is necessary and sufficient that you learn how to solve discrete mathe matics problems. b) If you read the newspaper every day, you will be in formed, and conversely. e) It rains if it is a weekend day, and it is a weekend day if it rains. d) You can see the wizard only if the wizard is not in, and the wizard is not in only if you can see him. State the converse, contrapositive, and inverse of each of these conditional statements. a) If it snows today, I will ski tomorrow. b) I come to class whenever there is going to be a quiz. e) A positive integer is a prime only if it has no divisors other than I and itself. State the converse, contrapositive, and inverse of each of these conditional statements. a) If it snows tonight, then I will stay at home. b) I go to the beach whenever it is a sunny summer day. c) When I stay up late, it is necessary that I sleep until noon. How many rows appear in a truth table for each of these compound propositions?
/-/9
a) b) c) d)
1 . 1 Propositional Logic p -+ -'p (p v -.r) /\ (q V -.s ) q v p v -'s V -.r V -.t (p /\ r /\ t) ++ (q /\ t)
V
u
26. How many rows appear in a truth table for each of these compound propositions? a) b) c) d)
(q -+ -'p) v (-' P -+ -.q) (p v -.t) /\ (p v -'s ) (p -+ r) V ( -,s -+ -.t) v (-,u -+ v) (p /\ r /\ s) V (q /\ t) V (r /\ -.t)
27. Construct a truth table for each of these compound propositions. a) p /\ -.p b) P v -'p c) (p v -'q ) -+ q d) (p v q ) -+ (p /\ q ) e) (p -+ q) ++ (-.q -+ -'p)
t) (p -+ q ) -+ (q -+ p)
28. Construct a truth table for each of these compound propositions. a) p -+ -'p b) P ++ -' P c) p tIJ (p v q ) d) (p /\ q) -+ (p v q ) e) (q -+ -'p) ++ (p ++ q)
t) (p ++ q ) tIJ (p ++ -.q )
29. Construct a truth table for each o f these compound propositions. a) (p v q ) -+ (p tIJ q ) b) (p tIJ q ) -+ (p /\ q ) c) (p V q) tIJ (p /\ q ) d) ( p ++ q ) tIJ (-'P ++ q ) e ) ( p ++ q ) tIJ (-' P ++ -.r)
t) (p tIJ q) -+ (p tIJ -.q )
30. Construct a truth table for each o f these compound propositions. b) P tIJ -'p a) p tIJ p c) p tIJ -.q d) -.p tIJ -.q e) (p tIJ q) v (p tIJ -.q ) t) (p tIJ q) /\ (p tIJ -.q ) 31. Construct a truth table for each of these compound propositions. a) p -+ -.q b) -'p ++ q c) (p -+ q) v (-'P -+ q ) d) (p -+ q ) /\ (-'P -+ q ) e ) ( p ++ q ) v (-' P ++ q )
t) (-'P ++ -.q ) ++ ( p ++ q )
32. Construct a truth table for each o f these compound propositions. a) (p v q ) v r c) (p /\ q ) V r e) (p V q) /\ -.r
b) (p v q ) /\ r d) (p /\ q ) /\ r
t) (p /\ q ) V -'r
33. Construct a truth table for each of these compound propositions. a) p -+ (-.q V r ) b) -'p -+ (q -+ r) c) (p -+ q ) v (-' P -+ r) d) (p -+ q ) /\ (-' P -+ r ) e) (p ++ q ) v (-.q ++ r)
t) (-'P ++ -.q ) ++ (q ++ r)
34. Construct a truth table for « p -+ q ) -+ r) -+
s.
19
35. Construct a truth table for (p ++ q ) ++ (r ++ s ) . 36. What is the value of x after each of these statements is encountered in a computer program, if x = 1 before the statement is reached? a) if 1 + 2 = 3 then x := x + 1 b) if ( 1 + 1 = 3) OR (2 + 2 = 3) then x := x + 1 c) if (2 + 3 = 5) AND (3 + 4 = 7) then x := x + 1 d) if ( 1 + 1 = 2) XOR ( 1 + 2 = 3) then x := x + 1 e) if x < 2 then x := x + 1
37. Find the bitwise OR, bitwise AND, and bitwise XOR of each of these pairs of bit strings. a) 1 0 1 1 1 1 0, 0 1 0 000 1 b) 1 1 1 1 0000, 1 0 1 0 1 0 1 0 c) 00 0 1 1 1 000 1 , 1 0 0 1 00 1 000 d) 1 1 1 1 1 1 1 1 1 1 , 00 0000 0000 38. Evaluate each of these expressions. a) b) c) d)
I 1 000 /\ (0 1 0 1 1 v 1 1 0 1 1 ) (0 1 1 1 1 /\ 1 0 1 0 1 ) v 0 1 000 (0 1 0 1 0 tIJ 1 1 0 1 1 ) tIJ 0 1 000 ( 1 1 0 1 1 v 0 1 0 1 0) /\ ( 1 000 1 v 1 1 0 1 1 )
Fuzzy logic i s used in artificial intelligence. In fuzzy logic, a proposition has a truth value that is a number between 0 and I , inclusive. A proposition with a truth value of 0 is false and one with a truth value of 1 is true. Truth values that are between 0 and 1 indicate varying degrees of truth. For instance, the truth value 0.8 can be assigned to the statement "Fred is happy," because Fred is happy most of the time, and the truth value 0.4 can be assigned to the statement "John is happy," because John is happy slightly less than half the time. 39. The truth value of the negation of a proposition in fuzzy logic is 1 minus the truth value of the proposition. What are the truth values of the statements "Fred is not happy" and "John is not happy"? 40. The truth value of the conjunction of two propositions in fuzzy logic is the minimum of the truth values of the two propositions. What are the truth values of the statements "Fred and John are happy" and "Neither Fred nor John is happy"?
4 1 . The truth value of the disjunction of two propositions in fuzzy logic is the maximum of the truth values of the two propositions. What are the truth values of the statements "Fred is happy, or John is happy" and "Fred is not happy, or John is not happy"? *42. Is the assertion "This statement is false" a proposition? *43. The nth statement in a list of 1 00 statements is "Exactly n of the statements in this list are false." a) What conclusions can you draw from these statements? b) Answer part (a) if the nth statement is "At least n of the statements in this list are false." c) Answer part (b) assuming that the list contains 99 statements.
44. An ancient Sicilian legend says that the barber in a re mote town who can be reached only by traveling a dan gerous mountain road shaves those people, and only those
20
1 / The Foundations: Logic and Proofs
/ -20
people, who do not shave themselves. Can there be such a barber?
then they can save new files. Ifusers cannot save new files, then the system software is not being upgraded."
45. Each inhabitant of a remote village always tells the truth or always lies. A villager will only give a "Yes" or a "No" response to a question a tourist asks. Suppose you are a tourist visiting this area and come to a fork in the road. One branch leads to the ruins you want to visit; the other branch leads deep into the jungle. A villager is standing at the fork in the road. What one question can you ask the villager to determine which branch to take?
51. Are these system specifications consistent? "The router can send packets to the edge system only if it supports the new address space. For the router to support the new ad dress space it is necessary that the latest software release be installed. The router can send packets to the edge sys tem if the latest software release is installed, The router does not support the new address space."
46. An explorer is captured by a group of cannibals. There are two types of cannibals-those who always tell the truth and those who always lie. The cannibals will barbecue the explorer unless he can determine whether a particular cannibal always lies or always tells the truth. He is allowed to ask the cannibal exactly one question.
52. Are these system specifications consistent? "If the file system is not locked, then new messages will be queued. If the file system is not locked, then the system is func tioning normally, and conversely. Ifnew messages are not queued, then they will be sent to the message buffer. If the file system is not locked, then new messages will be sent to the message buffer. New messages will not be sent to the message buffer."
a) Explain why the question "Are you a liar?" does not work. b) Find a question that the explorer can use to determine whether the cannibal always lies or always tells the truth.
53. What Boolean search would you use to look for Web pages about beaches in New Jersey? What if you wanted to find Web pages about beaches on the isle of Jersey (in the English Channel)?
47. Express these system specifications using the propositions p "The message is scanned for viruses" and q "The mes sage was sent from an unknown system" together with logical connectives.
54. What Boolean search would you use to look for Web pages about hiking in West Virginia? What if you wanted to find Web pages about hiking in Virginia, but not in West Virginia?
a) "The message is scanned for viruses whenever the message was sent from an unknown system." b) "The message was sent from an unknown system but it was not scanned for viruses." c) "It is necessary to scan the message for viruses when ever it was sent from an unknown system." d) "When a message is not sent from an unknown system it is not scanned for viruses."
Exercises 55-59 relate to inhabitants of the island of knights and knaves created by Smullyan, where knights always tell the truth and knaves always lie. You encounter two people, A and Determine, if possible, what A and are if they address you in the ways described. If you cannot determine what these two people are, can you draw any conclusions?
48. Express these system specifications using the proposi tions p "The user enters a valid password," q "Access is granted," and r "The user has paid the subscription fee" and logical connectives. a) "The user has paid the subscription fee, but does not enter a valid password." b) "Access is granted whenever the user has paid the sub scription fee and enters a valid password." c) "Access is denied ifthe user has not paid the subscrip tion fee." d) "If the user has not entered a valid password but has paid the subscription fee, then access is granted."
49. Are these system specifications consistent? "The system is in multiuser state if and only if it is operating normally. If the system is operating normally, the kernel is func tioning. The kernel is not functioning or the system is in interrupt mode. If the system is not in multiuser state, then it is in interrupt mode. The system is not in interrupt mode." 50. Are these system specifications consistent? "Whenever the system software is being upgraded, users cannot ac cess the file system. If users can access the file system,
B.
B
B
55. A says "At least one of us is a knave" and says nothing. 56. A says "The two of us are both knights" and says " A is a knave." 57. A says "I am a knave or is a knight" and says nothing. 58. Both A and say "I am a knight." 59. A says "We are both knaves" and says nothing.
B
B
B
B
B
Exercises 60-65 are puzzles that can be solved by translating statements into logical expressions and reasoning from these expressions using truth tables.
60. The police have three suspects for the murder of Mr. Cooper: Mr. Smith, Mr. Jones, and Mr. Williams. Smith, Jones, and Williams each declare that they did not kill Cooper. Smith also states that Cooper was a friend of Jones and that Williams disliked him. Jones also states that he did not know Cooper and that he was out of town the day Cooper was killed. Williams also states that he saw both Smith and Jones with Cooper the day of the killing and that either Smith or Jones must have killed him. Can you determine who the murderer was if a) one of the three men is guilty, the two innocent men are telling the truth, but the statements of the guilty man may or may not be true? b) innocent men do not lie?
1 .2 Propositional Equivalences
1-2 1
61. Steve would like to determine the relative salaries ofthree coworkers using two facts. First, he knows that if Fred is not the highest paid of the three, then Janice is. Sec ond, he knows that if Janice is not the lowest paid, then Maggie is paid the most. Is it possible to determine the relative salaries of Fred, Maggie, and Janice from what Steve knows? If so, who is paid the most and who the least? Explain your reasoning. 62. Five friends have access to a chat room. Is it possible to determine who is chatting if the following information is known? Either Kevin or Heather, or both, are chatting. Either Randy or Vijay, but not both, are chatting. If Abby is chatting, so is Randy. Vijay and Kevin are either both chatting or neither is. If Heather is chatting, then so are Abby and Kevin. Explain your reasoning. 63. A detective has interviewed four witnesses to a crime. From the stories of the witnesses the detective has con cluded that if the butler is telling the truth then so is the cook; the cook and the gardener cannot both be telling the truth; the gardener and the handyman are not both lying; and if the handyman is telling the truth then the cook is lying. For each ofthe four witnesses, can the detective de termine whether that person is telling the truth or lying? Explain your reasoning. 64. Four friends have been identified as suspects for an unau thorized access into a computer system. They have made statements to the investigating authorities. Alice said "Carlos did it." John said "I did not do it." Carlos said
21
"Diana did it." Diana said "Carlos lied when he said that I did it." a) Ifthe authorities also know that exactly one of the four suspects is telling the truth, who did it? Explain your reasoning. b) If the authorities also know that exactly one is lying, who did it? Explain your reasoning. *65. Solve this famous logic puzzle, attributed to Albert Einstein, and known as the zebra puzzle. Five men with different nationalities and with different jobs live in consecutive houses on a street. These houses are painted different colors. The men have different pets and have dif ferent favorite drinks. Determine who owns a zebra and whose favorite drink is mineral water (which is one ofthe favorite drinks) given these clues: The Englishman lives in the red house. The Spaniard owns a dog. The Japanese man is a painter. The Italian drinks tea. The Norwegian lives in the first house on the left. The green house is immediately to the right of the white one. The photogra pher breeds snails. The diplomat lives in the yellow house. Milk is drunk in the middle house. The owner ofthe green house drinks coffee. The Norwegian's house is next to the blue one. The violinist drinks orange juice. The fox is in a house next to that of the physician. The horse is in a house next to that of the diplomat. [Hint: Make a table where the rows represent the men and columns represent the color of their houses, their j obs, their pets, and their favorite drinks and use logical reasoning to determine the correct entries in the table.]
�
1.2 Pro p ositional Equivalences Introduction An important type of step used in a mathematical argument is the replacement of a statement with another statement with the same truth value. Because of this, methods that produce propo sitions with the same truth value as a given compound proposition are used extensively in the construction of mathematical arguments. Note that we will use the term "compound propo sition" to refer to an expression formed from propositional variables using logical operators, such as p /\ q . We begin our discussion with a classification of compound propositions according to their possible truth values. DEFINITION 1
A compound proposition that is always true, no matter what the truth values ofthe propositions that occur in it, is called a tautology. A compound proposition that is always false is called a contradiction. A compound proposition that is neither a tautology nor a contradiction is called a contingency.
Tautologies and contradictions are often important in mathematical reasoning. Example I illus trates these types of compound propositions.
22
/ -22
1 / The Foundations: Logic and Proofs
TABLE 2 De Morgan's
TABLE 1 Examples of a Tautology and a Contradiction.
EXAMPLE 1
Laws.
p
-,p
p v -,p
P I\ -,p
T
F
T
F
F
T
T
F
1\
q)
==
-'p v -.q
-'(p v q)
==
-'p
-'(p
1\
-.q
We can construct examples of tautologies and contradictions using just one propositional vari able. Consider the truth tables of p v """ p and p 1\ """ p , shown in Table 1 . Because p v """ p is � always true, it is a tautology. Because p 1\ """p is always false, it is a contradiction.
Logical Equivalences
Demo� DEFINITION 2
Compound propositions that have the same truth values in all possible cases are called logically also define this notion as follows.
equivalent. We can
The compound propositions p and q are called logically equivalent if p � q is a tautology. The notation p == q denotes that p and q are logically equivalent. The symbol == is not a logical connective and p == q is not a compound proposition but rather is the statement that p q is a tautology. The symbol {} is sometimes used instead of == to denote logical equivalence.
Remark:
�
Extra � Examples .....
EXAMPLE 2
One way to determine whether two compound propositions are equivalent is to use a truth table. In particular, the compound propositions p and q are equivalent if and only if the columns giving their truth values agree. Example 2 illustrates this method to establish an extremely important and useful logical equivalence, namely, that ...... (p v q) of """p 1\ ......q . This logical equivalence is one of the two De Morgan laws, shown in Table 2, named after the English mathematician Augustus De Morgan, of the mid-nineteenth century. Show that ...... (p v q) and """ p 1\ ......q are logically equivalent. Solution: The truth tables for these compound propositions are
TABLE 3 Truth Tables for .(p V q) and .p 1\ .q . -' q
-,p 1\ -,q
F
F
F
F
T
F
T
F T
T
p
q
pVq
-,(p v q )
-,p
T
T
T
F
T
F T F
T
F
T
F T
F F
F
T
F
1 .' . 1
1 .2 Propositional Equivalences
23
TABLE 4 Truth Tables for -,p V q and
p - q.
EXAMPLE 3
p
q
"' p
"' p V q
p -q
T
T
F
T
T
T F
F T
F T
F T
T
F
F
T
T
T
F
Show that p -+ q and ...,p v q are logically equivalent. Solution: We construct the truth table for these compound propositions in Table 4. Because the .... truth values of ""p v q and p -+ q agree, they are logically equivalent.
We will now establish a logical equivalence of two compound propositions involving three different propositional variables p, q , and r . To use a truth table to establish such a logical equivalence, we need eight rows, one for each possible combination of truth values of these three variables. We symbolically represent these combinations by listing the truth values of p, q, and r, respectively. These eight combinations of truth values are TTT, TTF, TFT, TFF, FTT, FTF, FFT, and FFF; we use this order when we display the rows ofthe truth table. Note that we need to double the number of rows in the truth tables we use to show that compound propositions are equivalent for each additional propositional variable, so that 1 6 rows are needed to establish the logical equivalence of two compound propositions involving four propositional variables, and so on. In general, 2n rows are required if a compound proposition involves n propositional variables. EXAMPLE 4
Show that p V (q 1\ r) and (p v q) 1\ (p V r) are logically equivalent. This is the distributive law of disjunction over conjunction. Solution: We construct the truth table for these compound propositions in Table 5. Because the truth values of p V (q 1\ r) and (p v q) 1\ (p V r) agree, these compound propositions are
....
logically equivalent.
TABLE 5 A Demonstration That p V (q Equivalent. p
1\
r) and (p
V
q) 1\ (p
V
r) Are Logically
q
r
q 1\ r
p V (q l\ r)
pVq
pVr
(p V q) 1\ (p V r)
T
T
T
T
T
T
T
T
T
T
F
F
T
T
T
T
T
F
T
F
T
T
T
T
T
F
F
F
T
T
T
T
F
T
T
T
T
T
T
T
F
T
F
F
F
T
F
F
F
F
T
F
F
F
T
F
F
F
F
F
F
F
F
F
24
1 / The Foundations: Logic and Proofs
1 -24
TABLE 6 Logical Equivalences. Equivalence
Name
p /\ T = p pvF=p
Identity laws
pvT=T p /\ F = F
Domination laws
pvp=p P /\ p = p
Idempotent laws
� ( �p ) = p
Double negation law
pvq =q vp P /\ q = q /\ P
Commutative laws
(p V q ) V r = p V (q v r) (p /\ q ) /\ r = p /\ (q /\ r )
Associative laws
p v (q /\ r ) = (p V q ) /\ (p V r) p /\ (q V r ) = (p /\ q ) V (p /\ r )
Distributive laws
� (p /\ q ) = � p v �q � (p v q ) = � p /\ �q
De Morgan's laws
P V (p /\ q ) = p p /\ (p V q ) = p
Absorption laws
p v �p = T p /\ �p = F
Negation laws
Table 6 contains some important equivalences. * In these equivalences, T denotes the com pound proposition that is always true and F denotes the compound proposition that is al ways false. We also display some useful equivalences for compound propositions involving conditional statements and biconditional statements in Tables 7 and 8, respectively. The reader is asked to verify the equivalences in Tables 6-8 in the exercises at the end of the section. The associative law for disjunction shows that the expression P v q V r is well defined, in the sense that it does not matter whether we first take the disjunction of P with q and then the disjunction of P v q with r , or if we first take the disjunction of q and r and then take the disjunction ofP with q v r . Similarly, the expression P /\ q /\ r is well defined. By extending this reasoning, it follows that PI v P2 V . . . V Pn and PI /\ P2 /\ . . . /\ Pn are well defined whenever PI , P2 , . . . , Pn are propositions. Furthermore, note that De Morgan's laws extend to and --'(PI /\ P2 /\ . . . /\ Pn) == (--'PI V --'P2 V . . . v --'Pn). (Methods for proving these identities will be given in Section 4. 1 .) 'Readers familiar with the concept of a Boolean algebra will notice that these identities are a special case of identities that hold for any Boolean algebra. Compare them with set identities in Table I in Section 2.2 and with Boolean identities in Table 5 in Section I l . l .
1 .2 Propositional Equivalences
1 -25
TABLE 7 Logical Equivalences Involving Conditional Statements. p
---+
p
---+
q == � p v q q == �q
p v q == �p
P 1\
� (p
q == � (p ---+
q) ==
�p
---+
---+
q
---+
�q )
P 1\
�q
25
TABLE 8 Logical Equivalences Involving Biconditionals. P P P
*+ q == (p
q ) 1\ (q
---+
p)
q ) V ( �p
1\
�q )
---+
*+ q == � p *+ �q
*+ q == (p
� (p *+ q ) ==
1\
P
*+ �q
r ) == p ---+ (q 1\ r ) (p ---+ r ) 1\ (q ---+ r ) == (p V q ) ---+ r (p ---+ q ) v (p ---+ r ) == p ---+ (q V r ) (p ---+ r) V (q ---+ r ) == (p 1\ q ) ---+ r (p
---+
q) 1\ (p
---+
Using De Morgan's Laws The two logical equivalences known as De Morgan's laws are particularly important. They tell us how to negate conjunctions and how to negate disjunctions. In particular, the equivalence -'(p v q) == -'p /\ -.q tells us that the negation of a disjunction is formed by taking the con junction of the negations of the component propositions. Similarly, the equivalence -'(p /\ q) == -'p v -.q tells us that the negation of a conjunction is formed by taking the disjunction of the negations of the component propositions. Example 5 illustrates the use of De Morgan's laws. EXAMPLE 5
Use De Morgan's laws to express the negations of "Migue1 has a cellphone and he has a laptop computer" and "Heather will go to the concert or Steve will go to the concert." Solution: Let p be "Miguel has a cellphone" and q be "Miguel has a laptop computer." Then "Miguel has a cellphone and he has a laptop computer" can be represented by p /\ q . By the first of De Morgan's laws, -'(p /\ q) is equivalent to -'p v -'q . Consequently, we can express
unkS�
AUGUSTUS DE MORGAN ( 1 806- 1 87 1 ) Augustus De Morgan was born in India, where his father was a colonel in the Indian army. De Morgan's family moved to England when he was 7 months old. He attended private schools, where he developed a strong interest in mathematics in his early teens. De Morgan studied at Trinity College, Cambridge, graduating in 1 827. Although he considered entering medicine or law, he decided on a career in mathematics. He won a position at University College, London, in 1 828, but resigned when the college dismissed a fellow professor without giving reasons. However, he resumed this position in 1 836 when his successor died, staying there until 1 866. De Morgan was a noted teacher who stressed principles over techniques. His students included many famous mathematicians, including Augusta Ada, Countess of Lovelace, who was Charles Babbage's collaborator in his work on computing machines (see page 27 for biographical notes on Augusta Ada). (De Morgan cautioned the countess against studying too much mathematics, because it might interfere with her childbearing abilities! ) De Morgan was an extremely prolific writer. H e wrote more than 1 000 articles for more than 1 5 periodicals. D e Morgan also wrote textbooks on many subjects, including logic, probability, calculus, and algebra. In 1 83 8 he presented what was perhaps the first clear explanation of an important proof technique known as mathematical induction (discussed in Section 4. 1 of this text), a term he coined. In the 1 840s De Morgan made fundamental contributions to the development of symbolic logic. He invented notations that helped him prove propositional equivalences, such as the laws that are named after him. In 1 842 De Morgan presented what was perhaps the first precise definition of a limit and developed some tests for convergence of infinite series. De Morgan was also interested in the history of mathematics and wrote biographies of Newton and Halley. In 1 837 De Morgan married Sophia Frend, who wrote his biography in 1 882. De Morgan's research, writing, and teaching left little time for his family or social life. Nevertheless, he was noted for his kindness, humor, and wide range of knowledge.
26
/ -26
1 / The FOWldations: Logic and Proofs
the negation of our original statement as "Miguel does not have a cell phone or he does not have a laptop computer." Let r be "Heather will go to the concert" and s be "Steve will go to the concert." Then "Heather will go to the concert or Steve will go to the concert" can be represented by r v s . By the second of De Morgan's laws, ..... (r v s) is equivalent to .....r /\ .....s . Consequently, we can express the negation of our original statement as "Heather will not go to the concert and Steve .... will not go to the concert."
Constructing New Logical Equivalences The logical equivalences in Table 6, as well as any others that have been established (such as those shown in Tables 7 and 8), can be used to construct additional logical equivalences. The reason for this is that a proposition in a compound proposition can be replaced by a compound proposition that is logically equivalent to it without changing the truth value of the original compound proposition. This technique is illustrated in Examples 6-8, where we also use the fact that if p and q are logically equivalent and q and r are logically equivalent, then p and r are logically equivalent (see Exercise 56) . EXAMPLE 6
Extra � Examples �
Show that ..... (p -+ q) and p /\ .....q are logically equivalent. We could use a truth table to show that these compound propositions are equivalent (similar to what we did in Example 4). Indeed, it would not be hard to do so. However, we want to illustrate how to use logical identities that we already know to establish new logical identities, something that is ofpractical importance for establishing equivalences ofcompound propositions with a large number of variables. So, we will establish this equivalence by developing a series of logical equivalences, using one of the equivalences in Table 6 at a time, starting with ..... (p -+ q) and ending with p /\ ..... q. We have the following equivalences . Solution:
..... (p
EXAMPLE 7
-+
q) == ..... ( .....p v q) == ..... ( .....p) /\ .....q == p /\ .....q
by Example
3
by the second De Morgan law by the double negation law
Show that ..... (p v ( .....p /\ q)) and ""'p /\ .....q are logically equivalent by developing a series of logical equivalences. Solution: We will use one of the equivalences in Table 6 at a time, starting with .....(p v ( .....p /\ q)) and ending with ""'p /\ .....q . (Note: we could also easily establish this equivalence using a truth
table.) We have the following equivalences .
..... (p v ( ..... p /\ q)) == ""'p /\ ..... ( .....p /\ q) == ""' p /\ [ ..... ( .....p ) v .....q ] == ""' p /\ (p V .....q) == ( ..... p /\ p ) v ( .....p /\ .....q) == F v ( ..... p /\ .....q) == ( .....p /\ ..... q) v F
by the second De Morgan law by the first De Morgan law by the double negation law by the second distributive law because
-' p /\ P ==
F
by the commutative law for disjunction by the identity law for F
Consequently ..... (p v ( .....p /\ q)) and .....p /\ .....q are logically equivalent.
/ :1 7
1 .2 Propositional Equivalences
EXAMPLE 8
27
Show that (p 1\ q) --+ (p v q) is a tautology. Solution: To show that this statement is a tautology, we will use logical equivalences to demon strate that it is logically equivalent to T. (Note: This could also be done using a truth table.) (p 1\ q) --+ (p v q) =. -'(p 1\ q) V (p V q) =. (-'p v -.q) v (p v q) =. (-'p v p) v (-.q v q)
by Example 3 by the first De Morgan law by the associative and commutative laws for disjunction
=. T v T
by Example I and the commutative law for disjunction
=. T
by the domination law
A truth table can be used to determine whether a compound proposition is a tautology. This can be done by hand for a compound proposition with a small number of variables, but when the number of variables grows, this becomes impractical. For instance, there are 2 20 1 , 048, 576 rows in the truth value table for a compound proposition with 20 variables. Clearly, you need a computer to help you determine, in this way, whether a compound proposition in 20 variables is a tautology. But when there are 1 000 variables, can even a computer determine in a reasonable amount of time whether a compound proposition is a tautology? Checking every one of the 2 1 000 (a number with more than 300 decimal digits) possible combinations of truth values simply cannot be done by a computer in even trillions of years. Furthermore, no other procedures are known that a computer can follow to determine in a reasonable amount of time whether a compound proposition in such a large number of variables is a tautology. We will study questions such as this in Chapter 3, when we study the complexity of algorithms. =
Ulksitl
AUGUSTA ADA, COUNTESS OF LOVELACE ( 1 8 1 5-1 852) Augusta Ada was the only child from the marriage of the famous poet Lord Byron and Lady Byron, Annabella Millbanke, who separated when Ada was 1 month old, because of Lord Byron's scandalous affair with his half sister. The Lord Byron had quite a reputation, being described by one of his lovers as "mad, bad, and dangerous to know." Lady Byron was noted for her intellect and had a passion for mathematics; she was called by Lord Byron "The Princess of Parallelograms." Augusta was raised by her mother, who encouraged her intellectual talents especially in music and mathematics, to counter what Lady Byron considered dangerous poetic tendencies. At this time, women were not allowed to attend universities and could not join learned societies. Nevertheless, Augusta pursued her mathematical studies independently and with mathematicians, including William Frend. She was also encouraged by another female mathematician, Mary Somerville, and in 1 834 at it dinner party hosted by Mary Somerville, she learned about Charles Babbage's ideas for a calculating machine, called the Analytic Engine. In 1 838 Augusta Ada married Lord King, later elevated to Earl of Lovelace. Together they had three children. Augusta Ada continued her mathematical studies after her marriage. Charles Babbage had continued work on his Analytic Engine and lectured on this in Europe. In 1 842 Babbage asked Augusta Ada to translate an article in French describing Babbage's invention. When Babbage saw her translation, he suggested she add her own notes, and the resulting work was three times the length of the original. The most complete accounts of the Analytic Machine are found in Augusta Ada's notes. In her notes, she compared the working of the Analytic Engine to that of the Jacquard loom, with Babbage's punch cards analogous to the cards used to create patterns on the loom. Furthermore, she recognized the promises of the machine as a general purpose computer much better than Babbage did. She stated that the "engine is the material expression of any indefinite fimction of any degree of generality and complexity." Her notes on the Analytic Engine anticipate many future developments, including computer-generated music. Augusta Ada published her writings under her initials A.A.L. concealing her identity as a women as did many women did at a time when women were not considered to be the intellectual equals of men. After 1 845 she and Babbage worked toward the development of a system to predict horse races. Unfortunately, their system did not work well, leaving Augusta heavily in debt at the time of her death at an unfortunately young age from uterine cancer. In 1 953 Augusta Ada's notes on the Analytic Engine were republished more than 1 00 years after they were written, and after they had been long forgotten. In his work in the 1 950s on the capacity of computers to think (and his famous Turing Test), Alan Turing responded to Augusta Ada's statement that "The Analytic Engine has no pretensions whatever to originate anything. It can do whatever we know how to order it to perform." This "dialogue" between Turing and Augusta Ada is still the subject of controversy. Because of her fimdamental contributions to computing, the programming language Augusta is named in honor of the Countess of Lovelace.
28
I
/ The Foundations: Logic and Proofs
1-28
Exercises 1. Use truth tables to verify these equivalences.
a) p /\ T == p b) p V F == P d) p v T == T P /\ F == F p v p == p t) P /\ P == P Show that -'(-'p) and p are logically equivalent.
c) e)
2. 3. Use truth tables to verify the commutative laws v
9. Show that each of these conditional statements is a tau tology by using truth tables. b) p -'; (p v q) a) (P /\ q) -,; p d ) (p /\ q) -'; (p -'; q) c) -'p -'; (p -'; q) e ) -'(p -'; q) -'; P t) -'(p -'; q) -'; -.q 10. Show that each of these conditional statements is a tau tology by using truth tables.
a) [-' P /\ (p V q)] -'; q
b) p /\ q == q /\ P 4. Use truth tables to verify the associative laws
a) p V q == q
p
a) (p v q) V r == p V (q V r) b) (p /\ q) /\ r == p /\ (q /\ r)
5 . U s e a truth table t o verify the distributive law
p /\ (q V r)
==
(p /\ q) V (p /\ r).
6. Use a truth table to verify the first De Morgan law
-'(p /\ q)
==
-'p v -'q .
7. Use De Morgan's laws to find the negation of each of the following statements.
a) Jan is rich and happy. b) Carlos will bicycle or run tomorrow. c) Mei walks or takes the bus to class. d) Ibrahim is smart and hard working. 8.
Use De Morgan's laws to find the negation of each of the following statements.
a)
Kwame will take a job in industry or go to graduate school. b) Yoshiko knows Java and calculus. c) James is young and strong. d) Rita will move to Oregon or Washington.
11. 12. 13. 14. 15.
b) [(p -'; q) /\ (q -'; r)] -'; (p -'; r) c) [p /\ (p -'; q)] -'; q d) [(p v q) /\ (p -'; r) /\ (q -'; r)] -'; r Show that each conditional statement in Exercise 9 is tautology without using truth tables. Show that each conditional statement in Exercise l O is tautology without using truth tables. Use truth tables to verify the absorption laws. b) p /\ (p V q) == P a) p v (p /\ q) == p Determine whether (-'P /\ (p -'; q» -'; -.q is tautology. Determine whether (-.q /\ (p -'; q» -'; -'p is tautology.
a a
a a
Each of Exercises 1 6-28 asks you to show that two compound propositions are logically equivalent. To do this, either show that both sides are true, or that both sides are false, for exactly the same combinations of truth values of the propositional variables in these expressions (whichever is easier).
16. Show that equivalent. 17. Show that equivalent.
p
*+
q
and
-'(p *+ q)
(p /\ q ) V ( -' P /\ -.q)
and
p *+ -.q
are
are logically
HENRY MAURICE SHEFFER ( 1 883-1 964) Henry Maurice Sheffer, born to Jewish parents in the western Ukraine, emigrated to the United States in 1 892 with his parents and six siblings. He studied at the Boston Latin School before entering Harvard, where he completed his undergraduate degree in 1 905, his master's in 1 907, and his Ph.D. in philosophy in 1 908. After holding a postdoctoral position at Harvard, Henry traveled to Europe on a fellowship. Upon returning to the United States, he became an academic nomad, spending one year each at the University of Washington, Cornell, the University of Minnesota, the University of Missouri, and City College in New York. In 1 9 1 6 he returned to Harvard as a faculty member in the philosophy department. He remained at Harvard until his retirement in 1 952. Sheffer introduced what is now known as the Sheffer stroke in 1 9 1 3 ; it became well known only after its use in the 1 925 edition of Whitehead and Russell's Principia Mathematica. In this same edition Russell wrote that Sheffer had invented a powerful method that could be used to simplify the Principia. Because of this comment, Sheffer was something of a mystery man to logicians, especially because Sheffer, who published little in his career, never published the details of this method, only describing it in mimeographed notes and in a brief published abstract. Sheffer was a dedicated teacher of mathematical logic. He liked his classes to be small and did not like auditors. When strangers appeared in his classroom, Sheffer would order them to leave, even his colleagues or distinguished guests visiting Harvard. Sheffer was barely five feet tall; he was noted for his wit and vigor, as well as for his nervousness and irritability. Although widely liked, he was quite lonely. He is noted for a quip he spoke at his retirement: "Old professors never die, they just become emeriti." Sheffer is also credited with coining the term "Boolean algebra" (the subject of Chapter 1 1 ofthis text). Sheffer was briefly married and lived most of his later life in small rooms at a hotel packed with his logic books and vast files of slips of paper he used to jot down his ideas. Unfortunately, Sheffer suffered from severe depression during the last two decades of his life.
1 .2 Propositional Equivalences
1-29
Show that p -+ q and �q -+ �p are logically equivalent. Show that �p ++ q and p ++ �q are logically equivalent. Show that �(p EB q) and p ++ q are logically equivalent. Show that �(p ++ q) and �p ++ q are logically equivalent. 22. Show that (p -+ q) /\ (p -+ r) and p -+ (q /\ r) are log ically equivalent. 23. Show that (p -+ r) /\ (q -+ r) and (p v q) -+ r are log ically equivalent. 24. Show that (p -+ q) v (p -+ r) and p -+ (q V r) are log ically equivalent. 25. Show that (p -+ r) V (q -+ r) and (p /\ q) -+ r are log ically equivalent. 26. Show that � p -+ (q -+ r) and q -+ (p V r) are logically equivalent. 27. Show that p ++ q and (p -+ q) /\ (q -+ p) are logically equivalent. 28. Show that p ++ q and �p ++ �q are logically equivalent. 29. Show that (p -+ q) /\ (q -+ r) -+ (p -+ r) is a tautology. 30. Show that (p v q) /\ (�p V r) -+ (q V r) is a tautology. 3 1 . Show that (p -+ q) -+ r and p -+ (q -+ r) are not equivalent. 32. Show that (p /\ q) -+ r and (p -+ r) /\ (q -+ r) are not equivalent. 33. Show that (p -+ q) -+ (r -+ s) and (p -+ r) -+ (q -+ s) are not logically equivalent. The dual of a compound proposition that contains only the logical operators v, /\, and � is the compound proposition obtained by replacing each v by /\, each /\ by v, each T by F, and each F by T. The dual of s is denoted by s* . 34. Find the dual of each of these compound propositions. a) p V �q b) P /\ (q V (r /\ T» 18. 19. 20. 21.
c)
(p /\ �q) v (q /\ F)
35. Find the dual of each of these compound propositions. b) (p /\ q /\ r) v s a) p /\ �q /\ �r c) (p v F) /\ (q v T) 36. When does s* = s, where s is a compound proposition? 37. Show that (s*)* = s when s is a compound proposition. 38. Show that the logical equivalences in Table 6, except for
the double negation law, come in pairs, where each pair contains compound propositions that are duals of each other. **39. Why are the duals of two equivalent compound proposi tions also equivalent, where these compound propositions contain only the operators /\, v, and �? 40. Find a compound proposition involving the propositional variables p, q , and r that is true when p and q are true and r is false, but is false otherwise. [Hint: Use a conjunction of each propositional variable or its negation.] 41. Find a compound proposition involving the propositional variables p, q, and r that is true when exactly two of p, q, and r are true and i s false otherwise. [Hint: Form a dis junction of conjunctions. Include a conjunction for each
29
combination of values for which the propositional vari able is true. Each conjunction should include each of the three propositional variables or their negations.] l? 42. Suppose that a truth table in n propositional variables is specified. Show that a compound proposition with this truth table can be formed by taking the disjunction of conjunctions of the variables or their negations, with one conjunction included for each combination of values for which the compound proposition is true. The resulting compound proposition is said to be in disjunctive normal form.
A collection of logical operators is called functionally com plete if every compound proposition is logically equiva
lent to a compound proposition involving only these logical operators. 43. Show that �, /\, and v form a functionally complete col lection of logical operators. [Hint: Use the fact that every compound proposition is logically equivalent to one in disjunctive normal form, as shown in Exercise 42.] *44. Show that � and /\ form a functionally complete collec tion of logical operators. [Hint: First use a De Morgan law to show that p v q is equivalent to �(�p /\ �q).] *45. Show that � and v form a functionally complete collection of logical operators. The following exercises involve the logical operators NAND and NOR. The proposition p NAND q is true when either p or q , or both, are false; and it is false when both p and q are true. The proposition p NOR q is true when both p and q are false, and it is false otherwise. The propositions p NAND q and p NOR q are denoted by p I q and p + q , respectively. (The op erators I and + are called the Sheffer stroke and the Peirce arrow after H. M. Sheffer and C. S. Peirce, respectively.) 46. Construct a truth table for the logical operator NAND. 47. Show that p I q is logically equivalent to �(p /\ q). 48. Construct a truth table for the logical operator NOR. 49. Show that p + q is logically equivalent to �(p V q). 50. In this exercise we will show that { + } is a functionally complete collection of logical operators. a) Show that p + p is logically equivalent to �p. b) Show that (p + q) + (p + q) is logically equivalent to
p v q.
c) Conclude from parts (a) and (b), and Exercise 49, that
*51. 52. 53. 54. *55.
{ + } is a functionally complete collection of logical operators. Find a compound proposition logically equivalent to p -+ q using only the logical operator +. Show that {I} i s a functionally complete collection of log ical operators. Show that p I q and q I p are equivalent. Show that p I (q I r) and (p I q) I r are not equivalent, so that the logical operator I is not associative. How many different truth tables of compound proposi tions are there that involve the propositional variables p and q ?
30
/-30
1 / The Foundations: Logic and Proofs
56. Show that if p, q , and r are compound propositions
such that p and q are logically equivalent and q and r are logically equivalent, then p and r are logically equivalent. 57. The following sentence is taken from the specification of a telephone system: "If the directory database is opened, then the monitor is put in a closed state, if the system is not in its initial state." This specification is hard to under stand because it involves two conditional statements. Find an equivalent, easier-to-understand specification that in volves disjunctions and negations but not conditional statements. 58. How many of the disjunctions p v -'q , -'p v q , q V r , q V -' r , and -'q v -, r can b e made simultaneously true by an assignment of truth values to p, q , and r? 59. How many of the disjunctions p v -'q v s , -,p V -, r V s, -' p V -' r V -,s , -' p v q v -'s , q V r V -'s , q V -, r V -'S , -'p V -'q V -'S , P v r v s , and p v r v -'s can be made
simultaneously true by an assignment of truth values to p, q , r , and s? A compound proposition is satisfiable if there is an assign ment of truth values to the variables in the compound propo sition that makes the statement form true. 60. Which of these compound propositions are satisfiable? a) (p v q V -,r ) /\ (p V -'q V -,s) /\ (p V -'r V -,s ) /\ (-'p v -'q v -,s) /\ (p V q V -,s) b) (-'p v -'q v r ) /\ (-'p v q v -,s) /\ (p V -'q V -'s) /\ (-'p v -,r V -,s ) /\ (p V q V -'r ) /\ (p V -'r V -,s ) c) (p v q v r ) /\ (p V -'q V -,s) /\ (q V -'r V s) /\ (-'p v r V s) /\ (-'p v q v -,s) /\ (p V -'q V -,r) /\ (-'p v -'q v s) /\ (-'p v -'r V -,s) 61. Explain how an algorithm for determining whether a compound proposition is satisfiable can be used to de termine whether a compound proposition is a tautology. [Hint: Look at -'P, where p is the compound proposition that is being examined.]
1.3 Predicate� and Quantifiers Introduction Propositional logic, studied in Sections 1 . 1 and 1 .2, cannot adequately express the meaning of statements in mathematics and in natural language. For example, suppose that we know that "Every computer connected to the university network is functioning properly." No rules of propositional logic allow us to conclude the truth of the statement "MATH3 is functioning properly," where MATH3 is one of the computers connected to the university network. Likewise, we cannot use the rules of propositional logic to conclude from the statement "CS2 is under attack by an intruder," where CS2 is a computer on the university network, to conclude the truth of "There is a computer on the university network that is under attack by an intruder." In this section we will introduce a more powerful type of logic called predicate logic. We . will see how predicate logic can be used to express the meaning of a wide range of statements in mathematics and computer science in ways that permit us to reason and explore relationships between objects. To understand predicate logic, we first need to introduce the concept of a predicate. Afterward, we will introduce the notion of quantifiers, which enable us reason with statements that assert that a certain property holds for all objects of a certain type and with statements that assert the existence of an object with a particular property.
/-3 /
1 .3 Predicates and Quantifiers
31
Predicates Statements involving variables, such as "x
>
3," "x
= y + 3,"
"x + y = z,"
"computer x is under attack by an intruder," and "computer x is functioning properly," are often found in mathematical assertions, in computer programs, and in system specifications. These statements are neither true nor false when the values of the variables are not specified. In this section, we will discuss the ways that propositions can be produced from such statements. The statement "x is greater than 3" has two parts. The first part, the variable x, is the subject of the statement. The second part-the predicate, "is greater than 3"-refers to a property that the subject of the statement can have. We can denote the statement "x is greater than 3" by P(x), where P denotes the predicate "is greater than 3" and x is the variable. The statement P(x) is also said to be the value of the propositional function P at x. Once a value has been assigned to the variable x, the statement P(x) becomes a proposition and has a truth value. Consider Examples 1 and 2. EXAMPLE 1
Let P(x) denote the statement "x
>
3." What are the truth values of P(4) and P(2)?
Solution: We obtain the statement P (4) by setting x = 4 in the statement "x > 3." Hence, P (4), which is the statement "4 > 3," is true. However, P (2), which is the statement "2 > 3," is .... false. EXAMPLE 2
Let A(x) denote the statement "Computer x is under attack by an intruder." Suppose that ofthe computers on campus, only CS2 and MATH 1 are currently under attack by intruders. What are truth values of A(CS l), A(CS2), and A(MATH l)?
= CSI in the statement "Computer x is under attack by an intruder." Because CS 1 is not on the list of computers currently under attack, we conclude that A(CS I) is false. Similarly, because CS2 and MATH 1 are on the list of .... computers under attack, we know that A(CS2) and A(MATHl ) are true. Solution: We obtain the statement A(CS l) by setting x
We can also have statements that involve more than one variable. For instance, consider the statement "x = y + 3." We can denote this statement by Q(x , y), where x and y are variables and Q is the predicate. When values are assigned to the variables x and y, the statement Q(x , y) has a truth value. EXAMPLE 3
Let Q(x , y) denote the statement "x Q(1 , 2) and Q(3, O)?
= y + 3." What are the truth values of the propositions
= 1 and y = 2 in the statement Q(x , y). Hence, Q(1 , 2) is the statement " 1 = 2 + 3," which is false. The statement Q(3, 0) is the proposition "3 = 0 + 3," .... which is true.
Solution: To obtain Q(I , 2), set x
32
1 / The Foundations: Logic and Proofs EXAMPLE
4
1 -32
Let A(c, n ) denote the statement "Computer c is connected to network n ," where c is a variable representing a computer and n is a variable representing a network. Suppose that the computer MATH I is connected to network CAMPUS2, but not to network CAMPUS l . What are the values of A(MATH I , CAMPUS 1) and A(MATHI , CAMPUS2)? Solution: Because MATH 1 is not connected to the CAMPUS 1 network, we see that A(MATH I ,
CAMPUS I ) is false. However, because MATHI i s connected to the CAMPUS2 network, we ... see that A (MATH 1 , CAMPUS2) is true.
Similarly, we can let R (x , y , z) denote the statement "x + y = z." When values are assigned to the variables x , y, and z, this statement has a truth value. EXAMPLE 5
What are the truth values of the propositions R(1 , 2, 3) and R(O, 0, I)? Solution: The proposition R(1 , 2, 3) is obtained by setting x = 1 , Y = 2, and z = 3 in the statement R (x , y , z). We see that R(1 , 2, 3) is the statement "1 + 2 = 3," which is true. Also
note that R(O, 0, 1), which is the statement "0 + 0 = 1 ," is false. In general, a statement involving the n variables Xl, X2,
•
•
•
...
, Xn can be denoted by
A statement of the form P(XI, X2, , Xn ) is the value of the propositional function P at the n-tuple (Xl, X2, , xn ) , and P is also called a n-place predicate or a n-ary predicate. Propositional functions occur in computer programs, as Example 6 demonstrates. •
.
unkS�
•
•
.
•
CHARLES SANDERS PEIRCE ( 1 839- 1 9 1 4) Many consider Charles Peirce the most original and versatile intellect from the United States; he was born in Cambridge, Massachusetts. He made important contributions to an amazing number of disciplines, including mathematics, astronomy, chemistry, geodesy, metrology, engineer ing, psychology, philology, the history of science, and economics. He was also an inventor, a lifelong student of medicine, a book reviewer, a dramatist and an actor, a short story writer, a phenomenologist, a logician, and a metaphysician. He is noted as the preeminent system-building philosopher competent and productive in logic, mathematics, and a wide range of sciences. His father, Benjamin Peirce, was a professor of mathematics and natural philosophy at Harvard. Peirce attended Harvard ( 1 855-1 859) and received a Harvard master of arts degree ( 1 862) and an advanced degree in chemistry from the Lawrence Scientific School ( 1 863). His father encouraged him to pursue a career in science, but instead he chose to study logic and scientific methodology. In 1 86 1 , Peirce became an aide in the United States Coast Survey, with the goal of better understanding scientific methodology. His service for the Survey exempted him from military service during the Civil War. While working for the Survey, Peirce carried out astronomical and geodesic work. He made fundamental contributions to the design of pendulums and to map projections, applying new mathematical developments in the theory of elliptic functions. He was the first person to use the wavelength of light as a unit of measurement. Peirce rose to the position of Assistant for the Survey, a position he held until he was forced to resign in 1 89 1 when he disagreed with the direction taken by the Survey's new administration. Although making his living from work in the physical sciences, Peirce developed a hierarchy of sciences, with mathematics at the top rung, in which the methods of one science could be adapted for use by those sciences under it in the hierarchy. He was also the founder of the American philosophical theory of pragmatism. The only academic position Peirce ever held was as a lecturer in logic at Johns Hopkins University in Baltimore from 1 879 to 1 884. His mathematical work during this time included contributions to logic, set theory, abstract algebra, and the philosophy of mathematics. His work is still relevant today; some of his work on logic has been recently applied to artificial intelligence. Peirce believed that the study of mathematics could develop the mind's powers of imagination, abstraction, and generalization. His diverse activities after retiring from the Survey included writing for newspapers and journals, contributing to scholarly dictionaries, translating scientific papers, guest lecturing, and textbook writing. Unfortunately, the income from these pursuits was insufficient to protect him and his second wife from abject poverty. He was supported in his later years by a fund created by his many admirers and administered by the philosopher William James, his lifelong friend. Although Peirce wrote and published voluminously in a vast range of subjects, he left more than 1 00,000 pages of unpublished manuscripts. Because of the difficulty of studying his unpublished writings, scholars have only recently started to understand some of his varied contributions. A group of people is devoted to making his work available over the Internet to bring a better appreciation of Peirce's accomplishments to the world.
1 .3 Predicates and Quantifiers
1-33
EXAMPLE 6
33
Consider the statement if x
>
0 then x
:= x + 1 .
When this statement is encountered in a program, the value of the variable x at that point in the execution of the program is inserted into P (x), which is "x > 0." If P (x) is true for this value of x, the assignment statement x := x + 1 is executed, so the value of x is increased by 1 . If P(x) is false for this value of x, the assignment statement is not executed, so the value of x is � not changed. Predicates are also used in the verification that computer programs always produce the desired output when given valid input. The statements that describe valid input are known as preconditions and the conditions that the ouput should satisfy when the program has run are known as postconditions. As Example 7 illustrates, we use predicates to describe both preconditions and postconditions. We will study this process in greater detail in Section 4.4. EXAMPLE 7
Consider the following program, designed to interchange the values of two variables x and y. t emp
:= x
x
.
y
: = t emp
-
y
Find predicates that we can use as the precondition and the postcondition to verify the correctness of this program. Then explain how to use them to verify that for all valid input the program does what is intended. Solution: For the precondition, we need to express that x and y have particular values before
we run the program. So, for this precondition we can use the predicate P(x , y), where P(x , y) is the statement "x = a and y = b," where a and b are the values of x and y before we run the program. Because we want to verify that the program swaps the values of x and y for all input values, for the postcondition we can use Q(x , y), where Q(x , y) is the statement "x = b and y = a ." To verify that the program always does what it is supposed to do, suppose that the precon dition P(x , y) holds. That is, we suppose that the statement "x = a and y = b" is true. This means that x = a and y = b. The first step of the program, temp := x, assigns the value ofx to the variable temp, so after this step we know that x = a , temp = a, and y = b. After the second step of the program, x := y, we know that x = b, temp = a, and y = b. Finally, after the third step, we know that x = b, temp = a , and y = a . Consequently, after this program is run, the � postcondition Q(x , y) holds, that is, the statement "x = b and y = a" is true.
Quantifiers
Assessment �
When the variables in a propositional function are assigned values, the resulting statement becomes a proposition with a certain truth value. However, there is another important way, called quantification, to create a proposition from a propositional function. Quantification expresses the extent to which a predicate is true over a range of elements. In English, the words all, some, many, none, and few are used in quantifications. We will focus on two types of quantification here: universal quantification, which tells us that a predicate is true for every element under consideration, and existential quantification, which tells us that there is one or more element under consideration for which the predicate is true. The area of logic that deals with predicates and quantifiers is called the predicate calculus.
34
I
/ The Foundations: Logic and Proofs
Assessmenl �
DEFINITION 1
1-34
THE UNIVERSAL QUANTIFIER Many mathematical statements assert that a property is true for all values of a variable in a particular domain, called the domain of discourse (or the universe of discourse), often just referred to as the domain. Such a statement is expressed using universal quantification. The universal quantification of P (x) for a particular domain is the proposition that asserts that P(x) is true for all values of x in this domain. Note that the domain specifies the possible values of the variable x . The meaning of the universal quantification of P (x) changes when we change the domain. The domain must always be specified when a universal quantifier is used; without it, the universal quantification of a statement in not defined.
The universal quantification of P(x) is the statement "P(x) for all values ofx in the domain." The notation Vx P(x) denotes the universal quantification of P(x). Here V is called the
universal quantifier. We read Vx P(x) as "for all x P(x)" or "for every x P(x )." An element for which P(x) is false is called a counterexample ofVx P(x).
The meaning of the universal quantifier is summarized in the first row of Table 1. We illustrate the use of the universal quantifier in Examples 8-1 3. EXAMPLE S
Let P(x) be the statement "x + 1 > x ." What is the truth value of the quantification Vx P(x), where the domain consists of all real numbers? Solution: Because P(x) is true for all real numbers x, the quantification
Vx P (x) is true. Generally, an implicit assumption is made that all domains of discourse for quantifiers are nonempty. Note that if the domain is empty, then Vx P (x) is true for any propositional function P (x) because there are no elements x in the domain for which P(x) is false.
Remark:
Besides "for all" and "for every," universal quantification can be expressed in many other ways, including "all of," "for each," "given any," "for arbitrary," "for each," and "for any." It is best to avoid using "for any x" because it is often ambiguous as to whether "any" means "every" or "some." In some cases, "any" is unambiguous, such as when it is used in negatives, for example, "there is not any reason to avoid studying."
Remark:
TABLE 1 Quantifiers. Statement
When True?
When False?
Vx P (x)
P (x) is true for every x. There is an x for which P (x) is true.
There is an x for which P(x) is false.
3x P (x)
P (x) is false for every x.
1 .3 Predicates and Quantifiers 35
1-35
A statement 'Ix P (x ) is false, where P (x) is a propositional function, if and only if P (x) is not always true when x is in the domain. One way to show that P (x) is not always true when x is in the domain is to find a counterexample to the statement 'Ix P (x). Note that a single counterexample is all we need to establish that 'Ix P (x) is false. Example 9 illustrates how counterexamples are used. EXAMPLE 9
Let Q(x) be the statement "x < 2." What is the truth value of the quantification Vx Q (x), where the domain consists of all real numbers? Solution: Q(x) is not true for every real number x , because, for instance, Q (3) is false. That is, x = 3 is a counterexample for the statement Vx Q (x). Thus Vx Q(x)
is false. EXAMPLE 10
Suppose that P (x) is "x 2 > 0." To show that the statement Vx P (x) is false where the uni verse of discourse consists of all integers, we give a counterexample. We see that x = 0 is a counterexample because x 2 = 0 when x = 0, so that x 2 is not greater than 0 when x = O . .... Looking for counterexamples to universally quantified statements is an important activity in the study of mathematics, as we will see in subsequent sections of this book. When all the elements in the domain can be listed-say, X I , X2 , . . . , xn-it follows that the universal quantification 'Ix P (x) is the same as the conjunction
because this conjunction is true if and only if P (X I ), P (X2 ), . . . , P (xn ) are all true. EXAMPLE 11
What is the truth value of 'Ix P (x), where P (x) is the statement "x 2 consists of the positive integers not exceeding 4?
<
1 0" and the domain
Solution: The statement 'Ix P (x) is the same as the conjunction P ( l ) 1\ P (2) 1\ P (3) 1\ P (4),
because the domain consists of the integers 1 , 2, 3, and 4. Because P (4), which is the statement .... "42 < 1 0," is false, it follows that 'Ix P (x) is false. EXAMPLE 12
What does the statement VxN(x) mean if N(x) is "Computer x is connected to the network" and the domain consists of all computers on campus? Solution: The statement VxN(x) means that for every computer x on campus, that computer x
is connected to the network. This statement can be expressed in English as "Every computer on .... campus is connected to the network."
As we have pointed out, specifying the domain is mandatory when quantifiers are used. The truth value of a quantified statement often depends on which elements are in this domain, as Example 1 3 shows.
36
1 / The Foundations: Logic and Proofs
EXAMPLE 13
1-36
What is the truth value of Yx(x 2 2: x) if the domain consists of all real numbers? What is the truth value of this statement if the domain consists of all integers? Solution: The universal quantification Yx(x 2 2: x), where the domain consists of all real num bers, is false. For example, ( i )2 t i . Note that x 2 2: x if and only if x 2 - x = x(x - 1) � O.
Consequently, x 2 � x if and only if x ::'S 0 or x � 1 . It follows that Yx(x 2 2: x) is false if the domain consists of all real numbers (because the inequality is false for all real numbers x with o < x < 1). However, if the domain consists of the integers, Yx(x 2 2: x) is true, because there .... are no integers x with 0 < x < 1 .
THE EXISTENTIAL QUANTIFIER Many mathematical statements assert that there is an element with a certain property. Such statements are expressed using existential quantification. With existential quantification, we form a proposition that is true if and only if P(x) is true for at least one value of x in the domain. DEFINITION 2
The existential quantification of P(x) is the proposition "There exists an element x We use the notation
in the domain such that P (x )."
3x P(x) for the existential quantification of P(x). Here 3 is called the
existential quantifier.
A domain must always be specified when a statement 3x P(x) is used. Furthermore, the meaning of 3x P (x) changes when the domain changes. Without specifying the domain, the statement 3x P (x) has no meaning. The existential quantification 3x P (x) is read as "There is an x such that P(x)," "There is at least one x such that P(x )," or "For some x P (x)." Besides the words "there exists," we can also express existential quantification in many other ways, such as by using the words "for some," "for at least one," or "there is." The meaning of the existential quantifier is summarized in the second row of Table 1 . We illustrate the use of the existential quantifier in Examples 14-16. EXAMPLE 14
Let P (x) denote the statement "x > 3." What is the truth value of the quantification 3x P (x), where the domain consists of all real numbers? Solution: Because "x
> 3" is sometimes true-for instance, when x tification of P (x), which is 3x P(x), is true.
= 4-the existential quan
....
Observe that the statement 3x P (x) is false if and only ifthere is no element x in the domain for which P (x) is true. That is, 3x P (x) is false if and only P(x) is false for every element of the domain. We illustrate this observation in Example 1 5 . EXAMPLE 15
Let Q(x) denote the statement "x = x + I ." What is the truth value of the quantification 3x Q(x), where the domain consists of all real numbers?
1 .3 Predicates and Quantifiers 37
1-3 7
Solution: Because Q(x) is false for every real number x, the existential quantification of Q(x), which is 3x Q(x), is false. �
Generally, an implicit assumption is made that all domains of discourse for quantifiers are nonempty. If the domain is empty, then 3x Q(x) is false whenever Q(x) is a propositional function because when the domain is empty, there can be no element in the domain for which Q(x) is true.
Remark:
When all elements in the domain can be listed-say, XI , X2 , . . . , xn- the existential quan tification 3x P (x) is the same as the disjunction
because this disjunction is true if and only if at least one of P (X I ), P(X2 ), . . . , P (xn) is true. EXAMPLE 16
What is the truth value onx P (x), where P(x) is the statement "x 2 discourse consists of the positive integers not exceeding 4?
> 1 0"
and the universe of
Solution: Because the domain is { l , 2, 3 , 4}, the proposition 3x P (x) is the same as the
disjunction
P( l) v P(2) v P(3) v P(4). Because P(4), which is the statement "42
>
1 0," is true, it follows that 3x P(x) is true.
It is sometimes helpful to think in terms of looping and searching when determining the truth value of a quantification. Suppose that there are n objects in the domain for the variable x . To determine whether Yx P (x) is true, we can loop through all n values ofx to see if P (x) is always true. Ifwe encounter a value x for which P (x) is false, then we have shown that Yx P(x) is false. Otherwise, Yx P(x) is true. To see whether 3x P (x) is true, we loop through the n values of x searching for a value for which P (x) is true. If we find one, then 3x P (x) is true. If we never find such an x, then we have determined that 3x P (x) is false. (Note that this searching procedure does not apply if there are infinitely many values in the domain. However, it is still a useful way of thinking about the truth values of quantifications.)
Other Quantifiers We have now introduced universal and existential quantifiers. These are the most important quantifiers in mathematics and computer science. However, there is no limitation on the number of different quantifiers we can define, such as "there are exactly two," "there are no more than three," "there are at least 1 00," and so on. Of these other quantifiers, the one that is most often seen is the uniqueness quantifier, denoted by 3 ! or 31 . The notation 3 !x P (x) [or 3lx P(x)] states "There exists a unique x such that P (x) is true." Other phrases for uniqueness quantification include "there is exactly one" and "there is one and only one." Observe that we can use quantifiers and propositional logic to express uniqueness (see Exercise 52 in Section 1 . 4), so the uniqueness quantifier can be avoided. Generally, it is best to stick with existential and universal quantifiers so that rules of inference for these quantifiers can be used.
38
I
/ The Foundations: Logic and Proofs
/ -38
Quantifiers with Restricted Domains An abbreviated notation is often used to restrict the domain of a quantifier. In this notation, a condition a variable must satisfY is included after the quantifier. This is illustrated in Example 1 7. We will also describe other forms of this notation involving set membership in Section 2. 1 . EXAMPLE 17
What do the statements ' 0), ' 0 (z2 the domain in each case consists of the real numbers?
= 2) mean, where
0 (x 2 > 0) states that for every real number x with x < 0, x 2 > O . That is, it states "The square of a negative real number is positive." This statement is the same as ' 0). The statement ' 0 (z2 = 2) states that there exists a real number z with z > 0 such that z 2 = 2. That is, it states "There is a positive square root of 2." This statement is equivalent <0lIl to 3z(z > 0 1\ z2 = 2). Solution: The statement '
<
Note that the restriction of a universal quantification is the same as the universal quantifi cation of a conditional statement. For instance, ' 0) is another way of expressing ' 0). On the other hand, the restriction of an existential quantification is the same as the existential quantification of a conjunction. For instance, 3z > 0 (z2 = 2) is another way of expressing 3z(z > 0 1\ z2 = 2).
Precedence of Quantifiers The quantifiers '
Binding Variables When a quantifier is used on the variable x, we say that this occurrence of the variable is bound. An occurrence of a variable that is not bound by a quantifier or set equal to a particular value is said to be free. All the variables that occur in a propositional function must be bound or set equal to a particular value to turn it into a proposition. This can be done using a combination of universal quantifiers, existential quantifiers, and value assignments. The part of a logical expression to which a quantifier is applied is called the scope of this quantifier. Consequently, a variable is free if it is outside the scope of all quantifiers in the formula that specifies this variable. EXAMPLE 18
In the statement 3x(x + y = 1), the variable x is bound by the existential quantification 3x, but the variable y is free because it is not bound by a quantifier and no value is assigned to this variable. This illustrates that in the statement 3x(x + y = I), x is bound, but y is free. In the statement 3x(P(x) 1\ Q(x» v '
1 .3 Predicates and Quantifiers
/-39
39
using two different variables x and y, as 3x(P(x) 1\ Q(x)) v VyR(y), because the scopes of the two quantifiers do not overlap. The reader should be aware that in common usage, the same letter is often used to represent variables bound by different quantifiers with scopes that do not � overlap.
Logical Equivalences Involving Quantifiers In Section 1 .2 we introduced the notion of logical equivalences of compound propositions. We can extend this notion to expressions involving predicates and quantifiers. DEFINITION 3
Statements involving predicates and quantifiers are logically equivalent if and only if they have the same truth value no matter which predicates are substituted into these statements and which domain of discourse is used for the variables in these propositional functions. We use the notation S == T to indicate that two statements S and T involving predicates and quantifiers are logically equivalent. Example 1 9 illustrates how to show that two statements involving predicates and quantifiers are logically equivalent.
EXAMPLE 19
Show that Vx(P(x) 1\ Q(x)) and Vx P (x) 1\ Vx Q(x) are logically equivalent (where the same domain is used throughout). This logical equivalence shows that we can distribute a universal quantifier over a conjunction. Furthermore, we can also distribute an existential quantifier over a disjunction. However, we cannot distribute a universal quantifier over a disjunction, nor can we distribute an existential quantifier over a conjunction. (See Exercises 50 and 5 l .) Solution: To show that these statements are logically equivalent, we must show that they always take the same truth value, no matter what the predicates P and Q are, and no matter which domain of discourse is used. Suppose we have particular predicates P and Q, with a common domain. We can show that Vx(P(x) 1\ Q(x)) and Vx P (x) 1\ Vx Q(x) are logically equivalent by doing two things. First, we show that if Vx(P(x) 1\ Q(x)) is true, then Vx P (x) 1\ Vx Q(x) is true. Second, we show that ifVx P (x) 1\ Vx Q(x) is true, then Vx(P(x) 1\ Q(x)) is true. So, suppose that Vx(P(x) 1\ Q(x)) is true. This means that if a is in the domain, then P(a) 1\ Q(a) is true. Hence, P (a) is true and Q(a) is true. Because P (a) is true and Q(a) is true for every element in the domain, we can conclude that Vx P (x) and Vx Q(x) are both true. This means that Vx P(x) 1\ Vx Q(x) is true. Next, suppose that Vx P (x) 1\ Vx Q(x) is true. It follows that Vx P (x) is true and Vx Q(x) is true. Hence, if a is in the domain, then P (a) is true and Q(a) is true [because P (x) and Q(x) are both true for all elements in the domain, there is no conflict using the same value of a here]. It follows that for all a, P (a) 1\ Q(a) is true. It follows that Vx(P(x) 1\ Q(x)) is true. We can now conclude that
Vx(P(x) 1\ Q(x)) == Vx P (x) 1\ Vx Q(x).
Negating Quantified Expressions We will often want to consider the negation of a quantified expression. For instance, consider the negation of the statement "Every student in your class has taken a course in calculus."
40
I / The Foundations: Logic and Proofs
1-40
This statement is a universal quantification, namely, Vx P (x),
ASS8SSlIeo.ii::!
where P(x) is the statement "x has taken a course in calculus" and the domain consists of the students in your class. The negation of this statement is "It is not the case that every student in your class has taken a course in calculus." This is equivalent to "There is a student in your class who has not taken a course in calculus." And this is simply the existential quantification of the negation of the original propositional function, namely, 3x -.P(x). This example illustrates the following logical equivalence: -'VxP(x)
==
3x
-.P(x).
To show that -.Vx P (x) and 3x (x) are logically equivalent no matter what the propositional function P (x) is and what the domain is, first note that -.Vx P(x) is true if and only if "Ix P(x) is false. Next, note that "Ix P(x) is false if and only there is an element x in the domain for which P(x) is false. This holds if and only if there is an element x in the domain for which -.P(x) is true. Finally, note that there is an element x in the domain for which -. P (x) is true if and only if 3x -.P(x) is true. Putting these steps together, we can conclude that -.VxP(x) is true if and only if 3x -.P(x) is true. It follows that -.Vx P(x) and 3x -.P(x) are logically equivalent. Suppose we wish to negate an existential quantification. For instance, consider the proposi tion "There is a student in this class who has taken a course in calculus." This is the existential quantification 3x Q(x), where Q(x) is the statement "x has taken a course in calculus." The negation of this statement is the proposition "It is not the case that there is a student in this class who has taken a course in calculus." This is equivalent to "Every student in this class has not taken calculus," which is just the universal quantification of the negation of the original propositional function, or, phrased in the language of quantifiers, "Ix -.Q(x). This example illustrates the equivalence -'3x Q(x)
==
"Ix
-. Q(x).
To show that -.3x Q(x) == "Ix -.Q(x) are logically equivalent no matter what Q(x) is and what the domain is, first note that -.3x Q(x) is true if and only if 3x Q(x) is false. This is true if and only if no x exists in the domain for which Q(x) is true. Next, note that no x exists in the domain for which Q(x) is true if and only if Q(x) is false for every x in the domain. Finally, note that Q(x) is false for every x in the domain if and only if -.Q(x) is true for all x in the domain, which holds if and only if Vx-.Q(x) is true. Putting these steps together, we see that -.3x Q(x) is true if and only if Vx-.Q(x) is true. We conclude that -.3x Q(x) and "Ix -.Q(x) are logically equivalent. The rules for negations for quantifiers are called De Morgan's laws for quantifiers. These rules are summarized in Table 2.
1 .3 Predicates and Quantifiers
1-41
TABLE 2
41
De Morgan's Laws for Quantifiers.
Negation
Equivalent Statement
When Is Negation True?
When False?
-,3x P (x)
Yx -, P (x )
For every x , P (x ) i s false.
There is an x for which P (x) is true.
-,Yx P (x)
3x -, P (x)
There is an x for which P (x) is false.
P (x) is true for every x .
When the domain of a predicate P(x) consists of n elements, where n is a positive integer, the rules for negating quantified statements are exactly the same as De Morgan's laws discussed in Section 1 .2. This is why these rules are called De Morgan's laws for quantifiers. When the domain has n elements X l , x2 , ' ' ' ' Xn , it follows that -,Yx P (x) is the same as -,(P(XI) /\ P(X2 ) /\ . . . /\ P(xn)), which is equivalent to -,P(xJ) V -,P(X 2 ) V . . . V -,P(xn) by De Morgan's laws, and this is the same as 3x -,P(x). Similarly, -,3x P(x) is the same as -,(P (X I ) v P (X2 ) v . . . v P(xn)), which by De Morgan's laws is equivalent to -,P(xJ) /\ -,P(X2 ) /\ . . . /\ -,P(xn), and this is the same as Yx-,P(x).
Remark:
We illustrate the negation of quantified statements in Examples 20 and 2 1 . EXAMPLE 20
What are the negations of the statements "There is an honest politician" and "All Americans eat cheeseburgers"? Solution: Let H (x) denote "x is honest." Then the statement "There is an honest politician"
is represented by 3xH (x), where the domain consists of all politicians. The negation of this statement is -,3xH(x), which is equivalent to Yx-,H(x). This negation can be expressed as "Every politician is dishonest." (Note: In English, the statement "All politicians are not honest" is ambiguous. In common usage, this statement often means "Not all politicians are honest." Consequently, we do not use this statement to express this negation.) Let C (x) denote "x eats cheeseburgers." Then the statement "All Americans eat cheese burgers" is represented by YxC (x), where the domain consists of all Americans. The negation of this statement is -,YxC(x), which is equivalent to 3x-,C(x). This negation can be expressed in several different ways, including "Some American does not eat cheeseburgers" and "There .... is an American who does not eat cheeseburgers." EXAMPLE 21
What are the negations of the statements Yx(x 2
>
x) and 3x(x 2
= 2)?
Solution: The negation of Yx(x 2
> x) is the statement -,Yx(x 2 > x), which is equivalent to 3x-,(x 2 > x). This can be rewritten as 3x(x 2 :s: x). The negation of 3x(x 2 = 2) is the statement -,3x(x 2 = 2), which is equivalent to Yx-,(x 2 = 2). This can be rewritten as Yx(x 2 =f. 2). The .... truth values of these statements depend on the domain.
We use De Morgan's laws for quantifiers in Example 22. EXAMPLE 22
Show that -,Yx(P(x) --+ Q(x)) and 3x(P(x) /\ -,Q(x)) are logically equivalent. Solution: By De Morgan's law for universal quantifiers, we know that -,Yx(P(x) --+ Q(x))
and 3x(-,(P(x) --+ Q(x))) are logically equivalent. By the fifth logical equivalence in Table 7 in Section 1 .2, we know that -,(P(x) --+ Q(x)) and P(x) /\ -,Q(x) are logically equivalent for every x . Because we can substitute one logically equivalent expression for another in a logical equivalence, it follows that -'Yx(P(x) --+ Q(x)) and 3x(P(x) /\ -,Q(x)) are logically .... equivalent.
42
1 / The Foundations: Logic and Proofs
1 -42
Translating from English into Logical Expressions Translating sentences in English (or other natural languages) into logical expressions is a crucial task in mathematics, logic programming, artificial intelligence, software engineering, and many other disciplines. We began studying this topic in Section 1 . 1 , where we used propositions to express sentences in logical expressions. In that discussion, we purposely avoided sentences whose translations required predicates and quantifiers. Translating from English to logical ex pressions becomes even more complex when quantifiers are needed. Furthermore, there can be many ways to translate a particular sentence. (As a consequence, there is no "cookbook" approach that can be followed step by step.) We will use some examples to illustrate how to translate sentences from English into logical expressions. The goal in this translation is to pro duce simple and useful logical expressions. In this section, we restrict ourselves to sentences that can be translated into logical expressions using a single quantifier; in the next section, we will look at more complicated sentences that require multiple quantifiers. EXAMPLE 23
Express the statement "Every student in this class has studied calculus" using predicates and quantifiers. Solution: First, we rewrite the statement so that we can clearly identify the appropriate quantifiers
to use. Doing so, we obtain:
"For every student in this class, that student has studied calculus." Next, we introduce a variable x so that our statement becomes "For every student x in this class, x has studied calculus." Continuing, we introduce C (x), which is the statement "x has studied calculus." Consequently, if the domain for x consists of the students in the class, we can translate our statement as 'v'x C (x ) . However, there are other correct approaches; different domains of discourse and other predicates can be used. The approach we select depends on the subsequent reasoning we want to carry out. For example, we may be interested in a wider group of people than only those in this class. Ifwe change the domain to consist of all people, we will need to express our statement as "For every person x, if person x is a student in this class then x has studied calculus." If S(x) represents the statement that person x is in this class, we see that our statement can be expressed as 'v'x(S(x) -+ C(x)). [Caution! Our statement cannot be expressed as 'v'x(S(x) 1\ C (x)) because this statement says that all people are students in this class and have studied calculus!] Finally, when we are interested in the background of people in subjects besides calculus, we may prefer to use the two-variable quantifier Q(x , y) for the statement "student x has studied subject y ." Then we would replace C (x) by Q(x , calculus) in both approaches to obtain .... 'v'x Q(x , calculus) or 'v'x(S(x) -+ Q(x , calculus)). In Example 23 we displayed different approaches for expressing the same statement using predicates and quantifiers. However, we should always adopt the simplest approach that is adequate for use in subsequent reasoning. EXAMPLE 24
Express the statements "Some student in this class has visited Mexico" and "Every student in this class has visited either Canada or Mexico" using predicates and quantifiers.
1 .3 Predicates and Quantifiers 43
1 -43
Solution: The statement "Some student in this class has visited Mexico" means that
"There is a student in this class with the property that the student has visited Mexico." We can introduce a variable x, so that our statement becomes "There is a student x in this class having the property that x has visited Mexico." We introduce M(x), which is the statement "x has visited Mexico." If the domain for x consists of the students in this class, we can translate this first statement as 3xM(x). However, if we are interested in people other than those in this class, we look at the statement a little differently. Our statement can be expressed as •
"There is a person x having the properties that x is a student in this class and x has visited Mexico." In this case, the domain for the variable x consists of all people. We introduce S (x) to represent "x is a student in this class." Our solution becomes 3x(S(x) /\ M(x)) because the statement is that there is a person x who is a student in this class and who has visited Mexico. [Caution! Our statement cannot be expressed as 3x(S(x) -+ M(x)), which is true when there is someone not in the class because, in that case, for such a person x , S(x) -+ M(x) becomes either F -+ T or F -+ F, both of which are true.]
Similarly, the second statement can be expressed as
"For every x in this class, x has the property that x has visited Mexico or x has visited Canada." (Note that we are assuming the inclusive, rather than the exclusive, or here.) We let C (x) be "x has visited Canada." Following our earlier reasoning, we see that if the domain for x consists of the students in this class, this second statement can be expressed as 'v'x(C(x) v M(x)). However, if the domain for x consists of all people, our statement can be expressed as "For every person x , if x is a student in this class, then x has visited Mexico or x has visited Canada." In this case, the statement can be expressed as 'v'x(S(x) -+ (C(x) v M(x))). Instead of using M (x) and C (x) to represent that x has visited Mexico and x has visited Canada, respectively, we could use a two-place predicate V (x . y) to represent "x has visited country y." In this case, V (x . Mexico) and V (x . Canada) would have the same meaning as M (x) and C (x) and could replace them in our answers. If we are working with many statements that involve people visiting different countries, we might prefer to use this two-variable approach. Otherwise, for simplicity, we would stick with the one-variable predicates M(x) and C(x). ....
Using Quantifiers in System Specifications In Section 1 . 1 we used propositions to represent system specifications. However, many system specifications involve predicates and quantifications. This is illustrated in Example 25 . EXAMPLE 25
Use predicates and quantifiers to express the system specifications "Every mail message larger than one megabyte will be compressed" and "If a user is active, at least one network link will be available."
Exam=�
Solution: Let S(m , y) be "Mail message m is larger than y megabytes," where the variable x has the domain of all mail messages and the variable y is a positive real number, and let C (m) denote
44
1 / The Foundations: Logic and Proofs
1 -44
"Mail message m will be compressed." Then the specification "Every mail message larger than one megabyte will be compressed" can be represented as Vm(S(m , 1) --+ C(m)) . Let A(u) represent "User u is active," where the variable u has the domain of all users, let S(n , x) denote "Network link n is in state x ," where n has the domain of all network links and x has the domain of all possible states for a network link. Then the specification "If a user is active, at least one network link will be available" can be represented by 3u A(u) --+ .... 3n S(n , available).
Examples from Lewis Carroll Lewis Carroll (really C. L. Dodgson writing under a pseudonym), the author ofAlice in Wonder land, is also the author of several works on symbolic logic. His books contain many examples of reasoning using quantifiers. Examples 26 and 27 come from his book Symbolic Logic; other examples from that book are given in the exercises at the end of this section. These examples illustrate how quantifiers are used to express various types of statements. EXAMPL E 26
Consider these statements. The first two are called premises and the third is called the conclusion. The entire set is called an argument. "All lions are fierce." "Some lions do not drink coffee." "Some fierce creatures do not drink coffee." (In Section 1 .5 we will discuss the issue of determining whether the conclusion is a valid conse quence of the premises. In this example, it is.) Let P (x), Q (x), and R(x) be the statements "x is a lion," "x is fierce," and "x drinks coffee," respectively. Assuming that the domain consists of all creatures, express the statements in the argument using quantifiers and P (x), Q(x), and R(x).
Solution: We can express these statements as: Vx(P(x) --+ Q (x)). 3x(P(x) /\ --.R(x)). 3x(Q(x ) /\ --.R(x )). Notice that the second statement cannot be written as 3x (P(x) --+ --.R(x)). The reason is that P (x) --+ --.R(x) is true whenever x is not a lion, so that 3x(P(x) --+ --.R(x)) is true as long as there is at least one creature that is not a lion, even if every lion drinks coffee. Similarly, the third statement cannot be written as
unkS�
3x(Q(x) --+ --.R(x )).
CHARLES LUTWIDGE DODGSON ( 1 832- 1 898) We know Charles Dodgson as Lewis Carroll-the pseudonym he used in his writings on logic. Dodgson, the son of a clergyman, was the third of 1 1 chil dren, all of whom stuttered. He was uncomfortable in the company of adults and is said to have spoken without stuttering only to young girls, many of whom he entertained, corresponded with, and photographed (sometimes in poses that today would be considered inappropriate). Although attracted to young girls, he was extremely puritanical and religious. His friendship with the three young daughters of Dean Liddell led to his writing Alice in Wonderland, which brought him money and fame. Dodgson graduated from Oxford in 1 854 and obtained his master of arts degree in 1 857. He was appointed lecturer in mathematics at Christ Church College, Oxford, in 1 855. He was ordained in the Church of England in 1 86 1 but never practiced his ministry. His writings include articles and books on geometry, determinants, and the mathematics of tournaments and elections. (He also used the pseudonym Lewis Carroll for his many works on recreational logic.)
1 .3 Predicates and Quantifiers 45
1 -45
EXAMPLE 27
Consider these statements, of which the first three are premises and the fourth is a valid conclusion.
•
"All hummingbirds are richly colored." "No large birds live on honey." "Birds that do not live on honey are dull in color." "Hummingbirds are small."
Let P (x), Q (x), R(x), and S(x) be the statements "x is a hummingbird," "x is large," "x lives on honey," and is richly colored," respectively. Assuming that the domain consists of all birds, express the statements in the argument using quantifiers and P (x), Q (x), R(x), and Sex).
"x
Solution: We can express the statements in the argument as Yx(P(x) ---+ Sex»�. -.3x(Q(x) 1\ R(x» . Yx ( -.R(x) ---+ -,S(x » . Yx(P(x) ---+ -. Q (x» . (Note we have assumed that "small" is the same as "not large" and that "dull in color" is the same as "not richly colored." To show that the fourth statement is a valid conclusion of the first .... three, we need to use rules of inference that will be discussed in Section 1 .5 .)
Logic Pro gramming
linkS � EXAMPLE 28
An important type of programming language is designed to reason using the rules of predicate logic. Prolog (from Programming in Logic), developed in the 1 970s by computer scientists working in the area of artificial intelligence, is an example of such a language. Prolog programs include a set of declarations consisting of two types of statements, Prolog facts and Prolog rules. Prolog facts define predicates by specifying the elements that satisfy these predicates. Prolog rules are used to define new predicates using those already defined by Prolog facts. Example 28 illustrates these notions. Consider a Prolog program given facts telling it the instructor of each class and in which classes students are enrolled. The program uses these facts to answer queries concerning the professors who teach particular students. Such a program could use the predicates instructor(p, c) and enrolled(s , c) to represent that professor p is the instructor of course c and that student s is enrolled in course c, respectively. For example, the Prolog facts in such a program might include: ins t ruc t o r ( chan , math2 7 3 ) ins t ruc t o r ( pa t e l , e e2 2 2 ) i n s t ruc t o r ( g r o s sman , c s 3 0 1 ) enro l l ed ( kevi n , ma th2 7 3 ) enro l l ed ( j uana , ee 2 2 2 ) enro l l ed ( j uana , c s 3 0 1 ) enro l l ed ( ki ko , math2 7 3 ) enro l l ed ( k i ko , c s 3 0 1 )
(Lowercase letters have been used for entries because Prolog considers names beginning with an uppercase letter to be variables.)
46
I / The Foundations: Logic and Proofs
1 -46
A new predicate teaches(p, s), representing that professor p teaches student s, can be defined using the Prolog rule t eaches ( P , S )
:-
ins t ru c t o r ( P , C ) ,
enro l l ed ( S , C )
which means that teaches(p, s) is true if there exists a class c such that professor p is the instructor of class c and student s is enrolled in class c . (Note that a comma is used to represent a conjunction of predicates in Prolog. Similarly, a semicolon is used to represent a disjunction of predicates.) Prolog answers queries using the facts and rules it is given. For example, using the facts and rules listed, the query ? enro l l ed ( kevi n , ma th2 7 3 )
produces the response yes
because the fact enrolled (kevin, math273) was provided as input. The query ? enro l l ed ( X , math2 7 3 )
produces the response kev i n k i ko
To produce this response, Prolog determines all possible values of X for which enrolled(X, math273) has been included as a Prolog fact. Similarly, to find all the professors who are instructors in classes being taken by Juana, we use the query ? t eaches ( X , j uana )
This query returns pat e l g r o s sman
Exercises :s 4." What are the truth values? a) P (O) b) P (4) c) P (6) 2. Let P (x) be the statement "the word x contains the letter a ." What are the truth values? a) P (orange) b) P (lemon) c) P (true) d) P (false) 3. Let Q(x , y) denote the statement "x is the capital of y ." What are these truth values? a) Q(Denver, Colorado) b) Q(Detroit, Michigan)
1. Let P (x) denote the statement "x
c) Q(Massachusetts, Boston) d) Q(New York, New York) 4. State the value of x after the statement if P (x) then x := 1 is executed, where P (x) is the statement "x > I ," if the
value of x when this statement is reached is a) x = O. b) x = 1 . c) x = 2.
5. Let P (x) be the statement "x spends more than five hours
every weekday in class," where the domain for x consists of all students. Express each of these quantifications in English.
1 .3 Predicates and Quantifiers 47
1-47
b) Vx P(x) a) 3x P(x ) d) "Ix ..... P (x) e ) 3 x ..... P (x) 6. Let N(x) be the statement "x has visited North Dakota,"
where the domain consists of the students in your school. Express each of these quantifications in English.
a) 3x N(x) d) 3x ..... N(x)
e) ..... 3x N (x) t) Vx ..... N (x)
b) Vx N(x) e) .....Vx N (x)
7. Translate these statements into English, where C (x ) is "x is a comedian" and F (x) is "x is funny" and the domain
8.
9.
10.
11.
consists of all people. b) Vx(C (x ) /\ F (x» a) Vx(C (x) -+ F(x» d) 3x(C (x) /\ F (x» e) 3x(C (x) -+ F (x» Translate these statements into English, where R(x) is "x is a rabbit" and H (x) is "x hops" and the domain consists of all animals. b) Vx(R(x ) /\ H (x» a) Vx(R(x) -+ H (x» d) 3x(R(x) /\ H (x» e) 3x(R(x) -+ H(x» Let P (x) be the statement "x can speak Russian" and let Q(x) be the statement "x knows the computer language C++." Express each of these sentences in terms of P (x), Q(x), quantifiers, and logical connectives. The domain for quantifiers consists of all students at your school. a) There is a student at your school who can speak Rus sian and who knows C++. b) There is a student at your school who can speak Rus sian but who doesn't know C++. e) Every student at your school either can speak Russian or knows C++. d) No student at your school can speak Russian or knows C++. Let C (x) be the statement "x has a cat," let D (x) be the statement "x has a dog," and let F(x) be the statement "x has a ferret." Express each of these statements in terms of C (x), D (x), F (x), quantifiers, and logical connectives. Let the domain consist of all students in your class. a) A student in your class has a cat, a dog, and a ferret. b) All students in your class have a cat, a dog, or a ferret. e) Some student in your class has a cat and a ferret, but not a dog. d) No student in your class has a cat, a dog, and a ferret. e) For each of the three animals, cats, dogs, and ferrets, there is a student in your class who has one of these animals as a pet. Let P (x) be the statement "x = x 2 ." If the domain con sists of the integers, what are the truth values?
b) P(l) a) P (O) e) 3x P (x) d) P(- l ) 12. Let Q(x) be the statement "x + 1
>
e) P (2 ) t) Vx P (x) 2x ." If the domain
consists of all integers, what are these truth values?
a) Q(O) d) 3x Q(x) g) Vx ..... Q(x)
b) Q(- I ) e) Vx Q (x)
e) Q ( l ) t) 3x ..... Q(x)
13. Determine the truth value of each of these statements if
the domain consists of all integers.
b) 3n (2n = 3n) d) Vn(n2 :::: n )
a) Vn(n + 1 > n ) e ) 3n(n = -n)
1 4 . Determine the truth value of each of these statements if
the domain consists of all real numbers. b) 3X(x4 < x2) a) 3x (x3 = - 1 )
e) VX« _X)2
= x2)
d) Vx(2x
>
x)
1 5. Determine the truth value of each of these statements if
the domain for all variables consists of all integers. a) Vn (n2 :::: 0) b) 3n(n2 = 2) e) Vn (n2 :::: n ) d ) 3n(n2 < 0) 1 6. Determine the truth value of each of these statements if the domain of each variable consists of all real numbers. b) 3x (x2 = - 1 ) a) 3x(x2 = 2) e) Vx(x2 + 2 :::: 1 ) d) Vx (x2 =1= x ) 17. Suppose that the domain o f the propositional function P (x) consists of the integers 0, 1 , 2, 3, and 4. Write out each of these propositions using disjunctions, conjunc tions, and negations.
a) 3x P (x) d) Vx ..... P (x )
b ) Vx P (x ) e ) ..... 3x P (x )
e) 3x ..... P (x) t) .....Vx P(x )
1 8 . Suppose that the domain of the propositional function P (x) consists of the integers -2, - 1 , 0, 1 , and 2. Write
out each of these propositions using disjunctions, con junctions, and negations.
a) 3x P (x ) d) Vx ..... P (x )
b) Vx P (x) e ) ..... 3x P (x )
e ) 3x ..... P (x) t) .....Vx P (x)
19. Suppose that the domain of the propositional function P (x ) consists of the integers 1 , 2, 3 , 4, and 5 . Express
these statements without using quantifiers, instead using only negations, disjunctions, and conjunctions.
b) Vx P (x) a) 3x P (x) d) .....Vx P (x) e) ..... 3x P (x) e) Vx« x =1= 3) -+ P (x» v 3x ..... P (x)
20. Suppose that the domain of the propositional function P (x ) consists of -5, 3 - 1 , 1 , 3 , and 5. Express these statements without using quantifiers, instead using only negations, disjunctions, and conjunctions. -
a) e) d) e)
,
b ) Vx P (x) 3x P (x ) Vx « x =1= 1 ) -+ P (x» 3x « x :::: 0) /\ P (x» 3x( ..... P (x» /\ Vx« x < 0) -+ P (x»
2 1 . For each of these statements find a domain for which the
statement is true and a domain for which the statement is false. a) Everyone is studying discrete mathematics. b) Everyone is older than 2 1 years. e) Every two people have the same mother. d) No two different people have the same grandmother. 22. For each of these statements find a domain for which the statement is true and a domain for which the statement is false. a) Everyone speaks Hindi. b) There is someone older than 2 1 years.
48
1 / The Foundations: Logic and Proofs
e) Every two people have the same first name. d) Someone knows more than two other people. 23. Translate in two ways each of these statements into logi
cal expressions using predicates, quantifiers, and logical connectives. First, let the domain consist of the students in your class and second, let it consist of all people. a) Someone in your class can speak Hindi. b) Everyone in your class is friendly. e) There is a person in your class who was not born in California. d) A student in your class has been in a movie. e) No student in your class has taken a course in logic programming. 24. Translate in two ways each of these statements into logi cal expressions using predicates, quantifiers, and logical connectives. First, let the domain consist of the students in your class and second, let it consist of all people. a) Everyone in your class has a cellular phone. b) Somebody in your class has seen a foreign movie. e) There is a person in your class who cannot swim. d) All students in your class can solve quadratic equations. e) Some student in your class does not want to be rich. 25. Translate each of these statements into logical expressions using predicates, quantifiers, and logical connectives. a) No one is perfect. b) Not everyone is perfect. e) All your friends are perfect. d) At least one of your friends is perfect. e) Everyone is your friend and is perfect. f) Not everybody is your friend or someone is not perfect. 26. Translate each ofthese statements into logical expressions in three different ways by varying the domain and by using predicates with one and with two variables. a) Someone in your school has visited Uzbekistan. b) Everyone in your class has studied calculus and C++. e) No one in your school owns both a bicycle and a motorcycle. d) There is a person in your school who is not happy. e) Everyone in your school was born in the twentieth century. 27. Translate each ofthese statements into logical expressions in three different ways by varying the domain and by using predicates with one and with two variables. a) A student in your school has lived in Vietnam. b) There is a student in your school who cannot speak Hindi. e) A student in your school knows Java, Prolog, and C++. d) Everyone in your class enjoys Thai food. e) Someone in your class does not play hockey. 28. Translate each ofthese statements into logical expressions using predicates, quantifiers, and logical connectives. a) Something is not in the correct place.
1 -48
b) All tools are in the correct place and are in excellent
condition.
e) Everything is in the correct place and in excellent
condition.
d) Nothing is in the correct place and is in excellent
condition.
e) One of your tools is not in the correct place, but it is
in excellent condition.
29. Express each of these statements using logical operators,
predicates, and quantifiers. a) Some propositions are tautologies. b) The negation of a contradiction is a tautology. e) The disjunction of two contingencies can be a tautology. d) The conjunction of two tautologies is a tautology. 30. Suppose the domain of the propositional function P (x , y) consists of pairs x and y, where x is I , 2, or 3 and y is 1 , 2, or 3. Write out these propositions using disjunctions and conjunctions. a) 3x P(x , 3) b) 'v'y P( I , y) e)
3y-'P(2, y)
d)
'v'x -'P(x , 2)
31. Suppose that the domain of Q(x , y, z) consists of triples x , y, z , where x = 0, I , or 2, y = 0 or I , and z = 0 or 1 .
Write out these propositions using disjunctions and con junctions. a) 'v'y Q(O, y, 0) b) 3x Q(x , I , I ) e) 3z-'Q(O, 0, z ) d) 3x-'Q(x , O, I ) 32. Express each o f these statements using quantifiers. Then form the negation of the statement so that no negation is to the left of a quantifier. Next, express the negation in simple English. (Do not simply use the words "It is not the case that.") a) All dogs have fleas. b) There is a horse that can add. e) Every koala can climb. d) No monkey can speak French. e) There exists a pig that can swim and catch fish. 33. Express each of these statements using quantifiers. Then form the negation of the statement, so that no negation is to the left of a quantifier. Next, express the negation in simple English. (Do not simply use the words "It is not the case that.") a) Some old dogs can learn new tricks. b) No rabbit knows calculus. e) Every bird can fly. d) There is no dog that can talk. e) There is no one in this class who knows French and Russian. 34. Express the negation of these propositions using quanti fiers, and then express the negation in English. a) Some drivers do not obey the speed limit. b) All Swedish movies are serious. e) No one can keep a secret. d) There is someone in this class who does not have a good attitude.
1 .3 Predicates and Quantifiers 49
/-49
d) Video on demand can be delivered when there are at
35. Find a counterexample, if possible, to these universally
quantified statements, where the domain for all variables consists of all integers. a) b) c)
2 ::: x) \lx(x \lx(x \lx(x 01) x >
v
=
<
0)
36. Find a counterexample, if possible, to these universally
quantified statements, where the domain for all variables consists of all real numbers.
a) c)
2 \lx(x \lx( lx l x)0) ¥
b)
>
\lx(x 2
¥ 2)
37. Express each of these statements using predicates and
quantifiers. a) A passenger on an airline qualifies as an elite flyer if the passenger flies more than 25,000 miles in a year or takes more than 25 flights during that year. b) A man qualifies for the marathon if his best previous time is less than 3 hours and a woman qualifies for the marathon if her best previous time is less than 3.5 hours. c) A student must take at least 60 course hours, or at least 45 course hours and write a master's thesis, and receive a grade no lower than a B in all required courses, to receive a master's degree. d) There is a student who has taken more than 2 1 credit hours in a semester and received all A's. Exercises 38-42 deal with the translation between system specification and logical expressions involving quantifiers. 38. Translate these system specifications into English where the predicate y) is is in state y" and where the do main for and y consists of all systems and all possible states, respectively. a) open) b) malfunctioning) v diagnostic» c) open) v diagnostic) d) available) e) working) 39. Translate these specifications into English where is "Printer is out of service," is "Printer is busy," is "Print job is lost," and is "Print job is queued."
x S(x, "x \l3xS(x, x (S(x, 3xS(x, Sex, 3xS(x, \l3x..., x ...,SS(x,(x,
L(j) p 3p(F(p) \lpB(p) 3j(Q(j) (\lpB(p)
a) b) c) d)
B (p)Q(j)
1\
-+
1\
1\
j B(p» 3jL(j) 3jQ(j) L(j» 3pF(p) \ljQ(j» 3jL(j)
p F(p)j
least 8 megabytes of memory available and the con nection speed is at least 56 kilobits per second. 41. Express each of these system specifications using predi cates, quantifiers, and logical connectives. a) At least one mail message, among the nonempty set of messages, can be saved if there is a disk with more than 1 0 kilobytes of free space. b) Whenever there is an active alert, all queued messages are transmitted. c) The diagnostic monitor tracks the status of all systems except the main console. d) Each participant on the conference call whom the host of the call did not put on a special list was billed. 42. Express each of these system specifications using predi cates, quantifiers, and logical connectives. a) Every user has access to an electronic mailbox. b) The system mailbox can be accessed by everyone in the group if the file system is locked. c) The firewall is in a diagnostic state only if the proxy server is in a diagnostic state. d) At least one router is functioning normally if the throughput is between 1 00 kbps and 500 kbps and the proxy server is not in diagnostic mode. 43. Determine whether -+ and -+ are logically equivalent. JustifY your answer. 44. Determine whether � and � are logically equivalent. JustifY your answer. 45. Show that v and v are logically equivalent. Exercises 46-49 establish rules for null quantification that we can use when a quantified variable does not appear in part of a statement.
-+
occur as a free variable in nonempty. v
a) b) 47.
v
==
v
==
v
nonempty.
48.
40. Express each of these system specifications using predi
cates, quantifiers, and logical connectives. a) When there is less than 30 megabytes free on the hard disk, a warning message is sent to all users. b) No directories in the file system can be opened and no files can be closed when system errors have been detected. c) The file system cannot be backed up if there is a user currently logged on.
does not A. Assume that thexdomain is (\lxP(x» A \lx(P(x) A) (3xP(x » A 3x(P(x) A) Establish these logical equivalences, where x does not occur as a free variable in A. Assume that the domain is
46. Establish these logical equivalences, where
-+
-+
\lx(P(x) Q(x» \lxP(x) \lx(P(x) Q(x» \Ix P(x) 3x(P(x) Q(x» 3xP(x) 3xQ(x)
\Ix Q(x) \lxQ(x)
(\IxP(x» A \lx(P(x) A) (3xP(x» A 3x(P(x) A) Establish these logical equivalences, where x does not occur as a free variable in A. Assume that the domain is nonempty. \lx(A P(x» A \lxP(x) 3x(A P(x» A 3xP(x) Establish these logical equivalences, where x does not occur as a free variable in Assume that the domain is a)
1\
==
1\
b)
1\
==
1\
a) b) 49.
==
-+
==
-+
A.
nonempty. a) b)
-+
-+
\lx(P(x) A) 3xP(x) A 3x(P(x) A) \lxP(x) A -+
-+
==
==
-+
-+
50
1 I The Foundations: Logic and Proofs
1 -50
"Ix P(x) VxQ(x) Vx(P(x) Q(x)) 3xP(x) 3xQ(x) 3x(P(x) Q(x» 3!x P(x) x P(x)
50. Show that V not logically equivalent.
and
v
are
51. Show that 1\ not logically equivalent.
and
1\
are
52. As mentioned in the text, the notation "There exists a unique
such that
denotes
is true."
If the domain consists of all integers, what are the truth values of these statements? b) = 1) a) > 1) c) +3 = = + 1) 53. What are the truth values o f these statements?
2 3!3!xx (x(x 2x) d) 3!3!xx (x(x x a) 3!xP(x) 3xP(x) VxP(x) 3!x-'P(x) 3!xP(x) -,VxP(x) 3!x P(x), 3, -+
b) -+ c) -+ 54. Write out where the domain consists ofthe inte gers 1 , 2, and in terms of negations, conjunctions, and disjunctions. 55. Given the Prolog facts in Example 28, what would Prolog return given these queries?
a)
? i n s t ru c t o r ( chan , math2 7 3 )
b) ? i ns t ruc t o r ( pa t e l , c s 3 0 1 ) c) ? enr o l l ed ( X , c s 3 0 1 )
d) e)
? enr o l l ed ( k i ko , Y ) ? t e aches ( g r o s sman , Y )
56. Given the Prolog facts in Example 28, what would Prolog return when given these queries? a) ? enr o l l ed ( kevin , e e 2 2 2 ) b) ? enr o l l ed ( k i ko , math2 7 3 ) c) ? i ns t ruc t o r ( g r o s sman , X ) d) ? i ns t ruc t o r ( X , c s 3 0 1 )
e)
? t e aches ( X , kevi n )
57. Suppose that Prolog facts are used to define the predicates mother(M, Y) andfather(F, X), which represent that M is the mother of Y and F is the father of X, respectively. Give a Prolog rule to define the predicate sibling(X, y), which represents that X and Y are siblings (that is, have the same mother and the same father). 58. Suppose that Prolog facts are used to define the pred icates mother(M, Y) and father(F, X), which repre sent that M is the mother of Y and F is the father of X, respectively. Give a Prolog rule to define the predicate grandfather(X, y), which represents that X is the grand father of Y . [Hint: You can write a disjunction in Prolog
either by using a semicolon to separate predicates or by putting these predicates on separate lines.]
59-62 P(x), "x "x Q(x), R(x)"x P(x), Q(x), R(x),
Exercises
are based on questions found in the book
Symbolic Logic by Lewis Carroll.
59. Let and be the statements is a profes sor," is ignorant," and is vain," respectively. Express each of these statements using quantifiers; logical con nectives; and and where the domain consists of all people.
a) No professors are ignorant. b) All ignorant people are vain. c) No professors are vain. Does (c) follow from (a) and (b)? 60. Let and be the statements is a clear explanation," is satisfactory," and is an excuse," respectively. Suppose that the domain for consists of all English text. Express each ofthese statements using quan tifiers, logical connectives, and and a) All clear explanations are satisfactory. b) Some excuses are unsatisfactory. c) Some excuses are not clear explanations. Does (c) follow from (a) and (b)?
d)
P(x), Q(x),"x R(x)
"xx "x P(x), Q(x), R(x).
*d)
P(x),"xQ(x), R(x), "x S(x) "x R(x), S(x).
"x
61. Let and be the statements is a baby," is logical," is able to manage a crocodile," and is despised," respectively. Suppose that the domain consists of all people. Express each of these statements using quantifiers; logical connectives; and and a) Babies are illogical. b) Nobody is despised who can manage a crocodile. c) Illogical persons are despised. Babies cannot manage crocodiles. >te) Does (d) follow from (a), (b), and (c)? If not, is there a correct conclusion?
P(x), Q(x),
d)
P(x), Q(x), R(x), S(x) "x "x "x P(x), Q(x), R(x), S(x).
"x
62. Let and be the statements is a duck," is one of my poultry," is an officer," and is willing to waltz," respectively. Express each of these statements using quantifiers; logical connectives; and and a) No ducks are willing to waltz. b) No officers ever decline to waltz. c) All my poultry are ducks. My poultry are not officers. Does (d) follow from (a), (b), and (c)? If not, is there a correct conclusion?
*e)d)
1 .4 Nested Quantifiers Introd udion In Section 1 .3 we defined the existential and universal quantifiers and showed how they can be used to represent mathematical statements. We also explained how they can be used to translate
1 .4 Nested Quantifiers
1-51
51
English sentences into logical expressions. In this section we will study nested quantifiers. Two quantifiers are nested if one is within the scope of the other, such as Vx3y(x + y = 0). Note that everything within the scope of a quantifier can be thought of as a propositional function. For example, Vx3y(x + y
= 0)
=
is the same thing as Vx Q (x), where Q(x) is 3yP (x , y), where P (x , y) is x + y O . Nested quantifiers commonly occur in mathematics and computer science. Although nested quantifiers can sometimes be difficult to understand, the rules we have already studied in Section 1 .3 can help us use them. To understand these statements involving many quantifiers, we need to unravel what the quantifiers and predicates that appear mean. This is illustrated in Examples I and 2. EXAMPLE 1
Assume that the domain for the variables x and y consists of all real numbers. The statement VxVy(x + y = y + x) says that x + y = y + x for all real numbers x and y. This is the commutative law for addition of real numbers. Likewise, the statement Vx 3y(x + y = 0) says that for every real number x there is a real number y such that x + y = O . This states that every real number has an additive inverse. Similarly, the statement VxVyVz(x + (y + z) = (x + y) + z) is the associative law for addition of real numbers.
EXAMPLE 2
Translate into English the statement VxVy« x
>
0) /\ (y
<
0) -+ (xy
<
0» ,
where the domain for both variables consists of all real numbers.
Solution: This statement says that for every real number x and for every real number y, if x > 0 and y < 0, then xy < O . That is, this statement says that for real numbers x and y, if x is positive and y is negative, then xy is negative. This can be stated more succinctly as "The product of a
52
I / The
Foundations: Logic and Proofs
1 -52
every x we hit such a y, then 'v'x 3y P (x , y) is true; if for some x we never hit such a y, then 'v'x 3y P (x , y) is false. To see whether 3x 'v'y P (x , y) is true, we loop through the values for x until we find an x for which P (x , y) is always true when we loop through all values for y . Once we find such an x , we know that 3x'v'y P (x , y) is true. Ifwe never hit such an x , then we know that 3x'v'y P (x , y) is false. Finally, to see whether 3x 3y P (x , y) is true, we loop through the values for x, where for each x we loop through the values for y until we hit an x for which we hit a y for which P (x , y) is true. The statement 3x 3y P (x , y) is false only if we never hit an x for which we hit a y such that P (x , y) is true.
The Order of Quantifiers Many mathematical statements involve multiple quantifications of propositional functions in volving more than one variable. It is important to note that the order of the quantifiers is important, unless all the quantifiers are universal quantifiers or all are existential quantifiers. These remarks are illustrated by Examples
EXAMPLE 3
3-5 .
Let P (x , y) be the statement "x + y = Y + x ." What are the truth values of the quantifications 'v'x'v'y P (x , y) and 'v'y'v'x P (x , y) where the domain for all variables consists of all real numbers?
Solution: The quantification
Extra � Examples �
'v'x'v'y P (x , y) denotes the proposition "For all real numbers x , for all real numbers y, x
+ y = y + x ."
Because P (x , y) is true for all real numbers x and y (it is the commutative law for addition, which is an axiom for the real numbers-see Appendix 1 ), the proposition 'v'x'v'y P (x , y) is true. Note that the statement 'v'y'v'x P (x , y) says "For all real numbers y, for all real numbers x , x + y = y + x ." This has the same meaning a s the statement a s "For all real numbers x , for all real numbers y, x + y = y + x ." That is, 'v'x'v'y P (x , y) and'v'y'v'x P (x , y) have the same meaning,
and both are true. This illustrates the principle that the order of nested universal quantifiers in a statement without other quantifiers can be changed without changing the meaning of the .... quantified statement.
EXAMPLE 4
Let Q (x , y) denote "x + y = 0 ." What are the truth values of the quantifications 3y'v'x Q (x , y) and 'v'x 3y Q (x , y), where the domain for all variables consists of all real numbers?
Solution: The quantification 3y'v'x Q (x , y) denotes the proposition "There is a real number y such that for every real number x, Q (x , y)." No matter what value of y is chosen, there is only one value of x for which x + y = O. Because there is no real number y such that x + y = 0 for all real numbers x , the statement 3y'v'x Q(x , y) is false.
1 .4 Nested Quantifiers S3
/ -53
TABLE 1 Quantifications of Two Variables. Statement
When True?
When False?
VxVyP(x , y) VyVxP(x , y)
P(x , y) is true for every pair x, y.
There is a pair x , y for which P(x , y) is false.
Vx 3yP(x, y)
For every x there is a y for which P(x, y) is true.
There is an x such that P(x, y) is false for every y.
3xVyP(x, y)
There is an x for which P(x, y) is true for every y.
For every x there is a y for which P(x, y) is false.
3x 3yP(x , y) 3y3x P(x, y)
There is a pair x , y for which P(x , y) is true.
P(x, y) is false for every pair x , y.
The quantification Vx3y Q (x , y) denotes the proposition "For every real number x there is a real number y such that Q (x , y)." Given a real number x , there is a real number y such that x the statement Yx 3y Q (x , y) is true.
+ y = 0;
namely, y = -x . Hence, ....
Example 4 illustrates that the order in which quantifiers appear makes a difference. The state ments 3yVx P (x , y) and Vx 3y P (x , y) are not logically equivalent. The statement 3yVx P (x , y) is true if and only ifthere is a y that makes P (x , y) true for every x . So, for this statement to be true, there must be a particular value ofy for which P (x , y) is true regardless ofthe choice ofx . On the other hand, Vx 3y P (x , y) is true if and only if for every value ofx there is a value ofy for which P (x , y) is true. So, for this statement to be true, no matter which x you choose, there must be a value of y (possibly depending on the x you choose) for which P (x , y) is true. In other words, in the second case, y can depend on x , whereas in the first case, y is a constant independent of x . From these observations, it follows that if 3yVx P (x , y) is true, then Vx 3y P (x , y) must also be true. However, if Vx 3y P (x , y) is true, it is not necessary for 3yVx P (x , y) to be true. (See Supplementary Exercises 24 and 25 at the end of this chapter.) Table 1 summarizes the meanings of the different possible quantifications involving two
variables. Quantifications of more than two variables are also common, as Example
EXAMPLE 5
5 illustrates.
Let Q (x , y, z) be the statement "x + y = z." What are the truth values of the statements VxVy3z Q (x , y , z) and 3zVxVy Q (x , y , z), where the domain of all variables consists of all real numbers?
Solution: Suppose that x and y are assigned values. Then, there exists a real number z such that + y = z. Consequently, the quantification
x
VxVy3z Q (x , y, z),
54
I / The Foundations: Logic and Proofs
1-54
which is the statement "For all real numbers x and for all real numbers y there is a real number z such that x + y z,"
=
is true. The order of the quantification here is important, because the quantification
3zVxVy Q(x , y, z), which is the statement "There is a real number z such that for all real numbers x and for all real numbers y it is true that + y = z,"
x
is false, because there is no value of z that satisfies the equation x + y = z for all values of x .... and y.
Translating Mathematical Statements into Statements Involving Nested Quantifiers Mathematical statements expressed in English can be translated into logical expressions, as Examples 6-8 show. EXAMPLE 6
Exam= �
Translate the statement "The sum of two positive integers is always positive" into a logical expression.
Solution: To translate this statement into a logical expression, we first rewrite it so that the implied quantifiers and a domain are shown: "For every two integers, if these integers are both positive, then the sum of these integers is positive." Next, we introduce the variables x and y to obtain "For all positive integers x and y, x + y is positive." Consequently, we can express this statement as VxVy((x
>
0)
1\
(y
>
0)
--+
(x + y
>
0» ,
where the domain for both variables consists of all integers. Note that we could also translate this using the positive integers as the domain. Then the statement "The sum of two positive integers is always positive" becomes "For every two positive integers, the sum of these integers is positive. We can express this as
VxVy(x + y
>
0) ,
where the domain for both variables consists of all positive integers. EXAMPLE 7
Translate the statement "Every real number except zero has a multiplicative inverse." (A mul a real number y such that xy 1 .)
tiplicative inverse of a real number x is
=
Solution: We first rewrite this as "For every real number x except zero, x has a multiplicative inverse." We can rewrite this as "For every real number x , if x =I 0, then there exists a real number y such that xy I ." This can be rewritten as
=
Vx((x =I 0) --+ 3y(xy
= 1» .
One example that you may be familiar with is the concept of limit, which is important in calculus.
1 .4 Nested Quantifiers
1 -55
EXAMPLE 8
55
(Requires Calculus) Express the definition of a limit using quantifiers. Solution: Recall that the definition of the statement
lim f(x) = L
x -> a
is: For every real number E > 0 there exists a real number 8 > 0 such that I f(x) - L I < E whenever 0 < Ix - a I < 8. This definition of a limit can be phrased in terms of quantifiers by VE 38Vx(0
<
Ix - a l
<
8 -+ I f(x) - L I
<
E),
where the domain for the variables 8 and E consists of all positive real numbers and for x consists of all real numbers. This definition can also be expressed as VE > 0 38 > O Vx(0
<
Ix - a l
<
8 -+ I f(x) - L I
<
E)
when the domain for the variables E and 8 consists ofall real numbers, rather than just the positive real numbers. [Here, restricted quantifiers have been used. Recall that 'Ix > 0 P (x) means that � for all x with x > 0, P (x) is true.]
Translating from Nested Quantifiers into English Expressions with nested quantifiers expressing statements in English can be quite complicated. The first step in translating such an expression is to write out what the quantifiers and predicates in the expression mean. The next step is to express this meaning in a simpler sentence. This process is illustrated in Examples 9 and 1 0. EXAMPLE 9
Translate the statement Vx(C(x) v 3y(C(y) 1\ F(x , y» )
into English, where C (x) is "x has a computer," F (x , y) is "x and y are friends," and the domain for both x and y consists of all students in your school. Solution: The statement says that for every student x in your school, x has a computer or there is a student y such that y has a computer and x and y are friends. In other words, every student
in your school has a computer or has a friend who has a computer.
EXAMPLE 10
�
Translate the statement 3xVyVz«F(x , y) 1\ F(x , z) 1\ (y :j:. z» -+ --.F(y,z»
into English, where F(a , b) means a and b are friends and the domain for x, y, and z consists of all students in your school. Solution: We first examine the expression (F(x , y) 1\ F (x , z) 1\ (y :j:. z» -+ --. F(y, z). This expression says that if students x and y are friends, and students x and z are friends, and further
more, if y and z are not the same student, then y and z are not friends. It follows that the original statement, which is triply quantified, says that there is a student x such that for all students y and all students z other than y, if x and y are friends and x and z are friends, then y and z are not friends. In other words, there is a student none of whose friends are also friends with each � other.
56
1 / The Foundations: Logic and Proofs
I -56
Translating English Sentences into Logical Expressions In Section 1 .3 we showed how quantifiers can be used to translate sentences into logical expres sions. However, we avoided sentences whose translation into logical expressions required the use of nested quantifiers. We now address the translation of such sentences. EXAMPLE 11
Express the statement "If a person is female and is a parent, then this person is someone's mother" as a logical expression involving predicates, quantifiers with a domain consisting of all people, and logical connectives.
Solution: The statement "If a person is female and is a parent, then this person is someone's mother" can be expressed as "For every person x , if person x is female and person x is a parent, then there exists a person y such that person x is the mother of person y." We introduce the propositional functions F(x) to represent "x is female," P (x) to represent "x is a parent," and M (x , y) to represent "x is the mother of y." The original statement can be represented as Vx« F (x) /\ P (x» -+ 3yM(x , y» . Using the null quantification rule in part (b) of Exercise 47 in Section 1 .3, we can move 3y to the left so that it appears just after Vx , because y does not appear in F(x) /\ P(x). We obtain the logically equivalent expression
Vx 3y« F (x) /\ P (x» -+ M(x , y» . EXAMPLE 12
Express the statement "Everyone has exactly one best friend" as a logical expression involving predicates, quantifiers with a domain consisting of all people, and logical connectives.
Solution: The statement "Everyone has exactly one best friend" can be expressed as "For every person x , person x has exactly one best friend." Introduc ing the universal quantifier, we see that this statement is the same as "Vx (person x has exactly one best friend)," where the domain consists of all people. To say that x has exactly one best friend means that there is a person y who is the best friend of x , and furthermore, that for every person z , if person z is not person y, then z is not the best friend of x . When we introduce the predicate B (x , y) to be the statement "y is the best friend of x ," the statement that x has exactly one best friend can be represented as 3y(B (x , y) /\ Vz « z # y) -+ -.B(x , z » ). Consequently, our original statement can be expressed as
Vx 3y(B (x , y) /\ Vz« z # y) -+ -.B(x , z» ). [Note that we can write this statement as Vx 3 !yB(x , y), where 3 ! is the "uniqueness quantifier" ... defined on page 37. ] EXAMPLE 13
Use quantifiers to express the statement "There is a woman who has taken a flight on every airline in the world."
1 .4 Nested Quantifiers
/-57
57
Solution: Let P (w , f) be "w has taken f" and QU, a) be "f is a flight on a ." We can express the statement as 3w'v'a3f(P (w , f) 1\ QU, a» , where the domains of discourse for w , f, and a consist of all the women in the world, all airplane flights, and all airlines, respectively. The statement could also be expressed as 3w'v'a3fR(w , f, a), where R(w , f, a) is "w has taken f on a ." Although this is more compact, it somewhat obscures the relationships among the variables. Consequently, the first solution is usually preferable. ....
Negating Nested Quantifiers
Assessmenl �
Statements involving nested quantifiers can be negated by successively applying the rules for negating statements involving a single quantifier. This is illustrated in Examples 1 4- 1 6 .
EXAMPLE 14
Express the negation o f the statement 'v'x3y(xy = 1 ) s o that n o negation precedes a quantifier.
Exlra � EIIamples � EXAMPLE 15
Solution: By successively applying De Morgan's laws for quantifiers in Table 2 of Section 1 .3, we can move the negation in -Nx 3y(xy 1 ) inside all the quantifiers. We find that -Nx 3y(xy = 1 ) is equivalent t0 3x ..... 3y(xy = 1 ), which is equivalent t0 3x'v'y ..... (xy = 1 ). Because ..... (xy = l ) can be expressed more simply as xy . =1= 1 , we conclude that our negated statement can be expressed .... as 3x'v'y(xy =1= 1 ).
=
Use quantifiers to express the statement that "There does not exist a woman who has taken a flight on every airline in the world."
Solution: This statement is the negation of the statement "There is a woman who has taken a flight on every airline in the world" from Example 1 3 . By Example 1 3 , our statement can be expressed as ..... 3w'v'a3f(P (w , f) 1\ QU, a» , where P (w , f) is "w has taken f" and QU, a) is "f is a flight on a ." By successively applying De Morgan's laws for quantifiers in Table 2 of Section 1 .3 to move the negation inside successive quantifiers and by applying De Morgan's law for negating a conjunction in the last step, we find that our statement is equivalent to each of this sequence of statements: 'v'w .....'v'a 3f(P(w , f) 1\ QU, a»
'v'w 3a ..... 3f(P(w , f) 1\ Q( J, a» == 'v'w 3a'v'f ..... (P(w , f) 1\ QU, a»
==
==
'v'w 3a'v'f(""'P (w , f) V ..... QU, a» .
This last statement states "For every woman there is an airline such that for all flights, this .... woman has not taken that flight or that flight is not on this airline." EXAMPLE 16
(Requires Calculus) Use quantifiers and predicates to express the fact that limx-+a f(x) does not exist. Solution: To say that limx-->a f(x) does not exist means that for all real numbers L , limx-->a f(x) =1= L . B y using Example 8, the statement limx-->a f(x) =1= L can b e expressed as
.....'v'E > 0 38 > 0 'v'x(O < Ix - a l < 8 --* I f(x) - L I < E).
58
I
1-58
/ The Foundations: Logic and Proofs
Successively applying the rules for negating quantified expressions, we construct this sequence of equivalent statements
-,vE > 0 38 > 0 Vx(O < Ix - a l < 8
-*
I f(x) - L I < E)
== 3E > 0 -.38 > 0 Vx(O < Ix - al < 8
-*
I f(x) - L I < E)
== 3E > 0 V8 > 0 -.VX(O < Ix - al < 8
-*
I f(x) - L I < E)
== 3E > 0 V8 > 0 3x -.(0 < Ix - al < 8
-*
I f(x) - L I < E)
== 3E > 0 V8 > 0 3x(0 < Ix - al < 8 /\ I f(x) - LI
� E).
In the last step we used the equivalence -'(p -* q) == P /\ -'q , which follows from the fifth equivalence in Table 7 of Section 1 .2. Because the statement "limx---+ a f(x) does not exist" means for all real numbers L, limx---+ a f(x) i= L , this can be expressed as
VL 3E > 0 V8 > 0 3x(0 < Ix - a l < 8 /\ I f(x) - L I � E). This last statement says that for every real number L there is a real number E > 0 such that for every real number 8 > 0, there exists a real number x such that 0 < Ix - a I < 8 and � I f(x) - L I � E .
Exercises 1. Translate these statements into English, where the domain for each variable consists of all real numbers. a) 'v'x3y(x < y) b) 'v'x'v'y(((x ::: 0) 1\ (y ::: 0» -+ (xy ::: 0» c) 'v'x'v'y3z(xy = z) 2. Translate these statements into English, where the domain for each variable consists of all real numbers. a) 3x'v'y(xy = y) b) 'v'x'v'y(((x ::: 0) 1\ (y < 0» -+ (x - y > 0» c) 'v'x'v'y3z(x = Y + z) 3. Let Q (x , y) be the statement "x has sent an e-mail mes sage to y," where the domain for both x and y consists of all students in your class. Express each of these quantifi cations in English.
b) 3x'v'y Q (x , y) 3x 3y Q (x , y) 3y'v'x Q(x , y) c) 'v'x 3y Q (x , y) 1) 'v'x'v'y Q (x , y) 'v'y3x Q (x , y) 4. Let P (x , y) be the statement "student x has taken class y," where the domain for x consists of all students in your class and for y consists of all computer science courses at your school. Express each of these quantifications in English.
a)
e)
d)
b) 3x'v'y P (x , y) 3x 3y P (x , y) 3y 'v'x P (x , y) c) 'v'x 3yP (x , y) 1) 'v'x'v'y P (x , y) 'v'y3x P (x , y) 5. Let W (x , y) mean that student x has visited website y, where the domain for x consists of all students in your
a)
e)
d)
school and the domain for y consists of all websites. Express each of these statements by a simple English sentence.
a) W (Sarah Smith, www.att.com) b) 3x W (x , www.imdb.org) c) 3yW(Jose Orez, y) 3y(W(Ashok Puri, y) 1\ W(Cindy Yoon, y» 3y'v'z(y i= (David Belcher) 1\ (W(David Belcher, z) -+ W(y,z» ) 1) 3x 3y'v'z((x i= y) 1\ (W(x , z) B W(y , z))) 6. Let C (x , y) mean that student x is enrolled in class y, where the domain for x consists of all students in your school and the domain for y consists of all classes being given at your school. Express each of these statements by a simple English sentence. a) C (Randy Goldberg, CS 252) b) 3xC (x , Math 695) c) 3yC (Carol Sitea, y) 3x(C (x , Math 222) 1\ C(x , CS 252» 3x 3y'v'z((x i= y) 1\ (C (x , z) -+ C (y, z» ) 1) 3x 3y'v'z((x i= y) 1\ (C(x , z) B C (y, z» ) 7 . Let T (x , y) mean that student x likes cuisine y, where the domain for x consists of all students at your school and the domain for y consists of all cuisines. Express each of these statements by a simple English sentence. a) -.T (Abdallah Hussein, Japanese) b) 3x T (x , Korean) 1\ 'v'x T (x , Mexican)
d) e)
e)d)
.
1 .4 Nested Quantifiers
1-59
c) 3y(T (Monique Arsenault, y) V T (Jay Johnson, y» 'v'x'v'z3y« x =f. z) --+ -'(T (x , y) /\ T (z , y» ) 3x 3z'v'y(T (x , y) *+ T (z, y» t) 'v'x'v'z3y(T(x , y) *+ T (z, y» 8. Let Q(x , y) be the statement "student x has been a contestant on quiz show y." Express each of these sentences in terms of Q(x , y), quantifiers, and logical con nectives, where the domain for x consists of all students at your school and for y consists of all quiz shows on television. a) There is a student at your school who has been a con testant on a television quiz show. b) No student at your school has ever been a contestant on a television quiz show. c) There is a student at your school who has been a con testant on Jeopardy and on Wheel ofFortune. Every television quiz show has had a student from your school as a contestant. At least two students from your school have been con testants on Jeopardy. 9. Let L (x , y) be the statement "x loves y," where the do main for both x and y consists of all people in the world. Use quantifiers to express each of these statements. a) Everybody loves Jerry. b) Everybody loves somebody. c) There is somebody whom everybody loves. Nobody loves everybody. There is somebody whom Lydia does not love. t) There is somebody whom no one loves. g) There is exactly one person whom everybody loves. h) There are exactly two people whom Lynn loves. i) Everyone loves himself or herself. j) There is someone who loves no one besides himself or herself.
d) e)
d) e)
e)d)
10. Let F(x , y) be the statement "x can fool y," where the do main consists of all people in the world. Use quantifiers to express each of these statements. a) Everybody can fool Fred. b) Evelyn can fool everybody. c) Everybody can fool somebody. There is no one who can fool everybody. ) Everyone can be fooled by somebody. t) No one can fool both Fred and Jerry. g) Nancy can fool exactly two people. h) There is exactly one person whom everybody can fool. i) No one can fool himself or herself. j) There is someone who can fool exactly one person besides himself or herself. 1 1 . Let S(x) be the predicate "x is a student," F (x) the pred icate "x is a faculty member," and A(x , y) the predicate "x has asked y a question," where the domain consists of all people associated with your school. Use quantifiers to express each of these statements.
d)e
a) Lois has asked Professor Michaels a question. b) Every student has asked Professor Gross a question.
S9
c) Every faculty member has either asked Professor Miller a question or been asked a question by Profes sor Miller. Some student has not asked any faculty member a question. There is a faculty member who has never been asked a question by a student. t) Some student has asked every faculty member a question. g) There is a faculty member who has asked every other faculty member a question. h) Some student has never been asked a question by a faculty member. 12. Let I (x) be the statement "x has an Internet connection" and C (x , y) be the statement "x and y have chatted over the Internet," where the domain for the variables x and y consists of all students in your class. Use quantifiers to express each of these statements. a) Jerry does not have an Internet connection. b) Rachel has not chatted over the Internet with Chelsea. c) Jan and Sharon have never chatted over the Internet. No one in the class has chatted with Bob. Sanjay has chatted with everyone except Joseph. t) Someone in your class does not have an Internet connection. g) Not everyone in your class has an Internet connection. h) Exactly one student in your class has an Internet connection. i) Everyone except one student in your class has an Internet connection. j) Everyone in your class with an Internet connection has chatted over the Internet with at least one other student in your class. k) Someone in your class has an Internet connection but has not chatted with anyone else in your class. I) There are two students in your class who have not chatted with each other over the Internet. m) There is a student in your class who has chatted with everyone in your class over the Internet. n) There are at least two students in your class who have not chatted with the same person in your class. 0) There are two students in the class who between them have chatted with everyone else in the class. 13. Let M (x , y) be "x has sent y an e-mail message" and T (x , y) be "x has telephoned y," where the domain consists of all students in your class. Use quantifiers to express each of these statements. (Assume that all e-mail messages that were sent are received, which is not the way things often work.) a) Chou has never sent an e-mail message to Koko. b) Arlene has never sent an e-mail message to or tele phoned Sarah. c) Jose has never received an e-mail message from Deborah. Every student in your class has sent an e-mail mes sage to Ken.
d) e)
e)d)
d)
60
I / The Foundations: Logic and Proofs e) No one in your class has telephoned Nina. 1) Everyone in your class has either telephoned Avi or sent him an e-mail message. g) There is a student in your class who has sent everyone else in your class an e-mail message. h) There is someone in your class who has either sent an e-mail message or telephoned everyone else in your class. i) There are two different students in your class who have sent each other e-mail messages. j) There is a student who has sent himself or herself an e-mail message. k) There is a student in your class who has not received an e-mail message from anyone else in the class and who has not been called by any other student in the class. I) Every student in the class has either received an e-mail message or received a telephone call from another student in the class. m) There are at least two students in your class such that one student has sent the other e-mail and the second student has telephoned the first student. n) There are two different students in your class who between them have sent an e-mail message to or telephoned everyone else in the class.
14. Use quantifiers and predicates with more than one variable to express these statements. a) There is a student in this class who can speak Hindi. b) Every student in this class plays some sport. c) Some student in this class has visited Alaska but has not visited Hawaii. d) All students in this class have learned at least one programming language. e) There is a student in this class who has taken ev ery course offered by one of the departments in this school. 1) Some student in this class grew up in the same town as exactly one other student in this class. g) Every student in this class has chatted with at least one other student in at least one chat group. 1 5. Use quantifiers and predicates with more than one variable to express these statements.
a) Every computer science student needs a course in discrete mathematics. b) There is a student in this class who owns a personal computer. c) Every student in this class has taken at least one computer science course. d) There is a student in this class who has taken at least one course in computer science. e) Every student in this class has been in every building on campus. 1) There is a student in this class who has been in every room of at least one building on campus. g) Every student in this class has been in at least one room of every building on campus.
1 -60
16. A discrete mathematics class contains I mathematics major who is a freshman, mathematics majors who are sophomores, computer science majors who are sophomores, mathematics majors who are juniors, computer science majors who are juniors, and 1 computer science major who is a senior. Express each of these state ments in terms of quantifiers and then determine its truth value. a) There is a student in the class who is a junior. b) Every student in the class is a computer science major. c) There is a student in the class who is neither a math ematics major nor a junior. d) Every student in the class is either a sophomore or a computer science major. e) There is a major such that there is a student in the class in every year of study with that major. 17. Express each of these system specifications using predi cates, quantifiers, and logical connectives, if necessary. a) Every user has access to exactly one mailbox. b) There is a process that continues to run during all error conditions only if the kernel is working correctly. c) All users on the campus network can access all web sites whose urI has a .edu extension. *d) There are exactly two systems that monitor every remote server.
2 15
12
2
18. Express each of these system specifications using predi cates, quantifiers, and logical connectives, if necessary.
a) At least one console must be accessible during every fault condition. b) The e-mail address of every user can be retrieved whenever the archive contains at least one message sent by every user on the system. c) For every security breach there is at least one mecha nism that can detect that breach if and only if there is a process that has not been compromised. d) There are at least two paths connecting every two distinct endpoints on the network. e) No one knows the password of every user on the sys tem except for the system administrator, who knows all passwords. 19. Express each of these statements using mathematical and logical operators, predicates, and quantifiers, where the domain consists of all integers. a) The sum of two negative integers is negative. b) The difference of two positive integers is not neces sarily positive. c) The sum of the squares of two integers is greater than or equal to the square of their sum. d) The absolute value of the product of two integers is the product of their absolute values. 20. Express each of these statements using predicates, quan tifiers, logical connectives, and mathematical operators where the domain consists of all integers. a) The product of two negative integers is positive. b) The average of two positive integers is positive.
1 .4 Nested Quantifiers 61
/-6/
e) The difference of two negative integers is not neces
sarily negative. The absolute value of the sum of two integers does not exceed the sum ofthe absolute values ofthese integers. Use predicates, quantifiers, logical connectives, and mathematical operators to express the statement that every positive integer is the sum of the squares of four integers. Use predicates, quantifiers, logical connectives, and math ematical operators to express the statement that there is a positive integer that is not the sum of three squares. Express each of these mathematical statements using predicates, qUahtifiers, logical connectives, and mathe matical operators. a) The product of two negative real numbers is positive. b) The difference of a real number and itself is zero. e) Every positive real number has exactly two square roots. A negative real number does not have a square root that is a real number. Translate each of these nested quantifications into an En glish statement that expresses a mathematical fact. The domain in each case consists of all real numbers. a) + =
d)
21.
22.
23.
3xVy(x y ) y)/\ (y » --+ (x - y VxVy(((x 0 0 0» 3x3y(((x � 0) /\ (y � 0» /\ (x 0» d) VxVy((x ::/= 0) /\ (y ::/= 0) (xy ::/= 0» b) e)
�
<
29.
a) e)
26.
>
<
=
e) e) g) i)
Q(I, y) VyQ(I, 3x3yQ(x, y)y) 3yVxQ(x, VxVyQ(x,y)
Vx3yVzT(x, y, z) Vx3yP(x,y) v Vx3yQ(x,y) Vx3y(P(x, y) /\ 3zR (x, y, z» d) Vx3y(P(x, y) --+ Q(x, y» a)
Q(2, 0) 2) d) Vx3yQ(x,y) 3xQ(x, Vy3xQ(x,y)
b) c)
32. Express the negations of each of these statements so that
all negation symbols immediately precede predicates.
3zVyVxT(x,y, z) 3x3yP(x,y) /\ VxVyQ(x, y) 3x3y(Q(x, y) Q(Y, x» d) Vy3x3z(T(x,y, z) v Q(x, y» a)
b) e)
e) e) g) h) i)
pear only within predicates (that is, so that no negation is outside a quantifier or an expression involving logical connectives).
=
=
=
=
t)
=
=
=
e)
if the domain of each variable consists of all real numbers.
Vx3y(x2 y) =
=
6)
=
b)
e)
34.
28. Determine the truth value of each of these statements
a)
y) -.VxVyP(x, -.Vy3xP(x, y) -.VyVx(P(x, y) v Q(x, y» d) -.(3x3y-.P(x, y) /\ VxVyQ(x, y» -.Vx(3yVzP(x, y, z) /\ 3zVyP(x, y, z» Find a common domain for the variables x, y, and z for which the statement VxVy((x ::/= y) --+ Vz((z x) (z y» ) is true and another domain for which it is false. Find a common domain for the variables x, y, z, and w for which the statement VxVyVz3w((w ::/= x) /\ (w ::/= y)these /\ (wvariables ::/= z» isfortruewhichandit another common domain for is false. a)
Vn3m(n 2+ mm) 0) d) 3nVm(nm 3nVm(n m2)m) Vn3m(n 2 + m 2 5) 3n3m(n2 + m2 3n3m(n 3n3m(n ++ mm 44 /\/\ nn -- mm 2)1 ) 3n3m(n VnVm3p(p (m + n)/2) =
_
33. Rewrite each of these statements so that negations ap
t) b)
<
b)
all negation symbols immediately precede predicates.
b)
b)
b)
e)
the domain for all variables consists of all integers. <
3x3yP(x, y) d) Vy3xP(x, y)
3 1 . Express the negations of each of these statements so that
27. Determine the truth value of each of these statements if
a)
=
-.3y3xP(x, y) -.Vx3yP(x, y) -.3y(Q(y) /\ Vx-.R (x, y» d) -.3y(3xR (x , y) v VxS(x, y» -.3y(Vx3zT(x, y, z) v 3xVzU(x, y, z»
=
are the truth values? a) 1)
VxVyP(x,y) 3xVyP(x, y)
a)
>
<
=
-Y =
e)
>
3xVy(xy y)) /\ (y --+ (xy VxVy(((x 0 /\ (x 0»y» 0» 3x3y((x2 y) d) VxVy3z(x + y z) Let Q(x, y) be the statement "x + y x - y." If the domain for both variables consists of all integers, what <
=
=
1)
pear only within predicates (that is, so that no negation is outside a quantifier or an expression involving logical connectives).
Y >
glish statement that expresses a mathematical fact. The domain in each case consists of all real numbers.
b) e)
=
=
30. Rewrite each of these statements so that negations ap
25. Translate each of these nested quantifications into an En
=
=
and conjunctions.
_
a)
=
e) t) g) b) i) j)
d)
24.
3xVy(xy d) Vx(x 3x3y(x::/= + --+y 0::/=)3y(xy y + x) 1» 0 3xVy(y +::/= y0 --+ 1xy) Vx3y(x 3x3y(x + 2y 2 /\ 2x + 4y 5) Vx3y(x + y 2 /\ 2x 1 ) VxVy3z(z (x + y)/2) Suppose the domain ofthe propositional function P(x, y) consists of pai� x and y, where x is 1 , 2, or 3 and y is 1 , 2, or 3. Write out these propositions using disjunctions e)
35.
=
V
36. Express each of these statements using quantifiers. Then
form the negation of the statement so that no negation is to the left of a quantifier. Next, express the negation in
62
I
/ The Foundations: Logic and Proofs
simple English. (Do not simply use the words "It is not the case that.")
a) No one has lost more than one thousand dollars play ing the lottery. b) There is a student in this class who has chatted with exactly one other student. c) No student in this class has sent e-mail to exactly two other students in this class. Some student has solved every exercise in this book. e) No student has solved at least one exercise in every section of this book.
d)
37. Express each of these statements using quantifiers. Then form the negation of the statement so that no negation is to the left of a quantifier. Next, express the negation in simple English. (Do not simply use the words "It is not the case that.")
a) Every student in this class has taken exactly two mathematics classes at this school. b) Someone has visited every country in the world except Libya. c) No one has climbed every mountain in the Himalayas. Every movie actor has either been in a movie with Kevin Bacon or has been in a movie with someone who has been in a movie with Kevin Bacon.
d)
38. Express the negations of these propositions using quan tifiers, and in English.
a) Every student in this class likes mathematics. b) There is a student in this class who has never seen a computer. c) There is a student in this class who has taken every mathematics course offered at this school. There is a student in this class who has been in at least one room of every building on campus.
d)
39. Find a counterexample, if possible, to these universally quantified statements, where the domain for all variables consists of all integers.
a) b) c)
VxVy(x2 y2 -+ X y) Vx3y(y2 x) VxVy(xy ::: x) =
=
=
40. Find a counterexample, i f possible, t o these universally quantified statements, where the domain for all variables consists of all integers.
a) b) c)
Vx3y(x l/y) Vx3y(y2 - X 3 VxVy(x2 y ) =
<
1 00)
=1=
41. Use quantifiers to express the associative law for multi plication of real numbers. 42. Use quantifiers to express the distributive laws of multi plication over addition for real numbers. 43. Use quantifiers and logical connectives to express the fact that every linear polynomial (that is, polynomial of degree 1 ) with real coefficients and where the coefficient of is nonzero, has exactly one real root.
x
44. Use quantifiers and logical connectivesto express the fact
1 - 62
that a quadratic polynomial with real number coefficients has at most two real roots. 45. Determine the truth value ofthe statement i f the domain for the variables consists of a) the nonzero real numbers. b) the nonzero integers. c) the positive real numbers.
46. Determine the truth value of the statement if the domain for the variables consists of
Vx 3y(xy
=
1)
3xVy(x y2) :s:
a) the positive real numbers. b) the integers. c) the nonzero real numbers. 47. Show that the two statements and where both quantifiers over the first vari able in have the same domain, and both quantifiers over the second variable in have the same domain, are logically equivalent. *48. Show that v and v where all quantifiers have the same nonempty domain, are logically equivalent. (The new variable is used to combine the quantifications correctly.) *49. a) Show that 1\ is logically equivalent to 1\ where all quantifiers have the same nonempty domain. b) Show that v is equivalent to v where all quantifiers have the same nonempty domain.
-.3xVy P(x, y)
Vx3y-.P(x, P (x, y)y),
P (x, y) VxP(x) VxQ(x) VxVy(P(x) Q(Y» , y VxP(x)Q( 3xQ(x) Vx3y (P(x) Y» , Vx3y(P(x) VxP(x) Q(Y» , 3xQ(x)
A statement is in prenex normal form (PNF) if and only if it is of the form
Q \ x \ Q2X2 . . . QkXk P(X \ , X2 , . . . , Xk), Qi, i 1 , 2, P(x\,3xVy(P(x, . . . ,Xk) y) Q(y» 3xP(x) VxQ(x)
where each = . . . , k, is either the existential quanti fier or the universal quantifier, and is a predicate involving no quantifiers. For example, 1\ is in prenex normal form, whereas v is not (because the quantifiers do not all occur first). Every statement formed from propositional variables, predicates, T, and F using logical connectives and quantifiers is equivalent to a statement in prenex normal form. Exercise 5 1 asks for a proof of this fact. *50. Put these statements in prenex normal form. [Hint: Use logical equivalence from Tables 6 and 7 in Section 1 .2, Table 2 in Section 1 .3 , Example 19 in Section 1 .3 , Ex ercises 45 and 46 in Section 1 .3 , and Exercises 48 and 49 in this section.]
a) b) c)
3xP(x) 3xQ(x) -.(VxP(x) VxQ(x» 3xP(x) 3xQ(x)
v v A, where A is a proposition not involving any quantifiers. v
-+
**51. Show how to transform an arbitrary statement to a state ment in prenex normal form that is equivalent to the given statement.
3!xP(x),
*52. Express the quantification introduced on page 38, using universal quantifications, existential quantifica tions, and logical operators.
1 .5 Rules of Inference
1-63
63
1.5 Rules of Inference Introduction Later in this chapter we will study proofs. Proofs in mathematics are valid arguments that estab lish the truth of mathematical statements. By an argument, we mean a sequence of statements that end with a conclusion. By valid, we mean that the conclusion, or final statement of the argument, must follow from the truth ofthe preceding statements, or premises, ofthe argument. That is, an argument is valid if and only if it is impossible for all the premises to be true and the conclusion to be false. To deduce new statements from statements we already have, we use rules of inference which are templates for constructing valid arguments. Rules of inference are our basic tools for establishing the truth of statements. Before we study mathematical proofs, we will look at arguments that involve only compound propositions. We will define what it means for an argument involving compound propositions to be valid. Then we will introduce a collection of rules of inference in propositional logic. These rules of inference are among the most important ingredients in producing valid arguments. After we illustrate how rules of inference are used to produce valid arguments, we will describe some common forms of incorrect reasoning, called fallacies, which lead to invalid arguments. After studying rules of inference in propositional logic, we will introduce rules of inference for quantified statements. We will describe how these rules of inference can be used to produce valid arguments. These rules of inference for statements involving existential and universal quantifiers play an important role in proofs in computer science and mathematics, although they are often used without being explicitly mentioned. Finally, we will show how rules of inference for propositions and for quantified statements can be combined. These combinations of rule of inference are often used together in complicated arguments.
Valid Arguments in Propositional Logic Consider the following argument involving propositions (which, by definition, is a sequence of propositions): "If you have a current password, then you can log onto the network." "You have a current password." Therefore, "You can log onto the network." We would like to determine whether this is a valid argument. That is, we would like to determine whether the conclusion "You can log onto the network" must be true when the premises "If you have a current password, then you can log onto the network" and "You have a current password" are both true. Before we discuss the validity of this particular argument, we will look at its form. Use p to represent "You have a current password" and q to represent "You can log onto the network." Then, the argument has the form
p -+ q p :. q where . ' . is the symbol that denotes "therefore."
64
1 / The Foundations: Logic and Proofs
/-64
We know that when P and q are propositional variables, the statement « p -+ q) 1\ p) -+ q is a tautology (see Exercise l O(c) in Section 1 .2). In particular, when both P -+ q and p are true, we know that q must also be true. We say this form of argument is valid because whenever all its premises (all statements in the argument other than the final one, the conclusion) are true, the conclusion must also be true. Now suppose that both "If you have a current password, then you can log onto the network" and "You have a current password" are true statements. When we replace p by "You have a current password" and q by "You can log onto the network," it necessarily follows that the conclusion "You can log onto the network" is true. This argument is valid because its form is valid. Note that whenever we replace p and q by propositions where p -+ q and p are both true, then q must also be true. What happens when we replace p and q in this argument form by propositions where not both p and p -+ q are true? For example, suppose that p represents "You have access to the network" and q represents "You can change your grade" and that p is true, but p -+ q is false. The argument we obtain by substituting these values of p and q into the argument form is "If you have access to the network, then you can change your grade." "You have access to the network." . ' . "You can change your grade." The argument we obtained is a valid argument, but because one of the premises, namely the first premise, is false, we cannot conclude that the conclusion is true. (Most likely, this conclusion is false.) In our discussion, to analyze an argument, we replaced propositions by propositional vari ables. This changed an argument to an argument form. We saw that the validity of an argument follows from the validity of the form of the argument. We summarize the terminology used to discuss the validity of arguments with our definition of the key notions.
DEFINITION 1
An argument in propositional logic is a sequence of propositions. All but the final proposition in the argument are called premises and the final proposition is called the conclusion. An argument is valid if the truth of all its premises impl}es that the conclusion is true. An argument form in propositional logic is a sequence of compound propositions in volving propositional variables. An argument form is valid if no matter which particular
propositions are substituted for the propositional variables in its premises, the conclusion is true if the premises are all true.
From the definition of a valid argument form we see that the argument form with premises 1\ P2 1\ . . . 1\ Pn ) -+ q is a tautology. The key to showing that an argument in propositional logic is valid is to show that its argument form is valid. Consequently, we would like techniques to show that argument forms are valid. We will now develop methods for accomplishing this task.
P I , P2 , . . . , Pn and conclusion q is valid, when (PI
Rules of Inference for Propositional Logic We can always use a truth table to show that an argument form is valid. We do this by showing that whenever the premises are true, the conclusion must also be true. However, this can be a tedious approach. For example, when an argument form involves 1 0 different propositional variables, to use a truth table to show this argument form is valid requires 2 1 0 = 1 024 different
1-65
1 .5 Rules of inference 65
rows. Fortunately, we do not have to resort to truth tables. Instead, we can first establish the validity of some relatively simple argument forms, called rules of inference. These rules of inference can be used as building blocks to construct more complicated valid argument forms. We will now introduce the most important rules of inference in propositional logic. The tautology (p 1\ (p ---+ q)) ---+ q is the basis ofthe rule of inference called modus ponens, or the law of detachment. (Modus ponens is Latin for mode that affirms. ) This tautology leads to the following valid argument form, which we have already seen in our initial discussion about arguments (where, as before, the symbol : . denotes "therefore."):
p p ---+ q :. q Using this notation, the hypotheses are written in a column, followed by a horizontal bar, followed by a line that begins with the therefore symbol and ends with the conclusion. In particular, modus ponens tells us that if a conditional statement and the hypothesis of this conditional statement are both true, then the conclusion must also be true. Example 1 illustrates the use of modus ponens. EXAMPLE 1
Suppose that the conditional statement "If it snows today, then we will go skiing" and its hypothesis, "It is snowing today," are true. Then, by modus ponens, it follows that the conclusion .... of the conditional statement, "We will go skiing," is true. As we mentioned earlier, a valid argument can lead to an incorrect conclusion if one or more of its premises is false. We illustrate this again in Example 2.
EXAMPLE 2
Determine whether the argument given here is valid and determine whether its conclusion must be true because of the validity of the argument. "If J2 > � , then (J2) 2 >
(J2) 2
= 2 > G) 2 = � ."
(D 2 . We know that J2 > � . Consequently,
Solution: Let p be the proposition "J2 > �" and q the proposition "2 > (�) 2 ." The premises of the argument are p ---+ q and p, and q is its conclusion. This argument is valid because it is constructed by using modus ponens, a valid argument form. However, one of its premises, J2 > �, is false. Consequently, we cannot conclude that the conclusion is true. Furthermore, .... note that the conclusion of this argument is false, because 2 < � . Table 1 lists the most important rules of inference for propositional logic. Exercises 9, 1 0, 1 5 , and 30 in Section 1 .2 ask for the verifications that these rules of inference are valid argument forms. We now give examples of arguments that use these rules of inference. In each argument, we first use propositional variables to express the propositions in the argument. We then show that the resulting argument form is a rule of inference from Table 1 . EXAMPLE 3
State which rule of inference is the basis of the following argument: "It is below freezing now. Therefore, it is either below freezing or raining now."
66
I / The Foundations: Logic and Proofs
1 - 66
TABLE 1 Rules of Inference. Name
Rule ofInference
Tautology
p p -+ q -:. q
[p
-.q p -+ q -:. -'p
[-.q
p -+ q q -+ r -:. p -+ r
[(p -+ q) 1\ (q -+ r)] -+ (p -+ r)
Hypothetical syllogism
p vq -'p -:. q
[(p V q)
Disjunctive syllogism
p -:. p v q
p -+ (p v q)
Addition
p I\ q -:. p
(p l\ q) -+ p
Simplification
p q -:. p 1\ q
[(p)
pvq -'p v r -:. q V r
[(p v q)
1\
(p -+ q)] -+ q
1\
1\
(p -+ q)] -+ -'p
1\
-'p] -+ q
(q )] -+ (p
1\
1\
q)
(-'P V r)] -+ (q V r)
Modus ponens
Modus tollens
Conjunction
Resolution
Solution: Let p be the proposition "It is below freezing now" and q the proposition "It is raining now." Then this argument is of the form p :. p v q This is an argument that uses the addition rule. EXAMPLE 4
State which rule of inference is the basis of the following argument: "It is below freezing and raining now. Therefore, it is below freezing now."
Solution: Let p be the proposition "It is below freezing now," and let q be the proposition "It is raining now." This argument is of the form
P I\ q p :. This argument uses the simplification rule.
1-67
1 .5 Rules of Inference 67
EXAMPLE 5
State which rule of inference is used in the argument: If it rains today, then we will not have a barbecue today. If we do not have a barbecue today, then we will have a barbecue tomorrow. Therefore, if it rains today, then we will have a barbecue tomorrow.
Solution: Let p be the proposition "It is raining today," let q be the proposition "We will not have a barbecue today," and let r be the proposition "We will have a barbecue tomorrow." Then this argument is of the form p -+ q q -+ r : . p -+ r
Hence, this argument is a hypothetical syllogism.
Using Rules of Inference to Build Arguments When there are many premises, several rules of inference are often needed to show that an argument is valid. This is illustrated by Examples 6 and 7, where the steps of arguments are displayed on separate lines, with the reason for each step explicitly stated. These examples also show how arguments in English can be analyzed using rules of inference. EXAMPLE 6
Show that the hypotheses "It is not sunny this afternoon and it is colder than yesterday," "We will go swimming only if it is sunny," "If we do not go swimming, then we will take a canoe trip," and "If we take a canoe trip, then we will be home by sunset" lead to the conclusion "We will be home by sunset."
Exam: C
Solution: Let p be the proposition "It is sunny this afternoon," q the proposition "It is colder than yesterday," r the proposition "We will go swimming," s the proposition "We will take a canoe trip," and t the proposition "We will be home by sunset." Then the hypotheses become -op 1\ q , r -+ p, -or -+ s, and s -+ t . The conclusion is simply t . We need to give a valid argument with hypotheses -op 1\ q , r -+ p, -or -+ s, and s -+ t and conclusion t . We construct an argument to show that our hypotheses lead to the desired conclusion as follows. Step 1 . -op 1\ q 2. -op 3. r -+ p 4. -or 5. -or -+ s
6. 7.
s
s
8. t
-+ t
Reason
Hypothesis Simplification using ( 1 ) Hypothesis Modus tollens using (2) and (3) Hypothesis Modus ponens using (4) and (5) Hypothesis Modus ponens using (6) and (7)
Note that we could have used a truth table to show that whenever each of the four hypotheses is true, the conclusion is also true. However, because we are working with five propositional ... variables, p, q , r, s, and t, such a truth table would have 32 rows. EXAMPLE 7
Show that the hypotheses "If you send me an e-mail message, then I will finish writing the program," "If you do not send me an e-mail message, then I will go to sleep early," and "If I go
68
1 / The Foundations: Logic and Proofs
1 - 68
to sleep early, then I will wake up feeling refreshed" lead to the conclusion "If I do not finish writing the program, then I will wake up feeling refreshed."
Solution: Let p be the proposition "You send me an e-mail message," q the proposition "I will finish writing the program," r the proposition "I will go to sleep early," and s the proposition "I will wake up feeling refreshed." Then the hypotheses are p -+ q , ""'p -+ r, and r -+ s . The desired conclusion is ..... q -+ s . We need to give a valid argument with hypotheses p -+ q , ""' p -+ r , and r -+ s and conclusion ..... q -+ s . This argument form shows that the hypotheses p lead to the desired conclusion. Step
Reason
1 . p -+ q 2 . ..... q -+ ""'p 3. ""'p -+ r 4 . .....q -+ r 5. r -+ s 6 . ..... q -+ s
Hypothesis Contrapositive of ( 1 ) Hypothesis Hypothetical syllogism using (2) and (3) Hypothesis Hypothetical syllogism using (4) and (5)
Resolution
Unks � �
Computer programs have been developed to automate the task of reasoning and proving theo rems. Many of these programs make use of a rule of inference known as resolution. This rule of inference is based on the tautology
«p v q) 1\ ( ..... p V r » -+ (q
V r).
(The verification that this is a tautology was addressed in Exercise 30 in Section 1 .2.) The final disjunction in the resolution rule, q v r , is called the resolvent. When we let q = r in this tautology, we obtain (p v q) 1\ ( .....p V q) -+ q . Furthermore, when we let r = F, we obtain (p v q) 1\ ( ..... p) -+ q (because q v F = q), which is the tautology on which the rule of disjunc tive syllogism is based. EXAMPLE S
Use resolution to show that the hypotheses "Jasmine is skiing or it is not snowing" and "It is snowing or Bart is playing hockey" imply that "Jasmine is skiing or Bart is playing hockey."
Extra � Examples �
Solution: Let p be the proposition "It is snowing," q the proposition "Jasmine is skiing," and r the proposition "Bart is playing hockey." We can represent the hypotheses as ""'p v q and p v r , respectively. Using resolution, the proposition q v r , "Jasmine i s skiing or Bart i s playing ... hockey," follows. Resolution plays an important role in programming languages based on the rules of logic, such as Prolog (where resolution rules for quantified statements are applied). Furthermore, it can be used to build automatic theorem proving systems. To construct proofs in propositional logic using resolution as the only rule of inference, the hypotheses and the conclusion must be expressed as clauses, where a clause is a disjunction of variables or negations of these variables. We can replace a statement in propositional logic that is not a clause by one or more equivalent statements that are clauses. For example, suppose we have a statement of the form p v (q 1\ r ) . Because p v (q 1\ r ) = (p V q) 1\ (p V r ) , we can replace the single statement p v (q 1\ r ) by two statements p v q and p v r , each of which is a clause. We can replace a statement of
1-69
1 .5 Rules of Inference
69
the form -'(p v q ) by the two statements -'p and -.q because De Morgan's law tells us that -'(p v q ) == -'p 1\ -'q . We can also replace a conditional statement p -+ q with the equivalent disjunction -'p v q . EXAMPLE 9
Show that the hypotheses (p
1\
q ) V r and r
-+ s
imply the conclusion p
v s.
Solution: We can rewrite the hypothesis (p 1\ q) V r as two clauses, p V r and q v r. We can also replace r -+ s by the equivalent clause -.r v s . Using the two clauses p V r and -'r v s , .... we can use resolution to conclude p v s .
Fallacies
unkS �
Several common fallacies arise in incorrect arguments. These fallacies resemble rules of infer ence but are based on contingencies rather than tautologies. These are discussed here to show the distinction between correct and incorrect reasoning. The proposition [(p -+ q) 1\ q] -+ p is not a tautology, because it is false when p is false and q is true. However, there are many incorrect arguments that treat this as a tautology. In other words, they treat the argument with premises p -+ q and q and conclusion p as a valid argument form, which it is not. This type of incorrect reasoning is called the fallacy of affirming the conclusion.
EXAMPLE 10
Is the following argument valid? If you do every problem in this book, then you will learn discrete mathematics. You learned discrete mathematics. Therefore, you did every problem in this book.
Solution: Let p be the proposition "You did every problem in this book." Let q be the proposition "You learned discrete mathematics." Then this argument is of the form: if p -+ q and q, then p. This is an example of an incorrect argument using the fallacy of affirming the conclusion. Indeed, it is possible for you to learn discrete mathematics in some way other than by doing every problem in this book. (You may learn discrete mathematics by reading, listening to lectures, .... doing some, but not all, the problems in this book, and so on.) The proposition [(p -+ q ) 1\ -'p] -+ -.q is not a tautology, because it is false when p is false and q is true. Many incorrect arguments use this incorrectly as a rule of inference. This type of incorrect reasoning is called the fallacy of denying the hypothesis. EXAMPLE 11
Let p and q be as in Example 1 0. If the conditional statement p -+ q is true, and -'p is true, is it correct to conclude that -.q is true? In other words, is it correct to assume that you did not learn discrete mathematics if you did not do every problem in the book, assuming that if you do every problem in this book, then you will learn discrete mathematics?
Solution: It is possible that you learned discrete mathematics even if you did not do every problem in this book. This incorrect argument is of the form p -+ q and -'p imply -'q , which .... is an example of the fallacy of denying the hypothesis.
70
1 / The Foundations: Logic and Proofs
1 - 70
Rules of Inference for Quantified Statements We have discussed rules of inference for propositions. We will now describe some important rules of inference for statements involving quantifiers. These rules of inference are used extensively in mathematical arguments, often without being explicitly mentioned. Universal instantiation is the rule of inference used to conclude that P (c) is true, where c is a particular member of the domain, given the premise Vx P (x). Universal instantiation is used when we conclude from the statement "All women are wise" that "Lisa is wise," where Lisa is a member of the domain of all women. Universal generalization is the rule of inference that states that Vx P(x) is true, given the premise that P (c) is true for all elements c in the domain. Universal generalization is used when we show that Vx P (x) is true by taking an arbitrary element c from the domain and showing that P (c) is true. The element c that we select must be an arbitrary, and not a specific, element of the domain. That is, when we assert from Vx P (x) the existence of an element c in the domain, we have no control over c and cannot make any other assumptions about c other than it comes from the domain. Universal generalization is used implicitly in many proofs in mathematics and is seldom mentioned explicitly. However, the error of adding unwarranted assumptions about the arbitrary element c when universal generalization is used is all too common in incorrect reasoning. Existential instantiation is the rule that allows us to conclude that there is an element c in the domain for which P (c) is true if we know that 3x P (x) is true. We cannot select an arbitrary value of c here, but rather it must be a c for which P(c) is true. Usually we have no knowledge of what c is, only that it exists. Because it exists, we may give it a name (c) and continue our argument. Existential generalization is the rule of inference that is used to conclude that 3x P(x) is true when a particular element c with P (c) true is known. That is, if we know one element c in the domain for which P (c) is true, then we know that 3x P (x) is true. We summarize these rules of inference in Table 2. We will illustrate how one of these rules of inference for quantified statements is used in Example 1 2.
EXAMPLE 12
Show that the premises "Everyone in this discrete mathematics class has taken a course in computer science" and "Marla is a student in this class" imply the conclusion "Marla has taken a course in computer science."
TABLE 2 Rules of Inference for Quantified Statements. Rule ofInference
Name
Vx P(x) :. P(c)
Universal instantiation
P(c) for an arbitrary c :. Vx P(x)
Universal generalization
3x P(x) :. P(c) for some element c
Existential instantiation
P(c) for some element c :. 3x P(x)
Existential generalization
1 - 71
l .5 Rules of lnference
71
Solution: Let D(x) denote "x is in this discrete mathematics class," and let C (x) denote "x has taken a course in computer science." Then the premises are Vx(D(x) ---+ C (x» and D(MarIa). The conclusion is C (Marla). The following steps can be used to establish the conclusion from the premises.
EXAMPLE 13
Step
Reason
1. 2. 3. 4.
Premise Universal instantiation from ( 1 ) Premise Modus ponens from (2) and (3)
Vx(D(x) ---+ C (x» D(Marla) ---+ C (Marla) D(MarIa) C(MarIa)
Show that the premises "A student in this class has not read the book," and "Everyone in this class passed the first exam" imply the conclusion "Someone who passed the first exam has not read the book."
Solution: Let C (x) be "x is in this class," B (x) be "x has read the book," and P (x) be "x passed the first exam." The premises are 3x (C (x ) 1\ -.B (x» and Vx (C (x) ---+ P (x» . The conclusion is 3x (P(x) 1\ -.B(x» . These steps can be used to establish the conclusion from the premises. Step
Reason
1. 2. 3. 4. 5. 6. 7.
Premise Existential instantiation from ( l ) Simplification from (2) Premise Universal instantiation from (4) Modus ponens from (3) and (5) Simplification from (2) Conjunction from (6) and (7) Existential generalization from (8)
3x (C(x) 1\ -.B(x» C(a) 1\ -.B(a) C(a) Vx(C (x) ---+ P (x» C (a) ---+ pea) pea) -.B(a) 8. p ea) 1\ -.B(a) 9 . 3x(P(x) 1\ -.B(x»
Combining Rules of Inference for Propositions and Quantified Statements We have developed rules of inference both for propositions and for quantified statements. Note that in our arguments in Examples 1 2 and 1 3 we used both universal instantiation, a rule of inference for quantified statements, and modus ponens, a rule of inference for propositional logic. We will often need to use this combination of rules of inference. Because universal instantiation and modus ponens are used so often together, this combination of rules is sometimes called universal modus ponens. This rule tells us that ifVx(P(x ) ---+ Q (x» is true, and if P (a) is true for a particular element a in the domain of the universal quantifier, then Q (a) must also be true. To see this, note that by universal instantiation, p ea) ---+ Q (a) is true. Then, by modus ponens, Q (a) must also be true. We can describe universal modus ponens as follows: Vx(P(x) ---+ Q (x» P (a), where a is a particular element in the domain : . Q(a) Universal modus ponens is commonly used in mathematical arguments. This is illustrated in Example 14. EXAMPLE 14
Assume that "For all positive integers n, if n is greater than 4, then n 2 is less than 2n " is true. Use universal modus ponens to show that 1 002 < 2 100 •
72
I I
The Foundations: Logic and Proofs
1 - 72
Solution: Let P (n) denote "n > 4" and Q (n) denote "n 2 < 2n ." The statement "For all positive integers n , ifn is greater than 4, then n 2 is less than 2n " can be represented by Vn(P(n) --+ Q(n» , where the domain consists of all positive integers. We are assuming that Vn(P(n) --+ Q(n» is true. Note that P ( l 00) is true because 1 00 > 4. It follows by universal modus ponens that Q(n) � is true, namely that 1 002 < 2 100 • Another useful combination of a rule of inference from propositional logic and a rule of inference for quantified statements is universal modus tollens. Universal modus toll ens combines universal instantiation and modus tollens and can be expressed in the following way: Vx(P(x) --+ Q (x» -' Q (a ) , where a is a particular element in the domain : . -, P (a )
We leave the verification of universal modus tollens to the reader (see Exercise 25). Exercise 26 develops additional combinations of rules of inference in propositional logic and quantified statements.
Exercises 1 . Find the argument form for the following argument and
determine whether it is valid. Can we conclude that the conclusion is true if the premises are true? If Socrates is human, then Socrates is mortal. Socrates is human. . . . Socrates is mortal. 2. Find the argument form for the following argument and
determine whether it is valid. Can we conclude that the conclusion is true if the premises are true? If George does not have eight legs, then he is not an insect. George is an insect. . ' . George has eight legs. 3. What rule of inference is used in each of these
arguments?
a) Alice is a mathematics major. Therefore, Alice is ei ther a mathematics major or a computer science major.
b) Jerry is a mathematics major and a computer science
major. Therefore, Jerry is a mathematics major. c) If it is rainy, then the pool will be closed. It is rainy. Therefore, the pool is closed. If it snows today, the university will close. The uni versity is not closed today. Therefore, it did not snow today. e) If ! go swimming, then I will stay in the sun too long. If! stay in the sun too long, then I will sunburn. There fore, if I go swimming, then I will sunburn. 4. What rule of inference is used in each ofthese arguments?
d)
a) Kangaroos live in Australia and are marsupials. There fore, kangaroos are marsupials.
b) It is either hotter than 1 00 degrees today or the pollu
tion is dangerous. It is less than 1 00 degrees outside today. Therefore, the pollution is dangerous. c) Linda is an excellent swimmer. If Linda is an excellent swimmer, then she can work as a lifeguard. Therefore, Linda can work as a lifeguard. Steve will work at a computer company this summer. Therefore, this summer Steve will work at a computer company or he will be a beach bum. e) If! work all night on this homework, then I can answer all the exercises. If! answer all the exercises, I will un derstand the material. Therefore, if I work all night on this homework, then I will understand the material. 5. Use rules ofinference to show that the hypotheses "Randy works hard," "If Randy works hard, then he is a dull boy," and "If Randy is a dull boy, then he will not get the job" imply the conclusion "Randy will not get the job." 6. Use rules of inference to show that the hypotheses "If it does not rain or if it is not foggy, then the sailing race will be held and the lifesaving demonstration will go on," "If the sailing race is held, then the trophy will be awarded," and "The trophy was not awarded" imply the conclusion "It rained." 7. What rules of inference are used in this famous argu ment? "All men are mortal. Socrates is a man. Therefore, Socrates is mortal." 8. What rules of inference are used in this argument? "No man is an island. Manhattan is an island. Therefore, Manhattan is not a man."
d)
1 .5 Rules of Inference 73
1 - 73
9. For each of these sets of premises, what relevant conclu sion or conclusions can be drawn? Explain the rules of in ference used to obtain each conclusion from the premises. a) "If I take the day off, it either rains or snows." "I took Tuesday off or I took Thursday off." "It was sunny on Tuesday." "It did not snow on Thursday." b) "If ! eat spicy foods, then I have strange dreams." "I have strange dreams if there is thunder while I sleep." "I did not have strange dreams." c) "I am either clever or lucky." "I am not lucky." "If I am lucky, then I will win the lottery." d) "Every computer science major has a personal com puter." "Ralph does not have a personal computer." "Ann has a personal computer." "What is good for corporations is good for the United States." "What is good for the United States is good for you." "What is good for corporations is for you to buy lots of stuff." t) "All rodents gnaw their food." "Mice are rodents." "Rabbits do not gnaw their food." "Bats are not ro dents."
e)
10. For each of these sets of premises, what relevant conclu sion or conclusions can be drawn? Explain the rules of in ference used to obtain each conclusion from the premises.
a)
"If! play hockey, then I am sore the next day." "I use the whirlpool if I am sore." "I did not use the whirlpool." b) "If! work, it is either sunny or partly sunny." "I worked last Monday or I worked last Friday." "It was not sunny on Tuesday." "It was not partly sunny on Friday." c) "All insects have six legs." "Dragonflies are insects." "Spiders do not have six legs." "Spiders eat dragon flies." "Every student has an Internet account." "Homer does not have an Internet account." "Maggie has an Internet account." "All foods that are healthy to eat do not taste good." "Tofu is healthy to eat." "You only eat what tastes good." "You do not eat tofu." "Cheeseburgers are not healthy to eat." t) "I am either dreaming or hallucinating." "I am not dreaming." "If I am hallucinating, I see elephants run ning down the road." 1 1 . Show that the argument form with premises P I , P2 , . . . , Pn and conclusion q -+ r is valid ifthe argument form with premises P I , P2 , . . . , Pn , q , and conclusion r is valid. 12. Show that the argument form with premises /\ t) -+ (r v s), q -+ (u /\ t), u -+ and --.s and conclusion q -+ r is valid by first using Exercise 1 1 and then using rules of inference from Table 1 . 13. For each of these arguments, explain which rules of infer ence are used for each step.
d) e)
p,
a)
(p
"Doug, a student in this class, knows how to write programs in JAVA. Everyone who knows how to write programs in JAVA can get a high-paying job. There fore, someone in this class can get a high-paying job."
b) "Somebody in this class enjoys whale watching. Every person who enjoys whale watching cares about ocean pollution. Therefore, there is a person in this class who cares about ocean pollution." c) "Each of the 93 students in this class owns a personal computer. Everyone who owns a personal computer can use a word processing program. Therefore, Zeke, a student in this class, can use a word processing pro gram." d) "Everyone in New Jersey lives within 50 miles of the ocean. Someone in New Jersey has never seen the ocean. Therefore, someone who lives within 50 miles of the ocean has never seen the ocean." 14. For each of these arguments, explain which rules of infer ence are used for each step. a) "Linda, a student in this class, owns a red convertible. Everyone who owns a red convertible has gotten at least one speeding ticket. Therefore, someone in this class has gotten a speeding ticket." b) "Each of five roommates, Melissa, Aaron, Ralph, Ve neesha, and Keeshawn, has taken a course in discrete mathematics. Every student who has taken a course in discrete mathematics can take a course in algorithms. Therefore, all five roommates can take a course in al gorithms next year." c) "All movies produced by John Sayles are wonder ful. John Sayles produced a movie about coal min ers. Therefore, there is a wonderful movie about coal miners." d) "There is someone in this class who has been to France. Everyone who goes to France visits the Lou vre. Therefore, someone in this class has visited the Louvre." 1 5. For each of these arguments determine whether the argu ment is correct or incorrect and explain why.
a)
All students in this class understand logic. Xavier is a student in this class. Therefore, Xavier understands logic. b) Every computer science maj or takes discrete mathe matics. Natasha is taking discrete mathematics. There fore, Natasha is a computer science major. c) All parrots like fruit. My pet bird is not a parrot. There fore, my pet bird does not like fruit. d) Everyone who eats granola every day is healthy. Linda is not healthy. Therefore, Linda does not eat granola every day. 16. For each of these arguments determine whether the argu ment is correct or incorrect and explain why.
a)
Everyone enrolled in the university has lived in a dor mitory. Mia has never lived in a dormitory. Therefore, Mia is not enrolled in the university. b) A convertible car is fun to drive. Isaac's car is not a convertible. Therefore, Isaac's car is not fun to drive. c) Quincy likes all action movies. Quincy likes the movie Eight Men Out. Therefore, Eight Men Out is an action movie.
1 - 74
1 / The Foundations: Logic and Proofs
74
d)
All iobstennen set at least a dozen traps. Hamilton is a lobstennan. Therefore, Hamilton sets at least a dozen traps. 1 7. What is wrong with this argument? Let H(x ) be "x is happy." Given the premise 3x H (x), we conclude that H(Lola). Therefore, Lola is happy.
y)
18. What is wrong with this argument? Let S(x , be "x is shorter than Given the premise 3s S(s , Max), it follows that S(Max, Max). Then by existential generalization it follows that 3x S(x , x), so that someone is shorter than himself.
y ."
19. Detennine whether each of these arguments is valid. Ifan argument is correct, what rule of inference is being used? If it is not, what logical error occurs? a) If n is a real number such that n > 1 , then n 2 > 1 . Suppose that n 2 > 1 . Then n > 1 . b) Ifn is a real number with n > 3 , then n 2 > 9 . Suppose that n 2 :s 9. Then n :s 3 . c) Ifn is a real number with n > 2, then n 2 > 4 . Suppose that n :s 2. Then n 2 :s 4. 20. Detennine whether these are valid arguments. a) If x is a positive real number, then x 2 is a positive real number. Therefore, if a 2 is positive, where a is a real number, then a is a positive real number. b) If x 2 =1= 0, where x is a real number, then x =1= O. Let a be a real number with a 2 =1= 0; then a =1= O. 21. Which rules of inference are used to establish the conclusion of Lewis Carroll's argument described in Example 26 of Section 1 .3 ? 2 2 . Which rules of inference are used t o establish the conclu sion of Lewis Carroll's argument described in Example of Section 1 .3 ?
27
23. Identify the error o r errors i n this argument that sup posedly shows that if 3x P (x ) 1\ 3x Q (x ) is true then 3x( P (x) 1\ Q(x» is true. 3x P (x ) 1\ 3x Q(x) Premise Simplification from ( 1 ) 3x P (x ) Existential instantiation from (2) P ee) 3x Q (x ) Simplification from ( 1 ) 5 . Q (e) Existential instantiation from (4) 6. P ee) 1\ Q (e) Conjunction from (3) and (5) Existential generalization 3x(P(x) 1\ Q (x » 24. Identify the error or errors in this argument that sup posedly shows that if Vx ( P (x ) v Q (x » is true then Vx P(x) v Vx Q (x ) is true. Premise 1 . Vx (P(x ) v Q (x» Universal instantiation from ( 1 ) 2. P ee) v Q (e) Simplification from (2) 3 . Pee) Universal generalization from (3) 4. Vx P (x ) 5. Q(e) Simplification from (2) 6. Vx Q(x) Universal generalization from (5) Vx(P(x) v Vx Q (x » Conjunction from (4) and (6) 25. Justify the rule of universal modus tollens by showing that the premises Vx( P (x) � Q(x» and --. Q (a) for a partic ular element a in the domain, imply --. P (a). 1. 2. 3. 4.
7.
7.
26. Justify the rule of universal transitivity, which states that ifVx ( P (x) � Q(x » and Vx(Q(x) � R (x » are true, then Vx(P(x ) � R (x » is true, where the domains of all quantifiers are the same. 27. Use rules of inference to show that ifVx(P(x) � (Q(x) 1\ Sex»�) and Vx(P(x ) 1\ R(x» are true, then Vx(R(x) 1\ Sex»� is true. 28. Use rules of inference to show that if Vx(P(x) v Q (x » and Vx« --. P (x) 1\ Q(x» � R (x» are true, then Vx(--.R(x) � P (x» is also true, where the domains of all quantifiers are the same. 29. Use rules of inference to show that if Vx(P(x) v Q(x» , Vx(--. Q (x ) v Sex»�, Vx(R(x) � --.S(x» , and 3x --. P (x) are true, then 3x --.R(x) is true. 30. Use resolution to show the hypotheses "Allen is a bad boy or Hillary is a good girl" and "Allen is a good boy or David is happy" imply the conclusion "Hillary is a good girl or David is happy." 3 1 . Use resolution to show that the hypotheses "It is not rain ing or Yvette has her umbrella," "Yvette does not have her umbrella or she does not get wet," and "It is raining or Yvette does not get wet" imply that "Yvette does not get wet." 32. Show that the equivalence p 1\ --'p == F can be derived using resolution together with the fact that a condi tional statement with a false hypothesis is true. [Hint: Let q = r = F in resolution.] 33. Use resolution to show that the compound proposition (p v q) 1\ (--'P V q ) 1\ (p V --.q ) 1\ (--'P V --'q ) is not sat isfiable. *34. The Logic Problem, taken from WFF 'N PROOF, The Game ofLogic, has these two assumptions: 1. "Logic is difficult or not many students like logic." 2. "If mathematics is easy, then logic is not difficult." By translating these assumptions into statements involv ing propositional variables and logical connectives, deter mine whether each of the following are valid conclusions of these assumptions: a) That mathematics is not easy, if many students like logic. b) That not many students like logic, if mathematics is not easy. c) That mathematics is not easy or logic is difficult. That logic is not difficult or mathematics is not easy. e) That if not many students like logic, then either math ematics is not easy or logic is not difficult. *35. Detennine whether this argument, taken from Kalish and Montague [KaM064] , is valid.
d)
If Superman were able and willing to prevent evil, he would do so. If Supennan were unable to prevent evil, he would be impotent; if he were unwilling to prevent evil, he would be malevolent. Supennan does not prevent evil. If Supennan exists, he is nei ther impotent nor malevolent. Therefore, Supennan does not exist.
1 .6 Introduction to Proofs
1 - 75
75
1.6 Introduction to Proofs Introduction In this section we introduce the notion of a proof and describe methods for constructing proofs. A proof is a valid argument that establishes the truth of a mathematical statement. A proof can use the hypotheses of the theorem, if any, axioms assumed to be true, and previously proven theorems. Using these ingredients and rules of inference, the final step of the proof establishes the truth of the statement being proved. In our discussion we move from formal proofs of theorems toward more informal proofs. The arguments we introduced in Section 1 .5 to show that statements involving propositions and quantified statements are true were formal proofs, where all steps were supplied, and the rules for each step in the argument were given. However, formal proofs of useful theorems can be extremely long and hard to follow. In practice, the proofs of theorems designed for human consumption are almost always informal proofs, where more than one rule of inference may be used in each step, where steps may be skipped, where the axioms being assumed and the rules of inference used are not explicitly stated. Informal proofs can often explain to humans why theorems are true, while computers are perfectly happy producing formal proofs using automated reasoning systems. The methods of proof discussed in this chapter are important not only because they are used to prove mathematical theorems, but also for their many applications to computer science. These applications include verifying that computer programs are correct, establishing that operating systems are secure, making inferences in artificial intelligence, showing that system specifica tions are consistent, and so on. Consequently, understanding the techniques used in proofs is essential both in mathematics and in computer science.
Some Terminology
UDkS it:'l
Formally, a theorem is a statement that can be shown to be true. In mathematical writing, the term theorem is usually reserved for a statement that is considered at least somewhat important. Less important theorems sometimes are called propositions. (Theorems can also be referred to as facts or results.) A theorem may be the universal quantification of a conditional statement with one or more premises and a conclusion. However, it may be some other type of logical statement, as the examples later in this chapter will show. We demonstrate that a theorem is true with a proof. A proof is a valid argument that establishes the truth of a theorem. The statements used in a proof can include axioms (or postulates), which are statements we assume to be true (for example, see Appendix 1 for axioms for the real numbers), the premises, if any, of the theorem, and previously proven theorems. Axioms may be stated using primitive terms that do not require definition, but all other terms used in theorems and their proofs must be defined. Rules of inference, together with definitions of terms, are used to draw conclusions from other assertions, tying together the steps of a proof. In practice, the final step of a proof is usually just the conclusion of the theorem. However, for clarity, we will often recap the statement ofthe theorem as the final step of a proof. A less important theorem that is helpful in the proof of other results is called a lemma (plural lemmas or lemmata). Complicated proofs are usually easier to understand when they are proved using a series of lemmas, where each lemma is proved individually. A corollary is a theorem that can be established directly from a theorem that has been proved. A conjecture is a statement that is being proposed to be a true statement, usually on the basis of some partial evidence, a heuristic argument, or the intuition of an expert. When a proof of a conjecture is found, the conjecture becomes a theorem. Many times conjectures are shown to be false, so they are not theorems.
76
1 - 76
1 / The Foundations: Logic and Proofs
Understanding How Theorems Are Stated Extra � Examples �
Before we introduce methods for proving theorems, we need to understand how many math ematical theorems are stated. Many theorems assert that a property holds for all elements in a domain, such as the integers or the real numbers. Although the precise statement of such theorems needs to include a universal quantifier, the standard convention in mathematics is to omit it. For example, the statement "If x
>
y,
where x and y are positive real numbers, then x 2
>
y2 ."
really means "For all positive real numbers x and y, if x
>
y, then x 2
>
y2 ."
Furthermore, when theorems of this type are proved, the law of universal instantiation is often used without explicit mention. The first step of the proof usually involves selecting a general element of the domain. Subsequent steps show that this element has the property in question. Finally, universal generalization implies that the theorem holds for all members of the domain.
Methods of Proving Theorems
Assessmenl
iC
We now tum our attention to proofs of mathematical theorems. Proving theorems can be difficult. We need all the ammunition that is available to help us prove different results. We now introduce a battery of different proof methods. These methods should become part of your repertoire for proving theorems. To prove a theorem ofthe form Vx (P(x) --+ Q(x» , our goal is to show that P(c) --+ Q(c) is true, where c is an arbitrary element of the domain, and then apply universal generalization. In this proof, we need to show that a conditional statement is true. Because of this, we now focus on methods that show that conditional statements are true. Recall that p --+ q is true unless p is true but q is false. Note that when the statement p --+ q is proved, it need only be shown that q is true if p is true. The following discussion will give the most common techniques for proving conditional statements. Later we will discuss methods for proving other types of statements. In this section, and in Section 1 .7, we will develop an arsenal of many different proof techniques that can be used to prove a wide variety of theorems. When you read proofs, you will often find the words "obviously" or "clearly." These words indicate that steps have been omitted that the author expects the reader to be able to fill in. Unfortunately, this assumption is often not warranted and readers are not at all sure how to fill in the gaps. We will assiduously try to avoid using these words and try not to omit too many steps. However, if we included all steps in proofs, our proofs would often be excruciatingly long.
Direct Proofs A direct proof of a conditional statement p --+ q is constructed when the first step is the assumption that p is true; subsequent steps are constructed using rules of inference, with the final step showing that q must also be true. A direct proof shows that a conditional statement p --+ q is true by showing that if p is true, then q must also be true, so that the combination p true and q false never occurs. In a direct proof, we assume that p is true and use axioms, definitions, and previously proven theorems, together with rules of inference, to show that q must also be true. You will find that direct proofs of many results are quite straightforward, with a fairly obvious sequence of steps leading from the hypothesis to the conclusion. However, direct
1-77
1 .6 Introduction to Proofs
77
proofs sometimes require particular insights and can be quite tricky. The first direct proofs we present here are quite straightforward; later in the text you will see some that are less obvious. We will provide examples of several different direct proofs. Before we give the first example, we need a definition.
DEFINITION 1
EXAMPLE 1
Extra � Examples �
EXAMPLE 2
The integer n is even ifthere exists an integer k such that n = 2k, and n is odd ifthere exists an integer k such that n = 2k + 1 . (Note that an integer is either even or odd, and no integer is both even and odd.)
Give a direct proof of the theorem "If n is an odd integer, then n 2 is odd."
Solution: Note that this theorem states Vn P« n) -+ Q(n)), where P(n) is "n is an odd integer" and Q(n) is "n 2 is odd." As we have said, we will follow the usual convention in mathematical proofs by showing that P (n) implies Q(n), and not explicitly using universal instantiation. To begin a direct proof of this theorem, we assume that the hypothesis of this conditional statement is true, namely, we assume that n is odd. By the definition of an odd integer, it follows that n = 2k + I , where k is some integer. We want to show that n 2 is also odd. We can square both sides of the equation n 2k + I to obtain a new equation that expresses n 2 • When we do this, we find that n 2 = (2k + 1 ) 2 = 4k2 + 4k + I = 2(2k2 + 2k) + I . By the definition of an odd integer, we can conclude that n 2 is an odd integer (it is one more than twice an integer). ... Consequently, we have proved that if n is an odd integer, then n 2 is an odd integer.
=
Give a direct proof that if m and n are both perfect squares, then nm is also a perfect square. (An integer a is a perfect square if there is an integer b such that a = b2 . )
Solution: To produce a direct proof of this theorem, we assume that the hypothesis of this conditional statement is true, namely, we assume that m and n are both perfect squares. By the definition of a perfect square, it follows that there are integers s and t such that m s 2 and n = t 2 . The goal of the proof is to show that mn must also be a perfect square when m and n are; looking ahead we see how we can show this by multiplying the two equations m = s 2 and n = t 2 together. This shows that mn = s 2 t 2 , which implies that mn = (s t)2 (using commutativity and associativity of multiplication). By the definition of perfect square, it follows that mn is also a perfect square, because it is the square of s t, which is an integer. We have proved that if m and ... n are both perfect squares, then mn is also a perfect square.
=
Proof by Contraposition Direct proofs lead from the hypothesis of a theorem to the conclusion. They begin with the premises, continue with a sequence of deductions, and end with the conclusion. However, we will see that attempts at direct proofs often reach dead ends. We need other methods of proving theo rems ofthe form Vx (P (x ) -+ Q (x )) . Proofs oftheorems ofthis type that are not direct proofs, that is, that do not start with the hypothesis and end with the conclusion, are called indirect proofs. An extremely useful type of indirect proof is known as proof by contraposition. Proofs by contraposition make use of the fact that the conditional statement p -+ q is equivalent to its contrapositive, -.q -+ -'p. This means that the conditional statement p -+ q can be proved by showing that its contrapositive, -.q -+ -'p, is true. In a proof by contraposition of p -+ q , we take -.q as a hypothesis, and using axioms, definitions, and previously proven theorems, together with rules of inference, we show that -'p must follow. We will illustrate proof by contraposition
78
1 - 78
1 / The Foundations: Logic and Proofs
with two examples. These examples show that proof by contraposition can succeed when we cannot easily find a direct proof. EXAMPLE 3
Exam: C
Prove that if n is an integer and 3 n + 2 is odd, then n is odd.
Solution: We first attempt a direct proof. To construct a direct proof, we first assume that 3 n + 2 is an odd integer. This means that 3 n + 2 2k + 1 for some integer k. Can we use this fact to show that n is odd? We see that 3 n + 1 2k, but there does not seem to be any direct way to conclude that n is odd. Because our attempt at a direct proof failed, we next try a proof by contraposition. The first step in a proof by contraposition is to assume that the conclusion of the conditional statement "If 3 n + 2 is odd, then n is odd" is false; namely, assume that n is even. Then, by the definition of an even integer, n 2k for some integer k. Substituting 2k for n , we find that 3 n + 2 3(2k) + 2 6k + 2 2(3k + 1). This tells us that 3 n + 2 is even (because it is a multiple of 2), and therefore not odd. This is the negation of the hypothesis of the theorem. Because the negation of the conclusion of the conditional statement implies that the hypothesis is false, the original conditional statement is true. Our proof by contraposition succeeded; we .... have proved the theorem "If 3 n + 2 is odd, then n is odd."
= =
=
EXAMPLE 4
Prove that if n
=
=
=
= ab, where a and b are positive integers, then a :::: .jfi or b :::: .jfi.
Solution: Because there is no obvious way of showing that a :::: .jfi or b :::: .jfi directly from the equation n ab, where a and b are positive integers, we attempt a proof by contraposition. The first step in a proofby contraposition is to assume that the conclusion of the conditional statement "If n ab, where a and b are positive integers, then a :::: .jfi or b :::: .jfi" is false. That is, we assume that the statement (a :::: .jfi) V (b :::: .jfi) is false. Using the meaning of disjunction together with De Morgan's law, we see that this implies that both a :::: .jfi and b :::: .jfi are false. This implies that a > .jfi and b > .jfi. We can multiply these inequalities together (using the fact that if 0 < t and 0 < v, then su < tv) to obtain ab > .jfi . .jfi n . This shows that ab =I- n , which contradicts the statement n abo Because the negation of the conclusion of the conditional statement implies that the hy pothesis is false, the original conditional statement is true. Our proof by contraposition suc ceeded; we have proved that if n ab, where a and b are positive integers, then a :::: .jfi or .... b :::: .jfi.
=
=
=
=
=
VACUOUS AND TRIVIAL PROOFS We can quickly prove that a conditional statement p -+ q is true when we know that p is false, because p -+ q must be true when p is false. Consequently, if we can show that p is false, then we have a proof, called a vacuous proof, of the conditional statement p -+ q . Vacuous proofs are often used to establish special cases of theorems that state that a conditional statement is true for all positive integers [i.e., a theorem of the kind Vn P (n), where p en) is a propositional function] . Proof techniques for theorems of this kind will be discussed in Section 4. 1 . EXAMPLE 5
Show that the proposition P (O) is true, where p en) is "If n consists of all integers.
>
1 , then n 2
>
n" and the domain
Solution: Note that P (O) is "If 0 > 1 , then 02 > 0." We can show P(O) using a vacuous proof, .... because the hypothesis 0 > 1 is false. This tells us that P (O) is automatically true.
1 - 79
1 .6 Introduction to Proofs
79
Remark: The fact that the conclusion of this conditional statement, 02
> 0, is false is irrelevant to the truth value of the conditional statement, because a conditional statement with a false hypothesis is guaranteed to be true.
We can also quickly prove a conditional statement p --+ q if we know that the conclusion q is true. By showing that q is true, it follows that p --+ q must also be true. A proof of p --+ q that uses the fact that q is true is called a trivial proof. Trivial proofs are often important when special cases of theorems are proved (see the discussion of proof by cases in Section 1 .7) and in mathematical induction, which is a proof technique discussed in Section 4. 1 .
EXAMPLE 6
Let P(n) be "If a and b are positive integers with a ::: b, then a n ::: b n ," where the domain consists of all integers. Show that P (O) is true.
Solution: The proposition P(O) is "If a ::: b, then a o ::: b O ." Because a o = b O = I , the conclusion of the conditional statement "If a ::: b, then a O ::: b O " is true. Hence, this conditional statement, which is P (O), is true. This is an example of a trivial proof. Note that the hypothesis, which is .... the statement "a ::: b," was not needed in this proof. A LITILE PROOF STRATEGY We have described two important approaches for proving theorems of the form Vx(P(x) --+ Q(x» : direct proof and proofby contraposition. We have also given examples that show how each is used. However, when you are presented with a theorem of the form Vx(P(x) --+ Q(x» , which method should you use to attempt to prove it? We will provide a few rules of thumb here; in Section 1 .7 we will discuss proof strategy at greater length. When you want to prove a statement of the form Vx(P(x) --+ Q(x», first evaluate whether a direct proof looks promising. Begin by expanding the definitions in the hypotheses. Start to reason using these hypotheses, together with axioms and available theorems. If a direct proof does not seem to go anywhere, try the same thing with a proof by contraposition. Recall that in a proof by contraposition you assume that the conclusion of the conditional statement is false and use a direct proof to show this implies that the hypothesis must be false. We illustrate this strategy in Examples 7 and 8. Before we present our next example, we need a definition.
DEFINITION 2
EXAMPLE 7
The real number r is rational if there exist integers p and q with q =f:. 0 such that r = p / q. A real number that is not rational is called irrational.
Prove that the sum of two rational numbers is rational. (Note that if we include the implicit quantifiers here, the theorem we want to prove is "For every real number r and every real number s , if r and s are rational numbers, then r + s is rational.)
Solution: We first attempt a direct proof. To begin, suppose that r and s are rational numbers. From the definition of a rational number, it follows that there are integers p and q , with q =f:. 0, such that r = p / q , and integers t and u, with u =f:. 0, such that s = t / u . Can we use this information to show that r + s is rational? The obvious next step is to add r = p /q and s = t /u, to obtain P t pu + q t r + s = - + - = =----=-u qu q Because q =f:. 0 and u =f:. 0 , it follows that q u =f:. O . Consequently, we have expressed r + s as the ratio of two integers, pu + q t and qu, where qu =f:. O. This means that r + s is rational. We have proved that the sum of two rational numbers is rational; our attempt to find a direct proof .... succeeded.
80
I
/ The Foundations: Logic and Proofs
EXAMPLE 8
1 -80
Prove that if n is an integer and n 2 is odd, then n is odd.
Solution: We first attempt a direct proof. Suppose that n is an integer and n 2 is odd. Then, there exists an integer k such that n 2 2k + 1 . Can we use this information to show that n is odd? There seems to be no obvious approach to show that n is odd because solving for n produces the equation n = ±J2k + 1 , which is not terribly useful. Because this attempt to use a direct proof did not bear fruit, we next attempt a proof by contraposition. We take as our hypothesis the statement that n is not odd. Because every integer is odd or even, this means that n is even. This implies that there exists an integer k such that n 2k. To prove the theorem, we need to show that this hypothesis implies the conclusion that n 2 is not odd, that is, that n 2 is even. Can we use the equation n 2k to achieve this? By squaring both sides ofthis equation, we obtain n 2 = 4k2 = 2(2k2 ), which implies that n 2 is also even because n 2 2t , where t = 2k2 . We have proved that if n is an integer and n 2 is odd, then ... n is odd. Our attempt to find a proof by contraposition succeeded.
=
=
=
=
Proofs by Contradiction Suppose we want to prove that a statement p is true. Furthermore, suppose that we can find a contradiction q such that -'p -+ q is true. Because q is false, but -'p -+ q is true, we can conclude that -'p is false, which means that p is true. How can we find a contradiction q that might help us prove that p is true in this way? Because the statement r 1\ -'r is a contradiction whenever r is a proposition, we can prove that p is true if we can show that -'p -+ (r 1\ -.r ) is true for some proposition r . Proofs of this type are called proofs by contradiction. Because a proofby contradiction does not prove a result directly, it is another type of indirect proof. We provide three examples of proofby contradiction. The first is an example of an application of the pigeonhole principle, a combinatorial technique that we will cover in depth in Section 5.2. EXAMPLE 9
Show that at least four of any 22 days must fall on the same day of the week.
Solution: Let p be the proposition "At least four of 22 chosen days fall on the same day of the week." Suppose that -'p is true. This means that at most three of the 22 days fall on the same day of the week. Because there are seven days of the week, this implies that at most 2 1 days could have been chosen because for each of the days of the week, at most three of the chosen days could fall on that day. This contradicts the hypothesis that we have 22 days under consideration. That is, if r is the statement that 22 days are chosen, then we have shown that -'p -+ (r 1\ ....,r ) . Consequently, we know that p is true. We have proved that at least four of 22 chosen days fall ... on the same day of the week. EXAMPLE 10
Prove that V2 is irrational by giving a proof by contradiction.
Solution: Let p be the proposition " V2 is irrational." To start a proofby contradiction, we suppose that -'p is true. Note that ""'p is the statement "It is not the case that V2 is irrational," which says that V2 is rational. We will show that assuming that ""'p is true leads to a contradiction. If V2 is rational, there exist integers a and b with V2 alb, where a and b have no common factors (so that the fraction alb is in lowest terms.) (Here, we are using the fact that every rational number can be written in lowest terms.) Because V2 alb, when both sides of this equation are squared, it follows that
=
=
1-81
1 .6 Introduction to Proofs
81
Hence,
By the definition of an even integer it follows that a 2 is even. We next use the fact that if a 2 is even, a must also be even, which follows by Exercise 1 6. Furthermore, because a is even, by the definition of an even integer, a 2c for some integer c . Thus,
=
Dividing both sides of this equation by 2 gives
By the definition of even, this means that b 2 is even. Again using the fact that if the square of an integer is even, then the integer itself must be even, we conclude that b must be even as well. We have now shown that the assumption of ...... p leads to the equation ,J2 alb, where a and b have no common factors, but both a and b are even, that is, 2 divides both a and b. Note that the statement that ,J2 alb, where a and b have no common factors, means, in particular, that 2 does not divide both a and b. Because our assumption of """ p leads to the contradiction that 2 divides both a and b and 2 does not divide both a and b, """ p must be false. .... That is, the statement p, ",J2 is irrational," is true. We have proved that ,J2 is irrational.
=
=
Proofby contradiction can be used to prove conditional statements. In such proofs, we first assume the negation of the conclusion. We then use the premises of the theorem and the negation of the conclusion to arrive at a contradiction. (The reason that such proofs are valid rests on the logical equivalence of p -+ q and (p 1\ ......q ) -+ F. To see that these statements are equivalent, simply note that each is false in exactly one case, namely when p is true and q is false.) Note that we can rewrite a proof by contraposition of a conditional statement as a proof by contradiction. In a proof of p -+ q by contraposition, we assume that ...... q is true. We then show that """p must also be true. To rewrite a proof by contraposition of p -+ q as a proof by contradiction, we suppose that both p and ...... q are true. Then, we use the steps from the proof of ...... q -+ """ p to show that """ p is true. This leads to the contradiction p 1\ """ p , completing the proof. Example 1 1 illustrates how a proof by contraposition of a conditional statement can be rewritten as a proof by contradiction.
EXAMPLE 11
Give a proof by contradiction of the theorem "If 3n + 2 is odd, then n is odd."
Solution: Let p be "3n + 2 is odd" and q be "n is odd." To construct a proof by contradiction, assume that both p and ...... q are true. That is, assume that 3n + 2 is odd and that n is not odd. Because n is not odd, we know that it is even. Following the steps in the solution of Example 3 (a proofby contraposition), we can show that if n is even, then 3n + 2 is even. First, because n is even, there is an integer k such that n 2k. This implies that 3n + 2 3(2k) + 2 6k + 2 2(3k + 1). Because 3n + 2 is 2t, where t 3k + 1 , 3n + 2 is even. Note that the statement "3n + 2 is even" """p , because an integer is even if and only if it is not odd. Because both p and """ p are true, we have a contradiction. This completes the proof by contradiction, proving that if .... 3n + 2 is odd, then n is odd.
=
=
=
=
=
Note that we can also prove by contradiction that p -+ q is true by assuming that p and are true, and showing that q must be also be true. This implies that ......q and q are both true, a contradiction. This observation tells us that we can tum a direct proof into a proof by contradiction. ...... q
82
1 / The Foundations: Logic and Proofs
1 -82
PROOFS OF EQUIVALENCE To prove a theorem that is a biconditional statement, that is, a statement of the form P *+ q, we show that P -+ q and q -+ P are both true. The validity of this approach is based on the tautology (p *+ q ) *+ [(p -+ q ) 1\ (q -+ p)] .
EXAMPLE 12
Exam= �
Prove the theorem "If n is a positive integer, then n is odd if and only if n 2 is odd."
Solution: This theorem has the form "P if and only if q ," where P is "n is odd" and q is "n 2 is odd." (As usual, we do not explicitly deal with the universal quantification.) To prove this theorem, we need to show that P -+ q and q -+ P are true. We have already shown (in Example 1 ) that P -+ q is true and (in Example 8) that q -+ P is true. Because we have shown that both P -+ q and q -+ P are true, we have shown that the .... theorem is true. Sometimes a theorem states that several propositions are equivalent. Such a theorem states that propositions P I , P2 , P3 , . . . , Pn are equivalent. This can be written as
PI *+ P2 *+
.
.
.
*+ Pn ,
which states that all n propositions have the same truth values, and consequently, that for all i and j with 1 :s i :s n and 1 :s j :s n, Pi and Pj are equivalent. One way to prove these mutually equivalent is to use the tautology
This shows that if the conditional statements PI -+ P2 , P2 -+ P3 , . . . , Pn -+ P I can be shown to be true, then the propositions P I , P2 , . . . , Pn are all equivalent. This is much more efficient than proving that Pi -+ Pj for all i =f. j with 1 :s i :s n and 1 :s j :s n . When we prove that a group o f statements are equivalent, we can establish any chain of conditional statements we choose as long as it is possible to work through the chain to go from any one of the-se statements to any other statement. For example, we can show that P I , P2 , and P3 are equivalent by showing that P I -+ P3 , P3 -+ P2 , and P2 -+ P I .
EXAMPLE 13
Show that these statements about the integer n are equivalent:
PI : P2 : P3 :
n is even. n - 1 is odd. n 2 is even.
Solution: We will show that these three statements are equivalent by showing that the conditional statements PI -+ P2 , P2 -+ P3 , and P3 -+ P I are true. We use a direct proof to show that P I -+ P2 . Suppose that n is even. Then n 2k for some integer k. Consequently, n - 1 2k - 1 2(k - 1 ) + 1 . This means that n - 1 is odd because it is of the form 2m + 1 , where m is the integer k - 1 . We also use a direct proof to show that P2 -+ P3 . Now suppose n - 1 is odd. Then n 1 2k + 1 for some integer k. Hence, n 2k + 2 so that n 2 (2k + 2i 4k2 + 8k + 4 2(2k2 + 4k + 2). This means that n 2 is twice the integer 2k2 + 4k + 2, and hence is even. To prove P3 -+ PI , we use a proofby contraposition. That is, we prove that if n is not even, then n 2 is not even. This is the same as proving that if n is odd, then n 2 is odd, which we have .... already done in Example 1 . This completes the proof.
=
=
=
=
=
=
=
=
1 .6 Introduction to Proofs
1 -83
83
COUNTEREXAMPLES In Section 1 .3 we stated that to show that a statement of the form Vx P(x) is false, we need only find a counterexample, that is, an example x for which P (x) is false. When presented with a statement of the form Vx P (x), which we believe to be false or which has resisted all proof attempts, we look for a counterexample. We illustrate the use of counterexamples in Example 14.
EXAMPLE 14
Show that the statement "Every positive integer is the sum of the squares of two integers" is false.
Ex1ra � Examples IIiiiiiI
Solution: To show that this statement is false, we look for a counterexample, which is a par ticular integer that is not the sum of the squares of two integers. It does not take long to find a counterexample, because 3 cannot be written as the sum of the squares of two integers. To show this is the case, note that the only perfect squares not exceeding 3 are 0 2 = 0 and 1 2 = 1 . Furthermore, there is no way to get 3 as the sum of two terms each of which is 0 or I . Conse quently, we have shown that "Every positive integer is the sum of the squares of two integers" is .... false.
Mistakes in Proofs
unlls E:J EXAMPLE 15
There are many common errors made in constructing mathematical proofs. We will briefly describe some of these here. Among the most common errors are mistakes in arithmetic and basic algebra. Even professional mathematicians make such errors, especially when working with complicated formulae. Whenever you use such computations you should check them as carefully as possible. (You should also review any troublesome aspects of basic algebra, especially before you study Section 4. 1 .) Each step of a mathematical proof needs to be correct and the conclusion needs to follow logically from the steps that precede it. Many mistakes result from the introduction of steps that do not logically follow from those that precede it. This is illustrated in Examples 1 5-17. What i s wrong with this famous supposed "proof" that 1 = 2?
"Proof:" We use these steps, where a and b are two equal positive integers. Step
Reason
l. 2. 3. 4. 5. 6.
Given Multiply both sides of ( 1 ) by a Subtract b2 from both sides of (2) Factor both sides of (3) Divide both sides of (4) by a - b Replace a by b in (5) because a = b and simplify Divide both sides of (6) by b
a=b a 2 = ab a 2 - b 2 = ab - b2 (a - b)(a + b) = b(a - b) a+b=b 2b = b
7. 2 = 1
Solution: Every step is valid except for one, step 5 where we divided both sides by a - b. The error is that a - b equals zero; division of both sides of an equation by the same quantity is .... valid as long as this quantity is not zero. EXAMPLE 16
What is wrong with this "proof" ? "Theorem:" If n 2 is positive, then n is positive.
84
1 / The Foundations: Logic and Proofs
1 -84
"Proof:" Suppose that n 2 is positive. Because the conditional statement "If n is positive, then n 2 is positive" is true, we can conclude that n is positive.
Solution: Let P (n) be "n is positive" and Q(n) be "n 2 is positive." Then our hypothesis is Q(n). The statement "If n is positive, then n 2 is positive" is the statement "In(P(n) --+ Q(n)). From the hypothesis Q(n) and the statement "In(P(n) --+ Q(n)) we cannot conclude p en), because we are not using a valid rule of inference. Instead, this is an example of the fallacy of affirming the conclusion. A counterexample is supplied by n - 1 for which n 2 1 is positive, but n is � negative.
=
EXAMPLE 17
=
What is wrong with this "proof" ? "Theorem:" If n is not positive, then n 2 is not positive. (This is the contrapositive of the "theorem" in Example 1 6.)
"Proof:" Suppose that n is not positive. Because the conditional statement "If n is positive, then n 2 is positive" is true, we can conclude that n 2 is not positive.
Solution: Let P (n) and Q(n) be as in the solution of Example 1 6. Then our hypothesis is -,P(n) and the statement "Ifn is positive, then n 2 is positive" is the statement "In(P(n) --+ Q(n)). From the hypothesis -, P (n) and the statement "In(P(n) --+ Q(n)) we cannot conclude -'Q(n), because we are not using a valid rule of inference. Instead, this is an example of the fallacy of denying � the hypothesis. A counterexample is supplied by n - 1 , as in Example 1 6.
=
Finally, we briefly discuss a particularly nasty type of error. Many incorrect arguments are based on a fallacy called begging the question. This fallacy occurs when one or more steps of a proof are based on the truth of the statement being proved. In other words, this fallacy arises when a statement is proved using itself, or a statement equivalent to it. That is why this fallacy is also called circular reasoning.
EXAMPLE 18
Is the following argument correct? It supposedly shows that n is an even integer whenever n 2 is an even integer. Suppose that n 2 is even. Then n 2 This shows that n is even.
= 2k for some integer k. Let n = 21 for some integer I. =
Solution: This argument is incorrect. The statement "let n 21 for some integer I" occurs in the proof. No argument has been given to show that n can be written as 21 for some integer I. This is circular reasoning because this statement is equivalent to the statement being proved, namely, "n is even." Of course, the result itself is correct; only the method of proof is wrong. � Making mistakes in proofs is part of the learning process. When you make a mistake that someone else finds, you should carefully analyze where you went wrong and make sure that you do not make the same mistake again. Even professional mathematicians make mistakes in proofs. More than a few incorrect proofs of important results have fooled people for many years before subtle errors in them were found.
Just
a
Beginning
We have now developed a basic arsenal of proof methods. In the next section we will introduce other important proof methods. We will also introduce several important proof techniques in
/-85
1 .6 Introduction to Proofs
85
Chapter 4, including mathematical induction, which can be used to prove results that hold for all positive integers. In Chapter 5 we will introduce the notion of combinatorial proofs. In this section we introduced several methods for proving theorems of the form Vx(P(x) --+ Q(x )), including direct proofs and proofs by contraposition. There are many theorems of this type whose proofs are easy to construct by directly working through the hypotheses and definitions of the terms of the theorem. However, it is often difficult to prove a theorem without resorting to a clever use of a proof by contraposition or a proof by contradiction, or some other proof technique. In Section 1 .7 we will address proof strategy. We will describe various approaches that can be used to find proofs when straightforward approaches do not work. Constructing proofs is an art that can be learned only through experience, including writing proofs, having your proofs critiqued, and reading and analyzing other proofs.
Exercises 1. Use a direct proofto show that the sum of two odd integers is even. 2. Use a direct proof to show that the sum oftwo even inte gers is even. 3. Show that the square of an even number is an even number using a direct proof. 4. Show that the additive inverse, or negative, of an even number is an even number using a direct proof. 5. Prove that if m + n and n + p are even integers, where m , n , and p are integers, then m + p is even. What kind of proof did you use? 6. Use a direct proof to show that the product of two odd numbers is odd. 7. Use a direct proof to show that every odd integer is the difference of two squares. 8. Prove that if n is a perfect square, then n + 2 is not a perfect square. 9. Use a proof by contradiction to prove that the sum of an irrational number and a rational number is irrational. 10. Use a direct proof to show that the product of two rational numbers is rational. 1 1 . Prove or disprove that the product of two irrational num bers is irrational. 12. Prove or disprove that the product of a nonzero rational number and an irrational number is irrational. 13. Prove that if x is irrational, then 1 /x is irrational. 14. Prove that if x is rational and x =1= 0, then I /x is rational. 15. Use a proof by contraposition to show that if x + y :::: 2, where x and y are real numbers, then x :::: I or y :::: I . S' 16. Prove that if m and n are integers and m n is even, then m is even or n is even. 17. Show that if n is an integer and n 3 + 5 is odd, then n is even using
a) a proof by contraposition. b) a proof by contradiction. 18. Prove that if n is an integer and even using
3n + 2 is even, then n
is
a) a proof by contraposition. b) a proof by contradiction. 19. Prove the proposition P (O), where P (n ) is the proposition "If n is a positive integer greater than I , then n 2 > n ." What kind of proof did you use? 20. Prove the proposition P ( I ), where pen) is the proposi tion "If n is a positive integer, then n 2 :::: n ." What kind of proof did you use? 2 1 . Let p en) be the proposition "If a and b are positive real numbers, then (a + b)n :::: a n + bn ." Prove that P ( I ) is true. What kind of proof did you use? 22. Show that if you pick three socks from a drawer contain ing j ust blue socks and black socks, you must get either a pair of blue socks or a pair of black socks. 23. Show that at least 1 0 of any 64 days chosen must fall on the same day of the week. 24. Show that at least 3 of any 25 days chosen must fall in the same month of the year. 25. Use a proofby contradiction to show that there is no ratio nal number r for which r 3 + r + = O . [Hint: Assume that r = a / b is a root, where a and b are integers and a / b is in lowest terms. Obtain an equation involving integers by multiplying by b 3 • Then look at whether a and b are each odd or even.] 26. Prove that if n is a positive integer, then n is even if and only if n + 4 is even.
I
7
27. Prove that if n is a positive integer, then n is odd if and only if 5 n + 6 is odd. 28. Prove that m 2 = n 2 if and only if m = n or m = -n . 29. Prove or disprove that if m and n are integers such that mn = 1 , then either m = I and n = I , or else m = - I and n = - 1 . 30. Show that these three statements are equivalent, where a and b are real numbers: (i) a is less than b, (ii) the average of a and b is greater than a , and (iii) the average of a and b is less than b. 3 1 . Show that these statements about the integer x are equiv alent: (i) 3x + 2 is even, (ii) x + 5 is odd, (iii) x 2 is even.
1 / The Foundations: Logic and Proofs
86
1 -86
32. Show that these statements about the real number x are equivalent: (i) x is rational, (ii) x /2 is rational, and (iii) 3x - 1 is rational.
37.
33. Show that these statements about the real number x are equivalent: (i) x is irrational, (ii) 3x + 2 is irrational, (iii) x /2 is irrational. 34. Is this reasoning for finding the solutions of the equation .J2x 2 - 1 = x correct? (1) .J2x 2 - 1 = x is given; (2) 2X 2 - 1 = x 2 , obtained by squaring both sides of ( 1 ); (3) x 2 - 1 = 0, obtained by subtracting x 2 from both sides of (2); (4) (x - 1 )(x + 1 ) = 0, obtained by factoring the left hand side of x 2 - 1 ; (5) x = 1 or x = - 1 , which follows because ab = 0 implies that a = 0 or b = O . 35. Are these steps for finding the solutions of .Jx + = - x correct? (1) .Jx + = - x is given; (2) x + 3 = x 2 - 6x + obtained by squaring both sides of ( I ); (3) 0 = x 2 - 7x + 6, obtained by subtracting x + from both sides of (2); (4) 0 = (x - 1 )(x - 6), obtained by factoring the right-hand side of (5) x = 1 or x = 6, which follows from (4) because ab = 0 implies that a = 0 or b = O . 36. Show that the propositions PI , P2 , P 3 , and P4 can be shown
3 3
9,
3 3
3
(3);
38.
39.
40.
41.
42.
to be equivalent by showing that PI ++ P4, P2 ++ P3 , and P I ++ P3 · Show that the propositions PI , P2 , P3 , P4, and P5 can be shown to be equivalent by proving that the conditional statements P I -+ P4 , P3 -+ PI . P4 -+ P2 , P2 -+ P5 , and P5 -+ P3 are true. Find a counterexample to the statement that every positive integer can be written as the sum of the squares of three integers. Prove that at least one of the real numbers a I . a 2 , . . . , an is greater than or equal to the average of these numbers. What kind of proof did you use? Use Exercise to show that if the first 10 positive inte gers are placed around a circle, in any order, there exist three integers in consecutive locations around the circle that have a sum greater than or equal to 1 7 . Prove that if is an integer, these four statements are equivalent: (i) is even, (ii) + 1 is odd, (iii) + 1 is odd, (iv) is even. Prove that these four statements about the integer are equivalent: (i) n 2 is odd, (ii) 1 - is even, (iii) n 3 is odd, (iv) n 2 + 1 is even.
39
nn 3n
n
3n
n
n
1.7 Proof Methods and Strategy Introduction
Assessment �
In Section 1 .6 we introduced a variety of different methods of proof and illustrated how each method is used. In this section we continue this effort. We will introduce several other important proof methods, including proofs where we consider different cases separately and proofs where we prove the existence of objects with desired properties. In Section 1 .6 we only briefly discussed the strategy behind constructing proofs. This strategy includes selecting a proof method and then successfully constructing an argument step by step, based on this method. In this section, after we have developed a wider arsenal of proof methods, we will study some additional aspects of the art and science of proofs. We will provide advice on how to find a proof of a theorem. We will describe some tricks of the trade, including how proofs can be found by working backward and by adapting existing proofs. When mathematicians work, they formulate conjectures and attempt to prove or disprove them. We will briefly describe this process here by proving results about tiling checkerboards with dominoes and other types of pieces. Looking at tilings of this kind, we will be able to quickly formulate conjectures and prove theorems without first developing a theory. We will conclude the section by discussing the role of open questions. In particular, we will discuss some interesting problems either that have been solved after remaining open for hundreds of years or that still remain open.
Exhaustive Proof and Proof by Cases Sometimes we cannot prove a theorem using a single argument that holds for all possible cases. We now introduce a method that can be used to prove a theorem, by considering different cases
1 .7 Proof Methods and Strategy 87
/-87
separately. This method is based on a rule of inference that we will now introduce. To prove a conditional statement of the form
the tautology
can be used as a rule of inference. This shows that the original conditional statement with a hypothesis made up ofa disjunction of the propositions PI , P2 , . . . , Pn can be proved by proving each of the n conditional statements Pi -+ q , i 1 , 2, . . . , n , individually. Such an argument is called a proof by cases. Sometimes to prove that a conditional statement P -+ q is true, it is convenient to use a disjunction PI v P2 V . . . V P instead of P as the hypothesis of the n conditional statement, where P and PI v P2 V . . . V P are equivalent. n
=
EXHAUSTIVE PROOF Some theorems can be proved by examining a relatively small num ber of examples. Such proofs are called exhaustive proofs, because these proofs proceed by exhausting all possibilities. An exhaustive proof is a special type of proof by cases where each case involves checking a -single example. We now provide some illustrations of exhaustive proofs.
EXAMPLE 1
Prove that (n + 1 )2 � 3n if n is a positive integer with n :s 4.
Solution: We use a proof by exhaustion. We only need verify the inequality (n + 1 ) 2 ::: 3n when n 1 , 2, 3 , and 4. For n 1 , we have (n + 1 )2 2 2 4 and 3n 3 1 3 ; for n 2, we have (n + Ii 3 2 9 and 3n 3 2 9; for n 3, we have (n + 1 )3 43 64 and 3n 3 3 27; and for n 4, we have (n + 1 ) 3 5 3 1 25 and 3n 3 4 8 1 . In each of these four cases, we see that (n + 1 ) 2 � 3n . We have used the method of exhaustion to prove that (n + 1 ) 2 � 3n if n .... is a positive integer with n :s 4.
=
Ex1ra � Examples IIiiiiiiI EXAMPLE 2
= = = = = = = = = =
= = =
= = = =
= = =
Prove that the only consecutive positive integers not exceeding 1 00 that are perfect powers are 8 and 9. (An integer is a perfect power if it equals na, where a is an integer greater than 1 .)
Solution: We can prove this fact by showing that the only pair n, n + 1 of consecutive positive integers that are both perfect powers with n < 1 00 arises when n 8. We can prove this fact by examining positive integers n not exceeding 1 00, first checking whether n is a perfect power, and if it is, checking whether n + 1 is also a perfect power. A quicker way to do this is simply to look at all perfect powers not exceeding 1 00 and checking whether the next largest integer is also a perfect power. The squares of positive integers not exceeding 1 00 are 1 , 4, 9, 1 6, 25, 36, 49, 64, 8 1 , and 1 00. The cubes of positive integers not exceeding 1 00 are 1 , 8, 27, and 64. The fourth powers of positive integers not exceeding 1 00 are 1 , 1 6, and 8 1 . The fifth powers of positive integers not exceeding 1 00 are 1 and 32. The sixth powers of positive integers not exceeding 1 00 are 1 and 64. There are no powers of positive integers higher than the sixth power not exceeding 1 00, other than 1 . Looking at this list of perfect powers not exceeding 1 00, we see that n = 8 is the only perfect power n for which n + 1 is also a perfect power. That .... is, 23 = 8 and 3 2 9 are the only two consecutive perfect powers not exceeding 1 00.
=
=
People can carry out exhaustive proofs when it is necessary to check only a relatively small number of instances of a statement. Computers do not complain when they are asked to check a much larger number of instances of a statement, but they still have limitations. Note that not even a computer can check all instances when it is impossible to list all instances to check.
88
1 -88
1 / The Foundations: Logic and Proofs
PROOF BY CASES A proof by cases must cover all possible cases that arise in a theorem. We illustrate proofby cases with a couple of examples. In each example, you should check that all possible cases are covered.
EXAMPLE 3
Prove that if n is an integer, then n 2 :::: n .
=
Solution: We can prove that n 2 :::: n for every integer by considering three cases, when n 0, when n :::: 1 , and when n :::: - 1 . We split the proof into three cases because it is straightforward to prove the result by considering zero, positive integers, and negative integers separately.
=
=
Case (i). When n 0, because 02 0, we see that 02 :::: 0. It follows that n 2 :::: n is true in this case. Case (ii). When n :::: 1 , when we multiply both sides of the inequality n :::: 1 by the positive integer n , we obtain n . n :::: n . 1 . This implies that n 2 :::: n for n :::: 1 . Case (iii). In this case n :::: - 1 . However, n 2 :::: 0 . It follows that n 2 :::: n . Because the inequality n 2 :::: n holds in all three cases, we can conclude that i f n i s an integer,
EXAMPLE 4
=
Use a proof by cases to show that IxY I Ix l ly l , where x and y are real numbers. (Recall that la l , the absolute value of a , equals a when a :::: ° and equals -a when a :::: 0.)
Solution: In our proof of this theorem, we remove absolute values using the fact that la I = a when a :::: ° and la I -a when a < 0. Because both Ix I and Iyl occur in our formula, we will need four cases: (i) x and y both nonnegative, (ii) x nonnegative and y is negative, (iii) x negative and y nonnegative, and (iv) x negative and y negative. (Note that we can remove the absolute value signs by making the appropriate choice of signs within each case.)
=
=
Case (i). We see that PI --+ q because xy :::: ° when x :::: ° and y :::: 0, so that IxY I xy = Ix l ly l · Case (ii). To see that P2 --+ q , note that if x :::: ° and y < 0, then xy :::: 0, so that IxY I -xy x (-y) Ix l ly l . (Here, because y < 0, we have Iyl -y.) Case (iii). To see that P3 --+ q, we follow the same reasoning as the previous case with the roles of x and y reversed. Case (iv). To see that P4 --+ q , note that when x < ° and y < 0, it follows that xy > 0. Hence, IxY I xy (-x)(-y) Ix l ly l .
=
=
=
=
=
=
=
Because we have completed all four cases and these cases exhaust all possibilities, we can
=
LEVERAGING PROOF BY CASES The examples we have presented illustrating proof by cases provide some insight into when to use this method of proof. In particular, when it is not possible to consider all cases of a proof at the same time, a proofby cases should be considered. When should you use such a proof? Generally, look for a proofby cases when there is no obvious way to begin a proof, but when extra information in each case helps move the proof forward. Example 5 illustrates how the method of proof by cases can be used effectively.
EXAMPLE 5
Formulate a conjecture about the decimal digits that occur as the final digit of the square of an integer and prove your result.
Solution: The smallest perfect squares are 1 , 4, 9, 1 6, 25, 36, 49, 64, 8 1 , 1 00, 1 2 1 , 144, 1 69, 1 96, 225, and so on. We notice that the digits that occur as the final digit of a square are 0, 1 , 4, 5 , 6, and 9, with 2, 3 , 7, and 8 never appearing as the final digit ofa square. We conjecture
1 .7 Proof Methods and Strategy 89
/-89
this theorem: The final decimal digit of a perfect square is 0, 1 , 4, 5 , 6 or 9. How can we prove this theorem? We first note that we can express an integer n as l Oa + b, where a and b are positive integers and b is 0, 1 , 2, 3 , 4, 5 , 6, 7, 8, or 9. Here a is the integer obtained by subtracting the final decimal digit ofn from n and dividing by 1 0. Next, note that ( 1 Oa + b i = 1 00a 2 + 20ab + b2 = 1 0( 1 0a 2 + 2b) + b2 , so that the final decimal digit ofn 2 is the same as the final decimal digit of b 2 . Furthermore, note that the final decimal digit of b 2 is the same as the final decimal digit of ( 1 0 b)2 = 1 00 - 20b + b2 . Consequently, we can reduce our proof to the consideration of six cases. Case (i}. The final digit of n is 1 or 9. Then the final decimal digit of n 2 is the final decimal digit of 1 2 = 1 or 92 = 8 1 , namely, 1 . Case (ii). The final digit of n is 2 or 8 . Then the final decimal digit of n 2 is the final decimal digit of 22 = 4 or 8 2 = 64, namely, 4. Case (iii}. The final digit of n is 3 or 7. Then the final decimal digit of n 2 is the final decimal digit of 3 2 = 9 or 7 2 = 49, namely, 9. Case (iv). The final digit of n is 4 or 6. Then the final decimal digit of n 2 is the final decimal digit of 42 = 1 6 or 6 2 = 36, namely, 6. Case (v). The final decimal digit of n is 5 . Then the final decimal digit of n 2 is the final decimal digit of 5 2 = 25, namely 5 . Case (vi}. The final decimal digit o f n i s O. Then the final decimal digit o f n 2 i s the final decimal digit of 02 = 0, namely O.
Because we have considered all six cases, we can conclude that the final decimal digit of n 2 , ... where n is an integer is either 0, 1 , 2, 4, 5, 6, or 9. Sometimes we can eliminate all but a few examples in a proof by cases, as Example 6 illustrates.
EXAMPLE 6
Show that there are no solutions in integers x and y of x 2 + 3y 2 = 8.
Solution: We can quickly reduce a proof to checking just a few simple cases because x 2 > 8 when Ix I ::: 3 and 3y2 > 8 when Iy I ::: 2. This leaves the cases when x takes on one of the values -2, - 1 , 0, 1 , or 2 and y takes on one of the values - 1 , 0, or 1 . We can finish using an exhaustive proof. To dispense with the remaining cases, we note that possible values for x 2 are 0, 1 , and 4, and possible values for 3y 2 are 0 and 3, and the largest sum of possible values for x 2 and 3y 2 is ... 7. Consequently, it is impossible for x 2 + 3y 2 = 8 to hold when x and y are integers. WITHOUT LOSS OF GENERALITY In the proof in Example 4, we dismissed case (iii), where x < 0 and y ::: 0, because it is the same as case (ii), where x ::: 0 and y < 0, with the roles of x and y reversed. To shorten the proof, we could have proved cases (ii) and (iii) together by assuming, without loss of generality, that x ::: 0 and y < O. Implicit in this statement is that we can complete the case with x < 0 and y ::: 0 using the same argument as we used for the case with x ::: 0 and y < 0, but with the obvious changes. In general, when the phrase "without loss of generality" is used in a proof(often abbreviated as WLOG), we assert that by proving one case of a theorem, no additional argument is required to prove other specified cases. That is, other cases follow by making straightforward changes to the argument, or by filling in some straightforward initial step. Of course, incorrect use of this principle can lead to unfortunate errors. Sometimes assumptions are made that lead to a loss in generality. Such assumptions can be made that do not take into account that one case may be substantially different from others. This can lead to an incomplete, and possibly unsalvageable, proof. In fact, many incorrect proofs of famous theorems turned out to rely on arguments that used the idea of "without loss of generality" to establish cases that could not be quickly proved from simpler cases. We now illustrate a proof where without loss of generality is used effectively.
90
1 -90
1 / The Foundations: Logic and Proofs
EXAMPLE 7
Show that (x + y Y < xr + yr whenever x and y are positive real numbers and r is a real number with 0 < r < 1 .
Solution: Without loss of generality we can assume that x + y = 1 . [To see this, suppose we have proved the theorem with the assumption that x + y = 1 . Suppose that x + y = t. Then (x / t ) + (y/ t ) = 1 , which implies that « x / t ) + (y/ t )Y < (x / tY + (y/ tY . Multiplying both sides of this last equation by tr shows that (x + y y < xr + yr .] Assuming that x + y = 1, because x and y are positive, we have 0 < x < 1 and 0 < y < 1 . I Because 0 < r < 1 , it follows that 0 < 1 r < 1 , so X -r < 1 and y l -r < 1 . This means that x < xr andy < yr . Consequently, xr + yr > X + y = 1 . This means that (x + yy = l r < xr + yr . This proves the theorem for x + y = 1 . Because we could assume x + y = 1 without loss of generality, we know that (x + yy < xr + yr whenever x and y are positive real numbers and r is a real number with 0 < r < 1 . .... -
COMMON ERRORS WITH EXHAUSTIVE PROOF AND PROOF BY CASES A common error of reasoning is to draw incorrect conclusions from examples. No matter how many separate examples are considered, a theorem is not proved by considering examples unless every possible case is covered. The problem of proving a theorem is analogous to showing that a computer program always produces the output desired. No matter how many input values are tested, unless all input values are tested, we cannot conclude that the program always produces the correct output.
EXAMPLE 8
Is it true that every positive integer is the sum of 1 8 fourth powers of integers?
Solution: To determine whether n can be written as the sum of 1 8 fourth powers of integers, we might begin by examining whether n is the sum of 1 8 fourth powers of integers for the smallest positive integers. Because the fourth powers of integers are 0, 1 , 1 6, 8 1 , . . . , if we can select 1 8 terms from these numbers that add up to n , then n is the sum of 1 8 fourth powers. We can show that all positive integers up to 78 can be written as the sum of 1 8 fourth powers. (The details are left to the reader.) However, if we decided this was enough checking, we would come to the wrong conclusion. It is not true that every positive integer is the sum of 1 8 fourth powers .... because 79 is not the sum of 1 8 fourth powers (as the reader can verify). Another common error involves making unwarranted assumptions that lead to incorrect proofs by cases where not all cases are considered. This is illustrated in Example 9.
EXAMPLE 9
What is wrong with this "proof" ? "Theorem:" If x is a real number, then x 2 is a positive real number.
"Proof:" Let P I be "x is positive," let P 2 be "x is negative," and let q be "x 2 is positive." To show that P I -+ q is true, note that when x is positive, x 2 is positive because it is the product of two positive numbers, x and x . To show that P2 -+ q , note that when x is negative, x 2 is positive because it is the product of two negative numbers, x and x . This completes the proof.
Solution: The problem with the proof we have given is that we missed the case x = O. When x = 0, x 2 = 0 is not positive, so the supposed theorem is false. If P is "x is a real number," then we can prove results where P is the hypothesis with three cases, P I , P2 , and P3 , where P I is "x is positive," P2 is "x is negative," and P 3 is "x = 0" because of the equivalence .... P B PI V P2 V P3 ·
1 .7 Proof Methods and Strategy 91
1-91
Existence Proofs Many theorems are assertions that objects of a particular type exist. A theorem of this type is a proposition of the form 3x P (x), where P is a predicate. A proof of a proposition of the form 3x P(x) is called an existence proof. There are several ways to prove a theorem of this type. Sometimes an existence proof of 3x P (x) can be given by finding an element a such that P (a) is true. Such an existence proof is called constructive. It is also possible to give an existence proof that is nonconstructive; that is, we do not find an element a such that P ea) is true, but rather prove that 3x P (x) is true in some other way. One common method of giving a nonconstructive existence proof is to use proof by contradiction and show that the negation of the existential quantification implies a contradiction. The concept of a constructive existence proof is illustrated by Example 1 0 and the concept of a nonconstructive existence proof is illustrated by Example 1 1 .
EXAMPLE 10
Extra ;'" Examples �
A Constructive Existence Proof Show that there is a positive integer that can be written as the sum of cubes of positive integers in two different ways.
Solution: After considerable computation (such as a computer search) we find that
Because we have displayed a positive integer that can be written as the sum of cubes in two � different ways, we are done.
EXAMPLE 11
A Nonconstructive Existence Proof
that xY is rational.
Show that there exist irrational numbers x and y such
Solution: By Example 1 0 in Section 1 .6 we know that .j2 is irrational. Consider the number v"i .j2 . If it is rational, we have two irrational numbers x and y with xY rational, namely, x = .j2 v"i v"i and y = .j2. On the other hand if .j2 is irrational, then we can let x = .j2 and y = .j2 r,;v"i h � v"i.v"i) r,; = v 2 2 = 2. so that xY = (v 2 )'-' 2 = '\1' 2 This proof is an example of a nonconstructive existence proof because we have not found irrational numbers x and y such that xY is rational. Rather, we have shown that either the pair v"i x = .j2, y = .j2 or the pair x = .j2 , Y = .j2 have the desired property, but we do not know � which of these two pairs works ! Nonconstructive existence proofs often are quite subtle, as Example 1 2 illustrates.
EXAMPLE 12
unkS�
Chomp is a game played by two players. In this game, cookies are laid out on a rectangular grid. The cookie in the top left position is poisoned, as shown in Figure l ea). The two players take turns making moves; at each move, a player is required to eat a remaining cookie, together with all cookies to the right and/or below it (see Figure I (b), for example). The loser is the player who has no choice but to eat the poisoned cookie. We ask whether one of the two players has a winning strategy. That is, can one of the players always make moves that are guaranteed to lead to a win?
Solution: We will give a nonconstructive existence proof of a winning strategy for the first player. That is, we will show that the first player always has a winning strategy without explicitly describing the moves this player must follow. First, note that the game ends and cannot finish in a draw because with each move at least one cookie is eaten, so after no more than m x n moves the game ends, where the initial grid is
92
1 - 92
1 / The Foundations: Logic and Proofs ...............
® -' @ " , . ' -. -
...
.
-" .
' ''',- - ! ' .. �
' \
(b)
FIGURE 1
(a) Chomp, the Top Left Cookie is Poison
(b) Three Possible Moves.
m x n _ Now, suppose that the first player begins the game by eating just the cookie in the bottom right comer. There are two possibilities, this is the first move of a winning strategy for the first player, or the second player can make a move that is the first move of a winning strategy for the second player. In this second case, instead of eating just the cookie in the bottom right comer, the first player could have made the same move that the second player made as the first move of a winning strategy (and then continued to follow that winning strategy). This would guarantee a win for the first player. Note that we showed that a winning strategy exists, but we did not specify an actual winning strategy. Consequently, the proof is a nonconstructive existence proof. In fact, no one has been able to describe a winning strategy for that Chomp that applies for all rectangular grids by describing the moves that the first player should follow. However, winning strategies can be described for certain special cases, such as when the grid is square and when the grid only has
Uniqueness Proofs Some theorems assert the existence of a unique element with a particular property. In other words, these theorems assert that there is exactly one element with this property. To prove a statement of this type we need to show that an element with this property exists and that no other element has this property. The two parts of a uniqueness proof are:
Existence: We show that an element x with the desired property exists. Uniqueness: We show that if y #- x , then y does not have the desired property. Equivalently, we can show that if x and y both have the desired property, then x = y. Remark: Showing that there is a unique element x such that P(x) is the same as proving the
statement 3x(P(x) /\ Vy(y #- x
-+
-,P(y» ).
We illustrate the elements of a uniqueness proof in Example 1 3 .
1 .7 Proof Methods and Strategy
1 -93
EXAMPLE 13
ExIra ;"" Examples "'"
93
Show that if a and b are real numbers and a =I=- 0, then there is a unique real number r such that ar + b = O.
Solution: First, note that the real number r = -b/a is a solution of ar + b = 0 because a( -b/a) + b = -b + b = O. Consequently, a real number r exists for which ar + b = O. This is the existence part of the proof. Second, suppose that s is a real number such that as + b = O. Then ar + b = as + b, where r = -b / a . Subtracting b from both sides, we find that ar = as . Dividing both sides of this last equation by a, which is nonzero, we see that r = s . This means that if s =I=- r, then as + b =I=- O. .... This establishes the uniqueness part of the proof.
Proof Strategies Finding proofs can be a challenging business. When you are confronted with a statement to prove, you should first replace terms by their definitions and then carefully analyze what the hypotheses and the conclusion mean. After doing so, you can attempt to prove the result using one of the available methods of proof. Generally, if the statement is a conditional statement,
linlls E:2
GODFREY HAROLD HARDY ( 1 877- 1 947) Hardy, born in Cranleigh, Surrey, England, was the older of two children of Isaac Hardy and Sophia Hall Hardy. His father was the geography and drawing master at the Cranleigh School and also gave singing lessons and played soccer. His mother gave piano lessons and helped run a boardinghouse for young students. Hardy's parents were devoted to their children's education. Hardy demonstrated his numerical ability at the early age of two when he began writing down numbers into the millions. He had a private mathematics tutor rather than attending regular classes at the Cranleigh School. He moved to Winchester College, a private high school, when he was 1 3 and was awarded a scholarship. He excelled in his studies and demonstrated a strong interest in mathematics. He entered Trinity College, Cambridge, in 1 896 on a scholarship and won several prizes during his time there, graduating in 1 899. Hardy held the position of lecturer in mathematics at Trinity College at Cambridge University from 1 906 to 1 9 1 9, when he was appointed to the Sullivan chair of geometry at Oxford. He had become unhappy with Cambridge over the dismissal of the famous philosopher and mathematician Bertrand Russell from Trinity for antiwar activities and did not like a heavy load of administrative duties. In 193 1 he returned to Cambridge as the Sadleirian professor of pure mathematics, where he remained until his retirement in 1942. He was a pure mathematician and held an elitist view of mathematics, hoping that his research could never be applied. Ironically, he is perhaps best known as one of the developers of the Hardy-Weinberg law, which predicts patterns of inheritance. His work in this area appeared as a letter to the journal Science in which he used simple algebraic ideas to demonstrate errors in an article on genetics. Hardy worked primarily in number theory and function theory, exploring such topics as the Riemann zeta function, Fourier series, and the distribution of primes. He made many important contributions to many important problems, such as Waring's problem about representing positive integers as sums of kth powers and the problem of representing odd integers as sums of three primes. Hardy is also remembered for his collaborations with John E. Littlewood, a colleague at Cambridge, with whom he wrote more than 100 papers, and the famous Indian mathematical prodigy Srinivasa Ramanuj an. His collaboration with Littlewood led to the joke that there were only three important English mathematicians at that time, Hardy, Littlewood, and Hardy Littlewood, although some people thought that Hardy had invented a fictitious person, Littlewood, because Littlewood was seldom seen outside Cambridge. Hardy had the wisdom of recognizing Ramanujan's genius from uncoventional but extremely creative writings Ramanujan sent him, while other mathematicians failed to see the genius. Hardy brought Ramanujan to Cambridge and collaborated on important joint papers, establishing new results on the number of partitions of an integer. Hardy was interested in mathematics education, and his book A Course of Pure Mathematics had a profound effect on undergraduate instruction in mathematics in the first half of the twentieth century. Hardy also wrote A Mathematician 50 Apology, in which he gives his answer to the question of whether it is worthwhile to devote one's life to the study of mathematics. It presents Hardy's view of what mathematics is and what a mathematician does. Hardy had a strong interest in sports. He was an avid cricket fan and followed scores closely. One peculiar trait he had was that he did not like his picture taken (only five snapshots are known) and disliked mirrors, covering them with towels immediately upon entering a hotel room. HISTORICAL NOTE The English mathematician G. H. Hardy, when visiting the ailing Indian prodigy Ramanujan in the hospital, remarked that 1 729, the number of the cab he took, was rather dull. Ramanujan replied "No, it is a very interesting number; it is the smallest number expressible as the sum of cubes in two different ways."
94
1 / The Foundations: Logic and Proofs
/-94
you should first try a direct proof; if this fails, you can try an indirect proof. If neither of these approaches works, you might try a proof by contradiction. FORWARD AND BACKWARD REASONING Whichever method you choose, you need a starting point for your proof. To begin a direct proof of a conditional statement, you start with the premises. Using these premises, together with axioms and known theorems, you can construct a proof using a sequence of steps that leads to the conclusion. This type of reasoning, calledforward reasoning, is the most common type of reasoning used to prove relatively simple results. Similarly, with indirect reasoning you can start with the negation of the conclusion and, using a sequence of steps, obtain the negation of the premises. Unfortunately, forward reasoning is often difficult to use to prove more complicated results, because the reasoning needed to reach the desired conclusion may be far from obvious. In such cases it may be helpful to use backward reasoning. To reason backward to prove a statement q , we find a statement p that we can prove with the property that p ---+ q . (Note that it is not helpful to find a statement r that you can prove such that q ---+ r , because it is the fallacy of begging the question to conclude from q ---+ r and r that q is true.) Backward reasoning is illustrated in Examples 1 4 and 1 5 .
unkS � . SRINIVASA RAMANUJAN ( 1 887-1 920) The famous mathematical prodigy Ramanujan was born and raised in southern India near the city of Madras (now called Chennai). His father was a clerk in a cloth shop. His mother contributed to the family income by singing at a local temple. Ramanujan studied at the local English language school, displaying his talent and interest for mathematics. At the age of 1 3 he mastered a textbook used by college students. When he was 1 5 , a university student lent him a copy of Synopsis ofPure Mathematics. Ramanujan decided to work out the over 6000 results in this book, stated without proof or explanation, writing on sheets later collected to form notebooks. He graduated from high school in 1 904, winning a scholarship to the University of Madras. Enrolling in a fine arts curriculum, he neglected his subjects other than mathematics and lost his scholarship. He failed to pass examinations at the university four times from 1 904 to 1 907, doing well only in mathematics. During this time he filled his notebooks with original writings, sometimes rediscovering already published work and at other times making new discoveries. Without a university degree, it was difficult for Ramanujan to find a decent job. To survive, he had to depend on the goodwill of his friends. He tutored students in mathematics, but his unconventional ways of thinking and failure to stick to the syllabus caused problems. He was married in 1 909 in an arranged marriage to a young woman nine years his junior. Needing to support himself and his wife, he moved to Madras and sought a job. He showed his notebooks of mathematical writings to his potential employers, but the books bewildered them. However, a professor at the Presidency College recognized his genius and supported him, and in 1 9 1 2 he found work a s an accounts clerk, earning a small salary. Ramanujan continued his mathematical work during this time and published his first paper in 1 9 1 0 in an Indian journal. He realized that his work was beyond that of Indian mathematicians and decided to write to leading English mathematicians. The first mathematicians he wrote to turned down his request for help. But in January 1 9 1 3 he wrote to G. H. Hardy, who was inclined to tum Ramanujan down, but the mathematical statements in the letter, although stated without proof, puzzled Hardy. He decided to examine them closely with the help of his colleague and collaborator J. E. Littlewood. They decided, after careful study, that Ramanujan was probably a genius, because his statements "could only be written down by a mathematician of the highest class; they must be true, because if they were not true, no one would have the imagination to invent them." Hardy arranged a scholarship for Ramanuj an, bringing him to England in 1 9 14. Hardy personally tutored him in mathematical analysis, and they collaborated for five years, proving significant theorems about the number of partitions of integers. During this time, Ramanujan made important contributions to number theory and also worked on continued fractions, infinite series, and elliptic functions. Ramanujan had amazing insight involving certain types of functions and series, but his purported theorems on prime numbers were often wrong, illustrating his vague idea of what constitutes a correct proof. He was one of the youngest members ever appointed a Fellow of the Royal Society. Unfortunately, in 1 9 1 7 Ramanujan became extremely ill. At the time, it was thought that he had trouble with the English climate and had contracted tuberculosis. It is now thought that he suffered from a vitamin deficiency, brought on by Ramanujan's strict vegetarianism and shortages in wartime England. He returned to India in 1 9 1 9, continuing to do mathematics even when confined to his bed. He was religious and thought his mathematical talent came from his family deity, Namagiri. He considered mathematics and religion to be linked. He said that "an equation for me has no meaning unless it expresses a thought of God." His short life came to an end in April 1 920, when he was 32 years old. Ramanujan left several notebooks of unpublished results. The writings in these notebooks illustrate Ramanujan's insights but are quite sketchy. Several mathematicians have devoted many years of study to explaining and justifying the results in these notebooks.
1 .7 Proof Methods and Strategy
/ - 95
EXAMPLE 14
95
Given two positive real numbers x and y, their arithmetic mean is (x + y) /2 and their geometric mean is .jXY . When we compare the arithmetic and geometric means of pairs of distinct positive real numbers, we find that the arithmetic mean is always greater than the geometric mean. [For example, when x = 4 and y = 6, we have 5 = (4 + 6)/2 > � = .J24 .] Can we prove that this inequality is always true?
Solution: To prove that (x + y) /2 > .jXY when x and y are distinct positive real numbers, we can work backward. We construct a sequence of equivalent inequalities. The equivalent inequalities are (x + y)/2 (x + y)2 /4
>
.jXY,
>
xy ,
>
4xy,
(x + y)2 x 2 + 2xy + y 2 x 2 - 2xy + y2
>
4xy,
>
0,
(x - y)2
>
o.
Because (x - y)2 > 0 when x =1= y, it follows that the final inequality is true. Because all these inequalities are equivalent, it follows that (x + y) /2 > .jXY when x =1= y. Once we have carried out this backward reasoning, we can easily reverse the steps to construct a proof using forward reasoning. We now give this proof. Suppose that x and y are distinct real numbers. Then (x - y)2 > 0 because the square of a nonzero real number is positive (see Appendix 1 ). Because (x - y)2 = x 2 - 2xy + y 2 , this implies that x 2 - 2xy + y2 > O. Adding 4xy to both sides, we obtain x 2 + 2xy + y 2 > 4xy. Because x 2 + 2xy + y 2 = (x + y)2 , this means that (x + y) 2 2: 4xy. Dividing both sides of this equation by 4, we see that (x + y)2 /4 > xy. Finally, taking square roots of both sides (which preserves the inequality because both sides are positive) yields (x + y)/2 > .jXY. We conclude that if x and y are distinct positive real numbers, then their arithmetic mean (x + y)/2 is greater � than their geometric mean .jXY.
EXAMPLE 15
Suppose that two people play a game taking turns removing one, two, or three stones at a time from a pile that begins with 1 5 stones. The person who removes the last stone wins the game. Show that the first player can win the game no matter what the second player does.
Solution: To prove that the first player can always win the game, we work backward. At the last step, the first player can win if this player is left with a pile containing one, two, or three stones. The second player will be forced to leave one, two, or three stones if this player has to remove stones from a pile containing four stones. Consequently, one way for the first person to win is to leave four stones for the second player on the next-to-last move. The first person can leave four stones when there are five, six, or seven stones left at the beginning of this player's move, which happens when the second player has to remove stones from a pile with eight stones. Consequently, to force the second player to leave five, six , or seven stones, the first player should leave eight stones for the second player at the second-to-last move for the first player. This means that there are nine, ten, or eleven stones when the first player makes this move. Similarly, the first player should leave twelve stones when this player makes the first move. We can reverse this argument to show that the first player can always make moves so that this player wins the game no matter what the second player does. These moves successively leave twelve, eight, and � four stones for the second player.
96
1 / The Foundations: Logic and Proofs
1 -96
ADAPTING EXISTING PROOFS An excellent way to look for possible approaches that can be used to prove a statement is to take advantage of existing proofs. Often an existing proof can be adapted to prove a new result. Even when this is not the case, some of the ideas used in existing proofs may be helpful. Because existing proofs provide clues for new proofs, you should read and understand the proofs you encounter in your studies. This process is illustrated in Example 1 6.
EXAMPLE 16
In Example 1 0 of Section 1 .6 we proved that ..ti is irrational. We now conjecture that J3 is irrational. Can we adapt the proof in Example l O in Section 1 .6 to show that J3 is irrational?
Ex1ra � Examples �
Solution: To adapt the proof in Example l O in Section 1 .6, we begin by mimicking the steps in that proof, but with ..ti replaced with J3. First, we suppose that J3 d / c where the fraction c / d is in lowest terms. Squaring both sides tells us that 3 c2 / d2 , so that 3d2 c2 . Can we use this equation to show that 3 must be a factor of both c and d, similar to how we used the equation 2b 2 a 2 in Example 1 0 in Section 1 .6 to show that 2 must be a factor of both a and b? (Recall that an integer s is a factor of the integer t if t /s is an integer. An integer n is even if and only if 2 is a factor of n .) In turns out that we can, but we need some ammunition from number theory, which we will develop in Chapter 3 . We sketch out the remainder of the proof, but leave the justification of these steps until Chapter 3 . Because 3 is a factor of c2 , it must also be a factor of c. Furthermore, because 3 is a factor of c, 9 is a factor of c2 , which means that 9 is a factor of 3d2 • This implies that 3 is a factor of d2 , which means that 3 is a factor of that d. This makes 3 a factor of both c and d, which is a contradiction. After we have filled in the justification for these steps, we will have shown that J3 is irrational by adapting the proof that ..ti is irrational. Note that this proof can be extended to show that In is irrational whenever n � is a positive integer that is not a perfect square. We leave the details of this to Chapter 3 .
=
=
=
=
A good tip is to look for existing proofs that you might adapt when you are confronted with proving a new theorem, particularly when the new theorem seems similar to one you have already proved.
Looking for Counterexamples In Section 1 .5 we introduced the use of counterexamples to show that certain statements are false. When confronted with a conjecture, you might first try to prove this conjecture, and if your attempts are unsuccessful, you might try to find a counterexample. If you cannot find a counterexample, you might again try to prove the statement. In any case, looking for counterex amples is an extremely important pursuit, which often provides insights into problems. We will illustrate the role of counterexamples with a few examples.
EXAMPLE 17
In Example 14 in Section 1 .6 we showed that the statement "Every positive integer is the sum of two squares of integers" is false by finding a counterexample. That is, there are positive integers that cannot be written as the sum of the squares of two integers. Although we cannot write every positive integer as the sum of the squares of two integers, maybe we can write every positive integer as the sum of the squares of three integers. That is, is the statement "Every positive integer is the sum of the squares of three integers" true or false?
Solution: Because we know that not every positive integer can be written as the sum of two squares of integers, we might initially be skeptical that every positive integer can be written as the sum of three squares of integers. So, we first look for a counterexample. That is, we can show that the statement "Every positive integer is the sum of three squares of integers"
1 .7 Proof Methods and Strategy
97
is false if we can find a particular integer that is not the sum of the squares of three integers. To look for a counterexample, we try to write successive positive integers as a sum of three squares. We find that 1 = 02 + 02 + 1 2 , 2 = 02 + 1 2 + 1 2 , 3 = 1 2 + 1 2 + 1 2 , 4 = 02 + 02 + 22 , 5 = 02 + 1 2 + 2 2 , 6 = 1 2 + 1 2 + 22 , but we cannot find a way to write 7 as the sum of three squares. To show that there are not three squares that add up to 7, we note that the only possible squares we can use are those not exceeding 7, namely, 0, 1 , and 4. Because no three terms where each term is 0, 1 , or 4 add up to 7, it follows that 7 is a counterexample. We conclude that the statement "Every positive integer is the sum of the squares of three integers" is false. We have shown that not every positive integer is the sum of the squares of three integers. The next question to ask is whether every positive integer is the sum of the squares of four positive integers. Some experimentation provides evidence that the answer is yes. For example, 7 = 1 2 + 1 2 + 1 2 + 22 , 25 = 42 + 22 + 2 2 + 1 2 , and 87 = 92 + 2 2 + 1 2 + 1 2 . It turns out the conjecture "Every positive integer is the sum of the squares offour integers" is true. For a proof, � see [Ro05].
Proof Strategy in Action Mathematics is generally taught as if mathematical facts were carved in stone. Mathematics texts (including the bulk of this book) formally present theorems and their proofs. Such presen tations do not convey the discovery process in mathematics. This process begins with exploring concepts and examples, asking questions, formulating conjectures, and attempting to settle these conjectures either by proof or by counterexample. These are the day-to-day activities of math ematicians. Believe it or not, the material taught in textbooks was originally developed in this way. People formulate conjectures on the basis of many types of possible evidence. The exam ination of special cases can lead to a conjecture, as can the identification of possible patterns. Altering the hypotheses and conclusions of known theorems also can lead to plausible conjec tures. At other times, conjectures are made based on intuition or a belief that a result holds. No matter how a conjecture was made, once it has been formulated, the goal is to prove or disprove it. When mathematicians believe that a conjecture may be true, they try to find a proof. If they can not find a proof, they may look for a counterexample. When they cannot find a counterexample, they may switch gears and once again try to prove the conjecture. Although many conjectures are quickly settled, a few conjectures resist attack for hundreds of years and lead to the devel opment of new parts of mathematics. We will mention a few famous conjectures later in this section.
Tilings
unkS �
We can illustrate aspects of proof strategy through a brief study of tHings of checkerboards. Looking at tHings of checkerboards is a fruitful way to quickly discover and prove many different results, with these proofs using a variety of proof methods. There are almost an endless number of conjectures that can be made and studied in this area too. To begin, we need to define some terms. A checkerboard is a rectangle divided into squares of the same size by horizontal and vertical lines. The game of checkers is played on a board with 8 rows and 8 columns; this board is called the standard checkerboard and is shown in Figure 2. In this section we use the term board to refer to a checkerboard of any rectangular size as well as parts of checkerboards obtained by removing one or more squares. A domino is a rectangular piece that is one square by two squares, as shown in Figure 3 . We say that a board is tiled by dominoes when all its squares are covered with no overlapping dominoes and no dominoes overhanging the board. We now develop some results about tiling boards using dominoes.
98
1 / The Foundations: Logic and Proofs
1 - 98
B
rn FIGURE
EXAMPLE 18
2
The Standard Checkerboard.
FIGURE 3
Two Dominoes.
Can we tile the standard checkerboard using dominoes?
Solution: We can find many ways to tile the standard checkerboard using dominoes. For example, we can tile it by placing 32 dominoes horizontally, as shown in Figure 4. The existence of one such tiling completes a constructive existence proof. Of course, there are a large number of other ways to do this tiling. We can place 32 dominoes vertically on the board or we can place some tiles vertically and some horizontally. But for a constructive existence proof we needed to .... find just one such tiling. EXAMPLE 19
Can we tile a board obtained by removing one ofthe four comer squares of a standard checker board?
Solution: To answer this question, note that a standard checkerboard has 64 squares, so removing a square produces a board with 63 squares. Now suppose that we could tile a board obtained from the standard checkerboard by removing a comer square. The board has an even number of squares because each domino covers two squares and no two dominoes overlap and no dominoes overhang the board. Consequently, we can prove by contradiction that a standard checkerboard with one square removed cannot be tiled using dominoes because such a board has an odd .... number of squares. We now consider a trickier situation. EXAMPLE 20
Can we tile the board obtained by deleting the upper left and lower right comer squares of a standard checkerboard, shown in Figure 5?
Solution: A board obtained by deleting two squares of a standard checkerboard contains 64 2 = 62 squares. Because 62 is even, we cannot quickly rule out the existence of a tiling of the standard checkerboard with its upper left and lower right squares removed, unlike Example 1 9, where we ruled out the existence of a tiling of the standard checkerboard with one comer square removed. Trying to construct a tiling of this board by successively placing dominoes might be a first approach, as the reader should attempt. However, no matter how much we try, we cannot find such a tiling. Because our efforts do not produce a tiling, we are led to conjecture that no tiling exists.
1-99
1 .7 Proof Methods and Strategy
99
OJ OJ OJ OJOJOJOJ FIGURE 4
Tiling the Standard Checkerboard.
FIGURE 5 The Standard Checkerboard with the Upper Left and Lower Right Corners Removed.
We might try to prove that no tiling exists by showing that we reach a dead end however we successively place dominoes on the board. To construct such a proof, we would have to consider all possible cases that arise as we run through all possible choices of successively placing dominoes. For example, we have two choices for covering the square in the second column of the first row, next to the removed top left comer. We could cover it with a horizontally placed tile or a vertically placed tile. Each of these two choices leads to further choices, and so on. It does not take long to see that this is not a fruitful plan of attack for a person, although a computer could be used to complete such a proof by exhaustion. (Exercise 2 1 asks you to supply such a proof to show that a 4 x 4 checkerboard with opposite comers removed cannot be tiled.) We need another approach. Perhaps there is an easier way to prove there is no tiling of a standard checkerboard with two opposite comers. As with many proofs, a key observation can help. We color the squares of this checkerboard using alternating white and black squares, as in Figure 2. Observe that a domino in a tiling of such a board covers one white square and one black square. Next, note that this board has unequal numbers of white square and black squares. We can use these observations to prove by contradiction that a standard checkerboard with opposite comers removed cannot be tiled using dominoes. We now present such a proof.
Eb FIGURE 6 A Right Triomino and a Straight Triomino.
Proof: Suppose we can use dominoes to tile a standard checkerboard with opposite comers removed. Note that the standard checkerboard with opposite comers removed contains 64 - 2 = 62 squares. The tiling would use 62/2 = 3 1 dominoes. Note that each domino in this tiling covers one white and one black square. Consequently, the tiling covers 3 1 white squares and 3 1 black squares. However, when we remove two opposite comer squares, either 32 of the remaining squares are white and 30 are black or else 30 are white and 32 are black. This contradicts the assumption that we can use dominoes to cover a standard checkerboard with opposite comers removed, completing the proof. .... We can use other types of pieces besides dominoes in tilings. Instead of dominoes we can study tilings that use identically shaped pieces constructed from congruent squares that are connected along their edges. Such pieces are called polyominoes, a term coined in 1 953 by the mathematician Solomon Golomb, the author of an entertaining book about them [G094] . We will consider two polyominoes with the same number of squares the same if we can rotate and/or flip one ofthe polyominoes to get the other one. For example, there are two types of triominoes (see Figure 6), which are polyominoes made up of three squares connected by their sides. One
100
1 / The Foundations: Logic and Proofs
/ - / 00
type of triomino, the straight triomino, has three horizontally connected squares; the other type, right triominoes, resembles the letter L in shape, flipped and/or rotated, if necessary. We will study the tHings of a checkerboard by straight triominoes here; we will study tilings by right triominoes in Section 4. 1 .
EXAMPLE 21
Can you use straight triominoes to tile a standard checkerboard?
Solution: The standard checkerboard contains 64 squares and each triomino covers three squares. Consequently, if triominoes tile a board, the number of squares of the board must be a mul tiple of 3 . Because 64 is not a multiple of 3, triominoes cannot be used to cover an 8 x 8 .... checkerboard. In Example 22, we consider the problem of using straight triominoes to tile a standard checkerboard with one comer missing.
EXAMPLE 22
Can we use straight triominoes to tile a standard checkerboard with one of its four comers removed? An 8 x 8 checkerboard with one comer removed contains 64 1 = 63 squares. Any tiling by straight triominoes of one of these four boards uses 63/3 = 2 1 triominoes. However, when we experiment, we cannot find a tiling of one of these boards using straight triominoes. A proof by exhaustion does not appear promising. Can we adapt our proof from Example 20 to prove that no such tiling exists? -
Solution: We will color the squares of the checkerboard in an attempt to adapt the proof by contradiction we gave in Example 20 of the impossibility of using dominoes to tile a standard checkerboard with opposite comers removed. Because we are using straight triominoes rather than dominoes, we color the squares using three colors rather than two colors, as shown in Figure 7. Note that there are 2 1 green squares, 2 1 black squares, and 22 white squares in this coloring. Next, we make the crucial observation that when a straight triomino covers three squares of the checkerboard, it covers one green square, one black square, and one white square. Next, note that each of the three colors appears in a comer square. Thus without loss of generality, we may assume that we have rotated the coloring so that the missing square is colored green. Therefore, we assume that the remaining board contains 20 green squares, 2 1 black squares, and 22 white squares. If we could tile this board using straight triominoes, then we would use 63/3 = 2 1 straight triominoes. These triominoes would cover 2 1 green squares, 2 1 black squares, and 2 1 white squares. This contradicts the fact that this board contains 20 green squares, 2 1 black squares, .... and 22 white squares. Therefore we cannot tile this board using straight triominoes.
The Role of Open Problems Many advances in mathematics have been made by people trying to solve famous unsolved problems. In the past 20 years, many unsolved problems have finally been resolved, such as the proof of a conjecture in number theory made more than 300 years ago. This conjecture asserts the truth of the statement known as Fermat's Last Theorem.
THEOREM
1
FERMAT'S LAST THEOREM
The equation
x n + yn = z n
has no solutions in integers x,
y,
and z with xyz i= 0 whenever n is an integer with n
>
2.
1 .7 Proof Methods and Strategy
1-101
101
FIGURE 7 Coloring the Squares of the Standard Checkerboard with Opposite Corners Removed with Three Colors.
unkS
�
2
Remark: The equation x + y
2 = z 2 has infinitely many solutions in integers x, y, and z; these
solutions are called Pythagorean triples and correspond to the lengths of the sides of right triangles with integer lengths. See Exercise 30.
This problem has a fascinating history. In the seventeenth century, Fermat jotted in the margin of his copy of the works of Diophantus that he had a "wondrous proof" that there are no integer solutions of x n + y n = zn when n is an integer greater than 2 with xyz =1= O. However, he never published a proof (Fermat published almost nothing), and no proof could be found in the papers he left when he died. Mathematicians looked for a proof for three centuries without success, although many people were convinced that a relatively simple proof could be found. (Proofs of special cases were found, such as the proof of the case when n = 3 by Euler and the proof of the n = 4 case by Fermat himself.) Over the years, several established mathematicians thought that they had proved this theorem. In the nineteenth century, one ofthese failed attempts led to the development of the part of number theory called algebraic number theory. A correct proof, requiring hundreds of pages of advanced mathematics, was not found until the 1 990s, when Andrew Wiles used recently developed ideas from a sophisticated area of number theory called the theory of elliptic curves to prove Fermat's last theorem. Wiles's quest to find a proof of Fermat's last theorem using this powerful theory, described in a program in the Nova series on public television, took close to ten years ! (The interested reader should consult [Ro05] for more information about Fermat's Last Theorem and for additional references concerning this problem and its resolution.) We now state an open problem that is simple to describe, but that seems quite difficult to resolve. EXAMPLE 23 unkS
�
The 3x + 1 Conjecture Let T be the transformation that sends an even integer x to x /2 and an odd integer x to 3x + 1 . A famous conjecture, sometimes known as the 3x + 1 conjecture, states that for all positive integers x, when we repeatedly apply the transformation T , we will eventually reach the integer 1 . For example, starting with x = 1 3 , we find T ( 1 3) = 3 . 1 3 + 1 = 40, T (40) = 40/2 = 20, T (20) = 20/2 = 1 0, T ( 1 0) = 1 0/2 = 5 , T (5) = 3 · 5 + 1 = 1 6, T ( 1 6) = 8, T (8) = 4, T (4) = 2, and T (2) = 1 . The 3x + 1 conjecture has been verified for all integers x up to 5 . 6 · 1 0 1 3 .
102
I / The Foundations: Logic and Proofs
1 - 1 02
The 3x + 1 conjecture has an interesting history and has attracted the attention of math ematicians since the 1 950s. The conjecture has been raised many times and goes by many other names, including the Collatz problem, Hasse's algorithm, Ulam's problem, the Syracuse problem, and Kakutani's problem. Many mathematicians have been diverted from their work to spend time attacking this conjecture. This led to the joke that this problem was part of a conspiracy to slow down American mathematical research. See the article by Jeffrey Lagarias [La85] for a fascinating discussion of this problem and the results that have been found by .... mathematicians attacking it. In Chapter 3 we will describe additional open questions about prime numbers. Students already familiar with the basic notions about primes might want to explore Section 3 .4, where these open questions are discussed. We will mention other important open questions throughout the b09k.
Additional Proof Methods In this chapter we introduced the basic methods used in proofs. We also described how to leverage these methods to prove a variety of results. We will use these proof methods in Chapters 2 and 3 to prove results about sets, functions, algorithms, and number theory. Among the theorems we will prove is the famous halting theorem which states that there is a problem that cannot be solved using any procedure. However, there are many important proof methods besides those we have covered. We will introduce some of these methods later in this book. In particular, in Section 4. 1 we will discuss mathematical induction, which is an extremely useful method for proving statements of the form Vn P (n ) , where the domain consists of all positive integers. In Section 4.3 we will introduce structural induction, which can be used to prove results about recursively defined sets. We will use the Cantor diagonalization method, which can be used to prove results about the size of infinite sets, in Section 2.4. In Chapter 5 we will introduce the notion of combinatorial proofs, which can be used to prove results by counting arguments. The reader should note that entire books have been devoted to the activities discussed in this section, including many excellent works by George P6lya ([P06 1 ] , [P07 1 ] , [P090]). Finally, note that we have not given a procedure that can be used for proving theorems in mathematics. It is a deep theorem of mathematical logic that there is no such procedure.
Exercises 1.
Prove that n 2 + I ::: 1 :::: n ::::
4.
2n
when n is a positive integer with
2. Prove that there are no positive perfect cubes less 1 000 that are the sum of the cubes of two positive integers. 3. Prove that if x and y are real numbers, then max(x , y) + min(x , y) = x + y. [Hint: Use a proof by cases, with the two cases corresponding to x ::: y and x < y, respectively. ] 4. Use a proof by cases to show that min(a , min(b, c» = min(min(a , b), c) whenever a, b, and c are real numbers. 5. Prove the triangle inequality, which states that if x and y are real numbers, then Ix l + I y l ::: Ix + y l (where Ix l rep resents the absolute value of x, which equals x if x ::: 0 and equals -x if x < 0).
6. Prove that there is a positive integer that equals the sum of the positive integers not exceeding it. Is your proof constructive or nonconstructive? 7. Prove that there are 1 00 consecutive positive integers that are not perfect squares. Is your proof constructive or non constructive? 8. Prove that either 1 05 00 + 1 5 or 1 0500 + 1 6 is
2·
2·
not a perfect square. Is your proof constructive or nonconstructive? 9. Prove that there exists a pair of consecutive integers such that one of these integers is a perfect square and the other is a perfect cube. 10. Show that the product of two of the numbers 65 1000 - 5 8 1 92 + + 2 2001 , and 8 2001 + 3 1 77 , 7 9 1 2 1 2 _
92399
244493
1-103
1 .7 Proof Methods and Strategy
7 1 777 is nonnegative. Is your proof constructive or noncon structive? Do not try to evaluate these numbers ! ] 1 1 . Prove or disprove that there i s a rational number x and an irrational number y such that xY is irrational. 12. Prove or disprove that if a and b are rational numbers, then a b is also rational. 13. Show that each of these statements can be used to ex press the fact that there is a unique element x such that P(x) is true. [Note that we can also write this statement as 3 ! x P (x).] a) 3xVy(P(y) ++ x = y) b) 3x P(x) /\ VxVy(P(x) /\ P (y) -+ X = y) c) 3x(P(x) /\ Vy(P(y) -+ x = y» 14. Show that if a, b, and c are real numbers and a =I 0, then there is a unique solution of the equation ax + b = c. 15. Suppose that a and b are odd integers with a =I b. Show there is a unique integer c such that la - cl = I b - c l . 16. Show that i f r i s an irrational number, there i s a unique integer such that the distance between r and is less than 1 /2. 17. Show that if is an odd integer, then there is a unique integer k such that is the sum of k - 2 and k + 3 . 18. Prove that given a real number x there exist unique num bers and E such that x = + E, is an integer, and O � E < l. 19. Prove that given a real number x there exist unique num bers n and E such that x = - E , is an integer, and O � E < l. 20. Use forward reasoning to show that if x is a nonzero real number, then x 2 + l /x 2 :::: 2. Start with the inequality (x - l /x)2 :::: 0 which holds for all nonzero real numbers x .] 21. The harmonic mean of two real numbers x and y equals 2xy /(x + y). By computing the harmonic and geometric means of different pairs of positive real numbers, formu late a conjecture about their relative sizes and prove your conjecture. 22. The quadratic mean of two real numbers x and y equals J(x 2 + y2 )/2. By computing the arithmetic and quadratic means of different pairs of positive real num bers, formulate a conjecture about their relative sizes and prove your conjecture. *23. Write the numbers 1 , 2, on a blackboard, where is an odd integer. Pick any two of the numbers, and k, write Ij - kl on the board and erase and k. Continue this process until only one integer is written on the board. Prove that this integer must be odd. *24. Suppose that five ones and four zeros are arranged around a circle. Between any two equal bits you insert a 0 and between any two unequal bits you insert a 1 to produce nine new bits. Then you erase the nine original bits. Show that when you iterate this procedure, you can never get nine zeros. Work backward, assuming that you did end up with nine zeros.]
[Hint:
n
n
n n
n
n
n
n
n
[Hint:
. . . , 2n
[Hint:
j
j n
103
25. Formulate a conjecture about the decimal digits that ap
pear as the final decimal digit of the fourth power of an integer. Prove your conjecture using a proof by cases. 26. Formulate a conjecture about the final two decimal digits of the square of an integer. Prove your conjecture using a proof by cases. 27. Prove that there is no positive integer such that 2 +
n3
=
n
t oo.
n
28. Prove that there are no solutions in integers x and y to the
equation 2x 2 + 5y 2 = 14. 29. Prove that there are no solutions in positive integers x and y to the equation x4 + y4 = 625 . 30. Prove that there are infinitely many solutions i n posi tive integers x, y, and z to the equation x 2 + y2 = z2 . Let x = m 2 - n 2 , y = and z = m 2 + 2 , where m and are integers.] 31. Adapt the proof in Example in Section 1 .6 to prove that if = abc, where a , b, and c are positive integers, then a � �, b � �, or c � �. 32. Prove that � is irrational. 33. Prove that between every two rational numbers there is an irrational number. 34. Prove that between every rational number and every irra tional number there is an irrational number. *35. Let = xl Y I + X2 Y2 + . . . + Xn Yn , where X I , X2 , . . . , Xn and Y I , Y2 , , Yn are orderings of two different se quences of positive real numbers, each containing elements. a) Show that takes its maximum value over all order ings of the two sequences when both sequences are sorted (so that the elements in each sequence are in nondecreasing order). b) Show that takes its minimum value over all order ings ofthe two sequences when one sequence is sorted into nondecreasing order and the other is sorted into nonincreasing order. 36. Prove or disprove that if you have an 8-gallon jug of water and two empty jugs with capacities of 5 gal lons and 3 gallons, respectively, then you can measure 4 gallons by successively pouring some of or all of the water in a jug into another jug. 37. Verify the 3x + 1 conjecture for these integers. a) 6 b) 7 c) 1 7 d) 2 1 38. Verify the 3x + 1 conjecture for these integers. a) 1 6 b) 1 1 c) 35 d) 1 1 3 39. Prove or disprove that you can use dominoes to tile the standard checkerboard with two adjacent comers removed (that is, comers that are not opposite). 40. Prove or disprove that you can use dominoes to tile a stan dard checkerboard with all four comers removed. 41. Prove that you can use dominoes to tile a rectangular checkerboard with an even number of squares. 42. Prove or disprove that you can use dominoes to tile a 5 x 5 checkerboard with three comers removed.
[Hint: n
S
n
. • .
S
S
4
2mn,
n
n
104
1 / The Foundations: Logic and Proofs
1 - 1 04
43. Use a proof by exhaustion to show that a tiling using
dominoes of a 4 x 4 checkerboard with opposite cor ners removed does not exist. First show that you can assume that the squares in the upper left and lower right corners are removed. Number the squares of the original checkerboard from 1 to 1 6, starting in the first row, moving right in this row, then starting in the left most square in the second row and moving right, and so on. Remove squares 1 and 1 6 . To begin the proof, note that square is covered either by a domino laid horizontally, which covers squares and 3, or vertically, which covers squares and 6. Consider each of these cases separately, and work through all the subcases that arise.] *44. Prove that when a white square and a black square are removed from an 8 x 8 checkerboard (colored as in the text) you can tile the remaining squares of the checkerboard using dominoes. Show that when one black and one white square are removed, each part of the partition of the remaining cells formed by insert ing the barriers shown in the figure can be covered by dominoes.] 45. Show that by removing two white squares and two black squares from an 8 x 8 checkerboard (colored as in the text) you can make it impossible to tile the remaining squares using dominoes. *46. Find all squares, ifthey exist, on an 8 x 8 checkerboard so that the board obtained by removing one of these square can be tiled using straight triominoes. First use
[Hint:
2
2
2
[Hint:
[Hint:
Figure for Exercise 44. arguments based on coloring and rotations to eliminate as many squares as possible from consideration.] *47. a) Draw each of the five different tetrominoes, where a tetromino is a polyomino consisting of four squares. b) For each of the five different tetrominoes, prove or disprove that you can tile a standard checkerboard using these tetrominoes. *48. Prove or disprove that you can tile a l O x 10 checkerboard using straight tetrominoes.
Key Terms and Results TERMS proposition: a statement that is true or false propositional variable: a variable that represents a proposition truth value: true or false ...., p (negation ofp): the proposition with truth value opposite
to the truth value of p
logical operators: operators used to combine propositions compound proposition: a proposition constructed by combin
ing propositions using logical operators
truth table: a table displaying the truth values of propositions p V q (disj unction ofp and q): the proposition "p or q ," which
is true if and only if at least one of p and q is true q (conjunction of p and q): the proposition "p and q " which is true if and only if both p and q are true p EB q (exclusive or of p and q): the proposition "p XOR q " which is true when exactly one of p and q is true p - q (p implies q): the proposition "if p, then q ," which is false if and only if p is true and q is false converse ofp _ q: the conditional statement q � p contrapositive of p _ q: the conditional statement
p
/\
""'q � ""'p
inverse ofp - q: the conditional statement ""'p � ""'q p � q (biconditional): the proposition "p if and only if q ,"
which is true if and only if p and q have the same truth value bit: either a 0 or a 1 Boolean variable: a variable that has a value of 0 or 1 bit operation: an operation on a bit or bits bit string: a list of bits bitwise operations: operations on bit strings that operate on each bit in one string and the corresponding bit in the other string tautology: a compound proposition that is always true contradiction: a compound proposition that is always false contingency: a compound proposition that is sometimes true and sometimes false consistent compound propositions: compound propositions for which there is an assignment of truth values to the vari ables that makes all these propositions true logically equivalent compound propositions: compound propositions that always have the same truth values
1 - 1 05
predicate: part of a sentence that attributes a property to the subject propositional function: a statement containing one or more variables that becomes a proposition when each of its vari ables is assigned a value or is bound by a quantifier domain (or universe) of discourse: the values a variable in a propositional function may take 3x P(x) (existential quantification of P(x»: the proposition that is true if and only if there exists an x in the domain such that P(x) is true 'v'xP(x) (universal quantification of P(x»: the proposition that is true if and only if P (x) is true for every x in the domain logically equivalent expressions: expressions that have the same truth value no matter which propositional functions and domains are used free variable: a variable not bound in a propositional function bound variable: a variable that is quantified scope of a quantifier: portion of a statement where the quan tifier binds its variable argument: a sequence of statements argument form: a sequence of compound propositions involv ing propositional variables premise: a statement, in an argument, or argument form, other than the final one conclusion: the final statement in an argument or argument form valid argument form: a sequence of compound propositions involving propositional variables where the truth of all the premises implies the truth of the conclusion valid argument: an argument with a valid argument form rule of inference: a valid argument form that can be used in the demonstration that arguments are valid fallacy: an invalid argument form often used incorrectly as a rule of inference (or sometimes, more generally, an incor rect argument) circular reasoning or begging the question: reasoning where one or more steps are based on the truth of the statement being proved theorem: a mathematical assertion that can be shown to be true conjecture: a mathematical assertion proposed to be true, but that has not been proved
Review Questions
105
proof: a demonstration that a theorem is true axiom: a statement that is assumed to be true and that can be used as a basis for proving theorems lemma: a theorem used to prove other theorems corollary: a proposition that can be proved as a consequence of a theorem that has just been proved vacuous proof: a proof that p � q is true based on the fact that p is false trivial proof: a proof that p � q is true based on the fact that q is true direct proof: a proofthat p � q is true that proceeds by show ing that q must be true when p is true proof by contraposition : a proof that p � q is true that pro ceeds by showing that p must be false when q is false proof by contradiction: a proof that p is true based on the truth of the conditional statement �p � q, where q is a contradiction exhaustive proof: a proof that establishes a result by checking a list of all cases proof by cases: a proof broken into separate cases where these cases cover all possibilities without loss of generality: an assumption in a proofthat makes it possible to prove a theorem by reducing the number of cases needed in the proof counterexample: an element x such that P (x) is false constructive existence proof: a proof that an element with a specified property exists that explicitly finds such an ele ment nonconstructive existence proof: a proof that an element with a specified property exists that does not explicitly find such an element rational number: a number that can be expressed as the ratio of two integers p and q such that q # 0 uniqueness proof: a proof that there is exactly one element satisfying a specified property
RESULTS The logical equivalences given in Tables 6, 7, and 8 in Section 1 .2. De Morgan's laws for quantifiers. Rules of inference for propositional calculus. Rules of inference for quantified statements.
Review Questions 1. a) Define the negation of a proposition. b) What is the negation of "This is a boring course"? 2. a) Define (using truth tables) the disjunction, conjunc tion, exclusive or, conditional, and biconditional of the propositions p and q . b ) What are the disjunction, conjunction, exclusive or, conditional, and biconditional ofthe propositions "I'll go to the movies tonight" and "I'll finish my discrete mathematics homework"?
3. a) Describe at least five different ways to write the con ditional statement p � q in English. b) Define the converse and contrapositive ofa conditional statement. c) State the converse and the contrapositive of the con ditional statement "If it is sunny tomorrow, then I will go for a walk in the woods." 4. a) What does it mean for two propositions to be logically equivalent?
1 I
106
The Foundations: Logic and Proofs
b) Describe the different ways to show that two com pound propositions are logically equivalent. c) Show in at least two different ways that the compound propositions � P v (r -+ �q) and �P v �q V �r are equivalent. 5. (Depends on the Exercise Set in Section 1.2)
6. 7.
8.
9.
a) Given a truth table, explain how to use disjunctive nor mal form to construct a compound proposition with this truth table. b) Explain why part (a) shows that the operators /\, v, and � are functionally complete. c) Is there an operator such that the set containing just this operator is functionally complete? What are the universal and existential quantifications of a predicate P (x)? What are their negations? a) What is the difference between the quantification 3xVy P (x , y) and Vy3x P (x , y), where P (x , y) is a predicate? b) Give an example of a predicate P (x , y) such that 3xVy P (x , y) and Vy3x P (x , y) have different truth values. Describe what is meant by a valid argument in proposi tional logic and show that the argument "If the earth is flat, then you can sail off the edge of the earth," "You can not sail off the edge of the earth," therefore, "The earth is not flat" is a valid argument. Use rules of inference to show that if the premises "All zebras have stripes" and "Mark is a zebra" are true, then the conclusion "Mark has stripes" is true.
1 - 1 06
10. a) Describe what is meant by a direct proof, a proof by contraposition, and a proof by contradiction of a con ditional statement P -+ q . b) Give a direct proof, a proof by contraposition and a proof by contradiction of the statement: "If n is even, then n + 4 is even." 1 1 . a) Describe a way to prove the biconditional P # q . b) Prove the statement: "The integer 3 n + 2 i s odd if and only if the integer 9n + 5 is even, where n is an integer." 12. To prove that the statements PI , P2 , P3 , and P4 are equivalent, is it sufficient to show that the conditional statements P4 -+ P2 , P3 -+ PI , and PI -+ P2 are valid? If not, provide another set of conditional statements that can be used to show that the four statements are equivalent. 13. a) Suppose that a statement of the form Vx P (x) is false. How can this be proved? b) Show that the statement "For every positive integer n, n 2 ::: 2 n " i s false. 14. What is the difference between a constructive and nonconstructive existence proof? Give an example of each. 1 5. What are the elements of a proof that there is a unique element x such that P (x), where P(x) is a propositional function? 1 6. Explain how a proof by cases can be used to prove a result about absolute values, such as the fact that Ixy l = Ix l ly l for all real numbers x and y .
Su pp lementary Exercises 1. Let P be the proposition "I will do every exercise in this book" and q be the proposition "I will get an ''I>:' in this course." Express each of these as a combination of P and q.
a) I will get an "Pl' in this course only if ! do every exer cise in this book. b) I will get an "Pl' in this course and I will do every exercise in this book: c) Either I will not get an "Pl' in this course or I will not do every exercise in this book. For me to get an "A" in this course it is necessary and sufficient that I do every exercise in this book. 2. Find the truth table of the compound proposition (p v
d)
q) -+ (p /\ �r).
3. Show that these compound propositions are tautologies.
a) (�q /\ (p -+ q» -+ � P b) « p v q) /\ �p) -+ q 4. Give the converse, the contrapositive, and the inverse of these conditional statements. a) If it rains today, then I will drive to work. b) If lx l = x, then x ::: O. c) If n is greater than 3, then n 2 is greater than 9.
5 . Given a conditional statement P -+ q , find the converse of its inverse, the converse of its converse, and the con verse of its contrapositive. 6. Given a conditional statement p -+ q , find the inverse of its inverse, the inverse of its converse, and the inverse of its contrapositive. 7. Find a compound proposition involving the propositional variables p, q , r, and s that is true when exactly three of these propositional variables are true and is false otherwise.
8. Show that these statements are inconsistent: "If Sergei takes the job offer then he will get a signing bonus." "If Sergei takes the job offer, then he will receive a higher salary." "If Sergei gets a signing bonus, then he will not receive a higher salary." "Sergei takes the job offer." 9. Show that these statements are inconsistent: "If Miranda does not take a course in discrete mathematics, then she will not graduate." "If Miranda does not graduate, then she is not qualified for the job." "If Miranda reads this book, then she is qualified for the job." "Miranda does not take a course in discrete mathematics but she reads this book."
I - I 1! 7
Supplementary Exercises
ing the propositional function G(x , y), which represents "x is the grandmother of y."
10. Suppose that you meet three people, A, B , and C , o n the
island of knights and knaves described in Example 1 8 in Section 1 . 1 . What are A, B , and C if A says "I am a knave and B is a knight" and B says "Exactly one of the three of us is a knight"? 1 1 . (Adapted from [Sm78]) Suppose that on an island there are three types of people, knights, knaves, and normals. Knights always tell the truth, knaves always lie, and nor mals sometimes lie and sometimes tell the truth. Detec tives questioned three inhabitants of the island-Amy, Brenda, and Claire-as part of the investigation of a crime. The detectives knew that one ofthe three commit ted the crime, but not which one. They also knew that the criminal was a knight, and that the other two were not. Ad ditionally, the detectives recorded these statements: Amy: "I am innocent." Brenda: "What Amy says is true." Claire: "Brenda is not a normal." After analyzing their informa tion, the detectives positively identified the guilty party. Who was it? 12. Show that if is a proposition, where is the condi tional statement "If is true, then unicorns live," then "Unicorns live" is true. Show that it follows that can not be a proposition. (This paradox is known as Lob s
S S
S
S
paradox.)
13. Show that the argument with premises "The tooth fairy is
a real person" and "The truth fairy is not a real person" and conclusion "You can find gold at the end of the rainbow" is a valid argument. Does this show that the conclusion is true? 14. Let P (x) be the statement "student x knows calculus" and let Q(y) be the statement "class y contains a student who knows calculus." Express each ofthese as quantifications of P (x) and Q(y). a) Some students know calculus. b) Not every student knows calculus. c) Every class has a student in it who knows calculus. d) Every student in every class knows calculus. e) There is at least one class with no students who know calculus. 15. Let P (m , n) be the statement "m divides n ," where the
domain for both variables consists of all positive inte gers. (By "m divides n" we mean that n = for some integer Determine the truth values of each of these statements. b) P (2, 4) a) P(4, 5) d) 3m "In P (m , n) c) "1m "In P (m , n) e) 3n "1m P (m , n) 1) Vn P ( I , n) 16. Find a domain for the quantifiers in 3x 3y(x =1= y 1\ Vz «z =1= x) 1\ (z =1= y» such that this statement is true. 1 7. Find a domain for the quantifiers in 3x 3y(x =1= y 1\ Vz « z = x) V (z = y» ) such that this statement is false. 18. Use existential and universal quantifiers to express the statement "No one has more than three grandmothers" us-
k.)
km
107
19. Use existential and universal quantifiers to express the
20.
21.
22. 23.
24. 25. 26.
27.
28.
29.
30.
31.
statement "Everyone has exactly two biological parents" using the propositional function P (x , y), which repre sents "x is the biological parent of y." The quantifier 3n denotes "there exists exactly n ," so that 3n x P (x) means there exist exactly n values in the do main such that P (x) is true. Determine the true value of these statement where the domain consists of all real numbers. b) 3 \ x(lx l = 0) a) 30 x(x z = - 1 ) d) 3 3 x(x = Ix I ) c) 3 z x(x z = 2) Express each o f these statements using existential and universal quantifiers and propositional logic where 3 n is defined in Exercise 20. b) 3 \ x P (x) a) 3 0x P (x) d ) 3 3 X P (X) c) 3z x P (x) Let P (x , y) be a propositional function. Show that 3x Vy P (x , y) --+ Vy 3x P (x , y) is a tautology. Let P (x) and Q(x) be propositional functions. Show that 3x ( P (x) --+ Q(x» and "Ix P (x) --+ 3x Q(x) always have the same truth value. If Vy 3x P (x , y) is true, does it necessarily follow that 3x Vy P (x , y) is true? If "Ix 3y P (x , y) is true, does it necessarily follow that 3x Vy P (x , y) is true? Find the negations of these statements. a) If it snows today, then I will go skiing tomorrow. b) Every person in this class understands mathematical induction. c) Some students in this class do not like discrete math ematics. d) In every mathematics class there is some student who falls asleep during lectures. Express this statement using quantifiers: "Every student in this class has taken some course in every department in the school of mathematical sciences." Express this statement using quantifiers: "There is a build ing on the campus of some college in the United States in which every room is painted white." Express the statement "There is exactly one student in this class who has taken exactly one mathematics class at this school" using the uniqueness quantifier. Then express this statement using quantifiers, without using the uniqueness quantifier. Describe a rule of inference that can be used to prove that there are exactly two elements x and y in a domain such that P (x) and P (y) are true. Express this rule of inference as a statement in English. Use rules of inference to show that if the premises Vx( P (x) --+ Q(x» , Vx(Q(x) --+ R (x» , and -.R(a), where a is in the domain, are true, then the conclusion -'P (a) is true.
108
x 3 is irrational, then x is irrational. x is irrational and x 0, then .jX is irra tional. Prove that given a nonnegative integer n, there is a unique nonnegative integer m such that m 2 � n (m + 1 )2 . 10 Prove that there exists an integer m such that m 2 Is your proof constructive or nonconstructive?
32. Prove that if 33. Prove that if 34. 35.
/ - / 08
1 / The Foundations: Logic and Proofs
�
<
>
1 000
.
36. Prove that there is a positive integer that can be written ·
as the sum of squares of positive integers in two differ ent ways. (Use a computer or calculator to speed up your work.)
37. Disprove the statement that every positive integer is the
sum of the cubes of eight nonnegative integers.
38. Disprove the statement that every positive integer is the
sum of at most two squares and a cube of nonnegative integers.
39. Disprove the statement that every positive integer is the
sum of 36 fifth powers of nonnegative integers.
40. Assuming the truth of the theorem that states that ...;n is
n
irrational whenever is a positive integer that is not a perfect square, prove that ..fi + .../3 is irrational.
Computer Proj ects Write programs with the specified input and output.
1. Given the truth values of the propositions p and q, find the
truth values of the conjunction, disjunction, exclusive or, conditional statement, and biconditional of these proposi tions. 2. Given two bit strings of length find the bitwise AND, bitwise OR, and bitwise XOR of these strings. 3. Given the truth values of the propositions p and q in fuzzy logic, find the truth value of the disjunction and the con-
n,
junction of p and q (see Exercises 40 and 4 1 of Section 1 . 1 ). *4. Given positive integers and interactively play the game of Chomp. * 5. Given a portion of a checkerboard, look for tilings of this checkerboard with various types ofpolyominoes, including dominoes, the two types oftriominoes, and larger polyomi noes.
m n,
Computations and Explorations Use a computational program or programs you have written to do these exercises.
1 . Look for positive integers that are not the sum of the cubes
of nine different positive integers.
2. Look for positive integers greater than 79 that are not the
sum of the fourth powers of 1 8 positive integers.
3. Find as many positive integers as you can that can be writ-
ten as the sum of cubes of positive integers, in two different ways, sharing this property with 1 729. *4. Try to find winning strategies for the game of Chomp for different initial configurations of cookies. *5. Look for tilings of checkerboards and parts of checker boards with polynominoes.
Writing Proj ects Respond to these with essays using outside sources.
1 . Discuss logical paradoxes, including the paradox of Epi
menides the Cretan, Jourdain's card paradox, and the bar ber paradox, and how they are resolved. 2. Describe how fuzzy logic is being applied to practical ap plications. Consult one or more of the recent books on fuzzy logic written for general audiences. 3. Describe the basic rules of WFF 'N PROOF, developed by Layman Allen. Give examples of some of the games included in WFF 'N
of Modern Logic, PROOF.
The Game
4. Read some of the writings of Lewis Carroll on symbolic
logic. Describe in detail some of the models he used to represent logical arguments and the rules of inference he used in these arguments. 5. Extend the discussion of Prolog given in Section 1 .3, ex plaining in more depth how Prolog employs resolution. 6. Discuss some of the techniques used in computational logic, including Skolem's rule. 7. ''Automated theorem proving" is the task of using com puters to mechanically prove theorems. Discuss the goals
1 - 1 09
and applications of automated theorem proving and the progress made in developing automated theorem provers. 8. Describe how DNA computing has been used to solve instances of the satisfiability problem. 9. Discuss what is known about winning strategies in the game of Chomp.
Writing Projects
109
10. Describe various aspects of proof strategy discussed by
George P6lya in his writings on reasoning, including [P062], [P07 1 ] , and [P090] . 1 1 . Describe a few problems and results about tilings with polyominoes, as described in [G094] and [Ma9 1 ] , for example.
C H A P T E R
Basic Structures : Sets, Functions, Sequences, and Sums 2.1 Sets 2.2 Set Operations 2.3 Functions 2.4 Sequences and Summations
2.1
M
uch of discrete mathematics i s devoted to the study of discrete structures, used to represent discrete obj ects. Many important discrete structures are built using sets, which are collections of obj ects. Among the discrete structures built from sets are combinations, unordered collections of obj ects used extensively in counting; relations, sets of ordered pairs that represent relationships between obj ects; graphs, sets of vertices and edges that connect vertices; and finite state machines, used to model computing machines . These are some of the topics we will study in later chapters. The concept of a function is extremely important in discrete mathematics. A function assigns to each element of a set exactly one element of a set. Functions play important roles throughout discrete mathematics. They are used to represent the computational complexity of algorithms, to study the size of sets, to count obj ects, and in a myriad of other ways. Useful structures such as sequences and strings are special types of functions. In this chapter, we will introduce the notion of a sequence, which represents ordered lists of elements. We will introduce some important types of sequences, and we will address the problem of identifying a pattern for the terms of a sequence from its first few terms. Using the notion of a sequence, we will define what it means for a set to be countable, namely, that we can list all the elements of the set in a sequence. In our study of discrete mathematics, we will often add consecutive terms of a sequence of numbers. Because adding terms from a sequence, as well as other indexed sets of numbers, is such a common occurrence, a special notation has been developed for adding such terms . In this section, we will introduce the notation used to express summations. We will develop formulae for certain types of summations. Such summations appear throughout the study of discrete mathematics, as, for instance, when we analyze the number of steps a procedure uses to sort a list of numbers into increasing order.
Sets Introduction In this section, we study the fundamental discrete structure on which all other discrete structures are built, namely, the set. Sets are used to group obj ects together. Often, the obj ects in a set have similar properties. For instance, all the students who are currently enrolled in your school make up a set. Likewise, all the students currently taking a course in discrete mathematics at any school make up a set. In addition, those students enrolled in your school who are taking a course in discrete mathematics form a set that can be obtained by taking the elements common to the first two collections. The language of sets is a means to study such collections in an organized fashion. We now provide a definition of a set. This definition is an intuitive definition, which is not part of a formal theory of sets.
DEFINITION 1
Lijfill �m 2- /
A
set is an unordered collection of obj ects.
Note that the term object has been used without specifying what an obj ect is. This description of a set as a collection of obj ects, based on the intuitive notion of an obj ect, was first stated by the German mathematician Georg Cantor in 1 89 5 . The theory that results from this intuitive 111
112
2 / Basic Structures: Sets, Functions, Sequences, and Sums
Assessmenl
�
DEFINITION 2
2-2
definition of a set, and the use of the intuitive notion that any property whatever there is a set consisting of exactly the objects with this property, leads to paradoxes, or logical inconsistencies. This was shown by the English philosopher Bertrand Russell in 1 902 (see Exercise 38 for a description of one of these paradoxes). These logical inconsistencies can be avoided by building set theory beginning with axioms. We will use Cantor's original version of set theory, known as naive set theory, without developing an axiomatic version of set theory, because all sets considered in this book can be treated consistently using Cantor's original theory. The objects in a set are called the elements, or members, of the set. A set is said to contain its elements. We will now introduce notation used to describe membership in sets. We write a E A to denote that a is an element of the set A . The notation a f/ A denotes that a is not an element of the set A. Note that lowercase letters are usually used to denote elements of sets. There are several ways to describe a set. One way is to list all the members of a set, when this is possible. We use a notation where all members of the set are listed between braces. For example, the notation {a , b, c , d} represents the set with the four elements a , b, c, and d.
EXAMPLE 1
The set V of all vowels in the English alphabet can be written as V = {a , e, i,
EXAMPLE 2
The set 0 of odd positive integers less than 1 0 can be expressed by 0
EXAMPLE 3
Although sets are usually used to group together elements with common properties, there is nothing that prevents a set from having seemingly unrelated elements. For instance, {a , 2, Fred, New Jersey} is the set containing the four elements a, 2, Fred, and New Jersey. ....
,=
0,
u}.
....
{ l , 3 , 5 , 7, 9}.
Sometimes the brace notation is used to describe a set without listing all its members. Some members of the set are listed, and then ellipses ( . . . ) are used when the general pattern of the elements is obvious.
EXAMPLE 4 Extra � Examples �
The set of positive integers less than 1 00 can be denoted by { I , 2, 3 , . . . , 99} . Another way to describe a set is to use set builder notation. We characterize all those elements in the set by stating the property or properties they must have to be members. For instance, the set 0 of all odd positive integers less than 1 0 can be written as o
= {x I x is an odd positive integer less than 1 O} ,
or, specifying the universe a s the set o f positive integers, as o
= {x
E
Z+ I x is odd and x
<
1O}.
We often use this type o f notation to describe sets when it i s impossible to list all the elements of the set. For instance, the set Q + of all positive rational numbers can be written as
Q + = {x
E
R I x = p / q , for some positive integers p and q .
These sets, each denoted using a boldface letter, play an important role in discrete mathematics: N = {O, 1 , 2, 3 , . . . }, the set of natural numbers Z = { . . , -2, - 1 , 0, 1 , 2, . . . } , the set of integers .
2-3
2. 1 Sets
1 13
z+ = { I , 2, 3, . . . } , the set of positive integers Q = {p/q I p E Z, q E Z, and q =I=- OJ, the set of rational numbers R, the set of real numbers (Note that some people do not consider 0 a natural number, so be careful to check how the term natural numbers is used when you read other books.) Sets can have other sets as members, as Example 5 illustrates.
EXAMPLE 5
The set {N, Z, Q , R } is a set containing four elements, each of which is a set. The four elements of this set are N, the set of natural numbers; Z, the set of integers; Q, the set of rational numbers; ..... and R, the set of real numbers. Remark: Note that the concept of a datatype, or type, in computer science is built upon the concept of a set. In particular, a datatype or type is the name of a set, together with a set of
operations that can be performed on objects from that set. For example, boolean is the name of the set {O, I } together with operators on one or more elements of this set, such as AND, OR, and NOT. Because many mathematical statements assert that two differently specified collections of objects are really the same set, we need to understand what it means for two sets to be equal.
DEFINITION 3
EXAMPLE 6
Two sets are equal if and only if they have the same elements. That is, if A and B are sets, then A and B are equal if and only ifYx (x E A ++ x E B). We write A = B if A and B are equal sets.
The sets { l , 3 , 5 } and { 3 , 5 , I } are equal, because they have the same elements. Note that the order in which the elements of a set are listed does not matter. Note also that it does not matter if an element of a set is listed more than once, so { I , 3 , 3 , 3 , 5 , 5 , 5 , 5 } is the same as the set ..... { l , 3 , 5 } because they have the same elements. Sets can be represented graphically using Venn diagrams, named after the English mathe matician John Venn, who introduced their use in 1 88 1 . In Venn diagrams the universal set U, which contains all the objects under consideration, is represented by a rectangle. (Note that the universal set varies depending on which objects are of interest.) Inside this rectangle, circles or other geometrical figures are used to represent sets. Sometimes points are used to represent the particular elements of the set. Venn diagrams are often used to indicate the relationships between sets. We show how a Venn diagram can be used in Example 7.
GEORG CANTOR ( 1 845-1 9 1 8) Georg Cantor was born i n St. Petersburg, Russia, where his father was a successful merchant. Cantor developed his interest in mathematics in his teens. He began his university studies in Zurich in 1 862, but when his father died he left Zurich. He continued his university studies at the University of Berlin in 1 863, where he studied under the eminent mathematicians Weierstrass, Kummer, and Kronecker. He received his doctor's degree in 1 867, after having written a dissertation on number theory. Cantor assumed a position at the University of Halle in 1 869, where he continued working until his death. Cantor is considered the founder of set theory. His contributions in this area include the discovery that the set of real numbers is uncountable. He is also noted for his many important contributions to analysis. Cantor also was interested in philosophy and wrote papers relating his theory of sets with metaphysics. Cantor married in 1 874 and had five children. His melancholy temperament was balanced by his wife's happy disposition. Although he received a large inheritance from his father, he was poorly paid as a professor. To mitigate this, he tried to obtain a better-paying position at the University of Berlin. His appointment there was blocked by Kronecker, who did not agree with Cantor's views on set theory. Cantor suffered from mental illness throughout the later years of his life. He died in 1 9 1 8 in a psychiatric clinic.
114
2 / Basic Structures: Sets, Functions, Sequences, and Sums
2-4
u
a
•
v
o•
Venn Diagram for the Set of Vowels.
F IGURE 1
EXAMPLE 7
Draw a Venn diagram that represents V, the set of vowels in the English alphabet.
Solution: We draw a rectangle to indicate the universal set U, which is the set of the 26 letters of the English alphabet. Inside this rectangle we draw a circle to represent V. Inside this circle ... we indicate the elements of V with points (see Figure 1). There is a special set that has no elements. This set is called the empty set, or null set, and is denoted by 0. The empty set can also be denoted by { } (that is, we represent the empty set with a pair of braces that encloses all the elements in this set). Often, a set of elements with certain properties turns out to be the null set. For instance, the set of all positive integers that are greater than their squares is the null set. A set with one element is called a singleton set. A common error is to confuse the empty set 0 with the set {0}, which is a singleton set. The single element of the set {0} is the empty set itself1 A useful analogy for remembering this difference is to think of folders in a computer file system. The empty set can be thought of as an empty folder and the set consisting of just the empty set can be thought of as a folder with exactly one folder inside, namely, the empty folder.
DEFINITION 4
The set A is said to be a subset of B if and only if every element of A is also an element of B . We use the notation A 5; B to indicate that A is a subset of the set B .
We see that A 5; B i f and only if the quantification
Yx(x
E
A
--+
x
E
B)
is true.
unkS � BERTRAND RUSSELL ( 1 872- 1 970) Bertrand Russell was born into a prominent English family active in the progressive movement and having a strong commitment to liberty. He became an orphan at an early age and was placed in the care of his father's parents, who had him educated at home. He entered Trinity College, Cambridge, in 1 890, where he excelled in mathematics and in moral science. He won a fellowship on the basis of his work on the foundations of geometry. In 1 9 1 0 Trinity College appointed him to a lectureship in logic and the philosophy of mathematics. Russell fought for progressive causes throughout his life. He held strong pacifist views, and his protests against World War I led to dismissal from his position at Trinity College. He was imprisoned for 6 months in 1 9 1 8 because of an article he wrote that was branded as seditious. Russell fought for women's suffrage in Great Britain. In 1 96 1 , at the age of 89, he was imprisoned for the second time for his protests advocating nuclear disarmament. Russell's greatest work was in his development of principles that could be used as a foundation for all of mathematics. His most famous work is Principia Mathematica, written with Alfred North Whitehead, which attempts to deduce all of mathematics using a set of primitive axioms. He wrote many books on philosophy, physics, and his political ideas. Russell won the Nobel Prize for literature in 1 950.
2-5
2.1 Sets
115
u
FIGURE 2
EXAMPLE 8
Venn Diagram Showing that A Is a Subset of B.
The set of all odd positive integers less than 1 0 is a subset of the set of all positive integers less than 1 0, the set of rational numbers is a subset of the set of real numbers, the set of all computer science majors at your school is a subset of the set of all students at your school, and the set of all people in China is a subset of the set of all people in China (that is, it is a subset of .... itself). Theorem 1 shows that every nonempty set S is guaranteed to have at least two subsets, the empty set and the set S itself, that is, 0 � S and S � S.
THEOREM !
For every set S, (i ) 0 � S and (ii ) S � S.
Proof: We will prove (i ) and leave the proof of (ii ) as an exercise. Let S be a set. To show that 0 � S, we must show that Yx(x E 0 -+ X E S) is true. Because the empty set contains no elements, it follows that x E 0 is always false. It follows that the conditional statement x E 0 -+ X E S is always true, because its hypothesis is always false and a conditional statement with a false hypothesis is true. That is, Yx(x E 0 -+ X E S) is true. This <J completes the proof of (i). Note that this is an example of a vacuous proof. When we wish to emphasize that a set A is a subset of the set B but that A =F B, we write A C B and say that A is a proper subset of B. For A C B to be true, it must be the case that A � B and there must exist an element x of B that is not an element of A . That is, A is a proper subset of B if YX(X E A
-+
x E B) !\ 3x(x E B
!\
x f/ A)
is true. Venn diagrams can be used to illustrate that a set A is a subset of a set B. We draw the universal set U as a rectangle. Within this rectangle we draw a circle for B. Because A is a subset of B, we draw the circle for A within the circle for B. This relationship is shown in Figure 2.
unkS � JOHN V E N N ( 1 834- 1 923) John Venn was born into a London suburban family noted for its philanthropy. He attended London schools and got his mathematics degree from Caius College, Cambridge, in 1 857. He was elected a fellow of this college and held his fellowship there until his death. He took holy orders in 1 859 and, after a brief stint of religious work, returned to Cambridge, where he developed programs in the moral sciences. Besides his mathematical work, Venn had an interest in history and wrote extensively about his college and family. Venn's book Symbolic Logic clarifies ideas originally presented by Boole. In this book, Venn presents a systematic development of a method that uses geometric figures, known now as diagrams. Today these diagrams are primarily used to analyze logical arguments and to illustrate relationships between sets. In addition to his work on symbolic logic, Venn made contributions to probability theory described in his widely used textbook on that subject.
foenn
116
2-6
2 / Basic Structures: Sets, Functions, Sequences, and Sums
One way to show that two sets have the same elements is to show that each set is a subset of the other. In other words, we can show that if A and B are sets with A S; B and B S; A, then A = B. This turns out to be a useful way to show that two sets are equal. That is, A = B, where A and B are sets, if and only ifYx(x E A � x E B ) and Yx(x E B � x E A), or equivalently if and only if Yx(x E A ++ x E B). Sets may have other sets as members. For instance, we have the sets A = {0 , {a } , {b } , {a , b } }
and
B = {x I x is a subset of the set {a , b } } .
Note that these two sets are equal, that is, A = B . Also note that {a } E A , but a rI. A. Sets are used extensively in counting problems, and for such applications we need to discuss the size of sets.
DEFINITION 5
EXAMPLE 9
distinct elements in S �re n is a nonnegative integer, we say that S is a finite set and that n is the cardinality of S. The cardinality of S is denoted by I S I.
Let S be a set. Ifthere are exactly n
Let A be the set of odd positive integers less than 1 0. Then I A I = 5 .
EXAMPLE 10
Let S be the set of letiers in the English alphabet. Then l S I = 26.
EXAMPLE 11
Because the null set has no elements, it follows that 1 0 1 = 0. We will also be interested in sets that are not finite.
DEFINITION 6
EXAMPLE 12
A set is said to be infinite if it is not finite.
The set of positive integers is infinite. The cardinality of infinite sets will be discussed in Section 2.4.
The Power Set Many problems involve testing all combinations of elements of a set to see if they satisfy some property. To consider all such combinations of elements of a set S, we build a new set that has as its members all the subsets of S .
DEFINITION 7
EXAMPLE 13
a set S, the power set of S is the set of all su\?sets of the set S. The power set of S is denoted by P(S).
Given
What is the power set of the set to, 1 , 2 }?
Solution: The power set P ({O , I , 2}) is the set of all subsets of to , 1 , 2 } . Hence, P ({O , 1 , 2 } ) = {0 , to}, { l } , {2} , to, 1 } , to, 2 } , { l , 2 } , to, 1 , 2 } } .
Note that the empty set and the set itself are members of this set of subsets.
2. 1 Sets
2- 7
EXAMPLE 14
1 17
What is the power set of the empty set? What is the power set of the set {0}?
Solution: The empty set has exactly one subset, namely, itself. Consequently, P (0) = {0} . The set {0} has exactly two subsets, namely, 0 and the set {0} itself. Therefore, P ({0}) = {0, {0} } . If a set has n elements, then its power set has 2n elements. We will demonstrate this fact in several ways in subsequent sections of the text.
Cartesian Prod uets The order of elements in a collection is often important. Because sets are unordered, a different structure is needed to represent ordered collections. This is provided by ordered n-tuples.
DEFINITION 8
The ordered n-tuple (aI , a2 , . . . , an ) is the ordered collection that has al as its first element, a2 as its second element, . . . , and an as its nth element.
We say that two ordered n -tuples are equal if and only if each corresponding pair of their elements is equal. In other words, (a I , a2 , . . . , an ) = (bl , b2 , , bn ) if and only if ai = bi , for i = 1 , 2, . . . , n . In particular, 2-tuples are called ordered pairs. The ordered pairs (a , b) and (c, d) are equal if and only if a = c and b = d. Note that (a , b) and (b, a) are not equal unless a = b. Many of the discrete structures we will study in later chapters are based on the notion of the Cartesian product of sets (named after Rene Descartes). We first define the Cartesian product of two sets. •
unks�
.
•
RENE DESCARTES ( 1 596- 1 650) Rene Descartes was born into a noble family near Tours, France, about
200 miles southwest of Paris. He was the third child of his father's first wife; she died several days after his
birth. Because of Rene1s poor health, his father, a provincial judge, let his son's formal lessons slide until, at the age of 8, Rene entered the Jesuit college at La Fleche. The rector of the school took a liking to him and permitted him to stay in bed until late in the morning because of his frail health. From then on, Descartes spent his mornings in bed; he considered these times his most productive hours for thinking. Descartes left school in 1 6 1 2, moving to Paris, where he spent 2 years studying mathematics. He earned a law degree in 1 6 1 6 from the University of Poitiers. At 1 8 Descartes became disgusted with studying and decided to see the world. He moved to Paris and became a successful gambler. However, he grew tired of bawdy living and moved to the suburb of Saint-Germain, where he devoted himself to mathematical study. When his gambling friends found him, he decided to leave France and undertake a military career. However, he never did any fighting. One day, while escaping the cold in an overheated room at a military encampment, he had several feverish dreams, which revealed his future career as a mathematician and philosopher. After ending his military career, he traveled throughout Europe. He then spent several years in Paris, where he studied mathemat ics and philosophy and constructed optical instruments. Descartes decided to move to Holland, where he spent 20 years wandering around the country, accomplishing his most important work. During this time he wrote several books, including the Discours, which contains his contributions to analytic geometry, for which he is best known. He also made fundamental contributions to philosophy. In 1 649 Descartes was invited by Queen Christina to visit her court in Sweden to tutor her in philosophy. Although he was reluctant to live in what he called "the land of bears amongst rocks and ice," he finally accepted the invitation and moved to Sweden. Unfortunately, the winter of 1 649-1 650 was extremely bitter. Descartes caught pneumonia and died in mid-February.
1 18
2-8
2 / Basic Structures: Sets, Functions, Sequences, and Sums
DEFINITION 9
Let A and B be sets. The Cartesian product of A and B, denoted by A x B, is the set of all ordered pairs (a , b), where a E A and b E B . Hence, A x B = { (a , b ) I a E A
/\
b E B}.
EXAMPLE 15
Let A represent the set of all students at a university, and let B represent the set of all courses offered at the university. What is the Cartesian product A x B?
E.IItra � Examples llliiiil
Solution: The Cartesian product A x B consists of all the ordered pairs ofthe form (a , b ) , where a is a student at the university and b is a course offered at the university. The set A x B can be
....
used to represent all possible enrollments of students in courses at the university. EXAMPLE 16
What is the Cartesian product of A = { I , 2} and B = {a , b , c}? Solution: The Cartesian product A x B is
A x B = {(1 , a), ( 1 , b ), ( 1 , c), (2, a), (2, b), (2, c)} . A subset R of the Cartesian product A x B i s called a relation from the set A to the set B . The elements of R are ordered pairs, where the first element belongs to A and the second to B . For example, R = {(a , 0), (a , 1 ), (a , 3), (b , 1 ), (b , 2), (c, 0), (c, 3)} is a relation from the set {a , b , c} to the set to, 1 , 2, 3 } . We will study relations at length in Chapter 8. The Cartesian products A x B and B x A are not equal, unless A = 0 or B ::::;:: 0 (so that A x B = 0) or A = B (see Exercises 26 and 30, at the end of this section). This is illustrated in Example 1 7. EXAMPLE 17
Show that the Cartesian product B x A is not equal to the Cartesian product A x B , where A and B are as in Example 1 6. Solution: The Cartesian product B x A is
B x A = {(a , I ), (a , 2), (b, I), (b, 2), (c, 1 ), (c, 2)} . This is not equal to A x B , which was found in Example 1 6. The Cartesian product of more than two sets can also be defined. DEFINITION 10
The Cartesian product of the sets A ) , A2 , . . , A n , denoted by A) x A2 X i s the set of ordered n -tuples (a) , a2 , . . . , an ), where aj belongs to Aj 1 , 2, . . . , n . In other words, .
A)
EXAMPLE 18
X
•
•
•
x
for
An , i =
A2 x · · · x An = {(a) , a2 , . . . , an ) I aj E Ai for i = 1 , 2, . . , n }. .
What is the Cartesian product A x B x C , where A = to, l }, B = { I , 2}, and C = to, 1 , 2}? Solution: The Cartesian product A x B x C consists of all ordered triples (a , b , c), where a E A, b E B, and c E C. Hence,
A x B x C = { CO, 1 , 0), (0, I , I ), (0, 1 , 2), (0, 2, 0), (0, 2, I ) , (0, 2, 2), ( 1 , 1 , 0), ( 1 , 1 , 1 ), .... ( 1 , 1 , 2), ( 1 , 2, 0), ( 1 , 2, 1 ), ( 1 , 2, 2)} .
2. 1 Sets
2-9
1 19
Using Set Notation with Quantifiers Sometimes we restrict the domain of a quantified statement explicitly by making use of a particular notation. For example, VXE S (P (x» denotes the universal quantification of P (x) over all elements in the set S . In other words, VXE S(P(x» is shorthand for Vx(x E S -+ P (x» . Similarly, 3xE S(P(x » denotes the existential quantification of P (x ) over all elements in S. That is, 3XE S(P(x» is shorthand for 3x (x E S 1\ P (x » .
EXAMPLE 19
What do the statements VXE R (x 2 ::: 0) and 3XE Z (x 2 = 1) mean?
Solution: The statement VXE R(x 2 ::: 0) states that for every real number x , x 2 ::: O. This state ment can be expressed as "The square of every real number is nonnegative." This is a true statement. The statement 3XE Z(x 2 = 1 ) states that there exists an integer x such that x 2 = 1 . This statement can be expressed as "There is an integer whose square is I ." This is also a true -
Truth Sets of Quantifiers We will now tie together concepts from set theory and from predicate logic. Given a predicate P , and a domain D, we define the truth set of P to be the set of elements x in D for which P (x) is true. The truth set of P (x) is denoted by {x E D I P (x ) } .
EXAMPLE 20
What are the truth sets of the predicates P (x), Q (x), and R (x), where the domain is the set of integers and P (x) is "Ix l = 1 ," Q(x) is "x 2 = 2," and R (x ) is " Ix l = x ."
Solution: The truth set of P , {x E Z I Ix I = I } , is the set of integers for which Ix I = 1 . Because Ix I = 1 when x = 1 or x = - 1 , and for no other integers x , we see that the truth set of P is the set { - I , I } . The truth set of Q , { x E Z I x 2 = 2} , is the set of integers for which x 2 = 2. This is the empty set because there are no integers x for which x 2 = 2. The truth set of R , {x E Z I lx l = x}, is the set of integers for which Ix l = x. Because Ix l = x if and only if x ::: 0, it follows that the truth set of R is N, the set of nonnegative -
Exercises 1 . List the members of these sets. a) {x I x is a real number such that x 2 = I } b) {x I x is a positive integer less than 1 2 } c ) { x I x i s the square o f an integer and x < 1 00} d) {x I x is an integer such that x 2 = 2} 2. Use set builder notation to give a description of each of
these sets. a) to, 3 , 6, 9, I2} b ) { - 3 , -2, - I , O, I , 2, 3 } c) {m , n , 0 , p } 3 . Determine whether each of these pairs o f sets are equal.
a) { I , 3 , 3 , 3 , 5, 5, 5 , 5, 5 } , {5 , 3 , I } c) 0, {0} b) { { I } } , { I , { I } } 4. Suppose that A = {2, 4, 6}, B = {2, 6}, C = {4, 6}, and D = {4, 6, 8 } . Determine which of these sets are subsets
of which other of these sets.
5. For each of the following sets, determine whether 2 is an
element of that set. a) {x E I x is an integer greater than I } b) {x E I x is the square of an integer} d) {{2} , { { 2 } } } c) {2, {2}} 1) { { {2} } } e ) ({2} , {2, {2} } }
RR
2 / Basic Structures: Sets, Functions, Sequences, and Sums
120
6. For each of the sets in Exercise 5, determine whether
is an element of that set.
{2}
2-10
22. Determine whether each of these sets is the power set of
a) c)
7. Determine whether each of these statements is true or
false.
b) 00 E {OJ{OJ {OJ{OJE E0 {OJ0 {OJ {OJ {0} {0} Determine whether these statements are true or false. b) 0{0}E E{0,{{0}} {0}} 0{0}E E{0}{0} {0} {{0}} {0, {0}} {{0}} {0,{{0},{0}}{0}}
a) 0 c) e) g) 8.
a) c) e) g)
d) t)
c
c
C
S;
C
d) t)
c
false.
a) d)
x{x}E E{x}{{x}} b) 0{x} {x}{x} S;
e)
S;
c) t)
0{x}E E{x}{x}
10. Use a Venn diagram to illustrate the subset of odd
integers in the set of all positive integers not exceeding 1 0. 1 1 . Use a Venn diagram to illustrate the set of all months of
the year whose names do not contain the letter set of all months of the year.
R in the
12. Use a Venn diagram to illustrate the relationship A S; B and B S; C .
13. Use a Venn diagram to illustrate the relationships A C B and B c C .
14. Use a Venn diagram to illustrate the relationships A C B and A C C .
15. Suppose that A , B , and C are sets such that A S; B and B S; C . Show that A S; C .
E B and A B . What is the cardinality of each of these sets? {a}{a, {a}} b) {a,{{a}}{a}, {a, {am S;
16. Find two sets A and B such that A 17.
a) c)
d)
18. What is the cardinality of each of these sets?
19.
0{0, {0}}
b) {0,{0} {0}, {0, {0m Find the power set of each of these sets, where a and b are distinct elements. {0, {0}} {a} b) {a, b} a) c)
d)
a)
c)
20. Can you conclude that A = B if A and B are two sets
with the same power set?
2 1 . How many elements does each of these sets have where
a and b are distinct elements? b, {a,{a},b}})({a}}}) b) P({a, P({0, a, P(P(0» a)
c)
23.
d)
=
a)
=
c,
x
x
24. What is the Cartesian product A x B , where A is
25.
C
9 . Determine whether each o f these statements i s true or
a and b are distinct elements. b) {0,{0, {a}, {a}} {b}, {a, b}} {0,0 {a}, {0, a}} Let A {a, b, d} and B {y, z}. Find A B b) B A a set, where
26. 27. 28.
the set of courses offered by the mathematics depart ment at a university and B is the set of mathematics pro fessors at this university? What is the Cartesian product A x B x C , where A is the set of all airlines and B and C are both the set of all cities in the United States? Suppose that A x B = where A and B are sets. What can you conclude? Let A be a set. Show that x A = A x = Let A = c ,B = and C = I . Find a) A x B x C C x B x A d) B x B x B c) C x A x B
0,
0 {x, y}, b)
{a, b, }
0 0. {O, }
29. How many different elements does A x B have if A has m elements and B has n elements? 30. Show that A x B =1= B x A, when A and B are nonempty, unless A = B . 3 1 . Explain why A x B x C and ( A x B) x C are no� the same. 32. Explain why (A x B) x (C x D) and A x (B x C) x D are not the same. 33. Translate each of these quantifications into English and determine its truth value. a) =1= - 1 ) = c) 0) d) =
2 'IxER(x 'IxE Z (x2 >
b) 3XE 3XEZR (x(x22 2)x)
34. Translate each of these quantifications into English and
determine its truth value. a) = - 1) c) -1
3XER(x(X 3 E Z) 'IxEZ
b) 'IxE 3xEZZ (x(x 2+ElZ)> x)
d)
35. Find the truth set of each of these predicates where the
domain is the set of integers. a) c)
2 P(x : "X ) R(x): "2x + 1
< 3"
=
0"
b) Q(x): "X 2 > x"
36. Find the truth set of each of these predicates where the
domain is the set of integers. a) 2': I "
3 P(x b) R(x): Q(x):) : "x"x"X 2 x2"2" c)
=
<
*37. The defining property of an ordered pair is that two or
dered pairs are equal if and only if their first elements are equal and their second elements are equal. Surpris ingly, instead of taking the ordered pair as a primitive con cept, we can construct ordered pairs using basic notions from set theory. Show that if we define the ordered pair = (c, to be then if and only
(a, b)
{{a}, {a, b}}, (a, b)
d)
2.2 Set Operations
2-1 1
[Hint:if aFirste show that {{a}, {a, b}} {{e},a {e,e and d}} ifb andd.only and b d. ] This exercise presents Let S be the set that contains a set x if the set x does not belong to itself, so that S {x I x ¢ x}. Show the assumption that S is a member of S leads to a contradiction. if
*38.
=
=
=
=
=
Russell's paradox.
=
:J
a)
121
b) toShow the assumption that S is not a member of S leads a contradiction. By parts (a) and (b) it follows that the set S cannot be de fined as it was. This paradox can be avoided by restricting
the types of elements that sets can have.
*39. Describe a procedure for listing all the subsets of a finite
set.
2.2 Set O p erations Introduction
unks� DEFINITION 1
Two sets can be combined in many different ways. For instance, starting with the set of mathe matics majors at your school and the set of computer science majors at your school, we can form the set of students who are mathematics majors or computer science majors, the set of students who are joint majors in mathematics and computer science, the set of all students not majoring in mathematics, and so on. Let A and B be sets. The union ofthe sets A and B , denoted by A U B , is the set that contains those elements that are either in A or in B , or in both.
An element x belongs to the union of the sets A and B if and only if x belongs to A or x belongs to B . This tells us that A U B = {x I x
E
A
v
x
E
B}.
The Venn diagram shown in Figure 1 represents the union of two sets A and B . The area that represents A U B is the shaded area within either the circle representing A or the circle representing B . We will give some examples of the union of sets.
EXAMPLE 1
The union of the sets { l , 3 , 5 } and { l , 2 , 3 } is the set { l , 2, 3 , 5 } ; that is, { l , 3 , 5 } U { l , 2, 3 } = � { l , 2, 3 , 5 } .
EXAMPLE 2
The union ofthe set of all computer science majors at your school and the set of all mathematics majors at your school is the set of students at your school who are majoring either in mathematics � or in computer science (or in both).
DEFINITION 2
Let A and B be sets. The intersection of the sets A and B , denoted by A n B , is the set containing those elements in both A and B .
An element x belongs to the intersection of the sets A and B if and only if x belongs to A and x belongs to B . This tells us that A n B = {x I x
E
A
1\ X E
B}.
122
2 / Basic Structures: Sets, Functions, Sequences, and Sums
2- 1 2
u
A UB
is shaded.
FIGURE 1 Venn Diagram Representing the Union of A and B.
A nB
is shaded.
FIGURE 2 Venn Diagram Representing the Intersection of A and B.
The Venn diagram shown in Figure 2 represents the intersection oftwo sets A and B . The shaded area that is within both the circles representing the sets A and B is the area that represents the intersection of A and B . We give some examples of the intersection of sets. EXAMPLE 3
The intersection of the sets { I , 3 , 5 } and { I , 2, 3 } is the set { I , 3 } ; that is, { I , 3 , 5 } n { l , 2, 3} = .... { I , 3}.
EXAMPLE 4
The intersection of the set of all computer science majors at your school and the set of all mathematics majors is the set of all students who are joint majors in mathematics and computer .... science.
DEFINITION 3
EXAMPLE 5
Two sets are called disjoint if their intersection is the empty set.
Let A = { I , 3 , 5 , 7, 9} and B = {2, 4, 6, 8 , 1 O} . Because A n B = 0, A and B are disjoint. .... We often are interested in finding the cardinality of a union of two finite sets A and B . Note that I A I + I B I counts each element that is in A but not in B or in B but not in A exactly once, and each element that is in both A and B exactly twice. Thus, if the number of elements that are in both A and B is subtracted from I A I + I B I , elements in A n B will be counted only once. Hence, IA U B I = IAI + IB I - IA n B I . The generalization of this result to unions of an arbitrary number of sets is called the principle of inclusion-exclusion. The principle of inclusion-exclusion is an important technique used in enumeration. We will discuss this principle and other counting techniques in detail in Chapters 5 and 7. There are other important ways to combine sets.
2-13
2.2 Set Operations
1 23
--��-'-----,
u
B
A -B
A is shaded.
is shaded.
FIGURE 3 Venn Diagram for the Difference of A and B.
DEFINITION 4
FIGURE 4 Venn Diagram for the Complement of the Set A.
Let A and B be sets. The difference of A and B , denoted by A B , is the set containing those elements that are in A but not in B . The difference of A and B is also called the complement ofB with respect to A. -
An element x belongs to the difference of A and B if and only if x that A - B = {x I x
E
A
/\
E
A and x rt B . This tells us
x rt B } .
The Venn diagram shown in Figure 3 represents the difference of the sets A and B . The shaded area inside the circle that represents A and outside the circle that represents B is the area that represents A - B . We give some examples of differences of sets. EXAMPLE 6
The difference of { I , 3 , 5 } and { l , 2 , 3 } is the set { 5 } ; that is, { l , 3 , 5 } - { l , 2, 3 } = { 5 } . This is .... different from the difference of { I , 2, 3} and { I , 3 , 5 } , which is the set {2}.
EXAMPLE 7
The difference of the set of computer science majors at your school and the set of mathematics majors at your school is the set of all computer science majors at your school who are not also .... mathematics majors. Once the universal set U has been specified, the complement of a set can be defined.
DEFINITION 5
Let U be the universal set. The complement of the set A , denoted by A , is the complement of A with respect to U . In other words, the complement of the set A is U A . -
An element belongs to A i f and only i f x rt A . This tells u s that
if = {x I x rt A } . In Figure 4 the shaded area outside the circle representing A i s the area representing A. We give some examples of the complement of a set.
124
2- 1 4
2 / Basic Structures: Sets, Functions, Sequences, and Swns TABLE 1 Set Identities. Identity
Name
AU0=A AnU=A
Identity laws
AUU =U An0=0
Domination laws
AUA=A AnA=A
Idempotent laws
-
(:4) = A
Complementation law
AUB =BUA AnB =BnA
Commutative laws
A U (B U C) = (A U B) U C A n (B n C) = (A n B) n C
Associative laws
A n (B U C) = (A n B) U (A n C) A U (B n C) = (A U B) n (A U C)
Distributive laws
A U B = An B A n B = AU B
De Morgan's laws
A U (A n B) = A A n (A U B) = A
Absorption laws
A U A= U A n A= 0
Complement laws
EXAMPLE 8
Let A = {a , e, i , 0, u } (where the universal set is the set of letters of the English alpha� bet). Then A = {b , c , d, j, g, h , j, k , I , m , n , p , q , r, S , t , v , w , x , y , z } .
EXAMPLE 9
Let A be the set of positive integers greater than 1 0 (with universal set the set of all positive � integers). Then A = { l , 2, 3 , 4, 5 , 6, 7, 8 , 9 , l O} .
Set Identities Table 1 lists the most important set identities. We will prove several of these identities here, using three different methods. These methods are presented to illustrate that there are often many different approaches to the solution of a problem. The proofs of the remaining identities will be left as exercises. The reader should note the similarity between these set identities and the logical equivalences discussed in Section 1 .2. In fact, the set identities given can be proved directly from the corresponding logical equivalences. Furthermore, both are special cases of identities that hold for Boolean algebra (discussed in Chapter 1 1 ). One way to show that two sets are equal is to show that each is a subset of the other. Recall that to show that one set is a subset of a second set, we can show that if an element belongs to the first set, then it must also belong to the second set. We generally use a direct proof to do this. We illustrate this type of proof by establishing the second of De Morgan's laws.
2-15
2.2 Set Operations
EXAMPLE 10
125
Prove that A n B = A U B .
Solution: To show that A n B = A U B , we will show that A n B S; A U B and that A U B S; A n B. First, we will show that A n B S; A U B . So suppose that x E A n B . By the definition of complement, x f/ A n B . By the definition of intersection, -.( (x E A) /\ (x E B» is true. Applying De Morgan's law (from logic), we see that -.(x E A) or -.(x E B). Hence, by the definition of negation, x f/ A or x f/ B . By the definition of complement, x E A or x E B . It follows by the definition of union that x E A U B . This shows that A n B S; if U B . Next, we will show that if U B S; A n B . Now suppose that x E A U B . By the definition of union, x E A or x E B . Using the definition of complement, we see that x f/ A or x f/ B . Consequently, -.(x E A ) V -.(x E B ) i s true. By De Morgan's law (from logic), we conclude that -.«x E A) /\ (x E B» is true. By the definition of intersection, it follows that -.(x E A n B) holds. We use the definition of complement to conclude that x E A n B . This shows that if U B � A n B . Because we have shown that each set is a subset of the other, the two sets ... are equal, and the identity is proved. We can more succinctly express the reasoning used in Example l O using set builder notation, as Example 1 1 illustrates. EXAMPLE 11
Use set builder notation and logical equivalences to establish the second De Morgan law A n B = A U B.
Solution: We can prove this identity with the following steps. AnB = = = = =
{x {x {x {x {x
I I I I I
x fJ. A n B } -.(x E ( A n B » } -.(x E A /\ X E B)} -.(x E A) V -.(x E B)} x fJ. A V x fJ. B }
by definition of complement by definition of does not belong symbol by definition of intersection by the first De Morgan law for logical equivalences by definition of does not belong symbol
= {x I x E A V X E B } = {x I x E A U B }
by definition of union
=AUB
by meaning of set builder notation
by definition of complement
Note that besides the definitions of complement, union, set membership, and set builder ... notation, this proof uses the first De Morgan law for logical equivalences. Proving a set identity involving more than two sets by showing each side of the identity is a subset of the other often requires that we keep track of different cases, as illustrated by the proof in Example 1 2 of one of the distributive laws for sets. EXAMPLE 12
Prove the first distributive law from Table 1 , which states that A n (B U C) = (A n B) U (A n C) for all sets A, B , and C .
Solution: We will prove this identity by showing that each side is a subset of the other side. Suppose that x E A n (B U C) . Then x E A and x E B U C . By the definition of union, it follows that x E A, and x E B or x E C (or both). Consequently, we know that x E A and x E B or that x E A and x E C . By the definition of intersection, it follows that x E A n B or x E A n C . Using the definition of union, we conclude that x E (A n B ) U (A n C) . We conclude that A n (B U C) S; (A n B ) U (A n C) .
126
2- 1 6
2 / Basic Structures: Sets, Functions, Sequences, and Swns
TABLE 2 A Membership Table for the Distributive Property.
A
B
C
BUC
A n CB U C)
AnB
AnC
(A n B) U (A n C)
1 1 1 1 0 0 0 0
1 1 0 0 1 1 0 0
1 0 1 0 1 0 1 0
1 1 1 0 1 1 1 0
1 1 1 0 0 0 0 0
1 1 0 0 0 0 0 0
1 0 1 0 0 0 0 0
1 1 1 0 0 0 0 0
Now suppose that x E (A n B ) U (A n C). Then, by the definition of union, x E A n B or x E A n C . By the definition of intersection, it follows that x E A and x E B or that x E A and x E C . From this we see that x E A, and x E B or x E C . Consequently, by the definition of union we see that x E A and x E B U C . Furthermore, by the definition of intersection, it follows that x E A n (B U C). We conclude that (A n B) U (A n C) <; A n (B U C). This completes the .... proof of the identity. Set identities can also be proved using membership tables. We consider each combination of sets that an element can belong to and verify that elements in the same combinations of sets belong to both the sets in the identity. To indicate that an element is in a set, a 1 is used; to indicate that an element is not in a set, a 0 is used. (The reader should note the similarity between membership tables and truth tables.) EXAMPLE 13
Use a membership table to show that A n (B U C) = (A n B) U (A n C).
Solution: The membership table for these combinations of sets is shown in Table Z. This table has eight rows. Because the columns for A n (B U C) and (A n B) U (A n C) are the same, the .... identity is valid. Additional set identities can be established using those that we have already proved. Consider Example 14. EXAMPL E 14
Let A , B , and C be sets. Show that A U (B n C) = (C U B ) n A .
Solution: We have A U (B n C ) = A n (B n C)
C) = (B U C) n A = (C U B) n A
= A n (B U
b y the first D e Morgan law by the second D e Morgan law by the commutative law for intersections by the commutative law for unions.
G eneralized Unions and Intersections Because unions and intersections of sets satisfy associative laws, the sets A U B U C and A n B n C are well defined; that is, the meaning of this notation is unambiguous when A , B , and
2.2 Set Operations
2- 1 7
127
--.-. �----, - �---.---.----
u
(a) FIGURE 5
A U B U C is
shaded.
(b) A n B n C is
shaded.
The Union and Intersection of A, B, and C .
C are sets. That is, we do not have to use parentheses to indicate which operation comes first because A U (B U C) = (A U B) U C and A n (B n C) = (A n B) n C . Note that A U B U C contains those elements that are in at least one of the sets A , B , and C, and that A n B n C contains those elements that are in all of A , B , and C . These combinations of the three sets, A , B , and C, are shown in Figure 5 . EXAMPLE 15
Let A = {a, 2, 4, 6, 8}, B = {a, 1 , 2, 3 , 4} , and C = {a, 3, 6, 9}. What are A U B U C and A n B n c?
Solution: The set A U B U C contains those elements in at least one of A , B , and C . Hence, A U B U C = {a, 1 , 2, 3 , 4, 6, 8 , 9 } . The set A n B n C contains those elements i n all three o f A , B , and C . Thus,
A n B n C = {oJ . We can also consider unions and intersections of an arbitrary number of sets. We use these definitions.
DEFINITION 6
The union of a collection of sets is the set that contains those elements that are members of at least one set in the collection.
We use the notation
A I U A2 U . . . U An =
n
U Ai i=1
to denote the union of the sets A I , A 2 , . . . , A n .
D EFINITI 0 N 7
The intersection of a collection of sets is the set that contains those elements that are members of all the sets in the collection.
128
2 / Basic Structures: Sets, Functions, Sequences, and Sums
2- 1 8
We use the notation
A I n A2 n . . . n A n =
n
n Ai
i= 1
to denote the intersection of the sets A I , A 2 , . . . , A n . We illustrate generalized unions and intersections with Example 1 6.
EXAMPLE 16
Let A ; = {i , i + 1 , i + 2, . . . }. Then,
n
n
;= 1
i= 1
U A ; = U {i, i + 1 , i + 2, . . . } = { l , 2, 3 , . . . } , \
,
and
n
n A = n ; n {i , i + 1 , i + 2, . . . } = {n , n + 1 , n + 2, . . . } . i= 1 i=1 We can extend the notation we have introduce for unions and intersections to other families of sets. In particular, we can use the notation 00
A I U A2 U · · · U A n U · · · =
U Ai
;= 1
to denote the union of the sets A I , A 2 , . . . , An ' . . . . Similarly, the intersection of these sets can be denoted by
A l n A2 n . . . n An n · · · =
00
n A; .
;=1
More generally, the notations n; E I A i and U; E I A i are used to denote the inte,section and union of the sets A i for i E I , respectively. Note that we have n; E I A ; = {x I Vi E I (x E A;)} and U; E I A ; = {x I 3i E I (x E A i ) } .
EXAMPLE 1 7
Suppose that A ; = { l , 2, 3 , . . . , i } for i = 1 , 2, 3 , . . . . Then, 00
00
;= 1
;= 1
00
00
;= 1
;= 1
U A ; = U { l , 2, 3 , . . . , i } = { I , 2, 3 , . . . } = Z+
and
n A ; = n { l , 2, 3 , . . . , i } = { l } .
To see that the union of these sets is the set of positive integers, note that every positive integer is in at least one of the sets, because the integer n belongs to A n = { I , 2, . . . , n } and every element of the sets in the union is a positive integer. To see that the intersection of these sets, note that the only element that belongs to all the sets A I , A 2 , . . . is 1 . To see this note that ... A I = { l } and 1 E A i for i = 1 , 2, . . . .
2.2 Set Operations
2- 1 9
129
Computer Representation of Sets There are various ways to represent sets using a computer. One method is to store the elements of the set in an unordered fashion. However, if this is done, the operations of computing the union, intersection, or difference of two sets would be time-consuming, because each of these operations would require a large amount of searching for elements. We will present a method for storing elements using an arbitrary ordering of the elements of the universal set. This method of representing sets makes computing combinations of sets easy. Assume that the universal set U is finite (and of reasonable size so that the number of elements of U is not larger than the memory size of the computer being used). First, specify an arbitrary ordering of the elements of U , for instance ai , a2 , . . . , an . Represent a subset A of U with the bit string of length n , where the i th bit in this string is 1 if aj belongs to A and is 0 if aj does not belong to A. Example 1 8 illustrates this technique. EXAMPLE 18
Let U = { l , 2, 3 , 4, 5 , 6, 7 , 8 , 9, 1 O} , and the ordering of elements of U has the elements in increasing order; that is, aj = i . What bit strings represent the subset of all odd integers in U , the subset of all even integers in U , and the subset of integers not exceeding 5 in U ?
Solution: The bit string that represents the set o f odd integers i n U , namely, { I , 3 , 5 , 7, 9}, has a one bit in the first, third, fifth, seventh, and ninth positions, and a zero elsewhere. It is 1 0 1 0 1 0 1 0 1 0. (We have split this bit string of length ten into blocks of length four for easy reading because long bit strings are difficult to read.) Similarly, we represent the subset of all even integers in U, namely, {2, 4, 6, 8, 1 0} , by the string 01 0101 0101. The set of all integers in U that do not exceed 5, namely, { l , 2 , 3 , 4, 5 } , i s represented by the string 1 1 1 1 1 0 0000. Using bit strings to represent sets, it is easy to find complements of sets and unions, inter sections, and differences of sets. To find the bit string for the complement of a set from the bit string for that set, we simply change each 1 to a 0 and each 0 to 1 , because x E A if and only if x ¢ A. Note that this operation corresponds to taking the negation of each bit when we associate a bit with a truth value-with 1 representing true and 0 representing false. EXAMPLE 19
We have seen that the bit string for the set { I , 3, 5 , 7, 9} (with universal set { l , 2, 3, 4, 5 , 6, 7, 8 , 9, 1 OJ) is 10 1 0 1 0 1 0 1 0. What is the bit string for the complement of this set?
Solution: The bit string for the complement of this set is obtained by replacing Os with 1 s and vice versa. This yields the string 01 0101 0101 , which corresponds to the set {2, 4, 6, 8, 1 O} .
2 / Basic Structures: Sets, Functions, Sequences, and Sums
130
2-20
To obtain the bit string for the union and intersection of two sets we perform bitwise Boolean operations on the bit strings representing the two sets. The bit in the ith position of the bit string of the union is 1 if either of the bits in the ith position in the two strings is 1 (or both are 1 ), and is 0 when both bits are O. Hence, the bit string for the union is the bitwise OR of the bit strings for the two sets. The bit in the i th position of the bit string of the intersection is 1 when the bits in the corresponding position in the two strings are both 1 , and is 0 when either of the two bits is 0 (or both are). Hence, the bit string for the intersection is the bitwise AND of the bit strings for the two sets. EXAMPLE 20
The bit strings for the sets { I , 2, 3 , 4, 5 } and { I , 3 , 5 , 7, 9} are 1 1 1 1 1 0 0000 and 1 0 1 0 1 0 1 0 1 0, respectively. Use bit strings to find the union and intersection of these sets.
Solution: The bit string for the union of these sets is 1 1 1 1 1 0 0000 v 1 0 1 0 1 0 1 0 1 0 = 1 1 1 1 1 0 1 0 1 0, which corresponds to the set { l , 2, 3 , 4, 5 , 7, 9}. The bit string for the intersection of these sets IS
1 1 1 1 1 0 0000 /\ 1 0 1 0 1 0 1 0 1 0 = 1 0 1 0 1 0 0000, which corresponds to the set { I , 3 , 5 } .
Exercises 1 . Let A be the set of students who live within one mile of
school and let B be the set of students who walk to classes. Describe the students in each of these sets. a) A n B AUB c) A - B d) B - A 2. Suppose that A is the set of sophomores at your school and B is the set of students in discrete mathematics at your school. Express each of these sets in terms of A and B. a) the set o f sophomores taking discrete mathematics in your school the set of sophomores at your school who are not tak ing discrete mathematics c) the set of students at your school who either are sopho mores or are taking discrete mathematics d) the set of students at your school who either are not sophomores or are not taking discrete mathematics 3. Let A = { , 2, 3 , 4, 5} and B = {O, 3 , 6 . Find a) A U B . A n B. c) A - B . d) B - A . 4 . Let A = {a , c , e and B = {a, c, e, g, Find A n B. a) A U B . d) B - A . c) A - B . In Exercises 5- 1 0 assume that A i s a subset o fsome underlying universal set
b)
b)
l
b, d, }
h}.
U.
b)
b)
}
b, d, f,
5. 6. 7. 8. 9. 10. 11.
12. 13. 14.
�ove the complementation law in Table 1 by showing that A = A. Prove the identity laws in Table 1 by showing that a) A U = A . A n = A. Prove the domination laws i n Table 1 by showing that An = a) A U = Prove the idempotent laws in Table 1 by showing that a) A U A = A . A n A = A. Prove the complement laws in Table I by showing that A n A= a) A U A = Show that a) A = A. -A= Let A and B b e sets. Prove the commutative laws from Table 1 by showing that a) A U B = B U A . A n B = B n A. Prove the first absorption law from Table 1 by showing that if A and B are sets, then A U (A n B) = A . Prove the second absorption law from Table 1 by showing that if A and B are sets, then A n (A U B) = A.
0 U U.
-0
U.
b) U b) 0 0. b) b) 0. b) 0 0.
b)
8},
Find the sets A and B if A - B = { I , 5 , 7, B-A= {2, and A n B = {3 , 6, 1 5 . Prove the first De Morgan law in Table 1 by showing that if A and B are sets, then A U B = A n B
lO},
9}.
::-2 1
2.2 Set Operations a)
16.
by showing each side is a subset of the other side. b) using a membership table. Let A and B be sets. Show that (A n B) A . b) AA n (B(A -U A)B).= 0. A - B A.
a) � c) � e) A U (B - A) = A U B .
�
d)
1 7 . Show that i f A, B, and C are sets, then A n B n C =
AU E U C
a) by showing each side is a subset of the other side.
b) using a membership table.
18. Let A, B, and C be sets. Show that a) (A U B) � (A U B U C). (A n B n C) � (A n B). c) (A - B) - C � A - C . d) (A - C) n (C - B) = e) (B - A) U (C - A) = (B U C) - A. 19. Show that if A and B are sets, then A - B = A n E.
b)
0.
20. Show that if A and B are sets, then (A n B) U 21. 22.
23.
24. 2S.
26.
(A n E) = A . Prove the first associative law from Table by showing that if A, B, and C are sets, then A U (B U C) = (A U B) U C . Prove the second associative law from Table by show ing that if A, B, and C are sets, then A n (B n C) = (A n B) n C . Prove the second distributive law from Table by show ing that if A , B, and C are sets, then A U (B n C) = (A U B) n (A U C). Let A, B, and C be sets. Show that (A - B) - C = (A - C) - (B - C). Let A = {O, 2, 4, 6, 8, 10}, B = {O, 1 , 2, 3 , 4, 5 , 6}, and C = {4, 5 , 6, 7, 8 , 1O}. Find A U B U C. a) A n B n C . d) (A n B) U C. c ) (A U B) n c. Draw the Venn diagrams for each of these combinations of the sets A, B, and C. a) A n (B U C) AnBnC c) (A - B) U (A - C) U (B - C) Draw the Venn diagrams for each of these combinations of the sets A, B, and C . a ) A n (B - C ) (A n B) U (A n C) c) (A n E) U (A n C) Draw the Venn diagrams for each of these combinations of the sets A, B, C, and D. AUBUCUD a) (A n B) U (C n D) c) A - (B n C n D) What can you say about the sets A and B if we know that a) A U B = A? A n B = A? c) A - B = A? d) A n B = B n A? e) A - B = B - A? Can you conclude that A = B if A, B, and C are sets such that a) A U C = B U C? A n C = B n C? c) A U B = B U C and A n C = B n C?
I
I
9,
b)
b)
27.
I
-
-
-
31. Let A and B be subsets of a universal set A � B if and only if E � A?
b)
29.
30.
b)
b)
U. Show that
The symmetric difference of A and B , denoted by A EB B, is the set containing those elements in either A or B , but not in both A and B . 32. Find the symmetric difference o f { , 3 , 5 } and { , 2, 3 } . 33. Find the symmetric difference o f the set o f computer science majors at a school and the set of mathematics majors at this school. 34. Draw a Venn diagram for the symmetric difference of the sets A and B . 35. Show that A EB B = ( A U B) - ( A n B). 36. Show that A EB B = (A - B) U (B - A). 37. Show that if A is a subset of a universal set then A EB = A. a) A EB A = c) A EB = A. d) A EB A = 38. Show that if A and B are sets, then a) A EB B = B EB A . (A EB B) EB B = A . 39. What can you say about the sets A and B i f A EB B = A? *40. Determine whether the symmetric difference is associa tive; that is, if A, B, and C are sets, does it follow that A EB (B EB C) = (A EB B) EB C? *41 . Suppose that A, B , and C are sets such that A EB C = B EB C. Must it be the case that A = B? 4 2 . If A, B , C, and D are sets, does it follow that (A EB B) EB (C EB D) = (A EB C) EB (B EB D)? 43. If A, B, C, and D are sets, does it follow that (A EB B) EB (C EB D) = (A EB D) EB (B EB C)? *44. Show that if A , B , and C are sets, then
I
b)
U 0.
b)
+ +
I
U, 0 U .
IA U B U CI = IAI IBI ICI - IA n BI - IA n CI - IB n CI IA n B n CI·
+
(This i s a special case o f the inclusion-exclusion princi ple, which will be studied in Chapter 7.) *45. Let A i = { I , 2, 3, . . . , i} for i = 1 , 2, 3 , . . . . Find n
U Ai . i =1 i= 1 *46. Let A i = { . . . , -2, - 1 , 0, . . . , i } . Find a)
I,
b)
28.
131
47. Let
i= 1 i =1 Ai b e the set of all nonempty bit strings (that is, bit
strings of length at least one) of length not exceeding i . Find n
U Ai • i =1 i= 1 48. Find U� I Ai and n� 1 A i if for every positive integer i , a ) Ai = I i , i i 2, . . . } . A i = {O, n . c ) A i = (0, i ) , that is, the set o f real numbers x with a)
b)
d)
+ I, + 0 < < i. A = (i, 00), that is, the set o f real numbers i
x >
x
i.
x
with
132
2 / Basic Structures: Sets, Functions, Sequences, and Sums
49. Find U:: I Aj and n:: 1 Aj if for every positive integer i ,
a) Aj = { - i, -i + l , . . . , - I , O, I , . . . , i - l , i } . b) A j = { - i, i } . c) A j = [-i, i ] , that is, the set o f real numbers x with
-i :::: x :::: i.
d ) Aj = [i, 00], that is, the set of real numbers x with
x
?:
i.
50. Suppose that the universal set is
51.
52.
53. 54. 55.
U
= { I , 2 , 3 , 4, 5 , 6, 7, 8, 9, 1 O } . Express each ofthese sets with bit strings where the ith bit in the string is 1 if i is in the set and 0 otherwise. a) {3 , 4, 5 } b) { l , 3 , 6, I O} c) {2, 3 , 4, 7, 8 , 9} Using the same universal set as in the last problem, find the set specified by each of these bit strings. a) 1 1 1 1 00 1 1 1 1 b) 0 1 0 1 1 1 1 000 c) 1 0 0000 000 1 What subsets of a finite universal set do these bit strings represent? a) the string with all zeros b) the string with all ones What is the bit string corresponding to the difference of two sets? What is the bit string corresponding to the symmetric difference of two sets? Show how bitwise operations on bit strings can be used to find these combinations of A = { a , b, e, d , e} , B = {b, e, d, g, p, t, v } , C = { e , e, i, 0 , u , x , y , z}, and D = {d, e, h , i , n , o, t , u , x , y } . a) A U B c) (A U D ) n (B U C)
b) A n B d) A U B U C U D
56. How can the union and intersection of n sets that all are
U
subsets of the universal set be found using bit strings? The successor of the set A is the set A U { A } . 57. Find the successors o f the following sets. b) 0 a) { 1 , 2, 3 } d) .{ 0, {0}} c) {0} 58. How many elements does the successor of a set with n elements have? Sometimes the number of times that an element occurs in an unordered collection matters. Multisets are unor dered collections of elements where an element can oc cur as a member more than once. The notation { m i ' a i , m 2 ' a2 , . . . , mr . ar } denotes the multiset with element al oc curring m I times, element a2 occurring m 2 times, and so on. The numbers m j, i = 1 , 2, . . . , r are called the multiplicities of the elements aj , i = 1 , 2 , . . . , r . Let P and Q b e multi sets. The union o f the multisets P and Q is the multi set where the multiplicity of an element is the maximum of its multiplicities in P and Q . The intersec tion of P and Q is the multi set where the multiplicity of an element is the minimum of its multiplicities in P and Q . The
2-22
difference of P and Q is the multi set where the multiplicity of an element is the multiplicity of the element in P less its mul tiplicity in Q unless this difference is negative, in which case the multiplicity is O. The sum of P and Q is the multiset where the multiplicity of an element is the sum of multiplicities in P and Q . The union, intersection, and difference of P and Q are denoted by P U Q, P n Q, and P - Q, respectively (where these operations should not be confused with the anal ogous operations for sets). The sum of P and Q is denoted by P + Q. 59. Let A and B be the multisets {3 · a , 2 · b, 1 . e} and {2 . a , 3 . b, 4 . d}, respectively. Find
a) A U B . c) A - B . e) A + B .
b) A n B . d) B - A .
60. Suppose that A i s the multiset that has as its elements the
types of computer equipment needed by one department of a university where the multiplicities are the number of pieces of each type needed, and B is the analogous multi set for a second department of the university. For instance, A could be the multi set { I 07 . personal comput ers, 44 . routers, 6 . servers} and B could be the multiset { 1 4 . personal computers, 6 . routers, 2 . mainframes} . a) What combination o f A and B represents the equip
ment the university should buy assuming both depart ments use the same equipment? b) What combination of A and B represents the equip ment that will be used by both departments if both departments use the same equipment? c) What combination of A and B represents the equip ment that the second department uses, but the first department does not, if both departments use the same equipment? d) What combination of A and B represents the equip ment that the university should purchase if the depart ments do not share equipment? Fuzzy sets are used in artificial intelligence. Each element in the universal set has a degree of membership, which is a real number between 0 and 1 (including 0 and 1 ), in a fuzzy set The fuzzy set is denoted by listing the elements with � their degrees of membership (elements with 0 qegree of mem bership are not listed). For instance, we write {0.6 Alice, 0.9 Brian, 004 Fred, 0. 1 Oscar, 0.5 Rita} for the set (of famous people) to indicate that Alice has a 0.6 degree of membership in Brian has a 0.9 degree of membership in Fred has a 004 degree of membership in Oscar has a 0. 1 degree of membership in and Rita has a 0.5 degree of membership in (so that Brian is the most famous and Oscar is the least famous ofthese people). Also suppose that is the set of rich people with = {OA Alice, 0.8 Brian, 0.2 Fred, 0.9 Oscar, 0.7 Rita}. 61. The complement of a fuzzy set is the set S, with the degree of the membership of an element in S equal to 1 minus the degree of membership of this element in Find F (the fuzzy set of people who are not famous) and R (the fuzzy set of people who are not rich).
U
S.
F, F
F,
R
S
F F,
F,
R
S
S.
2.3 Functions
2-23
62. The union of two fuzzy sets S and T is the fuzzy set
UU
63. The intersection of two fuzzy sets S and T is the fuzzy
S T, where the degree of membership of an element in S T is the maximum of the degrees of membership of this element in S and in T. Find the fuzzy set of rich or famous people.
2.3
133
set S n T , where the degree of membership of an element in S n T is the minimum of the degrees of membership of this element in S and in T . Find the fuzzy set n of rich and famous people.
FUR
F R
Functions
Introduction In many instances we assign to each element of a set a particular element of a second set (which may be the same as the first). For example, suppose that each student in a discrete mathematics class is assigned a letter grade from the set { A , B , C , D , F } . And suppose that the grades are A for Adams, C for Chou, B for Goodfriend, A for Rodriguez, and F for Stevens. This assignment of grades is illustrated in Figure 1 . This assignment is an example of a function. The concept of a function is extremely impor tant in mathematics and computer science. For example, in discrete mathematics functions are used in the definition of such discrete structures as sequences and strings. Functions are also used to represent how long it takes a computer to solve problems of a given size. Many computer programs and subroutines are designed to calculate values of functions. Recursive functions, which are functions defined in terms of themselves, are used throughout computer science; they will be studied in Chapter 4. This section reviews the basic concepts involving functions needed in discrete mathematics.
DEFINITION 1
Let A and B be nonempty sets. A jUnction f from A to B is an assignment of exactly one element of B to each element of A . We write f(a) = b if b is the unique element of B assigned by the function f to the element a of A . If f is a function from A to B , we write f : A -+ B .
Remark: Functions are sometimes also called mappings or transformations.
Assessmenl �
Functions are specified in many different ways. Sometimes we explicitly state the assign ments, as in Figure 1 . Often we give a formula, such as f(x) = x + 1 , to define a function. Other times we use a computer program to specify a function.
Adams
. --------�.� . A
Chou
•
. B
Goodfriend
•
. c
Rodriguez
•
. D
Stevens
•
�.. �.
--------------
FIGURE 1
F
Assignment of Grades in a Discrete Mathematics Class.
134
2-24
2 / Basic Structures: Sets, Functions, Sequences, and Sums
f
. ----�-----r--� .
a
b = f(a)
A
B
FIG URE 2
The Function! Maps A to B.
A function ! : A -+ B can also be defined in terms of a relation from A to B. Recall from Section 2. 1 that a relation from A to B is just a subset of A x B . A relation from A to B that contains one, and only one, ordered pair (a , b) for every element a E A, defines a function f from A to B . This function is defined by the assignment f(a ) = b, where (a , b) is the unique ordered pair in the relation that has a as its first element.
DEFINITION 2
If ! is a function from A to B , we say that A is the domain of f and B is the codomain of f. If f(a) = b, we say that b is the image of a and a is a preimage of b. The range of f is the set of all images of elements of A . Also, if f is a function from to B , we say thatfmaps A to B .
A
Figure 2 represents a function f from A to B . When we define a function we specify its domain, its codomain, and the mapping of elements of the domain to elements in the codomain. Two functions are equal when they have the same domain, have the same codomain, and map elements of their common domain to the same elements in their common codomain. Note that if we change either the domain or the codomain of a function, then we obtain a different function. If we change the mapping of elements, then we also obtain a different function. Examples 1-5 provide examples of functions. In each case, we describe the domain, the codomain, the range, and the assignment of values to elements of the domain. EXAMPLE 1
What are the domain, codomain, and range of the function that assigns grades to students described in the first paragraph of the introduction of this section?
Solution: Let G be the function that assigns a grade to a student in our discrete mathematics class. Note that G (Adams) = A, for instance. The domain of G is the set {Adams, Chou, Goodfriend, Rodriguez, Stevens}, and the codomain is the set {A, B , C , D , F } . The range of G is the set {A, .... B , C , F}, because each grade except D is assigned to some student. EXAMPLE 2
Let R be the relation consisting of ordered pairs (Abdul, 22), (Brenda, 24), (Carla, 2 1 ), (Desire, 22), (Eddie, 24), and (Felicia, 22), where each pair consists of a graduate student and the age of this student. What is the function that this relation determines?
Solution: This relation defines the function f, where with f(Abdul ) = 22, f(Brenda) = 24, f(Carla) = 2 1 , f(Desire) = 22, f(Eddie) = 24, and f(Felicia) = 22. Here the domain is the set {Abdul, Brenda, Carla, Desire, Eddie, Felicia} . To define the function f, we need to specify a codomain. Here, we can take the codomain to be the set of positive integers to make sure
2.3 Functions 135
2-25
that the codomain contains all possible ages of students. (Note that we could choose a smaller � codomain, but that would change the function.) Finally, the range is the set {2 1 , 22, 24} . EXAMPLE 3
Extra � Examples "'"
Let I be the function that assigns the last two bits of a bit string of length 2 or greater to that string. For example, 1( 1 1 0 1 0) = 1 0. Then, the domain of I is the set of all bit strings of length � 2 or greater, and both the codomain and range are the set {OO, 0 1 , 1 0, I I } .
EXAMPLE 4
Let I: Z -+ Z assign the square of an integer to this integer. Then, I(x) = x 2 , where the domain of I is the set of all integers, we take the the codomain of I to be the set of all integers, and the range of I is the set of all integers that are perfect squares, namely, {O, 1 , 4, � 9, . . . } .
EXAMPLE S
The domain and codomain of functions are often specified in programming languages. For instance, the Java statement int ftoor(float real){ . . . } and the Pascal statement functionftoor(x : real): integer
both state that the domain of the floor function is the set of real numbers and its codomain is � the set of integers. Two real-valued functions with the same domain can be added and multiplied.
DEFINITION 3
Let II and !2 be functions from A to R. Then II R defined by
(II
+ !2 and Id2 are also functions from A to
+ !2)(x ) II (x ) + !2 (x ) , (/1 !2)(X) fi (x )/2(X). =
=
Note that the functions It + !2 and It !2 have been defined by specifying their values at x in terms of the values of It and !2 at x . EXAMPLE 6
Let It and !2 be functions from R to R such that It (x) = x 2 and 12 (X) = x - x 2 . What are the functions It + !2 and It !2? Solution: From the definition of the sum and product of functions, it follows that (/t + !2)(x ) = It (x) + !2(x ) = x 2 + (x - x 2 ) = X
and
When I is a function from a set
A to a set B , the image of a subset of A can also be defined.
136
2 / Basic Structures: Sets, Functions, Sequences, and Sums
DEFINITION 4
2-26
Let f be a function from the set A to the set B and let S be a subset of A. The image of S under the function f is the subset of B that consists of the images of the elements of S. We denote the image of S by f(S), so
f(S) = { t I 3s E S (t = f(s» } . We also use the shorthand (f(s) I s E S} to denote this set.
Remark: The notation f(S) for the image of the set S under the function f is potentially
ambiguous. Here, f(S) denotes a set, and not the value of the function f for the set S .
EXAMPLE 7
A
Let = {a , b, c, d, e} and B = { l , 2, 3 , 4} with f(a) = 2, f(b) = I , f(c) = 4, f(d) = I , .... and f(e) = 1 . The image of the subset S = {b, c , d } i s the set f(S) = { I , 4} .
One-to-One and Onto Functions Some functions never assign the same value to two different domain elements. These functions are said to be one-to-one.
DEFINITION 5
A function f is said to be one-to-one, or injective, if and only if f(a) = f(b) implies that a = b for all a and b in the domain of f. A function is said to be an injection if it is one-to-one.
Note that a function f is one-to-one if and only if f(a) =1= f(b) whenever a =1= b. This way of expressing that f is one-to-one is obtained by taking the contrapositive of the implication in the definition. Remark: We can express that f is one-to-one using quantifiers as VaVb(f(a) = f(b)
or equivalently VaVb(a =1= b the function.
Assessmenl
�
EXAMPLE S
-+
-+ a = b) f(a) =1= f(b » , where the universe of discourse is the domain of
We illustrate this concept by giving examples of functions that are one-to-one and other functions that are not one-to-one. Determine whether the function f from fa, b, c, d} to { I , 2, 3 , 4, 5 } with f(a) f(c) = I, and f(d) = 3 is one-to-one.
==
4, f(b)
=
5,
Extra � Examples �
Solution: The function f is one-to-one because f takes on different values at the four elements .... of its domain. This is illustrated in Figure 3 .
EXAMPLE 9
Determine whether the function f(x) = x 2 from the set of integers to the set of integers is one-to-one.
Solution: The function f(x) = x 2 is not one-to-one because, for instance, f( l ) f(- I ) = 1 , but 1 =1= - 1 .
2.3 Functions
2-2 7
137
a . b . c . d . . 5
FIGURE 3
A One-to-One Function.
Note that the function f(x) = x 2 with its domain restricted to Z+ is one-to-one. (Technically, when we restrict the domain of a function, we obtain a new function whose values agree with those of the original function for the elements of the restricted domain. The restricted function .... is not defined for elements of the original domain outside of the restricted domain.) EXAMPLE 10
Determine whether the function f(x) = x + I from the set ofreal numbers to itselfis one-to-one.
Solution: The function f(x) = x + I is a one-to-one function. To demonstrate this, .... note that x + I =1= y + I when x =1= y. We now give some conditions that guarantee that a function is one-to-one.
DEFINITION 6
A function f whose domain and codomain are subsets of the set of real numbers is called increasing if f(x) .::: f(Y), and strictly increasing if f(x) < f(Y) , whenever x < y and x and y are in the domain of f. Similarly, f is called decreasing if f(x) ::: f(Y), and strictly decreasing if f(x) > f(Y), whenever x < y and x and y are in the domain of f. (The word strictly in this definition indicates a strict inequality.) < y -+ f(x) .::: f(Y» , strictly increasing if VxVy(x < y -+ f(x) < f(Y» , decreasing if VxVy(x < y -+ f(x) ::: f(Y» , and strictly de creasing if VxVy(x < y -+ f(x) > f(Y» , where the universe of discourse is the domain of f.
Remark: A function f is increasing if VxVy(x
From these definitions, we see that a function that is either strictly increasing or strictly decreasing must be one-to-one. However, a function that is increasing, but not strictly increasing, or decreasing, but not strictly decreasing, is not necessarily one-to-one. For some functions the range and the codomain are equal. That is, every member of the codomain is the image of some element of the domain. Functions with this property are called onto functions.
DEFINITION 7
A function f from A to B is called onto, or surjective, if and only if for every element b E B there is an element a E A with f(a) = b. A function f is called a surjection if it is onto.
= y), where the domain for x is the domain of the function and the domain for y is the codomain of the function.
Remark: A function f is onto if Vy3x (f(x)
We now give examples of onto functions and functions that are not onto.
138
2 / Basic Structures: Sets, Functions, Sequences, and Sums a e
�
b e
e l
c e
e2
d e
2-28
-----;..�
FIGURE 4
e3
An Onto Function.
EXAMPLE 11
Let f be the function from {a , b, c, d} to { I , 2, 3} defined by f(a) = 3, f(b) = 2, f(c) = 1 , and f(d) = 3 . I s f an onto function?
Ex1ra � Examples ....
Solution: Because all three elements of the codomain are images of elements in the domain, we see that f is onto. This is illustrated in Figure 4. Note that if the codomain were { l , 2, 3 , 4}, .... then f would not be onto.
EXAMPLE 12
Is the function f(x) = x 2 from the set of integers to the set of integers onto?
Solution: ·The function f is not onto because there is no integer x with x 2 = - 1 , for instance. .... EXAMPLE 13
Is the function f(x) = x + 1 from the set of integers to the set of integers onto?
Solution: This function is onto, because for every integer y there is an integer x such that f(x) = y. To see this, note that f(x) = y if and only if x + 1 = y, which holds if and only if .... x = y - l. DEFINITION 8
The function f is a one-to-one correspondence, or a bijection, if it is both one-to-one and onto.
Examples 1 4 and 1 5 illustrate the concept of a bijection. EXAMPLE 14
Let f be the function from {a , b, c, d} to { l , 2, 3, 4} with f(a) = 4, f(b) = 2, f(c) = 1, and f(d) = 3 . Is f a bijection?
Solution: The function f is one-to-one and onto. It is one-to-one because no two values in the domain are assigned the same function value. It is onto because all four elements of the .... codomain are images of elements in the domain. Hence, f is a bijection. Figure 5 displays four functions where the first is one-to-one but not onto, the second is onto but not one-to-one, the third is both one-to-one and onto, and the fourth is neither one-to-one nor onto. The fifth correspondence in Figure 5 is not a function, because it sends an element to two different elements. Suppose that f is a function from a set A to itself. If A is finite, then f is one-to-one if and only if it is onto. (This follows from the result in Exercise 68 at the end of this section.) This is not necessarily the case if A is infinite (as will be shown in Section 2.4). EXAMPLE 15
Let A be a set. The identity function on A is the function LA : A -+ A, where LA (X )
=x
for all x E A . In other words, the identity function LA is the function that assigns each element to itself. The function LA is one-to-one and onto, so it is a bijection. (Note that L is the Greek .... letter iota.)
2.3 Functions
2-29
(a)
(b)
One-to-one, not onto .1
a.
.o;z •
2
b•
b. .3
c.
c. .4
FIG U RE 5
Onto, not one-to-one
� �
d.
(c) a.
One-to-one, and onto
(d) Neither one-to-one nor onto .1
a.
.1 b.
.2
.2 c•
.3
•3
d.
.4
><
(e) Not a function
.1
.1 .O
::::=:: : :
d.
139
� .2
b. .3 c.
.4
.4
Examples of Different Types of Correspondences.
Inverse Functions and Compositions of Functions Now consider a one-to-one correspondence f from the set A to the set B. Because f is an onto function, every element of B is the image of some element in A . Furthermore, because f is also a one-to-one function, every element of B is the image of a unique element of A. Consequently, we can define a new function from B to A that reverses the correspondence given by f. This leads to Definition 9.
DEFINITION 9
Let f be a one-to-one correspondence from the set A to the set B . The inversefunction of f is the function that assigns to an element b belonging to B the unique element a in A such that f(a) = b. The inverse function of f is denoted by f- I . Hence, f- I (b ) = a when f(a) = b. Remark: Be sure not to confuse the function f- I with the function I lf, which is the function
that assigns to each x in the domain the value 1 If(x). Notice that the latter makes sense only when f(x) is a non-zero real number. They are not the same. Figure 6 illustrates the concept of an inverse function. If a function f is not a one-to-one correspondence, we cannot define an inverse function of f. When f is not a one-to-one correspondence, either it is not one-to-one or it is not onto. If f is not one-to-one, some element b in the codomain is the image of more than one element in the domain. If f is not onto, for some element b in the codomain, no element a in the domain exists for which f(a) = b . Consequently, if f is not a one-to-one correspondence, we cannot assign to each element b in the codomain a unique element a in the domain such that f(a ) = b (because for some b there is either more than one such a or no such a).
FIGURE 6
The Function I- I Is the Inverse of Function I.
140
2 / Basic Structures: Sets, Functions, Sequences, and Sums
2-30
A one-to-one correspondence is called invertible because we can define an inverse of this function. A function is not invertible if it is not a one-to-one correspondence, because the inverse of such a function does not exist. EXAMPLE 16
Let f be the function from {a , b, c} to { l , 2, 3} such that f(a) = 2, feb) = 3, and f(c) = 1 . Is f invertible, and if it is, what is its inverse?
Solution: The function f is invertible because it is a one-to-one correspondence. The in verse function f- I reverses the correspondence given by f, so f- I ( 1 ) = c, f- I (2) = a, and .... f- I (3) = b. EXAMPLE 17
Let f : Z
-+
Z be such that f(x) = x + 1 . Is f invertible, and if it is, what is its inverse?
Solution: The function f has an inverse because it is a one-to-one correspondence, as we have shown. To reverse the correspondence, suppose that y is the image of x , so that y = x + 1 . Then x = y - 1 . This means that y - 1 is the unique element of Z that is sent to y by f. Consequently, .... f- I (y) = Y - 1 . EXAMPLE 18
Let f be the function from R to R with f(x) = x 2 . Is f invertible?
Solution: Because f( -2) = f(2) = 4, f is not one-to-one. If an inverse function were defined, .... it would have to assign two elements to 4 . Hence, f is not invertible. Sometimes we can restrict the domain or the codomain of a function, or both, to obtain an invertible function, as Example 1 9 illustrates. EXAMPLE 19
Show that if we restrict the function f(x) = x 2 in Example 1 8 to a function from the set of all nonnegative real numbers to the set of all nonnegative real numbers, then f is invertible.
Solution: The function f(x) = x 2 from the set of nonnegative real numbers to the set of non negative real numbers is one-to-one. To see this, note that if f(x) = f(Y), then x 2 = y 2 , so x 2 - y 2 = (x + y)(x - y) = O. This means that x + y = O or x - y = 0, so x = -y or x = y. Because both x and y are nonnegative, we must have x = y. So, this function is one-to-one. Furthermore, f(x) = x 2 is onto when the codomain is the set of all nonnegative real numbers, because each nonnegative real number has a square root. That is, if y is a nonnegative real number, there exists a nonnegative real number x such that x = ,.;y, which means that x 2 = y. Because the function f(x) = x 2 from the set of nonnegative real numbers to the set of non negative real numbers is one-to-one and onto, it is invertible. Its inverse is given by the rule .... f- I (y) = ,.;y.
DEFINITION 10
Let g be a function from the set A to the set B and let f be a function from the set B to the set C . The composition ofthe functions f and g, denoted by f o g, i s defined by (f 0 g) (a)
=
f(g(a)).
In other words, f o g is the function that assigns to the element a of A the element assigned by f to g(a). That is, to find (f 0 g)(a) we first apply the function g to a to obtain g(a) and then we apply the function f to the result g(a) to obtain (f 0 g)(a) = f(g(a)). Note that the
2.3 Functions
2-3 1
(f
0
141
g)(a)
•
f( g(a »
A
B
c
fo g
FIGURE 7
The Composition of the Functions! and g.
composition f o g cannot be defined unless the range of g is a subset of the domain of f. In Figure 7 the composition of functions is shown. EXAMPLE 20
Let g be the function from the set {a , b, c} to itself such that g(a) = b, g(b) = c, and g(c) = a . Letfbe the function from the set {a , b , c} to the set { l , 2, 3 } such that f(a) = 3, feb) = 2 , and f(c) = 1 . What is the composition of f and g, and what is the composition of g and f?
Solution: The composition f o g is defined by (f 0 g)(a) = f(g(a)) = feb) = 2, (f 0 g) (b) = f(g(b)) = f(c) = 1 , and (f 0 g)(c) = f(g(c)) = f(a) = 3 . Note that g 0 f i s not defined, because the range of f i s not a subset of the domain of g . .... EXAMPLE 21
. Let f and g be the functions from the set of integers to the set of integers defined by f(x) = 2x + 3 and g(x) = 3x + 2. What is the composition of f and g? What is the composition of g and f?
Solution: Both the compositions f o g and g 0 f are defined. Moreover, (f 0 g)(x) = f(g(x)) = f(3x + 2) = 2(3x + 2) + 3 = 6x + 7 and (g 0 f)(x) = g(f(x)) = g(2x + 3) = 3(2x + 3) + 2 = 6x + 1 1 . Remark: Note that even though f o g and g 0 f are defined for the functions f and g in
Example 2 1 , f o g and g 0 f are not equal. In other words, the commutative law does not hold for the composition of functions. When the composition of a function and its inverse is formed, in either order, an identity function is obtained. To see this, suppose that f is a one-to-one correspondence from the set A to the set B . Then the inverse function f - I exists and is a one-to-one correspondence from B to A . The inverse function reverses the correspondence of the original function, so f - I (b) = a when f(a) = b, and f(a) = b when f - I (b) = a . Hence, (f - I 0 f )(a) = f - I (f(a)) = f- \b) = a , and (f 0 f - I )(b) = f(f - I (b)) = f(a) = b.
142
2 / Basic Structures:
Sets, Functions, Sequences, and Sums
o
0
o
•
0
0
0
o
0
0
0
0
0
0
o
0
.
0
0
0
0
o
0
0
0
0
0
0
o
0
o
0
0
0
0
o
•
0
0
0
0
0
o
0
o
0
0
0
0
2-32
e (-3, 9)
(3,9) e
e (-2, 4)
(2,4) e
e (I, 1)
(-1 , 1 ) e (0, 0)
FIGURE 8 The Graph of f(n) = 2n + 1 from Z to Z.
FIGURE 9 The Graph of f(x) = x2 from Z to Z.
Consequently 1- 1 0 I = LA and 1 0 1- 1 = L B , where the sets A and B , respectively. That is, (1- 1 )- 1 = I.
LA
and
LB
are the identity functions on
The Graphs of Functions We can associate a set of pairs in A x B to each function from A to B. This set of pairs is called the graph of the function and is often displayed pictorially to aid in understanding the behavior of the function.
DEFINITION 11
Let I be a function from the set A to the set B . The graph of the function I is the set of ordered pairs {(a , b) I a E A and I(a) = b}.
From the definition, the graph of a function I from A to B is the subset of A x B containing the ordered pairs with the second entry equal to the element of B assigned by I to the first entry. EXAMPLE 22
Display the graph of the function I(n) = 2n + 1 from the set of integers to the set of integers.
Solution: The graph of I is the set of ordered pairs of the form (n , 2n + 1 ), where n is an integer. .... This graph is displayed in Figure 8. EXAMPLE 23
Display the graph of the function I(x) = x 2 from the set of integers to the set of integers.
Solution: The graph of I is the set of ordered pairs of the form (x , I(x» = (x , x 2 ), where x is .... an integer. This graph is displayed in Figure 9.
Some Important Functions Next, we introduce two important functions in discrete mathematics, namely, the floor and ceiling functions. Let x be a real number. The floor function rounds x down to the closest integer less than or equal to x, and the ceiling function rounds x up to the closest integer greater than or
2.3 Functions
2-33
3
3 ---.0
2
-3
-2
2
-1 -1
3
-3
-2
2
-1
�
-3
-3 y = [xl
3
-1 -2
FIGURE 1 0
�
2
---.0 -2
(a)
143
(b) y = [xl
Graphs of the (a) Floor and (b) Ceiling Functions.
equal to x . These functions are often used when objects are counted. They play an important role in the analysis of the number of steps used by procedures to solve problems of a particular SIze.
DEFINITION 12
Thefioorfunction assigns to the real number x the largest integer that is less than or equal to
x. The value of the floor function at x is denoted by Lx J . The ceilingfimction assigns to the real number x the smallest integer that is greater than or equal to x. The value of the ceiling function at x is denoted by rx l . Remark: The floor function is often also called the greatest integerfunction. It is often denoted
by [xl
EXAMPLE 24
These are some values of the floor and ceiling functions:
L!J
linkS �
EXAMPLE 25
= 0, r ! l = 1 , L - ! J
= - 1 , r - ! 1 = 0, L3 . 1 J = 3 , r3 . 1 l = 4, L 7 J = 7, f7 1 = 7.
�
We display the graphs of the floor and ceiling functions in Figure 1 0. In Figure 1 0(a) we display the graph of the floor function LxJ . Note that this function has the same value throughout the interval [n , n + 1 ), namely n , and then it jumps up to n + 1 when x = n + 1 . In Figure 1 0(b) we display the graph of the ceiling function rxl . Note that this function has the same value throughout the interval (n , n + 1 ] , namely n + 1 , and then jumps to n + 2 when x is a little larger than n + 1 . The floor and ceiling functions are useful in a wide variety of applications, including those involving data storage and data transmission. Consider Examples 25 and 26, typical of basic calculations done when database and data communications problems are studied. Data stored on a computer disk or transmitted over a data network are usually represented as a string of bytes. Each byte is made up of 8 bits. How many bytes are required to encode 1 00 bits of data?
Solution: To determine the number of bytes needed, we determine the smallest integer that is at least as large as the quotient when 1 00 is divided by 8, the number of bits in a byte. Consequently, P OO /8 1 = r 1 2 . 5 1 = 1 3 bytes are required. �
144
2 / Basic Structures: Sets, Functions, Sequences, and Sums
2-34
TABLE 1 Useful Properties of the Floor and Ceiling Functions.
(n is an integer) ( I a) LxJ ( I b) rxl ( I c) Lx J ( I d) rx l
= n if and only if n :-:: x < n + I = n if and only if n - I < x :-:: n = n if and only if x - I < n :-:: x = n if and only if x :-:: n < x + I
(2) x -
I
(3a)
<
LxJ :-:: x :-:: rxl
<
x+I
L -xJ = - fx l = - LxJ
(3b) r - x l (4a) (4b)
EXAMPLE 26
Lx + nJ = LxJ + n rx + n l = rx l + n
In asynchronous transfer mode (ATM) (a communications protocol used on backbone networks), data are organized into cells of 53 bytes. How many ATM cells can be transmitted in 1 minute over a connection that transmits data at the rate of 500 kilobits per second?
Solution: In 1 minute, this connection can transmit 500, 000 · 60 = 30, 000, 000 bits. Each ATM cell is 53 bytes long, which means that it is 53 . 8 = 424 bits long. To determine the number of cells that can be transmitted in 1 minute, we determine the largest integer not exceeding the quotient when 30, 000, 000 is divided by 424. Consequently, L30, 000, 000j424J = 70, 754 ATM ... cells can be transmitted in 1 minute over a 500 kilobit per second connection. Table 1 , with x denoting a real number, displays some simple but important properties of the floor and ceiling functions. Because these functions appear so frequently in discrete mathematics, it is useful to look over these identities. Each property in this table can be established using the definitions of the floor and ceiling functions. Properties ( l a), ( l b), ( l c), and ( l d) follow directly from these definitions. For example, ( l a) states that Lx J = n if and only if the integer n is less than or equal to x and n + 1 is larger than x . This is precisely what it means for n to be the greatest integer not exceeding x , which is the definition of Lx J = n . Properties ( l b), ( l c), and ( l d) can be established similarly. We will prove property (4a) using a direct proof.
Proof: Suppose that Lx J = m , where m is a positive integer. By property ( l a), it follows that m ::::; x < m + 1 . Adding n to both sides of this inequality shows that m + n ::::; x + n < m + n + 1 . Using property ( l a) again, we see that Lx + n J = m + n = LxJ + n . This completes the proof.
rxl
-
2.3 Functions
2-35
EXAMPLE 27
145
�
Prove that if x is a real number, then L2x J = Lx J + Lx + J .
Solution: To prove this statement we let x = n + E , where n is a positive integer and 0 :s E < I . There are two cases to consider, depending on whether E is less than or greater than or equal to � . (The reason we choose these two cases will be made clear in the proof.)
�.
We first consider the case when 0 :s E < In this case, 2x = 2n + 2E and L2x J = 2n because 0 :s 2E < 1 . Similarly, x + � = n + + E), so Lx + J = n , because 0 < + E < 1 . Consequently, L2xJ = 2n and LxJ + Lx + J = n + n = 2n . Next, we consider the case when :s E < I . In this case, 2x = 2n + 2E = (2n + 1 ) + (2E 1 ). Because 0 :s 2E - 1 < 1 , it follows that L2xJ = 2n + 1 . Because Lx + �J = Ln + (� + E)J = Ln + 1 + (E - )J and 0 :s E - < 1 , it follows that Lx + = n + 1 . Consequently, � L2xJ = 2n + 1 and LxJ + Lx + J = n + (n + 1 ) = 2n + 1 . This concludes the proof.
�
�
�
EXAMPLE 28
�
�
(�
�
�
�J
Prove or disprove that rx + y 1 = rx l + ry 1 for all real numbers x and y.
Solution: Although this statement may appear reasonable, it is false. A counterexample is supplied by x = and y = With these values we find that rx + y l = + = rl1 = 1, � but rx l + ry 1 = r 1 + r 1 = 1 + 1 = 2.
�
�
�
�.
r� �l
There are certain types of functions that will be used throughout the text. These include polynomial, logarithmic, and exponential functions. A brief review of the properties of these functions needed in this text is given in Appendix 2. In this book the notation log x will be used to denote the logarithm to the base 2 of x, because 2 is the base that we will usually use for logarithms. We will denote logarithms to the base b, where b is any real number greater than 1 , by 10gb x , and the natural logarithm by In x . Another function we will use throughout this text is the factorial function f : N -+ Z+, denoted by f(n) = n ! . The value of f(n) = n ! is the product of the first n positive integers, so f(n) = I · 2 · · · (n - I ) · n [and f(O) = O! = 1 ] . EXAMPLE 29
unkS�
We have f( l ) = l ! = l , f(2) = 2 ! = 1 · 2 = 2, f(6) = 6 ! = 1 · 2 · 3 · 4 · 5 · 6 = 720, and f(20) 1 · 2 · 3 · 4 · 5 · 6 · 7 · 8 · 9 · 1 0 · 1 1 . 1 2 · 1 3 · 1 4 · 1 5 · 1 6 · 1 7 · 1 8 · 1 9 · 20 = � 2,432,902,008, 1 76,640,000.
JAMES STIRLING ( \ 692- \ 770) was born near the town of Stirling, Scotland. His family strongly supported the Jacobite cause of the Stuarts as an alternative to the British crown. The first infonnation known about James is that he entered Balliol College, Oxford, on a scholar�hip in 1 7 1 1 . However, he later lost his scholarship when he refused to pledge his allegiance to the British crown. The first Jacobean rebellion took place in 1 7 1 5, and Stirling was accused of communicating with rebels. He was charged with cursing King George, but he was acquitted of these charges. Even though he could not graduate from Oxford because of his politics, he remained there for several years. Stirling published his first work, which extended Newton's work on plane curves, in 1 7 1 7. He traveled to Venice, where a chair of mathematics had been promised to him, an appointment that unfortunately fell through. Nevertheless, Stirling stayed in Venice, continuing his mathematical work. He attended the University of Padua in 1 72 1 , and in 1 722 he returned to Glasgow. Stirling apparently fled Italy after learning the secrets of the Italian glass industry, avoiding the efforts of Italian glass makers to assassinate him to protect their secrets. In late 1 724 Stirling moved to London, staying there 1 0 years teaching mathematics and actively engaging in research. In 1 730 he published Methodus Differentialis, his most important work, presenting results on infinite series, summations, interpolation, and quadrature. It is in this book that his asymptotic fonnula for n ! appears. Stirling also worked on gravitation and the shape of the earth; he stated, but did not prove, that the earth is an oblate spheroid. Stirling returned to Scotland in 1 735, when he was appointed manager of a Scottish mining company. He was very successful in this role and even published a paper on the ventilation of mine shafts. He continued his mathematical research, but at a reduced pace, during his years in the mining industry. Stirling is also noted for surveying the River Clyde with the goal of creating a series of locks to make it navigable. In 1 752 the citizens of Glasgow presented him with a silver teakettle as a reward for this work.
2 / Basic Structures: Sets, Functions, Sequences, and Sums
146
2-36
Example 29 illustrates that the factorial function grows extremely rapidly as n grows. The rapid growth of the factorial function is made clearer by Stirling's formula, a result from higher mathematics that tell us that n ! '" J2rrn(n/et . Here, we have used the notation f(n) '" g(n), which means that the ratio f(n)/g(n) approaches 1 as n grows without bound (that is, limn -4oo f(n)/g(n) = 1). The symbol '" is read "is asymptotic to." Stirling's formula is named after James Stirling, a Scottish mathematician of the eighteenth century.
Exercises
f not a function from to if 1/x?? f(x ) b) f(x) = .JX f(x) ±J(x 2 + I )? Determine whether f is a function from Z to if n. . ) = ±J1i2+1 b) f(n f(n) f(n) 1/(n 2 - 4). Determine whether f is a function from the set of all bit strings to the set of integers if f(S) isis the position of a 0 bit in S. the number o f 1 bits i n S. b) f(S f(S) ) i s the smallest integer i such that the ith bit of S is 1 and f(S) = 0 when S is the empty string, the string with no bits. R
1. Why is
2.
3.
a)
=
c)
=
a)
=
c)
=
c) the function that assigns to a bit string the number of
R
,.--,,..-
R
a)
c)
4. Find the domain and range of these functions. Note that
in each case, to find the domain, determine the set of ele ments assigned values by the function. a) the function that assigns to each nonnegative integer its last digit the function that assigns the next largest integer to a positive integer c) the function that assigns to a bit string the number of one bits in the string d) the function that assigns to a bit string the number of bits in the string 5. Find the domain and range of these functions. Note that in each case, to find the domain, determine the set of ele ments assigned values by the function. a) the function that assigns to each bit string the number of ones minus the number of zeros the function that assigns to each bit string twice the number of zeros in that string c) the function that assigns the number of bits left over when a bit string is split into bytes (which are blocks of 8 bits) d) the function that assigns to each positive integer the largest perfect square not exceeding this integer 6. Find the domain and range of these functions. a) the function that assigns to each pair of positive inte gers the first integer of the pair the function that assigns to each positive integer its largest decimal digit
ones minus the number of zeros in the string
d) the function that assigns to each positive integer the
largest integer not exceeding the square root of the integer e) the function that assigns to a bit string the longest string of ones in the string 7. Find the domain and range of these functions. a) the function that assigns to each pair of positive inte gers the maximum of these two integers the function that assigns to each positive integer the number of the digits 0, 1 , 2, 3, 4, 5, 6, 7, 8, 9 that do not appear as decimal digits of the integer c) the function that assigns to a bit string the number of times the block 1 1 appears d) the function that assigns to a bit string the numerical position of the first 1 in the string and that assigns the value 0 to a bit string consisting of all Os 8. Find these values.
b)
b)
b)
b)
9.
10.
l.1 J b) r -01 . 11 1 LL -O.l J rr -2.99· 1 1 L ! + r!l J r L! J + r! 1 + ! 1 Find these values. b) L�J r�lLL --lJ� J r �l r3 1 L! + r � 1 J L! ' L � J J Determine whether each of these functions {a, b, c, d} to itself is one-to-one. a) f(a ) = b, f(b) = a, f(c) = c, f(d) = d b) f(a f(a) = d,b, f(b b, f(c) d, f(d) ) = f(b)) == b, f(c) == c, f(d) == dc a) c) e) r2 .991 g)
d) 1) h)
a) c) e) g)
d) 1) h)
from
c) 1 1 . Which functions in Exercise 1 0 are onto?
12. Determine whether each of these functions from
is one-to-one. a) c)
b) f(n) f(n) == nn2 +2 1 r11
f(n f(n )) = nn 3- 1 =
d)
13. Which functions in Exercise 1 2 are onto?
f: Z f(m, n 2m ) = b) f(m, n) = m 2 - nn.2 •
14. Determine whether a)
x
Z � Z is onto if
Z to Z
2.3 Functions
2-3 7
f(m, n ) == mI m + n I+n 1 . f(m, n) f(m, n ) = m 2l -- 4. l. Detennine whether the function f: Z f(m, n ) == mm 2++n.n2 • f(m, n) c) f(m, n ) = m. f(m, n) = I n l . f(m , n ) = m - n . c)
26. Let S = { - I , 0, 2, 4, 7 } . Find f(S) if
d) e)
15.
a) b)
xZ
�
Z is onto if
d) e)
16. Give an example of a function from N to N that is a) one-to-one but not onto. b) onto but not one-to-one.
17.
18.
19.
20.
21.
22.
23.
24.
25.
c) both onto and one-to-one (but different from the iden tity function). d) neither one-to-one nor onto. Give an explicit fonnula for a function from the set of integers to the set of positive integers that is a) one-to-one, but not onto. b) onto, but not one-to-one. c) one-to-one and onto. d) neither one-to-one nor onto. Detennine whether each of these functions is a bijection from R to R. a) f(x) = -3x + 4 b) f(x) = -3x 2 + 7 c) f(x) = (x + I )/(x + 2) d) f(x) = x 5 + I Detennine whether each of these functions is a bijection from R to R. a) f(x) = 2x + 1 b) f(x) = x 2 + 1 c) f(x) = x3 d) f(x) = (x 2 + I )/(x 2 + 2) Let f: R � R and let f(x) > 0 for all x E R. Show that f(x) is strictly increasing if and only if the function g(x) = 1 /f(x) is strictly decreasing. Let f: R � R and let f(x) > O. Show that f(x) is strictly decreasing if and only if the function g(x) = 1 /f(x) is strictly increasing. Qive an example of an increasing function with the set of real numbers as its domain and codomain that is not one-to-one. Give an example of a decreasing function with the set of real numbers as its domain and codomain that is not one-to-one. Show that the function f(x) = from the set of real number to the set of real numbers is not invertible, but if the codomain is restricted to the set of positive real numbers, the resulting function is invertible. Show that the function f(x) = x I from the set of real numbers to the set of nonnegative real numbers is not invertible, but if the domain is restricted to the set of nonnegative real numbers, the resulting function is invertible.
eX
I
147
b) f(x) = 2x + 1 . a) f(x) = 1 . d) f(x) = L(x 2 + I )/3J . c) f(x) = fx /5 l 27. Let f(x) = Lx 2 /3J . Find f(S) if a) S = { 2, - 1 , 0, 1 , 2, 3 } . b ) S = {O, 1 , 2, 3 , 4, 5 } . c ) S = { l , 5 , 7, l l } . d) S = {2, 6 , 1 0, 1 4 } . 28. Let f(x) = 2x . What is a) f(Z)? c) f(R)? b) f(N)? 29. Suppose that g is a function from A to B and f is a func tion from B to C . a) Show that i f both f and g are one-to-one functions, then f o g is also one-to-one. b) Show that if both f and g are onto functions, then f o g is also onto. *30. If f and f o g are one-to-one, does it follow that g is one-to-one? Justify your answer. *31 . If f and f o g are onto, does it follow that g is onto? Justify your answer. 32. Find f o g and g 0 f, where f(x) = x 2 + 1 and g(x) = x + 2, are functions from R to R. 33. Find f + g and fg for the functions f and g given in Exercise 32. 34. Let f(x) = x + b and g(x) = ex + where b, e, and are constants. Detennine for which constants b, e, and it is true that f o g = g 0 f. 35. Show that the function f(x) = x + b from R to R is invertible, where and b are constants, with #- 0, and find the inverse of f. 36. Let f be a function from the set A to the set Let S and T be subsets of A . Show that a) f(S U T ) = f(S) U f(T ). b) f(S n T) c;:: f(S) n f(T ). 37. Give an example to show that the inclusion in part (b) in Exercise 36 may be proper. Let f be a function from the set A to the set B . Let S be a subset of B . We define the inverse image of S to be the subset of A whose elements are precisely all pre-images of all ele ments of S. We denote the inverse image of S by f - I (S), so f - I (S) = { E A f( ) E S } . The notation f - I is used in two different ways. Do not confuse the notation intro duced here with the notation f - I for the value at y of the inverse of the invertible function f. Notice also that f - I (S), the inverse image of the set S, makes sense for all functions f, not just invertible functions.) 38. Let f be the function from R to R defined by f(x) = x 2 • Find b) f - I ({x 1 0 < x < I }). a) f - I ({ I }). c) f - I ({x I x > 4}) . 39. Let g(x) = LxJ . Find a) g - I ({O}). b) g - I ({- I , O, I }). I c) g ({X 1 0 < x < I }).
-
d d
a
d,
a
a I a
a
(Beware: (y)
a, a,
a B.
148
2 / Basic Structures: Sets, Functions, Sequences, and Sums
40. Let f be a function from A to B . Let S and T be subsets
of B . Show that a) f - I (S U T ) = f - I (S) U f - I ( T ). b) f - I (S n T ) = f - I (S) n f - I ( T ). 41. Let f be a function from A to B. Let S be a subset of B . Show that f - I (:5) = f- I (S).
�
42. Show that Lx + J is the closest integer to the number x,
except when x is midway between two integers, when it is the larger of these two integers. 43. Show that fx - � 1 is the closest integer to the number x , except when x i s midway between two integers, when it is the smaller of these two integers.
44. Show that if x is a real number, then fxl - LxJ = I
if x is not an integer and fxl - LxJ integer.
=
2-38
57. Data are transmitted over a particular Ethernet network
in blocks of 1 500 octets (blocks of 8 bits). How many blocks are required to transmit the following amounts of data over this Ethernet network? (Note that a byte is a syn onym for an octet, a kilobyte is 1 000 bytes, and a megabyte is 1 ,000,000 bytes.) a) 1 50 kilobytes of data b) 3 84 kilobytes of data c) 1 .544 megabytes of data d) 45.3 megabytes of data 58. Draw the graph of the function fen) = 1 - n 2 from Z to z.
59. Draw the graph of the function f(x) = L2xJ from R to
R.
0 if x is an
60. Draw the graph of the function f(x) = Lx /2J from R to
45. Show that if x is a real number, then x - I < Lx J ::::: x :::::
61. Draw the graph of the function f(x) = Lx J + Lx /2J from
46. Show that if x is a real number and m is an integer, then
62. Draw the graph ofthe function f(x) = fxl + Lx /2J from
R.
fxl < x + 1 .
fx + m l
=
R to R.
fx l + m .
47. Show that if x is a real number and n is an integer, then a) x < n if and only if LxJ < n . b) n < x if and only if n < fxl 48. Show that if x is a real number and n is an integer, then a) x ::::: n if and only i[ fxl ::::: n . b) n ::::: x i f and only if n ::::: LxJ . 49. Prove that ifn is an integer, then Ln /2J = n /2 ifn is even
and (n - 1 )/2 if n is odd.
50. Prove that if x is a real number, then L -x J = - fx 1 and 51.
f -xl = - LxJ . The function INT is found on some calculators, where INT(x ) = Lx J when x is a nonnegative real number and INT(x) = xl when x is a negative real number. Show that this INT function satisfies the identity INT( -x) = -INT(x ). Let a and be real numbers with a < Use the floor and/or ceiling functions to express the number of integers n that satisfy the inequality a ::::: n ::::: Let a and be real numbers with a < Use the floor and/or ceiling functions to express the number of integers n that satisfy the inequality a < n < How many bytes are required to encode n bits of data where n equals a) 4? b) 1 O? c) 500? d) 3000? How many bytes are required to encode n bits of data where n equals a) 7? b) 1 7? c) 1 00 1 ? d) 28,800? How many ATM cells (described in Example 26) can be transmitted in 10 seconds over a link operating at the fol lowing rates? a) 128 kilobits per second ( 1 kilobit = 1 000 bits) b) 300 kilobits per second c) I megabit per second ( 1 megabit = 1 ,000,000 bits)
f
52.
53.
54.
55.
56.
b
b
b.
b.
b.
b.
R to R.
63. Draw graphs of each of these functions.
�
b) f(x) = L2x + I J a) f(x) = Lx + J c) f(x) = fx /3 1 d) f(x) = f l /x l e) f(x) = fx - 2 1 + Lx + 2J g) f(x) = f Lx - J + f) f(x) = L2xHx/2 1
�
�l
64. Draw graphs o f each o f these functions. b) f(x) fO.2x l a) f(x) = f3x - 2 1 c) f(x) = L - 1 /xJ d) f(x) = Lx 2 J e) f(x) = fx/21 Lx /2J f) f(x) = Lx /2J + fx /2J g) f(x ) = L2 fx /2l + J =
�
65. Find the inverse function of fex) = x3 + 1 . 66. Suppose that f is an invertible function from Y to Z and
g is an invertible function from X to Y . Show that the inverse of the composition f o g is given by (f 0 g) - I = g - I o f- I .
U. U
67. Let S be a subset of a universal set The characteristic function fs of S is the function from to the set to, 1 }
such that fs(x) = 1 if x belongs to S and fsex) = 0 if x does not belong to S. Let A and B be sets. Show that for all x , a) hnB eX ) = h ex) . fB eX ) b) hUB (X) = h ex) + /H ex) - fA (X) . /H ex) c) hex) = 1 - fA (X) d) fAEBB (X) = fA eX) + fB eX) - 2 h (x)fB eX ) 68. Suppose that f is a function from A to B , where A and B are finite sets with 1 A 1 = 1 B I . Show that f is one-to-one if and only if it is onto. 69. Prove or disprove each ofthese statements about the floor and ceiling functions. a) r Lx J 1 = Lx J for all real numbers x . b ) L2xJ = 2 LxJ whenever x i s a real number. c) fxl + fyl - fx + yl = 0 or 1 �henever x and y are . real numbers.
2.4 Sequences and Summations
2-39
d) rxyl e)
I�l
= rxl ryl for all real numbers x and y. x 1 = for all real numbers x .
l;J
70. Prove or disprove each ofthese statements about the floor
and ceiling functions. a) L rx l J = rxl for all real numbers x . b) Lx + y J = LxJ + LY J for all real numbers x and y . c) r rx/2 1 /2 1 = rx/41 for all real numbers x . d) L ,JrxT J = J for all positive real numbers x . e) LxJ + LY J + Lx + y J .::: L2xJ + L2yJ for all real numbers x and y. 71. Prove that if x is a positive real number, then a) L vTxT J = L J. b) r ,JrxT l = 72. Let x be a real number. Show that L3xJ = LxJ + Lx + 1 J + Lx + � J . A program designed to evaluate a function may not produce the correct value ofthe function for all elements in the domain of this function. For example, a program may not produce a correct value because evaluating the function may lead to an infinite loop or an overflow. Similarly, in abstract mathemat ics, we often want to discuss functions that are defined only for a subset of the real numbers, such as 1 /x , and arcsin (x). Also, we may want to use such notions as the "youngest child" function, which is undefined for a couple having no children, or the "time of sunrise," which is undefined for some days above the Arctic Circle. To study such situations, we use the concept of a partial function. A partial function f from a set A to a set B is an assignment to each element a in a subset of A, called the do main of definition of f, of a unique element in B . The sets A and B are called the domain and codomain of f, respec tively. We say that f is undefined for elements in A that are not in the domain of definition of f. We write f : A --+ B to denote that f is a partial function from A to B . (This is the same notation as is used for functions. The context in which the notation is used determines whether f is a partial function or a total function.) When the domain of definition of f equals A, we say that f is a total function.
L..jX
..jX ..jX r l.
..jX,
b
149
73. For each of these partial functions, determine its domain,
codomain, domain of definition, and the set of values for which it is undefined. Also, determine whether it is a total function. a) f : Z --+ R, fen) = l i n b) c) d) e)
f : Z --+ Z, fen) = rn/21 f : Z x Z --+ Q, f(m , n) = m/n f : Z x Z --+ Z, f(m , n) = mn f : Z x Z --+ Z, f(m , n) = m - n if m > n
74. a) Show that a partial function from A to B can be viewed
as a function f* from A to B element of B and
f* (a) =
I
f(a) u
U
{u } , where u is not an
if a belongs to the domain of definition of f if f is undefined at a.
b ) Using the construction i n (a), find the function f* cor
responding to each partial function in Exercise 73 .
E3f> 75. a) Show that if a set S has cardinality
m , where m is a positive integer, then there is a one-to-one correspon dence between S and the set { I , 2, . . . , m } . b) Show that i f S and T are two sets each with m ele ments, where m is a positive integer, then there is a one-to-one correspondence between S and T . *76. Show that a set S i s infinite i f and only ifthere is a proper subset A of S such that there is a one-to-one correspon dence between A and S. *77. Show that the polynomial function f : Z+ x Z+ --+ Z+ with f(m , n) = (m + n - 2)(m + n - 1 )/2 + m is one to-one and onto. *78. Show that when you substitute ( 3 n + 1 )2 for each occur rence of n and ( 3 m + 1 )2 for each occurrence of m in the right-hand side of the formula for the function f(m , n) in Exercise 77, you obtain a one-to-one polynomial func tion Z x Z --+ Z. It is an open question whether there is a one-to-one polynomial function Q x Q --+ Q.
2.4 Sequences and Summations Introduction Sequences are ordered lists of elements. Sequences are used in discrete mathematics in many ways. They can be used to represent solutions to certain counting problems, as we will see in Chapter 7. They are also an important data structure in computer science. This section contains a review of the notation used to represent sequences and sums of terms of sequences. When the elements of an infinite set can be listed, the set is called countable. We will conclude this section with a discussion of both countable and uncountable sets. We will prove that the set of rational numbers is countable, but the set of real numbers is not.
1 50
2 / Basic Structures: Sets, Functions, Sequences, and Sums
2-40
Sequences A sequence is a discrete structure used to represent an ordered list. For example, 1 , 2, 3, 5, 8 is a sequence with five terms and 1 , 3, 9, 27, 8 1 , . . . , 30, . . . is an infinite sequence.
DEFINITION 1
A sequence is a function from a subset of the set ofintegers (usually either the set {O, 1 , 2, . . . }
or the set { I , 2 , 3 , . . J) to a set S. We use the notation an to denote the image of the integer n . We call an a term of the sequence. .
We use the notation {an } to describe the sequence. (Note that an represents an individual term of the sequence {an } . Also note that the notation {an } for a sequence conflicts with the notation for a set. However, the context in which we use this notation will always make it clear when we are dealing with sets and when we are dealing with sequences. Note that although we have used the letter a in the notation for a sequence, other letters or expressions may be used depending on the sequence under consideration. That is, the choice of the letter a is arbitrary.) We describe sequences by listing the terms of the sequence in order of increasing subscripts. EXAMPLE 1
Consider the sequence {a n }, where
The list of the terms of this sequence, beginning with a I , namely,
starts with 1,
DEFINITION 2
1 1 1 2' 3' 4' · · · ·
A geometric progression is a sequence of the form
a , ar, ar 2 , . . . , ar n ,
•
•
•
where the initial term a and the common ratio r are real numbers.
Remark: A geometric progression is a discrete analogue ofthe exponential function f(x)
EXAMPLE 2
= arX .
The sequences {bn } with bn = (_ I )n , {en } with Cn = 2 · 5 n , and {dn } with dn = 6 · ( l /3t are geometric progressions with initial term and common ratio equal to 1 and - 1 ; 2 and 5; and 6 and 1 /3, respectively, if we start at n = 0. The list of terms bo , b l , b2 , b3 , b4 , begins with •
1, -1, 1, -1, 1, . . . ;
•
•
2.4 Sequences and Summations
2-41
the list of terms co , C I , C2 , C3 , C4 ,
•
•
•
151
begins with
2, 1 0 , 50, 250, 1 250, . . . ; and the list of terms do , dl , d2 , d3 , d4 ,
•
•
.
begins with
2 2 2 6, 2, 3 ' 9 ' '···· 27 DEFINITION 3
An arithmetic progression is a sequence of the form
a , a + d, a + 2d, . . . , a + nd, . . . where the initial term a and the common difference d are real numbers. Remark: An arithmetic progression is a discrete analogue of the linear function f(x)
EXAMPLE 3
= dx + a .
The sequences {sn } with Sn = - 1 + 4n and {In } with fn = 7 - 3n are both arithmetic progres sions with initial terms and common differences equal to - 1 and 4, and 7 and -3, respectively, if we start at n = 0. The list of terms so , Sl , S2 , S3 , . . begins with .
- 1 , 3 , 7, 1 1 , . . . , and the list of terms 10 , f l , 12 , 13 ,
.
.
. begins with
7, 4, 1 , -2, . . . . Sequences of the form a I , a2 , . . . , an are often used in computer science. These finite sequences are also called strings. This string is also denoted by a l a2 . . . a n . (Recall that bit strings, which are finite sequences of bits, were introduced in Section 1 . 1 .) The length of the string S is the number of terms in this string. The empty string, denoted by A, is the string that has no terms. The empty string has length zero. EXAMPLE 4
The string abcd is a string of length four.
Special Integer Sequences A common problem in discrete mathematics is finding a formula or a general rule for constructing the terms of a sequence. Sometimes only a few terms of a sequence solving a problem are known; the goal is to identify the sequence. Even though the initial terms of a sequence do not determine the entire sequence (after all, there are infinitely many different sequences that start with any finite set of initial terms), knowing the first few terms may help you make an educated conjecture about the identity of your sequence. Once you have made this conjecture, you can try to verify that you have the correct sequence. When trying to deduce a possible formula or rule for the terms of a sequence from the initial terms, try to find a pattern in these terms. You might also see whether you can determine how a term might have been produced from those preceding it. There are many questions you could ask, but some of the more useful are: • • •
Are there runs of the same value? That is, does the same value occur many times in a row? Are terms obtained from previous terms by adding the same amount or an amount that depends on the position in the sequence? Are terms obtained from previous terms by multiplying by a particular amount?
152
2 / Basic Structures: Sets, Functions, Sequences, and Sums • •
EXAMPLE S
Extra � ExampleS �
2-42
Are terms obtained by combining previous terms in a certain way? Are there cycles among the terms?
Find formulae for the sequences with the following first five terms: (a) 1 , 1 /2, 1 /4, 1 /8, 1 / 1 6 (b) 1 , 3, 5, 7, 9 (c) 1 , - 1 , 1 , - 1 , 1 .
Solution: (a) We recognize that the denominators are powers of2. The sequence with an = 1 /2n , n = 0, 1 , 2, . . . is a possible match. This proposed sequence is a geometric progression with a = 1 and r = 1 /2. (b) We note that each term is obtained by adding 2 to the previous term. The se quence with an = 2n + 1 , n = 0, 1 , 2, . . . is a possible match. This proposed sequence is an arithmetic progression with a = 1 and d = 2. (c) The terms alternate between 1 and - 1 . The sequence with an = (- 1 t, n = 0, 1 , 2 . . . i s a possible match. This proposed sequence is a geometric progression with a = 1 and ... r = -1. Examples 6 and 7 illustrate how we can analyze sequences to find how the terms are constructed.
EXAMPLE 6
How can we produce the terms of a sequence if the first 1 0 terms are 1 , 2, 2, 3, 3, 3, 4, 4, 4, 4?
Solution: Note that the integer 1 appears once, the integer 2 appears twice, the integer 3 appears three times, and the integer 4 appears four times. A reasonable rule for generating this sequence is that the integer n appears exactly n times, so the next five terms of the sequence would all be 5, the following six terms would all be 6, and so on. The sequence generated this way is a ... possible match. EXAMPLE 7
How can we produce the terms of a sequence if the first 1 0 terms are 5, 1 1 , 1 7, 23, 29, 35, 4 1 , 47, 5 3 , 59?
Solution: Note that each of the first 10 terms ofthis sequence after the first is obtained by adding 6 to the previous term. (We could see this by noticing that the difference between consecutive terms is 6.) Consequently, the nth term could be produced by starting with 5 and adding 6 a total of n - 1 times; that is, a reasonable guess is that the nth term is 5 + 6(n - 1 ) = 6n - 1 . ... (This i s an arithmetic progression with a = 5 and d = 6.) Another useful technique for finding a rule for generating the terms of a sequence is to compare the terms of a sequence of interest with the terms of a well-known integer sequence, such as terms of an �thmetic progression, terms of a geometric progression, perfect squares, perfect cubes, and so on. The first 1 0 terms of some sequences you may want to keep in mind are displayed in Table 1 . EXAMPLE 8
Conjecture a simple formula for an if the first 1 0 terms of the sequence {a n } are 1 , 7, 25, 79, 24 1 , 727, 2 1 85, 6559, 1 968 1 , 59047.
Solution: To attack this problem, we begin by looking at the difference of consecutive terms, but we do not see a pattern. When we form the ratio of consecutive terms to see whether each term is a multiple of the previous term, we find that this ratio, although not a constant, is close to 3. So it is reasonable to suspect that the terms of this sequence are generated by a formula involving 3 n • Comparing these terms with the corresponding terms of the sequence {3 n } , we notice that the nth term is 2 less than the corresponding power of 3 . We see that an = 3 n - 2 ... for 1 :s n :s 1 0 and conjecture that this formula holds for all n .
2.4 Sequences and Su�ations
153
TABLE 1 Some Useful Sequences. First 10 Terms
nth Term
1 , 4, 9, 1 6 , 25, 36, 49, 64, 8 1 , 1 00, . . .
n2 n3 n4 2n 3n
1 , 8 , 27, 64, 125, 2 1 6, 343 , 5 1 2, 729, 1 000, . . . 1 , 1 6 , 8 1 , 256, 625 , 1 296, 240 1 , 4096, 656 1 , 1 0000, . . . 2, 4, 8 , 1 6 , 32, 64, 128, 256, 5 1 2, 1 024, . . . 3 , 9, 27, 8 1 , 243 , 729, 2 1 87, 656 1 , 1 9683 , 59049, . . . 1 , 2, 6, 24, 1 20, 720, 5040, 40320, 362880, 3628800, . . .
n!
linkS �
We will see throughout this text that integer sequences appear in a wide range of contexts in discrete mathematics. Sequences we have or will encounter include the sequence of prime numbers (Chapter 3), the number of ways to order n discrete objects (Chapter 5), the number of moves required to solve the famous Tower of Hanoi puzzle with n disks (Chapter 7), and the number of rabbits on an island after n months (Chapter 7). Integer sequences appear in an amazingly wide range of subject areas besides discrete mathematics, including biology, engineering, chemistry, and physics, as well as in puzzles. An amazing database of over 1 00,000 different integer sequences can be found in the On Line Encyclopedia of Integer Sequences. This database was originated by Neil Sloane in the 1 960s. The last printed version of this database was published in 1 995 ([SIPI95]); the current encyclopedia would occupy more than 1 50 volumes of the size of the 1 995 book. New sequences are added regularly to this database. There is also a program accessible via the Web that you can use to find sequences from the encyclopedia that match initial terms you provide.
Summations Next, we introduce summation notation. We begin by describing the notation used to express the sum of the terms
from the sequence {an } . We use the notation or to represent
Here, the variable j is called the index of summation, and the choice of the letter j as the variable is arbitrary; that is, we could have used any other letter, such as i or k. Or, in notation,
n
L
}= m
a} =
n
L
i= m
ai =
n
L ak .
k= m
Here, the index of summation runs through all integers starting with its lower limit m and ending with its upper limit n . A large uppercase Greek letter sigma, L, is used to denote summation.
154
2 / B�sic Structures: Sets, Functions, Sequences, and Sums
2-44
The usual laws for arithmetic apply to summations. For example, when a and b are real numbers, we have L� = I (aXj + bYj ) = a L� = l Xj + b L� = l Yj ' where X l , X2 , " " Xn and Y l , Y2 , . . . , Yn are real numbers. (We do not present a formal proof of this identity here. Such a proof can be constructed using mathematical induction, a proof method we introduce in Chap ter 4. The proof also uses the commutative and associative laws for addition and the distributive law of multiplication over addition.) We give some examples of summation notation.
EXAMPLE 9
Express the sum of the first 1 00 terms of the sequence {an }, where an = 1 / n for n = 1 , 2, 3 , . . . .
Extra � Examples IIiiiiiI
Solution: The lower limit for the index of summation is 1 , and the upper limit is 1 00. We write this sum as
1 00 1
L j = l �' }
EXAMPLE 10
�
What is the value of L = l j 2 ?
Solution: We have 5
/ = 1 2 + 2 2 + 3 2 + 42 + 5 2 L j=l = 1 + 4 + 9 + 1 6 + 25 = 55.
EXAMPLE 11
What is the value of L � = 4 (- I )k ?
Solution: We have 8
( _ 1 )k = (_ 1 )4 + ( - 1 ) 5 + ('- 1 ) 6 + ( _ 1 ) 7 + (_ 1 ) 8 L =4
k
= 1 + (- 1 ) + 1 + (- 1 ) + 1 = 1.
unks
iC
NEIL SLOANE (BORN 1 939) Neil Sloane studied mathematics and electrical engineering at the Uni versity of Melbourne on a scholarship from the Australian state telephone company. He mastered many telephone-related jobs, such as erecting telephone poles, in his summer work. After graduating, he de signed minimal-cost telephone networks in Australia. In 1 962 he came to the United States and studied electrical engineering at Cornell University. His Ph.D. thesis was on what are now called neural networks. He took a job at Bell Labs in 1 969, working in many areas, including network design, coding theory, and sphere packing. He now works for AT&T Labs, moving there from Bell Labs when AT&T split up in 1 996. One of his favorite problems is the kissing problem (a name he coined), which asks how many spheres can be arranged in n dimensions so that they all touch a central sphere of the same size. (In two dimensions the answer is 6, because 6 pennies can be placed so that they touch a central penny. In three dimensions, 1 2 billiard balls can be placed so that they touch a central billiard ball. Two billiard balls that just touch are said to "kiss," giving rise to the terminology "kissing problem" and "kissing number.") Sloane, together with Andrew Odlyzko, showed that in 8 and 24 dimensions, the optimal kissing numbers are, respectively, 240 and 1 96,560. The kissing number is known in dimensions I , 2, 3, 8, and 24, but not in any other dimensions. Sloane's books include Sphere Packings, Lattices and Groups, 3d ed., with John Conway; The Theory ofError-Correcting Codes with Jessie MacWilliams; The Encyclopedia ofInteger Sequences with Simon Plouffe; and The Rock-Climbing Guide to New Jersey Crags with Paul Nick. The last book demonstrates his interest in rock climbing; it includes more than 50 climbing sites in New Jersey.
2.4 Sequences and Summations
2-45
1 55
Sometimes it is useful to shift the index of summation in a sum. This is often done when two sums need to be added but their indices of summation do not match. When shifting an index of summation, it is important to make the appropriate changes in the corresponding summand. This is illustrated by Example 1 2 . EXAMPLE 12
Suppose we have the sum
5
/ L j= l Extra � ExampleS �
but want the index of summation to run between ° and 4 rather than from 1 to 5 . To do this, we let j - 1 . Then the new summation index runs from ° to 4, and the term j 2 becomes I i . Hence,
k=
(k +
5
4
(k + li. / = kL L j= l =O It is easily checked that both sums are 1
+ 4 + 9 + 1 6 + 25 = 55.
Sums of terms of geometric progressions commonly arise (such sums are called geometric series). Theorem 1 gives us a formula for the sum of terms of a geometric progression.
THEOREM 1
{
If a and r are real numbers and r i= 0, then
n
ar j = L j=O
ar n + l - a r-l (n + l )a
if r i 1 if r
= 1.
Proof: Let
To compute S, first multiply both sides of the equality by r and then manipulate the resulting sum as follows:
n
rS
= r L ar j
substituting summation formula for S
j= o n = L ar j +1 j= o n+ l = L ar k k=l
= =
(t ) + ar k
by the distributive property
shifting the index of summation, with k
(ar n + 1 - a)
k= O S + (ar n +1 - a)
removing k
=
n
+ 1 term and adding k
substituting S for summation formula
=
=
j
+ I
0 term
156
2 / Basic Structures: Sets, Functions, Sequences, and Sums
2-46
From these equalities, we see that
rS = S + (ar n + ! - a ) . Solving for S shows that if r =1= 1 , then
ar n + ! - a S = --- r-1 If r = 1 , then clearly the sum equals (n + l )a . EXAMPLE 13
<J
Double summations arise in many contexts (as in the analysis of nested loops in computer programs). An example of a double summation is
4
3
L L ij· i =! j =! To evaluate the double sum, first expand the inner summation and then continue by computing . the outer summation:
4
3
4
(i + 2i + 3i) L L ij = L i =! j =! i =!
=
4
6i L i =!
= 6 + 1 2 + 18 + 24 = 60. We can also use summation notation to add all values of a function, or terms of an indexed set, where the index of summation runs over all values in a set. That is, we write
L /(S )
SES
to represent the sum of the values EXAMPLE 14
What is the value of Ls E {O, , 4 } 2
/(s ) , for all members s of S .
s? s
Solution: Because Ls E {O, , 4 } S represents the sum of the values of for all the members of the 2 set {O, 2, 4}, it follows that
L s = 0 + 2 + 4 = 6.
.... {O, 2 , 4 } Certain sums arise repeatedly throughout discrete mathematics. Having a collection of formulae for such sums can be useful; Table 2 provides a small table of formulae for commonly occurring sums. We derived the first formula in this table in Theorem 1 . The next three formulae give us the sum of the first n positive integers, the sum of their squares, and the sum of their cubes. These three formulae can be derived in many different ways (for example, see Exercises 2 1 and 22 at the end of this section). Also note that each of these formulae, once known, can easily be proved using mathematical induction, the subject of Section 4 . 1 . The last two formulae in the table involve infinite series and will be discussed shortly. Example 1 5 illustrates how the formulae in Table 2 can be useful. SE
2.4 Sequences and Summations
2-4 7
157
TABLE 2 Some Useful Summation Formulae. Sum
n L ar k (r =P 0) k=O n Lk k= l n L k2 k=l n L k3 k= l L X k , Ix l I k=O L, kX k - 1 , Ix l k= l
Closed Form
ar n +1 - a , r =P I r-I n(n + I ) 2
n(n + I )(2n + I ) 6
n 2 (n + 1 )2 4
00
I I -x
<
00
<
--
I
I ( I - x)2
---
Using the fonnula L� = I k2 = n(n + 1 )(2n + 1 )/6 from Table 2, we see that
1 00
2
� k =
k = 50
1 00 · 1 0 1 · 20 1 49 · 50 · 99 = 338, 350 - 40,425 = 297 , 925. 6 6
SOME INFINITE SERIES Although most of the summations in this book are finite sums, infinite series are important in some parts of discrete mathematics. Infinite series are usually studied in a course in calculus and even the definition of these series requires the use of calculus, but sometimes they arise in discrete mathematics, because discrete mathematics deals with infi nite collections of discrete elements. In particular, in our future studies in discrete mathematics, we will find the closed fonns for the infinite series in Examples 1 6 and 1 7 to be quite useful. EXAMPLE 16
Exlra �
Examples �
(Requires calculus) Let x be a real number with Ix I < 1 . Find L::O= o x n .
' . By Theorem 1 with SolutIOn: a = 1 and r = x we see that L nk = 0 x n = x k+ 1 approaches 0 as k approaches infinity. It follows that 00
xk+l x-
I I . Because Ix I < 1 ,
-1 x k+ 1 - 1 = = k--+oo x - I 1 -x x-I
� x n = lim
f::o
We can produce new summation fonnulae by differentiating or integrating existing fonnulae.
158
2 / Basic Structures: Sets, Functions, Sequences, and Sums EXAMPLE 17
2-48
(Requires calculus) Differentiating both sides of the equation 1 L x k = -, 1 x k=O 00
from Example 1 6 we find that
� kx k-l
k=l
�
= ( 1 1 x)2 · _
(This differentiation is valid for
Ix I
<
1 by a theorem about infinite series.)
Cardinali ty In Section 2. 1 we defined the cardinality of a finite set to be the number of elements in the set. That is, the cardinality of a finite set tells us when two finite sets are the same size, or when one is bigger than the other. Can we extend this notion to an infinite set? Recall from Exercise 75 in Section 2.3 that there is a one-to-one correspondence between any two finite sets with the same number of elements. This observation lets us extend the concept of cardinality to all sets, both finite and infinite, with Definition 4.
DEFINITION 4
The sets A and B have the same cardinality if and only ifthere is a one-to-one correspondence from A to B .
We will now split infinite sets into two groups, those with the same cardinality as the set of natural numbers and those with different cardinality.
D EFINITION 5
A set that is either finite or has the same cardinality as the set of po s itive integers is called countable. A set that is not countable is called uncountable. When an infinite set S is countable, we denote the cardinality of S by �o (where � is aleph, the first letter of the Hebrew alphabet).
We write l S I
=
�o and say that S has cardinality "aleph null."
We now give examples of countable and uncountable sets. EXAMPLE 18
Show that the set of odd positive integers is a countable set.
Solution: To show that the set of odd positive integers is countable, we will exhibit a one-to-one correspondence between this set and the set of positive integers. Consider the function fen)
= 2n - 1
from z+ to the set of odd positive integers. We show that f is a one-to-one correspondence by showing that it is both one-to-one and onto. To see that it is one-to-one, suppose that f(n) f(m). Then 2n - 1 2m - 1 , so n m . To see that it is onto, suppose that t is an odd positive
=
=
=
2.4 Sequences and Summations
2-49
159
Tenns not circled are not listed because they repeat previously listed tenns
2
3
4
5
6
7
8
9
10
11
12
3
5
7
9
11
13
15
17
19
21
23
I I I I I I I I I I I I
FIGURE 1 A One-to-One Correspondence Between Z+ and the Set of Odd Positive Integers.
FIGURE 2 Countable.
The Positive Rational Numbers Are
integer. Then t is 1 less than an even integer 2k, where k is a natural number. Hence t = 2k - 1 = ... f(k). We display this one-to-one correspondence in Figure 1 . An infinite set is countable if and only if it is possible to list the elements of the set in a sequence (indexed by the positive integers). The reason for this is that a one-to-one correspon dence f from the set of positive integers to a set S can be expressed in terms of a sequence a i , a2 , . . . , an , . . . , where al = f( 1 ), a2 = f(2), . . . , an = f(n), . . . . For instance, the set of odd integers can be listed in a sequence a I , a2 , . . . , an , . . . , where an = 2n - 1 . We can show that the set of all integers is countable by listing its members. EXAMPLE 19
Show that the set of all integers is countable.
Solution: We can list all integers in a sequence by starting with 0 and alternating between positive and negative integers: 0, 1 , - 1 , 2, -2, . Alternately, we could find a one-to-one correspondence between the set of positive integers and the set of all integers. We leave it to the reader to show that the function f(n) = n/2 when n is even and f(n) = -(n - 1 )/2 when n is ... odd is such a function. Consequently, the set of all integers is countable. .
.
.
It is not surprising that the set of odd integers and the set of all integers are both countable sets. Many people are amazed to learn that the set of rational numbers is countable, as Example 20 demonstrates. EXAMPLE 20
Show that the set of positive rational numbers is countable.
Solution: It may seem surprising that the set of positive rational numbers is countable, but we will show how we can list the positive rational numbers as a sequence rl , r2 , . . . , rn , First, note that every positive rational number is the quotient p /q of two positive integers. We can arrange the positive rational numbers by listing those with denominator q = 1 in the first row, those with denominator q = 2 in the second row, and so on, as displayed in Figure 2. The key to listing the rational numbers in a sequence is to first list the positive ratio nal numbers p/q with p + q = 2, followed by those with p + q = 3, followed by those with p + q = 4, and so on, following the path shown in Figure 2. Whenever we encounter a num ber p/q that is already listed, we do not list it again. For example, when we come to 2/2 = 1 we do not list it because we have already listed 1 / 1 = 1 . The initial terms in the list of posi tive rational numbers we have constructed are 1 , 1 /2, 2, 3 , 1 /3 , 1 /4, 2/3 , 3/2, 4, 5, and so on. These numbers are shown circled; the uncircled numbers in the list are those we leave out .
•
.
.
160
2 / Basic Structures: Sets, Functions, Sequences, and Sums
2-50
because they are already listed. Because all positive rational numbers are listed once, as the ... reader can verify, we have shown that the set of positive rational numbers is countable.
unks
�
EXAMPLE 21
Extra � Examples .....
We have seen that the set of rational numbers is a countable set. Do we have a promising candidate for an uncountable set? The first place we might look is the set of real numbers. In Example 2 1 we use an important proof method, introduced in 1 879 by Georg Cantor and known as the Cantor diagonalization argument, to prove that the set of real numbers is not countable. This proof method is used extensively in mathematical logic and in the theory of computation. Show that the set of real numbers is an uncountable set.
Solution: To show that the set of real numbers is uncountable, we suppose that the set of real numbers is countable and arrive at a contradiction. Then, the subset of all real numbers that fall between 0 and 1 would also be countable (because any subset of a countable set is also countable; see Exercise 36 at the end of the section). Under this assumption, the real numbers between 0 and 1 can be listed in some order, say, r" r2 , r3 , . . . . Let the decimal representation of these real numbers be r, = O.d" d1 2d1 3d'4 . . . r2 = O.d2 1 d22d23d24 . . . r3 = 0.d3 , d32d33d34 . . .
r4 = 0.d4 , d42 d43 d44 . . .
where djj E (O, 1 , 2, 3 , 4, 5 , 6, 7, 8, 9}. (For example, ifr, = 0.23794 1 02 . . . , we have dl 1 = 2, d1 2 = 3, d1 3 = 7, and so on.) Then, form a new real number with decimal expansion r = 0.d, d2d3d4 . . . , where the decimal digits are determined by the following rule: if djj "# 4 if djj = 4. (As an example, suppose that r, = 0. 23794 1 0 2 . . . , r2 = 0.44590 1 3 8 . . . , r3 = 0.09 1 1 8764 . . . , r4 = 0. 80553900 . . . , and so on. Then we have r = 0.d, d2 d3d4 = 0.4544 . . . , where d, = 4 because d" "# 4, d2 = 5 because d22 = 4, d3 = 4 because d33 "# 4, d4 = 4 because d44 "# 4, and so on.) Every real number has a unique decimal expansion (when the possibility that the expansion has a tail end that consists entirely of the digit 9 is excluded). Then, the real number r is not equal to any of r, , r2 , . . . because the decimal expansion of r differs from the decimal expansion of rj in the ith place to the right of the decimal point, for each i . Because there i s a real number r between 0 and 1 that i s not in the list, the assumption that all the real numbers between 0 and 1 could be listed must be false. Therefore, all the real nuinbers between 0 and 1 cannot be listed, so the set of real numbers between 0 and 1 is uncountable. Any set with an uncountable subset is uncountable (see Exercise 37 at the end of this section). ... Hence, the set of real numbers is uncountable. .
.
Exercises 1. Find these terms of the sequence ( _ 3 )n + 5n . a) ao
d)
as
{an }, where an
=
2·
2. What is the term as of the sequence {an } if an equals n a) 2 - 1 ? b) 7? d) ( 2 n c) 1 + ( _ l )n ? -
-
.
2.4 Sequences and Summations
· 2-51
3. What are the terms ao, a I , a2 , and a ofthe sequence {an },
3 where an equals b) (n a) 2n l? d) c) n 2 ? 4. What are the terms ao, a I , a2 , and a ofthe sequence {an }, 3 where an equals a) (-2t? b) 3? d) 2 n (-2t? c) 7 4n ? 5. List the first 1 0 terms of each of these sequences. a) the sequence that begins with 2 and in which each successive term is 3 more than the preceding term b) the sequence that lists each positive integer three times, in increasing order c) the sequence that lists the odd positive integers in in creasing order, listing each odd integer twice d) the sequence whose nth term is n ! - 2 n e) the sequence that begins with 3, where each succeed ing term is twice the preceding term t) the sequence whose first two terms are 1 and each succeeding term is the sum ofthe two preceding terms (This is the famous Fibonacci sequence, which we will study later in this text.) g) the sequence whose nth term is the number of bits in the binary expansion of the number n (defined in Section 3 .6) h) the sequence where the nth term is the number of let ters in the English word for the index n
l? + It++ fnj21? nj2J L
+ LjJ
+
+
6. List the first 1 0 terms of each of these sequences. a) the sequence obtained by starting with 10 and obtain
ing each term by subtracting 3 from the previous term b) the sequence whose nth term is the sum of the first n
positive integers
c) the sequence whose nth term is 3 n - 2 n d) the sequence whose nth term is e) the sequence whose first two terms are 1 and 2 and
L.Ji1J
each succeeding term is the sum of the two previous terms t) the sequence whose nth term is the largest integer whose binary expansion (defined in Section 3 .6) has n bits (Write your answer in decimal notation.) g) the sequence whose terms are constructed sequentially as follows: start with 1 , then add 1 , then multiply by 1 , then add 2, then multiply by and so on h) the sequence whose nth term is the largest integer such that � n
k!
2,
k
7. Find at least three different sequences beginning with the
terms 1 , 2, 4 whose terms are generated by a simple for mula or rule. S. Find at least three different sequences beginning with the terms 3, 5, 7 whose terms are generated by a simple for mula or rule. 9. For each ofthese lists ofintegers, provide a simple formula or rule that generates the terms of an integer sequence that begins with the given list. Assuming that your formula
161
or rule is correct, determine the next three terms of the sequence. 1 , 0, 1 , 1 , 0, 0, 1 , 1 , 1 , 0, 0, 0, 1 , . . . 1 , 2, 2, 3 , 4, 4, 5 , 6, 6, 7, 8 , 8 , . . . 1 , 0, 2, 0, 4, 0, 8, 0, 1 6, 0, . . . 3 , 6, 1 2, 24, 48, 96, 1 92, . . . 1 5 , 8, 1 , -6, - l 3 , -20, -27, . . . t) 3 , 5 , 8 , 1 2 , 1 7, 23 , 30, 38, 47, . . . g) 2, 1 6, 54, 128 , 250, 432, 686, . . . h) 2, 3 , 7, 25 , 1 2 1 , 72 1 , 504 1 , 4032 1 , . . .
a) b) c) d) e)
10. For each ofthese lists of integers, provide a simple formula
or rule that generates the terms of an integer sequence that begins with the given list. Assuming that your formula or rule is correct, determine the next three terms of the sequence.
a) 3 , 6, 1 1 , 1 8 , 27, 38, 5 1 , 66, 83 , 1 02, . . . b) 7, 1 1 , 1 5 , 1 9 , 23 , 27, 3 1 , 3 5 , 39, 43 , . . . c) 1 , 1 0, 1 1 , 1 00, 1 0 1 , 1 1 0, 1 000, 1 00 1 ,
d) e)
t) g) b)
Ill,
1 0 10, 1 0 1 1 , . . . 1 , 2, 2, 2, 3 , 3 , 3 , 3 , 3 , 5 , 5 , 5 , 5 , 5 , 5 , 5 , . . . 0, 2, 8 , 26, 80, 242, 728 , 2 1 86, 6560, 1 9682, . . . 1 , 3 , 1 5 , 105, 945 , 1 0395 , l 35 1 35 , 2027025, 34459425, . . . 1 , 0, 0, 1 , 1 , 1 , 0, 0, 0, 0, 1 , 1 , 1 , 1 , 1 , . . . 2, 4, 1 6, 256, 65536, 4294967296, . . .
* * 1 1 . Show that if an denotes the nth positive integer that is not
+ .Ji1
a perfect square, then an = n { } ' where {x } denotes the integer closest to the real number x . * 12. Let an be the nth term o f the sequence 1 , 2, 2, 3, 3, 3 , 4, 4, 4, 4, 5, 5, 5, 5, 5, 6, 6, 6, 6, 6, 6, . . . , constructed by including the integer exactly times. Show that an = ffn 1 · 13. What are the values of these sums?
L +J L (k + l) k= 1 L; =1 01 3
a)
5
c)
k
k
4
b)
d)
L (-2)j j =O L (2j+ 1 - 2j ) j =O 8
I
14. What are the values of these sums, where S = { , 3 ,
5 , 7}? a)
L} L (I j} ) jES jES
d)
L
1 jES 15. What is the value of each of these sums of terms of a geometric progression? c)
a)
8
L 3 · 2j j =O
b)
8
L 2j j=1 8
8
d) L 2 · (-3)j L (-3)j j =O j=2 1 6. Find the value of each of these sums. c)
8
a)
L (I + (-IY) j =O
b)
8
L (3j - 2j ) j =O
2 / Basic Structures: Sets, Functions, Sequences, and Sums
162
c)
8
L (2 . 3 j + 3 . 2j ) j=o
d)
8
31. Determine whether each of these sets is countable or un countable. For those that are countable, exhibit a one-to one correspondence between the set of natural numbers and that set. a) the negative integers b) the even integers c) the real numbers between 0 and d) integers that are multiples of 7
L (2j + 1 - 2j ) j=O
17. Compute each of these double sums.
3 L L (2 + 3 j ) i=Oj=O 3 d) L L ij i =Oj = 1
3 L L (i + j ) i=lj=1 3 c) L L i i= l j=O
2
a)
b)
2
2 2
i
18. Compute each of these double sums.
!
3 L L (3 i + 2j ) i=Oj=O 2 3 d) L L i=Oj=O
3 2 L L (i - j ) i=lj=1 3 c) L L j i = 1 j =O a)
b)
2
2
i 2l
that L� = I (aj - aj - I ) = an - aO , where ao , a I , . . . , an is a sequence of real numbers. This type of
19. 5how
sum is called telescoping.
20. Use the identity l /(k(k + 1 » = 1 / k - l /(k + 1 ) and Ex ercise 1 9 to compute L � = I l /(k(k + 1 » .
2
- 1)2
21. Sum both sides of the identity k - (k = 2k - 1 from k = 1 to k = and use Exercise 1 9 to find a) a formula for
n
L� = I (2k - 1 ) (the sum of the first
odd natural numbers). b) a formula for L � = I k.
n
* 22. Use the technique given in Exercise 1 9, together with the result of Exercise 2 1 b, to derive the formula for L � = I k given in Table 2. Take ak = k3 in the telescoping sum in Exercise 1 9.)
2
(Hint:
23. Find L�� lOo k. (Use Table 2.)
24. Find L�� 99 k3 . (Use Table 2.)
* 25. Find a formula for L;= o L .J7(j , when m is a positive integer. * 26. Find a formula for L ;= 0 LJkj , when m is a positive integer. There is also a special notation for products. The product of
am , am + I , . . . , an is represented by
27. What are the values of the following products?
i n :�o lOo _
b) d)
n 1 = 1 ( I )i
n �l o= 5 i
n1 = 1 2
Recall that the value of the factorial function at a posi tive integer denoted by is the product of the posi tive integers from 1 to inclusive. Also, we specify that
O! = 1 .
32. Determine whether each of these sets is countable or un countable. For those that are countable, exhibit a one-to one correspondence between the set of natural numbers and that set. a) the integers greater than 1 0 b ) the odd negative integers c) the real numbers between 0 and 2 d) integers that are multiples of 1 0 33. Determine whether each o f these sets i s countable or un countable. For those that are countable, exhibit a one-to one correspondence between the set of natural numbers and that set. a) all bit strings not containing the bit 0 b) all positive rational numbers that cannot be written with denominators less than 4 c) the real numbers not containing 0 in their decimal representation d) the real numbers containing only a finite number of 1 s in their decimal representation 34. Determine whether each of these sets is countable or un countable. For those that are countable, exhibit a one-to one correspondence between the set of natural numbers and that set. a) integers not divisible by 3 b) integers divisible by 5 but not by 7 c) the real numbers with decimal representations con sisting of all 1 s d) the real numbers with decimal representations of all I s or 9s 35. If A is an uncountable set and B is a countable set, must A - B be uncountable? 36. Show that a subset of a countable set is also countable.
j =m a) c)
2-52
n,
n!
n, n!,
28. Express using product notation. 29. Find L� = o j ! . 30. Find n� = o j ! .
37. Show that if A and B are sets, A is uncountable, and A S; B , then B is uncountable. ' ,
38. Show that if A and B are sets with the same cardinality, then the power set of A and the power set of B have the same cardinality. 39. Show that if A and B are sets with the same cardinal ity and C and D are sets with the same cardinality, then A x C and B x D have the same cardinality. 40. Show that the union oftwo countable sets is countable. **41. Show that the union of a countable number of countable sets is countable.
Key Terms and Results
2-53
42. Show that the set Z+ x
Z+ is countable.
*43. Show that the set of all finite bit strings is countable. *44. Show that the set of real numbers that are solutions of
quadratic equations ax 2 +
bx
b,
+ c = 0, where a, and c are integers, is countable. *45. Show that the set of all computer programs in a partic ular programming language is countable. A com puter program written in a programming language can be thought of as a string of symbols from a finite alphabet.] *46. Show that the set of functions from the positive inte gers to the set {O, 1 , 2, 3 , 4, 5 , 6, 7, 8, is uncountable. First set up a one-to-one correspondence between
[Hint:
9}
[Hint:
163
I I I(n)
the set of real numbers between 0 and and a subset of these functions. Do this by associating to the real number 0.d1 d2 dn the function with = dn .] • • •
• . •
*47. We say that a function is computable if there is a com
puter program that finds the values of this function. Use Exercises 45 and 46 to show that there are functions that are not computable.
I S S peSI ), I(s)} s I(s )
*48. Show that if
is a set, then there does not exist an onto function from to the power set of Suppose such a function existed. Let T = E and show that no element can exist for which = T.]
(s S. S[Hint: i s f/.
Key Terms and Results TERMS set: a collection of distinct objects axiom: a basic assumption of a theory paradox: a logical inconsistency element, member of a set: an object in a set o (empty set, null set): the set with no members universal set: the set containing all objects under consideration Venn diagram: a graphical representation of a set or sets S = T (set equality): and T have the same elements S � T (S is a subset of T).: every element of is also an
S
element of T
S
S is a subset of T and S =I= T a set with n elements, where n is a nonnegative integer a set that is not finite lSI the number of elements in S the set of all subsets of S
S e T (S is
a proper subset of
T):
finite set:
infinite set: (the cardinality of S): P(S) (the power set of S): A U B (the union of A and B): the set containing those ele
ments that are in at least one of A and B
A n B (the intersection of A and B): the set containing those
elements that are in both A and B .
A - B (the difference o f A and B): the set containing those
every element of B is the image of some element in A
one-to-one function, injection: a function such that the im
ages of elements in its domain are all different
one-to-one correspondence, bijection: a function that is both
one-to-one and onto
inverse of I: the function that reverses the correspondence
I (when I is a bijection) Lxfx1JI(g(x» to x thethelargest integer not exceeding x smallest integer greater than or equal to x a function with domain that is a subset of the set of given by
l o g (composition of 1 and g): the function that assigns (floor function): (ceiling function):
sequence:
integers
geometric progression: a sequence ofthe form a, ar, ar 2 , . . . ,
where a and r are real numbers
arithmetic progression: a sequence of the form
a + 2d, . . . , where a and d are real numbers
a, a + d,
string: a finite sequence empty string: a string of length zero
L7= 1 aj : the sum al + a2 + . . . + an D7= 1 aj : the product al a2 · . . an
elements that are in A but not in B
countable set: a set that either is finite or can be placed in
set that are not in A
uncountable set: a set that is not countable Cantor diagonalization argument: a proof technique that
A (the complement of A): the set of elements in the universal A EEl B (the symmetric difference of A and B): the set con
taining those elements in exactly one of A and B
membership table: a table displaying the membership of ele
ments in sets
function from A to B : an assignment of exactly one element
of B to each element of A domain ofI: the set A, where is a function from A to B codomain ofI: the set B, where is a function from A to B
b
I
range ofI: the set of images of onto function, surjection: a function from A to B such that
I I b I(I(a a ) b )
is the image of a under I: = a is a pre-image of b under I:
=
one-to-one correspondence with the set of positive integers
can be used to show that the set of real numbers is uncountable
RESULTS
2.2
The set identities given in Table 1 in Section The summation formulae in Table 2 in Section 2.4 The set of rational numbers is countable. The set of real numbers is uncountable.
164
2/ Basic Structures: Sets, Functions, Sequences, and Sums
2-54
Review Questions 1 . Explain what it means for one set to be a subset of another
2. 3. 4.
5.
6.
7. 8.
set. How do you prove that one set is a subset of another set? What is the empty set? Show that the empty set is a subset of every set. a) Define l S I , the cardinality of the set S. b) Give a formula for IA U B I , where A and B are sets. a) Define the power set of a set S. b) When is the empty set in the power set of a set S? c) How many elements does the power set of a set S with n elements have? a) Define the union, intersection, difference, and sym metric difference of two sets. b) What are the union, intersection, difference, and sym metric difference of the set of positive integers and the set of odd integers? a) Explain what it means for two sets to be equal. b) Describe as many of the ways as you can to show that two sets are equal. c) Show in at least two different ways that the sets A - (B n C) and (A - B) U (A - C) are equal. Explain the relationship between logical equivalences and set identities. a) Define the domain, codomain, and the range of a function. b) Let f(n) be the function from the set of integers to the set of integers such that f(n) = n 2 1 . What are the domain, codomain, and range of this function? a) Define what it means for a function from the set of positive integers to the set of positive integers to be one-to-one.
+
9.
b) Define what it means for a function from the set of
positive integers to the set of positive integers to be onto. c) Give an example of a function from the set of posi tive integers to the set of positive integers that is both one-to-one and onto. d) Give an example of a function from the set of positive integers to the set ofpositive integers that is one-to-one but not onto. e) Give an example of a function from the set of posi tive integers to the set of positive integers that is not one-to-one but is onto. f) Give an example of a function from the set of positive integers to the set of positive integers that is neither one-to-one nor onto. 1 0. a) Define the inverse of a function. b) When does a function have an inverse? c) Does the function f(n) = 1 0 - n from the set of inte gers to the set of integers have an inverse? If so, what is it? 1 1. a) Define the floor and ceiling functions from the set of real numbers to the set of integers. b) For which real numbers x is it true that LxJ = rx l ? 12. Conjecture a formula for the terms of the sequence that begins 8 , 14, 32, 86, 248 and find the next three terms of your sequence. 13. What is the sum ofthe terms ofthe geometric progression a
+ +...+ ar
ar
n
when r =1= I ?
14. Show that the set o f odd integers i s countable. 1 5. Give an example of an uncountable set.
Supplementary Exercises 1 . Let A be the set of English words that contain the letter x, and let B be the set of English words that contain the letter q . Express each of these sets as a combination of A and B. a) The set of English words that do not contain the letter x.
b) The set o f English words that contain both an x and a q.
c) The set o f English words that contain an x but not a q . d) The set o f English words that do not contain either an x
or a q . e) The set o f English words that contain an x or a q , but not both. 2. Show that if A is a subset of B, then the power set of A is a subset of the power set of B . 3 . Suppose that A and B are sets such that the power set of A is a subset of the power set of B. Does it follow that A is a subset of B?
4. Let E denote the set of even integers and 0 denote the set of odd integers. As usual, let Z denote the set of all
integers. Determine each of these sets.
a) E U O
b) E n O
c) Z - E
d) Z - O
5. Show that if A and B are sets, then A - (A - B) = A n B. 6 . Let A and B b e sets. Show that A � B i f and only if A n B = A. 7 . Let A , B, and C be sets. Show that ( A - B ) - C i s not necessarily equal to A - (B - C). 8. Suppose that A, B, and C are sets. Prove or disprove that (A - B) - C = (A - C) - B . 9 . Suppose that A , B , C, and D are sets. Prove or disprove that (A - B) - (C - D) = (A - C) (B - D). -
10. Show that if A and B are finite sets, then IA n BI :::: I A U B I . Determine when this relationship is an equality.
2-55
Computer Projects
x and y is it true that rx + y1 = rx 1 + ryl ? For which real numbers x and y it true that rx + y1 = rx 1 + LYJ? 2 Prove that Ln/2J rn/2 1 = Ln /4J for all integers n. Prove that if m is an integer, then Lx J + Lm - x J = m - 1 , unless x is an integer, in which case, it equals m. Prove that if x is a real number, then LLx /2 J /2J = Lx /4J. Prove that if n is an odd integer, then rn 2 /41 = (n 2 + 3)/4. Prove that if m and are positive integers and x is a real number, then
A and B be sets in a finite universal set U . List the following in order of increasing size.
20. For which real numbers
a) b)
21.
1 1 . Let
I A I , I A U B I , I A n B I , l U I , 10 1 I A - B I , I A EEl B I , I A I I B I , I A U B I , 1 0 1 12. Let A and B be subsets ofthe finite universal set U . Show that I A n 81 IUI - IAI - IBI IA n BI ·
=
+
+
13. Let 1 and g be functions from { I , 2, 3 , 4} to {a , b, c, d}
and from {a , b, c, d} to { I , 2, 3 , 4}, respectively, such that d, c, a, and 1(4) b, and 2, g(b) I, 3, and g(d) = 2. a) Is 1 one-to-one? Is g one-to-one? b) Is 1 onto? Is g onto? c) Does either 1 or g have an inverse? If so, find this inverse. 14. Let 1 be a one-to-one function from the set A to the set B. Let S and T be subsets of A. Show that I(S n T) I(S) n I(T). 15. Give an example to show that the equality in Exercise 1 4 may not hold i f 1 i s not one-to-one. Suppose that 1 is a function from A to B. We define the func tion SI from P(A) to P(B) by the rule SI(X) I(X) for each subset X of A. Similarly, we define the function SI-I from P(B) to P(A) by the rule SI-I (y) I- I (y) for each subset Y of B. Here, we are using Definition 4, and the defi nition of the inverse image of a set found in the preamble to Exercise 38, both in Section 2.3. *16. a) Prove that if 1 is a one-to-one function from A to B, then SI i s a one-to-one function from P(A) t o P(B). b) Prove that if 1 is an onto function from A to B, then SI is an onto function from P(A) to P(B). c) Prove that if 1 is an onto function from A to B, then SI-I is a one-to-one function from P(B) to P(A). d) Prove that if 1 is a one-to-one function from A to B, then SI-I is an onto function from P(B) to P(A). e) Use parts (a) through (d) to conclude that if 1 is a one-to-one correspondence from A to B, then SI is a one-to-one correspondence from P(A) to P(B) and SI-I is a one-to-one correspondence from P(B) to
= =1(3) = g(a) 1(1) = = =1(2)g(c)
=
22. 23. 24. 25. 26.
=
=
*27.
=
P(A).
17. Prove that if 1 and g are functions from A to B and SI Sg (using the definition in the preamble to Exercise
= I(x) = g(x) x n n = rn/21 + Ln/2J . Lx + yJ = Lx J + LYJ? x y is it true that
1 6), then for all E A. 18. Show that i f i s an integer, then 19. For which real numbers and
*28.
165
IS
n
l Lx J + n J = l x � n J m Prove that if m is a positive integer and x is a real number, then Lmx J = Lx J + lx + � J + lx + � J + . . . + lx + m � 1 J We define the by setting U I = 1 and U 2 = 2. Furthermore, after determining whether the in tegers less than n are Ulam numbers, we set n equal to the next Ulam number if it can be written uniquely as the sum of two different Ulam numbers. Note that U 3 = 3, U = 4, U = 6, and U6 = 8. U1am numbers
4
5
a ) Find the first 20 Ulam numbers. b) Prove that there are infinitely many Ulam numbers.
kt
�O I . (The notation used here for products is defined in the preamble to Exercise 27 in Section 2.4.) *30. Determine a rule for generating the terms ofthe sequence that begins 1 , 3 , 4, 8 , 1 5 , 27, 50, 92, . . . , and find the next four terms of the sequence. *31 . Determine a rule for generating the terms of the se quence that begins 2, 3 , 3 , 5 , 1 0 , 1 3 , 39, 43 , 1 72, 1 77, 885, 89 1 , . . . , and find the next four terms of the sequence. *32. Prove that if A and B are countable sets, then A x B is also a countable set. 29. Determine the value ofn! I
Computer Proj ects Write programs with the specified input and output.
n
A and B of a set with elements, use bit strings to find A, A U B, A n B, A - B, and
1. Given subsets
A EEl B.
A and B from the same universal set, find A U B, A n B, A - B, and A B (see preamble to Exer
2. Given multisets
cise 59 of Section 2.2).
+
1 66
2 / Basic Structures: Sets, Functions, Sequences, and Sums
3. Given fuzzy sets A and B, find if, A U B, and A n B (see
preamble to Exercise 61 of Section 2.2). 4. Given a function I from { I , 2, . . . , n} to the set of integers, determine whether I is one-to-one.
2-56
, n} to itself, determine whether I is onto. 6. Given a bijection I from the set { I , 2, . . . , n} to itself, find 1- 1 . S. Given a function I from { I , 2, . . .
Computations and Explorations Use a computational program or programs you have written to do these exercises.
1 . Given two finite sets, list all elements in the Cartesian prod
uct of these two sets. 2. Given a finite set, list all elements of its power set. 3. Calculate the number of one-to-one functions from a set S to a set T, where S and T are finite sets of various sizes. Can you determine a formula for the number of such functions? (We will find such a formula in Chapter 5 .) 4. Calculate the number of onto functions from a set S to a set T, where S and T are finite sets of various sizes. Can
you determine a formula for the number of such functions? (We will find such a formula in Chapter 7.) * S. Develop a collection of different rules for generating the
terms of a sequence and a program for randomly select ing one of these rules and the particular sequence gen erated using these rules. Make this part of an interactive program that prompts for the next term of the sequence and determine whether the response is the intended next term.
Writing Proj ects Respond to these with essays using outside sources.
1 . Discuss how an axiomatic set theory can be developed
to avoid Russell's paradox. (See Exercise 38 of Sec tion 2. 1 .) 2. Research where the concept of a function first arose, and describe how this concept was first used. 3. Explain the different ways in which the Encyclopedia 01 Integer Sequences has been found useful. Also, describe a few of the more unusual sequences in this encyclopedia and how they arise.
4. Define the recently invented EKG sequence and describe
some of its properties and open questions about it.
S. Look up the definition of a transcendental number. Explain
how to show that such numbers exist and how such num bers can be constructed. Which famous numbers can be shown to be transcendental and for which famous numbers is it still unknown whether they are transcendental? 6. Discuss infinite cardinal numbers and the continuum hypothesis.
The Fundamentals : Algorithms, the Integers, and Matrices 3.1 Algorithms 3.2 The Growth of Functions
3.3 Complexity of Algorithms
3.4 The Integers
and Division
3.5 Primes and Greatest Common Divisors
3.6 Integers and Algorithms
3.7 Applications of Number Theory
3.8 Matrices
3.1
M
any problems can b e solved by considering them a s special cases of general problems. For instance, consider the problem of locating the largest integer in the sequence 10 1 2 , 144, 2 1 2, 98. This i s a specific case o f the problem o f locating the largest integer in a sequence of integers. To solve this general problem we must give an algorithm, which specifies a sequence of steps used to solve this general problem. We will study algorithms for solving many different types of problems in this book. For instance, algorithms will be developed for finding the greatest common divisor of two integers, for generating all the orderings of a finite set, for searching a list, for sorting the elements of a sequence, and for finding the shortest path between nodes in a network. One important consideration concerning an algorithm is its computational complexity. That is, what are the computer resources needed to use this algorithm to solve a problem of a specified size? To measure the complexity of algorithms we use big- O and big-Theta notation, which we develop in this chapter. We will illustrate the analysis of the complexity of algorithms in thi s chapter. Furthermore, we will discuss what the complexity of a n algorithm means in practical and theoretical terms. The set of integers plays a fundamental role in discrete mathematics. In particular, the concept of division of integers is fundamental to computer arithmetic. We will briefly review some of the important concepts of number theory, the study of integers and their properties. Some important algorithms involving integers will be studied, including the Euclidean algorithm for computing greatest common divisors, which was first described thousands of years ago. Integers can be represented using any positive integer greater than I as a base. Binary expansions, which are used throughout computer science, are representations with 2 as the base . In this chapter we discuss base b representations of integers and give an algorithm for finding them. Algorithms for integer arithmetic, which were the first procedures called algorithms, will also be discussed. This chapter also introduces several important applications of number theory. For example, in this chapter we will use number theory to make messages secret, to generate pseudorandom numbers, and to assign memory locations to computer files. Number theory, once considered the purest of subj ects, has become an essential tool in providing computer and Internet security. Matrices are used in discrete mathematics to represent a variety of discrete structures. We review the basic material about matrices and matrix arithmetic needed to represent relations and graphs. Matrix arithmetic will be used in numerous algorithms involving these structures.
I,
Al g orithms Introduction There are many general classes ofproblems that arise in discrete mathematics. For instance: given a sequence of integers, find the largest one; given a set, list all its subsets; given a set of integers, put them in increasing order; given a network, find the shortest path between two vertices. When presented with such a problem, the first thing to do i s to construct a model that translates the problem into a mathematical context. Discrete structures used in such models include sets, sequences, and functions-structures discussed in Chapter 2-as well as such other structures as permutations, relations, graphs, trees, networks, and finite state machines-concepts that will be discussed in later chapters.
3- /
1 67
168
3
/ The Fundamentals: Algorithms, the Integers, and Matrices
3-2
Setting up the appropriate mathematical model is only part of the solution. To complete the solution, a method is needed that will solve the general problem using the model. Ideally, what is required is a procedure that follows a sequence of steps that leads to the desired answer. Such a sequence of steps is called an algorithm.
DEFINITION 1
An algorithm is a finite set ofprecise instructions for performing a computation or for solving
a problem.
The term algorithm is a corruption of the name al-Khowarizmi, a mathematician of the ninth century, whose book on Hindu numerals is the basis of modern decimal notation. Originally, the word algorism was used for the rules for performing arithmetic using decimal notation. Algorism evolved into the word algorithm by the eighteenth century. With the growing interest in computing machines, the concept of an algorithm was given a more general meaning, to include all definite procedures for solving problems, not just the procedures for performing arithmetic. (We will discuss algorithms for performing arithmetic with integers in Section 3.6.) In this book, we will discuss algorithms that solve a wide variety of problems. In this section we will use the problem of finding the largest integer in a finite sequence of integers to illustrate the concept of an algorithm and the properties algorithms have. Also, we will describe algorithms for locating a particular element in a finite set. In subsequent sections, procedures for finding the greatest common divisor of two integers, for finding the shortest path between two points in a network, for multiplying matrices, and so on, will be discussed. EXAMPLE l
Describe an algorithm for finding the maximum (largest) value in a finite sequence of integers.
Exam= �
Even though the problem of finding the maximum element in a sequence is relatively trivial, it provides a good illustration of the concept of an algorithm. Also, there are many instances where the largest integer in a finite sequence of integers is required. For instance, a university may need to find the highest score on a competitive exam taken by thousands of students. Or a sports organization may want to identify the member with the highest rating each month. We want to develop an algorithm that can be used whenever the problem of finding the largest element in a finite sequence of integers arises. We can specify a procedure for solving this problem in several ways. One method is simply to use the English language to describe the sequence of steps used. We now provide such a solution.
Solution ofExample 1: We perform the following steps. 1 . Set the temporary maximum equal to the first integer in the sequence. (The temporary maximum will be the largest integer examined at any stage of the procedure.) 2. Compare the next integer in the sequence to the temporary maximum, and if it is larger than the temporary maximum, set the temporary maximum equal to this integer.
unkS � ABU JA ' FAR MOHAMMED IBN MUSA AL-KHOWARIZMI (C. 780 - C. 850) al-Khowarizmi, an as tronomer and mathematician, was a member of the House of Wisdom, an academy of scientists in Baghdad. The name al-Khowarizmi means "from the town of Kowarzizm," which was then part of Persia, but is now called Khiva and is part of Uzbekistan. al-Khowarizmi wrote books on mathematics, astronomy, and geography. Western Europeans first learned about algebra from his works. The word algebra comes from al-jabr, part of the title of his book Kitab al-jabr w 'al muquabala. This book was translated into Latin and was a widely used textbook. His book on the use of Hindu numerals describes procedures for arithmetic operations using these numerals. European authors used a Latin corruption of his name, which later evolved to the word algorithm, to describe the subject of arithmetic with Hindu numerals.
3 . 1 Algorithms
169
3 . Repeat the previous step if there are more integers in the sequence. 4. Stop when there are no integers left in the sequence. The temporary maximum at this
procedure max(a] , a2 ,
. . . , an : integers) max := a] for i := to n if max < aj then max := aj { max is the largest element}
2
This algorithm first assigns the initial term of the sequence, ai , to the variable max. The "for" loop is used to successively examine terms of the sequence. If a term is greater than the current value of max, it is assigned to be the new value of max. There are several properties that algorithms generally share. They are useful to keep in mind when algorithms are described. These properties are: • • • • • • •
Input. An algorithm has input values from a specified set. Output. From each set of input values an algorithm produces output values from a specified set. The output values are the solution to the problem. Definiteness. The steps of an algorithm must be defined precisely. Correctness. An algorithm should produce the correct output values for each set of input values. Finiteness. An algorithm should produce the desired output . after a finite (but perhaps large) number of steps for any input in the set. Effectiveness. It must be possible to perform each step of an algorithm exactly and in a finite amount of time. Generality. The procedure should be applicable for all problems of the desired form, not just for a particular set of input values.
170
/
3 The Fundamentals: Algorithms, the Integers, and Matrices
EXAMPLE 2
3-4
Show that Algorithm 1 for finding the maximum element in a finite sequence of integers has all the properties listed.
Solution: The input to Algorithm 1 is a sequence of integers. The output is the largest integer in the sequence. Each step of the algorithm is precisely defined, because only assignments, a finite loop, and conditional statements occur. To show that the algorithm is correct, we must show that when the algorithm terminates, the value of the variable max equals the maximum of the terms of the sequence. To see this, note that the initial value of max is the first term of the sequence; as successive terms of the sequence are examined, max is updated to the value of a term if the term exceeds the maximum of the terms previously examined. This (informal) argument shows that when all the terms have been examined, max equals the value of the largest term. (A rigorous proof of this requires techniques developed in Section 4. l .) The algorithm uses a finite number of steps, because it terminates after all the integers in the sequence have been examined. The algorithm can be carried out in a finite amount of time because each step is either a comparison or an assignment. Finally, Algorithm 1 is general, because it can be used to .... find the maximum of any finite sequence of integers.
Searching Algorithms The problem of locating an element in an ordered list occurs in many contexts. For instance, a program that checks the spelling of words searches for them in a dictionary, which is just an ordered list of words. Problems of this kind are called searching problems. We will discuss several algorithms for searching in this section. We will study the number of steps used by each of these algorithms in Section 3.3. The general searching problem can be described as follows: Locate an element x in a list of distinct elements aj , a2 , . . . , an , or determine that it is not in the list. The solution to this search problem is the location of the term in the list that equals x (that is, i is the solution if x = aj ) and is 0 if x is not in the list.
unkS �
THE LINEAR SEARCH The first algorithm that we will present is called the linear search, or sequential search, algorithm. The linear search algorithm begins by comparing x and aj . When x = a j , the solution is the location of a j , namely, 1 . When x #- ai , compare x with a2 . If x = a2, the solution is the location of a 2 , namely, 2. When x #- a2, compare x with a3 . Continue this process, comparing x successively with each term of the list until a match is found, where the solution is the location of that term, unless no match occurs. If the entire list has been searched without locating x , the solution is O . The pseudocode for the linear search algorithm is displayed as Algorithm 2.
ALGORITHM 2 The Linear Search Algorithm.
procedure linear search(x : integer, a 1 , a2 ,
i
:=
. . . , an : distinct integers)
1
while (i :5
n and x =1= ai) i := i 1 if i :5 n then location : = i else location := 0 {location is the subscript of the term that equals x , or is 0 if x is not found}
+
3 . 1 Algorithms
3-5
links �
EXAMPLE 3
171
THE BINARY SEARCH We will now consider another searching algorithm. This algorithm can be used when the list has terms occurring in order of increasing size (for instance: if the terms are numbers, they are listed from smallest to largest; if they are words, they are listed in lexicographic, or alphabetic, order). This second searching algorithm is called the binary search algorithm. It proceeds by comparing the element to be located to the middle term of the list. The list is then split into two smaller sublists of the same size, or where one of these smaller lists has one fewer term than the other. The search continues by restricting the search to the appropriate sublist based on the comparison of the element to be located and the middle term. In Section 3.3, it will be shown that the binary search algorithm is much more efficient than the linear search algorithm. Example 3 demonstrates how a binary search works. To search for 1 9 in the list 1 2 3 5 6 7 8 1 0 1 2 1 3 1 5 1 6 1 8 1 9 20 22, first split this list, which has 16 terms, into two smaller lists with eight terms each, namely, 1 2 3 5 6 7 8 10
1 2 1 3 1 5 1 6 1 8 1 9 20 22.
Then, compare 19 and the largest term in the first list. Because 10 < 1 9, the search for 19 can be restricted to the list containing the 9th through the 1 6th terms of the original list. Next, split this list, which has eight terms, into the two smaller lists of four terms each, namely, 12 13 15 16
1 8 1 9 20 22.
Because 16 < 19 (comparing 1 9 with the largest term of the first list) the search is restricted to the second of these lists, which contains the 1 3th through the 1 6th terms of the original list. The list 1 8 1 9 20 22 is split into two lists, namely, 18 19
20 22.
Because 19 is not greater than the largest term of the first of these two lists, which is also 1 9, the search is restricted to the first list: 1 8 1 9, which contains the 1 3th and 1 4th terms of the original list. Next, this list of two terms is split into two lists of one term each: 1 8 and 1 9. Because 1 8 < 1 9, the search is restricted to the second list: the list containing the 1 4th term of the list, which is 1 9. Now that the search has been narrowed down to one term, a comparison is made, .... and 1 9 is located as the 1 4th term in the original list. We now specify the steps of the binary search algorithm. To search for the integer x in the list a I , a2 , . . . , an , where al < a2 < . . . < a n , begin by comparing x with the middle term of the sequence, a m , where m = L(n + 1 )/2J . (Recall that LxJ is the greatest integer not exceeding x .) If x > am , the search for x can be restricted to the second half of the sequence, which is am + l , am +2 , . . . , an . Ifx is not greater than am , the search for x can be restricted to the first half of the sequence, which is a I , a2 , . . . , am . The search has now been restricted to a list with no more than rn 121 elements. (Recall that rxl is the smallest integer greater than or equal to x .) Using the same procedure, compare x to the middle term of the restricted list. Then restrict the search to the first or second half of the list. Repeat this process until a list with one term is obtained. Then determine whether this term is x . Pseudocode for the binary search algorithm is displayed as Algorithm 3 .
172
3 / The Fundamentals: Algorithms, the Integers, and Matrices
3-6
ALGORITHM 3 The Binary Search Algorithm.
procedure binary search
i j
:=
:=
I
(x : integer, aI , a2 , . . . , an : increasing integers) {i is left endpoint of search interval}
n {j is right endpoint of search interval}
while i < j begin m :=
L(i
+ j)/2J
+1
if x > am then i := m else j := m end if x = ai then location := else location := 0
i
{location is the subscript of the term equal to x , or 0 if x is not found}
Algorithm 3 proceeds by successively narrowing down the part of the sequence being searched. At any given stage only the terms beginning with ai and ending with aj are under consideration. In other words, i and j are the smallest and largest subscripts of the remaining terms, respectively. Algorithm 3 continues narrowing the part of the sequence being searched until only one term of the sequence remains. When this is done, a comparison is made to see whether this term equals x .
Sorting
Demo �
Ordering the elements of a list is a problem that occurs in many contexts. For example, to produce a telephone directory it is necessary to alphabetize the names of subscribers. Putting addresses in order in an e-mail mailing list can determine whether there are duplicated addresses. Creating a useful dictionary requires that words be put in alphabetical order. Similarly, generating a parts list requires that we order them according to increasing part number. Suppose that we have a list of elements of a set. Furthermore, suppose that we have a way to order elements of the set. (The notion of ordering elements of sets will be discussed in detail in Section 8.6.) Sorting is putting these elements into a list in which the elements are in increasing order. For instance, sorting the list 7, 2, 1 , 4, 5, 9 produces the list 1 , 2, 4, 5, 7, 9. Sorting the list d, h, c, a,f (using alphabetical order) produces the list a, c, d,J, h. An amazingly large percentage of computing resources is devoted to sorting one thing or another. Hence, much effort has been devoted to the development of sorting algorithms. A surprisingly large number of sorting algorithms have been devised using distinct strategies. In his fundamental work, The Art o/Computer Programming, Donald Knuth devotes close to 400 pages to sorting, covering around 1 5 different sorting algorithms in depth! There are many reasons why sorting algorithms interest computer scientists and mathematicians. Among these reasons are that some algorithms are easier to implement, some algorithms are more efficient (either in general, or when given input with certain characteristics, such as lists slightly out of order), some algorithms take advantage of particular computer architectures, and some algorithms are particularly clever. In this section we will introduce two sorting algorithms, the bubble sort and the insertion sort. Two other sorting algorithms, the selection sort and the binary insertion sort, are introduced in the exercises at the end of this section, and the shaker sort is introduced in the Supplementary Exercises at the end of this chapter. In Section 4.4 we will discuss the merge sort and introduce the quick sort in the exercises in that section; the tournament sort is introduced in the exercise set in Section 1 0.2. We cover sorting algorithms both because sorting is an important problem and because these algorithms can serve as examples of many important concepts.
3 . 1 Algorithms
3- 7
First pass
C; 4 5
Third pass
C
� 3
2 (3
4 1 5
2 3
C� 5
2 3 1
Second pass
4 (5
1 2 ( 3
unkS �
EXAMPLE 4
2
2 1
C�
4
[1]
4
(!
:
an interchange
[1]
C
Fourth pass ( 1 2
[iJ [iJ FIGURE 1
(�
rn
173
[1]
( : pair in correct order
numbers in color guaranteed to be in correct order
The Steps of a Bubble Sort.
THE BUBBLE SORT The bubble sort is one of the simplest sorting algorithms, but not one of the most efficient. It puts a list into increasing order by successively comparing adjacent elements, interchanging them if they are in the wrong order. To carry out the bubble sort, we perform the basic operation, that is, interchanging a larger element with a smaller one following it, starting at the beginning of the list, for a full pass. We iterate this procedure until the sort is complete. Pseudocode for the bubble sort is given as Algorithm 4. We can imagine the elements in the list placed in a column. In the bubble sort, the smaller elements "bubble" to the top as they are interchanged with larger elements. The larger elements "sink" to the bottom. This is illus trated in Example 4. Use the bubble sort to put 3, 2, 4, 1, 5 into increasing order.
Solution: The steps of this algorithm are illustrated in Figure 1 . Begin by comparing the first two elements, 3 and 2. Because 3 > 2, interchange 3 and 2, producing the list 2, 3 , 4, 1 , 5 . Because 3 < 4, continue by comparing 4 and 1 . Because 4 > 1 , interchange 1 and 4, producing the list 2, 3, 1 , 4, 5. Because 4 < 5, the first pass is complete. The first pass guarantees that the largest element, 5, is in the correct position. The second pass begins by comparing 2 and 3. Because these are in the correct order, 3 and 1 are compared. Because 3 > 1 , these numbers are interchanged, producing 2, I , 3 , 4, 5. Because 3 < 4, these numbers are in the correct order. It is not necessary to do any more comparisons for this pass because 5 is already in the correct position. The second pass guarantees that the two largest elements, 4 and 5, are in their correct positions. The third pass begins by comparing 2 and I . These are interchanged because 2 > 1 , produc ing 1 , 2, 3 , 4, 5 . Because 2 < 3, these two elements are in the correct order. It is not necessary to do any more comparisons for this pass because 4 and 5 are already in the correct positions. The third pass guarantees that the three largest elements, 3, 4, and 5, are in their correct positions. The fourth pass consists of one comparison, namely, the comparison of I and 2. Because .... 1 < 2, these elements are in the correct order. This completes the bubble sort.
ALGORITHM 4 The Bubble Sort.
procedure bubblesort(aJ , . . . , an : real numbers with n for i := to n - 1 for j := 1 to n - i if aj > aj + J then interchange aj and aj +J
I
{aJ , . . . , an is in increasing order}
:::
2)
174
3/
The Fundamentals: Algorithms, the Integers, and Matrices
linkS �
3-8
THE INSERTION SORT The insertion sort is a simple sorting algorithm, but it is usually not the most efficient. To sort a list with n elements, the insertion sort begins with the second element. The insertion sort compares this second element with the first element and inserts it before the first element if it does not exceed the first element and after the first element if it exceeds the first element. At this point, the first two elements are in the correct order. The third element is then compared with the first element, and if it is larger than the first element, it is compared with the second element; it is inserted into the correct position among the first three elements. In general, in the jth step of the insertion sort, the jth element of the list is inserted into the correct position in the list of the previously sorted j 1 elements. To insert the jth element in the list, a linear search technique is used (see Exercise 43); the jth element is successively compared with the already sorted j 1 elements at the start ofthe list until the first element that is not less than this element is found or until it has been compared with all j 1 elements; the jth element is inserted in the correct position so that the first j elements are sorted. The algorithm continues until the last element is placed in the correct position relative to the already sorted list of the first n 1 elements. The insertion sort is described in pseudocode in Algorithm 5. -
-
-
-
EXAMPLE
5
Use the insertion sort to put the elements of the list 3 , 2, 4, 1 , 5 in increasing order.
Solution: The insertion sort first compares 2 and 3 . Because 3 > 2, it places 2 in the first position, producing the list 2, 3, 4, 1 , 5 (the sorted part of the list is shown in color). At this point, 2 and 3 are in the correct order. Next, it inserts the third element, 4, into the already sorted part of the list by making the comparisons A > 2 and 4 > 3 . Because 4 > 3 , 4 is placed in the third position. At this point, the list is 2, 3 , 4, 1 , 5 and we know that the ordering of the first three elements is correct. Next, we find the correct place for the fourth element, 1 , among the already sorted elements, 2, 3 , 4. Because 1 < 2, we obtain the list 1 , 2, 3 , 4, 5 . Finally, we insert 5 into the correct position by successively comparing it to 1 , 2, 3, and 4. Because 5 > 4, it goes at the � end of the list, producing the correct order for the entire list. ALGORITHM 5 The Insertion Sort.
procedure insertion sort(al ' a2 , for j := 2 to n begin
. . . , an :
real numbers with n ::: 2)
i := 1 > aj i := i m := aj for := 0 to j aj - k := aj - k - l aj := m {ai , a2 , . . . , an are sorted}
while aj
k
end
+I
i-I
Greedy Algorithms Many algorithms we will study in this book are designed to solve optimization problems. The goal of such problems is to find a solution to the given problem that either minimizes or maximizes the value of some parameter. Optimization problems studied later in this text include finding a route between two cities with smallest total mileage, determining a way to encode
3-9
3. 1 Algorithms
unks �
1 75
messages using the fewest bits possible, and finding a set of fiber links between network nodes using the least amount of fiber. Surprisingly, one of the simplest approaches often leads to a solution of an optimization problem. This approach selects the best choice at each step, instead of considering all sequences of steps that may lead to an optimal solution. Algorithms that make what seems to be the "best" choice at each step are called greedy algorithms. Once we know that a greedy algorithm finds a feasible solution, we need to determine whether it has found an optimal solution. To do this, we either prove that the solution is optimal or we show that there is a counterexample where the algorithm yields a nonoptimal solution. To make these concepts more concrete, we will consider an algorithm that makes change using coins.
EXAMPLE 6
Consider the problem of making n cents change with quarters, dimes, nickels, and pennies, and using the least total number of coins. We can devise a greedy algorithm for making change for n cents by making a locally optimal choice at each step; that is, at each step we choose the coin ofthe largest denomination possible to add to the pile of change without exceeding n cents. For example, to make change for 67 cents, we first select a quarter (leaving 42 cents). We next select a second quarter (leaving 1 7 cents), followed by a dime (leaving 7 cents), followed by a nickel .... (leaving 2 cents), followed by a penny (leaving 1 cent), followed by a penny.
Demo �
We display a greedy change-making algorithm for n cents, using any set of denominations of coins, as Algorithm 6.
ALGORITHM 6 Greedy Change-Making Algorithm.
procedure for
i := I
change(Cl , C2 ,
Cl
>
to r while ::: begin
n
C2
, Cr : values of denominations of coins, where Cr ; a positive integer)
• • •
> ... >
n:
Ci
add a coin with value Ci to the change
end
n := n
-
Cj
We have described a greedy algorithm for making change using quarters, dimes, nickels, and pennies. We will show that this algorithm leads to an optimal solution in the sense that it uses the fewest coins possible. Before we embark on our proof, we show that there are sets of coins for which the greedy algorithm (Algorithm 6) does not necessarily produce change using the fewest coins possible. For example, if we have only quarters, dimes, and pennies (and no nickels) to use, the greedy algorithm would make change for 30 cents using six coins-a quarter and five pennies-whereas we could have used three coins, namely, three dimes.
LEMMA l
If n is a positive integer, then n cents in change using quarters, dimes, nickels, and pennies using the fewest coins possible has at most two dimes, at most one nickel, at most four pennies, and cannot have two dimes and a nickel. The amount of change in dimes, nickels, and pennies cannot exceed 24 cents.
176
/
3 The Fundamentals: Algorithms, the Integers, and Matrices
3- / 0
Proof: We use a proof by contradiction. We will show that if we had more than the specified number of coins of each type, we could replace them using fewer coins that have the same value. We note that if we had three dimes we could replace them with a quarter and a nickel, if we had two nickels we could replace them with a dime, if we had five pennies we could replace them with a nickel, and if we had two dimes and a nickel we could replace them with a quarter. Because we can have at most two dimes, one nickel, and four pennies, but we cannot have two dimes and a nickel, it follows that 24 cents is the most money we can have in dimes, nickels, <J and pennies when we make change using the fewest number of coins for n cents. THEOREM l
The greedy algorithm (Algorithm 6) produces change using the fewest coins possible. Proof: We will use a proof by contradiction. Suppose that there is a positive integer n such that there is a way to make change for n cents using quarters, dimes, nickels, and pennies that uses fewer coins than the greedy algorithm finds. We first note that q ', the number of quarters used in this optimal way to make change for n cents, must be the same as q , the number of quarters used by the greedy algorithm. To show this, first note that the greedy algorithm uses the most quarters possible, so q ' :s q . However, it is also the case that q ' cannot be less than q . If it were, we would need to make up at least 25 cents from dimes, nickels, and pennies in this optimal way to make change. But this is impossible by Lemma 1 . Because there must be the same number of quarters in the two ways to make change, the value of the dimes, nickels, and pennies in these two ways must be the same, and these coins are worth no more than 24 cents. There must be the same number of dimes, because the greedy algorithm used the most dimes possible and by Lemma 1 , when change is made using the fewest coins possible, at most one nickel and at most four pennies are used, so that the most dimes possible are also used in the optimal way to make change. Similarly, we have the same number <J of nickels and, finally, the same number of pennies.
The Halting Problem
unkS �
We will now describe a proof of one of the most famous theorems in computer science. We will show that there is a problem that cannot be solved using any procedure. That is, we will show there are unsolvable problems. The problem we will study is the baiting problem. It asks whether there is a procedure that does this: It takes as input a computer program and input to the program and determines whether the program will eventually stop when run with this input. It would be convenient to have such a procedure, if it existed. Certainly being able to test whether a program entered into an infinite loop would be helpful when writing and debugging programs. However, in 1 936 Alan Turing showed that no such procedure exists (see his biography in Section 12.4). Before we present a proof that the halting problem is unsolvable, first note that we cannot simply run a program and observe what it does to determine whether it terminates when run with the given input. If the program halts, we have our answer, but if it is still running after any fixed length of time has elapsed, we do not know whether it will never halt or we just did not wait long enough for it to terminate. After all, it is not hard to design a program that will stop only after more than a billion years has elapsed. We will describe Turing's proof that the halting problem is unsolvable; it is a proof by contradiction. (The reader should note that our proof is not completely rigorous, because we have not explicitly defined what a procedure is. To remedy this, the concept of a Turing machine is needed. This concept is introduced in Section 1 2.5.) Proof: Assume there is a solution to the halting problem, a procedure called H (P, I). The procedure H (P, I ) takes two inputs, one a program P and the other I , an input to the program .p . H (P, l ) generates the string "halt" as output if H determines that P stops when given I as
3- 1 1
3 . 1 Algorithms
177
If H(P, P) "halts," then loop forever =
P as program Program
Input Program P
H(P' l)
Output H(P, P)
Program K(P)
If H(P, P)
P as input FIGURE 2
"loops forever," then halt =
Showing that the Halting Problem is Unsolvable.
input. Otherwise, H (P, l) generates the string "loops forever" as output. We will now derive a contradiction. When a procedure is coded, it is expressed as a string of characters; this string can be interpreted as a sequence of bits. This means that a program itself can be used as data. Therefore a program can be thought of as input to another program, or even itself. Hence, H can take a program P as both of its inputs, which are a program and input to this program. H should be able to determine if P will halt when it is given a copy of itself as input. To show that no procedure H exists that solves the halting problem, we construct a simple procedure K (P), which works as follows, making use of the output H (P, P). If the output of H (P, P) is "loops forever," which means that P loops forever when given a copy of itself as input, then K(P) halts. If the output of H (p, P) is "halt," which means that P halts when given a copy of itself as input, then K (P) loops forever. That is, K (P) does the opposite of what the output of H (P, P) specifies. (See Figure 2.) Now suppose we provide K as input to K. We note that if the output of H(K, K) is "loops forever," then by the definition of K we see that K (K) halts. Otherwise, ifthe output of H (K , K) is "halt," then by the definition of K we see that K (K) loops forever, in violation of what H tells us. In both cases, we have a contradiction. Thus, H cannot always give the correct answers. Consequently, there is no procedure that
Exercises 1. List all the steps used by Algorithm 1 to find the maximum
2
2
of the list 1 , 8, 1 , 9, 1 1 , , 14, 5, 1 0, 4. 2. Determine which characteristics of an algorithm the fol lowing procedures have and which they lack. a) procedure positive integer) while > 0
double(n: n n : = 2n divide(n: positive integer) n 0 nm :=:= nlin 1 positive integer) sum : =i 0 1sum(n: 0 sum := choose(a,b: + i integers) : = either a or b
b) procedure while ::: begin
n
4. Describe an algorithm that takes as input a list of integers
5.
6.
7.
-
end c) procedure while
8.
<
sum
d) procedure x
9.
3. Devise an algorithm that finds the sum of all the integers
in a list.
1 0.
and produces as output the largest difference obtained by subracting an integer in the list from the one following it. Describe an algorithm that takes as input a list of inte gers in non decreasing order and produces the list of all values that occur more than once. Describe an algorithm that takes as input a list of in tegers and finds the number of negative integers in the list. Describe an algorithm that takes as input a list of inte gers and finds the location of the last even integer in the list or returns 0 if there are no even integers in the list. Describe an algorithm that takes as input a list of distinct integers and finds the location of the largest even integer in the list or returns 0 if there are no even integers in the list. A palindrome is a string that reads the same forward and backward. Describe an algorithm for determining whether a string of characters is a palindrome. Devise an algorithm to compute x n , where x is a real
n
n
n
n
n
1 78
3 / The Fundamentals: Algorithms, the Integers, and Matrices number and is an integer. [Hint: First give a procedure for computing x n when is nonnegative by successive n
n
multiplication by x, starting with 1 . Then extend this pro cedure, and use the fact that x - n = I /x n to compute x n when n is negative.] 1 1 . Describe an algorithm that interchanges the values of the variables x and y, using only assignments. What is the minimum number of assignment statements needed to do this? 12. Describe an algorithm that uses only assignment state ments that replaces the triple (x , y, z ) with (y, z, x). What is the minimum number of assignment statements needed? 13. List all the steps used to search for 9 in the sequence 1 , 3, 4, 5, 6, 8, 9, 1 1 using b) a binary search. a) a linear search. 14. List all the steps used to search for 7 in the sequence given
in Exercise 1 3 .
15. Describe an algorithm that inserts an integer x i n the ap
propriate position into the list a i , a2 , . . . , an of integers that are in increasing order. 16. Describe an algorithm for finding the smallest integer in a finite sequence of natural numbers. 1 7. Describe an algorithm that locates the first occurrence of the largest element in a finite list of integers, where the integers in the list are not necessarily distinct. 18. Describe an algorithm that locates the last occurrence of the smallest element in a finite list of integers, where the integers in the list are not necessarily distinct. 19. Describe an algorithm that produces the maximum, me dian, mean, and minimum of a set of three integers. (The median of a set of integers is the middle element in the list when these integers are listed in order of increasing size. The mean of a set of integers is the sum of the integers divided by the number of integers in the set.) 20. Describe an algorithm for finding both the largest and the smallest integers in a finite sequence of integers. 2 1 . Describe an algorithm that puts the first three terms of a sequence of integers of arbitrary length in increasing order. 22. Describe an algorithm to find the longest word in an English sentence (where a sentence is a sequence of sym bois, either a letter or a blank, which can then be broken into alternating words and blanks). 23. Describe an algorithm that determines whether a function from a finite set of integers to another finite set of integers is onto. 24. Describe an algorithm that determines whether a function from a finite set to another finite set is one-to-one. 25. Describe an algorithm that will count the number of 1 s in a bit string by examining each bit ofthe string to determine whether it is a 1 bit. 26. Change Algorithm 3 so that the binary search procedure compares x to am at each stage of the algorithm, with the algorithm terminating if x = am . What advantage does this version of the algorithm have? .
3-12
27. The ternary search algorithm locates an element in a list
ofincreasing integers by successively splitting the list into three sublists of equal (or as close to equal as possible) size, and restricting the search to the appropriate piece. Specify the steps of this algorithm. 28. Specify the steps of an algorithm that locates an element in a list of increasing integers by successively splitting the list into four sublists of equal (or as close to equal as possible) size, and restricting the search to the appropriate piece. 29. A mode of a list of integers is an element that occurs at least as often as each of the other elements. Devise an algorithm that finds a mode in a list of nondecreasing integers. 30. Devise an algorithm that finds all modes (defined in Exercise 29)in a list of nondecreasing integers. 3 1 . Devise an algorithm that finds the first term of a sequence ofintegers that equals some previous term in the sequence. 32. Devise an algorithm that finds all terms of a finite se quence of integers that are greater than the sum of all previous terms of the sequence. 33. Devise an algorithm that finds the first term of a sequence of positive integers that is less than the immediately pre ceding term of the sequence. 34. Use the bubble sort to sort 6, 3, 1 , 5, 4, showing the lists obtained at each step. 35. Use the bubble sort to sort 3, 1 , 5, 7, 4, showing the lists obtained at each step. 36. Use the bubble sort to sort J, m, a, showing the lists obtained at each step. *37. Adapt the bubble sort algorithm so that it stops when no interchanges are required. Express this more efficient ver sion of the algorithm in pseudocode. 38. Use the insertion sort to sort the list in Exercise 34, show ing the lists obtained at each step. 39. Use the insertion sort to sort the list in Exercise 35, show ing the lists obtained at each step. 40. Use the insertion sort to sort the list in Exercise 36, show ing the lists obtained at each step. The selection sort begins by finding the least element in the list. This element is moved to the front. Then the least element � among the remaining elements is found and put into the sec ond position. This procedure is repeated until the entire list has been sorted. 41. Sort these lists using the selection sort. a) 3, 5, 4, 1 , 2 b) 5, 4, 3, 2, 1 c) 1 , 2, 3 , 4, 5 42. Write the selection sort algorithm in pseudocode. I3r 43. Describe an algorithm based on the linear search for de termining the correct position in which to insert a new element in an already sorted list. 44. Describe an algorithm based on the binary search for de termining the correct position in which to insert a new element in an already sorted list.
2,
d, k,
b,
3 . 1 Algorithms
3- / 3
45. How many comparisons does the insertion sort use to sort
, n?
the list 1 , 2, . . . 46. How many comparisons does the insertion sort use to sort the list - 1 , . . . , 2, I ?
n, n
The binary insertion sort i s a variation o f the insertion sort that uses a binary search technique (see Exercise 44) rather than a linear search technique to insert the ith element in the correct place among the previously sorted elements. 47. Show all the steps used by the binary insertion sort to sort
the list 3, 2, 4, 5, 1 , 6. 48. Compare the number of comparisons used by the inser
tion sort and the binary insertion sort to sort the list 7, 4, 3, 8, 1 , 5, 4, 2. *49. Express the binary insertion sort in pseudocode. 50. a) Devise a variation of the insertion sort that uses a lin ear search technique that inserts the jth element in the correct place by first comparing it with the (j - 1 )st element, then the (j - 2)th element if necessary, and so on. b) Use your algorithm to sort 3, 2, 4, 5, 1 , 6. c) Answer Exercise 45 using this algorithm. d) Answer Exercise 46 using this algorithm. 51. When a list of elements is in close to the correct order,
would it be better to use an insertion sort or its variation described in Exercise 50? 52. Use the greedy algorithm to make change using quarters, dimes, nickels, and pennies for a) 87 cents. c) 99 cents.
b) 49 cents. d) 33 cents.
53. Use the greedy algorithm to make change using quarters,
dimes, nickels, and pennies for a) 5 1 cents. c) 76 cents.
b) 69 cents. d) 60 cents.
54. Use the greedy algorithm to make change using quar
ters, dimes, and pennies (but no nickels) for each of the amounts given in Exercise 52. For which ofthese amounts does the greedy algorithm use the fewest coins of these denominations possible? 55. Use the greedy algorithm to make change using quar ters, dimes, and pennies (but no nickels) for each of the amounts given in Exercise 53. For which of these amounts does the greedy algorithm use the fewest coins of these denominations possible? 56. Show that if there were a coin worth 1 2 cents, the greedy algorithm using quarters, l 2-cent coins, dimes, nickels, and pennies would not always produce change using the fewest coins possible. 57. a) Describe a greedy algorithm that solves the problem of scheduling a subset of a collection of proposed talks in a lecture hall by selecting at each step the proposed talk that has the earliest finish time among all those that do not conflict with talks already scheduled. (In Section 4. 1 we will show that this greedy algorithm
179
always produces a schedule that includes the largest number of talks possible.) b) Use your algorithm from part (a) to schedule the largest number of talks in a lecture hall from a pro posed set of talks, if the starting and finishing times of the talks are 9 :00 A.M. and 9:45 A.M.; 9:30 A.M. and 1 0:00 A.M.; 9:50 A.M. and 10: 1 5 A.M.; 1 0:00 A.M. and 10:30 A.M.; 10: 1 0 A.M. and 1 0:25 A.M.; 10:30 A.M. and 1 0:55 A.M.; 10: 1 5 A.M. and 1 0:45 A.M.; 1 0:30 A.M. and 1 1 :00 A.M.; 1 0:45 A.M. and 1 1 :30 A.M.; 1 0:55 A.M. and 1 1 :25 A.M.; 1 1 :00 A.M. and 1 1 : 15 A.M. Suppose we have s men m I , m 2 , . . . , ms and s women W I , W 2 , . . . , Ws ' We wish to match each person with a mem � ber of the opposite gender. Furthermore, suppose that each � person ranks, in order of preference, with no ties, the people ofthe opposite gender. We say that an assignment of people of opposite genders to form couples is stable if we cannot find a man m and a woman W who are not assigned to each other such that m prefers W over his assigned partner and W prefers m to her assigned partner. 58. Suppose we have three men m l , m 2 , and m 3 and three women W I , W 2 , and W 3 . Furthermore, suppose that the preference rankings of the men for the three women, from highest to lowest, are m l : W 3 , W I , W 2 ; m 2 : W I , W 2 , W 3 ; m 3 : W 2 , W 3 , W I ; and the preference rankings ofthe women for the three men, from highest to lowest, are W I : m I , m 2 , m 3 ; W 2 : m 2 , m l , m 3 ; W 3 : m 3 , m 2 , m I . For each of the six possible assignments of men and women to form three couples, determine whether this assignment is stable. *59. The deferred acceptance algorithm can be used to con struct an assignment ofmen and women. In this algorithm, members of one gender are the suitors and members of the other gender the suitees. The algorithm uses a se ries of rounds; in each round every suitor whose proposal was rejected in the previous round proposes to his or her highest ranking suitee who has not already rejected a pro posal from this suitor. The suitees reject all the proposals except that from the suitor that this suitee ranks highest among all the suitors who have proposed in this round or previous rounds. The proposal of this highest ranking suitor remains pending and is rejected in a later round if a more appealing suitor proposes in that round. The series of rounds ends when every suitor has exactly one pending proposal. All pending proposals are then accepted. a) Write the deferred acceptance algorithm in pseudocode. b) Show that the deferred acceptance algorithm terminates. c) Show that the deferred acceptance always terminates with a stable assignment. 60. Show that the problem of determining whether a program with a given input ever prints the digit 1 is unsolvable. 61. Show that the following problem is solvable. Given two programs with their inputs and the knowledge that exactly one of them halts, determine which halts. 62. Show that the problem of deciding whether a specific pro gram with a specific input halts is solvable.
180
3/
The Fundamentals: Algorithms, the Integers, and Matrices
3- 1 4
3.2 The Growth of Functions Introduction In Section 3 . 1 we discussed the concept of an algorithm. We introduced algorithms that solve a variety of problems , including searching for an element in a list and sorting a list. In Section 3.3 we will study the number of operations used by these algorithms. In particular, we will esti mate the number of comparisons used by the linear and binary search algorithms to find an element in a sequence of n elements. We will also estimate the number of comparisons used by the bubble sort and by the insertion sort to sort a list of n elements. The time required to solve a problem depends on more than only the number of operations it uses. The time also depends on the hardware and software used to run the program that implements the algorithm. However, when we change the hardware and software used to implement an algorithm, we can closely approximate the time required to solve a problem of size n by multiplying the previous time required by a constant. For example, on a supercomputer we might be able to solve a problem of size n a million times faster than we can on a PC. However, this factor of one million will not depend on n (except perhaps in some minor ways). One of the advan tages of using big-O notation, which we introduce in this section, is that we can estimate the growth of a function without worrying about constant multipliers or smaller order terms. This means that, using big-O notation, we do not have to worry about the hardware and software used to implement an algorithm. Furthermore, using big- O notation, we can assume that the different operations used in an algorithm take the same time, which simplifies the analysis considerably. Big-O notation is used extensively to estimate the number of operations an algorithm uses as its input grows. With the help of this notation, we can determine whether it is practical to use a particular algorithm to solve a problem as the size of the input increases. Furthermore, using big-O notation, we can compare two algorithms to determine which is more efficient as the size of the input grows. For instance, if we have two algorithms for solving a problem, one using 1 00n 2 + 1 7n + 4 operations and the other using n 3 operations, big-O notation can help us see that the first algorithm uses far fewer operations when n is large, even though it uses more operations for small values of n , such as n = 1 0 . This section introduces big- O notation and the related big-Omega and big-Theta notations. We will explain how big- O, big-Omega, and big-Theta estimates are constructed and establish estimates for some important functions that are used in the analysis of algorithms.
Big-O Notation The growth of functions is often described using a special notation. Definition 1 describes this notation.
DEFINITION 1
Let f and g be functions from the set of integers or the set of real numbers to the set of real numbers. We say that f(x) is 0 (g(x» if there are constants C and k such that
If(x ) 1 :::: C1g(x)1 whenever x
>
k. [This i s read as "f(x) is big-oh of g(x )."]
3-/5
3.2 The Growth of Functions
Assessment
�
unks
�
EXAMPLE l
Extra � Examples Iiiiiiiij
181
The constants C and k in the definition of big-O notation are called witnesses to the relationship I(x) is O (g(x» . To establish that I(x) is O (g(x» we need only one pair of witnesses to this relationship. That is, to show that I(x) is O (g(x» , we need find only one pair of constants C and k, the witnesses, such that I /(x) 1 � C lg(x) 1 whenever x > k. Note that when there is one pair of witnesses to the relationship I(x) is O (g(x» , there are irifinitely many pairs of witnesses. To see this, note that if C and k are one pair of witnesses, then any pair C' and k', where C < C' and k < k', is also a pair of witnesses, because I /(x) 1 � C jg(x) 1 � C' lg(x) 1 whenever x > k' > k. A useful approach for finding a pair of witnesses is to first select a value of k for which the size of I /(x) 1 can be readily estimated when x > k and to see whether we can use this estimate to find a value of C for which I /(x) 1 < C Ig(x) 1 for x > k. This approach is illustrated in Example I . Show that I(x) = x 2 + 2x + 1 is O (x 2 ).
Solution: We observe that we can readily estimate the size of I(x) when x > I because x and 1 < x 2 when x > 1 . It follows that
<
x2
whenever x > 1 , as shown in Figure 1 . Consequently, we can take C = 4 and k = 1 as witnesses to show that I(x) is O (x 2 ). That is, I(x) = x 2 + 2x + 1 < 4x 2 whenever x > 1 . (Note that it is not necessary to use absolute values here because all functions in these equalities are positive when x is positive.) Alternatively, we can estimate the size of I(x) when x > 2. When x > 2, we have 2x � x 2 and 1 � x 2 • Consequently, if x > 2, we have
It follows that C = 3 and k = 2 are also witnesses to the relation I(x) is O (x 2 ).
4
3
The part of the graph of/(x) x 2 + 2x + 1 that satisfies/ex) < 4x 2 is shown in color. =
2
2 FIGURE 1
The Function x2 + 2x + 1 is O(x2 ).
182
3/
The Fundamentals: Algorithms, the Integers, and Matrices
3- 1 6
Observe that in the relationship "f(x) is O (x 2 )," x 2 can be replaced by any function with larger values than x 2 • For example, f(x ) is O (x 3 ), f(x) is O (x 2 + X + 7), and so on. It is also true that x 2 is O (x 2 + 2x + 1 ), because x 2 < x 2 + 2x + 1 whenever x > 1 . This .... means that C = 1 and k = 1 are witnesses to the relationship x 2 is O (x 2 + 2x + 1 ). Note that in Example 1 we have two functions, f(x) = x 2 + 2x + 1 and g(x) = x 2 , such that f(x) is O (g(x)) and g(x) is O (f(x))-the latter fact following from the inequality x 2 ::: x 2 + 2x + 1 , which holds for all nonnegative real numbers x . We say that two functions f(x) and g(x) that satisfy both of these big-O relationships are of the same order. We will return to this notion later in this section.
= O (g(x)). However, the equals sign in this notation does not represent a genuine equality. Rather, this notation tells us that an inequality holds relating the values of the functions f and g for sufficiently large numbers in the domains of these functions. However, it is acceptable to write f(x) E O (g(x)) because O (g(x)) represents the set of functions that are O (g(x)).
Remark: The fact that f(x) is O (g(x)) is sometimes written f(x)
Big-O notation has been used in mathematics for more than a century. In computer science it is widely used in the analysis of algorithms, as will be seen in Section 3.3. The German mathematician Paul Bachmann first introduced big- O notation in 1 892 in an important book on number theory. The big- O symbol is sometimes called a Landau symbol after the German mathematician Edmund Landau, who used this notation throughout his work. The use ofbig- O notation in computer science was popularized by Donald Knuth, who also introduced the big-Q and big-e notations defined later in this section. When f(x) is o (g(x)), and h(x) is a function that has larger absolute values than g(x) does for sufficiently large values of x, it follows that f(x) is O (h(x)). In other words, the function g(x) in the relationship f(x) is o (g(x )) can be replaced by a function with larger absolute values. To see this, note that if if x
>
k,
Ig(x ) 1 for all x
>
k, then
>
k.
I f(x) 1 ::: C lg(x) 1 and if I h (x ) 1
>
I f(x) 1 ::: C l h (x ) 1
ifx
Hence, f(x) is O (h (x)). When big- O notation is used, the function g in the relationship f(x ) is O (g(x)) is chosen to be as small as possible (sometimes from a set of reference functions, such as functions of the form x n , where n is a positive integer).
unllS � PAUL GUSTAV HEINRICH BACHMANN ( 1 837- 1 920) Paul Bachmann, the son ofa Lutheran pastor, shared his father's pious lifestyle and love of music. His mathematical talent was discovered by one of his teachers, even though he had difficulties with some of his early mathematical studies. After recuperating from tuberculosis in Switzerland, Bachmann studied mathematics, first at the University of Berlin and later at Gottingen, where he attended lectures presented by the famous number theorist Dirichlet. He received his doctorate under the German number theorist Kummer in 1 862; his thesis was on group theory. Bachmann was a professor at Breslau and later at Munster. After he retired from his professorship, he continued his mathematical writing, played the piano, and served as a music critic for newspapers. Bachmann's mathematical writings include a five-volume survey of results and methods in number theory, a two-volume work on elementary number theory, a book on irrational numbers, and a book on the famous conjecture known as Fermat's Last Theorem. He introduced big-O notation in his 1 892 book Analytische Zahlentheorie.
3-1 7
3 .2 The Growth of Functions
183
The part of the graph ofj(x) that satisfies j(x) < Cg(x) is shown in color.
k
FIGURE 2
The Function I(x) is O(g(x» .
In subsequent discussions, we will almost always deal with functions that take on only positive values. All references to absolute values can be dropped when working with big- O estimates for such functions. Figure 2 illustrates the relationship I(x) is O (g(x » . Example 2 illustrates how big- O notation is used to estimate the growth of functions.
EXAMPLE 2
Show that 7x 2 is O (x 3 ).
Solution: Note that when x > 7, we have 7x 2 < X 3 • (We can obtain this inequality by multiplying both sides of x > 7 by x 2 .) Consequently, we can take e = 1 and k = 7 as witnesses to establish the relationship 7x 2 is O (x 3 ). Alternatively, when x > 1 , we have 7x 2 < 7x 3 , so that e = 7 and .... k = 1 are also witnesses to the relationship 7x 2 is O (x 3 ). Example 3 illustrates how to show that a big- O relationship does not hold.
EXAMPLE 3
Show that n 2 is not O (n).
Solution: To show that n 2 is not O (n), we must show that no pair of constants e and k exist such that n 2 � en whenever n > k. To see that there are no such constants, observe that when n > 0 we can divide both sides of the inequality n 2 � en by n to obtain the equivalent inequality n :::: e . We now see that no matter what e and k are, the inequality n � e cannot hold for all n with n > k. In particular, once we set a value of k, we see that when n is larger than the .... maximum of k and e , it is not true that n � e even though n > k.
unks � EDMUND LANDAU ( 1 877- 1 938) Edmund Landau, the son of a Berlin gynecologist, attended high school and university in Berlin. He received his doctorate in 1 899, under the direction of Frobenius. Landau first taught at the University of Berlin and then moved to Gottingen, where he was a full professor until the Nazis forced him to stop teaching. Landau's main contributions to mathematics were in the field of analytic number theory. In particular, he established several important results concerning the distribution of primes. He authored a three-volume exposition on number theory as well as other books on number theory and mathematical analysis.
184
/
3 The Fundamentals: Algorithms, the Integers, and Matrices
EXAMPLE
4
3- / 8
Example 2 shows that 7x 2 is O (x 3 ). Is it also true that x 3 is O (7x 2 )?
Solution: To determine whether x 3 is 0 (7x 2 ), we need to determine whether there are constants C and k such that x 3 ::::: C (7x 2 ) whenever x > k. The inequality x 3 ::::: C(7x 2 ) is equivalent to the inequality x ::::: 7C , which follows by dividing the original inequality by the positive quantity x 2 . Note that no C exists for which x ::::: 7C for all x > k no matter what k is, because x can be made arbitrarily large. It follows that no witnesses C and k exist for this proposed big-O
Some Important Big-O Results Polynomials can often be used to estimate the growth of functions. Instead of analyzing the growth of polynomials each time they occur, we would like a result that can always be used to estimate the growth of a polynomial. Theorem I does this. It shows that the leading term of a polynomial dominates its growth by asserting that a polynomial of degree n or less is O (x n ).
THEOREM l
Let f(x) = an x n + an _ I X n - 1 + . . . + a l x + ao, where ao , a l , " " an - I , an are real numbers. Then f(x) is O (x n ).
unkS � DONALD E. KNUTH (BORN \ 938) Knuth grew up in Milwaukee, where his father taught bookkeeping at a Lutheran high school and owned a small printing business. He was an excellent student, earning academic achievement awards. He applied his intelligence in unconventional ways, winning a contest when he was in the eighth grade by finding over 4500 words that could be formed from the letters in "Ziegler's Giant Bar." This won a television set for his school and a candy bar for everyone in his class. Knuth had a difficult time choosing physics over music as his major at the Case Institute of Technology. He then switched from physics to mathematics, and in 1 960 he received his bachelor of science degree, simultane ously receiving a master of science degree by a special award of the faculty who considered his work outstanding. At Case, he managed the basketball team and applied his talents by constructing a formula for the value of each player. This novel approach was covered by Newsweek and by Walter Cronkite on the CBS television network. Knuth began graduate work at the California Institute of Technology in 1 960 and received his Ph.D. there in 1 963 . During this time he worked as a consultant, writing compilers for different computers. Knuth joined the staff ofthe California Institute of Technology in 1 963, where he remained until 1 968, when he took a job as a full professor at Stanford University. He retired as Professor Emeritus in 1 992 to concentrate on writing. He is especially interested in updating and completing new volumes of his series The Art of Computer Programming. a work that has had a profound influence on the development of computer science, which he began writing as a graduate student in 1 962, focusing on compilers. In common jargon, "Knuth," referring to The Art of Computer Programming. has come to mean the reference that answers all questions about such topics as data structures and algorithms. Knuth is the founder ofthe modern study of computational complexity. He has made fundamental contributions to the subject of compilers. His dissatisfaction with mathematics typography sparked him to invent the now widely used TeX and Metafont systems. TeX has become a standard language for computer typography. Two of the many awards Knuth has received are the 1 974 Turing Award and the 1 979 National Medal of Technology, awarded to him by President Carter. Knuth has written for a wide range of professional journals in computer science and in mathematics. However, his first publication, in 1 957, when he was a college freshman, was a parody ofthe metric system called "The Potrzebie Systems of Weights and Measures," which appeared in MAD Magazine and has been in reprint several times. He is a church organist, as his father was. He is also a composer of music for the organ. Knuth believes that writing computer programs can be an aesthetic experience, much like writing poetry or composing music. Knuth pays $2.56 for the first person to find each error in his books and $0.32 for significant suggestions. If you send him a letter with an error (you will need to use regular mail, because he has given up reading e-mail), he will eventually inform you whether you were the first person to tell him about this error. Be prepared for a long wait, because he receives an overwhelming amount of mail. (The author received a letter years after sending an error report to Knuth, noting that this report arrived several months after the first report of this error.)
3 .2 The Growth of Functions
3- / 9
Proof: Using the triangle inequality (see Exercise 5 in Section 1 .7), if x
>
185
1 we have
I f(x) 1 = l an x n + an_ I x n - 1 + . . . + alx + ao l ::::: Ian Ix n + l an_ I l x n - 1 + . . . + l al l x + l ao l = x n (Ian I + l an - I I /x + . . + l a l l /x n - I + l ao l /x n ) ::::: x n (Ian l + l an - I I + . . . + l al l + l ao ! ) . .
This shows that
where C = Ian I + l an - I I + . . . + l ao I whenever x + . . . + l ao l and k = 1 show that f(x) is O (x n ).
>
1 . Hence, the witnesses C = I an I + l an - I I
We now give some examples involving functions that have the set of positive integers as their domains. EXAMPLE
5
How can big- O notation be used to estimate the sum of the first n positive integers?
Solution: Because each of the integers in the sum of the first n positive integers does not exceed n, it follows that 1 + 2 + . . . + n ::::: n + n + . . . + n = n 2 • From this inequality it follows that 1 + 2 + 3 + . . . + n is O (n 2 ) , taking C = 1 and k = 1 as witnesses. (In this example the domains of the functions in the big- O relationship are the set of .... positive integers.) In Example 6 big- O estimates will be developed for the factorial function and its loga rithm. These estimates will be important in the analysis of the number of steps used in sorting procedures. EXAMPLE 6
Give big- O estimates for the factorial function and the logarithm of the factorial function, where the factorial function f(n) = n ! is defined by
n! = 1 · 2 · 3 · . . . · n whenever n is a positive integer, and O! = 1 . For example,
1! = 1,
2 ! = 1 · 2 = 2,
3 ! = 1 · 2 · 3 = 6,
4 ! = 1 · 2 · 3 · 4 = 24.
Note that the function n ! grows rapidly. For instance,
20! = 2,432,902,008, 1 76,640,000. Solution: A big-O estimate for n ! can be obtained by noting that each term in the product does not exceed n . Hence, n! = 1 · 2 · 3 · · · · · n ::::: n · n · n · . . . . n
186
3/
The Fundamentals: Algorithms, the Integers, and Matrices
3-20
This inequality shows that n ! is O (n n ), taking C = 1 and k = 1 as witnesses. Taking logarithms of both sides of the inequality established for n ! , we obtain log n ! :::: log n n = n log n . This implies that log n ! is 0 (n log n), again taking C = 1 and k = 1 as witnesses. EXAMPLE 7
In Section 4. 1 , we will show that n < 2n whenever n is a positive integer. Show that this inequality implies that n is O (2n ), and use this inequality to show that log n is O (n).
Solution: Using the inequality n < 2 n , we quickly can conclude that n is O (2 n ) by taking k = C = I as witnesses. Note that because the logarithm function is increasing, taking logarithms (base 2) of both sides of this inequality shows that log n
<
n.
It follows that log n is O (n). (Again we take C = k = 1 as witnesses.) If we have logarithms to a base b, where b is different from 2, we still have 10gb n is O (n) because 10gb n =
log n log b
<
n log b
whenever n is a positive integer. We take C = 1 / log b and k = 1 as witnesses. (We have used � Theorem 3 in Appendix 2 to see that 10gb n = log n / log b.) As mentioned before, big-O notation is used to estimate the number of operations needed to solve a problem using a specified procedure or algorithm. The functions used in these estimates often include the following:
I , log n , n , n log n , n 2 , 2 n , n ! Using calculus it can be shown that each function in the list is smaller than the succeeding function, in the sense that the ratio of a function and the succeeding function tends to zero as n grows without bound. Figure 3 displays the graphs of these functions, using a scale for the values of the functions that doubles for each successive marking on the graph. That is, the vertical scale in this graph is logarithmic.
The Growth of Combinations of Functions Many algorithms are made up of two or more separate subprocedures. The number of steps used by a computer to solve a problem with input of a specified size using such an algorithm is the sum of the number of steps used by these subprocedures. To give a big-O estimate for the number of steps needed, it is necessary to find big-O estimates for the number of steps used by each subprocedure and then combine these estimates. Big- O estimates of combinations of functions can be provided if care is taken when different big- O estimates are combined. In particular, it is often necessary to estimate the growth of the sum and the product of two functions. What can be said if big- O estimates for each of two
3 .2 The Growth of Functions
187
4096 2048 \024 512 256 128 64
32 16 8 4 2
2
3
FIGURE 3
5
4
6
7
8
A Display of the Growth of Functions Commonly Used in Big-O Estimates.
functions are known? To see what sort of estimates hold for the sum and the product of two functions, suppose that fl (x) is O (gl (x» and fz (x) is O (g2 (X» . From the definition of big- O notation, there are constants C ) , C2 , k) , and k2 such that
when x
>
k) , and
I fz(x) 1 :s C2 Ig2 (X) 1 when x
>
k2 . To estimate the sum o f fl (x) and fz(x), note that
l UI + fz)(x ) 1 = I fl (x) + fz(x ) 1 :s I fl (x) 1 + I fz(x) 1
using the triangle inequality la + h i ::: l a l + I h I -
When x i s greater than both k l and k2 , it follows from the inequalities for I fl (x ) 1 and I fz(x ) 1 that I fl (x) 1 + I fz(x) 1 :s C l lgl (x) 1 + C2 Ig2 (X) 1 :s Cl lg(x) 1 + C 2 Ig(x ) 1
= (C I + C2 )lg(x) 1 = C lg(x) I ,
where C = C I + C2 and g(x) = max( lgl (x) l , Ig2 (X) I ). [Here max(a , b) denotes the maximum, or larger, of a and b.] This inequality shows that l UI + fz)(x) I :s C lg(x ) 1 whenever x > k, where k = max(k l , k2 ). We state this useful result as Theorem 2 .
188
3/
The Fundamentals: Algorithms, the Integers, and Matrices
THEOREM 2
3-22
Suppose that II (x) is O (gl (x)) and h ex) is 0 (g2(X)). Then (II + h)(x) is O (max(lgl (x) l , Ig2(X) I )) · We often have big- 0 estimates for II and 12 in terms of the same function g. In this situation, Theorem 2 can be used to show that (II + h)(x) is also O (g(x)), because max(g(x), g(x)) = g(x). This result is stated in Corollary 1 .
COROLLARY 1
Suppose that II (x) and h(X) are both O (g(x )). Then (II + h)(x ) is O (g(x)). In a similar way big- O estimates can be derived for the product of the functions II and h. When x is greater than max(kl , k2 ) it follows that 1 (ll h)(x ) 1 = 1 /1 (x) l l h(x) 1 :::: C 1 Igl (X) I C2 Ig2 (X) 1 :::: C I C2 1 (gl g2)(x) 1 :::: C I (gl g2 )(x ) l , where C = C I C2 . From this inequality, it follows that II (x)h(x) is 0 (g l g2 ), because there are constants C and k, namely, C = C I C2 and k = max(kl , k2), such that I (ld2 )(X) I :::: C lgl (X)g2 (X) 1 whenever x > k. This result is stated in Theorem 3.
THEOREM 3
Suppose that II (x) is O (g I (X)) and 12(x) is 0 (g2(X)). Then (ll h)(x) is 0 (gl (X)g2(X)). The goal in using big-O notation to estimate functions is to choose a function g(x) that grows relatively slowly so that I(x) is 0 (g(x )). Examples 8 and 9 illustrate how to use Theorems 2 and 3 to do this. The type of analysis given in these examples is often used in the analysis of the time used to solve problems using computer programs.
EXAMPLE 8
Give a big- O estimate for I(n) = 3n log(n !) + (n 2 + 3) log n , where n is a positive integer.
Solution: First, the product 3n log(n !) will be estimated. From Example 6 we know that log(n !) is O (n log n). Using this estimate and the fact that 3n is O (n), Theorem 3 gives the estimate that 3n 10g(n !) is 0 (n 2 10g n). Next, the product (n 2 + 3) log n will be estimated. Because (n 2 + 3) < 2n 2 when n > 2, it follows that n 2 + 3 is 0 (n 2 ). Thus, from Theorem 3 it follows that (n 2 + 3) log n is 0 (n 2 10g n). Using Theorem 2 to combine the two big-O estimates for the products shows that I(n) = .... 3n 10g(n ! ) + (n 2 + 3) log n is 0 (n 2 10g n). EXAMPLE 9
Give a big- O estimate for I(x) = (x + 1 ) log(x 2 + 1 ) + 3x 2 .
Solution: First, a big- O estimate for (x + 1 ) 10g(x 2 + 1 ) will be found. Note that (x + 1 ) is o (x). Furthermore, x 2 + 1 :::: 2X 2 when x > 1 . Hence, log(x 2 + 1 ) :::: log(2x 2 ) = log 2 + log x 2 = log 2 + 2 10g x :::: 3 10g x , if x
>
2. This shows that log(x 2 + 1 ) i s O (log x).
3.2 The Growth of Functions
3-23
1 89
From Theorem 3 it follows that (x + 1 ) log(x 2 + 1 ) is O (x 10g x). Because 3x 2 is O (x 2 ), Theorem 2 tells us that f(x) is O (max(x log x, x 2 » . Because x log x ::::: x 2 , for x > 1 , it follows .... that f(x) is O (x 2 ).
Big-Omega and Big-Theta Notation Big- O notation is used extensively to describe the growth of functions, but it has limitations. In particular, when f(x) is O (g(x» , we have an upper bound, in terms of g(x), for the size of f(x) for large values ofx . However, big-O notation does not provide a lower bound for the size of f(x) for large x . For this, we use big-Omega (big-Q) notation. When we want to give both an upper and a lower bound on the size of a function f(x), relative to a reference function g(x), we use big Theta (big-8) notation. Both big-Omega and big-Theta notation were introduced by Donald Knuth in the 1 970s. His motivation for introducing these notations was the common misuse of big- O notation when both an upper and a lower bound on the size of a function are needed. We now define big-Omega notation and illustrate its use. After doing so, we will do the same for big-Theta notation. There is a strong connection between big- O and big-Omega notation. In particular, f(x) is Q(g(x» if and only if g(x) is o (f(x» . We leave the verification of this fact as a straightforward exercise for the reader.
DEFINITION 2
Let f and g be functions from the set of integers or the set of real numbers to the set of real numbers. We say that f(x) is Q(g(x» if there are positive constants C and k such that I f(x ) 1 ::: C lg(x ) 1
whenever x
EXAMPLE 10
>
k. [This is read as "f(x) is big Omega o f g(x)."] -
The function f(x) = 8x 3 + 5x 2 + 7 is Q(g(x» , where g(x) is the function g(x) = x 3 • This is easy to see because f(x) = 8x 3 + 5x 2 + 7 ::: 8x 3 for all positive real numbers x . This is equivalent to saying that g(x) = x 3 is O (8x 3 + 5x 2 + 7), which can be established directly by .... turning the inequality around. Often, it is important to know the order of growth of a function in terms of some relatively simple reference function such as x n when n is a positive integer or eX , where e > 1 . Knowing the order of growth requires that we have both an upper bound and a lower bound for the size of the function. That is, given a function f(x), we want a reference function g(x) such that f(x) is O (g(x» and f(x) is Q(g(x» . Big-Theta notation, defined as follows, is used to express both of these relationships, providing both an upper and a lower bound on the size of a function.
DEFINITION 3
Let f and g be functions from the set of integers or the set of real numbers to the set of real numbers. We say that f(x) is 8(g(x» if f(x) is O (g(x» and f(x) is Q(g(x» . When f(x) is 8(g(x» , we say that "f is big-Theta of g(x)" and we al so say that f(x) is of order g(x).
When f(x) is 8(g(x» , it is also the case that g(x) is 8(f(x » . Also note that f(x) is 8(g(x» if and only if f(x) is o (g(x» and g(x) is O (f(x» (see Exercise 25).
190
3/
The Fundamentals: Algorithms, the Integers, and Matrices
3-24
Usually, when big-Theta notation is used, the function g(x) in 8 (g(x » is a relatively simple reference function, such as x n , c;X , log x , and so on, while f(x) can be relatively complicated. EXAMPLE 11
We showed (in Example 5) that the sum of the first n positive integers is O(n 2 ). Is this sum of order n 2 ?
Solution: Let fen) = 1 + 2 + 3 + . . . + n . Because we already know that fen) is O(n 2 ), to show that f(n) is of order n 2 we need to find a positive constant C such that f(n) > Cn 2 for sufficiently large integers n . To obtain a lower bound for this sum, we can ignore the first half of the terms. Summing only the terms greater than rn /21 , we find that 1 + 2 + . . . + n � rn /2 1 + ( rn /21 + I ) + . . . + n � rn/21 + rn/2 1 + . . . + rn/21 = (n - rn/21 + I ) rn/21 �
(n /2)(n /2)
= n 2 /4 .
This shows that fen) is Q(n 2 ). We conclude that fen) is of order n 2 , or in symbols, fen) is .... 8(n 2 ) . We can show that f(x) is 8 (g(x» if we can find positive real numbers C 1 and C2 and a positive real number k such that
C 1 Ig(x ) 1 :s I f(x) 1 :s C2 Ig(x)1 whenever x EXAMPLE 12
>
k. This shows that f(x) is O (g(x» and f(x) is Q(g(x» .
Show that 3x 2 + 8x log x is 8 (x 2 ) .
Solution: Because 0 :s 8x log x :s 8x 2 , it follows that 3x 2 + 8x log x :s I Ix 2 for x > I . Conse quently, 3x 2 + 8x log x is O (x 2 ) . Clearly, x 2 is O (3x 2 + 8x log x). Consequently, 3x 2 + 8x log x .... is 8 (x 2 ) . One useful fact is that the leading term of a polynomial determines its order. For example, if f(x) = 3x5 + X 4 + 1 7x 3 + 2, then f(x) is of order x 5 • This is stated in Theorem 4, whose proof is left as an exercise at the end of this section.
THEOREM
4
EXAMPLE 13
n
n
Let f(x) = anx + an _ l X - 1 + . . . + a l X + ao, where aO , a l , n an i- O. Then f(x) is of order x .
. . . , an
are real numbers with
The polynomials 3x8 + I Ox7 + 22 1x 2 + 1444, x 19 - 1 8x 4 - 1 0, 1 1 2, and _x 99 + 40,OOlx 98 + .... I OO, 003x are of orders x 8 , x 19 , and x 99 , respectively. Unfortunately, as Knuth observed, big- O notation is often used by careless writers and speakers as if it had the same meaning as big-Theta notation. Keep this in mind when you see big- O notation used. The recent trend has been to use big-Theta notation whenever both upper and lower bounds on the size of a function are needed.
3.2 The Growth of Functions
3-25
191
Exercises In Exercises 1-14, to establish a big- O relationship, find wit nesses C and such that I f(x)1 ::: C lg(x)1 whenever x >
k
1. Determine whether each of these functions is O (x). a) f(x) = 10 b) f(x) = 3x + 7 c) f(x) = x 2 + x + 1 d) f(x) = S log x e) f(x) = LxJ t) f(x) = fx/21
k.
2. Determine whether each of these functions is 0 (x 2 ). a) f(x) = 1 7x + 1 1 b) f(x) = x 2 + 1 000 c) f(x) = x log x d) f(x) = x4/2 e) f(x ) = 2x t) f(x) = LxJ . fx l 3. Use the definition of "f(x) is O (g(x» " to show that 4. 5. 6. 7.
8.
9. 10. 11. 12. 13. 14.
15. 16. 17.
18. 19.
x4 + 9x3 + 4x + 7 is 0 (X4). Use the definition of "f(x) is O (g(x» " to show that 2x + 1 7 is O W ). Show that (x 2 + I )/(x + I ) is O (x). Show that (x3 + 2x)/(2x + I) is 0 (x 2 ). Find the least integer n such that f(x) is O (x n ) for each of these functions. a) f(x) = 2x3 + x 2 10g x b) f(x) = 3x3 + (log x)4 c) f(x) = (x4 + x 2 + 1 )/(x3 + I ) d) f(x) = (x4 + S log x)/(x4 + 1 ) Find the least integer n such that f(x) i s O (x n ) for each of these functions. a) f(x) = 2x 2 + x3 10g x b) f(x) = 3x 5 + (log x)4 c) f(x) = (x4 + x 2 + 1 )/(x4 + 1 ) d) f(x) = (x3 + S log x)/(x4 + I ) Show that x 2 + 4x + 1 7 i s 0 (x3 ) but that x3 i s not 0 (x 2 + 4x + 1 7). Show that x3 is 0 (X4) but that X4 is not 0 (x3 ). Show that 3x4 + I is 0 (x4/2) and x4/2 is 0 (3x4 + I ). Show that x log x is 0 (x 2 ) but that x 2 is not O (x log x). Show that 2 n is 0 (3 n ) but that 3 n is not 0 (2n ). Is it true that x3 is o (g(x » , ifg is the given function? [For example, if g(x) = x + I , this question asks whether x3 is O(x + I ).] a) g(x) = x 2 b) g(x) = x3 d) g(x) = x 2 + x4 c) g(x) = x 2 + x3 e) g(x) = ]X t) g(x) = x3 /2 Explain what it means for a function to be 0 ( 1 ). Show that if f(x) is O (x), then f(x) is 0 (x 2 ). Suppose that f(x), g(x), and h (x) are functions such that f(x) is O (g(x» and g(x) is O(h(x» . Show that f(x) is O(h(x» . Let be a positive integer. Show that I k + 2 k + . . . + n k is O(n k + l ). Give as good a big-O estimate as possible for each of these functions. b) (n log n + n 2 )(n3 + 2) a) (n 2 + 8)(n + I )
k
c) (n ! + 2 n )(n3 + log(n 2 + I » 20. Give a big-O estimate for each o f these functions. For the
function g in your estimate f(x) is O (g), use a simple function g of smallest order. a) (n3 + n 2 10g n)(log n + I ) + ( 1 7 10g n + 1 9)(n3 + 2) b) (2n + n 2 )(n3 + 3 n ) c) (n n + n2n + s n )(n ! + s n ) 2 1 . Give a big-O estimate for each ofthese functions. For the function g in your estimate that f(x) is O (g(x» , use a simple function g of the smallest order. a) n log(n 2 + I ) + n 2 10g n b) (n log n + 1 )2 + (log n + 1 )(n 2 + I ) c) n 2n + n n 2 22. For each function in Exercise I , determine whether that function is Q(x) and whether it is 8(x). 23. For each function in Exercise 2, determine whether that function is Q(x 2 ) and whether it is 8(x 2 ). 24. a) Show that 3x + 7 is 8(x). b) Show that 2x 2 + x - 7 is 8(x 2 ). c) Show that Lx + 1 /2J is 8(x). d) Show that log(X 2 + I) is 8(log2 x). e) Show that 10g I O x is 8(log2 x). 25. Show that f(x) is 8(g(x» if and only if f(x) is O (g(x» and g(x) is O (f(x» . 26. Show that if f(x) and g(x) are functions from the set of real numbers to the set of real numbers, then f(x) is O (g(x» if and only if g(x) is Q(f(x» . 27. Show that if f(x) and g(x) are functions from the set of real numbers to the set of real numbers, then f(x) is 8(g(x» if and only if there are positive constants C 1 , and C2 such that C I Ig(x ) 1 ::: I f(x ) 1 ::: C 2 Ig(x)1 whenever x > 28. a) Show that 3x 2 + x + I is 8(3x 2 ) by directly finding the constants C 1 , and C 2 in Exercise 27. b) Express the relationship in part (a) using a picture showing the functions 3x 2 + x + I , C I . 3x 2 , and C2 · 3x 2 , and the constant on the x-axis, where C 1 , C 2 , and are the constants you found in part (a) to show that 3x 2 + x + I is 8(3x 2 ). 29. Express the relationship f(x) is 8(g(x» using a picture. Show the graphs of the functions f(x), C I Ig(x ) l , and C2 Ig(x ) l , as well as the constant on the x -axis. 30. Explain what it means for a function to be Q( 1 ). 3 1 . Explain what it means for a function to be 8( 1 ). 32. Give a big-O estimate of the product of the first n odd positive integers. 33. Show that if f and g are real-valued functions such that f(x) is O (g(x» , then for every positive integer fk (x) is O (gk (x» . [Note that fk (x) = f(x l .] 34. Show that for all real numbers a and with a > I and > I , if f(x) is o (logb x), then f(x) is O (loga x).
k,
k.
k,
k
k
k
b
b
k,
192
3 / The Fundamentals: Algorithms, the Integers, and Matrices
3-26
c) x 2 is o(2X ). d) x 2 + X + 1 is not o(x 2 ).
35. Suppose that f(x) is o (g(x » where f and g are increas
ing and unbounded functions. Show that log I f(x)1 is O (log I g(x ) 1 ). 36. Suppose that f(x) is O (g(x» . Does it follow that 2f(x ) is . o (2g(x» ? 37. Let fl (x) and hex) be functions from the set of real num bers to the set of positive real numbers. Show that if fl (x) and f2 (X) are both 8(g(x » , where g(x) is a function from the set of real numbers to the set of positive real numbers, then fl (x) + hex) is 8(g(x» . Is this still true if fl (x) and f2 (X) can take negative values? 38. Suppose that f(x), g(x), and h (x ) are functions such that f(x) is 8(g(x» and g(x) is 8(h (x» . Show that f(x) is 8(h (x» . 39. If fl (x) and hex) are functions from the set of posi tive integers to the set of positive real numbers and fl (x) and f2 (x) are both 8(g(x» , is (fl - f2 )(X) also 8(g(x» ? Either prove that it is or give a counterexample. 40. Show that if fl (x) and f2 (X) are functions from the set of positive integers to the set of real numbers and fl (x) is 8(gl (x» and hex) is 8(g2 (X» , then (fJ /2 )(X) is 8(gl g2 (X» . 41. Find functions f and g from the set of positive integers to the set of real numbers such that fen) is not O (g(n» and g(n ) is not O (f(n» simultaneously. 42. Express the relationship f(x) is Q(g(x» using a picture. Show the graphs of the functions f(x) and Cg(x), as well as the constant k on the real axis. 43. Show that if fl (x) is 8(gl (x» , hex) is 8(g2 (X» , and f2 (X) i= 0 and g2 (x) i= 0 for all real numbers x > 0, then (fJ /h )(x) is 8« gJ /g2 )(X» . n I n 44. Show that if f(x) = anx + an_lx - + . . . + al x + ao, where ao , ai , . . . , an - I , and an are real numbers and an i= 0, then f(x) is 8(x n ). Big- O , big-Theta, and big-Omega notation can be extended to functions in more than one variable. For example, the state ment f(x , y) is O (g(x , y» means that there exist constants C , kl , and k2 such that I f(x , y ) 1 :::: C lg(x , y ) 1 whenever x > kl and y > k2 . 45. Define the statement f(x , y) is 8(g(x , y» . 46. Define the statement f(x , y) is Q(g(x , y» . 47. Show that (x 2 + xy + X log y)3 is O (x 6y3 ). 48. Show that x 5 y3 + x4y4 + x 3y 5 is Q(x3y 3). 49. Show that Lxy J is O (xy). 50. Show that fxYl is Q(xy). The following problems deal with another type of asymptotic notation, called Iittle-o notation. Because little-o notation is based on the concept of limits, a knowledge of calculus is needed for these problems. We say that f(x) is o(g(x» [read f(x) is "little-oh" of g(x)], when f(x) lim = o. oo ---> g(x ) x 51. (Requires calculus) Show that b) x log x is O(X 2 ). a) x 2 is o(x3 ).
52.
(Requires calculus) a) Show that if f(x ) and g(x) are functions such that f(x)
53.
54.
*55. *56. 57.
58.
59. 60.
is o(g(x» and c is a constant, then cf(x) is o(g(x» , where (cf)(x) = cf(x). b) Show that if fl (X), h ex), and g(x) are functions such that fl (x) is o(g(x» and h ex) is o(g(x» , then (fl + h)(x) is o(g(x» , where (fl + h)(x) = fl (x) + hex). (Requires calculus) Represent pictorially that x log x is O(X 2 ) by graphing x log x , x 2 , and x log x/x 2 . Explain how this picture shows that x log x is O(X 2 ). (Requires calculus) Express the relationship f(x) is o(g(x» using a picture. Show the graphs of f(x), g(x), and f(x)/g(x). (Requires calculus) Suppose that f(x) is o(g(x» . Does it follow that 2f(x ) is o(2g(x» ? (Requires calculus) Suppose that f(x ) is o(g(x» . Does it follow that log I f(x)1 is o(log Ig(x ) I )? (Requires calculus) The two parts ofthis exercise describe the relationship between little-o and big- O notation. a) Show that if f(x) and g(x) are functions such that f(x) is o(g(x» , then f(x) is O (g(x» . b) Show that if f(x) and g(x) are functions such that f(x) is O (g(x» , then it does not necessarily follow that f(x) is o(g(x» . (Requires calculus) Show that if f(x) is a polynomial of degree n and g(x) is a polynomial of degree m where m > n , then f(x) is o(g(x» . (Requires calculus) Show that if fl (x) is O (g(x» and hex) is o(g(x» , then fl (x ) + hex) is O (g(x» . (Requires calculus) Let Hn be the nth harmonic number 1 1 1 Hn = I + + + " ' + ' 2" 3 ;;
Show that Hn is o (log n). [Hint: First establish the inequality n n L -:1 < -1 dx X I j =2 J by showing that the sum of the areas of the rectangles of height I /} with base from } 1 to } , for } = 2, 3 , . . . , n, is less than the area under the curve y = 1 /x from 2 to n.] *61 . Show that n log n is O (log n ! ). � 62. Determine whether log n ! is 8(n log n). Justify your answer. *63. Show that log n! is greater than (n log n)/4 for n > 4. [Hint: Begin with the inequality n ! > n(n - 1 ) ( n - 2) · · · fn /2l] Let f(x) and g(x) be functions from the set ofreal numbers to the set of real numbers. We say that the functions f and g are asymptotic and write f(x) g(x) iflimx---> oo f(x)/g(x) = 1 . 64. (Requires calculus) For each of these pairs of functions, determine whether f and g are asymptotic.
I
-
�
3. 3
3-2 7
+ + + ++
+
+
a) f(x) = x2 3x 7, g(x) = x2 b) f(x) = x2 log x , g(x) = x3 c) f(x ) = X4 log(3x8 7), g(x) = (x2 1 7x 3)2 f(x) = (x3 x2 X 1 )4 , g(x ) = (x4 x3 x2 X 1 )3 .
d)
3.3
+ + ++ ++ +
10
65.
Complexity of Algorithms
193
(Requires calculus) +
For each o f these pairs o f functions, determine whether f and g are asymptotic. a) f(x) = log(X2 1), g(x) = log x b) f(x) = 2x+3 , g(x) = 2x+ 7 c) f(x) = 2 1' , g(x) = 2X 2 f(x) = 2x2 +x+ l , g(x) = 2x2 +2x
d)
Com p lexity o f Algorithms
Introduction When does an algorithm provide a satisfactory solution to a problem? First, it must always produce the correct answer. How this can be demonstrated will be discussed in Chapter 4. Second, it should be efficient. The efficiency of algorithms will be discussed in this section. How can the efficiency of an algorithm be analyzed? One measure of efficiency is the time used by a computer to solve a problem using the algorithm, when input values are of a specified size. A second measure is the amount of computer memory required to implement the algorithm when input values are of a specified size. Questions such as these involve the computational complexity ofthe algorithm. An analysis of the time required to solve a problem of a particular size involves the time complexity of the algorithm. An analysis of the computer memory required involves the space complexity of the algorithm. Considerations of the time and space complexity of an algorithm are essential when algorithms are implemented. It is obviously important to know whether an algorithm will produce an answer in a microsecond, a minute, or a billion years. Likewise, the required memory must be available to solve a problem, so that space complexity must be taken into account. Considerations of space complexity are tied in with the particular data structures used to implement the algorithm. Because data structures are not dealt with in detail in this book, space complexity will not be considered. We will restrict our attention to time complexity.
Time Complexity The time complexity of an algorithm can be expressed in terms of the number of operations used by the algorithm when the input has a particular size. The operations used to measure time complexity can be the comparison of integers, the addition of integers, the multiplication of integers, the division of integers, or any other basic operation. Time complexity is described in terms of the number of operations required instead of actual computer time because of the difference in time needed for different computers to perform basic operations. Moreover, it is quite complicated to break all operations down to the basic bit operations that a computer uses. Furthermore, the fastest computers in existence can perform basic bit operations (for instance, adding, multiplying, comparing, or exchanging two bits) in 1 0- 9 second ( 1 nanosecond), but personal computers may require 1 0- 6 second ( 1 microsecond), which is 1 000 times as long, to do the same operations. We illustrate how to analyze the time complexity of an algorithm by considering Algorithm 1 of Section 3 . 1 , which finds the maximum of a finite set of integers. EXAMPLE 1
Describe the time complexity of Algorithm 1 of Section 3 . 1 for finding the maximum element in a set.
Solution: The number of comparisons will be used as the measure of the time complexity of the algorithm, because comparisons are the basic operations used.
194
3/
The Fundamentals: Algorithms, the Integers, and Matrices
. 3-28
To find the maximum element of a set with n elements, listed in an arbitrary order, the temporary maximum is first set equal to the initial term in the list. Then, after a comparison has been done to determine that the end of the list has not yet been reached, the temporary maximum and second term are compared, updating the temporary maximum to the value of the second term if it is larger. This procedure is continued, using two additional comparisons for each term of the list--one to determine that the end of the list has not been reached and another to determine whether to update the temporary maximum. Because two comparisons are used for each of the second through the nth elements and one more comparison is used to exit the loop when i = n + 1 , exactly 2(n - 1 ) + 1 = 2 n - 1 comparisons are used whenever this algorithm is applied. Hence, the algorithm for finding the maximum of a set of n elements has ... time complexity 8(n), measured in terms of the number of comparisons used. Next, we will analyze the time complexity of searching algorithms. EXAMPLE 2
Describe the time complexity of the linear search algorithm.
Solution: The number of comparisons used by the algorithm will be taken as the measure of the time complexity. At each step of the loop in the algorithm, two comparisons are performed one to see whether the end of the list has been reached and one to compare the element x with a term of the list. Finally, one more comparison is made outside the loop. Consequently, if x = aj , 2i + 1 comparisons are used. The most comparisons, 2n + 2, are required when the element is not in the list. In this case, 2n comparisons are used to determine that x is not aj , for i = 1 , 2, . . . , n , an additional comparison is used to exit the loop, and one comparison is made outside the loop. So when x is not in the list, a total of 2 n + 2 comparisons are used. Hence, a ... linear search requires at most O (n) comparisons. WORST-CASE COMPLEXITY The type of complexity analysis done in Example 2 is a worst case analysis. By the worst-case performance of an algorithm, we mean the largest number of
operations needed to solve the given problem using this algorithm on input of specified size. Worst-case analysis tells us how many operations an algorithm requires to guarantee that it will produce a solution. EXAMPLE 3
Describe the time complexity of the binary search algorithm in terms of the number of compar isons used (and ignoring the time required to compute m = [(i + j) / 2] in each iteration of the loop in the algorithm).
Solution: For simplicity, assume there are n = 2 k elements in the list ai , a2 , . . . , an , where k is a nonnegative integer. Note that k = log n . (If n , the number of elements in the list, is not a power of 2, the list can be considered part of a larger list with 2 k +1 elements, where 2 k < n < 2k + l . Here 2 k + 1 is the smallest power of 2 larger than n .) At each stage of the algorithm, i and j, the locations of the first term and the last term of the restricted list at that stage, are compared to see whether the restricted list has more than one term. If i < j, a comparison is done to determine whether x is greater than the middle term of the restricted list. At the first stage the search is restricted to a list with 2 k - 1 terms. So far, two comparisons have been used. This procedure is continued, using two comparisons at each stage to restrict the search to a list with half as many terms. In other words, two comparisons are used at the first stage of the algorithm when the list has 2 k elements, two more when the search has been reduced to a list with 2 k -1 elements, two more when the search has been reduced to a list with 2k - 2 elements, and so on, until two comparisons are used when the search has been reduced to a list with 2 1 = 2 elements. Finally, when one term is left in the list, one comparison tells us that there are no additional terms left, and one more comparison is used to determine if this term is x.
3-29
3.3
Complexity of Algorithms
195
Hence, at most 2k + 2 = 2 log n + 2 comparisons are required to perform a binary search when the list being searched has 2k elements. (If n is not a power of 2, the original list is expanded to a list with 2 k + 1 terms, where k = L log n J , and the search requires at most 2 flog n 1 + 2 comparisons.) Consequently, a binary search requires at most 8(log n) comparisons. From this analysis it follows that the binary search algorithm is more efficient, in the worst case, than a .... linear search. AVERAGE-CASE COMPLEXITY Another important type of complexity analysis, besides worst-case analysis, is called average-case analysis. The average number of operations used to solve the problem over all inputs of a given size is found in this type of analysis. Average-case time complexity analysis is usually much more complicated than worst-case analysis. However, the average-case analysis for the linear search algorithm can be done without difficulty, as shown in Example 4. EXAMPLE
4
Describe the average-case performance of the linear search algorithm, assuming that the element is in the list.
x
Solution: There are n types of possible inputs when x is known to be in the list. If x is the first term of the list, three comparisons are needed, one to determine whether the end of the list has been reached, one to compare x and the first term, and one outside the loop. If x is the second term of the list, two more comparisons are needed, so that a total of five comparisons are used. In general, if x is the ith term of the list, two comparisons will be used at each of the i steps of the loop, and one outside the loop, so that a total of 2i + 1 comparisons are needed. Hence, the average number of comparisons used equals 3 + 5 + 7 + . . . + (2n + 1 ) n
2( 1 + 2 + 3 + . . . + n) + n n
In Section 4. 1 we will show that
1 +2+3+···+n =
n (n + 1 ) . 2
Hence, the average number of comparisons used by the linear search algorithm (when x is known to be in the list) is
2[n(n + 1 )/2] + I = n + 2, n
-----
which is 8 (n ). Remark: In the analysis in Example 4 it has been assumed that x is in the list being searched
and it is equally likely that x is in any position. It is also possible to do an average-case analysis of this algorithm when x may not be in the list (see Exercise 1 3 at the end of this section).
Remark: Although we have counted the comparisons needed to determine whether we have
reached the end of a loop, these comparisons are often not counted. From this point on we will ignore such comparisons.
WORST-CASE COMPLEXITY OF TWO SORTING ALGORITHMS We analyze the worst-case complexity of the bubble sort and the insertion sort in Examples 5 and 6. EXAMPLE
5
What is the worst-case complexity of the bubble sort in terms of the number of comparisons made?
196
3/
The Fundamentals: Algorithms, the Integers, and Matrices
3-30
Solution: The bubble sort (described in Example 4 in Section 3 . 1 ) sorts a list by performing a sequence of passes through the list. During each pass the bubble sort successively compares adjacent elements, interchanging them if necessary. When the ith pass begins, the i - I largest elements are guaranteed to be in the correct positions. During this pass, n - i comparisons are used. Consequently, the total number of comparisons used by the bubble sort to order a list of n elements is (n - l )n 2 using a summation formula that we will prove in Section 4 . 1 . Note that the bubble sort always uses this many comparisons, because it continues even if the list becomes completely sorted at some intermediate step. Consequently, the bubble sort uses (n - l )n/2 comparisons, so it has .... 8(n 2 ) worst-case complexity in terms of the number of comparisons used. (n - 1 ) + (n - 2) + . . . + 2 + 1 =
EXAMPLE 6
What is the worst-case complexity of the insertion sort in terms of the number of comparisons made?
Solution: The insertion sort (described in Section 3 . 1 ) inserts the jth element into the correct position among the first j - 1 elements that have already been put into the correct order. It does this by using a linear search technique, successively comparing the jth element with successive terms until a term that is greater than or equal to it is found or it compares aj with itself and stops because aj is not less than itself. Consequently, in the worst case, j comparisons are required to insert the jth element into the correct position. Therefore, the total number of comparisons used by the insertion sort to sort a list of n elements is 2+3+" '+n =
n (n + 1 ) - 1, 2
using the summation formula for the sum of consecutive integers that we will prove in Section 4. 1 and noting that the first term, 1 , is missing in this sum. Note that the insertion sort may use considerably fewer comparisons if the smaller elements started out at the end of the list. We .... conclude that the insertion sort has worst-case complexity 8(n 2 ).
Understanding the Complexity of Algorithms Table 1 displays some common terminology used to describe the time complexity of algorithms. For example, an algorithm that finds the largest of the first 1 00 terms of a list of n elements by TABLE 1 Commonly Used Terminology for the Complexity of Algorithms. Complexity
Terminology
8( 1 )
Constant complexity
8(log n ) 8(n)
Logarithmic complexity
8(n log n ) 8(n b )
n log n complexity
8(bn ), where b 8(n !)
Linear complexity Polynomial complexity >
1
Exponential complexity Factorial complexity
3.3
unks �
Complexity of Algorithms
197
applying Algorithm 1 to the sequence of the first 1 00 terms, where n is an integer with n ::: 1 00, has constant complexity because it uses 99 comparisons no matter what n is (as the reader can verify). The linear search algorithm has linear (worst-case or average-case) complexity and the binary search algorithm has logarithmic (worst-case) complexity. Many important algorithms have n log n complexity, such as the merge sort, which we will introduce in Chapter 4 . An algorithm has polynomial complexity if it has complexity 8(n h ), where b is an integer with b ::: 1 . For example, the bubble sort algorithm is a polynomial-time algorithm because it uses 8(n 2 ) comparisons in the worst case. An algorithm has exponential complexity if it has time complexity 8(bn ), where b > 1 . The algorithm that determines whether a compound proposition in n variables is satisfiable by checking all possible assignments of truth variables is an algorithm with exponential complexity because it uses 8(2n ) operations. Finally, an algorithm has factorial complexity if it has 8(n ! ) time complexity. The algorithm that finds all orders that a travelling salesman could use to visit n cities has factorial complexity; we will discuss this algorithm in Chapter 9. A problem that is solvable using an algorithm with polynomial worst-case complexity is called tractable, because the expectation is that the algorithm will produce the solution to the problem for reasonably sized input in a relatively short time. However, if the polynomial in the big-8 estimate has high degree (such as degree 1 00) or if the coefficients are extremely large, the algorithm may take an extremely long time to solve the problem. Consequently, that a problem can be solved using an algorithm with polynomial worst-case time complexity is no guarantee that the problem can be solved in a reasonable amount of time for even relatively small input values. Fortunately, in practice, the degree and coefficients of polynomials in such estimates are often small. The situation is much worse for problems · that cannot be solved using an algorithm with worst-case polynomial time complexity. Such problems are called intractable. Usually, but not always, an extremely large amount of time is required to solve the problem for the worst cases of even small input values. In practice, however, there are situations where an algorithm with a certain worst-case time complexity may be able to solve a problem much more quickly for most cases than for its worst case. When we are willing to allow that some, perhaps small, number of cases may not be solved in a reasonable amount of time, the average-case time complexity is a better measure of how long an algorithm takes to solve a problem. Many problems important in industry are thought to be intractable but can be practically solved for essentially all sets of input that arise in daily life. Another way that intractable problems are handled when they arise in prac tical applications is that instead oflooking for exact solutions of a problem, approximate solutions are sought. It may be the case that fast algorithms exist for finding such approximate solutions, perhaps even with a guarantee that they do not differ by very much from an exact solution. Some problems even exist for which it can be shown that no algorithm exists for solving them. Such problems are called unsolvable (as opposed to solvable problems that can be solved using an algorithm). The first proof that there are unsolvable problems was provided by the great English mathematician and computer scientist Alan Turing when he showed that the halting problem is unsolvable. Recall that we proved that the halting problem is unsolvable in Section 3 . 1 . (A biography of Alan Turing and a description of some of his other work can be found in Chapter 1 2.) The study of the complexity of algorithms goes far beyond what we can describe here. Note, however, that many solvable problems are believed to have the property that no algorithm with polynomial worst-case time complexity solves them, but that a solution, if known, can be checked in polynomial time. Problems for which a solution can be checked in polynomial time are said to belong to the class NP (tractable problems are said to belong to class P).* There is also an important class of problems, called NP-complete problems, with the property that if any of these problems can be solved by a polynomial worst-case time algorithm, then all problems in the class NP can be solved by polynomial worst-case time algorithms. ' N P stands
for nondeterministic polynomial time.
198
3 / The Fundamentals: Algorithms, the Integers, and Matrices
3-32
The satisfiability problem is an example of an NP-complete problem-we can quickly verify that an assignment of truth values to the variables of a compound proposition makes it true, but no polynomial time algorithm has been discovered for finding such an assignment of truth values. [For example, an exhaustive search of all possible truth values requires E>(2n ) bit operations where n is the number of variables in the compound proposition.] Furthermore, if a polynomial time algorithm for solving the satisfiability problem were known, there would be polynomial time algorithms for all problems known to be in this class of problems (and there are many important problems in this class). Despite extensive research, no polynomial worst-case time algorithm has been found for any problem in this class. It is generally accepted, although not proven, that no NP-complete problem can be solved in polynomial time. For more information about the complexity of algorithms, consult the references, including [CoLeRiStO l ] , for this section listed at the end of this book. (Also, for a more formal discussion of computational complexity in terms of Turing machines, see Section 12.5.) Note that a big-E> estimate of the time complexity of an algorithm expresses how the time required to solve the problem can change as the input grows in size. In practice, the best estimate (that is, with the smallest reference function) that can be shown is used. However, big-E> esti mates of time complexity cannot be directly translated into the actual amount of computer time used. One reason is that a big-E> estimate f(n) is E>(g(n» , where f(n) is the time complexity of an algorithm and g(n) is a reference function, means that CJg(n) � f(n) � C2 g(n) when n > k, where C J , C2 , and k are constants. So without knowing the constants C J , C2 , and k in the inequality, this estimate cannot be used to determine a lower bound and an upper bound on the number of operations used in the worst case. As remarked before, the time required for an oper ation depends on the type of operation and the computer being used. Often, instead of a big-E> estimate on the worst-case time complexity of an algorithm, we have only a big-O estimate. Note that a big-O estimate on the time complexity of an algorithm provides an upper, but not a lower, bound on the worst-case time required for the algorithm as a function of the input size. Never theless, for simplicity, we will often use big- O estimates when describing the time complexity of algorithms, with the understanding that big-E> estimates would provide more information. However, the time required for an algorithm to solve a problem of a specified size can be determined if all operations can be reduced to the bit operations used by the computer. Table 2 displays the time needed to solve problems of various sizes with an algorithm using the indicated number of bit operations. Times of more than 1 0 1 00 years are indicated with an asterisk. (In Section 3 . 6 the number of bit operations used to add and multiply two integers will be discussed.) In the construction of this table, each bit operation is assumed to take 1 0-9 second, which is the time required for a bit operation using the fastest computers today. In the future, these times will decrease as faster computers are developed. It is important to know how long a computer will need to solve a problem. For instance, if an algorithm requires 1 0 hours, it may be worthwhile to spend the computer time (and money)
TABLE 2 The Computer Time Used by Algorithms. Bit Operations Used
Problem Size log n
n
n
n2
n log n
3
10 1 02
3 7
1 0-9 1 0-9
1 0-8
S
1 0-7
S
S
1 0-8 1 0-7
X
X
S
7
X
1 0-7
S
1 0-5
S
1 03
1 .0
X
1 0-8
S
1 0-6
S
1
X
1 0-5
S
1 0-3
S
x
S
S
n!
2n 1 0-6 S 1 0 1 3 yr
4X
3 *
*
*
*
*
1 04
1 .3
X
1 0-8 s
1 0-5
S
1
X
1 0-4
S
1 0- 1
1 05 1 06
1 .7
X
1 0-8 s
1 0-4 1 0-3
S
2 2
X
1 0-3
S
10 s
*
*
1 7 min
*
*
2
X
1 0-8
S
S
X
1 0-2 s
S
X
1 0-3
S
3.3 Complexity of Algorithms
3-33
199
required to solve this problem. But, if an algorithm requires 1 0 billion years to solve a problem, it would be unreasonable to use resources to implement this algorithm. One of the most interesting phenomena of modem technology is the tremendous increase in the speed and memory space of computers. Another important factor that decreases the time needed to solve problems on computers is parallel processing, which is the technique of performing sequences of operations simultaneously. Efficient algorithms, including most algorithms with polynomial time complexity, bene fit most from significant technology improvements. However, these technology improvements offer little help in overcoming the complexity of algorithms of exponential or factorial time complexity. Because of the increased speed of computation, increases in computer memory, and the use of algorithms that take advantage of parallel processing, many problems that were considered impossible to solve five years ago are now routinely solved, and certainly five years from now this statement will still be true.
Exercises procedure
1. How many comparisons are used by the algorithm given
2.
3.
4.
in Exercise 1 6 of Section 3 . 1 to find the smallest natural number in a sequence of natural numbers? Write the algorithm that puts the first four terms of a list of arbitrary length in increasing order. Show that this al gorithm has time complexity 0 ( 1 ) in terms ofthe number of comparisons used. Suppose that an element is known to be among the first four elements in a list of 32 elements. Would a linear search or a binary search locate this element more rapidly? Determine the number of multiplications used to find starting with x and successively squaring (to find X4 and so on). Is this a more efficient way to find than by multiplying x by itself the appropriate number of times? Give a big-O estimate for the number of comparisons used by the algorithm that determines the number of 1 s in a bit string by examining each bit of the string to determine whether it is a 1 bit (see Exercise of Section 3 . 1 ). a) Show that this algorithm determines the number of 1 bits in the bit string S :
n
x 2'
5.
*6.
k 2 X x2 ,
,
25
count :S= i=0 0bit count ( S : bit string) := count + 1 count S := S (S - 1 ) {count i s the number o f 1 s in S } procedure
while begin end
power y :=i ao:=1 I n power c y {y:= =y :+=a power aicn +power n an_l Cn -1 + . . . + alc + ao l where the final value of y is the value of the polynomial at x = c. Evaluate 3x 2 + x + 1 at x = 2 by working through each step ofthe algorithm showing the values assigned for := begin
*
end
a)
at each assignment step.
b) Exactly how many multiplications and additions are
n =c
used to evaluate a polynomial of degree at x ? (Do not count additions used to increment the loop variable.) 8. There is a more efficient algorithm (in terms of the num ber of multiplications and additions used) for evaluating polynomials than the conventional algorithm described in the previous exercise. It is called Horner's method. This pseudocode shows how to use this method to find the value ' of +...+ + at x + procedure real numbers)
alx ao = c
an xn an _lX n - 1 alX ao = c. Horner(c, ao, aI, a2 , . . . , an : y :=i a:=n 1 n + an-i n -1 + . . . + alc + ao l {y y=:=ancyn 2+c an_lC Evaluate 3x + x + 1 at x = 2 by working through for
to
*
-
anxn an_lXn-1
to
*
/\
Here S I i s the bit string obtained by changing the rightmost 1 bit of S to a 0 and all the 0 bits to the right of this to 1 s. [Recall that S /\ (S - 1 ) is the bitwise AND of S and S - 1 .] b) How many bitwise AND operations are needed to find the number of 1 bits in a string S? 7. The conventional algorithm for evaluating a polynomial + +...+ + at x can be ex pressed in pseudocode by
polynomial(c, ao, aI, . . . , an : real
numbers)
a)
each step ofthe algorithm showing the values assigned at each assignment step. b) Exactly how many multiplications and additions are used by this algorithm to evaluate a polynomial of degree at x ? (Do not count additions used to increment the loop variable.) 9. What is the largest for which one can solve in one
n =c n
200
3 / The Fundamentals: Algorithms, the Integers, and Matrices
second a problem using an algorithm that requires fen ) bit operations, where each bit operation i s carried out in 1 0 - 9 second, with these values for f(n)? a) log n b) n c) n log n 1) n ! e) 2 n d) n 2 10. How much time does an algorithm take to solve a problem of size n if this algorithm uses 2n 2 2 n bit operations, each requiring 1 0 - 9 second, with these values of n ? a) 10 b) 20 c ) 50 d) 1 00 1 1 . How much time does an algorithm using 2 50 bit operations need if each bit operation takes these amounts of time? a) 1 0 - 6 second b) 1 0- 9 second c) 1 0 - 1 2 second 12. Determine the least number of comparisons, or best-case performance, a) required to find the maximum of a sequence of n in tegers, using Algorithm 1 of Section 3 . 1 . b) used to locate an element in a list of n terms with a linear search. c) used to locate an element in a list of n terms using a binary search. 13. Analyze the average-case performance ofthe linear search algorithm, if exactly half the time element x is not in the list and if x is in the list it is equally likely to be in any position. 14. An algorithm is called optimal for the solution of a prob lem with respect to a specified operation if there is no algorithm for solving this problem using fewer operations. a) Show that Algorithm 1 in Section 3 . 1 is an optimal al gorithm with respect to the number of comparisons of integers. Comparisons used for bookkeeping in the loop are not of concern here.) b) Is the linear search algorithm optimal with respect to the number of comparisons of integers (not including comparisons used for bookkeeping in the loop)? 1 5. Describe the worst-case time complexity, measured in terms of comparisons, of the ternary search algorithm described in Exercise 27 of Section 3 . 1 . 16. Describe the worst-case time complexity, measured in terms of comparisons, of the search algorithm described in Exercise 28 of Section 3 . 1 . 17. Analyze the worst-case time complexity o f the algorithm you devised in Exercise 29 of Section 3 . 1 for locating a mode in a list of nondecreasing integers. 18. Analyze the worst-case time complexity of the algorithm you devised in Exercise 30 of Section 3 . 1 for locating all modes in a list of nondecreasing integers.
3-34
19. Analyze the worst-case time complexity of the algorithm
20.
+
(Note:
21.
22.
23.
24.
25.
26.
27.
28.
you devised in Exercise 3 1 of Section 3 . 1 for finding the first term of a sequence of integers equal to some previous term. Analyze the worst-case time complexity of the algorithm you devised in Exercise 32 of Section 3 . 1 for finding all terms of a sequence that are greater than the sum of all previous terms. Analyze the worst-case time complexity of the algorithm you devised in Exercise 33 of Section 3 . 1 for finding the first term of a sequence less than the immediately preced ing term. Determine the worst-case complexity in terms of com parisons of the algorithm from Exercise 5 in Section 3 . 1 for determining all values that occur more than once in a sorted list of integers. Determine the worst-case complexity in terms of compar isons of the algorithm from Exercise 9 in Section 3 . 1 for determining whether a string is a palindrome. How many comparisons does the selection sort (see preamble to Exercise 4 1 in Section 3 . 1 ) use to sort n items? Use your answer to give a big-O estimate of the complex ity ofthe selection sort in terms of number of comparisons for the selection sort. Find a big-O estimate for the worst-case complexity in terms of number of comparisons used and the number of terms swapped by the binary insertion sort described in the preamble to Exercise 47 in Section 3 . 1 . Show that the greedy algorithm for making change for n cents using quarters, dimes, nickels, and pennies has O(n) complexity measured in terms of comparisons needed. Describe how the number of comparisons used in the worst case changes when these algorithms are used to search for an element of a list when the size of the list doubles from n to 2n, where n is a positive integer. a) linear search b) binary search Describe how the number of comparisons used in the worst case changes when the size of the list to be sorted doubles from n to 2n , where n is a positive integer when these sorting algorithms are used. b) insertion sort a) bubble sort c) selection sort (described in the preamble to Exercise 4 1 in Section 3 . 1 ) d) binary insertion sort (described in the preamble to Exercise 47 in Section 3 . 1 )
3.4 The Integers and Division Introduction The part o f mathematics involving the integers and their properties belongs to the branch of mathematics called number theory. This section is the beginning of a four-section introduction
3.4 The Integers
3-35
and Division
201
to number theory. We will develop the basic concepts of number theory used throughout com puter science. In this section we will review some basic concepts of number theory, including divisibility and modular arithmetic. In Section 3.5 we will cover prime numbers and begin our discussion of greatest common divisors. In Section 3 . 6 we will describe several important algo rithms from number theory, tying together the material in Sections 3 . 1 and 3 .3 on algorithms and their complexity with the notions introduced in this section and in Section 3 . 5 . For example, we will introduce algorithms for finding the greatest common divisor of two positive integers and for performing computer arithmetic using binary expansions. Finally, in Section 3.7, we will con tinue our study ofnumber theory by introducing some important results and their applications to computer arithmetic and cryptology, the study of secret messages. We will rely on proof methods we studied in Chapter 1 to develop basic theorems in number theory. This will give you an oppor tunity to use what you have learned about proof methods and proof strategies in a new setting. The ideas that we will develop in this section are based on the notion of divisibility. Divi sion of an integer by a positive integer produces a quotient and a remainder. Working with these remainders leads to modular arithmetic, which is used throughout computer science. We will discuss three applications of modular arithmetic in this section: generating pseudorandom num bers, assigning computer memory locations to files, and encrypting and decrypting messages.
Division When one integer is divided by a second, nonzero integer, the quotient may or may not be an inte ger. For example, 1 2/3 = 4 is an integer, whereas 1 1 /4 = 2.75 is not. This leads to Definition 1 .
DEFINITION 1
If a and b are integers with a #- 0, we say that a divides b if there is an integer c such that = ac. When a divides b we say that a is a factor of b and that b is a multiple of a . The notation a I b denotes that a divides b. We write a 1 b when a does not divide b. b
Remark: We can express a I b using quantifiers as 3c(ac
is the set of integers.
= b), where the universe of discourse
In Figure 1 a number line indicates which integers are divisible by the positive integer d. EXAMPLE 1
Determine whether 3 1 7 and whether 3 I 1 2 .
Solution: It follows that 3 17, because 7/3 is not an integer. On the other hand, 3 I 1 2 because
Let n and d be positive integers. How many positive integers not exceeding n are divisible by d?
Solution: The positive integers divisible by d are all the integers of the form dk, where k is a positive integer. Hence, the number of positive integers divisible by d that do not exceed n equals the number of integers k with 0 < dk :s n , or with 0 < k :s n/d. Therefore, there are
- 3d
- 2d
FIGURE 1
-d
o
d
2d
Integers Divisible by the Positive Integer d.
1 ..
3d
202
3 / The Fundamentals: Algorithms, the Integers, and Matrices THEOREM l
3-36
Let a , b, and c be integers. Then (i) if a I b and a I c, then a I (b + c); (ii) if a I b, then a I bc for all integers c; (iii) if a I b and b I c, then a I c .
Proof: We will give a direct proof of (i) . Suppose that a I b and a I c. Then, from the definition of divisibility, it follows that there are integers s and t with b = as and c = at. Hence,
b + c = as + at = a(s + t) . Therefore, a divides b + c . This establishes part (i) of the theorem. The proofs of parts (ii) and <J (iii) are left as exercises for the reader. Theorem 1 has this useful consequence.
COROLLARY l
If a, b, and c are integers such that a I b and a I c, then a 1 mb + nc whenever m and n are integers.
Proof: We will give a direct proof. By part (ii) of Theorem 1 it follows that a I mb and a I nc <J whenever m and n are integers. By part (i) of Theorem 1 it follows that a 1 mb + nco
The Division Al g orithm When an integer is divided by a positive integer, there is a quotient and a remainder, as the division algorithm shows.
THEOREM 2
THE DIVISION ALGORITHM Let a be an integer and d a positive integer. Then there are unique integers q and r , with 0 � r < d, such that a = dq· + r .
We defer the proof o f the division algorithm to Example 5 and Exercise 37 in Section 4.2. Remark: Theorem 2 is not really an algorithm. (Why not?) Nevertheless, we use its traditional
name.
DEFINITION 2
In the equality given in the division algorithm, d is called the divisor, a is called the dividend, q is called the quotient, and r is called the remainder. ihis notation is used to express the quotient and remainder:
q = a div d,
r
=
a mod d .
Examples 3 and 4 illustrate the division algorithm.
3.4 The Integers and Division
3-3 7
EXAMPLE 3
203
What are the quotient and remainder when 1 0 1 is divided by I I ?
Solution: We have 1 0 1 = 1 1 · 9 + 2.
Hence, the quotient when 1 0 1 is divided by 1 1 is 9 = 101 div 1 1 , and the remainder is 2 =
....
1 0 1 mod 1 1 .
EXAMPLE 4
What are the quotient and remainder when - 1 1 is divided by 3?
Solution: We have - 1 1 = 3(-4) + 1 .
Hence, the quotient when - 1 1 is divided by 3 is -4 = - 1 1 div 3, and the remainder is 1 = - 1 1 mod 3 . Note that the remainder cannot b e negative. Consequently, the remainder i s not - 2 , even though - 1 1 = 3(-3) - 2,
because r = -2 does not satisfy 0 .:::
r <
3.
Note that the integer a i s divisible by the integer d i f and only i f the remainder i s zero when
a is divided by d.
Modular Arithmetic In some situations we care only about the remainder of an integer when it is divided by some specified positive integer. For instance, when we ask what time it will be (on a 24-hour clock) 50 hours from now, we care only about the remainder when 50 plus the current hour is divided by 24. Because we are often interested only in remainders, we have special notations for them. We have a notation to indicate that two integers have the same remainder when they are divided by the positive integer m .
DEFINITION 3
I f a and b are integers and m i s a positive integer, then a i s congruent to b modulo
m if m divides a - b. We use the notation a = b (mod m ) to indicate that a is congruent to b modulo m . If a and b are not congruent modulo m , we write a ¢: b (mod m).
The connection between the notations used when working with remainders is made clear in Theorem 3 .
THEOREM 3
Let a and b b e integers, and let m b e a positive integer. Then a if a mod m = b mod m .
=
b
(mod m ) if and only
The proof of Theorem 3 i s left as Exercises 1 1 and 1 2 at the end of this section.
EXAMPLE 5
Determine whether 1 7 is congruent to 5 modulo 6 and whether 24 and 1 4 are congruent modulo 6.
204
3/
3-38
The Fundamentals: Algorithms, the Integers, and Matrices
Solution: Because 6 divides 1 7 - 5 = 1 2 , we see that 1 7 == 5 (mod 6) . However, because 24 .... 14 = 1 0 is not divisible by 6, we see that 24 ¢ 1 4 (mod 6). The great German mathematician Karl Friedrich Gauss developed the concept of congru ences at the end of the eighteenth century. The notion of congruences has played an important role in the development of number theory. Theorem 4 provides a useful way to work with congruences. THEOREM
4
Let m be a positive integer. The integers a and b are congruent modulo m if and only if there is an integer k such that a = b + km .
Proof: If a == b (mod m), then m I (a - b). This means that there is an integer k such that a - b = km , so that a = b + km . Conversely, if there is an integer k such that a = b + km ,
THEOREM S
Let m be a positive integer. If a
a +c
==
b + d (mod m)
==
b (mod m) and c
ac
and
Proof: Because a == b (mod m ) and c and d = c + tm . Hence,
==
==
==
d (mod m), then
bd (mod m).
d (mod m), there are integers s and t with b = a + sm
b + d = (a + sm) + (c + tm) = (a + c) + m (s + t) and
bd = (a + sm)(c + tm) = ac + m eat + cs + stm). Hence,
a+c
==
b + d (mod m )
and
ac
==
bd (mod m).
links � KARL FRIEDRICH GAUSS ( 1 777- 1 855) Karl Friedrich Gauss, the son of a bricklayer, was a child prodigy. He demonstrated his potential at the age of 1 0, when he quickly solved a problem assigned by a teacher to keep the class busy. The teacher asked the students to find the sum of the first 1 00 positive integers. Gauss realized that this sum could be found by forming 50 pairs, each with the sum 1 0 1 : 1 + 1 00, 2 + 99, . . . , 50 + 5 1 . This brilliance attracted the sponsorship of patrons, including Duke Ferdinand of Brunswick, who made it possible for Gauss to attend Caroline College and the University of Gottingen. While a student, he invented the method of least squares, which is used to estimate the most likely value of a variable from experimental results. In 1 796 Gauss made a fundamental discovery in geometry, advancing a subject that had not advanced since ancient times. He showed that a 1 7 -sided regular polygon could be drawn using just a ruler and compass. In 1 799 Gauss presented the first rigorous proof of the Fundamental Theorem of Algebra, which states that a polynomial of degree n has exactly n roots (counting multiplicities). Gauss achieved worldwide fame when he successfully calculated the orbit of the first asteroid discovered, Ceres, using scanty data. Gauss was called the Prince of Mathematics by his contemporary mathematicians. Although Gauss is noted for his many discoveries in geometry, algebra, analysis, astronomy, and physics, he had a special interest in number theory, which can be seen from his statement "Mathematics is the queen of the sciences, and the theory of numbers is the queen of mathematics." Gauss laid the foundations for modem number theory with the publication of his book Disquisitiones Arithmeticae in 1 80 1 .
3.4
3-39 EXAMPLE 6
Because 7
==
2 (mod 5) and 1 1
18 = 7 + 1 1
==
==
The Integers and Division
205
1 (mod 5), it follows from Theorem 5 that
2 + 1 = 3 (mod 5)
and that 77 = 7 . 1 1
==
2 . 1 = 2 (mod 5).
We will need the following corollary of Theorem 5 in Section 4.4.
COROLLARY 2
Let m be a positive integer and let a and b be integers. Then (a + b) mod m = « a mod m ) + (b mod m )) mod m
and ab mod m = « a mod m )(b mod m )) mod m .
Proof: By the definitions mod m and the definition of congruence modulo m , we know that a == (a mod m ) (mod m ) and b == (b mod m ) (mod m). Hence, Theorem 5 tell us that a+b
==
(a mod m ) + (b mod m ) (mod m )
and ab
==
(a mod m )(b mod m ) (mod m).
The equalities in this corollary follow from these last two congruences by Theorem 3 .
We must be careful working with congruences. Some properties we may expect to be true are not valid. For example, if ae == be (mod m), the congruence a == b (mod m ) may be false. Similarly, if a == b (mod m) and e == d (mod m), the congruence a C == bd (mod m) may be false. (See Exercise 23 .)
Applications of Congruences Number theory has applications to a wide range of areas. We will introduce three applications in this section: the use of congruences to assign memory locations to computer files, the generation of pseudorandom numbers, and cryptosystems based on modular arithmetic. EXAMPLE
7
UnkS �
Hashing Functions The central computer at an insurance company maintains records for each of its customers. How can memory locations be assigned so that customer records can be retrieved quickly? The solution to this problem is to use a suitably chosen hashing function. Records are identified using a key, which uniquely identifies each customer's records. For in stance, customer records are often identified using the Social Security number of the customer as the key. A hashing function h assigns memory location h (k) to the record that has k as its key. In practice, many different hashing functions are used. One of the most common is the function
h(k) = k mod m where m is the number of available memory locations.
206
3/
The Fundamentals: Algorithms, the Integers, and Matrices
3-40
Hashing functions should be easily evaluated so that files can be quickly located. The hashing function h (k) = k mod m meets this requirement; to find h (k) , we need only compute the remainder when k is divided by m . Furthermore, the hashing function should be onto, so that all memory locations are possible. The function h (k) = k mod m also satisfies this property. For example, when m = 1 1 1 , the record of the customer with Social Security number 0642 1 2 848 is assigned to memory location 1 4, because h (0642 1 2 848) = 0642 1 2848 mod 1 1 1 = 1 4.
Similarly, because h (03 7 1 492 1 2) = 037 1 492 1 2 mod 1 1 1 = 65 ,
the record of the customer with Social Security number 037 1 492 1 2 is assigned to memory location 65. Because a hashing function is not one-to-one (because there are more possible keys than memory locations), more than one file may be assigned to a memory location. When this happens, we say that a collision occurs. One way to resolve a collision is to assign the first free location following the occupied memory location assigned by the hashing function. For example, after making the two earlier assignments, we assign location 1 5 to the record of the customer with the Social Security number 1 07405723 . To see this, first note that h (k) maps this Social Security number to location 1 4, because h ( 1 07405723) = 1 07405723 mod 1 1 1 = 1 4 ,
but this location is already occupied (by the file of the customer with Social Security number 0642 1 2 848). However, memory location 1 5 , the first location following memory location 14, is free. There are many more sophisticated ways to resolve collisions that are more efficient than the simple method we have described. These are discussed in the references on hashing functions � given at the end of the book. EXAMPLE 8
unkS�
Randomly chosen numbers are often needed for computer simu lations. Different methods have been devised for generating numbers that have properties of randomly chosen numbers. Because numbers generated by systematic methods are not truly random, they are called pseudorandom numbers. The most commonly used procedure for generating pseudorandom numbers is the linear congruential method. We choose four integers: the modulus m, multiplier a, increment c, and seed Xo, with 2 :s a < m , 0 :s c < m , and 0 :s Xo < m . We generate a sequence of pseudorandom numbers {xn }, with 0 :s Xn < m for all n , by successively using the congruence Pseudorandom Numbers
Xn + ! = (axn + c) mod m . (This is an example of a recursive definition, discussed in Section 4.3 . In that section we will show that such sequences are well defined.) Many computer experiments require the generation of pseudorandom numbers between 0 and 1 . To generate such numbers, we divide numbers generated with a linear congruential generator by the modulus: that is, we use the numbers Xn / m .
3.4
3-41
The Integers and Division
207
m
= 9,
For instance, the sequence of pseudorandom numbers generated by choosing
a = 7, c = 4, and Xo = 3 , can be found as follows: X I = txo + 4 mod 9 = 7 . 3 + 4 mod 9 X2 = 7x + 4 mod 9 = 7 . 7 + 4 mod 9 1 X 3 = 7X2 + 4 mod 9 = 7 . 8 + 4 mod 9 X4 = 7X3 + 4 mod 9 = 7 . 6 + 4 mod 9 X 5 = 7X4 + 4 mod 9 = 7 . 1 + 4 mod 9
= 25 mod 9 = 53 mod 9 = 60 mod 9 = 46 mod 9 = 1 1 mod 9
X6 = 7X 5 + 4 mod 9 = 7 . 2 + 4 mod 9 = 1 8 mod 9 X7 = 7X6 + 4 mod 9 = 7 . ° + 4 mod 9 = 4 mod 9
Xg = 7X7 + 4 mod 9 = 7 · 4 + 4 mod 9 = 32 mod 9 X9 = txg + 4 mod 9 = 7 . 5 + 4 mod 9 = 39 mod 9
=7 =8 =6 =1 =2 =° =4 =5 =3
Because X9 = Xo and because each term depends only on the previous term, this sequence is generated: 3 , 7, 8 , 6, 1 , 2, 0, 4, 5 , 3 , 7, 8 , 6 , 1 , 2 , 0, 4, 5 , 3 , . . . .
This sequence contains nine different numbers before repeating. Most computers do use linear congruential generators to generate pseudorandom numbers. Often, a linear congruential generator with increment c = ° is used. Such a generator is called a pure multiplicative generator. For example, the pure multiplicative generator with modulus 2 3 1 - 1 and multiplier 7 5 = 1 6, 807 is widely used. With these values, it can be shown that ... 2 3 1 - 2 numbers are generated before repetition begins.
Cryptology Congruences have many applications to discrete mathematics and computer science. Discussions of these applications can be found in the suggested readings given at the end of the book. One of the most important applications of congruences involves cryptology, which is the study of secret messages. One of the earliest known uses of cryptology was by Julius Caesar. He made messages secret by shifting each letter three letters forward in the alphabet (sending the last three letters of the alphabet to the first three). For instance, using this scheme the letter B is sent to E and the letter X is sent to A. This is an example of encryption, that is, the process of making a message secret. To express Caesar's encryption process mathematically, first replace each letter by an integer from ° to 25, based on its position in the alphabet. For example, replace A by 0, K by 1 0, and Z by 25 . Caesar's encryption method can be represented by the function f that assigns to the nonnegative integer p, p :::: 25, the integer f(p) in the set {O, 1 , 2 , . . . , 25} with
f(p) = (p + 3) mod 26. In the encrypted version of the message, the letter represented by p is replaced with the letter represented by (p + 3) mod 26. EXAMPLE 9
What is the secret message produced from the message "MEET YOU IN THE PARK" using the Caesar cipher?
Solution: First replace the letters in the message with numbers. This produces 12 4 4 19
24 1 4 20
8 13
19 7 4
1 5 0 1 7 1 0.
208
3/
The Fundamentals: Algorithms, the Integers, and Matrices
3-42
Now replace each of these numbers p by f(p) = (p + 3) mod 26. This gives I S 7 7 22
1 1 7 23
1 1 16
22 1 0 7
1 8 3 20 1 3 .
Translating this back to letters produces the encrypted message "PHHW B RX LQ WKH <0lIl SDUN." To recover the original message from a secret message encrypted by the Caesar cipher, the function f- I , the inverse of f, is used. Note that the function f- I sends an integer p from {O, 1 , 2, . . . , 25} to f- I (p) = (p - 3) mod 26. In other words, to find the original message, each letter is shifted back three letters in the alphabet, with the first three letters sent to the last three letters of the alphabet. The process ofdetermining the original message from the encrypted message is called decryption. There are various ways to generalize the Caesar cipher. For example, instead of shifting each letter by 3, we can shift each letter by k, so that
f(p) = (p + k) mod 26. Such a cipher is called a shift cipher. Note that decryption can be carried out using I f - (p) = (p - k) mod 26. Obviously, Caesar's method and shift ciphers do not provide a high level of security. There are various ways to enhance this method. One approach that slightly enhances the security is to use a function of the form
f(p) = (ap + b) mod 26, where a and b are integers, chosen such that f is a bijection. (Such a mapping is called an affine transformation.) This provides a number of possible encryption systems. The use of one of these systems is illustrated in Example 1 0. EXAMPLE 10
What letter replaces the letter K when the function f(p) = (7p + 3) mod 26 is used for en cryption?
Solution: First, note that 10 represents K. Then, using the encryption function specified, it follows that f(l O) = (7 · 1 0 + 3) mod 26 = 2 1 . Because 2 1 represents V , K is replaced by V <0lIl in the encrypted message.
unkS �
Caesar's encryption method, and the generalization of this method, proceed by replacing each letter of the alphabet by another letter in the alphabet. Encryption methods of this kind are vulnerable to attacks based on the frequency of occurrence of letters in the message. More sophisticated encryption methods are based on replacing blocks of letters with other blocks of letters. There are a number of techniques based on modular arithmetic for encrypting blocks of letters. A discussion of these can be found in the suggested readings listed at the end of the book.
Exercises 1. Does 1 7 divide each of these numbers?
a) 68
b) 84
c) 357
d) 1 00 1
2 . Show that i f a i s a n integer other than 0 , then a) 1 divides a. b) a divides O. 3. Show that part (ii ) of Theorem 1 is true.
4. Show that part (iii ) of Theorem 1 is true.
b b -b. b I a, b i d, aba, I b,cd. d
5. Show that if a I and then a = or a = 6. Show that if then
c , and
where a and
b
are integers,
are integers such that a I
C and
3.4 The Integers and Division 209
3-43
e be, then a I b. e
e =1= 0, such that Prove or disprove that if a I be, where a, b, and e are pos itive integers, then a I b or a I e.
7. Show that if a, b, and are integers with
a I 8.
9. What are the quotient and remainder when
a) b) c) d) e)
1 9 is divided by 7? - 1 1 1 is divided by I I ? 789 is divided by 23? 1 00 1 is divided by 13? 0 is divided by 1 9? t) 3 is divided by 5? g) -1 is divided by 3? b) 4 is divided by I? 10. What are the quotient and remainder when a) 44 is divided by 8? b) 777 is divided by 2 1 ? c) - 1 23 is divided by 1 9? d) - 1 is divided by 23? e) -2002 is divided by 87? t) 0 is divided by 1 7? g) 1 ,234,567 is divided by 1 00 1 ? b) - 1 00 i s divided by 1 O I ? 1 1 . Let m be a positive integer. Show that a = b (mod m ) if a mod m = b mod m . 12. Let m be a positive integer. Show that a mod m = b mod m if a = b (mod m). 13. Show that if n and are positive integers, then rn / 1 = L(n - 1 ) / J 1 . 14. Show that if a is an integer and d is a positive inte ger greater than 1 , then the quotient and remain der obtained when a is divided by d are La/dJ and a - d La/dJ , respectively. 15. Find a formula for the integer with smallest absolute value that is congruent to an integer a modulo m , where m is a positive integer. 16. Evaluate these quantities. a) - 1 7 mod 2 b) 1 44 mod 7 c) - 1 0 1 mod 1 3 d) 1 99 mod 1 9 17. Evaluate these quantities. a) 13 mod 3 b ) -97 mod 1 1 c) 155 mod 1 9 d) -22 1 mod 23 18. List five integers that are congruent to 4 modulo 1 2 . 1 9 . Decide whether each o f these integers i s congruent to 5 modulo 1 7. a) 80 b) 1 03 c) -29 d) - 1 22 20. Show that if a = b (mod m) and = d (mod m), where a, b, d, and m are integers with m =:': 2, then a - = b - d (mod m). 21. Show that if n I where n and are positive integers greater than 1 , and if a = (mod where a and are integers, then a = (mod n). 22. Show that if a, b, e, and are integers such that m =:': 2, > 0, and a = b (mod m), then ae = (mod
k+ k
e,
e
k
e
m, b
b mm), m be
e
b
me).
23. Find counterexamples to each of these statements about
congruences. a) If ae = (mod m), where a, b, and m are integers with m =:': 2, then a = b (mod m). b) If a = b (mod m ) and = (mod m ), where a, b, d, and m are integers with and positive and m =:': 2, then aC = bd (mod m). 24. Prove that if n is an odd positive integer, then n 2 = 1 (mod 8). 25. Show that if a, b, and m are integers such that =:': 1 , m =:': 2, and a = b (mod m ), then a k = b k (mod m ) whenever i s a positive integer. 26. Which memory locations are assigned by the hashing function = mod 1 0 1 to the records of insurance company customers with these Social Security numbers? a) 1 04578690 b) 432222 1 87 c) 37220 1 9 1 9 d) 501 33 8753 27. A parking lot has 3 1 visitor spaces, numbered from 0 to 30. Visitors are assigned parking spaces using the hash ing function = mod 3 1 , where is the number formed from the first three digits on a visitor's license plate. a) Which spaces are assigned by the hashing function to cars that have these first three digits on their license plates? 3 1 7, 9 1 8, 007, 1 00, 1 1 1 , 3 1 0 b) Describe a procedure visitors should follow to find a free parking space, when the space they are assigned is occupied. 28. What sequence of pseudorandom numbers is gener ated using the linear congruential generator Xn+ 1 = (4xn 1 ) mod 7 with seed xo = 3? 29. What sequence of pseudorandom numbers is gener ated using the pure multiplicative generator Xn+ 1 = 3 xn mod 1 1 with seed Xo = 2? 30. Write an algorithm in pseudocode for generating a se quence of pseudorandom numbers using a linear congru ential generator. 3 1 . Encrypt the message "DO NOT PASS GO" by translating the letters into numbers, applying the encryption func tion given, and then translating the numbers back into letters. a) f(p ) = (p 3) mod 26 (the Caesar cipher) b) f(p ) = (p 1 3 ) mod 26 c) f(p ) = (3 p + 7) mod 26 32. Decrypt these messages encrypted using the Caesar cipher.
be
k
e, e ed d
k h(k) k
e,
k,
h(k) k
k
+
++
a) EOXH MHDQV b) WHVW WRGDB c) HDW GLP VXP
All books are identified by an International Standard Book Number (ISBN), a l O-digit code X I X2 XIO, assigned by the publisher. (This system will change in 2007 when a new 1 3-digit code will be introduced.) These 1 0 digits consist of . • .
210
3 / The Fundamentals: Algorithms, the Integers, and Matrices
blocks identifying the language, the publisher, the number as signed to the book by its publishing company, and finally, a I -digit check digit that is either a digit or the letter X (used to represent 1 0). This check digit is selected so that L:�l == 0 (mod 1 1 ) and is used to detect errors in individual digits and transposition of digits.
ix;
33. The first nine digits of the ISBN of the European version
3-44
of the fifth edition of this book are 0-07- 1 1 988 1 . What is the check digit for that book? 34. The ISBN of the fifth edition of is 0-32- 1 23Q072, where Q is a digit. Find the value of Q . 35. Determine whether the check digit o f the ISBN for this textbook was computed correctly by the publisher.
ory and Its Applications
Elementary Number The
3.5 Primes and Greatest Common Divisors Introduction In Section 3 .4 we studied the concept of divisibility of integers. One important concept based on divisibility is that of a prime number. A prime is an integer greater than I that is divisible only by I and by itself. The study of prime numbers goes back to ancient times. Thousands of years ago it was known that there are infinitely many primes. Great mathematicians of the last 400 years have studied primes and have proved many important results about them. However, many old conjectures about primes remain unsettled. Primes have become essential in modern cryptographic systems. An important theorem, the Fundamental Theorem of Arithmetic, asserts that every positive integer can be written uniquely as the product of prime numbers. The length oftime required to factor large integers into their prime factors plays an important role in cryptography. In this section we will also study the greatest common divisor of two integers, as well as the least common multiple of two integers. We will develop an algorithm for computing greatest common divisors in Section 3 .6.
Primes Every positive integer greater than I is divisible by at least two integers, because a positive integer is divisible by I and by itself. Positive integers that have exactly two different positive integer factors are called primes.
DEFINITION 1
A positive integer p greater than 1 is called prime if the only positive factors of p are 1 and p. A positive integer that is greater than 1 and is not prime is called composite.
Remark: The integer n is composite if and only if there exists an integer a such that a I n and 1 < a < n. EXAMPLE 1
The integer 7 is prime because its only positive factors are 1 and 7, whereas the integer 9 is .... composite because it is divisible by 3 . The primes less than 1 00 are 2, 3 , 5, 7, 1 1 , 1 3 , 1 7, 1 9, 23, 29, 3 1 , 37, 4 1 , 43, 47, 53, 59, 6 1 , 67, 7 1 , 73, 79, 83, 89, and 97. In Section 7.6 we introduce a procedure, known as the sieve of Eratosthenes, that can be used to find all the primes not exceeding an integer n .
The primes are the building blocks of positive integers, as the Fundamental Theorem of Arithmetic shows. The proof will be given in Section 4.2.
3.5 Primes and Greatest Common Divisors
3-45
THEOREM l
211
Every positive integer greater than 1 can be written uniquely as a prime or as the product of two or more primes where the prime factors are written in order of nondecreasing size.
THE FUNDAMENTAL THEOREM OF ARITHMETIC
Example 2 gives some prime factorizations of integers. EXAMPLE
2
The prime factorizations of 1 00, 64 1 , 999, and 1 024 are given by 1 00 = 2 · 2 · 5 · 5 = 2 2 5 2 , 64 1 = 64 1 , 999 = 3 . 3 · 3 · 37 = 3 3 · 37, 1 024 = 2 · 2 · 2 · 2 · 2 · 2 · 2 · 2 · 2 · 2 = 2 1 0 .
It is often important to show that a given integer is prime. For instance, in cryptology, large primes are used in some methods for making messages secret. One procedure for showing that an integer is prime is based on the following observation.
THEOREM
2
If n is a composite integer, then n has a prime divisor less than or equal to ..;n.
Proof: If n is composite, by the definition of a composite integer, we know that it has a factor a with 1 < a < n . Hence, by the definition of a factor ofa positive integer, we have n = ab, where b is a positive integer greater than I . We will show that a :s ..;n or b :s ..;n. If a > ..;n and b > ..;n, then ab > ..;n . ..;n = n , which is a contradiction. Consequently, a :s ..;n or b :s ..;n. Because both a and b are divisors of n , we see that n has a positive divisor not exceeding ..;n. This divisor is either prime or, by the Fundamental Theorem of Arithmetic, has a prime divisor <J less than itself. In either case, n has a prime divisor less than or equal to ..;n. From Theorem 2, it follows that an integer is prime if it is not divisible by any prime less than or equal to its square root. In the following example this observation is used to show that 1 0 1 is prime. EXAMPLE 3
Show that 1 0 1 is prime.
Solution: The only primes not exceeding -JIOf are 2, 3, 5, and 7. Because 1 0 1 is not divisible by 2 , 3, 5, or 7 (the quotient of 1 0 1 and each of these integers is not an integer), it follows that � 1 0 1 is prime. Because every integer has a prime factorization, it would be useful to have a procedure for finding this prime factorization. Consider the problem of finding the prime factorization of n . Begin by dividing n by successive primes, starting with the smallest prime, 2. If n has a prime factor, then by Theorem 3 a prime factor p not exceeding ..;n will be found. So, if no prime factor not exceeding ..;n is found, then n is prime. Otherwise, if a prime factor p is found, continue by factoring n j p . Note that n j p has no prime factors less than p . Again, if n j p has no prime factor greater than or equal to p and not exceeding its square root, then it is prime. Otherwise, if it has a prime factor q , continue by factoring n j (pq). This procedure is continued until the factorization has been reduced to a prime. This procedure is illustrated in Example 4.
212
3 / The Fundamentals: Algorithms, the Integers, and Matrices
EXAMPLE
4
3-46
Find the prime factorization of 7007.
Solution: To find the prime factorization of 7007, first perform divisions of 7007 by successive primes, beginning with 2. None of the primes 2, 3, and 5 divides 7007. However, 7 divides 7007, with 7007/7 = 1 00 1 . Next, divide 1 00 1 by successive primes, beginning with 7. It is immediately seen that 7 also divides 1 00 1 , because 1 00 1 /7 = 1 43 . Continue by dividing 143 by successive primes, beginning with 7. Although 7 does not divide 1 43, 1 1 does divide 143, and 1 43 / 1 1 = 1 3 . Because 13 is prime, the procedure is completed. It follows that the prime .... factorization of 7007 is 7 . 7 . 1 1 . 1 3 = 7 2 . 1 1 . 1 3 .
LinkS �
Prime numbers were studied in ancient times for philosophical reasons. Today, there are highly practical reasons for their study. In particular, large primes play a crucial role in cryp tography, as we will see in Section 3.7. THE INFINITUDE OF PRIMES It has long been known that there are infinitely many primes. This means that whenever P I . P2 Pn are the n smallest primes, we know there is a larger prime not listed. We will prove this fact using a proof given by Euclid in his famous mathematics text, The Elements. •
THEOREM 3
.
.
.
•
There are infinitely many primes.
Proof: We will prove this theorem using a proofby contradiction. We assume that there are only finitely many primes, P I . P2 Pn . Let •
.
.
.
•
Q = PI P2 ' " Pn + 1 . By the Fundamental Theorem of Arithmetic, Q is prime or else it can be written as the product of two or more primes. However, none of the primes Pj divides Q, for if Pj I Q, then Pj divides Q P I P2 . . . Pn = 1 . Hence, there is a prime not in the list PI . P2 Pn . This prime is either Q . if it is prime, or a prime factor of Q . This is a contradiction because we assumed that we have listed all the primes. Consequently, there are infinitely many primes. (Note that in this proof we do not state that Q is prime! Furthermore, in this proof, we have given a nonconstructive existence proof that given any n primes, there is a prime not in this list. For this proof to be constructive, we would have had to explicitly give a prime not in our original list of <J n primes.) -
•
.
.
.
•
Because there are infinitely many primes, given any positive integer there are primes greater than this integer. There is an ongoing quest to discover larger and larger prime numbers; for almost all the last 300 years, the largest prime known has been an integer of the special form 2P I , where P is also prime. Such primes are called Mersenne primes, after the French monk Marin Mersenne, who studied them in the seventeenth century. The reason that the largest known prime has usually been a Mersenne prime is that there is an extremely efficient test, known as the Lucas-Lehmer test, for determining whether 2P 1 is prime. Furthermore, it is not currently possible to test numbers not of this or certain other special forms anywhere near as quickly to determine whether they are prime. -
-
EXAMPLE 5
The numbers 2 2 - 1 = 3, 23 1 = 7, and 2 5 1 = 3 1 are Mersenne primes, while 2 1 1 2047 is not a Mersenne prime because 2047 = 23 · 89. -
-
-
1 =
....
3.5 Primes and Greatest Common Divisors
3-4 7
213
Progress in finding Mersenne primes has been steady since computers were invented. As of early 2006, 43 Mersenne primes were known, with 12 found since 1 990. The largest Mersenne prime known (again as of early 2006) is 23 0 , 402 , 45 7 1 , a number with over nine million dig its, which was shown to be prime in December 2005 . A communal effort, the Great Internet Mersenne Prime Search (GIMPS), is devoted to the search for new Mersenne primes. By the way, even the search for Mersenne primes has practical implications. One quality control test for supercomputers has been to replicate the Lucas-Lehmer test that establishes the primality of a large Mersenne prime. -
THE DISTRIBUTION OF PRIMES Theorem 3 tells us that there are infinitely many primes. However, how many primes are less than a positive number x ? This question interested mathe maticians for many years; in the late eighteenth century, mathematicians produced large tables of prime numbers to gather evidence concerning the distribution of primes. Using this evidence, the great mathematicians of the day, including Gauss and Legendre, conjectured, but did not prove, Theorem 4. THEOREM
4
unkS
�
unks
�
THE PRIME NUMBER THEOREM The ratio of the number of primes not exceeding x and x / In x approache s 1 as x grows without bound. (Here In x is t4e natural logarithm of x .)
The Prime Number Theorem was first proved in 1 896 by the French mathematician Jacques Hadamard and the Belgian mathematician Charles-Jean-Gustave-Nicholas de la Vallee-Poussin using the theory of complex variables. Although proofs not using complex variables have been found, all known proofs of the Prime Number Theorem are quite complicated. We can use the Prime Number Theorem to estimate the odds that a randomly chosen number is prime. The Prime Number Theorem tells us that the number of primes not exceeding x can be approximated by x / In x . Consequently, the odds that a randomly selected positive integer less than n is prime are approximately (n / In n )/n = 1 / In n . Sometimes we need to find a prime with a particular number of digits. We would like an estimate of how many integers with a particular number of digits we need to select before we encounter a prime. Using the Prime Number Theorem and calculus, it can be shown that the probability that an integer n is prime is also approximately 1 / In n . For example, the odds that an integer near 1 0 1000 is prime are approximately 1 / In 1 0 1 000 , which is approximately 1 /2300. (Of course, by choosing only odd numbers, we double our chances of finding a prime.)
MARIN MERSENNE ( 1 5 88- 1 648) Mersenne was born in Maine, France, into a family of laborers and attended the College of Mans and the Jesuit College at La Fleche. He continued his education at the Sor bonne, studying theology from 1 609 to 1 6 1 1 . He joined the religious order of the Minims in 1 6 1 1 , a group whose name comes from the word minimi (the members of this group were extremely humble; they consid ered themselves the least of a}1 religious orders). Besides prayer, the members of this group devoted their energy to scholarship and study. In 1 6 1 2 he became a priest at the Place Royale in Paris; between 1 6 1 4 and 1 6 1 8 he taught philosophy at the Minim Convent at Nevers. He returned to Paris in 1 6 1 9, where his cell in the Minims de l' Annociade became a place for meetings of French sCientists, philosophers, and mathe maticians, including Fermat and Pascal. Mersenne corresponded extensively with scholars throughout Europe, serving as a clearinghouse for mathematical and scientific knowledge, a function later served by mathematical journals (and today also by the Internet). Mersenne wrote books covering mechanics, mathematical physics, mathematics, music, and acoustics. He studied prime numbers and tried unsuccessfully to construct a formula representing all primes. In 1 644 Mersenne claimed that 2P - 1 is prime for p = 2, 3 , 5, 7, 1 3 , 1 7, 1 9, 3 1 , 67, 1 27, 257 but is composite for all other primes less than 257. It took over 300 years to determine that Mersenne's claim was wrong five times. Specifically, 2P - 1 is not prime for p = 67 and p = 257 but is prime for p = 6 1 , P = 87, and p = 1 07. It is also noteworthy that Mersenne defended two of the most famous men of his time, Descartes and Galileo, from religious critics. He also helped expose alchemists and astrologers as frauds.
214
3 1 The Fundamentals: Algorithms, the Integers, and Matrices
3-48
U sing trial division with Theorem 2 gives procedures for factoring and for primality testing. However, these procedures are not efficient algorithms; many much more practical and efficient algorithms for these tasks have been developed. Factoring and primality testing have become important in the applications of number theory to cryptography. This has led to a great interest in developing efficient algorithms for both tasks. Clever procedures have been devised in the last 30 years for efficiently generating large primes. However, even though powerful new factor ization methods have been developed in the same time frame, factoring large numbers remains extraordinarily more time consuming. Nevertheless, the challenge of factoring large numbers interests many people. There is a communal effort on the Internet to factor large numbers, especially those of the special form k n ± 1 , where k is a small positive integer and n is a large positive integer (such numbers are called Cunningham numbers). At any given time, there is a list of the "Ten Most Wanted" large numbers of this type awaiting factorization.
Conj ectures and Open Problems About Primes Number theory is noted as a subject for which it is easy to formulate conjectures, some of which are difficult to prove and others that have remained open problems for many years. We will describe some conjectures in number theory and discuss their status in Examples 6-9. EXAMPLE 6
Extra � Examples IIiiiiI1
It would be useful to have a function f(n) such that f(n) is prime for all positive integers n . If we had such a function, we could find large primes for use in cryptography and other applications. Looking for such a function, we might check out different polynomial functions, as some mathematicians did several hundred years ago. After a lot of computation we may encounter the polynomial f(n) = n 2 - n + 4 1 . This polynomial has the interesting property that f(n) is prime for all positive integers n not exceeding 40 . [We have f( l ) = 4 1 , f(2) = 4 3, f(3) = 47, f(4) = 53, and so on.] This can lead us to the conjecture that f(n) is prime for all positive integers n . Can we settle this conjecture?
Solution: Perhaps not surprisingly, this conjecture turns out to be false; we do not have to look far to find a positive integer n for which f(n) is composite, because f( 4 1 ) = 4 1 2 - 4 1 + 4 1 = 4 1 2 . Because f(n) = n 2 - n + 4 1 is prime for all positive integers n with 1 ::: n ::: 40, we might be tempted to find a different polynomial with the property that f(n) is prime for all positive integers n . However, there is no such polynomial. It can be shown that for every polynomial f(n) with integer coefficients, there is a positive integer y such that f(y) is composite. (See ... Exercise 37 in the Supplementary Exercises at the end of the chapter.) Many famous problems about primes still await ultimate resolution by clever people. We describe a few of the most accessible and better known of these open problems in Examples 7-9. Number theory is noted for its wealth of easy-to-understand conjectures that resist attack by all but the most sophisticated techiques, or simply resist all attacks. We present these conjectures to show that many questions that seem relatively simple remain unsettled even in the twenty-first century. EXAMPLE 7
linkS �
Goldbach's Conjecture In 1 742, Christian Goldbach, in a letter to Leonhard Euler, conjec tured that every odd integer n, n > 5, is the sum of three primes. Euler replied that this conjecture is equivalent to the conjecture that every even integer n , n > 2, is the sum of two primes (see Exercise 39 in the Supplementary Exercises at the end of the chapter). The conjecture that every even integer n , n > 2, is the sum of two primes is now called Goldbach's conjecture. We can check this conjecture for small even numbers. For example, 4 = 2 + 2, 6 = 3 + 3, 8 = 5 + 3, 10 = 7 + 3, 12 = 7 + 5, and so on. Goldbach's conjecture was verified by hand calculations for numbers up to the millions prior to the advent of computers. With computers it can be checked for extremely large numbers. As of early 2006, the conjecture has been checked for all positive even integers up to 2 . 1 0 1 7 •
3-49
3.5 Primes and Greatest Common Divisors
215
Although no proof of Goldbach's conjecture has been found, most mathematicians believe it is true. Several theorems have been proved, using complicated methods from analytic number theory far beyond the scope of this book, establishing results weaker than Goldbach's conjecture. Among these are the result that every even positive integer greater than 2 is the sum of at most six primes (proved in 1 995 by O. Raman!) and that every sufficiently large positive integer is the sum of a prime and a number that is either prime or the product of two primes (proved in 1 966 by 1. R. Chen). Perhaps Goldbach's conjecture will be settled in the not too distant � future. EXAMPLE S
linkS
�
EXAMPLE 9
unkS
�
There are many conjectures asserting that there are infinitely many primes of certain special forms. A conjecture of this sort is the conjecture that there are infinitely many primes of the form n 2 + 1 , where n is a positive integer. For example, 5 22 + 1 , 1 7 = 42 + 1 , 37 = 62 + 1 , and so on. The best result currently known is that there are infinitely many positive integers n such that n 2 + 1 is prime or the product of at most two primes (proved by Henryk Iwaniec in 1 973 using advanced techniques from analytic number theory, far beyond the scope of this book). � =
Twin primes are primes that differ by 2, such as 3 and 5, 5 and 7, 1 1 and 1 3, 1 7 and 1 9, and 4967 and 4969. The twin prime conjecture asserts that there are infinitely many twin primes. The strongest result proved concerning twin primes is that there are infinitely many pairs p and p + 2, where p is prime and p + 2 is prime or the product of two primes (proved by 1. R. Chen in 1 966). The world's record for twin primes, as of early 2006, � consists of the numbers 1 6, 869, 987, 339, 975 . 2 1 7 1 , 960 ± 1 numbers with 5 1 , 779 digits. The Twin Prime Conjecture
Greatest Common Divisors and Least Common Multiples The largest integer that . divides both of two integers is called the greatest common divisor of these integers. DEFINITION 2
Let a and b be integers, not both zero. The largest integer d such that d I a and d I b is called the greatest common divisor of a and b . The greatest common divisor of a and b is denoted by gcd(a b). ,
The greatest common divisor of two integers, not both zero, exists because the set of common divisors of these integers is finite. One way to find the greatest common divisor of two integers is to find all the positive common divisors of both integers and then take the largest divisor. This is done in Examples 1 0 and 1 1 . Later, a more efficient method of finding greatest common divisors will be given. EXAMPLE 10
What is the greatest common divisor of 24 and 36?
Solution: The positive common divisors of 24 and 36 are 1 , 2, 3, 4, 6, and 12. Hence, � gcd(24, 36) = 1 2 . unkS
�
CHRISTIAN GOLDBACH ( 1 690- 1 764) Christian Goldbach was born in Konigsberg, Prussia, the city noted for its famous bridge problem (which will be studied in Section 9.5). He became professor of mathematics at the Academy in St. Petersburg in 1 725. In 1 728 Goldbach went to Moscow to tutor the son ofthe Tsar. He entered the world of politics when, in 1 742, he became a staff member in the Russian Ministry of Foreign Affairs. Goldbach is best known for his correspondence with eminent mathematicians, including Euler and Bernoulli, for his famous conjectures in number theory, and for several contributions to analysis.
216
3 / The Fundamentals: Algorithms, the Integers, and Matrices
EXAMPLE 11
3-50
What is the greatest common divisor of 1 7 and 22? Solution: The integers 1 7 and 22 have no positive common divisors other than 1 , so that ... gcd( 1 7, 22) = 1 . Because it is often important to specify that two integers have no common positive divisor other than 1 , we have Definition 3 .
DEFINITION 3
EXAMPLE 12
The integers a and b are relatively prime if their greatest common divisor is 1 . By Example 1 1 it follows that the integers 1 7 and 22 are relatively prime, because ... gcd( 1 7, 22) = 1 . Because we often need to specify that no two integers in a set of integers have a common positive divisor greater than 1 , we make Definition 4.
DEFINITION 4
EXAMPLE 13
The integers ai , a2 , . . . , an are pairwise relatively prime if gcd(ai > aj ) = 1 whenever 1 � i < j � n. Determine whether the integers 1 0, 1 7, and 21 are pairwise relatively prime and whether the integers 1 0, 1 9, and 24 are pairwise relatively prime.
Solution: Because gcd( 1 0, 1 7) = 1 , gcd( 1 0, 2 1 ) = 1 , and gcd( 1 7 , 2 1 ) = 1 , we conclude that 1 0, 1 7, and 2 1 are pairwise relatively prime. Because gcd( 1 0, 24) = 2 > 1 , we see that 1 0, 1 9, and 24 are not pairwise relatively prime . ... Another way to find the greatest common divisor of two integers is to use the prime factor izations of these integers. Suppose that the prime factorizations of the integers a and b, neither equal to zero, are
a .= P aI , P2a2 . . . Pnan , b = P b,I P2b2 . . . Pnbn ' where each exponent is a nonnegative integer, and where all primes occurring in the prime factorization of either a or b are included in both factorizations, with zero exponents if necessary. Then gcd(a , b) is given by min(a " b
min(a2
, b2) ,) min(an , bn ) gcd(a , b) PI P2 ' ' ' Pn where min(x , y) represents the minimum of the two numbers x and y. To show that this formula for gcd(a , b) is valid, we must show that the integer on the right-hand side divides both a and b, and that no larger integer also does. This integer does divide both a and b, because the power of each prime in the factorization does not exceed the power of this prime in either the factorization of a or that of b. Further, no larger integer can divide both a and b, because the exponents of the primes in this factorization cannot be increased, and no other primes can be included. '
EXAMPLE 14
Because the prime factorizations of 1 20 and 500 are 1 20 = 2 3 . 3 . 5 and 500 = 22 . 5 3 , the greatest common divisor is
Prime factorizations can also be used to find the least common multiple of two integers.
3 .5 Primes and Greatest Common Divisors
3-5 /
217
The least common multiple of the positive integers a and b is the smallest positive integer that is divisible by both a and b. The least common multiple of a and b is denoted by lcm(a , b).
DEFINITION 5
The least common multiple exists because the set of integers divisible by both a and b is nonempty, and every nonempty set of positive integers has a least element (by the well-ordering property, which will be discussed in Section 4.2). Suppose that the prime factorizations of a and b are as before. Then the least common multiple of a and b is given by Icm(a , b) =
max(a \ , btl
PI
max(a2 , b2)
P2
. . . Pnmax(an , bn )
where max(x , y) denotes the maximum ofthe two numbers x and y. This formula is valid because a common multiple of a and b has at least max(ai , bi ) factors of Pi in its prime factorization, and the least common multiple has no other prime factors besides those in a and b. EXAMPLE 1 5
What is the least common multiple of 23 3 5 7 2 and 24 3 3 ?
Solution: We have
Theorem 5 gives the relationship between the greatest common divisor and least common multiple oftwo integers. It can be proved using the formulae we have derived for these quantities. The proof of this theorem is left as an exercise for the reader.
Let a and b be positive integers. Then
THEOREM 5
ab
=
gcd(a , b) · lcm(a , b).
Exercises 1. Determine whether each of these integers is prime. a) b) c) d) 1) e) 2.
2171 2997 143 111 Determine whether each of these integers is prime. 19 27101 93107 1) 113 Find the prime factorization of each of these integers. 881001 126 729 1111 1) 909,090
a) c) e) 3.
b) d)
a) d)
b) e)
c)
4. Find the prime factorization of each of these integers.
5. *6.
39143
81 289
101 899 Find the prime factorization of 1O!. How many zeros are there at the end of 100!? a) d)
b) e)
c)
1)
*7. Show that log2 8.
*9. 10. 11. 12.
3
is an irrational number. Recall that an ir rational number is a real number x that cannot be written as the ratio of two integers. Prove that for every positive integer there are consecu tive composite integers. Consider the consecutive integers starting with + + Prove or disprove that there are three consecutive odd positive integers that are primes, that is, odd primes ofthe form p, p and p + Which positive integers less than are relatively prime to Which positive integers less than are relatively prime to Determine whether the integers in each of these sets are pairwise relatively prime.
12? 30?
a) c)
+ 2,
21, 41, 34, 49,55 64 25,
[(n HinI)!t: 2.n,] 4. 12 30 b) d)
nn
17,14, 18,17, 8519, 23
218
3 / The Fundamentals: Algorithms, the Integers, and Matrices
13. Determine whether the integers in each of these sets are
pairwise relatively prime. a) 1 1 , 1 5 , 1 9 b ) 14, 1 5 , 2 1 c) 12, 1 7, 3 1 , 37 d) 7, 8, 9, 1 1 14. We call a positive integer perfect if it equals the sum of its positive divisors other than itself. a) Show that 6 and 28 are perfect. b) Show that 2P - I (2P - 1 ) is a perfect number when 2P - 1 is prime. 15. Show that if 2n - 1 is prime, then is prime. Use the identity 2ab - 1 (2a - 1 ) . (2a (b - l ) + 2a (b - 2) + . . . + 2a + 1 ).] 16. Determine whether each of these integers is prime, veri fYing some of Mersenne's claims. a) 27 - I b) 29 - 1 c) 2 1 1 - I d) 2 1 3 - 1 The value of the Euler q,-function at the positive integer is defined to be the number of positive integers less than or equal to that are relatively prime to n . ¢ is the Greek letter phi.) 1 7. Find a) ¢(4). b) ¢( I O). c) ¢(1 3). 18. Show that n is prime if and only if ¢(n) = n - I . 19. What is the value of ¢(p k ) when P is prime and k is a positive integer? 20. What are the greatest common divisors of these pairs of integers? a) 2 2 . 33 . 5 5 , 2 5 . 33 . 5 2 b) 2 · 3 · 5 · 7 · 1 1 . 1 3 , 2 1 1 . 39 1 1 . 1 7 1 4 d) 22 . 7, 53 · 1 3 c) 1 7, 1 7 1 7 1) 2 · 3 · 5 · 7, 2 · 3 . 5 · 7 e) 0, 5 2 1 . What are the greatest common divisors of these pairs of integers? a) 37 · 53 . 73 , 2 1 1 . 3 5 . 59 b) 1 1 · 13 . 1 7, 29 . 37 . 5 5 . 73 c) 233 1 , 23 1 7 d) 4 1 · 43 · 53, 4 1 · 43 · 53 e) 3 1 3 . 5 1 7 , 2 12 . 7 2 1 1) 1 1 1 1 , 0 22. What is the least common multiple of each pair in Exercise 20? 23. What is the least common multiple of each pair in Exercise 21? 24. Find gcd( l OOO, 625) and lcm( 1 000, 625) and verifY that gcd( 1 000, 625) · lcm( 1 000, 625) 1 000 · 625 . 25. Find gcd(92928, 123552) and lcm(92928, 123552), and verifY that gcd(92928, 1 23552) . lcm(92928, 1 23 552) 92928 . 1 23552. First find the prime factorizations of 92928 and 1 23552.] 26. lEthe product of two integers is 273 8 5 2 7 1 1 and their great est common divisor is 23 345, what is their least common multiple? 27. Show that if a and b are positive integers, then ab = gcd(a , b) · lcm(a , b). Use the prime factorizations
n
=
n
(Note:
[Hint: n
•
=
[Hint:
[Hint:
=
3-52
of a and b and the formulae for gcd(a , b) and lcm(a , b) in terms of these factorizations.] 28. Find the smallest positive integer with exactly different factors when n is a) 3 . b) 4. c) 5 . d) 6. e) 1 0. 29. Can you find a formula o r rule for the nth term o f a se quence related to the prime numbers or prime factoriza tions so that the initial terms of the sequence have these values? a) 0, 1 , 1 , 0, 1 , 0, 1 , 0, 0, 0, 1 , 0, 1 , . . . b) 1 , 2, 3 , 2, 5 , 2, 7, 2, 3 , 2, 1 1 , 2, 1 3 , 2, . . . c) 1 , 2, 2, 3 , 2, 4, 2, 4, 3 , 4, 2, 6, 2, 4, . . . d) 1 , 1 , 1 , 0, 1 , 1 , 1 , 0, 0, I , 1 , 0, 1 , 1 , . . . e) 1 , 2, 3 , 3 , 5 , 5 , 7, 7, 7, 7, 1 1 , I I , 1 3 , 1 3 , . . . 1) 1 , 2, 6, 30, 2 1 0, 23 1 0, 30030, 5 1 05 1 0, 9699690, 223092870, . . . 30. Can you find a formula or rule for the nth term of a se quence related to the prime numbers or prime factoriza tions so that the initial terms of the sequence have these values? a) 2 , 2, 3 , 5 , 5 , 7, 7, 1 1 , 1 1 , l l , l l , 1 3 , 1 3 , . . . b) 0, 1 , 2, 2, 3 , 3 , 4, 4, 4, 4, 5 , 5 , 6, 6, . . . c) 1 , 0, 0, 1 , 0, 1 , 0, I , 1 , 0, 1 , 0, I , . . . d) 1 , - 1 , 0, - 1 , 1 , - 1 , 0, 0, 1 , - 1 , 0, - I , I , 1 , . . . e) I , I , 1 , 1 , 1 , 0, 1 , 1 , 1 , 0, I , 0, 1 , 0, 0, . . . 1) 4, 9, 25 , 49, 1 2 1 , 1 69, 289, 36 1 , 529, 84 1 , 96 1 , 1 369, . . . 3 1 . Prove that the product of any three consecutive integers is divisible by 6. 32. Show that if a, b, and m are integers such that m :::: 2 and a == b (mod m), then gcd(a , m ) gcd(b, m). *33. Prove or disprove that - 79n + 1 60 1 is prime when ever is a positive integer. 34. Prove or disprove that P I P2 . . . Pn + I is prime for every positive integer where P I , P2 , . . . , Pn are the smallest prime numbers. 35. Adapt the proof in the text that there are infinitely many primes to prove that there are infinitely many primes of the form 4k + 3 , where k is a nonnegative integer. Suppose that there are only finitely many such primes q l , q2 , · · · , qn , and considerthe number4q l q2 " · qn - I .] *36. Prove that the set of positive rational numbers is countable by setting up a function that assigns to a rational number P / q with gcd(p, q ) 1 the base I I number formed from the decimal representation of P followed by the base I I digit A, which corresponds to the decimal number 1 0, followed by the decimal representation of q . *37. Prove that the set of positive rational numbers is countable by showing that the function K is a one to-one correspondence between the set of positive rational numbers and the set of positive integers , l 2 2a l P 2a2 • . . P 2a, q 2b l - 1 q22b - 1 . . qt2b - 1 , I'f K( m / ) = P I s 2 where gcd(m , = I and the prime-power factorizations o f m and are m P aI l P2a2 q bl l q 2b2 . . . qtb, . Psa, and
n
I,
-I,
=
n2
n
n,
n
[Hint:
=
n n) n .
•
=
. • •
n
=
3-53
3.6 Integers and Algorithms
3.6
219
Integers and Algorithms Introduction As mentioned in Section 3 . 1 , the term algorithm originally referred to procedures for performing arithmetic operations using the decimal representations of integers. These algorithms, adapted for use with binary representations, are the basis for computer arithmetic. They provide good illustrations of the concept of an algorithm and the complexity of algorithms. For these reasons, they will be discussed in this section. There are many important algorithms involving integers besides those used in arithmetic, including the Euclidean algorithm, which is one of the most useful algorithms, and perhaps the oldest algorithm, in mathematics. We will also describe an algorithm for finding the base b expansion of a positive integer for any base b and for modular exponentiation, an algorithm important in cryptography.
Representations of Integers In everyday life we use decimal notation to express integers. For example, 965 is used to denote 9 · 1 02 + 6 · 1 0 + 5 . However, it is often convenient to use bases other than 1 0. In particular, computers usually use binary notation (with 2 as the base) when carrying out arithmetic, and octal (base 8) or hexadecimal (base 1 6) notation when expressing characters, such as letters or digits. In fact, we can use any positive integer greater than I as the base when expressing integers. This is stated in Theorem 1 . THEOREM l
Let b be a positive integer greater than I . Then if n is a positive integer, it can be expressed uniquely in the form k k n = akb + ak_ l b - ' + . . . + a l b + ao ,
where k is a nonnegative integer, ao , a " . . . , ak are nonnegative integers less than b, and ak ::j:. o.
A proof of this theorem can be constructed using mathematical induction, a proof method that is discussed in Section 4. 1 . It can also be found in [Ro05] . The representation of n given in Theorem I is called the base b expansion of n . The base b expansion of n is denoted by (akak - ' . . . a , aoh. For instance, (245) 8 represents 2 . 8 2 + 4 · 8 + 5 = 1 65 . Choosing 2 as the base gives binary expansions of integers. In binary notation each digit is either a 0 or a 1 . In other words, the binary expansion of an integer is just a bit string. Binary expansions (and related expansions that are variants of binary expansions) are used by computers to represent and do arithmetic with integers. BINARY EXPANSIONS
EXAMPLE 1
What is the decimal expansion of the integer that has ( l 0 1 0 1 1 1 1 1 )2 as its binary expansion?
Solution: We have ( l 0 1 0 1 1 1 1 l h = I . 2 8 + 0 . 2 7 + 1 . 2 6 + 0 . 2 5 + 1 . 24 + 1 . 2 3 + 1 . 22 + 1 . 2 ' + 1 . 20 = 35 1 . HEXADECIMAL EXPANSIONS Sixteen is another base used in computer science. The base 1 6 expansion of an integer is called its hexadecimal expansion. Sixteen different digits are
220
3 / The Fundamentals: Algorithms, the Integers, and Matrices
3-54
required for such expansions. Usually, the hexadecimal digits used are 0, 1 , 2, 3, 4, 5, 6, 7, 8, 9, A, B, C, D, E, and F, where the letters A through F represent the digits corresponding to the numbers 1 0 through 1 5 (in qecimal notation). EXAMPLE 2
What is the decimal expansion of the hexadecimal expansion of (2AEOB) ' 6 ?
Solution: We have (2AEOB) ' 6 = 2 . 1 64 + 1 0 . 1 6 3 + 1 4 . 1 62 + 0 · 1 6 + 1 1 = ( 1 75627) 10 . Each hexadecimal digit can be represented using four bits. For instance, we see that ( 1 1 1 0 0 1 0 1 )2 = (E5) ' 6 because ( 1 1 1 0h = (E)' 6 and (0 1 0 1 h = (5) '6. Bytes, which are bit strings of length eight, can be represented by two hexadecimal digits. BA SE CONVERSION
We will now describe an algorithm for constructing the base b expan sion of an integer n. First, divide n by b to obtain a quotient and remainder, that is,
n = bqo + ao ,
0 ::; ao
<
b.
The remainder, ao , is the rightmost digit in the base b expansion of n. Next, divide qo by b to obtain
We see that a, is the second digit from the right in the base b expansion of n. Continue this process, successively dividing the quotients by b, obtaining additional base b digits as the remainders. This process terminates when we obtain a quotient equal to zero. EXAMPLE 3
Extra � ExaDlples �
Find the base 8, or octal, expansion of ( 1 2345) 10 .
Solution: First, divide l 2345 by 8 to obtain 1 2345 = 8 . 1 543 + 1 . Successively dividing quotients by 8 gives 1 543 1 92 24 3
= = = =
8 8 8 8
. 1 92 + 7, · 24 + 0, · 3 + 0, · 0 + 3.
Because the remainders are the digits of the base 8 expansion of 1 2345, it follows that ( 1 2345) 1 0 = (3 007 1 ) 8 . EXAMPLE 4
Find the hexadecimal expansion of ( 1 77 1 30) 1 0 .
Solution: First divide 1 77 1 30 by 1 6 to obtain 1 77 1 30 = 1 6 · 1 1 070 + 1 0.
3.6 Integers and Algorithms
3-55
221
Successively dividing quotients by 1 6 gives 1 1 070 69 1 43 2
= = = =
1 6 · 69 1 + 14, 1 6 · 43 + 3 , 16 · 2 + 1 1 , 1 6 · 0 + 2.
Because the remainders are the digits of the hexadecimal (base 1 6) expansion of ( 1 77 1 30) 10 , it follows that ( 1 77 1 30) 10 = (2B3EA) 16 . (Recall that the integers 1 0, 1 1 , and 1 4 correspond to the hexadecimal digits A, B, and E, .... respectively.) EXAMPLE 5
Find the binary expansion of (24 1 ) 10 .
Solution: First divide 24 1 by 2 to obtain 24 1 = 2 . 1 20 + 1 . Successively dividing quotients by 2 gives 1 20 = 2 . 60 + 0, 60 = 2 · 30 + 0, 30 = 2 · 1 5 + 0, 15 = 2 · 7 + 1, 7 = 2 · 3 + 1, 3 = 2 · 1 + 1, 1 = 2 · 0 + 1. Because the remainders are the digits of the binary (base 2) expansion of (24 1 ) 1 O , it follows that (24 1 ) 10 = ( 1 1 1 1 000 1 )2 . The pseudocode given in Algorithm 1 finds the base b expansion (a k - l . . . al aO )b of the integer n.
ALGORITHM 1 Constructing Base
procedure
q := n k := O
base b expansion(n:
0 ak := q b q : = Lq/b k : = k + 1J b
b
Expansions.
positive integer)
while q =1= begin
mod
end {the base
expansion of n is
(ak- l . . . alao)b}
222
3-56
3 / The Fundamentals: Algorithms, the Integers, and Matrices
TABLE 1 Hexadecimal, Octal, and Binary Representation of the Integers 0 through 15. Decimal
0
1
2
3
4
5
6
7
8
9
10
II
12
13
14
15
Hexadecimal
0
1
2
3
4
5
6
7
8
9
A
B
C
D
E
F
Octal
0
1
2
3
4
5
6
7
10
11
12
13
14
15
16
17
Binary
0
1
10
11
1 00
101
1 10
III
1 000
1 00 1
1010
101 1
1 1 00
1 101
1 1 10
1111
In Algorithm 1 , q represents the quotient obtained by successive divisions by b, starting with q = n . The digits in the base b expansion are the remainders of these divisions and are given by q mod b. The algorithm terminates when a quotient q = 0 is reached. Remark: Note that Algorithm 1 can be thought of as a greedy algorithm.
Conversion between binary and hexadecimal expansions is extremely easy because each hexadecimal digit corresponds to a block of four binary digits, with this correspondence shown in Table 1 without initial Os shown. (We leave it as Exercises 1 1 and 1 2 at the end ofthis section to show that this is the case.) This conversion is illustrated in Example 6. EXAMPLE 6
Find the hexadecimal expansion of ( 1 1 1 1 1 0 1 0 1 1 1 1 00)z and the binary expansion of (A8D) 16 .
Solution: To convert ( 1 1 1 1 1 0 1 0 1 1 1 1 00h into hexadecimal notation we group the binary digits into blocks of four, adding initial zeros at the start of the leftmost block if necessary. These blocks are 00 1 1 , 1 1 1 0, 1 0 1 1 , and 1 1 00, which correspond to the hexadecimal digits 3, E, B, and C, respectively. Consequently, ( 1 1 1 1 1 0 1 0 1 1 1 1 00)2 = (3EBC) 16 ' To convert (A8D) 1 6 into binary notation, we replace each hexadecimal digit by a block of four binary digits. These blocks are 1 0 1 0, 1 000, 1 1 0 1 . Consequently, (A8D) 1 6 = .... ( 1 0 1 0 1 000 1 1 0 1 )z .
Algorithms for Integer Operations The algorithms for performing operations with integers using their binary expansions are ex tremely important in computer arithmetic. We will describe algorithms for the addition and the multiplication of two integers expressed in binary notation. We will also analyze the compu tational complexity of these algorithms, in terms of the actual number of bit operations used. Throughout this discussion, suppose that the binary expansions of a and b are
a
=
(an - I an - 2 . . . a l aO )2 , b
=
(bn - I bn - 2 . . . b l bo)2 ,
so that a and b each have n bits (putting bits equal to 0 at the beginning of one of these expansions if necessary). We will measure the complexity of algorithms for integer arithmetic in terms of the number of bits in these numbers. Consider the problem of adding two integers in binary notation. A procedure to perform addition can be based on the usual method for adding numbers with pencil and paper. This method proceeds by adding pairs of binary digits together with carries, when they occur, to compute the sum of two integers. This procedure will now be specified in detail. . To add a and b, first add their rightmost bits. This gives
ao + bo
=
Co . 2 + So,
where So is the rightmost bit in the binary expansion of a + b and Co is the carry, which is either o or 1 . Then add the next pair of bits and the carry,
a l + b l + Co
=
C I . 2 + Sl ,
3.6 Integers and Algorithms
3-5 7
223
where S I is the next bit (from the right) in the binary expansion of a + b, and C I is the carry. Continue this process, adding the corresponding bits in the two binary expansions and the carry, to determine the next bit from the right in the binary expansion of a + b. At the last stage, add an - I , bn - I , and Cn - 2 to obtain Cn - I · 2 + Sn- I . The leading bit of the sum is Sn = Cn - I . This procedure produces the binary expansion of the sum, namely, a + b = (Sn Sn - I Sn - 2 S I SO )z. .
EXAMPLE 7
•
•
Add a = ( 1 1 1 0h and b = ( 1 0 1 1 )2 .
Solution: Following the procedure specified in the algorithm, first note that
ao + bo = 0 + 1 = 0 . 2 + 1 , so that Co = 0 and So = 1 . Then, because
al + bl + Co = 1 + 1 + 0 = 1 . 2 + 0, it follows that C I = 1 and S I = O . Continuing, 1 1 1
a2 + b2 + C I = 1 + 0 + 1 = 1 . 2 + 0,
1 1 1 0 101 1
so that C2 = 1 and S2 = O. Finally, because
I 1 00 I
a3 + h3 + C2 = 1 + 1 + 1 = I . 2 + I ,
FIGURE 1
Adding (l l lOh and (lOl lh-
it follows that C3 = 1 and S3 = 1 . This means that S4 = C3 = 1 . Therefore, S = a + b = .... ( 1 1 00 1 h . This addition is displayed in Figure l . The algorithm for addition can be described using pseudocode as follows.
ALGORITHM 2 Addition of Integers.
add(a, b: a b (an- Ian -2 . . . alaO)2 (bn - Ibn-2 blbo)2 , c := 0 j := 0 1 d : = L(aj + bj + c)/2J Sj : = aj + bj + C 2d c := d Sn := c {the binary expansion of the sum is (Sn Sn- l . , . SO)2 } procedure
positive integers) {the binary expansions of and are and respectively} • • •
for begin
to
n
-
-
end
Next, the number of additions of bits used by Algorithm 2 will be analyzed. EXAMPLE 8
How many additions of bits are required to use Algorithm 2 to add two integers with n bits (or less) in their binary representations?
224
3 / The Fundamentals: Algorithms, the Integers, and Matrices
3-58
Solution: Two integers are added by successively adding pairs of bits and, when it occurs, a carry. Adding each pair of bits and the carry requires three or fewer additions of bits. Thus, the total number of additions of bits used is less than three times the number of bits in the expansion. Hence, the number of additions of bits used by Algorithm 2 to add two n-bit integers .... is O (n). Next, consider the multiplication of two n-bit integers a and b. The conventional algorithm (used when multiplying with pencil and paper) works as follows. Using the distributive law, we see that
ab
a(b0 2° + b l 2 1 + . . . + bn _ 1 2n - l ) n I = a(b0 20 ) + a(h 2 1 ) + . . . + a(bn _ 1 2 - ).
=
We can compute ab using this equation. We first note that abj = a if bj = I and abj = 0 if bj = O. Each time we multiply a term by 2, we shift its binary expansion one place to the left and add a zero at the tail end of the expansion. Consequently, we can obtain (abj )2j by shifting the binary expansion of abj j places to the left, adding j zero bits at the tail end of this binary expansion. Finally, we obtain ab by adding the n integers abj 2j , j = 0, I , 2, . . . , n - l. This procedure for multiplication can be described using the pseudocode shown as Algorithm 3 .
ALGORITHM 3 Multiplying Integers.
multiply(a, a (bn - 1 bn -2 b 1 boh. j := 0 1
b: positive integers)
procedure
(an - l an -2 . . . alao 2
{the binary expansions of and
and b are respectively}
for to n begin if then else end
shifted
• • •
-
bj = 1 Cj := a Cj : = 0 {co, Cl, . . . , C p := 0 n l j := 0 1 p := p + Cj p ab
j
places
are the partial products}
for {
to n
-
is the value of
}
Example 9 illustrates the use of this algorithm.
EXAMPLE 9
Find the product of a = ( l 1 O)z and b = ( l 0 1 )z .
Solution: First note that
abo · 2° = ( 1 1 O)z . I . 2° ab 1 . 2 1 = ( l 1 O)z . 0 . 2 1
= ( 1 1 O)z , = (0000)z ,
)
3.6 Integers and Algorithms
3-59 1 1 0 101 1 1 0 000 1 1 0 1 1 1 10 FIGURE 2
Multiplying (l lOh and (lOlhEXAMPLE 10
225
and
To find the product, add ( 1 1 0)2 , (OOOOh, and ( 1 1 000h. Carrying out these additions (using Algorithm 2, including initial zero bits when necessary) shows that a b = ( l I I l Oh . This mul .... tiplication is displayed in Figure 2. Next, we determine the number of additions of bits and shifts of bits used by Algorithm 3 to multiply two integers. How many additions of bits and shifts of bits are used to multiply a and b using Algorithm 3?
Solution: Algorithm 3 computes the products of a and b by adding the partial products Co , C\ , C2 , . . . , and Cn - I . When bj = 1 , we compute the partial product Cj by shifting the binary expansion of aj bits. When bj = 0, no shifts are required because Cj = O . Hence, to find all n of the integers ab j 2j , j = 0, 1 , . . . , n - 1 , requires at most 0+ I +2+···+n - I shifts. Hence, by Example 5 in Section 3 .2 the number of shifts required is O (n 2 ). To add the integers abj from j = 0 to j = n - 1 requires the addition of an n -bit integer, an (n + I )-bit integer, . . . , and a (2n)-bit integer. We know from Example 8 that each of these additions requires O (n) additions of bits. Consequently, a total of O (n 2 ) additions of bits are .... required for all n additions.
ALGORITHM 4 Computing div and mod.
procedure
q := 0 r := la l
division algorithm(a:
d r := r - d := + 1 a<0 r 0 r := d - r := + 1 ) =a d
while r begin
q end if begin q
end {q
integer,
d:
positive integer)
�
q
>
and
then
-(q
div
is the quotient,
r=a
mod
d
is the remainder}
Surprisingly, there are more efficient algorithms than the conventional algorithm for mul tiplying integers. One such algorithm, which uses O (n 1 . 5 8 5 ) bit operations to multiply n-bit numbers, will be described in Section 7.3. Given integers a and d, d > 0, we can find q = a div d and r = a mod d using Algorithm 4. In this algorithm, when a is positive we subtract d from a as many times as necessary until what
226
3-60
3 / The Fundamentals; Algoritluns, the Integers, and Matrices
is left is less than d . The number oftimes we perform this subtraction is the quotient and what is left over after all these subtractions is the remainder. Algorithm 4 also covers the case where a is negative. This algorithm finds the quotient q and remainder r when l a I is divided by d. Then, when a < 0 and r > 0, it uses these to find the quotient -(q + 1 ) and remainder d r when a is divided by d. We leave it to the reader (Exercise 57) to show that, assuming that a > d, this algorithm uses 0 (q log a ) bit operations. There are more efficient algorithms than Algorithm 4 for determining the quotient q = a div d and the remainder r = a mod d when a positive integer a is divided by a positive integer d (see [Kn98] for details). These algorithms require 0 (log a . log d) bit operations. If both of the binary expansions of a and d contain n or fewer bits, then we can replace log a . log d by n 2 . This means that we need O (n 2 ) bit operations to find the quotient and remainder when a is divided by d . -
Modular Exponentiation In cryptography it is important to be able to find bn mod m efficiently, where b, n, and m are large integers. It is impractical to first compute bn and then find its remainder when divided by m because b n will be a huge number. Instead, we can use an algorithm that employs the binary expansion of the exponent n , say n = (ak - I . . . al ao )z . Before we present this algorithm, we illustrate its basic idea. We will explain how to use the binary expansion of n to compute bn . First, note that
To compute b n , we find the values of b, b 2 , (b 2 )2 = b4 , (b4 )2 = b8 , • • • , b2k . Next, we multiply the terms b 2j in this list, where aj = 1 . This gives us b n • For example, to compute 3 1 1 we first note that 1 1 = ( 1 0 1 1 )2 , so that 3 1 1 = 383 2 3 1 • By successively squaring, we find that 3 2 = 9, 3 4 = 92 = 8 1 , and 38 = (8 1 )2 = 656 1 . Consequently, 3 1 1 = 383 2 3 1 = 656 1 9 3 = 1 77, 1 47. The algorithm successively finds b mod m, b 2 mod m, b4 mod m, . . . , b2k-' mod m and multiplies together those terms b2j mod m where aj = 1 , finding the remainder of the product when divided by m after each multiplication. Pseudocode for this algorithm is shown in Al gorithm 5 . Note that in Algorithm 5 we use the most efficient algorithm available to compute values of the mod function, not Algorithm 4. ·
ALGORITHM 5 Modular Exponentiation.
modular exponentiation(b; n = (ak- ]ak-2 . . . a]aoh, m; x ;= 1 power ;= b m i ;= 0 1 a = 1 x ;= (x · power) m poweri ;= (power · power) m {x equals bn m} procedure
integer,
positive integers)
for begin if
mod to k
-
then
mod
mod
end
mod
We illustrate how Algorithm 5 works in Example 1 1 .
·
3.6 Integers and Algorithms
3-61
EXAMPLE 11
227
Use Algorithm 5 to find 3 644 mod 645 .
Solution: Algorithm 5 initially sets x = 1 and power = 3 mod 645 = 3 . In the computation of 3 644 mod 645, this algorithm determines 3 2i mod 645 for j = 1 , 2, . . . , 9 by successively squaring and reducing modulo 645 . If aj = 1 (where aj is the bit in the jth position in the binary expansion of 644), it multiplies the current value of x by 3 2i mod 645 and reduces the result modulo 645 . Here are the steps used:
i i i i i i i i
ao = 1 : Because a) = 2 : Because a2 = 3 : Because a 3 = 0: Because
= 4: Because a4
= 5: Because a 5
a6 = 7: Because a 7
= 6: Because
mod 645
i = 8 : Because ag i = 9: Because a9
= 0, we have x = 1 and power = 3 2 mod 645 = 9 mod 645 = 9; = 0, we have x = 1 and power = 92 mod 645 = 8 1 mod 645 = 8 1 ; = 1 , we have x = 1 · 8 1 mod 645 = 8 1 and power = 8 1 2 mod 645 = 656 1 mod 645 = 1 1 1 ; = 0, we have x = 8 1 and power = 1 1 1 2 mod 645 = 1 2 , 32 1 mod 645 = 66; = 0, we have x = 81 and power = 662 mod 645 = 4356 mod 645 = 486; = 0, we have x = 81 and power = 4862 mod 645 = 236, 1 96 mod 645 = 1 26; = 0, we have x = 8 1 and power = 1 262 mod 645 = 1 5 , 876 mod 645 = 396; = 1, we find that x = (8 1 · 396) mod 645 = 47 1 and power = 3962 mod 645 = 1 5 6, 8 1 6 = 81; = 0, we have x = 47 1 and power = 8 1 2 mod 645 = 656 1 mod 645 = 1 1 1 ; = 1 , we find that x = (47 1 . 1 1 1 ) mod 645 = 36.
This shows that following the steps of Algorithm 5 produces the result 3 644 mod .... 645 = 36. Algorithm 5 is quite efficient; it uses 0 ((log m )2 log n) bit operations to find bn mod m (see Exercise 56).
The Euclidean Algorithm
unkS �
The method described in Section 3 . 5 for computing the greatest common divisor of two integers, using the prime factorizations of these integers, is inefficient. The reason is that it is time consuming to find prime factorizations. We will give a more efficient method of finding the greatest common divisor, called the Euclidean algorithm. This algorithm has been known since ancient times. It is named after the ancient Greek mathematician Euclid, who included a description of this algorithm in his book The Elements. Before describing the Euclidean algorithm, we will show how it is used to find gcd(9 1 , 287). First, divide 287, the larger of the two integers, by 9 1 , the smaller, to obtain 287 = 9 1 . 3 + 1 4 . Any divisor of 9 1 and 2 8 7 must also b e a divisor of 287 - 9 1 · 3 = 14. Also, any divisor of 9 1 and 1 4 must also be a divisor of 287 = 9 1 · 3 + 14. Hence, the greatest common divisor of 9 1 and 287 i s the same as the greatest common divisor of 9 1 and 1 4 . This means that the problem of finding gcd(9 1 , 287) has been reduced to the problem of finding gcd(9 1 , 1 4). Next, divide 91 by 1 4 to obtain 9 1 = 14 · 6 + 7.
228
3 / The Fundamentals: Algorithms, the Integers, and Matrices
3-62
Because any common divisor of9 1 and 1 4 also divides 9 1 - 1 4 . 6 = 7 and any common divisor of 1 4 and 7 divides 9 1 , it follows that gcd(9 1 , 1 4) = gcd( 1 4, 7). Continue by dividing 14 by 7, to obtain 1 4 = 7 · 2. Because 7 divides 14, it follows that gcd( 1 4, 7) = 7. Furthermore, because gcd(287, 9 1 ) = gcd(9 1 , 1 4) = gcd( 1 4, 7) = 7, the original problem has been solved. We now describe how the Euclidean algorithm works in generality. We will use successive divisions to reduce the problem of finding the greatest common divisor of two positive integers to the same problem with smaller integers, until one of the integers is zero. The Euclidean algorithm is based on the following result about greatest common divisors and the division algorithm. LEMMA 1
Let a = bq + r, where a , b, q , and r are integers. Then gcd(a , b) = gcd(b, r) .
Proof: If we can show that the common divisors of a and b are the same as the common divisors of b and r, we will have shown that gcd(a , b) = gcd(b, r), because both pairs must have the same greatest common divisor. So suppose that d divides both a and b. Then it follows that d also divides a - bq = r (from Theorem 1 of Section 3 .4). Hence, any common divisor of a and b is also a common divisor of b and r. Likewise, suppose that d divides both b and r . Then d also divides bq + r = a . Hence, any common divisor of b and r is also a common divisor of a and b.
ro rl
= rl ql + r2 = r2q2 + r3
0 ::::: r2 0 ::::: r3
<
rl , r2 ,
rn - 2 = rn - I qn - I + rn rn - I = rn qn '
0 ::::: rn
<
rn - I ,
<
Eventually a remainder of zero occurs in this sequence of successive divisions, because the sequence of remainders a = ro > r l > r2 > . . . :::: 0 cannot contain more than a terms. Fur thermore, it follows from Lemma 1 that gcd(a , b) = gcd(ro , rl ) = gcd(rl , r2) = . . . = gcd(rn - 2 , rn - d
unkS �
= gcd(rn _ l , rn ) = gcd(rn , 0) = rn . Hence, the greatest common divisor is the last nonzero remainder in the sequence of divisions.
The
EU CLID (C. 325-265 B.C.E. ) Euclid was the author ofthe most successful mathematics book ever written, which appeared in over different editions from ancient to modem times. Little is known about Euclid's life, other than that he taught at the famous academy at Alexandria in Egypt. Apparently, Euclid did not stress applications. When a student asked what he would get by learning geometry, Euclid explained that knowledge was worth acquiring for its own sake and told his servant to give the student a coin "because he must make a profit from what he learns."
Elements,
1000
3-63
3.6 Integers and Algorithms EXAMPLE 12
229
Find the greatest common divisor of 4 1 4 and 662 using the Euclidean algorithm.
Solution: Successive uses of the division algorithm give: 662 = 4 1 4 . 1 + 248 4 1 4 = 248 . 1 + 1 66 248 = 1 66 . 1 + 82 1 66 = 82 · 2 + 2 82 = 2 · 4 1 . Hence, gcd(4 1 4, 662) = 2, because 2 is the last nonzero remainder. The Euclidean algorithm is expressed in pseudocode in Algorithm 6.
ALGORITHM 6 The Euclidean Algorithm.
procedure gcd(a , b: positive integers) x := a y := b while y =1= 0 begin r : = x mod y x := y y := r end (gcd(a , b) is x }
In Algorithm 6 , the initial values o f x and y are a and b, respectively. At each stage o f the procedure, x is replaced by y, and y is replaced by x mod y, which is the remainder when x is divided by y. This process is repeated as long as y =1= O. The algorithm terminates when y = 0, and the value of x at that point, the last nonzero remainder in the procedure, is the greatest common divisor of a and b. We will study the time complexity of the Euclidean algorithm in Section 4.3, where we will show that the number of divisions required to find the greatest common divisor of a and b, where a ?: b, is O (log b).
Exercises 1. Convert these integers from decimal notation to binary
notation. b) 4532 a) 23 1 2. Convert these integers notation. b) 1 023 a) 32 1 3. Convert these integers notation. a) 1 1 c) 1 0 1 0 1 0 1 0 1
III
c) 97644
from decimal notation to binary c) 1 00632
from binary notation to decimal b) 1 0 0000 000 1 d) 1 1 0 1 00 1 000 1 0000
4. Convert these integers from binary notation to decimal
notation. a) 1 l O l l c) lOl l 1 1 1O
II
b) 1 0 1 0 1 1 0 1 0 1 d) 00 000 1 l l l l
III I I
5. Convert these integers from hexadecimal notation to bi
nary notation. b) 1 35AB a) 80E d) DEFACED c) ABBA 6. Convert (BADFACED) 1 6 from its hexadecimal expansion to its binary expansion.
230
7. Convert (ABCDEF) 1 6 from its hexadecimal expansion to
its binary expansion. 8. Convert each of these integers from binary notation to hexadecimal notation. a) 1 1 1 1 0 1 1 1 b) 1 0 1 0 1 0 1 0 1 0 1 0 c) 1 1 1 0 1 1 1 0 1 1 1 0 1 1 1 9. Convert ( 1 0 1 1 0 1 1 1 I O I l h from its binary expansion to
its hexadecimal expansion.
10. Convert ( 1 1 000 0 1 1 0 00 1 1 h from its binary expansion
to its hexadecimal expansion. 1 1 . Show that the hexadecimal expansion of a positive integer
12.
13. 14. 1 5. 16.
1 7.
1 8. 19. 20. 21. 22. 23.
3-64
3 / The Fundamentals: Algorithms, the Integers, and Matrices
can be obtained from its binary expansion by grouping to gether blocks of four binary digits, adding initial digits if necessary, and translating each block of four binary digits into a single hexadecimal digit. Show that the binary expansion of a positive integer can be obtained from its hexadecimal expansion by translat ing each hexadecimal digit into a block of four binary digits. Give a simple procedure for converting from the binary expansion of an integer to its octal expansion. Give a simple procedure for converting from the octal expansion of an integer to its binary expansion. Convert (734532 1 ) 8 to its binary expansion and ( 1 0 10 1 1 1 0 1 1 )2 to its octal expansion. Give a procedure for converting from the hexadecimal ex pansion of an integer to its octal expansion using binary notation as an intermediate step. Give a procedure for converting from the octal expansion of an integer to its hexadecimal expansion using binary notation as an intermediate step. Convert ( 1 2345670) 8 to its hexadecimal expansion and (ABB093BABBA) 1 6 to its octal expansion. Use Algorithm 5 to find 7644 mod 645 . Use Algorithm 5 to find 1 1 644 mod 645 . Use Algorithm 5 to find 3 2003 mod 99. Use Algorithm 5 to find 123 1 00 1 mod 1 0 1 . Use the Euclidean algorithm to find a) gcd( 1 2 , 1 8). b) gcd( 1 1 1 , 20 1). c) gcd( 1 00 1 , 1 33 1 ). d) gcd( 1 2345 , 5432 1 ). e) gcd( 1 000, 5040). f) gcd(9888, 6060).
24. Use the Euclidean algorithm to find a) gcd( I , 5). b) gcd( 1 00, 1 0 1 ). c) gcd( 123 , 277). d) gcd( 1 529, 1 4039). e) gcd( 1 529, 1 4038). f) gcd( l 1 I 1 1 , 1 1 1 1 1 1 ). 25. How many divisions are required to find gcd(2 1 , 34) using
the Euclidean algorithm? 26. How many divisions are required to find gcd(34, 55) using the Euclidean algorithm? 27. Show that every positive integer can be represented uniquely as the sum of distinct powers of 2. [Hint: Con sider binary expansions of integers.]
28. It can be shown that every integer can be uniquely repre
sented in the form
+
+...+ +
ek 3 k ek _ 1 3 k- 1 e l 3 eo , where ej = - I , O, or 1 for j = 0, 1 , 2, Expansions ofthis type are called balanced ternary expansions. Find the balanced ternary expansions of a) 5. b) 1 3 . c) 37. d) 79. 29. Show that a positive integer is divisible by 3 if and only if the sum of its decimal digits is divisible by 3 . 30. Show that a positive integer i s divisible by 1 1 i f and only if the difference of the sum of its decimal digits in even numbered positions and the sum of its decimal digits in odd-numbered positions is divisible by 1 1 . 3 1 . Show that a positive integer is divisible by 3 if and only if the difference of the sum of its binary digits in even numbered positions and the sum of its binary digits in odd-numbered positions is divisible by 3 . One's complement representations o f integers are used to simplify computer arithmetic. To represent positive and nega tive integers with absolute value less than 2 n - l , a total ofn bits is used. The leftmost bit is used to represent the sign. A 0 bit in this position is used for positive integers, and a 1 bit in this position is used for negative integers. For positive integers, the remaining bits are identical to the binary expansion ofthe integer. For negative integers, the remaining bits are obtained by first finding the binary expansion of the absolute value of the integer, and then taking the complement of each of these bits, where the complement of a I is a 0 and the complement of a 0 is a 1 . 32. Find the one's complement representations, using bit strings of length six, of the following integers. a) 22 b) 3 1 c) -7 d) - 1 9 33. What integer does each ofthe following one's complement representations of length five represent? a) 1 1 00 1 b) 0 1 1 0 1 c ) 1 000 1 d) 1 1 1 1 1 34. If m is a positive integer less than 2n - l , how is the one 's complement representation of -m obtained from the one's complement of m , when bit strings of length n are used? 35. How is the one's complement representation of the sum of two integers obtained from the one's complement rep resentations of these integers? 36. How is the one's complement representation ofthe differ ence of two integers obtained from the one's complement representations of these integers? 37. Show that the integer m with one's complement represen tation (an - I an -2 . . . al ao ) can be found using the equation al · 2 ao. m = -an _I (2 n -1 - 1 ) an _ 2 2 n - 2 Two's complement representations of integers are also used to simplify computer arithmetic and are used more commonly than one's complement representations. To represent an in teger x with _2 n -1 :::: x :::: 2n - 1 - 1 for a specified positive integer n , a total of n bits is used. The leftmost bit is used to represent the sign. A 0 bit in this position is used for positive
+
. . . , k.
+...+
+
3. 7 Applications of Number Theory
3-65
integers, and a 1 bit in this position is used for negative inte gers, just as in one's complement expansions. For a positive integer, the remaining bits are identical to the binary expan sion of the integer. For a negative integer, the remaining bits are the bits of the binary expansion of 2 n - 1 - Ix I . Two's com plement expansions of integers are often used by computers because addition and subtraction of integers can be performed easily using these expansions, where these integers can be either positive or negative. 38. Answer Exercise 32, but this time find the two's comple ment expansion using bit strings of length six. 39. Answer Exercise 33 if each expansion is a two's complement expansion of length five. 40. Answer Exercise 34 for two's complement expansions. 41. Answer Exercise 35 for two's complement expansions. 42. Answer Exercise 36 for two's complement expansions. 43. Show that the integer m with two's complement represen tation (an - l an -2 . . . al aO ) can be found using the equation m = -an - I ' 2n - 1 an _ 2 2n -2 al · 2 ao . 44. Give a simple algorithm for forming the two's comple ment representation of an integer from its one's comple ment representation. 45. Sometimes integers are encoded by using four-digit binary expansions to represent each decimal digit. This produces the binary coded decimal form of the in teger. For instance, is encoded in this way by 0 1 1 1 1 00 1 000 1 . How many bits are required to represent a number with n decimal digits using this type of encoding? A Cantor expansion is a sum of the form
+...+
+
+
791
+ . . . + a2 2 ! + a l l ! , where ai is an integer with 0 ::: ai ::: i for i 1 , 2, . . . , n . an n !
+
an - l en - 1 ) !
=
231
46. Find the Cantor expansions of
a) 2. c) 1 9. e) 1 000.
b) d)
t)
787.. 1 ,000,000.
*47. Describe an algorithm that finds the Cantor expansion of
an integer. *48. Describe an algorithm to add two integers from their
Cantor expansions.
49. Add ( 1 0 1 1 1 )2 and ( 1 1 0 1 0)2 by working through each step
of the algorithm for addition given in the text.
50. Multiply ( 1 1 1 O)z and ( 1 0 1 O)z by working through each
step of the algorithm for multiplication given in the text.
51. Describe an algorithm for finding the difference of two
binary expansions.
52. Estimate the number ofbit operations used to subtract two
binary expansions.
53. Devise an algorithm that, given the binary expansions of
the integers a and b, determines whether a > b , a or a < b.
=
b,
54. How many bit operations does the comparison algorithm
from Exercise 53 use when the larger of a and b has n bits in its binary expansion?
55. Estimate the complexity of Algorithm 1 for finding the
base b expansion of an integer n in terms of the number of divisions used.
*56. Show that Algorithm 5 uses O ((log m i 10g n) bit opera tions to find b n mod m . 57. Show that Algorithm 4 uses O (q log a ) bit operations,
assuming that a >
d.
3.7 App lications of Numb er Theory Introduction Number theory has many applications, especially to computer science. In Section 3 .4 we de scribed several of these applications, including hashing functions, the generation of pseudo random numbers, and shift ciphers. This section continues our introduction to number theory, developing some key results and presenting two important applications: a method for perform ing arithmetic with large integers and a recently invented type of cryptosystem, called a public key system . In such a cryptosystem, we do not have to keep encryption keys secret, because knowledge of an encryption key does not help someone decrypt messages in a realistic amount of time. Privately held decryption keys are used to decrypt messages. Before developing these applications, we will introduce some key results that play a central role in number theory and its applications. For example, we will show how to solve systems of linear congruences modulo pairwise relatively prime integers using the Chinese Remainder Theorem, and then show how to use this result as a basis for performing arithmetic with large integers. We will introduce Fermat's Little Theorem and the concept of a pseudoprime and will show how to use these concepts to develop a public key cryptosystem.
232
3-66
3 / The Fundamentals: Algorithms, the Integers, and Matrices
Some Useful Results An important result we will use throughout this section is that the greatest common divisor of two integers a and b can be expressed in the form sa + tb, where s and t are integers. In other words, gcd(a, b) can be expressed as a linear combination with integer coefficients of a and b. For example, gcd(6, 1 4) = 2, and 2 = (-2) · 6 + 1 . 14. We state this fact as Theorem 1 . THEOREM 1
If a and b are positive integers, then there exist integers + tb.
sa
s
and t such that gcd(a , b) =
We will not give a formal proof of Theorem 1 here (see Exercise 36 in Section 4.2 and [Ro05] for proofs), but we will provide an example of a method for finding a linear combination of two integers equal to their greatest common divisor. (In this section, we will assume that a linear combination has integer coefficients.) The method proceeds by working backward through the divisions of the Euclidean algorithm. (We also describe an algorithm called the extended Euclidean algorithm that can be used to express gcd(a , b) as a linear combination of a and b in the preamble to Exercise 48.) EXAMPLE 1
Express gcd(252, 1 98) = 1 8 as a linear combination of 252 and 1 98.
Solution: To show that gcd(252, 1 98) = 1 8, the Euclidean algorithm uses these divisions: 252 = 1 1 98 = 3 54 = 1 36 = 2
. 1 98 + 54 · 54 + 36 · 36 + 1 8 · 1 8.
Using the next-to-1ast division (the third division), we can express gcd(252, 1 98) = 1 8 as a linear combination of 54 and 36. We find that 1 8 = 54 - 1 . 36. The second division tells us that 36 = 1 98 - 3 . 54. Substituting this expression for 36 into the previous equation, we can express 18 as a linear combination of 54 and 1 98. We have 1 8 = 54 - 1 · 36 = 54 - 1 . ( 1 98 - 3 · 54) = 4 · 54 - 1 . 1 98 . The first division tells u s that 54 = 252 - 1 . 1 98 . Substituting this expression for 5 4 into the previous equation, we can express 1 8 as a linear combination of 252 and 1 98. We conclude that 1 8 = 4 · (252 - 1 . 1 98) - 1 . 1 98 = 4 · 252 - 5 . 1 98 , completing the solution.
3.7 Applications of Number Theory
3-67
233
We will use Theorem I to develop several useful results. One of our goals will be to prove the part of the Fundamental Theorem of Arithmetic asserting that a positive integer has at most one prime factorization. We will show that if a positive integer has a factorization into primes, where the primes are written in nondecreasing order, then this factorization is unique. First, we need to develop some results about divisibility.
LEMMA 1
If a, b, and e are positive integers such that gcd(a b) ,
=
I and a
I be, then a I e.
Proof: Because gcd(a , b) = I, by Theorem I there are integers s and t such that sa + tb = 1 . Multiplying both sides of this equation by c, we obtain
sac + tbc
=
c.
Using Theorem I of Section 3 .4, we can use this last equation to show that a I c . By part (ii ) of that theorem, a I tbc . Because a I sac and a I tbc, by part (i ) of that theorem, we conclude
LEMMA 2
If P is a prime and p i a l a2 . . . an , where each aj is an integer, then P I aj for some i .
We can now show that a factorization of an integer into primes is unique. That is, we will show that every integer can be written as the product of primes in nondecreasing order in at most one way. This is part of the Fundamental Theorem of Arithmetic. We will prove the other part, that every integer has a factorization into primes, in Section 4.2.
Proof (of the uniqueness of the prime factorization of a positive integer): We will use a proof by contradiction. Suppose that the positive integer n can be written as the product of primes in two different ways, say, n = P I P2 . . . Ps and n = q l q2 . . . qt , each pj and qj are primes such that P I :::: P2 :::: . . . :::: Ps and q l :::: q2 :::: . . . :::: q t . When we remove all common primes from the two factorizations, we have
where no prime occurs on both sides of this equation and u and v are positive integers. By Lemma I it follows that Pi , divides qjk for some k. Because no prime divides another prime, this is impossible. Consequently, there can be at most one factorization of n into primes in
234
3
/
3-68
The Fundamentals: Algorithms, the Integers, and Matrices
Lemma 1 can also be used to prove a result about dividing both sides of a congruence by the same integer. We have shown (Theorem 5 in Section 3 .4) that we can multiply both sides of a congruence by the same integer. However, dividing both sides of a congruence by an integer does not always produce a valid congruence, as Example 2 shows, EXAMPLE 2
The congruence 1 4 == 8 (mod 6) holds, but both sides of this congruence cannot be divided by ... 2 because 1 4/2 = 7 and 8/2 = 4, but 7 ¢= 4 (mod 6). However, using Lemma 1 , we can show that we can divide both sides of a congruence by an integer relatively prime to the modulus. This is stated as Theorem 2.
THEOREM 2
Let m be a positive integer and let a , b, and e be integers. Ifae I , then a == b (mod m).
==
be (mod m) and gcd(e, m)
=
Proof: Because ae == be (mod m ), m l ac - be = e(a - b). By Lemma 1, because gcd(e, m) =
Linear Congruences A congruence of the form
ax
==
b (mod m),
where m is a positive integer, a and b are integers, and x is a variable, is called a linear congruence. Such congruences arise throughout number theory and its applications. How can we solve the linear congruence ax == b (mod m ), that is, find all integers x that satisfy this congruence? One method that we will describe uses an integer a such that aa == 1 (mod m), if such an integer exists. Such an integer a is said to be an inverse of a modulo m . Theorem 3 guarantees that an inverse ofa modulo m exists whenever a and m are relatively prime.
THEOREM 3
If a and m are relatively prime integers and m > 1 , then an inverse of a modulo m exists. Furthermore, this inverse is unique modulo m. (That is, there is a unique positive integer a less than m that is an inverse of a modulo m and every other inverse of a modulo m is congruent to a modulo m .)
Proof: By Theorem 1 , because gcd(a , m ) = 1 , there are integers s and I such that sa + 1m = 1 . This implies that sa + 1m Because 1m sa
==
== ==
1 (mod m).
0 (mod m ), it follows that
1 (mod m).
Consequently, s is an inverse of a modulo m. That this inverse is unique modulo m is left as
3-69
3.7 Applications of Number Theory
235
The proof of Theorem 3 describes a method for finding the inverse of a modulo m when a and m are relatively prime: find a linear combination of a and m that equals I (which can be done by working backward through the steps ofthe Euclidean algorithm); the coefficient of a in this linear combination is an inverse of a modulo m . We illustrate this procedure in Example 3 .
EXAMPLE 3
Find an inverse of 3 modulo 7.
Solution: Because gcd(3 , 7) = I, Theorem 3 tells us that an inverse of 3 modulo 7 exists. The Euclidean algorithm ends quickly when used to find the greatest common divisor of 3 and 7: 7 = 2 · 3 + 1. From this equation we see that -2 · 3 + 1 · 7 = 1 . This shows that -2 is an inverse 00 modulo 7 . (Note that every integer congruent to -2 modulo ... 7 is also an inverse of 3, such as 5, -9, 12, and so on.) When we have an inverse
a
of a modulo
m,
we can easily solve the congruence ax
b (mod m ) by multiplying both sides of the linear congruence by a, as Example 4 illustrates. EXAMPLE 4
What are the solutions of the linear congruence 3x
==
==
4 (mod 7)?
Solution: By Example 3 we know that -2 is an inverse of 3 modulo 7. Multiplying both sides of the congruence by -2 shows that -2 · 3x
==
-2 · 4 (mod 7).
Because -6 == 1 (mod 7) and -8 == 6 (mod 7), it follows that if x is a solution, then x == -8 == 6 (mod 7). We need to determine whether every x with x == 6 (mod 7) is a solution. Assume that x == 6 (mod 7). Then, by Theorem 5 of Section 3 .4, it follows that 3x
==
3 · 6 = 18
==
4 (mod 7),
which shows that all such x satisfy the congruence. We conclude that the solutions to the congruence are the integers x such that x == 6 (mod 7), namely, 6, 1 3 , 20, . . . and - 1 , -8, ... - 15, . . . .
The Chinese Remainder Theorem
linkS � EXAMPLE S
Systems oflinear congruences arise in many contexts. For example, as we will see later, they are the basis for a method that can be used to perform arithmetic with large integers. Such systems can even be found as word puzzles in the writings of ancient Chinese and Hindu mathematicians, such as that given in Example 5 . In the first century, the Chinese mathematician Sun-Tsu asked: There are certain things whose number is unknown. When divided by 3, the remainder is 2; when divided by 5, the remainder is 3; and when divided by 7, the remainder is 2. What will be the number of things?
236
3 / The Fundamentals: Algorithms, the Integers, and Matrices
3- 70
This puzzle can be translated into the following question: What are the solutions of the systems of congruences x == x == x ==
2 (mod 3), 3 (mod 5), 2 (mod 7)?
We will solve this system, and with it Sun-Tsu's puzzle, later in this section. The Chinese Remainder Theorem, named after the Chinese heritage of problems involving systems of linear congruences, states that when the moduli of a system of linear congruences are pairwise relatively prime, there is a unique solution of the system modulo the product of the moduli.
THEOREM 4
THE CHINESE REMAINDER THEOREM Let m 1 , m 2 , , m n be pairwise relatively , an arbitrary integers. Then the system .
prime positive integers and ai , a2 , x == x ==
.
.
.
.
.
al (mod m l ), a2 (mod m2),
has a unique solution modulo m = m 1 m 2 m n . (That is, there is a solution x with 0 ::: x m , and all other solutions are congruent modulo m to this solution.) •
•
•
<
Proof: To establish this theorem, we need to show that a solution exists and that it is unique modulo m. We will show that a solution exists by describing a way to construct this solution; showing that the solution is unique modulo m is Exercise 24 at the end of this section. To construct a simultaneous solution, first let
for k = 1 , 2, . . , n . That is, Mk is the product of the moduli except for mk. Because mj and mk have no common factors greater than 1 when i -I k, it follows that gcd(mb Md = 1 . Conse quently, by Theorem 3, we know that there is an integer Yk, an inverse of Mk modulo mb such that .
To construct a simultaneous solution, form the sum
We will now show that x is a simultaneous solution. First, note that because Mj == 0 (mod mk) whenever j -I k, all terms except the kth term in this sum are congruent to 0 modulo mk. Because MkYk == 1 (mod mk) we see that x ==
akMkYk
==
ak (mod mk),
for k = I , 2, . . . , n. We have shown that x is a simultaneous solution to the n congruences.
3.7 Applications of Number Theory
3- 71
237
Example 6 illustrates how to use the construction given in the proof of Theorem 4 to solve a system of congruences. We will solve the system given in Example 5, arising in Sun-Tsu's puzzle.
EXAMPLE
6
To solve the system of congruences in Example 5, first let m = 3 · 5 · 7 = 1 05 , M I = m/3 = 3 5 , M2 = m /5 = 2 1 , and M3 = m /7 = 1 5 . We see that 2 i s an inverse o f M I = 35 modulo 3, because 35 == 2 (mod 3); 1 is an inverse of M2 = 21 modulo 5, because 21 == 1 (mod 5); and 1 is an inverse of M3 = 1 5 (mod 7), because 1 5 == 1 (mod 7). The solutions to this system are those x such that x ==
a l M 1 Y l + a2M2Y2 + a3 M3Y3 = 2 · 35 · 2 + 3 · 2 1 . 1 + 2 · 1 5 · 1 = 233
==
23 (mod 1 05).
It follows that 23 is the smallest positive integer that is a simultaneous solution. We conclude that 23 is the smallest positive integer that leaves a remainder of 2 when divided by 3, a remainder .... of 3 when divided by 5, and a remainder of 2 when divided by 7.
Computer Arithmetic with Large Integers Suppose that m l , m 2 , . . . , m n are pairwise relatively prime integers greater than or equal to 2 and let m be their product. By the Chinese Remainder Theorem, we can show (see Exercise 22) that an integer a with ° :::: a < m can be uniquely represented by the n -tuple consisting of its remainders upon division by m i , i = 1 , 2, . . . , n . That is, we can uniquely represent a by
(a mod m l , a mod m 2 , . . . , a mod m n ). EXAMPLE 7
What are the pairs used to represent the nonnegative integers less than 1 2 when they are rep resented by the ordered pair where the first component is the remainder of the integer upon division by 3 and the second component is the remainder of the integer upon division by 4?
Solution: We have the following representations, obtained by finding the remainder of each integer when it is divided by 3 and by 4: 0 1 2 3
= (0, 0) = (1 , 1) = (2, 2) = (0, 3)
4 5 6 7
= = = =
( 1 , 0) (2, 1 ) (0, 2) ( 1 , 3)
8 9 10 11
= = = =
(2 , 0) (0, 1 ) ( 1 , 2) (2, 3).
....
To perform arithmetic with large integers, we select moduli m l , m 2 , . . . , m n , where each mi is an integer greater than 2, gcd(m i, m j ) = 1 whenever i =I- j , and m = m 1 m 2 m n is greater than the result of the arithmetic operations we want to carry out. Once we have selected our moduli, we carry out arithmetic operations with large integers by performing componentwise operations on the n -tuples representing these integers using their remainders upon division by m i , i = 1 , 2, . . . , n . Once we have computed the value of each component in the result, we recover its value by solving a system of n congruences modulo mj , i = 1 , 2, . . . , n . This method ofperforming arithmetic with large integers has severa1 valuable features. First, it can be used to perform arithmetic with integers larger than can ordinarily be carried out on a computer. Second, computations with respect to the different moduli can be done in parallel, speeding up the arithmetic. •
•
•
238
3 / The Fundamentals: Algorithms, the Integers, and Matrices
EXAMPLE 8
3- 72
Suppose that performing arithmetic with integers less than 1 00 on a certain processor is much quicker than doing arithmetic with larger integers. We can restrict almost all our computations to integers less than 1 00 if we represent integers using their remainders modulo pairwise relatively prime integers less than 1 00. For example, we can use the moduli of 99, 98, 97, and 95. (These integers are relatively prime pairwise, because no two have a common factor greater than 1 .) By the Chinese Remainder Theorem, every nonnegative integer less than 99 . 98 . 97 . 95 = 89,403,930 can be represented uniquely by its remainders when divided by these four moduli. For example, we represent 1 23,684 as (33, 8, 9, 89), because 1 23,684 mod 99 = 33, 1 23,684 mod 98 = 8; 1 23,684 mod 97 = 9; and 1 23,684 mod 95 = 89. Similarly, we represent 4 1 3 ,456 as (32, 92, 42, 1 6). To find the sum of 1 23,684 and 4 1 3,456, we work with these 4-tuples instead of these two integers directly. We add the 4-tuples componentwise and reduce each component with respect to the appropriate modulus. This yields (33 , 8 , 9, 89) + (32, 92, 42, 1 6) = (65 mod 99, 1 00 mod 98, 5 1 mod 97, 1 05 mod 95) = (65 , 2, 5 1 , 1 0). To find the sum, that is, the integer represented by (65, 2, 5 1 , 1 0), we need to solve the system of congruences x == x == x == x ==
65 2 51 10
(mod 99) (mod 98) (mod 97) (mod 95)
It can be shown (see Exercise 39) that 537, 1 40 is the unique nonnegative solution of this system less than 89,403,930. Consequently, 537, 140 is the sum. Note that it is only when we have to recover the integer represented by (65, 2, 5 1 , 1 0) that we have to do arithmetic with .... integers larger than 1 00. Particularly good choices for moduli for arithmetic with large integers are sets of integers of the form 2k - 1 , where k is a positive integer, because it is easy to do binary arithmetic modulo such integers, and because it is easy to find sets of such integers that are pairwise relatively prime. [The second reason is a consequence of the fact that gcd(2a - I , 2 b - I ) = 2gcd(a, b) - I , as Exercise 4 1 shows.] Suppose, for instance, that we can do arithmetic with integers less than 2 3 5 easily on our computer, but that working with larger integers requires special procedures. We can use pairwise relatively prime moduli less than 2 3 5 to perform arithmetic with integers as large as their product. For example, as Exercise 42 shows, the integers 2 3 5 - I , 234 - I , 233 - 1 , 2 3 1 - 1 , 229 - I , and 223 - I are pairwise relatively prime. Because the product of these six moduli exceeds 2 1 84 , we can perform arithmetic with integers as large as 2 1 84 (as long as the results do not exceed this number) by doing arithmetic modulo for each of these six moduli, none of which exceeds 2 3 5 .
Pseudoprimes In Section 3.5 we showed that an integer n is prime when it is not divisible by any prime p with In. Unfortunately, using this criterion to show that a given integer is prime is inefficient. It requires that we find all primes not exceeding In and that we carry out trial division by each such prime to see whether it divides n .
p �
3. 7
3- 73
Applications of Number Theory
239
Are there more efficient ways to determine whether an integer is prime? According to some sources, ancient Chinese mathematicians believed that n was prime if and only if
2n - 1
==
1 (mod n).
Ifthis were true, it would provide an efficient primality test. Why did they believe this congruence could be used to determine whether an integer is prime? First, they observed that the congruence holds whenever n is prime. For example, 5 is prime and
2 5- 1
=
24
=
16
==
1 (mod 5).
Second, they never found a composite integer n for which the congruence holds. The ancient Chinese were only partially correct. They were correct in thinking that the congruence holds whenever n is prime, but they were incorrect in concluding that n is necessarily prime if the congruence holds. The great French mathematician Fermat showed that the congruence holds when n is prime. He proved the following, more general result.
THEOREM S
FERMAT'S LITTLE THEOREM then aP
-1
==
If P is prime and a is an integer not divisible by p,
1 (mod p).
Furthermore, for every integer a we have a P == a
(mod p).
The proof of Theorem 5 is outlined in Exercise 1 7 at the end of this section. Unfortunately, there are composite integers n such that 2 n - 1 == 1 (mod n ) . Such integers are called pseudoprimes to the base 2 .
EXAMPLE 9
The integer 34 1 is a pseudoprime to the base 2 because it is composite (34 1 = 1 1 . 3 1 ) and as Exercise 27 shows
2 340
==
1 (mod 34 1 ) .
We can use an integer other than 2 a s the base when we study pseudoprimes.
UDks it:J PI ERRE DE F ERMAT ( 1 60 1 - 1 665) Pierre de Fermat, one of the most important mathematicians of the seventeenth century, was a lawyer by profession. He is the most famous amateur mathematician in history. Fermat published little of his mathematical discoveries. It is through his correspondence with other mathematicians that we know of his work. Fermat was one of the inventors of analytic geometry and developed some of the fundamental ideas of calculus. Fermat, along with Pascal, gave probability theory a mathematical basis. Fermat formulated what was the most famous unsolved problem in mathematics. He asserted that the equation xn + yn = zn has no nontrivial positive integer solutions when n is an integer greater than 2. For more than years, no proof (or counterexample) was found. In his copy of the works of the ancient Greek mathematician Diophantus, Fermat wrote that he had a proof but that it would not fit in the margin. Because the first proof, found by Andrew Wiles in 1 994, relies on sophisticated, modem mathematics, most people think that Fermat thought he had a proof, but that the proof was incorrect. However, he may have been tempting others to look for a proof, not being able to find one himself.
300
240
3 I The Fundamentals: Algorithms, the Integers, and Matrices
D EFINITION 1
3- 74
Let b be a positive integer. If n is a composite positive integer, and bn - 1 n is called a pseudoprime to the base b.
==
1 (mod n), then
Given a positive integer n , determining whether 2n - 1 == 1 (mod n) is a useful test that provides some evidence concerning whether n is prime. In particular, if n satisfies this congruence, then it is either prime or a pseudoprime to the base 2; if n does not satisfy this congruence, it is composite. We can perform similar tests using bases b other than 2 and obtain more evidence as to whether n is prime. If n passes all such tests, it is either prime or a pseudoprime to all the bases b we have chosen. Furthermore, among the positive integers not exceeding x , where x is a positive real number, compared to primes there are relatively few pseudoprimes to the base b, where b is a positive integer. For example, less than 1 0 1 0 there are 455,052,5 1 2 primes, but only 1 4,884 pseudoprimes to the base 2. Unfortunately, we cannot distinguish between primes and pseudoprimes just by choosing sufficiently many bases, because there are composite integers n that pass all tests with bases with gcd(b, n) = 1 . This leads to Definition 2.
D EFINITI ON 2
A composite integer n that satisfies the congruence bn
-
1 ==
1 (mod n) for all positive integers
b with gcd(b n) = 1 is called a Carmichael number. (These numbers are named after Robert Carmichael, who studied them in the early twentieth century.) ,
EXAMPLE 10
The integer 5 6 1 is a Carmichael number. To see this, first note that 561 is composite because 5 6 1 = 3 · 1 1 . 1 7 . Next, note that if gcd(b, 5 6 1 ) = I , then gcd(b, 3) = gcd(b, 1 1 ) = gcd(b , 1 7) = 1 . Using Fermat's Little Theorem we find that b2
==
l 1 (mod 3), b O
==
l 1 (mod 1 1 ), and b 6
==
1 (mod 1 7).
It follows that b 5 60 = (b2 )280 b 5 60 = (b l O ) 5 6 b 5 60 = (b I 6 )3 5
== == ==
1 (mod 3), 1 (mod 1 1 ), 1 (mod 1 7).
By Exercise 23 at the end of this section, it follows that b 5 60 == 1 (mod 5 6 1 ) for all positive .... integers b with gcd(b, 5 6 1 ) = 1 . Hence 5 6 1 is a Carmichael number. Although there are infinitely many Carmichael numbers, more delicate tests, described in the exercise set at the end ofthis section, can be devised that can be used as the basis for efficient
ROBERT DANIEL CARMICHAEL ( 1 879- 1 967) Robert Daniel Carmichael was born in Alabama. He re ceived his undergraduate degree from Lineville College in 1 898 and his Ph.D. in 1 9 1 1 from Princeton. Carmichael held positions at Indiana University from 1 9 1 1 until 1 9 1 5 and at the University of Illinois from 1 9 1 5 until 1 947. Carmichael was an active researcher in a wide variety of areas, including number theory, real analysis, differential equations, mathematical physics, and group theory. His Ph.D. thesis, written under the direction of G. D. Birkhoff, is considered the first significant American contribution to the subject of differential equations.
3.7 Applications of Nwnber Theory
241
probabilistic primality tests. Such tests can be used to quickly show that it is almost certainly the case that a given integer is prime. More precisely, if an integer is not prime, then the probability that it passes a series of tests is close to O . We will describe such a test in Chapter 6 and discuss the notions from probability theory that this test relies on. These probabilistic primality tests can be used, and are used, to find large primes extremely rapidly on computers.
Public Key Cryptography In Section 3 .4 we introduced methods for encrypting messages based on congruences. When these encryption methods are used, messages, which are strings of characters, are translated into numbers. Then the number for each character is transformed into another number, either using a shift or an affine transformation modulo 26. These methods are examples of private key cryp tosystems. Knowing the encryption key lets you quickly find the decryption key. For example, when a shift cipher is used with encryption key k, a number p representing a letter is sent to
c = (p + k) mod 26. Decryption is carried out by shifting by -k; that is,
p = (c - k) mod 26. When a private key cryptosystem is used, a pair of people who wish to communicate in secret must have a separate key. Because anyone knowing this key can both encrypt and decrypt messages easily, these two people need to securely exchange the key. In the mid- 1 970s, cryptologists introduced the concept of public key cryptosystems. When such cryptosystems are used, knowing how to send someone a message does not help you decrypt messages sent to this person. In such a system, every person can have a publicly known encryption key. Only the decryption keys are kept secret, and only the intended recipient of a message can decrypt it, because the encryption key does not let someone find the decryption key without an extraordinary amount of work (such as more than 2 billion years of computer time).
The RSA Cryptosystem In 1 976, three researchers at M.I.T.-Ronald Rivest, Adi Shamir, and Leonard Adleman introduced to the world a public key cryptosystem, known as the RSA system, from the initials of its inventors. (As is often the case with cryptographic discoveries, the RSA system had already been discovered in secret government research. Clifford Cocks, working in secrecy at the United Kingdom's Government Communications Headquarters, had discovered this cryptosystem in 1 973, but his invention was unknown to the outside world until the late 1 990s. An excellent account of this earlier discovery, as well as the work of Rivest, Shamir, and Adleman, can be found in [Si99] .) The RSA cryptosystem is based on modular exponentiation modulo the product oftwo large primes, which can be done rapidly using Algorithm 5 in Section 3.6. Each individual has an encryption key consisting of a modulus n = pq , where p and q are large primes, say, with 200 digits each, and an exponent e that is relatively prime to (p - 1 )(q - 1 ). To produce a usable key, two large primes must be found. This can be done quickly on a computer using probabilistic primality tests, referred to earlier in this section. However, the product of these primes n = pq , with approximately 400 digits, cannot be factored in a reasonable length of time. As we will see, this is an important reason why decryption cannot be done quickly without a separate decryption key.
242
3 / The Fundamentals: Algorithms, the Integers, and Matrices
3- 76
RSA Enc ryption In the RSA encryption method, messages are translated into sequences of integers. This can be done by translating each letter into an integer, as is done with the Caesar cipher. These integers are grouped together to form larger integers, each representing a block ofletters. The encryption proceeds by transforming the integer M , representing the plaintext (the original message), to an integer C , representing the ciphertext (the encrypted message), using the function
C = Me mod n . (To perform the encryption, we use an algorithm for fast modular exponentiation, such as Algorithm 5 in Section 3 .6.) We leave the encrypted message as blocks of numbers and send these to the intended recipient. Example 1 1 illustrates how RSA encryption is performed. For practical reasons we use small primes p and q in this example, rather than primes with 1 00 or more digits. Although the cipher described in this example is not secure, it does illustrate the techniques used in the RSA cipher.
EXAMPLE 11
Encrypt the message STOP using the RSA cryptosystem with p = 43 and q = 59, so that n = 43 . 59 = 2537, and with e = 1 3 . Note that gcd(e, (p
-
l )(q
-
1 )) = gcd( 1 3 , 42 · 58) = 1 .
Solution: We translate the letters in STOP into their numerical equivalents and then group the numbers into blocks of four. We obtain 1819
1415.
RONALD RIVEST (BORN 1 948) Ronald Rivest received a B.A. from Yale in 1 969 and his Ph.D. in computer science from Stanford in 1 974. Rivest is a computer science professor at M.I.T. and was a cofounder of RSA Data Security, which held the patent on the RSA cryptosystem that he invented together with Adi Shamir and Leonard Adleman. Areas that Rivest has worked in besides cryptography include machine learning, VLSI design, and computer algorithms. He is a coauthor of a popular text on algorithms ([CoLeRiStO 1 D.
ADI SHAMIR (BORN 1 952) Adi Shamir was born in Tel Aviv, Israel. His undergraduate degree is from Tel Aviv University ( 1 972) and his Ph.D. is from the Weizmann Institute of Science ( 1 977). Shamir was a research assistant at the University of Warwick and an assistant professor at M.I. T. He is currently a profes sor in the Applied Mathematics Department at the Weizmann Institute and leads a group studying computer security. Shamir's contributions to cryptography, besides the RSA cryptosystem, include cracking knapsack cryptosystems, cryptanalysis of the Data Encryption Standard (DES), and the design of many cryptographic protocols.
LEONARD ADLEMAN (BORN 1 945) Leonard Adleman was born in San Francisco, California. He received a B.S. in mathematics ( 1 968) and his Ph.D. in computer science ( 1 976) from the University of California, Berkeley. Adleman was a member of the mathematics faculty at M.I.T. from 1 976 until 1 980, where he was a coinventor of the RSA cryptosystem, and in 1 980 he took a position in the computer science department at the University of Southern California (USC). He was appointed to a chaired position at USC in 1 985. Adleman has worked on computer security, computational complexity, immunology, and molecular biology. He invented the term "computer virus." Adleman's recent work on DNA computing has sparked great interest. He was a technical adviser for the movie Sneakers, in which computer security played an important role.
3- 7 7
3.7 Applications o f Number Theory
243
We encrypt each block using the mapping C = M I 3 mod 2537. Computations using fast modular multiplication show that 1 8 1 9 1 3 mod 2537 = 208 1 and ... 1 4 1 5 13 mod 2537 = 2 1 82. The encrypted message is 208 1 2 1 82.
RSA Decryption The plaintext message can be quickly recovered when the decryption key d, an inverse of e modulo (p - I )(q - I ), is known. [Such an inverse exists because gcd(e, (p - 1 )(q - I )) = 1 .] To see this, note that if de == I (mod (p - I )(q - I )), there is an integer k such that de = 1 + k(p - I )(q - I ). It follows that
By Fermat's Little Theorem [assuming that gcd(M , p) = gcd(M , q ) = 1 , which holds except in rare cases], it follows that MP - l == I (mod p) and Mq - I == I (mod q). Consequently, Cd
==
l M . (M P - 1 i(q - )
==
M. I
==
M (mod p)
Cd
==
I M · ( M q - I )k(p - )
==
M·I
==
M (mod q).
and
Because gcd(p , q ) = I, it follows by the Chinese Remainder Theorem that Cd
==
M (mod pq ).
Example 1 2 illustrates how to decrypt messages sent using the RSA cryptosystem.
EXAMPLE 12
We receive the encrypted message 098 1 046 1 . What is the decrypted message if it was encrypted using the RSA cipher from Example I I ?
Solution: The message was encrypted using the RSA cryptosystem with n = 43 . 59 and expo nent 1 3 . As Exercise 4 shows, d = 937 is an inverse of 1 3 modulo 42 . 5 8 = 2436. We use 937 as our decryption exponent. Consequently, to decrypt a block C, we compute p = C 93 7 mod 2537. To decrypt the message, we use the fast modular exponentiation algorithm to compute 098 1 93 7 mod 2537 = 0704 and 046 1 93 7 mod 2537 = 1 1 1 5 . Consequently, the numerical version of the original message is 0704 1 1 1 5 . Translating this back to English letters, we see that the ... message is HELP.
RSA as a Public Key System
unkS �
Why is the RSA cryptosystem suitable for public key cryptography? When we know the factorization ofthe modulus n , that is, when we know p and q , we can use the Euclidean algorithm to quickly find an exponent d inverse to e modulo (p - 1 )(q - 1 ). This lets us decrypt messages sent using our key. However, no method is known to decrypt messages that is not based on
3- 78
3 / The Fundamentals: Algorithms, the Integers, and Matrices
244
finding a factorization of n , or that does not also lead to the factorization of n . Factorization is believed to be a difficult problem, as opposed to finding large primes p and q , which can be done quickly. The most efficient factorization methods known (as of 2005) require billions of years to factor 400-digit integers. Consequently, when p and q are 200-digit primes, messages encrypted using n = pq as the modulus cannot be found in a reasonable time unless the primes p and q are known. Active research is under way to find new ways to efficiently factor integers. Integers that were thought, as recently as several years ago, to be far too large to be factored in a reasonable amount of time can now be factored routinely. Integers with more than 1 00 digits, as well as some with more than 1 50 digits, have been factored using team efforts. When new factorization techniques are found, it will be necessary to use larger primes to ensure secrecy of messages. Unfortunately, messages that were considered secure earlier can be saved and subsequently decrypted by unintended recipients when it becomes feasible to factor the n = pq in the key used for RSA encryption. The RSA method is now widely used. However, the most commonly used cryptosystems are private key cryptosystems. The use of public key cryptography, via the RSA system, is growing. However, there are applications that use both private key and public key systems. For example, a public key cryptosystem, such as RSA, can be used to distribute private keys to pairs of individuals when they wish to communicate. These people then use a private key system for encryption and decryption of messages.
Exercises 1 . Express the greatest common divisor of each ofthese pairs
of integers as a linear combination of these integers. b) 2 1 , 44 c) 36, 48 a) 1 0, I I e) 1 1 7, 2 1 3 34, 55 t) 0, 223 h) 3454, 4666 i) 9999, 1 1 1 1 1 g) 1 23 , 2347 Express the greatest common divisor of each ofthese pairs of integers as a linear combination of these integers. c) 35, 78 b) 33, 44 a) 9, 1 1 e) 1 0 1 , 203 2 1 , 55 t) 1 24, 323 h) 3457, 4669 i) 1 000 1 , 1 3422 g) 2002, 2339 Show that 1 5 is an inverse of 7 modulo 26. Show that 937 is an inverse of 13 modulo 2436. Find an inverse of 4 modulo 9. Find an inverse of 2 modulo 1 7 . Find an inverse o f 1 9 modulo 1 4 1 . Find an inverse o f 1 44 modulo 233 . Show that i f a and m are relatively prime positive inte gers, then the inverse of a modulo m is unique modulo m. Assume that there are two solutions b and e of the congruence ax == 1 (mod m). Use Theorem 2 to show that b == e (mod m ).] Show that an inverse of a modulo m does not exist if gcd(a , m) > 1 . Solve the congruence 4x == 5 (mod 9). Solve the congruence 2x == 7 (mod 1 7). Show that if m is a positive integer greater than 1 and ae == be (mod m), then a == b mod m /gcd(e, m).
d)
2.
d)
3. 4. 5. 6. 7. 8. *9.
[Hint:
1 0. 11. 12. *13.
14. a) Show that the positive integers less than I I , except
1 and 1 0, can be split into pairs of integers such that each pair consists of integers that are inverses of each other modulo 1 1 . b) Use part (a) to show that 1 O ! == - 1 (mod 1 1 ). 1 5. Show that if p is prime, the only solutions of x 2 == 1 (mod p) are integers x such that x == 1 (mod p) and x == - 1 (mod p). *16. a) Generalize the result in part (a) of Exercise 14; that is, show that if p is a prime, the positive integers less thanp, except 1 and p - 1, can be split into (p - 3)/2 pairs of integers such that each pair consists of integers that are inverses of each other. Use the result of Exercise 1 5 .] b) From part (a) conclude that (p - I ) ! == - 1 (mod p) whenever p is prime. This result is known as Wilson's
[Hint:
Theorem. c) What can we conclude if n is a positive integer such that (n - I ) ! ¥= - 1 (mod n)?
*17. This exercise outlines a proof of Fermat's Little Theorem. a) Suppose that a is not divisible by the prime p. Show
that no two of the integers 1 . a , 2 · a , . . . , (p - l)a are congruent modulo p. b) Conclude from part (a) that the product of 2, . . . , p - I is congruent modulo p to the prod uct of a , 2a , . . . , (p - I )a . Use this to show that
I,
(p - I ) !
==
aP - 1 (p - I ) ! (mod p l .
3- 79
3.7 Applications of Number Theory
c) Use Theorem 2 to show from part (b) that aP - 1 == 1 (mod p ) if P f a . Use Lemma 2 to show that P does not divide (p - 1 ) 1 and then use Theorem 2.
[Hint:
Alternatively, use Wilson's Theorem.] == a (mod p ) for all inte gers a .
d) Use part (c) to show that a p
18. Find all solutions to the system o f congruences. x ==
2 (mod 3) 1 (mod 4) x == 3 (mod 5) 19. Find all solutions to the system of congruences. x == 1 (mod 2) x == 2 (mod 3) x == 3 (mod 5) x == 4 (mod 1 1 ) *20. Find all solutions, if any, to the system of congruences. x == 5 (mod 6) x == 3 (mod 1 0) x == 8 (mod 15) *21. Find all solutions, if any, to the system of congruences. x == 7 (mod 9) x == 4 (mod 1 2) x == 1 6 (mod 2 1 ) 22. Use the Chinese Remainder Theorem to show that an integer a, with 0 :::: a < m = m l m 2 ' " m n , where the integers m l , m 2 , " " m n are pairwise relatively prime, can be represented uniquely by the n-tuple (a mod m I , a mod m 2 , . . . , a mod m n ) . *23. Let m I , m 2 , . . . , mn be pairwise relatively prime integers greater than or equal to 2. Show that if a == b (mod m i l for = 1 , 2, . . . , n , then a == b (mod m ), where m = m 1 m 2 . . . m n ' (This result will be used in Exercise 24 to prove the Chinese remainder theorem. Consequently, do not use the Chinese remainder theorem to prove it.) *24. Complete the proof of the Chinese Remainder Theorem by showing that the simultaneous solution of a system of linear congruences modulo pairwise relatively prime integers is unique modulo the product of these moduli. Assume that x and y are two simultaneous solu tions. Show that m i I x - y for all j . Using Exercise 23, conclude that m = m l m 2 . . . m n I x - y.] 25. Which integers leave a remainder of 1 when divided by 2 and also leave a remainder of 1 when divided by 3? 26. Which integers are divisible by 5 but leave a remainder of 1 when divided by 3? 27. a) Show that 2 340 == 1 (mod 1 1 ) by Fermat's Little The orem and noting that 2 340 = (2 1 0 )34 . b ) Show that 2 340 == 1 (mod 3 1 ) using the fact that 2 340 (2 5 )68 = 32 68 . c) Conclude from parts (a) and (b) that 2 340 == 1 (mod 34 1). 28. a) Use Fermat's Little Theorem to compute 3 3 02 mod 5, 3 302 mod 7, and 33 02 mod 1 1 . b) Use your results from part (a) and the Chinese Rex ==
i
[Hint:
245
mainder Theorem to find 3 302 mod 385. (Note that 385 = 5 . 7 . 1 1 .) 29. a) Use Fermat's Little Theorem to compute 5 2003 mod 7, 5 2003 mod 1 1 , and 5 2003 mod 1 3 . b ) Use your results from part (a) and the Chinese Re mainder Theorem to find 5 2003 mod 1 00 1 . (Note that 1 00 1 = 7 . 1 1 . 1 3 .) I3r Let n be a positive integer and let n - 1 = 2S , where s is a nonnegative integer and is an odd positive integer. We say that n passes Miller's test for the base b if either Y == 1 (mod i n) or b 2 t == - 1 (mod n) for some j with 0 :::: j :::: s - 1 . It can be shown (see [Ro05]) that a composite integer n passes Miller's test for fewer than n /4 bases b with 1 < b < n .
t
t
*30. Show that i f n i s prime and b i s a positive integer with
n f b, then n passes Miller's test to the base b.
3 1 . Show that 2047 passes Miller's test to the base 2, but that
32. 33. *34.
it is composite. A composite positive integer n that passes Miller's test to the base b is called a strong pseudoprime to the base b. 1t follows that 2047 is a strong pseudoprime to the base 2. Show that 1 729 is a Carmichael number. Show that 282 1 is a Carmichael number. Show that if n = PI P2 . . . Pk. where P I , P2 , . . . , Pk are distinct primes that satisfy Pj 1 I n - 1 for j = 1 , 2, . . . , k, then n is a Carmichael number. a) Use Exercise 34 to show that every integer ofthe form (6m 1 )( 1 2m 1 )( 1 8m 1 ), where m is a positive integer and 6m 1 , 1 2m 1 , and 1 8 m 1 are all primes, is a Carmichael number. b) Use part (a) to show that 1 72,947,529 is a Car michael number. Find the nonnegative integer a less than 28 represented by each of these pairs, where each pair represents (a mod 4, a mod 7). c) ( 1 , 1) a) (0, 0) b) ( 1 , 0) d) (2, 1 ) e) (2, 2) f) (0, 3) g) (2, 0) h) (3, 5) i) (3, 6) Express each nonnegative integer a less than 1 5 using the pair (a mod 3 , a mod 5). Explain how to use the pairs found in Exercise 37 to add 4 and 7. Solve the system of congruences that arises in Example 8. Show that whenever a and b are both positive integers, then (2a - 1 ) mod (2b - 1 ) = 2a mod b - 1 . Use Exercise 40 to show that if a and b are pos itive integers, then gcd(2a - 1 , 2 b - 1 ) = 2gcd(a , b) - 1 . [Hint: Show that the remainders obtained when the Eu clidean algorithm is used to compute gcd(2a - 1 , 2 b - 1 ) are o f the form 2r - 1 , where r i s a remainder arising when the Euclidean algorithm is used to find gcd(a , b).] Use Exercise 41 to show that the integers 23 5 - 1 , 234 - 1 , 233 - 1 , 23 1 - 1 , 2 29 - 1 , and 2 2 3 - 1 are pairwise rela tively prime. Show that if P is an odd prime, then every divisor of the -
35.
36.
37. 38. 39. *40. * *41.
=
42.
43.
+
++
++
+
246
3-80
3 / The Fundamentals: Algorithms, the Integers, and Matrices
-I
Mersenne number 2P is ofthe form 2kp + 1 , where k is a nonnegative integer. [Hint: Use Fermat's Little The orem and Exercise 4 1 .] 44. Use Exercise 43 to determine whether M 1 3 = 2 1 3 - 1 = 8 1 9 1 and M2 3 = 2 2 3 - 1 = 8 , 3 8 8 , 607 are prime. *45. Show that we can easily factor n when we know that n is the product of two primes, p and q , and we know the value of (p - 1 )(q - 1). 46. Encrypt the message ATTACK using the RSA system with n = 43 . 59 and e = 1 3 , translating each letter into inte gers and grouping together pairs of integers, as done in Example 1 1 . 47. What is the original message encrypted using the RSA system with n = 43 . 59 and e = 1 3 ifthe encrypted mes sage is 0667 1 947 067 1 ? (Note: Some computational aid is needed to do this in a realistic amount of time.) The extended Euclidean algorithm can be used to express � gcd(a , b) as a linear combination with integer coefficients � of the integers a and b. We set So = 1 , SI = 0, to = 0, and tl = 1 and let sj = Sj-2 - qj _ I Sj _ 1 and tj = tj - 2 - qj _l tj _ 1 for j = 2, 3 , . . . , n, where the qj are the quotients in the divi sions used when the Euclidean algorithm finds gcd(a , b) (see page 228). It can be shown (see [Ro05]) that gcd(a , b) = Sn a + tn b. 48. Use the extended Euclidean algorithm to express gcd(252, 356) as a linear combination of 252 and 356. 49. Use the extended Euclidean algorithm to express gcd( 1 44, 89) as a linear combination of 1 44 and 89. 50. Use the extended Euclidean algorithm to express gcd( 1 00 1 , 1 0000 1 ) as a linear combination of 1 00 1 and 1 0000 1 . 51. Describe the extended Euclidean algorithm using pseudocode. If m is a positive integer, the integer a is a quadratic residue of m if gcd(a , m) = 1 and the congruence x 2 == a (mod m ) has a solution. I n other words, a quadratic residue o f m i s an integer relatively prime to m that is a perfect square modulo m . For example, 2 is a quadratic residue of 7 because gcd(2, 7) = 1 and 3 2 == 2 (mod 7) and 3 is a quadratic nonresidue of 7 because gcd(3 , 7) = 1 and x 2 == 3 (mod 7) has no solution. 52. Which integers are quadratic residues of 1 1 ? 53. Show that if p is an odd prime and a is an integer not divisible by p, then the congruence x 2 == a (mod p) has
either no solutions or exactly two incongruent solutions modulo p. 54. Show that if p is an odd prime, then there are exactly (p - 1 )/2 quadratic residues of p among the integers 1 , 2, . . . , p - 1 . If p is an odd prime and a is an integer not divisible by p, the
(�)
Legendre symbol
is defined to be 1 if a is a quadratic
residue of p and - 1 otherwise. 55. Show that if p is an odd prime and a and b are integers
with a
==
b (mod p), then
56. Prove that if p is an odd prime and a is a positive integer
not divisible by p, then
(�)
==
a (p - I )/2 (mod p).
57. Use Exercise 56 to show that if p is an odd prime and a
and b are integers not divisible by p, then
(�) (�) (%) . =
58. Show that if p is an odd prime, then - 1 is a quadratic
residue of p if p == 1 (mod 4), and - 1 is not a quadratic residue of p if p == 3 (mod 4). [Hint: Use Exercise 56.] 59. Find all solutions of the congruence x 2 == 29 (mod 35). [Hint: Find the solutions of this congruence modulo 5 and modulo 7, and then use the Chinese Remainder Theorem.] 60. Find all solutions of the congruence x 2 == 16 (mod 1 05). [Hint: Find the solutions of this congruence modulo 3, modulo 5, and modulo 7, and then use the Chinese Re mainder Theorem.] *61 . Explain why it would not be suitable to use p, where p is a large prime, as the modulus for encryption in the RSA cryptosystem. That is, explain how someone could, without excessive computation, find a private key from the corresponding public key if the modulus were a large prime, rather than the product of two large primes.
3.8 Matrices Introduction Matrices are used throughout discrete mathematics to express relationships between elements in sets. In subsequent chapters we will use matrices in a wide variety of models. For instance, matrices will be used in models of communications networks and transportation systems. Many algorithms will be developed that use these matrix models. This section reviews matrix arithmetic that will be used in these algorithms.
3-81
3 . 8 Matrices
DEFINITION 1
m n
matrices. equal
m
247
n
A matrix is a rectangular array of numbers. A matrix with rows and columns is called x matrix. The plural of matrix is A matrix with the same number of rows
an
as columns is called
square.
Two matrices are
if they have the same number of rows
and the same number of columns and the corresponding entries in every position are equal.
EXAMPLE 1
The matrix
[� n
is a
3
x
2 matrix.
We now introduce some terminology about matrices. Boldface uppercase letters will be used to represent matrices.
DEFINITION 2
Let
all a2l
aln a2n
al2 a22
A=
The ith matrix
The
row
of
A
is the
1 x
(i, element entry j)th
or
n
matrix
[ail, ai2,"" ain]. The jth
of A is the element
column
of
A
is the
nx1
aij, that is, the number in the ith row and
jth column of A. A convenient shorthand notation for expressing the matrix A is to write
A= [aij], which indicates that A is the matrix with its (i, j)th element equal to aij.
Matrix Arithmetic The basic operations of matrix arithmetic will now be discussed, beginning with a definition of matrix addition.
DEFINITION 3
Let the
=
m n (i,
sum
[aij] and B= [bij] be x matrices. The of A and B, denoted by A + B, is matrix that has aij + bij as its j)th element. In other words, A + B= [aij + bij].
mA x n
248
3-82
3 / The Fundamentals: Algorithms, the Integers, and Matrices
The sum of two matrices of the same size is obtained by adding elements in the corresponding positions. Matrices of different sizes cannot be added, because the sum oftwo matrices is defined only when both matrices have the same number of rows and the same number of columns.
EXAMPLE 2
We have
[
1
2
3
] [
0 -1 2 -3 +
4
o
3 1 -1
4-3 1
-1
] [4
4
-2
]
0 = 3 - 1 -3 .
2
2
5
2
We now discuss matrix products. A product oftwo matrices is defined only when the number of columns in the first matrix equals the number of rows of the second matrix.
DEFINITION 4
product
n
of A and B, denoted by AB, is Let A be an m x k matrix and B be a k x matrix. The ofthe corresponding products the of sum the m x matrix with its (i, J)th entry equal to the if AB = [cii]' then words, other In B. of elements from the ith row of A and the Jth column
n
cij =
ail blj
+ ai2b2i + . . . + aik bki.
In Figure 1 the colored row of A and the colored column of B are used to compute the element cii of AB. The product of two matrices is not defined when the number of columns in the first matrix and the number of rows in the second matrix are not the same. We now give some examples of matrix products. EXAMPLE 3
Let o
1 1
2
41
0
2
]
and
Find AB if it is defined.
Extra � Examples �
Solution: Because A is a 4 3 matrix and B is a 3 matrix, the product AB is defined and is a4 matrix. To find the elements of AB, the corresponding elements of the rows of A and the columns ofB are first multiplied and then these products are added. For instance, the element in x2
x
x2
the (3, l )th position of AB is the sum of the products ofthe corresponding elements of the third row of A and the first column of B; namely, 3 . 2 + 1 . 1 + 0 . 3 = 7. When all the elements of
all a 12 a2 l a22 ail
ail
am I am2
FIGURE 1
[
al k a2k bll b 12 b21 b22 b k l b k2 am k aik
The Product of A
=
blj
b2 j
bl' b2n bkn · · ·
bkj
[aii] and B = [hii].
] [CCll21 ClC222 CIC2n. ] . . .
Cm l Cm2
Cij
Cmn
3-83
3 . 8 Matrices
249
[14�
AB are computed, we see that
AB�
not
4, 4
4;
Matrix multiplication is commutative. That is, if A and B are two matrices, it is not necessarily true that AB and BA are the same. In fact, it may be that only one of these two products is defined. For instance, if A is 2 x 3 and B is 3 x then AB is defined and is 2 x however, BA is not defined, because it is impossible to multiply a 3 x matrix and a 2 x 3 matrix. In general, suppose that A is an m x matrix and B is an r x s m atrix. Then AB is defined only when = r and BA is defined only when s = m. Moreover, even when AB and BA are both defined, they will not be the same size unless m = = r = s . Hence, if both AB and BA are defined and are the same size, then both A and B must be square and of the same size. Furthermore, even with A and B both x matrices, AB and BA are not necessarily equal, as Example demonstrates.
4
EXAMPLE 4
n
n
n n
Let A=
U �J
and
B=
Does AB = BA?
[i �J.
Solution: We find that and
n
BA =
[j �J.
Hence, AB f=. BA.
Algorithms for Matrix Multiplication
n [cij ] [bij ]. n
1.
The definition of the product of two matrices leads to an algorithm that computes the product of two matrices. Suppose that C = is the m x matrix that is the product of the m x k matrix A = and the k x matrix B = The algorithm based on the definition of the matrix product is expressed in pseudocode in Algorithm
[aij ]
ALGORITHM 1 Matrix Multiplication.
procedure matrix multiplication(A, for i := 1 to m for j := 1 to n begin
end
Ie
B: matrices)
Cij := 0 for q := 1 to k Cij := Cij + aiq bqj
= [Cij ] is the product of A and B}
250
3-84
3 / The Fundamentals: Algorithms, the Integers, and Matrices
We can determine the complexity of this algorithm in terms of the number of additions and multiplications used. EXAMPLE 5
n n
How many additions of integers and multiplications of integers are used by Algorithm 1 to multiply two x matrices with integer entries?
Solution: There are n 2 entries in the product of A and To find each entry requires a total of n multiplications and n - 1 additions. Hence, a total of n 3 multiplications and n 2 (n - 1 ) additions B.
....
are used.
n n
Surprisingly, there are more efficient algorithms for matrix multiplication than that given in Algorithm 1 . As Example 5 shows, multiplying two x matrices directly from the definition requires multiplications and additions. Using other algorithms, two x matrices can be multiplied using multiplications and additions. (Details of such algorithms can be found in [CoLeRiStO l ] .)
O(n 3 ) O(n..n)
linkS �
EXAMPLE 6
n n
MATRIX-CHAIN MULTIPLICATION There is another important problem involving the complexity of the multiplication of matrices. How should the matrix-chain A I A2 · · · An be computed using the fewest multiplications of integers, where A I , A2 , . . . , An are m l x m 2 , m 2 x m 3 , . . . , m n x m n + 1 matrices, respectively, and each has integers as entries? (Because matrix multiplication is associative, as shown in Exercise 1 3 at the end of this section, the order of the multiplication used does not matter.) Before studying this. problem, note that m l m 2 m 3 multiplications of integers are performed to multiply an m l x m 2 matrix and an m 2 x m 3 matrix using Algorithm 1 (see Exercise 23 at the end of this section). Example 6 illustrates this problem. In which order should the matrices AI, A2 , and A3 -where A I is 30 x 20, A2 is 20 x 40, and A3 is 40 x 1 0, all with integer entries-be multiplied to use the least number of multiplications of integers?
Solution: There are two possible ways to compute A I A2 A3 . These are A I (A2 A3 ) and (A I A2 )A3 . If A2 and A 3 are first multiplied, a total of20 · 40 . 1 0 8000 multiplications of integers are =
used to obtain the 20 x 1 0 matrix A2 A3 . Then, to multiply A I and A2 A3 requires 30 . 20 . 1 0 = 6000 multiplications. Hence, a total of 8000 + 6000
=
1 4,000
multiplications are used. On the other hand, if A I and A2 are first multiplied, then 30 . 20 · 40 = 24,000 multiplications are used to obtain the 30 x 40 matrix A I A2 . Then, to multiply A I A2 and A3 requires 30 x 40 x 1 0 = 1 2,000 multiplications. Hence, a total of 24,000 + 1 2,000
=
36,000
multiplications are used. Clearly, the first method is more efficient. Algorithms for determining the most efficient way to carry out matrix-chain multiplication are discussed in [CoLeRiStO l].
3-85
3.8 Matrices
251
Transposes and Powers of Matrices We now introduce an important matrix with entries that are zeros and ones.
DEFINITION 5
matrix of order n is the n n matrix = [8ij ], where 8ij = 1 if i 8iThej =identity 0 if i #- j . Hence In
x
o
1
0 1
0 0
o
0
1
=
j and
A m n
Multiplying a matrix by an appropriately sized identity matrix does not change this matrix. In other words, when is an x matrix, we have
Powers of square matrices can be defined. When
Ar = '-..-' AAA· · ·A. r
A is an n n matrix, we have x
times
The operation of interchanging the rows and columns of a square matrix is used in many algorithms.
DEFINITION 6
EXAMPLE 7
A = [aij ] hij = aji
m n . . . ,n
transpose A, = . . . ,m.
n= [him matrix j ], then
Let be an x matrix. The of denoted by N, is the x obtained by interchanging the rows and columns of A. In other words, if At for i = 1 , 2, and j 1 , 2,
The transpose of tbe matrix
[! � �]
is the matrix
Un
Matrices that do not change when their rows and columns are interchanged are often im portant.
DEFINITION 7
i
if A = i nsymmetric and 1 j n.
A square matrix A is called
for all and j with 1 ::s
::s
::s
::s
At. Thus A
= [aij] is symmetric if aij aji =
252
3-86
3 / The Fundamentals: Algorithms, the Integers, and Matrices
Note that a matrix is symmetric if and only if it is square and it is symmetric with respect to its main diagonal (which consists of entries that are in the ith row and ith column for some i). This symmetry is displayed in Figure 2.
EXAMPLE 8
The matrix
[ ! b �]
is symmetric.
0 1 0
Zero-One Matrices
FIGURE 2
A
A matrix with entries that are either 0 or 1 is called a zero-one matrix. Zero-one matrices are often used to represent discrete structures, as we will see in Chapters 8 and 9. Algorithms using these structures are based on Boolean arithmetic with zero-one matrices. This arithmetic is based on the Boolean operations v and /\, which operate on pairs of bits, defined by
Symmetric Matrix.
bl bl D EFINITION 8
/\
V
Let A
b2 = { b2 = {
0
=
1
otherwise,
I if 0
bl = b2 bl = b2 = 1 or
otherwise.
1
[aij] and B = [bij] be m
x n zero-one matrices. Then thejoin of A and B is the matrix with (i, j)th entry aij v bij. The j oin of A and B is denoted by A v B. The meet of A and B is the zero-one matrix with (i, j)th entry aij /\ bij. The meet of A and B is denoted by A /\ B. =
zero-one
EXAMPLE 9
I if
Find the join and meet of the zero-one matrices
A= [ Solution:
]
1 0 1 0 1 0 '
We find that the join of
A
v B_ -
The meet of
A
1 1
[
A
/\ B _ -
IVO Ovl Ovl Ivl
A
and B is
] [
IVO 1 _ OvO 1
and B is
[
I/\O 0 /\1 1/\ 0 0 /\1 1/\1 0 /\ 0
] [ _ -
0 0 0 0 1 0
]
.
We now define the Boolean product of two matrices.
3-87
DEFINITION 9
3.8 Matrices
Let A
=
[aij] be an m x k zero-one matrix and B
the Boolean product of A and B, denoted
Cij where Cij
253
=
/\
(ail
b1j) v (ai2
/\
b2 j) v ..
.
= [bij] be a k x n zero-one matrix Then . by A 0 B, is the m x n matrix with (i, j )th entry
V
(aik
/\
bkj).
A B
Note that the Boolean product of and is obtained in an analogous way to the ordinary product of these matrices, but with addition replaced with the operation V and with multiplication replaced with the operation I\. We give an example of the Boolean products of matrices. EXAMPLE 10
Find the Boolean product of
Solution: A0B= [
A B,
The Boolean product
=[
and
where
A0B
is given by
(1/\l ) v (O/\O) (O/\ I) v (1/\O) (1/\l ) v (O/\O)
(1/\l ) v (O/\1) (O/\ I) v (1/\ I) (1/\l ) v (O/\1)
1vO 1vO OVO OvO Ovl Ovl IvO IvO OvO
]
(1/\ 0) V (0 /\1) (O/\O) v (l /\1) (1/\O) v (O/\ I)
]
Algorithm 2 displays pseudocode for computing the Boolean product of two matrices.
ALGORITHM 2 The Boolean Product.
procedure Boolean product (A, for i := 1 to m for j := 1 to n begin
Cij := 0 := 1 to k Cij := Cij V (aiq
forq
{C
end
B: zero-one matrices)
/\
bqj )
= [cij] is the Boolean product of A and B}
We can also define the Boolean powers of a square zero-one matrix. These powers will be used in our subsequent studies of paths in graphs, which are used to model such things as communications paths in computer networks.
254
3-88
3 / The Fundamentals: Algorithms, the Integers, and Matrices
DEFINITION 10
Let A be a square zero-one matrix and let r be a positive integer. Therth Boolean power of A is the Boolean product ofr factors of A. Therth Boolean product of A is denoted by A[r]. Hence A[r]
=
A0A0A0 ... 0A .
\
I
v
r
times
(This is well defined because the Boolean product of matrices is associative.) We also define
A[O] to be In.
EXAMPLE 11
Let A
�
Solution:
[r � � J
n.
We find that
We also find that
A[3]
Find A[,] for all positive integers
�
A [ 2]
0
A
�
[l ] o 1 I 0 1 1
1 o 1
,
Additional computation shows that
The reader can now see that
A n] A ] [
=
[5 for all positive integers
II
n n with
� 5.
The number o f bit operations used to find the Boolean product o f two be easily determined. EXAMPLE 12
How many bit operations are used to find A
Solution: 2n 3
0
nxn n 2.
B, where A and Bare
A0 A0 2n 0
2,
n n x
matrices can
zero-one matrices?
n
There are n 2 entries in B. Using Algorithm a total of DRs and ANDs are used to find an entry of B. Hence, bit operations are used to find each entry. Therefore, .... bit operations are required to compute A B using Algorithm
Exercises
1.uth a) b) c)
U��n
What size is A? What is the third column of A? What is the second row of A?
d) What is the element of A in the (3, 2 )th position? e) What is AI? 2.
[-� � i] ,
Find A + B, where a)
A=
o
-2
-3
3-89
B= b)
0
A=
b) h
c)
A�
0 5 5
9
-2
B=
-2
B�
0
,B -
5
-1
A=
b) h
0 -1 4
c) h
-4
-1 0
-2
-1 o
B�
o . 1
0
2
4
Find a matrix A such that
6.
[Hint; Finding A requires that you solve systems oflinear equations.] Find a matrix A such that
1
7.
8.
15.
Show that matrix addition is associative; that is, show that if A, B, and C are all m x n matrices, then A + (B + C)=(A + B) + c. Let A be a x 4 matrix, B be a 4 x 5 matrix, and C be a 4 x 4 matrix. Determine which of the following products are defined and find the size of those that are defined. a) AB b) BA c) AC d) CA e) BC t) CB What do we know about the sizes of the matrices A and B if both of the products AB and BA are defined? In this exercise we show that matrix multiplication is dis tributive over matrix addition. a) Suppose that A and B are m x k matrices and that C is a k x n matrix. Show that (A + B)C=AC + BC. b) Suppose that C is an m x k matrix and that A and Bare k x n matrices. Show that C(A + B)=CA + CB. In this exercise we show that matrix multiplication is associative. Suppose that A is an m x p matrix, B is a p x k matrix, and C is a k x n matrix. Show that A(BC) =(AB)C. The n x n matrix A= [aij ] is called a diagonal matrix if aij =0 when i =1= j. Show that the product of two n x n diagonal matrices is again a diagonal matrix. Give a sim ple rule for determining this product. Let
3
Find a formula for An, whenever n is a positive integer. 1 6. Show that (Aty =A. 1 7. Let A and B be two n x n matrices. Show that a) (A + B)I =AI + BI. b) (ABY = BlAt. If A and B are n x n matrices with AB = BA=In, then B is called the inverse of A (this terminology is appropriate be cause such a matrix B is unique) and A is said to be invertible. The notation B=A-\ denotes that B is the inverse of A. 18. Show that
[ i ; -�]3
3 2] [7 33] 3 -3 7
1 1 A= 1 4 0 -1
2
14.
255
2
5.
[
13.
2 .
2
B�
1 -1
12.
2 .
0 0 1 0 - 1 - 1 ,B= 1 . - 1 0 -1 0 1 -1 2
11.
-I
_
-2
10.
-1
Find the product AB, where a)
9.
2
Find AB if a)
4.
-I
1 A= -4 B=
3.
[ ; -33 -�l [-3- -3 -3 -�J. [ �l [; �J. [� �J. [! -ll [i J [ 1 ::i] [-' 3 -3- J [ ' '] [ -I] -3 gl [-: 3 -3 n D [ � ��l [-� 3 3 �l
3.8 Matrices
-1 -1
1 o
Let A be an m x n matrix and let 0 be the m x n matrix that has all entries equal to zero. Show that A=0 + A= A + O. Show that matrix addition is commutative; that is, show that if A and B are both m x n matrices, then A + B=B + A.
is the inverse of
19.
Let A be the 2
A=
x 2
matrix
[� �l
3 I The
256
A-I=
A=
22.
23.
24.
25.
26.
-be
0, then
-b
ad �
]be .
[- 11 23J .
al lxl + a12 xZ + . . . + alnxn= h a21x I + azz xz + . . . + aznxn= bz
in the variables X I , Xz , . . . , Xn can be expressed as AX = B, where A = [aij ], X is an n x1 matrix with Xi the entry in its ith row, and B is an n x1 matrix with the entry in its ith row. b) Show that if the matrix A= [aij ] is invertible (as de fined in the preamble to Exercise 1 8), then the solution of the system in part (a) can be found using the equa tion X= A-lB. Use Exercises 1 8 and 26 to solve the system
bi
27.
28.
7xI - 8X2 + 5X3= 5 -4x I + 5Xl - 3X3 = -3 X I - Xl + X 3 = 0
Let
[� �J
A=
Find
ad - be
Find A- I . [Hint: Use Exercise 1 9.] Find A3. Find ( A- I ) 3 . Use your answers to (b) and (c) to show that (A -1)3 is the inverse of A 3 . Let A be an invertible matrix. Show that (An)-I=(A -I)n whenever n is a positive integer. Let A be a matrix. Show that the matrix AN is symmetric. [Hint: Show that this matrix equals its transpose with the help of Exercise 1 7b.] Show that the conventional algorithm uses m I m Z m 3 mul tiplications to compute the product ofthe m I xm z matrix A and the m z xm 3 matrix B. What is the most efficient way to multiply the matrices AI , Az, and A 3 with sizes a) 20 x50, 50 x 10, 10 x40? b) 10 x5, 5 x50, 50 xl ? What is the most efficient way to multiply the matrices A I, Az, A 3 , and � if the dimensions of these matrices are 1 0 x2, 2 x5, 5 x20, and 20 x3, respectively? a) Show that the system of simultaneous linear equations a) b) c) d)
21.
=1=
ad ad
Let
be
[ -e�be
Show that if ad -
20.
3-90
Fundamentals: Algorithms, the Integers, and Matrices
a) A v B.
29.
[i �] 0 1 0
Find a) A v B.
[� �l c) AOB.
and
B�
c) AOB.
b) AA B.
'
[! lJ 1 0 0
Find the Boolean product of A and B, where A=
31.
B=
b) AA B.
Let
A�
30.
and
Let
[
0 0 0 I 0 1 1 I
1]
and
h[i !J 0 0 I
Find
a) A [l] . c) Av A [l] v A [3 ] .
32.
Let A be a zero-one matrix. Show that
33.
In this exercise we show that the meet and join opera tions are commutative. Let A and B be m xn zero-one matrices. Show that
34.
In this exercise we show that the meet and join opera tions are associative. Let A, B, and C be m xn zero-one matrices. Show that
a) Av A=A.
b) AA A= A.
a) Av B=Bv A.
b) BA A= AAB.
a) ( Av B)v C= Av (Bv C). b) (A A B)AC= AA(BAC).
35.
We will establish distributive laws of the meet over the join operation in this exercise. Let A, B, and C be m xn zero-one matrices. Show that a) Av (BAC)=(A v B)A (A v C). b) AA(Bv C)=(A AB)v (A AC).
36. 37.
Let A be an n xn zero-one matrix. Let I be the n xn identity matrix. Show that A 0 I= lOA= A. In this exercise we will show that the Boolean prod uct of zero-one matrices is associative. Assume that A is an m xp zero-one matrix, B is a p xk zero-one matrix, and C is a k xn zero-one matrix. Show that AO(BOC)=(AOB)OC.
3-9/
Key Terms and Results
257
Key Terms and Results TERMS
a finite set of precise instructions for performing a computation or solving a problem searching algorithm: the problem of locating an element in a list linear search algorithm: a procedure for searching a list element by element binary search algorithm: a procedure for searching an or dered list by successively splitting the list in half sorting: the reordering of the elements of a list into nonde creasing order greedy algorithm: an algorithm that makes the best choice at each step I(x) is O(g(x»: the fact that I/(x)1 ::::: C lg(x)1 for all x > k for some constants C and k witness to the relationship I(x) is O(g(x»: a pair C and k such that I/(x)1 ::::: C lg(x)1 whenever x > k I(x) is O(g(x»: the fact that I /(x)1 :::: C lg(x)1 for all x > k for some positive constants C and k I(x) is 8(g(x»: the fact that I(x) is O(g(x» and I(x) is Q(g(x» time complexity: the amount of time required for an algorithm to solve a problem space complexity: the amount of storage space required for an algorithm to solve a problem worst-case time complexity: the greatest amount of time re quired for an algorithm to solve a problem of a given size average-case time complexity: the average amount of time required for an algorithm to solve a problem of a given size a I b (a divides b): there is an integer c such that b = ac prime: a positive integer greater than I with exactly two pos itive integer divisors composite: a positive integer greater than 1 that is not prime Mersenne prime: a prime of the form 2P I , where p is prime gcd(a, b) (greatest common divisor of a and b): the largest integer that divides both a and b relatively prime integers: integers a and b such that gcd(a, b) = 1 pairwise relatively prime integers: a set of integers with the property that every pair of these integers is relatively prime Icm(a, b) (least common multiple of a and b): the smallest positive integer that is divisible by both a and b a mod b: the remainder when the integer a is divided by the positive integer b a = b (mod m) (a is congruent to b modulo m): a - b is divisible by m encryption: the process of making a message secret decryption: the process of returning a secret message to its original form n = (akak-\ a\aO )b: the base b representation of n binary representation: the base 2 representation of an integer hexadecimal representation : the base 1 6 representation of an integer algorithm:
-
• • •
octal representation: the base 8 representation of an integer linear combination of a and b with integer coefficients: a
number of the form sa + tb, where s and t are integers
inverse of a modulo m: an integer 7i such that aa == I (mod m) linear congruence: a congruence ofthe form ax == b (mod m ) ,
where x is a variable
pseudoprime to the base 2:
a composite integer n such that
pseudoprime t o the base b:
a composite integer n such that
2n - \
bn - l
==
I (mod n)
==
1 (mod n)
a composite integer n such that n is a pseudoprime to the base b for all positive integers b with gcd(b, n) = I private key encryption: encryption where both encryption keys and decryption keys must be kept secret public key encryption: encryption where encryption keys are public knowledge, but decryption keys are kept secret matrix: a rectangular array of numbers matrix addition: see page 247 matrix mUltiplication: see page 248 In (identity matrix of order n): the n x n matrix that has entries equal to 1 on its diagonal and Os elsewhere ' A (transpose of A): the matrix obtained from A by inter changing the rows and columns symmetric: a matrix is symmetric if it equals its transpose zero-one matrix: a matrix with each entry equal to either 0 or I A V B (the join of A and B): see page 252 A 1\ B (the meet of A and B): see page 252 A 0 B (the Boolean product of A and B): see page 253 Carmichael number:
RESULTS linear and binary search algorithms: (given in Section 3 . 1 ) bubble sort: a sorting that uses passes where successive items
are interchanged if they are out of order a sorting that at the jth step inserts the jth el ement into the correct position in the list of the sorted first j I elements
insertion sort: -
The linear search has O (n) complexity. The binary search has O (log n) complexity. The bubble and insertion sorts have O (n 2 ) complexity. log n ! is O (n log n). If II (x) is O (gl (X» and hex) is O (g2(X» , then (Ji + h)(x) is o (max(g\ (x), g2(X» ) and (ftf2)(X) is O (gl (x) g2(X » . If ao, aI , . . . , an are real numbers with an =I 0, then an x n + an _ lXn- l + . . . + alX + ao is O (x n ) and 8(x n ). Every positive inte ger can be written uniquely as the product of primes, where the prime factors are written in order of increasing size. division algorithm: Let a and d be integers with d positive. Then there are unique integers q and r with 0 ::::: r < d such that a = dq + r . If a and b are positive integers, then ab = gcd(a, b)-lcm(a , b).
Fundamental Theorem of Arithmetic:
3 / The Fundamentals: Algorithms, the Integers, and Matrices
258
for finding greatest common divisors (see Algorithm 6 in Section 3.6) Let b be a positive integer greater than 1 . Then if n is a positive integer, it can be expressed uniquely in the form n = akbk + ak _ lbk- l + . . . + alb + ao . The algorithm for finding the base b expansion of an integer (see Algorithm 1 in Section 3.6) The conventional algorithms for addition and multiplication of integers (given in Section 3.6) The modular exponentiation algorithm (see Algorithm 5 in Section 3.6) Euclidean algorithm :
3-92
The greatest common divisor of two integers can be expressed as a linear combination with integer coefficients of these integers. Ifm is a positive integer and gcd(a , m) = 1 , then a has a unique inverse modulo m. Chinese Remainder Theorem: A system of linear congru ences modulo pairwise relatively prime integers has a unique solution modulo the product of these moduli. Fermat's Little Theorem: If p is prime and p Ia, then aP - 1 == 1 (mod p).
Review Questions
Define the term algorithm. What are the different ways to describe algorithms? What is the difference between an algorithm for solv ing a problem and a computer program that solves this problem? a) Describe, using English, an algorithm for finding the largest integer in a list of n integers. b) Express this algorithm in pseudocode. c) How many comparisons does the algorithm use? a) State the definition of the fact that f(n) is o (g(n » , where f(n) and g(n) are functions from the set of positive integers to the set of real numbers. b) Use the definition of the fact that fen) is O (g(n» directly to prove or disprove that n 2 + 1 8n + 1 07 is O(n 3 ). c) Use the definition of the fact that fen) is O (g(n» directly to prove or disprove that n 3 is O (n 2 + 1 8n + 1 07). a) How can you produce a big-O estimate for a function that is the sum of different terms where each term is the product of several functions? b) Give a big-O estimate for the function fen) = (n ! + 1 )(2 n + 1 ) + (n n- 2 + 8 n n- 3 )(n 3 + 2n ). For the function g in your estimate f(x) is o (g(x» use a sim ple function of smallest possible order. a) Define what the worst-case time complexity, average case time complexity, and best-case time complexity (in terms of comparisons) mean for an algorithm that finds the smallest integer in a list of n integers. b) What are the worst-case, average-case, and best-case time complexities, in terms of comparisons, ofthe al gorithm that finds the smallest integer in a list of n integers by comparing each of the integers with the smallest integer found so far? a) Describe the linear search and binary search algorithm for finding an integer in a list of integers in increasing order. b) Compare the worst-case time complexities of these two algorithms. c) Is one of these algorithms always faster than the other (measured in terms of comparisons)?
Describe the bubble sort algorithm. Use the bubble sort algorithm to sort the list 5, 2, 4, 1 , 3. c) Give a big-O estimate for the number of comparisons used by the bubble sort. a) Describe the insertion sort algorithm. b) Use the insertion sort algorithm to sort the list 2, 5, 1 , 4, 3. c ) Give a big-O estimate for the number of comparisons used by the insertion sort. a) Explain the concept of a greedy algorithm. b) Provide an example of a greedy algorithm that pro duces an optimal solution and explain why it produces an optimal solution. c) Provide an example of a greedy algorithm that does not always produce an optimal solution and explain why it fails to do so. State the Fundamental Theorem of Arithmetic. a) Describe a procedure for finding the prime factoriza tion of an integer. b) Use this procedure to find the prime factorization of 80,707. a) Define the greatest common divisor of two integers. b) Describe at least three different ways to find the great est common divisor of two integers. When does each method work best? c) Find the greatest common divisor of 1 ,234,567 and 7,654,32 1 . d) Find the greatest common divisor of 2 3 3 5 5 7 791 1 and 29375 5 73 13. a) Define what it means for a andb to be congruent mod ulo 7. b) Which pairs of the integers - 1 1 , -8, -7, - 1 , 0, 3, and 17 are congruent modulo 7? c) Show that if a and b are congruent modulo 7, then l Oa + 1 3 and -4b + 20 are also congruent modulo 7. Describe a procedure for converting decimal (base 10) expansions of integers ipto hexadecimal expansions. a) How can you find a linear combination (with integer coefficients) of two integers that equals their greatest common divisor?
1. a) b) c)
7. a) b)
2.
8.
3.
4.
5.
6.
9.
10. 11.
12.
13.
14. 15.
Supplementary Exercises
3-93
Express gcd(84, 119) as a linear combination of 84 and 119. a) What does it mean for a to be an inverse of a modulo m? b) How can you find an inverse of a modulo m when m is a positive integer and gcd(a , m) = I ? c) Find an inverse of 7 modulo 19. a) How can an inverse of a modulo m be used to solve the congruence ax == b (mod m) when gcd(a , m) = I ? b) Solve the linear congruence 7x == 1 3 (mod 19). a) State the Chinese Remainder Theorem. b) Find the solutions to the system x == 1 (mod 4), x == 2 (mod 5), and x == 3 (mod 7). Suppose that 2n -1 == 1 (mod n). Is n necessarily prime? b)
16.
17,
18.
19.
259
What is the difference between a public key and a private key cryptosystem? b) Explain why using shift ciphers is a private key system. c) Explain why the RSA cipher system is a public key system. Define the product oftwo matrices A and B. When is this product defined? a) How many different ways are there to evaluate the product A I A2 A3 A! by successively multiplying pairs of matrices, when this product is defined? b) Suppose that A I , A2 , A3 , and A! are 1 0 x 20, 20 x 5, 5 x 10, and lO x 5 matrices, respectively. How should A I A2 A3 A! be computed to use the least number of multiplications of entries?
20. a)
21. 22.
Supplementary Exercises 1. a) b) 2. a) b) 3. a) b) 4. a)
b) 5. a)
b) c)
6. a)
Describe an algorithm for locating the last occurrence of the largest number in a list of integers. Estimate the number of comparisons used. Describe an algorithm for finding the first and second largest elements in a list of integers. Estimate the number of comparisons used. Give an algorithm to determine whether a bit string contains a pair of consecutive zeros. How many comparisons does the algorithm use? Suppose that a list contains integers that are in order oflargest to smallest and an integer can appear repeat edly in this list. Devise an algorithm that locates all occurrences of an integer x in the list. Estimate the number of comparisons used. Adapt Algorithm 1 in Section 3 . 1 to find the maxi mum and the minimum of a sequence of n elements by employing a temporary maximum and a temporary minimum that is updated as each successive element is examined. Describe the algorithm from part (a) in pseudocode. How many comparisons of elements in the sequence are carried out by this algorithm? (Do not count com parisons used to determine whether the end of the se quence has been reached.) Describe in detail the steps of an algorithm that finds the maximum and minimum of a sequence of n el ements by examining pairs of successive elements, keeping track of a temporary maximum and a tempo rary minimum. If n is odd, both the temporary maxi mum and temporary minimum should initially equal the first term, and if n is even, the temporary minimum and temporary maximum should be found by compar ing the initial two elements. The temporary maximum and temporary minimum should be updated by com-
paring them with the maximum and minimum of the pair of elements being examined. b) Express the algorithm described in part (a) in pseudo code. c) How many comparisons of elements of the sequence are carried out by this algorithm? (Do not count compari sons used to determine whether the end of the se quence has been reached.) How does this compare to the number of comparisons used by the algorithm in Exercise 5? * 7. Show that the worst-case complexity in terms of compar isons of an algorithm that finds the maximum and mini mum of n elements is at least f3n/21 - 2. 8. Devise an efficient algorithm for finding the second largest element in a sequence of n elements and deter mine the worst-case complexity of your algorithm. The shaker sort (or bidirectional bubble sort) successively compares adjacent pairs of elements, exchanging them ifthey are out of order, and alternately passing through the list from the beginning to the end and then from the end to the beginning until no exchanges are needed. 9. Show the steps used by the shaker sort to sort the list 3, 5, 1 , 4, 6, 2. 10. Express the shaker sort in pseudocode. 1 1 . Show that the shaker sort has 0 (n 2 ) complexity measured in terms of the number of comparisons it uses. 12. Explain why the shaker sort is efficient for sorting lists that are already in close to the correct order. 13. Show that (n log n + n 2 )3 is O(n6). 14. Show that 8x 3 + 12x + 1 00 log x is O(x 3 ). 15. Give a big-O estimate for (x 2 + x(log x)3 ) . (2X + x 3 ). 16. Find a big-O estimate for 2:J = 1 JU + 1). n *1 7. Show that n ! is not O (2 ). o
260
Show that n n is not O (n I). Find four numbers congruent to 5 modulo 1 7. Show that if a and d are positive integers, then there are integers q and r such that a = dq + r where -d 12 < r ::::: d12. �1. Show that if ae == be (mod m), then a == b (mod m i d), where d = gcd(m , e) . �2. How many zeros are at the end of the binary expansion of IOO I O !? 23. Use the Euclidean algorithm to find the greatest common divisor of 10,223 and 33,34 1 . 24. How many divisions are required to find gcd(144, 233) using the Euclidean algorithm? 25. Find gcd(2n + I , 3n + 2), where n is a positive integer. [Hint: Use the Euclidean algorithm.] 26. a) Show that if a and b are positive integers with a ::: b, then gcd(a , b) = a if a = b, gcd(a , b) = 2 gcd(aI2, b12) if a and b are even, gcd(a , b) = gcd(aI2, b) if a is even and b is odd, and gcd(a , b) = gcd(a - b, b) if both a and b are odd. b) Explain how to use (a) to construct an algorithm for computing the greatest common divisor of two posi tive integers that uses only comparisons, subtractions, and shifts of binary expansions, without using any divisions. c) Find gcd(1 202, 4848) using this algorithm. 27. Show that an integer is divisible by 9 if and only if the sum of its decimal digits is divisible by 9. 28. Prove that if n is a positive integer such that the sum of the divisors of n is n + I, then n is prime. 29. Adapt the proofthat there are infinitely many primes (The orem 3 in Section 3.5) to show that are infinitely many primes ofthe form 6k + 5 where k is a positive integer. 30. a) Explain why n div 7 equals the number of weeks in n days. b) Explain why n div 24 equals the number of days in n hours. A set of integers is called mutually relatively prime if the greatest common divisor of these integers is 1 . 3 1 . Determine whether these sets of integers are mutually rel atively prime. a) 8, 1 0, 1 2 b) 1 2, 15, 25 d) 2 1 , 24, 28, 32 c) 15, 2 1 , 28 32. Find a set of four mutually relatively prime integers such that no two of them are relatively prime. 33. a) Suppose that messages are encrypted using the func tion f(p ) = (ap + b) mod 26 such that gcd(a , 26) = 1 . Determine a function that can be used to decrypt messages. b) The encrypted version of a message is LJMKG MGMXF QEXMW Ifit was encrypted using the func tion f(p ) = (7p + 1 0) mod 26, what was the original message? 34. Show that the system of congruences x == 2 (mod 6) and x == 3 (mod 9) has no solutions. �18. 19. 20.
3-94
3 / The Fundamentals: Algorithms, the Integers, and Matrices 35. *36.
*37.
* *38.
39.
40.
41.
Find all solutions of the system of congruences x == 4 (mod 6) and x == 1 3 (mod IS). a) Show that the system of congruences x == al (mod m I ) and x == a2 (mod m 2 ) has a solution if and only if gcd(m l , m 2 ) I al - a2 · b) Show that the solution in part (a) is unique modulo lcm(m l , m 2 ). Prove that if f(x) is a nonconstant polynomial with inte ger coefficients, then there is an integer y such that f(y) is composite. [Hint: Assume that f(xo) = p is prime. Show that p divides f(xo + kp) for all integers k. Obtain a con tradiction ofthe fact that a polynomial of degree n, where n > 1 , takes on each value at most n times.] Show that if a and b are positive irrational numbers such that 1 Ia + 1 I b = 1 , then every positive integer can be uniquely expressed as either Lka J or LkbJ for some posi tive integer k. Show that Goldbach's conjecture, which states that every even integer greater than 2 is the sum of two primes, is equivalent to the statement that every integer greater than 5 is the sum ofthree primes. Prove that there are no solutions in integers x and y to the equation x 2 - 5y 2 = 2. [Hint: Consider this equation modulo 5.] Find An if A is
Show that if A = cI, where e is a real number and I is the n x n identity matrix, then AB = BA whenever B is an n x n matrix. 43. Show that ifA is a 2 x 2 matrix such that AB = BA when ever B is a 2 x 2 matrix, then A = cI, where e is a real number and I is the 2 x 2 identity matrix. An n x n matrix is called upper triangular if aij = 0 when ever i > j . 44. From the definition o f the matrix product, devise an algo rithm for computing the product of two upper triangular matrices that ignores those products in the computation that are automatically equal to zero. 45. Give a pseudocode description of the algorithm in Exer cise 44 for multiplying two upper triangular matrices. 46. How many multiplications of entries are used by the al gorithm found in Exercise 44 for multiplying two n x n upper triangular matrices? 47. Show that ifA and B are invertible matrices and AB exists, then (AB) -I = B - 1 A- I . 48. What is the best order to form the product ABCD if A, B, C, and D are matrices with dimensions 30 x 1 0, l Ox 40, 40x 50, and 50 x 30, respectively? Assume that the number of multiplications of entries used to mul tiply a p x q matrix and a q x r matrix is pqr. 49. Let A be an n x n matrix and let 0 be the n x n matrix all of whose entries are zero. Show that the following are true. 42.
3-95
Computer Projects a) A <:) 0 = 0 <:) A = 0 b) A v 0 = 0 v A = A c) A /\ 0 = 0 /\ A = 0
50. Show that ifthe denominations of coins are co , c l , . . . , ck , where k is a positive integer and c is a positive integer, c > 1 , the greedy algorithm always produces change us ing the fewest coins possible.
261
Devise an algorithm for guessing a number between 1 and 2n - 1 by successively guessing each bit in its binary expansion. 52. Determine the complexity, in terms of the number of guesses, needed to determine a number between 1 and 2n - 1 by successively guessing the bits in its binary expansion. 51.
Computer Projects Write programs with these inputs and outputs.
1. 2. 3. 4. 5. 6. 7.
8.
9.
10. 11.
12. 13. *14.
Given a list of n integers, find the largest integer in the list. Given a list of n integers, find the first and last occurrences of the largest integer in the list. Given a list of n distinct integers, determine the position of an integer in the list using a linear search. Given an ordered list of n distinct integers, determine the position of an integer in the list using a binary search. Given a list of n integers, sort them using a bubble sort. Given a list of n integers, sort them using an insertion sort. Given an integer n, use the greedy algorithm to find the change for n cents using quarters, dimes, nickels, and pennies. Given an ordered list of n integers and an integer x, find the number of comparisons used to determine the position of an integer in the list using a linear search and using a binary search. Given a list of integers, determine the number of compar isons used by the bubble sort and by the insertion sort to sort this list. Given a positive integer, determine whether it is prime. Given a message, encrypt this message using the Caesar cipher; and given a message encrypted using the Caesar cipher, decrypt this message. Given two positive integers, find their greatest common divisor using the Euclidean algorithm. Given two positive integers, find their least common multiple. Given a positive integer, find the prime factorization of this integer.
15. Given a positive integer and a positive integer b greater than 1 , find the base b expansion of this integer. 16. Given the positive integers a, b, and m > 1 , find ab modm.
17. Given a positive integer, find the Cantor expansion of this integer (see the preamble to Exercise 46 of Section 3.6). 18. Given a positive integer n, a modulus m, multiplier a, increment c, and seed xo, with 0 .:s a < m, 0 .:s c < m, and 0 .:s Xo < m, generate the sequence of n pseudo random numbers using the linear congruential generator Xn + 1 = (axn + c) modm. 19. Given positive integers a and b, find integers s and t such that sa + tb = gcd(a, b). 20. Given n linear congruences modulo pairwise relatively prime moduli, find the simultaneous solution ofthese con gruences modulo the product of these moduli. 2 1 . Given an m x k matrix A and a k x n matrix B, find AB.
Given a square matrix A and a positive integer n, find An . 23. Given a square matrix, determine whether it is symmetric. 24. Given an n l x n 2 matrix A, an n 2 x n 3 matrix B, an n 3 x n 4 matriJi. C, and an n 4 x n 5 matrix D, all with inte ger entries, determine the most efficient order to multiply these matrices (in terms of the number of multiplications and additions of integers). 25. Given twom x n Boolean matrices, find their meet and join. 26. Given anm x k Boolean matrix A and a k x n Boolean matrix B, find the Boolean product of A and B. 27. Given a square Boolean matrix A and a positive integer n, find A[n ) . 22.
Computations and Explorations Use a computational program or programs you have written to do these exercises.
1.
We know that nb is O(dn ) when b and d are positive num bers with d :::: 2 . Give values ofthe constants C and k such that nb .:s C dn whenever x > k for each of these sets of values: b = 10, d = 2; b = 20, d = 3; b = 1 000, d = 7.
2.
Compute the change for different values of n with coins of different denominations using the greedy al gorithm and determine whether the smallest number of coins was used. Can you find conditions so that the
262
3.
4. 5.
6. 7.
3 / The Fundamentals:
Algorithms, the Integers, and Matrices
greedy algorithm is guaranteed to use the fewest coins possible? Using a generator of random orderings of the integers I , 2, . . . , n, find the number of comparisons used by the bubble sort, insertion sort, binary insertion sort, and selection sort to sort these integers. Determine whether 2P - I is prime for each ofthe primes not exceeding 1 00. Test a range of large Mersenne numbers 2P - I to de termine whether they are prime. (You may want to use software from the GIMPS project.) Look for polynomials in one variables whose values at long runs of consecutive integers are all primes. Find as many primes of the form n 2 + I where n is a
3-96
8. 9.
10.
11.
positive integer as you can. It is not known whether there are infinitely many such primes. Find 10 different primes each with 100 digits. How many primes are there less than 1 ,000,000, less than 10,000,000, and less than 1 00,000,000? Can you propose an estimate for the number of primes less than x where x is a positive integer? Find a prime factor of each of 10 different 20-digit odd in tegers, selected at random. Keep track of how long it takes to find a factor of each ofthese integers. Do the same thing for 1 0 different 30-digit odd integers, 1 0 different 40-digit odd integers, and so on, continuing as long as possible. Find all pseudoprimes to the base 2, that is, composite integers n such that 2n -1 == I (mod n ) , where n does not exceed 10,000.
Writing Projects Respond to these questions with essays using outside sources.
1. 2.
3.
4. 5.
6.
7.
8.
9.
1 0.
Examine the history of the word algorithm and describe the use of this word in early writings. Look up Bachmann's original introduction of big-O no tation. Explain how he and others have used this notation. Explain how sorting algorithms can be classified into a taxonomy based on the underlying principle on which they are based. Describe the radix sort algorithm. Describe what is meant by a parallel algorithm. Explain how the pseudocode used in this book can be extended to handle parallel algorithms. Explain how the complexity of parallel algorithms can be measured. Give some examples to illustrate this concept, showing how a parallel algorithm can work more quickly than one that does not operate in parallel. Describe the Lucas-Lehmer test for determining whether a Mersenne number is prime. Discuss the progress of the GIMPS project in finding Mersenne primes using this test. Explain how probabilistic primality tests are used in prac tice to produce extremely large numbers that are almost certainly prime. Do such tests have any potential draw backs? The question of whether there are infinitely many Carmichael numbers was solved recently after being open for more than 75 years. Describe the ingredients that went into the proof that there are infinitely many such numbers. Summarize the current status of factoring algorithms in terms oftheir complexity and the size of numbers that can currently be factored. When do you think that it will be feasible to factor 200-digit numbers?
11.
12.
13.
14.
15.
16. 17. 18. 19.
Describe the algorithms that are actually used in modern computers to add, subtract, multiply, and divide positive integers. Describe the history of the Chinese Remainder Theorem. Describe some of the relevant problems posed in Chi nese and Hindu writings and how the Chinese Remainder Theorem applies to them. When are the numbers of a sequence truly random num bers, and not pseudorandom? What shortcomings have been observed in simulations and experiments in which pseudorandom numbers have been used? What are the properties that pseudorandom numbers can have that ran dom numbers should not have? Describe how public key cryptography is being applied. Are the ways it is applied secure given the status of fac toring algorithms? Will information kept secure using public key cryptography become insecure in the future? Describe how public key cryptography can be used to send signed secret messages so that the recipient is rela tively sure the message was sent by the person claiming to have sent it. Show how a congruence can be used to tell the day ofthe week for any given date. Describe some of the algorithms used to efficiently mul tiply large integers. Describe some of the algorithms used to efficiently mul tiply large matrices. Describe the Rabin public key cryptosystem, explaining how to encrypt and how to decrypt messages and why it is suitable for use as a public key cryptosystem.
C HAPTER
Induction and Recursion 4.1 Mathematical
Induction
4.2 Strong
Induction and Well-Ordering
4.3 Recursive
Definitions and Structural Induction
4.4 Recursive
Algorithms
4.5 Program
Correctness
M any mathematical statements assert that a property is true for all positive integers. 3-
n: n!n ::: nn, n n
n
Examples of such statements are that for every positive integer is divisible by 3; a set with elements has 2n subsets; and the sum of the first positive integers is + 1 )/2. A major goal of this chapter, and the book, is to give the student a thorough understanding of mathematical induction, which is used to prove results of this kind. Proofs using mathematical induction have two parts. First, they show that the statement holds for the positive integer 1 . Second, they show that if the statement holds for a positive integer then it must also hold for the next larger integer. Mathematical induction is based on the rule of inference that tells us that if and � + 1 » are true for the domain of positive integers, then is true. Mathematical induction can be used to prove a tremendous variety of results. Understanding how to read and construct proofs by mathematical induction is a key goal of learning discrete mathematics. In Chapters 2 and 3 we explicitly defined sets and functions. That is, we described sets by listing their elements or by giving some property that characterizes these elements. We gave formulae for the values of functions. There is another important way to define such objects, based on mathematical induction. To define functions, some initial terms are specified, and a rule is given for finding subsequent values from values already known. Sets can be defined by listing some of their elements and giving rules for constructing elements from those already known to be in the set. Such definitions, called are used throughout discrete mathematics and computer science. Once we have defined a set recursively, we can use a proof method called structural induction to prove results about this set. When a procedure is specified for solving a problem, this procedure must solve the problem correctly. Just testing to see that the correct result is obtained for a set of input values does not show that the procedure always works correctly. The correctness of a procedure can be guaranteed only by proving that it always yields the correct result. The final section of this chapter contains an introduction to the techniques of program verification. This is a formal technique to verify that procedures are correct. Program verification serves as the basis for attempts under way to prove in a mechanical fashion that programs are correct.
n(n
V P(n)
P(I) Vk(P(k) P(k
recursive definitions,
4.1
always
Mathematical Induction Introduction Suppose that we have an infinite ladder, as shown in Figure 1 , and we want to know whether we can reach every step on this ladder. We know two things: 1 . We can reach the first rung of the ladder. 2. If we can reach a particular rung of the ladder, then we can reach the next rung. Can we conclude that we can reach every rung? By ( 1 ), we know that we can reach the first rung of the ladder. Moreover, because we can reach the first rung, by (2), we can also reach the second rung; it is the next rung after the first rung. Applying (2) again, because we can reach the second rung, we can also reach the third rung. Continuing in this way, we can show that we can reach the fourth rung, the fifth rung, and so on. For example, after 1 00 uses of (2), we know that we can reach the 1 0 1 st rung. But can we conclude that we are able to reach every rung of this
4- /
263
264
4 / Induction and Recursion
FIGURE 1
4-2
Climbing an Infinite Ladder.
infinite ladder? The answer is yes, something we can verify using an important proof technique called mathematical induction. That is, we can show that P (n) is true for every positive integer n , where P(n) is the statement that we can reach the nth rung of the ladder. Mathematical induction is an extremely important proof technique that can be used to prove assertions of this type. As we will see in this section and in subsequent sections of this chapter and later chapters, mathematical induction is used extensively to prove results about a large variety of discrete objects. For example, it is used to prove results about the complexity of algorithms, the correctness of certain types of computer programs, theorems about graphs and trees, as well as a wide range of identities and inequalities. In this section, we will describe how mathematical induction can be used and why it is a valid proof technique. It is extremely important to note that mathematical induction can be used only to prove results obtained in some other way. It is not a tool for discovering formulae or theorems.
Mathematical Induction
Assessmenl �
In general, mathematical induction* can be used to prove statements that assert that P(n) is true for all positive integers n, where P (n) is a propositional function. A proof by mathe matical induction has two parts, a basis step, where we show that P ( l ) is true, and an in ductive step, where we show that for all positive integers k, if P(k) is true, then P(k + 1 ) i s true. · Unfortunately. using the terminology "mathematical induction" clashes with the terminology used to describe different types of reasoning. In logic, deductive reasoning uses rules of inference to draw conclusions from premises, whereas inductive reasoning makes conclusions only supported, but not ensured, by evidence. Mathematical proofs, including arguments that use mathematical induction, are deductive, not inductive.
4-3
4. 1 Mathematical Induction
265
PRINCIPLE OF MATHEMATICAL INDUCTION To prove that P en ) is true for all pos itive integers n , where P (n ) is a propositional function, we complete two steps: BASIS STEP:
We verify that p e l ) is true.
INDUCTIVE S TEP:
all positive integers k.
We show that the conditional statement P (k)
-+
P (k + 1 ) is true for
To complete the inductive step of a proof using the principle of mathematical induction, we assume that P(k) is true for an arbitrary positive integer k and show that under this assumption, P(k + 1 ) must also be true. The assumption that P(k) is true is called the inductive hypothesis. Once we complete both steps in a proof by mathematical induction, we have shown that P en) is true for all positive integers, that is, we have shown that Yn P (n ) is true where the quantification is over the set of positive integers. In the inductive step, we show that Yk( P(k) -+ P (k + 1 )) is true, where again, the domain is the set of positive integers. Expressed as a rule of inference, this proof technique can be stated as [P(1)
A
Y k(P(k)
-+
P (k + 1 )) ]
-+
Y n P (n ) ,
when the domain is the set of positive integers. Because mathematical induction is such an important technique, it is worthwhile to explain in detail the steps of a proof using this technique. The first thing we do to prove that P en ) is true for all positive integers n is to show that p e l ) is true. This amounts to showing that the particular statement obtained when n is replaced by 1 in Pen ) is true. Then we must show that P (k) -+ P (k + 1 ) is true for every positive integer k. To prove that this conditional statement is true for every positive integer k, we need to show that P(k + 1 ) cannot be false when P (k) is true. This can be accomplished by assuming that P(k) is true and showing that P (k + 1 ) must also be true.
under this hypothesis not ifi! is assumed
Remark: In a proofby mathematical induction it is
assumed that P (k) is true for all positive integers ! It is only shown that that P (k) is true, then P (k + 1 ) is also true. Thus, a proof by mathematical induction is not a case of begging the question, or circular reasoning.
When we use mathematical induction to prove a theorem, we first show that P ( 1 ) is true. Then we know that P (2) is true, because P ( 1 ) implies P (2). Further, we know that P (3) is true, because P (2) implies P (3). Continuing along these lines, we see that P en ) is true for every positive integer n . WAYS TO REMEMBER HOW MATHEMATICAL INDUCTION WORKS Thinking of the infinite ladder and the rules for reaching steps can help you remember how mathematical induction works. Note that statements ( 1 ) and (2) for the infinite ladder are exactly the basis step and inductive step, respectively, of the proof that P en ) is true for all positive integers n , where P(n) is the statement that we can reach the nth rung of the ladder. Consequently, we can invoke mathematical induction to conclude that we can reach every rung. Besides the infinite ladder, several other useful illustrations of mathematical induction can help you remember how this principle works. One ofthese involves a line of people: person one,
unkS �
H I STORICAL NOTE The first known use of mathematical induction is in the work of the sixteenth-century mathematician Francesco Maurolico ( 1 494 - 1 575). Maurolico wrote extensively on the works of classical mathematics and made many contributions to geometry and optics. In his book Arithmeticorum Libri Duo. Maurolico presented a variety of properties of the integers together with proofs of these properties. To prove some of these properties he devised the method of mathematical induction. His first use of mathematical induction in this book was to prove that the sum of the first n odd positive integers equals n2•
266
4 I Induction and Recursion
4-4
,.�""" 14 Pt",,,, /J Ptrsan 12
Person
1 Person
FIGURE 2
Persan l l Per�on / 0 Person 9 Person 8 Person 7 Person 6 Person 5 Person 4
2 Person 3
People Telling Secrets.
person two, and so on. A secret is told to person one, and each person tells the secret to the next person in line, if the former person hears it. Let P en ) be the proposition that person n knows the secret. Then P ( 1 ) is true, because the secret is told to person one; P (2) is true, because person one tells person two the secret; P (3) is true, because person two tells person three the secret; and so on. By the principle of mathematical induction, every person in line learns the secret. This is illustrated in Figure 2. (Of course, it has been assumed that each person relays the secret in an unchanged manner to the next person, which is usually not true in real life.) Another way to illustrate the principle of mathematical induction is to consider an infinite row of dominoes, labeled 1 , 2 , 3 , . . . , n , . . . , where each domino is standing up. Let P en ) be the proposition that domino n is knocked over. If the first domino is knocked over-i.e., if P ( 1 ) i s true-and if, whenever the kth domino i s knocked over, i t also knocks the (k + l )st domino over-i.e., if P (k) -+ P (k + 1 ) is true for all positive integers k-then all the dominoes are knocked over. This is illustrated in Figure 3 .
Examples o f Proofs b y Mathematical Induction Many theorems state that P en ) is true for all positive integers n , where p en) is a propositional function, such as the statement that 1 + 2 + . . . + n = n (n + 1 )/2 for all positive integers n or the statement that n ::::: 2 n for all positive integers n . Mathematical induction is a technique
FIGURE 3
Illustrating How Mathematical Induction Works Using Dominoes.
4-5
4. 1
Mathematical Induction
267
lin P(n),
unks �
for proving theorems of this kind. In other words, mathematical induction can be used to prove statements of the form where the domain is the set of positive integers. Mathematical induction can be used to prove an extremely wide variety of theorems, each of which is a statement of this form. We will use a variety of examples to illustrate how theorems are proved using mathematical induction. The theorems we will prove include summation formulae, inequalities, identities for combinations of sets, divisibility results, theorems about algorithms, and some other creative results. In later sections, we will employ mathematical induction to prove many other types of results, including the correctness of computer programs and algorithms. Mathematical induction can be used to prove a wide variety of theorems, not just summation formulae, inequalities, and other types of examples we illustrate here. Note that there are many opportunities for error in induction proofs. We will describe some incorrect proofs by mathematical induction at the end of this section and in the exercises. PROVING SUMMATION FORMULAE We begin by using mathematical induction to prove several different summation formulae. As we will see, mathematical induction is particularly well suited for proving that such formulae are valid. However, summation formulae can be proven in other ways. This is not surprising because there are often different ways to prove a theorem. The major disadvantage of this use of mathematical induction is that you cannot use it to derive a summation formula. That is, you must already have the formula before you attempt to prove it by mathematical induction. We begin by using mathematical induction to prove a formula for the sum of the smallest positive integers.
EXAMPLE 1
Ex1ra � ExamPles lliiiiiiil
n
n 1 + 2 + · · · + n = n(n 2+ 1) . Solution: P(n) n n(n + 1)/2 . P(n) n = 1, 2, P(I) P(k) P(k + 1) k = 1, 2, P(1) . 1 = 1(1 2+ 1) . P(k) k. 1 + 2 + · · · + k = k(k 2+ 1) . P(k + 1) 1 + 2 + . . . + k + (k + 1) = (k + 1) [(k2+ 1) + 1 ] = (k + 1)(k2 + 2) k+ 1 P(k), 1 + 2 + . . . + k + (k + 1) = k(k 2+ 1) + (k + 1) k(k + 1) + 2(k + 1) 2 = (k + 1)(k2 + 2) Show that if
is a positive integer, then
Let be the proposition that the sum of the first positive integers is We must do two things to prove that is true for 3, . Namely, we must show that is true and that the conditional statement implies is true for 3, . . . . .
BASIS STEP:
IS
.
true, because
For the inductive hypothesis we assume that That is, we assume that
INDUCTIVE S TEP:
positive integer
.
Under this assumption, it must be shown that
is also true. When we add
holds for an arbitrary
is true, namely, that
to both sides of the equation in
we obtain
268
4 / Induction and Recursion
4-6
This last equation shows that P (k + 1) is true under the assumption that P(k) is true. This completes the inductive step. We have completed the basis step and the inductive step, so by mathematical induction we know that P is true for all positive integers That is, we have proven that 1 + 2 + . . . + =
(n)
n(n
n.
n.
n
As we noted, mathematical induction is not a tool for finding theorems about all positive integers. Rather, it is a proof method for proving such results once they are conjectured. In Example 2, using mathematical induction to prove a summation formula, we will both formulate and then prove a conjecture. EXAMPLE 2
Conjecture a formula for the sum of the first using mathematical induction.
Solution:
The sums of the first
I = I, 1 + 3 + 5 + 7 = 1 6,
n
n
n
positive odd integers. Then prove your conjecture
positive odd integers for
1 + 3 = 4, 1 +3+5+7+
9 = 25 .
n
= 1 , 2, 3 , 4, 5 are
1+3+5=
(2n
9,
proven
conjecture
From these values it is reasonable to conjecture that the sum of the first positive odd integers is 2 , that is, 1 + 3 + 5 + . . . + - 1 ) = n 2 • We need a method to that this is correct, if in fact it is. Let denote the proposition that the sum of the first odd positive integers is n 2 • Our conjecture is that is true for all positive integers. To use mathematical induction to prove this conjecture, we must first complete the basis step; that is, we must show that P ( 1 ) is true. Then we must carry out the inductive step; that is, we must show that P (k + 1 ) is true when P (k) is assumed to be true. We now attempt to complete these two steps.
pen) pen)
n
P ( l ) states that the sum of the first one odd positive integer is 1 2 . This is true because the sum of the first odd positive integer is 1 . The basis step is complete. BASIS S TEP:
To complete the inductive step we must show that the proposition P (k) -+ P (k + I ) is true for every positive integer k. To do this, we first assume the inductive hypothesis. The inductive hypothesis is the statement that P (k) is true, that is, INDUCTIVE S TEP:
I + 3 + 5 + . . . + (2k - I) = k2 • [Note that the kth odd positive integer is (2k - 1 ), because this integer is obtained by adding 2 a total of k - 1 times to 1 .] To show that Vk(P (k) -+ P (k + I » is true, we must show that if P(k) is true (the inductive hypothesis), then P (k + I ) is true. Note that P (k + I ) is the statement that 1 + 3 + 5 + . . . + (2k - 1 ) + (2k + 1 ) = (k + 1 )2 . So, assuming that P (k) is true, it follows that 1 + 3 + 5 + . . . + (2k
-
1 ) + (2k + 1 ) = [ 1 + 3 + . . . + (2k - 1 )] + (2k + 1 ) = k2 + (2k + 1 ) = k2 + 2 k + 1 = (k + l ) 2 .
This shows that P (k + 1 ) follows from P (k). Note that we used the inductive hypothesis P (k) in the second equality to replace the sum of the first k odd positive integers by k2 .
4- 7
4. 1
Mathematical Induction 269
We have now completed both the basis step and