Instructor’s Guide for Poole’s
Linear Algebra Second Edition
Michael Prophet Douglas Shaw University of Northern Iowa
Australia • Canada • Mexico • Singapore • Spain United Kingdom • United States
Mathematics Editor Assistant Editor
John-Paul Ramin
Compositor
Andy Bulman-Fleming
Stacy Green
c 2006 Brooks/Cole, a division of Thomson COPYRIGHT TM
Learning, Inc. Thomson Learning
is a trademark used
herein under license.
Thomson Higher Education 10 Davis Drive Belmont, CA 94002-3098 USA
ALL RIGHTS RESERVED. Instructors of classes adopting Linear Algebra, Second Edition, by David Poole, as an assigned textbook may reproduce material from this publication for classroom use or in a secure electronic network environment that prevents downloading or reproducing the
Asia Thomson Learning 5 Shenton Way #01-01 UIC Building Singapore 068808
copyrighted material. Otherwise, no part of this work covered by the copyright hereon may be reproduced or used in any form or by any means—graphic, electronic, or mechanical, including but not limited to photocopying, recording, taping, Web distribution, information networks, or information storage and retrieval systems—without the written permission of the publisher. Printed in the United States of America 1 2 3 4 5 6 7 07 06 05
For more information about our products, contact us at: Thomson Learning Academic Resource Center 1-800-423-0563 For permission to use material from this text or product, submit a request online at http://www.thomsonrights.com Any additional questions about permissions can be submitted by email to
[email protected]
Trademarks Maple is a registered trademark of Waterloo Maple, Inc.
Australia/New Zealand Nelson Thomson Learning 102 Dodds Street Southbank, Victoria 3006 Australia Canada Thomson Nelson 1120 Birchmount Road Toronto, Ontario M1K 5G4 Canada UK/Europe/Middle East/South Africa Thomson Learning High Holborn House 50/51 Bedford Row London WC1R 4LR United Kingdom Latin America Thomson Learning Seneca, 53 Colonia Polanco 11560 Mexico D.F. Mexico
Mathematica is a registered trademark of Wolfram Research, Inc. Matlab is a registered trademark of The MathWorks, Inc. ISBN 0-534-99861-5
Spain (includes Portugal) Thomson Paraninfo Calle/Magallanes, 25 28015 Madrid Spain
Preface The purpose of this Instructor’s Guide is to save you time while helping you to teach an honest, interesting, student-centered course. For each section, there are suggested additions to your lecture that can supplement ⎡ ⎤ 2 1 6 ⎢ ⎥ (but not replace) things like like taking the inverse of ⎣ 3 1 −1 ⎦. Speaking of problems like that, for each 5 2 −1 section we have some worked-out routine examples that you can use straight out of our guide without having to do the computations yourself ahead of time. Lecturing is not your only option, of course. This guide provides group activities, ready for copying, that will allow your students to discover and explore the concepts of linear algebra. You may find that your classes become more “fun”, but we assure you that this unfortunate by-product of an engaged student population can’t always be avoided. This guide was designed to be used with Linear Algebra, Second Edition as a source of both supplementary and complementary material. Depending on your preferences, you can occasionally glance through the Guide for content ideas and alternate approaches, or you can use it as a major component in planning your day-to-day classes. In addition to lecture notes and group activities, each section has technology tips, sample homework assignments, and sample quiz questions. The unfortunate among us remember two linear algebra courses: the dry, computational one where we do Gaussian elimination by hand on larger and larger systems, and the dry, theoretical one where students do the same induction proof on the same abstract vector spaces repeatedly, occasionally taking a break to find yet another method of putting things in row-echelon form. Poole’s book finds a third way, bringing to life the beautiful structures that we linear algebra fans love so much, and adding accessible applications that show how this wonderful subject can be applied. We were proud to write this Instructor’s Guide in that spirit. We value reactions from all of our colleagues who are teaching from this guide, both to correct any errors and to suggest additional material for future editions. We are especially interested in which features of the guide are the most and the least useful. Please email any feedback to
[email protected]. We would like to thank David Poole, John-Paul Ramin and Bob Pirtle for giving us this opportunity. Stacy Green has been a wonderful editor, and her guidance throughout this project has been greatly appreciated. The “Find the Error” problems came from a raid of Dr. Poole’s notes (we added the moon pie taunts). Melissa Potter’s help has been invaluable on this and other projects. She always does wonderful work — catching errors, offering suggestions, and reminding us that students are people, and that we should stop using the definite article to describe them. When deadlines slipped, she was always ready to work on short notice. Thanks again, Melissa. iii
Preface
Andy Bulman-Fleming, one of the best typesetters in the business, again did a stellar job, with turnaround times that we had no right to ask for. When we agreed to write this book, our first question was, “Can we have Andy?” because we love the clarity of his graphics and the smoothness of his design. We think it makes a big difference, and we’re sure you will agree. We would also like to thank the University of Northern Iowa mathematics department, for always encouraging its professors to try new things and to pursue their passions. Our colleagues, department head, and dean have been very supportive. This Instructor’s Guide has meant many late evenings, weekend meetings, sudden phone calls, more late evenings, and bringing laptop computers to the Village Inn during breakfast. Our wives, Laurel and Margaret, have never complained about being Brooks/Cole Widows, and we hope they will be happy to have us back. We are proud to dedicate this book to them. Michael Prophet Douglas Shaw
iv
Contents How to Use the Instructor’s Guide
1
2
3
vii
Vectors 1 1.1
The Geometry and Algebra of Vectors
1.2
Length and Angle: The Dot Product
1.3
Lines and Planes
1.4
Code Vectors and Modular Arithmetic
1 6
17 27
Linear Equations 33 2.1
Introduction to Systems of Linear Equations
33
2.2
Direct Methods for Solving Linear Systems
42
2.3
Spanning Sets and Linear Independence
2.4
Applications
2.5
Iterative Methods for Solving Linear Systems
49
56
Matrices 65 3.1
Matrix Operations
3.2
Matrix Algebra
3.3
The Inverse of a Matrix
3.4
LU Factorization
3.5
Subspaces, Basis, Dimension, and Rank
3.6
Introduction to Linear Transformations
3.7
Applications
65 71 77
85
101 v
90 96
60
Contents
4
5
6
7
Eigenvalues and Eigenvectors 109 4.1
Introduction to Eigenvalues and Eigenvectors
4.2
Determinants
4.3
Eigenvalues and Eigenvectors of n × n Matrices
4.4
Similarity and Diagonalization
4.5
Iterative Methods for Computing Eigenvalues
4.6
Applications and the Perron-Frobenius Theorem
109
119 129
137 144 150
Orthogonality 159 5.1
Orthogonality in Rn
5.2
Orthogonal Complements and Orthogonal Projections
163
5.3
The Gram-Schmidt Process and the QR Factorization
171
5.4
Orthogonal Diagonalization of Symmetric Matrices
5.5
Applications
159
176
180
Vector Spaces 183 6.1
Vector Spaces and Subspaces
6.2
Linear Independence, Basis, and Dimension
6.3
Change of Basis
6.4
Linear Transformations
6.5
The Kernel and Range of a Linear Transformation
6.6
The Matrix of a Linear Transformation
6.7
Applications
183 191
196 201
216
Distance and Approximation 223 7.1
Inner Product Spaces
7.2
Norms and Distance Functions
7.3
Least Squares Approximation
7.4
The Singular Value Decomposition
7.5
Applications
223 226 231
240 vi
235
210
205
How to Use the Instructor’s Guide
For each section of Linear Algebra, Second Edition, this Instructor’s Guide provides information on the items listed below. 1. Suggested Time and Emphasis
These suggestions assume that the class is fifty minutes long. They also advise whether or not the material is essential to the rest of the course. If a section is labeled “optional”, the time range given is the amount of time for the material in the event that it is covered.
2. Points to Stress
This is a short summary of the big ideas to be covered.
3. Sample Questions
• Drill Questions: Some instructors have reported that they like to open or close class by handing out a single question. These questions are designed to be straightforward “right down the middle” questions for students who have read but not yet mastered the material. • Discussion Questions: These questions are more open-ended questions designed to provoke a lively conversation among the students, or between the students and the instructor. While most of them have answers, some of them do not, and some of them will be answered later on in the course. The idea here is to get the students talking mathematics, as opposed to talking about mathematics. • Test Questions: These questions are meant to be interesting ones to add to an exam. We do not reccommend making all of the questions like ours — the questions we provide are meant to add spice to a more routine test. 4. Lecture Notes
These suggestions are meant to work along with the text to create a classroom atmosphere of experimentation and inquiry.
5. Lecture Examples
These are routine examples with all the computations worked out, designed to save a bit of time in class preparation.
6. Tech Tips
Many students have access to symbolic algebra packages like Maple, Mathematica, Matlab, and the TI-89 or TI-92 calculators. These tips are meant to give you ideas on incorporating technology into your course.
7. Group Work
One of the main difficulties instructors have in presenting group work to their classes is that of choosing an appropriate group task. Suggestions for implementation and answers to the group activities are provided first, followed by photocopy-ready handouts on separate pages. This guide’s main philosophy of group work is that there should be a solid introduction to each exercise (“What are we supposed to do?”) and good closure before class is dismissed (“Why did we just do that?”) vii
How to Use the Instructor’s Guide
8. Sample Core Assignment
Every teacher has a different philosophy on assigning homework. These problems have been selected to form the basis for a homework assignment. Many instructors will want to assign a superset of our problems, and some might want to trim them slightly. The problems that require proofs as answers are marked with a superscript P.
viii
1 Vectors 1.1
The Geometry and Algebra of Vectors
Suggested Time and Emphasis 1 class. Essential material.
Points to Stress 1. A vector must have an initial point and a terminal point. 2. Translation invariance of vectors, including the concept of standard position. 3. The geometric interpretation of vector addition. 4. Linear combinations.
Drill Question Find a vector v that can be written as a linear combination of [1, 1, 1] and [2, 0, 3]. Answer
Answers will vary
Discussion Question Is there a vector w in Rn , other than 0, that has the property v + w = v for every v in Rn ? Answer No
Test Question Can every element of R3 be written as a linear combination of v1 = [2, 1, 0], [0, 1, 0] and [2, 2, 0]? Answer
No; any linear combination of these vectors has z -component 0.
Lecture Notes • Draw [3, 1] and [1, 3] on the board in standard position. Have the students draw vectors in standard position, and show them how they can always be “swallowed” by linear combinations of these two vectors. Examples are given below. y 4
y 3
3
2
2
1
1
0
1
2
3
x
0
1
1
2
3
4
5
6
7 x
Chapter 1 Vectors
_1
y
y
4
4
3
3
2
2
1
1
0
1
2
3
x
4
_3
_2
0
_1
_1
_1
_2
_2
_3
_3
1
3
2
4
x
• Assume v1 and v2 are in standard position. Demonstrate that the vector v3 = v1 − v2 can be viewed as the vector with initial point v2 and terminal point v1 . Draw this picture in R2 and (if you can) in R3 . • Ask the students to describe the set of all possible linear combinations of [1, 0] and [0.1]. Extend to all linear combinations of [1, 0, 0], [0, 1, 0], and [0, 0, 1]. Then ask them to explore the linear combinations of [1, 0, 0], [0, 0, 1], and [−2, 0, 2]. • Let v1 = [1, 2, 3] and let v2 = [4, 5, 6]. Find a third vector v3 , such as [5, 7, 9], so that R3 is not the set of all linear combinations of v1 , v2 , and v3 .
Lecture Examples • Geometric and algebraic addition and subtraction: y
y
0 _4
_2
_1
2
1
x
0 _4 _2 _1
_2 _3 _4
_2
_5
_3
[3, −2] + [−5, −3] = [−2, −5]
2
4
6
8 x
[3, −2] − [−5, −3] = [8, 1]
• Addition of 4-vectors: [4, 2, 9, π] + [−4, 1, 1, e] = [0, 3, 10, e + π]
Tech Tip Show students how to draw vectors in Maple using the arrow command. Consider drawing a sequence of vectors of the form [cos t, sin t], where t ranges between 0 and 2π .
Group Work 1: The Spanning Set The purpose of this activity is to give the students a sense of how two non-parallel, two-dimensional vectors span an entire plane. 2
Section 1.1 The Geometry and Algebra of Vectors
Start by giving each student or group of students a sheet of regular graph paper and a transparent grid of parallelograms formed by two vectors u and v.
v u Next give each student a point (x, y) in the plane. By placing the grid over the graph paper they should estimate values of r and s such that [x, y] = ru + sv. Repeat for several other points, including some that require negative values of r and/or s, until the students have convinced themselves that every point in the plane can be expressed in this manner. Now repeat the activity with different vectors u and v, perhaps using the same points as before. As a wrap-up, give the students specific vectors u and v, such as u = [3, 1] and v = [−1, −2], and have them determine values of r and s for several points. See if they can find general formulas for r and s in terms of the point (x, y). What “goes wrong” algebraically if u and v are parallel?
Group Work 2: Cones in General This activity will introduce students to the concept of the cone of a set of vectors. You might want to stop after the first question and make sure they all understand the definition, or give a different example to do all together before setting them loose. Answers 1.
y
2.
1 0
y
1 1
0
x
3. Yes. [0, 7, 3] = 2 [1, 1, 1] + 2 [−1, 2, 1] + (−1) [0, −1, 1] 4. Yes. [0, 7, 3] = 2 [1, 1, 1] + 2 [−1, 2, 1] + (−1) [0, −1, 1] + 0 [0, 1, 0]
Suggested Core Assignment Note: Exercises requiring proofs are marked with a superscript P.
Exercises 2, 5, 6, 12, 14, 18, 22, 24P 3
1
x
Group Work 1, Section 1.1 The Spanning Set
4
Group Work 2, Section 1.1 Cones in General The cone in Rn generated by a collection of vectors v1 , v2 , v3 , . . . , vk is the set of all nonnegative linear combinations of these vectors, drawn in standard position. (A nonnegative linear combination is of the form k i=1 ci vi , where ci ≥ 0, i.e. a linear combination where all of the coefficients are nonnegative.) 1. Cones in R2 are easy to identify. Shade in the cone generated by v1 = [1, 1] and v2 = [2, 0].
2. Shade in the cone generated by v1 = [−1, 1] , v2 = [0, 2], and v3 = [1, 1].
3. Does the vector [0, 7, 3] belong to the cone generated by [1, 1, 1], [−1, 2, 1], and [0, −1, 1]?
4. Does the vector [0, 7, 3] belong to the cone generated by [1, 1, 1], [−1, 2, 1], [0, −1, 1], and [0, 1, 0]?
5
1.2
Length and Angle: The Dot Product
Suggested Time and Emphasis 1 class. Essential material.
Points to Stress 1. The dot product and its properties. 2. Definition of orthogonality. 3. Length of a vector. 4. Projections.
Drill Question Given the vectors a and b below, draw proja b and projb a. b
a
Discussion Question Investigate the following question: What is the maximum number of mutually orthogonal nonzero vectors in Rk ?
It’s easy to see geometrically that in R2 , there exist two mutually orthogonal vectors and there
cannot exist three. In R3 , sketch the lines determined by three mutually orthogonal vectors. Notice that this creates a set of coordinate axes just like the standard ones. Every element of R3 belongs to one of the “octants” created by your new axes, and thus every element of R3 is a linear combination of the three mutually orthogonal vectors. Show why this implies that there cannot be a fourth nonzero vector that is orthogonal to the other three.
Test Question Consider the following pairs of vectors, all of which have length 1:
d
b
f a
c
h e
Put the following quantities in order, from smallest to largest: a·b c·d e·f Answer e · f , c · d, a · b, g · h 6
g
g·h
Section 1.2 Length and Angle: The Dot Product
Lecture Notes • Assume that a · b = a · c and a = 0. Pose the question, “Is it necessarily true that b = c?” When you’ve convinced them (perhaps by example) that the answer is “no”, the next logical question to ask is, “What can we say about b and c?” It can be shown that b and c have the same projection onto a, since a ⊥ (b − c). • Derive the formula for proju v as in Exercise 57: The vector proju v must be a scalar multiple of u, and thus we know that proju v = cu for some c. By definition, v − cu is orthogonal to u, so we can use this fact to solve for c. • Demonstrate the distributive property in general. Note that while the dot product is commutative and distributive, the associative property makes no sense, as it is not possible to take the dot product of three vectors. • Demonstrate the proper formation of statements involving dot products. c (a · b) makes sense, while the statements d · (a · b) and c · a do not.
For example, the statement
• This is a nice, direct application of vector projections: It is clear that a weight will slide more quickly down ramp 2 than down ramp 1:
ramp 1 ramp 2
Gravity is the same in both cases, yet there is a definite difference in speed. The reason behind this is interesting. Gravity is doing two things at once: it is letting the weight slide down, and it is also preventing the weight from floating off the ramp and flying into outer space. We can draw a “free body diagram” that shows how the gravity available to let the weight slide down is affected by the angle of the ramp. force causing weight to move
force causing weight to move
ramp 1 ramp 2
A block slides faster on a steeper slope because the projection of the gravitational force in the direction of the slope is larger. There is more force pushing the block down the slope, and less of a force holding it to the surface of the slope. 7
Chapter 1 Vectors
Lecture Examples • [5, 6, 2] · [−3, 1, 0] = −9 • Two vectors that are orthogonal: a = [5, 6, 2], b = 1, −1, 12 3 1 7 • Projections: If a = [2, 1, −1] and b = [3, 2, 7], then proja b = 13 , 16 , − 16 , projb a = 62 , 31 , 62 , and √ ||a|| = 6.
Tech Tips • Have the students write a routine that takes, as its input, two non-zero vectors in R3 and computes the angle between them. • A more advanced challenge would be to have the students devise a routine that finds a vector perpendicular to two given vectors. (In essence, the students are being asked to derive something like the cross product.)
Group Work 1: Orthogonal Projections Rule This worksheet extends what the students know about orthogonal projections to R3 . If they have a strong conceptual knowledge of projections, this should come easily. Answers 1. P is a plane through the origin. 2. Answers will vary. 3. [u1 − c1 − c2 , u2 − c1 , u3 − c1 ] 4. −3c1 − c2 + u1 + u2 + u3 = 0, u1 − c1 − c2 = 0
5. c1 = 12 (u1 + u2 ), c2 = 12 (2u1 − u2 − u3 )
Group Work 2: A Glimpse of Things to Come This activity foreshadows later work, expressing a given vector as a linear combination of two other vectors. Students may find this difficult because the vectors and components are given in general. One of the main hurdles for a linear algebra student is making the transition to thinking abstractly. Answers 1. We know that v · u = 0, which implies v1 u1 + v2 u2 = 0. 2. This is clear if one thinks about it geometrically. There is also an algebraic solution:
u · v
u = (0) u = 0 u·u 3. The easiest way to see this is geometrically. If v and u are orthogonal, then v and [u2 , −u1 ] are parallel. The dot product of two nonzero parallel vectors cannot be 0. And v · [u2 , −u1 ] = v1 u2 − v2 u1 . There is a more cumbersome algebraic proof: v2 u2 v1 = − u1 proju (v) =
v1 u2 = −
v2 (u2 )2 u1
v2 (u2 )2 v2 (u1 )2 − u1 u1
v2 2 = − u + u21 u1 2
v1 u2 − v2 u1 = −
8
Section 1.2 Length and Angle: The Dot Product
The only way that − uv21 u22 + u21 can be zero is if v2 = 0, but then orthogonality forces u1 to be zero as well, causing the expression to be undefined. The case v2 = u1 = 0 is easy to check directly. 4. Again, this is easy to see geometrically. To obtain an algebraic solution, we choose an arbitrary vector
[x1 , x2 ] and solve the system x1 = au1 + bv1 x2 = au2 + bv2
This system has a unique solution provided u1 v2 − u2 v1 is not zero, which was shown in the previous question.
Group Work 3: The Right Stuff Give each group of students a different set of three points, and have them use vectors to determine if they form a right triangle. (It is easiest to write the points directly on the pages before handing them out.) They can do this using dot products, by calculating side lengths and using the Pythagorean Theorem, or by calculating the slopes of lines between the pairs of points. Have the students whose points are in R2 carefully graph their points to provide a visual check. At the end of the exercise, point out that using the dot product is the easier method. Sample triples: (−2, −1), (−2, 8), (8, −1) (0, 0), (10, 7), (−14, 20)
Right
(3, 4), (3, 12), (6, 5)
Right
(−1, −2, −3), (0, 0, −4), (−1, −1, −1)
Right
Not right
(2, 1, 2), (3, 3, 1), (2, 2, 4)
Right
(2, 3, 6), (3, 4, 7), (3, 3, 6)
Right
Group Work 4: The Regular Hexagon If the students have trouble with this one, copy the figure onto the blackboard. Then draw a point at its center, and draw lines from this point to every vertex. This modified figure should make the exercise more straightforward. Answers 1. 1, 1, 1
3. cos 60◦ = 12
2. 120◦
5. − 12 , 0 , − 14 , 43
4. − 12
√
6. 1
Group Work 5: Find the Error (Part 1) Answer
There is no cancellation law for dot products. For example,
1 3 1 · · = 11 and = 11, but 2 4 2
3 it is not the case that = . Some students may make the argument that no two vectors exist that 4 1 2 and v = . meet the requirements of this problem. Examples of two such vectors are u = 1 0 −1 6
−1 6
9
Chapter 1 Vectors
Group Work 6: Find the Error (Part 2) Answer
No two such vectors exist. This can be shown by Cauchy-Schwarz, or by the fact that u · v = u v cos θ.
Suggested Core Assignment Exercises 4, 10, 16, 19, 25, 30, 36, 43, 44, 54P , 59P , 63P , 65P
10
Group Work 1, Section 1.2 Orthogonal Projections Rule We know how to orthogonally project one vector onto another. Let’s try to extend this procedure in the setting of R3 . 1. Let v1 = [1, 1, 1] and v2 = [1, 0, 0] be vectors (in standard position) in R3 and let P denote the set of all
linear combinations of v1 and v2 . Describe what P looks like.
2. With P fixed, we want to orthogonally project a given vector u onto P . That is, we want the orthogonal
component of u that belongs to P . Sketch P and orthogonal projections for a couple of different u’s. Let’s let projP u denote the projection.
3. Now since projP u belongs to P , we can write projP u = c1 v1 + c2 v2 for some constants c1 and c2 . Our
goal is to obtain formulas for c1 and c2 . Expressing u using coordinates, we write u = [u1 , u2 , u3 ]. Now express, in coordinate form, the vector u − projP u. Make sure to use the equation for projP u and the coordinates of v1 and v2 above.
4. From our work in Problem 2, we know that u − projP u is orthogonal to both v1 and v2 . Combine this
fact with your work from Problem 3. to find 2 equations involving the unknowns c1 and c2 .
5. Finally, solve the above equations to give formulas for c1 and c2 .
11
Group Work 2, Section 1.2 A Glimpse of Things to Come Let u = [u1 , u2 ] and v = [v1 , v2 ] be orthogonal elements of R2 . 1. Why do we know that v1 u1 + v2 u2 = 0?
2. Show that the projection of v onto u is always 0.
3. Show that v1 u2 − v2 u1 is never equal to zero.
4. Show that any vector in R2 can be written as a linear combination of v and u.
12
Group Work 3, Section 1.2 The Right Stuff Consider the points ( , ), ( , ), and ( triangle a right triangle? Justify your answer.
,
13
). These three points form a triangle. Is this
Group Work 4, Section 1.2 The Regular Hexagon Consider the following regular hexagon:
y
b
¬ a=(1, 0)
1. Compute a, b, and c.
2. What is the angle θ?
3. What is a · c?
4. What is a · b?
5. Compute proja b and projb c.
6. Compute the x-component of a + b + c.
14
c x
Group Work 5, Section 1.2 Find the Error (Part 1) It is a beautiful spring morning. You are about to go to your 4 P. M . class, but have stopped at a convenience store to buy carrot sticks and bottled water for a healthy snack. As you wait in line to pay for your purchases, whistling to yourself, you notice a wild-eyed gentleman standing in line in front of you, buying a moon pie. “Well aren’t you a merry grig?” he asks. You nod noncommittally, since you have no idea what a “grig” is. He takes your nod to mean that you would like further conversation, and asks, “Where are you off to now?” “Why, I’m off to my linear algebra class, to learn some useful information about vectors.” “Vectors, vectors,” he says, half to himself. “I remember learning about vectors... I remember learning... LIES!” “What do you mean, ‘lies’?” you ask. “Everything we’ve learned about vectors is as true as it is useful.” “Oh yes? You think you know it all, do you?” By this point, he has paid for his purchases. As you pay for yours, you notice him writing on his receipt: √ Let u be a vector such that u = 2. Choose a vector v = u such that u · v = 2. Now we have 2 (u · u) = u · u + u · v 2 (u · u) − 2 (u · v) = u · u + u · v − 2 (u · v) 2 (u · u) − 2 (u · v) = u · u − u · v 2u · (u − v) = u · (u − v) 2u = u 2 = 1
“I’ve seen someone try this before,” you say dismissively, “in college algebra. But you are not allowed to divide by zero.” “Ah, but I am not dividing by zero! Since u = v, we know that u − v cannot be zero! Now you go enjoy your class, while I go and enjoy my moony pie!” And the stranger leaves, singing a strange song to himself, and opening the wrapper to his moon pie. Could Linear Algebra be flawed already? Two can’t equal one, can it? Find the error in the gentleman’s reasoning.
15
Group Work 6, Section 1.2 Find the Error (Part 2) After determining the stranger’s mistake, you go to your Linear Algebra class. Your teacher tells you to pay particular attention to page 18, so you take a scrap of paper to mark the page. You notice that you are using the gentleman’s receipt, and that he has written something on the front of the receipt as well!
Dear Merry Grig, If I haven’t already convinced you that your teacher is nothing but a purveyor of falsity, check this out: Let u be a vector such that u = 1. Choose a vector √ v such that u · v = 3 and v = 5. Now we have u − v2 = (u − v) · (u − v) = u · u − 2 (u · v) + v · v = 0
Hence u = v, since u − v = 0. But u and v have different lengths!
Well, gosh darn him anyway! How can two things be the same, and yet different? Find the error.
16
1.3
Lines and Planes
Suggested Time and Emphasis 1 class. Recommended material.
Points to Stress 1. The definition and intuitive idea of “normal”. 2. The normal, vector, parametric and general forms of the equation of a line. 3. The normal, vector, parametric and general forms of the equation of a plane. 4. The vector form of the equation of a line (plane) viewed as a translation of all linear combinations of one
(two) fixed vector(s).
Drill Question 1. Consider the line with equation 2x + y = 0. Sketch this line, and then write its equation in vector form,
and then again in normal form. x 1 2 x =t · =0 Answer , y 2 1 y 2. Why do you even need the vector p? Why can’t you just specify a line with the direction vector alone? Answer Without the vector p, the line would go through the origin.
Test Question Do the following four points all lie on the same plane? Why or why not? (3, 2, 1) Answer
(3, 1, 2)
(5, 2, 3)
(2, 0, 2)
Yes. An equation of the plane is ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ −1 1 3 x ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎣ y ⎦ = ⎣ 2 ⎦ + ⎣ 0 ⎦ s + ⎣ −1 ⎦ t 0 1 1 z
Discussion Question What would the equation of a line in 4-space look like? How about a a plane in 4-space? What would ax + by + cz + dw = e represent?
Lecture Notes • Students are being asked to do something that seems unusual to them. They are taking a concept that they presumably understand, finding the equation of a line, and asked to do it again, only this time in a more complicated way that they don’t fully understand. One can motivate the students by discussing the problem of generalization. ax + by = c (which simplifies to y = mx + b) is the equation of a line in 2-space. How do we get a line in 3-space? One guess would be ax + by + cz = d, but that turns out to be a plane, not a line. The problem with our standard way of writing the equation of a line is that it doesn’t 17
Chapter 1 Vectors
naturally extend to three dimensions. The techniques of this section not only generalize to three (or more) dimensions, but they do so in a simple, natural way.
• Example 1 is very dense, and provides the students with an excellent opportunity to learn how to read a mathematics textbook. Ask the students to reread Example 1, and give them time, waiting until almost all of the students are done before going on. And then go through it with them, sentence by sentence. Notice that almost every sentence requires a bit of thought. “The left-hand side of the equation is in the form of a dot product”, “The vector n is perpendicular to the line”, “The equation n · x = 0 is the normal form of the equation of l” all convey new concepts. Students may need to remind themselves that if the dot product of two vectors is zero, then the vectors are orthogonal. They may need to look up the definition of “orthogonal” if they’ve forgotten it. Students tend to try to read mathematics textbooks too quickly, and this example gives you an opportunity to demonstrate the process of truly understanding every sentence before going on, or at least making notes of what they need to ask questions about. (This isn’t a bad thing to do two or three times throughout the semester.)
• Be sure to note the distinction between the point (1, 2), the vector [1, 2], and the vector [1, 2] in standard position. When we discuss the normal and vector forms of the equation of a line, we are assuming that our vectors are in standard position. Perhaps discuss what happens if we remove this assumption. (The term for this hard-to-picture object is a pencil.)
• Students get direction vectors and normal vectors confused. Direction vectors tend to agree with the students’ geometric intuition. They point along the line, and have a “rise” and “run” interpretation that harkens back to the concept of slope. The normal vector literally goes off at a right angle to their intuition. It is important to draw a picture such as Figure 5, and to show the students how the two different forms of the equation work.
• If you will be covering cross products (as done in the Exploration) this is a good time to foreshadow them: “Wouldn’t it be useful if we had a good way of finding a vector perpendicular to two given vectors?”
• The following is a different approach to lines in R2 and planes in R3 . It has the advantage of extending nicely to higher dimensions. Let v = [2, 4]; this vector defines a (linear) function mapping R2 → R by the rule w → w · v. Let fv denote this function. A level set La of fv is the set of all w such that fv (w) = a. Sketch L0 . Now sketch L1 . Let’s move up in dimension; let v = [0, 1, 0] and sketch (in R3 ) the level set L0 . What can say about these level sets in general? If you start with a plane in R3 , can you express it as the level set of some v? These level sets can obviously be defined in Rk for any integer k; these levels sets are called hyperplanes. 18
Section 1.3 Lines and Planes
Lecture Examples • Various forms of the equation of the line that goes through the points (1, 3) and (2, 8) in R2 : y = 5x − 2 5x − y = 2 5 x 1 · − =0 −1 y 3 5 x 5 1 ( = 2) · = · −1 y −1 3 1 1 x +t = 5 3 y x 2 1 = +t y 8 5
x=1+t y = 3 + 5t x=2+t y = 8 + 5t
y 10 5 _1
0
1
_5
19
2
3 x
Chapter 1 Vectors
• Various forms of the equation of the plane that goes through (1, 1, 1) , (2, 2, 1) and (2, 1, 2). Note: We first observe that the vectors [1, 1, 0] and [1, 0, 1] are in the plane, and determine that [1, −1, −1] is perpendicular to both by playing with dot-products. ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ 1 x 1 1 ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎣ −1 ⎦ · ⎣ y ⎦ = ⎣ −1 ⎦ · ⎣ 1 ⎦ ( = −1) −1 z −1 1 x − y − z = −1 ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ x 1 1 1 ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎣ y ⎦ = ⎣ 1 ⎦ + s⎣ 1 ⎦ + t⎣ 0 ⎦ z 1 0 1 ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ x 2 1 1 ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ y 2 1 ⎣ ⎦ = ⎣ ⎦ + s⎣ ⎦ + t⎣ 0 ⎦ z 1 0 1 ⎧ ⎪ ⎨ x=1+s+t y =1+s ⎪ ⎩ z =1+t ⎧ ⎪ ⎨ x=2+s+t y =2+s ⎪ ⎩ z =1+t ⎡
⎤
z 2 1
x
2
1
0 _1
_2
_1 1 2
y
• The distance between the point (2, 1, 8) and the plane 2x − 3y + z = 5 is
2 7
√ 14 ≈ 1.069045
Tech Tip A CAS can draw a plane by expressing the general form of the plane as z = f (x, y) and then using a 3D plot command. Use this approach to illustrate the different ways three planes can intersect (the empty set, a point, a line). 20
Section 1.3 Lines and Planes
Group Work 1: The Match Game
This is a pandemonium-inducing game. Give each group four index cards. Each card contains an equation of a different line, and each card’s equation is in a different form. Tell the students that their goal is to trade cards, and wind up with four different descriptions of the same line.
For the convenience of the teacher, each row below contains a winning combination. Make sure that each team starts with descriptions from different rows.
After the dust settles, lead the students in a discussion of optimal strategies for this game. Category A The line between (0, 0, 1) and
Category B
Category C
[x, y, z] = [2, 4, 1] + t [1, 2, 0]
(1, 2, 1)
The line between (0, −3, 3) and
[x, y, z] = [1, −1, 2] + t [1, 2, −1]
(3, 3, 0)
The line between (1, 3, 2) and
[x, y, z] = [1, 2, 3] + t [0, −1, 1]
(1, −1, 6)
The line between (0, 0, 4) and
[x, y, z] = [9, 6, 7] + t [−3, −2, −1]
(12, 8, 8)
The line between (5, 0, 7) and
[x, y, z] = [3, −2, 5] + t [−1, −1, −1]
(−2, −7, 0)
The line between (−3, 3, −9) and
[x, y, z] = [0, 0, 0] + t [−1, 1, −3]
(3, −3, 9)
The line between (−4, 2, 1) and
[x, y, z] = [3, 3, 3] + t [7, 1, 2]
(−11, 1, −1) 21
Category D
x = 2t y = 4t z=1
A line through (−4, −8, 1) and
x=2+t y = 1 + 2t z =1−t
A line through (4, 5, −1) and
x=1 y = −t z =5+t
A line through (1, 0, 5) and parallel
x = 6 − 6t y = 4 − 4t z = 6 − 2t
A line through
parallel to [3, 6, 0]
parallel to [2, 4, −2]
to [0, −π, π] (−12, −8, 0) and
parallel to [3, 2, 1]
x = 2 − 2t y = −3 − 2t z = 4 − 2t
A line through (0, −5, 2) and
x = −1 + t y =1−t z = −3 + 3t
A line through
x = 10 + 7t y =4+t z = 5 + 2t
parallel to [5, 5, 5] (2, −2, 6) and
parallel to [3, −3, 9] A line through (17, 5, 7) and parallel to [−7, −1, −2]
Chapter 1 Vectors
Group Work 2: Planes from Points Give each group two sets of three points each, one non-collinear set and one collinear set. Ask the students to give a parametric equation of the unique plane containing the points. For the second set of points this is a trick question, since collinear points do not determine a plane. Sample sets of points are given below. Non-collinear (−1, 4, 2) (3, 1, 1) (7, 2, 0)
Non-collinear (0, 0, 0) (1, 2, 3) (2, 5, 9)
Non-collinear (0, −5, 5) (0, 1, 1) (0, 3, 4)
Collinear (0, 0, 0) (1, 2, 3)
3 9 2 , 3, 2
Collinear (−1, 4, 2) (3, 1, 1) (7, −2, 0)
Collinear (0, −5, 5) (0, 1, 1) (0, −2, 3)
Group Work 3: Calculus and Linear Algebra Answers
y
1. f»(¹/4)
f»(7¹/4)
0
f»(2¹/3)
y
2.
3. x = f»(¹/4)
f»(2¹/3)
x
cos t sin t
− sin t +s cos t
4. They are orthogonal. f(¹/4)
f(2¹/3) 0
f(t)
f(7¹/4)
x
5. x = g (t0 ) + tg (t0 )
f»(7¹/4)
f(t)
Group Work 4: Plane to See This activity is designed to give the students an opportunity to apply what they’ve learned about equations of planes. Answers
⎡
1. 2.
3.
5.
⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ x 1 2 0 ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎣ y ⎦ = ⎣ 2 ⎦ + ⎣ 0 ⎦s + ⎣ 2 ⎦t z 0 −3 −1 Answers will vary. Sample answers: ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ x 1 2 ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎣ y ⎦ = ⎣ 2 ⎦ + ⎣ 0 ⎦s z 0 −3 ⎤⎞ ⎡ ⎤ ⎛⎡ ⎤ ⎡ −1 x 2 ⎥⎟ ⎢ ⎥ ⎜⎢ ⎥ ⎢ ⎣ −2 ⎦ · ⎝⎣ y ⎦ − ⎣ 2 ⎦⎠ = 0 3 z −2 ⎤⎞ ⎡ ⎤ ⎛⎡ ⎤ ⎡ −1 x 0 ⎥⎟ ⎢ ⎥ ⎜⎢ ⎥ ⎢ ⎣ 2 ⎦ · ⎝⎣ y ⎦ − ⎣ 2 ⎦⎠ = 0 3 z −1
⎡
⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ x 1 2 1 ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎣ y ⎦ = ⎣ 2 ⎦ + ⎣ 0 ⎦s + ⎣ 0 ⎦t z 0 −3 0 ⎡ ⎤ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ 0 2 1 x ⎢ ⎥ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ 4. ⎣ y ⎦ = ⎣ 2 ⎦ + ⎣ 0 ⎦ s + ⎣ 2 ⎦ t −1 −3 0 z
Suggested Core Assignment Exercises 2, 6, 9, 11, 13, 22, 24, 28, 32, 41P , 43, 48 22
Group Work 1, Section 1.3 The Match Game Your teacher has just handed you four index cards, each with equations of four different lines. Your task is, by clever trading with other groups, to wind up with four different descriptions of the same line. The winning team gets a boffo prize, so go for it!
23
Group Work 2, Section 1.3 Planes from Points 1. Consider the following set of three points:
POINT 1: __________ POINT 2: __________ POINT 3: __________ Find a parametric equation of the unique plane containing these points.
2. Repeat Problem 1 using these points:
POINT 1: __________ POINT 2: __________ POINT 3: __________
24
Group Work 3, Section 1.3 Calculus and Linear Algebra In beginning calculus, we learned to interpret the derivative of a real-valued function of one variable f : R → R. A function of one variable that take on values in Rk is called a vector-valued function. For example, if we let f (t) = t, t2 , t3 , then f is a vector-valued function.
One important question is how we should interpret the derivative of a vector-valued function? Let’s start with a definition. If f (t) = [f1 (t) , . . . , fk (t)] : R → Rk , then f (t) = [f1 (t) , . . . , fk (t)]. This should feel intuitive. If, for example, f (t) = t, t2 , t3 , then we are defining f (t) to be 1, 2t, 3t2 . Notice that we interpret f (t) not as a point in Rk but as a vector in Rk . 1. Let f (t) = [cos t, sin t]. Sketch the image curve f (t) in R2 . Calculate f (t) and draw a few f (t) vectors
in standard position on the same graph.
2. Notice that the f (t) vectors, while quite beautiful, fail to reveal much about the curve f (t). For each
vector you have drawn in standard position, redraw each one in a better position — a position that better captures the connection between f and f .
3. For each redrawn vector, give the equation of the corresponding tangent line using vector form.
4. Let v be the vector [cos t0 , sin t0 ] and let u be the derivative vector at t0 . What do you notice about v and
u?
5. In general, if g (t) = (g1 (t) , . . . , gk (t)), find the vector form of the tangent line to this curve at t = t0 .
25
Group Work 4, Section 1.3 Plane to See We are going to explore how much information is needed to uniquely determine a plane in R3 . Consider the three noncollinear points P = (−1, 2, 3), Q = (1, 2, 0), and R = (1, 0, 1). For each of the following, either determine (in vector or normal form) the uniquely defined plane or exhibit two distinct planes satisfying the description. 1. A plane that contains P , Q and R.
2. A plane that contains the line determined by P and Q but does not contain R.
−−→
−→
3. A plane that contains the vector P Q and is orthogonal to the vector P R.
−−→
−→
4. A plane that contains the vectors P Q and P R.
−−→
−→
5. A plane that contains P and is orthogonal to the vector P Q − P R.
26
1.4
Code Vectors and Modular Arithmetic
Suggested Time and Emphasis 1 2 –1
class. Modular arithmetic recommended, error-correcting codes optional.
Points to Stress 1. Modular arithmetic, with an emphasis on binary arithmetic. 2. The concept of error-correcting codes. 3. Parity check digits.
Drill Question Circle the binary vector which has a parity different from that of the others: [1, 0, 0, 1, 1, 1, 0] Answer
[1, 0, 0, 1, 0, 0, 0]
[1, 1, 1, 1, 1, 0, 1]
[1, 0, 0, 1, 0, 1, 0]
[1, 0, 0, 1, 0, 1, 0]
Discussion Questions 1. Why do we use error-correcting codes? 2. What would happen if every bit of a vector came across correctly except the parity bit?
Test Question Is 0-534-34174-8 a valid ISBN? Why or why not? Answer No. The parity bit is wrong.
Lecture Notes • Point out that modular arithmetic comes up in other contexts besides this one. A good example to use is addition modulo 12. For example, if someone starts reading a book at 9 A . M ., and reads for six hours, what time is it? If a student starts studying at 10 A . M ., and studies for 23 hours, what time is it? • One can introduce a little bit of symbolic logic as a way of justifying the importance of binary arithmetic. If the students look at the binary multiplication table, this corresponds to the idea of “and”. The statement ab = 1 is true only if a = 1 and b = 1. Ask the students to think of a meaning for the addition table. It turns out to be exclusive or: a + b = 1 only if a = 1 or b = 1, but not both. (Exclusive or tends to come up in the English language, such as “You may have fries or cole slaw,” or “Your money or your life”.) • Notice that there is a hierarchy of codes. Example 5 talks about a simple error-detecting code. We put a check bit at the end, and if there has been a single error, we know there is a problem. One drawback is that if there is an even number of errors, they are not detected. It is possible to fix this problem, but the price is that we need to add more than one extra bit. Another drawback is that we only know that an error has occurred, but we don’t know which bit is faulty. It is possible to create error-correcting codes that tell you exactly what needs to be changed. But again, we pay the price of having to add extra bits. In general, the more sophisticated the code, the more bits that have to be added, and the more bits that we add, the slower the transmission rate. This trade-off is a special case of the age-old problem of speed versus accuracy. 27
Chapter 1 Vectors
• The textbook gives two examples of error-correcting codes that are used in common products, the UPC and the ISBN. Students could be asked to verify the check digit on their Linear Algebra textbook for both the ISBN and the UPC . Discuss why the check vector used with UPC s will detect all single errors. • Discuss self-correcting codes. One simple self-correcting code just repeats each word three times. For example, to encode 1011 we would send the string 101110111011. Now, if there is a single error, we can deduce where the error was and correct it. For example, if you received the transmission 101011101010, you would write 1010 1110 1010
You would know that there was an error, but that the intended message was 1010. The disadvantage to this method is that it takes 3k bits to send a k-bit message. There are much more efficient self-correcting codes created by some very smart people.
Lecture Examples • A binary vector with a check digit, where a transmission error has occurred: [1, 0, 0, 1, 0, 0, 1, 0, 0, 0] • A valid UPC:
• An invalid UPC:
• An invalid UPC that would go undetected:
Error undetected
Original 28
Section 1.4 Code Vectors and Modular Arithmetic
Tech Tip Have the students design a function that returns 1 if an given UPC is valid and 0 if it is invalid.
Group Work 1: Find the Counterfeit This activity is a direct application of the UPC described in the textbook. Answer
The UPC of “The Starry Nights” is invalid.
Group Work 2: Beyond Parity You may want to do Part 1 with the students, or a different example of your own devising. Answers 1. [1, 0, 1, 1, 1, 1, 0] 2. [1, 1, 1, 0, 1, 0, 1] 3. If the intended message was [1, 1, 0, 1] then the transmitted word should be [1, 1, 0, 1, 1, 0, 1]. We didn’t
get that, so there is an error. Now, since we are assuming there was only one error, we can try flipping each bit of the received word, in turn: [0, 1, 0, 1, 0, 1, 1] is valid [1, 0, 0, 1, 0, 1, 1] is valid [1, 1, 1, 1, 0, 1, 1] is invalid [1, 1, 0, 0, 0, 1, 1] is invalid [1, 1, 0, 1, 1, 1, 1] is invalid [1, 1, 0, 1, 0, 0, 1] is invalid [1, 1, 0, 1, 0, 1, 0] is invalid
So we know that the first or second bit is faulty, but we don’t know which. 4. This particular system allows us to narrow down where there error is, but does not allow us to find it exactly. It is an improvement over a single parity bit, because it allows us to localize the error. It has the disadvantage of taking 7 bits of bandwidth to send a 4-bit message. There are actually more complex codes that allow us to find the error precisely.
Suggested Core Assignment Exercises 3, 14, 16, 22, 23, 24, 36, 37, 46, 53P , 54
29
Group Work 1, Section 1.4 Find the Counterfeit Three of the following CDs were bought at my local music store. One of them was bought from an Evil Counterfeiter, who will soon be brought to justice. Which one is the pirated CD?
30
Group Work 2, Section 1.4 Beyond Parity We wish to transmit binary vectors of length 4 such as [1, 0, 1, 1] and [1, 1, 1, 0]. Because we are afraid of transmission error, we are going to add some bits to make an error-detecting code. Let v be the 4-bit vector we want to transmit. Bit number 5 will be v · [1, 1, 1, 1], bit number 6 will be v · [1, 1, 0, 0], and bit number 7 will be v · [0, 0, 1, 1]. All addition is modulo two. For example, [1, 1, 0, 1, 1] · [1, 1, 1, 1, 0] = 1. 1. If v = [1, 0, 1, 1], what vector will we transmit?
2. If v = [1, 1, 1, 0], what vector will we transmit?
3. If you receive the vector [1, 1, 0, 1, 0, 1, 1], was there an error? If you know for sure there was exactly one
error, can you determine which bit is faulty?
4. What are the advantages of this system over the single-parity-bit system? What are the disadvantages?
31
2 Linear Equations 2.1
Introduction to Systems of Linear Equations
Suggested Time and Emphasis 1 2 –1
class. Essential material.
Points to Stress 1. Basic definitions including linear equations, systems of linear equations and augmented matrices. 2. Geometric interpretations of the solution set of a system of linear equations, including viewing them as
sets of vectors. 3. Dependent, independent, and inconsistent systems: systems with no solution, systems with a single solution, and systems with infinitely many solutions. (The details of solving these systems is given in the next section.)
Drill Question Consider the system of equations x + y = −1 2x − 3y = 8
(a) Find the augmented matrix corresponding to this system. (b) Put the matrix you obtained in part (a) into upper triangular form. 1 1 −1 1 1 −1 Answer (a) (b) 2 −3 8 0 −5 10
Discussion Question Can a system with 2 equations and 3 unknowns be inconsistent? Answer Yes
Sample Test Question: Consider the system ax + 3y + 2z = 5 bx + cy + 4z = 9 5x + by + cz = 16 33
Chapter 2 Linear Equations
If {x = 1, y = 2, z = 3} is a solution to this system, then find a, b and c. Answer
a = −7, b = −37, c = 17
Lecture Notes • After showing that every linear system has an associated augmented matrix, ask the question, “Does every matrix have an associated linear system?” The students may not think to look at n × 1 matrices. • Ask the question, “Why isn’t there any other possibility for the number of solutions to a linear system besides 0, 1, and infinitely many?” Consider a linear system of m equations in n variables. We will show that if there is more than one solution, then there must be infinitely many solutions. Suppose u, v ∈ Rn are solutions to the system. Let λ be a number between 0 and 1. Let w = λu + (1 − λ) v. Now w solves the system for every λ. • Inconsistent systems can be illustrated by starting with the equation x − y − z = 3 and sketching the plane of solutions for this equation.
0 _2 z _4 _6 2
1
y 0
_1
_2 _2
_1
0 x
1
2
We can then add a second equation, such as x−y −z = 4, such that the two equations form an inconsistent system, and sketch the two parallel planes, noting they do not intersect.
0 _2 z _4 _6 _8 2
1
y 0
_1
34
_2 _2
_1
0 x
1
2
Section 2.1 Introduction to Systems of Linear Equations
Or we can instead add a second equation such as x + y − z = 3 and show them intersecting in a line.
2
2
0
0
_2
_2
z
z _4
_4
_6
_6
_8
_8 2
1 y
0
_1
_2
0
_2
x
2
4
2
1 y
0
_1
_2
_2
0
x
2
4
Finally, we can now find a third a equation so that the resulting system has a 1-point solution set. (A possible third equation is −x − y − z = 3 with solution x = 0, y = 0, z = −3.)
2 0 _2 z _4 _6 _8 2
1 y
0
_1
_2
_2
0
2
4
x
• Foreshadow Lies My Computer Told Me, perhaps by doing the first two problems of that section, and asking the students to do the rest. • Introduce the concept of homogeneous systems as a way of getting the students to think about dependent and inconsistent systems. Define a homogenous system as “a system where the equations are all equal to zero”: 3x − 2y − z = 0 4x + 5y − 4z = 0 2x − 8y − z = 0
Start by asking the class some simple questions: Find and solve a dependent homogenous system, a homogeneous system with a unique solution, and finally one with no solution. (Give them some time to do this — a lot of learning will take place while they go through the process of creating and solving problems both forward and backward.) When students or groups of students finish early, ask them to articulate why there cannot be an inconsistent homogeneous system. 35
Chapter 2 Linear Equations
After the students have thought about this type of system, bring them all together. Point out that it is clear that x = 0, y = 0, z = 0 is always a solution to a homogeneous system, and so there cannot be an inconsistent one. The only possible solutions sets are (0, 0, 0) and a set with infinitely many points, one of which is (0, 0, 0).
Lecture Examples • A linear system with one solution: The system: x − y + 3z = 2 2x + y + z = 0 −x + 3y − z = 1
The augmented matrix:
⎡
The matrix in row echelon form:
⎡
⎤ 1 −1 3 2 ⎥ ⎢ ⎣ 2 1 1 0⎦ −1 3 −1 1 ⎤ 1 −1 3 2 ⎥ ⎢ ⎣ 0 3 −5 −4 ⎦ 0 0 16 17
The solution: x=−
3 4
y=
7 16
z=
• A linear system with no solution: The system: x − y + 3z = 2 2x + y + z = 0 −x + y − 3z = 1
The augmented matrix:
⎡
The matrix in row echelon form:
⎡
⎤ 1 −1 3 2 ⎥ ⎢ ⎣ 2 1 1 0⎦ −1 1 −3 1 ⎤ 1 −1 3 2 ⎥ ⎢ ⎣ 0 3 −5 −4 ⎦ 0 0 0 3 36
17 16
Section 2.1 Introduction to Systems of Linear Equations
• A linear system with infinitely many solutions: The system: x − y + 3z = 2 2x + y + z = 0 −x + y − 3z = 2
The augmented matrix:
⎤ 1 −1 3 2 ⎥ ⎢ ⎣ 2 1 1 0⎦ −1 1 −3 −2 ⎡
The matrix in echelon form:
⎤ 1 −1 3 2 ⎥ ⎢ ⎣ 0 3 −5 −4 ⎦ 0 0 0 0 ⎡
The solution: x=
2 3
− 43 t
y = − 43 + 53 t
z=t
Tech Tip Have the students use a CAS to graph three planes, no two of which are parallel, that represent an inconsistent linear system.
Group Work 1: The Lemonade Trick Students tend to find “mixture problems” difficult. This will serve as a review and extension of those problems. If a group finishes early, ask them to show that, regardless of how many liters she winds up with, the proportions of the final recipe will be the same. Answers
2 liters of each. We never said it was a complicated recipe!
Group Work 2: The Shape of the System If the students have read the section beforehand, this group work will be fairly straightforward. For a tougher challenge, give the students this group work before they have seen the material from the section. Problem 4 gets at the idea of a hyperplane. The students aren’t expected to come up with a rigorous description of four-dimensional geometry, but it is valuable to have them discuss such concepts at this point. Answers 1. A line in R2 2. Answers will vary. Anything like 2x = 5 or 2x + 1 = 5 is correct. The solution set is a point on the real
number line. 3. A plane in R3 4. A point in R3 , a line in R3 , or a plane in R3 5. It will be interesting to hear the students give their descriptions of a hyperplane. 37
Chapter 2 Linear Equations
Group Work 3: Augmenting the Augmented This is a straightforward computational exercise. It can be given to the students to do individually or perhaps in pairs; it does not warrant larger groups. Answer 1. Given at start of Problem 2 2. It is easy to solve; computations could have been made easier by dividing the equations at each stage to
produce a leading coefficient of 1. 3. As above, the equations should be divided by their leading coefficients.
Suggested Core Assignment Exercises 2, 5, 13, 17, 22, 26, 29, 35, 40
38
Group Work 1, Section 2.1 The Lemonade Trick A college student is dissatisfied with her lemonade options, and decides to mix three kinds of lemonade in her quest to find the perfect cup. She mixes some of her father’s sour lemonade (which turns out to be 40% lemon juice, 10% sugar, and 50% water), some store-bought sweet lemonade (10% lemon juice, 40% sugar, and the rest water, preservatives, artificial flavors, etc.) and some of her grandmother’s watery lemonade (10% lemon juice, 10% sugar, 80% water, cherry juice, and a bit of Worcestershire sauce for “color”). When she is finished, she has six liters of lemonade that contains 20% lemon juice and 20% sugar. How many liters of each kind has she mixed?
39
Group Work 2, Section 2.1 The Shape of the System 1. Describe the solution set to a single linear equation of two variables, such as
2x + 3y = 4
We are not looking for the actual answer — just describe the shape of the set of points. Is it a circle? A triangle? What is it?
2. Give an example of a linear equation in one variable. What is the shape of the solution set?
3. With the above in mind, describe the shape of the solution set to a linear equation in three variables.
4. Now consider a system of three equations in three variables. Describe the possible shapes that the solution
set can have.
5. Try to describe the shape of the solution set to a linear equation in four variables.
40
Group Work 3, Section 2.1 Augmenting the Augmented Consider the following linear system: 4x − 2y + z = 20 x+ y+z = 5 9x + 3y + z = 25 1. Copy this system onto the left side of a piece of paper and write the corresponding augmented matrix on
the right side. Now proceed as in the text — transform both the system and the matrix into a triangular pattern.
2. You should end up with something like this:
⎤ 4 −2 1 20 ⎥ ⎢ ⎣ 0 6 3 0⎦ 0 0 5 20 ⎡
4x − 2y + z = 20 6y + 3z = 0 5z = 20
Notice what has happened: we eliminated the first variable (x) from every equation below the first; then we eliminated the second variable (y) from every equation below the second. Let’s continue this process in the reverse direction. Eliminate the third variable from every equation above the third. Then eliminate the second variable from every equation above the second. Remember to continue to perform each elimination operation on the corresponding matrix. Is the resulting linear system easier to solve?
3. An augmented matrix in the form
⎤ 1 0 0 · · · 0 c1 ⎢ 0 1 0 · · · 0 c2 ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ 0 0 1 · · · 0 c3 ⎥ ⎢ ⎥ ⎢ .. .. .. . . .. .. ⎥ ⎣ . . . . . . ⎦ 0 0 0 · · · 1 cn is said to be in reduced row echelon form. What extra steps are needed to put your matrix into this form? ⎡
41
2.2
Direct Methods for Solving Linear Systems
Suggested Time and Emphasis 1–2 classes. Essential material (systems over Zp optional).
Points to Stress 1. Solving systems using elementary row operations (Gaussian elimination and Gauss-Jordan elimination). 2. Row equivalence, row echelon form and the definition and uniqueness of reduced row echelon form 3. The rank of a matrix, including the rank theorem. 4. Homogeneous linear systems.
Drill Question Which of the following matrices are in row echelon form? Which are in reduced row echelon form? ⎡ ⎡ ⎡ ⎤ ⎤ ⎤ 1 4 0 0 0 1 0 0 0 0 1 4 3 0 5 ⎢ 0 0 1 0 −5 ⎥ ⎢0 0 1 0 0⎥ ⎢ 0 0 1 2 −5 ⎥ ⎢ ⎢ ⎢ ⎥ ⎥ ⎥ (b) ⎢ (c) ⎢ (a) ⎢ ⎥ ⎥ ⎥ ⎣0 0 0 1 π ⎦ ⎣0 1 0 1 0⎦ ⎣0 0 0 1 π ⎦ 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 Answer
(a) and (c) are in row echelon form. (a) is in reduced row echelon form.
Discussion Question Is it possible for a system of equations to have exactly two solutions? Why or why not? (Hint: look at the geometry in R2 and R3 .)
Test Question If possible, give examples of the following: (a) A matrix in row echelon form that is not in reduced row echelon form. (b) A matrix in reduced row echelon form but not in row echelon form. Answer
(a) Answers will vary. Possible answer:
1 2 0 1
(b) There is no such matrix.
Lecture Notes • Define overdetermined systems, and give examples such as the following: 2x − y − 3z = −2
x+y+z = 3 x + 2y − 3z = 0
x + 2y + 5z = 8
−x − y + 5z = 3 Answer
[x, y, z] = [1, 1, 1] 42
Section 2.2 Direct Methods for Solving Linear Systems
• For a given linear system, let C denote the coefficient matrix of the system and let A denote the augmented matrix. Discuss the rank of C vs the rank of A. Answer rank C ≤ rank A. If rank C < rank A, the system is inconsistent. • Start introducing applications, such as the following: A pet shop has 100 animals, consisting of puppies, kittens, and turtles. A puppy costs $30, a kitten costs $20, and a turtle costs $5. If there are twice as many kittens as puppies, and if the stock is worth $1050, how many of each type of animal is there? Answer p + k + t = 100 and 30p + 20k + 5t = 1050, and 2p − k = 0. There are 10 puppies, 20 kittens, and 70 turtles. • For a proof-oriented course, start showing some sample abstract proofs, done in the style the student are expected to do. For example, prove Theorem 1. • Discuss the geometry of homogeneous systems. For example, ax + by = 0 corresponds to a line through the origin, and ax + by + cz = 0 corresponds to a plane through the origin. So systems of these equations will always have the trivial solution, and cannot have a unique nontrivial solution.
Show that row reduction on the coefficient matrix, as opposed to the augmented matrix, will always suffice when solving this system.
Lecture Examples • A 3 × 3 system with one solution, including the geometry involved: x+y+z = 5 2x + y − z = 2 x−y+z = 1
Augmented matrix:
The matrix in row echelon form:
⎤ 1 1 1 5 ⎢ ⎥ ⎣ 2 1 −1 2 ⎦ 1 −1 1 1 ⎡
⎡
⎤ 1 1 1 5 ⎢ ⎥ ⎣0 1 3 8⎦ 0 0 6 12 43
Chapter 2 Linear Equations
Solution: {x = 1, y = 2, z = 2} 8 6 4 z
2 0
_2 _4 3
2
1
y 0 _1 _1
0
x
1
2
• A homogeneous 3 × 3 system with only the trivial solution x+y+z = 0 2x + y − z = 0 x−y+z = 0 • A 3 × 3 system with infinitely many solutions x+y−z = 0 x − 2y + z = 4 x + 4y − 3z = −4
Tech Tips • Use a computer to put matrices in reduced row echelon form. Make the matrices bigger and bigger, until it takes over a minute. (Many CAS’s can generate matrices with random entries.) a b • Find the reduced row echelon form of using a computer. c d
Group Work 1: Meet the Identity Matrix This activity illustrates the idea that if A and B are both equivalent to I , then they are equivalent to each other. After this activity, it is possible to start addressing the idea of equivalence classes. The students can try to classify which 2 × 2 matrices are equivalent to I2 and which are not. Answers 1. Answers will vary. It is easiest if they start by swapping rows 1 and 3. 2. Answers will vary. This is an easy way:
⎡ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎤ ⎤ 1 0 0 1 0 0 1 0 1 2R1 2 0 2 1 0 0 1 ⎥ R1 +R3 ⎢ ⎥ 2R3 ⎢ ⎥ ⎢ ⎥ R2 + 2 R1 ⎢ ⎥ 2R2 ⎢ ⎣ 0 1 0 ⎦ −→ ⎣ 12 1 0 ⎦ −→ ⎣ 1 2 0 ⎦ −→ ⎣ 1 2 0 ⎦ −→ ⎣ 1 2 0 ⎦ 0 0 1 0 0 1 0 0 2 0 0 1 0 0 1 ⎡
3. We can go from A to I3 as in the first part, and then proceed from I3 to B as in the second part. 4. It is not possible. Any sequence of operations that makes c21 = 0 also makes c22 = 0. 44
Section 2.2 Direct Methods for Solving Linear Systems
Group Work 2: Reduced Row Echelon Form of 2 × 2 Matrices This activity can stand alone or be used with Group Work 1 to get further into the concept that putting matrices in reduced row echelon form is a way of putting them into equivalence classes. Answers
1.
1 0 1 ∗ 0 0 , , 0 1 0 0 0 0
2. Answers will vary. Possible answers include
nonzero matrix that is row equivalent to
1 2 3 4
0 0 . 0 0
→
1 0 0 1
and
1 2 2 4
→
1 2 . There is no 0 0
Group Work 3: Literal Translation Is there a relationship between the solutions of the linear system AX = B1 and the solutions of AX = B2 ? In this activity, we see evidence that these solution sets (provided they exist) are simply translates of each other. In the particular case of our Group Work, we find the solution sets to be parallel lines. ! 1 0 12 0 t t 1. (a) ,x=− ,y=− ,z=t 1 2 2 0 1 2 0 (b) A line in R3 ⎡ 1⎤ ⎡ ⎤ −2 0 ⎢ 1⎥ ⎢ ⎥ (c) x = ⎣ − 2 ⎦ t + ⎣ 0 ⎦ 0 1 2. Answers will vary, but should be in the following form:
(a)
1 0 0 1
1 2 1 2
! ∗ t t , x = − + ∗, y = − + ∗, z = t 2 2 ∗
(b) A line in R3 ⎡ 1⎤ ⎡ ⎤ −2 ∗ ⎢ 1⎥ ⎢ ⎥ (c) x = ⎣ − 2 ⎦ t + ⎣ ∗ ⎦ 0 1 3. They are all parallel lines.
Suggested Core Assignment Exercises 5, 6, 11, 13, 18, 21, 27, 37, 46, 49, 51P
45
Group Work 1, Section 2.2 Meet the Identity Matrix The following matrices are called identity matrices: I2 =
1 0 0 1
⎡
⎡
⎤
1 0 0 ⎢ ⎥ I3 = ⎣ 0 1 0 ⎦ 0 0 1
⎡
1 ⎢0 ⎢ I4 = ⎢ ⎣0 0
⎤ 3 2 −5 ⎢ ⎥ 1. Go from A = ⎣ 3 4 −1 ⎦ to I3 using elementary row operations. 1 1 1
0 1 0 0
0 0 1 0
⎤ 0 0⎥ ⎥ ⎥ 0⎦ 1
⎤ 2 0 2 ⎥ ⎢ 2. Go from I3 to B = ⎣ 1 2 0 ⎦ using elementary row operations. 0 0 2 ⎡
3. Is it possible to go from A to B using elementary row operations?
Either explain why you can’t, or explain why you can, without going through the actual work of doing it.
4. Is it possible to go from C =
3 6 9 18
to I2 using elementary row operations? Either do it or explain
why you can’t.
46
Group Work 2, Section 2.2 Reduced Row Echelon Form of 2 × 2 Matrices 1. Find all possible 2 × 2 matrices that are in reduced row echelon form.
2. Find 2 × 2 matrices with non zero entries that can be put into the reduced row echelon forms you found
in Question 1.
47
Group Work 3, Section 2.2 Literal Translation Consider the linear system 4x − 2y + z = c1 x + y + z = c2
where c1 and c2 are real numbers. Our goal is to discover a relationship between the solution sets of this system for various values of c1 and c2 . 1. Let’s start with the homogeneous case, that is, the case where c1 = c2 = 0.
(a) Use Gaussian elimination to solve the system.
(b) What is the shape of the solution? Is it a point in R2 ? A point in R3 ? A line in R2 ? A line in R3 ? A plane in R3 ?
(c) Express your answer in the vector form of Section 1.3, that is, x = p + td.
2. Now, let each person in your group choose a different pair of numbers c1 and c2 and repeat the previous
part.
3. Compare your solutions. How are they related to each other? How are they related to the homogeneous
solution?
48
2.3
Spanning Sets and Linear Independence
Suggested Time and Emphasis 1–1 12 classes. Essential material.
Points to Stress 1. Definitions of linear dependence and linear independence. 2. The equivalent definitions of linear dependence, specifically Theorems 2–4. 3. Definition of the span of a set of vectors, including geometric interpretation and spanning sets of Rn ,
including the standard unit vectors. 4. Theorem 1: A system of linear equations
A b is consistent if and only if b is a linear combination
of the columns of A.
Drill Question Consider the following four vectors in R3 . ⎡ ⎤ 1 ⎢ ⎥ ⎣2⎦ 3 Answer
Are they linearly independent? Why or why not? ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ 1 −1 0 ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎣3⎦ ⎣ 0⎦ ⎣2⎦ 4 2 −1
No. The first vector plus the second one equals the third.
Discussion Question
⎡
⎡ ⎤ ⎡ ⎤ ⎤ 1 −1 −1 ⎢ ⎥ ⎢ ⎢ ⎥ ⎥ What methods will determine whether or not ⎣ 2 ⎦, ⎣ 0 ⎦, and ⎣ 2 ⎦ are linearly independent? 4 −1 2
Sample Test Questions 1. True or False:
(a) All sets of m vectors in Rn are linearly dependent if m < n. (b) All sets of m vectors in Rn are linearly dependent if m = n. (c) All sets of m vectors in Rn are linearly dependent if m > n. 2. If possible, give an example of a linearly independent subset of R4 whose span contains e1 and e4 but
fails to span all of R4 . Answers 1. (a) False
(b) False
(c) True
2. {e1 , e4 } 49
Chapter 2 Linear Equations
Lecture Notes • Inform the students that subsets of Rn that are both linearly independent and spanning play an important role in linear algebra. Start by asking for subsets (say of R3 ) that are independent, but not spanning; then for subsets that are dependent, but spanning. Now find examples of independent, spanning sets. Do this for various Rn . What is the relationship between the size of these sets and n? • If you will be assigning any of Exercises 43–47, do one for the class, writing out the detail that the students will be expected to turn in. ⎡ ⎤ ⎡ ⎡ ⎤ ⎤ 1 −1 −1 ⎢ ⎥ ⎢ ⎢ ⎥ ⎥ • Using the linearly dependent system from the discussion question, ⎣ 2 ⎦, ⎣ 0 ⎦, and ⎣ 2 ⎦, show that 4 −1 2 the corresponding homogeneous linear system with augmented matrix A 0 has a nontrivial solution • Make clear the relationship between linear systems (LS), augmented matrices (AM), and collections of vectors (CV). Note that all vectors are considered to be column vectors. At this point the relation LS ⇔ AM should be clear. Let v1 , . . . , vm be a CV from Rn . Let c1 , . . . , cn be constants and x1 , . . . , xm be variables. By forming the vector c = [c1 , . . . , cm ], we can create a LS (that is, show that CV ⇒ LS) by attempting to write c as a linear combination of v1 , . . . , vm : x1 v1 + x2 v2 + · · · + xm vm = c
or equivalently
⎡
v11 ⎢v ⎢ 21 x1 ⎢ ⎢ .. ⎣ . vn1
⎤
⎡
v12 ⎥ ⎢v ⎥ ⎢ 22 ⎥ + x2 ⎢ ⎥ ⎢ .. ⎦ ⎣ . vn2
⎤
⎡
v1m ⎥ ⎢v ⎥ ⎢ 2m ⎥ + · · · + xm ⎢ ⎥ ⎢ .. ⎦ ⎣ . vnm
⎤
⎡
c1 ⎥ ⎢c ⎥ ⎢ 2 ⎥=⎢ ⎥ ⎢ .. ⎦ ⎣ . cn
⎤ ⎥ ⎥ ⎥ ⎥ ⎦
or equivalently v11 x1 + v12 x2 + · · · + v1m xm = c1 v21 x1 + v22 x2 + · · · + v2m xm = c2
.. . vn1 x1 + vn2 x2 + · · · + vnm xm = cn
which is a linear system. • To see that every LS gives rise to a linear combination of a CV (that is, LS ⇒ CV), simply start with a generic LS such as v11 x1 + v12 x2 + · · · + v1m xm = c1 v21 x1 + v22 x2 + · · · + v2m xm = c2
.. . vn1 x1 + vn2 x2 + · · · + vnm xm = cn
and work the above argument backwards by recognizing that this 50
LS
has the equivalent vector
Section 2.3 Spanning Sets and Linear Independence
representation
⎡
v11 ⎢v ⎢ 21 x1 ⎢ ⎢ .. ⎣ . vn1
⎤
⎡
v12 ⎥ ⎢v ⎥ ⎢ 22 ⎥ + x2 ⎢ ⎥ ⎢ .. ⎦ ⎣ . vn2
⎤
⎡
v1m ⎥ ⎢v ⎥ ⎢ 2m ⎥ + · · · + xm ⎢ ⎥ ⎢ .. ⎦ ⎣ . vnm
⎡
⎤
c1 ⎥ ⎢c ⎥ ⎢ 2 ⎥=⎢ ⎥ ⎢ .. ⎦ ⎣ . cn
⎤ ⎥ ⎥ ⎥ ⎥ ⎦
or x1 v1 + x2 v2 + m =⎤c ⎡ · · ·⎤+ xm v⎡ ⎡ ⎤ 1 4 7 ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ • Show that the vectors (written as columns) u = ⎣ 2 ⎦, v = ⎣ 5 ⎦, w = ⎣ 8 ⎦ do not span R3 using the 3 6 9 following argument:
(a) let A be the matrix with columns u, v, and w. Show that the linear system with augmented matrix A 0 has nontrivial solutions and thus, by Theorem 3, the vectors are dependent. (b) If one of the vectors can be expressed as a linear combination of the other two, then these three cannot span R3 — the size of a spanning set of R3 must be at least 3. Notice that this is kind of neat — these particular vectors happen to be linearly dependent.
Lecture Examples • Four vectors in R4 that are linearly independent: [1, 3, −1, 2]
[3, 2, 0, −1]
[2, 1, 1, 1]
[−4, 2, 1, 0]
• Four vectors in R4 that are not linearly independent: [1, 3, −1, 2]
[3, 2, 0, −1]
[2, 1, 1, 1]
[1, 4, −4, −3]
Tech Tip This is a good place to introduce the reduced row echelon form command of the CAS. To determine if a collection of vectors is independent, define a matrix whose rows are the vectors and then apply the reduced row echelon form command. A zero row will result if and only if the vectors are dependent.
Group Work 1: Ones and Zeros Notice that Question 3 is impossible. Don’t tell the students this right away. Answers 1., 2. Answers will vary. Any set of four of these vectors is linearly independent. 3. No such set exists. 4. No set of four such vectors is linearly dependent. 5. Yes. There are exactly four of them, and this set is linearly independent. 6. It does not. For example, this is a linearly dependent set:
⎡
⎤ 1 ⎢0⎥ ⎢ ⎥ ⎢ ⎥ ⎣0⎦ 0
⎡
⎤ 0 ⎢1⎥ ⎢ ⎥ ⎢ ⎥ ⎣0⎦ 0
⎡
⎤ 0 ⎢0⎥ ⎢ ⎥ ⎢ ⎥ ⎣1⎦ 0
51
⎡
⎤ 1 ⎢1⎥ ⎢ ⎥ ⎢ ⎥ ⎣1⎦ 0
Chapter 2 Linear Equations
Group Work 2: Four Cases After the students have worked on Problems 3 and 4 for awhile, allow the possibility that “impossible” is an acceptable answer. If a group finishes early, have them try to prove their answers. Answers 1. Answers will vary. The identity matrix works.
⎡
⎤ 1 1 1 ⎢ ⎥ 2. Answers will vary. Example: ⎣ 1 1 1 ⎦ 1 1 1 3., 4. There are no such matrices.
Group Work 3: Ranking Your Independence Answers 3. Students can do this by counting the nonzero rows of the row reduced echelon form of each matrix. 4. This should be the same answer as above, with no work needed. 5. The rank of a matrix is equal to the rank of its transpose.
Suggested Core Assignment Exercises 4, 5, 10, 16, 18P , 20P , 25, 28, 35, 38, 48P
52
Group Work 1, Section 2.3 Ones and Zeros Consider the set of all vectors in R4 with two 0s and two 1s: ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ 0 0 1 0 1 1 ⎢1⎥ ⎢0⎥ ⎢1⎥ ⎢0⎥ ⎢1⎥ ⎢0⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥,⎢ ⎥,⎢ ⎥,⎢ ⎥,⎢ ⎥,⎢ ⎥ ⎣0⎦ ⎣1⎦ ⎣1⎦ ⎣0⎦ ⎣0⎦ ⎣1⎦ 1 1 1 0 0 0 1. Find a set of four of these vectors that is linearly independent.
2. Find a different such set.
3. Hey, this is pretty easy. Find a set of four that is linearly dependent.
4. Make a conjecture about vectors in R4 with two 0s and two 1s.
5. Does your conjecture work with the set of vectors in R4 with one 0 and three 1s?
6. Does your conjecture work with the set of all vectors in R4 consisting of only 0s and 1s?
53
Group Work 2, Section 2.3 Four Cases 1. Create a 3 × 3 matrix whose rows are linearly independent and whose columns are linearly independent.
2. Create a 3 × 3 matrix whose rows are linearly dependent and whose columns are linearly dependent.
3. Create a 3 × 3 matrix whose rows are linearly independent and whose columns are linearly dependent.
4. Create a 3 × 3 matrix whose rows are linearly dependent and whose columns are linearly independent.
54
Group Work 3, Section 2.3 Ranking Your Independence A matrix can be viewed as a collection of row vectors or as a collection of column vectors. We now investigate, at least in one case, the extent to which the interpretation matters. 1. Within your group, choose between 3 and 8 vectors from R5 . (Be nice and use integer entries.) Now swap
your set of vectors with another group.
2. Form two matrices from the given set: one matrix formed by listing the vectors as rows, the other matrix
formed by listing the vectors as columns.
3. Find the number of linearly independent rows of each matrix.
4. Find the rank of each matrix.
5. Draw a conclusion.
55
2.4
Applications
Suggested Time and Emphasis 1 class. Optional material.
Points to Stress We recommend going through one or two of the applications below in depth, as opposed to going through all of them lightly. 1. Allocation of resources 2. Balancing of chemical equations 3. Network analysis 4. Finite linear games
Drill Question Describe one of the applications in the text.
Discussion Question Have you seen any of these applications in your other classes? Do you think these techniques may help you in the future?
Test Question Find a value of k such that that the system x + 2y + 3z = 0 3x + 6y + kz = 0 x + 4z = 0
has at least one nontrivial solution. Answer
k=9
Lecture Notes • Exercises 41–44 involve partial fraction decomposition. If your students have had calculus before, this is an excellent application to discuss with them. Start with the general problem of trying to integrate 3x + 1 x3 + x + 1 (see Exercise 41) or even (see Exercise 44) and then go x2 + 2x − 3 x (x − 1) (x2 + x + 1) (x2 + 1)3 into the details of partial fraction decomposition, emphasizing that it winds up being a linear algebra problem. • Discuss the idea of mutual funds — a security that allows an investor to own a little bit of many companies, as opposed to a straight stock purchase, which is all from one company. Point out that if one is very particular about what one wants to own, choosing mutual funds becomes a linear algebra problem. Present the following (simplified, perhaps contrived) problem: 56
Section 2.4 Applications
There are three mutual funds. For every share of Doug’s Friendly Fund you buy, you get one share of GE, one share of Disney, and one share of Exxon. For every share of Mike’s Marvy Fund, you get two shares of GE and one share of Exxon. And for every share of Joe’s Jolly Fund you buy you get three shares of Disney. Assume that we want to wind up with 400 shares of GE, 700 shares of Disney, and 250 shares of Exxon. How much of each fund must we buy? Answer 100 shares of Doug’s Friendly Fund, 150 shares of Mike’s Marvy Fund, and 200 shares of Joe’s Jolly Fund. • Have the students play a game with coffee cups. Take six cups. A move consists of flipping two adjacent cups, or of flipping the last cup without flipping any others. The first player chooses an ending position and a starting position, and the second player has to get from one to the other. After letting them play for a while, discuss Example 7 with the students, and let them try to solve their games.
Lecture Examples • An example of balancing chemical equations: lead oxide plus hydrochloric acid yields lead chloride, plus chlorine and water: PbO2 + HCl −→ PbCl2 + Cl2 + H2 O Answer PbO2 +4 HCl −→ PbCl2 + Cl2 +2 H2 O • Exercise 22 is a very good, and practically important, electrical networks problem.
Tech Tip Give the students some large, nontrivial circuits or chemical equations to solve. Perhaps have them find actual ones on the Internet. Have them solve them and cite their sources.
Group Work: Curve Fitting Problem 2 of this activity assumes that the students can use a calculator to put matrices in reduced row echelon form. Answers 1. f (x) = 4x2 + 2x + 1
_2
_1
2. g (x) ≈ −15.51 sin x + 15.052x + 5
y
y
20
20
10
10
0
1
2 x
_2
_1
0 _10
57
1
2 x
Chapter 2 Linear Equations
3. There are infinitely many cubic functions that fit these points.
y 20
Any function of the form t−1 5−t 3 2 x + (5 − t) x + x+t 2 2
10
passes through all three points. _2
_1
0 _10
Suggested Core Assignment Exercises 1, 4, 8, 11, 16, 18, 20, 26P , 34
58
1
2 x
Group Work, Section 2.4 Curve Fitting 1. A quadratic function is a function of the form f (x) = ax2 + bx + c. Assume we know that f (1) = 7,
f (−1) = 3, and f (2) = 21. Find a, b, and c. Graph the function to check your answer.
2. What you have just done is taken the points (1, 7), (−1, 3), and (2, 21), and fit a curve (a parabola) to
those points. But a parabola isn’t the only curve out there. What if we wanted to fit them to a strange curve? Let g (x) = a sin x + bx + c and find a, b, and c such that g goes through all three points. Again, graph the function to make sure your answer is correct.
3. Assume that you want to fit the points to a cubic function. What happens?
59
2.5
Iterative Methods for Solving Linear Systems
Suggested Time and Emphasis 1 2 –1
class. Recommended material.
Points to Stress 1. Jacobi’s method for solving linear systems. 2. The Gauss-Seidel method for solving linear systems. 3. The concepts of iterates converging to a solution or diverging.
Drill Question Suppose we have a linear system involving unknowns x1 , . . . , xn . Is it possible that the Jacobi Method converges to a vector [a1 , . . . , an ], but x1 = a1 , . . . , xn = an fails to be a solution to the linear system? Answer
No. See Theorem 2.
Discussion Question In the two equation, two variable case, the Gauss-Seidel method can be carried out using a graphical approach on two intersecting lines. Suppose one of the lines is x2 = x1 . Pick a point on this line and draw a second line through this point. Will the graphical approach work? That is, will the graphical approach converge to the point of intersection? Do this for several second lines. What property must the second line possess in order for the graphical approach to converge? Answer
The second line must have a slope between −1 and 1.
Test Question Give examples of one 4 × 4 matrix that is strictly diagonally dominant, and one that is not. Answers (will vary)
⎡
⎤ 100 1 1 1 ⎢ 1 100 1 1⎥ ⎢ ⎥ Diagonally dominant:⎢ ⎥. Not diagonally dominant: ⎣ 1 1 100 1⎦ 1 1 1 100
⎡
0 ⎢0 ⎢ ⎢ ⎣0 1
1 0 0 0
0 1 0 0
⎤ 0 0⎥ ⎥ ⎥. 1⎦ 0
Lecture Notes • Repeat the discussion question above, this time with an arbitrary first line. Try to come up with a relationship between the two lines that will guarantee convergence. • Go through the details of Example 3, making sure to cover how a linear system is derived. • There are good reasons that we are using the Gauss-Seidel (GS) method on a linear system and not on a nonlinear system. Suppose we are looking for the solution is a linear system with two variables x1 and x2 and two equations. For i ∈ {1, 2}, let x∗i denote the value of xi that solves the system. The GS method asks us to solve for each of the unknowns in terms of the other; thus we obtain x1 as a function of x2 60
Section 2.5 Iterative Methods for Solving Linear Systems
[x1 = x1 (x2 )] and vice versa [x2 = x2 (x1 )]. Let xki denote the kth approximation to x∗i . Starting with an initial guess of x01 = x0 , the GS method produces approximations like this: x12 = x2 x01 x11 = x1 x12 x22 = x2 x11 .. . So, for example, the sixth approximation to x∗2 is x62 = x2 x51 = x2 (x1 (x2 (x1 (x2 (x1 (x2 (x1 (x2 (x1 (x2 (x1 (x0 )))))))))))) Now suppose that x1 (x2 ) = x2 (that is, x1 is the identity function) and x2 (x1 ) = x1 × x1 − 2. In this case, the sixth approximation to x∗2 is x62 = x2 (x2 (x2 (x2 (x2 (x2 (x0 ))))))
Graph x1 and x2 and notice the two points of intersection. Now use the GS method to try to find one of these points of intersection. Notice what happens when a starting value of x0 = 1.2 is used: the iterations appear to wander about in a chaotic manner. The picture below shows the first 100 iterations of the graphical representation of the Gauss-Seidel method in this case. y 6 y=x@-2
4
y=x
2 0 _2
_1
1
2
3
x
_2
Lecture Examples • Two iterates of the Jacobi method, starting with x = 0, on the system 3x1 − x2 + x3 = 1 3x1 + 6x2 + 2x3 = 0 3x1 + 3x2 + 7x3 = 4
After two iterates, we have x1 = 0.1428, x2 = −0.3571, x3 = 0.4285. The exact solution is x1 = 9 x2 = − 38 , x3 =
25 38 .
61
2 57 ,
Chapter 2 Linear Equations
• Two iterates of the Gauss-Seidel method on the system above, starting with x0 = 0: 3x1 − x2 + x3 = 1 3x1 + 6x2 + 2x3 = 0 3x1 + 3x2 + 7x3 = 4
After two iterates: x21 = 0.1111, x22 = −0.2222, x23 = 0.6190 2 9 25 Actual answer: x1 = , x2 = − , x3 = 57 38 38 • Two iterates of the Gauss-Seidel method, starting with x = 0, on the system 10x1 − x2 = 9 −x1 + 10x2 − 2x3 = 7 −2x2 + 10x3 = 6
After two iterates, we have x1 = 0.979, x2 = 0.9495, x3 = 0.7899. The exact solution is x1 = x2 =
91 95 ,
x3 =
376 475 .
473 475 ,
Note that both of the systems above give rise to strictly diagonally dominant matrices.
Tech Tip This is an excellent place to introduce the idea of writing simple programs. Maple, Mathematica, Matlab, and CAS calculators all have programming capability. Students should, from scratch, write programs to use the Gauss-Seidel method.
Group Work 1: Another Form of Jacobi’s Method This activity demonstrates, through a specific example, how to express the k-iterate of the Jacobi Method in terms of the previous iterate. The expression involves matrix multiplication and vector addition. Answers 1 9 1 7 x2 + 10 , x2 = 10 x1 + 25 x3 + 10 , x3 = 25 x2 + 35 1. x1 = 10 ⎡ ⎡ ⎤ ⎤ 1 9
0 10 0 10 ⎢1 ⎢7⎥ ⎥ 2. T = ⎣ 10 0 25 ⎦, c = ⎣ 10 ⎦ 2 3 0 5 0 5 3. After 7 iterates we find x1 = 1.02399, x2 = 1.24051, x3 = 1.09597.
Group Work 2: Abject Failure If the coefficient matrix of a linear system fails to be strictly diagonally dominant then the iterative methods of this section could fail to converge to the solution of the linear system. In this activity, both the Jacobi Method and Gauss-Seidel Method will fail to converge. Gaussian elimination is then used to reveal the solution. Answer
Students will find that both the Jacobi Method and Gauss-Seidel Method will fail to converge. Using Gaussian elimination, we find x1 = 2, x2 = −3, x3 = 1.
Suggested Core Assignment Exercises 5, 11, 18, 20 , 23, 28
62
Group Work 1, Section 2.5 Another Form of Jacobi’s Method 1. Consider Exercise 25. The goal is to determine, within an accuracy of 0.001, the values of t1 , t2 , . . . , t6 .
From the given diagram, set up the associated augmented matrix A.
2. Here is a notational convenience: let tkj denote the k th iterate for tj . Some members of the group
should proceed to find iterates tk1 , . . . , tk6 , where k = 0, 1, 2, . . ., using the method of Example 1 (use t01 = t02 = · · · = t06 = 0 as the starting point). The other members of the group should find the iterates in the following way: it can be shown that tki can be solved for explicitly in terms of the previous iterate, elements of A, and elements of b1 b2 · · · b6 . Specifically, for k = 1, 2, . . ., we have 6, j=i
tki =
j=1
−aij tk−1 + bi j aii
3. Compare the two approaches and determine which would be easier to implement (on a computer, for
example).
63
Group Work 2, Section 2.5 Abject Failure Try both the Jacobi method and the Gauss-Seidel method on the following system. Then solve the system using Gaussian elimination. How do these techniques compare? x1 − 5x2 − x3 = 16 6x1 − x2 − 2x3 = 13 7x1 + x2 + x3 = 12
64
3 Matrices 3.1
Matrix Operations
Suggested Time and Emphasis 1–2 classes. Essential material.
Points to Stress 1. Definition of a matrix and various special cases (square, diagonal and identity). 2. Matrix equality and arithmetic (addition, subtraction, scalar and matrix multiplication). 3. Matrix-column representation of multiplication. 4. Block partitioning of matrices.
Drill Question If A is a 3×5 matrix and B is a 5×7 matrix, is the product matrix AB defined? If so, what are its dimensions? If not, why not? Answer
Yes, 3 × 7
Discussion Question Suppose v1 , v2 , and v3 are column vectors in R3 , and A is a 3 × 3 matrix. Consider the three vectors wi = Avi for i = 1, 2, 3. Give the conditions under which the vectors wi are linearly independent. Answer
The only way this can happen is for A to have linearly independent columns (or rows) and for the set of vectors {vi } to be linearly independent.
Test Question Find a nonzero 2 × 2 matrix A such that A2 = 0. 0 1 Sample Answer 0 0
Lecture Notes • Make a point about the relative sizes of matrices and the operations covered in this section: addition and subtraction require identical size and the operations are commutative; multiplication requires a match in column size (on the left) with row size (on the right) and, as such, this operation is not necessarily commutative. (There are matrices such that AB = BA, but this is the exception, rather than the rule.) 65
Chapter 3 Matrices
• Do an example such as Example 8 to reinforce the comment made in the text: a solution to Ax = b allows the column vector b to be expressed as a linear combination of the columns of A; that is, a solution guarantees that b is in the span of A’s columns. Point out that if the columns of A are linearly independent then there can exist at most one solution to Ax = b (regardless of b). • To contrast with the matrix column representation, discuss the outer product expansion for matrix multiplication. This representation will come up in the discussion of later topics. • Discuss and illustrate how correct block partitioning can speed up matrix multiplication. This process is demonstrated in Group Work 2. • Compute the rank of a nonsquare matrix. Compute the rank of its transpose, and note that it does not change. • Discuss the special case of multiplication of an arbitrary matrix by a diagonal matrix.
Lecture Examples • Matrix addition:
⎤ ⎤ ⎡ ⎤ ⎡ −10 −8 4 −4 −8 −1 −6 0 5 ⎥ ⎥ ⎢ ⎢ ⎥ ⎢ 3 4⎦ 1 1 ⎦ + ⎣ −10 2 3 ⎦ = ⎣ −20 ⎣ −10 −4 −19 13 −1 −9 8 −3 −10 5 ⎡
• Matrix scalar multiplication:
• Matrix multiplication:
⎤ ⎤ ⎡ ⎤⎡ 19 3 46 −4 −8 −1 −6 0 5 ⎥ ⎥ ⎢ ⎢ ⎥⎢ 1 1 ⎦ ⎣ −10 2 3 ⎦ = ⎣ 29 73 21 ⎦ ⎣ −10 107 −41 13 −1 −9 8 −3 −10 5 ⎡
• Matrix multiplication:
• Matrix multiplication:
⎤ ⎤ ⎡ 18 0 −15 −6 0 5 ⎥ ⎢ ⎥ ⎢ 1 1 ⎦ = ⎣ 30 −3 −3 ⎦ −3 ⎣ −10 9 30 −15 −3 −10 5 ⎡
⎡
⎤ −10 ⎢ 1⎥ ⎢ ⎥ −6 0 5 1 ⎢ ⎥ = 65 ⎣ 1⎦ 0 ⎡ ⎤ 60 −10 ⎢ ⎢ ⎥ 1⎥ ⎢ −6 ⎢ ⎢ ⎥ −6 0 5 1 = ⎢ ⎣ −6 ⎣ 1⎦ 0 0 ⎡
66
⎤ 0 −50 −10 0 5 1⎥ ⎥ ⎥ 0 5 1⎦ 0 0 0
Section 3.1 Matrix Operations
• Matrix transpose: ⎤ ⎤T ⎡ −6 −10 −3 −6 0 5 ⎢ ⎢ ⎥ ⎥ 1 −10 ⎦ 1 1⎦ =⎣ 0 ⎣ −10 5 1 5 −3 −10 5 ⎡
Tech Tip Use a CAS to set up and solve the system of equations that results from solving Problem 5 in Group Work 1.
Group Work 1: Does Not Commute There is an important bit of logic that may go over the students’ heads. When we say addition is commutative, we mean that A + B = B + A regardless of A and B . When we say that multiplication is not commutative, we mean that it is not necessarily the case that AB = BA, but we do not mean that this is never the case. For example, the students have already seen that the identity matrix commutes with square matrices of appropriate size. Problem 5 may be reserved for students with a CAS. Answers
1. AB =
19 14 15 13 , BA = 8 4 12 8
2. x = 1 3. There is no such value. 4. x = 5, y = 2 5. Yes
Group Work 2: Offensive Blocking This activity will allow students to experiment with the concept of block matrices, as discussed in the text. Emphasize, either before or after they have worked through the activity, that this technique is useful only when it makes the computation easier. For example, if ⎡ ⎡ ⎤ ⎤ 2 5 1 0.2 0.235 0.23 ⎢ ⎢ ⎥ ⎥ 9.23 −2.4 ⎦ A = ⎣ 6 9 −2 ⎦ and B = ⎣ 1.6 5 2 −23 2.212 2.221 −2.3 then there is no advantage to be gained from breaking A and B up into blocks. Students may wonder why it is advantageous to making computations simpler, given that computers can multiply matrices together. It turns out that, even for a computer, multiplying large matrices together is a very slow process. In some applications, a computer program may have to multiply many large matrices together in a very short time. If the programmer uses knowledge of block matrices, the computer will be able to multiply the matrices in a reasonable length of time. 67
Chapter 3 Matrices
Answers
2 6 3 1. 7 3 2 2. Same answer, but students should have gotten it more quickly by blocking the matrices like so: ⎡ ⎤ 2 3 1 ⎤ ⎡ ⎢ 7 4 1 ⎥ 3 2 1 0 ⎢ ⎥ ⎦ B=⎢ A=⎣ ⎥ ⎣ 0 1 0 ⎦ 0 1 −1 1 0 0 1 ⎡ ⎤ 27 20 0 0 0 0 ⎢ 54 40 0 0 0 0 ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ 0 0 1 0 0 0⎥ ⎢ ⎥ 3. ⎢ ⎥ ⎢ 0 0 0 1 0 0⎥ ⎢ ⎥ ⎣ 0 0 0 0 1 0⎦ 0 0 0 0 0 1
Suggested Core Assignment Exercises 1, 6, 7, 8, 19, 20, 24, 30P , 33, 38P
68
Group Work 1, Section 3.1 Does Not Commute 1. Show that the two matrices
A=
3 1 0 2
and
B=
5 4 4 1
Show that A and B do not commute; i.e., AB = BA.
2. Find x such that the following matrices commute:
A=
3 1 x 2
and
B=
3. Find x such that the following matrices commute:
A=
4 1 x 2
and
B=
4. Find x and y such that the following matrices commute:
A=
x 4 y 1
and
B=
5 4 4 1
5 4 4 1
3 2 1 1
5. If aij and bst are constants, is it always possible to find x and y such that
A=
x a12 y a22
and
commute?
69
B=
b11 b12 b21 b22
Group Work 2, Section 3.1 Offensive Blocking Consider the following matrices: ⎡
A=
1 0 3 2 0 1 −1 1
and
2 ⎢7 ⎢ B=⎢ ⎣0 0
3 4 1 0
⎤ 1 1⎥ ⎥ ⎥ 0⎦ 1
1. Compute AB using standard matrix multiplication.
2. Try to partition both matrices into blocks, so that computing AB using block multiplication is faster and
easier.
3. Now use blocking to find AB if
⎡
2 ⎢4 ⎢ ⎢ ⎢0 A=⎢ ⎢0 ⎢ ⎢ ⎣0 0
3 6 0 0 0 0
0 0 1 0 0 0
0 0 0 1 0 0
0 0 0 0 1 0
⎤ 0 0⎥ ⎥ ⎥ 0⎥ ⎥ 0⎥ ⎥ ⎥ 0⎦ 1
⎡
and
70
6 ⎢5 ⎢ ⎢ ⎢0 B=⎢ ⎢0 ⎢ ⎢ ⎣0 0
−2 8 0 0 0 0
0 0 1 0 0 0
0 0 0 1 0 0
0 0 0 0 1 0
⎤ 0 0⎥ ⎥ ⎥ 0⎥ ⎥ 0⎥ ⎥ ⎥ 0⎦ 1
3.2
Matrix Algebra
Suggested Time and Emphasis 1 class. Essential material.
Points to Stress 1. Algebraic properties of addition and scalar multiplication (Theorem 1). 2. The extension of previously introduced concepts such as span, linear combination and linear
independence. 3. Properties that matrix multiplication possesses (Theorem 2) and does not possess (namely,
commutativity). 4. Properties of the transpose.
Drill Question
Describe the span of the matrices Answer
All matrices of the form
2 0 0 0
and
0 3 . 0 0
∗ ∗ . 0 0
Discussion Question We can think of a vector as a matrix with one of the dimensions equal to 1. Now we can extend vector addition to matrix addition. What other aspects of our work with vectors can now be extended to matrices? Answer
The discussion may touch on ideas of linear combinations, linear independence, scalar multiplication, and so forth.
Test Question Find a 3 × 3 matrix A = I such that A · AT = I . ⎡ ⎤ −1 0 0 ⎢ ⎥ Sample Answer ⎣ 0 1 0 ⎦ 0 0 1
Lecture Notes • Prove Theorem 4(b) in class, using the properties of the transpose. (This is Exercise 34 in the text.) • Write out the proof of Theorem 3(d) as in the book, or put it on a tranparency. Now go through the proof, line by line. Show the students the thought process behind decoding mathematical text, in particular, how it should be read more slowly than one would read a newspaper. 71
Chapter 3 Matrices
• Foreshadow Section 3.3: Matrix Inverses. Have the students compute ⎤⎞ ⎤⎛ ⎡ ⎡ ⎤⎡ 3 1 0 0 2 −2 0 2 1 0 ⎥⎟ ⎥⎜ ⎢ ⎢ ⎥⎢ ⎣ −2 3 0 ⎦ ⎣ 1 1 0 ⎦ ⎝= ⎣ 0 1 0 ⎦⎠ 0 0 1 0 0 1 0 0 1 ⎤⎡ ⎤ ⎡ ⎡3 ⎤ x 1 2 1 0 ⎥⎢ ⎥ ⎢ ⎢ ⎥ and show them how to use this fact to quickly solve ⎣ 1 1 0 ⎦ ⎣ y ⎦ = ⎣ 0 ⎦. z −1 0 0 1 • Compare e (A · B) to e (A) · e (B) for each of the three elementary row operations e.
Lecture Examples • The use of Theorem 2: (From Section 3.1): Prove that Am An = Am+n . Am = A (A (· · · A)) " #$ % m times
n
A
= A (A (A (· · · A))) " #$ % n times
Then we use associativity to obtain A (A (· · · A))A (A (A (· · · A))) = A (A (A (A (A (· · · A))))) " #$ %" #$ % " #$ % m times
n times
m+n times
• Span and linear dependence: If ⎡
⎤ −1 20 ⎢ ⎥ A = ⎣ 8 25 ⎦ 17 27
⎡
⎤ 1 5 ⎢ ⎥ B=⎣2 5⎦ 3 8
⎡
⎤ −1 0 ⎢ ⎥ C=⎣ 0 1⎦ 1 −1
then A is in the span of {B, C}, so {A, B, C} is a linearly dependent set of matrices. In particular, A = 4B + 5C .
Tech Tip Students can use a CAS to verify the results of the four theorems for arbitrary, large matrices.
Group Work 1: A Special Property --- Upper Triangularity Students have seen upper triangular matrices before when solving an n × n system — the coefficient matrix ends up being upper triangular. Remind the students of this fact after you’ve defined upper triangular matrices. Page 1 is optional; if it is to be used, it needs very little introduction. Problem 3 is Exercise 29 from the text. Answers 1. The third and fourth matrices have the property. 2., 3. Answers will vary. 4. A square matrix is called lower triangular if all of the entries above the main diagonal are zero. 72
Section 3.2 Matrix Algebra
⎡
⎤ ∗ 0 0 ∗ 0 ⎢ ⎥ 5. ⎣ 0 ∗ 0 ⎦ or . These matrices are called diagonal matrices. 0 ∗ 0 0 ∗ 6. This is true when there is no zero row. 7. Answers will vary Here is one proof: Begin by recalling that an n × n matrix is upper triangular if and only if every element below the diagonal is zero. Let A and B be n × n upper triangular matrices. Let Ai denote the ith column of A. Let C = AB , and Ci will denote the ith column of C . We will show that every element of Cj below the j th row is zero. Recall that a property of matrix multiplication allows us to write Cj = ni=1 bij Ai
But B is upper triangular, so we actually have Cj =
j
i=1 bij Ai
And, since A is upper triangular, for i = 1, . . . , j , every element of Ai below the j th row is zero. Therefore every element of Cj below the j th row is zero. So C is upper triangular.
Group Work 2: Trace Have all the groups do the first three problems. Assign each group one of the remaining four to write on a transparency, instructing them to work on the others if they get done early. If there is time, have the groups present their solutions. Otherwise, photocopy their solutions and distribute the next day, if they are correct. If the students get stuck on Problem 5, you may want to give them the hint that they need look only at the diagonal entries of AB and BA. Answers 1. 22 2. The trace of a matrix is the sum of the diagonal entries. 3. n 4. True. Use commutativity. 5. True. By commutativity of the dot product, we know that the diagonal of AB is the same as that of BA. 6. This is false in general. It is false for almost any two matrices. 7. Note that A and AT have the same diagonal.
Suggested Core Assignment Exercises 3, 6, 8, 10, 16, 20P , 26, 37, 38, 44P
73
Group Work 1, Section 3.2 A Special Property --- Upper Triangularity We are going to look at some matrices that have a special property: A = [aij ] has the property if aij = 0 when i > j 1. Which of the following matrices have the property?
⎡
1 ⎢0 ⎢ ⎢ ⎣2 0
0 1 0 1
2 0 1 0
⎤ 0 1⎥ ⎥ ⎥ 0⎦ 2
⎡
1 ⎢1 ⎢ ⎢ ⎣1 1
0 1 1 0
0 0 1 0
⎤ 0 0⎥ ⎥ ⎥ 0⎦ 0
⎡
1 ⎢0 ⎢ ⎢ ⎣0 0
2. Find a 5 × 5 matrix with this property.
3. Explain the property in your own words.
74
2 1 0 0
1 2 1 0
⎤ 2 1⎥ ⎥ ⎥ 2⎦ 1
⎡
1 ⎢0 ⎢ ⎢ ⎣0 0
1 0 0 0
1 0 0 0
⎤ 1 1⎥ ⎥ ⎥ 1⎦ 1
A Special Property — Upper Triangularity
A square matrix is called upper triangular if all of the entries below the main diagonal are zero. Thus, the form of an upper triangular matrix is ⎡ ⎤ ∗ ∗ ··· ∗ ∗ ⎢ 0 ∗ ··· ∗ ∗ ⎥ ⎢ ⎥ ⎢ . . .. .. ⎥ ⎢ ⎥ ⎢0 0 . . . ⎥ ⎢ ⎥ ⎢ .. .. ⎥ ⎣. . ∗ ∗⎦ 0 0 ··· 0 ∗
where the entries marked ∗ are arbitrary. A more formal definition of such a matrix A = [aij ] is that aij = 0 if i > j . 4. What would be a good definition for “lower triangular”?
5. Can a non-zero matrix be both upper triangular and lower triangular simultaneously? If not, why not? If
so, give an example of such a matrix, and come up with a term for a matrix with both of these properties.
6. When is an upper triangular matrix row-equivalent to the identity matrix?
7. Prove that the product of two n × n upper triangular matrices is upper triangular.
75
Group Work 2, Section 3.2 Trace We define the trace of an n × n matrix A = [aij ] by tr A = 1. Find the trace of
⎡
2 ⎢3 ⎢ ⎢ ⎣1 4
1 3 2 9
2 7 9 4
n
i=1 aii .
⎤ 3 0⎥ ⎥ ⎥ 8⎦ 8
2. Define the trace of a matrix in simple language.
3. What is the trace of the identity matrix In ?
4. Prove or find a counterexample: tr (A + B) = tr A + tr B .
5. Prove or find a counterexample: If A and B are square, then tr (AB) = tr (BA).
6. Prove or find a counterexample: If A and B are square, then tr (AB) = tr A tr B .
7. Prove that tr AAT =
(aii )2 .
76
3.3
The Inverse of a Matrix
Suggested Time and Emphasis 2 classes. Essential material.
Points to Stress 1. Definition of inverse, including its uniqueness, and noninvertible matrices. 2. Properties of inverses, and their use in solving systems. 3. Inverses of 2 × 2 matrices, including the definition of the determinant of a 2 × 2 matrix. 4. Inverses of higher-dimension matrices. 5. The fundamental theorem of invertible matrices (Theorem 7).
Drill Question If AB = I , where I is the identity matrix, is it necessarily true that BA = I ? Answer Yes
Discussion Question Why is it useful to find the inverse of a matrix? Answer There are many possible ways this discussion can go. One reason is that finding the inverse of a matrix enables us to solve equations easily.
Test Question ⎡
⎤−1 ⎡ ⎤ −1 2 −2 2 1 0 2 0 ⎢ 1 −1 2 −1 ⎥ ⎢0 1 0 1⎥ ⎢ ⎢ ⎥ ⎥ Use the fact that ⎢ = ⎢ ⎥ ⎥ to solve the system of equations ⎣ 1 −1 1 −1 ⎦ ⎣ 0 1 −1 0 ⎦ −1 2 −2 1 1 0 0 −1 −w + 2x − 2y + 2z = 3 w − x + 2y − z = 1 w−x+y−z = 4 −w + 2x − 2y + z = 1 Answer
w = 11, x = 2, y = −3, z = 2
Lecture Notes • Point out that the advantage of inverses really kicks in when we are solving a series of large systems with similar equations. For example, we can solve w+x+y+z = 4 2w + 2x + 3y − z = 5 w − x − 2y − z = 0 w − 2x − 4y − z = −1 77
Chapter 3 Matrices
using Gauss-Jordan elimination (finding that w = 2, x = 1, y = 0, and z = 1), or we can use inverses, and it will take roughly the same amount of work. But, if we also had to solve w+x+y+z = 1 2w + 2x + 3y − z = 2 w − x − 2y − z = 3 w − 2x − 4y − z = 4
and w+x+y+z = 4 2w + 2x + 3y − z = 0 w − x − 2y − z = 2 ⎡
1 ⎢2 ⎢ We could take the inverse of A = ⎢ ⎣1 1
w − 2x − 4y − z = −3 ⎤ 1 1 1 2 3 −1 ⎥ ⎥ ⎥ once, and compute A−1 b as many times as we like. −1 −2 −1 ⎦ −2 −4 −1
• Perhaps take this opportunity to talk about the inverse of a complex number: how do we find
1 ? 3 + 4i
3 − 4i , is not as important as the concept that given a real, complex, or 3 − 4i matrix quantity it is often possible to find an inverse that will reduce it to unity. One can also add “inverse functions” to this discussion — in this case f (x) = x is the identity function, so-called because it leaves inputs unchanged (analogous to multiplying by 1). Try to get the students to see the conceptual similarities in solving the three following equations:
The technique, multiplying by
3x = 2
(3 + i) x = 2 − 4i 2 1 x 3 = 3 −4 y 1
• After doing a standard example or two, throw a noninvertible matrix on the board before defining invertibility. “Unexpectedly” run into trouble and thus discover, with your class, that not every matrix has an inverse. Examples of noninvertible matrices are given in the lecture examples.
78
Section 3.3 The Inverse of a Matrix
8 2 • Demonstrate that = A for A = . Explain why this has to be true in general, by the 15 4 definition of inverse. One can prove this fact fairly simply: −1 −1 −1 A A = I
−1 −1 A A = IA A−1
−1 A−1
−1 −1 −1 A A = A A −1 −1 A = A
It is amusing to see how messy a general algebraic proof is. The algebraic proof for the 2× 2 case follows: a b Let A = . Then c d ⎤−1 ⎡ d b − −1 −1 ⎢ ad − bc ⎥ = ⎣ ad −c bc A ⎦ a − ad − bc ad − bc ⎡ ⎤ a b 1 ad − bc ad − bc ⎥ ⎢ = ⎣ ⎦ c d d b c a − ad − bc ad − bc ad − bc ad − bc ad − bc ad − bc a b a b 1 1 = = =A · · 1/ (ad − bc) ad − bc c d c d • Note that there is a formula for finding the inverses of 3 × 3 matrices, just as there is one for 2 × 2 matrices. Unfortunately, it is so complicated that it is easier to do 3 × 3 inverses manually than to use a formula. ⎡ ⎡ ⎤−1 ⎤ a b c fh − ei bi − ch ce − bf 1 ⎢ ⎢ ⎥ ⎥ ⎣d e f ⎦ = ⎣ di − fg cg − ai af − cd ⎦ af h − aei + bdi − bf g + ceg − cdh g h i eg − dh ah − bg bd − ae
Notice that in the 2 × 2 case we wound up with an expression ad − bc that determined invertibility. Now, in the 3 × 3 case we have a longer expression, af h − aei + bdi − bfg + ceg − cdh, that fulfills the same role. Point out to the students that we will be calling this the 3 × 3 determinant, and that we will find better ways to compute it than memorizing this formula.
• Illustrate Theorem 7 by the following diagram. It shows that all five statements are equivalent, and the double arrows signify “easy proof” or “by definition”. rank (A) = n ⇔
The reduced row echelon form of A is I
⇔
A is nonsingular
⇓
|A| = 0
AX = B has a unique solution 79
AX = 0 has a unique solution
Chapter 3 Matrices
Lecture Examples • Invertible real matrices −1 −2 1 = 3 3
− 13 1 3
1 9 2 9
⎡
⎡
!
0 ⎢2 ⎢ ⎢ ⎣0 0
2 0 0 1
0 0 5 0
⎤−1 0 0⎥ ⎥ ⎥ 0⎦ 6
• Noninvertible real matrices
⎡
⎡
−1 ⎢ ⎣ 4 4 ⎡ 0 ⎢ 1 ⎢ =⎢ 2 ⎣ 0 1 − 12
5 4 15 12
⎤ ⎤−1 ⎡ 1 1 2 − 5 15 15 1 1 ⎢ ⎥ ⎥ −1 1 ⎦ = ⎣ 0 − 13 13 ⎦ 2 4 1 2 1 5 5 −5 ⎤ 1 2 0 0 0 0 0⎥ ⎥ ⎥ 1 0 5 0⎦ 0 0
1 6
⎤ 1 2 3 ⎢ ⎥ ⎣ −1 3 −2 ⎦ −1 8 −1
⎤ 8 3 5 −8 ⎢ −7 6 0 −6 ⎥ ⎢ ⎥ ⎢ ⎥ ⎣ −5 −3 7 9 ⎦ −4 6 12 −5 • A complex matrix and its inverse −1 1 −2i 1 2i 1 = = −1 − 2i −1 + i 1 1−i 1
− 15 + 25 i 45 + 25 i − 15 − 35 i − 15 + 25 i
!
Group Work: Some Special Kinds of Matrices This activity uses various types of matrices to give students practice with inverses, and to introduce them to the vocabulary of matrix algebra. Before handing out the sheet, start by defining a diagonal matrix, one ⎡ ⎤ 1 0 0 1 0 ⎥ ⎢ , for which aij = 0 whenever i = j such as ⎣ 0 2 0 ⎦. Ask the students to find the inverses of 0 2 0 0 3 ⎡ ⎡ ⎡ ⎤ ⎤ ⎤−1 1/a 0 0 1 0 0 a 0 0 ⎢ ⎢ ⎢ ⎥ ⎥ ⎥ ⎣ 0 2 0 ⎦, and then an arbitrary diagonal matrix. They will find that ⎣ 0 b 0 ⎦ = ⎣ 0 1/b 0 ⎦. 0 0 1/c 0 0 3 0 0 c Then hand out the activity which will allow the students to explore more special types of matrices. Problem 4 hits an important theorem in matrix algebra: that the only invertible idempotent matrix is the identity. If the students finish early, follow up on Problem 10 by asking how they could tell, at a glance, if a given permutation matrix is its own inverse. When everyone is done, close the activity by discussing that question. It turns out that if the permutation just swaps pairs of elements, then doing it twice will bring us back to the identity permutation. All one does is to make sure that if aij = 1 then aji = 1 as well. 80
Section 3.3 The Inverse of a Matrix
Answers
Answers will vary. ⎡ ⎤ 1 0 0 ⎢ ⎥ 1. ⎣ 0 1 0 ⎦ 0 0 0
⎡
⎤ 1 0 0 1 0 ⎢ 1 1⎥ 2. ⎣ 0 2 2 ⎦ or 2 0 0 12 21
4. Another cannot exist. If A2 = A, then A2 A−1 = AA−1 and A = I .
⎡
⎤
42 ⎢ ⎥ 5. ⎣ 51 ⎦ 27
⎡
⎤ b ⎢d⎥ ⎢ ⎥ ⎢ ⎥ 6. ⎢ a ⎥ ⎢ ⎥ ⎣c⎦
e 8. They permute the rows of a matrix without changing them. ⎡ ⎤ ⎡ ⎤ 1 0 0 0 1 0 ⎢ ⎥ ⎢ ⎥ 9. ⎣ 0 0 1 ⎦ 10. ⎣ 0 0 1 ⎦ 0 1 0 1 0 0
Suggested Core Assignment Exercises 2, 6, 8, 12, 15P , 22, 38, 44, 52, 53, 55, 64, 69
81
3.
⎡
1 0 0 1
⎤ −1 −2 −3 −4 −5 ⎢ −6 −7 −8 −9 −10 ⎥ ⎢ ⎥ ⎢ ⎥ 7. ⎢ 6 7 8 9 10 ⎥ ⎢ ⎥ ⎣ a b c d e⎦ 1 2 3 4 5
Group Work, Section 3.3 Some Special Kinds of Matrices A matrix A is called idempotent if A2 = A. 1. Find a 3 × 3 diagonal idempotent matrix that is not the identity.
2. Find a nondiagonal idempotent matrix.
3. Find an invertible 2 × 2 idempotent matrix.
4. Find a different invertible 2 × 2 idempotent matrix, or show why one cannot exist.
82
Some Special Kinds of Matrices
A permutation matrix has a single 1 in each row and each column. The rest of the entries are zeros. Here are some permutation matrices: ⎤ ⎡ ⎤ ⎡ 0 0 1 0 0 0 1 0 0 0 ⎤ ⎡ ⎢0 0 0 1 0⎥ ⎢0 0 0 1 0⎥ 0 1 0 ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎢0 0 0 0 1⎥ ⎢1 0 0 0 0⎥ ⎣0 0 1⎦ ⎥ ⎢ ⎥ ⎢ ⎣1 0 0 0 0⎦ ⎣0 0 1 0 0⎦ 1 0 0 0 1 0 0 0 0 0 0 0 1 ⎡ ⎤⎡ ⎤ 0 1 0 27 ⎢ ⎥⎢ ⎥ 5. Compute ⎣ 0 0 1 ⎦ ⎣ 42 ⎦. 1 0 0 51
⎡
0 ⎢0 ⎢ ⎢ 6. Compute ⎢ 1 ⎢ ⎣0 0
⎡
0 ⎢0 ⎢ ⎢ 7. Compute ⎢ 0 ⎢ ⎣0 1
1 0 0 0 0
0 0 0 1 0
0 1 0 0 0
⎤⎡ ⎤ a 0 ⎢ ⎥ 0 ⎥⎢ b ⎥ ⎥ ⎥⎢ ⎥ 0 ⎥ ⎢ c ⎥. ⎥⎢ ⎥ 0 ⎦⎣ d ⎦ e 1
0 0 1 0 0
1 0 0 0 0
0 1 0 0 0
⎤⎡ ⎤ 0 1 2 3 4 5 ⎢ ⎥ 0⎥ ⎥ ⎢ 6 7 8 9 10 ⎥ ⎥⎢ ⎥ 0 ⎥ ⎢ −1 −2 −3 −4 −5 ⎥. ⎥⎢ ⎥ 1 ⎦ ⎣ −6 −7 −8 −9 −10 ⎦ 0 a b c d e
8. Why are matrices of this type called permutation matrices?
83
Some Special Kinds of Matrices
9. Find a 3 × 3 permutation matrix A = I such that A−1 = A.
10. Find a 3 × 3 permutation matrix A = I such that A−1 = A.
84
3.4
LU Factorization
Suggested Time and Emphasis 1 class. Optional material.
Points to Stress 1. An LU factorization of A. 2. Theorem 1 3. A P T LU factorization of A.
Drill Question Does every square matrix A have a unique LU factorization? Answer No. It is unique only if A is nonsingular.
Discussion Question How might we define an “pseudo-LU factorization” for a nonsquare matrix A? Answer Begin by formulating a definition of “pseudo-upper (or -lower) triangularity” for nonsquare matrices.
Test Question 0 1 Find an factorization of . 2 0 0 1 0 1 1 0 2 0 = Answer 2 0 1 0 0 1 0 1 P T LU
Lecture Notes • Calculate the LU factorization of a nonsingular matrix A in two ways: using the method applied in Example 3, and using the alternative way described after Example 3. You may find the alternative method easier to execute, but perhaps less intuitive. ⎡ ⎤ −1 4 −2 ⎢ ⎥ • Have the students find the LU factorization of A = ⎣ 2 −6 −4 ⎦. The correct answer is 2 0 −25 ⎡ ⎤⎡ ⎤ 1 0 0 −1 4 −2 ⎢ ⎥⎢ ⎥ A = ⎣ −2 1 0 ⎦ ⎣ 0 2 −8 ⎦ −2 4 1 0 0 3
Now do this example in front of them, but make the mistake of performing elementary row operation out of order. Ask the students to tell you what you did wrong. 0 1 • Prove that the nonsingular matrix does not possess an LU factorization, then exhibit the easy 1 0 P T LU factorization. 85
Chapter 3 Matrices
• Consider going through the proof of Theorem 3 in class. It provides a good review of several important topics. • Show the students that every permutation matrix is cyclic; that is, given an n × n permutation matrix P , there exists a positive integer k such that P k = In .
Lecture Examples • An LU factorization of⎤a matrix A: ⎡ ⎡ ⎤⎡ ⎤ −4 5 −2 1 0 0 −4 5 −2 ⎢ ⎥ ⎢ ⎥⎢ ⎥ Let A = ⎣ −3 2 −1 ⎦. Then A = LU = ⎣ 34 1 0 ⎦ ⎣ 0 − 74 21 ⎦. 1 1 0 − 14 − 97 1 0 0 17 • Using the LU factorization of A to solve Ax = b: ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎤⎡ 1 0 0 −4 5 −2 −4 5 −2 5 ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎥⎢ Let A = ⎣ −3 2 −1 ⎦ and b = ⎣ 4 ⎦. Then A = LU = ⎣ 34 1 0 ⎦ ⎣ 0 − 74 21 ⎦. Let 1 1 0 −1 0 0 17 − 14 − 97 1 y = U x and solve Ly = b, giving y1 = 5 3 4 y1
⎡ ⎢ So y = ⎣
5 1 4 4 7
⎤
+ y2 = 4
− 14 y1 − 97 y2 + y3 = −1
⎥ ⎦. Now we solve U x = y and find −4x1 + 5x2 − 2x3 = 5 − 74 x2 + 12 x3 = 1 7 x3
⎡
⎤ −2 ⎢ ⎥ and so x = ⎣ 1 ⎦. 4 • Two P T LU factorizations of A: ⎡ ⎤ 0 4 0 ⎢ ⎥ Let A = ⎣ 2 −6 −4 ⎦. Then 2 0 −20 ⎡ ⎤T ⎡ 1 0 0 0 1 ⎢ ⎥ ⎢ T A = P LU = ⎣ 1 0 0 ⎦ ⎣ 0 1 0 1 0 1 − 32
=
1 4 4 7
⎡
⎤⎡ ⎤T ⎡ ⎤ 1 0 0 0 1 0 2 −6 −4 ⎢ ⎥⎢ ⎥ ⎢ ⎥ 0 ⎦ and A = P T LU = ⎣ 1 0 0 ⎦ ⎣ 0 1 0 ⎦ ⎣ 0 4 0 0 1 0 0 −16 1 32 1 ⎤⎡ ⎤ 0 2 0 −20 ⎥⎢ ⎥ 0 ⎦⎣ 0 4 0 ⎦. 0 0 16 1
Tech Tip As a project, have students use the programming language of a CAS to write a procedure that returns the LU factorization of a nonsingular matrix A. 86
Section 3.4 LU Factorization
Group Work: The Diagonal Within Another common approach to the factorization described in this section includes a diagonal matrix D sandwiched between the L and the U . Thus, we write A = LDU , where A is nonsingular. This is briefly discussed in the text just prior to Exercises 31 and 32. In this case, both L and U have 1s on their diagonals. The LDU decomposition is an easy step away from the LU factorization. Answers 1. It is nonsingular.
⎡
2. 3.
4. 5.
6. 7. 8.
⎤⎡ ⎤ 0 0 2 1 4 ⎢ ⎥⎢ ⎥ A = LU = ⎣ 32 1 0 ⎦ ⎣ 0 12 −1 ⎦ 2 −2 1 0 0 −1 The specific L in our problem is nonsingular. In general, a unit lower (or upper) triangular matrix is easily seen to be nonsingular, since elementary row operations can always transform such a matrix into the identity matrix. U is the product of nonsingular matrices: U = L−1 A. U is always nonsingular, and thus is row equivalent to the identity matrix. If U had a zero on its diagonal, then it would no longer be row equivalent to I . ⎡ ⎡ 1 ⎤ ⎤ 2 0 0 1 2 2 ⎢ ⎢ ⎥ ⎥ D = ⎣ 0 12 0 ⎦ and therefore U1 = ⎣ 0 1 −2 ⎦. 0 0 −1 0 0 1 A = LDU1 By the assumption, we can conclude that A has an LU factorization and that U is nonsingular. Thus, the diagonal of U contains no 0. Let D be the diagonal matrix whose diagonal entries are those of U , and let U1 be the matrix formed by dividing the ith row of U by its diagonal entry uii . Then A = LDU1 . 1
Suggested Core Assignment Exercises 2, 4, 6, 8, 10, 14, 18, 26P
87
Group Work, Section 3.4 The Diagonal Within ⎡
⎤ 2 1 4 ⎢ ⎥ Consider the matrix A = ⎣ 3 2 5 ⎦. 4 1 9 1. Verify that A is nonsingular.
2. Let’s call a matrix factorable if it can reduced to row echelon form without using any row interchanges.
Then Theorem 1 says that every factorable matrix has an LU factorization. Verify that A is factorable, and find an LU factorization of A.
3. Verify that the unit lower triangular matrix L is nonsingular. Explain why, in general, an LU factorization
of a matrix results in a nonsingular L.
4. Verify that our U is nonsingular. Explain why, in general, the LU factorization of a nonsingular matrix
results in a nonsingular U .
88
The Diagonal Within
5. Our matrix U has no 0 on its diagonal. Explain why, in a general LU factorization of a nonsingular matrix
A, U has no 0 on its diagonal.
6. Define the diagonal matrix D whose diagonal entries are those of a matrix U . Now find a unit upper
triangular matrix U1 such that U = DU1 .
7. Express A as a product of a unit lower triangular matrix, a diagonal matrix, and a unit upper triangular
matrix.
8. Prove that if a square matrix A is factorable and nonsingular, then A can be written as a product of a unit
lower triangular matrix, a diagonal matrix, and a unit upper triangular matrix.
89
3.5
Subspaces, Basis, Dimension, and Rank
Suggested Time and Emphasis 2 classes. Essential material.
Points to Stress 1. Definition of subspace; subspace spanned by v1 , . . . , vn . 2. Row and column spaces of a matrix. 3. Definition of basis and dimension. 4. The Rank Theorem and its connection to the Fundamental Theorem (Theorem 9).
Drill Question If A has n columns, why is nullity (A) a subspace of Rn ? It is a set of vectors in Rn containing 0. It is closed under addition and scalar multiplication. If Ax = 0 and Ay = 0, then A (x + y) = 0. If Ax = 0, then A (kx) = 0. Answer
Discussion Questions 1. Give an example of a pair of 3 × 3 matrices that are row equivalent but have different column spaces. 2. Give an example of a pair of 3 × 3 matrices that are row equivalent and have the same column space. Possible Answers
⎡ ⎤ ⎡ ⎤ 1 0 0 1 0 0 ⎢ ⎥ ⎢ ⎥ 1. ⎣0 1 0⎦, ⎣0 0 0⎦ 0 0 0 0 1 0
⎡ ⎤ ⎡ ⎤ 1 0 0 1 0 0 ⎢ ⎥ ⎢ ⎥ 2. ⎣0 1 0⎦, ⎣0 2 0⎦ 0 0 1 0 0 3
Test Question Is the vector [10, 11, 12] in the span of [1, 2, 3], [4, 5, 6], and [7, 8, 9]? Answer
No
Lecture Notes • It is instructive to point out the potential problem with the solution method of Example 5(b) if you are performing the reduced row echelon calculation on a calculator. This method assumes that you are keeping track of the elementary row operations used on the way to row echelon form and, moreover, that no row swapping takes place. If using a calculator to put matrices in reduced row echelon form, we recommend the method of Example 5(a): list the given vectors as the first columns of a matrix and the vector in question as the last column, then put the matrix in reduced row echelon form. • If your students use calculators to put matrices in reduced row echelon form, then determining a basis from a given set of vectors is quite easy. List the vectors as rows of a matrix (on a calculator) and put the matrix in reduced row echelon form. The nonzero rows form a basis (and in particular are linearly independent). It is worth discussing the reasons why the rows of the matrix in reduced row echelon form are linearly independent, and why they form a basis for the space spanned by the original set of vectors. 90
Section 3.5 Subspaces, Basis, Dimension, and Rank
Note: One can tie the previous two points together nicely. When determining if a vector is in the span of a
given set of vectors, we can list the vectors as columns of a matrix and find its reduced row echelon form. When determining a basis for the space spanned by a given set, we can list the vectors as rows of a matrix and find its reduced row echelon form. • The technique for determining a basis for the column space of a matrix (Example 11) has an interesting benefit: the basis that is returned is always a subset of the original set of columns. That is, the basis chosen through this method always uses vectors from the given collection. Note that this differs from the method described above and in Example 10. So, is there a method (other than a modified version of Example 11) to find a basis from a given set of vectors that is a subset of the given set? As explained in Group Work 1, there is a trial-and-error technique that will accomplish this. Admittedly, this technique is not highly efficient (we are assuming calculator use for this) but it does give students a “hands-on” feel in constructing a basis. • As an in-class discussion, find the conditions in Theorem 9 that are (almost) immediate equivalences of condition (d): the reduced row echelon form of A is the identity matrix. For example, note that nullity (A) is exactly the number of zero rows of the reduced row echelon form of A. Thus, the reduced row echelon form of A is equal to I if and only if nullity (A) = 0. • If A is m × n, then is it true that null (A) = null AT ? Of course not! null (A) belongs to Rn while T T 1 0 0 m null A belongs to R . Is it true that nullity (A) = nullity A ? Also no; consider A = . 0 1 0 • This is a good time to introduce the concept of transition matrices. Notice that these are not the same as the transition matrices discussed in Section 3.7. We’ll consider the square case here. For a fixed integer n, let V denote Rn equipped with basis BV = {v1 , . . . , vn } and let W denote Rn equipped with basis BW = {w1 , . . . , wn }. A given vector u ∈ V has a particular coordinate vector (with respect to BV ). Let’s call it uBV . The question is, what is the coordinate vector of u regarded as an element of W ? That is, can we find uBW ? This is where a transition matrix comes in. There is a matrix M with the property that uBW = MuBV
for every u. This is the transition matrix from V to W . There are a variety of equivalent ways to determine the transition matrix; the discovery of one such method could be a class activity. One way is to let M = A−1 B , where A is the n × n matrix whose columns are w1 , . . . , wn and B is the n × n matrix whose columns are v1 , . . . , vn .
Lecture Examples • A subspace of R3 that is not a subspace of R2 : & ' V = [x, y, z] ∈ R3 : 3x + y − 2z = 0 • Determining if a given vector is in the row or column space of a matrix: ⎡ ⎡ ⎤ ⎤ 1 0 1 2 ⎢ ⎢ ⎥ ⎥ Let A = ⎣ 0 1 1 ⎦ and let v = ⎣ 3 ⎦. Then v is in the column space of A, but v is not in the row space 0 0 0 0 of A. 91
Chapter 3 Matrices
• Finding a basis for null (A): ⎡ ⎤ ⎡ ⎤ 1 0 1 0 ⎢ ⎥ ⎢ ⎥ Let A = ⎣ 0 1 1 ⎦. Then v = ⎣ 0 ⎦ is a basis for null (A). 0 0 0 1 • Finding information about the space spanned by a set of given vectors: ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ 0 5 9 0 ⎢2⎥ ⎢0⎥ ⎢ 0 ⎥ ⎢ −4 ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ Let v1 = ⎢ ⎥, v2 = ⎢ ⎥, v3 = ⎢ ⎥, and v4 = ⎢ ⎥. Then a basis for the span of these vectors ⎣3⎦ ⎣0⎦ ⎣ 0 ⎦ ⎣ −6 ⎦ 0 8 12 0 is w1 = [1, 0, 0, −2], w2 = [0, 1, 0, 0], and w3 = [0, 0, 1, 0]. The dimension of span (v1 , v2 , v3 , v4 ) is 3. ⎤ ⎡ ⎡ ⎤ 0 1 ⎢ 14 ⎥ ⎢2⎥ ⎥ ⎢ ⎢ ⎥ Let v5 = ⎢ ⎥. Then the coordinate vector of v = ⎢ ⎥ with respect to the basis B = {v1 , v2 , v3 , v5 } ⎣ 15 ⎦ ⎣3⎦ 0 4 ⎡ ⎤ 1 ⎢ 2⎥ ⎢ ⎥ of R4 is ⎢ ⎥. ⎣ −1 ⎦ 0
Tech Tip Write a CAS procedure to find the coordinate vector with respect to a given basis.
Group Work 1: Trial by Ferr Answers (may vary): 1. {[4, 7, 1] , [1, 0, 0]} 2. {[1, 3, −2] , [2, 1, 4] , [0, 1, −1]} 3. {[1, 0, −8, 1] , [0, 1, −2, 3] , [−10, 2, 0, 4] , [12, −1, −20, 1]}
Group Work 2: Another Space, Another Time Answers
⎡ 0 ⎢0 ⎢ 1. Z = ⎢ ⎣0 0
⎤ 0 0⎥ ⎥ ⎥ 0⎦ 0
⎡ ⎤ 1 0 ⎢0 2⎥ ⎢ ⎥ 2. Answers will vary. One subspace is W , the set of all scalar multiples of ⎢ ⎥. ⎣3 0⎦ 0 4 ( ) k 3. span (A1 , . . . , Ak ) = i=1 ci Ai : ci ∈ R . By the definition of span (A1 , . . . , Ak ), this set contains
the zero matrix and is closed under scalar multiplication and addition. 92
Section 3.5 Subspaces, Basis, Dimension, and Rank
4. {A1 , . . . , Ak } is linearly independent in M42 if
c1 = · · · = ck = 0. 5. One possible linearly independent set is ⎧⎡ ⎤ ⎡ 1 0 0 ⎪ ⎪ ⎪ ⎨⎢0 0⎥ ⎢0 ⎢ ⎥ ⎢ ⎢ ⎥,⎢ ⎪ ⎣ ⎦ ⎣0 0 0 ⎪ ⎪ ⎩ 0 0 0 6. The simplest possibility:
⎡ 1 ⎢0 ⎢ ⎢ ⎣0 0
⎤ ⎡ 0 0 ⎢ ⎥ 0⎥ ⎢0 ⎥,⎢ 0⎦ ⎣0 0 0
⎤ ⎡ 1 0 ⎢ ⎥ 0⎥ ⎢1 ⎥,⎢ 0⎦ ⎣0 0 0
⎤ ⎡ 0 0 ⎢ ⎥ 0⎥ ⎢0 ⎥,⎢ 0⎦ ⎣0 0 0
k
i=1 ci Ai
is equal to the zero matrix only when
⎤ ⎡ 1 0 ⎥ ⎢ 0⎥ ⎢1 ⎥,⎢ 0⎦ ⎣0 0 0
⎤ ⎡ 0 0 ⎥ ⎢ 0⎥ ⎢0 ⎥,⎢ 0⎦ ⎣0 0 0
⎤⎫ 0 ⎪ ⎪ ⎪ ⎬ 1⎥ ⎥ ⎥ 0⎦⎪ ⎪ ⎪ ⎭ 0
⎤ ⎡ 0 0 ⎢ ⎥ 1⎥ ⎢0 ⎥,⎢ 0⎦ ⎣1 0 0
⎤ ⎡ 0 0 ⎢ ⎥ 0⎥ ⎢0 ⎥,⎢ 0⎦ ⎣0 0 0
⎤ ⎡ 0 0 ⎢ ⎥ 0⎥ ⎢0 ⎥,⎢ 1⎦ ⎣0 0 1
Suggested Core Assignment Exercises 3, 4, 8, 12, 14, 19, 23, 30, 32, 37, 42, 47, 52, 55P , 62P
93
⎤ ⎡ 0 0 ⎢ ⎥ 0⎥ ⎢0 ⎥,⎢ 0⎦ ⎣0 0 0
⎤ 0 0⎥ ⎥ ⎥ 0⎦ 1
Group Work 1, Section 3.5 Trial by Ferr For a given matrix A, assume that it’s easy to find the reduced row echelon form of A (using a calculator, for example). Given a set of vectors {v1 , . . . , vk } ⊂ Rn , we want a basis for the space spanned by {v1 , . . . , vk } which is a subset of {v1 , . . . , vk }. One way to find such a basis is to list the first two vectors as rows of a matrix A and then find the reduced row echelon form of A. If there is a zero row, replace the second row of A with the next vector and repeat. If the reduced row echelon form of A has no zero row, then use the next vector as the third row of A, and repeat. Every time the reduced row echelon form of A has a zero row, we throw out the dependent vector and put in a new one. At the end of this process, we will have a largest possible linearly independent set. Now try this procedure on the following sets. On each set, different group members can perform this procedure in varying orders (for example, start with the last two vectors in the set). While the basis choice may vary among group members, explain why the sizes of the bases are equal. 1. {[4, 7, 1] , [1, 0, 0] , [6, 7, 1] , [−4, 0, 0]}
2. {[1, 3, −2] , [2, 1, 4] , [3, −6, 18] , [0, 1, −1] , [−2, 1, −6]}
3. {[1, 0, −8, 1] , [0, 1, −2, 3] , [−10, 2, 0, 4] , [12, −1, −20, 1] , [−13, 2, 24, 1] , [18, −7, 14, 1]}
94
Group Work 2, Section 3.5 Another Space, Another Time Let M42 denote the set of all 4 × 2 matrices. Equip M42 with the operations of + (regular matrix addition) and · (regular scalar multiplication). Our goal is to show that M42 has subspace properties similar to Rn . Look at the definition of subspace in the text. Replace Rn with M42 (and call the elements of M42 vectors). 1. What plays the role of 0 in M42 ?
2. Find some subspaces of Mnm .
3. If A1 , . . . , Ak belong to M42 , find an appropriate definition for span (A1 , . . . , Ak ).
Prove that
span (A1 , . . . , Ak ) is a subspace of M42 .
4. If A1 , . . . , Ak belong to M42 , find an appropriate definition for the phrase, “{A1 , . . . , Ak } forms a linearly
independent set in M42 .”
5. Find a linearly independent subset of M42 .
6. Find a basis for M42 .
95
3.6
Introduction to Linear Transformations
Suggested Time and Emphasis 1–1 12 classes. Essential material.
Points to Stress 1. Definition of linear transformations. 2. The matrix representation of a linear transformation. 3. The inverse of a linear transformation and its matrix representation.
Drill Question Give an example of a linear transformation that has no inverse. x x Possible Answer L = y 0
Discussion Question Let L : R2 → R2 be a translation by a fixed vector v. Thus, L (x) = x + v. Is L is a linear transformation? Answer
No, it is not. One reason is that L (0) = 0.
Test Question If T : Rm → Rn , what are the dimensions of the matrix representation of T ? Answer n × m
Lecture Notes • You may want to emphasize the one-to-one correspondence between linear transformations and matrices. Every transformation has a unique matrix representation and every matrix has a unique associated transformation. You can mention that when the transformation maps Rn into itself, it is often called a linear operator. Thus, linear operators are associated with square matrices. • Emphasize that a linear transformation is completely determined by its action on a basis of Rn . For example, consider the following basis for R4 : {[0, 4, 0, 1] , [−2, 5, 0, 2] , [−3, 5, 1, 1] , [−1, 2, 0, 1]}
Let T be the linear transformation from R4 → R3 such that T ([0, 4, 0, 1]) = [3, 1, 2]
T ([−2, 5, 0, 2]) = [2, −1, 1]
and T ([−3, 5, 1, 1]) = [−4, 3, 0] T ([−1, 2, 0, 1]) = [6, 1, −1] 4 Use this information to find T (x) for various x ∈ R . Also, find the matrix representation of T . • Example 5 is a good in-class exercise. Part (a) is an easy, straightforward application of the preceding material, and part (b) is a nice contrast: the problem can be simply stated but the actual computations assure the student that there is some significant mathematics going on. 96
Section 3.6 Introduction to Linear Transformations
• Consider doing the details of Exercises 26 and 40 in class. 0 −1 a −b • Notice that the linear transformation maps to , and thus rotates a given set of points 1 0 b a 2 0 −1 0 ◦ 90 relative to the origin. Similarly, will stretch a figure in the x-direction, and will 0 1 0 1 reflect it in the x-axis. These transformations can be illustrated by applying them to some simple polygons, or to a more complex figure. (Also, point out that not all transformations are linear transformations.)
1 0 0 1
0 −1 1 0
2 0 0 1
−1 0 Not linear 0 1 ⎡ ⎡ ⎤ ⎡ ⎤ ⎤ 1 0 0 a a ⎢ ⎢ ⎥ ⎢ ⎥ ⎥ • Notice that the transformation ⎣ 0 1 0 ⎦ maps ⎣ b ⎦ to ⎣ b ⎦, meaning that this transformation “flattens” 0 0 0 c 0 a three dimensional figure, that is, it maps it to its projection onto the xy -plane.
Lecture Examples • Example of a linear transformation: ⎡ ⎤ x x+y ⎢ ⎥ 3 2 Let T : R → R such that T ⎣ y ⎦ = . y+z z • Example of a nonlinear transformation: ⎡ ⎤ x xy ⎢ ⎥ 3 2 y Let T : R → R such that T ⎣ ⎦ = . yz z 97
Chapter 3 Matrices
• A matrix transformation at work: ⎡ ⎤ ⎡ ⎤ 7 7 2 3 1 2 3 1 ⎢ ⎥ 20 ⎢ ⎥ 3 2 . So TA ⎣ 2 ⎦ = . Let TA : R → R where A = ⎣2⎦= 0 1 1 0 1 1 2 0 0 • The matrix transformation of a⎡linear ⎤ transformation: x x+y 1 1 0 ⎢ ⎥ 3 2 . Let A = T1 T2 T3 = . Then T = TA . Let T : R → R such that T ⎣ y ⎦ = y+z 0 1 1 z • The matrix transformation of the composition of two linear transformations: ⎡ ⎤ x x x−y x+y ⎢ ⎥ 3 2 2 2 . Let U : R → R such that U = . Let Let T : R → R be T ⎣ y ⎦ = y x+y y+z z 1 1 0 1 −1 1 0 1 A= and B = . Then U ◦ T = BA = . 0 1 1 1 1 1 2 1
Tech Tip Use the vector capabilities of your CAS to graph the image of various vectors under the matrix transformation cos θ − sin θ for various values of θ. sin θ cos θ
Group Work 1: The Murdered Vector If a group finishes early, ask them to find the “murder weapon”, the element of R4 that maps to v. Answer
Yes. If u = [−5, −3, 4, 12, t] for any value of t, then T u = v.
Group Work 2: Coordinating Our Efforts Answers 1. The ith column of A is given by Aei , where ei denotes the ith standard basis vector of Rn . But
Aei = A[vi ]B = [P vi ]B . 2. If B is the standard basis, then [P vi ]B = [P ei ]B = P ei , and from the above, we know that [P vi ]B is the ith column of A. Thus, the ith column of A is P ei , which agrees with Theorem 2. T T 3. We will determine the columns of A. Column 1: [P v1 ]B = [2, 1, 1]B = 43 , 23 , − 13 . T T Column 2: [P v2 ]B = [1, 0, 3]TB = 53 , − 23 , 43 . Column 3: [P v3 ]B = [1, 1, 1]TB = 23 , 13 , 13 . Thus, ⎡ 4 ⎤ 5 2 ⎢ A=⎣
3 2 3 − 13
3
− 23 4 3
3 1 3 1 3
⎥ ⎦.
Suggested Core Assignment Exercises 2, 4P , 8, 12, 16, 18, 20, 23, 26, 42P , 48, 54P
98
Group Work 1, Section 3.6 The Murdered Vector T is a linear transformation. We know very little about it. All we know is that T : R4 → R5 and ⎡ ⎤ ⎡ ⎤ 0 1 ⎛⎡ ⎤⎞ ⎛⎡ ⎤⎞ 1 1 ⎢1⎥ ⎢ ⎥ ⎜⎢ 2 ⎥⎟ ⎢ ⎥ ⎜⎢ 2 ⎥⎟ ⎢ 1 ⎥ ⎜⎢ ⎥⎟ ⎢ ⎥ ⎜⎢ ⎥⎟ ⎢ ⎥ T ⎜⎢ ⎥⎟ = ⎢ 1 ⎥ T ⎜⎢ ⎥⎟ = ⎢ 0 ⎥ ⎢ ⎥ ⎥ ⎝⎣ 3 ⎦⎠ ⎝⎣ 0 ⎦⎠ ⎢ ⎣0⎦ ⎣ 0⎦ 0 3 1 −1 ⎤ ⎡ 1 ⎛⎡ ⎤⎞ ⎡ ⎤ ⎛⎡ ⎤⎞ 0 0 1 ⎢ 0⎥ ⎜⎢ 1 ⎥⎟ ⎢ 1 ⎥ ⎥ ⎢ ⎜⎢ 0 ⎥⎟ ⎜⎢ ⎥⎟ ⎢ ⎥ ⎥ ⎢ ⎜⎢ ⎥⎟ T ⎜⎢ ⎥⎟ = ⎢ ⎥ T ⎜⎢ ⎥⎟ = ⎢ 0 ⎥ ⎥ ⎢ ⎝⎣ 2 ⎦⎠ ⎣ 1 ⎦ ⎝⎣ 2 ⎦⎠ ⎣ −1 ⎦ 3 0 3 0 ⎤ 1 ⎢ 4⎥ ⎥ ⎢ ⎥ ⎢ Is v = ⎢ 7 ⎥ in the range of T ? Prove your answer. ⎥ ⎢ ⎣ 8⎦ −2 ⎡
99
Group Work 2, Section 3.6 Coordinating Our Efforts Linear transformations are completely determined by their actions on bases. So, looking back at Section 3.5, we see that we can use the information (and notation) from the “Coordinates” subsection to generalize the notion of matrix transformation. Let B = {v1 , . . . , vn } be a basis for V = Rn and let P : V → V be a linear transformation. The matrix representation of P with respect to B is the matrix A such that [P v]B = A [v]B
for all v ∈ V . 1. Show that the columns of A are [P vi ]B .
2. Show that if B is the standard basis, then A is the matrix of P (as in the text).
⎡
⎤ x+z ) ( ⎢ ⎥ T T T 3. Let P ([x, y, z]) = ⎣ z ⎦ and let B = [1, 0, 1] , [1, 2, 0] , [0, 1, 1] . Find the matrix representation x+y of P with respect to B.
100
3.7
Applications
Suggested Time and Emphasis 1 class. Optional material.
Points to Stress Most of the applications in this section demonstrate that raising matrices to powers can give useful information about real-world phenomena, sometimes unexpectedly so. We recommend doing one or two of the following in detail: 1. Markov Chains 2. Modeling population growth with a Leslie matrix 3. Graph theory: adjacency matrices and digraphs 4. Error-correcting codes
Drill Question Let B be a matrix. Give an example where finding B 2 is important. Answers will vary. Several are given in the text.
Discussion Question What are some pros and cons of the Leslie population growth model?
Lecture Notes • This is a good time to explore some properties of powers of a matrix. For example, consider ! ! 3 33 3 − 21 4 2 8 −4 A= B = 23 7 −23 13 2 2 −4 0 0 ∞ −∞ k k As k → ∞, A → and B → . Notice that, just looking at the matrices, this behavior 0 0 ∞ −∞ is hard to predict. In fact, it might even seem that the results should go the other way around, given that the absolute values of the entries of A are all larger than one. (Perhaps have the students guess the long-term behaviors of Ak and B k before revealing them.) It turns out that if the eigenvalues of a matrix are all less than one, Ak will approach the zero matrix as k → ∞, and if the eigenvalues are all greater than one, the entries will grow unbounded. • A classic, unsolved problem in mathematics (actually, in fourth-grade arithmetic) is the Collatz Conjecture. Choose any positive integer, and repeatedly apply the following function: - n if n is even 2 C (n) = 3n + 1 if n is odd
Repeatedly applying this function to the number 7, for example, gives us the following sequence: 7, 22, 11, 34, 17, 52, 26, 13, 40, 20, 10, 5, 16, 8, 4, 2, 1, 4, 2, 1, . . . 101
Chapter 3 Matrices
The conjecture is that all numbers wind up in a 4, 2, 1, . . . cycle, and although it has been proven for all numbers through about 1.12 × 1017 , it has not been proven in general. In 2001, Zarnowski showed that the conjecture can be restated as a problem involving Markov chains. We slightly change the function, like so: ⎧ n ⎪ if n is even ⎪ ⎪ ⎨ 2 3n + 1 Z (n) if n is odd, n > 1 ⎪ ⎪ 2 ⎪ ⎩ 1 if n = 1 Under this formulation, the path of 7 is now 7, 11, 17, 26, 13, 20, 10, 5, 8, 4, 2, 1, 1, 1, 1, . . .
and the Collatz Conjecture becomes, “All positive integers end up at 1.” We now fix n and define the transition matrix P by 1 if Z (i) = j Pij = 0 otherwise ⎤ ⎡ 1 0 0 0 0 0 0 0 ⎢1 0 0 0 0 0 0 0⎥ ⎥ ⎢ ⎥ ⎢ ⎢0 0 0 0 1 0 0 0⎥ ⎥ ⎢ ⎢0 1 0 0 0 0 0 0⎥ ⎥ ⎢ So if we let n = 8, we get P = ⎢ ⎥. Notice that there is no entry in row 7, because ⎢0 0 0 0 0 0 0 1⎥ ⎥ ⎢ ⎢0 0 1 0 0 0 0 0⎥ ⎥ ⎢ ⎥ ⎢ ⎣0 0 0 0 0 0 0 0⎦ 0 0 0 1 0 0 0 0 G (7) = 11, which is “out of bounds”. It turns out that we can find the k th entry of m’s orbit by looking at the mth row of P k . For example: The path of 3 is 3, 5, 8, 4, 2, 1 The third row of P is [0, 0, 0, 0, 1, 0, 0, 0] The third row of P 2 is [0, 0, 0, 0, 0, 0, 0, 1] The third row of P 3 is [0, 0, 0, 1, 0, 0, 0, 0] The third row of P 4 is [0, 1, 0, 0, 0, 0, 0, 0] The third row of P 5 is [1, 0, 0, 0, 0, 0, 0, 0] The Collatz conjecture is true if it turns out that, for arbitrarily large n, the transition matrix converges to ⎡ ⎤ 1 0 0 0 ··· ⎢ 1 0 0 0 ··· ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ 1 0 0 0 · · · ⎥. ⎢ ⎥ ⎢ 1 0 0 0 ··· ⎥ ⎣ ⎦ .. .. .. .. . . . . . . .
102
Section 3.7 Applications
• A graph is said to be connected if one can make a path from any vertex to any other vertex in the graph. If we are able to draw the graph nicely, then it is easy to tell, by inspection, if a given graph is connected.
Connected
Not connected
Notice that if a graph is too complicated to draw, if the adjacency matrix is A, then lim Ak determines k→∞
whether or not A is connected.
Lecture Examples
• Markov Process: Every year, 10% of all University of Okoboji students change their major to mathematics, and 25% of University of Okoboji math majors change their major to something else, or graduate. If University of Okoboji enrollment remains a constant 13,500 and the math department starts with 100 majors, what can we say about the long-term departmental enrollment? Answer The number of math majors will approach 27 of the total enrollment, or 3857 students.
• Graph Theory: The following graph is called bipartite. One way to define bipartite is that it has no odd cycles — a path from a vertex to itself must have an even number of steps.
How many two-cycles are there for the left-hand vertices? For the right-hand vertices? How about fourcycles? six-cycles? 103
Chapter 3 Matrices
Answer
These are the relevant adjacency matrices: ⎤ ⎡ ⎡ 0 0 0 1 1 2 2 2 ⎢0 0 0 1 1⎥ ⎢2 2 2 ⎥ ⎢ ⎢ ⎥ ⎢ ⎢ 2 A =⎢2 2 2 A = ⎢0 0 0 1 1⎥ ⎥ ⎢ ⎢ ⎣1 1 1 0 0⎦ ⎣0 0 0 1 1 1 0 0 0 0 0 ⎡ ⎡ ⎤ 0 0 0 6 6 12 12 ⎢0 0 0 6 6⎥ ⎢ 12 12 ⎢ ⎢ ⎥ ⎢ ⎢ ⎥ 3 4 A = ⎢0 0 0 6 6⎥ A = ⎢ 12 12 ⎢ ⎢ ⎥ ⎣6 6 6 0 0⎦ ⎣ 0 0 6 6 6 0 0 0 0
0 0 0 3 3 12 12 12 0 0
⎤ 0 0⎥ ⎥ ⎥ 0⎥ ⎥ 3⎦ 3 0 0 0 18 18
⎤ 0 0 ⎥ ⎥ ⎥ 0 ⎥ ⎥ 18 ⎦ 18
So we have the (trivial) result that the number of two-cycles are 3 and 2, and that the number of four-cycles are 12 and 18. • A (6, 3) binary code:
⎡
⎤ 1 0 0 ⎢0 1 0⎥ ⎡ ⎤ ⎢ ⎥ 1 1 0 1 0 0 ⎢ ⎥ ⎢0 0 1⎥ ⎢ ⎥ ⎥ P =⎣1 0 1 0 1 0⎦ G=⎢ ⎢1 1 0⎥ ⎢ ⎥ 0 1 1 0 0 1 ⎢ ⎥ ⎣1 0 1⎦ 0 1 1 Allowable words: 000000, 001011, 010101, 011110, 100110, 101101, 110011, 111000.
⎡
⎤ 1 ⎢ ⎥ Let c = 101101. If we transmit c with an error in the fourth place, we get c = 101001. P c = ⎣ 0 ⎦, 0 which is the fourth column of P .
Tech Tip An important graph in Graph Theory is the Heawood graph: A
B
It is clear there is no one-path from A to B , for they are not connected. One can also see, by inspection, there is only one two-path from A to B . It is easy to convince oneself that there are no k-paths from A to B if k is odd. One could even find all 11 four-paths. Use the techniques of this section to find how many k -paths there are from A to B for various values of k . (Sample answers: 1 two-path, eleven four-paths, 103 six-paths, 935 eight-paths.) 104
Section 3.7 Applications
Group Work 1: Markov in Literature The first three questions emphasize that a linear Markov process is not dependent on initial conditions. Answers 1. 30
2. 30
3. 30
3 4. 22 ≈ 13.6%
Group Work 2: The Round Robin Tournament This is similar to Example 5 in the text, with an interesting phenomenon occurring when the weighting of indirect victories is changed. Don’t mention the similarity to Example 5 right away; let the students discover it for themselves. You may want to introduce the activity by showing the students how to do matrix multiplication on their calculators, if such technology is allowed in your course. Answers 1. 1st place (tie): John Dandy and The Blurs
3rd place: The Toys 4th place: Frenzy 5th place (tie): The Brotherhood of Dada and Number None 2. 1st place: The Blurs 2nd place: John Dandy 3rd place: The Toys 4th place: Number None 5th place: Frenzy 6th place: The Brotherhood of Dada 3. 1st place: The Blurs 2nd place: John Dandy 3rd place: The Toys 4th place: Frenzy 5th place: Number None 6th place: The Brotherhood of Dada
Suggested Core Assignment Markov Chains: Exercises 1, 2, 3, 4, 10, 12, 20 Graphs and Digraphs: Exercises 26, 28, 30, 34, 41, 43, 52, 53 Error-Correcting Codes: Exercises 61, 62, 64, 68, 72
105
Group Work 1, Section 3.7 Markov in Literature The following is a quotation from Lewis Carroll’s Alice In Wonderland: Alice couldn’t help laughing as she said “I don’t want you to hire me — and I don’t care for jam.” “It’s very good jam,” said the Queen. “Well, I don’t want it to-day, at any rate.” “You couldn’t have it if you did want it,” the Queen said. “The rule is, jam to-morrow and jam yesterday — but never jam to-day.”
Let’s assume that the Queen was not quite as cruel as she said she was. If you don’t get jam today, assume there is a 15% chance you will get jam tomorrow, and if you get jam today, there is a 5% chance that you get jam tomorrow. 1. If the Queen has 220 employees, and on the first day none of them gets jam, how many of them will get
jam per day in the long run?
2. If the Queen has 220 employees, and on the first day half of them get jam, how many of them will get jam
per day in the long run?
3. If the Queen has 220 employees, and on the first day all of them get jam, how many of them will get jam
per day in the long run?
4. If you don’t get jam on the first day, what is the probability you will get jam on the 1000th day?
106
Group Work 2, Section 3.7 The Round Robin Tournament The Bay-City Frisbee Soccer league has their first full tournament. The results are as follows: ROUND 1
ROUND 4
Number None defeats the Blurs
The Blurs defeat The Brotherhood of Dada
The Toys defeat The Brotherhood of Dada
John Dandy defeats Number None
John Dandy defeats Frenzy
The Toys defeat Frenzy
ROUND 2
ROUND 5
The Blurs defeat The Toys
The Blurs defeat Frenzy
Frenzy defeats Number None
John Dandy defeats The Toys
John Dandy defeats The Brotherhood of Dada
The Brotherhood of Dada defeats Number None
ROUND 3 The Blurs defeat John Dandy The Toys defeat Number None Frenzy defeats The Brotherhood of Dada
1. Rank the teams according to their win-loss record.
2. Notice there are ties. Rank the teams according to their total victories, direct and indirect.
3. Team Frenzy argues that indirect victories are not as important as victories. Rank the teams again, but
this time have an indirect victory only count
1 3
as much as a direct victory.
4. Which of the above rankings is the most fair? Why? Are there any other rankings you can come up with
that would be more fair?
107
4 Eigenvalues and Eigenvectors 4.1
Introduction to Eigenvalues and Eigenvectors
Suggested Time and Emphasis 1 class. Essential material.
Points to Stress 1. Eigenvectors are generalizations of steady state vectors. 2. Eigenvectors are unusual vectors, in that they force a given matrix to behave like a scalar. 3. Geometric interpretations of eigenvalues, eigenvectors, and eigenspaces. 4. Computations of eigenvalues, eigenvectors, and eigenspaces for a given matrix.
Drill Question
Why is Answer
5 −3 ? 4 −2 5 −3 3 3 3 Because . Furthermore, the corresponding eigenvalue is 1. = =1 4 −2 4 4 4 3 4
an eigenvector of the matrix
Discussion Questions How many distinct eigenvalues can a 2 × 2 matrix have? How many distinct eigenvalues can a 3 × 3 matrix have? Why must a matrix be square in order to have eigenvalues?
Test Question The unit vectors x in R2 and their images Ax under the action of a 2 × 2 matrix A are drawn head-to-tail below. Estimate the eigenvalues of A.
Answer
1 2,
2 109
Chapter 4 Eigenvalues and Eigenvectors
Lecture Notes
λ1 0 • Show that the eigenvalues of a diagonal matrix such as are λ1 and λ2 . (Perhaps discuss diagonal 0 λ2 n × n matrices as well.) This topic is addressed fully in Section 4.3. • Present this simple proof showing that a matrix A is singular if and only if it has a zero eigenvalue: If λ = 0, then Av = λv = 0, which implies that there is a nontrivial solution to Av = 0, and hence A is singular. Conversely, if Av = 0 has a nontrivial solution, then v is an eigenvector for the eigenvalue λ = 0. • Draw the “eigenpicture” of a matrix A for which every vector is an eigenvector, such as this one:
⎡
⎤ 1 0 0 ⎢ ⎥ • The matrix ⎣ 0 2 0 ⎦ transforms an object in three dimensions. The object’s length is unchanged, its 0 0 4 width is doubled, and its height is quadrupled. The eigenvalues 1, 2, and 4 clearly relate to the associated ⎡ ⎤ 20 −9 8 ⎢ ⎥ transformation. The matrix ⎣ 0 48 0 ⎦ also doubles the object’s size in a particular dimension, 4 9 16 quadruples it in another, and leaves a third alone. Finding eigenvalues allows us to extract that information from this matrix. √ ! 3 1 a 2 2√ • Demonstrate that A = has no real eigenvalue. First algebraically, assuming that A = 3 1 b 2 − 2 a a and trying to find , and then geometrically, showing that A rotates vectors 30◦ , so that there k b b is no solution to Ax = λx, and finally by examining the “eigenpicture”. • Note that, in general, matrix multiplication and scalar multiplication give quite different solutions, so eigenvectors are very special vectors. • Note that each eigenvalue has infinitely many associated eigenvectors, but one can always find a maximal linearly independent set of eigenvectors which serves as a basis for the eigenspace.
Lecture Examples • A 2 × 2 matrix with two distinct eigenvalues: −1 2 3 has characteristic polynomial λ2 − λ − 2, eigenvalues −1 and 2, eigenvectors a (a = 0) 1 0 −1 . 1 −a + b (b = 0) corresponding to λ = 2, and eigenspace . corresponding to λ = −1 and b 0 a 110
Section 4.1 Introduction to Eigenvalues and Eigenvectors
• A 2 × 2 matrix with one distinct eigenvalue and two eigenvectors: 1 −3 0 has characteristic polynomial λ2 + 6λ + 9, eigenvalues −3 and −3, eigenvectors a and 0 0 −3 . 0 a b corresponding to λ = −3, and eigenspace . 1 b • A 2 × 2 matrix with one distinct eigenvalue and one eigenvector: 1 1 1 2 has characteristic polynomial λ − 2λ + 1, eigenvalues 1 and 1, eigenvector a corresponding 0 1 0 . a . to λ = 1, and eigenspace 0
Tech Tips • As mentioned in Section 3.6, if the absolute value of each of a matrix A’s eigenvalues is less than one, then as k → ∞, Ak approaches the zero matrix. Let students experiment with this fact using technology. • Have students write a program for creating “eigenpictures” for a matrix, as shown in the textbook. The students can then use their diagrams to estimate eigenvectors associated with a matrix.
Group Work 1: A Dynamical System on Graphs This group activity is based on the introduction to Chapter 4. If this activity is done in class, it should be referred to whenever appropriate throughout coverage of Chapter 4. Introduce this activity by going over Problem 1 in Section 4.0 with the students. After defining a complete graph for them, draw K4 as done in the text: vÁ
vª
v¢
v£
Then pick an arbitrary vector such as [1, 2, 3, 4] and label each vertex with its corresponding vector component. Write the adjacency matrix of the graph: ⎡ ⎤ 0 1 1 1 ⎢1 0 1 1⎥ ⎢ ⎥ ⎢ ⎥ ⎣1 1 0 1⎦ 1 1 1 0 111
Chapter 4 Eigenvalues and Eigenvectors
Relabel the vertices with the scaled components: ⎡ ⎤ ⎤⎡ ⎤ ⎡ 1 1 0 1 1 1 ⎢8⎥ ⎢ 1 0 1 1 ⎥⎢ 2 ⎥ ⎢ ⎥ ⎥⎢ ⎥ ⎢ ⎥ ⎢ ⎥ = ⎢ 97 ⎥ ⎢ ⎣9⎦ ⎣ 1 1 0 1 ⎦⎣ 3 ⎦ 6 4 1 1 1 0 9 ⎡
0 ⎢1 ⎢ ⎢ ⎣1 1
1 0 1 1
⎤2 ⎡ ⎤ ⎡ 21 1 1 24 ⎢2⎥ ⎢ 22 1⎥ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ = ⎢ 24 ⎣ 23 1⎦ ⎣3⎦ 24 0 4 1
1 1 0 1
⎤ ⎥ ⎥ ⎥ ⎦
and so forth, to show that a steady state solution exists: ⎡ ⎤100 ⎡ ⎤ ⎡ ⎤ 1 0 1 1 1 1 ⎢1 0 1 1⎥ ⎢2⎥ ⎢1⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥=⎢ ⎥ ⎣1 1 0 1⎦ ⎣3⎦ ⎣1⎦ 1 1 1 1 0 4 Note that one can view this as a sort of “flow” where the smallest vertex gets larger at the expense of the large vertices. Perhaps show that the steady state vector is the same, regardless of initial vector. Now hand out the activity. To extend this activity, either as a homework assignment or for students who finish early, have them explore cycles as done in the text. Answers
Answers will vary based upon the way the graph is labeled. The answers below are based on this labeling: 1 7 5
6
9
10
1.
⎡
0 ⎢ ⎢1 ⎢ ⎢0 ⎢ ⎢0 ⎢ ⎢ ⎢1 ⎢ ⎢0 ⎢ ⎢ ⎢1 ⎢ ⎢0 ⎢ ⎢ ⎣0 0
1 0 1 0 0 0 0 1 0 0
0 1 0 1 0 0 0 0 1 0
0 0 1 0 1 0 0 0 0 1
1 0 0 1 0 1 0 0 0 0
0 0 0 0 1 0 0 1 1 0
1 0 0 0 0 0 0 0 1 1
0 1 0 0 0 1 1 0 0 1
0 0 1 0 0 1 0 0 0 0
⎤
0 ⎥ 0⎥ ⎥ 0⎥ ⎥ 1⎥ ⎥ ⎥ 0⎥ ⎥ 0⎥ ⎥ ⎥ 1⎥ ⎥ 1⎥ ⎥ ⎥ 0⎦ 0
3
4
2., 3. n
0 1 2 3 4 5 6 7 8 9 10
2
8
v1 4 0.4 1 0.76 1 0.93 1 0.98 1 0.99 1
v2
v3
v4
v5
v6
v7
v8
v9
v10
3 0.8 0.78 0.90 0.92 0.97 0.98 0.99 0.99 1 1
2 0.7 0.74 0.89 0.91 0.97 0.97 0.99 0.99 1 1
1 0.6 0.70 0.87 0.91 0.97 0.97 0.99 0.99 1 1
0 0.5 0.65 0.86 0.90 0.96 0.97 0.99 0.99 1 1
0 0.5 0.65 0.86 0.90 0.96 0.97 0.99 0.99 1 1
1 1 0.65 1 0.87 1 0.96 1 1 1 1
2 0.7 0.74 0.89 0.91 0.97 0.97 0.99 0.99 1 1
3 0.3 0.96 0.75 0.99 0.92 1 0.98 1 0.99 1
4 0.4 1 0.76 1 0.93 1 0.98 1 0.99 1
112
Section 4.1 Introduction to Eigenvalues and Eigenvectors
Group Work 2: Introduction to Similarity In addition to giving students an opportunity to apply what they have learned in Section 4.1, this activity foreshadows Section 4.4. The first version involves 2 × 2 matrices, and the second extends the idea to 3 × 3 matrices. Either or both versions can be distributed, depending on the preparation of the particular class. Students in an advanced class should be asked to prove the conjectures they provide for Problem 6 in Version 1. Answers Version 1 1. 3, −1
2.
3 1 −2 1
3. 3, −1
4.
0 1 1 2
3 1 0 −1
3 1 0 −1
3 1 −2 1
0 1 1 2
−1 =
−1
=
13 5 − 12 5
−1 0 −7 3
− 35 − 35
!
5. 3, −1 6. The eigenvalues are always 3 and −1. (That is, they are always the same as the eigenvalues of A.)
To prove this, assume that S is invertible and consider the set of solutions to SAS −1 − λI = 0: SAS −1 − λI = 0 ⇒ SAS −1 − λSS −1 = 0 ⇒ −1
S AS − λS −1 = 0 ⇒ S (A − λI) S −1 = 0 ⇒ |S| |(A − λI)| S −1 = 0 ⇒ |A − λI| = 0
Version 2 1. 4, 8, 12
⎤ ⎤⎡ ⎤⎡ ⎤−1 ⎡ 40 −4 −18 1 2 1 12 3 0 1 2 1 ⎢ ⎢ ⎥ ⎥⎢ ⎥⎢ ⎥ 38 − 11 2. ⎣ 1 −2 0 ⎦ ⎣ 0 8 0 ⎦ ⎣ 1 −2 0 ⎦ = ⎣ 22 5 5 5 ⎦ 36 118 276 0 3 2 24 15 4 0 3 2 5 − 5 − 5 ⎡
3. 4, 8, 12
⎤ ⎤⎡ ⎤⎡ ⎤−1 ⎡ 4 15 −12 0 0 1 12 3 0 0 0 1 ⎢ ⎢ ⎥ ⎥⎢ ⎥⎢ ⎥ 0⎦ 4. ⎣ 0 1 0 ⎦ ⎣ 0 8 0 ⎦ ⎣ 0 1 0 ⎦ = ⎣ 0 8 0 −6 12 −2 0 0 24 15 4 −2 0 0 ⎡
5. 4, 8, 12 6. The eigenvalues are always 4, 8, and 12. (That is, they are always the same as the eigenvalues of A.) See
the proof of Version 1, Problem 6 above. 113
Chapter 4 Eigenvalues and Eigenvectors
Group Work 3: Matrix Construction After students are put into groups, they should be “warmed up” by being asked to compute eigenvalues and eigenvectors associated with a few matrices. Problem 4 is significantly harder than the others, but will become easier after the students have learned about similar matrices. Answers (may vary)
⎡
−1 ⎢ 1. ⎣ 0 0 ⎡ −4 ⎢ 4. ⎣ −6 −6
⎤ 0 0 ⎥ 2 0⎦ 0 3
⎡
⎤ −1 3 4 ⎢ ⎥ 2. ⎣ 0 2 5 ⎦ 0 0 3
3.
1 −2 −2 1
⎤ ⎤ ⎡ −1 0 0 − 12 ⎥ ⎥ ⎢ 6 1 ⎦ works, as does any matrix of the form A ⎣ 0 2 0 ⎦ A−1 . 0 0 3 3 2 7 2
Suggested Core Assignment Exercises 3, 5, 9, 12, 13, 15, 19, 22, 25, 28, 35P , 37P
114
works.
Group Work 1, Section 4.1 A Dynamical System on Graphs Meet the Petersen graph:
1. As you did in class with K4 , label the vertices of the Petersen graph and find the 10 × 10 adjacency matrix
A.
2. Now let x = [4, 3, 2, 1, 0, 0, 1, 2, 3, 4]. Compute Ax, scale it, and write down the new vertex labels. (A
table is provided below.)
3. Iterate the process, as done in class. What is the steady state vector?
Initial Value
v1
v2
v3
v4
v5
v6
v7
v8
v9
v10
−4
−3
−2
−1
0
0
1
2
3
4
First Iteration Second Iteration Third Iteration Fourth Iteration Fifth Iteration Sixth Iteration Seventh Iteration Eighth Iteration Ninth Iteration Tenth Iteration
115
Group Work 2, Section 4.1 Introduction to Similarity (Version 1) 1. Find the eigenvalues of A =
2. Compute
SAS −1
if S =
3 1 . 0 −1
3 1 . −2 1
3. Find the eigenvalues of SAS −1 .
4. Compute
SAS −1
if S =
0 1 . 1 2
5. Find the eigenvalues of SAS −1 in this case.
6. Make a conjecture about the eigenvalues of SAS −1 , where A and S are 2 × 2 matrices.
116
Group Work 2, Section 4.1 Introduction to Similarity (Version 2) ⎤ 12 3 0 ⎥ ⎢ 1. Find the eigenvalues of A = ⎣ 0 8 0 ⎦. 24 15 4 ⎡
⎡
⎤ 1 2 1 ⎢ ⎥ 2. Compute SAS −1 if S = ⎣ 1 −2 0 ⎦. 0 3 2
3. Find the eigenvalues of SAS −1 .
⎡
⎤ 0 0 1 ⎢ ⎥ 4. Compute SAS −1 if S = ⎣ 0 1 0 ⎦. −2 0 0
5. Find the eigenvalues of SAS −1 in this case.
6. Make a conjecture about the eigenvalues of SAS −1 , where A and S are 3 × 3 matrices.
117
Group Work 3, Section 4.1 Matrix Construction 1. Construct a diagonal matrix with eigenvalues −1, 2, and 3.
2. Construct an upper triangular matrix with eigenvalues −1, 2, and 3.
3. Construct a symmetric, nondiagonal 2 × 2 matrix with eigenvalues −1 and 2.
4. Construct a 3 × 3 matrix with eigenvalues −1, 2 and 3, all of whose entries are nonzero.
118
4.2
Determinants
Suggested Time and Emphasis 1 class. Essential material.
Points to Stress 1. The recursive definition of the determinant of an n × n matrix as cofactor expansion. 2. The computation of the determinant of an n × n matrix using row reduction. 3. The use of the determinant of an n × n matrix to determine matrix singularity. 4. Properties of the determinant relating to matrix multiplication, inverses, transposes and triangular
matrices.
Drill Question State two different uses of the determinant. Answer Answers will vary.
Discussion Question An elementary row operation changes the determinant of a matrix in a simple, predictable way. Is there a similar relationship between elementary row operations and eigenvalues? Answer No
Test Question Prove that A · AT is invertible if and only if A is invertible. Answer
A is invertible if and only if det (A) = 0. Since det (A) = det AT , the theorem is true.
Lecture Notes • The determinant of a matrix is not the same thing as the absolute value of a function. For example, determinants can be negative, while absolute values cannot. However, they do have similarities with respect to inverses and products. Another similarity is in their geometric interpretations. The absolute value of a number gives its magnitude. If a matrix transforms a two-dimensional (three-dimensional) shape, the determinant reveals the proportion by which its area (volume) increases or decreases, which can be viewed as analogous to magnitude. • Justify Cramer’s Rule. Note that it only saves us time if we are interested in the values of a few variables in a large system. • Point out that it is best to choose a row or column that has many zeros when computing with cofactor expansion. f g • Define the Wronskian of a pair of functions: W (f, g) = . State that f = kg if and only if the f g f Wronskian is zero. This fact can be proved by showing that W = 0 if and only if = 0. The g 119
Chapter 4 Eigenvalues and Eigenvectors
concept of the Wronskian extends to more than two functions — the functions are linearly dependent if and only if the Wronskian is zero. • Show that if an n × n matrix A has an eigenvalue of 1, then |A − I| = 0, to foreshadow Section 4.3.
Lecture Examples • A determinant of a 3 × 3 matrix evaluated by both cofactor expansion and the “basket” method: 1 9 2 1 9 1 2 9 2 − 0 7 3 6 = 8 7 6 + 5 7 3 3 6 8 0 5 = 8 (48) + 5 (−60) = 84 1 9 2 7 3 6 = 15 + 432 + 0 − 48 − 0 − 315 = 84 8 0 5 • A determinant of a 4 × 4 matrix evaluated by cofactor expansion into 3 × 3 matrices: 1 2 −1 9
0 3 0 8
0 5 1 1
1 2 3 5 3 5 7 7 = 1 0 1 2 − 1 −1 0 1 2 9 8 1 8 1 2 2 = 24 − (−26) = 50
• A determinant of a 4 × 4 triangular matrix evaluated by multiplication along its diagonal: 1 0 0 0 2 3 0 0 =1·3·1·2=6 −1 0 1 0 9 8 1 2 • Cramer’s Rule used to find the second variable in a 4 × 4 system of equations: ⎡ ⎤⎡ ⎤ ⎡ ⎤ 1 1 1 1 w 11 ⎢ 1 −1 1 −1 ⎥ ⎢ x ⎥ ⎢ −9 ⎥ ⎢ ⎥⎢ ⎥ ⎢ ⎥ ⎢ ⎥⎢ ⎥ = ⎢ ⎥⇒ ⎣ 1 2 3 −4 ⎦ ⎣ y ⎦ ⎣ −25 ⎦ 1 10 2 −2 z 6 1 11 1 1 1 −9 1 −1 1 −25 3 −4 1 6 2 −2 72 = =2 x= 1 1 1 1 36 1 −1 1 −1 1 2 3 −4 1 10 2 −2 120
Section 4.2 Determinants
⎡
⎤ 1 4 4 ⎢ ⎥ • The inverse of the 3 × 3 matrix ⎣ 2 8 0 ⎦ found by Gaussian elimination: −1 −1 −1 ⎡ ⎤ ⎡ ⎤ ⎡ 1 4 4 1 0 0 1 4 4 1 0 0 1 4 4 1 ⎢ ⎥ ⎢ ⎥ ⎢ ⎣ 2 8 0 0 1 0 ⎦ =⇒ ⎣ 0 0 −8 −2 1 0 ⎦ =⇒ ⎣ 0 3 3 1 −1 −1 −1 0 0 1 0 3 3 1 0 1 0 0 −8 −2 ⎤ ⎡ ⎤ ⎡ ⎡ 1 4 4 1 0 0 1 4 4 1 0 0 1 0 0 − 13 ⎥ ⎢ ⎥ ⎢ ⎢ ⎣ 0 1 1 13 0 13 ⎦ =⇒ ⎣ 0 1 1 13 0 13 ⎦ =⇒ ⎣ 0 1 1 13 0 0 −8 −2 1 0 0 0 −8 −2 1 0 0 0 −8 −2 ⎡ ⎤ 0 − 43 1 0 0 − 13 ⎢ ⎥ 0 13 ⎦ ⎣ 0 1 1 13 0 0 1 14 − 18 0 ⎛ ⎞ ⎞−1 ⎛ 1 − 3 0 − 43 1 4 4 ⎜ 1 ⎜ ⎟ 1 ⎟ So ⎝ 2 8 0 ⎠ = ⎝ 12 81 3 ⎠. 1 1 −1 −1 −1 4 −8 0
⎤ 0 0 ⎥ 0 1 ⎦ =⇒ 1 0 ⎤ 0 − 43 ⎥ 0 13 ⎦ =⇒ 1 0
• Transformation of a shape by a matrix with large determinant and by a matrix with small determinant: 1 1 A= , with determinant 2: −1 1 y 1
y 1
A 0
B=
1
0
2x
1
2 x
1
2 x
! 1 1 , with determinant 12 : 1 1 2 y
y
1
0
B
1
1
0
2 x
121
Chapter 4 Eigenvalues and Eigenvectors
Tech Tip Have students use their calculators or computers to compute determinants of matrices with dimensions 1 × 1, 2 × 2, 3 × 3, etc. At some point, the technology will slow down noticeably. Have them try to time the program, and then graph time versus dimension. Have them try to describe the growth.
Group Work 1: Determinants and Matrix Operations This activity is designed to allow students to develop some of the ideas of Theorem 3 on their own. It is assumed that they have some sort of technology available to compute all the determinants — there are too many to be done by hand in a reasonable length of time. The questions ask for conjectures; students can also be asked to prove their conjectures if there is time. Answers 1. 4, 4, 14
2. −3, −3, − 13
1 3. 24 and 24 are good guesses.
4. −12, 16, 9. Good guesses are 96 and −72.
5. −4, −4, −1, 3, −3. An even number of row swaps does not change the determinant; an odd number
changes its sign.
Group Work 2: An Unusual Function This activity shows how a complicated-looking function can turn out to be relatively simple. Answers 1. −3, 0
x 2 3 2. x = 1 is the only root, because 4 5 6 = 3x − 3. 5 7 9 3. One can see that there are values of x that make |A| positive, and values that make it negative. (Take the
determinant by row expansion of row 1.) It is also easy to see that |A| is a continuous function of x. Now apply the Intermediate Value Theorem.
Group Work 3: Not Your Father’s Chebyshev This activity introduces the idea of a two-dimensional Chebyshev system — two functions f and g with the following interesting property: f (a) f (b) is strictly positive when −1 ≤ a < b < 1. Definition f and g form a Chebyshev system if g (a) g (b) The concept of a system of functions (even a system of two functions) may be difficult for students to grasp. Begin this activity by explaining the concept of a Chebyshev system to the students, and then hand out the worksheet. It turns out that the concept of a Chebyshev system extends to n functions in a natural way. There have been many books written on Chebyshev systems and their applications. For example, mathematicians use Chebyshev systems to generalize the notion of convexity. 122
Section 4.2 Determinants
Answers 1. It is not a Chebyshev system because the determinant is zero. 2. Yes, no, yes, yes, no 3. It is not. Compare a = 0, b = 0.9 to a = −0.9, b = 0. 4. g must be strictly increasing. 5. f (x) = x, g (x) = x2 6. These are not trivial to come by. One that will work: f (x) = − sin x, g (x) = cos x.
Group Work 4: Find the Error Note that skew-symmetric matrices were introduced in the exercises for Section 3.2. Answer It is not the case in general that det (−A) = − det (A). If A is a square matrix of odd dimension, this is true. If A is a square matrix of even dimension, this is false. If a student or group of students solve this very quickly, have them try to find a 3 × 3 skew-symmetric matrix whose determinant is nonzero, and try to get them to figure out why the proof is valid in the 3 × 3 case.
Suggested Core Assignment Exercises 4, 8, 13, 17, 20P , 24, 27, 29, 43P , 49, 54P , 56P , 69P
123
Group Work 1, Section 4.2 Determinants and Matrix Operations ⎡
⎡ ⎤ ⎤ 1 0 1 1 1 0 ⎢ ⎢ ⎥ ⎥ Let A = ⎣ 1 3 2 ⎦ and B = ⎣ 1 0 2 ⎦. 1 −4 1 0 1 1 T −1 1. Compute |A|, A , and A .
2. Compute |B|, B T , and B −1 .
3. If a 3 × 3 matrix C has determinant 24, guess the values of C T and C −1 .
4. Compute |AB|, A2 , and B 2 . Guess the values of |AC| and |BC|.
1 3 2 1 −4 1 1 0 1 1 1 0 5. Compute 1 0 1 , 1 3 2 , and 1 −4 1 . Guess the value of 0 1 1 and check to see if your 1 −4 1 1 0 1 1 3 2 1 0 2 0 1 1 guess is right. Try again with 1 1 0 . Can you explain your result? 1 0 2
124
Group Work 2, Section 4.2 An Unusual Function x 2 3 Let f (x) = 4 5 6 . 5 7 9 1. Compute f (0) and f (1).
2. What are the roots of f (x)?
⎡
⎤ x a1 a2 ⎢ ⎥ 3. Let A = ⎣ a3 a4 a5 ⎦, where ai ∈ R. Prove that there is a value of x such that A is singular. a6 a7 a8
125
Group Work 3, Section 4.2 Not Your Father’s Chebyshev In this activity we describe a property that a pair of functions, f and g may or may not have: f (a) f (b) is strictly positive when −1 ≤ a < b < 1. f and g form a Chebyshev system if g (a) g (b) 1. The simplest functions to think about are constant functions. Is the pair of functions f (x) = 1, g (x) = 2
a Chebyshev system? Why or why not?
2. Let’s leave f (x) = 1 for now and play with g (x). Are these systems Chebyshev systems?
f (x) = 1, g (x) = x f (x) = 1, g (x) = x2 f (x) = 1, g (x) = x3 f (x) = 1, g (x) = −e−x f (x) = 1, g (x) = sin πx
3. Let f (x) = 1 and suppose that g (x) has the following graph: y 1
_1
0
1 x
Is this a Chebyshev system? Why or why not? 4. Can you come up with a criterion for deciding if {f, g} is a Chebyshev system if we know that f (x) = 1?
5. Find a pair of functions, f and g , neither one constant, that does not form a Chebyshev system. Justify
your answer.
6. Find a pair of functions, f and g , neither one constant, that does form a Chebyshev system. Justify your
answer. 126
Group Work 4, Section 4.2 Find the Error It is a beautiful Spring day. You are very happy, because the supermarket has just gotten in a shipment of fresh broccoli. You buy a nice fresh head and start to walk home with it, looking at it with pride. Suddenly, someone is standing in your way. It is the wild-eyed gentleman! “You actually like that broccoli, Einstein?” “I do,” you say. “Broccoli is good for you. And it tastes great, as long as you steam it, instead of boiling it.” He takes out a moon pie and starts unwrapping it. “I’m not about to take culinary advice from someone who once told me that linear algebra makes sense.” “It does!” you protest. “For example, today we learned about determinants. They have all sorts of wonderful properties, and they make perfect sense.” He starts to look sly, as he picks a little fragment of moon pie from his front tooth. “Determinants, eh? Say, there, do you remember what a skew-symmetric matrix is?” Before giving you a chance to answer, he draws this in the air with his finger: ⎡ ⎤ 0 1 2 ⎢ ⎥ ⎣ −1 0 3 ⎦ −2 −3 0 “A skew-symmetric matrix is any matrix with the property that AT = −A.” “That’s fine...” you say, anxious to leave him to his moon pie and go home to eat your broccoli. “So, what’s the determinant of that matrix I just drew in the air?” “I can’t do it in my head! I’d need a calculator, or at least paper and pencil.” “That’s the problem with you young people. You can’t even count your thumbs without a calculator. The determinant is zero, my Merry Grig! And here’s how you know. For a skew symmetric matrix, A = −AT . So... det (A) = det AT = det (−A) = − det (A)
so 2 det (A) = 0 det (A) = 0
See? Every skew-symmetric matrix has determinant zero!” You are genuinely impressed. “Thank you. But that just proves my point. Linear Algebra is a subject full of beauty and interesting facts.” The gentleman snorts. 0 1 ?” −1 0
“Oh, yes.
But let’s see if you learned anything.
“Well,” you say, “It is a skew-symmetric matrix, so the determinant is zero.” 127
What is the determinant of
Find the Error
“You kids today are all liripoops — every last one of you. The way I see it, the determinant of that matrix is 0 · 0 − 1 · (−1), which is 1. But then again,” he says, wiping chocolate off his lips, “I don’t have a calculator on me.” He smiles broadly. He’s just proven that the determinant of a skew-symmetric matrix has to be zero, and yet you are looking at a simple one whose determinant is not zero! Don’t let him get away with this sort of thing! Find the error.
128
Eigenvalues and Eigenvectors of n × n Matrices
4.3
Suggested Time and Emphasis 1 class. Essential material.
Points to Stress 1. Eigenvalues as roots of the characteristic polynomial. 2. Eigenspace of an eigenvalue λ as the null space of A − λI . 3. The Fundamental Theorem of Invertible Matrices 4. The relationship between the algebraic multiplicity of λ and the dimension of the corresponding
eigenspace. 5. The eigenvalues of a triangular matrix.
Drill Question ⎡
1 0 0 ⎢ 3 2 0 ⎢ What are the eigenvalues of ⎢ ⎣ −2 −1 3 6 −4 2
⎤ 0 0⎥ ⎥ ⎥? 0⎦ 2
Hint: no computations should be necessary to answer this question! Answer 1, 2 (multiplicity 2), 3
Discussion Question Let A be a square matrix and let B be the reduced row echelon form of A. How are the eigenvalues of B related to the eigenvalues of A? Answer There is no relationship.
Test Question ⎡
2 ⎢0 ⎢ ⎢ Is the matrix ⎢ 0 ⎢ ⎣0 0 Answer
3 1 0 0 0
⎤ 4 5 6 2 3 4⎥ ⎥ ⎥ 0 1 2 ⎥ invertible? Why or why not? ⎥ 0 −1 0 ⎦ 0 0 −2
It is not invertible. One of its eigenvalues is 0, so its determinant is 0. 129
Chapter 4 Eigenvalues and Eigenvectors
Lecture Notes
3 0 9 0 • The matrices and have the same determinant, but different eigenvalues. Notice that each 0 3 0 1 of these matrices represent a very different type of linear transformation. The determinants tell us that the area of a transformed region will be increased by a factor of 9, but the eigenvalues give us more detail. The first matrix stretches the region by a factor of 3 in each direction, allowing it to maintain its shape. The second stretches the region by a factor of 9 in the x-direction and does not stretch the region in the y -direction.
• Randomly give each student a letter from A through O. (Some students will probably have the same letter). Have them attempt to prove Theorem 3 for their particular letter. If there is time, have them exchange proofs with a neighbor, who will verify that the proof is correct. Then go through some of the proofs yourself, or point out where they are shown in the text. • Go through the surprising Cayley-Hamilton Theorem, as discussed in Exercises 33–38. • Show that the determinant of a matrix is the product of its eigenvalues. One way to do this is as follows: / / Let |A − xI| = P (x) = ni=1 (λi − x). Substituting x = 0, we have |A| = ni=1 λi . • Demonstrate that if λ is an eigenvalue of A and µ is an eigenvalue of B , then λµ is not necessarily an eigenvalue of AB . • Note that the converse of Theorem 4(c) is true. Prove the converse for the case where A is nonsingular, and perhaps for the case where A is diagonalizable. The proof gets tricky if A is not diagonalizable.
Lecture Examples • A 3 × 3 matrix with three eigenvalues: ⎡ ⎤ ⎡ ⎤ 1 0 −1 0 ⎢ ⎥ ⎢ ⎥ 3 2 ⎣ 2 −1 5 ⎦ has characteristic polynomial, λ − 2λ − λ + 2 and eigenvectors ⎣ 1 ⎦ corresponding to 0 0 2 0 ⎡ ⎤ ⎡ ⎤ −1 1 ⎢ ⎥ ⎢ ⎥ λ = −1, ⎣ 1 ⎦ corresponding to λ = 2, and ⎣ 1 ⎦ corresponding to λ = 1. 1 0 • A 3 × 3 matrix with two eigenvalues, one with two eigenvectors, one with one eigenvector: ⎡ ⎡ ⎤ ⎡ ⎤ ⎤ 0 1 1 −3 32 ⎢ ⎢ ⎥ ⎢ ⎥ ⎥ 3 2 ⎣ 0 1 0 ⎦ has characteristic polynomial λ −4λ +5λ−2, eigenvectors ⎣ 1 ⎦ and ⎣ 0 ⎦ corresponding 2 0 0 −2 2 ⎡3⎤ 2
⎢ ⎥ to λ = 1, and eigenvector ⎣ 0 ⎦ corresponding to λ = 2. 1 • A 3 × 3 matrix with two eigenvalues, each with one eigenvector: ⎡ ⎡ ⎤ ⎤ −5 −7 −1 −1 ⎢ ⎢ ⎥ ⎥ ⎣ 3 5 1 ⎦ has characteristic polynomial λ3 −2λ2 −4λ+8 and eigenvectors ⎣ 1 ⎦ corresponding −6 −6 2 0 130
Section 4.3 Eigenvalues and Eigenvectors of n × n Matrices
⎡
⎤ − 17 9 ⎢ ⎥ to λ = 2 and ⎣ 1 ⎦ corresponding to λ = −2. − 43 • A demonstration of the Cayley-Hamilton theorem for a 3 × 3 matrix: ⎡ ⎤ −5 −7 −1 ⎢ ⎥ ⎣ 3 5 1 ⎦ has characteristic polynomial λ3 − 2λ2 − 4λ + 8. −6 −6 2 ⎡ ⎤ ⎤3 −8 −16 −12 −5 −7 −1 ⎢ ⎢ ⎥ ⎥ 0 8 12 ⎦ ⎣ 3 5 1⎦ = ⎣ −24 −24 8 −6 −6 2 ⎡ ⎡ ⎤ ⎡ ⎤2 ⎤ 10 6 −4 −5 −7 −1 −20 −12 8 ⎢ ⎢ ⎥ ⎢ ⎥ ⎥ 4 −8 ⎦ −2 ⎣ 3 5 1 ⎦ = −2 ⎣ −6 −2 4 ⎦ = ⎣ 12 0 0 4 −6 −6 2 0 0 −8 ⎡ ⎤ ⎡ ⎤ −5 −7 −1 20 28 4 ⎢ ⎥ ⎢ ⎥ −4 ⎣ 3 5 1 ⎦ = ⎣ −12 −20 −4 ⎦ −6 −6 2 24 24 −8 ⎡ ⎤ 8 0 0 ⎢ ⎥ 8I = ⎣ 0 8 0 ⎦ 0 0 8 ⎡
The sum of these four matrices is indeed 0.
Tech Tip Have students come up with 4 × 4 real matrices with the following properties: • 4 real eigenvalues • 3 real eigenvalues and 1 complex eigenvalue • 2 real and 2 complex eigenvalues • 1 real eigenvalue and 3 complex eigenvalues • No real eigenvalue and 4 complex eigenvalues
After the students have discovered that the complex eigenvalues come in conjugate pairs, they should be able to figure out which of the combinations are possible. The restriction that the matrices must be real can then be removed. Different combinations of eigenvalues and number of distinct eigenvectors can also be explored.
Group Work 1: Generalized Eigenvectors When solving a system of differential equations, it is convenient when the eigenvalues of the system have full geometric multiplicity. If an eigenvalue has “faulty” geometric multiplicity, this indicates that there are “not enough” independent (associated) eigenvectors. In this case it is necessary to manufacture “generalized eigenvectors” by using the process discovered in the group work. 131
Chapter 4 Eigenvalues and Eigenvectors
Answers
⎡
⎤ 1 ⎢0⎥ ⎢ ⎥ 1. λ = 1. The only eigenvector is ⎢ ⎥. ⎣0⎦ 0 2. This comes directly from the definition of an eigenspace. ⎡ ⎤ ⎡ ⎤ ⎡ 0 1 0 0 0 0 1 0 0 0 0 ⎢0 0 1 0⎥ ⎢0 0 0 1⎥ ⎢0 0 0 ⎢ ⎥ 2 ⎢ ⎥ 3 ⎢ 3. Λ = ⎢ ⎥, Λ = ⎢ ⎥, Λ = ⎢ ⎣0 0 0 1⎦ ⎣0 0 0 0⎦ ⎣0 0 0 0 0 0 0 0 0 0 0 0 0 0
⎤ 1 0⎥ ⎥ ⎥. The null space of Λ has dimension 1, the 0⎦ 0
null space of Λ2 has dimension 2, the null space of Λ3 has dimension 3, and so forth. 4. We choose Λ4 because the dimension of its null space is 4, giving us the desired number of linearly
independent eigenvectors for a 4 × 4 matrix. 5. For a given λ, ordinary eigenvectors form a basis for the nullspace of A−λI . The generalized eigenvectors
form a basis for the nullspace of (A − λI)k for some integer k .
Group Work 2: Introduction to Diagonalization The subjects of the next section are similarity and diagonalization. This activity is designed to foreshadow that material, allowing students to discover the concept and make a conjecture which will be confirmed or rebutted in the next section. It assumes students have technology available to them so they do not have to compute determinants by hand. Answers
⎡
⎡ ⎤ ⎤ 1 2 ⎢ ⎢ ⎥ ⎥ 1. Eigenvalues: 8, 24, 32. Eigenvectors: λ = 8 corresponds to ⎣ −1 ⎦, λ = 24 corresponds to ⎣ 3 ⎦, and 1 1 ⎡ ⎤ 3 ⎢ ⎥ λ = 32 corresponds to ⎣ −1 ⎦. 1 ⎡ ⎤ 2 −1 2 1⎢ ⎥ 2. ⎣ −2 1 1 ⎦ 3 −1 2 −1 ⎡ ⎡ ⎤ ⎤ −5 1 ⎢ ⎢ ⎥ ⎥ The eigenvalues are the same. Eigenvectors: λ = 8 corresponds to ⎣ 2 ⎦, λ = 24 corresponds to ⎣ 0 ⎦, 4 1 ⎡ ⎤ −3 ⎢ ⎥ and λ = 32 corresponds to ⎣ 2 ⎦. 2 ⎡ ⎤ 1 −1 1 ⎢ ⎥ 4. The eigenvalues are the same. We can’t let S = ⎣ 0 0 0 ⎦ because it is singular. 1 1 0 132
Section 4.3 Eigenvalues and Eigenvectors of n × n Matrices
5. We have transformed A into a diagonal matrix whose entries are the eigenvalues.
⎡
⎤
S −1 AS =
8 0 0 ⎢ ⎥ ⎣ 0 24 0 ⎦. 0 0 32 ⎡ 2 ⎤⎡ ⎤⎡ ⎤ ⎡ ⎤ 1 2 6 0 3 1 −1 1 9 0 0 3 −3 3 ⎢ ⎥⎢ ⎥ ⎢ ⎥ 1 1 ⎥⎢ 5 2 ⎦⎣ 1 0 2 ⎦ = ⎣ 0 3 0 ⎦ 6. ⎣ − 23 3 3 ⎦⎣ 2 1 2 4 −2 7 1 1 0 0 0 6 − 13 3 −3 7. We can only diagonalize n × n matrices that have n linearly independent eigenvectors. Such matrices are called diagonalizable.
Suggested Core Assignment Exercises 4, 10, 13P , 17, 20P , 24, 28P , 30, 32P , 34, 36, 38, 39P
133
Group Work 1, Section 4.3 Generalized Eigenvectors. ⎡
1 ⎢0 ⎢ Let A = ⎢ ⎣0 0
1 1 0 0
0 1 1 0
⎤ 0 0⎥ ⎥ ⎥. 1⎦ 1
1. Find the eigenvalues and eigenvectors of A.
2. You should notice that there are fewer than four total linearly independent eigenvectors.
For some applications, we need to have one eigenvector for every row of a square matrix. We are going to try some vectors that are “almost as good” as eigenvectors for some applications. We let Λ = A − λI
Explain why the eigenspace you found is the same as the null space of Λ.
3. Consider the null spaces of Λ, Λ2 , Λ3 , and so forth. What do you notice about their dimensions?
4. Find a basis for the null space of Λ4 . We are going to call those vectors the generalized eigenvectors of
A. Why did we pick Λ4 instead of Λ3 or Λ5 ?
5. Why do you think they are called generalized eigenvectors?
134
Group Work 2, Section 4.3 Introduction to Diagonalization ⎡
⎤ 44 −1 −37 ⎢ ⎥ Consider the matrix A = ⎣ −12 23 27 ⎦. 12 1 −3 1. Now that you have thoroughly considered it, find its eigenvalues and eigenvectors. (Matrices love when
you do that.)
⎡
⎤ 1 −1 1 ⎢ ⎥ 2. I grow bored with A. Let us create the new matrix S = ⎣ 1 0 2 ⎦. Find S −1 . 1 1 0
3. Now that we have S , A, and S −1 , let’s combine them into a new matrix, S −1 AS . Make S −1 AS happy
by finding its eigenvalues and eigenvectors.
4. You should have noticed something interesting about the eigenvalues. Compare them to the eigenvalues
of A. Try making your own new matrix S and find the eigenvalues and eigenvectors of S −1 AS . Have ⎡ ⎤ 1 −1 1 ⎢ ⎥ you spotted a pattern? What happens to your pattern if you try S = ⎣ 0 0 0 ⎦? Why? 1 1 0
135
Introduction to Diagonalization
5. We are going to create a special matrix S . Let the columns of S be the eigenvectors of A. Now compute
S −1 AS . We say you have just diagonalized A. What do you think that means?
⎡
⎤ 6 0 3 ⎢ ⎥ 6. Now try to diagonalize ⎣ 2 5 2 ⎦. 4 −2 7
7. Do you think we can diagonalize any matrix? If so, why? If not, what word should we use to describe a
matrix that can be diagonalized?
136
4.4
Similarity and Diagonalization
Suggested Time and Emphasis 1 class. Essential material.
Points to Stress 1. Similar matrices share important characteristics. 2. An n × n matrix is diagonalizable if and only if its eigenvectors span Rn . 3. Similarity to a diagonal matrix is a desirable trait for a matrix to have.
Drill Question Can two distinct eigenvalues have the same associated eigenvector? Answer No
Discussion Question Is it true that if A is diagonalizable, Ak must be diagonalizable? Is it true that if A is not diagonalizable, Ak cannot be diagonalizable? Is it true that if Ak is diagonalizable, A must be diagonalizable? Answer Yes, yes, yes
Test Question Prove that if A is a diagonalizable matrix with 1 as its only eigenvalue, then A is the identity matrix. Answer
MAM −1 = D, the identity matrix. This implies that A = In .
Lecture Notes • Point out that singularity and diagonalizability are independent properties, perhaps showing four matrices displaying all combinations of the two properties. • Prove Theorem 3.6 and discuss Lemma 6, going back to the generalized eigenvectors if Group Work 1 in the previous section was used. 100 3 4 3 4 = S −1 AS and then computing S −1 A100 S . • Find by diagonalizing 1 2 1 2 • Note that two matrices with identical eigenvalues may not be similar. matrices are diagonalizable.
Lecture Examples • A 3 × 3 matrix that is not diagonalizable:
⎡
⎤ 1 1 0 ⎢ ⎥ ⎣0 1 1⎦ 0 0 1 137
They are similar only if both
Chapter 4 Eigenvalues and Eigenvectors
• A 3 × 3 matrix diagonalized: ⎤−1 ⎤⎡ ⎤⎡ ⎤ ⎡ ⎡ 1 1 0 −1 0 0 1 1 0 −4 3 0 ⎥ ⎥⎢ ⎥⎢ ⎥ ⎢ ⎢ ⎣ −6 5 0 ⎦ = ⎣ 1 2 0 ⎦ ⎣ 0 2 0 ⎦ ⎣ 1 2 0 ⎦ 0 0 1 0 0 4 0 0 1 0 0 4 • A 4 × 4 matrix diagonalized: ⎡ 2 −1 −1 ⎢0 1 0 ⎢ ⎢ ⎣0 0 1 0 0 0
⎤ ⎡ 1 1 ⎢ ⎥ 0⎥ ⎢1 ⎥=⎢ 0⎦ ⎣0 0 3
1 0 1 0
1 0 0 1
⎤⎡ 1 1 ⎢ ⎥ 0 ⎥⎢ 0 ⎥⎢ 0 ⎦⎣ 0 0 0
Tech Tip
0 1 0 0
0 0 3 0
⎤⎡ 1 0 ⎢ ⎥ 0 ⎥⎢ 1 ⎥⎢ 0 ⎦⎣ 0 0 2
1 0 1 0
1 0 0 1
⎤−1 1 0⎥ ⎥ ⎥ 0⎦ 0
⎡
⎡ 22 ⎤ ⎤ 8 2 1 3 7 7 2 ⎢ ⎢ ⎥ 53 13 ⎥ Have the students find a way to determine if the matrices ⎣ −1 0 4 ⎦ and ⎣ − 11 28 − 28 4 ⎦ are similar. 15 15 55 2 1 3 28 − 28 4
(The students can easily do this by diagonalizing both matrices.) Then have them try to do the same with ⎡ ⎡2 ⎤ ⎤ 4 1 1 1 0 3 −3 3 ⎢ ⎢ ⎥ 5 5 ⎥ ⎣ 0 1 1 ⎦ and ⎣ 32 3 − 3 ⎦. (These two matrices are similar, but it is very hard to show because the first 1 2 1 0 0 1 3 3 3 one is not diagonalizable.)
Group Work 1: Six by Six This activity challenges students to use what they have learned about similarity in a new situation. Give students the hint of thinking about similarity only as a last resort. Answers
⎤ −3 0 0 0 0 0 ⎢ 0 −1 0 0 0 0 ⎥ ⎥ ⎢ ⎥ ⎢ ⎢ 0 0 0 0 0 0⎥ ⎥. ⎢ 1. Answers will vary. One possibility is ⎢ ⎥ ⎢ 0 0 0 1 0 0⎥ ⎥ ⎢ ⎣ 0 0 0 0 4 0⎦ 0 0 0 0 0 6 ⎡ ⎤ −3 0 0 0 0 0 ⎢ 0 −1 0 0 0 0 ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ 0 0 0 0 0 0 ⎥ −1 ⎢ ⎥ M , where M is a nonsingular 6 × 6 matrix. 2. Any matrix of the form M ⎢ ⎥ ⎢ 0 0 0 1 0 0⎥ ⎢ ⎥ ⎣ 0 0 0 0 4 0⎦ 0 0 0 0 0 6 ⎡
Group Work 2: A Disease Model This model does not take into account shifting populations, but it is otherwise a realistic discrete disease model. The solution involves raising a matrix to a high power, which can be done using diagonalization. 138
Section 4.4 Similarity and Diagonalization
Answers 1. Answers may vary. The general idea is that these equations are an almost literal translation of the problem.
0.7 0.1 2. A = 0.3 0.9 r (0) 0.25 0.25 r (0) 30 3. A . Approximately 25% of the original population has the rash, ≈ s (0) 0.75 0.75 s (0) while approximately 75% are rash-free. 0.25 0.25 4. After ten years, the matrix has stabilized to . 25% of the original population will have the 0.75 0.75 0.25 0.25 that the difference will be rash, while 75% will be rash free. The matrix will be so close to 0.75 0.75 one tiny blemish on the foot of one person.
Group Work 3: Find the Error Answer
The stranger called two different matrices P . The matrix P that transforms A to B is not the same matrix as the one that transforms C to D.
Suggested Core Assignment Exercises 4, 6, 11, 14, 20, 27, 33P , 37, 41P , 47P , 48P
139
Group Work 1, Section 4.4 Six by Six My favorite numbers are −3, −1, 0, 1, 4, and 6. 1. Find a 6 × 6 matrix which has my favorite numbers as eigenvalues.
2. Very clever. Now find a 6 × 6 nondiagonal, nontriangular matrix with those eigenvalues.
140
Group Work 2, Section 4.4 A Disease Model Westbury, New York is a wonderful city, almost a paradise. The only problem is that this year there was an outbreak of the Westbury Rash. r (n) represents the number of people with the rash on the nth day, and s (n) gives the number of people without it. Every year 70% of people with the rash still have it, and 30% of them heal. Also, 10% of the people who don’t have it, catch it and the remaining 90% of the healthy people stay healthy. 1. Explain why the above can be summarized by the equations
r (n + 1) = 0.7r (n) + 0.1s (n) s (n + 1) = 0.3r (n) + 0.9s (n)
2. Model this system as a matrix equation of the form
r (n) A s (n)
3. Compute
=
r (n + 1) s (n + 1)
r (30) , and describe what is happening in Westbury after the first month of the outbreak. s (30)
4. What is happening after the first ten years?
141
Group Work 3, Section 4.4 Find the Error It is a beautiful Spring day. You are at a boxing match, sitting in the eighth row. You have your Linear Algebra book out, so you can work on your homework while you watch the fight. During the third round, you hear a familiar voice calling out. “Yahhh Booo! Punch that killbuck! Give him what-for!” You turn to see the wild-eyed gentleman who hates linear algebra! You make eye contact, and you see him glance down at your book. “Similar matrices, eh?” “Yes,” you say, a little testily. “There is nothing wrong with similar matrices. It’s a simple idea. If A is similar to B , and B is similar to C , then A is similar to C .” “Is it true that if A is similar to B , and C is similar to D, then AC is similar to BD?” asks the stranger, with an odd gleam in his eye. “I don’t know. Watch the fight.” Suddenly, a big guy sitting behind the both of you says, with a pronounced Brooklyn accent, “Hey! I wanna know! What’s the answer?” “Yeah!” adds the woman sitting on the other side of you. “Now I’m all curious!” “Me, too!”... “And me!”... “What’s the answer?” There is a brief pause in the conversation as the audience applauds. Evidently someone has just been knocked down in the ring. He gets back up, and the wild-eyed gentleman speaks: “Let A ∼ B . Then P −1 AP = B . Let C ∼ D. Then P −1 CP = D.” “Dat’s just the definition of similar matrices. Tell us somet’ing dat we don’t know!” demands the big guy.
“Well, then P −1 (AC) P = P −1 AP P −1 C P = BD.” “So it is true,” says the woman on the other side of you. “My curiousicallity has been satisfized.” “Very nice,” you concede. “You have shown that if A ∼ B and C ∼ D, then AC ∼ BD.” “Let’s see an example!” demands another man behind you, a thin man with a thin moustache and a bow tie. 1 2 1 0 You open your book and look at a couple of examples. “We can let A = and B = . 0 −1 −2 −1 5 −1 4 0 My book says that they are similar. And we can let C = and D = — the book says that 2 2 0 3 C is similar to D.”
The big guy says, “Then we get dat AC =
9 3 .” −2 2
The woman sitting next to you adds “And BD should be exactily
4 0 !” −8 −3
Everybody nods. There is a brief pause as the crowd applauds a knockout. But when the applause subsides, the man with the bow-tie says, “Wait a minute! The determinant of AC is not the same as the determinant of 142
Find the Error
BD.”
“Oh,” you say. “Then they can’t be similar.” “But we’ve just proved that they were, by what that wild-eyed guy said! A is similar to B , and C is similar to D, so AC must be similar to BD!” says the woman next to you. “But the determinants ain’t the same!” says the big guy. Everyone looks at you, expectantly. You look to where the wild-eyed stranger was sitting, and on his seat is a note – “Went to get moony pie.” The matrices have to be similar. The matrices cannot be similar, by Theorem 2(a). Where was the mistake in the stranger’s reasoning? The crowd is waiting, and they’re hungry for blood!
143
4.5
Iterative Methods for Computing Eigenvalues
Suggested Time and Emphasis 1 class. Optional material.
Points to Stress 1. The concept of a dominant eigenvalue and its associated dominant eigenvector. 2. The power method of approximating the dominant eigenvalue, and the inverse power method for
approximating the diminutive eigenvalue. 3. The shifted inverse power method for finding other eigenvalues.
Drill Question Why do we scale the vectors when using the power method? Answer To reduce roundoff error.
Discussion Question Consider this potential drawback to the shifted power method: We never actually find λ1 , we find an approximation to it that has some appropriate error. Now we use our value of λ1 to approximate λ2 , which compounds the error. Would this drawback cause the power method to be useless in approximating the eigenvalues of large matrices? Answer It could. If our approximation to λ1 is poor, this error will affect subsequent approximations.
Test Question Find a 2 × 2 nondiagonal matrix which has no dominant eigenvalue. Answer Answers will vary.
Lecture Notes • Emphasize how amazing these methods are. We have thought of eigenvalues as roots of the characteristic polynomial, and now we are able to find them without even looking at this polynomial. It is also worth pointing out that the Rayleigh quotient can double its speed with very little extra work. • Do Exercise 46 and use it to justify the power method. The proof of the power method (in the diagonalizable case) is easy to understand and nicely reinforces earlier course material. • Discuss the text before Exercise 41 concerning the use of the power method in finding roots of polynomials. • Notice that Gerschgorin’s Theorem gives us a very nice way to show that a matrix is invertible. If 0 is not contained in any of the Gerschgorin disks of A, then A is invertible. Unfortunately, if 0 is contained in one or more of these disks, that does not mean that A is not invertible.
Lecture Examples • A 2 × 2 matrix whose eigenvalues are determined using each of the methods described in the text: 3 9 Let A = . The eigenvalues of A are −3 and 6. 2 0 144
Section 4.5 Iterative Methods for Computing Eigenvalues
• Using the power method to find the dominant eigenvalue, we obtain the following values: k
0
xk
1 1
mk
1
1 12 2
2
9 2
!
2 9 2
12
3 7 2
4
39 7
7
5
!
81 13
6
!
53 9
2
2
2
39 7
81 13
53 9
7
!
321 53
!
2 321 53
≈ 6.05
• Using the shifted power method on A with the eigenvalue λ = 6, we apply the power method to −3 9 A − 6I = .We obtain the following values: 2 −6 k
0
xk
1 1
mk
1
1 6 −4
2 −9 6
3 −9 6
−9
6
−9
Therefore, the second eigenvalues of A is −9 + 6 = −3. • We use the shifted inverse power method to find the eigenvalue of A closest to −1. A + 1I = A + I , so solving (A + I) x1 = y0 , we calculate the following: k
xk
0 1 1
yk
1 1
mk
1
1 4 7 − 17
1 − 14
2
!
− 13 56 3 14
!
1 − 12 13
4 7
! !
− 13 56
3 − 121 182 37 91
1
! !
4 787 − 1694 269 847
1
74 − 121
− 538 787
− 121 182
787 − 1694
! !
At this point, 1/m4 ≈ −2.1, and thus −1 + 1/m4 ≈ −3.1. • An illustration of Gerschgorin’s Theorem: ⎡ ⎤ 5 2 −2 ⎢ ⎥ Let A = ⎣ 5 −4 −5 ⎦. A has eigenvalues 3, 6, and −9. The three Gerschgorin disks are D1 with center 4 −8 −1 (5, 0) and radius 4, D2 with center (−4, 0) and radius 10, and D3 with center (−1, 0) and radius 12. Note that each eigenvalue lies within at least one of these disks, as shown: y 10
_10
0 _10
145
10 x
Chapter 4 Eigenvalues and Eigenvectors
Tech Tip An excellent way to learn this material is to write code implementing the various methods, for example, Rayleigh’s method.
Group Work 1: When the Power Method Goes Awry After the students have worked through this activity, you can either let the matter rest, or point out that it is possible to determine the eigenvalues of the second matrix experimentally. To do this, use the inverse method to find the smallest root, and then use the shifted power method to find the rest. Answers
⎡
⎤ 1 ⎢ ⎥ 31 1. Starting with x0 = ⎣ 1 ⎦, we calculate m0 = 1, m1 = −5, m2 = − 75 , m3 = − 17 7 , m4 = − 17 , 1 127 m5 = − 65 31 , m6 = − 65 ≈ −1.95.
⎡
⎤ 1 ⎢ ⎥ Starting with x0 = ⎣ 1 ⎦, we calculate m0 = 1, m1 = −6, m2 = 56 , m3 = 1 m6 =
125 96 .
24 5 ,
m4 =
29 24 ,
m5 =
96 29 ,
There appears to be no convergence. This is due to the fact that there is no dominant eigenvalue.
Group Work 2: Fun with Gerschgorin’s Theorem This activity lets students practice with complex eigenvalues and demonstrates Gerschgorin’s Theorem in the plane. The students will need to use technology to approximate the relevant eigenvalues numerically. Answers 1.
y 2 _4 _2 0 _2
2
4
6
8 x
The matrix cannot be singular because 0 cannot be an eigenvalue. 2. 5.36, −3.96, 1.60 3.
y 2 _4 _2 0 _2
2
4
6
8 x 146
Section 4.5 Iterative Methods for Computing Eigenvalues
4. We do not know whether or not this matrix is singular, because 0 may or may not be an eigenvalue (see
the first graph below). We calculate the eigenvalues to be approximately 2.88 + 0.15i, 0.42 + 1.14i, and 0.70 − 1.28i, and add them to our graph.
_2
y 4
y 4
2
2
0
2
4 x
_2
_2
0 _2
Suggested Core Assignment Exercises 2, 7, 14, 20, 23, 32, 36, 41, 48, 52P , 53P
147
2
4 x
Group Work 1, Section 4.5 When the Power Method Goes Awry ⎤ −2 32 − 92 ⎥ ⎢ 1. Consider the matrix A = ⎣ 0 1 0 ⎦. This matrix has an eigenvalue of 2, and there is none larger. 0 0 1 Oops, I gave it away! Use the power method to find this eigenvalue. ⎡
⎡
⎤ −2 2 −6 ⎢ ⎥ 2. Now consider the matrix B = ⎣ 0 2 −3 ⎦. B also has an eigenvalue of 2, and there is none larger. 0 0 1 Oops, I did it again! Go ahead and use the power method to find this eigenvalue. What happens differently in this case and why?
148
Group Work 2, Section 4.5 Fun with Gerschgorin’s Theorem ⎡
⎤ 2 1 0 ⎢ ⎥ Consider the matrix A = ⎣ 1 5 2 ⎦. 1 0 −4 1. Draw the Gerschgorin disks associated with this matrix on the axes below. From this picture alone, can
you tell whether or not the matrix is singular? y 2
_4
0
_2
2
4
6
8
x
_2
2. Compute the eigenvalues of this matrix using technology. 3. Add the eigenvalues to your picture. How does your picture illustrate Gerschgorin’s Theorem?
⎡
⎤ 2 1 0 ⎢ ⎥ 4. Repeat the previous three questions for B = ⎣ 1 1 + i 2 ⎦. 1 0 1−i y 4
2
0
_2
2
_2
149
4
x
4.6
Applications and the Perron-Frobenius Theorem
Suggested Time and Emphasis 1–3 classes. Essential material.
Points to Stress 1. A revisitation of Markov chains. 2. A revisitation of Leslie matrices. 3. The Perron-Frobenius theorem. 4. Linear recurrence relations. 5. Systems of linear differential equations.
Drill Question Give an example of a recurrence relation.
Discussion Question What are some of the real-world uses of the Perron-Frobenius theorem?
Test Question Suppose A > 0 has eigenvalues {λ1 , . . . , λk }. Let ρ (A) = maxi=1,...,k |λi |. Is it true that, for some i ∈ {1, . . . , k}, ρ (A) = λi ? Answer
Yes. This is the conclusion of Perron’s Theorem.
Lecture Notes • Rather than stating Theorem 4, students can be led to discover it, if they are able to quickly take matrices to large powers. • In previous sections, we advised talking about what happens to Ak as k → ∞ in the case where the dominant eigenvalue Λ was less than one, and greater than one. This section discusses the common case where Λ = 1. • The proofs of Theorems 7(a) and 7(b) provided in the text are remarkable in part because they are completely geometric. Perhaps go over these proofs with the students, both to explain the theorems and to help reinforce the geometry of eigenvalues and linear transformations. • The Fibonacci sequence comes up in a wide variety of applications. For example, a female bee, or worker, comes from a fertilized egg, laid by the queen bee. A male bee, or drone, comes from an unfertilized egg, laid by either the queen bee or a worker bee. In other words, a female bee has a mother and a father, and a male bee has only a mother. This makes life strange for bee genealogists. For example, a drone has two grandparents, but a worker has three! It turns out that if you look back through the generations, a Fibonacci sequence develops: 1 drone has 1 parent, 2 grandparents, 3 great-grandparents, 5 great-great-grandparents, and 8 great-great-great-grandparents. 150
Section 4.6 Applications and the Perron-Frobenius Theorem
• The theory of nonnegative matrices (A ≥ 0), now occupying a vast amount of literature, exhibits its simplest but most elegant form in the case of positive matrices (A > 0). It is here that Perron made his fundamental discoveries in 1907. Often Perron’s Theorem is stated using ρ (A), the spectral radius of A. ρ (A) is defined to be the maximum of {|λi |}, where {λi } is the set of distinct eigenvalues of A. In class, give your class a pair of 2 × 2 matrices (one positive matrix and one signed matrix), and have them compute the spectral radius for each matrix. One of the consequences of Perron’s Theorem is that ρ (A) is always an eigenvalue of A (when A > 0). In fact, the Theorem goes on to tell us that ρ (A) is the only eigenvalue of A with this (maximum) modulus. Exhibit a matrix for which ρ (A) is not an eigenvalue of A.
• It is often of interest to determine the eventual behavior Am as m → ∞. A consequence of Perron’s Theorem is that, whenever A > 0, Am converges to a matrix with rank 1. This is not true in general if we merely assume that A ≥ 0; for example, consider A = In . However, show using 2 × 2 examples, that there exists a nonpositive A ≥ 0 such that A2 > 0. Will Perron’s Theorem hold in this case? (The answer is yes.)
• The exponential of A, eA , is relatively easy to work with when A is diagonalizable (as illustrated, for example, in the proof of Theorem 12). Let’s now consider a nondiagonalizable example: Let 1 4 A = . Note that A has one eigenvalue λ = −1 and that the geometric multiplicity of λ is −1 −3 1. Thus, A fails to be diagonalizable. As such, we want to avoid computing eA via the definition and so we need a new “trick” — a trick that takes advantage of the fact that A possesses a single eigenvalue. It turns out that if A and B commute, then eA+B = eA eB (the proof of this requires — not surprisingly — the consideration of the product of two infinite series). So we write A = λI + (A − λI) and note that eA = eλI+(A−λI) = eλI eA−λI
since I commutes with every matrix. Now check that eλI = eλ I and then expand the above as 1 1 2 3 A λ e = e I + (A − λI) + (A − λI) + (A − λI) + · · · 2! 3! From the Group Work in Section 4.3 involving generalized eigenvectors, it turns out to be easy to compute powers of 2 4 A − λI = A + I = −1 −2 since 0 0 2 (A − λI) = 0 0 Thus 2 4 A −1 e =e −1 −2 This method works whenever A has only 1 real eigenvalue. 151
Chapter 4 Eigenvalues and Eigenvectors
• Discrete linear dynamical systems arising from powers of a 2 × 2 real matrix can be broken down into the following cases:
1. λ1 and λ2 are real (a) |λ1 |, |λ2 | < 1: 0 is an attractor (b) |λ1 |, |λ2 | > 1: 0 is a repeller (c) |λ1 | > 1, |λ2 | < 1: 0 is a saddle (d) |λ1 |, |λ2 | = 1: all points are stationary (b) Special cases where |λ1 | = 1, |λ2 | = 1 2. λ1 and λ2 are complex conjugates (a) |λ1 | < 1: spiral in (b) |λ1 | > 1: spiral out (c) |λ1 | = 1: elliptical orbit
Lecture Examples • Consider the transition matrix P = ⎡
1 3 2 3
2 5 3 5
!
. The limit lim P m = L = m→∞
⎤ 2 1 1 ⎢ ⎥ • Consider the positive matrix P = ⎣ 1 2 1 ⎦. 1 1 2
1 1 5 3
5 3
!
.
ρ (P ) = 4 is an eigenvalue, and [1, 1, 1] is a corresponding positive eigenvector. ⎡ ⎤ 0 1 1 ⎢ ⎥ • P = ⎣ 1 0 1 ⎦ is a nonnegative irreducible matrix. 1 1 0 ⎡ ⎤ −1 0 1 ⎢ ⎥ • Consider the linear system: x = A, where A = ⎣ 3 0 −3 ⎦. Three eigenvalues of A are λ1 = 0, λ2 = 1 0 −1 0, and λ3 = −2. Three linearly independent eigenvectors of A are µ1 = [0, 1, 0], µ2 = [1, 0, 1], and µ3 = [−1, 3, 1]. The general solution to the system is x (t) = C1 eλ1 t v1 + C2 eλ2 t v2 + C3 eλ3 t v3 = C1 v1 + C2 v2 + C3 e−2t v3 . 152
Section 4.6 Applications and the Perron-Frobenius Theorem
• Discrete linear dynamical systems arising from powers of a 2 × 2 real matrix: 1 . Initial values are all x0 = 1 ! 6 −1 0 16 A= has eigenvalues 2 and 3. A= has eigenvalues 12 −1 −2 76 y
1 2
and 23 .
y 1
40
0
20
1
x
_1 0
A=
1 2 −1 1
10
20
x
√ has eigenvalues 1 ± 2i.
A=
y
√ 2 √2 − 22
√ 2 √2 2 2
!
has eigenvalues
±
√ 2 2 i.
y 1
1
0
0
√ 2 2
x
1
1
x
_1
Tech Tip The Collatz function is given by f (n + 1) =
-
3f (n) + 1 1 2 f (n)
if f (n) is odd if f (n) is even
There is a classic, unproven conjecture that for all choices of initial value f (0), there exists an n such that f (n) = 1. Students should investigate this conjecture by starting with different initial values and seeing if they always go to 1.
If there is additional interest, students should investigate σ (k), the “stopping time” of k. If we let f (0) = k, then σ (k) is the smallest number such that f (σ (k)) = 1. The Collatz conjecture, then, states that σ (k) is always finite. Make sure that the students look at σ (k) up through k = 30, for there are some surprises. The graph of σ (k) versus k has a very unexpected, interesting structure that is still being investigated. 153
Chapter 4 Eigenvalues and Eigenvectors
Group Work 1: The Power of Perron Students will get to examine some consequences of Perron’s Theorem through a small example. In particular, m 1 Aj . they will consider the limiting value of Ak as k → ∞ as well as an “infinite average” lim m→∞ m j=1 Answers 1. λ1 = 1 − α − β , λ2 = 1 2. λ2 = 1 is an eigenvalue. There is no other eigenvalue with modulus equal to the spectral radius except
when α = β = 0. 1 β 3. Let S = . λ1 corresponds to [1, −1]; λ2 corresponds to [β, α]. The eigenvector corresponding −1 α to the spectral radius is positive. β β 1 4. α+β α α 5. lim Am = I2 . If α = β = 1, the limit does not exist. m→∞ ! 6.
1 2 1 2
1 2 1 2
Group Work 2: Since You Brought It Up This activity is a beautiful blending of linear algebra and concepts from calculus. Answers
t2 1. eAt = I + tA + A2 + · · · 2. 3. 4. 5. 6. 7. 8.
2 d At t 2 t2 3 e = A + A + A + ··· dt 1! 2! d At e = AetA . Theorem: x = eAt solves the differential equation. dt
d (x (t)) = d eAt v = d eAt v = AeA tv = Ax (t) These two quantities should be equal. t2 t3 eAt v = v + tλv + λ2 v + λ3 v + · · · 2! 3! It is equal to eλt v. eAt v = eλt v
Suggested Core Assignment Markov Chains: Exercises 4, 8, 10P Population Growth: Exercises 15, 17, 23P The Perron-Frobenius Theorem: Exercises 31, 33, 37P Linear Recurrence Relations: Exercises 43, 46, 53P , 55 Systems of Linear Differential Equations: Exercises 61, 73, 77, 85 154
Group Work 1, Section 4.6 The Power of Perron Here we examine some consequences of Perron’s Theorem (and more general theory) using a 2 × 2 example. 1−α β For 0 ≥ α ≥ 1 and 0 ≥ β ≥ 1, let A = . We are interested in Am for large m. α 1−β 1. Am is easy to calculate when A is diagonalizable, so let’s try that first. Compute the eigenvalues of A.
2. What is the spectral radius of A? Is the spectral radius an eigenvalue? If so, are there any other eigenvalues
with modulus equal to the spectral radius?
3. Assume that α + β = 0. Find the eigenvectors corresponding to λ1 and λ2 . If A > 0, what do you notice
about the eigenvector corresponding to the spectral radius? Using these eigenvectors, find a matrix S such that S −1 AS is diagonal.
155
The Power of Perron
4. If α and β are not both equal to 1, determine lim Am . m→∞
5. Now let’s consider the two cases we avoided. If α = β = 0, determine lim Am . If α = β = 1, what m→∞
happens to lim Am ? m→∞
6. Let’s look at the case where α = β = 1 a bit more. Rather than look for a limiting value for Am , let’s
consider a limit of “averages”. That is, determine if the limit 1 m j j=1 A m→∞ m lim
exists.
156
Group Work 2, Section 4.6 Since You Brought It Up... Let’s take a more complete look at the role of eA in solving a linear system of differential equations. Consider the system x = Ax, where A is a square matrix of dimension n. We want to prove some results about solutions to a system of this form. 1. Expand eAt using the definition of the matrix exponential.
2. Now, using term-by-term differentiation, compute
d At e . dt
3. Compare the above with AetA . What do you notice? State a theorem!
4. Let v ∈ Rn and let x (t) = eAt v. Use your theorem to show that x (t) solves the system x = Ax.
Hint: Use the fact that d eAt v = d eAt v.
157
Since You Brought It Up...
5. We see then that solutions to x = Ax are as hard to obtain as eAt is to obtain. Unfortunately, eAt is often
too difficult to compute directly from the definition. But there is some good news: Suppose that v is an eigenvector of A corresponding to the eigenvalue λ. Then, for every integer k, what can we say about Ak v as compared to λk v?
6. Notice that
t2 2 t3 3 e v = I + tA + A + A + · · · v 2! 3! Now distribute v through the infinite sum; you will be left with terms of the form Ak v. Replace these terms with your result above. What do you get? At
7. Finally, factor v out of your infinite sum and use the exponential definition of ecI to simplify your result.
8. State a theorem expressing your results.
158
5 Orthogonality Orthogonality in Rn
5.1
Suggested Time and Emphasis 1 class. Essential material.
Points to Stress 1. Sets of mutually orthogonal vectors. 2. Using Theorem 2 to find the coordinates of a vector with respect to a given orthogonal basis. 3. Definition and examples of orthogonal matrices. 4. Theorems 6 and 8: Properties of orthogonal matrices.
Drill Question
Let M = Answer
0 0 . Can a and b be found to make M an orthogonal matrix? Why or why not? a b
No. There are many reasons, one of which is that A−1 does not exist.
Discussion Question Is every orthogonal matrix diagonalizable? Answer The answer turns out to be “yes,” but this question can lead to discussions of complex matrices and other topics that are beyond the scope of this chapter. Students may not be able to obtain the answer to this question, but the idea here is to get a discussion going, to try to find counterexamples, and to reinforce the ideas of orthogonality and diagonalizability.
Test Question Assuming that v · w = 3 and that A is an orthogonal matrix, compute Av · Aw. Answer 3
Lecture Notes • If you discuss Section 5.0, review Example 5 from Section 3.5. This example derives the formula for projecting a vector onto a line. (It is not a bad idea to review that example even if you are not discussing Section 5.0.) • Note that the converse to Theorem 1 is false. There exist plenty of sets of linearly independent vectors that are not orthogonal. Students tend to confuse a statement with its converse, so it is good to point this out explicitly. 159
Chapter 5 Orthogonality
• Remind the students that we can view a square matrix as the associated matrix of a linear transformation in Euclidean space — that we can think of Ax as a transformation that takes a vector in, say, R3 to another vector in R3 . This perspective will make many of the abstract definitions and theorems in this section concrete and intuitive. For example, an orthogonal matrix can now be thought of as the matrix of a transformation that does not change the length of vectors. Theorem 6 now says that not only does it not change the length of vectors, but if two vectors are transformed, the angle between them does not change. Theorem 8 now says that the inverse transformation also doesn’t change the length of vectors, that it doesn’t change the area of transformed shapes, and that following one such transformation by another will not change the length of vectors.
Lecture Examples • A set of linearly independent vectors that are not orthogonal: ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ 1 1 2 ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎣2⎦ ⎣1⎦ ⎣3⎦ 3 1 0 • An orthogonal 4 × 4 matrix:
⎡ ⎢ ⎢ A=⎢ ⎣
1 2 − 12 − 12 − 12
1 2 − 12 1 2 1 2
1 2 1 2 − 12 1 2
⎤
1 2 1 2 1 2 − 12
⎥ ⎥ ⎥ ⎦
• One can easily check that the columns and rows form an orthogonal set. • One can easily check that the columns and rows each have length one. • Corresponding basis α of R4 : ⎧⎡ 1 ⎤ ⎡ 1 ⎪ 2 2 ⎪ ⎪ ⎨⎢ − 1 ⎥ ⎢ − 1 ⎢ 2⎥ ⎢ 2 ⎢ 1 ⎥,⎢ 1 ⎪ ⎣ −2 ⎦ ⎣ 2 ⎪ ⎪ ⎩ 1 − 12 2 ⎡1 ⎤ 1 1 1 2 −2 −2 −2 ⎢ 1 −1 1 1 ⎥ ⎢ 2 2 2 ⎥ • A−1 = AT = ⎢ 12 ⎥ 1 1 1 ⎣2 ⎦ 2 −2 2 1 2
1 2
1 2
⎤ ⎡ ⎥ ⎢ ⎥ ⎢ ⎥,⎢ ⎦ ⎣
1 2 1 2 − 12 1 2
⎤ ⎡ ⎥ ⎢ ⎥ ⎢ ⎥,⎢ ⎦ ⎣
1 2 1 2 1 2 − 12
− 12
• det A = 1, eigenvalues of A are −1 (multiplicity two) and ⎡ ⎤ ⎡ ⎤ 1 −4 ⎢2⎥ ⎢ 3⎥ ⎢ ⎥ ⎢ ⎥ • If v = ⎢ ⎥, [v]α = ⎢ ⎥. ⎣3⎦ ⎣ 2⎦ 4 1 160
1 2
±
⎤⎫ ⎪ ⎪ ⎥⎪ ⎬ ⎥ ⎥ ⎦⎪ ⎪ ⎪ ⎭
√ 3 2 i
Section 5.1 Orthogonality in Rn
• A matrix with determinant 1 that is not orthogonal: ⎡ ⎤ 1 1 1 ⎢ ⎥ ⎣0 1 0⎦ 2 8 3
Tech Tips • Give the students orthogonal sets of three vectors in R4 and challenge them to find a fourth vector that is mutually orthogonal to them.
Group Work: Hadamard Matrices This activity requires a bit of setup, but it is a nice introduction to an entire field of mathematical research. Go over the introduction and the first problem with the students, and then use your judgment to decide whether they can continue without you. After they finish, inform them of a theorem that says for an n × n Hadamard matrix to exist, n must be 1, 2, or divisible by 4. It is currently unknown whether there is a Hadamard matrix for every multiple of 4. One reason people are interested in this problem is that these matrices are crucial in coding theory. As of 1993, mathematicians had found one for every n up through n = 424. Hadamard matrices have many important properties and applications, and are still being researched. An entire course could be occupied with Hadamard matrices. Answers 1. A Hadamard matrix of order n (n even) is an n × n square matrix satisfying the following conditions:
• Each entry is either 1 or −1 • The first row and the first column consist entirely of 1s • Columns other than the first contain
n 2
1s and
n 2
−1s
• The dot product of any two columns is 0
Any equivalent description is fine for this problem. It’s a tough one! 2. This follows directly from the definition of an orthogonal matrix. 3. Answers will vary. 4. Answers will vary. One quick proof is as follows: We know that √1n A is orthogonal, so Therefore,
1 T 1 √ A √ A = I ⇔ n n 1 T A A = I ⇔ n A AT = nI
Suggested Core Assignment Exercises 2, 4, 6, 10, 14, 20, 25P , 28P , 30, 35
161
√1 AT n
=
√1 A−1 . n
Group Work, Section 5.1 Hadamard Matrices In this activity, we are going to look at a special type of matrix called a Hadamard matrix. Here are some examples of Hadamard matrices: ⎡ ⎤ 1 1 1 1 1 1 1 1 ⎢ 1 −1 1 −1 1 −1 1 −1 ⎥ ⎢ ⎥ ⎢ ⎥ ⎤ ⎡ ⎤ ⎡ ⎢ 1 1 −1 −1 1 1 −1 −1 ⎥ 1 1 1 1 1 1 1 1 ⎢ ⎥ ⎢ 1 −1 −1 1 1 −1 −1 1 ⎥ ⎢ 1 −1 1 −1 ⎥ ⎢ 1 1 −1 −1 ⎥ 1 1 ⎢ ⎥ ⎢ ⎥ ⎥ ⎢ ⎢ ⎥ ⎢ ⎥ ⎥ ⎢ ⎢ 1 1 1 1 −1 −1 −1 −1 ⎥ ⎣ 1 1 −1 −1 ⎦ ⎣ 1 −1 1 −1 ⎦ 1 −1 ⎢ ⎥ ⎢ 1 −1 1 −1 −1 1 −1 1 ⎥ 1 −1 −1 1 1 −1 −1 1 ⎢ ⎥ ⎢ ⎥ ⎣ 1 1 −1 −1 −1 −1 1 1 ⎦ 1 −1 −1 1 −1 1 1 −1 ⎡ ⎤ 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 ⎢ 1 −1 1 −1 1 −1 1 −1 1 −1 1 −1 1 −1 1 −1 ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ 1 1 −1 −1 1 1 −1 −1 1 1 −1 −1 1 1 −1 −1 ⎥ ⎢ ⎥ ⎢ 1 −1 −1 1 1 −1 −1 1 1 −1 −1 1 1 −1 −1 1 ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ 1 −1 1 1 −1 −1 −1 −1 1 1 1 1 −1 −1 −1 −1 ⎥ ⎢ ⎥ ⎢ 1 1 1 −1 −1 −1 1 1 1 −1 1 −1 −1 −1 1 1 ⎥ ⎢ ⎥ ⎢ 1 −1 −1 −1 −1 1 −1 1 1 1 −1 −1 −1 1 −1 1 ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ 1 1 −1 1 −1 1 1 −1 1 −1 −1 1 −1 1 1 −1 ⎥ ⎢ ⎥ ⎢ 1 −1 1 1 1 1 1 1 −1 −1 −1 −1 −1 −1 −1 −1 ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ 1 1 1 −1 1 −1 1 −1 −1 −1 1 1 −1 −1 1 1 ⎥ ⎢ ⎥ ⎢ 1 −1 −1 −1 1 1 −1 −1 −1 1 −1 1 −1 1 −1 1 ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ 1 1 −1 1 1 −1 −1 1 −1 1 1 −1 −1 1 1 −1 ⎥ ⎢ ⎥ ⎢ 1 −1 1 1 −1 −1 −1 −1 −1 −1 −1 −1 1 1 1 1 ⎥ ⎢ ⎥ ⎢ 1 1 1 −1 −1 −1 1 1 −1 −1 1 1 1 −1 1 −1 ⎥ ⎢ ⎥ ⎢ ⎥ ⎣ 1 −1 −1 −1 −1 1 −1 1 −1 1 −1 1 1 1 −1 −1 ⎦ 1 1 −1 1 −1 1 1 −1 −1 1 1 −1 1 −1 −1 1 1. From the examples above, try to guess a definition of a Hadamard matrix.
2. Show that if A is an n × n Hadamard matrix, then √1n A is an orthogonal matrix.
3. Find an 8 × 8 Hadamard matrix different than the one given.
4. Prove that if A is an n × n Hadamard matrix, then A AT = nI .
162
5.2
Orthogonal Complements and Orthogonal Projections
Suggested Time and Emphasis 1–2 classes. Recommended material.
Points to Stress 1. The definition of orthogonal complement. 2. The four fundamental subspaces of an m × n matrix, and their relationship. 3. Generalized orthogonal projections. 4. Corollary 6: If A is an m × n matrix, then rank (A) + nullity (A) = n.
Drill Question If W is a two-dimensional subspace of R5 , what is the dimension of W ⊥ ? Answer 3
Discussion Question Let W be a subspace of Rn and let P be an orthogonal projection onto W . For x ∈ Rn , we may consider P x to be an approximation to x in W . Is P x a “good” approximation to x? Answer In fact P x is the best approximation to x in W , because x − w is minimized (over w ∈ W ) by choosing w = P x.
Test Question Let W be a subspace of Rn and let P be the orthogonal projection onto W . Describe the set of x ∈ Rn such that P x = 0. Answer P x = 0 if and only if x is orthogonal to W .
Lecture Notes • This section makes heavy use of definitions and examples from Section 3.4. Perhaps start class with a review of Examples 9, 11 and 12 from that section. • Given a subspace W of Rn , it is important to emphasize that W ⊥ is also always a subspace of Rn . Consider doing the proof of Theorem 1(a) early in class. You may also want to take this opportunity to foreshadow Theorem 5. For example, suppose W is a three-dimensional subspace of R5 with basis {u1 , u2 , u3 }. To determine W ⊥ , find all vectors in R5 that are simultaneously orthogonal to u1 , u2 , and u3 . Let x be a vector in R5 ; note that this vector has five coordinates. If we require x · ui = 0 for i ∈ {1, 2, 3} then this places three linear constraints on the coordinate choices for x. Solving this linear system we find we have two free coordinates. And, of course, 2 is exactly the dimension of W ⊥ . • There is potential for confusion when attempting to visualize W and W ⊥ . For example, if W is a twodimensional subspace of R3 , ask your students to describe W ⊥ geometrically. On first thought, one might describe W ⊥ as “that plane which is orthogonal to the plane W ” — which makes sense, since it is easy to visualize two orthogonal planes in R3 . Take this opportunity to show students that the “plane orthogonal 163
Chapter 5 Orthogonality
to plane W ” contains vectors which are not orthogonal to W . Indeed there is an entire “line’s worth” in common between these two planes, and only the zero vector is orthogonal to itself!. Be sure to point out that in this case, W ⊥ consists only of a (one-dimensional) line. • From one point of view, Theorem 3 (the Orthogonal Decomposition Theorem) is the centerpiece of this section. Note that all subsequent results rely on this theorem and the central definition of the section (orthogonal projection) is heavily employed in its proof. Notice that in the proof of the theorem we find the phrase “‘...we choose an orthogonal basis...”. Certainly every finite-dimensional subspace has a basis, but can we always find an orthogonal basis? The answer is “yes,” but the verification of this claim is the subject of Section 5.3. To give students a basic sense of why the Orthogonal Decomposition Theorem should hold, continue along the same line as described in Lecture Material 2 above. Specifically, if W is a k-dimensional subspace of Rn , then, by an equation count, there should be n − k linearly independent vectors orthogonal to W . We call the span of these vectors W ⊥ . If x ∈ Rn , then x must be a linear combination of elements from W ∪ W ⊥ ; otherwise x would fail to be in the span of W ∪ W ⊥ , which would imply that the span of x ∪ W ∪ W ⊥ is (n + 1)-dimensional.
Lecture Examples • Fun with a 5 × 4 matrix of rank 3:
⎤ 1 −2 −1 1 ⎢ 0 1 3 2⎥ ⎢ ⎥ ⎢ ⎥ A = ⎢ −2 2 1 0 ⎥ ⎢ ⎥ ⎣ 3 −5 0 5 ⎦ −1 0 0 1
A in reduced row-echelon form:
⎡
⎡
1 ⎢ ⎢0 ⎢ ⎢0 ⎢ ⎢ ⎣0 0
0 1 0 0 0
0 0 1 0 0
⎤ −1 ⎥ − 85 ⎥ ⎥ 6 ⎥ 5 ⎥ ⎥ 0 ⎦ 0
The fundamental subspaces of A: ⎤ ⎡ ⎤ ⎡ ⎤⎫ 0 0 ⎪ 1 ⎪ ⎪ ⎢ ⎢ ⎬ ⎥ ⎥ 0⎥ ⎢ 1 ⎥ ⎢0⎥ ⎥ ⎥,⎢ ⎥ , ⎢ ⎥ . Notice that the first three rows of A are 0 ⎦ ⎣ 0 ⎦ ⎣ 1 ⎦⎪ ⎪ ⎪ 8 6 ⎭ −1 −5 5 also a basis for the row space, but are not as easy to work with.
⎧⎡ ⎪ ⎪ ⎪ ⎨⎢ ⎢ 1. Basis for the row space of A: ⎢ ⎪ ⎣ ⎪ ⎪ ⎩
⎧⎡ 1 ⎪ ⎪ ⎪ ⎨⎢ 8 ⎢ 2. Basis for the null space of A : ⎢ 56 ⎪ ⎣ −5 ⎪ ⎪ ⎩ 1
⎤⎫ ⎪ ⎪ ⎪ ⎥⎬ ⎥ ⎥ ⎦⎪ ⎪ ⎪ ⎭ 164
Section 5.2 Orthogonal Complements and Orthogonal Projections
⎧⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫ 0 ⎪ 0 1 ⎪ ⎪ ⎪ ⎪ ⎢ ⎥⎪ ⎪ ⎢ ⎥ ⎢ ⎥ ⎪ 0 1 0 ⎪ ⎬ ⎨⎢ ⎥ ⎢ ⎥ ⎢ ⎥⎪ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ 3. Basis for the column space of A : ⎢ 0 ⎥ , ⎢ 0 ⎥ , ⎢ 1 ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥⎪ ⎪ ⎪ ⎪ ⎪ ⎣ 3 ⎦ ⎣ 1 ⎦ ⎣ 0 ⎦⎪ ⎪ ⎪ ⎪ ⎪ ⎭ ⎩ 1 0 1
4. Basis for the null space of AT : {[1, 0, 0, 3, 1] , [0, 1, 0, 1, 0] , [0, 0, 1, 0, 1]}. It is easy to verify Theorem 2. rank (A) = 3, nullity (A) = 1, and rank (A) + nullity (A) = 4, which is the number of columns of A. • Fun with a subspace of R5 : Let W have the following basis: {[1, 2, 3, 4, 5] , [1, 0, 1, 0, 1]}. Then an orthogonal basis of W is {[1, 0, 1, 0, 1] , [−2, 2, 0, 4, 2]} and a basis of W ⊥ is {[−1, −2, 0, 0, 1] , [0, −2, 0, 1, 0] , [−1, −1, 1, 0, 0]}. 3 [−2, 2, 0, 4, 2] = 47 , 37 , 1, 67 , 10 Let v be the vector [1, 1, 1, 1, 1]. Then projW (v) = [1, 0, 1, 0, 1] + 14 7 , 3 4 1 3 perpW (v) = [1, 1, 1, 1, 1] − 47 , 37 , 1, 67 , 10 7 = 7 , 7 , 0, 7 , − 7 , and the orthogonal decomposition of v is 4 3 3 4 6 10 1 3 7 , 7 , 1, 7 , 7 + 7 , 7 , 0, 7 , − 7 . Let u be the vector [1, 2, 3, 4, 5]. Then projW (u) = 3 [1, 0, 1, 0, 1] + [−2, 2, 0, 4, 2] = [1, 2, 3, 4, 5], perpW (u) = [1, 2, 3, 4, 5] − [1, 2, 3, 4, 5] = [0, 0, 0, 0, 0], and the orthogonal decomposition of u is [1, 2, 3, 4, 5] + [0, 0, 0, 0, 0].
Tech Tip Have the students use an arbitrary matrix to verify Theorem 2.
Group Work 1: Cookie Dough Joe In this section we’ve defined the very natural concept of an orthogonal projection onto a subspace W . Note that the “orthogonal” part of this definition is captured in the fact that (by construction) x − P x is orthogonal to W . There are, however, ways to project x onto W other than orthogonally. We explore this more general scheme here. Answers 1. It suffices to show that P vi = vi for each i = 1, . . . , k . But by the choice of each ui , we have
P vi = (vi · ui ) vi = vi . 2. u1 = u11 , u12 , u13 and u2 = u21 , u22 , u23 . v1 · u1 = u11 +u12 = 1, v2 · u1 = u11 = 0, v1 · u2 = u21 + u22 = 0,
and v2 · u2 = u21 = 1. Therefore, u11 = 0, u12 = 1, u13 = s, u21 = 1, u22 = −1, and u23 = t, and so u1 = [0, 1, s] and u2 = [1, −1, t].
3. P e3 = sv1 + tv2 = [s, s, 0] + [t, 0, 0], so s = 1 and t = −1.
= [x1 , x2 , x3 ], so P x = (x2 + sx3 ) v1 + (x1 − x2 + tx3 ) v2 . Then x − P x = [−x3 (s + t) , −sx3 , x3 ]. We want (x − P x) · vi = 0 for i = 1, 2. This yields (x − P x) · v1 = −x3 (s + t) − sx3 = 0 and (x − P x) · v2 = −sx3 = 0, so s = t = 0.
4. Let x
165
Chapter 5 Orthogonality
Group Work 2: Co-Minimal Projections A projection (as defined in Group Work 1) can be used as a convenient approximating tool, in the following sense. Let W be a subspace of Rn . For x ∈ Rn we seek a “good” approximation to x in W . Answers 1. Calculate x − P x. The smaller this quantity, the better the approximation. 2. x − P x ≤ x − w for all x ∈ X and w ∈ W . 2
3. x − w = (x − w) · (x − w) = (x − w) · (x − w) + 2 (x − P x) · (w − P x)
= x · x − 2 (x · w) + w · w + 2 (x · w) − 2 (x · P x) − 2 (w · P x) + 2 (P x · P x) = (x − P x) · (x − P x) + (P x − w) · (P x − w) = x − P x2 + P x − w2 This implies that x − w ≥ x − P x, with equality holding if and only if w = P x.
Group Work 3: Find the Error Answer
The basis used for the plane W is not an orthogonal basis.
Suggested Core Assignment Exercises 2, 6, 8, 10, 16, 20
166
Group Work 1, Section 5.2 Cookie Dough Joe We say that P : Rn → W is a projection if P ◦ P = P 2 = P ; that is, if P w = w for all w ∈ W . Thus a projection onto W is a map that “fixes” W . Of course, an orthogonal projection is a special type of projection. We will now outline the recipe for constructing projections and then you will have a chance to build one of you own. Here are the steps: I Let {v1 , . . . , vk } be a basis for W in Rn . (Note we do not require any orthogonality.)
II Here is the place that requires work: choose u1 , . . . , uk in Rn so that vi · uj = δ ij (here we use δ ij = 0
for i = j and δ ij = 1 for i = j ). This will involve solving an (underdetermined) linear system, so there will be a non-unique selection of such uj ’s. III Here is the way we define a projection onto W relative to basis v1 , . . . , vk : P x = (x · u1 ) v1 + · · · + (x · uk ) vk
Now for some examples. 1. Show that the construction above does in fact yield a projection.
Hint: Show that P w = w for all w ∈ W .
2. In R3 , let W be the subspace with basis v1 = [1, 1, 0] and v2 = [1, 0, 0]. Let’s find all possible u1 , u2 such
that vi · uj = δ ij . Initially we have 6 unknowns (3 coordinates for u1 and 3 for u2 ); the δ ij requirements yield 4 equations, and so we expect 2 free parameters associated with u1 and u2 . Now find this family of u1 , u2 pairs.
3. Let u1 = [0, 1, s] and u2 = [1, −1, t]. How should s and t be chosen so that P e3 = P [0, 0, 1] = [0, 1, 0]?
4. Find s and t such that x − P x is orthogonal to W .
167
Group Work 2, Section 5.2 Co-Minimal Projections Let P : Rn → W be a projection. Let’s say that our approximation to x from W is P x (note that this approximation is the best possible if x ∈ W , since P = I on W ). 1. What is the natural way to measure how well P x approximates x?
2. Given x ∈ Rn , let wx ∈ W be an approximation to x from W and let’s say wx is a better approximation
ˆ x if x − wx ≤ x − w ˆ x . Let P denote the orthogonal projection onto W . What inequality than w must we establish in order to verify the claim that P x is the best approximation to x?
3. Verify that P x is the best approximation to x.
Hint: Use the fact the x − P x is orthogonal to every element of W and compare x − w2 with x − P x2 + P x − w2 .
168
Group Work 3, Section 5.2 Find the Error It is a beautiful spring day. You and your friends are lounging at an outdoor classical music concert. The string quartet has just taken a break, and you are discussing your linear algebra. “I love the Orthogonal Decomposition Theorem,” you say. “It is so elegant,” says one friend. “It is so perfect,” says another. “It is so FALSE,” says the wild-eyed gentleman who has crept up behind you! “What do you mean,” you say. “I love this theorem and it is the essence of that which is True!” “Oh, is it, my Merry Grig? Let’s consider this subspace W of R3 : x − y − z = 0.” “That’s a plane through the origin, so I buy that it is a subspace,” says one of your friends. ⎡ ⎤ 1 ⎢ ⎥ “...with normal vector ⎣ −1 ⎦”, says the other. −1 ⎡ ⎤ ⎡ ⎤ 1 1 ⎢ ⎥ ⎢ ⎥ “Yes, that’s obvious,” you say. “We also know that it is spanned by u1 = ⎣ 1 ⎦ and u2 = ⎣ 0 ⎦. But what 0 1 is your point?”
⎡
⎤ 1 ⎢ ⎥ The wild-eyed stranger smiles. “So by that theorem, we can find an orthogonal decomposition of v = ⎣ 2 ⎦, 3
right?” You nod. “Yes. You can write v = w + w⊥ , where w = projW (v) and w⊥ = perpW (v).” There was a time when you would have found it odd that both you and the stranger could speak in math notation, but that time has long since passed. The stranger now computes, writing his calculations down on a legal pad. u1 · v u2 · v u1 + u2 w = projW (v) = u1 · u1 u2 · u2 ⎡ ⎤ ⎡ ⎤ ⎡7⎤ 1 1 3 ⎢ ⎥ 4 ⎢ ⎥ ⎢ 23 ⎥ 1 = + ⎣ ⎦ ⎣0⎦=⎣ 2 ⎦ 2 2 0 1 2 ⎡ ⎤ ⎡7⎤ ⎡ 5⎤ 1 −2 2 ⎢ ⎥ ⎢3⎥ ⎢ 1 ⎥ ⊥ w = perpW (v) = v − w = ⎣ 2 ⎦ − ⎣ 2 ⎦ = ⎣ 2 ⎦ 3 2 1 “Is that copacetic by you?” he asks. Assuming that the word “copacetic” means “correct”, you nod. And then he asks you to check his work. 169
Find the Error
Your friends listen intently as you speak. They adore you. “Well,” you say, “clearly w + w⊥ = v. And w ∈ W since
7 2
−
3 2
− 2 = 0. Now we just need to check that w⊥ ∈ W ⊥ . So we take the dot products of
w⊥ with u1 and u2 , hoping to get zero...”
You pause. “...because w⊥ has to be perpendicular to every vector in W ...” Another pause, followed by silence. The dot products aren’t zero. “Umm...”, you say, “This isn’t right. There was some mistake...” “Oh, really, smartie? Then where is it? What’s my error?” And as your friends look at you expectantly, the quartet starts playing a strange little tune. The stranger runs up on stage and begins to sing: Don’t like Linear Algebra Always makes me cry Determinants are awful Just give me moony pies! Matrices are messy Eigenspace is dry You can take the inverse And I’ll take moony pies! To your horror, your friends are laughing and clapping along! Save your honor! Find the error.
170
5.3
The Gram-Schmidt Process and the QR Factorization
Suggested Time and Emphasis 1 class. Recommended material.
Points to Stress 1. Gram-Schmidt Process leading to an orthogonal basis. 2. Construction of an orthonormal basis. 3. Factorization of matrix A as QR.
Drill Question True or False: The Gram-Schmidt Process (as described in our text) always produces an orthonormal basis. Answer False. It yields an orthogonal basis.
Discussion Question Suppose {v1 , . . . , vk } form a basis for W and vi = 1 for each i = 1, . . . , k. Will the Gram-Schmidt Process applied to this basis produce an orthonormal basis? Answer Not necessarily
Test Question Suppose A has linearly independent columns. Is AT A nonsingular? Answer Yes, by QR factorization.
Lecture Notes • This is a good place to make a point regarding Section 5.2. If W is a subspace of Rn then a linear transformation P : Rn → W such that P w = w for all w ∈ W and x − P x ∈ W ⊥ is called an orthogonal projection onto W . Section 5.2 assures us that there is a unique orthogonal projection onto each W . This result is easiest to prove when we assume that we have an orthogonal basis for W , and the role of the current section is to ensure that such a basis can always be chosen. However, as illustrated in Group Work 2 of Section 5.2, the construction of an orthogonal projection can be accomplished without using an orthogonal basis. Consider drawing an “arbitrary” subspace (plane containing the origin) on the board and indicate how vectors are orthogonally projected onto the subspace. Note there is no need to invoke an orthogonal basis for such projecting to take place. In Group Work 1 of this section we take another look at orthogonal projection construction. • The Gram-Schmidt Process can be stated in a condensed form which is easier to remember. It goes like this: Given a basis {x1 , . . . , xk }, define the orthogonal basis {v1 , . . . , vk } by v1 = x1 and, for i = 2, . . . , k, vi = xi − projWi−1 xi , where Wi−1 is the span of {x1 , . . . , xi−1 }. • The proof that the Gram-Schmidt Process yields an orthogonal basis is almost trivial when the process is expressed as vi = xi − projWi−1 xi for i = 2, . . . , k. First note that, for i ≥ 2, vi is orthogonal to xj for j = 1, . . . , i − 1 by construction (see Section 5.2). And therefore, since vj ∈ Wi for j = 1, . . . , i − 1, we have vi orthogonal to vj for j = 1, . . . , i − 1. 171
Chapter 5 Orthogonality
• Only square matrices have the possibility of having an inverse. Nonetheless, QR factorization provides us with an opportunity to define a “generalized inverse” for an m × n matrix A (with m ≥ n) whenever the columns of A are linearly independent. How should the “generalized inverse” be defined? In what sense is it “easy” to compute?
Lecture Examples • Find a orthogonal basis for the subspace of R4 spanned by ⎡ ⎡ ⎤ ⎤ 2 1 ⎢ 1⎥ ⎢ 0⎥ ⎢ ⎢ ⎥ ⎥ v1 = ⎢ v2 = ⎢ ⎥ ⎥ ⎣ 0⎦ ⎣ 2⎦ −1 −1
⎡
⎤ 0 ⎢ −2 ⎥ ⎢ ⎥ v3 = ⎢ ⎥ ⎣ 1⎦ 0
Note these vectors are linearly independent. Performing the Gram-Schmidt process we find w1 = v1 , ⎤ ⎡ ⎡ 2 ⎤ 0 3 ⎢ −1 ⎥ ⎢ −4 ⎥ ⎢ 2⎥ ⎢ 3⎥ w2 = ⎢ ⎥, and w3 = ⎢ 1 ⎥. ⎣ 2 ⎦ ⎣ −3 ⎦ − 12 ⎡
⎢ ⎢ • Let A = ⎢ ⎣
⎤
0
2 1 0 1 0 −2 ⎥ ⎥ ⎥. Applying the QR factorization, we find 0 2 1⎦ −1 −1 0 ⎡ √ ⎤ √ ⎡ ⎤ 6 2 21 ⎡√ 0 2 1 0 21 ⎢ √3 ⎥ √ √ ⎢ 1 0 −2 ⎥ ⎢ 6 − 2 − 4 21 ⎥ ⎢ 6 ⎥ ⎢ ⎥ ⎢ 6 √6 √21 ⎥ ⎢ 0 A=⎢ ⎥=⎢ ⎣ 2 2 21 ⎣ 0 2 1⎦ ⎢ 0 − 21 ⎥ ⎣ √ ⎦ 3√ 0 −1 −1 0 − 66 − 62 0
√ 6 2 √ 3 2 2
0
√ ⎤ − 36 √ ⎥ 2 ⎥ ⎦ = QR √ 21 3
Tech Tip This is a good opportunity for the students to explore the built-in Gram-Schmidt function of their CAS.
Group Work 1: Cookie Dough Joe (Part 2) This activity illustrates a method for building an orthogonal projection onto a subspace. Answers 1. M is symmetric. 2. The linearity of the dot product and matrix multiplication imply that P is linear. 3. It suffices to show that P vi = vi for each i = 1, . . . , k . By definition
⎡
⎤ v1 ⎢ . ⎥ ⎥ P vi = [vi · v1 , . . . , vi · vk ] M −1 ⎢ ⎣ .. ⎦ vk
But vi · v1 , . . . , vi · vk is the ith row of M , and therefore [vi · v1 , . . . , vi · vk ] M −1 = vi , whence P vi = vi . 172
Section 5.3 The Gram-Schmidt Process and the QR Factorization
4. It suffices because the vi ’s form a basis for W .
⎤⎞ ⎡ v1 v1 · vi ⎟ ⎢ ⎥ ⎢ ⎜ . −1 ⎢ . ⎥⎟ −1 ⎢ 5. P x · vi = ⎜ ⎝[x · v1 , . . . , x · vk ] M ⎣ .. ⎦⎠ · vi = [x · v1 , . . . , x · vk ] M ⎣ .. vk vk · vi ⎡ ⎤ v1 · vi ⎢ . ⎥ ⎥ M −1 ⎢ ⎣ .. ⎦ = vi and therefore P x · vi = x · vi . vk · vi ⎡
⎛
⎤ ⎥ ⎥. ⎦
But
6. Since P x · vi = x · vi , we have (x − P x) · vi = 0.
Group Work 2: Interpolation Answers 1. Let w = a1 v1 + · · · + ak vk . The ai ’s are to be solved for. The set of equations is
a1 v1 · v1 + a2 v1 · v2 + · · · + ak v1 · vk = c1 a1 v2 · v1 + a2 v2 · v2 + · · · + ak v2 · vk = c2
.. . a1 vk · v1 + a2 vk · v2 + · · · + ak vk · vk = ck ⎤⎡ ⎤ ⎡ ⎤ c1 a1 v1 · v1 v1 · v2 · · · v1 · vk ⎢ ⎢ ⎢ v ·v ⎥ ⎥ ⎥ ⎢ 2 1 v2 ·2 · · · v2 · vk ⎥ ⎢ a2 ⎥ ⎢ c2 ⎥ ⎢ ⎢ ⎢ ⎥ ⎥ ⎥ The matrix equation is ⎢ .. .. .. ⎥ ⎢ .. ⎥ = ⎢ .. ⎥. .. . ⎣ ⎦⎣ . ⎦ ⎣ . ⎦ . . . vk · v1 vk · v2 · · · vk · vk ak ck ⎡ ⎤ v1 · v1 v1 · v2 · · · v1 · vk ⎢v · v v · v · · · v · v ⎥ 2 k⎥ ⎢ 2 1 2 2 ⎥ Let M = ⎢ .. ⎥. .. ⎢ .. .. . . ⎦ ⎣ . . vk · v1 vk · v2 · · · vk · vk M is symmetric. M should be nonsingular. Yes, because A has linearly independent columns. AT = RT QT . AT A = RT QT QR. But Q is orthogonal, so we have AT A = RT R. AT A is the product of two nonsingular matrices, and is hence nonsingular. Moreover, a quick check confirms that M = AT A. ⎡
2.
3. 4. 5. 6. 7.
Suggested Core Assignment Exercises 2, 5, 7, 9, 10, 12, 18, 20P , 21
173
Group Work 1, Section 5.3 Cookie Dough Joe (Part 2) Starting with a subspace W of Rn , we learn a simple construction method for building an orthogonal projection onto W . Let {v1 , . . . , vk } be a basis for W . There is a matrix associated with this basis that will be important to us; we will study this matrix later. For now, we simply define the k × k matrix M = [mij ] where mij = vi · vj . 1. State one obvious property of M .
If {v1 , . . . , vk } is a basis, then M nonsingular. This verification of this fact is covered in Group Work 2. Here is how we define our orthogonal projection P : ⎡
⎤ v1 ⎢ . ⎥ ⎥ P x = [x · v1 , . . . , x · vk ] M −1 ⎢ ⎣ .. ⎦ vk 2. Verify that P is a linear transformation.
3. Verify that P w = w for every w ∈ W .
4. We now want to verify that x − P x is orthogonal to W .
Why does it suffice to merely check
(x − P x) · vi = 0 for i = 1, . . . , k?
5. Use the definition of P to simplify the expression P x · vi .
6. Use this simplified form of P x · vi to conclude that x − P x is orthogonal to W .
174
Group Work 2, Section 5.3 Interpolation We know that if {v1 , . . . , vk } is a basis for W , then every element of W is a unique linear combination of these basis elements. Suppose that each element of W can be treated as a function (on some domain) and that x1 , . . . , xk are allowable “inputs” for all elements of W . If c1 , . . . , ck are arbitrary real numbers, is it true that there exists an element of w ∈ W such that w (xi ) = ci for i = 1, . . . , k ? If this were always true, then this would endow v1 , . . . , vk with a special “interpolation property” — a property which is stronger than linear independence. In this activity we consider a specific case of this interpolation problem: let {v1 , . . . , vk } be a basis of W and let c1 , . . . , ck be real numbers; we seek an element w ∈ W such that w (vi ) = ci where we define w (vi ) = vi · w. 1. Write the linear system corresponding to this problem.
2. Write the matrix equation corresponding to this system; use M to denote the coefficient matrix. Be sure
to indicate the form of the elements of M .
3. What matrix property does M have? What matrix property does M need in order to guarantee that the
linear system will have a unique solution?
4. Let’s prove that M is nonsingular using QR factorization. Let A be the matrix whose ith column is the
basis vector vi . Is the QR factorization valid for A? Why?
5. Let A = QR be this factorization. Use this factorization to express AT .
6. Use the above to simplify AT A.
7. Use this to conclude that M is nonsingular.
175
5.4
Orthogonal Diagonalization of Symmetric Matrices
Suggested Time and Emphasis 1 class. Optional material.
Points to Stress 1. Definition of orthogonally diagonalizable. 2. The eigenvalues of a symmetric matrix are real. 3. A matrix is symmetric if and only if it is orthogonally diagonalizable. 4. The spectral decomposition of a symmetric matrix.
Drill Question Is every diagonalizable matrix symmetric? Answer No, but every symmetric matrix is diagonalizable.
Discussion Question (a) Is it possible to have a matrix which is not symmetric, all of whose eigenvalues are real? (b) Is is possible to have a matrix which is diagonalizable but not symmetric? Answer
(a) Yes; consider an upper-triangular matrix.
(b) Yes.
Test Question
Does the matrix A = Answer
1 2 2 3
have a spectral decomposition? Why or why not?
Yes, because it is symmetric.
Lecture Notes • Emphasize that the Spectral Theorem equates symmetric matrices with those orthogonally diagonalizable. Do this by exhibiting a non-symmetric matrix which is diagonalizable but not orthogonally diagonalizable, ⎡ ⎤ 1 2 3 ⎢ ⎥ such as ⎣0 2 3⎦. 0 0 3 • Suppose A is an n × n diagonalizable matrix with a single distinct eigenvalue. Must A be symmetric? Answer Since A is diagonalizable, it must possess n linearly independent eigenvectors, all of which belong to the same eigenspace. Therefore, using Gram-Schmidt, we can create an orthogonal Q which diagonalizes A; thus A is symmetric. • Construct a Venn diagram showing the relationship between matrices with real eigenvalues, symmetric matrices, diagonalizable matrices and orthogonally diagonalizable matrices within the universe of real 176
Section 5.4 Orthogonal Diagonalization of Symmetric Matrices
matrices. Here is how the picture should look: Real Matrices Diagonalizable
Real Eigenvalues
Symmetric
Orthogonally Diagonalizable
• Spectral Decomposition is often used in Principal Component Analysis.
Lecture Examples
0 1 • A nonsymmetric matrix with imaginary eigenvalues: has eigenvalues ±i. −1 0 ⎡ ⎤ 0 1 −1 ⎢ ⎥ • A symmetric matrix with real eigenvalues: ⎣ 1 1 2 ⎦ has eigenvalues 1, −2, and 3. −1 2 1 • Orthogonal diagonalization of a matrix: √ ⎤⎡ √ ⎤T ⎤ ⎡ √3 ⎤ ⎡ √3 ⎡ 6 6 0 − 0 − −2 0 0 0 1 −1 3 √ 3 3 √ 3 √ √ √ √ ⎥ ⎥ ⎥ ⎢ ⎥⎢ ⎢ 3 2 6 ⎥⎢ 3 2 6 ⎥ ⎢ ⎣ 1 1 2⎦=⎢ ⎣ − √3 √2 − √6 ⎦ ⎣ 0 3 0 ⎦ ⎣ − √3 √2 − √6 ⎦ 3 2 6 3 2 6 −1 2 1 0 0 1 3
2
6
3
• Spectral decomposition of a symmetric matrix: ⎡ √ ⎤ ⎡ ⎤ 3 0 3 √ √ √ ⎢ √ ⎥ √ 3 ⎥ 3 3 3 + 3⎢ 2 ⎥ A = −2 ⎢ ⎣ ⎦ 0 − − 2 ⎣ √3 ⎦ 3 3 3 √ 2 2
3 3
⎡ √ 2 2
√ 2 2
⎢ + 1⎢ ⎣
2
√ 6 √3 − 66 √ 6 6
−
6
⎤ √ ⎥ √ ⎥ − 6 − 6 ⎦ 3 6
√ 6 6
Tech Tips • Most CAS’s do not have a built-in function to perform a spectral decomposition. Have the students find such a decomposition for a 6 × 6 symmetric matrix.
Group Work 1: The Missing Matrix If a group finishes early, ask them if their solution to Problem 3 is unique. Answers 1. Answers will vary, but one possible formulation is
⎡ ⎢ w1 = ⎣
√ 2 5 √5 5 5
0
⎤
⎡
⎥ ⎦
⎢ w2 = ⎢ ⎣
√ 30 √30 − 1530 √ 30 6
177
⎤
⎡
⎥ ⎥ ⎦
⎢ w3 = ⎢ ⎣
√ 6 √6 − 36 √ − 66
⎤ ⎥ ⎥ ⎦
Chapter 5 Orthogonality
2. Yes. A has 3 orthonormal eigenvectors; therefore A is orthogonally diagonalizable and thus symmetric.
⎡ ⎢
3. Answers will vary; one is A = ⎣
41 30 − 11 15 − 16
1 − 11 15 − 6 37 15 1 3
1 3 13 6
⎤ ⎥ ⎦
Group Work 2: The Missing Group Work Suggested Core Assignment Exercises 2, 6, 10, 12, 13P , 18, 24, 27P
178
Group Work 1, Section 5.4 The Missing Matrix 1. Let S be the space spanned by the vectors
⎡
⎤ 2 ⎢ ⎥ v1 = ⎣ 1 ⎦ 0
⎤ 0 ⎥ ⎢ v3 = ⎣ −2 ⎦ 1
⎡
⎤ 1 ⎢ ⎥ v2 = ⎣ 0 ⎦ 1
⎡
⎡
⎤ 1 ⎢ ⎥ v4 = ⎣ 2 ⎦ 3
Find a orthonormal basis for S .
2. Let A be a 3 × 3 matrix with eigenspace S and eigenvalues 1, 2, and 3. Must A be symmetric? Why?
3. Find a possible A.
179
5.5
Applications
Suggested Time and Emphasis 1 class. Optional material.
Points to Stress 1. Dual codes (provided the students have read Section 3.6). 2. Quadratic forms. 3. Rotated conic sections.
Drill Questions
⎡
1 ⎢1 ⎢ ⎢ 1. Consider the (5, 3) generator matrix G = ⎢ 0 ⎢ ⎣0 0
0 1 1 1 0
⎤ 0 1⎥ ⎥ ⎥ 1 ⎥. Why isn’t it in standard form? ⎥ 0⎦ 1
2. Which of the following are quadratic forms in x, y , and z ?
(a)
2x2 + 2y 2 + 2z 2
(b) 2x2 + 2xy + 2z 2
(c) 2x2 + xy + z 2
(d)
x2 + 2x + 1 + y 2 + z 2
(e) x + y + z + x2 + y 2 + z 2
(f) xy + yz
3. Identify each conic section as standard position, translated, or rotated.
(a)
x2 + y 2 + 2x + 1 = 0
(b) x2 − y 2 − 4 = 0
(c) x2 + 2xy + y2 + 1 = 0
Answers 1. The first three rows do not form the identity matrix. 2. (a), (b), (c), (f) 3. (a) Translated
(b) Standard position
(c) Rotated
Discussion Questions 1. What are some advantages of dealing with a self-dual code? 2. Why is the concept of maximizing a quadratic form subject to the constraint ||x|| = 1 an important special
case? Answers 1. It gives us another way of checking errors; the dot product of any two received vectors should be zero. 2. The idea is that if we are maximizing one of these functions over the space of vectors x, the special
case answers the natural question, “What happens if we fix the magnitude of our input, and just vary the direction?” 180
Section 5.5 Applications
Lecture Notes • Any code examples presented in Section 3.6 should be reviewed. The duals of these codes can now be found.
• Notice that in Section 3.6, we found that a matrix G in a certain form creates an error-correcting code, and that there is an associated matrix P that acts as the parity-check for that code. In this section, we now show that there are other matrices that generate the same code vectors. The only requirements for a pair of matrices G and P to generate an error-correcting code are that G has linearly independent rows, that P has linearly independent columns, and that P G = 0. It is then shown how to put these arbitrary matrices in the form discussed in Section 3.6. • Once one has found G and P for a code C , finding C ⊥ becomes trivial, by Theorem 1. • If the students are taking multivariable calculus, point out that they will look at (or have already looked at) constrained optimization problems. Note that one could do the examples in the text without using Theorem 5 (by solving for one variable in the constraint equation, using that to simplify the function f , and using partial derivatives), but Theorem 5 is much quicker.
• The text points out that the origin is a minimum for a positive definite quadratic form, and a maximum for a negative definite quadratic form. When the form is indefinite, the origin is called a saddle point.
• Show students a series of conic sections that approach a degenerate conic section: y
y
y
y
y
1
1
1
1
1
0
1
0
x
y 2 − x2 + 1 = 0
y 2 − x2 +
y
1 2
0
x
=0
x2 − y + 1 = 0
0
x
1 2 2x
1 8
0
x
=0
y2 − x2 +
y
1 1
1
y2 − x2 +
y
1 0
1
0
x
−y+1=0
1 2 4x
=0
y 2 − x2 +
0
x
1 2 16 x
1
1 512
x
=0
y
1 1
−y+1=0 181
1 128
0
x
y
1 1
1
1 1
0
x
−y+1=0
1 2 512 x
1
x
−y+1=0
Chapter 6 Vector Spaces
Lecture Examples
⎧⎡ ⎤ ⎡ ⎤⎫ ⎤ 1 0 ⎪ ⎪ ⎬ ⎨ 1 1 1 0 ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⊥ • Finding C : Let G = ⎣ 1 ⎦ and P = . Then C = ⎣ 1 ⎦ , ⎣ 0 ⎦ (see Example 6 in ⎪ ⎪ 1 0 1 ⎩ 1 1 0 ⎭ ⎡ ⎤ 1 0 ⎢ ⎥ Section 3.6). We now find C ⊥ : G⊥ = ⎣ 0 1 ⎦ and P ⊥ = 1 1 1 , so 1 1 ⎧⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫ 1 ⎪ 1 0 ⎪ ⎬ ⎨ 0 ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⊥ C = ⎣ 0 ⎦ , ⎣ 1 ⎦ , ⎣ 0 ⎦ , ⎣ 1 ⎦ (two bit words plus a parity bit). Note that this dual code is not ⎪ ⎪ ⎩ 0 0 ⎭ 1 1 error-correcting. • Classifying a quadratic form: Let f (x, y, z) = −x2 − 6y 2 − 4z 2 + 4xz + 4xy − 8yz . The associated ⎡ ⎡ ⎤ ⎤⎡ ⎤ −1 2 2 −1 2 2 x ⎢ ⎢ ⎥ ⎥⎢ ⎥ matrix is A = ⎣ 2 −6 −4 ⎦, so f (x, y, z) = x y z ⎣ 2 −6 −4 ⎦ ⎣ y ⎦. The eigenvalues of A 2 −4 −4 2 −4 −4 z are 0, −1, and −10. This is a negative semi-definite form. • A translated and rotated hyperbola: ⎡
y
y
y
1
1
1
0
1
0
x
The hyperbola x2 − 2y 2 + 1 = 0 in standard position
x2
1
0
x
The hyperbola, translated: + 2x + y 2 − 4y + 6 = 0 ⇒
(x + 1)2 + (y − 2)2 + 1 = 0
1
x
The original hyperbola, rotated 30◦ counterclockwise: 1 2 4x
+
√ 3 3 2 xy
− 54 y 2 + 1 = 0
Tech Tip All the applications in this section can be explored for larger dimensions by using a CAS. A CAS can allow students to animate conic sections with a parameter, such as x2 − y 2 + k = 0 or x2 + kxy + y2 = 0, and thus see how each term in a conic section affects the shape.
Group Work Any of the homework problems in the sections you cover are adaptable to group activities. Make sure to do an example first, to make sure the students at least know where to begin.
Suggested Core Assignment Dual Codes: Exercises 2, 6, 12, 14, 18, 19 Quadratic Forms: Exercises 24, 26, 30, 32, 33, 36, 37, 41, 42, 43, 44, 52P , 55, 56, 68, 69, 92 182
6 Vector Spaces 6.1
Vector Spaces and Subspaces
Suggested Time and Emphasis 1 class. Essential material.
Points to Stress 1. The definition of a vector space, stressing concepts such as closure and the zero vector. 2. Specific vector spaces such as Rn , Pn , and Mm×n , and the idea that these are all particular examples of
the abstract concept of “vector space.” 3. The idea that if we prove something about abstract vector spaces, we have learned something that applies to all specific examples of vector spaces.
Drill Question The text gives many examples of vector spaces. Describe two of them.
Discussion Question Consider the hierarchy of subspaces of F given in Figure 2. Can we narrow it down further? How? Answer There are many ways to find subspaces of P . Notice that the degree-two polynomials do not form a vector space (no zero, not closed under addition) but the polynomials of second degree or less do.
Test Question Give an example of a subspace of P4 . Then give an example of an infinite subset of P4 that is not a vector space. Answer Polynomials of degree 2 or less, polynomials of degree exactly 2
Lecture Notes • After discussing several of the examples of vector spaces in the text, show how general results (such as Theorem 1(d)) apply to each of the examples. The idea of Chapter 6 is that we are going to be proving things about abstract vector spaces, and these theorems will apply to the examples of vector spaces given in the text, and any other vector space we can think of. Pick three very different vector spaces (for example, column vectors in R3 , 2 × 3 matrices, and P2 ) and tell the class that you will be going back to these spaces again and again. Have students verify that these are all vector spaces, just to make sure they gain a familiarity with them. 183
Chapter 6 Vector Spaces
• Give some examples of sets that are not vector spaces, such as the set of discontinuous functions. Have the class check the space of functions f satisfying f (0) = 1 and the space of functions f satisfying f (0) = 0. • Note that given a finite set of vectors in a space V , the span of the set is a vector space (Theorem 3). Point out that it is not always easy to go in the other direction — given a subspace of V , it is not always possible to find a nice, finite set of vectors whose span is V . For example, F is very difficult to describe as a span of a small set of vectors. • This is an unusual vector space: Let V consist of the set of all functions f such that f + xf +(sin x) f = 0. Even though we aren’t necessarily able to find specific functions that meet this criterion, we can show that V is a vector space: Because V ⊂ F , we need only show that if f ∈ V , then cf ∈ V , and that if f1 , f2 ∈ v then (f1 + f2 ) ∈ V . Both of these statements are easy to show by direct substitution, followed by the use of the sum and constant multiple rules of differentiation.
Lecture Examples Note that this section has 21 examples, so it may not be necessary to supplement. • Let Σ be the set of all sequences. An element v of Σ will look like this: {v0 , v1 , v2 , v3 , . . .} with vi ∈ R. Addition and scalar multiplication are defined thusly: cv = {cv0 , cv1 , cv2 , cv3 , . . .} and v + w = {v1 + w1 , v2 + w2 , v3 + w3 , . . .}. One can verify that Σ is a vector space. One subspace of Σ is the set of finite power sequences — sequences where there exists an N for which vi = 0 for all i > N . Another interesting subspace of Σ is the set of convergent sequences. The set of alternating sequences is not a vector space, but the set of alternating sequences whose first term is positive is a vector space. The set of sequences that converge to zero is a vector space, but the set of sequences that converge to 1 is not a vector space.
Group Work 1: Is It a Vector Space? Begin by going over a few of the text examples with the students, or Exercises 6 and 7. If a group finishes early, have them see if they can come up with an unusual vector space on their own, with a non-standard addition or scalar multiplication. Close by pointing out how much quicker it is to prove that something is not a vector space than to prove that it is. Answers 1. S is not a vector space because it violates Property 4. 2. S is not a vector space because it violates Property 8. 3. S is a vector space. 4. S is not a vector space because it violates Properties 7–10.
Group Work 2: Rotations and Reflections This activity explores a vector space that comes up in group theory and combinatorics. The first version is more open-ended (and thus a bit more difficult) than the second. Depending on your class, this activity may require a detailed introduction, perhaps actually starting it as class before breaking up into groups to finish. 184
Section 6.1 Vector Spaces and Subspaces
Answers Version 1 1. We can define addition of rotations as doing one rotation, followed by the other. That is, v15 + v30 = v45 .
Note that our addition is naturally going to wind up modulo 360: v270 + v180 = v90 , v180 + v180 = v0 . We define scalar multiplication similarly: avθ = vaθ . Again, this necessarily turns out to be modulo 360. 2. (a) No. For example, v10 + w is not in S ∪ {w}.
(b) Yes, it does. Students should verify all ten properties of vector spaces, making algebraic or geometric arguments. Version 2 1. Students can interpret the ten properties algebraically or geometrically. 2. See Version 1.
Suggested Core Assignment Exercises 1, 3, 7, 12P , 25, 28, 33, 40, 45, 47, 49P , 50P , 54, 62P
185
Group Work 1, Section 6.1 Is It a Vector Space? Determine if each of the following structures is or is not a vector space. Justify your answers. 1. S = R3 . We define scalar multiplication as usual. We define addition this way:
⎡
⎤ ⎡ ⎤ ⎡ ⎤ v1 w1 v1 + w1 ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎣ v2 ⎦ + ⎣ w2 ⎦ = ⎣ v2 + w2 ⎦ v3 w3 0
2. S = M2×3 . We define addition as usual. We define scalar multiplication this way:
a11 a12 a13 k a21 a22 a23
=
a11 a12 ka13 ka21 ka22 ka23
3. S is the set of all 4 × 4 matrices in upper triangular form. Addition and scalar multiplication have the
usual definitions.
4. S = M2×3 . We define addition as usual. We define scalar multiplication this way:
k
a11 a12 a13 a21 a22 a23
=
186
ka22 ka12 ka13 ka21 ka11 ka23
Group Work 2, Section 6.1 Rotations and Reflections (Version 1) We wish to display a highly expensive circular bejeweled bracelet: ruby diamonds emeralds
There is nothing magical about having the ruby at the top. We can rotate the pendant by 90◦ clockwise, to make it look like this:
We can write down all possible rotations of the pendant this way: v90 = rotate the pendant by 90◦ clockwise vθ = rotate the pendant by θ◦ clockwise, where 0 < θ < 360 v0 = leave it alone and find something else to do with your time
Notice that if we were to get all radical and rotate it by 90◦ counterclockwise, that would be equivalent to v270 . Similarly, rotating it by 450◦ clockwise would be equivalent to v90 . 1. Your challenge is to make this set of rotations into a vector space. You are going to have to define addition
(What does v15 + v30 mean? Compute v180 + v180 ) and scalar multiplication (what does 8v15 mean?) Once you’ve defined these operations, prove that you have come up with a vector space.
187
Rotations and Reflections (Version 1)
2. Let S be the set of vectors in your vector space. I am going to add a new vector w.
w = flip the pendant across a vertical diameter
So, if we apply w to the pendant, we get this picture:
(a) Does S ∪ {w} form a vector space? Why or why not?
(b) Does span (S ∪ {w}) form a vector space?
188
Group Work 2, Section 6.1 Rotations and Reflections (Version 2) We wish to display a highly expensive circular bejeweled bracelet: ruby diamonds emeralds
There is nothing magical about having the ruby at the top. We can rotate the pendant by 90◦ clockwise, to make it look like this:
We can write down all possible rotations of the pendant this way: v90 = rotate the pendant by 90◦ clockwise vθ = rotate the pendant by θ◦ clockwise, where 0 < θ < 360 v0 = leave it alone and find something else to do with your time
Notice that if we were to get all radical and rotate it by 90◦ counterclockwise, that would be equivalent to v270 . Similarly, rotating it by 450◦ clockwise would be equivalent to v90 . 1. Our challenge is to make this set of rotations into a vector space.
15◦ ,
30◦ ,
We can define addition naturally: making a total of 45◦ . We have to be a bit
v15 + v30 = v45 . That is, we first rotate by then by careful here: v180 + v270 = v90 . (Why?) We can define scalar multiplication naturally, too. 3v15 = v45 . We rotate by 15◦ three times, for a total of 45◦ . Again, care must be taken: 10v45 = v90 . (Why?) Show that our vectors, along with our definitions of vector addition and scalar multiplication, form a vector space.
189
Rotations and Reflections (Version 2)
2. Let S be the set of vectors in your vector space. I am going to add a new vector w.
w = flip the pendant across a vertical diameter
So, if we apply w to the pendant, we get this picture:
(a) Does S ∪ {w} form a vector space? Why or why not?
(b) Does span (S ∪ {w}) form a vector space?
190
6.2
Linear Independence, Basis, and Dimension
Suggested Time and Emphasis 1 class. Recommended material.
Points to Stress 1. Definitions of linear dependence, linear independence, basis, and dimension for general vector spaces. 2. Applications of these concepts to a few specific vector spaces other than Rn . 3. Definition of the coordinate vector of v with respect to B: [v]B . 4. Theorem 7: the consequences of a vector space having dimension n.
Drill Question
& ' Let B = 1 + x, x + x2 , 1 + x2 be a basis of P2 . It is a fact that
(1 + x) + 2 x + x2 + 3 1 + x2 = 5x2 + 3x + 4.
Now let v = 5x2 + 3x + 4. Find the coordinate vector [v]B of v with respect to B. ⎡ ⎤ 1 ⎢ ⎥ Answer ⎣ 2 ⎦ 3
Discussion Question Since every vector in a vector space can be written as a coordinate vector, why don’t we just study Rn exclusively? Answer If all we care about is the way that the vectors behave based on the vector space operation, we could do this. But often there are operations (such as differentiating elements of the vector space F ) that don’t have a nice analogy in Rn .
Test Question
' & Consider the vector space of second-degree polynomials. Does the set of vectors x2 + 1, x + 1, x2 + x
form a basis? Answer Yes
Lecture Notes • Return to three of the vector spaces discussed in the previous section (for example, column vectors in R3 , 2 × 3 matrices, and P2 ) and find linearly independent sets of vectors in these spaces, linearly dependent sets, and bases for the spaces. • As suggested in Section 6.1, let V consist of the origin and all line segments in the plane that have one end at the origin. Call the other end of a segment the terminating point. We define addition of two segments by coordinate-wise addition of their non-origin ends. Define scalar multiplication by coordinate-wise multiplication. Note that a set in this space is linearly independent if no two segments in the space have the same or opposite slopes. Have the students try to come up with a basis for this vector space. One 191
Chapter 6 Vector Spaces
simple one is the segment from (0, 0) to (0, 1), and the segment from (0, 0) to (1, 0). If v is the segment a from (0, 0) to (a, b), then the coordinate vector of v with respect to this basis is . b • Notice that Theorem 3 allows us to consider all vector spaces of a given dimension to be somehow the same. Intuitively, students should notice the similarity between these statements: ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ 1 1 3 ⎢2⎥ ⎢0⎥ ⎢2⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ + 2⎢ ⎥ = ⎢ ⎥ ⎣3⎦ ⎣0⎦ ⎣3⎦ 4 1 6 1 2 3 4 +2 1 0 0 1 = 3 2 3 6
1 + 2x + 3x2 + 4x3 + 2 1 + x3 = 3 + 2x + 3x2 + 6x3 1 2 1 0 3 2 +2 = 3 4 0 1 3 6 • Have the students come up with examples illustrating the six parts of Theorem 7. actually very strong, and the differences between them can seem subtle to students.
Each statement is
Lecture Examples • Coordinate vectors form a vector space that is not Rn :
Let V be the subspace of F determined by all solutions to the differential equation set of all functions y = A sin x + B cos x. One basis for this subspace: B1 = {sin x, cos x}. Another basis for this subspace: B2 = {sin x + cos x, sin x − cos x}. Let u = sin x, v = 2 sin x − 4 cos x. Then 1 2 [u]B1 = [v]B1 = 0 −4 ! 1 −1 2 [u]B2 = 1 [v]B2 = 3 2 3 = [u]B1 + [v]B1 [u + v]B1 = [3 sin x − 4 cos x]B1 = −4 ! − 12 [u + v]B2 = [3 sin x − 4 cos x]B2 = = [u]B2 + [v]B2 7 [8u]B2 = [8 sin x]B2 =
4 4
=8
1 2 1 2
!
2
= 8 [u]B2 192
d2 y = −y, that is, the dx2
Section 6.2 Linear Independence, Basis, and Dimension
• A linearly dependent set in V : {w1 = sin x + 5 cos x, w2 = 2 sin x + 10 cos x} Correponding linearly dependent coordinate vectors: 1 2 [w1 ]B1 = [w2 ]B1 = 5 10 3 6 [w1 ]B2 = [w2 ]B2 = −2 −4
Note that B1 and B2 are bases because they are linearly independent and span V . They conform to Theorem 6, since they both are sets of two vectors, which means that V has dimension 2. Therefore √ & ' any set of two linearly independent vectors, even π sin x + e cos x, 2 sin x + 22 7 cos x , automatically forms a basis.
Tech Tips • Explore the relevant built-in functions on a computer algebra system
Group Work: Wronski Beat Note that this uses Exercise 15 from the text. Answers 1. No; sin 2x = 12 sin x cos x 2. Yes, but this is not easy to show at this point. 3. W (x) = ex (cos x − sin x). Linearly independent. 4. W (x) = 0. Linearly dependent. 5. W (x) = 3 sin3 x cos x + 3 cos3 x sin x. Linearly independent.
f kf . 6. We prove that if the functions are not linearly independent, then W = 0. Let g = kf , then take f kf
Suggested Core Assignment Exercises 3, 8, 15P , 17, 21, 23, 28, 34, 37, 43P , 46, 54P , 59P
193
Group Work, Section 6.2 Wronski Beat Applying standard function addition and scalar multiplication to the set of all infinitely differentiable functions forms a vector space, a subspace of F . Sometimes it is hard to determine if a set of vectors in F is linearly independent. Let’s try some examples: 1. Is S = {sin x, sin x cos x, sin 2x} a linearly independent set?
2. Is R = {sin x, cos x, sin x cos x} a linearly independent set?
Now that you have seen that these problems are a bit tricky, you are ready for the good news. There is a way of telling if a set of functions is linearly independent. We take our collection of functions {fi (x)}and find a new function called the Wronskian W (x). If W (x) = 0, the zero function, then the vectors are linearly dependent. Otherwise, they are linearly independent. The Wronskian of {f1 , f2 , . . . , fn } is given by f1 (x) f2 (x) f (x) f2 (x) 1 W (x) = .. .. . . (n−1) (n−1) f (x) f2 (x) 1
(n−1) · · · f/ n (x)
··· ··· .. .
f/ n (x) f/ n (x) .. .
provided all the derivatives exist and are smooth. 3. Find the Wronskian of the set {ex , sin x}. Is this set linearly independent?
4. Find the Wronskian of the set {ex , 2ex + 1, ex − 2}. Is this set linearly independent?
194
Wronski Beat
5. Find the Wronskian of the set R from Problem 2. Is this set linearly independent?
6. Prove that f and g are linearly independent if their Wronskian is not identically zero.
195
6.3
Change of Basis
Suggested Time and Emphasis 1 class. Recommended material.
Points to Stress 1. Definition and notation for a change-of-basis matrix. 2. Properties of a change-of-basis matrix (Theorem 1). 3. Computing a change-of-basis matrix as in Examples 3 and 4.
Drill Question Let V = P2 and W = R3 . Does there exist a change-of-basis matrix that will allow us to express an element of V as an element of W ? Answer No. A change-of-basis matrix allows for different representations within the same vector space.
Discussion Question Must a change-of-basis matrix always be square? Answer Yes, because it must be nonsingular.
Test Question Could the following be a change-of-basis matrix? ⎡ ⎤ 1 2 3 ⎢ ⎥ ⎣0 0 4⎦ 0 0 5 Answer
No. It is singular.
Lecture Notes • In Rn , we describe a vector using an ordered (column) list of numbers; for example, we can define x ∈ R2 2 by writing x = . Note that in doing this we are actually giving the coordinate vector of x with 3
respect to the standard basis of R2 . This should not be confused with the way in which we describe the coordinate vector of x with respect to a (non-standard) basis B = {v1 , v2 }. For example, we might have −3 −3 [x]B = for a particular B. This is not the same as saying x = . −1 −1 • Pose this quick quiz question to the students: Is there a basis B for Rn such that x = [x]B for all x ∈ Rn ? The answer is yes, and there is only one such choice — the standard basis! • If students have a CAS or graphing calculator available then the method described in Solution 2 to Example 3 for finding a change-of-basis matrix seems to be the most efficient. This method takes advantage of the fact that the matrix PE←B can be found with little or no effort. 196
Section 6.3 Change of Basis
• Let PB←C denote the change-of-basis matrix for bases B and C . Discuss properties that this matrix must have and properties that it need not have. For example, must PB←C be nonsingular? (Yes.) Must PB←C be diagonalizable? (No.) What can be concluded if PB←C is diagonalizable? What would it mean if P were symmetric?
Lecture Examples We calculate PB←C in three different settings. In each setting, E denotes the standard basis. ⎧⎡ ⎤ ⎡ ⎧⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫ ⎤⎫ −1 5 1 ⎪ −5 ⎪ ⎪ ⎪ ⎨ 1 ⎨ 10 ⎬ ⎬ ⎢ ⎥ ⎢ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎥ • Let V = R3 , B = ⎣ 2 ⎦ , ⎣ 0 ⎦ , ⎣ −1 ⎦ , and C = ⎣ 10 ⎦ , ⎣ 5 ⎦ , ⎣ 20 ⎦ . Then ⎪ ⎪ ⎪ ⎪ ⎩ 0 ⎩ 15 1 0 1 ⎭ −5 ⎭ ⎤ ⎤−1 ⎡ ⎤ ⎡ ⎡ 9 3 6 1 −1 1 10 5 −5 ⎥ ⎥ ⎢ ⎥ ⎢ ⎢ PB←C = (PE←B )−1 PE←C = ⎣ 2 0 −1 ⎦ ⎣ 10 5 20 ⎦ = ⎣ 7 −1 3 ⎦ 8 1 −8 0 1 1 15 0 −5 & & ' ' • Let V = P2 , B = −x + x2 , 1, x + x2 , and C = 2x, 4 − 2x2 , 2x2 . Then ⎡ ⎤−1 ⎡ ⎤ ⎡ ⎤ 0 1 0 0 4 0 −5 −5 5 ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ PB←C = (PE←B )−1 PE←C = ⎣ −1 0 1 ⎦ ⎣ 2 0 0 ⎦ = ⎣ 0 20 0 ⎦ 1 0 1 0 −2 2 5 −5 5 . 1 1 0 1 0 0 0 1 , , , , and • Let V = M22 , B = 0 0 1 0 1 1 0 1 . 1 1 1 1 1 1 0 1 , , , C= . Then 1 1 1 0 0 0 0 0 ⎤ ⎡ ⎤ ⎡ ⎤−1 ⎡ 0 1 1 1 0 1 1 1 1 0 0 0 ⎢1 1 0 1⎥ ⎢1 1 1 1⎥ ⎢ 1 0 1 0⎥ ⎥ ⎢ ⎥ ⎢ 2 ⎥ ⎢ 2 PB←C = (PE←B )−1 PE←C = ⎢ ⎥ ⎥=⎢ 1 ⎥ ⎢ 1 ⎣ 0 1 1 0 ⎦ ⎣ 0 0 1 1 ⎦ ⎣ −2 0 2 1 ⎦ 1 1 0 0 0 1 0 0 1 1 2 0 −2 0
Tech Tip Use the methods presented in Examples 2, 3, and 4 to compute the change-of-basis matrix for a specific high-dimensional vector space equipped with two different bases. Compare the speeds of the three methods.
Group Work 1: The Shape of the Future Answers
2 1. B2 (1; x) = 1, B2 (x; x) = x, B2 x2 ; x = 12 x + 12 x2 . B2 (ex ; x) = xe1/2 + 1 − x . Monotonicity
and concavity are both preserved. 2 2−k 2. B2 (f + g) (x) = k=0 (f + g) k2 k2 xk (1 − x)
2−k 2 k = 2k=0 f k2 + g k2 k x (1 − x) = 2k=0 f k2 k2 xk (1 − x)2−k + 2k=0 g k2 k2 xk (1 − x)2−k = B2 f (x) + B2 g (x) and B2 (cf) (x) = 2k=0 (cf ) k2 k2 xk (1 − x)2−k = c 2k=0 f k2 k2 xk (1 − x)2−k . 197
Chapter 6 Vector Spaces
⎤ ⎤ ⎡ 1 0 0 1 0 0 ⎥ ⎢ ⎥ ⎢ 3. PE←B = ⎣ −2 2 0 ⎦, PB←E = (PE←B )−1 = ⎣ 1 12 0 ⎦ 1 −2 1 1 1 1 ⎡ ⎤ ⎡ ⎤⎡ ⎤ ⎡ ⎤ a 1 0 0 a a ⎢ ⎥ ⎢ ⎥⎢ ⎥ ⎢ ⎥ 4. [f ]E = ⎣ b ⎦, [f ]B = PB←E [f ]E = ⎣ 1 12 0 ⎦ ⎣ b ⎦ = ⎣ a + 12 b ⎦ c c 1 1 1 a+b+c ⎡
Group Work 2: Back to the Future This can be done as an out-of-class project. Answers 1. B2 f (x)
2. [B2 f (x)]B
⎤ B2
⎢ ⎥ = B2 a + bx + cx2 = a b c ⎣ B2 x ⎦ B2 x2 ⎡ ⎤ ⎛⎡ ⎤⎞T ⎡ ⎤ 1 0 0 B2 1 ⎢ ⎥ ⎜⎢ ⎥⎟ ⎢ ⎥ = [f]TE ⎣ B2 x ⎦ = [f]TE ⎝⎣ 0 1 12 ⎦⎠ ⎣ x ⎦ x2 B2 x2 0 0 12 ⎡ ⎤ 1 ⎥ T ⎢ = (E [f ]E ) ⎣ x ⎦ x2 ⎡
= PB←E [B2 f (x)]E = PB←E E [f ]E
= PB←E EPE←B [f ]B = PB←E E (PB←E )−1 [f]B ⎡ ⎤ 1 0 0 ⎢ ⎥ = ⎣ 14 21 41 ⎦ [f]B 0 0 1 ⎡ ⎤ 1 0 0 ⎢ ⎥ Therefore, B = ⎣ 14 21 41 ⎦. 0 0 1 −1
3. The matrix representation is simply the product PC←E E (PC←E )
Suggested Core Assignment Exercises 1, 4, 6, 10, 11, 16, 18, 22P
198
.
Group Work 1, Section 6.3 The Shape of the Future We will see in upcoming sections that the concept of a linear transformation can be extended from Rn to arbitrary finite-dimensional vector spaces. Moreover, such transformations will always have a matrix representation (dependent on the choice of basis). We look at a specific example of this over the course of two group activities. 1. For a continuous function f (x) defined on the interval [0, 1], we define the nth degree Bernstein
polynomial of f (x), Bn (f; x), to be the polynomial Bn (f ; x) = nk=0 f nk nk xk (1 − x)n−k n n! where denotes the binomial coefficient . These polynomials have wonderful “shapek k!(n − k)! preserving” properties. For example, if f (x) is an increasing function, then so is Bn (f ; x); and if the graph of y = f (x) is concave up, then so is the graph of y = Bn (f ; x). Let’s verify some of this for B2 (f ; x). Calculate B2 (f; x) for f (x) = 1, f (x) = x, f (x) = x2 , and f (x) = ex . Is monotonicity preserved? Is concavity preserved?
2. Let P2 denote the second-degree polynomials defined on [0, 1]. A mapping T : P2 → P2 is said to
be a linear transformation if T (f + g) (x) = T f (x) + T g (x) and T (cf ) (x) = cT f (x) whenever c ∈ R. To simplify the notation, from now on let’s write B2 f (x) instead of B2 (f; x). Verify that B2 f (x) : P2 → P2 is a linear transformation.
&
'
3. Let f (x) be an element of P2 and write f (x) = a + bx + cx2 . Let E = 1, x, x2 denote the standard
) ( basis for P2 and let B = (1 − x)2 , 2x (1 − x) , x2 denote the (so-called) Bernstein basis for P2 . Find PE←B and PB←E .
Use your results to find [f ]E and [f ]B .
199
Group Work 2, Section 6.3 Back to the Future We continue our investigation of the linear operator Bn f (x) = nk=0 f nk nk xk (1 − x)n−k
& ' mapping P2 into itself. We are interested in the following two bases for P2 : E = 1, x, x2 and B = ( ) (1 − x)2 , 2x (1 − x) , x2 . ⎡ ⎤ 1 0 0 ⎢ ⎥ 1. Let E = ⎣ 0 1 12 ⎦. E is the matrix representation of B2 with respect to the basis E , because for every 0 0 12 ⎡ ⎤ 1 ⎢ ⎥ f ∈ P2 we have B2 f (x) = (E [f ]E )T ⎣ x ⎦ or equivalently [B2 f (x)]E = E [f ]E . Use the linearity of x2 B2 to verify that these equations are valid.
2. Let B denote the matrix representation of B2 with respect to the basis B; that is, let B be the matrix such
⎡
⎤ (1 − x)2 ⎢ ⎥ that for every f ∈ P2 we have B2 f (x) = (B [f ]B )T ⎣ 2x (1 − x) ⎦ or equivalently [B2 f (x)]B = B [f ]B . x2 Use the result from Question 1 and the change-of-basis matrix PB←E to find B .
3. Look at your solution to the Question 2. If C is another basis for P2 , describe how to obtain the matrix
representation for B2 with respect to C .
200
6.4
Linear Transformations
Suggested Time and Emphasis 1 class. Essential material.
Points to Stress 1. Definition of a linear transformation. 2. Theorem 1(a) 3. A linear transformation is completely determined by its action on a basis.
Drill Question Does every linear transformation have an inverse? Answer No
Discussion Question As mentioned in this section, if T : V → W and B is s basis for V , the set T (B) need not be a basis for W . Can you find an example that illustrates this fact? Can you find conditions which would guarantee that T (B) is a basis? Answer Answers will vary. One such condition is that T be invertible.
Test Question How many linear transformations T : P2 → P2 are there such that T (1) = x + 2, T (x) = x + x2 , and T x2 = 1? Answer
There is only one such linear transformation.
Lecture Notes • Begin by reviewing Theorem 2 from Section 3.5 — that every linear transformation from Rm → Rn is, essentially, an n × m matrix. • Emphasize Examples 3 and 4, since they illustrate linear transformations that do not possess finite matrix representations. • Consider the nonstandard vector space V , the set of positive real numbers with vector addition defined as multiplication (of real numbers) and scalar multiplication, cv, defined as exponentiation: vc . Define a nontrivial linear transformation mapping V into V (such as v → v2 ) and verify that it is linear. • The first of the Remarks following the definition of invertibility. hints at a close relationship between the domain and codomain of an invertible transformation. Have the students try to define an invertible transformation between V and W if the dimensions of these spaces are not the same. This should provide evidence that such a transformation can exist only between spaces of identical dimension. 201
Chapter 6 Vector Spaces
Lecture Examples
• Verification that T is a linear transformation, where T : P2 → P1 is defined by T a + bx + cx2 = b+cx:
= T a + d + (b + e) x + (c + f ) x2 T a + bx + cx2 + d + ex + f x2 = (b + e) + (c + f) x = b + cx + e + fx
= T a + bx + cx2 + T d + ex + f x2
and T k a + bx + cx2 = T ka + kbx + kcx2 = kb + kcx = k (b + cx) = kT a + bx + cx2 . a ab 2 2 • Verification that T is not a linear transformation where T : R → R is defined by T = : b b a c a+c (a + c) (b + d) a c ab T + =T = , but T +T = + b d b+d b+d b d b cd ab + cd = . Therefore, T is not linear. d b+d • Illustration that a linear transformation T is completely determined by its action on a basis: 2 1 0 0 1 0 0 Let T : P2 → M22 where T (1) = , T (x) = , and T x = . Therefore, 0 0 0 0 0 1
T a + bx + cx2 = aT 1 + bT xc + T x2 0 0 0 1 1 0 +c +b =a 0 1 0 0 0 0 a b = 0 c
• Verification that T and T are ⎛⎡inverses ⎡ ⎤ ⎤⎞ of each other: a a
⎜⎢ ⎥⎟ ⎢ ⎥ 3 2 3 2 Define T : R → P2 by T ⎝⎣ b ⎦⎠ = c + bx + ax and T : P2 → R by T c + bx + ax → ⎣ b ⎦. c c ⎛⎡ ⎤⎞ ⎡ ⎤ ⎛⎡ ⎤⎞ a a a
⎢ ⎥
⎜⎢ ⎥⎟ ⎜⎢ ⎥⎟ 2 2 Then (T ◦ T ) ⎝⎣ b ⎦⎠ = T c + bx + ax = ⎣ b ⎦ and (T ◦ T ) c + bx + ax = T ⎝⎣ b ⎦⎠ = c c c c + bx + ax2 . Thus, T = T −1 .
Tech Tips • Let V be the the cubic polynomials and choose a specific linear transformation T : V → V . Describe how T affects the graphs of various elements of V .
Group Work 1: Linear Transactions There is no handout for this activity. Each group member defines a nontrivial linear transformation for each of the following pairs of spaces, and then trades transformations with another student. The recipient then verifies that each of the transformations they received is linear. 1. T1 : P3 → R5 2. T2 : M22 → P2 3. T3 : R3 → M23 202
Section 6.4 Linear Transformations
4. T4 : M23 → R 5. T5 : P2 → U33 , where U33 denotes the vector space of all 3 × 3 upper triangular matrices.
Group Work 2: Bases, Matrices and Transformation Answers
⎡
⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ 1 1 0 0 ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ 1. [TA (1)]E = A ⎣ 0 ⎦ = ⎣ 2 ⎦, and therefore TA (1) = 1 + 2x + x2 . [TA (x)]E = A ⎣ 1 ⎦ = ⎣ 3 ⎦, and 0 1 0 0 ⎡ ⎤ ⎡ ⎤ 0 1 2 ⎢ ⎥ ⎢ ⎥ therefore TA (x) = 3x. TA x E = A ⎣ 0 ⎦ = ⎣ 1 ⎦, and therefore TA x2 = 1 + x. 1 0 2. By the linearity of matrix multiplication, we have [TA v1 + v2 ]E = [TA v1 ]E + [TA v2 ]E
and
[TA cv]E = c [TA v]E
By the uniqueness of coordinate vectors, we have that TA is linear. 3. By the linearity of TA , we have
TA a + bx + cx2 = aTA 1 + bTA x + cTA x2 = a 1 + 2x + x2 + b (3x) + c (1 + x) = a + c + (2a + 3b + c) x + ax2 ⎡ ⎤ ⎡ ⎤ 1 1 ⎢ ⎥ ⎢ ⎥ 4. [TA (1)]B = A ⎣ −1 ⎦ = ⎣ −1 ⎦, and therefore TA (1) = 1 + x2 − x2 + x − x2 = 1 + x − x2 . 0 1 ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ 0 0 1 0 2 ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ 2 [TA (x)]B = A ⎣ 1 ⎦ = ⎣ 4 ⎦, and therefore TA x = 1 + 5x . TA x B = A ⎣ 1 ⎦ = ⎣ 3 ⎦, and 1 0 0 0 2 2 therefore TA x = 3x . 5. A change of basis has resulted in a new (different) transformation.
Suggested Core Assignment Exercises 2, 5, 9, 15, 18, 20, 22P , 24P , 27, 35P
203
Group Work 2, Section 6.4 Bases, Matrices and Transformation In Section 6.6, we will learn how linear transformations and matrices are related and the role that bases play in this relationship. The goal of this activity is to notice that matrices can define linear transformations on arbitrary finite-dimensional vector spaces. We also note how a change of basis affects the transformation. ⎡ ⎤ 1 0 1 ⎢ ⎥ 1. Let V = P2 and let E denote the standard basis for V . We will use the matrix A = ⎣ 2 3 1 ⎦ to define 1 0 0 the following transformation TA : V → V : for v ∈ V , let TA v be the element of V with (standard basis) coordinate vector [TA v]E equal to A [v]E . Calculate TA (1), TA (x), and TA x2 .
2. Prove that TA is a linear transformation.
3. Use the above results to evaluate TA a + bx + cx2 , where a, b, c ∈ R.
4. Let’s change the basis that we are using. Let B =
A [v]B . Calculate TA (1), TA (x), and TA x2 .
& ' 1 + x2 , x2 , x − x2 and, for v ∈ V , let [TA v]B =
5. What effect does a change of basis have on the linear transformation TA ?
204
6.5
The Kernel and Range of a Linear Transformation
Suggested Time and Emphasis 1 class. Optional material.
Points to Stress 1. The definitions of kernel, range, rank and nullity of a linear transformation. 2. The kernel and range of a transformation from Rm → Rn given by a matrix. 3. Definitions of one-to-one and onto. 4. The Rank Theorem. 5. Theorem 7 and isomorphisms.
Drill Question
⎡
⎡ ⎤ ⎤ w x ⎢x⎥ ⎢w⎥ ⎢ ⎥ ⎢ ⎥ Consider the linear transformation T : ⎢ ⎥ → ⎢ ⎥ on R4 . What is the kernel? The range? The rank? ⎣y ⎦ ⎣0⎦ z z
The nullity?
Answer
⎫ ⎧⎡ ⎤ 0 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎢ ⎥ ⎬ ⎨ 0 ⎢ ⎥ Kernel: ⎢ ⎥ : a ∈ R , range: ⎪ ⎪ ⎣a⎦ ⎪ ⎪ ⎪ ⎪ ⎭ ⎩ 0
⎫ ⎧⎡ ⎤ a ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎢ ⎥ ⎬ ⎨ b ⎢ ⎥ : a, b, c ∈ R , rank: 3, nullity: 1 ⎢ ⎥ ⎪ ⎪ ⎣0⎦ ⎪ ⎪ ⎪ ⎪ ⎭ ⎩ c
Discussion Question Ask students to describe what it means for a vector space to have kernel 0. Then discuss the special case where the rank is 0.
Test Question Let T : Answer
R2
→
R2
be given by T
a b
=
b . Is T an isomorphism? Prove your answer. a
It is an isomorphism. It is easy to show that T (u + v) = T (u) + T (v).
Lecture Notes • Make sure that the students understand that the kernel and range “live” in different spaces. If T : V → W , the kernel lives in V and the range lives in W . One way to emphasize this idea is let T : P3 → R2 be given 3
c 2 by T ax + bx + cx + d = . Now the kernel of T is a set of polynomials in P3 , and the range of d T is a set of vectors in R2 . • Theorem 8 says something very powerful. We now know that, for example, all six-dimensional vector a n n o spaces are isomorphic. If you are tired of working with vectors in M2,4 that look like this: y i n g 205
Chapter 6 Vector Spaces
you can, instead, choose to work in M1,8 : p l e a s a n t , knowing all your results will carry over. Notice that M2,4 and M1,8 are isomorphic only as vector spaces. If your application involves, for example, matrix multiplication, then we can no longer get away with substituting 1 × 8 matrices for 2 × 4 matrices. If you are dealing with isomorphic vector spaces, the assumption is that you are using only the operations of addition and scalar multiplication. • This is an interesting isomorphism: Take the set of complex numbers C = {a + bi}. If you consider only the operations of addition and multiplication of a complex number by a real number, they form a vector space. And, by Theorem 8, this vector space is isomorphic to R2 . A natural isomorphism T : C → R2 is a . given by T (a + bi) = b • Ask the students if all linear transformations from R → R are one-to-one and onto. one-to-one and onto, except in the case where a = 0, in which case it is neither.
T (x) = ax is
Lecture Examples • Isomorphic spaces: Let V be the set of solutions to the differential equation y + y = 0. It turns out that V = {a sin x + b cos x : a, b ∈ R}. a 0 Let T : R2 → V , where T = a sin x + b cos x. Then ker (T ) = , nullity (T ) = 0, b 0 range (T ) = V , and rank (T ) = 2. T is one-to-one and onto, and maps a two-dimensional space to a two-dimensional space, and is thus an isomorphism. ⎛⎡ ⎤⎞ ⎛⎡ ⎤⎞ a 0 ⎢ ⎥⎟ ⎜⎢ b ⎥⎟ ⎜ a ⎜⎢ ⎥⎟ ⎜⎢ b ⎥⎟ . Then ker (T ) = ⎜⎢ ⎥⎟, nullity (T ) = 3, Let T : R4 → R2 be given by T ⎜⎢ ⎥⎟ = ⎝⎣ c ⎦⎠ ⎝⎣ c ⎦⎠ 0 d d a range (T ) = , and rank (T ) = 1. This is neither one-to-one nor onto, and hence is not an 0 isomorphism.
Tech Tips • Have a CAS compute the rank and nullity of some large linear transformations, presented in matrix form.
Group Work 1: One-to-One and Onto This activity deals with functions between sets, not between vector spaces. It is meant to give students an opportunity to play with “one-to-one” and “onto” without the baggage of addition and scalar multiplication. Remind them that a relation is only well-defined if every element of the domain is mapped to a unique element of the range. Some of the problems can be answered only by polling the class after they have finished working. Answers Chairs: Well-defined, one-to-one, onto (if all the chairs are occupied; otherwise not). Eye color: Usually well-defined, usually not one-to-one, usually onto. Mom & Dad’s birthplace: Usually not well defined. 206
Section 6.5 The Kernel and Range of a Linear Transformation
Molecules: Well-defined, one-to-one, not onto. Spleens: Well-defined, one-to-one,onto. Pencils: Not well-defined. Students may have more than one pencil, or (horrors!) none. Social Security Number: Well-defined, one-to-one, not onto February birthday: Not well-defined (some people were born on February 29!). Birthday: Well-defined, probably not one-to-one but you should check, we hope it is not onto! Cars: Not well-defined (some have none, some have more than one), not onto. Cash: Well-defined, usually not one-to-one, not onto. Middle names: Not well-defined (some have none, some have more than one). Identity: Well-defined, one-to-one, onto. Algebra instructor: Well-defined, not one-to-one, onto.
Group Work 2: A Transformation Question 1 is optional, depending on the amount of time you have. If a group finishes early, have them try to come up with an isomorphism between the two spaces that is not the standard one, and to show that they do indeed have an isomorphism. Answers 1. One can show that T (cA) = cT (A) and that T (A + B) = T (A) + T (B) algebraically.
0 0 , u ∈ R; nullity (T ) = 1. 0 u ⎧⎡ ⎤⎫ a ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎨⎢ ⎥ ⎬ b ⎢ ⎥ 3. range (T ) = ⎢ ⎥ , rank (T ) = 3. ⎪ ⎣ ⎦⎪ c ⎪ ⎪ ⎪ ⎪ ⎩ ⎭ −a + b + c 4. No. It is neither one-to-one nor onto. 5. Yes. All we need is for there to exist an isomorphism between the two spaces. Just because T isn’t an isomorphism does not mean there isn’t some isomorphism out there. 2. ker (T ) =
Suggested Core Assignment Exercises 1, 3, 5, 7, 10, 14, 16, 18, 24, 30P , 33P , 37P
207
Group Work 1, Section 6.5 Functions in the Classroom Which of the following relations are well-defined functions? Of the ones that are well-defined, which are one-to-one? Which are onto? Domain
Function Values
Function
All the people in your classroom
Chairs
f (person) = his or her chair
All the people in your classroom
The set {blue, brown, green, hazel}
f (person) = their eye color
All the people in your classroom
Cities
f (person) = birthplace of their mom and dad
All the people in your classroom
R, the real numbers
f (person) = number of molecules in their body
All the people in your classroom
Spleens
f (person) = their own spleen
All the people in your classroom
Pencils
f (person) = their pencil
All the people in the United States
Integers from 0–999999999
f (person) = their Social Security number
All the living people born in February
Days in February, 2007
f (person) = their birthday in February 2007
All the people in your classroom
Days of the year
f (person) = their birthday
All the people in your classroom
Cars
f (person) = their car
All the people in your classroom
R, the real numbers
f (person) = how much cash they have on them
All the people in your college
Names
f (person) = their middle name
All the people in your classroom
People
f (person) = himself or herself
All the people in your classroom
People
f (person) = their linear algebra instructor
208
Group Work 2, Section 6.5 A Transformation ⎡
⎤ r ⎢ ⎥ r s ⎢ r+s ⎥ 4 Let T : M2,2 → R be given by T =⎢ ⎥. ⎣ r+t ⎦ t u r+s+t 1. Show that T is a linear transformation.
2. Find the kernel of T and the nullity of T .
3. Find the range and the rank of T .
4. Is T an isomorphism?
5. Are the two spaces, M2,2 and R4 isomorphic?
209
6.6
The Matrix of a Linear Transformation
Suggested Time and Emphasis 2–3 classes. Essential material.
Points to Stress 1. The construction of the matrix representation of a given linear transformation with respect to given bases. 2. Every linear transformation between finite-dimensional spaces has a matrix representation with respect to
given bases. 3. The construction of the matrix representation for the inverse of a given linear transformation. 4. The matrix representation for a linear transformation between vector spaces changes in a predictable way
when associated bases are changed.
Drill Question Can a linear transformation which maps a 3-dimensional vector space onto a 2-dimensional vector space ever be invertible? Answer No
Discussion Question Let T : V → V be a linear transformation. If T is diagonalizable, must T be invertible? If T is invertible, must it be diagonalizable? Answer No. A main point of this section is that every such T has a matrix A associated with it (up to basis choice); this matrix is diagonalizable/invertible if and only if T is diagonalizable/invertible. Since there is no relationship between matrix diagonalizability and invertibility, there is no such connection involving the diagonalizability and invertibility of a linear transformation.
Test Question Is the linear transformation T : P2 → P2 defined by T (f ) = f invertible? Answer
No
Lecture Notes • Note that Figure 1 is at the heart of what this section is about; vector spaces V and W are essentially copies of Rn and Rm respectively (via isomorphisms). As such, any linear transformation from V to W must induce a linear transformation from Rn and Rm , and Chapter 3 provides a complete description of all such transformations. • Remind students that Theorem 1 establishes that there is, in some sense, no difference between linear transformations and matrices. This theorem describes how to find an associated matrix from a given linear transformation. Given two vector spaces and a matrix (of appropriate size), we can immediately define an associated linear transformation between the two spaces. 210
Section 6.6 The Matrix of a Linear Transformation
• Let T be a linear transformation between two finite-dimensional vector spaces. Ask students to describe (using terminology from the course) the set of matrices that can represent T . The description should involve the concept of similar matrices. • Consider the approach to matrix representations described in Group Work 2 of Section 6.4: if we are to find the matrix representation of linear transformation T : V → V with respect to a nonstandard basis B, first find the the matrix representation E with respect to the standard basis E and then calculate the product PC←E E (PC←E )−1 . • Consider going through the proof of Theorem 2 in detail. • Look at the additions to The Fundamental Theorem of Invertible Matrices (Theorem 5) and verify the new equivalences.
Lecture Examples • Find the matrix A of T with respect to bases B and C : Let V = M22 and W = R2 , and define T : V → W by T (A) = the first column of A. Let . 1 0 −1 B = {E4 , E2 , E3 , E1 } be a basis for V and let C = . Then T (E4 ) = , , 1 0 1 0 0 1 0 0 T (E2 ) = , T (E3 ) = , and T (E1 ) = . Thus [T (E4 )]C = , [T (E2 )]C = , 0 1 0 0 0 ! ! ! 1 1 1 1 0 0 2 2 2 [T (E3 )]C = 21 , and [T (E1 )]C = 1 . Therefore, A = 1 1 . − 0 0 − 2 2 2 2 • Changing the matrix of T as a basis changes:
Let V = M22 , B = E = {E1 , E2 , E3 , E4 }, and C = {A, B, C, D}, where A = C=
1 0 1 1 ,B = , 0 0 0 0
1 1 1 1 , and D = . Define T : V → V by T (M) = M T . The matrix of T with respect to B 1 0 1 1
is
⎡
0 0 1 0
0 1 0 0
⎤ 0 0⎥ ⎥ ⎥ 0⎦ 1
1 ⎢0 ⎢ =⎢ ⎣0 0
1 1 0 0
1 1 1 0
1 ⎢0 ⎢ A=⎢ ⎣0 0
Note that
⎡
PB←C
so the matrix of T with respect to C is ⎡ ⎤⎡ 1 −1 0 0 1 ⎢ 0 1 −1 0 ⎥ ⎢ 0 ⎢ ⎥⎢ P −1 AP = ⎢ ⎥⎢ ⎣ 0 0 1 −1 ⎦ ⎣ 0 0 0 0 1 0
0 0 1 0
0 1 0 0
211
⎤ 1 1⎥ ⎥ ⎥ 1⎦ 1
⎤⎡ 0 1 ⎢ ⎥ 0 ⎥⎢ 0 ⎥⎢ 0 ⎦⎣ 0 1 0
1 1 0 0
1 1 1 0
⎤ ⎡ ⎤ 1 1 1 0 0 ⎢ ⎥ 1⎥ ⎥ ⎢ 0 −1 0 0 ⎥ ⎥=⎢ ⎥ 1⎦ ⎣0 1 1 0⎦ 1 0 0 0 1
Chapter 6 Vector Spaces
• Computations with T using A: Let V = M22 and B = E = {E1 , E2 , E3 , E4 }. Define T : V → V by T (M) = M T (transpose). The matrix of T with respect to B is ⎤ ⎡ 1 0 0 0 ⎢0 0 1 0⎥ ⎥ ⎢ A=⎢ ⎥ ⎣0 1 0 0⎦ 0 0 0 1 ⎤ ⎡ 3 ⎥ ⎢ 3 −2 ⎢ −2 ⎥ Let M = . Note that [M]B = ⎢ ⎥ and ⎣ 1⎦ 1 0 0 ⎤ ⎤ ⎡ ⎤⎡ ⎡ 3 3 1 0 0 0 ⎢ 0 0 1 0 ⎥ ⎢ −2 ⎥ ⎢ 1 ⎥ ⎥ ⎥ ⎢ ⎥⎢ ⎢ A [M ]B = ⎢ ⎥ = [T (M)]B ⎥=⎢ ⎥⎢ ⎣ 0 1 0 0 ⎦ ⎣ 1 ⎦ ⎣ −2 ⎦ 0 0 0 0 0 1 3 1 since [T (M )]B = . −2 0 • Composition of transformations: Let V = M22 and B = E = {E1 , E2 , E3 , E4 }. Define T : V → V by T (M ) = M T (transpose) and U : V → V by interchanging rows. The matrices of T and U with respect to B are respectively ⎡ ⎡ ⎤ ⎤ 1 0 0 0 0 0 1 0 ⎢0 0 1 0⎥ ⎢0 0 0 1⎥ ⎢ ⎢ ⎥ ⎥ A=⎢ and C=⎢ ⎥ ⎥ ⎣0 1 0 0⎦ ⎣1 0 0 0⎦ 0 0 0 1 0 1 0 0 3 −2 1 3 Let M = . Note that (T ◦ U ) (M) = T (U (M)) = . So 1 0 0 −2 ⎡ ⎤⎡ ⎤⎡ ⎤ ⎡ ⎤ 1 0 0 0 0 0 1 0 3 1 ⎢ 0 0 1 0 ⎥ ⎢ 0 0 0 1 ⎥ ⎢ −2 ⎥ ⎢ 3 ⎥ ⎢ ⎥⎢ ⎥⎢ ⎥ ⎢ ⎥ AC [M]B = ⎢ ⎥⎢ ⎥⎢ ⎥=⎢ ⎥ = T (U (M))B ⎣ 0 1 0 0 ⎦⎣ 1 0 0 0 ⎦⎣ 1 ⎦ ⎣ 0 ⎦ 0 0 0 1 0 1 0 0 0 −2
Tech Tip Find the matrix representation for differentiation (with respect to the standard basis) on the set of polynomials of degree 5 or smaller, and verify that this representation is correct by using it to differentiate some polynomials.
Group Work 1: Find the Diagonal There are intentionally few instructions given. One goal is to get students to recall that this question can be answered simply by finding matrix representations — with respect to any basis — for the given transformation 212
Section 6.6 The Matrix of a Linear Transformation
and then checking eigenvectors. The assumption is that standard bases will be used here. Answers 1. Yes 2. No 3. Yes 4. Yes 5. No
Group Work 2: Finding a Basis This activity may be harder than it initially appears. To help the students get started, first determine the matrix that represents T with respect to the standard basis of P1 . Call this matrix B . We can apply Theorem 4 and state A = (PE←B )−1 BPE←B , where E denotes the standard basis. Rewrite the equality as PE←B A = BPE←B . This equality involves four dependent linear equations, from which the matrix PE←B can be determined. There are two free parameters. Once a particular PE←B is determined, a basis B can be found. Answer
Let E denote the standard basis and P = PE←B . Note that B =
1 2 3 4
denotes the matrix of T with respect
to E . By Theorem 4, we have A = P −1 BP , which implies P A = BP . Write the unknown matrix P as p1 p2 P = . Then P A = BP gives rise to the following four linear equations: p3 p4 p1 + 2p3 = 5p1 + 2p2 p2 + 2p4 = p1 3p1 + 4p3 = 5p3 + 2p4 3p2 + 4p4 = p3
Solving for p2 and p4 , we find
P =
p2 + 2p4 p2 3p2 + 4p4 p4
Thus, for example, choosing p2 = −1 and p4 = 1, we obtain 1 −1 P = 1 1 and therefore the basis B = {1 + x, −1 + x} works.
Suggested Core Assignment Exercises 2, 5, 7, 13P , 17, 20, 31, 34, 39P , 46P
213
Group Work 1, Section 6.6 Find the Diagonal Determine which, if any, of the following linear transformations are diagonalizable. 1. T : M22 → M22 such that T (A) = AT .
2. T : P3 → P3 such that P (f ) = f .
&
'
3. Let V be the two-dimensional vector space spanned by B = ex , e2x . Then define T : V → V such that
T (f ) =
0
f dx, with constant of integration equal to zero.
a+c d+b a+c d+b .
a+c b a+c d+b .
4. T : P3 → R4 such that T a + bx + cx2 + dx3 =
5. T : P3 → R4 such that T a + bx + cx2 + dx3 =
214
Group Work 2, Section 6.6 Finding a Basis Consider the linear transformation T : P1 → P1 such that T (1) = 1 + 2x and T (x) = 3 + 4x
and the matrix
A=
5 1 2 0
Suppose A is the matrix that represents the transformation T . Find a basis B such that our supposition is correct. Hint: There is more than one correct answer. Consider using Theorem 4 and the standard basis.
215
6.7
Applications
Suggested Time and Emphasis 1 classes. Optional material.
Points to Stress We recommend choosing one application to explore in detail, as opposed to trying to touch on them all. 1. Homogeneous linear differential equations. 2. Linear Codes
Drill Questions 1. True or false: The differential equation y − ay = 0 has a solution regardless of the value of a.
⎧⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫ 1 0 0 ⎪ ⎪ ⎨ 0 ⎬ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ 2. Is C = ⎣ 0 ⎦ , ⎣ 0 ⎦ , ⎣ 1 ⎦ , ⎣ 0 ⎦ a binary linear code? ⎪ ⎪ ⎩ 0 0 0 1 ⎭ Answers 1. True 2. No. It is not closed under addition.
Discussion Questions 1. Try to come up with real-world examples of phenomena modeled by y = k , y = ky , or y = ky . 2. Why do we concentrate on binary linear codes? What would be the advantages and disadvantages of
base 3 or base 4 codes? Answers 1. Answers will vary. 2. The discussion can touch on many possible advantages, the main one being you could store much more
information in the same number of bits. Perhaps the discussion can go in the direction of quantifying how much more information can be stored. The disadvantages are practical. If the signals being sent are of type, say, 0, 1, and 2, then when making the transition between 0 and 2 using current methods we have to pass through 1. It is a lot harder to tell the difference between three signal types than two. Allow the discussion to go in other directions, if it happens.
Test Questions 1. Find a value of A such that solutions of the differential equation y + 6y + Ay are of the form
y = c1 e−3x + c2 xe−3x . 2. Find a C (8, 2) code. 216
Section 6.7 Applications
Answers 1. A = 9 2. Answers will vary. One possible answer:
⎧⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫ 1 ⎪ 0 1 0 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎢ 0 ⎥ ⎢ 0 ⎥ ⎢ 1 ⎥ ⎢ 1 ⎥⎪ ⎪ ⎪ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎪ ⎥ ⎢ ⎪ ⎪ ⎪ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎪ ⎥ ⎪ ⎢ ⎪ ⎪ ⎪ ⎢ 0 ⎥ ⎢ 0 ⎥ ⎢ 0 ⎥ ⎢ 0 ⎥⎪ ⎪ ⎪ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎪ ⎥ ⎪ ⎢ ⎪ ⎬ ⎨⎢ 0 ⎥ ⎢ 0 ⎥ ⎢ 0 ⎥ ⎢ 0 ⎥⎪ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ C = ⎢ ⎥,⎢ ⎥,⎢ ⎥,⎢ ⎥ ⎢ 0 ⎥ ⎢ 0 ⎥ ⎢ 0 ⎥ ⎢ 0 ⎥⎪ ⎪ ⎪ ⎪ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥⎪ ⎪ ⎪ ⎪ ⎢ 0 ⎥ ⎢ 0 ⎥ ⎢ 0 ⎥ ⎢ 0 ⎥⎪ ⎪ ⎪ ⎪ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥⎪ ⎪ ⎪ ⎪ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎥ ⎢ ⎪ ⎪ ⎪ ⎪ 0 0 0 0 ⎪ ⎪ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎦ ⎣ ⎪ ⎪ ⎪ ⎪ ⎭ ⎩ 0 0 0 0
Lecture Notes • Note that the proof of Theorem 1 used only homogeneity and linearity. Thus, we can extend Theorem 1 to say that the solution space of any linear differential equation is a subspace of F . If we remove linearity, this may no longer be true. For example, consider the nonlinear differential equation y (y + 1) = 0. It is easy to check that every function of the form y = − 12 x2 + ax + b is a solution. y1 = −x2 is not in the
solution space, and y2 = − 12 x2 . Yet y1 = y2 + y2 , and therefore the solution space is not a subspace of F .
• The text points out that a model for the motion of an object on a spring is y − ky = 0. This model assumes no friction, so the solution is a periodic function — the object moves up and down forever. If we want to add friction to the model we get another term: y + ay − ky = 0. If a = 0 we are back to the frictionless case. If the characteristic equation has imaginary roots, then we have damped oscillation (the oscillation dies out, as it would if you attached a baseball to a vertical Slinky). If it has real roots, then we have no oscillation. (Picture attaching a 16-ton weight to a Slinky.) In the borderline case, we say the system is critically damped. y
y
0
t
y
0
t
t
0
Damped oscillation
No oscillation
Critically damped
• The Reed-Muller code is set up here, but is applied in Section 7.5. Cover it here if and only if you plan to bring it back in Section 7.5. • For some applications, we are thinking of code vectors that are ordered, so they can be incremented or decremented. One easy way to order code vectors is to think of them as binary numbers. The disadvantage of this method is sometimes many bits have to be changed at once in an increment operation. For example, 217
Chapter 6 Vector Spaces
⎡
⎤ ⎡ ⎤ 1 1 ⎢0⎥ ⎢1⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ going from ⎢ 1 ⎥ to ⎢ 0 ⎥ requires a change of four of the five bits. If, for a given application, this makes ⎢ ⎥ ⎢ ⎥ ⎣1⎦ ⎣0⎦ 1 0 a difference, one can use the Gray codes, which are orderings designed to have only one bit change at a time. Here is an example of a Gray code: 4: 0 1 1 0 8: 1 1 0 0 12: 1 0 1 0 0: 0 0 0 0 1: 0 0 0 1 5: 0 1 1 1 9: 1 1 0 1 13: 1 0 1 1 2: 0 0 1 1 6: 0 1 0 1 10: 1 1 1 1 14: 1 0 0 1 3: 0 0 1 0 7: 0 1 0 0 11: 1 1 1 0 15: 1 0 0 0 • The text mentions that the Mariner 9 spacecraft uses a 700 × 832 pixel grid with 64 shades of gray. The students may not have a good idea of what that kind of resolution is like. Show them the following pictures:
700 × 832, 225 shades of gray
700 × 832, 64 shades of gray
700 × 832, 8 shades of gray
Lecture Examples • A simple homogeneous first order linear differential equations: y − 2y = 0 has solution y = Ae2x . • A homogeneous linear differential equations whose characteristic equation has two real roots: y + y − 2y = 0 has solution y = Aex + Be−2x . y
10
_2
_1
0
_10
218
1
2
x
Section 6.7 Applications
• A homogeneous linear differential equations whose characteristic equation has one real root: y − 4y + 4y = 0 has solution y = Ae2x + Bxe2x . y
5
_2
0
_1
1
x
_5
• A homogeneous linear differential equations whose characteristic equation has no real root: y + 2y + 2y = 0 has solution y = Ae−x cos x + Be−x sin x. y
10
_3
_2
_1
0
1
x
_10
Tech Tips • Use the animation feature on a CAS to represent a family of solutions to a differential equation dynamically. • Program a CAS to generate a Reed-Muller code of any given order.
Group Work 1: Nonhomogeneous Differential Equations This activity foreshadows material that would be covered in any first semester course on differential equations. The second question will be very difficult for students. Good hints are “The answer is a linear equation,” or “The answer is of the form y = ax + b”. If a group finishes early, have them try to find the general solution to y + y = 8. This question can also
be used to replace Problem 4, in case you do not want to assign it. The answer is y = 4x2 + Ax + B + C cos x + D sin x.
Close by summing up the technique, or having the students do it: To solve a linear differential equation, we can find a particular solution, and then add it to the general solution of the associated homogeneous differential equation. Answers 1. y = Ae3x 2. y = − 13 x − 19
3. y = − 13 x − 19 + Ae3x 4. Use the linearity of the derivative. 219
Chapter 6 Vector Spaces
Group Work 2 Exercise 34 from the text makes a good group exercise.
Suggested Core Assignment Homogeneous Linear Differential Equations: Exercises 1, 3, 6, 13, 16, 17, 22P Linear Codes: 23, 26, 30, 33P , 38, 39P
220
Group Work, Section 6.7 Nonhomogeneous Differential Equations 1. This section has taught you how to solve homogeneous linear differential equations with constant
coefficients. This tool allows you to take on some formidable looking differential equations! For example, try to solve y − 3y = 0.
2. One nice thing about your new skill is that it has an impressive number of syllables. When you get
home from class today, tell a younger sibling or friend this: “Today I solved some homogeneous linear differential equations with constant coefficients.” Check out their reaction! The odd thing is that things get more difficult when we remove syllables. For example, this is what can happen when we take away the word “homogeneous”: y − 3y = x Find a solution to this differential equation.
3. You have just found one solution to y − 3y = x. It turns out that there are infinitely many. Try to find the
general solution. Here’s a hint: see what happens when you add e3x to your solution.
221
Nonhomogeneous Differential Equations
4. We have just discovered a nice technique in general. Consider this truly horrible differential equation:
y + (sin x) y + (3x + 5) y + πy = ln (|x + 4|)
We’re not going to ask you to solve it. I don’t know if we could solve it! I don’t know if even Dr. Poole could solve it! But let’s assume that y = yp (x) is a solution of this equation. (The ‘p’ stands for ‘particular.’ This is a particular solution of the differential equation, not a general one.) Now we look at the homogeneous equation y + (sin x) y + (3x + 5) y + πy = 0
Notice that it is the same equation, only with the right-hand side replaced by 0. Prove that if y = yh (x) is the general solution to the homogenous equation above, then y = yp (x) + yh (x) is a solution to our original equation.
222
7 Distance and Approximation 7.1
Inner Product Spaces
Suggested Time and Emphasis 1–2 classes. Essential material for a two-semester course; optional for a one-semester course.
Points to Stress 1. The dot product is an example of an inner product, but not the only such example. 2. The definition of an abstract inner product, and basic properties of inner products. 3. Abstract definitions of length, distance, and orthogonality. 4. The generalization of the Pythagorean Theorem.
Drill Questions 1. How are dot products and inner products related? 2. Let a and b be real numbers. Define a, b = |a + b|. Show that a, b is not an inner product on the real
numbers. Answers 1. A dot product is a particular kind of inner product. 2. Answers will vary.
Discussion Question The way we have defined the dot product gives us a · b = 0 if and only if a and b are orthogonal. Can we change “orthogonal” to “parallel” and still have it be a valid inner product? Answer No. By our definition, a, a = 0 for all a. Now choose u and v so that u is not parallel to v. We have u, v > 0. But u = v = 0, which contradicts the Cauchy-Schwarz inequality.
Test Question Given a symmetric matrix A and vectors x and y, define x, y = xT Ay. Why must A be a positive definite matrix for this to be an inner product? Answer
Since A is symmetric, xT Ax > 0 if and only if A is positive definite. 223
Chapter 7 Distance and Approximation
Lecture Notes • The text gives an example of the inner product u, v = 2u1 v1 + 3u2 v2 . It also shows how a “circle” in the associated product space is an ellipse from the standard perspective. Examine the space determined by u, v = u1 v1 + 8u2 v2 . Discuss the concept of orthogonality, perhaps by showing various pairs of vectors that are orthogonal in this space.
0 1
1/Ï2 _8/Ï65
1 0
1/Ï2
1/Ï65
Orthogonal in both this and the standard inner product space
Orthogonal in this inner product space
• After drawing a 3-4-5 right triangle, ask the students if it is possible to have a triangle with side lengths 3, 4, and 5 that is not a right triangle. Note that the Pythagorean Theorem goes in both directions. If a triangle is a right triangle, then a2 + b2 = c2 . But it is also true that if a2 + b2 = c2 then the triangle must be a right triangle. Now discuss the extension of the theorem to arbitrary inner product spaces. • If you are going to cover generalized orthogonal projections and the Gram-Schmidt process, do not assume your students remember these topics from a scant few weeks ago. Take the time to review the case of Rn with the standard dot product before moving to the abstract case, or a case like Example 8. • Ask the students to come up with an inner product on the space of upper triangular 2 × 2 matrices. One example: 1 2 a11 a12 b11 b12 , = a11 b11 + a12 b12 + a22 b22 0 a22 0 b22
Lecture Example 3 4 Consider the space of quadratic polynomials. We define a1 x2 + b1 x + c1 , a2 x2 + b2 x + c2 = a1 a2 + b1 b2 + 3c1 c2 . Then if u = 3x2 + 2x + 1 and v = x2 − x − 1, we have the following: • u, v = −2 • u, u = 16 • u = 4 • d (u, v) = 5 • u and v are not orthogonal u, v u = − 38 x2 − 14 x − • proju v = u, u
1 8
224
Section 7.1 Inner Product Spaces
Note how hard it is to get geometric intuition from an arbitrary space and inner product. u, v, and proju v are graphed below. Do you see an obvious connection? y u v
10
_4
4 x
0 proju v
_10
√ • Demonstration of Cauchy-Schwarz: |u, v| = 2, u v = 4 5 √ √ • Demonstration of the Triangle Inequality: u + v = 17 ≈ 4.123, u + v = 4 + 5 ≈ 6.236 ' & • An orthogonal basis of P2 with respect to this inner product: x2 , x, 1 ( ) • An orthonormal basis of P2 with respect to this inner product: x2 , x, √13
Tech Tip Write code to produce the nth Legendre polynomial.
Group Work Exercises 39 and 40 from the text make excellent group activities.
Suggested Core Assignment Exercises 1, 3, 5, 6, 10, 12, 14, 18, 20, 22, 27, 28, 35P , 38, 42
225
7.2
Norms and Distance Functions
Suggested Time and Emphasis 1–2 classes. Essential material for a two-semester course; optional for a one-semester course.
Points to Stress 1. The definitions of , E , and m . 2. The definitions of matrix norm and compatible matrix norm. 3. The interpretation of Figure 1, the definition of operator norm, and Theorem 3. 4. The description of Jacobi’s method given in Equation 8.
Drill Question If A is a square matrix, what is an easy description of A1 ? Answer
Maximum absolute column sum
Discussion Question Let M denote the matrix associated with Jacobi’s method. Can anything be concluded about the condition number of this matrix? Answer Note the form of M given in the text. It is possible that M is singular, in which case cond (M) = ∞. Thus, there is not much that can be said about cond (M) in general.
Test Question If A is symmetric, what can we say about A1 and A∞ ? Answer
They are equal.
Lecture Notes • Be sure to distinguish between a vector norm and a matrix norm. Use R2 as an example to show students that while vm = max {|v1 | , . . . |vn |} defines a vector norm on Rn , the operation Am = maxi,j=1..n {|aij |} does not define a matrix norm on Mnn . • Let v define a vector norm. What can be said about a matrix norm compatible with v versus the operator norm induced by v? The operator norm induced by v is the smallest matrix norm compatible with v. Note that we must be careful when considering compatibility. For example, equip R2 with the vector norm vs . We know A1 is a matrix norm compatible with vs . Is it true that A∞ (defined in this setting to be the maximum absolute row sum) is also compatible with vs ? The answer is no. 3 2 . Note that M∞ = 5 and M1 = 6. Moreover, Me2 s = 6 ≤ M∞ . Consider M = 1 4 • Equation 9 asserts that xk+1 − x ≤ M xk − xx . Certainly M < 1 implies that xk+1 − x < xk − xx , but how does it follow that xk − x → 0?
Consider the following: xk+1 − x ≤ M xk − xx implies that xk+1 − x ≤ Mk+1 x0 − xx . Now what happens to Mk+1 as k → ∞?
226
Section 7.2 Norms and Distance Functions
• Equip R2 with the norm vp = (|v1 |p + |v2 |p )1/p for a fixed real number p ≥ 1. For example, if p = 1,
then vp = vs . Let Ap denote the operator norm on M22 that is compatible with vp . We have a nice formula for A when p = 1. There is also an easy-to-use expression for A in the case p = 2 (we look at this in Group Work 2, “A2 Guessing”, and examine it in more detail in Section 7.4). Investigate the possibility of a formula for other integer values for p. Note that no such formula is known in general. • The Cauchy-Schwarz inequality is used in Example 7(b). This is a good place for the students to see Cauchy-Schwarz in action. • In deriving a matrix form of Jacobi’s method, we are required to invert the diagonal matrix D (the diagonal entries of D are those of the strictly diagonally dominant matrix A). How can we guarantee that D is nonsingular? (Can A have a zero on its diagonal?)
Lecture Examples • Calculations of vs , vE , and vm :
√ √ Let v = [1, 2 − 13, 5] Then vs = 1 + 2 + 13 + 5 = 21, vE = 1 + 4 + 169 + 5 = 179 ≈ 13.4, and vm = 13. • Calculation of A1 and A∞ : ⎤ ⎡ 1 −2 3 ⎥ ⎢ Let A = ⎣ 4 −5 6 ⎦. Then A1 = max {12, 15, 18} = 18 and A∞ = max {6, 15, 24} = 24. 7 8 9 • Calculation of cond (A): ⎡ ⎡ 31 7 ⎤ ⎤ 1 − 32 16 32 1 −2 3 ⎢ ⎢ 1 ⎥ 1 ⎥ Let A = ⎣ 4 −5 6 ⎦. Then A−1 = ⎣ 16 − 18 16 ⎦, so cond (A) with respect to A1 is 11 1 67 7 8 9 96 − 48 32 5 −1 5 & 83 19 1 ' 5A 5 A = max , , · 18 = 249 ≈ 31.1. 1
1
48 24 8
8
Tech Tips • Using a CAS, study A2 of a 2 × 2 matrix by first parametrically graphing the unit circle, and then graphing the image of this circle under multiplication by A.
Group Work 1: The Shapes of a Sphere This activity explores the shape of the unit sphere in R2 for a continuum of norms. Students may require assistance for some of the more unfamiliar norms. Answers 1.
y
2.
1
y
3.
1
y 1
1 x
1 x
1 x
4. The shape gradually transforms from a diamond (p = 1) to a circle (p = 2) and then to a square (p = ∞). 227
Chapter 7 Distance and Approximation
Group Work 2: A2 Guessing While not absolutely required, a graphing device is very helpful for this activity. The goal here is to discover an expression for A2 in the case where A is a symmetric 2 × 2 matrix. Each person in the group should compare results in order to reach the conclusion that A2 is in this case equal to the largest eigenvalue of A (the spectral radius of A). Answers
Answers to Questions 1–4 will vary. 5. A2 is the maximum of the absolute values of the eigenvalues.
Suggested Core Assignment Exercises 2, 3, 4, 6, 7, 11, 12, 20, 24, 26, 30, 37, 38
228
Group Work 1, Section 7.2 The Shapes of a Sphere In this activity, we will be working exclusively in R2 . The elements of R2 are vectors; let’s agree to represent vectors as points in the xy -plane. For example, the xy -pair (or point) (1, 1) represents the vector with initial point the origin and terminal point (1, 1). With this understanding, make sketches of the following sets (referred to as unit spheres): & ' 1. S = v ∈ R2 : vs = 1 & ' 2. S = v ∈ R2 : vE = 1 & ' 3. S = v ∈ R2 : vm = 1 y
0
4. Let Sp =
x
( ) v ∈ R2 : vp = 1 for p ≥ 1. How does Sp change as p ranges from 1 to ∞?
229
Group Work 2, Section 7.2 A2 Guessing Choose a symmetric 2 × 2 matrix A. To compute A2 we will need to know the maximum stretching factor of A when applied to the unit vectors. 1. Explain why every unit vector in
R2
can be written as
cos t sin t
for some t ∈ [0, 2π).
cos t 2. For t ∈ [0, 2π), let v = and compute Av. Your answer should be a 2 × 1 matrix where each entry sin t is a function of t. Let v1 (t) denote the (1, 1) entry of Av and v2 (t) denote the (2, 1) entry of Av.
3. Let f (t) =
6 v1 (t)2 + v2 (t)2 . Find the approximate maximum value of this function on [0, 2π). This a
good place to use a graphing device!
4. The maximum value you found in Question 3 is the value of A2 . Now find the eigenvalues of A.
5. What is the relationship between the eigenvalues of A and A2 ?
230
7.3
Least Squares Approximation
Suggested Time and Emphasis 1–1 12 classes. Recommended material.
Points to Stress 1. Definition of the best approximation. 2. Definition of a least squares approximation and the connection with best approximation. 3. Solution technique for a least squares problem. 4. The pseudoinverse.
Drill Question Does the system Ax = b always have a least squares solution? Answer Yes
Discussion Question Can the least squares method be used to fit a polynomial of any given degree through a data set? Answer The least squares method can fit a polynomial of degree n through a data set of at least n+1 points.
Test Question How is A+ , the pseudoinverse of A, connected with the least squares solution to the problem Ax = b? Answer
The least squares solution is given by A+ b.
Lecture Notes • Discussions involving the existence and uniqueness of least squares solutions help to emphasize the connection between best approximation and solving a least squares problem Ax = b. Let V denote the space spanned by the columns of A; the vector projV (b) is the best approximation to b from V . This vector always exists and is unique (in our inner-product space setting there is always a single best approximation to b from V , as per Theorem 1). Since projV (b) belongs to the column space of A, it is clear that this vector can be written in the form Ax from some vector x. Thus, as stated in Theorem 2, Ax = b always has at least one least squares solution. Can there be more than one? Yes – despite the uniqueness of projV (b). This is demonstrated in the Group Work. • Note that the text implicitly assumes uniqueness when defining “the best approximation”. It is possible in a normed space V that there are many “best approximations” to a given v ∈ V from subspace W . For example, let R2 be normed by vm . Let W denote the horizontal axis and v = (1, 1). Then every w ∈ W of the form w = (x, 0), 0 ≤ x ≤ 2 is a best approximation of v. • Discuss in detail Example 6. Stress that this problem, which seems quite nonlinear, has in fact a linear solution. • Use a nonsquare matrix A with independent columns to verify the Penrose conditions of a pseudoinverse given in Theorem 5. Point out how the pseudoinverse is a generalization of the standard inverse. Note that, for such an matrix A, the equation Ax = b will have exact solutions only when b is in col (A). 231
Chapter 7 Distance and Approximation
Lecture Examples • Solution to a least squares problem: Find the parabola that gives the best least squares approximation to the points (5, 6), (3, 1), (−3, 7), (2, 0), (1, 1). The associated matrices are ⎤ ⎡ ⎡ ⎤ 1 5 25 6 ⎢1 3 9⎥ ⎢1⎥ ⎥ ⎢ ⎢ ⎥ ⎥ ⎢ ⎢ ⎥ A = ⎢ 1 −3 9 ⎥ and b = ⎢ 7 ⎥ ⎥ ⎢ ⎢ ⎥ ⎣1 2 4⎦ ⎣0⎦ 1 1 1 1
The solution is
775 2 2002 x
−
145 113 154 x + 143
y 6 4 2 _4
_2
0
2
4
6
x
• Solution to a least squares problem using QR factorization: ⎡ ⎤ ⎡ ⎤ 2 4 −1 1 ⎢3 6 2⎥ ⎢2⎥ ⎢ ⎥ ⎢ ⎥ Let A = ⎢ ⎥ and b = ⎢ ⎥. The corresponding QR factorization of A is ⎣ 1 −1 −0 ⎦ ⎣3⎦ 1 0 1 4 √ √ ⎡ √ ⎤ 2 15 102 4 102 − ⎡√ √ ⎤ √ 15 51 51 ⎢ ⎥ 15 15 5 315 ⎢ √ ⎥ √ √ 3 ⎢ ⎢ 15 ⎥ 102 5 102 ⎥ ⎢ ⎢ 5 ⎥ ⎥ √ √ 34 102 ⎢ ⎢ ⎥ ⎥ 102 102 R = Q=⎢ √ ⎢ ⎥ 0 − 102 ⎥ √ √ 3 ⎢ ⎢ 15 ⎥ ⎥ 4 102 102 ⎥ ⎢ ⎣ ⎦ − 51 √ ⎢ 15 − 51 ⎥ 7 102 ⎣ ⎦ 0 0 √ √ √ 34 15 5 102 102 − 15 102 34
Then
⎡ √ 15 ⎢ 6√102 T Q b=⎢ ⎣ − √ 17 4 102 51
Thus, we must solve
⎡ √ 15 ⎢ 6√102 Rx = ⎢ ⎣ − √ 17 4 102 51
232
⎤ ⎥ ⎥ ⎦ ⎤ ⎥ ⎥ ⎦
Section 7.3 Least Squares Approximation
This gives
⎡ x=
⎤
55 21 ⎢ 22 ⎥ ⎣− 21 ⎦ 8 21
• Calculation of a pseudoinverse: ⎡ ⎤ ! ! 1 2 1 1 1 T −1 T −1 T 0 12 ⎢ ⎥ 2 −4 2 + = , and therefore A = A A A = 1 Let A = ⎣1 0⎦. Then A A 1 1 − 14 38 2 −4 −4 1 0
Tech Tips • Many CAS’s have high-level commands that make the least squares approximations and QR factorizations into a one-step process. Have your students explore these routines.
Group Work: A Tale of Concrete Sidewalks and Rugged Individualism This section demonstrates the close connection between best approximation (in an inner-product space) and the least squares approximation problem. However, there is at least one important difference between these two approximations: uniqueness. While there is always, in our setting, a unique best approximation to b from W , there need not be a unique solution to the corresponding least-squares problem. Answers 1. No 2. Using the Gram-Schmidt process, we find an orthogonal basis for col (A) to be {u1 , u2 } with u1 =
⎡
⎡ 17 ⎤ ⎤ −9 2 ⎢ ⎢ 40 ⎥ ⎥ ⎣ −1 ⎦ and u2 = ⎣ 9 ⎦. Calculating projV (b), where V is spanned by u1 and u2 , we find the best 37 2 9 ⎡ 64 ⎤ ⎢ approximation to be ⎣
181 73 181 244 181
⎥ ⎦.
3. y belongs to the column space of A, and Ax represents a linear combination of the columns of A. 4. There are infinitely many solutions x;
⎡
1 0 1 ⎢ ⎢ 0 1 1 ⎣ 0 0 0
47 181 30 181
⎤
the reduced row echelon form of
⎥ ⎥. ⎦
0
5. Add a row to A so that the columns are independent, and add a fourth entry to b.
Suggested Core Assignment Exercises 5, 6, 10, 16, 20, 24, 26, 28, 30, 32, 34, 28, 44, 50, 58P
233
A
y
is
Group Work, Section 7.3 A Tale of Concrete Sidewalks and Rugged Individualism ⎤ ⎡ ⎤ 2 −1 1 1 ⎥ ⎢ ⎢ ⎥ 1. Let A = ⎣ −1 4 3 ⎦ and b = ⎣ 1 ⎦. Does b belong to col (A)? 1 2 5 7 ⎡
2. Find the best approximation to b from col (A).
3. Denote your answer to the above question by y. Then any vector x such that Ax = y will be a least
squares solution. Before trying to find x, explain why such a vector must exist.
4. Now try to solve Ax = y. What do you find?
5. What changes to A and b could be made so that b remains excluded from col (A), but there is a unique
solution to the least squares problem?
234
7.4
The Singular Value Decomposition
Suggested Time and Emphasis 1–2 classes. Optional material.
Points to Stress 1. Definition of singular values. 2. The singular value decomposition. 3. The final version of the Fundamental Theorem of Invertible Matrices. 4. Moore-Penrose inverses and Theorem 6.
Drill Question How is |A2 | related to the singular value decomposition of the matrix A? Answer
|A2 | is the largest singular value of A.
Discussion Question What is wrong with the following argument? Suppose A is an n × n upper triangular matrix with real eigenvalues λ1 , . . . , λn . Then AT is lower triangular and has the same set of eigenvalues. Thus, AT A has eigenvalues λ21 , . . . , λ2n , and thus the singular values of AT A are λ1 , . . . , λn . Answer
It is not necessarily true that AT A has eigenvalues λ21 , . . . , λ2n .
Test Question
Find the singular values of A = Answer
a 0 . 0 −a
|a|
Lecture Notes • Discuss when the singular values of square matrix A are exactly the eigenvalues of A (see also Group Work 2). • Let U ΣV T be the singular value decomposition of A. Demonstrate that the orthogonal matrices U and V
have, respectively, simple relationships with AAT and AT A. Note that AAT = U ΣV T V ΣT U T = U ΣΣT U T , where ΣΣT is a diagonal matrix with main diagonal entries σ 21 , . . . , σ 2r . These entries are the nonzero eigenvalues of AAT and thus we see that the columns of U must be orthonormal eigenvectors of
AAT . Similarly, AT A = V ΣT U T U ΣV T = V ΣT ΣV T , and thus the columns of V are normalized
eigenvectors of AT A. • The singular value decomposition of A can be used to describe the effect of multiplication by A on the unit sphere. This is well illustrated by Figure 4. • Illustrate Theorem 2 by connecting the result back to the spectral decomposition of Section 5.4. 235
Chapter 7 Distance and Approximation
Lecture Examples • The singular value decomposition of a matrix: ⎤ ⎡ 3 0 ⎥ ⎢ Let A = ⎣ 0 −5 ⎦. Then the singular value decomposition of A is 0 0 ⎡ ⎤⎡ ⎤ 0 −1 0 5 0 0 1 ⎢ ⎥⎢ ⎥ A = ⎣ −1 0 0 ⎦ ⎣ 0 3 ⎦ −1 0 0 0 1 0 0 • The outer product form of the singular value decomposition of a matrix: ⎡ ⎤ 3 0 ⎢ ⎥ Let A = ⎣ 0 −5 ⎦. Then by applying the outer product form of the singular value decomposition, we 0 0 have ⎡ ⎤ ⎡ ⎤ 0 −1 ⎢ ⎥ ⎢ ⎥ A = 5 ⎣ −1 ⎦ 0 1 + 3 ⎣ 0 ⎦ −1 0 0 0 • The pseudoinverse of a matrix: ⎤ ⎡ 3 0 ⎥ ⎢ Let A = ⎣ 0 −5 ⎦. Then 0 0 A+ = V Σ+ U T =
0 −1 1 0
1 5
0 0 0 13 0
!
⎡
⎤ 0 −1 0 ⎢ ⎥ ⎣ −1 0 0 ⎦ = 0 0 1
1 3
0 0 0 − 15 0
!
• A minimum length least squares solution: ⎡ ⎤ ⎡ ⎤ 3 0 1 ⎢ ⎥ ⎢ ⎥ Let A = ⎣ 0 −5 ⎦ and b = ⎣ 2 ⎦. The minimum length least squares solution to Ax = b is 0 0 3 ! x = A+ b =
2 3 − 14
Tech Tips • Use a CAS to graph the two- or three-dimensional image of the unit sphere under multiplication by the singular value decomposition of a matrix A.
Group Work 1: Revisiting Non-Uniqueness in Least Squares In this activity we re-examine the problem from the Group Work of Section 7.3: students will express the set of all solutions to a particular least squares problem and then use scaling to find the minimum length solution. We compare this result to the pseudoinverse approach. 236
Section 7.4 The Singular Value Decomposition
Answers 1. No, because b does not belong to col (A). 47 30 2. The normal equations for the solution are AT Ax = AT x. This yields x1 = 181 − s, x2 = 181 − s, x3 = s. ⎡ ⎤
⎢ 3. A solution is of the form v = ⎣
is minimized at s =
77 543 .
47 181 30 181
−s ⎥ − s ⎦, and v2 = s − s
47 2 181
+ s−
30 2 181 ⎡
⎢ Thus the v we are looking for is approximately v0 = ⎣
+ s2 . This quantity
64 543 13 543 77 543
⎤ ⎥ ⎦ with minimum
length 0.18594. 4. It does agree.
Group Work 2: Condition, Rinse, Repeat In this activity we see that a fundamental property of the eigenvalues of a square matrix A fails to hold for the singular values of the same matrix. Answers 1. It is symmetric.
√ √ 2, 3 − 2 AT A = A2 , and thus the eigenvalues of AT A are the eigenvalues of A2 , which are the squares of the eigenvalues of A. √ √ 3, 3 + 2, 3 − 2 5.011, 4.355, 0.962; these are not the same as the eigenvalues. ⎡ ⎤ 1 2 4 ⎢ ⎥ No. Consider the singular values of MAM −1 where M = ⎣ 2 2 2 ⎦. 4 2 2
2. 3, 3 + 3. 4. 5. 6.
Suggested Core Assignment Exercises 2, 8, 10, 14, 18, 22, 27P , 38, 42, 47, 52P , 58
237
Group Work 1, Section 7.4 Revisiting Non-Uniqueness in Least Squares ⎤ ⎡ ⎤ 2 −1 1 1 ⎥ ⎢ ⎢ ⎥ 1. Let A = ⎣ −1 4 3 ⎦ and b = ⎣ 1 ⎦. Does Ax = b have an exact solution? 1 2 5 7 ⎡
2. Find the set of all least squares solutions to Ax = b.
3. Find, from your set above, the element of minimum Euclidean norm.
4. Now use the Moore-Penrose inverse as in Theorem 6 to find the minimum length least squares solution to
Ax = b. Your answer should agree with your answer to the above question.
238
Group Work 2, Section 7.4 Condition, Rinse, Repeat ⎤ 2 0 1 ⎥ ⎢ 1. Let A = ⎣ 0 3 0 ⎦. What important characteristic about A do you notice? 1 0 4 2. Compute the eigenvalues of A. ⎡
3. Verify that singular values of A are identical to the eigenvalues of A. Explain why this is so.
⎡
⎤ 1 0 4 ⎢ ⎥ 4. Let M = ⎣ 2 2 1 ⎦. Compute A1 = MAM −1 . What are the eigenvalues of A1 ? 4 2 2
5. Compute the singular values of A1 . What do you notice about these values compared to the eigenvalues
of A1 ?
6. If M is symmetric, does that guarantee that the singular values of A and MAM −1 agree?
239
7.5
Applications
Suggested Time and Emphasis 1–2 classes. Optional material.
Points to Stress We suggest doing one application in depth rather than trying to touch on all of them. 1. Function approximations. 2. Error-correcting codes.
Drill Questions 1. What is the difference in the result of approximating ex by a quadratic function, as done in this section,
and approximating it by a Taylor Polynomial of order 2? 2. Consider a code with 16 distinct code vectors. If the nearest Hamming distance between any pair of
vectors is 3, how many errors can this code detect? Answers 1. The Taylor polynomial is the best fit at a point. This section gives the best overall fit on an interval. 2. Up to 2 errors
Discussion Question The function approximations subsection assumed a particular norm on continuous functions. Why is that norm a good one to use in the context of approximating functions by other functions?What would happen if we used a different norm?Notice that the text takes care to specify that we are assuming that we are in the realm of inner product spaces — where is that used in the text?
Test Questions 1. Find the best linear approximation to f (x) = x2 on the interval [−1, 1] 2. If the minimum Hamming distance between code vectors in a binary code is 25, how many errors can it
detect?How many errors can it correct? Answers 1. f (x) = 13 2. 24, 12 240
Section 7.5 Applications
Lecture Notes • Tie this material to the idea of Taylor approximations in calculus. In calculus we had the advantage of approximating at a point, so we were able to merely make the function and the approximation match at the point, and then add as many derivatives as we liked. Here, we are working in a different ballpark, trying to make the functions match as closely as possible on infinitely many points. It turns out to be possible to write Taylor polynomials as a kind of projection, but that is outside the scope of this course. • Fourier approximations are a lot of fun. They allow you to take something like an instant of music (which is a very complicated wave) and approximate it by a set of constants. This is (in essence) how CD players work. A CD contains discrete data like Fourier coefficients, and the CD player reconstitutes them into sound waves. Also, the difference between a pure tone, like that of a tuning fork, and a complex tone, like a note on a violin, is overtones. Adding overtones is very similar to adding terms in a Fourier series. • Do a simple example of a Fourier polynomial. For example, we can model this periodic series of parabolas: ⎧ .. .. .. ⎪ ⎪ . . . ⎪ ⎪ ⎪ ⎪ 2 ⎪ (x + 4π) for −5π < x < −3π ⎪ ⎪ ⎪ ⎪ 2 ⎪ ⎪ ⎨ (x + 2π) for −3π < x < −π for −π < x < π f (x) = x2 ⎪ ⎪ 2 ⎪ (x − 2π) for π < x < 3π ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ (x − 4π)2 for 3π < x < 5π ⎪ ⎪ ⎪ ⎪ .. .. .. ⎩ . . . y 10
5
_4¹
_3¹
_2¹
_¹
0
¹
2¹
3¹
4¹
x
Due to symmetry, the coefficients of the sine function are all zero, and the cosine coefficients are a0 = a1 = −4, a2 = 1, a3 = − 49 , and a4 = 14 : g (x) =
π2 3
− 4 cos x + cos 2x − 49 cos 3x + 14 cos 4x y 10
5
_4¹
_3¹
_2¹
_¹
0
241
¹
2¹
3¹
4¹
x
π2 3 ,
Chapter 7 Distance and Approximation
• Note the difference between an error-detecting code and an error-correcting code. In general, code vectors do not have to be as long for error-detecting codes as they do for error-correcting codes. Intuitively, this is because an error-correcting code needs to have more information — it has to convey that an error has occurred and also convey the error’s position. Mathematically, it is because vectors in an error-correcting code have to be much farther apart (measuring with the Hamming distance). • Point out how clever the Reed-Muller code actually is. There are 64 code vectors, meaning that 64 shades of gray can be transmitted. There can be as many as 15 errors in the transmission of each bit, and they will be detected. If there are less than eight errors in the transmission, not only can they be detected, but they can be corrected.
Lecture Examples • Approximation of Functions: • The best cubic approximation to f (x) = sin πx on [−1, 1]: & ' An orthogonal basis is 1, x, x2 − 13 , x3 − 35 x . The inner products are
3 2 1 4 3 3 3 4 4 π 2 − 15 2 x − 3 , sin πx = 0 1, sin πx = 0 x, sin πx = x − 5 x, sin πx = π 5π 3 4 3 2 1 2 14 3 3 3 2 8 8 1, 1 = 2 x, x = x − 3, x − 3 = x − 5 x, x3 − 35 x = 3 45 175 3 The linear approximation is L (x) = x and the cubic approximation is π
35 π 2 − 15 3 3 3 C (x) = x + x − 5x . π 2π 3 y 1
y 4
2
y=L (x) y=L (x)
_2
_1
1
2 x y=sin x
_1
y=C (x)
_2
_4
y=C (x) _1
242
y=sin x 1
Section 7.5 Applications
• A Fourier polynomial for y = ex on [−π, π]:
a0 ≈ 3.676
a1 ≈ −3.676 b1 ≈ 3.676
a3 ≈ −0.735 b3 ≈ 2.206
y
y
20
20
10
10
0
_5
5 x
a4 ≈ 0.432 b4 ≈ −1.730
0
_5
n=1
n=2
y
y
20
20
10
10
0
_5
_5
a2 ≈ 1.470 b2 ≈ −2.941
5 x
0
_5
n=3
n=4
y
y
20
20
10
10
0
5 x
_5
5 x
_¹
5 x
y=e ¨
0
¹
5 x
The fifth-degree Fourier polynomial compared with y = ex
n=5
• Minimum Hamming distance: ⎧⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫ 1 ⎪ 0 1 1 1 0 ⎪ ⎪ ⎪ ⎪ ⎢ 0 ⎥ ⎢ 1 ⎥ ⎢ 0 ⎥ ⎢ 0 ⎥ ⎢ 1 ⎥ ⎢ 1 ⎥⎪ ⎪ ⎪ ⎪ ⎪ ⎢ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎪ ⎪ ⎥ ⎪ ⎪ ⎢ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎪ ⎥ ⎬ ⎨⎢ 0 ⎥ ⎢ 1 ⎥ ⎢ 1 ⎥ ⎢ 0 ⎥ ⎢ 0 ⎥ ⎢ 1 ⎥⎪ ⎥ , ⎢ ⎥ , ⎢ ⎥ , ⎢ ⎥ , ⎢ ⎥ , ⎢ ⎥ , then d = 3. If C = ⎢ ⎢ 0 ⎥ ⎢ 0 ⎥ ⎢ 1 ⎥ ⎢ 1 ⎥ ⎢ 1 ⎥ ⎢ 1 ⎥⎪ ⎪ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥⎪ ⎪ ⎪ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥⎪ ⎪ ⎪ ⎪ ⎪ ⎪⎣ 0 ⎦ ⎣ 0 ⎦ ⎣ 1 ⎦ ⎣ 0 ⎦ ⎣ 1 ⎦ ⎣ 1 ⎦⎪ ⎪ ⎪ ⎪ ⎪ ⎭ ⎩ 0 1 0 0 0 1 243
Chapter 7 Distance and Approximation
Group Work 1: Fun with Fourier The first two pages of this activity should be given before Fourier series are discussed in class. This activity will get students looking at combinations of sine curves, while at the same time foreshadowing the concepts of infinite series and Fourier series. Answers
1 if −2π ≤ x < −π or 0 ≤ x < π 3. sin x 4. sin x + 13 sin 3x 5. n = 5 −1 if −π ≤ x < 0 or π ≤ x ≤ 2π sin 3x sin 5x sin 7x sin 9x sin 11x sin 13x sin 15x sin 17x sin 19x + + + + + + + + 6. sin x + 3 5 7 9 11 13 15 17 19 1. No
2. S (x) =
y 1 _2¹
_¹
¹
2¹ x
_1
7.
y
¹/2 _2¹
0
_¹
¹
2¹ x
_¹/2
Group Work 2: A Binary Code This activity will allow students to see why self-correcting codes have to have large vectors. Answers
⎡
⎤ ⎡ ⎤ 1 0 ⎢1⎥ ⎢0⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ 1. It is impossible. If the distance between any v and ⎢ 1 ⎥ is 3, then the distance between v and ⎢ 0 ⎥ is 2. ⎢ ⎥ ⎢ ⎥ ⎣1⎦ ⎣0⎦
1 0 2. Answers will vary. It is possible to have a set with 6 elements, and there is no set with more than 6. One possible set: ⎧⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤⎫ 0 1 1 0 0 1 ⎪ ⎪ ⎪ ⎪ ⎪ ⎪ ⎢ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎪ ⎢ ⎥ ⎢ ⎥ ⎪ ⎪ ⎪⎢ 0 ⎥ ⎢ 0 ⎥ ⎢ 1 ⎥ ⎢ 0 ⎥ ⎢ 1 ⎥ ⎢ 1 ⎥ ⎪ ⎥ ⎪ ⎪ ⎪ ⎢ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎥ ⎪ ⎬ ⎨⎢ 0 ⎥ ⎢ 0 ⎥ ⎢ 0 ⎥ ⎢ 1 ⎥ ⎢ 1 ⎥ ⎢ 1 ⎥⎪ ⎢ ⎥,⎢ ⎥,⎢ ⎥,⎢ ⎥,⎢ ⎥,⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎪ ⎪ ⎪⎢ 0 ⎥ ⎢ 1 ⎥ ⎢ 0 ⎥ ⎢ 0 ⎥ ⎢ 1 ⎥ ⎢ 1 ⎥⎪ ⎪ ⎪ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥⎪ ⎪ ⎪ ⎪ ⎪ ⎪ 0 1 0 1 0 1 ⎣ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎦ ⎪ ⎪ ⎪ ⎪ ⎭ ⎩ 0 0 1 1 0 1
Suggested Core Assignment Approximation of Functions: Exercises 2, 3, 6, 8, 10, 14, 19, 21, 26 Error-Correcting Codes: Exercises 30, 33, 36, 38, 42P 244
Group Work 1, Section 7.5 Fun with Fourier The following function S (x) is called a square wave. y
1 _2¹
_¹
0
¹
2¹ x
_1
1. Can you find a function on your calculator which has the given graph?
2. Write a formula that has this graph for −2π ≤ x ≤ 2π .
3. Which of the functions sin x and cos x gives a better approximation to S (x)?
4. Select the function from among the following which gives the best approximation to the square wave:
sin x, sin x + sin 3x, sin x + 12 sin 3x, sin x + 13 sin 3x, and sin x + 14 sin 3x.
5. Find the integer n which makes f (x) = sin x + 13 sin 3x + n1 sin 5x the best possible approximation to
the square wave.
245
Fun with Fourier
6. A Fourier approximation of a function is an approximation of the form
F (x) = a0 + a1 cos x + b1 sin x + a2 cos 2x + b2 sin 2x + · · · + an cos nx + bn sin nx
You have just discovered the Fourier approximation to S (x) with three terms. Find the Fourier approximation to S (x) with ten terms, and sketch its graph.
7. The following expressions are Fourier approximations to a different function, T (x):
T (x) ≈ sin x T (x) ≈ sin x − 12 sin 2x T (x) ≈ sin x − 12 sin 2x + 13 sin 3x T (x) ≈ sin x − 12 sin 2x + 13 sin 3x − 14 sin 4x T (x) ≈ sin x − 12 sin 2x + 13 sin 3x − 14 sin 4x + 15 sin 5x
Sketch T (x).
246
Group Work 2, Section 7.5 A Binary Code We wish to create a binary code with minimum Hamming distance 3 that can correct one error. ⎡ ⎤ ⎡ ⎤ 1 0 ⎢1⎥ ⎢0⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ 1. For practical reasons, we want the vectors to be elements of R5 and we want ⎢ 1 ⎥ and ⎢ 0 ⎥ to be in the ⎢ ⎥ ⎢ ⎥ ⎣1⎦ ⎣0⎦ 1 0 set. Create a code with as many vectors as possible (at least four) or explain why it is not possible.
⎤ ⎡ ⎤ 0 1 ⎢0⎥ ⎢1⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎥ ⎢ ⎢1⎥ 0⎥ 6 ⎥ ⎢ 2. Now assume we want the vectors to be elements of R and we want ⎢ ⎥ and ⎢ ⎢ 1 ⎥ to be in the code. ⎢0⎥ ⎢ ⎥ ⎢ ⎥ ⎢ ⎥ ⎣0⎦ ⎣1⎦ 0 1 Find a code with as many vectors as possible, again making sure that the minimum Hamming distance between any pair of vectors is 3. ⎡
247