Real-World Problems for Secondary School Mathematics Students
Real-World Problems for Secondary School Mathematics Students Case Studies Edited by
Juergen Maasz University of Linz, Austria
John O’Donoghue University of Limerick, Ireland
SENSE PUBLISHERS ROTTERDAM/BOSTON/TAIPEI
A C.I.P. record for this book is available from the Library of Congress.
ISBN: 978-94-6091-541-3 (paperback) ISBN: 978-94-6091-542-0 (hardback) ISBN: 978-94-6091-543-7 (e-book)
Published by: Sense Publishers, P.O. Box 21858, 3001 AW Rotterdam, The Netherlands www.sensepublishers.com
Printed on acid-free paper Image – The Living Bridge, University of Limerick. © Patrick Johnson, 2008. “The Living Bridge – An Droichead Beo” The Living Bridge is the longest pedestrian bridge in Ireland and links both sides of the University of Limerick’s campus across the river Shannon. The bridge is constructed of 6 equal spans and follows a 350 metre long curved alignment on a 300 metre radius.
All Rights Reserved © 2011 Sense Publishers No part of this work may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, microfilming, recording or otherwise, without written permission from the Publisher, with the exception of any material supplied specifically for the purpose of being entered and executed on a computer system, for exclusive use by the purchaser of the work.
TABLE OF CONTENTS
Preface.................................................................................................................... vii 1. Modelling in Probability and Statistics: Key Ideas and Innovative Examples ......................................................................................... 1 Manfred Borovcnik and Ramesh Kapadia 2. Problems for the Secondary Mathematics Classrooms on the Topic of Future Energy Issues ................................................................................... 45 Astrid Brinkmann and Klaus Brinkmann 3. Coding Theory................................................................................................. 67 Tim Brophy 4. Travelling to Mars: A Very Long Journey: Mathematical Modelling in Space Travelling.......................................................................................... 87 Jean Charpin 5. Modelling the Storage Capacity of 2D Pixel Mosaics..................................... 99 Simone Göttlich and Thorsten Sickenberger 6. Mathematics for Problems in the Everyday World ....................................... 113 Günter Graumann 7. Political Polls and Surveys: The Statistics Behind the Headlines ................. 123 Ailish Hannigan 8. Correlations between Reality and Modelling: “Dirk Nowitzki Playing for Dallas in the NBA (U.S.A.)” ................................................................... 137 Herbert Henning and Benjamin John 9. Exploring the Final Frontier: Using Space Related Problems to Assist in the Teaching of Mathematics .................................................................... 155 Patrick Johnson 10. What are the Odds? ....................................................................................... 173 Patrick Johnson and John O’Donoghue 11. Models for Logistic Growth Processes (e.g. Fish Population in a Pond, Number of Mobile Phones within a Given Population) ................................ 187 Astrid Kubicek 12. Teaching Aspects of School Geometry Using the Popular Games Rugby and Snooker................................................................................................... 209 Jim Leahy v
TABLE OF CONTENTS
13. Increasing Turnover? Streamlining Working Conditions? A Possible Way to Optimize Production Processes as a Topic in Mathematics Lessons ........ 221 Juergen Maasz 14. Mathematics and Eggs: Does this Topic Make Sense in Education? ............ 239 Juergen Maasz and Hans-Stefan Siller 15. Digital Images: Filters and Edge Detection................................................... 257 Thomas Schiller 16. Modelling and Technology: Modelling in Mathematics Education Meets New Challenges .................................................................................. 273 Hans-Stefan Siller List of Contributors .............................................................................................. 281
vi
PREFACE
We should start by pointing out that this is not a mathematics text book – this is an ideas book. This is a book full of ideas for teaching real world problems to older students (15 years and older, Upper Secondary level). These contributions by no means exhaust all the possibilities for working with real world problems in mathematics classrooms but taken as a whole they do provide a rich resource for mathematics teachers that is readily available in a single volume. While many papers offer specific well worked out lesson type ideas, others concentrate on the teacher knowledge needed to introduce real world applications of mathematics into the classroom. We are confident that mathematics teachers who read the book will find a myriad of ways to introduce the material into their classrooms whether in ways suggested by the contributing authors or in their own ways, perhaps through miniprojects or extended projects or practical sessions or enquiry based learning. We are happy if they do! Why did we collect and edit them for you, the mathematics teachers? In fact we did not collect them for you but rather for your students! They will enjoy working with them at school. Having fun learning mathematics is a novel idea for many students. Since many students do not enjoy mathematics at school, students often ask: “Why should we learn mathematics?” Solving real world problems is one (and not the only one!) good answer to this question. If your students enjoy learning mathematics by solving real world problems you will enjoy your job as a mathematics teacher more. So in a real sense the collection of examples in this book is for you too. Using real world problems in mathematics classrooms places extra demands on teachers and students that need to be addressed. We need to consider at least two dimensions related to classroom teaching when we teach real world problems. One is the complexity (intensity or grade) of reality teachers think is appropriate to import into the classroom and the other is about the methods used to learn and work with real problems. Papers in this collection offer a practical perspective on each dimension, and more. Solving real world problems often leads to a typical decision situation where you (we hope together with your students) will ask: Should we stop working on our problem now? Do we have enough information to solve the real world problem? These are not typical questions asked in mathematics lessons. What students should learn when they solve real world problems is that an exact calculation is not enough for a good solution. They should learn the whole process of modelling from the first step abstracting important information from the complex real world situation, to the next steps of the mathematical modelling process. For example, they should learn to write down equations to describe the situation; do calculations; interpret the results of calculation; improve the quality of the model; calculate again (several times if needed); and discuss the results with others. Last but not least, they should reflect on the solution process in order to learn for the future.
vii
PREFACE
How real should real world problems be? More realistic problems are generally more complex and more complex problems demand more time to work them out. On the other hand a very simplified reality will not motivate students intrinsically to work for a solution (which is much better for a sustaining learning). Experience suggests starting with simple problems and simple open questions and moving to more complex problems. We think it is an impossible task for students without any experience of solving complex real problems to start by solving difficult real problems. It is better if you start with a simpler question and add complexity step by step. The second dimension of classroom teaching is concerned with methods of teaching real world problems. We are convinced that learning and teaching is more successful if you use open methods like group work, project planning, enquiry learning, practical work, and reflection. A lot of real world problems have more than one correct solution, and may in fact have several that are good from different points of view. The different solutions need to be discussed and considered carefully and this is good for achieving general education aims like “Students should become critical citizens”. Students are better prepared for life if they learn how to decide which solution is better in relation to the question and the people who are concerned. Finally we would like to counter a typical “No, thank you” argument against teaching real world problems. Yes, you will need more time for this kind of teaching than you need for a typical lesson training students in mathematical skills and operations. Yes, you will need to prepare more intensively for these lessons and be prepared for lot of activity in your classroom. You will need to change your role from a typical teacher in the centre of the classroom knowing and telling everything to that of manager of the learning process who knows how to solve the problem. But you need help to get started! We hope you will use this book as your starter pack We don’t expect you to teach like this every day but only on occasions during the year. It should be one of your teaching approaches but not the only one. Try it and you will be happy because the results will be great for the students and for you! ACKNOWLEDGEMENTS
We would like to thank all those who made this book possible especially the many authors who so generously contributed papers. This collaboration, sharing of insights, expertise and resources benefits all who engage in an enterprise such as this and offers potential benefits to many others who may have access to this volume. We are especially pleased to bring a wealth of material and expertise to an English speaking audience which might otherwise have remained unseen and untapped. The editors would like also to record their thanks to their respective organizations who have supported this endeavour viz. the Institut fur Didaktik der Mathematik, viii
PREFACE
Johannes Kepler University, Linz, and the National Centre for Excellence in Mathematics and Science Teaching and Learning (NCE-MSTL), at the University of Limerick. Juergen Maasz University of Linz, Austria John O’Donoghue NCE-MSTL University of Limerick and Linz Autumn 2010
ix
MANFRED BOROVCNIK AND RAMESH KAPADIA
1. MODELLING IN PROBABILITY AND STATISTICS Key Ideas and Innovative Examples
This chapter explains why modelling in probability is a worthwhile goal to follow in teaching statistics. The approach will depend on the stage one aims at: secondary schools or introductory courses at university level in various applied disciplines which cover substantial content in probability and statistics as this field of mathematics is the key to understanding empirical research. It also depends on the depth to which one wants to explore the mathematical details. Such details may be handled more informally, supported by simulation of properties and animated visualizations to convey the concepts involved. In such a way, teaching can focus on the underlying ideas rather than technicalities and focus on applications. There are various uses of probability. One is to model random phenomena. Such models have become more and more important as, for example, modern physicists build their theory completely on randomness; risk also occurs everywhere not only since the financial crisis of 2008. It is thus important to understand what probabilities really do mean and the assumptions behind the various distributions – the following sections deal with genuine probabilistic modelling. Another use of probability is to prepare for statistical inference, which has become the standard method of generalising conclusions from limited data; the whole area of empirical research builds on a sound understanding of statistical conclusions going beyond the simple representation of data – sections 6 and 7 will cover ideas behind statistical inference and the role, probability plays therein. We start with innovative examples of probabilistic modelling to whet the appetite of the reader. Several examples are analysed to illustrate the value of probabilistic models; the models are used to choose between several actions to improve the situation according to a goal criterion (eg., reduce cost). Part of this modelling approach is to search for crucial parameters, which strongly influence the result. We then explain the usual approach towards probability – and the sparse role that modelling plays therein by typical examples, ending with a famous and rather controversial example which led to some heated exchanges between professionals. Indeed we look at this example (Nowitzki) in some depth as a leitmotiv for the whole chapter: readers may wish to focus on some aspects or omit sections which deal with technical details such as the complete solution. In the third and fourth sections, basic properties of Bernoulli experiments are discussed in order to model and solve the Nowitzki task from the context of sports. J. Maasz and J. O’Donoghue (eds.), Real-World Problems for Secondary School Mathematics Students: Case Studies, 1–43. © 2011 Sense Publishers. All rights reserved.
BOROVCNIK AND KAPADIA
The approach uses fundamental properties of the models, which are not always highlighted as they should be in teaching probability. In the fifth section, the fundamental underlying ideas for a number of probability distributions are developed; this stresses the crucial assumptions for any situation, in which the distribution might be applied. A key property is discussed for some important distributions: waiting times, for example, may or may not be dependent on time already spent waiting. If independent, this sheds a special light on the phenomenon, which is to be modelled. In a modelling approach, more concepts than usual have to be developed (with the focus on informal mathematical treatment) but the effort is worthwhile as these concepts allow students to gain a more direct insight to understand the inherent assumptions, which are required from the situation to be modelled. In the sixth section, the statistical question – (is Nowitzki weaker in away than in home matches?) – is dealt with thoroughly. This gives rise to various ways to tackle this question within an inferential framework. We deal informally with the methods that comprise much of what students should know about the statistical comparison of two groups, which forms the core of any introductory course at university for all fields, in which data is used to enrich research. While the assumptions should be checked in any case of application, such a crucial test for the assumptions might be difficult. It will be argued that the perspective of probabilistic and statistical applications is different and linking heuristic arguments might be more attractive in the case of statistical inference. Probabilistic distributions are used to make a probability statement about an event, or to calculate expected values to make a decision between different options. Or, they may be used to describe the ‘internal structure’ of a situation by the model’s inherent structure and assumptions. Statistical applications focus on generalizing facts beyond available data. For that purpose they interpret the given data by probabilistic models. A typical question is whether the data is compatible with a specific hypothesized model. This model as well as the answer is interpreted within the context. For example, can we assume that a medical treatment is – according to some rules – better than a placebo treatment? The final two sections resume the discussion of teaching probability and statistics, some inherent difficulties, and the significance of modelling. Conclusions are drawn and pedagogical suggestions are made. INNOVATIVE EXAMPLES OF PROBABILISTIC MODELLING
As we shall see in a later section, a key assumption of independence is not justified in many exercises set in probability. Indeed the key question in any modelling is the extent to which underlying assumptions are or are not justified. Rather than the usual approach of mechanistic applications of probability, a more realistic picture of potential applications will be developed in this section by some selected innovative examples. They describe a situation from the ‘real world’ and state a target to improve or optimize. A spectrum of actions or interventions is open for use to improve a criterion such as reduction of costs. A probability distribution is chosen to model the situation, even though the inherent assumptions might not be perfectly fulfilled. 2
MODELLING IN PROBABILITY AND STATISTICS
A solution is derived and analysed: how does it change due to changes in parameters involved in the model, how does it change due to violations of assumptions? Sensitive parameters are identified; this approach offers ways of making the best use of the derived solutions and corroborating the best actions to initiate further investigations to improve the available information. The examples deal with novel applications – blood samples, twin light bulbs, telephone call times, and spam mail. Simulation, spreadsheets and other methods are used and illustrate the wide range of ideas where probability helps to model and understand a wide and diverse range of situations. Blood Samples Modelled with Binomial Probabilities The following example uses the binomial model for answering a typical question, which might be studied. The ‘outcome’ might be improved even if the model has some drawbacks: the cost situation is improved and hints for action are drawn from the model used though the actual cost improvement cannot be directly read off the solution. Crucial parameters that strongly influence the solution are identified, for which one may strive to get more information in the future. Example 1. Donations of blood have to be examined as to whether they are suitable for further processing or not. This is done in a special laboratory after the simple determination of the blood groups. Each donation is judged – independently of each other – ‘contaminated’ with a probability p = 0.1 and suitable with the complementary probability of q = 0.9. a. Determine the distribution of the number of non-suitable donations if 3 are drawn at random. b. 3 units of different donations are taken and mixed. Only then are they examined jointly as to whether they are suitable or not. If one of those mixed was already contaminated then the others will be contaminated and become useless. One unit has a value of € 50.-. Determine the loss for the various numbers of non-suitable units among those which are drawn and mixed. Remember: If exactly one is non-suitable then the two others are ‘destroyed’. c. Determine the distribution of the loss as a random variable. d. Calculate the expected loss if 3 units of blood are mixed in a ‘pool’. e. Testing of blood for suitability costs of € 25.- per unit tested; the price is independent of the quantity tested. By mixing 3 units in a pool, a sum of € 50.is saved. With the expected loss from d., does it pay to mix 3 different units, or should one apply the test for suitability separately to each of the blood units? A solution to this example is presented in a spreadsheet (Figure 1); the criterion for decision is based on the comparison of potential loss and benefit by pooling; pooling is favourable if and only if: Expected loss by pooling < Reduction of cost of lab testing
(1)
All blood donations are assumed to be contaminated with a probability of 0.1, independently of each other. That means we model the selection of blood donations, 3
BOROVCNIK AND KAPADIA
which should be combined in a pool and analysed in the laboratory jointly, as if it were ‘coin tossing with a success probability of p = 0.1’. While the benefit is a fixed number, loss has to be modelled by expected values. Comparison (1) yields 25.65 < 50, hence mixing 3 blood units and testing jointly, saves costs. Modelling of pooling and calculation of expected cost 0/1
Coding: Single unit is suitable = 0, is NOT suitable = 1
n
3
1-p
0.9
Probability that a single unit is suitable for further processing
p
0.1
Probability that a single unit is NOT suitable for further processing
a.
X ~ Bin(n, p)
b.
Y
Number of blood units to combine to a pool
Model for number X of NOT suitable single units in the pool: Draw with replacement Loss as a function of the number of not suitable units in the pool 50
c.
Cost of units destroyed by pool = loss P(Y = yi) - has to be determined
Distribution of Y
d.
E(Y)
e.
Comparison
Average cost of units destroyed by pooling = loss To compare reduction of cost in testing pooled units and expected loss by pooling 25
Cost of testing a unit in the laboratory
a.
b. Destroyed by pool
c. Loss Y
e.
d.
Distribution of loss P(Y = yi) Y = yi
Not suitable X=i
Probability P(X = i)
Expected loss yi pi
0
0.729
0
0
0
0.730
0
1
0.243
2
100
50
0.027
1.35
2
0.027
1
50
100
0.243
3
0.001
0
0
*
24.3 25.65
Cost of testing Single
25
Size of pool
3
separate
75
pooled
25
Reduction
50
These figures have to be compared.
Figure 1. Spreadsheet with a solution to Example 1.
Example 2. Explore the parameters in Example 1. a. For a probability of p = 0.1 of being contaminated what is a sensible recommendation for the number of blood units being mixed? b. How does the recommendation in a. depend on the probability of suitable blood donations? While a spreadsheet is effective in assisting to solve Example 1, it becomes vital for tackling the open questions in Example 2. Such an analysis of input parameters gives much more rationale to the questions posed. Of course, there are doubts on whether the suitability of blood donations might be modelled by a Bernoulli process. Even if so, how is one to get trustworthy information about the ‘success’ probability for suitable blood units? As this is a crucial input, it should be varied and the consequences studied. Also, just to mix a fixed number of blood units gives no clue why such a size of a pool should be chosen. It also gives no feeling about how the size makes sense relative to the expected monetary gain or loss connected to the size of a pool, which is examined jointly. In a spreadsheet, a slide control for the input parameters p and size n is easily constructed and allows an effective interactive investigation of the consequences (or the laboratory cost and the value of a unit). From a spreadsheet as in Figure 2, one may read off (q = 1 – p): Expected net saving (q = 0.9, n = 4) = 26.22 4
(2)
MODELLING IN PROBABILITY AND STATISTICS
This yields an even better expected net saving as in (1). Interactively changing the size n of the pool shows that n = 4 yields the highest expected value in (2) so that to combine 4 is ‘optimal’ if the proportion of suitable blood units is as high as q = 0.9. With a smaller probability of good blood units, the best size of the pool drastically declines; with q = 0.8, size 2 is the most favourable in cost reduction; the cost reduction is as low as 9 € per pool of 2 as compared to 26.22 € per each 4 units combined to a pool. Reduction of cost per unit is 4.5 with q = 0.8 and 6.6 € with q = 0.9. There is much more to explore, which will be left to the reader. Modelling of pooling - exploring the effect of parameter changes Lab cost of testing
Value of a unit
Proportion q of suitable units
Size of pool n
25
50
0.90
4
70
k suitable 0
P(k)
reduction of test cost by pooling
cost of destroyed units by pooling 0
net saving 75
xi * pi
Expected net saving
0.0001
75
0.0075
1
0.0036
75
-50
25
0.0900
2
0.0486
75
-100
-25
-1.2150
3
0.2916
75
-150
-75
-21.8700
4
0.6561
75
0
75
49.2075
5
#ZAHL!
75
0
75
0.0000
lines
10
#ZAHL!
75
0
75
0.0000
hidden
26.22
Figure 2. Spreadsheet to Example 2 – with slide controls for an interactive search.
It is merely an assumption that the units are independent and have the same probability of being contaminated. Nevertheless, the crucial parameter is still the size of such a probability as other measurements (of success) are highly sensitive to it. If it were a bit higher than given in the details of the example, pooling would not lead to decreasing cost of testing. If it were considerably smaller, then pools of even bigger size than suggested would lead to considerable saving of money. The monotonic influence becomes clear even if the exact size of decreasing cost cannot be given. Closer monitoring of the quality of the incoming blood donations is wise to clarify circumstances under which the required assumptions are less justified than usual. Lifetime of Bulbs Modelled with Normal Distribution Example 3. Bulbs are used in a tunnel to brighten it and increase the security of traffic; the bulbs have to be replaced from time to time. The first question is when a single bulb has to be replaced. The second is, whether it is possible to find a time when all bulbs may be replaced jointly. The reason for such an action is that one has to send a special repair team into the tunnel and block traffic for the time of replacement. While cost maybe reduced by a complete replacement, the lifetime of the still functioning lamps is lost. The time for replacement is, however, mainly determined by security arguments: with what percentage of bulbs still working is the tunnel still sufficiently lit? Here we assume that the tunnel is no longer secure if over two thirds of the bulbs have failed. 5
BOROVCNIK AND KAPADIA
Two systems are compared for their relative costs: Single bulbs and twin bulbs, which consist of two single bulbs – the second is switched on when the first fails. Which system is less costly to install? Lifetime of bulbs in hours is modelled by a normal distribution with mean lifetime μ =1900 and standard deviation σ =200. a. What is the probability that a single bulb fails within the 2300 hours in service? b. Determine the time when two thirds of single bulbs have failed. c. Determine an adequate model for the lifetime of twin bulbs and – based upon it – the time when two thirds of the twin bulbs have failed. Remark. If independence of failure is assumed then lifetime of twin bulbs is also normally distributed with parameters: mean = sum of single means, variance = sum of single variances. d. Assume that at least one third of the lamps in the system have to function for security reasons. The cost of one twin bulb is 2.5 €, a single lamp costs 1 €. The cost of replacing all lamps in the tunnel is 1000 €. For the whole tunnel, 2000 units have to be used to light it sufficiently at the beginning. Relative to such conditions, is it cheaper to install single or twin bulbs? The solution can be read-off from a spreadsheet like Figure 3. The probability in a. to fail before 2300 hours may be found by standardizing this value by the parameters to (2300–1900)/200 = 2 and calculating the value of the standard normal Φ (2). Parts b. and c. require us to calculate the ⅔ quantile of the normal distribution, which traditionally is done by using probability tables, or which may be directly solved by a standard function. As a feature of spreadsheets, such a quantile may be determined by a sliding control, which allows us to control x in the formula P(X ≤ x) = ∞ until the probability ⅔ is reached in the box. For twin bulbs, first the parameters have to be calculated: μT = 1900 + 1900 and σT = √(2002+2002). Such a relation between parameters of single and twin bulbs may be corroborated by simulation – a proof is quite complex. For the final comparison of costs in d., the calculation yields that single bulbs will be exchanged after 1986.1, twin bulbs after 3921.8 hours. The overall costs of the two systems have to be related to unit time. Of course, the lifetime of the bulbs is not really normally distributed. The estimation of the parameters (mean and standard deviation of lifetime) is surrounded with imprecision. As a perfect model, normal distribution might not serve. However, perceived as a scenario in order to investigate “what happens if …?” it helps to analyse the situation and detect crucial parameters. Thus, it might yield clues for the decision on which system to install. On the basis of such a scenario, we have a clear goal, namely to compare the expected costs per unit time E[C] /h of the two systems (see Figure 3): E[C] /h (single bulbs) – E[C] /h (twin bulbs) = 1.51 – 1.53 < 0
(3)
The comparison of expected costs gives 1.51 € per unit time for single as opposed to 1.53 € for twin bulbs. A crucial component is the price of twin bulbs: a reduction from 2.5 to 2.4 € per twin bulb (which amounts to a reduction of 4%) 6
MODELLING IN PROBABILITY AND STATISTICS
changes the decision in favour of twin bulbs. Hence this approach allows students to investigate assumptions, the actual situation and relative costs. It encourages them to use their own prior knowledge in the situation by varying parameters and costs. This gives practice in applying the normal distribution and then investigating consequences from the situation, some of which may be hard to quantify. Single bulbs
μ
σ
1900
x
200
z = (x - μ)/σ
2300
p
zp
x p = μ + z p *σ
0.6667
0.4307
1986.1
Lifetime of a single bulb
f(x)
Φ (z)
2
0.002
0.9772
a. survive 2300 hours
2/3 fail until then
0.001
b.
x
0.000
Twin bulbs
0
500
1000
1500
2000
2500
3000
Lifetime of single bulbs - - and twin bulbs ___
μT
σT
3800
p 0.6667
x
z = (x - μ)/σ
282.84
4600
2.83
zp
x p = μ + z p *σ
0.4307
3921.8
Φ (z)
0.002
0.9977 0.001
c. x
0.000 0
Comparison of cost per unit time D
e
v
i
c
e
s
C
o
s
t
1000
2000
T i m e
Cost per
€ per unit
Number
Light system
Exchange
total
working
unit time
Single
1.0
2000
2000
1000
3000
1986.1
1.51
Twin bulbs
2.5
2000
5000
1000
6000
3921.8
1.53
Variant
2.4
2000
4800
1000
5800
3921.8
1.48
3000
4000
5000
} d.
Figure 3. Spreadsheet with a solution to Example 3.
In fact, there is a systematic error in the normal distribution as for a lifetime no negative values are possible. However, for the model used, the probability for values less than 0 amounts to 10-21, which is negligible for all practical purposes. We will illustrate key ideas behind distributions in section 5; hazard rate is one of them, which does not belong to the standard repertoire of concepts in probability at lower levels. Hazard is a tendency to fail. However, it is different from propensity (to fail), which relates the tendency to fail for new items only. To look at an analogy: it is folklore that the tendency to fail (die) for human beings differs with age. The older we get, the higher the age-related tendency to die. Again, such a relation is complex to prove. However, a simulation study should be analysed accordingly and the percentage of failures can be calculated in two ways: one based on all items in the experiment, the other based only on those, which are still in function at a specific point of time. This enhances the idea of an age-related (conditional) risk to fail as compared to a general risk to fail, which implicitly is related to all items, which are exposed to failure in the experiment. The normal distribution may be shown to follow an increasing hazard. Such an increasing risk to fail with increasing age is reasonable with lifetime of bulbs – as engineers know. Thus, even if the normal model may not be interpreted directly by relative frequencies of failures, it may count as a reasonable model for the situation. The exact amount of net saving by installing the optimal system cannot be quantified 7
BOROVCNIK AND KAPADIA
but the size of it may well be read from the scenario. These are the sort of modelling issues to discuss with students in order to enhance their understanding and interest. Call Times and Cost – the Exponential and Poisson Distributions Combined Example 4. The length Y of a telephone call is a random variable and can be modelled as exponentially distributed with parameter λ = 0.5, which corresponds to a mean duration of a call of 1/λ = 2. The cost k of a call is a function of its duration y and is given by a fixed amount of 10 for y ≤ 5 and amounts to 2y for y > 5. a. Determine the expected cost of a telephone call. b. Calculate the standard deviation (s.d.) of the cost of a telephone call. c. Determine an interval for the cost of a single call with probability of 99%. The number of calls during a month will be modelled by a Poisson distribution with μ = 100; this coincides with an expected count of calls of 100. Determine d. the probability that the number of phone calls does not exceed 130 per month; e. a reasonable upper constraint for the maximum cost per month and a reasonable number depicting the risk that such a constraint does not hold – describe how you would proceed to determine such a limit more accurately. Cost of single calls and bills of 130 calls - a simulation study
λ = Nr.
c.
e.
Cost of call
Zi
Xi
c(X i )
Length of 130 Bill for period with 130 calls calls
empty column!
Many bills simulated *
Bill > 1387.5 ?
*
data tables in EXCEL
1329.84
0
1329.86
0
1325.67
0
lines
10.00
1337.23
0
hidden
10.00
1307.75
0
1
0.525
1.487
10.00
2
0.233
0.530
10.00
713
0.419
1.086
714
0.202
0.452
240.69
1329.84
E = Expected cost of single call
Single calls
b.
Exponential - length of call
10.25
σ = S.d.
Bill
a.
0.5
Random numbers from (0, 1)
mean
1.41
s.d.
Risk is
0.010
that single call cost more than:
18.25
cost
Risk is
0.010
that length of call longer than:
9.13
length
}
of simulated data on cost
}
99% quantile of simulated data
Risk is
0.020
that bill (with 130 calls) costs more:
1387.50
percentage rank in simulated bills
Risk is
0.010
that bill for the period is higher than
1393.78
99% quantile of simulated bills.
Calls in a period - Poisson with μ = 100
The simulation study is based on only 714 random numbers.
0.05
It fluctuates still, but would stabilize if more data were generated. d. Risk that number of calls > 130:
130
0.002 Risk seems 'negligible'
While for the single calls the cost cannot be approximated by a normal (as exponential distributions are heavily skewed), the cost of 130 calls (as a sum) may be approximated by the normal distribution, though it still is slightly skewed
0.00 0
(CLT!) single call
**
130 calls
130
expected cost
E
10.3283
En = n *E
1342.68
s.d.
σ
1.5871
σ n = √n * σ
18.10
** tricky integrals or simulation
50
100
150
Bills: Cost of 130 calls - artificial data 0.15
0.10 1387.5
standard normal e. cost of 130 calls
p quantile
0.99
zp
2.326
x p = E 130 + z p * σ 130
1384.78
0.05
99% of 130 calls cost less
0.00 1250
1300
Figure 4. Spreadsheet with a solution to Example 4. 8
1350
1400
1450
MODELLING IN PROBABILITY AND STATISTICS
The models suggested are quite reasonable. However, the analytic difficulties are considerable – even at university level. A solution to part e. may be found from a simulation scenario of the assumptions. The message of such examples is that not all models can be handled rigorously. The key idea here is to understand the assumptions implicit in the model and judge whether they form a plausible framework for the real situation to be modelled and interpret the result accordingly. In section 5, the model of the exponential distribution will be related to the idea of ‘pure’ random length and to the memory-less property according to which the further duration of a call has the same distribution regardless how long it already goes on. If such a condition is rejected for the phoning behaviour of a person the result derived here would not be relevant. The number of phone calls is modelled here by a Poisson distribution. The key idea behind this model is that events (phone calls) occur completely randomly in time with a specific rate per unit time; see section 5 for more details. The simulation is based on the following mathematical relation between distributions (which itself may be investigated by simulation): If Z is a random number in the interval (0, 1) then Y = – ln(1–Z)/λ is exponentially distributed with parameter λ
(4)
A further key idea used in this example is the normal approximation for the cost of the bill of 130 calls for a period as it is the sum of the ‘independent’ single call cost. The histogram of simulated bills ‘confirms’ that to some extent but shows still some skewness, which originates from the highly skewed exponential distribution. To apply the approximation, mean and s.d. of the sum of independent and identically distributed (called iid, a property analogue to (9) later) variables has to be known. An estimate of single-call cost may again be gained from a simulation study as the integrals here are not easy to solve. From single calls Xi to bills of n calls, the following mathematical relation is needed: iid
X i ~ X , Tn := X 1 + ... + X n , then E (Tn ) = n ⋅ E ( X ) , σ (Tn ) = n ⋅ σ ( X ) . (5)
The first part is intuitive – if E(X) is interpreted as fair prize of a game, then Tn describes the win of n repetitions of it; the fair prize of it should be n times the prize of a single game. The second part is harder and has to be investigated further. As it is fundamental, a simulation study can be used to clarify matters. Spam Mail – Revising Probabilities with Bayes’ Formula Example 5. Mails may be ‘ham’ or ‘spam’. To recognize this and build a filter into the mail system, one might scan for words that are contained in hams and in spams; e.g., 30% of all spams contain the word ‘free’, which occurs in hams with a frequency of only 1%. Such a word might therefore be used to discriminate between ham and spam mails. Assume that the basic frequency of spams in a specific mailbox is 10 (30)% 9
BOROVCNIK AND KAPADIA
a. If a mail arrives with the word ‘free’ in it, what is the probability that it is spam? b. If a message passes such a mail filter, what is its probability that it is actually spam? c. Suggest improvements of such a simple spam filter. The easiest way to find the solution is to re-read all the probabilities as expected frequencies of a suitable number of mails. If we base our thought on 1000 mails, we expect 100 (300) to be spam. Of these, we expect 30 (90) to contain the word ‘free’. The imaginary data – natural frequencies in the jargon of Gigerenzer (2002) are in Table 1, from which it is easy to derive answers to the questions: If the mail contains ‘free’, its conditional probability to be spam is 0.7692 (30/39 with 10% spam overall) and 0.9278 (90/97 with 30% spam). The filter, however, has limited power to discriminate between spam and ham as a mail, which does not contain the word ‘free’ still has a probability to be spam of 0.0728 (70/961) or 0.2326 (210/903) depending on the overall rate of spam mails. Table 1. Probabilities derived from fictional numbers – natural frequencies Spam overall ‘free’ Ham 9 Spam 30 all 39
10% ‘free’ 891 70 961
all 900 100 1000
‘free’ 7 90 97
30% ‘free’ 693 210 903
all 700 300 1000
This shows that the filter is not suitable for practical use. Furthermore, the conditional probabilities are different for each person. The direction for further investigation seems clear: to find more words that separate between ham and spam mails and let the filter learn from the user who classifies several mails into these categories. The table with natural (expected) frequencies – sometimes also called the statistical village behaviour – delivers the same probabilities as the Bayesian formula but is easier to understand and is accepted much better by laypeople. For further development, the inherent key concept of conditional probability should be made explicit. Updating of probabilities according to new incoming information is a basic activity. It has a wide-spread range of application such as in medical diagnoses or before court when indicative knowledge has to be evaluated. The Bayesian formula to solve this example is P( spam |' free' ) =
P( spam) ⋅ P(' free' | spam) P( spam) ⋅ P(' free' | spam) + P( ham) ⋅ P(' free' | ham)
(6)
Gigerenzer (2002) gives several examples how badly that formula is used by practitioners who apply it quite frequently (not knowing the details, of course). There are many issues to clarify, amongst them the tendency to confuse P(spam|'free') and P('free'|spam). Starting with two-way tables is advisable, as advocated by Gigerenzer (2002); later the structure of the process should be made explicit to promote the idea of continuously updating knowledge by new evidence. 10
MODELLING IN PROBABILITY AND STATISTICS
Summary Overall, these four examples show the interplay between modelling and probability distributions, with simplifying assumptions made but then analysed in order for students to develop deeper understanding of the role, probability may play to make decisions more transparent. Probability models are not only part of glass bead games but may partially model situations from reality and contribute to compare alternatives for action rationally. In each case, there is an interesting and key question to explore within a real-life context. Simplifying assumptions are made in order to apply a probability distribution. The results are then explored in order to check the validity of the assumptions and find out the sensitivity of the parameters used. We have only given an outline of the process in each example; a longer time would be needed with students in the classroom or lecture hall. THE USUAL APPROACH TOWARDS TEACHING PROBABILITY
The usual approach towards probability is dominated by games of chance, which per se is not wrong as the concepts stem from such a context, or from early insurance agreements, which are essentially loaded games of chance with the notable exception of symmetry arguments; probabilities that would otherwise arise from symmetry considerations in games are replaced by estimates for the basic (subjective) probabilities. The axiomatic rules of probabilities are usually discussed cursorily and merely used as basic rules to obey when dealing with probabilities. Thus, no detailed proofs of simple properties are done and if done are simplified and backed up by didactical methods like tree diagrams, which may be applied as one deals primarily with finite or countably infinite probability spaces. The link from an axiomatically ‘determined’ probability to distributions is not really established1 – so the many probability distributions develop their own ‘life’. Part of their complexity arises from their multitude. Normally, only a few are dealt with in teaching ‘paradigmatically’. At the secondary stage this is mainly the binomial and normal distributions; in introductory courses at universities hypergeometric, Poisson, or exponential distributions are also added. The various distributions are dealt with rather mechanistically. A short description of an artificial problem is followed by the question for the probability of an event like ‘the number of successes (in a binomial situation) does not exceed a specified value’. Hereby, the context plays a shallow role. Questions of modelling, e.g., what assumptions are necessary to apply the distribution in question, or, in what respect are such requirements fulfilled in the context, are rarely addressed. The following examples illustrate the usual ‘approach’. Example 6 illustrates an attitude towards modelling, which is not rare. Example 6. For the production of specific screws, it is known that ‘on average’ 5% are defective. If 10 screws are packed in a box, what is the probability that one finds two (or, not more than two) defective pieces in a box? The screws could be electric bulbs, or electronic devices, etc; defective may be defined as ‘lasting less than 2000 hours’. The silent assumption in all these examples 11
BOROVCNIK AND KAPADIA
is: ‘model’ the selected items as a random sample from the production. Sometimes the model is embodied by the paradigmatic situation of random selection from an urn with two sorts of marbles – marked 1 and 0 – predominantly with (sometimes without) replacement of the drawn marbles. The context is used as a fig leaf to tell the different stories for very similar tasks, namely to drill skills in calculating probabilities from the right distribution. Neither a true question to solve, nor alternative models, nor a discussion of validity of assumptions is involved. No clear view is given of why probabilistic modelling helps to improve one’s understanding of the context. The full potential of probability to serve as a means of modelling is missed by such an attitude; see Borovcnik (2011). Modelling from a Sporting Context – the Nowitzki Task The following example shows that such restricted views on probability modelling are not bound to single teachers, textbooks, or researchers in educational statistics. The example is taken from a centrally organized final exam in a federal state in Germany but could be taken from anywhere else. The required assumptions to solve the problems posed are clarified. This is – at the same time – a fundamental topic in modelling. The question, as we shall see later, aroused fierce controversy. Example 7 (Nowitzki task). The German professional basketball player Dirk Nowitzki plays in the American professional league NBC. In the season 2006–07 he achieves a success rate of 90.4% in free throws. (For the original task, which was administered in 2008, see Schulministerium NRW, n.d.2) Probabilistic part. Calculate the probability that he a. scores exactly 8 points with 10 trials; b. scores at the most 8 points with 10 trials; c. is successful in free throws at the most 4 times in a series. Statistical part. In home matches he scored 267 points with 288 free throws, in away matches the success rate was 231/263. A sports reporter [claimed that] Nowitzki has a considerably lower success rate away. At a significance level of 5%, analyse whether the number of scores in free throws away a. lies significantly below the ‘expected value’ for home and away matches; b. lies significantly below the ‘expected value’3 for home matches. This example will be referred to as the Nowitzki task and will be used extensively below to illustrate the various modelling aspects both in probability and statistics. From the discussion it will become clearer what assumptions we have to rely on and how these are ‘fulfilled’ differently in probabilistic and statistical applications. MODELLING THE NOWITZKI TASK
The Nowitzki task (Example 7) has a special history in Germany as it was an item for a centrally administered exam. The public debate provoked a harsh critique: its probabilistic part was criticized as unsolvable in the form it was posed; its statistical 12
MODELLING IN PROBABILITY AND STATISTICS
part was disputed as difficult and the ministerial solution was ‘attacked’ for blurring the fundamental difference between ‘true’ values of parameters of models and estimates thereof. The probabilistic task shows essentially the same features as Example 6; the context could be taken from anywhere, it is arbitrary. The statistical part, however, allows for more discussion on the fundamental problem of empirical research which has to deal with generalizing results from samples to populations. Various models are compared with respect to their quality to model the situation. Here we focus on modelling and then solving the probability part. Basic Assumptions of Bernoulli Processes This task is designed to be an exemplar of an application of the binomial distribution. It is worthwhile to repeat the basic features of the model involved. This distribution allows one to model experiments, with a fixed number of trials with two outcomes for each repetition of the experiment (trial); one is typically named ‘success’ and the other ‘failure’. The basic assumption is that – the probability p of success is the same for all trials, and – single trials do not ‘influence’ each other, which means – probabilistically speaking – the trials are independent. Such assumptions are usually subsumed under the name of Bernoulli experiments (Bernoulli process of experiments). If the results of the single trials are denoted by random variables X 1 , ..., X n with
X i = 1 (success) or X i = 0 (failure)
(7)
the random variables have to be independent, ie., P( X i = xi , X j = x j ) = P( X i = xi ) P( X j = x j ) if i ≠ j .
(8)
Such a product rule holds also for more than two variables X i of the process. The assumption usually is denoted by: iid
X i ~ X ~ B(1, p) ,
(9)
where the ‘iid ’ refers to independent, identically distributed random variables Xi. The model, however, is not uniquely determined by such a Bernoulli process. Still missing is information about the value of p. From the perspective of modelling, the decision on which distribution to apply is only one step towards a model for the situation in question. Usually, the model consists of a family of distributions, which differ by the values of one (or more) parameters. The next step is to find values for the parameters to fix one of the distributions as the model to use for further consideration. How to support the modeller to choose such a family like the binomial or normal distributions is discussed from 13
BOROVCNIK AND KAPADIA
a modelling perspective in section 5. The step of modelling to get values for the parameters of an already chosen model is outlined here. The process to get numbers for the parameters (p here) is quite similar for all models. In this section matters are discussed for Bernoulli processes. Some features arise from the possibility to model the data to come from a finite or an infinite population. The parameter p will be called the ‘strength’ of Nowitzki. The Nowitzki task was meant to go beyond an application within games of chance. The probability of success of a single trial of Nowitzki is not determined by a combinatorial consideration (justified by the symmetry of an experiment). The key idea to get numbers for the parameters is to incorporate some ‘knowledge’ or information. The reader is reminded that the problem involves 10 trials and the task will be treated as an open task to explore rather than just an examination question. So one might form a hypothesis on the success rate subjectively, for example. Model 1. The probability p could be determined by a hypothesis like Nowitzki is equally good as in the last two years when his success rate has been (e.g.) 0.910. Supposed that such a value holds also for the present season, the model is fixed. This value could well be a mere fiction – just to ‘develop a scenario’ and determine what would be its consequences. On the basis of this model, the random variable Tn = X 1 + ... + X n ~ B( n = 10, p = 0.910) ,
(10)
ie., the (overall) number of successes follows a binomial distribution with parameters 10 and 0.910. Model 2. The probability p is estimated from the given data, i.e., pˆ = 0.904 . In this case, the further calculations usually are based on Tn ~ B(n = 10, p = 0.904) .
(11)
This is somehow the ‘best’ model as the estimation procedure leading to pˆ = 0.904 fulfils certain optimality criteria like unbiasedness, minimum variance, efficiency, and asymptotic normality. Model 2 (variant). The approach in model 2 misses the fact that the estimate pˆ is not identical to the ‘true’4 value of p. There is some inaccuracy attached to the estimation of p. Thus, it might be better to derive a (95%) confidence interval for the unknown parameter p in a first approach leading to an interval [ p L , pU ] = [0.8792, 0.9284]
(12)
and only then calculate the required probabilities in Example 7 with the worst case pL and the best case pU. This procedure leads to a confidence interval for the required probability reflecting the inaccuracy of the estimate pˆ . 14
MODELLING IN PROBABILITY AND STATISTICS
Model 3. The value of p is equated (not estimated) to the success rate of the whole season, i.e., p:=0.904. This leads to the same model as in (11) but with a complete different connotation. The probability of success in a free throw may be – after the end of the season – viewed as factually known as the number of successes (498) divided by the number of trials (551). There is no need to view it any longer as unknown and treat the success rate of 0.904 as an estimate pˆ of an unknown p, which results from an imagined infinite series of Bernoulli trials. Investigating and Modelling the Unknown Value of p Three different ways may be pursued to provide the required information for the unknown parameter. With the binomial distribution, one needs to have information about the value of p, which may be interpreted as success rate in the Bernoulli process in the background. The information to fix the parameter has different connotation as described here. In due consequence, the models applied inherit part of this meaning. The cases to differentiate are: i. p is known ii. p is estimated from data iii. p is hypothesized from similar situations With games of chance, symmetries of the involved random experiment allow one to derive a value for p; eg., ½ for head in tossing a ‘fair’ coin – case i. Most applications, however, lack such considerations and one has to evoke ii. or iii. i. The assumption of equiprobability for all possible cases (of which one is called ‘success’) is – beyond games of chance – sometimes more a way of putting it, just to fix a model to work with. For coin tossing, this paves the way to avoid tedious data production (actually tossing the coin quite often) and work with such a model to derive some consequences on the basis ‘what would be if we suppose the coin to be fair (symmetric)’. Normally, for coins, a closer scrutiny would not deviate too much from the presupposition of p = ½ and yield quite similar results. With the basketball task, the value of p could be known from the past (model 1), which also refers to the further assumption that ‘things remain the same’5, i.e., there was no change from past to present. This is an assumption for which usually a substantial justification is lacking – those who do not rely too heavily on the information of the past might react more intelligently in the present.6 ii. Closer to the present situation in Example 7 is to use the latest data available (from the present season). The disadvantage of such a procedure is that the difference between the ‘true’ value of p and an estimate pˆ might be blurred, and thereby forgetting that the estimate is inaccurate. One possibility to deal with this is the variant of model 2. The inaccuracy of estimates is best illustrated by confidence intervals. To vary the size of underlying samples where the data stem from, gives a clearer picture of the influence of this lack of information. An important assumption for the data base from which the estimate is calculated, is: it has to be a random sample of the underlying Bernoulli process, which is 15
BOROVCNIK AND KAPADIA
essentially the same as the parent Bernoulli process. Clearly, the assumption of a sample to be random is rarely fulfilled and often is beyond scrutiny. Usually there are qualitative arguments to back up such an assumption. It is to be noted that the estimation of the probability is interwoven with the two key assumptions of Bernoulli experiments – the same success probability, and occurring independently in all trials. Otherwise, probabilities, such as a probability of success with a free throw, have no meaning. iii. A hypothesis about the success probability could be corroborated by knowledge about the past as in i. However, the season is completed and as a matter of fact, the success rate of Nowitzki was 0.904. To apply the factual knowledge about all games yields a value for the unknown parameter as p: = 0.904,
(13)
which amounts to much more concrete information than the estimation procedure leading to p ≈ pˆ = 0.904 . The success probability in (13) may be justified and clarified by the following: As the season is completed, one knows all data. There will be no more. A new season will have irretrievably different constellations. The success rate in 2006–07 is – as a matter of fact – 0.904. There could well be the question as to how to interpret this number and whether it is possible to interpret it as a success probability. The key question is whether it makes sense to interpret this 0.904 as a success probability. This interpretation is bound to the assumption that the data stem from an independent repetition of the same Bernoulli experiment. This requires – taken literally – that for each free throw of Nowitzki the conditions have been exactly the same, independently of each other and independent of the actual score and the previous course of the match, etc. With this point of view one might question whether the data really are compatible with the pre-requisites of a Bernoulli process. One could, e.g., inspect the number of ‘runs’ (points or failures in a series) and evaluate whether they are above or below the expected value for a Bernoulli process or not in order to check for the plausibility of its underlying assumptions. The way in which information is used to get numerical values for the unknown parameter influences the character of the model, which is fixed by it. From a modelling perspective, this has deep consequences as any interpretation of results from the model has to take such ‘restrictions’ into consideration. If the value of p is thought to be known – either by reference to a symmetric experiment, or by an unambiguous statement like ‘from long-standing experience from the past we know p to be 0.910’ in the wording of Example 7, the probabilistic part of it becomes trivial from a modelling perspective and a direct application of binomial probabilities is required. The solution may be found either by means of a hand-held calculator or a spreadsheet, or even by old-fashioned probability tables – the answer is straightforward and undisputed. The discussion about the various ways to deal with information about the success rate p might lead to the didactical conclusion that such questions have to be 16
MODELLING IN PROBABILITY AND STATISTICS
excluded from a final exam, especially if it is put forward centrally. The information in such tasks has to be conveyed clearly, the models have to be precisely and explicitly determined by the very text (not the context) of the task. The question remains – under such circumstances – would it still be worthwhile to teach probability as it would be reduced to a mere mechanistic application of the formulae in such exams? What is interesting is how the process of modelling used allows for an answer of the problem and in what respect such a model misses important features of the situation involved. To choose between various possible models and to critically appreciate the model finally chosen is a worthwhile standard to reach in studying about probability. In only rare cases is there one distinct answer to a problem in question. The assumptions of a Bernoulli process are not well fulfilled in sports and in many other areas where such methods are ‘blindly’ applied. Such assumptions establish (more or less well) a scenario (as opposed to a model that fits very well to the real situation), which allows an inspection of the situation on the basis of an ‘what would be – if we assume …’ Then of course, situations have to be set out where such scenarios may deliver suitable orientations despite their lack of fit to the situation (for the idea of a scenario instead of a model, see Borovcnik, 2006). If p is not known directly, there are various ways to fill in the gap of information – the scale ranges from hypotheses of differing credibility to estimates from statistical data of differing relevance (depending on the ‘grade of randomness’ of the sample). Clearly, a true value of p has to be distinguished from an estimate pˆ of p. The whole of inferential statistics is based on a careful discrimination between true parameters and estimations thereof. However, again, issues are not as easy and clear-cut. What may be viewed as a true parameter in one model may be viewed as an estimate in another model – see the ideas developed subsequently. If the option of a statistical estimate of the unknown parameter is chosen as in (11), then the data has to fulfil the assumption of a random sample – an independent repetition of the same basic experiment yielding each item of data. The accuracy linked to a specific sample may be best judged by a confidence interval as in (12). It might be tempting to reduce the length of such a confidence interval and to increase the precision of information about the unknown parameter by increasing the sample size. However, in practice, to obtain more data usually means a lower quality of data; ie., the data no longer fulfil their fundamental property of being a random sample, which involves a bias in the data with no obvious way to repair it. If the option of hypothesizing values for the unknown parameter is chosen as in (10), or in (13), one might have trouble in justifying such a hypothesis. In some cases, however, good arguments might be given. For the statistical part of Example 7 when it comes to an evaluation whether Nowitzki is better in home than in away matches, a natural hypothesis emerges from the following modelling. Split the Bernoulli process for home and away matches by a different value for p as pH and pA. The assumption of equal strength (home and away) leads to the hypothesis p A = p H , or p A − p H = 0 .
(14)
17
BOROVCNIK AND KAPADIA
Analysis of the data is then done under the auspices ‘as if the difference of the success probabilities home and away were zero’. However, it is not straightforward to derive the distribution for the test statistic pˆ A − pˆ H . More about Assumptions – A Homogenizing Idea ‘Behind’ the Binomial Distribution In the context of sports, it is dubious to interpret relative frequencies as a probability and – vice versa – it is difficult to justify estimating an unknown probability by relative frequencies. What is different in the sports context from games of chance where the idea of relative frequencies has emerged? It is the comparability of single trials, the non-dependence of single trials – that is the hinge for transferring the ideas from games to other contexts. For probabilistic considerations such a transfer seems to be more crucial than for a statistical purpose, which focuses on a summary viewpoint. In theory, the estimation of the success parameter p improves by increasing the sample size. Here, this requires combining several seasons together. However, the longer the series – especially in sports – the less plausible the assumptions for a Bernoulli process. And, if relative frequencies are used to estimate the underlying probabilities, condition (9) of a Bernoulli process has to be met. Only then do the estimates gain in precision by increasing the sample size. However, for Nowitzki’s scores, the assumptions have to be questioned. People and the sport change over time, making assumptions of random, independent trials as the basic modelling approach less tenable. Take the value of 0.904 as an estimate for his current competency to make a point with a free throw – formalized as a probability p. This requires an unrealistic assumption of ‘homogenizing’: Nowitzki’s capacity was constant for the whole season and independent of any accompanying circumstances, not even influenced by the fact that in one game everything is virtually decided with his team leading or trailing by a big gap close to the end of the match, or there is a draw in the match and this free throw – the last event in the match – will decide about winning or not. For statements related to the whole entity of free throws, such a homogenization might be a suitable working hypothesis. Perhaps the deviations from the assumptions balance for a longer series. To apply results derived on such a basis for the whole entity to a sequence of 10 specific throws and calculate the probability of 8 points at the most, however, makes hardly any sense, and even less so if it deals with the last 10 of an all-deciding match. To imbue p with a probabilistic sense, to apply the binomial distribution sensibly, one has to invent a scenario like the following: All free throws of the whole season have been recorded by video cameras. Now we randomly select 10 clips and ask: How often does Nowitzki make the point? ‘SOLUTION’ OF THE PROBABILISTIC PART OF THE NOWITZKI TASK
In this section, the solutions are derived from the various models and critically appraised. As the choice of model depends not only on assumptions but also on an 18
MODELLING IN PROBABILITY AND STATISTICS
estimation of unknown parameters, the question arises, which of the available models to choose – a fundamental issue in modelling. Several models are dealt with in the Nowitzki task and their relative merits are made clear. The methods of determining a good choice for the parameters also convey key features of pedagogy – some knowledge is taken from the context, some will be added by statistical estimation. While for parts a. and b. the results seem straightforward, part c. gives ‘new’ insights. This task was fiercely rejected by some experts as unsolvable. However, by highlighting a fundamental property of Bernoulli series a solution of part c. is easier. If the chosen model is taken seriously, then the modeller is in the same situation as in any game of chance. In such games, the player can start at any time – it does not matter. The player can also eliminate any randomly chosen games without a general change in the result. That is the key idea involved. Calculation of the Probability of Simple Events – Parts a. and b. With model 1 and the assumption that p = 0.910, the distribution is given in (10). Using a spreadsheet gives the solution with this specific binomial distribution as set out in Table 2. Model 3 is handled in the same way. Probability distribution for number of hits in 10 trials under various models
%
Probability distribution for number of hits in 10 trials - best and worst case
%
50
50
40
40
30
30
Model 2 or 3
20
Best case 20
Model 1
Worst case 10
10
0
0 0
5
10
hits
0
5
10
hits
Figure 5. Probability distributions for the number of hits under the various models.
With model 2 and the estimate p = 0.904, the distribution is fixed in (11). Using model 2 (variant), one derives the confidence interval (12) for Nowitzki’s ‘strength’ and uses the binomial distribution with parameters corresponding to worst and best cases for the playing strength. The distributions for the number of hits in 10 trials are depicted in Figure 5. While models 1 and 2 are similar, there is a huge difference between best and worst case in model 2 (variant). Table 2. Probabilities calculated under the various models
p = 0.910
Model 2 (3)7 pˆ = 0.904
Worst case pL
Best case pU
P (T10 = 8)
0.1714
0.1854
0.2345
0.1273
P (T10 ≤ 8)
0.2254
0.2492
0.3449
0.1573
Model 1
Model 2 (variant)
19
BOROVCNIK AND KAPADIA
It is remarkable, and worthy of pedagogical discussion in the classroom, that the solutions differ so much when the assumptions seem to be very similar. The inaccuracy as conveyed by the confidence interval (12) on p only reflects a margin of just under 5 percentage points. Nevertheless the variant of model 2 gives a margin of 0.1573 to 0.3449 for the probability in question b.8 Thus, it is crucial to remember that one has only estimates of the unknown parameter and imperfect knowledge. The Question ‘Nowitzki Scores at Most Four Times in a Series’ Task c. was disputed in a public discussion, in which statisticians were also involved. It was claimed that it can not be solved without an explicit number of observations given. Suggestions to fix the task are reported and a correct solution using a key property of Bernoulli processes is given below. The following passage is taken from an open letter to the ministry of education (Davies et al, 2008): “The media reported that one part of the [Nowitzki task] was not solvable because the number of trials is missing. This – in fact – is true and therefore several interpretations of the task are admissible, which lead to differing solutions.” Davies (2009, p. 4) illustrates three ways to cope with the missing number of trials: The first is to take n = 10 as it was used in the first part of the task. This suggestion comes out of ‘student logic’ but leads into an almost intractable combinatorial problem. One has to inspect all the 210 = 1024 series of 0’s and 1’s (for failures and successes) whether they have a single segment of five or more 1’s in it (indicating the complement of the event in question), or not. A second possibility is to take n = 5, which makes the problem very simple: The only sequence not favourable to the event in question is 1 1 1 1 1. Thus the probability for the complementary series, for which one is looking for here, equals 1 – p5
(15)
and the result is dependent on the chosen model (see Table 3). Again it is surprising that the change from model 2 to its variant reveals such a high degree of imprecision implicit in the task as the required probability is known only within a range of 0.31 to 0.47, if one refers to an estimate of the unknown probability of Nowitzki’s strength. But the reader is reminded that model 3 does not have this problem. Its result coincides with model 2 (without the variant) as the parameter is set to be known by (13). Table 3. Probabilities of the series ‘less than 5’ with n = 5 under the various models
5
p 1 – p5 20
Model 1 p = 0.910
Model 2 (3) pˆ = 0.904
Worst case pL
Model 2 (variant) Best case pU
0.6240 0.3760
0.6031 0.3969
0.5253 0.4747
0.6898 0.3102
MODELLING IN PROBABILITY AND STATISTICS
The third and final remedy to fill in for the gap of the missing number of trials, which Davies (2009) offers, refers to an artificial new gaming situation. “We imagine a game where two players perform free throws. One of the players begins and continues to throw as long as his ball passes correctly through the basket and he scores. If Nowitzki starts this game what is his probability that he scores at the most four times in a series in his first try?” Now, the possible sequences of this game are 0
10
110
1110
11110
(16)
The solution emerging from (16) coincides exactly with that where the number of trials is fixed by 5, which numerically was the officially accepted solution (though it was derived without the assumption of n = 5 trials). Regarding the common factor 1 – p in the single probabilities involved in (16), we get the solution by: (1 + p + p 2 + p 3 + p 4 )(1 − p) = 1 − p 5 .
(17)
The third attempt to solve task c. without the missing number of trials yields the solution. However, it implies an artificial new gaming situation, which makes things unnecessarily complicated. In fact, the task is solvable without supplementing the missing number of trials and without this artificial game. One only has to remind oneself of what really amounts to a Bernoulli process, what properties are fundamental to such processes. The property in question will – once recalled – lead to a deeper understanding of what Bernoulli processes are. The next section will illustrate this idea. If one agrees with the assumption (9) of a Bernoulli process for the trials of Nowitzki then part c. of the probabilistic task is trivial. If the conditions are always the same throughout then it does not matter when one starts to collect the data. Mathematically speaking: If X 1 , X 2 , ... is a Bernoulli process with relation (9), then the following two sub processes have essentially the same probabilistic features, i.e., they also follow property (9): – random start of data collection i0: Xi , Xi 0
iid
0+
1
, ... ~ X ~ B (1, p ) ;
(18a)
– random selection i0, i1, … of all data: iid
X i , X i , ... ~ X ~ B (1, p ) 0
1
(18b)
This is a fundamental property of Bernoulli processes in particular and of random samples in general. One may start with the data collection whenever one chooses, therefore, (18a) applies. One can also eliminate some data if the elimination is 21
BOROVCNIK AND KAPADIA
undertaken randomly as in (18b). While this key property of Bernoulli processes should be explained intuitively to students, it could also be supported by simulation studies to address ‘intuitive resistance’ from students. Statisticians coin the term iid variables. One needs to explain to students that each single reading comes from a process that – at each stage (for each single reading) – has a distribution independent of the other stages and which follows an identical (i.e., the same) distribution throughout (hence iid); this deceptively complex idea takes time for students to absorb. It has already been mentioned that in sports such an assumption is doubtful but the example was put forward with this assumption for modelling, which therefore will not be challenged at this stage. If it does not matter when we (randomly) start the data collection, we just go to the sports hall and wait for the next free throws. We note whether – Nowitzki does not score a point more than four times in a series – event A, or, – he succeeds in scoring more than four times in a series – event A Clearly it holds:
P ( A ) = p 5 ⋅1 and P( A) = 1 − p 5 .
(19)
The term p5 in (19) stands for the first five 1’s in the complementary event; this probability has to be multiplied by 1 as from the sixth trial onwards the outcome does not matter (and therefore is the certain event). The result coincides with solution (16). However, there is no need to develop this imaginary game with an opponent as Davies (2009) does in his attempt to ‘fix’ the task. The task is easily solved using the fundamental property (18a) of Bernoulli processes. If one just goes to a match (maybe one is late, it would not matter!) and observes whether Nowitzki scores more than four times in a series right from the start, then everything beyond the fifth trial is redundant and the solution coincides with solution (15) where the number of observations is fixed with n = 5. The Solution is Dependent on the Number of Trials Observed! If the number n of trials is pre-determined, the probability to have at the most four successes in a series changes. If one observes the player only up to four times, he cannot have more than four successes, whence it holds: P(A) = 1. The longer one observes the player, the more is the chance to finally see him score more than four times in a series. It holds: P( A | n) → 0, n → ∞ .
(20)
KEY IDEA BEHIND VARIOUS DISTRIBUTIONS
In this section, we explain the underlying key pedagogical ideas of the following seven distributions. a. Binomial distribution: repeated, independent trials, called Bernoulli process; b. Hypergeometric distribution: repeated dependent trials; 22
MODELLING IN PROBABILITY AND STATISTICS
c. d. e. f. g.
Poisson distribution: completely random events in time – Poisson process; Geometric distribution: waiting times in the Bernoulli process; Exponential distribution: Poisson and memory-less waiting; Weibull distribution: conditional failure rates or hazards; Normal distribution: the hypothesis of independent elementary errors;
For students’ understanding and also for good modelling reasons it is of advantage to have a key idea behind each of the distributions. Otherwise, it is hard to justify a specific distribution as a suitable model for the phenomenon under scrutiny. Why a Poisson distribution, or why a normal? The key idea of a distribution should convey a direct way of judging whether such a distribution could model the phenomenon in question. It allows one to check the necessary assumptions in the data generating process and whether they are plausible. Such a fundamental idea behind a specific distribution is sometimes hidden; it is difficult to recognise it from discrete probabilities or density functions, which might also have complicated mathematical terms. Other concepts related to a random variable might help to reveal to students ‘the’ idea behind a distribution. For example, a feature like a memory-less property is important for the phenomenon, which is described by a distribution. However, this property is a mathematical consequence of the distribution but cannot directly be recognized from its shape or mathematical term. In the context of waiting time, the memory-less property means that the ongoing waiting time until the ‘event’ occurs, has the same distribution throughout – regardless of the time already waiting for this event. Or, technical units might show (as human beings do) a phenomenon of wearingout, i.e., the future lifetime has, amongst others, an expected value decreasing by the age of the unit (or, the human being). To describe such behaviour, further mathematical concepts have to be introduced like the so-called hazard (see below). In technical applications continuous service of units might postpone wearing-out. For human beings, insurance companies charge higher premiums for a life insurance policy to older people. While further mathematical concepts might be – at first sight – an obstacle for teaching, they help to shed light on key ideas for a distribution that enhance ‘internal mechanisms’ lurking in the background, and also help to understand the phenomena better. On the contrary, the usual examination as to whether a specific distribution is an adequate model for a situation is performed by a statistical test on whether the data is compatible with what is to be ‘expected’ from a random sample of this model or not. Such tests focus on the ‘external’ phenomenon of frequencies as observed in data. a. Binomial Distribution – Repeated Independent Trials A Bernoulli process may be represented by drawing balls from an urn where there is a fixed proportion p of balls marked by a 1 (success) and the rest marked by a 0 (failure). If one draws a ball repeatedly from that urn n times always replacing the drawn ball, the number of successes follows a binomial distribution with parameters n and p. 23
BOROVCNIK AND KAPADIA
There are characteristic features inherent to the depicted situation: repeated experiments with the same success probability p, independent trials (mixing the balls before each draw), so that the success probability remains the same throughout. One could spin a wheel repeatedly with a sector marked as 1 and another marked as 0. This distribution was discussed at length with the Nowitzki task. The binomial distribution is intimately related to the Bernoulli process (9), which may also be analysed from the perspective of continuously observing its outcomes, until the first event occurs – see the geometric distribution below. b. Hypergeometric Distribution – Repeated Dependent Trials This distribution, too, is best explained by the artificial but paradigmatic context of drawing balls from an urn with a fixed number of marked (by 1’s) and non-marked (the 0’s) balls as in the binomial situation; however, now the drawn balls are not replaced. Under this assumption, the number of marked balls among the n drawn follows a hypergeometric distribution. The characteristics are repeated experiments with the same success probability p, but dependent trials, so that the success probability remains the same only if one does not know the history of the process, otherwise there is a distinct dependence. The context of drawing balls explains also that – under special circumstances – the hypergeometric may be well approximated by the binomial distribution: if the number n of balls drawn from the urn is small compared to the number N of all balls in the urn, then the dependence between successive draws is weaker and the conditions (9) of a Bernoulli process are nearly met. c. Poisson Distribution – Pure Random Events in Time It is customary to introduce the Poisson distribution as the – approximate – distribution of rare events in a Bernoulli process (p small); it is also advantageous to refer this distribution to the Poisson process even if this is lengthy and more complex. The process of generating ‘events’ (e.g., emitted radioactive particles), which occur in the course of time (or in space), should intuitively obey some laws that may compare to the Bernoulli process: – The start of the observations is not relevant for the probability to observe any event; see the fundamental property (18a) – this leads to A1 in (21) below. – If one observes the process in non-overlapping intervals, the pertinent random variables have to be independent, which corresponds to the independence of the single observations Xi in the Bernoulli process – this leads to A4. – The main difference in the processes lies in the fundamental frequency pulsing: in the Bernoulli process, there is definitely a new experiment Xi at ‘time’ i whereas in the Poisson process time is continuously flowing with no apparent performing of an experiment (leading to an event or not) – events just occur. 24
MODELLING IN PROBABILITY AND STATISTICS
– It remains to fix the probability of an event. As there is no distinct experiment with the outcome of the event (or its non-occurrence), we can speak only of an intensity λ of the process to bear events. This intensity has to be related to unit time; its mathematical treatment in A2 involves infinitesimal concepts. – Paradoxically, a further requirement has to be demanded: even if two or more events may occur in an interval of time, which is not too small, such a probability of coincidences should become negligible if the length of the observation interval becomes small – that leads to A3 below. Mathematically speaking (compare, e.g., the classic text of Meyer, 1970, p. 166), a Poisson process has to meet the following conditions (the random variable Xt counts the number of events in the interval ( 0, t ) ): A1 A2
If Yt counts the events in (t 0 , t 0 + t ) then Yt ~ Xt P ( X Δ t = 1) = λ ⋅ Δ t + o(Δ t )
A3
P( X Δ t ≥ 2) = o(Δ t )
A4
Xt and Yt are independent random variables if they count events in non-overlapping time intervals.
(21)
Assumptions (21) represent pure randomness; they imply that such a process has no preference for any time sequence, has no coincidences as they would occur by ‘intention’, and shows no dependencies on other events observed. The assumptions may also be represented locally by a grid square as is done in Example 8. The main difference of the Poisson to the Bernoulli process lies in the fact that there is no definite unit of time, linked to trials 1, 2, 3, etc., which may lead to the event (1) or not. Here, the events just occur at a specific point of time but one cannot trace when an ‘experiment’ is performed. The success probability p of the Bernoulli process associated with single experiments becomes an intensity λ per unit time. The independence of trials becomes now an independence of counting events in mutually exclusive intervals in the sense of A4. The Poisson process will have a further analogue to the Bernoulli process in terms of waiting for the first event – the Geometric and the Exponential distribution (which describe waiting times in pertinent situations) both have similar properties (see below). We present one example here to illustrate a modelling approach to the Poisson. This shows how discussions can be initiated with students on the theoretical ideas presented above, and help students to understand how and when to apply the Poisson distribution. Example 8. Are the bomb attacks of London during World War II the result of a planned bombardment, or may they be explained by pure random hitting? To compare the data to the scenario of a Poisson process, the area of South London is divided into square grids of ¼ square kilometres each. The statistics in Table 4 shows e.g., 93 squares with 2 impacts each, which amounts to 186 bomb hits. In sum, 537 impacts have been observed. 25
BOROVCNIK AND KAPADIA
Table 4. Number of squares in South London with various numbers of bomb hits – Comparison to the frequencies under the assumption of a Poisson process with λ = 0.9323 No. of hits in a square No. of grid squares with such a no. of hits Expected numbers under Poisson process
0
1
2
3
4
5 and more
229.
211.
93.
35.
7.
1.
226.74
211.39
98.54
30.62
7.14
1.57
all 576.
If targeting is completely random, it follows the rules of a Poisson process (21) and the number of ‘events’ per grid square follows then a Poisson distribution. The parameter λ is estimated from the data to fix the model by
λ=
537 576
= 0.9323 hits per grid square.
(22)
As seen from Table 4, the fit of the Poisson distribution to the data is extremely good. Feller (1968, pp 161) highlights the basic property of the Poisson distribution as modelling pure randomness and contrasts it to wide-spread misconceptions: “[The outcome] indicates perfect randomness and homogeneity of the area; we have here an instructive illustration of the established fact that to the untrained eye randomness appears as regularity or tendency to cluster.” In any case of an application, one might inspect whether the process of data generation fulfils such conditions – which could justify or rule out this distribution as a candidate for modelling. The set of conditions, however, also structures thinking about phenomena, which may be modelled by a Poisson distribution. All phenomena following internal rules, which come close to the basic requirements of a Poisson process, are open to such a modelling. d. Geometric Distribution – Memory-Less Waiting for an Event Here, a Bernoulli process with success parameter p is observed. In contrast to the binomial distribution, the number of trials is not fixed. Instead, one counts the number of trials until the first event (which corresponds to the event ‘marked’ by a 1) occurs. The resulting distribution ‘obeys’ the following ‘memory-less’ property: P (T > k 0 + k | T > k 0 ) = P (T > k ) .
(23)
This feature implies that the remaining waiting time for the first event is independent of the time k0 one has already waited for it – waiting gives no bonus. Such a characterization of the Bernoulli process helps in clarifying some basic misconceptions. The following example can be used to motivate students on the underlying features. 26
MODELLING IN PROBABILITY AND STATISTICS
Example 9. Young children remember long waiting times for the six on a die to come. As waiting times of 12 and longer still have a probability of 0.1346, see also Figure 6, this induces them to ‘think’ that a six has less probability than the other numbers on the die for which such a ‘painful’ experience is not internalised. Exponential distribution - with λ = 6
Geometric distribution - waiting times for the first 6 of a die 0.20
0.20
0.10
0.10
0.00
0.00
0
10
20
30
Figure 6. Waiting for the first six of a die – Bernoulli process with p = 1/6.
0
10
20
30
Figure 7. Exponential distribution has the same shape as the geometric distribution.
e. Exponential Distribution – Memory-Less Waiting for Events in Time The exponential distribution is connected to two key ideas: one links it to the Poisson process; the other uses the concept of conditional failure rate. In a Poisson process, if one is waiting for the next event to occur and the data are subsequent waiting times between the events, then the exponential distribution is the model of choice. This is due to a mathematical theorem (see Meyer 1970, p. 191). It can also be illustrated by simulation studies. An important feature of the exponential distribution is its memory-less property: P (t 0 < T ≤ t 0 + Δt | T > t 0 ) =
P (t 0 < T ≤ t 0 + Δt ) P (T > t 0 )
= P (0 < T ≤ Δt ) .
(24)
Due to the memory-less property, the conditional probability to fail within Δt units of time for a device that has reached age t0 is the same as within the first Δt units for a new device. This implies that the future lifetime (or, waiting time) is independent of age reached (or, the time already spent in waiting), i.e., t0, which amounts to a further characterization of ‘pure’ randomness. Exponential and geometric distributions share the memory-less property. This explains why the models have the same shape. If this conditional failure probability is calculated per unit time and the time length Δt is made smaller, one gets the conditional failure rate, or hazard h(t): h(t ) = lim
Δt → 0
P (t 0 < T ≤ t 0 + Δt | T > t 0 ) Δt
.
(25)
A hazard (rate) is just a different description of a distribution. Now it is possible to express the other key idea behind the exponential distribution, namely that its 27
BOROVCNIK AND KAPADIA
related conditional failure rate (or, hazard) is constant over the lifetime. If a (technical) unit’s lifetime is analysed and the internal structure supports that the remaining lifetime is independent of the unit’s age, then it may be argued that an exponential distribution is the model of choice. While such a property might seem paradoxal (old units are equally good as new units), it is in fact well fulfilled for electronic devices for a long part of their ordinary lifetime. Mechanical units, on the contrary, do show a wearing effect, so that their conditional lifetime gets worse with age. Similarly with human beings, with the exception of infant mortality when – in youngest ages – humans’ lifetime as a probability distribution improves. f. Weibull Distribution – Age-Related Hazards Lifetimes are an important issue in technical applications (reliability issues and quality assurance), waiting times are important in describing the behaviour of systems. There are some families of distributions, which may serve as suitable models. The drawback with these is that they require more mathematics to describe their density functions. Furthermore, the shape of their density gives no clue why they should yield a good model for a problem to be analysed. To view lifetimes (or waiting times) from the perspective of units that have reached some specific age already (have waited some specific time) sheds much more light on such phenomena than to analyse the behaviour of new items (with no time spent in waiting in the system). One would, of course, simulate such models first, explore the simulated data, and draw preliminary conclusions before one starts to delve deeper into mathematical issues. It may pay to learn – informally – about hazards and use this concept instead of probability densities to study probability models. Hazards will directly enhance the basic assumptions, which have to be fulfilled in case of applications. With the key idea of hazard or conditional failure rate, the discussion can relate to infant mortality (decreasing), purely random failures due to exponential lifetime (constant) and wearing-out effects (increasing). The power function is the simplest model to describe all these different types of hazard: h(t ) = β ( αt ) β −1 , α , β > 0 .
(26)
The parameter α is interpreted as the scale of time while β influences the shape and thus the quality of the change of hazard over lifetime. g. Normal Distribution – the Hypothesis of Independent Elementary Errors Any random variable that might be split into a sum of other (hidden) variables is – according to the central limit theorem (CLT) – approximately normally distributed. This explains the key underlying idea and ubiquity of the normal distribution. In the history of probability, the CLT prompted the ‘hypothesis of elementary errors’ (Gauss and earlier) where any measurement error was hypothesized to be the result 28
MODELLING IN PROBABILITY AND STATISTICS
(sum) of other, elementary errors. This supported the use of the normal distribution for modelling measurement errors in astronomy and geodesy. A generalization to the normal ‘law’ of distribution by Quételet and Galton is straightforward: it is an expression of God’s will (or Nature) that any biometric measurement of human beings and animals is normally distributed as it emerges from a superposition of elementary ‘errors of nature’ (Borovcnik, 2006)9. An interesting article about the history and myth of the normal law is Goertzel (n.d.). The mathematics was first proved by de Moivre and Laplace; the single summands Xi had then been restricted to a Bernoulli process (9). In this way, the binomial distribution is approximately normally distributed and the approximation is good enough if there are enough elements in the sum: Tn = X 1 + X 2 + ... + X n .
(27)
To illustrate matters, the Galton board or an electronic quincunx (see, e.g., Pierce, R., n.d.) may be used in teaching. Such a board has several rows of pegs arranged in a shape similar to Pascal’s triangle. Marbles are dropped from the top and then bounce their way down. At the bottom they are collected in little bins. Each time the marble hits one of the pegs, it may bounce either left or right. If the board is set up symmetrically the chances of bouncing either way are equal and the marbles in the bins follow the ‘bell shaped’ curve of the normal distribution. If it is inclined, a skewed distribution emerges, which normalizes, too, if enough rows are taken. Theoretically, one has to standardize the value of the sum Tn according to Un =
Tn − E (Tn )
(28)
var(Tn )
and the CLT in its crudest form becomes: iid
If X i ~ X are an iid process with finite variance var (X ) < ∞ , then it holds lim P(U n ≤ u ) = Φ (u ) n→∞
(29)
Here Φ (u) stands for the cumulative distribution function of the standard normal distribution, i.e., with parameters 0 and 1. Despite such mathematical intricacies, the result is so important that is has to be motivated in teaching. The method of simulation again is suitable not only to clarify the limiting behaviour of the sum (the distribution of its standardized form converges to the normal distribution), but also to get an orientation about the speed of convergence. Furthermore, this convergence behaviour is highly influenced by the shape of the distribution of the single Xi’s. A scenario of simulating 1000 different samples of size n = 20 and then n = 40 from two different distributions (see Figure 8) may be seen from Figure 9. The graph shows the frequency distributions of the mean of the single items of data 29
BOROVCNIK AND KAPADIA
Distribution of single data
Distribution of single data
7
Skewed distribution
0.3
0.3
0.2
0.2
0.1
0.1
0
0 0
2
4
6
8
10
0
10
20
30
40
50
Figure 8. A symmetric and a skewed distribution for the single summands in the scenario. Means of repeated samples of 20 data n=20
Means of repeated samples of 20 data
Normal curve
n=20
0,15
0,03
0,10
0,02
0,05
0,01
0,00
Normal curve
0,00 2
4
6
0
Means of repeated samples of 40 data n=40
4
8
12
16
20
16
20
Means of repeated samples of 40 data
Normal curve
n=40
0,15
0,03
0,10
0,02
0,05
0,01
0,00
Normal curve
0,00 2
4
6
0
4
8
12
Figure 9. Scenario of 1000 samples: distribution of the mean compared to the normal curve – left drawing from an equi-distribution, right drawing from a skewed distribution.
instead of the sum (27) – this is necessary to preserve scales as the sums are simply diverging. For the limit, the calculation of the mean still does not suffice as the mean converges weakly to one number (the expected value of Xi if all have the same distribution) – thus in the limit there would be no distribution at all. The simulation scenarios in Figure 9 illustrate that the calculated means of the repeated samples have a frequency distribution, which comes quite close to a normal distribution for only 20 summands. If the items of data are skewed, the approximation is slightly worse but with calculated means of 40 items of data in each sample the fit is sufficiently good again. The influence of the single summands (like those in Figure 8) on the convergence behaviour of the sum may be studied interactively: With a spreadsheet with slide 30
MODELLING IN PROBABILITY AND STATISTICS
controls for the number of values in the equi-distribution for the single data one could easily see that more values give a faster convergence to the fit. With a slide control for the skewness, e.g., to move the two highest values further away from the bulk of the data, one may illustrate a negative effect on convergence as the fit would become worse this way. By changes of the slide controls the effect on the distribution for an item of data Xi is seen from the bar graphs in Figure 8 and the effect on the ‘normalizing’ of the distribution of the mean of the repeatedly drawn samples may be studied from Figure 9 interactively. Distributions Connected to the Normal Distribution There are quite a few distributions, which are intimately connected to the normal distribution. The main usage of these is to describe the theoretical behaviour of certain test statistics based on a sample from a normal distribution. Amongst them are the t, the χ2 and F distribution. They are mainly used for coping with the mathematical problems of statistical inference and not for modelling phenomena. The χ2 distribution is somehow an exception to this, as the so-called Maxwell and Rayleigh distribution (the square root of a χ2) are also used by physicists to model velocity of particles (like molecules) in two or three dimensional space (with 2 or 3 degrees of freedom), see also Meyer (1970, pp. 220). SOLUTIONS TO THE STATISTICAL PART OF THE NOWITZKI TASK This section returns to the statistical questions of the Nowitzki task. Is Nowitzki weaker away than home? The usual way to answer such questions is a statistical test of significance. Such a modelling approach includes several steps to transfer the question from the context into a statistical framework, in which a null hypothesis reflects the situation of ‘no differences’ and alternative distributions depict situations of various degree of difference. As always in empirical research, there is no unique way to arrive at a conclusion. The chosen model might fit more for the one expert, and less for another one. Already the two questions posed in the formulation of the example (away weaker than home, or, away weaker than in all matches) give rise to disputes. The logic of a statistical test makes things not easier as one always has to refer to fictional situations in the sense ‘what would be if …’ Errors of type I and II, or p values give rise to many misinterpretations by students. And there are many different test statistics which use information, which could discriminate between the null and alternative hypotheses differently (not only in the sense of less precise and more precise but simply different with no direct way for a comparison). Moreover, one has to estimate parameters, or use other information to fix the hypotheses. If a Bernoulli process with success probability p is observed n times, then for the expected value of the number of successes Tn it holds: Tn = X 1 + ... + X n :
E (Tn ) = n p .
(30)
31
BOROVCNIK AND KAPADIA
Here, the value of p usually is not known and is called the true value for the (success) probability. Solutions to the First Statistical Part – Nowitzki Away Weaker than Home & Away? Part a. of the statistical questions in Example 7 is ill-posed10 insofar as the comparison of away matches against all matches does not reflect what one really should be interested in. Is Nowitzki away weaker than in home matches? A reference to comparison including all matches blurs the differences. Therefore, this question is omitted here. We will only discuss the inherent problems. With the three different success probabilities for home, away and all matches, it holds: p0 =
nH n ⋅ pH + A ⋅ p A n n
(31)
If the season is regarded as a self-contained entity, all success rates are known. If they are perceived as probabilities, the next question to discuss is whether there is one common process or two or more with different probabilities; a covariate like ‘location’ of the free throw (home, away) would explain the differences. If p0 is set as known, the problem may be handled in this way: ‘May the away free throws be modelled by a Bernoulli process with p0 from the overall strength?’ If the data is seen as a sample from an infinite (Bernoulli) process, p0 has to estimated from it, however, there are drawbacks in question a. and its modelling. Firstly, by common sense, no one would compare away scores to all scores in order to find differences between the two groups of trials away and home. Secondly, as the overall strength is estimated, it could also be estimated by the separate scores of away and home matches using equation (31): pˆ A and pˆ H are combined to an estimate pˆ 0 of p0. And the test would be performed by the data on away matches, which coincide with 231 = n A ⋅ pˆ A . Confusing here is that pA is dealt with as unknown (a test is performed whether it is lower than the overall strength), an estimate of it is used to get an estimate of the overall strength, and it is used as known data to perform the test. Solution to the Second Statistical Part – Nowitzki Weaker Away Than at Home?
In this subsection, the away scores are compared to the home matches only (part b). Various steps are required to transform a question from the context to the statistical level. It is illustrated how these steps lead to hypotheses at the theoretical level, which correspond and ‘answer’ the question at the level of context. Three different Bernoulli processes are considered: home, away, and all (home and away combined). After the end of the season, the related success probabilities are (factually) known from the statistics (see Table 5). Or, one could at least estimate some of these probabilities from the data. 32
MODELLING IN PROBABILITY AND STATISTICS
Table 5. Playing strength as success probabilities from hits and trials of the season Matches Home
Hits
Trials
TH = 267
nH = 288
Strength ‘known’ or estimated pH =
267
231
Away
TA = 231
nA = 263
pA =
All
T = 498
n = 551
p0 =
288
263 498 551
= 0.927 = 0.878 = 0.904
The basis for reference to compare the away results is the success in home matches. For home matches the success probability is estimated as in model 2 or ‘known’ from the completed season as in model 3 pH = 0.927.
(32)
Formally, the question from the context can be transferred to a (statistical) test problem in several steps of choosing the statistical model and hypotheses: T A ~ B (n A , π ) ,
(33)
ie., the number of hits (successes) in away matches is binomially distributed with unknown parameter π (despite reservations, this model is used subsequently). As null hypothesis H 0 : π = pH ,
(34)
will be chosen. This corresponds to ‘no difference of away to home matches’ from the context with pH designating the success probability in home matches. For the alternative, a one-sided hypothesis H1 : π < pH
(35)
is suggested. As in sports in general, the advantage of the home team is strong folklore, a one-sided11 hypotheses makes more sense than a two-sided alternative of π ≠ pH. The information about pH comes from the data, therefore it will be estimated by 0.927 to form the basis of the ‘model’ for the away matches. No other information about the ‘strength’ in home matches is available. Thus, the reference distribution for the number of successes in away matches is the following:
TA
H0
~ B ( n A , 0.927) .
(36)
Relation (36) corresponds to the probabilistic modelling of the null effect that ‘away matches do not differ from home matches’. The alternative is chosen to be 33
BOROVCNIK AND KAPADIA
one-sided as in (35). The question is whether the observed score of TA = 231 in away matches amounts to an event, which is significantly too low for a Bernoulli process with 0.927 as success probability, which is in the background of (36). Under this assumption, the expected value of successes in away matches is 263 ⋅ 0.927 = 243.8 . The p value of the observed number of 231 successes is now as small as 0.0033! Consistently, away matches differ significantly from home matches (at the 5% level of significance).
%
Number of hits in 263 trials - with the home strength p = 0.927
10
5
Upper limit of the smallest 5% of scores observed score
0 210
hits 220
230
240
250
260
Figure 10. Results of 2000 fictitious seasons with nA = 263 trials – based on an assumed strength of p = 0.927 (corresponding to home matches).
In Figure 10, the scores of Nowitzki in 2000 fictitious seasons are analysed. The scenario is based on his home strength with 263 free throws (the number of trials in away matches in the season 2006–07) ie., on the distribution of the null hypothesis in (36). From the bar chart it is easily seen that the observed score of 231 is far out in the distribution; it belongs to the smallest results of these fictitious seasons. In fact, the p value of the observation is 0.3%. A simulation study gives concrete data; the lower 5% quantile of the artificial data separates the 5% extremely low values from the rest. It is easy to understand that if actual scores are smaller than this threshold, they may be judged as not ‘compatible’ with the underlying assumptions (of the simulated data, especially the strength of 0.927). To handle a rejection limit (‘critical value’) from simulated data is easier than to derive a 5% quantile from an ‘abstract’ probability distribution. Validity of Assumptions – Contrasting Probabilistic and Statistical Point of Views The scenario of a Bernoulli process is more compelling for evaluating the question of whether Nowitzki is weaker in away matches than for the calculation of single probabilities of specific short sequences. In fact, whole blocks of trials are compared. This is not to ask for the probability for a number of successes in short periods of the process but to ask whether there are differences in large blocks on the whole. Homogenization means a balancing-out of short-term dependencies, or of fluctuation of the scoring probability over the season due to changes in the form of the player, 34
MODELLING IN PROBABILITY AND STATISTICS
or due to social conditions like quarrels in the team over an unlucky loss. In this way, the homogenization idea seems more convincing. The compensation of effects across single elements of a universe is essentially a fundamental constituent of a statistical point of view. For a statistical evaluation of the problem from the context, the scenario of a Bernoulli process – even though it does not apply really well – might allow for relevant results. For whole blocks of data, which are to be compared against each other, a homogenization argument is much more compelling as the violations of the assumptions might balance out ‘equally’. The situation seems to be different from the probabilistic part of the Nowitzki problem where it was doomed to failure to find suitable situations for which this scenario could reasonably be applied. Alternative Solutions to – Nowitzki Weaker Away than at Home? Some alternatives to deal with this question are discussed; they avoid the ‘confusion’ arising from the different treatment of the parameters (some are estimated from the data and some are not). Not all of them are in the school curricula. Table 6. Number of hits and trials Matches Home Away All matches
Hits 267 231 498
Failures 21 32 53
Trials 288 263 551
Fisher’s exact test. is based on the hypergeometric distribution. It is remarkable that it relies on less assumptions than the Bernoulli series and it uses nearly all information about the difference between away and home matches. The (one-sided) test problem is now depicted by the pair of hypotheses: H0: p A − p H = 0 against H1: p A − p H < 0
(37)
‘No difference’ between away and home matches is modelled by an urn problem relative to the data in Table 6: All N = 551 trials are represented by balls in an urn; A = 498 are white and depict the hits, 53 are black and model the failures. The balls are well mixed and then one draws n = 263 balls for the away matches (without replacement). The test is based on the number of white balls NW among the drawn balls. Here, a hypergeometric distribution serves as reference distribution: NW
H0
~ Hyp( N = 551, A = 498, n = 263) ,
(38)
Small values of NW indicate that the alternative hypothesis H1 might hold. The observation of 231 white balls has a p value of 3.6%. Therefore, at a level of 5% (one-sided), Nowitzki is significantly weaker away than in home matches. 35
BOROVCNIK AND KAPADIA
With this approach neither the success probability for away nor for home matches were estimated. The hypergeometric distribution needs not to be tackled with all the mathematical details. It is sufficient to describe the situation structurally and get (estimations of ) the probabilities by a simulation study. Test for the difference of two proportions. This test treats the two proportions for the success in away and home matches in the same manner as both are estimated from the data, which are modelled as separate Bernoulli processes according to (9): pˆ A estimates the away strength p A ; pˆ H the home strength p H
(39)
The (one-sided) test problem is again depicted by (37). However, the test statistic now is directly based on the estimated difference in success rates pˆ A − pˆ H . By the central limit theorem, this difference (as a random variable) – normalized by its standard error – is approximately normally distributed; it holds:
U :=
pˆ A − pˆ H pˆ A ⋅(1− pˆ A ) nA
+
pˆ H ⋅(1− pˆ H ) nH
approx
~ N (0, 1) .
(40)
There is now a direct way to derive rejection values to decide whether the observed difference between the success rates of the two Bernoulli processes is significant or not: The estimated value of the difference of the success rates of – 0.04876 gives rise to an U of –1.9257 which amounts to a p value of 2.7%. In this setting a simulation of the conditions under the null hypothesis is not straightforward. If one tests mean values from two different samples for difference then one can use the so-called Welch test. Both situations – testing two proportions or two means for significant differences – are too complex for introductory probability courses at the university. One could motivate the distributions and focus on the problem of judging the difference between the two samples. However, the simpler Fisher test may be seen as the better alternative for proportions. Some Conclusions on the Statistical Modelling of the Nowitzki Task Inferential statistics means evaluating hypotheses by data (and mathematical techniques). Is the strength of Nowitzki away equal to p = 0.927? An answer to this question depends on whether we search for deviations from this hypothesized value in both directions (two-tailed) or only in the direction of lower values (one-tailed). From the context, the focus may well be on lower values of p for the alternative as the advantage of the home team is a well-known matter in sport. The null hypothesis forms the reference basis for the given data. For its formulation, further knowledge is required, either from the context, or from the data. Such knowledge should never be mixed with data, which is used in the subsequent test procedure; this is a crucial problem of task a., which asks to compare the away 36
MODELLING IN PROBABILITY AND STATISTICS
scores to the score in all matches. A value for all matches – if estimated from the data – also contains the away matches. However, this should be avoided, not only for methodological reasons but by common sense too. If the season is seen as self-contained, the value of p = 0.927 is known. A test of 0.927 against alternative values of the strength less than 0.927 corresponds to the question ‘Is Nowitzki away weaker than home?’ 0.927 might as well be seen as an estimate of a larger imaginary season. An evaluation of its accuracy (as in section 3) is usually not pursued. A drawback might be seen in the unequal treatment of the scores: home scores are used to estimate a parameter pH while the away scores are treated as random variable. Note that in this test situation no ‘overlap’ occurs between data used to fix the null hypothesis and data used to perform the test. The alternative tests discussed here treat the home and away probabilities in a symmetric manner: both are assumed as unknown; either both are estimated from the data, or estimation is avoided for both. These tests express a correspondence between the question from the context and their test statistics differently. They view the situation as a two-way-sample. Such a view paves the way to more general types of questions in empirical research, which will be dealt with below. STATISTICAL ASPECTS OF PROBABILITY MODELLING
Data have to be seen in a setting of model and context. If two groups are compared – be it a treatment group receiving some medical treatment and a control group receiving only placebo (a pretended treatment), two success rates might be judged for difference as in the Nowitzki task. Is treatment more effective than placebo? The assumption of Bernoulli processes ‘remains’ in the background (at least if we measure success only on a 0–1 scale). However, such an assumption requires a heuristic argument like the homogenization of data in larger blocks. The independence assumption for the Bernoulli model is not really open to scrutiny as it leads to methodological problems (a null hypothesis can not be statistically confirmed). The idea of searching for covariates serves as a strategy to make the two groups as equal as they could be. Data may be interpreted sensibly and used for statistical inference – in order to generalize findings from the concrete data – only by carefully checking whether the groups are homogenous. Only then, do the models lead to relevant conclusions beyond concrete data. If success is measured on a continuous scale, the mathematics becomes more complicated but the general gist of this heuristic still applies. Dealing with the Inherent Assumptions A further example illustrates the role of confounders. Example 10. Table 7 shows the proportions of girl births in three hospitals. Can they be interpreted as estimates of the same probability for a female birth? Assume such a proportion equals 0.489 worldwide; with hospital B (and 358 births) the proportion of girls would lie between 0.437 and 0.541 (with a probability of 37
BOROVCNIK AND KAPADIA
Table 7. Proportion of girls among new-borns Hospital A B C ‘World stats’
Births 514 358
Proportion of girl births 0.492 0.450 0.508 0.489
approximately 95%). The observed value of 0.450 is quite close to the lower end of this interval. This consideration sheds some doubt on it that the data have been ‘produced’ by mere randomness, ie., by a Bernoulli process with p = 0.489. It may well be that there are three different processes hidden in the background and the data do not emerge from one and the same source. Such ‘phenomena’ are quite frequent in practice. However, it is not as simple as that one could go to hospital B if one wants to give birth to a boy? One may explain the big variation of girl births between hospitals by reference to a so-called covariate: one speculation refers to location; hospital A in Germany, C in Turkey, and B in China. In cases where covariates are not open to scrutiny as information is missing about them, they might blur the results – in such cases these variables are called confounders. For the probabilistic part of Nowitzki, it might be better to search for confounders (whether he is in good form, or had a quarrel with his trainer or with his partner) in order to derive at a probability that he will at the most score 8 times out of 10 trials instead of modelling the problem by a Bernoulli process with the seasonal strength as success probability. Such an approach cannot be part of a formal examination but should feature in classroom discussion. Empirical Research – Generalizing Results from Limited Data Data always has to be interpreted by the use of models and by the knowledge of context, which influences not only the thinking about potential confounders but also guides the evaluation of the practical relevance of conclusions drawn. A homogenization idea was used to support a probabilistic model. For the Bernoulli process the differing success probabilities should ease out, the afflictions of independency should play less of a role when series of data are observed, which form a greater entity – as is done from a statistical perspective. This may be the case for the series of away matches as a block and – in comparison to it – for the home matches as a block. One might also be tempted to reject such a homogenization idea for the statistical part of the Nowitzki task. However, a slight twist of the context, leaving the data unchanged, brings us in the midst of empirical research; see the data in Table 8, which are identical to Table 5. Table 8. Success probabilities for treatment and control group Group Treatment Control All 38
Success 267 231 498
Size nT = 288 nC = 263 N = 551
Success probability pT = 0.927 pC = 0.878 p0 = 0.904
MODELLING IN PROBABILITY AND STATISTICS
This is a two-sample problem, we are faced with judging a (medical) treatment for effectiveness (only with a 0, 1 outcome, not with a continuous response). The Nowitzki question reads now as: Was the actual treatment more effective than the placebo treatment applied to the persons in the control group? How can we justify the statistical inference point of view here? We have to model success in the two groups by a different Bernoulli process. This modelling includes the same success probability throughout, for all people included in the treatment group as well the independence of success between different people. Usually, such a random model is introduced by the design of the study. Of course, the people are not selected randomly from a larger population but are chosen by convenience – they are primarily patients of the doctors who are involved in the study. However, they are randomly attributed to one of the groups, i.e., a random experiment like coin tossing decides whether they are treated by the medical treatment under scrutiny or they receive a placebo, which looks the same from outside but has no expected treatment effect – except the person’s psychological expectation that it could affect. Neither the patient, nor the doctors, nor persons who measure the effect of the treatment, should know to which group a person is attributed – the golden standard of empirical research is the so-called double-blind randomized treatment and control group design. The random attribution of persons to either group should make the two groups as comparable as they could be – it should balance all known covariates and all unknown confounders, which might interfere with the effect of treatment. Despite all precautions, patients would differ by age, gender, stage of the disease, etc. Thus, they do not have a common success probability that the treatment is effective. All what one can say is that one has undertaken the usual precautions and one hopes that the groups are now homogenous enough to apply the model in a manner of a scenario: ‘what does the data tell us if we think that the groups meet a ceteris paribus condition’. A homogenization argument is generally applied to justify drawing conclusions out of empirical data. It is backed by random attribution of persons to the groups, which are to be compared. The goal of randomizing is to get two homogenous groups that differ only with respect to what has really been administered to them: medication or placebo. CONCLUSIONS
The two main roles for probability are to serve as a genuine tool for modelling and to prepare and understand statistical inference. – Probability provides an important set of concepts in modelling phenomena from the real world. Uncertainty or risk, which combines uncertainty with impact (win or loss, as measured by utility) is either implicitly inherent to reality or emerges of our partial knowledge about it. – Probability is the key to understand much empirical research and how to generalize findings from samples to populations. Random samples play an eminent role in that process. The Bernoulli process is a special case of random sampling. Moreover, inferential statistical methods draw heavily on a sound understanding of conditional probabilities. 39
BOROVCNIK AND KAPADIA
The present trend in teaching is towards simpler concepts focusing on a (barely) adequate understanding thereof. In line with this, a problem posed to the students has to be clear-cut, with no ambiguities involved – neither about the context nor about the questions. Such a trend runs counter to any sensible modelling approach. The discussion about the huge role intuitions play in the perception of randomness was initiated by Fischbein (1975). Kapadia and Borovcnik (1991) focused their deliberations on ‘chance encounters’ towards the interplay between intuitions and mathematical concepts, which might influence and enhance mutually. Various, psychologically impregnated approaches have been seen in the pertinent research. Kahneman and Tversky (1972) showed the persistent bias of popular heuristics people use in random situations; Falk and Konold (1992) entangle with causal strategies and the so-called outcome approach, a tendency to re-formulate probability statements into a direct, clear prediction. Borovcnik and Peard (1996) have described some specifities, which are peculiar to probability and not to other mathematical concepts, which might account for the special position of probability within the historic development of mathematics. The research on understanding probability is still ongoing, as may be seen from Borovcnik and Kapadia (2009). Lysø (2008) makes some suggestions to take up the challenge of intuitions right from the beginning of teaching. All these endeavours to understand probability more deeply, however, seem to have had limited success. On the contrary, the more the educational community became aware about the difficulties, the more it tried to suggest cutting out critical passages, which means that probability is slowly but silently disappearing from the content being taught. It is somehow a solution that resembles that of the mathematicians when they teach probability courses at university: they hurry to reach sound mathematical concepts and leave all ambiguities behind. The approach of modelling offers a striking opportunity to counterbalance the trend. Arguments that probability should be reduced in curricula at schools and at universities in favour of more data-handling and statistical inference might be met by the examples of this chapter; they connect approaches towards context and applications like that of Kapadia and Andersson (1987). Probability serves to model reality, to impose a specific structure upon it. In such an approach, key ideas to understand probability distributions turn out to be a fundamental tool to convey the implicit specific thinking about the models used and the reality modelled hereby. Contrary to the current trend, the position of probability within mathematics curricula should be reinforced instead of being reduced. We may have to develop innovative ways to deal with the mathematics involved. To learn more about the mathematics of probability might not serve the purpose as we may see from studies in understanding probabilistic concepts by Díaz and Batanero (2009). The perspective of modelling seems more promising: a modeller never understands all mathematical relations between the concepts. However, a modeller ‘knows’ about the inherent assumptions of the models and the restrictions they impose upon a real situation. Indirectly, modelling was also supported by Chaput, Girard, and Henry (2008, p. 6). They suggest the use of simulation to construct mental images of randomness. 40
MODELLING IN PROBABILITY AND STATISTICS
Real applications are suggested by various authors to overcome the magic ingredients in randomness, e.g. Garuti, Orlandoni, and Ricci (2008, p. 5). Personal conceptions about probability are characterized by an overlap between objective and subjective conceptions. In teaching, subjective views are usually precluded; Carranza and Kuzniak (2008, p. 3) note the resulting consequences: “Thus the concept […] is truncated: the frequentist definition is the only one approach taught, while the students are confronted with frequentist and Bayesian problem situations.” In modelling real problems, the two aspects of probability are always present; it is not possible to reduce to one of these aspects as the situation might lose sense. Modelling thus might lead to a more balanced way in teaching probability. From the perspective of modelling, the overlap between probabilistic and deterministic reasoning is a further source of complications as Ottaviani (2008, p. 1) stresses that probability and statistics belong to a line of thought which is essentially different from deterministic reasoning: “It is not enough to show random phenomena. […] it is necessary to draw the distinction between what is random and what is chaos.” Simulation or interactive animations may be used to reduce the need for mathematical sophistication. The idea of a scenario helps to explore a real situation as shown in section 1. The case of taking out an insurance policy for a car is analysed to some detail in Borovcnik (2006). There is a need for a reference concept wider than the frequentist approach. The perspective of modelling will help to explore issues. In modelling, it is rarely the case that one already knows all relevant facts from the context. Several models have to be used in parallel until one may compare the results and their inherent assumptions. A final overview of the results might help to solve some of the questions posed but raises some new questions. Modelling is an iterative cycle, which leads to more insights step by step. Of course, such a modelling approach is not easy to teach, and it is not easy for the students to acquire the flexibility in applying the basic concepts to explore various contexts. Examinations are a further hindrance. What can and should be examined and how should the results of such an exam be marked? Problems are multiplied by the need for centrally set examinations. Such examination procedures are intended to solve the goal of comparability of results of final exams throughout a country. They also form the basis for interventions in the school system: if a high percentage fail such an exam in a class, the teacher might be blamed, while if such a low achievement is found in a greater region, the exam papers have to be revised etc. However, higher-order attainment is hard to assess. While control over the result of schooling via central examinations ‘guarantees’ standards, such a procedure also has a levelling effect in the end. The difficult questions about a comparison of different probability models and evaluating the relative advantages of these models, and giving justifications for the choice of one or two of these models – genuine modelling aspects involve ambiguity – might not leave enough scope in the future classroom of probability. 41
BOROVCNIK AND KAPADIA
As a consequence of such trends, teaching will focus even more on developing basic competencies. From applying probabilistic models in the sense of modelling contextual problems, only remnants may remain – mainly in the sense of mechanistic ‘application’ of rules or ready-made models. To use one single model at a time does not clarify what a modelling approach can achieve. When one model is used finally, there is still much room for further modelling activities like tuning the model’s parameters to improve a specific outcome, which corresponds to one’s benefit in the context. A wider perspective on modelling presents much more potential for students to really understand probability. NOTES 1 2 3 4 5 6 7 8 9
10
11
Kolmogorov’s axioms are rarely well-connected to the concept of distribution functions. All texts from German are translated by the authors. ’What is to be expected?’ would be less misleading than ‘significantly below the expected value’. A ‘true’ value is nothing more than a façon de parler. This is quite similiar to the ‘ceteris paribus’ condition in economic models. In fact, the challenge is to detect a break between past and present; see the recent financial crisis. Model 3 yields identical solutions to model 2. However, its connotation is completely different. The final probability needs not be monotonically related to the input probability p as in this case. Quetelet coined the idea of ‘l’homme moyen’. Small errors superimposing to the (ideal value of) l’homme moyen ‘lead directly’ to the normal distribution. Beyond common sense issues, the ill-posed comparison of scores in away matches against all matches has several – statistical and methodological drawbacks. The use of a one-sided alternative has to be considered very carefully. An unjustified use could lead to a rejection of the null hypothesis and to a statistical ‘proof ’ of this pre-assumption.
REFERENCES Borovcnik, M. (2006). Probabilistic and statistical thinking. In M. Bosch (Ed.), European research in mathematics education IV (pp. 484–506). Barcelona: ERME. Online: ermeweb.free.fr/CERME4/ Borovcnik, M. (2011). Strengthening the role of probability within statistics curricula. In C. Batanero, G. Burrill, C. Reading, & A. Rossman, (Eds). Teaching statistics in school mathematics. Challenges for teaching and teacher education: A joint ICMI/IASE study. New York: Springer. Borovcnik, M., & Kapadia, R. (2009). Special issue on “Research and Developments in Probability Education”. International Electronic Journal of Mathematics Education, 4(3). Borovcnik, M., & Peard, R. (1996). Probability. In A. Bishop, K. Clements, C. Keitel, J. Kilpatrick, & C. Laborde (Eds.), International handbook of mathematics education (pp. 239–288). Dordrecht: Kluwer. Carranza, P., & Kuzniak, A. (2008). Duality of probability and statistics teaching in French education. In C. Batanero, G. Burrill, C. Reading, & A. Rossman. Chaput, B., Girard, J. C., & Henry, M. (2008). Modeling and simulations in statistics education. In C. Batanero, G. Burrill, C. Reading, & A. Rossman. Joint ICMI/IASE study: Teaching statistics in school mathematics. Challenges for teaching and teacher education. Monterrey: ICMI and IASE. Online: www.stat.auckland.ac.nz/~iase/publications Davies, P. L. (2009). Einige grundsätzliche Überlegungen zu zwei Abituraufgaben (Some basic considerations to two tasks of the final exam). Stochastik in der Schule, 29(2), 2–7. Davies, L., Dette, H., Diepenbrock, F. R., & Krämer, W. (2008). Ministerium bei der Erstellung von Mathe-Aufgaben im Zentralabitur überfordert? (Ministry of Education overcharged with preparing the exam paper in mathematics for the centrally administered final exam?) Bildungsklick. Online: http://bildungsklick.de/a/61216/ministerium-bei-der-erstellung-von-mathe-aufgaben-im-zentralabiturueberfordert/ 42
MODELLING IN PROBABILITY AND STATISTICS Díaz, C., & Batanero, C. (2009). University Students’ knowledge and biases in conditional probability reasoning. International Electronic Journal of Mathematics Education 4(3), 131–162. Online: www. iejme.com/ Falk, R., & Konold, C. (1992). The psychology of learning probability. In F. Sheldon & G. Sheldon (Eds.). Statistics for the twenty-first century, MAA Notes 26 (pp 151–164). Washington DC: The Mathematical Association of America. Feller, W. (1968). An introduction to probability theory and its applications (Vol. 1, 3rd ed.). New York: J Wiley. Fischbein, E. (1975). The intuitive sources of probabilistic thinking in children. Dordrecht: D. Reidel. Garuti, R., Orlandoni, A., & Ricci, R. (2008). Which probability do we have to meet? A case study about statistical and classical approach to probability in students’ behaviour. In C. Batanero, G. Burrill, C. Reading, & A. Rossman (2008). Joint ICMI/IASE study: Teaching statistics in school mathematics. Challenges for teaching and teacher education. Monterrey: ICMI and IASE. Online: www.stat. auckland.ac.nz/~iase/publications Gigerenzer, G. (2002). Calculated risks: How to know when numbers deceive you. New York: Simon & Schuster. Girard, J. C. (2008). The Interplay of probability and statistics in teaching and in training the teachers in France. In C. Batanero, G. Burrill, C. Reading, & A. Rossman (2008) Joint ICMI/IASE study: teaching statistics in school mathematics. Challenges for teaching and teacher education. Monterrey: ICMI and IASE. Online: www.stat.auckland.ac.nz/~iase/publications Goertzel, T. (n.d.). The myth of the Bell curve. Online: crab.rutgers.edu/~goertzel/normalcurve.htm Kahneman, D., & Tversky, A. (1972). Subjective probability: A judgement of representativeness. Cognitive Psychology, 3, 430–454. Kapadia, R., & Andersson, G. (1987). Statistics explained. Basic concepts and methods. Chichester: Ellis Horwood. Kapadia, R., & Borovcnik, M. (1991). Chance Encounters: Probability in education. Dordrecht: Kluwer. Lysø, K. (2008). Strengths and limitations of informal conceptions in introductory probability courses for future lower secondary teachers. In Eleventh International Congress on Mathematics Education, Topic Study Group 13 “Research and development in the teaching and learning of probability”. Monterrey, México. Online: tsg.icme11.org/tsg/show/14 Meyer, P. L. (1970). Introductory probability and statistical applications reading. Massachusetts: AddisonWesley. Ottaviani, M. G. (2008). The interplay of probability and statistics in teaching and in training the teachers. In C. Batanero, G. Burrill, C. Reading, & A. Rossman. Pierce, R. (n.d.), Quincunx. Rod. In Math is Fun – Maths Resources. Online: www.mathsisfun.com/ data/quincunx.html Schulministerium NRW. (n.d.). Zentralabitur NRW. Online: www.standardsicherung.nrw.de/abiturgost/fach.php?fach=2. Ex 7: http://www.standardsicherung.nrw.de/abitur-gost/getfile.php?file=1800
Manfred Borovcnik Institute of Statistics Alps-Adria University Klagenfurt Austria Ramesh Kapadia Institute of Education University of London London WC1H OAL United Kingdom
43
ASTRID BRINKMANN AND KLAUS BRINKMANN
2. PROBLEMS FOR THE SECONDARY MATHEMATICS CLASSROOMS ON THE TOPIC OF FUTURE ENERGY ISSUES
INTRODUCTION
The students’ interest and motivation in mathematics classroom towards the subject as a whole may be increased by using and applying mathematics. “The application of mathematics in contexts which have relevance and interest is an important means of developing students’ understanding and appreciation of the subject and of those contexts.” (National Curriculum Council 1989, para. F1.4). Such contexts might be, for example, environmental issues that are of general interest to everyone. Hudson (1995) states “it seems quite clear that the consideration of environmental issues is desirable, necessary and also very relevant to the motivation of effective learning in the mathematics classroom”. One of the most important environmental impacts is that of energy conversion systems. Unfortunately this theme is hardly treated in mathematics education. Dealing with this subject may not only offer advantages for the mathematics classroom, but also provide a valuable contribution to the education of our children. The younger generation especially, would be more conflicted with the environmental consequences of the extensive usage of fossil fuels, and thus a sustainable change from our momentary existing power supply system to a system based on renewable energy conversion has to be achieved. The decentralised character of this future kind of energy supply surely requires more personal effort of everyone and thus it is indispensable for young people to become familiar with renewable energies. However, at the beginning of the 21th century there was a great lack of suitable school mathematical problems concerning environmental issues, especially strongly connected with future energy issues. An added problem is that the development of such mathematical problems requires the co-operation of experts in future energy matters, with their specialist knowledge, and mathematics educators with their pedagogical content knowledge. The authors working in such a collaboration have developed a special didactical concept to open the field of future energy issues for students, as well as for their teachers, and this is presented below. On the basis of this didactical concept we have created several series of problems for the secondary mathematics classroom on the topics of rational usage of energy, photovoltaic, thermal solar energy, biomass, traffic, transport, wind energy and hydro power. The collection of worked out problems, with an extensive solution to each problem, has been published in a book in the J. Maasz and J. O’Donoghue (eds.), Real-World Problems for Secondary School Mathematics Students: Case Studies, 45–66. © 2011 Sense Publishers. All rights reserved.
BRINKMANN AND BRINKMANN
German language (Brinkmann & Brinkmann, 2005). Further problems dealing with so-called energy hybrid systems i.e., combinations of several energy types, will be developed (see Brinkmann & Brinkmann, 2009). Some problem examples are presented in paragraph 3 of this article. DIDACTICAL CONCEPT
The cornerstones of the didactical concept developed by the authors in order to promote renewable energy issues in mathematics classrooms are: – The problems are chosen in such a way that the mathematical contents needed to solve them are part of mathematics school curricula. – Ideally every problem should concentrate on a special mathematical topic such that it can be integrated into an existing teaching unit; as project-oriented problems referring to several mathematical topics are seldom picked up by teachers. – The problems should be of a greater extent than usual text problems, in order to enable the students and also their teachers to concern themselves in a more intensive way with the subject. – The problems should not require special knowledge of teachers concerning future energy issues and especially physical matters. For this reason all nonmathematical information and explanations concerning the problem’s foundations are included in separate text frames. – In this way information about future energy issues is provided for both teachers and students, helping them to concentrate on the topic. Thus, a basis for interdisciplinary discussion, argumentation and interpretation is given. EXAMPLES OF MATHEMATICAL PROBLEMS
The Problem of CO2 Emission This is an inter-disciplinary problem linked to the subjects of mathematics as well as chemistry, physics, biology, geography, and social sciences. Nevertheless, it may be treated in lower secondary classrooms. With respect to mathematics the conversion of quantities is practised, knowledge of rule of three and percentage calculation is required. The amount of CO2 produced annually in Germany especially by transport and traffic is illustrated vividly so that students become aware of it. Information: In Germany, each inhabitant produces an annual average of nearly 13 t of CO2 (Carbon dioxide). Combustion processes (for example from power plants or vehicle combustion motors) are responsible for this emission into the atmosphere. Assume now that this CO2 would build up a gaseous layer which stays directly above the ground. a) What height would this CO2-layer reach in Germany after one year? 46
PROBLEMS FOR THE SECONDARY MATHEMATICS
Hints: – Knowledge from chemical lessons is How long can helpful for your calculations. There you I breathe? learned that amounts of material could be measured with the help of the unit ‘mole’. 1 mole of CO2 weighs 44 g and takes a volume of 22.4 l, under normal standard conditions (pressure 1013 hPa and temperature 0°C). With these values CO2 you can calculate approximately. – You will find the surface area and the number of inhabitants of Germany in a lexicon. Help: Find the answers of the following partial questions in the given order. i) How many tons of CO2 are produced in total in Germany every year? ii) What volume in l (litres) takes this amount of CO2? (Regard the Hint!) iii) How many m3 of CO2 are therefore produced annually in Germany? Express this in km3! iv) Assume, the CO2 produced annually in Germany forms a low layer of gas directly above the ground, what height would it have? Information: – In Germany the amount of waste is nearly 1 t for each inhabitant (private households as well as industry) every year, the average amount of CO2 produced per inhabitant is therefore 13 times of this. – The CO2, which is produced during combustion processes and emitted into the atmosphere, distributes itself in the air. One part will be absorbed by the plants with the help of the photosynthesis, a much greater part goes into solution in the oceans’ waters. But the potential for CO2 absorption is limited. – In the 1990s, 20% of the total CO2-emissions in Germany came from the combustion engines engaged in traffic activities alone. b) What height would the CO2-layer over Germany have, if this layer results only from the annual emissions from individual vehicles? How many km3 of CO2 is this?
Usable Solar Energy This problem deals with the heating of water in private households using solar energy. It can be treated in lessons involving the topic of percent calculation and the rule of three. It requires the understanding and usage of data representations. Information: In private households the warm water required can be partly heated up by solar thermal collectors. They convert the solar radiation energy in thermal energy. This helps us to decrease the usage of fossil fuels which leads to environmental problems.
47
BRINKMANN AND BRINKMANN
Private households need heated water preferably in the temperature region 45°–55°C. Permanently in our region, the usable thermal energy from the sun is not sufficient to bring water to this temperature because of seasonal behavior. Thus, an input of supplementary energy is necessary.
Figure 1. A solar thermal energy plant (source: DGS LV Berlin BRB).
The following figure (Figure 2) shows how much of the needed energy for heating up water to a temperature of 45°C in private households can be covered respectively by solar thermal energy and how much supplementary energy is needed. Energy Coverage in % 100 90
Percentage of Covered Energy
80 70 60 50 40 30 20 10
0
1
2
3
4 5 6 7 8 9 Month March (1) to February (12)
10
11
12
Figure 2. Usable solar energy and additional energy. 48
PROBLEMS FOR THE SECONDARY MATHEMATICS
The following problems refer to the values shown by Figure 2. a) What percent of the thermal energy needed for one year can be provided by solar thermal energy? Information: – Energy is measured by the unit kWh. An average household in central Europe consumes nearly 4000 kWh energy for the heating of water per year. – 1 l fossil oil provides approximately 10 kWh thermal energy. The combustion of 1 l oil produces nearly 68 l CO2. – The great amount of CO2 worldwide produced at present by the combustion of fossil fuels damages the environment. b) How many kWh may be provided in one year by solar thermal energy? c) How many litres of oil have to be bought to supply the supplementary thermal energy needed for one year for a private household? d) How many litres of oil would be needed without solar thermal energy? e) How many litres of CO2 could be saved by an average household in Germany during one year by using Solar Collectors?
The Problem of Planning Solar Collector Systems This problem deals with calculations for planning solar collector systems by using linear functions. The understanding and usage of graphical representations is performed. Information: – In private households, the heating of warm water can partly be done by solar collector systems. Solar collector systems convert radiation energy in thermal energy. This process is called solar heat. – The usage of solar heat helps to save fossil fuels like natural gas, fuel oil or coal that damages the environment. – Energy is measured by the unit kWh. An average household in central Europe consumes nearly 4000 kWh thermal energy per year for the heating warm water. – Private households need heated water preferably in the temperature region 45°–55°C. Permanently at our latitude, the usable thermal energy from the sun is not sufficient to bring water to this temperature because of the seasonal behavior. Thus, an input of supplementary energy is necessary. – The warm water requirement per day per person can be reckoned at about 50 l. By energy-conscious living this value can be easily reduced to 30 l per person and day. – 1 l of fossil oil provides approximately 10 kWh of thermal energy. The combustion of 1 l of oil produces nearly 68 l of CO2. The diagram in Figure 3 provides data for planning a solar collector system for a private household. It shows the dependence of the collector area needed, for example on the part of Germany where the house is situated, on the number of persons living in the respective household, on the desired amount of warm water per day per person, as well as the desired output of thermal energy needed by solar thermal energy (in per cents). 49
BRINKMANN AND BRINKMANN
Storage Capacity 400 l 300 l 200 l 100 l
2
4
6 Persons
2 4 6 8 Collector Area in m2
Figure 3. Dimensioning diagram.
Example: In a household in central Germany with 4 persons and a consumption of 50 l of warm water per day for each one, a collector area of 4 m2 is needed for a reservoir of 300 l and an energy coverage of 50%. a) What would be the collector area needed for the household you are living in? What assumptions do you need to make first? What would be the minimal possible collector area, what the maximal one? b) A collector area of 6 m2 that provides 50% of the produced thermal energy is installed on a house in southern Germany. How many persons could be supplied with warm water in this household? c) Describe using a linear function the dependence of the storage capacity on the number of persons in a private household. Assume first a consumption of 50 l of warm water per day per person, and second a consumption of 30 l. Compare the two function terms and their graphical representation. d) Show a graphical representation of the dependence of the collector area on a chosen storage capacity assuming a thermal energy output of 50% for a house in central Germany. Sun Collectors This problem can be integrated in lessons about quadratic parabola and uses their focus property as application in the field of sun collectors. 50
PROBLEMS FOR THE SECONDARY MATHEMATICS
Information: – Direct solar radiation may be concentrated in a focus by means of parabolic sun collectors (Figure 4). These use the focus property of quadratic parabola. – sun collectors are figures with rotational symmetry, they evolve by rotation of a quadratic parabola. Their inner surface is covered with a reflective mirror surface; that is why they are named parabolic mirrors. – Sun beams may be assumed to be parallel. Thus, if they fall on such a collector, parallel to its axis of rotation, the beams are reflected so that they all cross the focus of the parabola. The thermal energy radiation may be focused this way in one point. – The temperature of a heating medium, which is lead through this point, becomes very high, relative to the environment. This is used for heating purposes, but also for the production of electric energy.
Figure 4. Parabolic sun collectors (source: DLR).
a) A parabolic mirror was constructed by rotation of the parabola y 0.125 x 2 . Determine its focal length (x and y are measured in meters). b) A parabolic mirror has a focal length of 8 m. Which quadratic parabola was used for its construction? c) Has the parabolic mirror with y 0.0625x 2 a greater or a smaller focal length than that one in b)? Generalize your result. d) A parabolic mirror shall be constructed with a width of 2.40 m and a focal length of 1.25 m. How great is its arch, i.e., how much does the vertex lay deeper than the border? e) In Figure 5 you see a parabolic mirror, the EuroDish with a diameter of 8.5 m. Determine from the figure, neglecting errors resulting from projection sight, its approximate focal length and the associated quadratic parabola. 51
BRINKMANN AND BRINKMANN
Figure 5. EuroDish system (source: Wikimedia Commons).
Information: Other focussing sun collectors are figures with length-symmetry, they evolve by shifting a quadratic parabola along the direction of one axis. They are named parabolic trough solar collectors (Figure 6).
Figure 6. Parabolic trough solar collectors in Almería, Spain and California, USA (source: DLR and FVEE/PSA/DLR). 52
PROBLEMS FOR THE SECONDARY MATHEMATICS
f) The underlying function of a parabolic trough solar collector is given by
y
0.35 x 2 (1 unit
1 m). Where has the heating pipe to be installed?
Photovoltaic Plant and Series Connected Efficiencies The aim of this problem is to make students familiar with the principle of series connected efficiencies, as they occur in complex energy conversion devices. As an example, an off-grid photovoltaic plant for the conversion of solar energy to ACcurrent as a self-sufficient energy supply is considered. The problem can be treated in a teaching unit on the topic of fractions. Figure 7 shows the components of an interconnected energy conversion system to build up a self-sufficient electrical energy supply. This kind of supply system is of special interest for developing countries, and also for buildings in rural off-grid areas (Figure 8). Figure 7 shows in schematic form the production of electrical energy from solar radiation with the help of a solar generator for off-grid applications. In order to guarantee a gap-free energy supply for times without sufficient solar radiation, a battery as an additional storage device is included. SolarGenerator
K PV
Charge Control
KCC
Battery
KB
Inverter
Consumer
KI
Figure 7. Off-Grid photovoltaic plant.
Figure 8. Illustration of an off-grid photovoltaic plant on the mountain hut “Starkenburger Hütte” (Source: Wikipedia). 53
BRINKMANN AND BRINKMANN
Information: The components of an off-grid photovoltaic (PV) plant are 1) a solar generator, 2) a charge control, 3) an accumulator and 4) an inverter (optional for AC-applications). The solar generator converts the energy of the solar radiation into electrical energy as direct current (DC). The electricity is passed to a battery via a charge control. From there it can be transformed directly, or later after storage in the battery, to alternating current (AC), when it is needed by most of the electric devices. Unfortunately, it is not possible to use the radiation energy without losses. Every component of the conversion chain produces losses, so only a fraction of the energy input for each component would be the energy input for the following Power going out component. The efficiency K of a component is defined by K . Power coming in Power is the energy converted in 1 second. It is measured by the unit W or kW (1 kW = 1000 W). For comparison standard electric bulbs need a power of 40 W, 60 W or 100 W, a hair blower consumes up to 2 kW. Assume in the tasks a), b) and c) that all the electric current is first stored in the battery before it reaches the consumer. a) Consider that the momentary radiation on the solar generator would be 20 kW. Calculate the out going power for every component of the chain, if:
K PV
3 , KCC 25
19 , KB 20
b) What is the total system efficiency Ktotal
4 and K I 5
23 . 25
gained power for the consumer ? insolated power
How can you calculate Ktotal by using only the values K PV , KCC , K B and KI ? Give a formula for this calculation. c) Transform the efficiency values given in a) into decimal numbers and percents. Check your result obtained in a) with these numbers. d) How do the battery efficiency and the total system efficiency change, if only 1 / 3 of the electric power delivered by the charge control would be stored in the battery and the rest of 2 / 3 goes directly to the inverter? What is your conclusion from this? Wind Energy Converter This problem deals with wind energy converters. It can be treated in lessons on geometry, especially calculations of circles or in lessons on quadratic parabola. The conversion of quantities is practised. Information: The nominal power of a wind energy converter depends upon the rotor area A with the diameter D as shown in Figure 9 below. 54
PROBLEMS FOR THE SECONDARY MATHEMATICS
2500 kW 2000 kW 1500 kW 1000 kW 750 kW 600 kW 500 kW 300 kW
80 m 72 m 64 m 54 m 48 m 44 m 40 m 33 m 27 m
225 kW
Figure 9. Nominal power related to the rotor area.
a) Interpret the meaning of Figure 9. b) Show the dependence of the nominal power of the wind energy converter on the rotor diameter D and respectively on the rotor area A by graphs in co-ordinate systems. c) Find the formula which gives the nominal power of the wind energy converter as a function of the rotor area and of the rotor diameter respectively. d) What rotor area would you expect to need for a wind energy converter with a nominal power of 3 MW? Give reason for your answer. (Note: 1 MW = 1000 kW.) What length should the rotor blades have for this case? Information: – The energy that is converted in one hour [h] by the power of one kilowatt [kW] is 1 kWh. – In central Europe, wind energy converters produce their nominal power on average for 2000 hours a year when wind energy conditions are sufficient. e) Calculate the average amount of energy in kWh, which would be produced by a wind energy converter with a nominal power of 1.5 MW during one year in middle Europe. Information: – An average household in central Europe consumes nearly 4000 kWh electrical energy per year. 55
BRINKMANN AND BRINKMANN
f) In theory, how many average private households in central Europe could be supplied with electrical energy by a 1.5 MW wind energy converter? Why do you think, that this could only be a theoretical calculation? g) Assume the nominal power of a 600 kW energy converter would be reached at a wind speed of 15 m/s, measured at the hub height. How many km/h is this? How fast are the movements of the tips of the blades, if the rotation speed is 15/min. Give the solutions in m/s and km/h, respectively. Compare the result with the wind speed. Wind Energy Development This problem requires the usage and interpretation of data and statistics which is done in the context of wind energy development in Germany. Information: At the end of 1990 the installed wind energy converters in Germany had a total nominal power of 56 MW. At the end of 2000 this amount increased to a total of 6113 MW. Power is the energy converted in a time unit; it is measured by the unit Watt [W]. 106 W are one Megawatt [MW]. The following table shows the development of the new installed wind power in Germany in the years 1991–2000. Table 1. Development of new installed wind energy in Germany Year
Number of new installed wind energy converters
Total of new installed nominal power
1991
300
1992
405
74
1993
608
155
1994
834
309
1995
911
505
1996
804
426
Total of nominal power
48
1997
849
534
1998
1010
733
1999
1676
1568
2000
1495
1665
a) Fill in the missing data in the 4th column. b) Show the development of the annual new installed nominal power and of the total annual nominal power in graphical representations. 56
PROBLEMS FOR THE SECONDARY MATHEMATICS
c) Considering only the data of the years 1991–1998, what development in respect to the installation of new wind power in Germany could be expected in your opinion? Give a well-founded answer! Compare your answer with the real data given for 1999 and 2000 and comment on it. What is your projection for 2005? Why? d) Calculate using the data in Table 1 for each of the years 1991–2000, the average size of new installed wind energy converters in kW. (Note: 1 MW = 1000 kW.) Show the respective development graphically and comment on it. Can you offer a projection for the average size of a wind energy converter that will be installed 2010? e) Comment on the graphical representation in Figure 10. Also take into account political and economical statements and possible arguments.
Development of wind energy use in Europe: installed capacity Predictions and Reality
Installed capacity in gigawatt
70
64.2
60
European Commission: Advanced Scenario, 1996 EWEA 1997
50 European Commission: White paper, 1997
40 30 20
European Commission: PRIMESt, 1998
10
Greenpeace/EWEA, 2002 2.5
0
IEA 2002
1990 1995 2000 2005 2010 2015 2020
Actual development
Figure 10. Development of wind energy use in Europe.
Betz’ Law and Differentiation This problem deals with the efficiency of a wind energy converter; it can be treated in lessons on differentiation and the determination of local extreme values. 57
BRINKMANN AND BRINKMANN
Information: 2% of the radiated solar energy is converted to kinetic energy of air molecules. In combination with the earth’s rotation, this results in a wind production. The kinetic energy of an air mass 'm is E 1 2 'm v 2 , in which v denotes the velocity of
the air mass. The kinetic energy can be written as E 1 2 U 'V v 2 given the density of air U 1, 2g/l and the relation 'm U 'V with the volume element
'V . The power is defined as the ratio of energy to time as P
E 't .
a) A wind volume 'V flows through a rotor area A and needs the time 't to travel the distance 's . Therefore the speed is v 's 't . Determine the general formula for the volume element which passes the rotor area A during the time interval 't as a function of the wind speed. b) Give the formula for the amount of wind power Pwind , which passes through the rotor area A as a function of the wind velocity. Show that the power increases with the third power of the wind velocity. Information:
A rotor of a wind energy converter with area A slows down the incoming wind speed from v1 in front of the rotor to the lower speed v2 behind the rotor (Figure 11). The wind speed in the rotor area itself can be shown to be the average of v1 and v2 i.e., v Pc
v1 v2
P1 P2
2 . The converted power is then given by:
1 'V U v12 v22 . 2 't
v1
v2
v A
A1
A2
Figure 11. Wind flow through the rotor area.
c) Express the formula for the determination of Pc as a function of A , v1 and v2 . d) Describe the converted power Pc in c) as a function of the variable x and v1 . 58
v2 v1
PROBLEMS FOR THE SECONDARY MATHEMATICS
Information: The efficiency of a wind energy converter (power coefficient) is defined as the ratio of the converted power to the wind power input as cP Pc Pwind .
e) Express the power coefficient as a function of the variable x
v2 v1 . Draw the
graph of this function as a function of x . Note that x > 0,1@ , why?
f) Determine the value xmax which corresponds to the maximum value of the power coefficient, the so-called Betz’ efficiency. This is the value for x which gives the best energy conversion.
v2 v1
Biomass and Reduction of CO2 Emissions This problem deals with fossil fuels and biomass, especially with the production of CO2 emissions and possibilities for their reduction. The conversion of quantities is practised, and knowledge of rule of three and percentage calculation is required. Information: In Germany for example, an average private household consumes nearly 18000 kWh of energy annually. 80% of this amount is for heating purposes and 20% for electrical energy. The energy demand for heating and hot water is mainly covered by the use of fossil fuels like natural gas, fuel oil or coal. Assume that the calorific values of gas, oil and coal can be converted to useable heating energy with a boiler efficiency of 85%. This means that 15% is lost in each case.
a) The following typical specific calorific values are given Natural gas: 9.8 kWh/m³ Fuel oil: 11.7 kWh/kg Coal: 8.25 kWh/kg (That is, the combustion of 1 m3 natural gas supplies 9.8 kWh, 1 kg fuel oil supplies 11.7 kWh and 1 kg coal supplies 8.25 kWh.) What amount of these fuels annually is necessary for a private household in each case? b) The specific CO2-emissions are approximately: Natural gas: 2.38 kg CO2/kg Fuel oil: 3.18 kg CO2/kg Coal: 2.90 kg CO2/kg The density of natural gas is nearly 0.77 kg/m³. How many m³ of CO2 each year for a private household in Germany does it take in each case? Hint: Amounts of material could be measured with the help of the unit ‘mole’. 1 mole of CO2 weights 44 g and has a volume of approximately 22.4 l. 59
BRINKMANN AND BRINKMANN
Information: Wood is essentially made with CO2 taken from the atmosphere and water. The bound CO2 is discharged by burning the wood and is used again in the building of plants. This is the so-called CO2-circuit. Spruce wood has a specific calorific value of nearly 5.61 kWh/kg. The specific CO2-emissions are approximately 1.73 kg CO2/kg. c) How many kg of spruce wood would be needed annually for a private household instead of gas, oil or coal? (Assume again a boiler efficiency of 85%). How many m³ of fossil CO2-emissions could be saved in this case? Information: Spruce wood as piled up split firewood has a storage density of 310 kg/m³. d) How much space has to be set aside in an average household for a fuel storage room, which contains a whole year’s supply of wood? Compare this with your own room! e) Discuss the need for saving heat energy with the help of heat reduction.
Automobile Energy Consumption This problem can be treated in lessons on trigonometry. Its solution requires knowledge of the rule of three. The problem makes clear the dependence of an automobile’s energy consumption on the distance-height-profile, the moved mass and the velocity. Tim and Lisa make a journey through Europe. Just before the frontier to Luxembourg their fuel tank is empty. Fortunately they have a reserve tank filled with 5 l fuel. “Let’s hope it will be enough to reach the first filling station in Luxembourg. There, the fuel is cheaper than here” Tim says. “It would be good if we had an exact description of the route, than we would be able to calculate our range”, answers Lisa. Information: – In order to drive, the resisting forces have to be overcome. Therefore a sufficient driving force Fdrive is needed. For an average standard car, the law for this force (in N) is given by the following formula: Fdrive (0.2 9.81 sin D ) m 0.3 v2 for Fdrive t 0 , where m the moving mass (in kg) is the mass of the vehicle, passengers and packages; v is the velocity (in m/s), and D is the angle relative to the horizontal line. D is positive for uphill direction and negative in the downhill case (Figure 12). – The energy E (in Nm) which is necessary for driving, can be calculated in cases of a constant driving force by: E Fdrive s , with s as the actual distance driven (in m). – The primary energy consisting of the fuel amounts to about 9 kWh for each l l of fuel. (kWh is the symbol for the energy unit ‘kilowatt-hours’; it is 1 kWh = 3 600000 Nm). 60
PROBLEMS FOR THE SECONDARY MATHEMATICS
– The efficiency of standard combustion engines in cars for average driving conditions is between 10% and 20% nowadays; this means only 10%–20% of the primary energy in the fuel is available to generate the driving forces.
D D Figure 12. Definition of the angle D .
The distance which Tim and Lisa have to drive to the first filling station in Luxembourg can be approximately given by a graphical representation like that one given in Figure 13. (Attention: think of the different scaling of the co-ordinate axis). The technical data sheet of their vehicle gives the unladen weight of their car as about 950 kg. Tim and Lisa together weigh c. 130 kg, and their packages nearly 170 kg. The efficiency of the engine can be assumed to be c. 16%. h [m] 350
200 140
I
II
III
Figure 13. Distance-Height-Diagram to the next filling station (h is the height above mean sea level).
a) Can Tim and Lisa take the risk of not searching for a filling station before the frontier? Assume at first, the speed they drive is 100 km/h. b) Would Tim and Lisa have less trouble, if they had only 50 kg packages instead of 170 kg? c) Would the answer to a) change if Tim and Lisa chose their speed to be only 50 km/h? Help for a): i) Note, the speed in the formula for Fdrive has to be measured in m/s. ii) The value for sin D can be calculated with the information given in Figure 13. iii) Determine for each section the force Fdrive and the energy needed. The distance s has to be measured in m. Convert the energy from Nm in kWh. 61
BRINKMANN AND BRINKMANN
iv) v)
Determine the total energy which is needed for the whole distance as a sum over the three different sections. How many kWh of energy to drive are given by the 5 l reserve fuel? Consider the efficiency of the motor.
Automobiles: Forces, Energy and Power This is a problem that can be treated in higher secondary mathematics education, in the context of differential and integral calculus. This problem shows the dependence of an automobile’s power and energy consumption on the distance-height-profile, the moved mass and the velocity. Kay has a new electric vehicle, of which he is very proud. He wants to drive his girlfriend Ann from the disco to her home. Ann jokes: “You will never get over the hill to my home with this car!” “I bet that I will”, says Kay. The distance-height-characteristic of the street from the Disco (D) to the house (H), in which Ann lives, is shown in Figure 14, and it can be described by the following function:
h( x )
1 (4 x 2 104 x 300) for x > 0; 20@ , 1 000
where x and h are measured in kilometres (km). h [km] H
D
x [km] Figure 14. Distance-Height-Diagram between D and H.
a) Show, that it is possible to calculate the real distance s depending of a given height function h over the interval [ x1 , x2 ] with the help of the following formula: x2
s
³
x1
62
1 (hc( x)) 2 dx .
PROBLEMS FOR THE SECONDARY MATHEMATICS
Help: Consider the right angle triangle as shown in Figure 15.
's
'h 'x
Figure 15. Geometrical representation.
b) How long is the real distance which Kay has to drive from the Disco to the house where Ann is living? Help: Show that 1 2 2 2 ³ 1 x dx 2 ( x 1 x ln( x 1 x )) const. 1 with the help of the derivative of ( x 1 x 2 ln( x 1 x 2 )) . 2 (Hint: This partial result is helpful for solving the problem f ).) c) D means the angle of the tangent to the curve h at the point x0 on the x -axis. Prove that hc( x0 ) . sin D 1 (hc( x0 ))2 (Note that hc( x0 )
tan D .)
d) Assume Kay wants to drive at a constant speed of 110
km . h
Determine the driving power necessary at the top of the hill (maximum of h ) and at the points with h( x ) 0.4 and h( x) 0.8 . For this purpose you need the following data: Kay’s electric vehicle has an empty weight of 620 kg, Kay and Ann together weigh nearly 130 kg. Information: – In order to drive, the resisting forces have to be overcome. Therefore a sufficient driving force Fdrive is needed. For an average standard car, the law for this force (in N) is given by the following formula: Fdrive (0.2 9.81 sin D ) m 0.3 v2 for Fdrive t 0 , where m the moving mass (in kg), is the mass of the vehicle, passengers and packages, v is the velocity (in m ), and D is the angle relative to the horis
zontal line. D is positive for uphill direction and negative in the downhill case (Figure 16). 63
BRINKMANN AND BRINKMANN
– The driving power P (in
Nm ), s
which is needed to hold the constant speed, can
be calculated using the product of the driving force (in N) and the velocity (in m ): s
P
Fdrive v .
The power P is measured with the unit [kW], with: 1 kW = 1 000
h
Nm . s
D
D
h Figure 16. Angle D dependant on the function of h x .
e) Kay’s electric vehicle has a nominal power of 25 kW. Is it possible for him to bring Ann home? f) Determine the driving energy which has to be consumed for the route from the disco to Ann’s home. Assume that Kay drives uphill with a speed of 80 km and h
downhill with a speed of 110
km . h
(Attention: Because h x as well as x are
expressed in km in the function equation of h , the resulting energy in the following equation is obtained in N km = 1 000 Nm .) Information: – The pure driving energy E (in Nm), which is necessary for driving, can be calculated as: s0
E
³ Fdrive ds 0
x0
³F
drive
1 (hc( x)) 2 dx , where s is the actual distance (in m) driven.
0
– The energy E is usually measured in kilowatt-hour [kWh]. 1 kWh 3 600 000 Nm . g) The actual charged electrical energy in the batteries of Kay’s vehicle is 6 kWh. The driving efficiency of his electrical vehicle is nearly 70%; this means only 70% of the stored energy can be used for driving. Is the charging status of Kay’s batteries sufficient to bring Ann to her home, under the assumptions in f )? Will I make it home afterwards??
64
PROBLEMS FOR THE SECONDARY MATHEMATICS
CLASSROOM IMPLEMENTATION
The problems on environmental issues developed by the authors must be seen as an offer for teaching material. In each case, the students’ abilities have to be considered. In lower achieving classes it might be advisable not to present every problem in full length. In addition, lower achievers need a lot of help for solving complex problems that require several calculation steps. The help given in some problems, like in example 1 above, addresses such students. The problems should be presented to higher achievers without much help included. It might even be of benefit not to present the given hints from the beginning. Students would thus have to find out, which quantities are yet needed in order to solve the problem. The problem would become more open and the students would be more involved in modelling processes. As the intention of the authors is also an informal one, in order to give more insight in the field of future energy issues, the mathematical models/formulas are mostly given in the problem texts. Students are generally not expected to find out by themselves the often complex contexts; these are already presented, thus guaranteeing realistic situation descriptions. The emphasis in the modelling activities lies rather in the argumentation and interpretation processes demanded, recognising that mathematical solutions lead to a deeper understanding of the studied contents. In the context of an evaluation of lessons dealing with problems like those presented in this paper, students amongst others were asked to express what they have mainly learned. The given answers can be divided in three groups: mathematical concepts, contents concerning renewable energy topics, as well as convenient problem solving strategies. As regards the last point, students stressed especially that they had learned that it is necessary to read the texts very carefully, and also to consider the figures and tables very carefully. Almost all students expressed, that they would like to work on much more problems of this kind in mathematical classes, as the problems are interesting, relevant for life, and are more fun than pure mathematics. Classroom experiences show that students react in different ways to the problem topics. While some are horrified by recognizing for example that the worldwide oil reserves are already running low during their life time, others are unmoved by this fact, as twenty or forty years in the future is not a time they worry about. In school lessons there are again and again situations in where students drift away in political and social discussions related to the problem contexts. Although desirable, this would sometimes lead to too much time loss for mathematical education itself. Cooperation with teachers of other school subjects would be profitable if possible. OUTLOOK AND FINAL REMARKS
In order to integrate future energy issues into curricula of public schools, several initiatives have already been started in Germany, supported and in co-operation with the ‘Deutsche Gesellschaft für Sonnenergie e.V. (DGS)’, the German section of the ISES (International Solar Energy Society). There exists a European project, named “SolarSchools Forum”, that aims to integrate future energy issues into curricula of public schools. In the context of this 65
BRINKMANN AND BRINKMANN
project the German society for solar energy DGS highlights the teaching material the authors created (http://www.dgs.de/747.0.html). Most of this material is only available in German language. This article is a contribution towards making these materials accessible in English also. (The English publications up to now (see e.g. Brinkmann & Brinkmann, 2007) present only edited versions of some of the problems.) Although the education in the field of future energy issues is of general interest, the project that we presented in this paper seems to be the only major activity focusing especially on mathematics lessons. The amount of problems should thus be increased, especially with problems which deal with a combination of different renewable energy converters, like hybrid systems, to give an insight into the complexity of system technology. Additionally, the sample mathematical problems on renewable energy conversion and usage have to be permanently adjusted to actual and new developments because of the dynamic evolution of the technology in this field. REFERENCES Brinkmann, A., & Brinkmann, K. (2005). Mathematikaufgaben zum Themenbereich Rationelle Energienutzung und Erneuerbare Energien. Hildesheim, Berlin: Franzbecker. Brinkmann, A., & Brinkmann, K. L. (2007). Integration of future energy issues in the secondary mathematics classroom. In C. Haines, P. Galbraith, W. Blum, & S. Chan (Eds.), Mathematical modelling (ICTMA 12): Education, engineering and economics (pp. 304–313). Chichester: Horwood Publishing. Brinkmann, A., & Brinkmann, K. (2009). Energie-Hybridsysteme – Mit Mathematik Fotovoltaik und Windkraft effizient kombinieren. In A. Brinkmann & R Oldenburg, (Eds.). Schriftenreihe der ISTRONGruppe. Materialien für einen realitätsbezogenen Mathematikunterricht, Band 14 (pp. 39–48), Hildesheim, Berlin: Franzbecker. Hudson, B. (1995). Environmental issues in the secondary mathematics classroom. Zentralblatt für Didaktik der Mathematik, 27(1), 13–18. National Curriculum Council. (1989). Mathematics non-statutory guidance. York: National Curriculum Council.
Astrid Brinkmann Institute of Mathematics Education University of Münster, Germany Klaus Brinkmann Umwelt-Campus Birkenfeld University of Applied Science Trier, Germany
66
TIM BROPHY
3. CODING THEORY
INTRODUCTION
Throughout the world teachers of mathematics instruct their pupils in the wonders of numbers. It is always a challenge to find areas where the topics covered at school intersect the world of the student. The search for such intersection points is well rewarded by arousing the interest of the student in the topic. Teachers already have a very heavy workload and may not simply have the time to do such research. This chapter attempts to link the students’ world of shopping, curiosity and music to modular arithmetic, trigonometry and complex numbers. It is hoped that the busy teacher will find here ideas to enliven classwork and use the students’ natural curiosity as a pedagogical tool in the exploration of numbers. In the world today we rely for many things on digital information. Without digital storage there would be no such thing as a CD or a DVD, satellite television or an mp3 player. NASA would not have been able to receive pictures from Mars. Mobile phones would not work.
Figure 1. Asteroid ice (Courtesy NASA/JPL-Caltech).
Information stored in digital form can, indeed will, become corrupted. This leads to errors in the information stored or transmitted. This article is about the two processes of first detecting and then correcting these errors. Sometimes the errors are unimportant. You may be speaking to someone on a telephone line with a faint crackle in the background. This crackle, while it may be irritating, does not prevent the transmission or reception of information: your voices. Sometimes the errors that could occur are very important. You may be making a purchase with a credit card. If there is an error in transmitting or receiving the amount of money involved it could seriously upset either you or your bank. J. Maasz and J. O’Donoghue (eds.), Real-World Problems for Secondary School Mathematics Students: Case Studies, 67–85. © 2011 Sense Publishers. All rights reserved.
BROPHY
The simplest way of storing information for use in digital media is in binary form. All the information will be encoded as a sequence of the two digits 0 and 1. A computers is a machine that contains banks of switches that can be either on or off. A switch that is on can be represented by the number 1. A switch that is off can be represented by the number 0. To store and transmit information in this format means working with only two integers, 0 and 1. These are called binary digits from which we get the word bit. When we limit the available integers used in arithmetic the process is called modular arithmetic. This is the key to much error detection and correction. We will look first at errors in bar codes. BAR CODES
Modular Arithmetic Prime numbers are numbers that are divisible only by themselves and one. Prime numbers are involved in coding information so that it can be used in digital media. It turns out that many of the uses of prime numbers depend on using Modular Arithmetic. What is this? While the numbers go on forever, human beings don’t. A way to begin looking at the endless rise of the numbers is to look at them in cycles. We do not count the hours of the day as rising forever. In fact we have two different systems for keeping track of time, a twelve hour and a twenty four hour system, and they are both cyclic. The older of the two that we use takes a fundamental cycle of twelve and begins again after the twelve is reached. An addition table would look something like the following (Table 1): Table 1. Addition on a clock + 1 2 3 4 5 6 7 8 9 10 11 12
1 2 3 4 5 6 7 8 9 10 11 12 1
2 3 4 5 6 7 8 9 10 11 12 1 2
3 4 5 6 7 8 9 10 11 12 1 2 3
4 5 6 7 8 9 10 11 12 1 2 3 4
5 6 7 8 9 10 11 12 1 2 3 4 5
6 7 8 9 10 11 12 1 2 3 4 5 6
7 8 9 10 11 12 1 2 3 4 5 6 7
8 9 10 11 12 1 2 3 4 5 6 7 8
9 10 11 12 1 2 3 4 5 6 7 8 9
10 11 12 1 2 3 4 5 6 7 8 9 10
11 12 1 2 3 4 5 6 7 8 9 10 11
12 1 2 3 4 5 6 7 8 9 10 11 12
Notice that the number 12 plays the part that is normally taken by zero. No matter what we add to 12 it remains the same: 12 + 2 = 2 68
CODING THEORY
12 + 7 = 7 12 + 12 = 12 What this means is that two hours after 12 noon is 2 pm. Seven hours after 12 noon is 7 pm and twelve hours after 12 noon is 12 midnight. Adding hours is arithmetic. To calculate the sum of two numbers in this system we add them together and then, if the total is greater than twelve, subtract twelve from the total. We learn to do this at a very early age and do not see any problems with it. The method is illustrated below: 7+2=9 7 + 6 = 13 and 13 – 12 = 1 5 + 11 = 16 and 16 – 12 = 4 Since the number 12 plays the part of zero we will use zero whenever the number 12 appears. This gives rise to Table 2 which is addition using only the numbers from zero to eleven. We call this addition modulo 12 and it is an example of modular arithmetic. Table 2. Addition modulo 12 + 1 2 3 4 5 6 7 8 9 10 11 0
1 2 3 4 5 6 7 8 9 10 11 0 1
2 3 4 5 6 7 8 9 10 11 0 1 2
3 4 5 6 7 8 9 10 11 0 1 2 3
4 5 6 7 8 9 10 11 0 1 2 3 4
5 6 7 8 9 10 11 0 1 2 3 4 5
6 7 8 9 10 11 0 1 2 3 4 5 6
7 8 9 10 11 0 1 2 3 4 5 6 7
8 9 10 11 0 1 2 3 4 5 6 7 8
9 10 11 0 1 2 3 4 5 6 7 8 9
10 11 0 1 2 3 4 5 6 7 8 9 10
11 0 1 2 3 4 5 6 7 8 9 10 11
0 1 2 3 4 5 6 7 8 9 10 11 0
Barcodes contain information that can be read by scanners electronically. This eliminates certain errors and minimises others. The barcode, as shown in Figure 2, consists of various groups of digits. The first two or three digits give the country to which the barcode has been issued. This is not the same as the country in which the product is manufactured. In Figure 2 the first three digits, 539, indicate that the barcode was issued to some group in Ireland. The remaining digits in this example, except for the final one, have no meaning. The final digit is called a check digit. It is in evaluating the check digit that modular arithmetic is used. When a barcode is scanned errors can occur. These errors can 69
BROPHY
Figure 2. Barcode.
be caused by a multitude of events, from dirt blocking either a bar or a space to random electronic errors. While no automatic system can trap all errors, a check digit can catch quite a few. Check Digit In the EAN-13 (European Article Number) system (Figure 3) the first two (or sometimes three) digits give the country to which the barcode has been issued. The next five (or four if the country used three) digits belong to a particular company. The following five digits identify the actual product. There is one more digit to go. This is the check digit. It is calculated from all the digits already present.
Figure 3. Meaning of the numbers.
The barcode reader will do the same calculation and if a different digit is arrived at then an error has occurred. For some errors this can even be corrected. The process of calculating the check digit is quite simple to carry out. Indeed a description of the process is much more complicated than actually doing it. Consider the fictitious bar code above: 5391234512342. The final digit, 2, is the check digit. This is calculated using the following steps in Table 3: Table 3. How to calculate the check digit 1. 2. 3. 4. 5. 6. 70
Write the number in a row. Label the digits alternately Odd(O) or Even(E) starting at the right and moving left. Make out a table with 3 below each O and 1 below each E. These are the weights. Multiply the digits by their weights mod 10. Add up the new row mod 10. If the total is t then the check digit, c, is the solution to t + c = 0 mod 10.
CODING THEORY
All this may seem very complicated but an example should make it clear. The barcode we invented above before a check digit was assigned to it was 539123451234. We will go through the process of calculating the check digit. Table 4. Calculation of check digit Barcode Position Weights Calculation Result
5 E 1
3 O 3
9 E 1
1 O 3
2 E 1
3 O 3
4 E 1
5 O 3
1 E 1
2 O 3
3 E 1
4 O 3
5×1 5
3×3 9
9×1 9
1×3 3
2×1 2
3×3 9
4×1 4
5×3 5
1×1 1
2×3 6
3×1 3
4×3 2
Adding these results together modulo 10 gives us 5+9+9+3+2+9+4+5+1+6+3+2=8 This is the value of t above. We calculate the check digit by subtracting this number from 10 to get c = 10 – 8 = 2 When the barcode reader analyses this barcode it only takes it a fraction of a second to calculate the check digit and compare it to the one on the barcode. However the check digit can do more than this. Quite often a particular bar can be smudged and be unreadable to the scanner. If there is only one error and the check digit is legible then the barcode reader can calculate the correct digit by reversing the process and shopping can proceed. For example if the scanner reads the number 4#02030145620 where # represents an unreadable smudge the calculation would proceed as follows. Table 5. Correction of an error Barcode Position Weights Calculation Result
4 E 1
x O 3
0 E 1
2 O 3
0 E 1
3 O 3
0 E 1
1 O 3
4 E 1
5 O 3
6 E 1
2 O 3
4×1 4
x×3 3x
0×1 0
2×3 6
0×1 0
3×3 9
0×1 0
1×3 3
4×1 4
5×3 5
6×1 6
2×3 6
Adding these results together modulo 10 gives us 4 + 3x + 0 + 6 + 0 + 9 + 0 + 3 + 4 + 5 + 6 + 6 = 3x + 3 So the check digit should be 10 – 3x – 3 = 7 – 3x = 0. We are looking for a number between 0 and 10 that when multiplied by 3 gives a remainder of 7. 7 is 71
BROPHY
not divisible by 3, neither is 17 but 3×9 = 27. This tells us that the unreadable digit was 9. The error has been both detected and corrected. How good is this method at picking up errors? Studies have shown that slightly more than 79% of errors are the replacement of one digit by a different digit. Suppose a digit, d, whose weight is 1 has been replaced by the digit e. The weighted sum will now change by the amount d – e (mod 10). This will only go undetected if d – e = 0 (mod 10) which means that d = e so there is no error. Similarly if a digit whose weight is 3 was altered we would have 3 (d – e) = 0 (mod 10) so again there is no error. The check digit approach traps all of the commonest types of error. DEEP SPACE
Figure 4. Saturn (Courtesy NASA/JPL-Caltech).
We have all looked in wonder at the photographs sent back from deep space by NASA (Figure 4). How are they sent? How are they received? How do we know that the information has not been corrupted? Mathematics is, of course, the answer to all these questions. To begin the information must be recorded. In other words a digital photograph has to be taken. This process stores all the required information as a sequence of the digits 0 and 1. This information, a sequence of bits, is sent to Earth as a bitstream. Images as Bits The grid drawn in Figure 5 has certain squares coloured black. It represents a fairly badly drawn smiley face. The only colours used here are black and white. This is very easy to translate into the digits that are used by all computers. If we represent a white square by the digit 1 and a black square by the digit 0 then the whole picture is represented by the number 72
CODING THEORY
1001001110010011111111111100011101101110101111011101101111100111
Figure 5. Smiley face.
The receiver of the above number can reconstruct the face only because it is known that the face is drawn on a grid with 8 squares horizontally and 8 squares vertically. The eight squares can each have a white colour or not. Eight squares will have 256 possible combinations of white or black (1 or 0). All black will be 0 and all white will be 255. A collection of eight bits is called a byte. Going from left to right the decimal value of a 1 will be 128
64
32 16
8 4 2 1
while the decimal value of a zero is always zero. For example the number 10010011 = 128 + 0 + 0 + 16 + 0 + 0 + 2 + 1 = 147 The big sequence of zeros and ones written above can be thought of as the sequence of the following eight numbers: 10010011 = 147 10010011 = 147 11111111 = 255 11000111 = 199 01101110 = 110 10111101 = 189 73
BROPHY
11011011 = 219 11100111 = 231 Can you see that 0110110001101100000000000011100010010001010000100000011000 will give the inverse smiley face of Figure 6?
Figure 6. Inverse smiley.
This shows how simple images can be translated into a sequence of binary digits. Depending on the information that is going to be transmitted the details of the coding will be a lot different. This does not matter here. You can see how a sequence of binary digits can even carry pictures. The next question is how to get the binary sequence from outer space to Earth. Phase Modulation You have probably seen the graph of y = sin x many times before (Figure 7). This graph, and modifications that can be carried out on it, is the secret to transmitting information from deep space back to our own planet.
Figure 7. Sin(x). 74
CODING THEORY
Electromagnetic radiation travels through space at about 300 000 km/s. Nothing can travel any faster. The distances to be covered are so vast that this speed is needed. Electromagnetic radiation, depending on its frequency, is perceived as light, radio, x-rays, γ-rays etc. It is as radio waves that NASA’s spacecraft transmit information back to their receivers. Radio waves have the advantage of requiring little energy to produce and passing easily through the Earth’s atmosphere. How can these radio waves carry information? Specifically, how can these waves carry a stream of the digits 0 and 1? To answer this question we need to look more closely at the graph of Sin(x). Figure 8 shows two sin waves with different phases.
Figure 8. Illustration of phase.
The red curve is the graph of Sin(x). The blue curve has exactly the same shape but is in a different place. It is the graph of Sin(x – θ). θ displaces the original wave by a certain amount. This is called the phase of the wave. There are three different methods by which waves can be used to carry information. The maximum height of the wave is called its amplitude. This can be modified to give Amplitude Modulation or AM which is common with certain radio signals. The number of wave fronts that pass a given point in a certain time is called the frequency of the wave. This gives rise to Frequency Modulation or FM which is also used in the transmission of radio waves. The third method, Phase Modulation, is the method used to get signals across the Solar System. Recall that all we need is to transmit a sequence of the digits 0 and 1. This is done in the following manner. Figure 9 shows three different graphs of waves. When the space craft is ready to transmit information back to Earth it begins the process by broadcasting a simple wave. This is received some time later at the Earth’s surface and allows the receiver to detect the frequency and phase of the transmitted wave. For our purposes we can regard this as a phase of zero. The point P is a typical point on the wave. Once communication is established between the transmitter on the spacecraft and the receiver on the ground the transmitter begins to modify the wave. It does this by changing the phase of the wave. The point P’ is on a wave that has its phase shifted 75
BROPHY
Figure 9. Phase shift.
by 900 which might represent the digit 0. The point P’’ is on a wave that has its phase shifted by –900 which might represent the digit 1. So by modulating the phase of the wave a sequence of the two digits 0 and 1 can be transmitted back to Earth. This requires very little energy. This is just as well because, as NASA points out, by the time the signal reaches Earth it is so weak that it would take 40,000 years to collect enough energy from it to light a Christmas tree bulb for one millionth of a second. Coding the Information If the signal is very weak and the distances to be covered are very large then there is obviously a strong possibility of errors occurring. These can occur in creating the signal, transmitting the signal or receiving the signal. While the mathematicians at the Jet Propulsion Laboratory (JPL) have invented many ways of detecting and correcting errors they all involve redundant information. In its simplest form this means that every binary digit (bit) is sent three times. Suppose that a very small part of the information being transmitted is 1011 then the transmitting computer will first transform this into 111 000 111 111. Figure 10 shows the sequence of waves that would then be transmitted to Earth. Each digit 1 in our system is sent as Sin (x + π/2) which gives a phase shift of –90o. Each digit 0 is sent as Sin (x – π/2) which gives a phase shift of 90o. Naturally this is just one continuous signal that is received at Earth. By looking at the phases of these waves the scientists at the receiving end are able to retrieve the message 111 000 111 111. Hence, knowing the redundancy that was applied, they can deduce that the original message was 1011.
Figure 10. 111 000 111 111. 76
CODING THEORY
Suppose that instead of the sequence of waves that are shown in Figure 10 a slightly different set arrived at the receiver. Look at the sequence of waves shown in Figure 11. An error has occurred and one wave has a different phase than those above.
Figure 11. 111 000 111 011.
At some point one of the waves had its phase changed. There are many ways this could have happened. The result is that the message received is not a string of triples of digits. Instead it is 111 000 111 011. The fact that not every digit is a triple alerts the receiver immediately that an error has occurred. The receiver even knows where the error is: 011. The fact that there are two occurrences of the digit 1 and only one occurrence of the digit 0 strongly implies that it is the 0 that is incorrect and the message received should have been 111 000 111 111 which means that the original sequence was 1011. This method, with its built-in redundancy, increases the length of the message by a factor of three. This is very wasteful and the methods actually used by NASA are much more efficient. This example merely demonstrates the principle that errors can be both detected and corrected. Hamming codes are much more efficient and are the basis of many of the codes used by NASA and, indeed, in most types of digital coding. The Hamming code uses redundancy also but it is not anything like as wasteful as the tripling method defined above. Hamming codes work by a very clever use of Geometry. Geometries The word “Geometry” conjures up for us ideas of points and lines and probably certain properties of triangles. Euclid, about 300 BC, assembled all that was known of Geometry into thirteen books called Elements. For many centuries the theorems in these books were regarded as much a part of the physical world as the mathematical one. During the Renaissance artists discovered the secrets of perspective drawing. This is the method of drawing far things smaller than near ones. Here parallel lines will meet. Think of the appearance of railway tracks as they disappear into the distance. Mathematicians, particularly the French mathematician Gérard Desargues, became very interested in these ideas and realized, after some resistance, that the artists were using a geometry other than that of Euclid. If one other geometry could exist then why not more? The very meaning of the words point, line and distance can change. To see how this can be used in error correction we need to look at some definitions, particularly distance. In the geometry of Euclid the distance between two 77
BROPHY
points is the length of the line segment connecting them. In coordinate geometry we use a formula involving squares and square roots to calculate this. (x1 − x 2 ) 2 + (y1 − y 2 ) 2
In ordinary life you may use quite different definitions. If a motorist stopped you to inquire how far he was from his destination you might reply: “It is 5 km as the crow flies but the road winds a lot and you will have to drive 7 km”. If you saw a signpost indicating that the distance to a city was 156 km you would know that meant that there were 156 km of road to cover. The actual length of the shortest line segment may be only 120 km. As you see, “distance” means what we want it to. Using Hamming codes for error correction only the digits 0 and 1 are used. A word is defined as a particular sequence of these digits. We regard each word as a point in a space. The “distance” between two words is defined as the number of digits that differ between the two words. Thus the distance between 1100101 and 1010110 is 4 since there are exactly four positions where the strings of digits differ. The geometric properties of the space where the words live are then used in error detection and correction. The mathematics required is quite advanced and uses tools from Linear Algebra, especially vectors and matrices. Vectors can be thought of as line segments pointing in specific directions. Matrices can be thought of as objects that do things to vectors.
Figure 12. Vectors. 78
CODING THEORY
The blue vector (closest to horizontal axis) in Figure 12 has been transformed by a matrix to the green one. This is a rotation matrix. In a certain sense, there are only specific directions these vectors can point. This makes error detection possible. The distance function then makes error correction possible also. We will illustrate this with a simple example. We will use just two valid code words: 000 and 111. It is not too difficult to imagine a situation where just two words are needed. Suppose that 000 signifies Off and 111 signifies On then we have our binary system back again. There are three digits being used here which gives us eight possible words: 000 001 010 011 100 101 110 111 Only the first and last of these are valid. The other six are errors. We can regard each of these words as points in 3D space where they form a cube. In two dimensions you are familiar with representing points with two coordinates. We refer to these pairs as (x, y) where the number x tells how far to move in a horizontal direction and the number y gives the distance in the vertical direction. Similarly once we move into three dimensions another direction becomes possible and so we need three numbers to represent each point. The triplet is usually referred to as (x, y, z). As we increase the number of dimensions we increase the number of coordinates. We lose the ability to draw pictures but the principles are the same and we can work out distances and equations of curves as easily in seven dimensions as in two. To see how error correction with Hamming codes works we will stay in three dimensions to get a feel for the process. In Figure 13 we show all the possible code words as the coordinates of a cube. Each side of the cube is one unit long as measured using the Hamming definition of distance. Here is how we use this geometry to correct words with one error. If the transmitted word is 000 then the possible errors lead to 001, 010, 100. These are the only words that can be got from 000 with just one error. You can think of them as being on a sphere in this space whose radius is 1 with centre 000 as in Figure 14. 79
BROPHY
Figure 13. Cube of words.
Figure 14. Errors on sphere.
All these words are also on a sphere of radius 2 centered at 111. The sphere of radius 1 centred at 111 is shown in Figure 15 and contains the triplets 110 101 and 011. The two spheres shown divide the error words into two sets with no elements in common as shown in Figure 16. To correct a word that has only one error we simply find the nearest code word to the error word. Other methods allow the detection and correction of more errors but are beyond the scope of this introduction. 80
CODING THEORY
Figure 15. More errors.
Figure 16. Spheres of the hamming code. COMPACT DISCS
Digital Music Compact discs (CD) are used to store many different types of information today. They were originally used to store sound, particularly music. How can music be encoded on the surface of a disc and how can it be retrieved? You will not be surprised to discover that, once again, trigonometry and binary digits form the key. All sound is transmitted as waves of some form. The sound itself needs a medium to carry it. The medium, usually air, will vibrate. This vibration is carried from one place to another as a wave. If certain regular patterns are present then humans call it music. From this point of view there is no difference between Grand Opera and Heavy Metal. Physically they will both be transmitted as waves with certain patterns.
Figure 17. A sound wave. 81
BROPHY
Figure 17 shows a pure sound wave. This is composed of a combination of various sin waves of different frequency and amplitude. For any instant between the start and end of the sound the wave will be at some particular point. This is what we mean by saying that the sound is continuous. Technically it is an analog signal. This type of signal cannot be represented just by using the two digits 0 and 1. The first thing to be done is sample the signal at various places. This will return the values of the sound at specific places: but not everywhere.
Figure 18. Sampling a wave.
Figure 18 shows a series of blue lines imposed on the sound wave illustrated in Figure 17. These are the values of the sound wave at specific points. To attain a sample that is close to the original sound there must be a large number of sample points. For a CD the sound is sampled 44,100 times each second. This gives rise to a sample such as that shown in Figure 19.
Figure 19. Sampled sound.
Each blue line represents a particular amplitude. The sampling rate of 44,100 Hz (Hz means per second) allows the use of 65536 different amplitudes to reconstruct the sound. What use are all these amplitudes if digital media can only store the digits 0 and 1? These digits can be used to build up numbers of any size. The two digits being used are 0 and 1. 0 represents a switch being in the Off position and 1 represents the switch in the On position. Each digit, therefore, distinguishes between two states. Hence a sequence of 16 digits can be combined to give the large number of 216 = 65536 values. Since one of these positions has all the switches turned off sets of sixteen bits (two bytes) can be used for all the numbers between 0 and 65535. The process is the same as we saw in constructing the smiley face from sets of eight bits. Physical Structure The CD itself is a plastic disc (Figure 20). During its manufacture the plastic is shaped with very small bumps arranged in a continuous spiral. This is next covered with a
82
CODING THEORY
Figure 20. Structure of a CD.
thin layer of highly reflective aluminium. An acrylic layer protects the aluminium. The label is printed onto the acrylic. A laser beam is reflected from the aluminium layer where the change in reflectivity caused by the bumps is interpreted as a sequence of the digits 0 and 1. Thus we have come via trigonometry to a bitstream yet again. A sophisticated piece of electronics then converts the digital signal back into analog form and this is used to convert the data back to sound. The very high sample rate means that the loss in quality is undetectable to most human ears although some musicians claim that they can distinguish between the sound on a CD and the analog sound of vinyl records. In Figure 21 because the sample rate is low either the green or red curve, or indeed many others, could be reconstructed.
Figure 21. Low sample rate.
In Figure 22, however, it would be very difficult to reconstruct the wrong wave as the high sample rate leaves hardly any freedom.
Figure 22. High sample rate.
This is the reason why such a high sampling rate is needed in the transformation of an analog sound wave to a set of digital data. Error Detection and Correction Errors, of course, can occur either in the manufacturing process or from physical damage to the CD. In the section on barcodes we saw how check digits can detect 83
BROPHY
and correct simple errors. A very sophisticated extension of this method is used in detecting and correcting errors on a CD. The methods used involve complex numbers. These are numbers of the form a + bi where i is defined by the equation
i 2 = −1 Using complex numbers it is possible to get various roots of the number 1.
Figure 23. Complex roots of 1.
The red points in Figure 23 are the eight eighth roots of 1. These are 1 2
+
1 2
i,i,−
1 2
+
1 2
i,−1,−
1 2
−
1 2
i,−i,
1 2
−
1 2
i,1
This means that any of those numbers raised to the power of eight will give the number 1. The data points can be regarded as the coefficients of a certain polynomial, p. The complex roots of 1 are used to create a second polynomial, g. The product of these two polynomials is analysed by the CD player which divides this result by g. If this division process leaves a remainder then there is an error in the received data. By evaluating this remainder at the complex roots of 1 the error can be corrected. All this happens far too quickly to be detected by the ear so the human listener hears the continuous sound of a melody thanks to some very complicated electronics and mathematics. This method is called a Reed-Solomon code and was first described in 1960. The particular roots of 1 to be used depend on the length of each word and the number of errors to be checked. It is almost incredible to think that nearly 50% of errors can be corrected by this method. This explains why a CD, unlike a vinyl record, does not exhibit gradual signs of decay. As long as the scratches on the CD remain below a critical level then the error correcting methods will be able to reconstruct the sound perfectly. If the faults in the CD rise above a certain level then correction is impossible and it seems to us that the CD has suddenly been corrupted. TO DO IN CLASS
Art Work In this activity the teacher will get different groups in the class to draw pictures on graph paper and transmit the information in code. Time constraints probably mean 84
CODING THEORY
that the grid on the paper should be fairly large. A (simple) piece of art should be drawn with each square clearly either blank or covered as in the smiley face drawn earlier. Divide the students into different groups. Each group will have two sheets of graph paper. After a discussion each group will draw a simple figure on one sheet of graph paper. Using binary arithmetic this figure should then be written as a binary number and then translated into decimal. The decimal number is written on the blank sheet of graph paper. The blank sheets are now passed to different groups and the figures reconstructed. Remember if a figure is drawn incorrectly the fault may lie with the translation into a decimal number or the translation from the decimal number to the corresponding binary number. How do these correspond to the errors discussed above? REFERENCES Cederberg, J. N. (2001). Axiomatic systems and finite geometries. A Course in Modern Geometries, 18–25. (2001). Speaking in phases. Technology Teacher [serial online (60, 12–17)]. Available from: Academic Search Complete, Ipswich, MA. Fitzpatrick, P., & Kingston, J. (2000). Error correcting codes and cryptography. Newsletter Irish Mathematics Teachers Association, (97), 45–58.
Tim Brophy National Centre for Excellence in Mathematics and Science Teaching and Learning (NCE-MSTL) University of Limerick
85
JEAN CHARPIN
4. TRAVELLING TO MARS: A VERY LONG JOURNEY Mathematical Modelling in Space Travelling
INTRODUCTION
Just over forty years ago, Neil Armstrong, Edwin ‘Buzz’ Aldrin and Michael Collins were the first people to travel to the Moon. This was, in the words of Neil Armstrong, ‘one small step for man, one giant leap for mankind’. The next big milestone in space travelling is to reach our next closest neighbour in the solar system: Mars. This planet is much further away than the Moon and there are a lot of challenges related to this trip. Mathematics will be key to solving them. This chapter introduces a few activities related to this trip focussing on two aspects of the school curriculum: 1. Geometry: circles and ellipses. The activities proposed in the first section of this chapter involve some simple geometry: drawing circles and ellipses, studying the distance between two points belonging to the circles, using the properties of aligned points and diameters to determine the minimum and maximum distance between the Earth and Mars. 2. Large numbers: Space travelling involves large numbers. Understanding what they represent is rather difficult for everyone. The simplest way to make sense of these values is to make a careful choice of units: large numbers are then transformed into much smaller ones which are much easier to interpret. The activity presented in the second section will show a simple way to achieve this. These activities offer mathematics teachers interesting and rewarding ways to engage secondary school students that are accessible to students and teachers alike. CIRCLES, ELLIPSES AND DISTANCE BETWEEN THE EARTH AND MARS
In this first part, the orbits of Earth and Mars will be studied. In the Solar system, the planets move around the Sun describing a curve known as an ellipse. The properties of orbits and ellipses will be briefly reviewed at first and two possible activities will then be presented. Some Background Some properties of ellipses and circles. Figure 1 shows some of the properties of ellipses. They are flattened versions of a circle and have a lot of common properties J. Maasz and J. O’Donoghue (eds.), Real-World Problems for Secondary School Mathematics Students: Case Studies, 87–98. © 2011 Sense Publishers. All rights reserved.
CHARPIN
with this well known curve. A point belongs to a circle if its distance to the centre is equal to the radius. An ellipse also has a centre, denoted O on the figure, but this is only a symmetry point. A point P belongs to the ellipse if the sum of its distance to the two points F1 and F2, known as the foci, equals a constant: P ∈ Ellipse ⇔ PF1 + PF2 = d1 + d 2 = C
y P b
d1 F1
B d2 F2
O
A x
f
a Figure 1. Geometry of an ellipse.
If the value of the constant distance d1+d2 is just above the distance between the two foci, F1F2, the ellipse will be very flat. Conversely, if d1+d2<<2f, the ellipse will be very close to a circle. An ellipse may be characterised by its eccentricity, e=f/a, defined as the ratio between the length of the major axis, 2a, and the distance between the two foci, 2f in Figure 1. By definition, the eccentricity verifies 0<e<1. When the eccentricity is close to e=0, the ellipse is nearly a circle and it becomes flatter and flatter as the value of the eccentricity e approaches 1. Ellipses are key geometrical figures to describe the trajectories of planets, called orbits. This aspect will now be detailed briefly. Earth Mars and the solar system. Earth and Mars are two planets of the Solar system. They rotate around the Sun on different paths called orbits and at different speeds. Some data about their orbits are provided in Table 1. Table 1. Some data about the Earth and Mars (days in the table refer to Earth days)
Earth Mars 88
Orbital radius 150 million km 228 million km
Orbital year 365.25 days 687 days
Eccentricity 0.017 0.093
TRAVELLING TO MARS
Planets rotating around the Sun describe an ellipse, this can be shown using the standard laws of physics. The Sun is one of their foci. The eccentricity of the planets is a given parameter: its value can not be changed. The eccentricity of the Earth’s orbit is extremely low, so its orbit can be approximated as a circle with the sun at its centre. The eccentricity of Mars is not as low; this is actually the second largest in the Solar system. The orbit of Mars will still look close to a circle but not with the Sun at its centre. This will be further detailed in the first activity. The planets rotate continuously around the Sun. The time it takes planets to travel all around their orbit is called the orbital year. This property varies from one planet to the other: the orbital year for Mars is about twice the value of the orbital year for the Earth. This parameter will be used in the last activity of the chapter. Activity 1: Drawing Ellipses The trajectory of Mars may be approximated as a circle with its centre away from the Sun. This can be explained by the following activity. Participants need an approximately 15cm piece of string, a sheet of paper and a pen. To draw ellipse, pin the piece the string on two separate points on the sheet. These will be the foci. Draw the string taut using the pen and move the pen around to reach all possible positions. The resulting points form an ellipse. The activity could unfold as follows – Draw a 15 cm straight line segment in the middle of the paper sheet. – Determine the middle of the segment. – Draw a few foci on the horizontal segment and draw the corresponding ellipses as shown on Figure 2 – F6 and F’6 located 6cm left and right of the centre point O, – F4 and F’4 located 4cm left and right of the centre point O, – F2 and F’2 located 2cm left and right of the centre point O, – From what value of the focal distance do ellipses look like a circle? – The ellipse describing the orbit of Mars will be close to an ellipse with foci 1cm left and right from the centre of the ellipse. Is it possible to approximate the orbit of Mars with a circle? Learners doing this activity should work in pairs. The first person could pin the extremities of the string on the foci with his/her fingers while the second person moves the pen. For maximal effect, the order indicated above should be respected. The first ellipse is going to be quite flat; the last two are quite close to circles. Their centres are the point O. Figure 2 shows a typical outcome for this activity. The points F2 and F’2 are the two foci 2 centimetres left and right of the central point. If the extremities of the 15 centimetre piece of string are pinned down on these two points, this leads approximately to the ellipse E2. Using the same method, points F4 and F’4 lead to ellipse E4 while foci F6 and F’6 lead to ellipse E6. As can be seen on Figure 2, the ellipses get flatter and flatter when the distance between the centre O and the foci 89
CHARPIN
increases. If the foci are close to the centre of the ellipse, the curve looks very similar to a circle.
Circle
E2 F6
F4
E4 F2
O F’2 E6
F’4
F’6
Figure 2. Typical result for activity 1.
If the trajectory of Mars was at this scale, the foci will be located left and right of the centre at the distance f = a×e =
15 × 0.093 ≈ 0.7cm 2
This means that the trajectory of Mars will be even closer to a circle than ellipse E2. The orbit of Mars can therefore be approximated with a circle, but with its centre away from the Sun. This orbit and the orbit of the Earth will now be studied to evaluate the minimum and maximum distances between these two planets. Activity 2: Distance between Mars and the Earth Figure 3 shows an approximation of the trajectories of Earth and Mars. The orbit of the Earth is a circle with the Sun at its centre and a radius of approximately 150 million kilometres. The orbit of Mars is also a circle: its centre is located approximately 21 million kilometres away from the Sun and its radius is around 228 million kilometres. In this activity, participants will draw the orbits for Mars and the Earth and study the distance between these two planets. For this activity, a sheet of paper, a ruler and a compass are required. 90
TRAVELLING TO MARS
1 unit
Orbit of the Earth
Sun Center of orbit for Mars
Orbit of Mars
Figure 3. Orbits for Mars and the Earth.
The activity could unfold as follows – Choose a scale adapted to the sheet size: the scaled orbit of Mars should fit on the page. For an A4 sheet, an appropriate scale is 1cm for 30 million kilometres. To start with, participants can fill in the following table.
Scale Orbital radius of the Earth Orbital radius of the Mars Distance between orbital centres
Distance (million of km) 30 150 228 21
Size on the graph (in cm) 1
The activity could continue as follows: – Draw a point in the centre of the sheet. This will represent the Sun. – Using the last line of the table above, determine a possible position for the orbital centre of Mars. This could be in any direction on the sheet. – Using the values calculated in the table above, draw the circles representing the orbits of Earth and Mars. The orbit centre for the Earth is the Sun. The orbit centre for Mars is the point determined in the previous step. – With the help of the ruler, participants can work out what are the minimum and maximum possible distances between the Earth and Mars. – The participants can also retrieve these results using the geometry of aligned points. In theory, the Earth and Mars could be anywhere on their orbits. They could be in positions M1 and E1 as shown on Figure 4. The points corresponding to the minimum and maximum distances can be determined when both planets are aligned with their orbit centres: these positions are located at the intersection between the 91
CHARPIN
1 unit
M1 E1
Center of orbit for Mars
Sun M2
E2
S R Earth
C
M3
d
R Mars
R Mars
Orbit of the Earth
Orbit of Mars Figure 4. Distance between Mars and the Earth.
circles describing the orbits and the line joining the two orbit centre points C and S, see Figure 4. The distance is minimum when the Earth and Mars are in positions E2 and M2 respectively. The distance will be maximum when the Earth and Mars are in positions E2 and M3. Since the points E2, M2, M3, C and S are aligned, these maximum and minimum distances can be calculated as functions of the radial orbits of the Earth and Mars, REarth and RMars and the distance d between the orbit centres. As shown on Figure 4, the minimum distance E2 M2 may be expressed as M 2 E2 = M 2C − CS − SE2
All these distances are known. – M2C is the orbital radius of Mars, 228 million kilometres, – CS is the distance between the orbital centres of the Earth and Mars, 21 million kilometres, – SE2 is the orbital radius of the Earth, 150 million kilometres. The minimum distance between the two planets is therefore M 2 E 2 = 228 − 21 − 150 = 57 million kilometres
92
TRAVELLING TO MARS
Using a similar method, the maximum distance between the Earth and Mars is: M 3 E2 = M 3C + CS + SE2
Here again, all these distances are known: – M3C is the orbital radius of Mars, 228 million kilometres, – CS is the distance between the orbital centres of the Earth and Mars, 21 million kilometres, – SE2 is the orbital radius of the Earth, 150 million kilometres. and the maximum distance is
M 3 E2 = 228 + 21 + 150 = 399 million kilometres The distance between the two planets will constantly oscillate between these two values. Nice computer simulations showing the evolution of the distance between the Earth and Mars may be found on the internet (see for example www.windows2 universe.org/mars/mars_orbit.html). The distances involved in the journey are so large that the shuttle launch time must be planned carefully. The planets will keep moving around the Sun during the shuttle journey. It is therefore important to evaluate the travel time to make sure that the shuttle will reach Mars. This second aspect will now be presented. DEALING WITH LARGE NUMBERS: TRAVELLING BETWEEN THE EARTH AND MARS
The distance between the Earth and Mars varies constantly. In the previous two activities, the minimum and maximum distances between the two planets have been approximated: – The minimum distance is about 57 million kilometres, – An approximation of the maximum distance is 399 million kilometres. Travelling to Mars from the Earth will require a very long travel time. In this section, the time required for this journey will be evaluated for three travelling means: a standard car, a jet plane and a space shuttle. Typical velocities may be seen in Table 2. Simple properties of velocities will be reviewed first and then the corresponding activity will be described. Table 2. Typical velocities (note the different unit for the space shuttle) Standard average velocity
Unit
Car Jet plane
90 900
km/h km/h
Space shuttle
7.5
km/s
93
CHARPIN
Velocity, Time and Distances The velocity of a vehicle is defined as the ratio between the distance travelled and the time it takes to travel it: Velocity =
Distance Time
If a vehicle travels at constant velocity, the distance travelled may be expressed as Distance = Velocity × Time
If the vehicle covers a distance at constant velocity, the time it takes to travel this distance is Time =
Distance Velocity
(1)
All these quantities must have a unit. Typically, the velocity is expressed in kilometres per second or kilometres per hour, denoted km/h and km/s respectively, the time is in hours (h) or in seconds (s), and the distance in kilometres (km). Activity 3: Velocities and Distances The travelling velocity of space shuttles is difficult to comprehend. A good way to overcome this difficulty is to compare the travel time of the space shuttle and other means of transport on familiar distances. People are used to travelling in cars and planes and understand how long it should take to go from one place to the other. These common sense values could be compared with the travel time of the space shuttle. The first part of this activity is based around this concept. Table 3. Travel times Town Distance from Limerick Time travel (car) Time travel (jet plane) Time travel (space shuttle)
Belfast 323
Cork 105
Dublin 198
Galway 105
In Table 3, the first line gives the distance between Limerick and other important towns in Ireland. This table should of course be modified to reflect the local geography. The travel times to these towns can be estimated for the three means of transport and the corresponding average velocities given in Table 2. For example, 94
TRAVELLING TO MARS
the distance between Dublin and Limerick is 198 kilometres. The travel times can be calculated as follows: – Travel by car. The average velocity of a car is 90km/h. The car travel time between Limerick and Dublin can be calculated using formula (1): Time =
Distance 198 = = 2.2 hours = 2 hours and 12 minutes. Velocity 90
The time calculated is in hours because the velocity of the car is expressed in kilometres per hour. – Travel by plane. The average velocity of a jet plane is 900km/h. The jet plane travel time between Limerick and Dublin can again be calculated using formula (1):
Time =
Distance 198 = = 0.22 hours = 13 minutes. Velocity 900
– Travel by space shuttle. The average velocity of a space shuttle is 7.5km/s. As in the previous cases, the travel time can be calculated using formula (1):
Time =
Distance 198 = = 26.4 seconds Velocity 7.5
In this case, the travel time is in seconds because the velocity of the space shuttle is expressed in kilometres per second. The results in Table 3 underline the speed of space travel: a journey that would take over two hours with a car or 15 minutes with a jet plane at maximum speed can be done a few seconds with a space shuttle. This activity also highlights the importance of units to make sense of large numbers. The velocity of the space shuttle could be expressed as 27000 kilometres per hour instead of 7.5 kilometres per second. This value would be harder to manipulate. The results in the last row of Table 3 would be much more difficult to calculate. The travel time between the Earth and Mars may be calculated using formula (1) as well. In this case, the distance to be covered is 57 million kilometres and the velocity considered is the velocity of the space shuttle 7.5 kilometres per second:
Time =
Distance 57,000,000 = ≈ 7,600,000 seconds Velocity 7.5
Once again, the meaning of this huge result is difficult to understand. To make sense of this figure, it should be expressed not in seconds but in days. There are 3600 seconds in an hour so 95
CHARPIN
7,600,000 seconds =
7,600,000 ≈ 2111 hours 3600
Similarly, there are 24 hours in a day so the figure can be further transformed to 7,600,000 seconds =
7,600,000 2111 ≈ 2111 hours = ≈ 88 days 3600 24
If the space shuttle was travelling at the velocity of a jet plane, the travel time could be expressed as Time =
Distance 57,000,000 63333 = ≈ 63333hours = ≈ 2639 days ≈ 7.22 years Velocity 900 24
Using appropriate units makes the difference between the two ways of transport very obvious: a distance that could be covered in a couple of months with the space shuttle would take years with a standard jet plane. This outcome was far from obvious when the figures were expressed in hours or seconds. If the space shuttle was travelling along the shortest (theoretical) distance, it will take nearly three months to reach Mars. Although considerably shorter, the travel time by space shuttle is still extremely long. While the shuttle is on its way the planets will be moving significantly. The last part of this activity will estimate how much the planets have moved during the space shuttle journey If the shuttle was launched from the Earth when it is closest to Mars (this corresponds to points E2 and M2 in Figure 4 and the shuttle was travelling on a minimum distance straight line it would never reach Mars. The planet will have moved significantly from the position it occupied at the time of launch. In the last part of this activity, the new positions of the planets will be calculated. The position of Earth and Mars 88 days after the beginning of the journey can be calculated using the values for the orbital year provided in Table 1. In one orbital year a planet covers its entire orbit corresponding to 360o. The new position of the planet may be calculated using proportions: – The orbital year of the Earth is 365.25 days: the planets travels 360o on its orbital circle during this period. In 88 days, it will cover the angle θE θ E = 360 ×
88 ≈ 87 o 365.25
– The orbital year of the Mars is 687 days: the planets travels 360o on its orbital circle during this period. In 88 days, it will cover the angle θM
θ M = 360 ×
88 ≈ 46 o 687
Figure 5 shows the positions of the planets 88 days after the shuttle launch. The Earth has moved almost 90o along its orbit while Mars is approximately 45o away from 96
TRAVELLING TO MARS
its original position. If the shuttle is travelling on the shortest distance straight line from the Earth, it will miss Mars by tens of millions of kilometres. This shows that the travel time is too long to neglect the movement of the planets. When planning the exact trajectory of the shuttle, this aspect must be taken into account and the journey will be several millions of kilometres and several months longer. The journey back will last several months as well. Space travelling is definitely no easy matter.
1 unit
Position at launch time
Center of orbit for Mars
Sun S C
Orbit of the Earth Orbit of Mars
Position after 88 days
Figure 5. Position of Mars and the Earth at time of launch and after 88 days. CONCLUSION
This chapter has underlined the importance of mathematics in the simplest problems posed by manned interplanetary travel. It will be up to the younger generations to find out how these challenges will be overcome. Travelling through space is one of the greatest challenges in the twenty first century. When he first set foot on the Moon, Neil Armstrong said the famous words ‘one small step for man, one giant leap for mankind’. What the first person setting foot on Mars will say is unknown yet but these words will most certainly be remembered for generations. However, this will only happen once all technical difficulties will be resolved. Mathematics will play a key role in this challenge and the examples developed in this chapter are only the tip of the iceberg. Space travelling will set mathematicians challenges for years and years to come. 97
CHARPIN
REFERENCES Ellipse. (n.d.). Retrieved from Wikepedia: http://en.wikipedia.org/wiki/Ellipse [Accessed 22 November 2010]. Earth. (n.d.). Retrieved from Wikepedia: http://en.wikipedia.org/wiki/Earth [Accessed 22 November 2010]. Mars. (n.d.). Retrieved from Wikepedia: http://en.wikipedia.org/wiki/Mars [Accessed 22 November 2010].
ACKNOWLEDGEMENTS
Jean Charpin acknowledges the support of the Mathematics Applications Consortium for Science and Industry (www.macsi.ul.ie) funded by the Science Foundation Ireland mathematics initiative grant 06/MI/005. The help and comments of Dr Joanna Mason and Dr William Lee, both from MACSI, University of Limerick, are here kindly acknowledged. Parts of this chapter were used in the ‘Travelling to Mars’ activity sheet developed for Engineers Week 2010 in partnership with Enterprise Ireland Steps Programme and the National Centre for Excellence in Mathematics and Science Teaching and Learning, University of Limerick, Ireland. Jean Charpin Mathematics Application Consortium for Science and Industry (MACSI) University of Limerick
98
SIMONE GÖTTLICH AND THORSTEN SICKENBERGER
5. MODELLING THE STORAGE CAPACITY OF 2D PIXEL MOSAICS
MATHEMATICAL MODELLING WITH STUDENTS
Mathematical modelling is one of the core competencies of today’s knowledge-based society. Description and abstraction of real problems by using mathematical language enables the simulation and optimisation of extensive systems with mathematical tools and IT capabilities. Besides, mathematical modelling with students provides new directions in motivation, knowledge transfer as well as problem solving. Therefore, it is recommended that it should be integrated into the interdisciplinary MINT1 education. For 16 years, the Department of Mathematics at the University of Kaiserslautern has been organising mathematical modelling weeks for selected students of 10th – 12th form. Teachers, too, can participate in this event to broaden their knowledge through further education. In total, 40 students and 16 teachers were engaged for the duration of one week in 8 different and realistic problems from industry, the economy, society, sports, IT and physics. Science assistants from universities and research centres supported each group by offering advice and help, whenever it was needed. During the 2008 modelling week, a team consisting of 5 students and 2 teachers from different schools worked on the optimisation of new two-dimensional (2D) bar codes, the so-called pixel mosaics. The problem was specially developed for students interested in mathematical modelling as well as in IT with experience in any programming language. Due to the complexity of the problem, it is recommended to implement such a task in interdisciplinary mathematics and IT class on AP level rather than in regular class. CAN PAPER TALK?
Take a coke bottle, a biscuit tin or a shoe box: nowadays, black and white bar codes are printed on nearly all consumer goods and scanned at the supermarket checkout, replacing manual typing of prices. Bar codes accompany us everywhere, every day. For example, suitcases at airports can be sorted with these codes; also libraries have introduced the bar code for their library cards for the borrowing of books. Therefore, for many students, it is very exciting to understand the functionality of these codes and how much mathematics they may contain [7]. J. Maasz and J. O’Donoghue (eds.), Real-World Problems for Secondary School Mathematics Students: Case Studies, 99–112. © 2011 Sense Publishers. All rights reserved.
GÖTTLICH AND SICKENBERGER
The bar code consists of a 13-digit ‘European Article Number’ (EAN), which classifies each item, and constitutes a 12-digit item number as well as a control number. The control number indicates whether the item number was correctly read during the scanning process or if transmission problems have occurred. For the control number, EAN uses a digit, which is calculated by using a specially weighted checksum of the item number. To enable this, the digits of the item number are multiplied alternately with factors 1 and 3 and then summed up. The control number completes this sum to the next multiple of 10. Hence, single errors like e.g. the wrong input of digit ‘8’ as ‘3’ as well as a pair wise mix up of single digits like ‘41’ instead of ‘14’ can easily be detected. The example in Figure 1 shows a bar code for yellow nectarines with the EAN 2404105001722. The weighted checksum of the first 12 digits of EAN is 48. Control number 2 completes this sum to the value 50 (see [3] for a detailed description). When we refer to codes or coding in this context, it does not refer to encrypting a message or its translation into a cipher code, but developing a code or password from a message for its electronic transmission. Errors that may occur are identified and ideally corrected during the transmission process. The bar codes in question only recognise transmission errors, but are unable to localise or even correct it. The development of error-correcting codes which, similar to bar codes, are optically readable but save more data volume and, consequently, information about correction modes. Black and white bar codes would require too much space for this. The new bar code generation consists of quadratic pixel mosaics. They are printed on Deutsche Bahn’s online ticket (see Figure 2) or on online stamps named ‘StampIt’ issued by the Deutsche Post. A mosaic like that of black and white pixels can be read through an optic camera or even with a mobile camera. The photographed mosaics are then read out by a specific software programme and finally decoded. This method enables the transmission of a higher data volume compared to those bar codes being used so far. Besides, there are completely new applications emerging from it: A weblink directing to a company or advertising homepage can be translated into a space-saving pixel mosaic and placed as a magazine advertisement. The mosaic is photographed via a mobile phone, the decoded link behind opens a webpage. Personal
Figure 1. Price tag for fruit with an EAN bar code. 100
MODELLING THE STORAGE CAPACITY OF 2D PIXEL MOSAICS
Figure 2. Online-Ticket of Deutsche Bahn (sample).
weblinks of profile pages such as Facebook in a pixel mosaic, too, could be coded and printed on the back of a t-shirt. Anyone photographing the mosaic at the next party can view the web profile of the shirt wearer via internet browser of his mobile. In the course of the modelling week, the students had to investigate the maximum data volume that could be saved in a 2D pixel mosaic and whether it was even possible to extend this storage capacity by developing a new concept. A particularly exciting question for students was raised, whether it would be possible to make paper speak, if a language was coded into a pixel mosaic, photographed with a mobile phone and played via an integrated loudspeaker. BACKGROUND INFORMATION: 2D PIXEL-MOSAICS
The following background information regarding construction and functionality of pixel mosaic variations, were compiled by the team via internet research. 2-Dimensional Bar Codes Pixel mosaics can also be described as 2-dimensional bar codes, as they code data horizontally as well as vertically. Since the late 1980’s, they have been developed and primarily used in production departments of the vehicle and electronic industries for the clear identification of single elements. Since then, various types of pixel mosaics have been developed and applied in different areas. The Deutsche Bahn for example uses the Aztec Code for coding the data of their online tickets; the Deutsche Post implemented Datamatrix for its online stamp business whereas the QR-Code (Quick Response Code) is most widely spread (Figure 3). Numerous QR codes can be found in the public sector which can be recorded with a mobile phone camera and evaluated inside the mobile phone through decoding software (‘Code Reader’). They contain advertising slogans, internet links or save useful information on sights2. This trend will probably soon find its way towards Europe.
Figure 3. The word “Modelling Week” in Aztec Code, Datamatrix and QR-Code. 101
GÖTTLICH AND SICKENBERGER
After a quick discussion, the team decided to analyse the QR code more thoroughly, followed by the search for an improved approach concerning the storage capacity. The single improvement steps as well as the results achieved ought to apply to other mosaic types in a similar way. The Structure of the QR-Code3 The QR-Code consists of black and white pixels and can save a message with up to 7089 signs, depending on the mode that is used. The mode determines the type of the data to be saved. A distinction is drawn between – numerical, i.e., only digits, – alphanumerical, i.e., digits as well as Arabic letters, and – special letters, e.g. Japanese or Chinese letters. Initially, the data is converted into binary numbers, coded through an error detection process, provided with a special mask, and then saved as a pixel mosaic. This process is called coding. The inverse process, that is the read-out of a message from a pixel mosaic, is referred to as decoding. We will now put our main focus on the single steps of decoding. At first, each character of the message is classified in a binary number. Often, for alphanumerical messages, the extended ASCII Table4 is being used, which uses one Byte (which equals 8 Bit) on information for the illustration of a character, thus picturing exact 256 different characters. The binary message produced is now coded through an error correction process. At this, the QR code reverts to the socalled Reed-Solomon-Process (see [13]), which is also used for coding data from a compact disc or DVD. Coding data by using this process enables the localisation of an error and the recovery of messages even if the QR code can no longer be properly read. The Reed-Solomon algorithm also allows for the regulation of the correction level making it possible to choose between minimum storage space requirement and best possible error correction, depending on the application. The reconstruction of a message can be realised on maximum correction level, even if up to 30% of the surface of the pixel mosaic is damaged.
Figure 4. Structure of the QR-Code. 102
MODELLING THE STORAGE CAPACITY OF 2D PIXEL MOSAICS
The binary code word which is produced via error correction process is now transmitted into a pixel mosaic. Black pixels represent a 1, whereas white pixels mark a 0. Finally, a so-called mask covers the code word, consisting of a short and repeating sequence of zeros and ones added pixel- and bitwise to the code word. This means that, for each 1 in the mask, the respective pixel of the code word changes its colour: black pixels turn to white and vice versa. It avoids large and monochrome surfaces which could easily cause problems if data is imported with an optical camera. Figure 4 shows the detailed structure of the QR code of the message ‘HALLO’: – The three Finder Patterns serve as alignment of the QR Code, enabling the decoding of a message even if the QR code is photographed in profile, distorted or twisted. – The Information Patterns indicate the mask that is used as well as information about further dimensions. – The Timing Patterns serve as coordinate axis and scale reference. Therefore, black and white pixels are alternating continuously. – The four pixels in the bottom right corner of the mosaic indicate the mode that is used. – The message length is coded directly above the mode, i.e., the number of signs in the binary system. The code word connects to the message length information, consisting of message and error correction data. The single pixels are populated from bottom to top as well as from right to left. Of course, the actual dimension of a mosaic depends on the length of the message, and is therefore calculated in advance. Pixels that may not be used are valued as 0. Another pattern of black and white pixel emerges with the use of this mask, identified as “zero information” when it is read out. MODELLING AND OPTIMISATION: THE QUATTRO-CODE
After completion of a thorough analysis of structure and functionality of the QR code, it was the team’s aim to develop their own model of a pixel mosaic with higher storage capacity. The new code was to memorise more information with the same number of pixels, but to also enable the transmission of compressed audio files such as e.g. short voice messages. The following approaches were developed and discussed (see Figure 5):
Figure 5. Three different proposals for memory optimisation:shapes, grey shades or colours.
The first two approaches, however, were soon dismissed. The students realised that using different shapes would require a high-definition camera as well as an algorithm for the identification of patterns. Both were graded as additional difficulty and source of error. The readout of pixel mosaics using several shades instead of 103
GÖTTLICH AND SICKENBERGER
black and white is very much dependent on the lighting conditions during the shot. The team eventually concentrated on the third approach, as there is no guarantee of ideal and persistent lighting in every situation. The use of different colours initially raises the question as to how many colour shades should be applied. The team chose four different colours, which are assigned to the binary numbers 00, 01, 10 and 11. In contrast to the QR code which requires 8 pixels, 1 Byte (= 8 Bits) of information was coded into 4 colour pixels. For their work, the students opted for white, red, green and blue, as these colour shades are in a sharp contrast to each other. Below, we exemplify the illustration of the message “HALLO” with coloured pixels, the error correction not being taken into account: – First, the ASCII numerical code is assigned to every single letter of the message: H = 72, A = 65, L = 76, O = 79, – the ASCII numerical codes are displayed in the binary numerical system (8 Bit): 72 = 01001000, 65 = 01000001, 76 = 01001100, 79 = 01001111, – the binary numbers are now combined: ‘H A L L O’ = 01001000 01000001 01001100 01001100 01001111 – and finally subdivided into pairs and coded in the corresponding ‘colours’:
00 = , 01 = , 10 = and 11 = to be applied. The pixel mosaic being generated this way was named Quattro code by the team. Error Correction An error correction algorithm followed as a next step: errors in the read data should be identified, localised and corrected. Note that for the QR Code, the Reed-Solomon error correction is being used. This algorithm divides the binary message into units of 8 Bits each and calculates their error correction information. Depending on the correction level, the calculated codeword gets longer by 3 to 8 Bits compared to the original message since both message and error correction information must be stored. 104
MODELLING THE STORAGE CAPACITY OF 2D PIXEL MOSAICS
The algorithm is based on the construction of defined polynomials and its evaluation followed by interpolation into predefined nodes. In order to avoid large function values, the calculation of all numbers is carried out in the finite field of integers. This additional difficulty prevented the students from getting deeper into the theory of this procedure. Instead, they put their focus on the development of their own error correction procedure, based on that of conventional bar codes on simple check sums. For this, the group pursued several approaches, of which a suitable one was selected. A linear system of equations for error correction. The first approach the students developed was not designed for the use of several colours. In fact, they focussed on a message composed of eight bits (e.g. 10011011). The individual bits are assigned to the variables a, b, c, d, e, f, g and h. Therefore, the variables correspond to the values a=1, b=0, c=0, … , h=1. Next, six control numbers are generated, calculated as the solutions of six linear equations. The aim is to choose this system of linear equations in such a way that differences in the message between read and calculated control numbers can be uniquely identified. It depends on the smart combination of the variables in the equations; in particular, two variables must not occur exclusively in the same set of equations. Furthermore, each variable is to appear in at least two and a maximum of three equations. For instance, the following equations meet the desired criteria:
a c e = x1 , a d h = x2 ,
a c f = x3 , b d g = x4 , b f h = x5 , b e g = x6 . The (binary) addition of each left hand side produces a 6-bit control number (x1, x2, x3, x4, x5, x6). Here, the check number is 011010. Having this tool at hand, the error correction will work for this system as follows: If there is an error in the 8-bit message at one position, the result of the linear system will change in at least two equations; hence the recalculated control number is incorrect, i.e., it differs in at least two positions from the original control number. Thus, conclusions regarding the incorrect bit are possible and corrections can be made. In case the message is 11011011, the calculated control number is 011101, 105
GÖTTLICH AND SICKENBERGER
which differs from the original control number at the points x4, x5 and x6. This is due to a read error in the variable b, since only this variable occurs in the appropriate equations. Thus, the correct message is 10011011. The disadvantage of this system is, that it can merely detect and correct single errors. Several errors in the message or errors while reading the control numbers could cancel each other out or lead to more than three different digits in the control number. These errors can neither be recognised nor corrected. Check sums for the entire message. Another possibility for the generation of check sums is to consider the entire message where all pixels are arranged line by line starting at the top left and moving to bottom right. This results in a big square, which should be as small as possible; void space may be filled with zero values. In a second step check sums can be introduced and calculated, e.g., check sums for rows, columns and diagonal elements. Together with these check sums the original message is encoded to the final codeword. The check sums are used to detect possible errors in the transmission of the codeword. This is done by comparison between the stored check sums calculated from the original message and the post-calculated check sums of the read in codeword. Using this approach the students demonstrated an error detection and error correction of an incorrect codeword ‘‘by hand’’. However it caused them difficulties to design an algorithm whilst implementing this approach, such that this way of error correction was not used in the end. But the group followed up the idea of introducing check sums for error correction and adapted this procedure locally to the Quattro code. Check Sums for 2 x 2 Pixels The ultimately implemented error correction approach was specially developed for pixel mosaics using four different colours. It enables the algorithm to detect and correct a single error within four pixels of information.
Figure 6. Sketched representation of own error correction procedure.
One byte of information represented by four coloured pixels is arranged as a square (see Figure 6) and the different colours depict a different value in the four number system ( = 0, = 1, = 2, = 3). Three additional check sums are used for error detection and correction: the first and second check sums give the 106
MODELLING THE STORAGE CAPACITY OF 2D PIXEL MOSAICS
sum of the diagonal and anti-diagonal elements, respectively, and the third check sum is the sum of the upper row of that square. The three check sums extend the message to a codeword, they are represented as additional information in coloured pixels and displayed in the pixel mosaic. As an example we consider the letter “H”: The ASCII code 01 00 10 00 results in the Quattro code and we get the following square for the calculation of the additional check sums.
Figure 7. Diagram of the Quattro Code of “H” in a square and the additional calculated check sums.
With these three check sums we can now identify and correct a single error within the four pixels. Let us assume that a pixel of the original square was read in incorrectly, the newly calculated check sums will not match with the stored check sum. We take a closer look at Figure 6 and refer to the columns as A and B and to the rows as C and D, respectively. If for instance the pixel (A/D) is read in incorrectly, the newly calculated check sum of the diagonal does not match with the stored check sum. However the first and the third check sums confirm the correctness of the pixels (A/C), (B/C) and (B/D). Hence the single error can appear in pixel (A/D) only and have to be corrected. A single error among the other pixels can be detected and corrected analogously. In our error correction we so far have assumed, that at least the three check sums have been read in correctly. However, if one of these is read in incorrectly then this check sum indicates an error in the message also the message itself was read in correctly. To detect errors in the check sums, a fourth check sum is added as the sum of the first three check sums. With regard to our example of the letter ‘H’ of the message ‘HALLO’ the fourth check sum is the sum of the first, second and third check sum: 1 + 2 + 1 = 0 mod 4 (see Figure 7). In this way the codeword of the letter ‘H’ is given by: H = 01 00 10 00 plus four additional check sums = 01 10 01 00:
H
check sums
Note that we had 00 = , 01 = , 10 = and 11 = . It enables us to first check the correctness of the check sums first, then locate and correct an error in the 107
GÖTTLICH AND SICKENBERGER
message afterwards: If the fourth check sum is correct, we can also assume that the first, second and third check sums, also are correct and go on with the error localisation and correction of the message itself as described above. If the fourth check sum does not match with the sum of the first three check sums, the first three check sums will be recalculated from the read in pixels of the message. In a second step the fourth check sum is recalculated and again compared to the stored one. If the fourth check sum is now correct, it can be assumed, that an error occured in one of the first three check sums, but the the message itself was read incorrectly. If the recalculation of the check sums do not result in a correct message, multiple error occurred in the four pixels of the message and/or the four pixels of the corresponding check sums. In that case the error detection and correction is no longer possible and this byte of information is lost. In that case a question mark is used at the affected place to indicate the failure. In most cases the message can still be interpreted by the reader, e.g., if instead of the original message ‘HALLO’ only ‘H?LLO’ is decoded. The developed error correction for the Quattro code requires an additional 8 bits for 8 bits of information (equivalent to 4 pixels). So the required storage space could also be used to store the message twice instead of storing the message and data for error correction. In doing so, errors can still be detected and localised, but one criterion is missing to decide, which of the two sources contains the correct message. Here, an approach based on check sums is more efficient. The Layout of the Quattro Code Compared to the QR code the layout of the Quattro code is slightly different. The Finder and Timing Patterns (see Section The structure of the QR-Code) have been merged and will be also used as coordinate system and scaling reference. Beginning in the upper left the pixel mosaic is filled up with information. The first eight pixels represent the length of the message. This is followed from left to right by the coded message and the error correction data. Four coloured pixel are needed to store one byte (=8 bit) of information. In total up to 4^8 = 65536 bits of information can be stored in a pixel mosaic. The Quattro code can store information much more efficiently and neatly than the QR code. Figure 8 shows the message “Modelling Week” encoded as QR Code and as Quattro code. It can be seen that the newly developed Quattro code stores the same message in significantly fewer pixels. However, the information for error correction data requires more space compared to the QR-code. This fact was identified by the students and suggested for the next generation the use of the Reed-Solomon method for error correction, which is more efficient. The Quattro code uses less space to code a message, but there are two main drawbacks: On the one hand, it becomes necessary to print the pixel mosaic coloured; however, on the other hand, the colours used need to be distinguishable. However, these additional sources of errors can be avoided by suitable printers and camera lenses. 108
MODELLING THE STORAGE CAPACITY OF 2D PIXEL MOSAICS
Figure 8. Comparison of a QR code (left) and a black/white print of the coloured Quattro code (right).
Implementation The team have developed two computer programs to encode a text message into a Quattro code and to decode a Quattro code into the original message, respectively. The encoding program transforms a word, a text phrase or even an audio file into a Quattro code which can be displayed on the screen or printed on paper. Independent of that, the decoding program reads a Quattro code as an image file, analyses that image and decodes the original data. If the original data was a text message, the decode message is displayed on the screen – if the original data was an audio file, the decoded file are played on the loudspeaker. FURTHER DEVELOPMENTS OF THE QUATTRO CODE
The good project results encouraged the group to discuss and to document potential further developments and optimisation of the Quattro code. One improvement could be to use more than four different colours. Ideally one would use eight different colours, so that 3 bits (instead of 2 bits of information) could be stored in a single coloured pixel. These colours should be chosen out of the basic colour map (e.g. for the Quattro code the RGB colour map was used), so that the difference in the contrast of the colours is as large as possible. Another potential for optimizing the storage capacity is the error correction method. Here, the relation between the length of the codeword and the length of the message might be improved by integrating the Reed-Solomon error correcting method. Regarding the use of mobile phones and their built-in cameras the decoding algorithm of the Quattro code should be implemented in Java to run on a Java-enabled mobile phone. Finally, the group discussed the idea of pixel mosaics which change in time. These kind of pixel mosaics could easily be represented on displays consisting of a large number of small LEDs. Instead of making photos one would have to use the video function of a mobile phone camera to read in ten or more different configurations of a Quattro code per second. However, time animated LED pixel mosaics can no 109
GÖTTLICH AND SICKENBERGER
longer be printed on T-shirts or posters, but could be integrated in an electronic chip or badge. DIDACTIC REMARKS
From the didactical point of view, the presented modelling task requires competencies in different areas. In the following we analyse and clarify the theoretical framework of this complex problem-solving process. Modelling process: The individual steps and milestones in mathematical modelling are described by the already known modelling process (see, e.g. [1, 2, 11]). Simplifying and structurising, mathematical modelling, working with algorithms and finding a mathematical solution and interpretation and validation of the mathematical results were the main aspect of the students’ work. Each step had been considered during the team work. Further modelling tasks for use at secondary school are found in [4, 6, 8, 9, 12, 14]. But mathematical modelling can also be applied in primary school (see, e.g. [5, 8, 10]). Interdisciplinary application-oriented education: The presented modelling task requires the study of new mathematical concepts (in particular calculation using binary numbers) and in-depth experience in computer sciences (such as PHP and Java programming). Hence, it is recommended to implement such a modelling task in interdisciplinary mathematics. For many students the above-indicated relation to everyday life (bar codes printed on nearly all consumer goods, possibility of using pixel mosaics on T-shirts, etc.) may additionally have a positive impact on their motivation. Knowledge transfer: The task requires a high level of knowledge transfer and an overall good student input. First, alphanumeric messages must be converted into binary numbers and arranged in pairs. Secondly, these pairs of numbers are translated into colour boxes, furnished with some error correction data and arranged in square form. To enable the error decoding, these algorithm must be bijective, which requires concentrated work, so as to not lose track of the structure of the code. Knowledge documentation and presentation skills: The team work during the week included the documentation of acquired knowledge by means of a project report and a 25-minute talk on the project results on the last project day. The presentation was followed by a brief discussion on success and aberration of the team work. The group members were asked questions by all participants as well by the external scientists in charge. Although, the processes of putting down the project results on paper and creating slides for the final presentation were initially regarded as tedious or even disturbing, looking back the students were proud to present their own project results in front of a large audience and receive positive feedback. Therefore the presentation of project results should be an integral part in the modelling with students. All participants called the Modelling Week 2008 in Lambrecht a big success, as reflected by the questionnaires: The students were highly motivated to work on their modelling task, because they could work out authentic, complex and 110
MODELLING THE STORAGE CAPACITY OF 2D PIXEL MOSAICS
open problems with applications in real life. Also, the description of the modelling problem gave no information about the field of mathematics which is required to solve the problem. In particular this fact is a big challenge for the teachers in the team. For a start they are on the same level of knowledge, but after analysing and structuring the problem they can benefit from their mathematical knowledge and point the team to the most useful mathematical tool to solve the problem. We hope that the interdisciplinary nature of mathematical modelling and problem solving will be integrated into day-to-day school classes. However, the working atmosphere during the Modelling Week in Lambrecht was extremely good and the modelling task was really fun. NOTES 1
2 3
4
MINT is an abbreviation for the subjects “Mathematics, Computer Science, Natural Science and Technology”. http://www.youtube.com/watch?v=OxFR6r-Dqk4 (last viewed on 19 December 2008). The QR-Code was developed by the Japanese company Denso Wave. It is now standardized under ISO 18004. ASCII is an abbreviation for “American Standard Code for Information Interchange”.
REFERENCES/BIBLIOGRAPHY [1] Blum, W. (1996). Anwendungsbezüge im Mathematikunterricht - Trends und Perspektiven. Schriftenreihe Didaktik der Mathematik 23, 15–38. [2] Blum, W., et al. (2002). Application and Modelling in Mathematics Education. Journal für MathematicDidaktik 23(3), 262–280. [3] Dorfmayr, A. (2007). Von Strichcode bis ASCII - Codierungstheorie in der Sekundarstufe I. In G. Greefrath, J. Maaß (Eds.), Materialen für einen realitätsbezogenen Mathematikunterricht (Vol. 11, pp. 9–17). ISTRON-Schriftenreihe. [4] Eck, C., Garcke, H., & Knabner, P. (2008). Mathematische Modellierung. Berlin: Springer Lehrbuch, Springer Verlag. [5] Göttlich, S. (2007). Mathematische Modellierung in der Mittelstufe: Personalausweis für Schildkröten. In Beiträge zum Mathematikunterricht 2007 (pp. 324–327). Berlin: Verlag Franzbecker, Hildesheim. [6] Hamacher, H., Korn, E., Korn, R., & Schwarze, S. (2004). Mathe und Ökonomie: Neue Ideen für einen projektorientierten Unterricht. Wiesbaden: Universum Verlag. [7] Herget, W. (1994). Artikelnummern und Zebrastreifen, Balkencode und Prüfziffern – Mathematik im Alltag. In W. Blum, H.-W. Henn, M. Klika, & J. Maaß (Eds.) Materialen für einen realitätsbezogenen Mathematikunterricht (Vol. 1, pp. 69–84), ISTRONSchriftenreihe. [8] Hinrichs, G. (2008). Modellierung im Mathematikunterricht: Mathematik Primar- und Sekundarstufe. Heidelberg: Spektrum Verlag. [9] Kiehl, M. (2006). Mathematisches Modellieren für die Sekundarstufe II. Berlin: Cornelsen Verlag. [10] Maaß, K. (2007). Praxisbuch Mathematisches Modellieren, Aufgaben für die Sekundarstufe I. Berlin: Cornelsen Verlag. [11] Maaß, K. (2006). What are modelling competencies? ZDM 38(2), 113–142. [12] Pesch, H. J. (2002). Schlüsseltechnologie Mathematik - Einblicke in aktuelle Anwendungen der Mathematik. Wiesbaden: Teubner Verlag.
111
GÖTTLICH AND SICKENBERGER [13] Reed, I., & Solomon, G. (1960). Polynomial codes over certain finite fields. SIAM Journal on Applied Mathematics, 8(2), 300–304. [14] Sonar, T. (2001). Angewandte Mathematik, Modellbildung und Informatik. Wiesbaden: Vieweg Verlag.
Simone Göttlich Department of Mathematics University of Kaiserslautern
[email protected] Thorsten Sickenberger Department of Mathematics Heriot-Watt University Edinburgh
[email protected]
112
GÜNTER GRAUMANN
6. MATHEMATICS FOR PROBLEMS IN THE EVERYDAY WORLD
PRACTICE ORIENTATED MATHEMATICS EDUCATION (PRINCIPAL IDEAS)
The conception of “Practice Orientated Mathematics Education” used here as basis has the goal to undergo mathematics as tool for problems of everyday world. It was developed about thirty-five years ago (see e.g. Graumann 1976, 1977 and 1987). The main objective was to provide information on the relevance of mathematics in everyday life and to allow people to see mathematics as a tool that allows us to master problems in everyday life. When referring to the term “everyday life” or “everyday world” I do not mean only the outward events; “everyday life” also refers to causal and social interaction with nature and within communities such as family, municipality, nation or mankind. Furthermore “everyday life” comprehends mental and cultural goods of our own as well as other civilisations. So the situations for “Practice Orientated Mathematical Education” can result from the sphere of living of the pupils or their parents, from the sphere of work or from the sphere of science and arts, but also can concern the world of politicians1 in a community or a nation or the European parliament or can concern a problem of nature and mankind. In respect to the goal of learning for future life we have to concede that we can only make forecasts. However every pedagogical conception is concerned with this dilemma. Moreover any pedagogy is alive and fruitful only when it can combine funded analysis about future with hope and confidence in the rightness of pedagogical handlings. Furthermore, I must concede that different children have and will always have different everyday worlds. A pedagogical conception therefore must be reducible to a common standpoint as well as give enough freedom for individual formations. I would also like to mention that for the numerous problems encountered in everyday life it is not always necessary for students have all the knowledge which is necessary to solve the problem available to them when they start to work on this problem. Sometimes they can find out new mathematical ideas while solving the problem. Having this sketched general view of “orientation to everyday life” in mind we can find a lot of situations of everyday life or problem fields true to life which can be solved with the help of mathematics and are suited to the levels of knowledge and the abilities of our learning group. J. Maasz and J. O’Donoghue (eds.), Real-World Problems for Secondary School Mathematics Students: Case Studies, 113–122. © 2011 Sense Publishers. All rights reserved.
GRAUMANN
PRACTICE ORIENTATED MATHEMATICS EDUCATION (METHODICAL RUN)
Due to the main goal of ‘Practice Orientated Mathematics Education’ and because we know that people can always transfer learnt concepts best when they learn these concepts in a context similar to the situation they shall use them I believe that every unit within this conception must start with a situation encountered in everyday life or a problem field which is true to life. The methodical run of a problem then can be organized in following five steps: Because each unit begins with a situation which occurs, has occurred or could occur in everyday life in the first step the students have to become familiar with the given situation (if it is not immediately clear for them). This means that all students have to develop an understanding of the circumstances surrounding the situation. They must believe they are in this situation in reality. There are several different ways to help ensure this but each method depends on several factors most notably the premises of the learning group. Different strategies to help students develop an understanding of the situation may include a presentation, a narration (by the teacher, a student or an external expert), a picture or sequence of pictures, a radio play or tape recording, a movie or video tape, a planned role-play or a discussion in the learning group. The second step requires the students to define the problem and the conditions connected with the problem more precisely. Also the more complex problems have to be divided into separate, more manageable problems with specific questions. Often we also have to decide in which direction we will find interesting points for us. In general all these tasks should be done by the students themselves. Depending on students prior experience of working with such problems will help the teacher determine the teacher must structure the work ahead of time or only has to keep ready for help. The third step, which follows on from the second, requires existing information to be separated into essential and unessential information. It is also necessary at this stage to find out what important information is missing and how it can be procured. At times this step can be addressed immediately but other problems may need a lot of effort in this area. Sometimes even a new description or modelling of the situation is advisable. In the fourth step the partial problems and questions, discussed in the second step, have to be solved. The way of finding the solutions of course depends on the specific problem and the presuppositions of the learning group. Normally we only can make hints about heuristics and general problem solving strategies. If we have solved all partial problems then in the fifth step the single solutions must be combined in order to devise a solution of the entire problem. Finally this solution has to be interpreted and integrated into the given situation of everyday life and the limitations of the solution must also be noted. This sequence of steps should not be interpreted as a strict scheme; rather it should be used as guide. Moreover in respect to the general goals of practice orientated mathematical education the focus on reality at all times must dominate the work. As a consequence of this, in such a unit (which can be done only by the mathematics teacher or in cooperation with a teacher of another subject) there have not only to 114
MATHEMATICS FOR PROBLEMS IN THE EVERYDAY WORLD
be discussed mathematical topics but also those topics of other subjects which are necessary to understand the situation adequate. EXAMPLES ALREADY PERFORMED IN SCHOOL (SHORT DESCRIPTION)
In order to illustrate this conception and to underline the importance of working on everyday problems I will now give some examples of problem orientation with problem fields of everyday world by sketching units which have been already performed in school in the named way. Expenses of a Dinner at Home This is a problem which was practiced already in grade three. In the first step students are presented with a clip on the tape recorder. A family with two children have just finished a dinner and the parents are talking about the cost of the dinner which contained sausages, potatoes, vegetables and a dessert. The price for a sausage, the potatoes, the vegetables, the necessary milk and pudding powder as well as an approximation of the cost for electricity etc. was mentioned in this talk.2 Afterwards in classroom the pupils discussed the situation and cleared the problem. Then they picked out the given information about the situation while listening to the tape recorder a second time. After discussing with each other they got all important information and began to first calculate the particular cost of each food type and then the total cost.3 Finally all together they compared their results and discussed the possible different cost of other similar meals. An extension of this problem field is the approximation of the expenses for all meals for one day or for one month or for another dinner. Expenses of Buying an Automobile or a Motor Cycle This problem field was introduced in grade four as well as in grade eight and nine. As introduction there was a video tape used. In one version there also was a family with two children and the father was discussing how their old car has to be repaired; however he thought that it would be better to buy a new car and sell the old one. The mother then was mentioning the problem with the expenses including insurance etc. In the next scene the parents met in town and looked out for a new car as well as talked about price and loan together with a car-seller. In a second version a girl who left school after grade nine and began a vocational education wanted to buy a motor cycle. Her father did not like that and told her that he will not finance a motor cycle for her; if he can not hinder the acquisition of a motor cycle he would only buy leather clothes for her security. The girl first went to a shop that sold motorbikes and looked out for a suitable motorbike at the right price. Then she visited a bank inquiring about a loan and its conditions. Finally at home she was doing some calculations about her income, her normal expenses and the monthly payments into a savings account. In addition to this she was also looking at an advertisement in the newspaper about a loan with very low interest rates. It is at this time that the video ends. 115
GRAUMANN
A questionnaire focussing on the knowledge of the pupils about cars and their cost was presented first4. Then the children in class watched the video two more times because the presented situation is very complex. After that it was necessary to carry out a class discussion in order to analyse the problem and look at specific aspects of it. When students began discussing the monthly cost of a car or a motor cycle the video was played a third time to pick up special information. The step of decision about particular problems the students wanted to investigate and the separation of these problems into single questions turned out different in different classes. On the one hand students only asked for the monthly cost of repaying the loan, insurance and approximate costs for fuel5. On the other hand this problem was imbedded in the problem of living expenses in general6 including wide investigations, for example some students questioned parents and looked out for statistical information. Depending on the fixed problem the final discussion contained different aspects. Extending a Loft This unit was taught to 14-year-old pupils in a basic course in elementary-school and took five lessons. The introduction to the situation was given to the pupils through role-playing in which the situation of two boys in a family with a small part of a house was narrated. The two boys in this situation have only one room and problems arise when either want to invite friends over. As a result their father suggests that the loft in their part of the house be extended to a living and sleeping room for one of the boys. Due to the fact that they cannot afford to spend much money on this project the family has to work on it by themselves. Before tackling this project it is good to make a rough draft of the financial cost of this project. After an introduction to the problem via a written description and a whole class discussion the task of calculating the financial costs was given to the pupils in school. Before working on this problem they discussed the type of the house and different forms of roofs. The teacher told the class that the (fictive) family holds a half of a house with a hip-roof (Figure 1 and 2). In the second and third lesson (on the next day and the day after) the situation of the family and their project was recalled. The first task was the isolation of the floor and the walls. Then the partial problems of covering the floor and the walls with insulation and wooden plates as well as the paperhanging and painting of the walls must be worked out. For this the pupils were required to calculate several areas as well as find out the cost of the material including the estimation of small materials like nails and screws(For homework pupils must price relevant items in local building centres). This work was done by the pupils in groups and as a positive by-product of this work different solutions emerged and were compared by students. In the fourth lesson the installation of heating elements was discussed. It emerged that the size and price of the heating depends on the volume of the loft. As a result the computation of the volume was the main task for this lesson. This lesson was taught by five teacher students simultaneously where each teacher student taught to a small group of pupils. For each group a model of a hip-roofed solid made of synthetic moss was available. The children did know a formula about a prism but not 116
MATHEMATICS FOR PROBLEMS IN THE EVERYDAY WORLD
about a pyramid. In all groups the children themselves had the idea to cut the model into pieces (which was easily done with a big knife) (Figure 3). The volume of the middle part, a prism, was calculated first. With the peaked solids several reflections and trials were done, until some of the students cut these solids again and put three of the quarter-pyramids together to form a cube (which was possible because we did arrange the height of the pyramid equal to the half of its breadth). [Some lessons later the math-teacher of this class deepened this experience with working out the general formula for a pyramid.]
Figure 1. The net of the loft-room.
Figure 2. One end of a hip-roofed building. 117
GRAUMANN
Figure 3. Three parts of right-angled pyramid (see above on the right) build a cube.
In the fifth lesson the calculation for the heating was finished and the price for the entire project was worked out. A test with single questions relating to this project was given some days later. Dyke Raising This unit was designed for students in grade eight or nine. The problem was introduced using pictures of stormy tidal waves and their destructive effects as well as the possible prevention of this. During the discussion the students should put themselves into the situation of a planner or parliamentarian of a special state where the dyke has to be renewed and raised up.
(seaside)
The new dyke with sand-kernel of the old one
(landside)
In a second step the students had to find out the construction plan of a dyke and what is necessary to renew and raise an old dyke. The next step needed modelling and computation. The students found out the volume of sand for renewing the dyke at which the difference between the old and new dyke on a length of twelve kilometres had to be computed. With this the shapes of the old and the new dyke could be seen as trapezium columns.
118
MATHEMATICS FOR PROBLEMS IN THE EVERYDAY WORLD
Furthermore the surface of the new dyke had to be calculated because of the amount of clay and grass layer. For the cost of transportation the students made approximations guided by information from transport companies. In a final step all single computations were got together and the result was discussed. Also issues including the accuracy of the result and additional costs like payment for the workers were raised. Sound Nutrition The idea of nutrition and changes in body weight is a well understood among students. It was introduced to students in grade seven with help of an advertisement about chocolate for children (Figure 4). This advertisement said something about “double milk” but nothing about calories. On the packaging of the original product the children found a schedule with the amount of carbohydrate, fat and albumen as well as minerals and vitamins in 100 g of chocolate. This encouraged the students also to investigate different products like dark chocolate, milk, chips and Coke in this regard. With this they produced for comparison an own schedule relative to carbohydrate, fat and albumen of several products. They found for example that the fat-intake of 100g children-chocolate is four times as much as the fat-intake of one glass of milk.
Figure 4. Children chocolate [+ milk / – cocoa / 8 bars].
Whilst discussing the different contents of food-stuffs the students’ interest turned to questions about calories and body-weight. At this time the definitions of one calorie/ kilocalorie and one Joule as well as the conversion of one unit of measurement into the other was provided by the teacher. The students then conducted investigations into the amount of calories in different food-stuffs and worked out a table to compare different food-stuffs in respect to their calorie count. The unit was finished with the discussion about sound nutrition and the role of different formulas about normal weight respectively body measure index.7 119
GRAUMANN
Architecture in Our City and Aesthetical Aspects in Works of Art Based on Geometry This unit was taught once in two parallel classes with sixth grade students and a second time for six- and seven-graders in a project week in a nine-year-elementaryschool. The starting problem was a building gap (tracing back to world war two) in front of one side of the old market in our town. A student of architecture and design had worked out a good plan to fill out this gap. This was published in the newspaper. In class we discussed this proposition and discussed (with help of pictures) several shapes of buildings from different ages (Figure 5). Afterwards the gap and some other buildings in the town were physically inspected by the students. The children drew pictures and gave written descriptions of some old buildings or details relating to these buildings. As a result, in addition to enhancing their ability of drawing and making relevant descriptions, the pupils also learnt something about different styles of architecture, gained a feeling for aesthetic aspects and saw the problems that an architect faces when restoring ancient buildings.
Figure 5. Two building of the 16th century with a building in the middle which was designed by a student of architecture.
In the second lesson a picture of the Porta Nigra in Trier with many round arches was presented. The children first had to make a picture of it by hand (Figure 6) and then made a more accurate picture with help of ruler and compasses. In this lesson also the repetition of the handling of a pair of compasses was necessary. In the third and forth lesson, being a double lesson, the children first created some pictures which can be seen as pieces of art consisting only of circles (or pieces of circles) and rectangles. Then they had to focus on ornaments consisting of circles and 120
MATHEMATICS FOR PROBLEMS IN THE EVERYDAY WORLD
rectangles. They analysed some given examples, drew some by themselves and finally produced works of art made by coloured yarn and cardboard in which figures of circles and rectangles were used as a basis.
Figure 6. Porta Nigra – drawn by a pupil.
NOTES 1
2
For all people of a democracy it is necessary to get at least a little bit knowledge and empathy about the problems politicians have to work with. Introductory radio-play: Speaker: Mr. and Mrs. Newman are sitting at the dinner table and eating their dessert. The two children have already left the dining-room. (You hear some noises having to do with eating the dessert.) - Father: Do you have some more of the pudding? - Mother: No. I divided the pudding exact into four portions. - Father: You should have made much more pudding and there would be leftovers. - Mother: You know, we have not enough money to always cook double portions. How much do you think this dinner cost for us four? - Father: Ah, we have had four fried sausages, potatoes and vegetable, and then the dessert. - I would say: 5 Euros. - Mother: I don’t think that would be enough. One sausage cost 80 cents. For the potatoes I paid 50 cents in the market and the vegetables cost €1.40. For the pudding I needed half a litre of milk, the powder and a little bit sugar. One litre of milk cost 90 cents and I paid 15 cents for the powder. We must also include remember that it cost approximately 30 cents for the little bit sugar, some salt for the potatoes, seasonings, water and electricity. - Father: O.k., but all are mostly small amounts.
121
GRAUMANN 3
4
5
6
7
Working-sheet (for assistance): 4 sausages ………………………_______ Euro potatoes …………………………_______ Euro vegetables .………………………_______ Euro ½ l milk …………………………_______ Euro powder …………………………._______ Euro diverse things …………………..._______ Euro (sugar, salt, seasonings, etc.) _________________________________________ Total expenses ……………….......________ Euro What do you think does cost a breakfast for the four people of this family? Questionnaire used at the beginning: Do you have a car at home? - For what do your parents use it? - What does a normal car use for the engine? (Water, gasoline, oil, air, electric, coal, beer, ice?) What does cost a new middle class car? (Give an average price of five different car makes.) - Give an approximation of the operation expenses of a middle class car. - Do you know something about insurances for a car? - Give a short explanation of liability and full car insurance! Working sheet (for assistance): - Operating expenses: Gasoline (for about 20 000 km per year), Motor oil (for about 1 Litre per year), Inspection (including change of motor oil), Tyres (extra for winter, change twice a year), Repairs (approximate amount per year), - Fixed expenses: Insurance (liability and/or full insurance per year), Tax (per year), Prime cost of a car (for approximately 6 years). Look out for these expenses for five different types of a middle class car and calculate the average expenses for a month. Working sheet(for assistance): Living expenses of a family are e.g. expenses for lodging (including heating, water, electric, gas, refuse), expenses for house holding (including food, washing, cleaning, etc.), expenses for clothing (approximately average a month), expenses for holidays (monthly reserves), expenses of other things like news paper, phone, radio, insurances. Other themes about mathematics and healthiness are for example the increase of microbes in food, costs and physical effect of consummation of cigarettes or alcohol, probability of infection with AIDS or quality of an AIDS-test and measuring the fitness of people.
REFERENCES Graumann, G. (1976). Praxisorientiertes Sachrechnen. In Beiträge zum Mathematikunterricht 1976 (pp. 79–83). Hannover. Graumann, G. (1977). Praxisorientierter Geometrieunterricht. In Beiträge zum Mathematikunterricht 1977 (pp. 98–101). Hannover. Graumann, G. (1987). Geometry in everyday life. In E. Pehkonen (Ed.), Articles On Mathematics Education (pp. 11–23). Research Report 55, Department of Teacher Education, University of Helsinki. Graumann, G. (1989). Geometry in everyday life. In W. Blum, et al. (Ed.), Applications and modelling in learning and teaching mathematics (pp. 153–158).
Gunter Graumann University of Bielefeld
122
AILISH HANNIGAN
7. POLITICAL POLLS AND SURVEYS The Statistics Behind the Headlines
INTRODUCTION
Statistical literacy has been defined as the ability to use “sound statistical reasoning to intelligently cope with the requirements of citizenship, employment and family” (Franklin et al., 2007, p. 1). Primary and second level mathematics curricula around the world include statistics and probability to develop students’ data handling skills and decision making skills in the face of uncertainty. The ability to produce, summarise and draw conclusions from numeric data is increasingly important in today’s knowledge society and “empowers people by giving them the tools to think for themselves, to ask intelligent questions of experts, and to confront authority confidently” (Steen, 2001, p. 22) The Guidelines for Assessment and Instruction in Statistics Education (GAISE) (GAISE report, 2005), recently endorsed by the American Statistical Association, contains six key recommendations for teaching statistics: emphasize statistical literacy and develop statistical understanding; use real data; stress conceptual knowledge rather than mere knowledge of the procedures; foster active learning in the classroom, use technology for developing conceptual understanding and analyzing data and use assessment to improve and evaluate student learning. Political polls and surveys are one of the most visible applications of statistics in the media. The aim of this paper is to outline some of the statistical concepts associated with survey methodology and explore some of the challenges of carrying out these surveys in practice by using real examples. The paper also aims to develop statistical literacy by discussing questions to be asked when interpreting the results of surveys and it gives an example of a classroom activity which could be used to develop statistical thinking. POLITICAL POLLS
We often read headlines in newspapers saying things like a certain percentage of the population is satisfied with the current government’s performance. How can the newspaper make such a statement when they haven’t asked everyone in the country their opinion of the government? The newspaper hasn’t the time or money to ask everyone in the population so has taken a subset of the population and inferred that what happens for that subset is what happens for the whole population. Is this J. Maasz and J. O’Donoghue (eds.), Real-World Problems for Secondary School Mathematics Students: Case Studies, 123–135. © 2011 Sense Publishers. All rights reserved.
HANNIGAN
inference valid? How do you select a representative subset? How many people do you need to select? What mistakes can you make selecting this subset and what can be done to minimize these mistakes? Without understanding the concepts behind selecting a subset of the population i.e., sampling, we can make serious errors in our conclusions about the population. Consider a pre-election poll on which political party is going to win the election. The population is the entire group of objects or subjects about which information is required so in this case, the population is all adults who will vote in the election. The sample is any subset of a population for example a subset of individuals who will vote in the election. A unit is any individual member of the population e.g. an individual who will vote in the election. A sampling frame is a list of the individuals in the population e.g. the electoral register could represent a list of those who will vote in the election. A variable is a quantity or attribute which we can measure for each unit and its value will change from unit to unit e.g. the political party the individual will vote for. We are interested in the percentage of the population who will vote for a particular party. This number is called a parameter i.e., some value (for example an average value or a percentage) that we are interested in calculating for the population. We will never find out the value of the parameter unless we ask everyone in the population and all of them respond. The value of a parameter is, therefore, usually unknown but a fixed number. We estimate it by using information from a sample. A statistic represents some value (for example. an average value or a percentage) that we are interested in calculating for the sample e.g. the percentage of a sample of adults who will vote for a particular political party. We can usually find out this value since we can ask everyone in the sample but if we took a different sample of people we might get a different answer. The value of a statistic is, therefore, not fixed but is known. We estimate the value of the parameter by using the value of the statistic. Sampling Methods Sampling methods are the different ways of selecting a subset of individuals from the population. We can select a group simply because it is easy for us to contact these people and they are willing to answer our questions. This method of sampling is appropriately called convenience sampling. The sample is identified primarily by convenience e.g. volunteer panels for consumer research. The advantage is relatively easy sample selection and data collection but it is impossible to evaluate how representative the sample is of the population. We can also use an “expert” to pick the people he or she considers most representative of the population. This is called judgement sampling. The quality of the sample depends on the judgement of the person selecting it. Quota sampling is another method of sampling widely used in opinion polling and market research. Interviewers are each given a quota of subjects of a specified type to survey, for example an interviewer might be told to find 80 adults and interview them. The 80 adults have to made up of 20 adult men aged 40 or over, 20 adult men aged under 40, 20 adult women aged 40 or over and 20 adult 124
POLITICAL POLLS AND SURVEYS
women aged under 40. Quotas are often based on gender, age group or geographical location. It is clear that these methods are open to mistakes being made. Let’s go back to the example of the pre-election poll on who is going to win the election. If we pick a group of people to answer this question simply because it’s convenient for us, our sample may include all people from a particular background and exclude everyone from a different background. Can we be certain that the conclusions for these people can be applied to the whole population? If there is a tendency for a certain group of the population to be omitted from the sample or if people who refuse to co-operate form a group which is, in some way, different to the sample, we have what is called a biased sample. The definition of bias in the context of sampling is a systematic tendency to overestimate or underestimate the population parameter of interest. For the election example, if our sample consists solely of people from a disadvantaged area with high unemployment, we may under-estimate the level of support for a particular political party. How can we eliminate bias? Bias can be eliminated by taking a random sample. This is a sample where everyone in the population has the same chance of getting into the sample and the fact that one individual has got into the sample does not affect the chances of another individual getting into the sample i.e., everyone has an independent and equal chance of being included in the sample. This method of sampling is called simple random sampling. All methods of sampling based on taking a random sample are called probability sampling methods since we know the chance (or probability) of someone getting into our sample. Both convenience and judgement sampling methods are called non-probability sampling methods since we don’t know what the chances are for an individual getting into our sample. Many methods of statistical analysis make the assumption that the sample from which the data are collected is a random sample. There are many different types of probability sampling. These include: Stratified sampling: Stratified sampling techniques are generally used when the population is heterogeneous (dissimilar) and where certain homogeneous (similar) sub-populations can be isolated. These sub-populations are called strata. The population is divided into strata, such that each unit in the population belongs to one and only one stratum. The basis for forming the strata could be age, gender, industry type etc. A simple random sample is taken from each stratum. Cluster sampling: the population is divided into separate groups of units called clusters. Each unit belongs to one and only one cluster. A simple random sample of clusters is selected from a list of all clusters. All units within each chosen cluster are included in the sample. A cluster could be a housing estate or other well-defined area. Cluster sampling is typically used when a researcher cannot get a complete list of the members of a population they wish to study but can get a complete list of groups or ‘clusters’ of the population. It is also used when a random sample would produce a list of subjects so widely scattered that surveying them would prove to be far too expensive, for example, people who live in different parts of the country. For example, it is difficult to obtain a list of all second level mathematics teachers in a country but a list of all second level schools is available. The school can act as 125
HANNIGAN
a cluster of mathematics teachers –a random sample of schools can be selected from the list of all schools and all mathematics teachers in the schools selected are included in the sample. Systematic sampling: For example, if a sample of size 50 is required from a population with 5000 units we would include in the sample one unit for every 5000/50 = 100 units in the population. One of the first 100 units of the population is selected at random from a list of all members of the population. Other sample units are found by starting with the first unit and then selecting every 100th unit that follows in the population list. In effect, the sample of 50 is identified by moving systematically through the population and identifying every 100th unit after the first randomly selected unit. Sample Size Precision is a measure of how close a statistic is expected to be to the true value of a parameter. Lack of precision occurs when every time we take a sample and ask the question of interest, we get a very different answer i.e., the result of the sampling is not repeatable. How can we make conclusions for the population if we get different answers from each sample? How can we correct this problem? If we increase the sample size, we also increase the repeatability or precision of our results. A large random sample will include many people with lots of different characteristics whereas a small sample will not have the same range of people or characteristics. Thus, the results from one small sample may differ considerably from the results of another small sample depending on the range of people and characteristics it includes. We are interested in, for example, estimating the proportion of the population with a particular attribute i.e., p. If we take repeated random samples of size n from this population, calculate the proportion of each sample with the attribute of interest i.e., and plot all the values of , we have what is called a sampling distribution. Provided that the sample size n is large, this sampling distribution looks bellshaped and symmetric i.e., it is approximately normally distributed. For example, if we toss a coin 50 times (n=50) and get the proportion of times we get heads, repeat this experiment 1000 times and plot all these proportions we get a distribution similar to Figure 1 for simulated data. The middle of this distribution i.e., the mean is the best estimate of p which in our example is 0.5 i.e., if the coin is fair we will get heads 50% of the time. The standard deviation of this distribution is given by
(1) and represents the error in using as an estimate of p or sampling error. Our standard deviation in Figure 1 is 0.07. If we repeat our experiment but this time toss a coin 100 times (n=100) and get the proportion of times we get heads, repeat 1000 times and plot all these proportions we get the distribution in Figure 2. Again, the middle of the distribution is 0.5 but the standard deviation is smaller since n is 126
POLITICAL POLLS AND SURVEYS
larger and this results in a narrower range of values. Students can simulate these distributions themselves using online applets e.g. http://www.rossmanchance.com/ applets/Reeses/ReesesPieces.html.
Histogram (with Normal Curve) of proportion of heads Mean StDev N
120
0.5026 0.07097 1000
Fre que nc y
100 80 60 40 20 0
0.30
0.36
0.42 0.48 0.54 0.60 proportion of heads
0.66
0.72
Figure 1. Histogram of proportion of heads in 50 tosses (n=1000).
Using a property of the normal distribution, 95% of the values of range mean ± 1.96 standard deviations which is
p ± 1.96
lie in the
(2)
Therefore, in repeated sampling, 95% of the time p lies in this range so we call this range a 95% confidence interval for p. The margin of error of using as the best estimate of p is, therefore, given by
margin of error = 1.96
(3)
127
HANNIGAN
Using the most conservative value of =0.5 i.e., the value that maximises (1), we can investigate the effect of sample size on the margin of error in Table 1. Histogram(with Normal Curve) of proportion of heads 90
Mean StDev N
80
0.5000 0.05129 1000
70
Frequency
60 50 40 30 20 10 0
0.30
0.35
0.40
0.45 0.50 0.55 proportion of heads
0.60
0.65
0.70
Figure 2. Histogram of proportion of heads in 100 tosses (n=1000). Table 1. Effect of sample size on the margin of error
n 50 100 1000 8000
±0.14 ±0.10 ±0.03 ±0.01
How large does the sample have to be? That depends on how confident one wants to be about the results (Table 1 is based on a 95% confidence level which is the most commonly used one) and what margin of error is acceptable. If one wants want to estimate to within ±3% i.e., within ±0.03 of the true proportion with 95% confidence, a sample size of 1000 is needed. Consider the example of a poll on the result of the Lisbon Treaty Referendum which was held in Ireland in October 2009. A newspaper report from the Irish Times on September 25th, 2009 stated that 128
POLITICAL POLLS AND SURVEYS
Support for the Lisbon Treaty is holding steady but the No side has gained ground over the past three weeks, according to the latest Irish Times/ TNS mrbi poll. The poll shows that 48 per cent are likely to vote Yes, an increase of two points since the last Irish Times poll in early September, while 33 per cent say they would vote No, an increase of four points. The number of people in the Don’t Know category has dropped by six points to 19 per cent. When undecided voters are excluded, the Yes side has 59 per cent with 41 per cent in the No camp. The article also gave information on the sampling strategy used, the sample size and the margin of error i.e., The latest poll was taken on Tuesday and Wednesday of this week among a representative sample of 1,000 voters in face-to-face interviews at 100 sampling points in all 43 constituencies. The margin of error is plus or minus 3 per cent. Using this information, the proportion of all voters in the referendum who will vote Yes lies in the range statistic ± margin of error = ± 0.03 with 95% confidence. 19% of the sample were undecided i.e., 190 of the 1000 in the sample. If a voter refuses to declare his/her voting intention, the usual practice is to exclude them and to base estimates on those who do respond but such a procedure reduces the sample size (Kmietowic, 2007). The estimates of the proportions for the respondents are unbiased but that does not necessarily mean that the estimated proportions are unbiased estimates of the population proportions. This would be the case if non response was random but this is not likely in political polls – respondents may have a good reason for not declaring their voting intentions e.g. they don’t want their support for a controversial political party to be known so support for this political party would be underestimated. If we exclude the undecided voters from the Lisbon referendum poll, of the remaining 810 in the sample, 59% intended to vote yes. Using =0.59 and (3), the
margin of error =
= ±0.034
(4)
and a 95% confidence interval for the true proportion of voters that will vote Yes is given by 0.59 ± 0.034 = [0.56, 0.62]. The following month the Lisbon Treaty was passed with 67% in favour so the poll had underestimated the proportion of people who voted Yes. It should be noted that this type of poll reflects public opinion at a particular time and predicting future behaviour of the voters even a few weeks later is challenging. Harding (1992), in his article on “Political Polls and Errors”, pointed out that if the margin of error is mentioned in newspaper reports, it only applies to a single sample proportion of voters who will vote for a particular political party i.e., the largest sample proportion. For smaller sample proportions e.g. political parties with a smaller proportion of support, the standard error is smaller. Thus the margin of error given overstates the sampling errors of the smaller sample proportions. 129
HANNIGAN
For example, the Sunday Business post carried out a poll of voters in February 2010 (Sunday Business Post, February 21st, 2010). A random sample of 1007 adults aged 18 or over were interviewed by telephone. Using a 95% confidence level, this sample size would give a margin of error of ± 3%. A random digit dial (RDD) method was used to ensure a random selection of households to be included. Half of the sample was selected from a RDD of fixed telephone lines, the other half was selected using a RDD of mobile phones. It was reported that support for the political party Fine Gael was highest with 34% of the sample giving this party their first preference vote. Using =0.34 and (3), the
margin of error for Fine Gael =
= ±0.03
(5)
or 3%. 27% of the sample gave Fianna Fail, the other large political party, their first preference vote. Using =0.34 and (3), the
margin of error for Fianna Fail =
= ±0.027
(6)
or 2.7%. Just 9% of the sample gave their first preference vote to the political party Sinn Fein. Here, the margin of error is smaller again:
margin of error for Sinn Fein =
= ±0.02
(7)
or 2% so for political parties with a smaller proportion of support, the margin of error is smaller than the 3% margin of error given. Kmietowic (2007) also cautioned against using the reported margin of error when investigating the sample lead of one political party over another. The standard error for the sample lead involves a more complicated calculation and is larger than the standard error of the two individual sample proportions. If we take the example from the Sunday Business Post above, the proportion which supports Fine Gael, i.e., , = 0.34 and the proportion which supports Fianna Fail, i.e., = 0.27. Does Fine Gael have a significant lead over Fianna Fail? Using the standard error of the sample lead calculation below from Kmietowic (2007), we get
=
= 0.025
and the margin of error is 1.96 (0.025) = 0.05.The difference between the two sample proportions is 0.34 – 0.27 = 0.07 so this difference ± margin of error = 0.07 ± 0.05 = [0.02, 0.12] giving us a 95% confidence interval for the sample lead of Fine 130
POLITICAL POLLS AND SURVEYS
Gael over Fianna Fail. Since this interval does not include zero where zero represents no difference between the two proportions, we can conclude Fine Gael have a statistically significant lead over Fianna Fail in the population. Note the margin of error of 0.05 for the sample lead is bigger than the margin of error for for Fine Gael given in (5) and Fianna Fail given in (6). SURVEYS
Non Response One of the pieces of information rarely given by newspapers is the response rate of the survey. We are told the size of the final sample but we are rarely told how many people were contacted in order to achieve the desired sample size. For example, interviewers may have had to ask thousands of voters in order to get 1000 voters who were willing to take part and answer their questions. Can we assume that there is no relationship between failure to respond and the answers to the questions of interest? Or will those that fail to respond result in a biased sample which is not reflective of the population of interest? A recent survey, reported in a national newspaper, stated that crime rates in Ireland were the highest in the EU with Irish people being more at risk of assault, burglary, theft and sexual attack (Irish Independent, February 5th, 2007). The EU International Crime Survey (EU ICS) was conducted in the 15 pre-accession EU countries, together with Poland, Estonia and Hungary. The newspaper reported that 22% of Irish people had been recent victims of crime, compared to the EU average of 15%. However, official statistics in Ireland had showed that Ireland had one of the lowest crime rates in the EU. The Central Statistics Office (CSO) in Ireland, in their report on interpreting crime statistics, drew attention to the differences in the size, scope and methodology between their survey and the EU ICS. The CSO survey used face-to-face interviewing in about 29,000 households throughout Ireland with a response rate of 92%. The EU ICS interviewed just over 2,000 persons using a fixed-line telephone survey. The response rate was 42% and this included those who could not be contacted by telephone as well as those who were contacted but refused to take part. The CSO suggested that low response rates for these types of surveys can result in prevalence estimates which are too high as those people who have not been victims of crime and who feel they have nothing to tell may be less likely to participate. The CSO also suggested that those who are hard to contact may be, on average, more at risk of being victims of crime because of their mobility for example their houses may be unattended for long periods of time. It was also not reported in the EU ICS if the response rate was similar across the countries involved in the survey. The populations surveyed by the CSO and the EU ICS were also different. The CSO only included adults over 18 whereas the EU ICS included people aged 16 or over. It was suggested that those in their late teens can be more vulnerable than average to some types of crime and this may have resulted in differences in the percentages who have been victims of crime in the two surveys. The questions asked in both surveys were also different. In EU ICS, the telephone interviewers asked 131
HANNIGAN
those who responded very specific questions about assault including incidents which occurred in the home. The CSO survey excluded these types of assault because such topics were considered too sensitive for this type of face-to-face survey and it was felt that specific supports would need to be offered to those asking and answering the questions to enable these topics to be covered in a survey. What Questions Should I Ask When Interpreting the Results from a Survey? Comparing the CSO and EU ICS survey methodologies raises some important questions to be asked when interpreting the results of a survey. – Who carried out the survey? Was it an independent body? – What was the population? Who is being represented by this survey? – How was the sample selected? Was it a random sampling method? Was it a convenient sample? What is the potential bias? – How large was the sample? What was the margin of error? – What was the response rate? How many people had to be contacted before the desired sample size was achieved? Did non response result in a sample which is different from the population and how would this affect the estimates? – How were the subjects contacted? Was it a face to face interview, telephone survey or mail survey? Did this affect the response rate? Did it result in a sample which was different from the population of interest? – When was the survey conducted? Is it a recent survey? Had a particular unusual event just occurred which may have influenced people’s opinions? – What were the exact questions asked? Were they leading questions? Were they sensitive questions? AN ACTIVITY FOR THE CLASSROOM
The most effective way for students to understand the concepts and challenges involved in carrying out a survey is to implement a survey themselves. Consider the following: your school is interested in promoting sustainable modes of transport i.e., walk, cycle or take the bus to and from school. The school needs to establish what are the modes of transport currently used by pupils in the school and the reasons why these modes of transport are used. Students are required to design a survey to answer these questions of interest. First we need to define the population, sample, etc. The population is all students in the school. The sample is a subset of students in the school, for example 100 students. The sampling frame is a list of all students in the school. A unit is an individual student and the variables of interest could be age of the student, gender, how far the student lives from school etc. One of the parameters of interest could be the percentage of all students in the school that walk to school and the statistic which estimates this parameter is the percentage of the sample of students that walk to school. What sampling strategies should be considered and discussed? A convenient sample could be selected by standing at the school gate 10 minutes before school 132
POLITICAL POLLS AND SURVEYS
starts and handing out a questionnaire to the first 100 students passing by. These 100 students are the sample. How would this sampling strategy result in a biased sample? Students who arrive earlier or later than 10 minutes before school starts will not have a chance of getting into your sample. They may arrive earlier because of the method of transport they use to get to school. Therefore the estimate of the percentage of students who use a particular form of transport may be an over or under estimation of the true percentage. It is clear that using a non-probability sampling method like convenience sampling can result in a biased sample. To minimize bias, a random sample should be selected. A simple random sample could be selected. First, a list of all the pupils in the school is required. Each pupil is given a unique number. Using a calculator, random number tables or a computer, 100 unique random numbers in the appropriate range of numbers are produced. The students with these numbers make up the sample. Selecting a Simple Random Sample Using a Calculator A pseudo-random number table is a computer generated table consisting of the numbers 0 to 9. The numbers in any position in the table are equally likely to be 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 and the numbers in different positions are independent – the value of one has no influence on the value of another. Pseudo-random number tables are used by calculators and software packages to select random samples from a population. To select a random sample of size 100 from a population of size 1000, give everyone in the population a unique number i.e., 0 to 999. Using a scientific calculator, generate a random number in the range 0 to 999. The unit in the population with the number generated by the calculator is the first person to be selected in your sample. Generate another random number. The unit in the population with the number generated by the calculator is the second person to be selected in your sample. Keep generating random numbers until you have the required sample size. Ignore repeated numbers. Contacting the Students Once the students have been selected, how will the information be obtained from them? Using a self completed questionnaire? Asking the students questions face to face? Collecting the information from them in class? Posting the questionnaires to them? Posting a questionnaire to the students would require access to their home addresses which may not be possible. It may also result in a low response rate. Contacting the students selected in the sample in school should result in a higher response rate. The students selected could be in any of the classes in the school so time and resources will be required to contact all of these students in their classes and encourage them to respond. It will also have to be decided if the students are asked the questions face to face in their classes or if they are given a questionnaire to complete in the classroom which is collected from them at the end of the class period. 133
HANNIGAN
Other Sampling Strategies It is often difficult to get access to a list of the population but simple random sampling could only be implemented if a list of all the students in the school was accessible. Without this list, we could use the class structure of the school as our sampling frame. Say for example the school has 40 classes with approximately 25 students in each class. We could use a class in the school as a cluster and get a list of all the classes. Each class is given a number from 0 to 39. A calculator, random number tables or a computer is used to randomly produce 4 unique numbers in the range 0 to 39. All students in classes with these four numbers make up the sample i.e., 4 classes of 25 students. This type of sampling (cluster sampling) is a useful strategy here as long as we can assume each class is as likely to use different modes of transport as any other class. If, however, younger classes are more likely to use a particular type of transport, then this will affect the usefulness of the clustering approach. Sample Size and Margin of Error Based on a sample of size 100, what is the margin of error for the percentage of all students in the school that will walk to school, with 95% confidence? From Table 1, a sample of size 100 gives a margin of error of 10% for the percentage of all students in the school who walk to school i.e., with 95% confidence, the percentage of all students in the school who walk to school lies in the range statistic ± margin of ± 10% where is the percentage of students in the sample who walk error i.e., to school. If this margin of error is too large, then a larger sample is required in order to be more precise in our estimation. What are the Exact Questions to Be Asked? Asking the students “How did you get to school today?” may result in a different answer than the answer to “On a typical day, how do you get to school?” The first question may have been affected by weather on a particular day or an unusual event. The second question would give the method of transport most commonly used by the student but may not capture information about the full range of transport used by a student for example a student may cycle to school three days a week but be driven to school by their parents on the other days. By discussing and deciding on a sampling strategy, sample size calculations, how to contact the sample selected and what questions to ask those selected for this example, students can learn the challenges and statistical issues involved in surveys. SUMMARY
Statistics is an increasingly important part of the mathematics curriculum at second level. Statistical literacy is considered a vital life skill in today’s knowledge economy. Using real data, fostering active learning and developing statistical thinking are three of the six recommendations for teaching statistics in the GAISE report. This paper 134
POLITICAL POLLS AND SURVEYS
has demonstrated how to implement these guidelines in the classroom using headlines from newspapers, giving an example of a survey to be implemented by students and discussing the questions to ask before interpreting the results of a survey. REFERENCES Central Statistics Office, Ireland. (2007). Interpreting crime statistics. http://www.cso.ie/releasespublications/ documents/crime_justice/current/interretingcrimestats.pdf Franklin, C., Kader, G., Mewborn, D. S., Moreno, J., Peck, R., Perry, M., et al. (2007). Guidelines for Assessment and Instruction in Statistics Education (GAISE) Report - A Pre-K-12 curriculum framework. Alexandria, VA: American Statistical Association. Guidelines for Assessment and Instruction in Statistics Education (GAISE) College Report. Alexandria, VA: American Statistical Association. Harding, D. (1992). Political Polls and Errors, Teaching Statistics, 14(2), 6. Kmietowic, Z. (2007). Sampling errors in political polls. Teaching Statistics, 16(3), 17–74. Steen, L. (Ed.). (2001). Mathematics and democracy: The case for quantitative literacy. National Council on Education and the Disciplines. Princeton, NJ: Woodrow Wilson Foundation.
Ailish Hannigan Department of Mathematics and Statistics University of Limerick
135
HERBERT HENNING AND BENJAMIN JOHN
8. CORRELATIONS BETWEEN REALITY AND MODELLING “Dirk Nowitzki Playing for Dallas in the NBA (U.S.A.)”
INTRODUCTION
Mathematical modelling and mathematics are a “Key Technology”. Mathematics is one of the core competences in developing reliable and efficient simulations for technical, economical and biological systems; thereby, mathematics found a new role as a key technology. In order to simulate any process, it is necessary to find an appropriate model for it and to create an efficient algorithm to evaluate the model. In practice, still one of the main restrictions is time: If one wants to optimize the process, the simulation must be very fast and, therefore, model and algorithm must be looked as a whole and, together, made as efficient as possible. Four problems are very important: – A problem finding competence, i.e., the capacity to discover real world problems, which may be solved successfully by simulation (this seems not to be well developed in teachers); – To develop a hierarchy of models, which allows us together with... – To construct, for each model, the most efficient evaluation algorithm, to reduce the simulation time; – To check the reliability of the simulation, its limitations and possible extensions. There is never an end in modelling a real world problem. While modelling a real-world problem, we move between reality and mathematics. The modelling process begins with the real-world problem. By simplifying, structuring and idealizing this problem, you get a real model (Figure 1). The mathematizing of the real model leads to a mathematical model. By working within mathematics, a mathematical solution can be found. This solution has to be interpreted first and
Figure 1. Modelling cycle. J. Maasz and J. O’Donoghue (eds.), Real-World Problems for Secondary School Mathematics Students: Case Studies, 137–153. © 2011 Sense Publishers. All rights reserved.
HENNING AND JOHN
then validated (Blum, 2004). A global cognitive analysis yields the following idealtypical solution, oriented towards the cycle. Competence can be regarded as the ability of a person to check and to judge the factual correctness and the adequacy of statements and tasks personally and to transfer them into action. Similar views can be found in the didactical discussion about modelling: “Research has shown that knowledge alone is not sufficient for successful modelling: the student must also choose to use that knowledge, and to monitor the process being made.” (Tanner & Jones, 1995, p. 124). Based on these concepts, I define the term “modelling competency” as follows: “Competencies for modelling include abilities of modelling problems as well as the will to use these abilities.” A further important basis is different sub-competencies mentioned in Maaß (2004, p. 173). Modelling competencies contain – Competencies to understand the real problem and to set up a model based on reality. – Competencies to set up a mathematical model from the real model. – Competencies to solve mathematical questions within this mathematical model. – Competencies to interpret mathematical results in a real situation. – Competencies to validate the solution. Mathematical modelling is a permanent interaction between reality and other matrices. MATHEMATICAL LITERACY AND MODELLING
The “Programme for International Student Assessment” (PISA) gives a precise definition of the term mathematical literacy as “an individual’s capacity to identify and understand the role that mathematics plays in the world, to make well-founded mathematical judgements and to engage in mathematics, in ways that meet the needs of that individual’s current and future life as a constructive, concerned and reflective citizen.” (Organization for Economic Cooperation and Development (OECD, 1999, p. 41). The concept of mathematical literacy connects the development of mathematical structures with the treatment of realistic tasks. This connection can be considered as analyzing, assimilating, interpreting and validating a problem – in short, modelling. Within this perspective modelling competencies form a part of mathematical literacy and the examination of modelling competencies are helpful in clarifying the mathematical literacy of students. The OECD/PISA identify two major aspects of the construct mathematical literacy: mathematical competencies and mathematical big ideas (chance, change and growth, dependency, relationships and shape). Among others, modelling is described as one of major competencies that build mathematical competence. Mathematical modelling needs an overarching set of abilities which can be identified in the well known modelling cycle. The modelling cycle has normally a starting point in a certain situation in the real world. Simplifying it, structuring it and making it more precise leads to the formulation of a problem and to a real model of the situation. If appropriate, real data are collected in order to provide more information on the situation at one’s disposal. 138
CORRELATIONS BETWEEN REALITY AND MODELLING
If possible and adequate, this real model – still a part of the real world in our sense – is mathematised. That is, the objects, data, relations and conditions involved in it are translated into mathematics, resulting in a mathematical model. Now mathematical methods come into play and are used to derive mathematical results. The results have to be re-translated into the real world, which is interpreted in relation to the original situation. At the same time the problem solver validates the model by checking whether the problem solution obtained by interpreting the mathematical results is appropriate and reasonable for his or her purposes. If needs be, the whole process has to be repeated with a modified or a totally different model. At the end, the obtained solution of the original real world problem is stated and communicated (Blum, 2002, pp. 149–171). LEVELS OF MODELLING COMPETENCE
Here we introduce a level-oriented description of the development of modelling competence, characterized in three levels: – Level 1: Recognition and understanding of modelling – Level 2: Independent modelling – Level 3: Meta-reflection on modelling Competence, as a theoretical construct, cannot be observed directly. One can only observe student’s behaviour and actions as they solve problems, for example. Competence is understood here as a measurable variable, in the sense that level of competence can be inferred by observing the behaviour of students. In a pilot study (Henning and Keune, 2004; Keune et al., 2004) students’ behaviour was observed as they worked on modelling problems, with the goal of reaching conclusions concerning their levels of modelling competencies. The theoretical assumption here was that at the first level, procedures and methods can be recognized and understood, as a prerequisite to being able to independently solve problems at the second level. Conscious solving of problems in the sense of this paper requires, accordingly, knowledge of the procedure. Furthermore, the authors make the assumption that meta-reflection on modelling would at the very least require both familiarity with modelling and personal experience. Within this perspective, the levels of modelling competencies could be considered as one dimension of at least three dimensions in which a modelling activity takes place, the other two being level of complexity (contexts, methods, technical skills), and educational level. CHARACTERISTIC ABILITIES
Level 1 – Recognize and Understand Modelling Characterized by the ability to recognize and describe the modelling process, and to characterize, distinguish, and localize phases of the modelling process. Level 2 – Independent Modelling Characterized by the ability to analyze and structure problems, abstract quantities, adopt different perspectives, set up mathematical models, work on models, interpret 139
HENNING AND JOHN
results and statements of models, and validate models and the whole process. Pupils who have reached this second level are able to solve a problem independently. Whenever the context or scope of the problem changes, then pupils must be able to adapt their model or to develop new solution procedures in order to accommodate the new set of circumstances that they are facing. Level 3 – Meta-Reflection on Modelling Characterized by the ability to critically analyze modelling, formulate the criteria of model evaluation, reflect on the purposes of modelling, and reflect on the application of mathematics. At this third level of competency, the overall concept of modelling is well understood. Furthermore, the ability to critically judge and recognize significant relationships has been developed. Consideration concerning the part played by models within various scientific areas of endeavour as well as their utilization in science in general is present. This implies that finished models are examined and any inferences drawn from them evaluated (Jablonka, 1996), while at the same time criteria for model evaluation are scrutinized (Henning and Keune, 2002). Here is an example for Level 1 (“Watertank”). WATERTANK
During a math class students are asked to describe a watertank as it is filled. The tank is one meter wide, empty at the beginning and is filled with one litre of water per second. The students receive further information from the teacher e.g. shape and measurements of the tank. Here you see one student’s results. He sketched the tank of water and depicted in a graph how the water-level changed over time.
Figure 2. Sketch of a student.
A1) How could the student have established the course of the graph? A2) Are there other informations which the student did not use? The teacher judges that the results so far are good and encourages the student to find a formula for calculating the water-level. A3) What steps would the student have to take in order to set up a formula for calculating the water-level?
140
CORRELATIONS BETWEEN REALITY AND MODELLING
Now following is a real-situation in a modelling-task as an example for Level 2 and Level 3. HOW TO EVALUATE THE TRAJECTORY OF DIRK NOWITZKI’S SHOT?
Motivation Being an enthusiastic basketball player myself I naturally follow the professional leagues in the media. What impresses me most is the shooting accuracy of some professional players. Roughly ten years ago Dirk Nowitzki became only the second German to play in the NBA (National Basketball Association), the world’s best basketball league. In 2004 the magazine DIE ZEIT printed an interview with Nowitzki’s advisor, mentor and personal coach Holger Geschwindner, without whom Nowitzki arguably would not have been as successful as he is today. In that interview Geschwindner, who owns a degree in mathematics, describes how he developed an individual shooting technique for Nowitzki: “I took a paper and a pen and asked myself: ‘Is there a shot where you can make mistakes but the ball still goes through the hoop?’ [...] Then I drew a sketch: The incidence angle of the ball must be at least 32 degree, Dirk is 2.13m tall, his arms have a certain length and if you know the laws of physics, you find a solution quickly.” (Ewers, 2004, translated by the author) At first it is surprising to find physics mentioned in a sports article. But after a short period of time you start thinking which laws Geschwindner could be referring to and how did he hit on the 32 degree angle? I started analyzing and comprehending Geschwindner’s statements, especially with regard to mathematics in school. Can you discuss the whole topic or side aspects with students in school? How can this interdisciplinary reference be utilized in physics lessons? These questions are picked out as the central themes of the following paper. Mathematical Modelling According to the Rahmenrichtlinien des Landes Sachsen-Anhalt, mathematical modelling is a mandatory task in schools. It is also described as one of the skills to be trained in the Bildungsstandards (Projektgruppe and LISA, 2008). In addition to conforming to the guidelines, it is also an objective to connect mathematics with reality. The aim is to show the wide ranging meaningfulness of mathematics in school which is often questioned by the students. The physicist describes motion sequences with formulas; the chemist handles reaction equations; the stress analyst calculates the bearing structure of a building. They all use mathematical tools although the original problem had nothing to do with mathematics. Mathematical modelling works with nearly every problem of any complexity. Applying this method includes phrasing and solving non-mathematical problems using mathematical language. This is done by differentiating between real world 141
HENNING AND JOHN
(non-mathematical) and mathematical world. In every modelling task the steps of the following cycle are executed (Figure 3):
Figure 3. Modelling process (NSW, 2006).
The starting point is a real-world problem. Then a situation model is created by simplifying, idealizing and structuring the task. Now the real-world model has to be transferred into mathematics: by generating a mathematical problem within a mathematical model. To solve the mathematical problem well-known algorithms are used. Then the mathematical results are transferred back into the real-world situation to be able to interpret the results with regard to the real-world problem. Afterwards, the results are reviewed and evaluated with respect to the real world. If the result is illogical or unrealistic, every single step e.g. overall proceeding, transfer processes and algorithms, have to be checked with regard to correctness. With this new way of setting a task teachers do have a means at hand to spark the students´ motivation and interest in mathematical and everyday life problems. In addition, students learn how to deal consciously and critically with questions which also helps them to get to know the benefits of mathematics on their own. In my opinion it is extremely important that students develop confidence in their (individual) abilities to solve problems. There is not one specific way to handle a certain problem, no calculator replacing the mental activity. Students develop their individual solutions; they can differ from one another but still end up with the same result which by the way does not necessarily mean a specific numerical value but in fact the interpretation of results including the implications in the real world. The activities of a teacher change if he uses this new way of setting tasks: it requires a greater amount of time and tasks are more complex and seem more difficult. Students with poorer performance who used to work with strict patterns are challenged. At the beginning it appears that there are going to be some complications. The lessons are less predictable and illustrations become a lot more important. 142
CORRELATIONS BETWEEN REALITY AND MODELLING
Both students and teachers are required to be more flexible as well as able to follow the train of thoughts of others. BASKETBALL - PLAY
The Idealized Shot Incidence Angle = 32° - How did Geschwindner arrive there? As mentioned in the first part, mathematician Geschwindner believes that the incidence angle of the basketball falling through the basket should not be smaller than 32°. The following part shows how he arrived at this result. For a basketball shot we assume a trajectory parabola as known from physics. The incidence angle represents the slope of the trajectory parabola when the ball is falling through the basket in case you make the shot or bouncing off the rim in case you miss the shot. If you hold the basketball directly above the rim and let it fall downwards without giving any impulse in either direction, due to gravity the ball will fall through the basket. The incidence angle would be 90°. You cannot reach this angle with a usual shot which will be explained later. As the lowest possible incidence angle we assume 0°. This would represent a ball thrown horizontally at the level of the basket. The ball would bounce against the front and back off the rim. It is impossible to score with this incidence angle. We need to look at the incidence angle with respect to the plane in height of the basket, which is located 3.05m above the court. This plane is parallel to the ground and for this reason parallel to the basketball court, too. To determine the lowest possible incidence angle with the basketball still falling through the basket we look at the following sketch:
Figure 4. Sketch to calculate the lowest possible incidence angle in case of a made shot.
Figure 4 shows schematically how to evaluate the lowest possible incidence angle. Therefore we assume the basketball is falling directly through the basket. It is definitely possible that the ball would hit the rim first, then bounce up, and fall down through the basket afterwards. But for the shooter this is hard to control. Instead of falling through the basket the ball could fall down beside the rim just as well. 143
HENNING AND JOHN
In the sketch, the incident ball is shown by the parallel straight lines crossing the bold line representing the basket at both ends. In the case of the lowest possible incidence angle the ball neither hits the front nor the back of the rim. The distance between the two parallel straight lines represents the diameter of the basketball. In the triangle in Figure 4 the length of two out of the three lines is known. The diameter of the basket is 0.45m and the basketball has a perimeter of 0.75m (University Mainz, 2006). From the perimeter of the basketball we get the following diameter:
d
u
S
u S d
(1)
75cm | 23,87cm 3,14
(2)
Let Į be the incidence angle which can be identified with the help of the following trigonometrical relation: sin (D )
opposite leg hypotenuse
diameter ball 23,87cm | | 0,530 4 diameter basket 45cm
D | 32,04q
(3)
(4)
To evaluate the incidence angle not more than basic mathematical knowledge and tools are necessary: mathematical modelling to get the sketch in Figure 4, evaluations on perpendicular triangles (trigonometrical relations) as well as perimeter evaluations of a circle and a sphere respectively. The lowest possible incidence angle of 32° could be validated almost exactly. Reconstructing the trajectory of a shot. During the regular NBA season every team plays 82 games. In the following playoffs the teams could play up to 28 more games but at least 16 more for the team that wins the championship. Consequently, a team could play more than 100 games in one season. Looking at professional basketball from this point of view, teams and players aim at saving forces. Therefore we take a look at how many shots Dirk Nowitzki released in the 2009/2010 season: he averages 19 field goal attempts and seven free throw attempts which makes 26 shots overall per game. His field goal percentage and free throw percentage are 47.5% and 90.8% respectively (http://www.nba.com/ playerfile/dirk_nowitzki/career_stats.html). Since Dirk Nowitzki is taking about 2500 shots in one season during games and let alone the shots in practice, it appears logical to minimize the expenditure of energy for every single shot. That is why we model the shortest trajectory of the basketball while shooting a free throw with an incidence angle of 32°. As in general mathematic lessons the goal is to try to reconstruct a function with the help of three known characteristic points. 144
CORRELATIONS BETWEEN REALITY AND MODELLING
To be able to operate in our well-known two-dimensional Cartesian coordinate plane the basketball is assumed to be a point mass. The basket is at 3.05m (ten feet high). Now the distance between the basket and the point where the ball leaves the shooter’s hands has to be identified. Therefore we use the following figure of a basketball court:
Figure 5. Dimensions of a NBA basketball court in feet and inches (osovo.com, 2010).
Since the basketball is assumed to be a point mass and the point where the ball leaves the shooter’s hands is assumed directly above the free throw line, the distance between those two points matches 19 feet minus 63 inches as shown in Figure 5, which equals 13 feet and nine inches. The lesson can thus also be used to repeat unit conversions. By using the following information we can transform the distance into the metric system (Brockhaus, 2004): 1’ = 1foot ԑ 0.3048 meter 1” = 1inch = 1/12 feet ԑ 0.0254 meter
(5) (6)
With these data the distance is evaluated as 4.191m. It remains to determine the height of the point at which the ball leaves the shooter´s (Dirk Nowitzki’s) hands. He is 2.13m tall and the ball leaves his hand just above his head. Since the basketball may be assumed as a point mass – we use the center of the basketball – the height 145
HENNING AND JOHN
of the point where the ball leaves Nowitzki’s hands is assumed to be at 2.20m. To illustrate the upcoming proceeding we use another sketch (Figure 6):
Figure 6. Schematical sketch of a free throw.
Since we assume a basketball shot is like a trajectory, a general second order equation can be used to start determining the functional equation: y
f ( x)
ax ² bx c
(7)
From our considerations above we get the following points: Height of the basket: P1 (0 / 3.05) Release point: P2 (4.19 / 2,2) Incidence angle: Į = 32.04° If this information is inserted into the general equation above we receive the following system of three equations and three variables:
y y
(8)
f (0) a 0² b 0 c 3.05
f ( 4.19 )
a 4.19 ² b 4.19 c
f ' ( x)
f ' (0)
2ax b
2a 0 b
tan(32q)
2 .2
(9) (10) (11)
From equation (8) we get c = 3.05 and from equation (11) follows b = 0.625. It only remains to determine variable a with the help of equation (9):
a
146
2.2 c 4.19 b 2
4.19
2.2 3.05 4.19 0.625 3.46875 | 0.198 | a . (12) 17.5561 4.192
CORRELATIONS BETWEEN REALITY AND MODELLING
From the functional equation above the following holds for the trajectory of the basketball (Figure 7):
y
f ( x)
0.198x ² 0.625x 3.05 .
(13)
Figure 7. Trajectory of the basketball according to equation (13) shown with the help of the algebraic computer software Maple®; the bold red circles mark the basket and the point where the ball leaves the shooter’s hands.
This functional equation changes if a different incidence angle or height where the ball leaves the shooter’s hands is assumed. The latter naturally depends on the height of the shooter. When shooting a jump shot the height where the ball leaves the shooter’s hands changes because the shooter is jumping vertically to be able to shoot over possible defenders. The following Table 1 shows how parameters a, b and c change if the height when dropping the ball is constant but the incidence angle varies. Such tables are created with a spreadsheet so the impact of changing one parameter can be observed directly. Table 1 obviously shows that parameter c remains constant and is independent of the chosen incidence angle. Parameter c represents the intersection with the y-axis. During a lesson the relevance of this parameter can be discussed with students to increase their understanding. In this particular example the parameter represents the height of the basket. On the German national team, Dirk Nowitzki plays with Heiko Schaffartzik (1.83m) and there are also two players on the Dallas Mavericks team with the same height –Frenchman Rodrigue Beaubois and Puerto Rican José Juan Barea. Though the shot of every single player is different the height of a player influences the way of shooting tremendously. 147
HENNING AND JOHN
Table 1. Impact of changing the incidence angle on parameters a, b and c if the height when dropping the ball (2.20m) as well as the distance of the shooter from the basket (4.19m) remains constant Incidence angle Į [°] 32 34 36 38 40 42 44 46 48 50 52 54 56 58 60 62 64 66 68 70
a [1/m]
b
c [m]
-0.198 -0.209 -0.222 -0.235 -0.249 -0.263 -0.279 -0.296 -0.313 -0.333 -0.354 -0.377 -0.402 -0.430 -0.462 -0.497 -0.538 -0.584 -0.639 -0.704
0.6249 0.6745 0.7265 0.7813 0.8391 0.9004 0.9657 1.0355 1.1106 1.1918 1.2799 1.3764 1.4826 1.6003 1.7321 1.8807 2.0503 2.2460 2.4751 2.7475
3.05 3.05 3.05 3.05 3.05 3.05 3.05 3.05 3.05 3.05 3.05 3.05 3.05 3.05 3.05 3.05 3.05 3.05 3.05 3.05
Figure 8. Illustration of the trajectory of the ball with an incidence angle of 32°, (blue), 40° (green), 50° (brown) and 60° (black) according to table 1 with the red circles marking the basket and the release point. 148
CORRELATIONS BETWEEN REALITY AND MODELLING
For a player with height of 1.83m, a release point is assumed at 1.85m because his arms are not as long as the arms of a player who is 2.13m tall. Therefore he does not shoot from as high above his head. The calculation to determine the trajectory is similar to the one above resulting in the following equations (Table 2): Table 2. Functional equations if the incidence angle is varied and the dropping point (1.85m) as well as the distance between shooter and basket (4.19m) remains constant Incidence angle Į [°] 32 40
y y
f ( x) f ( x)
50
y
f ( x)
functional equation 0.217 x ² 0.625 x 3.05 0.269 x ² 0.839 x 3.05
0.353 x ² 1.192 x 3.05
(14) (15) (16)
Figure 9. Illustration of the trajectory of a shot with the same incidence angle of 32° but from players with a different height (blue – height where the ball leaves the shooter’s hands 2.20m, black – 1.85m).
At this point the length of the trajectory could be compared to those where the height when dropping the ball is varied but the incidence angle remains constant. Since the length of a trajectory is calculated as follows b
L ( a , b)
³
1 ( f ' ( x)) 2 dx ,
(17)
a
the integral to be solved will take the following form: x2
F ( x)
³
ax 2 bx c dx
(18)
x1
149
HENNING AND JOHN
Given that solving these types of integrals is not part of mathematics in school the exact length of the trajectory will not be determined during a regular lesson. But this task can be picked out as a central topic during a Project Week, a workshop for experts in the afternoon or as a preparation for the Mathematical Olympiad. To be able to evaluate the length of the trajectory nevertheless, the local maximum of the functional equation is used. Obviously a trajectory extends if and only if its maximum is higher, given a steady distance between starting and endpoint. Apparently Figure 8 shows the direct proportionality of incidence angle and y-value of the maximum. Consequently the higher the maximum, the longer the trajectory and the more power is needed to overcome gravity. To be able to evaluate the length of the trajectories in cases of unequal heights when dropping the ball the maxima have to be looked at in a different way. Figure 9 shows the trajectories of two shooters with different heights, both aiming at the same incidence angle. The maxima of both trajectories can be evaluated by setting the first derivative of the functional equation to zero. This is how the x-value of the maximum is determined. The y-value is determined by reinserting this x-value into the functional equation. Table 3. Maximum of the trajectories of shooter’s with different height but their shot having the same incidence angle of 32° Body height 2.13m 1.83m
release point 2.20m 1.85m
x-value of the maximum 1.58m 1.44m
y-value of the maximum 3.54m 3.50m
absolute height differential 1.34m 1.65m
Table 3 shows that the absolute height of the trajectory of the ball shot by the shorter player is shorter by four centimeters. But at the same time the absolute height differential differs by 29 centimeters. Therefore, smaller players who usually have less muscles have to use more power to score a basket. The larger the incidence angle of the basketball while falling through the basket the larger may be the variance of the shot horizontally. It is crucial that the center of the basketball falls through the center of the rim when shooting with an incidence angle of 32°. This is not mandatory with larger incidence angles. However, the player has to exert more power to reach a larger incidence angle. Therefore, the need of a specific shooting form for each individual player becomes clear. Finally the question should be asked how high a shot needed to be to reach an incidence angle of 90°. A look at the functional equation leads to the conclusion, that it is impossible to let the ball fall upright down through the basket while shooting a regular shot: the slope at x=0 would have to be infinite. Therefore we assume an incidence angle of 89° as an approximation. The functional equation is determined using equations (8) to (13) as follows: y
150
f ( x)
13.721 x ² 57.290 x 3.05 .
(19)
CORRELATIONS BETWEEN REALITY AND MODELLING
The maximum of the functional equation is at the height of y=62.85m, a nonrealistic height for a basketball shot. The power and the impulse which are required to shoot a basketball with an inertia of 600g, 62.85m high can be evaluated in a Physics lesson as well as the question how many human beings would be able to exert such a shot. Possible sources of error. At the beginning it needs to be mentioned that the model of the basketball being a point mass is an idealization. Contrary to the basketball the point mass has no volume expansion. Along with this the rotation around the three spatial axes is ignored. Many basketball players are shooting with a backspin which means that the ball is rotating as if it rolls backwards on a plane. This spin induces stability of the trajectory. In this context it has to be discussed whether modelling a trajectory is correct or a ballistic curve is more appropriate. Due to ball rotation and air friction there is degradation as well as the Magnus effect known from Physics. The latter is the reason why soccer players are able to do a “banana kick” or table tennis players are able to play a “curve ball”. In addition, the data regarding the different lengths are defective: the exact distance between the center of the basket and the point where the ball leaves the shooter’s hands is not known but an estimate which varies between individuals. The same applies to the height of the point where the ball leaves the shooter’s hands which largely depends on the body height of the player. For the purpose of pure calculation and the enrichment of the Mathematic lessons these deviations are acceptable. Covered topics in mathematics. As mentioned before, at the beginning of this or analogical tasks, mathematical modelling is mandatory. At the same time the height where the ball leaves the shooter’s hands needs to be estimated, since it cannot be determined exactly. Moreover the height when dropping the ball can differ throughout the game so that using a mean is practicable for this task. The expertise of modelling and estimating must be trained. It is not an ability which every person is capable of right away. In fact students must be introduced to this challenge through tasks with an increasing level of difficulty. Another topic which can be dealt with during lessons is the calculation of percentages. While evaluating the trajectory of a shot the shooting percentages of a player from different positions on the court were mentioned. Students can discuss the meaning of a shooting percentage for the next shot. Can players deviate from their own percentages during one season? Do they have to miss their next shot if their percentage in one game is above their average? Is a successful shot guaranteed if a player usually scores 50% of his shots and has missed his only shot on that day? In this context the terms absolute and relative frequency as well as probability can be addressed and assigned to athletics in general. To be able to make a quantitative analysis the American unit of length was transformed into the European one at the beginning. Thus, the students not only learn how to convert units but also understand why the basket is exactly three meter and five centimeter high – because it equates to ten feet of the American unit of length. 151
HENNING AND JOHN
Furthermore, the students learn to draw a sketch to illustrate and understand problems as well as being able to explain them to their classmates. Beyond that they learn to extract information from their classmates’ sketches or other illustrations. The whole task is designed to deal with aspects of analysis which is also covered in regular classes. At this point new aspects are reasonably combined with the old ones to complement each other. Of course quadratic equations are focused on. Students evaluate derivatives, maxima and minima and reconstruct a functional equation with the help of a few known points. They do so by evaluating systems of equations and implementing their knowledge about trigonometrical functions. SUMMARY
Mathematical modelling can greatly enrich math lessons in school. Like every other didactical method, too, it may not be the only way of teaching. It is a reasonable addition to many other didactical methods. Besides, it has to be introduced slowly and with caution e.g. just like team work. Students do not learn how to work together gainfully overnight – as well as they cannot construct a mathematical model ad hoc. The greatest benefit of this type of setting a task is being able to adjust the task to the interests of the class and single students respectively. If students are not interested in sports this particular example should not be used because the intrinsic motivation will not be raised. In addition this particular example shows that mathematical modelling can be introduced early. It is the teacher’s task to single out aspects going along with relevant considerations and evaluations: to range from converting units to dealing with trigonometrical functions in combination with a second order equation. It is an instrument to enrich lessons at every single class level. REFERENCES Blum, W. (2002). ICMI Study 14 - Applications and modelling in mathematics education. Educational Studies in Mathematics, 51, 149–171. Blum, W. (2007). Mathematisches Modellieren – zu schwer für Schüler und Lehrer? Beiträge zum Mathematikunterricht. Retrieved February 24, 2010, from http://www.mathematik.uni-dortmund. de/ieem/BzMU/BzMU2007/Blum.pdf Brockhaus GmbH. (2004). Brockhaus-Enzyklopädie in 5 Bänden (10th ed.). Leipzig. Ewers, C. (2004, January 15). Angewandte Theorie, DIE ZEIT. International Basketball Federation. (2008). Official Basketball Rules. Henning, H., & Keune, M. (2002) Modelling a spreadsheet calculation. In I. Vakalis, D. H. Hallett, C. Kourouniotztis, D. Quinney, & C. Tzanakis (Eds.), Proceedings of the second international conference on the teaching of mathematics. Hersonissos: Wiley, ID 114 CD_Rom. Henning, H., & Keune, M. (2004). Levels of modelling competences. In H.-W. Henn, & W. Blum, (Eds.), ICMI Study 14 Application and e modelling in mathematical education (pp. 115–120). Jablonka, E. (1996). Meta-Analyse von Zugängen zur mathematischen Modellbildung und Konsequenzen für den Unterricht. Berlin: transparent. Kaiser, G., & Borromeo Ferri, R. (2008) Realitätsbezüge und mathematische Modellierung. Einführung in die Mathematikdidaktik. University of Hamburg, unpublished.
152
CORRELATIONS BETWEEN REALITY AND MODELLING Keune, M., Henning, H., Hartfieldt, C. (2004). Niveaustufenorientierte Herausbildung von Modellbildungskompetenzen im Mathematikunterricht. In H. Henning, (Ed.), Technical Report No.1. University of Magdeburg. Kultusministerium des Landes Sachsen-Anhalt. (2003). Rahmenrichtlinien Gymnasium, Mathematik Schuljahrgänge 5–12. Retrieved March 15, 2010, from http://www.nba.com/playerfile/dirk_nowitzki/ career_stats.html Maass, K. (2004). Mathematisches Modellieren im Unterricht - Ergebnisse einer empirischen Studie. Berlin: Franzbecker. NSW Department of Education and Training. (2006). Curriculum K-12 directorate. MATHEMATICAL MODELLING and the general mathematics syllabus. Retrieved July 8, 2010, from http://www. curriculumsupport.education.nsw.gov.au/secondary/mathematics/assets/pdf/s6_teach_ideas/cs_articl es_s6/cs_model_s6.pdf. Retrieved October 12, 2010, from http://www.osovo.com/diagram/basket ballcourt.gif OECD-Measuring student knowledge and skills. (1999). A new framework for assessment. Paris. Projektgruppe SINUS - Transfer Sachsen-Anhalt des Landesinstituts für Lehrerfortbildung, Lehrerweiterbildung und Unterrichtsforschung von Sachsen-Anhalt (LISA), (Eds.). (2008). Kompetenzentwicklung im Mathematikunterricht. Halle. Schmidt, P. (2007). Präzisionsoptimierung des Basketballwurfs Retrieved March 15, 2010, from http:// www.vde.de/de/Regionalorganisation/Bezirksvereine/Nordbayern/YoungNetregional/Schuelerwettbewerbe/ Schuelerforum/10Schuelerforum/Documents/MCMS/H811_88T_Schmidt_Wurfpraezision.pdf Tanner, H. F. R., & Jones, S. A. (1995). Teaching mathematical thinking skills to accelerate cognitive development. In Proceedings of the 19th Psychology of Mathematics Education conference (PME-19), (pp. 121–128). Recife. University of Mainz. (2006). BASKETBALL – REGELN, FB 26. Retrieved March 15, 2010, from http:// www.sport.uni-mainz.de/Schaper/Dateien/ZusammenfassungRegelwerk.doc
Herbert Henning Institute for Algebra and Geometry Otto-von-Guericke University of Magdeburg
[email protected] Benjamin John Otto-von-Guericke University of Magdeburg
[email protected]
153
PATRICK JOHNSON
9. EXPLORING THE FINAL FRONTIER Using Space Related Problems to Assist in the Teaching of Mathematics
INTRODUCTION
Gene Roddenberry, the creator of Star Trek, described space as the final frontier the concluding step along a line of exploratory endeavours that started when people first embarked on journeys of discovery to learn more about the world around them. In the last 50 years we have become increasingly captivated and mystified by the possibilities of what exists beyond our planet. This heightened interest in all things space related presents teachers of mathematics with an opportunity to motivate and demonstrate to their students the relevance of mathematics in a real world interesting context. It has long been known that students commonly identify mathematical abstraction and lack of relevance as causative factors for their dislike of and failure in mathematics. (Singh, 1993) th
The 20 century has seen scientists and engineers working on endeavours to explore and explain the mysteries of the Universe. These endeavours present mathematics, and science teachers, with opportunities to bring real world data and ideas into the classroom to motivate and demonstrate to their students the relevance of their subject discipline. Boaler (1994) also stresses this point by stating that contexts are also used in order to motivate and interest students, providing students with examples which enrich and enliven the curriculum. Due to the high level of interest displayed by the media and the public in the solar system, and space related issues, it presents an opportunity for educators to teach topics in an exciting context while at the same time engaging interest among students. This chapter explores several space related ideas by providing the necessary background and then problems that can be solved using an assortment of standard school level mathematical concepts and tools J. Maasz and J. O’Donoghue (eds.), Real-World Problems for Secondary School Mathematics Students: Case Studies, 155–171. © 2011 Sense Publishers. All rights reserved.
JOHNSON
SCIENTIFIC NOTATION
Scientific notation is used to express numbers that are too large or too small to be conveniently written using standard decimal notation. Scientific notation results in a number being written in the form a × 10b , where a is any real number ( 1 ≤ a < 10 ) and b is an integer. This form of number will be familiar to most of us as we will have dealt with numbers of this nature when learning the rules of indices. It is in fact the rules of indices that will be used here to allow us to simplify and hence evaluate problems involving scientific notation. Many applications involving scientific notation exist in the study of space – the distance between astronomical objects for example result in measurements too large to be written using standard decimal notation. The following problems rely on students being both familiar and comfortable with the rules of indices in order to solve the questions. Question: You are told that the mass of the sun is 1.98 × 1033 grams and the mass of a single proton is 1.67 × 10 −24 grams. How many protons are in the sun? Solution: To solve this problem we simply need to divide the mass of the sun by the mass of a single proton. The difficulty is in dealing with the mass when it is written using scientific notation. To answer this question we divide the coefficients (the letter a in the equation above) by each other and then divide the base 10 powers by each other. The rules of indices need to be using when dividing the powers. Recall that 10 x = 10 x − y 10 y and so in the problem here we have 10 33 = 1033−( −24) = 1057 . 10 −24 Hence the final answer is 1.98 × 1033 = 1.19 × 1057 protons. − 24 1.67 × 10 156
EXPLORING THE FINAL FRONTIER
Another area of space mathematics where scientific notation is commonly used is in determining the time it takes particles to move over the vast distances between objects in space. The following example deals with Coronal Mass Ejections (CME). A Coronal Mass Ejection is an ejection of material from the solar corona of the Sun as shown in the Figure 1 below.
Figure 1.Coronal Mass Ejection from the Sun. Image courtesy of SOHO/EIT consortium. SOHO is a project of international cooperation between ESA and NASA.
The ejected material, normally electrons and protons, spread out from the Sun and when they reach the Earth they collide and affect the magnetosphere surrounding the Earth. It is the collision of the CME with the Earth’s magnetosphere that causes the release of power that results in the Northern and Southern Auroras. This release of power can affect satellites, radio transmissions and even cause power blackouts and so we need to be able to accurately predict when these emissions will collide with the magnetosphere. Question: A Coronal Mass Ejection travels a distance of 1.56 × 1011 metres in 14 hours. What is the speed in kilometres per second? Solution: We know that Distance = Speed . Time 157
JOHNSON
To work out the time in seconds (as we are asked for the final answer in terms of seconds) we first determine how many second are in an hour. Once we determine how many seconds there are in an hour (3,600) we need to write this in scientific notation and thus we get 3.6 × 103 seconds in an hour. Hence our distance divided by time formula gives 1.56 × 1011 = 0.0301 × 108 . 14 × (3.6 × 103 )
Since it is the norm when using scientific notation to always have the decimal point between the first two non-zero digits in the coefficient, recall 1 ≤ a < 10 , we re-write the answer above to be 3.01 × 106 metres per second. Since we are looking for our answer in kilometres per second we divide the above answer by 1000 or 1× 103 . This yields 3.01 × 103 or 3,010 kilometres per second. Note for Teachers: How fast is this? Perhaps have your students work out the distance between two cities - the distance from London to Cairo is about 3,500 km. If a flight takes approximately 4hrs 30mins to cover this distance then they can compute the average speed that the plane will travel at? From this and other such calculations of this nature involving trains or cars they would be able to grasp a truer appreciation of how fast 3,010 kilometres per second really is!!! RADIATION EXPOSURE
Every day we are exposed to different forms and varying amounts of radiation. Most of this radiation is harmless but obviously if we accumulate too much radiation over a period of time then it can potentially do damage. Most of the radiation that we are exposed to here on Earth is electromagnetic radiation from sources such as sunlight, power cables and radio waves. The radiation from the sunlight is not dangerous to us due to the fact that the Earth’s magnetic field shields us from most of the harmful galactic radiation. Astronauts who stay for prolonged periods on board 158
EXPLORING THE FINAL FRONTIER
the International Space Station are unfortunately exposed to higher levels of radiation due to their position in space. Due to this exposure and the future prospect of extended space voyages new and more accurate ways of measuring space radiation and shielding astronauts from it are currently being examined by both NASA (National Aeronautics and Space Administration) and ESA (European Space Agency). A number of recent probes/spacecrafts launched in the last decade have been equipped with a new monitoring device called a Standard Radiation Environment Monitor (SREM). These devices are extremely sensitive to the highly charged particles emitted by the Sun and also to other radiation originating from interstellar space. The SREM’s main purpose is to identify radiation hazards threatening its host spacecraft but also to yield a detailed picture of the space radiation environment in our solar system. Due to the varied locations of the SREM’s within the solar system scientists are now able to monitor the solar particle events at the same time on similar instruments and so can produce a more accurate picture of radiation levels within our solar system. In Europe radiation dosages are measured in the standard SI unit measures of grays or seiverts. One gray is defined as the absorption of one joule of radiation by one kilogram of matter. One seivert on the other hand is the measure of the equivalent dose. The equivalent dose is a measure of the radiation dose to tissue where an attempt has been made to allow for the different relative biological effects of different types of radiation. Equivalent dose is therefore a more biologically significant measure of radiation and hence more commonly used when talking about radiation affecting human tissue. A radiation exposure of less than 1 seivert has no adverse health effects. An exposure measuring between 1 and 2 seiverts will result in non-fatal illness. An exposure measuring between 2 and 5 seiverts will cause serious illness that may even result in death. An exposure reading greater than 5 seiverts is fatal. Now that we have discussed radiation and know how to measure and interpret the results we can look at the following problem. Question: An astronaut travels to the Moon and spends one week on the lunar surface before returning to Earth. Calculate the total amount of radiation exposure in seiverts. Solution: Let’s say they spend one days in Earth orbit before departing and the radiation dosage is 1.9 × 10 −1 mSv(milli-seiverts)/day. On their way they pass through the Van Allen belts (torus shaped regions of high radiation surrounding the Earth) which takes 1/2 day at an exposure level of 3.0 mSv/day. The journey to the Moon takes two days at 5.0 × 10 −1 mSv/day. The stay on the lunar surface under shielded conditions amounts to 3.0 × 10 −1 mSv/day. The astronaut returns to Earth along a similar course taking the same amount of time but does not spend a day in Earth orbit at the end of the trip. 159
JOHNSON
To solve this problem we examine each part of the astronaut’s trip separately and then sum the total radiation exposure. One days in Earth orbit: 1.9 × 10−1 mSv ×1 days = 1.9 × 10 −1 mSv Passage through Van Allen belts: 3.0 mSv ×0.5 days = 1.5mSv Journey to Moon: 5.0 × 10 −1 mSv ×2 days = 1.0 mSv Stay on Moon: 3.0 × 10 −1 mSv ×7 days = 2.1 mSv Return Journey from Moon: 5.0 × 10 −1 mSv ×2 days = 1.0 mSv Passage through Van Allen belts: 3.0 mSv ×0.5 days = 1.5 mSv Converting all these radiation values to the same power we can determine the total radiation exposure to be 7.29 mSv. Converting milli-seiverts to seiverts we can get the total exposure to be 7.29 × 10 −3 Sv or 0.00729 Sv. This value is well within the acceptable radiation limits and so the astronaut should not experience any illness due to radiation exposure. AURORA PROBLEMS
The sun is our planet’s main source of light, but in addition to light the sun also emits particles, mainly electrons and protons, into space as we already mentioned when we discussed CMEs. Due to the intensity of the light given off by the sun it is almost impossible during the day to see any effect of these particle emissions but as the earth rotates and night time comes the effects of these solar emissions become more apparent. The dazzling, hypnotic glow of the Auroras in the night skies in both the Northern and Southern hemispheres alert us to an invisible clashing of forces in the skies above us. Auroras, sometimes referred to as the northern and southern (polar) lights are natural light displays in the sky particularly in the Polar Regions. They occur when the emitted particles (carried on solar winds) collide with the Earth’s magnetic field high up in the ionosphere and result in a greenish or reddish glow in the night sky as seen in Figure 2. As mentioned before they are most visible closer to the poles due to the longer periods of darkness and the intensity of the magnetic field. Sunlight takes just over eight minutes to travel the 150 million kilometres from the Sun to the Earth (speed of light ≈ 300,000 kilometres per second). The emitted particles are carried on solar winds towards the Earth. These solar winds, or solar storms as they are sometimes called, travel at speeds of between 250 kilometres per second and 2,500 kilometres per second. Thus, it takes the solar wind particles somewhere between 17 hours and 7 days to travel to Earth. Based on this knowledge, scientists are able to predict when there will be “high” or “low” Aurora activity. For years people viewing the Auroras had no idea at what height above them an Aurora was located. Up until the late 19th century, Aurora observers tried to determine the height of the Auroras by the method of triangulation. Triangulation is the process of determining the location of an object by measuring angles to it from known points rather than measuring distances to the point directly – which in this case was impossible. 160
EXPLORING THE FINAL FRONTIER
Figure 2. Solar emissions interacting with the Earth’s magnetosphere. Image courtesy of SOHO/EIT consortium. SOHO is a project of international cooperation between ESA and NASA.
One of the earliest of these recorded measurements was made by the French scientist Jean-Jacques d’Ortous de Mairan between 1731 and 1751. From two polar stations located 20 km apart, observers measured the angles A and B between the ground and a specific spot on an Aurora as can be seen from the diagram in Figure 3.
Figure 3. Diagram of original estimates of Aurora height. 161
JOHNSON
If, for example, the observers at station A recorded an angle of 58 degrees and the observers at station B measured an angle of 114 degrees we can then use the Sine Rule to determine the height of the Aurora. Recall that for any triangle ABC the Sine Rule states that a b c = = sin( A) sin( B) sin(C ) where a, b and c are the lengths of the three sides opposite the angles A, B and C respectively. Applying the Sine Rule to the triangle shown in Figure 3 yields 20 x = sin(8o ) sin(58o ) This gives us a value of 121.86km for the length of the side opposite the angle A as can be seen in Figure 4.
Figure 4. Updated height calculations.
Using this value we can now work out the length of the side ‘h’ as shown in Figure 4, which represents the height above the ground of the Aurora. The angle ‘D’ is easily calculated to be ‘180-B’ which yields 66 degrees. Then using the Sine Rule again we have h 121.86 = o sin(90 ) sin(66o ) This gives a value of 111.33km for ‘h’. This value is consistent with what we have already stated with respect to the Auroras occurring in the Earth’s ionosphere – the ionosphere region is located approximately between 50km and 500km above the Earth’s surface. 162
EXPLORING THE FINAL FRONTIER
Questions of this nature are often encountered by surveyors who are attempting to determine the height of a building, a tree, an electricity pylon etc. By using trigonometry as well as a clinometer (a device for measuring angles of elevation) the height of the required structure can be easily determined without requiring it to be physically measured. We have just determined a method for calculating the height of an Aurora above the surface of the Earth. A subsequent question related to this problem might be the following. Question: If an Aurora is currently visible in the night sky directly above your head, will the Aurora also be visible to an observer 1,500km directly south of your location? Solution: We will solve this problem by calculating the maximum distance an observer can be away from the Aurora so that it is still visible above their horizon. We can see from Figure 5 that if the Aurora is located at the point ‘C’ at a height ‘h’ above the surface of the Earth then the maximum distance south that an observer can be and still see the Aurora is the point ‘P’. The point ‘P’ is the intersection point of a straight line through the point ‘C’ that is tangential to the circle representing the Earth (assume the Earth is a perfect sphere). This tangent line is actually the horizon line of the observer located at ‘P’. Moving further south than ‘P’ on the surface of the Earth would result in the Aurora dropping below the observer’s horizon and thus no longer being visible.
Figure 5. Determining maximum visible range of a sighting. 163
JOHNSON
An easy way of determining the distance between yourself and the observer located at ‘P’ will be in terms of the angle ‘A’ between the two of you. Therefore, we need to determine the angle ‘A’ (angle of latitude). Looking at the right angled triangle OPC formed in Figure 5 we can conclude that
cos( A) =
R . R+h
Using the fact that we know the radius of the Earth (6,378km) and we know the height that the Aurora is located above our head (let’s say 320km) we can now determine the angle between us and the observer located at ‘P’. ⎛ R ⎞ A = cos −1 ⎜ ⎟ ⎝ R+h⎠
⎛ 6,378 ⎞ A = cos −1 ⎜⎜ ⎟⎟ = 17.78 degrees ⎝ 6,378 + 320 ⎠ Knowing the angle between you and the observer located at ‘P’ we can now determine the arc length or distance along the circumference of the Earth that the observer is away from you. The circumference of the Earth is 2πR and the maximum range of the observer is 17.78 degrees south of your location. Therefore, the maximum viewing distance is located ⎛ 17.78 ⎞ ⎜ ⎟ * 2π (6,378) = 1,978 km ⎝ 360 ⎠ south of your location along the circumference of the Earth. This is the maximum distance an observer can be located from your position and still see the Aurora. The original question asked if an observer located 1,500km south of your position could still see the Aurora – based on the result just determined the observed should be able to see the Aurora from their current location provided they have a clear sky. Notes for Teachers: Using the question just completed it is possible, as it is with most applied problems, to come up with follow up questions to demonstrate to the students that the solution techniques applied to solve the specialised problem can be applied more generally to solve other problems of a similar nature. The following are some more general questions, based on the example just completed, that you can use with your students to allow them to expand and test their understanding of the solution technique employed above. 164
EXPLORING THE FINAL FRONTIER
What is the angle of latitude formed between you and the observer located 1,500km south of you? If at the same time that you were viewing the Aurora a meteor passed over head at a height of 120km would the meteor have been visible to the observer located 1,500km south of your position? What is the minimum height that an object would have to be located above your head for it to be visible to the observer located 1,500km south of your position? CALCULUS PROBLEMS
Additionally it is very easy to set up a problem based on the type of question just covered so that Calculus is involved. For example you could ask the following question. Question: What is the rate of change of the line of sight radius, CP in Figure 5, for each additional kilometre the Aurora is above the ground? Solution: In plain terms we are trying to figure out for each additional kilometre change in the height of the Aurora what difference will be reflected in the maximum viewing location of the observer. To solve this problem we first of all need to derive a formula for the line of sight radius CP. This is straight forward to do observing that the triangle OPC in Figure 5 is right angled and so we can conclude that ( R + h) 2 = (CP ) 2 + R 2 ⇒ (CP ) 2 = ( R + h) 2 − R 2
Expanding the right hand side of this equation and then taking the square root of both sides yields 1
CP = (h 2 + 2 Rh) 2
We are attempting to calculate the rate of change of the line of sight with respect to a change in the height of the Aurora, so in mathematical notation we are looking to calculate d (CP) dh 165
JOHNSON
By inspection we can conclude that we will have to use the chain rule when differentiating our equation for CP . Let U = h 2 + 2 Rh and so d (CP ) d (CP ) dU = × dh dU dh
This yields −1
d (CP ) 1 = (U ) 2 × ( 2h + 2 R) dh 2 −1
⇒
d (CP ) 1 2 = (h + 2 Rh) 2 (2h + 2 R) dh 2
Looking back at the original question relating to the Aurora we know that h =320km and R =6,378km. Filling these values into the equation for the rate of change of the line of sight radius with respect to the height of the Aurora we arrive at an answer 3.274km. This means for every 1km increase in the height ‘h’ the line of sight radius ‘CP’ will increase by 3.274km. DETERMINING VISIBLE SURFACE AREA
Another problem involving the Earth and objects above the surface of the Earth encountered by scientists and astronauts is determining the amount of surface area of the Earth that is visible from a particular height in space. Figure 6 shows a diagram of this situation. Lets assume that a space shuttle is located at the point ‘D’, a distance ‘h’ above the surface of the Earth. The astronauts want to know what percentage of the Earth’s surface is visible from the space shuttle.
Figure 6. Space shuttle located at ‘D’ above the surface of the Earth. 166
EXPLORING THE FINAL FRONTIER
This problem requires us to know a little about surfaces of revolution before we can solve it. If in Figure 6 we assume that the point ‘A’ is the origin then what we want to be able to do is calculate the surface area of the region that is found by rotating the hashed region around the y-axis between the points ‘E’ and ‘B’. The area of a surface of revolution can be found by integrating an area element (say dS ). This area element is constructed by rotating an arc length ( dl ) of a curve about a given line – in our case the y-axis as shown in Figure 7. If the radius of the element dl is taken to be r , then on rotation it will generate a circular band of width dl and length (circumference) 2πr . The area of the band will therefore be
dS = 2πrdl .
Figure 7. Construction of the area element dS .
Since we are rotating around the y-axis we shall think of the coordinates of a point on the arc ‘EC’ in Figure 6 as having the form ( g ( y ), y ). This means that the curve representing the arc ‘EC’ will have the form x = g ( y ) . To determine the surface area of the entire region as the radius r varies between y = B and y = E we now integrate the region dS . Recall that we are rotating around the y-axis and so the height of each circular band (the radius) is the x-coordinate of the points along the curve ‘EC’. This now yield S = 2π
∫
y=E
y =B
| x | dl
which gives S = 2π
∫
y=E
y=B
| g ( y ) | dl
The final step is to find a representation for dl , the arc length, that depends only on the variable y. The differential triangle shown in Figure 8 suggests that ( dl ) 2 = (dx) 2 + (dy ) 2 167
JOHNSON
Figure 8. Differential triangle.
Since our curve is specified by an equation of the form x = g ( y ) we can divide both sides by (dy ) 2 and then by taking the square root we get 2
2
⎛ dl ⎞ ⎛ dx ⎞ ⎜⎜ ⎟⎟ = ⎜⎜ ⎟⎟ + 1 dy ⎝ ⎠ ⎝ dy ⎠ ⎛ dx ⎞ dl = 1 + ⎜⎜ ⎟⎟ dy ⎝ dy ⎠
2
2
⎛ dx ⎞ dl = 1 + ⎜⎜ ⎟⎟ dy ⎝ dy ⎠
Since x = g ( y ) we can clearly see that dx = g ' ( y) dy and so the equation for the arc length can be written fully in terms of y as dl = 1 + ( g ' ( y )) 2 dy .
This means that the final equation for the area of the surface of revolution is
S = 2π
168
∫
y=E
y=B
| g ( y ) | 1 + ( g ' ( y )) 2 dy .
EXPLORING THE FINAL FRONTIER
Now that we have an equation that tells us the amount of surface area that is visible from a height ‘h’ above the surface of the Earth we need to attempt to simplify this equation. We start by returning to Figure 6 and noting that the radius of the Earth is ‘R’, ‘g(y)’ represents the x-coordinate of a point on the arc ‘EC’ and ‘y’ represents the y-coordinate of a point on the arc ‘EC’. Using this information along with the fact that the triangle ABC is right-angled we can conclude that g ( y) = R 2 − y 2
We next attempt to determine the coordinates of the points ‘B’ and ‘E’ along the y-axis which are the limits of our integral. The coordinate of ‘E’ are straight forward as it is located on the y-axis and also on the circumference of the Earth and so must be a distance ‘R’ above the origin ‘A’. To determine ‘B’ we note that the triangles ABC and ACD are similar (they are congruent through uniform scaling). Due to this fact we have AB R = R R+h
AB =
R2 R+h
Since ‘A’ is the origin, the result for ‘AB’ is actually the y-coordinate of the point ‘B’ as required. Placing all this new information back into the original equation for the area of the surface of revolution we get
S = 2π
∫
y=R
R2 y= R+h
S = 2π
R2 − y2
∫
y=R
y=
R2 R+h
R2 dy R2 − y2 Rdy
From this we can conclude that the surface area of the Earth visible from a height ‘h’ above the Earth is ⎡ R 2 ⎤ 2πR 2 h . 2πR ⎢ R − ⎥= R + h ⎥⎦ R + h ⎢⎣
Originally we were asked to find the percentage of the Earth’s surface that is visible from a height ‘h’. To do this we note that the total surface area of the Earth 169
JOHNSON
is 4πR 2 (assuming that the Earth is a perfect sphere) and by determining the ratio of visible Earth surface to total Earth surface we have h 2πR 2 h = 2 4πR ( R + h) 2( R + h)
Knowing ‘R’ the radius of the Earth (6,378km) and the height that the space shuttle is above the surface of the Earth (‘h’), the above simple formula will determine the percentage of the Earth’s surface that will be visible to the astronauts. Question: On March 23, 2009 two members of the Space Shuttle Discovery crew, Richard Arnold and Joseph Acaba, performed an extravehicular activity while joined to the ISI at an altitude of 350km above the Earth’s surface. What percentage of the Earth’s surface did they see? Solution: Using the equation we have just derived, we can conclude that h 350 = = 0.026 2( R + h) 2(6,378 + 350) or 2.6% of the Earth’s surface was visible to the astronauts. An example of another follow up question on this topic might be the following: Question: On July 20th 1969, Commander Neil Armstrong and Lunar Module Pilot Edwin ‘Buzz’ Aldrin, Jr. became the first humans to walk on the Moon. The distance between the Earth and the Moon at the time of their landing was 393,309km. What percentage of the Earth’s surface was visible to them? CONCLUSION
By opening our eyes, and our minds, it is possible to identify real world mathematical problems that may be of interest to our students. Presented in this chapter are several different mathematical problems on the theme of space and the solar system. These problems cover a wide range of mathematical topics and so can be used at different stages of students’ mathematical development and hence with different student age groups. It is hoped that the problems presented here will encourage teachers to 170
EXPLORING THE FINAL FRONTIER
develop their own mathematical questions based on real world situations or possibly encourage them to expand upon the problems presented here to come up with new and challenging questions for their students. The problems here have been presented as ‘closed’, completed questions although teachers may find it more suitable to present the material as projects or even miniprojects for students to explore during group work sessions in the classroom. With this approach the teacher can choose to include or omit information from the questions and allow the students to debate and examine the questions to gain a deeper and fuller understanding of the context and the mathematical tools employed to arrive at the solution. REFERENCES Aurora. (n.d.). Retrieved April 22, 2010, from Wikepedia: http://en.wikipedia.org/wiki/Aurora_(astronomy) Boaler, J. (1994). When do girls prefer football to fashion? An analysis of female underachievement in relation to ‘realistic’ mathematic contexts. British Educational Research Journal, 20(5), 551–564. Radiation. (n.d.). Retrieved April 10, 2010, from Wikepedia: http://en.wikipedia.org/wiki/Radiation Singh, E. (1993). The political dimension of adult numeracy: Conclusions of a survey into attitudes to mathematics. In C. Julie, D. Angelis, & Z. Davis (Eds.), Political dimensions of mathematics education 2: Curriculum reconstruction for society in transition (pp. 335–341). Cape Town: Miller Maskew Longman (Pty) Ltd. Space Math @ NASA. (n.d.). National Aeronautical and Space Administration. Retrieved April 20, 2010, from http://spacemath.gsfc.nasa.gov/
Patrick Johnson National Centre for Excellence in Mathematics and Science Teaching and Learning (NCE-MSTL) University of Limerick
171
PATRICK JOHNSON AND JOHN O’DONOGHUE
10. WHAT ARE THE ODDS?
INTRODUCTION
Probability as a topic features on school curricula around the world and buying lottery tickets is an everyday experience for many young adults, especially in Ireland, when they support their local sports clubs or charities. Therefore there is a point of intersection between everyday experience of students and their school mathematics curriculum that can be exploited by mathematics teachers who are alert to the possibilities. But first the teachers must have the requisite background knowledge of the application and the mathematics involved. This chapter was developed with the busy mathematics teacher in mind who would like to present interesting applications of mathematics to his students but who does not have a lot of spare time to delve into the topic ab initio. An interesting pedagogical perspective is adopted that exploits students’ need to know how things work, in this case their need to know how lotteries work. LOTTERIES
Many different definitions of a lottery exist. A simple definition of a lottery is that it is the “drawing of lots in which prizes are distributed to the winners among persons buying a chance.” Most modern lotteries are organised as charity events. National lotteries attempt to make money for charitable groups within the country whereas local lotteries attempt to make money for their club or society. The general idea is that numbered balls/tickets are randomly drawn from a container/drum and prizes are given to those who hold the tickets with the same numbers as the ones that have been drawn out. The number of balls in the drum at the start of the draw and the number of balls that will be selected from the drum all affect the probability of an individual winner the top prize and matching all of the selected balls against their ticket. CALCULATING THE ODDS
We will start this section by first focusing on the Irish National lottery and determining the odds of winning a prize with this system. We will then expand our focus to other lottery systems particularly the Euromillions lottery and calculate the odds of winning in these systems. J. Maasz and J. O’Donoghue (eds.), Real-World Problems for Secondary School Mathematics Students: Case Studies, 173–185. © 2011 Sense Publishers. All rights reserved.
JOHNSON AND O’DONOGHUE
There are 45 balls in the drum at the start of the Irish National Lottery game, from which six are randomly selected. To win the jackpot an individual must hold the ticket that matches all six of the chosen balls. The order in which the balls are drawn does not matter. We will start by working out the odds of winning the jackpot from a system consisting of 45 balls and then see what the chances are of matching a smaller selection of the winning numbers. The drum will originally hold 45 differently-numbered balls. This means that an individual has a 1 in 45 chance of predicting the number of the first ball selected from the drum. Another way of thinking about this is that there are 45 different ways of choosing the first number. When it comes to the second number, there are now only 44 balls left in the drum (because the balls already drawn are not returned to the drum), so there is now a 1 in 44 chance of predicting this number. What this means is that each of the 45 ways of choosing the first number has 44 different ways of choosing the second. To work out the number of ways of correctly predicting the first two numbers drawn we multiply the total number of ways of selecting each number together - therefore the number of ways of correctly predicting 2 numbers drawn from 45 is calculated as 45 × 44 This principle is known as the Multiplication principle. The multiplication principle states that if one event can occur in m ways and a second can occur independently of the first in n ways, then the two events can occur in mn ways. This principle can easily be expanded to cover more than two independent events. Therefore we can conclude that the number of ways of selecting 6 balls from 45 is 45 × 44 × 43 × 42 × 41 × 40 = 5,864,443,200
The final thing that needs to be taken into consideration is that the order in which the balls are drawn from the drum does not matter. By this we mean that if a ticket has the numbers 1, 2, 3, 4, 5, and 6, it wins as long as all the numbers 1 through 6 are drawn, no matter what order they come out in. Hence given any set of 6 numbers, there are 6 × 5 × 4 × 3 × 2 × 1 = 6! or 720 ways they could be drawn. Therefore to find how many ways there are of selecting 6 numbers from 45 when order doesn’t matter we divide the total number of ways of selecting 6 numbers from 45 by the total number of orderings of 6 numbers; 5,864,443,200 = 8,145,060 720 174
WHAT ARE THE ODDS
This means that the chance of selecting the correct 6 numbers from the 45 balls is 1 in 8,145,060. Did you know? Unlike in the United States, where lottery wins are taxed, European jackpots are generally tax-free and the winning jackpot is paid out immediately in one lump sum. Another way, which will prove more efficient in the long term, of looking at the problem of selecting r balls from n is to treat it as a number theory problem. In this instance we use a mathematical idea called a combination. A combination can be defined as any arrangement of elements into groups without regard to their order in the group. This is exactly what we are looking for when choosing lottery balls. The formula for determining the number of ways of selecting r objects from n is given as ⎛n⎞ n! ⎜⎜ ⎟⎟ = r r ! ( n − r )! ⎝ ⎠ where n! is factorial n (n × (n − 1) × (n − 2) × ... × 2 × 1) . Understanding where this formula comes from is rather straight forward when we think again in terms of the lottery system. In the lottery system we are selecting 6 balls from 45. Writing this as a combination yields ⎛ 45 ⎞ 45! ⎜⎜ ⎟⎟ = 6 − 6)! 6 ! ( 45 ⎝ ⎠
We notice that (45 - 6)! is 39!. The 39! located in the denominator cancels with every value from 39 down to 1 located in the 45! in the numerator and so we are left with ⎛ 45 ⎞ 45 × 44 × 43 × 42 × 41 × 40 ⎜⎜ ⎟⎟ = 6! ⎝6⎠
We recall that 45 × 44 × 43 × 42 × 41 × 40 is the total number of ways of choosing 6 objects from 45 and that 6! = 720 is the total number of arrangements of 6 objects. Hence we can see that the combination approach to the lottery problem (C(45,6) as it is sometimes written) yields the same result as we previously calculated. 175
JOHNSON AND O’DONOGHUE
To determine the number of ways of matching any number of balls we must divide the total number of possible combinations (for example, 8,145,060 when choosing 6 balls from 45) by the number of combinations producing the desired result. The denominator equates to the number of ways one can select the required numbers multiplied by the number of ways one can select the losing numbers. To obtain a score of n (for example, if your are attempting to match 5 of the 6 balls drawn, then n would be equal to 5), there are ⎛6⎞ ⎜⎜ ⎟⎟ ⎝n⎠ ways of selecting the n winning numbers from the 6 drawn balls. In the case of the losing numbers, there are ⎛ 39 ⎞ ⎜⎜ ⎟⎟ ⎝6 − n⎠ ways to select them from the 39 losing lottery numbers. Hence the total number of combinations giving that result is, using the multiplication principle again, the first number multiplied by the second. The expression is therefore ⎛ 45 ⎞ ⎜⎜ ⎟⎟ ⎝6⎠ ⎛ 6 ⎞⎛ 39 ⎞ ⎜⎜ ⎟⎟⎜⎜ ⎟⎟ ⎝ n ⎠⎝ 6 − n ⎠
The following table (Table 1) lists the formulae and the associated chance of matching the selected number of balls for a lottery system consisting of 45 balls. We can see from Table 1 that you can be 97% certain (approximately) that you will match 2 or fewer numbers when picking 6 from 45. Based on these results it seems remarkable that so many people participate in a lottery on a regular basis but if questioned on why they continue to play most people would traditionally reply with a phrase like “someone has to win so why not me.” If you look on the Irish National Lottery website you will see that the odds that they list are slightly different from the odds determined here. The only odds that remain the same are the odds of winning the jackpot – i.e., matching all six numbers chosen. The difference between the game outlined here and the Irish National Lottery is that the Irish Lottery includes a “bonus ball”, or “powerball”, which is selected at the end of the traditional selection. Players then have the ability of winning when they match 5, 4 or 3 numbers plus the bonus ball. It might not be obvious how this bonus ball affects the odd of just matching 5, 4 or 3 of the winning numbers but this will now be explored in more detail. 176
WHAT ARE THE ODDS
Table 1. Chance and probability associated with a 45 ball game Match
Calculation
Chance of occurring
Probability of occurring (1/Chance)
0
⎛ 45 ⎞ ⎜⎜ ⎟⎟ ⎝6⎠ ⎛ 6 ⎞⎛ 39 ⎞ ⎜⎜ 0 ⎟⎟⎜⎜ 6 ⎟⎟ ⎝ ⎠⎝ ⎠
2.496475
0.4005646
1
⎛ 45 ⎞ ⎜⎜ ⎟⎟ ⎝6⎠ ⎛ 6 ⎞⎛ 39 ⎞ ⎜⎜ ⎟⎟⎜⎜ ⎟⎟ ⎝ 1 ⎠⎝ 5 ⎠
2.357782
0.4241272
2
⎛ 45 ⎞ ⎜⎜ ⎟⎟ ⎝6⎠ ⎛ 6 ⎞⎛ 39 ⎞ ⎜⎜ ⎟⎟⎜⎜ ⎟⎟ ⎝ 2 ⎠⎝ 4 ⎠
6.6017920
0.1514740
3
⎛ 45 ⎞ ⎜⎜ ⎟⎟ ⎝6⎠ ⎛ 6 ⎞⎛ 39 ⎞ ⎜⎜ ⎟⎟⎜⎜ ⎟⎟ ⎝ 3 ⎠⎝ 3 ⎠
44.562096
0.0224405
4
⎛ 45 ⎞ ⎜⎜ 6 ⎟⎟ ⎝ ⎠ ⎛ 6 ⎞⎛ 39 ⎞ ⎜⎜ ⎟⎟⎜⎜ ⎟⎟ ⎝ 4 ⎠⎝ 2 ⎠
732.7989204
0.00136463
5
⎛ 45 ⎞ ⎜⎜ ⎟⎟ ⎝6⎠ ⎛ 6 ⎞⎛ 39 ⎞ ⎜⎜ ⎟⎟⎜⎜ ⎟⎟ ⎝ 5 ⎠⎝ 1 ⎠
34,807.94872
0.000028729
6
⎛ 45 ⎞ ⎜⎜ ⎟⎟ ⎝6⎠ 6 ⎛ ⎞⎛ 39 ⎞ ⎜⎜ ⎟⎟⎜⎜ ⎟⎟ ⎝ 6 ⎠⎝ 0 ⎠
8,145,060
0.0000001227
BONUS BALL OR POWERBALL
Many lottery systems involve a “bonus ball”. The bonus ball can be drawn from the same pool of numbers as the main lottery or it can be selected from a different 177
JOHNSON AND O’DONOGHUE
drum of numbers. If the bonus ball is drawn from a pool of numbers different from the main lottery, then simply multiply the odds from the main lottery by the odds of getting the bonus ball to determine the overall odds. For example, in the 6 from 45 lottery system that we have already looked at, if there was another drum containing 8 bonus ball numbers, then the chance of getting a score of 3 and the bonus ball would be approximately 1 in 44.562096 × 8 = 356.496768 In the case where more than one bonus ball is selected from the separate bonus ball drum the approach is to treat the bonus ball selections as a mini-lottery by themselves and then simply multiply the bonus ball odds by the required main lottery odds as shown before. The situation is slightly different when the bonus ball is drawn from the same pool of numbers as the main lottery. In this case when calculating the number of winning combinations it is necessary to take the bonus ball into consideration. After the 6 balls are drawn which make up the main lottery selection, an extra ball is drawn from the same pool of balls, and this becomes the bonus ball. A more substantial prize is awarded for matching 5 plus the bonus ball rather than just matching 5. We shall explore this case further by focusing on an example; as mentioned previously the number of ways of having 5 of the 6 selected balls is ⎛ 6 ⎞⎛ 39 ⎞ ⎜⎜ ⎟⎟⎜⎜ ⎟⎟ = 234 ⎝ 5 ⎠⎝ 1 ⎠
Since your ticket has 1 unmatched number remaining and because there are only 39 balls remaining we can deduce that 1/39 of the 234 match 5 combinations will have their final ball selection matching the bonus ball. Therefore there are ⎛ 6 ⎞⎛ 39 ⎞ 1 ⎜⎜ ⎟⎟⎜⎜ ⎟⎟ × =6 ⎝ 5 ⎠⎝ 1 ⎠ 39
ways of matching 5 and the bonus ball. Since the total number of match 6 combinations is 8,145,060 we can conclude that there is a 1 in 1,357,510 (8,145,060/6) chance of matching 5 and the bonus ball. Similarly we can conclude that 38 of the remaining 39 balls will not match the bonus ball and so the chance of matching 5 without the bonus ball is ⎛ 45 ⎞ ⎜⎜ ⎟⎟ ⎝6⎠ = 35,723.94737 ⎛ 6 ⎞⎛ 39 ⎞ 38 ⎜⎜ ⎟⎟⎜⎜ ⎟⎟ × ⎝ 5 ⎠⎝ 1 ⎠ 39
So the chance of matching 5 balls from a total of 45 in a lottery system where there is one bonus ball chosen from the same pool is approximately 1 in 35,724 178
WHAT ARE THE ODDS
(which is the result displayed on the Irish National Lottery website). The following table (Table 2) shows the calculations and probability of winning in the Irish Nation Lottery – matching any combination of balls less than 3 does not result in a payout. Table 2. Chance and probability associated with the Irish National Lottery Match
Calculation
Chance of occurring
Probability of occurring (1/Chance)
3
⎛ 45 ⎞ ⎜⎜ ⎟⎟ ⎝6⎠ ⎛ 6 ⎞⎛ 39 ⎞ 36 ⎜⎜ ⎟⎟⎜⎜ ⎟⎟ × ⎝ 3 ⎠⎝ 3 ⎠ 39
48.27560455
0.020714396
3 + Bonus
⎛ 45 ⎞ ⎜⎜ ⎟⎟ ⎝6⎠ ⎛ 6 ⎞⎛ 39 ⎞ 3 ⎜⎜ ⎟⎟⎜⎜ ⎟⎟ × ⎝ 3 ⎠⎝ 3 ⎠ 39
579.3072546
0.0017261996
4
⎛ 45 ⎞ ⎜⎜ ⎟⎟ ⎝6⎠ ⎛ 6 ⎞⎛ 39 ⎞ 37 ⎜⎜ ⎟⎟⎜⎜ ⎟⎟ × ⎝ 4 ⎠⎝ 2 ⎠ 39
772.4096728
0.0012946497
4 + Bonus
⎛ 45 ⎞ ⎜⎜ ⎟⎟ ⎝6⎠ ⎛ 6 ⎞⎛ 39 ⎞ 2 ⎜⎜ ⎟⎟⎜⎜ ⎟⎟ × ⎝ 4 ⎠⎝ 2 ⎠ 39
14,289.57895
0.0000699810
5
⎛ 45 ⎞ ⎜⎜ ⎟⎟ ⎝6⎠ ⎛ 6 ⎞⎛ 39 ⎞ 38 ⎜⎜ ⎟⎟⎜⎜ ⎟⎟ × ⎝ 5 ⎠⎝ 1 ⎠ 39
35,723.94737
0.0000279924
5 + Bonus
⎛ 45 ⎞ ⎜⎜ ⎟⎟ ⎝6⎠ ⎛ 6 ⎞⎛ 39 ⎞ 1 ⎜⎜ ⎟⎟⎜⎜ ⎟⎟ × ⎝ 5 ⎠⎝ 1 ⎠ 39
1,357,510
0.0000007366
6
⎛ 45 ⎞ ⎜⎜ ⎟⎟ ⎝6⎠ ⎛ 6 ⎞⎛ 39 ⎞ ⎜⎜ ⎟⎟⎜⎜ ⎟⎟ ⎝ 6 ⎠⎝ 0 ⎠
8,145,060
0.0000001227
179
JOHNSON AND O’DONOGHUE
With the Irish National lottery the minimum buy-in is €3 which gets you two lines on a ticket and each additional line costs €1.50. A question that is commonly asked with regards to lotteries is the following: “Is it better to play 20 lines in one week or two lines for 10 weeks? To work out which approach gives the greater odds of winning we recall that the chance of winning the lottery is 1 in 8,145,060. If we play 20 lines in a single week this means that the chance of winning is now 8,145,060 ÷ 20 = 407,253 When looking at the case of playing the lottery over a 10 week period it is easier to work in terms of probabilities. Remember that the probability of an event occurring is the reciprocal of the chance of the event occurring. Therefore in the case of playing 20 lines is the one week the probability of winning is 1 = 0.0000024555 407,253 We know that probabilities only exist between 0 and 1, where 0 is no chance of the event occurring and 1 means that you are guaranteed that the event will occur. Additionally we know that Prob(Losing) = 1 – Prob(Winning) Since the probability of winning the lottery is 0.000000122 we can conclude that the probability of losing the lottery in any given week is 1 − 0.000000122 = 0.999999878
To get our final answer we need to slightly adjust our way of thinking. The probability of winning at least once in 20 plays (10 week period) is the same as the probability of not losing 20 times in a row. Since we know the probability of losing we can easily deduce the probability of losing 20 times in a row and then the probability of not losing 20 times in a row can be found by taking 1 – Prob (losing 20 times in a row). Probability of losing = 0.999999878 20
Probability of losing 20 times in a row = (0.999999878) = 0.99999756 Probability of NOT losing 20 times in a row = (1 – 0.99999756) = 0.00000244 Therefore the probability of winning when you play 20 times in a row is 0.00000244. We can see that the probability of winning from playing 20 lines in the same week is slightly ( 10 −8 ) better than the odds of playing over a 10 week period which suggests that you are better off playing one big lump sum in a given week rather than spreading it out over a number of weeks. 180
WHAT ARE THE ODDS
EUROMILLIONS
EuroMillions is a lottery game in which players from various European countries participate in a co-ordinated game with a common draw. The game is currently, as of 2010, available to players over the age of 18 (16 in the United Kingdom) in Austria, Belgium, France, Ireland, Luxembourg, Portugal, Spain, Switzerland and the United Kingdom. EuroMillions is similar to most lottery games but due to the increased jackpot value (minimum is normally €15 million) and the increased number of players the odds of winning the jackpot are considerably higher than in a national lottery game. EuroMillions consists of two separate drums of balls. The main drum holds 50 balls from which 5 are selected. The second drum holds the “lucky balls” – there are nine lucky balls from which 2 are selected. This is an example of a bonus ball game in which the bonus balls are selected from a different drum to the main balls unlike the Irish national lottery game. We shall now examine how to determine the odds of winning in the EuroMillions game. Did you know? The largest payout ever made by the EuroMillions lottery was on the 3rd of February, 2006 when the jackpot of €183 million, which had rolled over eleven times, was won by three individuals (two French individuals and one Portugese). As mentioned previously, determining the odds in a game where the bonus balls are drawn from a pool that is different from the main pool is done by determining the odds of the main lottery and the bonus lottery separately and then multiplying the two odds together to work out the overall odds. We shall first look at the main lottery. This is a game consisting of 50 balls from which 5 are selected. Therefore the total number of ways of choosing 5 balls from 50 is ⎛ 50 ⎞ ⎜⎜ ⎟⎟ = 2,118,760 ⎝5⎠ At some stage we shall be attempting to determine the odds of selecting less than 5 winning numbers from the 50 so at this point it is beneficial to come up with a general formula to cover all the instances that we will encounter. As in the Irish lottery example we shall look at the number of ways of obtaining the required number of winning numbers on the ticket (5, 4, 3 etc.) and the losing numbers on the ticket. Since we are selecting only 5 balls from the 50 balls in the pool the number of winning numbers that we have on our ticket will have to be chosen from these balls. Therefore the total number of ways of selecting w winning numbers from 5 is ⎛5⎞ ⎜⎜ ⎟⎟ ⎝ w⎠ 181
JOHNSON AND O’DONOGHUE
Once the 5 winning numbers have been removed from the pool it means that there are 45 losing numbers remaining. Therefore the number of ways of selecting l losing numbers from 45 is ⎛ 45 ⎞ ⎜⎜ ⎟⎟ ⎝ l ⎠ From this we can conclude that the total number of ways of selecting w winning numbers and l losing numbers is (by the multiplication principle) ⎛ 5 ⎞⎛ 45 ⎞ ⎜⎜ ⎟⎟⎜⎜ ⎟⎟ ⎝ w ⎠⎝ l ⎠ From this we can conclude that the formula for calculating the total number of ways of selecting w winning numbers and l losing numbers when drawing 5 balls from 50 will be ⎛ 50 ⎞ ⎜⎜ ⎟⎟ ⎝5⎠ ⎛ 5 ⎞⎛ 45 ⎞ ⎜⎜ ⎟⎟⎜⎜ ⎟⎟ ⎝ w ⎠⎝ l ⎠ This formula consists of the total number of ways of selecting 5 balls from 50 in the numerator and in the denominator we have the total number of ways of choosing our particular selection from the winning balls – e.g. getting 4 winning balls only. Did you know? The largest payment to a single individual of the EuroMillions lottery was in May 2009 when what is thought to be a world record amount of €126 million was paid to a Spanish winner. Similarly we can look at the “lucky ball” selection as a separate lottery drawn and arrive at the following formula. In the case of the lucky ball selection we are drawing 2 balls from a pool containing 9 and so we have ⎛9⎞ ⎜⎜ ⎟⎟ ⎝ 2⎠ ⎛ 2 ⎞⎛ 7 ⎞ ⎜⎜ ⎟⎟⎜⎜ ⎟⎟ ⎝ w ⎠⎝ l ⎠ 182
WHAT ARE THE ODDS
To finally work out the chance of winning in the EuroMillions lottery we multiply both formulae together: ⎛ 50 ⎞ ⎛9⎞ ⎜⎜ ⎟⎟ ⎜⎜ ⎟⎟ 5 ⎝ ⎠ × ⎝2⎠ ⎛ 5 ⎞⎛ 45 ⎞ ⎛ 2 ⎞⎛ 7 ⎞ ⎜⎜ ⎟⎟⎜⎜ ⎟⎟ ⎜⎜ ⎟⎟⎜⎜ ⎟⎟ ⎝ w1 ⎠⎝ l1 ⎠ ⎝ w2 ⎠⎝ l 2 ⎠
where w1 and l1 are the number of winning and losing balls in the main lottery and w2 and l 2 are the number of winning and losing balls in the lucky ball lottery, respectively (Table 3). Table 3. Chance and probability associated with the Euromillions Lottery Match
Calculation
5+2
⎛ 50 ⎞ ⎛9⎞ ⎜⎜ ⎟⎟ ⎜⎜ ⎟⎟ ⎝ 5 ⎠ × ⎝ 2⎠ ⎛ 5 ⎞⎛ 45 ⎞ ⎛ 2 ⎞⎛ 7 ⎞ ⎜⎜ ⎟⎟⎜⎜ ⎟⎟ ⎜⎜ ⎟⎟⎜⎜ ⎟⎟ ⎝ 5 ⎠⎝ 0 ⎠ ⎝ 2 ⎠⎝ 0 ⎠
76,275,360
0.000000013
5+1
⎛ 50 ⎞ ⎛9⎞ ⎜⎜ ⎟⎟ ⎜⎜ ⎟⎟ 5 ⎝ ⎠ × ⎝ 2⎠ ⎛ 5 ⎞⎛ 45 ⎞ ⎛ 2 ⎞⎛ 7 ⎞ ⎜⎜ ⎟⎟⎜⎜ ⎟⎟ ⎜⎜ ⎟⎟⎜⎜ ⎟⎟ ⎝ 5 ⎠⎝ 0 ⎠ ⎝ 1 ⎠⎝ 1 ⎠
5,448,240
0.000000183
5+0
⎛ 50 ⎞ ⎛9⎞ ⎜⎜ ⎟⎟ ⎜⎜ ⎟⎟ 5 ⎝ ⎠ × ⎝ 2⎠ ⎛ 5 ⎞⎛ 45 ⎞ ⎛ 2 ⎞⎛ 7 ⎞ ⎜⎜ ⎟⎟⎜⎜ ⎟⎟ ⎜⎜ ⎟⎟⎜⎜ ⎟⎟ ⎝ 5 ⎠⎝ 0 ⎠ ⎝ 0 ⎠⎝ 2 ⎠
3,632,160
0.000000275
4+2
⎛ 50 ⎞ ⎛9⎞ ⎜⎜ ⎟⎟ ⎜⎜ ⎟⎟ ⎝ 5 ⎠ × ⎝ 2⎠ ⎛ 5 ⎞⎛ 45 ⎞ ⎛ 2 ⎞⎛ 7 ⎞ ⎜⎜ ⎟⎟⎜⎜ ⎟⎟ ⎜⎜ ⎟⎟⎜⎜ ⎟⎟ ⎝ 4 ⎠⎝ 1 ⎠ ⎝ 2 ⎠⎝ 0 ⎠
339,001.6
0.000002949
4+1
⎛ 50 ⎞ ⎛9⎞ ⎜⎜ ⎟⎟ ⎜⎜ ⎟⎟ ⎝ 5 ⎠ × ⎝ 2⎠ ⎛ 5 ⎞⎛ 45 ⎞ ⎛ 2 ⎞⎛ 7 ⎞ ⎜⎜ 4 ⎟⎟⎜⎜ 1 ⎟⎟ ⎜⎜ 1 ⎟⎟⎜⎜ 1 ⎟⎟ ⎝ ⎠⎝ ⎠ ⎝ ⎠⎝ ⎠
24,214.4
0.000041297
4+0
⎛ 50 ⎞ ⎛9⎞ ⎜⎜ 5 ⎟⎟ ⎜⎜ ⎟⎟ ⎝ ⎠ × ⎝ 2⎠ ⎛ 5 ⎞⎛ 45 ⎞ ⎛ 2 ⎞⎛ 7 ⎞ ⎜⎜ ⎟⎟⎜⎜ ⎟⎟ ⎜⎜ ⎟⎟⎜⎜ ⎟⎟ ⎝ 4 ⎠⎝ 1 ⎠ ⎝ 0 ⎠⎝ 2 ⎠
16,142.933
0.000061946
Chance of occurring
Probability of occurring (1/Chance)
183
JOHNSON AND O’DONOGHUE
Table 3. (Continued)
3+2
⎛9⎞ ⎛ 50 ⎞ ⎜⎜ ⎟⎟ ⎜⎜ ⎟⎟ 5 ⎝ ⎠ × ⎝ 2⎠ ⎛ 5 ⎞⎛ 45 ⎞ ⎛ 2 ⎞⎛ 7 ⎞ ⎜⎜ ⎟⎟⎜⎜ ⎟⎟ ⎜⎜ ⎟⎟⎜⎜ ⎟⎟ ⎝ 3 ⎠⎝ 2 ⎠ ⎝ 2 ⎠⎝ 0 ⎠
7,704.581818
0.000129792
3+1
⎛9⎞ ⎛ 50 ⎞ ⎜⎜ ⎟⎟ ⎜⎜ ⎟⎟ ⎝ 5 ⎠ × ⎝ 2⎠ ⎛ 5 ⎞⎛ 45 ⎞ ⎛ 2 ⎞⎛ 7 ⎞ ⎜⎜ ⎟⎟⎜⎜ ⎟⎟ ⎜⎜ ⎟⎟⎜⎜ ⎟⎟ ⎝ 3 ⎠⎝ 2 ⎠ ⎝ 1 ⎠⎝ 1 ⎠
550.3272727
0.0018171
2+2
⎛9⎞ ⎛ 50 ⎞ ⎜⎜ ⎟⎟ ⎜⎜ ⎟⎟ 5 ⎝ ⎠ × ⎝ 2⎠ ⎛ 5 ⎞⎛ 45 ⎞ ⎛ 2 ⎞⎛ 7 ⎞ ⎜⎜ 2 ⎟⎟⎜⎜ 3 ⎟⎟ ⎜⎜ 2 ⎟⎟⎜⎜ 0 ⎟⎟ ⎝ ⎠⎝ ⎠ ⎝ ⎠⎝ ⎠
537.5289641
0.001860364
3+0
⎛9⎞ ⎛ 50 ⎞ ⎜⎜ ⎟⎟ ⎜⎜ 5 ⎟⎟ ⎝ ⎠ × ⎝ 2⎠ ⎛ 5 ⎞⎛ 45 ⎞ ⎛ 2 ⎞⎛ 7 ⎞ ⎜⎜ ⎟⎟⎜⎜ ⎟⎟ ⎜⎜ ⎟⎟⎜⎜ ⎟⎟ ⎝ 3 ⎠⎝ 2 ⎠ ⎝ 0 ⎠⎝ 2 ⎠
366.8848485
0.00272565
1+2
⎛ 50 ⎞ ⎛9⎞ ⎜⎜ ⎟⎟ ⎜⎜ ⎟⎟ 5 ⎝ ⎠ ⎝ 2⎠ × ⎛ 5 ⎞⎛ 45 ⎞ ⎛ 2 ⎞⎛ 7 ⎞ ⎜⎜ ⎟⎟⎜⎜ ⎟⎟ ⎜⎜ ⎟⎟⎜⎜ ⎟⎟ ⎝ 1 ⎠⎝ 4 ⎠ ⎝ 2 ⎠⎝ 0 ⎠
102.3864693
0.009766915
2+1
⎛9⎞ ⎛ 50 ⎞ ⎜⎜ ⎟⎟ ⎜⎜ ⎟⎟ ⎝ 5 ⎠ × ⎝ 2⎠ ⎛ 5 ⎞⎛ 45 ⎞ ⎛ 2 ⎞⎛ 7 ⎞ ⎜⎜ ⎟⎟⎜⎜ ⎟⎟ ⎜⎜ ⎟⎟⎜⎜ ⎟⎟ ⎝ 2 ⎠⎝ 3 ⎠ ⎝ 1 ⎠⎝ 1 ⎠
38.394926
0.026045108
SOME IDEAS FOR USE IN THE MATHEMATICS CLASSROOM
Mathematics teachers who are in control of this material are in a good position to exploit students’ interest in topical issues such as ‘winning on the lotto’ by working with them ‘to figure out how things work’ e.g. lotteries. In the process students are learning probability concepts in a real situation. Mathematics teachers will see many opportunities in this material for developing their own lessons. We would like to add some ideas of our own for consideration. Mathematicians use a tried and tested approach when they are confronted with problems or want to know how something new works. They specialise and generalise. By that we mean they look at simpler special cases so that they can understand the general case. This approach can work in the classroom by introducing games, for example, and/or examining a small lottery closely before moving on to bigger lotteries. Active learning is easily promoted because it is employed naturally by the students. 184
WHAT ARE THE ODDS
Class Lottery Game (Class Project 1) Construct your own class lottery game. Take any number of ping-pong balls or tennis balls, let’s say 6, and number then 1 through 6. Decide on the rules for the lottery – 4 balls will be selected from the six. Play the lottery game within your class and see if anyone wins the jackpot. Determine the probability of someone winning the lottery. Did the calculated probability of winning and the actual number of winners in your class match up? Discuss. Small Lottery (Class Project 2) Sports clubs throughout Ireland raise funds by running small scale lotteries e.g. pick 3 numbers from the first 36 natural numbers. How does this lottery work? Write a report and present it to the class. Small Lottery (Class Project 3) If a sports club runs a small scale lottery based on selecting 4 from 36. How does this lottery work? How is it different from the one above? Write a report and present it to the class explaining the differences and why a club might select this kind of lottery over the other. Questions for Consideration and Resolution What steps could you take to guarantee that you win the jackpot in the Irish Lottery? (select 6 numbers from 42) To win the Irish Lottery you have decided to buy all the possible combinations. How long would it take to print off the tickets if one ticket can be printed each second? Would this approach to winning the Lottery in that week then work? Discuss. REFERENCES How Stuff Works “Introduction to how Lotteries Work”. Retrieved June 22, 2010, from http:// entertainment.howstuffworks.com/lottery.htm EuroMillions – Wikepedia. Retrieved June 23, 2010, from http://en.wikipedia.org/wiki/EuroMillions Irish National Lottery Homepage. Retrieved June 22, 2010, from http://www.lotto.ie/ Lottery Math. Retrieved June 24, 2010, from http://members.cox.net/mathmistakes/rawdata.htm
Patrick Johnson and John O’Donoghue National Centre for Excellence in Mathematics and Science Teaching and Learning (NCE-MSTL) University of Limerick
185
ASTRID KUBICEK
11. MODELS FOR LOGISTIC GROWTH PROCESSES (E.G. FISH POPULATION IN A POND, NUMBER OF MOBILE PHONES WITHIN A GIVEN POPULATION)
INTRODUCTION
6th grade students (aged 15–16 years) were taught how to set up models for realistic logistic growth processes without the use of differential equations. The main aim was to pass a tool on to them that is easy to handle and easy to apply for modelling logistic growth processes. For presentation of causal loop diagrams VENSIM PLE was used as software. Alternatively EXCEL was used to set up the runs for the models, because the students were familiar with it. Many growth processes in real life are not linear, nor exponential. Most growth processes have got a limited capacity and are therefore logistic, approaching a certain limit, the maximum capacity. The problem at school is, that differential equations appear quite late in the Austrian syllabus (usually grade 8 (aged 18)) whereas sequences and series as well as exponential and logistic functions are taught in grade 6. Consequently I was looking for a possibility to teach logistic growth without using differential equations in grade 6. The system dynamics approach is an alternative way that is not based on differential equations and recurrence relations. This approach enables students to work out models for various growth processes using Excel. The advantage is that this is possible in grade 6, so that one can work on linear, exponential and logistic growth. Additionally the project can be seen as support for students to improve their ability for thinking and working with complex problems. Furthermore it is possible to include real life problems in mathematics lessons, which can help improving the image of the subject mathematics at school. It is to be seen as supportive tool, rather than as isolated theoretical subject. VENSIM PLE can also be used to represent data in other subjects like economics, for example. The main problem the students worked on was to set up a model for the growth of a certain population of fish under given conditions. After they had set up their main model of this growth process, the students were instructed to change the parameters and interpret their results. Finally they could work on other problems themselves. But before they started their work, they got an introduction to VENSIM PLE, so that they were able to use it to plot causal loop diagrams and stock and flow diagrams. Another important aim of the project was to make students aware of the fact that there is a difference between quantitative and qualitative models. For modelling J. Maasz and J. O’Donoghue (eds.), Real-World Problems for Secondary School Mathematics Students: Case Studies, 187–208. © 2011 Sense Publishers. All rights reserved.
KUBICEK
qualitative factors causal loop diagrams were used. It was easy for them to realize that these diagrams can be tremendously large. Stock and flow diagrams on the other hand should not include so many factors because they are used for calculations and these are more complex the more factors occur and as a result more difficult to solve. Stock and flow diagrams are used for models of quantitative data where it is important to distinguish between stocks and flows. Such diagrams were used in the project to model logistic growth processes1. One problem that came up during this phase of the project was that many students were pretty much used to simple problem solving strategies. In their opinion mathematical problems had one definite way to be solved and the result was then either correct or incorrect. This was maybe what they were trained to in class and how they were used to solve problems: The teacher offers a solution and they take it over and apply it to similar examples with different values. It took awhile until this different strategy of working in mathematics was accepted by a majority of students. Weaker students had bigger problems accepting this approach to problem solving. After this introduction to system dynamics, VENSIM and ways of presentation using causal loop and stock and flow diagrams, I showed the students how to use EXCEL for logistic growth processes. A factor called “space”2 was of great importance for modelling in the context of the task which had to be introduced to the students. The students then worked out appropriate stock and flow diagrams in VENSIM and used Excel to work out their runs. They were advised to work in pairs or at least small groups and we discussed their results in the end. After they had finished the model of the fish population they worked on other logistic growth models. Some students continued their work on the first model and started changing some parameters. They came up with interesting results. Others managed to get chaotic behaviour and we discussed possible reasons and effects. It was interesting to see how this activity improved teamwork in class. This was not a main intention when I started the project but it showed that such maths lessons have more than one alternative to conservative lessons. Following the summary of the realization of the project in class, I added some suggestions concerning didactics. REALIZATION IN CLASS
PHASE 1: Growth Processes Before the students started their work I introduced growth processes in general. The conclusion was that growth processes in real life are mostly not linear or exponential as we assumed in many previous math lessons. PHASE 2: Causal Loop Diagrams Using Software (VENSIM PLE3) After a short introduction concerning the nature of causal loop diagrams and the use of VENSIM PLE in that context I asked the students to develop their own diagrams. 188
MODELS FOR LOGISTIC GROWTH PROCESSES
They did not get detailed instructions concerning topics. I only asked them to work on growth processes. Here are some results (Figure 1):
-
number of children
money -
bad economy +
dogs in austria
+
+
+
people feel lonely
people feel save
births of dogs
stable politics Figure 1. Number of dogs in Austria (student: Silvia K.).
The interpretation of the students: The diagram shows a positive loop: dogs in Austria- birth of dogs; but this is influenced by numerous other factors (Figure 2). The connection between “stable politics” and “people feel save” is unclear to me as well as the connection between “bad economy” and “people feel lonely”. Another example: sunlight + + temperature
-
use of fertilizers
growth of a plant + amount of water +
fertile soil
+ rain Figure 2. Growth of plants (student: Silvia K.). 189
KUBICEK
The following example shows the amount of money on a bank account (Figure 3): money onthe bank account rate of interest +
+ +
interest
Figure 3. Money on bank account (student: Irene A.).
This example shows a positive loop: the more money on a bank account, the higher the interest, the more money on the account etc. This causal loop diagram enables us to work out an appropriate stock and flow diagram to work with quantitative factors as it does not include too many factors. Too many factors reduce the chance to get a proper stock and flow diagram. The importance is that students realise that these restrictions simplify the mathematical calculations but at the same time make the model more unrealistic. In real life there are certainly more factors that affect the amount of money on a bank account (Figure 4): additional expenses (luxury, presents) salary interest/ dept
rate of interest interest money on bank account
dept
daily expenses economy extra money (gifts, lottery, inheritance)
Figure 4. Additional factors that influence the amount of money on a bank account.
This diagram shows other factors that could influence the amount of money on a bank account, such as the local economy, additional expenses or receipts. Diagram 4 can easily be enlarged using further factors. At this stage of the project it was 190
MODELS FOR LOGISTIC GROWTH PROCESSES
important for me that the students realized that they themselves have to decide whether they work on qualitative or quantitative factors. This decision should be made before one starts modelling. PHASE 3: Qualitative vs. Quantitative Models Before one starts modelling it should be decided whether it is important to distinguish between stocks and flows (Figure 5). If so quantitative models are appropriate, if not causal loop diagrams are the better choice.
Figure 5. Ossimitz: 4 steps of modelling in system dynamics4.
It is not necessary to start with causal loop diagrams first and use stock and flow diagrams afterwards. I wanted to introduce both possibilities in my project. One reason for that was that causal loop diagrams can be used in other subjects as well for presentations of qualitative data. For Maths lessons on the other hand stock and flow diagrams are of major importance because they deal with quantitative data. PHASE 4: Stock and Flow Diagrams I used an example to introduce the idea: the number of mobile phone owners was used to introduce stock and flow diagrams. The term “space”5 and its meaning concerning logistic growth models were explained as well. The following diagrams were worked out by the students (Figure 6):
growth
height of the plant
growth rate Figure 6. Growth of plants (student: Silvia K.). 191
KUBICEK
This diagram shows a successful translation of a causal loop diagram (diagram 2) to a stock and flow diagram. In diagram 2 the students included more factors that had an influence on the model. Not all of them can be used for the stock and flow diagram. Only those that are stocks and flows and therefore quantitative were taken. The diagram above is the simplest version. It could be extended a bit and still it would be possible to solve it mathematically. But for the discussion the diagram was really good, because the result were discussions on how many different factors there are that are really relevant for their purposes. Logistic growth processes were introduced using an example of the growth of the number of mobile phone owners6. Afterwards the students had to gather in groups and work on the following problem: EXAMPLE: FISH POPULATION IN A POND
A fish population increases by 20% annually. Initially there are 10 000 fish in the pond. Model the logistic population growth given that it approaches a limit of capacity of 60 000 (due to limited food stock, limited space in the pond, enemies reducing the population, disease, etc.). Tasks: 1. 2. 3. 4. 5. 6. 7. 8. 9. 10.
Use VENSIM to present the growth process in a causal loop diagram. Use VENSIM to present the growth process in a stock and flow diagram. Use Excel to simulate the growth process. Change the annual growth rate to 100%, 130%, 150% and 190%. Comment briefly on your results. Politicians decide to catch 5000 fish each year to sell them on the local fish market. They start the harvest when the maximum capacity is reached. Use a growth factor of 50%. Comment on any long term trends (does the population die out, does the population still reach the maximum capacity…). Alter the amount of fish harvested each year and interpret your results. When does the population die out? Change the initial amount of fish and comment on your result. Find out when a harvest of 8000 fish per year has to be started so that the population remains stable and doesn’t die out. Modify your model: The harvest shall start when a certain minimum amount of fish is exceeded. Change the parameters of your latest model and interpret the changes they cause. Alternatively you can work out another logistic growth model.
Students’ Solutions: As a first step causal loop diagrams were produced in VENSIM to present the growth process (Figure 7): 192
MODELS FOR LOGISTIC GROWTH PROCESSES
Figure 7. Factors affecting the population of fish in a pond (student: Silvia K.).
After the causal loop diagram the students worked in EXCEL. The term “space” is of major importance to get logistic growth: Space = 1 -
fish population maximum capacity
Growth per unit time = space* maximum capacity*growth factor amount end = amount beginning + growth per unit time Here one can clearly see that the closer the fish population to the maximum capacity the closer to 0 the factor “space” in the calculation, the slower the growth per unit time. This produces the typical curve of a logistic growth function and is easier compared to differential equations. Using this definition one can run the following simulation (Table 1): Table 1. Logistic growth of a fish population7 Time initially
0
Time step [years] Initial amount of fish
0,5
Maximum capacity
60000
Growth factor (GF)
0,2
10000
193
KUBICEK
Table 1. (Continued) Amount beginning
Time 0,5
Space
10000
0,83333333
Growth per unit time
GF 0,2
1666,66667
Amount end 10833,3333
1
10833,33333
0,81944444
0,2
1775,46296
11721,0648
1,5
11721,06481
0,80464892
0,2
1886,26843
12664,199
2
12664,19903
0,78893002
0,2
1998,23335
13663,3157
2,5
13663,3157
0,77227807
0,2
2110,37582
14718,5036
3
14718,50361
0,75469161
0,2
2221,58623
15829,2967
3,5
15829,29673
0,73617839
0,2
2330,63723
16994,6153
4
16994,61534
0,71675641
0,2
2436,1999
18212,7153
… 37,5
59850,77042
0,00248716
0,2
29,7716853
59865,6563
38
59865,65626
0,00223906
0,2
26,8085875
59879,0606
38,5
59879,06055
0,00201566
0,2
24,1391351
59891,1301
39
59891,13012
0,0018145
0,2
21,7344672
59901,9974
39,5
59901,99735
0,00163338
0,2
19,5685143
59911,7816
40
59911,78161
0,00147031
0,2
17,6177363
59920,5905
40,5
59920,59048
0,00132349
0,2
15,8608847
59928,5209
Using the table one can plot the resulting graph for this growth process (Figure 8): 70000 60000
number of fish
50000 40000 30000 20000 10000 0 0
10
20
30
40
50
Time
Figure 8. Logistic growth of a fish population.
Using this graph the logistic growth process can be clearly seen. First it tends towards an exponential growth then the curve shows a point of inflexion and moves 194
MODELS FOR LOGISTIC GROWTH PROCESSES
towards the maximum capacity. This limit is approached because of the definition of the “space”. Changing parameters leads to different outcomes here as well: A modification of growth factor and time step for example change the results tremendously, whereas changes of initial amount of fish or a different maximum capacity do not. A change of growth factor to 3.9 results in a periodic curve that goes above the maximum capacity first, then back and forth until it reaches the maximum capacity (Figure 9). This seems quite unrealistic in the beginning. In fact it is more realistic than the growth process shown in the previous diagram: First there are more fish in the pond than theoretically possible, because they can survive a certain time period without enough food or space. After a while quite a few fish will be killed, therefore the number of fish in the pond decreases. This trend is then stopped and the curve approaches the maximum capacity again, extending it slightly etc.
Population 66000
fishpopulation
64000 62000 60000 58000 56000 54000 52000 50000 0
5
10
15
20
time
Figure 9. Population of fish modelled using growth factor 3.9: oscillating behaviour8.
PHASE 5: Results Finally the students had to present their results and discuss them with the rest of class. Together they came up with interesting solutions and improvements. DIDACTICS AND FURTHER DETAILS
There are some points that came to my mind which are important for the successful realization of such a project in class. Those are summarized below. 195
KUBICEK
Aims: Concerning content: Concerning didactics:
introduction of logistic growth process without use of differential equations, motivation of students, making them participate more frequently in class; use of teamwork as helpful tool to make them work on examples together;
Time required: 6 lessons Conditions: Participation of the whole class was obligatory. As a result there was a mixture of gifted and interested students on one hand and those that had troubles in maths lessons on the other hand. Each group had to organize a laptop beforehand, so that we could stay in class. The class had no knowledge of the software used for the project. They only knew how to use EXCEL. Functions describing linear and exponential growth processes had been taught earlier in class that year. Each of the phases discussed earlier in this essay will now be used again focussed on didactical aspects: Work Process in Class: Phase 1: Introduction of Growth Processes The growth of a population of elephants was used to start the project day off 9. Brainstorming led the class into modelling and different aspects of the modelling process which were of major importance later that day. Phase 2: Causal Loop Diagrams Causal loop diagrams are one possibility to present factors of a system and their relations. In this context there is no difference between qualitative and quantitative factors. Mathematical tools like “+” and “-” are used but other mathematical operations like equations are not required. Therefore they are used to present qualitative rather than quantitative data. Arrows are used to represent the relations between the different factors. Above these arrows a “+” can be shown. This means that the relation is of type “the more, the more” or the “less, the less”. This means for example: The more often a couple quarrels the more frequently the husband goes to the pub. The more frequently he goes to the pub the more often they quarrel and so on and so forth. Such relations are called positive. Above the arrows a “-” can also be shown. This means “the more, the less”, e.g. the lower the temperature, the more the radiator works. Such relations are called negative. Frequently more than two factors are presented in a causal loop diagram. If more than two factors are involved, loops can occur where the factors influence 196
MODELS FOR LOGISTIC GROWTH PROCESSES
themselves. If all arrows involved in such a loop show a “+” the resulting diagram shows a positive or escalating loop. In case of only “-” above the arrows between the factors in the system, the diagram shows a negative or stabilizing loop. If there are different signs involved the number of “-”-” is relevant. Even numbers of minuses above the arrows in a loop imply positive or escalating loops. If the number of “-” above the arrows gives an odd number on the other hand the result is a negative or stabilizing loop. A Few Examples: Example 1: positive (= escalating) loop (Figure 10) Ossimitz uses a couple that quarrels to explain this situation: The more frequently the husband goes to a pub, the more hostile the wife to him. The more hostile she is the more often her husband goes to the pub etc. until they probably get divorced in the end. This is a good example describing an escalating loop with only two factors influencing each other10: +
he doesn’t talk to her
she is hostile to him +
Figure 10. Escalating loop.
This diagram can also be used differently: The less often he doesn’t talk to her the less hostile the wife will be, the less the husband will be in the pub etc. This is also an escalating loop (Figure 11). But this time the result is hopefully not that the couple gets divorced. Still the diagram remains unchanged because a “+” only describes the connection between the factors.
+
he doesn’t talk to her
she is hostile to him
+
Figure 11. Escalating loop.
As a matter of fact systems with two factors related by “the more, the more” or “the less, the less” result in escalating loops. Systems where on the other hand two 197
KUBICEK
factors are connected by “the more, the less” or “the less the more” result in stabilizing loops: Example 2: negative (= stabilizing) loops (Figure 12) The higher the room temperature the less the radiators work, the less the room temperature the more the radiators work etc.11 This leads to a stabilizing loop, such that the room temperature can be stabilized at a certain temperature:
room temperature
-
work of radiators
+
Figure 12. Stabilizing loop.
If more factors are involved the following causal loop diagram is possible: Example 3: more factors involved in a causal loop diagram (Figure 13): The expansion of public transport in relation to the ticket prices and also in relation to the use of public transport is presented in the following diagram. An expansion of public transport implies higher prices for tickets. That leads to decreasing use of public transport which reduces the expansion of the public transport system. So the single factors of the system have an influence on other factors and on themselves in the end. Such systems cause stabilizing loops12. In reality there are more factors responsible for the public transport which is a source for discussions in class. The simplified system can be presented as follows13:
+
expansion of public transport
use of public transport
+
-
ticket price Figure 13. Public transport. 198
MODELS FOR LOGISTIC GROWTH PROCESSES
Summarizing the main facts of causal loop diagrams, one can say that they are used to present qualitative data. Two or more factors can be related. The more factors one includes the closer to reality the model, the more difficult to “transform” the diagram to a stock and flow diagram. Even numbers of “-” on top of the arrows in a loop cause escalating or positive loops, whereas odd numbers of “-” on top of the arrows showing the relation between the factors of the system are called stabilizing or negative loops. Phase 4: Stock and Flow Diagrams Stock and flow diagrams include mathematical operations and are used for quantitative data. That means the more factors one includes in a system the closer to reality the model. At the same time it gets more difficult to get calculations involved. The relations between the factors of a system are not only given by “+” and “-” as for a causal loop diagram. One must distinguish between stocks and flows. There are many ways to model data: The most qualitative one is a verbal description, followed by causal loop diagrams. The stock and flow diagram includes quantitative data and leads to mathematical problem solving.
Figure 14. Four ways of modelling by Ossimitz14.
This diagram (Figure 14) is a bit confusing to me because one gets the impression that there is a certain order when modelling, starting with a verbal description, leading to equations in the end. Such a “transformation” is not necessary in my opinion. It is only important that one decides which model is to be worked out before the process starts. If the data given is quantitative I would go for the stock and flow diagram. In case of qualitative data a causal loop diagram is sufficient and allows the use of more factors and is therefore closer to the real situation. Still I used both ways to set up models because I wanted to show my students how to plot causal loop diagrams. In my opinion mathematics can act as useful tool for other subjects like economics. For growth processes in general I would suggest using stock and flow diagrams immediately without plotting causal loop diagrams first. This saves time trying to “transform” the diagrams. I did not discuss this problem in detail in the process of the project. We worked on the difference between qualitative and quantitative data. In this context the use of causal loop and stock and flow diagrams was discussed Figure 15). 199
KUBICEK
qualitative
verbal description
causal loop diagram
stock and flow diagram
quantitative
equations
Figure 15. Relation between qualitative and quantitative models.
Stock and flow diagrams have special notation: 1. Stock (= box variable): shows the stock at a certain time (e.g. size of s certain population); stocks are changed at certain time points only Symbol: 2. Flow:
Symbol:
3. Auxiliary variable: Symbol: 4. Arrow: Symbol:
200
population shows the change of stock per time interval; the amount added to or taken away from the stock.
Number of birth per annum constant factor describing the change per time interval constant shows that there is a relation between two factors
MODELS FOR LOGISTIC GROWTH PROCESSES
5. Cloud:
Each system has limits. Influences from outside are presented using the cloud symbol.
Symbol: This notation was new for the students but did not cause any problems. A stock is changed by the flow. It is possible that more than one flow is connected to one and the same stock. Take money on a bank account as example: The money on the account is the stock. It can be changed by more than one flow: interest compounded per unit time, extra money put on or taken from the account, etc. It is important to point out the difference between flow and flow rate. The flow describes the absolute amount of change per unit time (e.g. the amount of water that flows into a bathtub per minute). The flow rate is the rate at which the water flows into the bathtub (e.g. 20l/min). Flows are easier to compare and can be used for different time steps. Flows can be continuous as well as discrete. That’s why they are really helpful for modelling growth processes. A few examples might be useful: Example 1: Simple population model A certain population is changed by the number of births and deaths per year. This situation can be modelled using a stock and flow diagram as follows (Figure 16): population number of birth per year
rate of birth
number of death per year
rate of death
Figure 16. Simple population model.
The stock in this example is the population. Two flows have an influence on the stock. That is an increase by the number of births per annum (=Anzahl an jährlichen Geburten) and a decrease by the number of deaths per annum (=Anzahl an jährlichen Todesfällen). Both flows are influenced by an auxiliary variable. The number of birth per year is related to the rate of birth per year (=Geburtenrate), also to the initial population at the beginning of each year. The same is valid for the number of deaths per year: This flow is influenced by the auxiliary variable rate of deaths (=Sterberate) as well as by the population at the beginning of each year. The blue arrows show the relation between the single factors. It can be clearly seen that each factor can be described by a numerical value. This is characteristic for stock and flow diagrams. 201
KUBICEK
These five variables describe the dynamic system above. The flow variables are linked to the world surrounding this model. That is why the arrows are connected to the cloud symbol. A model like the one above can be used to describe simple animal populations. For human beings it might be too simple, because there are many different factors interacting, which make the model more complex. Still it is quite easy to adapt the model so that it can be used for human beings. One can use two more flows, namely immigration and emigration. The following model results:
annual immigration rate of immigration number of birth per year rate of birth
population
number of death per year
annual emigration
rate of death rate of death
Figure 17. Population model including immigration and emigration.
This example (Figure 17) shows why I suggested the use of a stock and flow diagram immediately rather than trying to “transform” a causal loop diagram into a stock and flow diagram. The problem is obvious: There are many flows that can easily be taken into account in a causal loop diagram which make it impossible to work out equations describing the situation. Causal loop diagrams tend to get extremely large and complex quite easily. Example 2: Stock and flow diagrams can show interesting details that change the meaning completely: Ossimitz shows using the following example how a situation can change completely when the limits of a system are changed only slightly (Figure 18):
Figure 18. Crude oil production I 15. 202
MODELS FOR LOGISTIC GROWTH PROCESSES
This stock and flow diagram shows six stocks. Unknown sources of crude oil are connected to the limits of the system. This means, that theoretically there is an inflow which increases the amount of crude oil produced. One can easily see from the diagram that there is enough crude oil and there is no problem concerning the amount of oil needed and the resources. Reality is slightly different, as the following diagram shows (Figure 19): Little change but massive effect.
Figure 19. Crude oil production II.
A comparison of Figures 18 and 19 shows that changes of single factors of a system can change the whole system completely. There is no flow from the surroundings into the system. At least this inflow is infinitely small compared to the amount of crude oil that will be necessary in the future. The first diagram might be helpful for crude oil producing manufacturers, but on the long run the second diagram is definitely more realistic. This model is a useful connection to other subjects and can be used in maths lessons to introduce cross curriculum teaching. Another example that is useful for maths lessons is the money on the bank account that I have used earlier already (Figure 20 and 21): interest
amount of money on bank account
rate of interest
Figure 20. Causal loop diagram.
interest
amount of money on the bank account
rate of interest Figure 21. Stock and flow diagram. 203
KUBICEK
The money on the bank account example was used to introduce causal loop and stock and flow diagrams for my project at school because it is simple and shows the basics very well. The software used was VENSIM PLE. It is free to download and easily installed, therefore a good choice for lessons. VENSIM PLE can also be used to simulate various runs. Nevertheless I used EXCEL to work out the simulations because I tried to avoid differential equations, which were definitely not necessary for modelling logistic growth processes in class. There were two main growth processes we worked on in that project: First the growth of the number of owners of mobile phones and secondly the growth of a population of fish in a pond. The students realized the connection between these two growth processes quite quickly and also, that other growth processes like linear and exponential growth are not really realistic in this context.
Figure 22. Number of mobile phone owners16.
Using system dynamics one can use the stock and flow diagram above (Figure 22) to present the quantitative relations between stock (number of mobile phone owners) and flow (number of mobile phones bought per unit time). An advantage of this process is that the time step can be chosen individually. “Space” is important for the logistic growth model. This is the factor that gets the curve infinitely close to a certain maximum capacity: The bigger the number of mobile phone owners, and the closer this number to the maximum capacity, the smaller the factor “space”. This gives the curve the typical shape of the logistic growth without the use of differential equations: Space = 1 -
number of mobile phone owners maximum capacity
growth per unit time = growth factor * space * number of mobile phone owners amount new = amount old + growth Using these simple equations one can easily describe the connections of the factors using a stock and flow diagram. Then the data can be modelled and simulated (Figure 23 and Table 2). 204
MODELS FOR LOGISTIC GROWTH PROCESSES
Table 2. Number of mobile phone owners Number of mobile phone owners
Time 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007
Initial time Time step [years] Initial amount Maximum capacity Growth factor GF
1990 1 10000 500000 0,9
Amount old
Space 0,98 0,96236 0,929759093 0,870982683 0,769848018 0,610384176 0,396350376 0,181019296 0,047593116 0,006797906 0,000721381 7,26064E-05 7,26539E-06 7,26586E-07 7,26591E-08 7,26592E-09 7,26592E-10 7,26592E-11
10000 18820 35120,45368 64508,65871 115075,9909 194807,912 301824,8122 409490,3521 476203,4418 496601,047 499639,3095 499963,6968 499996,3673 499999,6367 499999,9637 499999,9964 499999,9996 500000
Growth 8820 16300,4537 29388,205 50567,3322 79731,9212 107016,9 107665,54 66713,0897 20397,6053 3038,26247 324,387258 32,6705292 3,26940143 0,32696366 0,0326966 0,00326966 0,00032697 3,2697E-05
Amount new 18820 35120,4537 64508,6587 115075,991 194807,912 301824,812 409490,352 476203,442 496601,047 499639,31 499963,697 499996,367 499999,637 499999,964 499999,996 500000 500000 500000
Number of mobile phone owners
600000 500000 400000 300000 200000 100000 0 1990
1995
2000
2005
2010
2015
Time
Figure 23. Number of mobile phone owners using GF 0.9 (number of mobile phone owners vs. time). 205
KUBICEK
One advantage of using EXCEL to model logistic growth processes in class is that students are familiar with the program. Furthermore parameters can easily be changed and one can see the results and changes on the graph immediately. There are no differential equations involved. Simple mathematical equations in combination with the factor “space” that makes it easy for students to follow the modelling process. One of my students’ results (Table 3): Table 3. Logistic growth of the number of mobile phone owners (Irene A.) Mobile phones Time step Initial amount Max. capacity Growth factor Time 1 2 3 4 5
20 21 22 23 24 25
1 year 10000 500000 0,5 Mobiles beg 10000 14900 22127,99 32702,33706 47984,06274
499095,9783 499547,1719 499773,3809 499886,6391 499943,3067 499971,6501
Space 0,98 0,9702 0,95574402 0,934595326 0,904031875
0,001808043 0,000905656 0,000453238 0,000226722 0,000113387 5,66997E-05
Growth per year 4900 7227,99 10574,34706 15281,72568 21689,56109
451,1936186 226,2090113 113,2582027 56,6676069 28,34344002 14,17413042
Mobiles end 14900 22127,99 32702,33706 47984,06274 69673,62383
499547,1719 499773,3809 499886,6391 499943,3067 499971,6501 499985,8243
After the introduction of causal loop and stock and flow diagrams I used this example of logistic growth of the number of owners of mobile phones to show my students how to set up a model using EXCEL. The term “space” was new for them but easy to understand and to apply to the problem of the fish population growing in a pond. Finally I want to point out some details concerning didactics which I found important when working on the project. DIDACTICS
The final question enabled the students to work as far as they got or as their ability, motivation and interest allowed them to get. The interesting thing was that most of 206
MODELS FOR LOGISTIC GROWTH PROCESSES
them were sceptical because of this “open” way of working in maths lessons in the beginning. It was unusual for them to have more than one “correct” way for problem solving in Maths lessons and one could not always decide whether the result was right or wrong. In the end they quite liked the idea of working at their own speed, trying out things and discussing results as a group. The atmosphere in class was relaxed but they really worked on their tasks, changing parameters, drawing different diagrams, working on various different realistic growth processes that came to their mind. They were in favour of the causal loop diagram and stock and flow diagrams. In my thesis I found out that the best group size to make students work as efficiently as possible is small groups of 3 to 5 students. In the end there were some groups of three and many pairs working at their models. It really went very well. The results were discussed also in bigger groups and the passive class started working as a team. Students got different access to mathematics and worked together as team, trying to solve problems, discussing and improving various strategies to achieve their results. Another aspect that was really positive was that they could work without pressure. Some of them worked on causal loop diagrams half the time, others on the most complex simulation runs. It was fabulous to see them so active. CONCLUSION
The main aim was to teach the class how to model logistic growth processes without the use of differential equations. A side effect was that they also got to know how to present qualitative data using causal loop diagrams and quantitative data using stock and flow diagrams. Stock and flow diagrams were then used for the growth processes because here the differentiation between stocks and flows is important. They learned the difference between stocks, flows, flow rates, auxiliary variables and what a system is in general as well as the appropriate symbols when using stock and flow diagrams. Another side effect was the introduction of VENSIM PLE as software. Students were really happy about this new tool that seems to be helpful not only in mathematics. Some of them used VENSIM to run the simulations and got similar results as when using EXCEL. The simulations would have been possible with VENSIM PLE, too. But I wanted no differential equations when working on the project, because my students did not know them from lessons. So the term “space” and the model of the mobile phone owners used to introduce logistic growth with EXCEL. The students got an insight into the process of modelling real situations and they will hopefully be a bit more critical in future times when it comes to interpreting results. NOTES 1
2
Causal loop diagrams were not absolutely necessary for the project, but they fitted into the context and can be useful for students in the future. Translation of the word “Freiraum” (=free room or free space) used by Dr. Ossimitz VENSIM is a software for free and easy to handle, so excellent for use in class 207
KUBICEK 3 4
5 6 7
8 9
10 11 12 13
14
15 16
Translation of the word “Freiraum” (=free room or free space) used by Dr. Ossimitz see: G. OSSIMITZ, F. SCHLÖGLHOFER: Untersuchung vernetzter Systeme (Lehrplankommentar). In: H. BÜRGER et al. (Hg.): Mathematik AHS Oberstufe Kommentar. Wien: Österreichischer Bundesverlag; S. 198- 211 See pg. 1 See later. The table shows another advantage of system dynamics compared to differential equations: the time steps can be chosen other than 1. A growth factor of 2.9 implies chaotic behaviour F. VESTER: Unsere Welt, ein vernetztes System; Deutscher Taschenbuchverlag; München; 2. Auflage; 1985; S. 50 vgl. G. OSSIMITZ: Entwicklung systemischen Denkens; aaO. S. 66 vgl. ebenda S. 66 as the number of “-” on top of the arrows is an odd number (see before!) H. BUERGER, R. FISCHER, G. MALLE: Mathematik Oberstufe 3; Wien; Hölder- Pichler- Tempsky Verlag; 1. Auflage; 1991; S. 288 nach G. OSSIMITZ, F. SCHLÖGLHOFER: Untersuchung vernetzter Systeme (Lehrplankommentar). In: H. BÜRGER et al. (Hg.): Mathematik AHS Oberstufe Kommentar. Wien: Österreichischer Bundesverlag; S. 198-211 G.OSSIMITZ, Ch. LAPP: Das Metanoia Prinzip; S. 133; G. OSSIMITZ, CH. LAPP: Das Metanoia Prinzip; (Eine Einführung in systemgerechtes Denken und Handeln); Verlag Franzbecker; Hildesheim; Berlin; 2006; S. 166
REFERENCES/BIBLIOGRAPHY “Dynamische Prozesse im Mathematikunterricht” (=Modelling dynamic processes in Maths lessons); thesis Astrid Kubicek 2008; Ossimitz, G., & Lapp, C. H. (2006). Das Metanoia Prinzip: (Eine Einführung in systemgerechtes Denken und Handeln). Berlin: Verlag Franzbecker; Hildesheim. Ossimitz, G. (2000). Entwicklung systemischen Denkens. Wien: Profil Verlag; Bd.1. Vester, F. (1985). Unsere Welt, ein vernetztes System; Deutscher Taschenbuchverlag; München; 2. Auflage. Buerger, H., Fischer, R., & Malle, G. (1991). Mathematik Oberstufe 3; Wien; Hölder- Pichler- Tempsky Verlag; 1. Auflage. Ossimitz, G., & Schlöglhofer, F. Untersuchung vernetzter Systeme (Lehrplankommentar). In H. Bürger, et al. (Hg.), Mathematik AHS Oberstufe Kommentar. Wien: Österreichischer Bundesverlag. (A list of software products can be found on the following homepage of Günther Ossimitz: www.uniklu.ac.at/gossimitz/home.php)
Astrid Kubicek Linz
208
JIM LEAHY
12. TEACHING ASPECTS OF SCHOOL GEOMETRY USING THE POPULAR GAMES RUGBY AND SNOOKER
INTRODUCTION
It is generally accepted now that all cultures engage in mathematical activities of some kind. Bishop (1991) in his study Mathematical Enculturation: A cultural Perspective on Mathematical education proposes six ‘mathematical activities’ that are more or less universal viz. counting, measuring, locating, designing, playing and explaining. This predisposition towards ‘playing’ offers an opportunity that mathematics teachers can exploit since games of all kinds develop from structured play e.g. by adding rules and objectives. Indeed man’s preoccupation with games and puzzles of all kinds is well documented and is ripe for exploitation in a pedagogical sense in mathematics classrooms. This paper looks at some interesting geometry problems that arise in two different types of well known games: Rugby (a field game), and Snooker (a table game). These problems are developed in a teacher-friendly and teacher-ready way that makes them available for use in any upper secondary mathematics classroom by mathematics teachers looking for interesting contexts and applications for their students. An added bonus from the point of view of a busy secondary school teacher is that the problems are self-contained from the point of view of contextual knowledge. CONVERTING A TRY IN RUGBY
In the game of rugby union a try, scored when the ball is touched down over the goal line, is worth 5 points. A free kick called a conversion, worth two further points, is then awarded and can be taken from any point on a line directly out from where the ball was touched down i.e., on a line perpendicular to the goal line at the point where the ball was touched down. To claim the two points the ball must be kicked between the goal posts and over the crossbar. Assuming distance and height is not a problem from how far out should the kicker attempt the conversion for the best chance of success? Two solutions follow; one using geometry and the other using calculus.
J. Maasz and J. O’Donoghue (eds.), Real-World Problems for Secondary School Mathematics Students: Case Studies, 209–220. © 2011 Sense Publishers. All rights reserved.
LEAHY
Investigation Using Geometry
Figure 1. Ball position for kick from K.
Let the goal posts be located at A and B and let the try be scored at C. Let K be the position from which the conversion is attempted where ∠BCK = 90 0 and the line through A and B is the goal line (see Figure 1). Clearly the best position for K is where the angle ∠AKB is a maximum. Intuitively one would think that the farther out K is from C the greater the angle but this is actually not the case.
Figure 2. Ball position for kick from K from narrow angle AKB.
If K is close to C (Figure 2) it is easy to see that ∠AKB is very small and at C would in fact be zero. On the other hand as K moves very far from C (Figure 3) the lines AK and BK become closer to being parallel and in fact are parallel when | CK |→ ∞ and then again the angle ∠AKB → 0 . So the maximum possible angle must occur somewhere in between. 210
TEACHING ASPECTS OF SCHOOL GEOMETRY
Figure 3. Ball position for kick from K from wider angle AKB.
The angle ∠AKB varying as K moves along C suggests the circle and angle in a segment might be useful. So draw a circle through A, B and K (Figure 4).
Figure 4. Circle through points A,B and K.
A circle through A, B and K will generally intersect CK at another point K ' (Figure 4). In this case ∠AKB = ∠AK ' B , since they are angles in the same segment. This would suggest since ∠AKB as K moves out from C increases from 0 to a maximum position and then fades away to zero that the maximum position is between K ' and K. It’s probably safe enough to assume that there is only one maximum position (although that as yet cannot be guaranteed). The only circle that gives one position if K moves is the circle through A and B that touches CK (Figure 5). 211
LEAHY
Figure 5. Possible ball locations K,K’ and K’’ .
Teaching Note The above discussion and outcomes can be motivated and ‘discovered’ by students working individually or in small groups using a dynamic geometry package such as GeoGebra or Cabri Geometry. Proof We now prove that this is the maximum position. Let K ' be a point between C and K and K " a point on CK extended (Figure 5). It is sufficient to show that ∠AKB > ∠AK ' B and ∠AKB > ∠AK " B . '
Join BK meeting the circle at D. Then ∠AKB = ∠ADB (in the same segment of the circle) = ∠AK D + ∠K ' AD since the external angle equal the opposite internal angles ⇒ ∠AKB > ∠AK ' D '
Similarly let BK " meet the circle again at D ' . Then ∠AKB = ∠AD ' B (same segment) = ∠AK D + ∠K AD ' (external angle equal to opposite internal angles) "
212
'
"
TEACHING ASPECTS OF SCHOOL GEOMETRY
= ∠AK " B + ∠K " AD ' > AK " B
Thus K is the required position. Now let AB = a, BC = b and CK = x. CK² = CB.CA x² = b(a + b) x=
b( a + b )
which gives a formula for the distance x. AB is actually 5.6m. So for example if b = 20m then x=
20(5.6 + 20)
x = 22.63m (2 d.p.) and if b = 30 you get 32.68m (2 d.p.). So in practice the kicker should come out a bit more than the distance BC. A Different Approach (Using Calculus)
Figure 6. Using calculus interpretation.
Let K be an arbitrary point on the perpendicular at C to AB. Let the angle AKB = θ and angle BKC = φ (Figure 6). Let AB = a, BC = b and CK = x. a+b b Then tan ( θ + φ ) = and tan φ = x x 213
LEAHY
a+b tan θ + tan φ = 1 − tan θ tan φ x b tan θ + x = a+b ⇒ b x 1 − tan θ x
⇒
Let tan θ = t. So b x = a+b bt x 1− x ⎛ b⎞ ⎛ bt ⎞ ⇒ x⎜ t + ⎟ = (a + b) ⎜1 − ⎟ x x⎠ ⎠ ⎝ ⎝ ⎛ bt ⎞ = (a + b) ⎜ − ⎟ (a + b) ⎝ x⎠ b ⎛b⎞ ⇒ t [x + ⎜ ⎟ (a + b)] = (a + b) - x ⎛⎜ ⎞⎟ ⎝ x⎠ ⎝ x⎠ = a +b−b = a ⎞ ⎛ ⎟ ⎜ a ⎟ ⇒t =⎜ ⎜ x + b ( a + b) ⎟ ⎟ ⎜ x ⎠ ⎝ ⎞ ⎛ ⎟ ⎜ a ⎟ =⎜ 2 ⎜ x + b(a + b) ⎟ ⎟ ⎜ x ⎠ ⎝ ⎞ ⎛ ax ⎟ = ⎜⎜ 2 ⎟ ⎝ x + b( a + b) ⎠ t+
Differentiating t with respect to x yields
dt [ x 2 + b(a + b)]a − ax.2 x AB(a + b) − ax 2 = 2 = dx [ x 2 + b(a + b)]2 [ x + b(a + b)]2 Since the denominator on the right hand side is positive 214
TEACHING ASPECTS OF SCHOOL GEOMETRY
dt = 0 if and only if AB(a + b) − ax 2 = 0 . dx ⇒ AB(a + b) = ax 2 ⇒ b(a + b) = x 2 since a ≠ 0
Since x > 0, x =
b( a + b) .
To show this turning point is a maximum note: x<
b( a + b) ⇒ x 2 < b(a + b) since x > 0 ⇒ a x 2 < AB(a + b) ⇒ AB(a + b) - a x 2 > 0 dt ⇒ >0 dx
Similarly x >
b( a + b) ⇒
tan θ is a maximum at x =
dt < 0 i.e., t is a maximum at x = dx
b(a + b) or
b( a + b) .
⇒ θ is a maximum at x = b(a + b) since θ increases when tan θ increases. ANGLES IN SNOOKER
The game of Pool, popular among young people, and similar games like Snooker and Billiards lend themselves to interesting uses of elementary geometry. The main objective in these games is to strike a billiard ball with another ball in order to direct it into one of the pockets at the side of the table. Problems arise when the ball to be struck is hidden behind another ball so that the cue ball must be deflected off the side of the table into the path of the ball to be struck. The cue ball may need to be deflected off more than one side depending on the configuration of the balls on the table. Any text in transformation geometry would prove useful during this phase e.g. Jeger (1968). The simplest case is illustrated in Figure 7. The sides of the table are labelled a, b, c, d. C is the cue ball and O is the ball to be struck by the cue ball. The other balls are not shown in the diagram. The cue ball C must strike the side a at some point P so that when deflected it strikes the object ball O. Now in physics there is a law of reflection which states that under ideal conditions when an object is reflected than the angle of incidence is equal to the angle of reflection. 215
LEAHY
Figure 7. Snooker shot off one side cushion.
Figure 8. Law of reflection.
In Figure 8 the angle of incidence is ∠ APC and the angle of reflection is ∠ BPC where CP ⊥ E so that ∠ APC = ∠ BPC. This implies that ∠ APD = ∠ BPE which is more useful for our purpose.
Figure 9. Construction to locate position of P. 216
TEACHING ASPECTS OF SCHOOL GEOMETRY
The problem now is to find P given that ∠ CPA = ∠ BPO (Figure 9). Extend CP to meet the perpendicular from O to the side a at Oa meeting AB at E. Then
∠Oa PB = ∠CPA = ∠BPO
⇒ ΔOa PE ≡ ΔOPE ⇒ Oa E = OE i.e., Oa can be thought of as the reflection of O in side a. This then gives another method to find P. Simply find Oa first and then join C to Oa meeting AB at the required point P. In practice you can imagine a virtual ball at Oa which must be struck by the cue ball C. This will result in striking ball O. An alternative analysis is as follows. Draw CF ⊥ AB . Then ΔCFP is similar to ΔOEP so that FP CF = PE OE
i.e., FE is divided in the ration. CF : OE This method is not practical in actual play but for teaching purposes it does introduce the relationships between similar triangles. Exercise: In Figure 9 if CF = 96cm, OE = 75cm and FE = 228cm calculate the distance FP. Teaching Note Professional snooker players often make shots using multiple cushions. ‘What if ’ considerations may now be employed by students to extend the analysis in interesting ways e.g. consider similar shots involving 2, 3 or more cushions. Other questions may occur to students that can be investigated. For example students might come to conjecture that the cue ball travels the least possible distance in all such cases. Is this true? Can you prove it? These situations are dealt with in the following sections for the benefit of the mathematics teacher. Two Cushion Shots Now consider the case where the cue ball must be reflected off two sides a and b (Figure 10). 217
LEAHY
Figure 10. Snooker shot of two side cushions.
To analyse this case we reduce it to two cases of reflection off one side. First imagine the cue ball is at P1 (yet to be determined) needing to strike ball O by reflecting off side b. As in section 1 above find the reflection Ob of O in side b. Then joining P1 to the virtual ball at Ob gives the point P2 . But of course you do not know point P1 on side a. However, the problem is now reduced to reflecting the cue ball C off side a to strike the virtual ball at Ob and again this has been solved in section 1. So find the image Oa by reflection the virtual ball Ob in side a. Then joining C to Oa gives the point P1 which in turn will reflect to the point P2 and thence to O. To construct the path of the cue ball C therefore first find Ob , the reflection of O in side b, then find Oa , the reflection of Ob in side a. Join C to Oa meeting side a at P1 , then join P1 to Ob meeting side b at P2 and finally joining P2 to C completes the path. Exercise: Prove CP1 // OP2 n Cushions Shots The method of the proceeding section can now be extended to deal with reflections off any number of sides. The path for reflections off three sides a, b and c in turn is shown in Figure 11. In order to find Oc , Ob and Oa , join C to Oa giving P1 , P1 to Ob giving P2 and P2 to Oc giving P3 . Exercise: Sketch the path for reflections in turn off sides a, b, c, and d. 218
TEACHING ASPECTS OF SCHOOL GEOMETRY
Figure 11. Snooker shot off three side cushions.
Shortest Path (Cue Ball Travels Least Possible Distance) An interesting aspect of the problem is that in any given situation the cue ball C travels the least possible distance to reach the object ball O. In Figure 12 let P ' be any point on side a other than P. Join C P ' , O P ' and Oa P ' .
Figure 12. Shortest path calculations.
From section 1,
ΔO a PE ≡ ΔOPE ⇒ Oa P ≡ OP 219
LEAHY
Similarly, Oa P' ≡ OP' So CP + PO = CP + O A P
< CP '+O A P' (triangle inequality) = CP'+ P' O i.e., CP + PO < CP'+ P' O Further Investigations Students will find that GeoGebra is a very suitable vehicle for pursuing these and other geometric investigations. 1. Sketch the path for reflecting off side a and c in turn 2. Sketch the path for reflecting off sides a,c,a, and c, in turn. 3. Try sketching the path that will return the cue ball to its starting position (ignoring O) under different sequences of reflections 4. Investigate if it is always possible to construct a path from C to O under specific given initial conditions. FINAL REMARKS
These problems and investigations offer real opportunities for student engagement in the mathematics classroom. I have no doubt that mathematics teachers and students who use them will find interesting extensions and a lot of enjoyment working with them. This approach exploits students’ cultural propensity to play and after a fashion engage in mathematical activity, together with the motivational power of context for teaching and learning mathematics. By working through the investigations and problems students are invited to marshal their resources, investigate, experiment, specialize and generalize, conjecture and prove and in other ways experience mathematics as mathematicians do. And there were surprises – an optimum angle for the kicker, and a path of least possible length for the cue ball! REFERENCES Bishop, A. J. (1991). Mathematical enculturation: A cultural perspective on mathematical education. Dordrecht: Kluwer Academic Publishers. Jeger, M. (1968). Transformation geometry. London: Allen and Unwin.
Jim Leahy Department of Mathematics and Statistics University of Limerick
220
JUERGEN MAASZ
13. INCREASING TURNOVER? STREAMLINING WORKING CONDITIONS? A POSSIBLE WAY TO OPTIMIZE PRODUCTION PROCESSES AS A TOPIC IN MATHEMATICS LESSONS
INTRODUCTION
Mathematics is applied in various ways in both everyday life as well as the working world. In this chapter, I aim to bring parts of the working world into the classroom by means of the Multi-Moment Recording. More precisely, the students should be acquainted with Multi Moment Recording as a tool to understand and record more effectively production processes. The main focus of this chapter is not on deducing the formula used, but simply on using it. For teachers that do not like this grade of reality in a mathematics lesson there are two helping excurses, one on checking its plausibility and practicality, and one on its mathematical origin. Furthermore, I plan on simulating and optimizing a simple production process (production of paper planes) in order to learn about mathematics as a means of presenting and communicating facts – aspects that are rarely addressed at school. OUTLINE: SUGGESTED LESSON
I will start this contribution with a short outline of the proposed teaching sequence and some preliminary remarks. Preliminary Remarks
First preliminary remark: Since it has often been shown that learning is facilitated by active participation in class, I would like to suggest a lesson that might seem somewhat unusual at first glance. The following description is meant to provide stimulus and support for the actual implementation of the suggested lesson. My proposal includes hints for teaching methods. I think they are appropriate. This should not mean that I know better than you how to teach. You, as a teacher, are the best expert for your teaching. I just like to show how this unusual thing (=my proposal) could really happen. The second preliminary remark refers to the question concerning the appropriateness of the topic for the classroom. Should the optimization of production processes be addressed at school? Wouldn’t such a topic mean meddling with social or trade-union related affairs? The answer to the first question is a clear YES, J. Maasz and J. O’Donoghue (eds.), Real-World Problems for Secondary School Mathematics Students: Case Studies, 221–237. © 2011 Sense Publishers. All rights reserved.
MAASZ
because practical relevance is a basic requirement in all current school curricula. The second question relates back to a crucial issue of lessons which are close to reality: One of the major curricular objectives concerns the development of students’ ability to judge, that is, the ability to form a personal opinion based on critical reflection and the ability to justify these opinions successfully. Effective instruction meets these objectives. In other words: “Effective” does not mean that the teacher summarizes his or her opinion in short words to make this non mathematical part of the mathematics lesson very short. The teacher helps the students to learn to make their own decision using mathematics for a more rational decision. The third preliminary remark concerns an objection frequently voiced by teachers saying that instruction close to real life is useful but time-consuming and hardly practical; it does not fit into day-to-day school life, which is dominated by time pressure and other curricular demands. However, traditional mathematics lessons, which dedicate much time to operative work including practicing calculation methods, have proven ineffective. That is, fostering active project work during lessons is time well spent indeed. OUTLINE OF THE LESSON
Presenting the development of industrial production, i.e., the transition from handicraft to manufacture and current car production by means of industrial robots and production islands is a good lesson opener and motivation. A film such as Charlie Chaplin’s ‘Modern Times’, which presents assembly line work from a very specific perspective, could complement such a lesson opener. If an interdisciplinary approach in combination with (economic) history is adopted, the students will get an insight into the development of productivity. At the same time, students will realize that today assembling a car is impossible for an individual. Mass production of automobiles is impossible without division of labour and specializations are fundamental prerequisites. The second step is a paper plane construction contest. Who can produce the best paper plane? Who is fastest? How long does it take? Of course, other simple objects can be produced as well. Paper planes, however, are particularly suitable because they require very little material (a piece of waste paper or spare photocopies, which are easily available in schools), they are easy to fold (no scissors, glue or other material to fix component parts are required) and students can discuss the best folding instructions. Google offers 12600 sites related to paper plane construction, one of which, for example, shows a youtube video: http://www.tippsundtricks24.de/heimwerken/ do-it-yourself/anleitung-papierflieger-saebelzahntiger-basteln.html. Furthermore, there is a simple quality check for every paper plane: a test flight! At this point is seems useful to note that students should decide on a code of conduct, for example, how to cooperate with others, how to use the finished products, where in the classroom to establish the designated zone to test the paper planes. After evaluating the results, the teacher sets an additional assignment: How can we all together produce the largest number of paper planes in the shortest possible time? Imagine we want to produce and sell them. The knowledge gained in the initial phase of the lesson suggests that the key to success is division of labour. However, another important precondition needs to be 222
A POSSIBLE WAY TO OPTIMIZE PRODUCTION PROCESSES
addressed first. What might an optimal design look like? What is optimal under the given circumstances? First, we need to set appropriate criteria. Such criteria might include: 1. 2. 3. 4.
The paper plane needs to be capable of flying. It should look good. Folding it should be easy. The production requires only folding but not gluing.
From this, the first conclusions can be drawn, which can in turn serve as arguments for such criteria. For example, crumpled paper is not effective because neither does it look nice nor will the plane have the desired flight qualities. How do the students in the class find the optimal design? Let’s organize a competition. Groups of three develop a proposal. All proposals are gathered, presented in class and discussed. Finally, students try to reach a consensus and jointly agree on one proposal (however, nobody is allowed to vote for their own proposal). In order to illustrate the suggested lesson, I have decided to describe the following part of the lesson by means of a specific model of a paper plane, using the internet, which offers a large number of step-by-step instructions as well as videos on youtube, for example this one: http://www.youtube.com/watch?v=y5ebnviXegc&feature= player_embedded My understanding is that the paper plane model ‘arrow’ meets the four criteria mentioned above. Ultimately, however, I have chosen this model because it requires a relatively small number of work steps. The video, which runs 1 minute and 26 seconds, is considerably shorter than other videos, which usually run for three minutes. This new criterion (length of the video) is mentioned here to indicate that considerations taken in the initial phases of the lesson might need revision in the course of the project. Let’s consider the chosen paper plane in more detail (Figure 1):
Figure 1. Screenshot from http://www.youtube.com/watch?v=y5ebnviXegc&feature=player_embedded 223
MAASZ
The third step is a test run: How many paper planes of this type can we produce together in five or ten minutes? The test run probably brings about a problem of quality control: Do the folded paper planes actually resemble the model? At this point we have reached the question of improving the speed and quality of the production process. That is, we are back to organization and division of labour. Let’s do a simple test in class. After having agreed on a model and construction plan, we fold as many planes as possible – for about ten to fifteen minutes. Then, we will plan the division of labour: Every student does one bend and hands over the paper to the next station, where the next bend will be produced – until the plane is finished at the last workstation. Then another test run and we’re done! In order to plan the division of labour, a precise analysis of the instructions is needed. I will demonstrate this using the ‘arrow’ model. We’ll watch the video carefully, take a piece of paper and fold along with the video. 1) 2) 3) 4) 5)
Take a piece of paper. Fold down the centre. Make sure the crease is straight and sharp. Open it out again. Fold in the top left hand corner to the centre line.
Figure 2. Screenshot from http://www.youtube.com/watch?v=y5ebnviXegc&feature=player_embedded
1) Fold in the second top corner to the centre line (Figure 2). 2) Fold the first corner again. 224
A POSSIBLE WAY TO OPTIMIZE PRODUCTION PROCESSES
Figure 3. Screenshot from http://www.youtube.com/watch?v=y5ebnviXegc&feature=player_embedded
3) Fold the second corner again. 4) Fold the two sides again along the centre line (Figure 3). 5) Indicate web width. 6) Fold the left wing (Figure 4). 7) Fold the right wing.
Figure 4. Screenshot from http://www.youtube.com/watch?v=y5ebnviXegc&feature=player_embedded 225
MAASZ
8) Fold the edge of left wing. 9) Fold the edge of right wing. 10) Flight test. Why does this plane fly but others don’t? Or at least not as well as this design? At this stage, I would like to point out that interdisciplinary and cross-curricular lessons in cooperation with physics might be useful. We could, for example, investigate why we should fold the front section of the paper more than once. The lesson is now beginning to get exciting. Once the division of labour has been discussed and organized by the students (who is sitting where doing what?), another five-minute production phase follows. Can we produce a larger number of airworthy planes? In step four, the optimization of the production based on division of labour begins. How can we become even faster and more efficient? This question is raised in every real assembly hall every day. We are looking for answers and find many of them on the internet. As we’re talking about mathematics lessons here, I have chosen those approaches which are strongly connected to mathematics. (Motivation or pressure would be examples of alternative solutions.) I am going to focus on Multi Moment Recording here, which is a process to investigate how long every work step takes in relation to the total labour time. Multi Moment Recording is a sampling procedure to determine the frequency of occurrence of predefined phenomena. A number of temporary observations of work systems are collected without involving the observed person in an active way, for example, by asking for information or interruptions of other kinds. The organization manual of the German Federal Ministry of the Interior describes the Multi Moment Method (http://www.orghandbuch.de/ nn_414926/Organisations Handbuch/DE/6__MethodenTechniken/61__Erhebungstechniken/616__Multimom entaufnahme/multimomentaufnahme-node.html?__nnn=true). I translate and draft it in the following way: The Multi Moment Method is a statistical method to find out how often different parts of a working process happen. This is done by viewing the working process after defining and separating single work steps. If viewing the work in progress happens several times (often enough) it gives an exact image of what the working process is. This is what you need to optimize it. This organisation manual provides a process description which can be used for a research design in the classroom based on the Multi Moment Method to investigate the production of paper planes. The following plan could be implemented. The class is divided into several groups, most of which produce paper planes. Two groups conduct Multi Moment Recordings of the ongoing production by describing in detail the various work steps. Then, the students create an observation plan, define the number of observations and plan the observation tour, i.e., a list of times at which the various work stations will be observed. What does that mean in our example, i.e., the production of ‘arrow’ paper planes? I have identified and listed a total of 15 work processes. It seems obvious to take 226
A POSSIBLE WAY TO OPTIMIZE PRODUCTION PROCESSES
these 15 processes as a starting point. One question that arises fairly quickly is how to record the handing over of the unfinished planes. Let’s assume that the workstations are arrayed in line (the tables should be arranged in a row, if possible). In this way, the planes can readily be handed over to the adjacent workstation. The usefulness of this arrangement becomes obvious when the distances between two tables are increased so that transporting the planes to the nearest workstation takes a minimum of 10 seconds. 14 transports would amount to 140 seconds or 2 minutes and 20 seconds of transit time. This is significantly longer than the time needed to fold a plane. We would like to keep the experiment rather simple and assume a transit time of 0 seconds. All students working at the assembly line hand over the plane to the next student. However, another problem will soon become apparent: Not all processes take equally long. The station that takes longest will cause a blockage. Which one is it? Are the differences so great that we need to provide intermediate storage facilities or a different plan of work steps? We are going to solve this problem by recording the video time: Process number 1) Take a piece of paper 2) Fold down the centre 3) Make sure the crease is straight and sharp 4) Open it out again 5) Fold in the top left hand corner to the centre line 6) Fold in the second top corner to the centre line 7) Fold the first corner again 8) Fold the second corner again 9) Fold the two sides again along the centre line 10) Indicate web width 11) Fold the left wing 12) Fold the right wing 13) Fold the edge of the left wing 14) Fold the edge of the right wing 15) Flight test Total:
time in seconds 6 7 3 2 6 7 6 9 6 1 6 8 5 5 10 (?) 77 seconds
plus 10 seconds testing time for trial run. The remaining time adding up to the total length of the video is used to present the current state of production. Now it’s about time for a moment of reflection. The various work steps obviously take different amounts of time. Steps 4 and 10, for example, are relatively short. In addition, even similar work steps differ in the required amount of time, which can also be due to measurement error (imprecise observation). I will, therefore, conflate some work steps in order to improve the conditions for the first attempt of Multi Moment Recording. The merging of individual work steps aims to approximate equal production times for every work station. 227
MAASZ
Station new I II III IV
Process old 1 to 5 6 to 8 9 to 12 13 to 15
Duration 24 seconds 22 seconds 21 seconds 20 seconds
In this first attempt I have merged subsequent processes in order to have work phases of equal length. I hope that the differences remain small and suggest students to have a first trial run. This corresponds with the standard procedure of modelling. After all, the model assumptions can easily be modified or refined in a second attempt (Siller 2008). For step five we need more information about the method and the procedure of the Multi Moment Recording. We have four stations and plan to measure the proportion of working time at each station in relation to the total working time. This relative proportion should be determined with precision e with a probability of 95% (that is, a confidence interval of 95% with a range of 2e). How many measurements (observations) do we need? The following formula has been found:
http://bios-bremerhaven.de/cms/upload/Dokumente/Erfa-PBE/mma_methode.pdf
At this point we have arrived at the need for a very important decision: – If we do teach as usual we now start to analyze this formula and go into a statistics lesson. – If we behave more like people in reality do we make some plausibility tests with this formula and use it afterwards. – If we do like most people do in reality we simply believe that this formula is correct and use it. – I ask you to choose the third way: Teaching real world mathematics should include this way, too. This is not an argument to always do this. My didactical argument is that students should learn at school how to work with mathematics in reality, too. This will give them a better view on mathematics itself and the relation of mathematics and the world around. 228
A POSSIBLE WAY TO OPTIMIZE PRODUCTION PROCESSES
Excursion 1: Some Hints About the Mathematical Background For all teachers that prefer to go into a statistic lesson I now will give some hints about the background of this formula. Where has this formula been taken from and what does it consist of ? It is a formula used in probability calculations, or more precisely the normal distribution. Let’s take a specific work process or a work station (I will use the term work process from now on). Suppose we want to determine the current percentage of time that a specific work process takes. How many times should we observe the process in order to be 95% confident that the estimated (sample) proportion is within e percentage points of the true proportion of time that a specific work process takes? To do so, we make random observations of the construction process of n different planes and check whether what we observe is the process we have defined as our focus or a different one. Let’s translate the variables: – p …. is the true proportion of time a specific work process takes in relation to the whole production (= the probability of success in a single trial ) – n …. is the total number of observations (number of repeated trials) – X … random variable, measures the number of times out of n observations we find the paper plane in this particular process – The random variable X is a binomial random variable and has a binomial distribution with parameters n and p. X ~ B(n, p) – X….is approximately normally distributed; this means an approximation to B (n, p) is given by the normal distribution: E(X) = n.p = μ Mean value and Standard deviation σ(X) = n.p.q
Confidence intervals can be calculated for the true proportion. The underlying distribution is binomial. Estimate p (the area under the Normal Distribution Curve) with 95% confidence (95% confidence interval). For ρ = 0.95 it follows that α = 1 − ρ = 0.05 so we get the area α 1−
2
= 0.975
For a value of 0.975 the z-score is equal to 1.96. 1.96 is the approximate value of the 97.5 percentile point of the normal distribution. 95% of the area under a normal curve lies within roughly 1.96 standard deviations of the mean. This number is therefore used in the construction of approximately 95% confidence intervals.
229
MAASZ
To form an estimated proportion, take X, the random variable, and divide it by n, the number of trials(observations). We derive an approximation for p, by dividing X through n, the number of trials. If we divide the random variable X by n, the mean by n, and the standard deviation by n, we get a normal distribution of proportions called the estimated proportion, as the random variable: The random variable is that estimated proportion: 1 . X n
1 1 1 For a proportion, the mean is: E ( . X ) = E ( x) = .n. p = p n n n 1 denoted as E ( . X ) = p = μˆ n For a proportion, the standard deviation is: 1 n
1 n
σ ( .X ) = σ ( X ) =
1 n. p.q = n
1 n. n
n p.q =
p.q n
where σ (c. X ) = c.σ ( X ) , if c ≥ 0 denoted as
1 n
σ ( .X ) =
p.q = σˆt n
1 .X n follows a normal distribution for proportions: 1 1 . X ~ N ( . X p.q ) n n , n
For the true proportion of p= 95% the confidence interval has the form:
μˆ
σˆ z
σˆ z
μˆ - σˆ z P( μˆ - σˆ z
μˆ + σˆ z
≤ 1 . X ≤ μˆ + σˆ z )= Φ ( z ) − Φ (− z ) =2 Φ (z ) − 1 = 1 − α
range = 2 σˆ z
n
μˆ e
σˆ z =e 230
e
=ρ
A POSSIBLE WAY TO OPTIMIZE PRODUCTION PROCESSES
Solving for n gives us an equation for the sample size:
σˆ z =e p.q .z = e n p.q e = n z p.q e 2 = 2 n z n=
z 2 pq e2
For the true proportion of p= 95%: z= 1.96 n=
1.96 2 pq e2
Suppose we want to determine the current percentage of time that a specific work process takes. How many times should we observe the process in order to be 95% confident that the estimated (sample) proportion is within e percentage points of the true proportion of time that a specific work process takes. This gives us a large enough sample so that we can be 95% confident that we are within e percentage points of the true proportion of time that a specific work process takes. The sample size should be n observations in order to be 95% confident that the estimated proportion is within e percentage points of the true proportion of time that a specific work process takes. Excursion 2: Some Ideas About the Plausibility of the Formula Did we find a suitable formula? How can we examine the formula in more detail? We will study the formula with a software program. As we are working with percentages, we take p, the true proportion of time a specific work process takes in relation to the whole production, times 100. 100.p = x 231
MAASZ
We plot the graph to see how it changes as p increases from 0 to 1 or x increases from 0 to 100 (we know, of course, that less than 0% or more than 100% would not make sense): 1. Axis… percentage x 2. Axis… number of observations
We obtain a parabola. Maximum value p=50%. If a work process takes a lot or very little time, the number of needed observations shrinks to zero ( p=0% or 100%). This is obviously correct. If a work process takes no or the whole time, we don’t need any observations. What effect does our claim for more or less accuracy of measurement have? For example, in our video we measured a time of 6 seconds for work process number 5. 6 out of 87 seconds is 6.9 %. p is therefore 0.069. For accuracy of measurement we claim a 5 % margin of error (or 10 % or1 %). That is, e = 0.05 (or 0.1 or 0.01). Now we use these numbers in our formula: n = 1.96^2 * 0.069 * (1 – 0.069) /0.05^2 = 98.7 (for accuracy of measurement of 5 %) n = 1.96^2 * 0.069 * (1 – 0.069) /0.1^2 = 24.68 232
A POSSIBLE WAY TO OPTIMIZE PRODUCTION PROCESSES
(that is 25 for accuracy of measurement of 10 %) n = 1.96^2 * 0.069 * (1 – 0.069) /0.01^2 = 2468 (for accuracy of measurement of 1 %) We plot the graph again: We chose p=0.069 and let e vary. 1. Axis… accuracy of measurement e 2. Axis… number of observations
Again, our graph seems accurate: If we ask for greater accuracy, the number of needed observations grows – ad infinitum. We end our excursion with the assumption that our formula seems accurate. ad 3) Back to Step Five: What now? Almost 2500 observations in one lesson – that’s impossible. Even 25 observations, which allow a precision of only 10%, are quite a lot. It seems we have to plan more precisely! How long does an observation take? We have to consider what is to be observed in which manner. One group of students should plan observations of the work stations. If, for example, there are 26 students in a class and four students work at four stations each, 2 students remain to make observations from a suitable position. One of the two observes and dictates (for example, “group 6, station III”) and the other 233
MAASZ
one takes notes. In a class of 28 students, two teams can make observations independently from each other. If more teams observe simultaneously, higher precision can be achieved. Furthermore, the interval between two observations should be long enough to allow the completion of one paper plane. However, as the brevity of one lesson does not allow doing so, we will choose a shorter interval. At the end of step five we make a first attempt: paper plane production and observations. This means 10 minutes of production, every 10 seconds an observation (i.e., 6 times per minute), 6 times 10 in total, which makes 60 observations, then discussion of results. That’s the plan. In reality, however, there will always be complications and errors, which have to be considered as yet. For the next step we obtain the following measurement report: Number of measurement 1 2 3 4 ...
group 1 2 3 4
station I I II II
work process 1 2 6 7
In step six, the results of the first measurement will be analyzed. For the purpose of this chapter I have compiled the following table and chosen the values in such a way that they can be used effectively in the next step. Work process number
Duration in seconds (video)
1 Take a piece of paper 2 Fold down the centre 3 Sharpening the crease 4 Re-opening 5 Fold the first corner 6 Fold the second corner 7 Fold the first corner again 8 Fold the second corner again 9 Fold up the two sections 10 Determine web width 11 Fold the left wing 12 Fold the right wing 13 Fold the edge of the left wing 14 Fold the edge of the right wing 15 Flight test? (not shown in the video) TOTAL 234
Percent Percent (video) (measure)
6 7 3 2 6 7 6 9 6 1 6 8 5 5 10
8 8 3 2 7 8 7 10 7 1 7 9 6 6 11
7 8 2 2 6 8 9 10 8 3 8 8 6 6 8
87
100
100
A POSSIBLE WAY TO OPTIMIZE PRODUCTION PROCESSES
What does the table indicate? The results correspond approximately to the values obtained from the video as well as the expected values. Their meaning is in the detail. Step 10, for example, takes three seconds instead of one. Such seemingly small differences can have severe effects on production. The measurement should be as precise as possible because the results will have implications for work organization and improvement of the output. If such measurement and reorganisation in real-life production can lead to an increase in profit by one or more percent, it is well worth the measuring effort. There are a number of options for the remaining lesson: One option is to end the lesson with the conclusion that the division of labour obviously results in an increase in profit; production becomes faster and more efficient. To increase the efficiency it is important to know exactly how long each work process takes. One suitable method to find out exact length of each working step in the process makes use of statistics and is called Multi Moment Recording. If the students decide on yet another step (step seven), the question of how this instrument is used in practice to determine labour time, could serve as the motivation for this step. The starting point for such considerations using internet research (discussion forum) could be the question of how to deal with breaks during work. In our example, breaks have been ignored as yet; everyone works as fast as possible, right? However, everyone knows that in real life breaks are essential – and there is often disagreement about when and how long breaks should be. I would like to include these considerations in our suggested lesson. One possible option is to take a 30 (or 10) second break after completion of every single paper plane. Thus, work process number 16 is a 30-second break. This will probably be reflected in another round of measurement. Let’s check the paper plane production in class. What’s going to happen? As before, the work processes are measured again – this time including breaks. One thing is crucial: the workers, whose work processes are measured, can influence the measurement results by intentionally taking (or not) a break exactly while measuring. However, this option of intentional influence is possible only if the time of measurement is known or predictable. What can be done if this form of influence is to be avoided? After all, the purpose of this measurement is to increase the efficiency of production. Do you think students will consider a random measurement? This idea does not seem too far-fetched – it is one of many opportunities for students to discover and try out for themselves. How can this idea be realized? Obviously, the moments of measurement play an important role. If measurement takes place at irregular intervals, chance comes into play: The moments of measurement will be determined by a random generator (such as a dice). If initially observations were made every 10 seconds, you can try the following this time: Throw a dice and alternately add and subtract the scores to or from 10, respectively. Furthermore, we agree to take the value 10 every time the dice shows a score of 6. In this way we obtain a sequence of numbers such as, for example, 12, 8, 15, 11, ... Now the class can discuss, try out and decide. Finding a series of random measurement moments can be exciting. Will the measurement of work processes be effective 235
MAASZ
this time? Will the measurements be accurate? And to what extent will attempts to influence the process actually be constricted? The practice of real-life measurement in manufacturing shows that random variables are indeed used for this purpose. This provides a further argument for reflection and discussion in class. In the final eighth step, the entire teaching sequence ought to be reflected. What have we learned? Which questions remain unanswered? What follows from it? In any case, students have gained a brief insight into the real working world. The paper plane production has given the students some understanding of the historical development of the production of goods, ranging from manual production to manufacturing and assembly line production supported and optimized by mathematics. Perhaps students or parents have had critical questions concerning this interdisciplinary and crosscurricular mathematics lesson: Is this really mathematics? There are hardly any calculations! My answer is YES! Mathematics is much more comprehensive than just solving a given arithmetic problem; it is a means of describing and shaping the world. Mathematics can help to organize and communicate interrelations (see R. Fischer). I consider it essential that mathematics lessons at school address these socially relevant aspects of mathematics. The application of mathematics in social contexts clearly shows that it is not at all neutral and value-free, particularly with respect to the selection of aspects to be modelled and goal-setting as well as the application of mathematics (Maaß 1990 and 2008). Maybe the reflection in class takes a different course and there is disagreement on whether a production process should be optimized in the first place. From my point of view, increase in productivity, in the number and quality of manufactured goods per time unit are the basis for growing social prosperity and wealth. How this prosperity is distributed and who benefits from it is a different question. The evaluation of this increase in productivity will depend on whether or not somebody benefits from it. Those who believe that this kind of progress can only be achieved at the expense of personal health due to the high intensity of labour, the kind of work itself or staff reduction resulting in unemployment will certainly consider this increase in productivity negatively. REFERENCES http://en.wikipedia.org/wiki/Methods-time_measurement http://www.rsscse.org.uk/ts/ Roland, F. (2007). Technology, mathematics and consciousness of society. In U. Gellert & E. Jablonka (Hrsg.), Mathematisation and demathematisation. Social, political and ramifications (pp. 67–80). Rotterdam: Sense Publishers. Roland, F. (2006). Materialisierung und Organisation. Zur kulturellen Bedeutung der Mathematik. (p. 292 S). Wien München: Profil. Juergen Maasz. (1990). Mathematische Technologie = sozialverträgliche Technologie? Zur mathematischen Modellierung der gesellschaftlichen “Wirklichkeit” und ihren Folgen. In R. Tschiedel (Hrsg.), Die technische Konstruktion der gesellschaftlichen Wirklichkeit, Profil-Verlag München.
236
A POSSIBLE WAY TO OPTIMIZE PRODUCTION PROCESSES Maasz, J. (2008). Manipulated by mathematics? Some answers that might be useful for teachers. In V. Seabright, et al. (Ed.), Crossing borders - Research, reflection and practice in adults learning mathematics. Belfast/Limerick. Hans-Stefan, S. (2008). Modellbilden - eine zentrale Leitidee der Mathematik. Aachen: Sahker Verlag. ISBN-Nr.: 978-383227211.
A. Univ. Prof. Univ. Doz. Dr. Juergen Maasz Universitaet Linz Institut fuer Didaktik der Mathematik Altenberger Str. 69 A - 4040 Linz
237
JUERGEN MAASZ AND HANS-STEFAN SILLER
14. MATHEMATICS AND EGGS Does this Topic Make Sense in Education?
INTRODUCTION
Eating “Easter eggs” is something many young people like very much. When we enjoyed eating eggs at Easter in the year 2010 we started to think about eggs from a mathematical point of view: Eggs are very simple objects but it is not simple to calculate measures of interests. What is the volume and the surface area of a given egg? This is not typical mathematics lesson content in Austrian schools. Our first approach to calculate eggs was a “scientific” one. We went to the library and found a very nice book by Hortsch (1990) with a lot of formulas. We tried to understand them, and then we used them. This is the content of the first part of this chapter. The second part of the chapter gives a much easier approach to a more practical solution. We started with the old and simple idea of Cavalieri. We divided the egg into several slices and made a model of the volume and the surface area. Therefore each slice is similar to a cylinder. The tower of several cylinders is approximating an egg. With the help of a spreadsheet we think that younger students should be able to find a good approximation for the volume and the surface area. Our first question is: Why should students be motivated (and not forced) to do this? Thinking about motivations leads us to some research in the internet about biggest eggs and other events involving eggs. This is the third part of our chapter, answering didactical question around our proposal to learn modelling by looking at objects of daily life, like eggs. LEARNING MATHEMATICS AND MODELLING BY CALCULTING EGGS - SOME ARGUMENTS FROM MATHEMATICS EDUCATION
Modelling of daily life objects is a topic which could be a motivating and fascinating access point to mathematics education. For this reason this chapter presents one such possibility, how an object which is common to all students in school, the egg, can be an item for an exciting discussion in schools. Based on two mathematical definitions a figure is constructed. Based on those the description in polar-coordinates and Cartesian implicit equations is developed. This stepwise modelling cycle shows the way how modelling of daily objects could be done in class while satisfying the demands of the curricula. The concept of modelling for education has been discussed for a long time. It is a basic concept in all parts of sciences and in particular in mathematics. This concept is a well accepted fundamental idea (cf. Siller, 2008), in this case the preliminary-definition J. Maasz and J. O’Donoghue (eds.), Real-World Problems for Secondary School Mathematics Students: Case Studies, 239–256. © 2011 Sense Publishers. All rights reserved.
MAASZ AND SILLER
of Schweiger (1992) is used: “A fundamental idea is a bundle of activities, strategies or techniques, which 1. can be shown in the historical development of mathematics, 2. are sustainable to order curricular concepts vertically, 3. are ideas for the question, what is mathematics, for communicating about mathematics, 4. allows mathematical education to be more flexible and clear, 5. have a corresponding linguistic or activity-based archetype in language and thinking.” Therefore it is not remarkable that the concept of modelling can be found in a lot of different curricula all over the world. For example in the Mathematics curriculum for Austrian grammar-schools one can find a lot of quotations for it (cf. BMUKK, 2004). By interpreting those quotations modelling can be seen as a process-related competence. That means – translating the area or the situation to be modelled into mathematical ideas, structures and relations, – working with a given or constructed mathematical model, – interpreting and testing results. Those process-related competencies have been described by many people, e.g. Pollak (1977), Müller and Wittmann (1984), Schupp (1987), Blum (1985). With regards to all the developments in modelling, Blum and Leiß (2007) have designed a modelling cycle which observes a more cognitive point of view. PROBLEMS IN THE REALM OF STUDENTS’ EXPERIENCES IN MATHEMATICS EDUCATION
Problems of real-life, like problems related to the environment, sports or traffic, are often a starting point for calculations and applications of mathematics. But before using mathematics in such fields it is necessary that the problem is well understood. This asks for a lot of time and dedication, because it is necessary to translate the problem from reality to mathematics and back to reality. Therefore models are used as an adequate description of the given situation. Modelling problems based on students’ experiences means creating a picture of reality which allows us to describe complex procedures in a common way. Creating such an image has to observe two directions as Krauthausen (2003) suggests: – Using knowledge for developing mathematical ideas. – Developing knowledge about reality by its reliance on mathematics. If such problems are discussed in mathematics education it is possible that students will be more motivated towards mathematics. But there are a lot of other arguments why such problems should be discussed. They – help students to understand and to cope with situations in their everyday life and in environment, – help students to achieve necessary qualifications, like translating from reality to mathematics, 240
MATHEMATICS AND EGGS
– help students to get a clear and straight-forward picture of mathematics, so that they are able to recognize that this subject is necessary for living, – motivate students to think about mathematics in an in-depth manner, so that they can recall important concepts even if they were taught them a long time ago. If a teacher is concerned with the listed points, then they will be able to find a lot of interesting topics which he/she is able to discuss with students. As a useful example we want to show a problem in realm of students’ experiences by observing an egg. THE EGG - MOTIVATION AND STARTING EXAMPLES
The teaching and learning of mathematics works much better if students are intrinsically motivated to learn. Real world problems should bring this type of motivation into the classroom. So we start with a little collection of real world questions concerning eggs. You find the answers at the end of this chapter shown as examples for school. An easy approach can be done by looking at eggs made of chocolate. If we get an egg that seems to be big as a hen’s egg - how much chocolate is it made of ? Is it cheaper to buy a normal chocolate bar if we have to pay for it? What is the price of the same type of chocolate in different forms? If teachers and students open their ears and eyes for “egg” questions they will find many of them. Here is one we heard in the news. A German company produces “Children Surprise Eggs” (Figure 1). Inside each of these eggs is a hidden toy, the surprise. The shell consists of chocolate.
Figure 1. “Children Surprise Egg”.
How much chocolate is the shell made from? If we compare costs of chocolate and toys: Is it a good idea to buy such surprise eggs? A lot of real world problems concerning eggs that are good for motivation in class can be found in daily media, like the internet. For example: What is the biggest egg at the world? (Figure 2) Where can you find it? Is there any mathematics involved in its construction? 241
MAASZ AND SILLER
Figure 2. Biggest Easter-egg 2008 (http://www.vol.at/news/tp:vol:special_ostern_aktuell/artikel/ das-groesste-osterei-der-welt/cn/news-20080312-03572171)
Another question which could be interesting might be the following: What is the biggest egg made of glass? (Figure 3) What is its weight? Is its form solid?
Figure 3. Biggest egg made of glass (http://www.joska.com/new s/news_ostern_2008) THE EGG - STARTING POINTS FOR CALCULTIONS
If we have a closer look at an egg, we will see that its shape is very harmonic and impressive. Considering a hen’s egg it is obvious that the shape of all those (hen’s) eggs is the same. Because of the fascinating shape of eggs we tried to think about a method to describe the shape of such an egg with mathematical methods. Searching the literature we found some material by Münger (1894), Schmidt (1907), Malina (1907), Wieleitner (1908), Loria (1911), Timmerding (1928) or Hortsch (1990). The book of Hortsch is a very interesting summary about the most important results of ‘egg-curves’. He also finds a new way for describing egg-curves by experimenting with known parts of ‘egg-curves’. The approach used by the authors to derive the ‘egg-curves’ is very fascinating. But none of the above listed authors has thought 242
MATHEMATICS AND EGGS
about a way to create a curve by using elementary mathematical methods. The way how the authors describe such curves are not suitable for mathematics education in schools. So we thought about a way to find such curves with the help of well known concepts in education. Our first starting point is a quotation found in Hortsch (1990): “The located ovals were the (astonishing) results of analytical-geometrical problems inside of circles.” Another place to start is a definition of ‘egg-curves’ found by Schmidt (1907) and presented in Hortsch (1990). Definition 1 Schmidt (1907) quotes: “An ‘egg-curve’ can be found as the geometrical position of the base point of all perpendiculars to secants, cut from the intersection-points of the abscissa with the bisectrix, which divide (obtuse) angles between the secants and parallel lines in the intersection-points of secants with the circle circumference in halves. The calculated formula is r = 2⋅a⋅cos²ϕ or (x2+y2)3 = 4⋅a2⋅x4” In education the role of technology is more and more important. Different systems, like computer-algebra-systems (CAS), dynamical-geometry-software (DGS) or spreadsheets, are now commonly used in education. With the help of technology it is possible to design a picture of the given definition immediately. In the first part we use a DGS because with its help it is possible to draw a dynamical picture of the given definition (Figure 4). First of all we construe the one point P of such an egg as it is given in the definition.
Figure 4. Translating the definition of Schmidt to the DGS.
According to the given instruction we first construe a circle (centre and radius arbitrarily) and then the secant from C to A (points arbitrarily). After that a parallel line to the x-axis through the point A (= intersection point secant-circle) is drawn and the line that bisects the angle CAD can be determined, which crosses the x-axis. 243
MAASZ AND SILLER
So we get the point S. Now we are able to draw the perpendicular to the secant through S. The intersection point of the secant and the perpendicular is called P and is a point of the ‘egg-curve’. We activate the “Trace on” function and use the dynamical aspect of the construction. By moving A towards the circle the ‘egg-curve’ is drawn as Schmidt has described it. This can be seen in the following figure (Figure 5):
Figure 5. Egg-curve construed by DGS.
Now we have to find a way to calculate the formulas r = 2⋅a⋅cos²ϕ or (x2+y2)3 = 4⋅a2⋅x4 as mentioned above. Let us start with the following figure (Figure 6), which is the picture constructed through the given definition:
Figure 6. Initial situation for calculating the equations of the egg-curve. 244
MATHEMATICS AND EGGS
We know (from the construction) that the triangle CPS is right-angled. Furthermore we recognize that the distances CP and PS are the same and that the triangle CAB is also rectangular, because it is situated in a semicircle. This is shown in Figure 7 (where the real ‘egg-curve’ is shown as a dashed line).
Figure 7. Affinity of the triangles.
From the known points C (0, 0), A (x, y), and B (2⋅r, 0) and through the instructions provided the coordinates of the points S and P are calculated. Therefore only a little bit of vector analysis is necessary. The calculation itself can be done through CAS. First of all we have to define the points and the direction vector of the bisecting line w:
245
MAASZ AND SILLER
Now we calculate the equation of the normal form of the bisection line, cut it with the x-axis and define the intersection-point S.
Then we calculate the intersection point P of the secant and the perpendicular through S.
Now all the important elements for finding the ‘egg-curve’ are calculated. Lets have a closer look at Figure 7 again. It is easy to recognize that there are two similar triangles – triangle CPS and triangle CAB. The distance CP is r and the radius of the circle is a. So the distance CB = 2⋅a. The other two distances which are needed . are CA and CB. The (calculated) length of CA = CB = In a next step the similarity theorem (of triangles) can be applied:
246
MATHEMATICS AND EGGS
Transforming this equation produces . By using the characteristic of the right-angled triangle CAB and calling ϕ the angle ACB we know that
. Transforming this equation yields . Inserting this connection in the equation above yields
another equation . Simplifying this equation by cancelling common terms gives the result: By substituting r and cosϕ it is possible to get the implicit Cartesian form, mentioned in the definition:
respectively
As we have seen the ‘egg-curve’ is modelled through elementary mathematical methods. By using technology teachers and students get the chance to explore such calculations by using the pivotal concept of modelling. Definition 2 Another approach for constructing an egg-curve is formulated by Münger (1894). He quotes: “Given is a circle with radius a and a point C on the circumference. CP1 is an arbitrarily position vector, P1Q1 the perpendicular to the x-axis, Q1P the perpendicular to the vector. While rotating the position vector around C point P is describing an egg-curve. The equation of this curve is r = a⋅cos²ϕ in Cartesian form (x2+y2) 3 = a2⋅x4.” As it is stated in the construction instructions a circle (radius arbitrarily) and a point C on the circumference of the circle is constructed. Then an arbitrarily point P1 on the circumference of the circle is constructed. The perpendicular to the x-axis is constructed through P1. The intersection point with the x-axis, Q1, can be found. 247
MAASZ AND SILLER
After that the perpendicular to the secant CP1 through Q1 is constructed. All these facts are shown in Figure 8:
Figure 8. Translating the definition of Münger to the DGS.
If the point P1 is moved around the circle, P will move along the ‘egg-curve’. It will be again easier to see if the “Trace on” option is activated (Figure 9).
Figure 9. Egg-curve construed by DGS.
The formula given by Münger can be derived in a similar way as the other formula was found. The most important fact which has to be seen here is that in this picture two rectangular triangles CPQ1 and CP1A exist. Those triangles are similar (Figure 10). 248
MATHEMATICS AND EGGS
Figure 10. Affinity of the triangles.
The coordinates of the points can be found mentally – without any calculation: C (0, 0), P1 (x, y), A (2⋅a, 0), Q1 (x, 0). If the distance CP is called r, then the coordinates of P will not be used. Otherwise they can be calculated analytically. For the sake of completeness we write down the coordinates of P: P Using the fact that both triangles are similar, the following equation is obvious:
Through elementary transformation, the mathematical fact for the triangle CP1A and substituting the term following equation is found:
by 2⋅a⋅cosϕ the
2⋅a⋅x⋅cosϕ = 2⋅a⋅r This result yields r = x⋅cosϕ. Because of the assumption (in the calculation) that x is part of our circle – it is the x-coordinate of P1 – it can be substituted by x = a⋅cosϕ, with a as the radius of the starting circle. So the formula of Münger (r = a⋅cos²ϕ) is found in polarcoordinates. If the implicit Cartesian form is desired, another substitution has to be done. The result in this case is: (x2+y2)3 = a2⋅x4 249
MAASZ AND SILLER
CALCULATING EGGS WITH COMPUTERS
Our first approach to calculate the volume of an egg is guided by our academic education. If we try to find a solution for a mathematical problem that we cannot remember or we have not seen before, we will walk into a library and start looking for literature such as that of Narushin (2005) or Zheng et al (2009). In the first part of this chapter we explained how useful this way can be – like in many other situations. We found a lot of formulas to calculate eggs that were proofed in history. We tried to understand these formulas and to use them. Our mathematical knowledge was trained enough to search for these formulas (we knew what we were looking for and to decide which book or article could help us), and to understand the literature. Reflecting the approach outlined above we feel follows the typical and traditional academic way to solve problems. This is great und useful - but it needs a lot of mathematical training to start it and to bring it to a successful conclusion. Going on with this reflection we started to look for an easier way. We want to describe this way now by starting with a basic ideas going back to Bonaventura Cavalieri (1598–1647). He said that we can find the volume of an object if we divide it in slices and add each volume. If the slices become thinner and thinner we go into a typical process that we know from Analysis – the limit of this process should be the exact volume. Indeed this is true under certain circumstances or conditions for the volume. Thinking about this we made a decision. We want to calculate the volume of an egg more or less exactly, but we do not want to go into an infinite process. In simple words: We want to solve the questions without Analysis this time. What exactly is the volume of a hen’s egg or an egg produced of chocolate? Maybe 1 g is good, maybe 1 mg? We will get this grade of exactness after a while if we use a finite process. Hence we divide the egg in some (or many) slices, and we know that we can count the number of slices. We can use a computer to add the volumes of the slices. When this basic idea is clear (or found by the students themselves) the rest of the work is fun with a little bit of a trial-and-error-strategy. We will show this now.
Figure 11. Taking a photograph. 250
MATHEMATICS AND EGGS
We like to have a general idea about the volume at the beginning. Our first step is a simple method to estimate it. We take the chocolate egg and put it under water (Figure 11). We look at the scale at the wall of the container and see a volume of about 70 cm³. How is it possible to estimate the slice of an egg? It looks like a cylinder but we know that we make a little mistake by saying that. Let us have a look at Figure 12:
Figure 12. A picture of an “egg-slicer”.
Do you know an egg slicer like the one you see on the picture? All slices together have the same volume as the egg had before we use the slicer. Now we have to find the volume of one slice and in the next step the volume of the sum of all slices. A typical slice somewhere in the middle is similar to a cylinder. The volume of a cylinder is V = r²·π·h (r is the radius of the basic circle; h the height of the cylinder). Now we have to find r and h. If we cut the egg with a knife into halves and put one half on a sheet of paper we can take a pencil and draw a line around that half. By adding some coordinates a picture – like Figure 13 – can be drawn:
Figure 13. Profile of an egg.
We decide to start with six slices along the x-axis which is scaled in centimetres. Each slice should have 1 cm height. What is the radius r? We take a close look at 251
MAASZ AND SILLER
one slice. We select the second one from 1 cm to 2 cm. Looking at its profile we see one like Figure 14:
Figure 14. Finding one slice.
Looking at this draft we see at least two lines that could be the radius, one on the left side and one on the right side. In the left portion of the egg the left radius is always shorter whereas in the right portion the right part is shorter. Now we are facing an important point, a good chance for good ideas or inventions done by the students. Students should learn to invent or discover new mathematics. We propose to give the students the chance to learn or practice it here. What are possible ideas? We show two main ways – both well known: one is known from introducing integrals (upper-sum and lower-sum), the other one is known as interpolation. In the first approach we take the bigger and the smaller radius and estimate their difference. While the slices become thinner the difference will fall to the moment we have reached the grade of exactness we want to reach. For the second idea we take the radius of the point in the middle between left side and right side. In this case x is 1.5 cm. If the slices become thinner the radius in the middle will become closer and closer to one of the border radius. Again it is a question of the grade of exactness we want to have. Now we will explain both ideas with concrete numbers and results. We name the left radius r and the right radius R. Figure 15 shows the situation appropriately.
Figure 15. Looking for r and R. 252
MATHEMATICS AND EGGS
Looking for concrete numbers for the length of the radius for cm 0 to 6 we come back to Figure 13. A spreadsheet is used to calculate the volume of the cylinders and to sum it up (Table 1): Table 1. Calculating first results x
Y (measured)
Smaller radius
Volume cylinders
Bigger radius
Volume cylinders
0.0
0.0
0.0
0.00
1.7
9.08
1.0
1.7
1.7
9.08
2.1
13.85
2.0
2.1
2.1
13.85
2.3
16.62
3.0
2.3
2.3
16.62
2.3
16.62
4.0
2.3
2.0
12.57
2.3
16.62
5.0 2.0 6.0 0.0 Sum Arithmetic middle
0.0 0.0
0.00 0.00 52.12
2.0 0
12.57 0.00 85.36
68.74
The result is really surprising! We made a first trial with little effort and little exactness and we got a result that is near the empirical result we got with the measurement of water displacement at the beginning (68.74 is near 70.00). Are we ready now? NO – not at all! The way we got the length of the radius is not very exact. We made a draft on the paper and took a simple line to measure the distance from the x-axis to the outline of the egg. If we did this very well maybe the length is correct within a little measurement error – let us estimate 1 mm. What would happen if the real length is 1 mm more in each case? (see Tables 2 and 3) Table 2. Influence of inexact measurement 1: plus 1 mm x
y
0.0 0.0 1.0 1.8 2.0 2.2 3.0 2.4 4.0 2.4 5.0 2.1 6.0 0.0 Sum Arithmetic middle
Smaller radius 0.0 1.8 2.2 2.4 2.1 0.0 0.0
Volume cylinders 0.00 10.18 15.21 18.10 13.85 0.00 0.00 57.33
Bigger radius 1.8 2.2 2.4 2.4 2.4 2.1 0.0
Volume cylinders 10.18 15.21 18.10 18.10 18.10 13.85 0.00 93.53
75.43 253
MAASZ AND SILLER
Table 3. Influence of inexact measurement 2: minus 1 mm x
Smaller radius 0.0 1.6 2.0 2.2 1.9 0.0 0.0
y
0.0 0.0 1.0 1.6 2.0 2.0 3.0 2.2 4.0 2.2 5.0 1.9 6.0 0.0 Sum Arithmetic middle
Volume cylinders 0.00 8.04 12.57 15.21 11.34 0.00 0.00 47.16
Bigger radius 1.6 2.0 2.2 2.2 2.2 1.9 0.0
Volume cylinders 8.04 12.57 15.21 15.21 15.21 11.34 0.00 77.57
62.36
What is the result of our experiment with small errors? The length which was measured is about 20 mm. We estimated an error of 1 mm that means about 5 percent. The result differs from 68.74 to 75.43 (plus 9.7 percent) or to 62.36 (minus 9 percent). The border line itself is about 1 mm thick. So we are really in danger of getting wrong results because we cannot measure the length exactly. One result of this reflection is that it is not useful to make the slices thinner before the method of length measuring is improved. This is similar to many situations in physics. Without exact measurement is it often useless to think about more calculation. What about the second option we proposed – following the idea of interpolation? We take the arithmetic middle of r and R and measure the length for these points on the x-axis. In Table 4 the results are shown. Table 4. Calculation through interpolation x 0.5 1.5 2.5 3.5 4.5 5.5
y 1.2 1.9 2.2 2.4 2.2 1.6
Cylinder volume 4.52 11.34 15.21 18.10 15.21 8.04 72.41
We are satisfied to see that the empirical results are nearly the same as the results in the calculation. Reaching this result we think we have shown how we can use Cavalieri and a spreadsheet to reach the empirical results. Many further steps are possible to be more precisely and to go on further, like calculating the surface area. But we think it is not necessary to explain this here because teachers know what to do and how to do it. 254
MATHEMATICS AND EGGS
EPILOGUE
Looking at objects from our daily life through the lens of mathematics can show interesting and motivating (mathematical) results. The grade of complexity is not an essential attribute for discussing objects from the realm of students’ experiences. It is necessary to show students that mathematics allocates methods and instruments to analyse objects in daily life. For this reason Austrian and German mathematics educators founded the ISTRON group in 1990. This group has the aim to look at problems in daily life and to show teachers how they can improve their education by implementing such examples. A more detailed look at mathematical aspects of an egg can be found in the article of Siller, Maaß & Fuchs (2009). All in all it is necessary that adequate problems are shown for education. Problems which are already known in research should be adapted and constructed for education. The fields for educational research in this area should be expanded and strengthened. SOME LAST REMARKS AND HINTS
For the sake of completeness we want to advice hints and solutions for the listed possible examples (in this article) that shall motivate students: 1. “Children Surprise Egg” (Figure 1) The old version of this egg is similar to the egg of a chicken. The mass of the chocolate is about 20 g, with a volume of about 15 cm³. The EU has decided that the toys inside these eggs are so small that little children are in danger to swallow them. The next generation of the eggs is bigger. The mass of chocolate will be about 100 g and the estimated size of the egg is 12.3 cm for the height and 8.3 cm for the diameter. The volume of this egg can be calculated as V = 72.26 cm³ (cf. Siller, Maaß & Fuchs, 2009, p. 106). 2. Biggest Easter egg 2008 (Figure 2) It has a surface of about 130 m². 3. World’s biggest glass egg (Figure 3) The volume of this egg is about 0.15 m³. The information in the internet says that this egg weights about 20 kg. If it is made of glass without air inside it would weight about 375 kg (2,5 kg/cm³). The shell would have a thickness of about 0.58 cm. REFERENCES Blum, W. (1985). Anwendungsorientierter Mathematikunterricht in der didaktischen Diskussion. Math. Semesterber, 32(2), 195–232. Blum, W., & Leiss D. (2007). How do students and teachers deal with mathematical modelling problems? The example “Filling up”. In Haines, et al. (Eds.), Mathematical modelling (ICTMA 12): Education, engineering and economics. Chichester: Horwood Publishing. BMUKK: AHS-Lehrplan Mathematik (Oberstufe). Wien: Bundesministerium für Unterricht und Kultur. Retrieved from http://www.bmukk.gv.at/schulen/unterricht/lp/lp_ahs_oberstufe.xml Hortsch, W. (1990). Alte und neue Eiformeln in der Geschichte der Mathematik. München: Selbstverlag Hortsch, München. 255
MAASZ AND SILLER Krauthausen, G., & Scherer, P. (2003). Einführung in die Mathematikdidaktik. Heidelberg: Spektrum. Loria, G. (1911). Ebene Kurven I und II. Berlin. Malina, J. (1907). Über Sternenbahnen und Kurven mit mehreren Brennpunkten. Wien. Mathematik Lehrplan AHS-Oberstufe, Wien, 2004, abrufbar unter:http://www.bmukk.gv.at/schulen/ unterricht/lp/lp_ahs_oberstufe.xml (Stand: 27.08.2008) Müller, G., & Wittmann, E. Ch. (1984). Der Mathematikunterricht in der Primarstufe. Braunschweig Wiesbaden: Vieweg. Münger, F. (1894). Die eiförmigen Kurven. Dissertation, Universität Bern, Bern. Narushin V. G. (2005). Egg geometry calculation using the measurements of length and breadth. Poultry Science, 84, 482–484. Pollak, H. O. (1977). The interaction between mathematics and other school subjects (Including integrated courses). In Proceedings of the third international congress on mathematical education (pp. 255–264). Karlsruhe. Schmidt, C. H. L. (1907). Über einige Kurven höherer Ordnung, Zeitschrift für mathem. u. naturwiss. Unterricht, 38. Jg., p. 485. Schweiger, F. (1992). Fundamentale Ideen - Eine geistesgeschichtliche Studie zur Mathematikdidaktik. In JMD, Jg. 13, H. 2/3, pp. 199–214. Schupp, H. (1987). Applied mathematics instruction in the lower secondary level: Between traditional and new approaches. In B. Werner, et al. (Hrsg.), Applications and modelling in learning and teaching mathematics (Vol. 37). Chichester: Horwood. Siller, H.-St. (2008). Modellbilden – eine zentrale Leitidee der Mathematik. In K. Fuchs (Hrsg.), Schriften zur Didaktik der Mathematik und Informatik. Aachen: Shaker Verlag. Siller, H.-St., Maaß, J., & Fuchs, K. J. Wie aus einem alltäglichen Gegenstand viele mathematische Modellierungen entstehen – Das Ei als Thema des Mathematikunterrichts. In H.-St. Siller, J. Maaß, (Hrsg.), ISTRON-Materialien für einen realitätsbezogenen Mathematikunterricht, Franzbecker, Hildesheim (to appear). Timmerding, H. E. (1928). Zeichnerische Geometrie, Akad. Verlagsgesellschaft, Leipzig. Wieleitner, H. (1908). Spezielle ebene Kurven. Leipzig: Sammlung Schubert. Zhou, P., Zheng, W., Zhao, C., Shen, C., & Sun, G. (2009). Egg volume and surface area calculations based on machine vision. Computer and Computing Technologies in Agriculture II, 3, 1647–1653.
Juergen Maasz IDM University of Linz Hans-Stefan Siller IFFB – Dept. for Mathematics and Informatics Education University of Salzburg
256
THOMAS SCHILLER
15. DIGITAL IMAGES Filters and Edge Detection
INTRODUCTION
Interdisciplinary work based on examples with reference to reality is essential to improve the teaching of mathematics. Based on such examples pupils learn the proper use of mathematical methods and their motivation to learn increases. Therefore I have chosen to explain important fields of the topic “digital image processing” in this article, which can be dealt with in an interdisciplinary and project based way in mathematics and informatics. Pupils are increasingly confronted with digital images, e.g. by using their modern smartphones with integrated cameras. The quality of images, especially of digital photos, may need to be improved during or after taking the photo, so you can recognize the objects in the image better, e.g. by adjusting the contrast. Principles of digital image processing have been prepared for the classroom in this chapter. You will see that you don’t need many special requirements in class, often simple knowledge of spreadsheets is all that is required. Among the prepared examples there is the use of linear filtering, which can, for example, be used for smoothing images, and a short explanation, how you can find positions of possible edges in the image. Edges, rapid transitions between light and dark (or differently coloured) areas, play an important role in the automatic recognition of objects in the image, as they define objects. By working with these examples pupils can see that the basic ideas behind the processing of image data on the values representing the colour or brightness of individual pixels are relatively simple, but in order to actually handle the complex reality and so get useful results much more work is required. IMAGES IN INFORMATICS
In practice there are many types of digital raster images, such as photos, colour and greyscale images, screenshots, fax documents, radar images, ultrasound images etc. Raster images are usually rectangular and made of regularly arranged elements, the pixels (short for picture element). In general, raster images are rectangular, and differ mainly by the values stored in the pixels. In addition to raster images, there are vector graphics. [Burger et al., 2006, p. 5] This article only deals with raster images, so I will not talk in detail on vector graphics. One can interpret a recorded image as a two- dimensional matrix with numbers. A digital image I is, considered more formally, a two- dimensional function of J. Maasz and J. O’Donoghue (eds.), Real-World Problems for Secondary School Mathematics Students: Case Studies, 257–271. © 2011 Sense Publishers. All rights reserved.
SCHILLER
non- negative integer coordinates to a set of image values, therefore I: N x N → V (N … natural numbers; V … set of pixel values). In this way, images can be represented, stored, processed, compressed or transmitted using computers. It doesn’t matter which way an image has emerged actually, we simply understand it as numeric data. [Burger et al., 2006, p. 10]
Figures 1 and 2. Part of an image (magnified) and related greyscale function (comp. figure 2.5 in [Burger et al._2006, p. 11]).
The values of individual pixels are binary words of length k. Therefore a pixel can assume 2k different values. “k” is frequently called the “depth” of the image. The exact coding of the pixel values depends, for example, on the type of the image (RGB colour image, binary image, greyscale image etc.). [Burger et al., 2006, p. 12] Greyscale images consist of only one channel representing the image brightness. As the values of the intensity of the light energy, which cannot be negative, are represented, normally only non-negative values are stored. Therefore image data in greyscale images are usually made of integer values of the interval [0, 2k-1], e.g. values representing the intensity of the interval [0, 255] at a depth of 8 bits. It this case, 256 different grey levels are possible in which usually 0 represents a black pixel (= pixel with minimum brightness) and 255 a white one (= pixel with maximum brightness). [Burger et al., 2006, p. 12] FILTERS IN DIGITAL IMAGE PROCESSING
Many effects such as the sharpening or smoothing of an image can be realized with the help of filters. [Burger et al._2006, p. 89] Smoothing in a Spreadsheet Regions in images with locally strong intensity changes, so large differences between neighbouring pixels, are felt as sharp. In contrast the image looks blurry or fuzzy, if the brightness changes only a little bit. If one wants to smooth an image, one could replace each pixel value by the average of the neighbouring pixels. One creates a new image I’ starting from the source image I by assuming the pixel values I’(u, v) as the arithmetic means of the pixel values of the pixel I(u, v) and the 8 neighbouring pixels. [Burger et al., 2006, p. 89f] 258
DIGITAL IMAGES
Let’s try to implement the smoothing of a greyscale image in a spreadsheet. There are several methods to import pixel values into a spreadsheet. After manipulating the values, they can be exported and viewed as an image again. In Figure 3 the values of the individual pixels of the source image (consisting of 256 grey levels) are represented. Figure 4 represents the pixel values formed by the calculation of the arithmetic mean values.
Figure 3. Screenshot with individual pixels of the source image.
Figure 4. Screenshot with pixel values formed by the calculation of the arithmetic mean values.
The filter process using arithmetic mean values can be implemented relatively quickly in a spreadsheet using the mean function (such as AVERAGE) that is included in the program. For example in cell B2 this requires writing the name of the function for averaging and the range of cells from A1 to C3 used for the calculation of the mean value. Of course, you can also enter the sum of the values (and the subsequent division) without using a predefined function for calculating the mean, e.g. with help of a sum function (e.g. SUM) or through multiple additions. Regardless of the chosen way, the references to other cells have to be relative, because afterwards the formula has to be copied into (almost) all other cells that represent pixel values. They should access the neighbouring pixels from the actual cells’ point of view while calculating the actual mean value. Since only integer pixel values are allowed, the value afterwards has to be changed into an integer number, e.g. by using the function INT or ROUND. 259
SCHILLER
Figure 5 shows how the table with the results from Figure 4 looks like in formula view:
Figure 5. Figure 4 in formula view (Ausgangsbild = source image, Graustufen = greyscale, ganze Zahl = integer number, Mittelwert = mean value).
As you can see in Figure 5, the mean values on the fringes of the image are not calculated and the original image values are used instead. But why? The students will find out by implementing the filtering on their own, because either they will try to create the formula for calculating the mean value in cell A1 and fail because of the lack of the value on the left side of cell A1, or they will create the formula in another cell not lying on a fringe of the image. In this case they will face problems when copying the formula to the fringes of the image due to an error message (Figures 6, 7 and 8):
Figure 6. Error messages on the fringes (Bezug = reference).
Figure 7. Error messages on the fringes.
Figure 8. Error messages on the fringes.
As can be seen e.g. in Figures 4 and 5, the formulae are not written in the same worksheet in which the values of the pixels of the source image are. So the old values won’t be changed, but a new image with new values based on the pixel values of the old image is created, as described in the theory above. The fact that in each cases a new image, therefore a new number matrix, is created and the manipulations are not performed directly in the output matrix, is essential for such filter processes. 260
DIGITAL IMAGES
Pupils will realize that this is the only possible right way when trying the wrong one. They will fail, because the spreadsheet shows a circular reference warning, if iterative calculation of formulas is turned off. In our case this means that while calculating a value of a cell, this cell is being accessed. This can’t work properly, because at the time the value is calculated, the value of this cell changes. Due to the direct influence of the value of the cell on the calculation, the cell will be updated taking the new value into account. This could (at least in theory) continue endlessly, but doesn’t make any sense in our case. The error message (a circular reference warning) can be avoided by activating the iterative calculation (and in addition for example setting the desired number of iterations), as described in the appropriate help in the spreadsheet. It should be noted that this is an appropriate example for introducing the topic of “recursion and iteration” (both in mathematics and in computer science education), which I will not describe in detail here.1 The problem with the circular reference when working with filters and “accidentally” using the source image should be discussed in class, because if such a filter is not only implemented in a spreadsheet, but in (any) programming language, there will not be a circular reference warning that draws one’s attention to the error. The value of each pixel is only calculated and changed once and therefore no problems will occur during the calculation while executing the program, although wrong pixel values are used. It isn’t easy to find the problem if one isn’t warned what error occurred. In our case you don’t have to search for it in the obvious core of the programmed method in which it comes to the filtering, but before that, when a new (empty) image has not been created and the main image is set up for editing.
Figures 9 and 10. Before and after smoothing in a spreadsheet.
Figure 9 is a section of the source image before smoothing in a spreadsheet. Figure 10 represents the corresponding image after smoothing. You can (especially on the concrete slabs) clearly see that by smoothing, the transitions between the different shades of grey have been blurred and the image seems to be out of focus. Another important aspect: In this example, the pixel values on the fringes of the image have been taken from the source image. Depending on the function of the filter, e.g. differentiation2, that usually doesn’t make sense. The calculation of the arithmetic means in the presented smoothing could rely just on the neighbouring pixels that are available. In the corners you can use only four (instead of 9) pixels, on the fringes (without the corners) you can use the remaining six pixels. Even by using other additional filters, the treatment of the pixels on the fringes of the image 261
SCHILLER
has to be considered separately. In doing so, it is up to the pupils’ creativity to handle these special cases in a way that to make sense. In this article I am going to demonstrate the sense of (linear) filters on different examples. It is particularly interesting how the image itself changes, i.e., the large interior of the image and not the (thin) fringes.3 Pupils can later develop new creative filters themselves and visualize them on the computer to observe the effects they have achieved. When a new creative filter is discovered, pupils can still handle a necessary treatment of pixels on the fringes. Filters in General The discussed smoothing filter with the local averaging includes all the elements that are typical of a filter. This smoothing filter belongs to the group of linear filters. In general, filters use several pixels of the source image for calculating a new pixel value. The size of the filter, respectively the region of the filter, determines how many pixels are used to calculate a new pixel value. The most recently introduced filter for smoothing had a size of 3 x 3 pixels. Filters of sizes 5 x 5, 7 x 7 or 21 x 21 would also be possible and would produce a stronger smoothing. The filter needn’t necessarily have a square shape. For round filters, the filter effect would occur uniformly in all image directions. Additionally a different weighting of the involved pixel values is possible, so that some more distant pixels are not considered to be as strong as those lying next to the pixel whose value is being calculated. The filter size theoretically doesn’t have to be finite and certainly doesn’t have to include the original pixel. Because of the high number of possibilities, filters are classified systematically. They are for example divided into linear and nonlinear filters, which can be seen on the mathematical expression for calculating a new pixel value. [Burger et al., 2006, p. 90f] For linear filters, the values of the pixels are linked in a linear form, e.g. by a weighted sum. In the example of smoothing, in each case all 9 pixel values were allocated a weight of one ninth and added up. [Burger et al., 2006, p. 91] By using the AVERAGE- function, the pupils may not be directly aware of that. Therefore in the example above one can enter the 9 summands and the following division in the spreadsheet individually instead of using the AVERAGE- function. If the division by 9 is considered as multiplying by one ninth and brackets are multiplied out, pupils can see immediately that each summand, so each pixel value, is weighted by one ninth before the values are eventually added to a new one. Depending on the choice of the singular weights many different filters with a completely different behaviour are possible. Linear filters can be represented by matrices, where the weights are given. The matrix for the smoothing filter in size (3 x 3) is
. [Burger et al., 2006, p. 91] 262
DIGITAL IMAGES
Filters are applied to images by positioning the centre of the filter matrix over the actually considered pixel of the source image, multiplying the pixel values with their respective weights from the filter above the image and adding these products. Afterwards the result of this calculation has to be inserted at the corresponding position in the resulting image. [Burger et al., 2006, p. 92f] Sharpening an Image Also the sharpening of images is possible in the same way as smoothing is. The only difference is the matrix used for the filter operation. For sharpening an image in [Pehla, 2003] the matrix
is used.
Figures 11 and 12. Before and after sharpening.
Figure 12 is obtained by applying the above filter matrix on Figure 11. Pupils’ Experiments (further (linear) filters) Linear filters offer you the chance to experiment. Based on a commonly known filter pupils can change well- directed singular entries in the filter matrix. They can even design their own new matrix and consider the impact of new entries. In experiments one should first use simpler (greyscale) images, such as the little smiley in Figures 1 and 2. The image should be, especially if it is small, spatially extended by adding additional (empty) fringes. This way you can prevent it from happening that all the pixels of the (too small) image are on the fringes and no filtering happens at all, thus one would see no effects. The pupils can carry out experiments guided by specific questions. For example they should consider what will happen if they use a filter matrix consisting of only zeros. The pupils should not try to find out the answer4 just by experimenting, but 263
SCHILLER
by considering before what will happen, and then confirm or disprove their suspicion by trying. Due to the necessary considerations, the filter process can be internalized more easily and experimentation doesn’t degenerate into wild, untimed and thoughtless tinkering. EDGE DETECTION
What is it, What is it Needed for and How Can it be Done? In human vision, edges play an important role. Figures can be reconstructed as a whole from a few striking lines. Edges could be roughly described by the fact that in the image, which has an edge, the intensity will change greatly within a small neighbourhood and along a distinct direction. Stronger changes of the intensity are a stronger indication of an edge in the observed position. The strength of the intensity change corresponds to the first derivative, which is thus an approach to determine the strength of an edge. [Burger et al., 2006, p. 117] So the steepness of the greyscale function represents the strength of an edge. The steepness of the greyscale function is nothing but the amount of the gradient and the direction of the edge is perpendicular to the direction of the gradient. [Tönnies, 2005, p. 174f] Since edges also separate objects from the background, the knowledge of edges may be useful for example in the detection of objects (e.g. characters (OCR), faces). As mentioned above, a greyscale image can be interpreted as function I (u, v). In order not to overwhelm pupils with two-dimensional differentiation, it is useful first to look at the whole process in a one-dimensional way by restricting themselves to one image line.5 One might draw a function, which is holding a high value (e.g. 250) relatively constantly, then at some point relatively suddenly is falling down (e.g. to 10) within a short period and is afterwards again remaining relatively constant at the value to which it has just fallen. Additionally, the derivative function is needed. One can then see clearly that the derivative function, wherever the function itself is relatively constant, approximately remains at value 0 and only at the relatively rapid shift from 250 to 10 it suddenly gets a large negative deflection. So at the position, where the observed image line is changing from a nearly perfect white on an almost perfect black, at the position of an edge, the derivative function has a remarkably low level. In a change from a darker to a brighter area, however, the deflection would be positive, because the greyscale function increases. Such swings in the derivative functions can thus be interpreted as positions of edges. What the pupils will notice is that behind the images to be analyzed we do not have continuous functions and therefore the functions cannot be differentiated. But the derivatives can be approximated by differences. The differentiation in x-direction can be approximated by the difference of the grey value of the pixel afterwards and the grey value of the pixel before the pixel currently under consideration. This can be achieved by convolution with the matrix
. 264
[Tönnies, 2005, p. 175]
DIGITAL IMAGES
The differentiation in the y-direction can be approximated with the help of the matrix
.
[Tönnies, 2005, p. 175]
It is important to note that the background to the differentiation is not necessarily required. In class, it will do if the teacher explains that big changes of the values between closely spaced pixels are indications for edges. As mentioned above, according to [Burger et al., 2006, p. 117] an edge is a place of great change in intensity. Thus, this basic idea behind the edge detection can be implemented in the classroom much earlier, because the necessary differences are also suitable for the lower grades. Example: Differentiation in the x-direction and Catching Inadmissible Values Let’s look at filtering with the matrix Dx in a practical way using the following image consisting of three different grey values as a source (Figure 13):
Figure 13. Test image with three different grey values.
Differentiation in x-direction provides us with the following result (Figure 14):
Figure 14. Result of differentiation in x-direction (clipping to the range 0 to 255).
At first glance, the result corresponds to our expectations: In a change from a dark shade of grey to a light one, the difference is significant, so the edge pixels are displayed. The strengths of the edges can also be seen in the resulting image, 265
SCHILLER
because when grey changes to white, the difference is not as great as from black to white, so these edge pixels are plotted darker. This follows automatically because the difference between grey and white is not as great as that between black and white. At second glance, we find out that edge transitions from lighter to darker shades of grey don’t occur in the resulting image. This is not surprising, because of the used transformation of the matrix of pixel values into an image, the values that lay outside the interval of 0 to 255 were clipped to the range 0 to 255, since they are not valid grey values. Subtracting the value of a bright pixel from a dark one provides us with a negative value. Due to the formation of differences, we are outside the (for greyscale images) admissible value range of 0 to 255. Pupils should consider at this point, what is the range of values generated from the differentiation in this way. This is the basic requirement for finding a reasonable solution that also plots the edge transitions from light to dark areas in the resulting image. The answer can be easily found: The most negative number is the one in which we subtract from the smallest number (0) the largest number (255), i.e., -255. The largest positive number is obtained by subtracting nothing from the largest number (255) and it therefore remains 255. The new range after a single differentiation in the way discussed here thus is the interval from -255 to +255. A clipping of negative values by setting them to 0 doesn’t make sense in this case, because by doing this these existing edges are not displayed. Negative values must be caught in a different way to find all possible edge pixels and not leave the permitted range of values. But negative values could also be caught by the absolute value (see [TUChemnitz, 2008, p. 8]). Let’s examine that (Figure 15):
Figure 15. As Figure 14, but catching by the absolute value.
We actually find all the edges we were searching for. PUPILS’ EXPERIMENTS WITH A GENERAL SPREADSHEET
Implementation and its Problems As mentioned above, it makes sense to let the pupils experiment with filters independently. They can determine the entries of a filter matrix themselves and test the effects of the filter procedure on an image quickly. Therefore they can create a spreadsheet with three tables: a table with the (greyscale) pixel values of the source image, one for entering the filter matrix, and one for the pixel values of the resulting image. 266
DIGITAL IMAGES
To make the document suitable for experiments, it makes sense to keep it as general as possible. Therefore it is recommended to use a filter matrix that is as large as possible, e.g. the size of 21 x 21, as can be seen in Figure 16. If a smaller filter matrix is required, the external cells have to be filled with zeros, and then the entire spreadsheet can be used principally unchanged. In Figure 16 you can see two matrices. The reason for this is that it is often useful to divide all matrix entries by a certain value. If we calculate the mean, for example, each pixel has to be weighted with 1 / (number of entries). Also for the reason of clarity, it is useful to single out certain common factors. Therefore, in my file the divisor, by which all the values have to be divided, is entered between the two matrices. The right matrix is filled with the (not yet divided) entries. The left matrix is provided with formulas to take the values of the right matrix and divide them by the divisor. In the implementation in the classroom, make sure that the column widths are selected appropriately so that (for clarity) the matrix has full space on the screen and so that between the entries with less numerics there is not too much space that would tear the matrix optically. But the real challenge for the students when creating such a generally usable spreadsheet is not the handling of the matrix, but rather the actual filter procedure: In the source image the values of the pixels on the fringes of the image should be unaffected. Later on, of course, other boundary treatments can be introduced. For simplicity, you start with the largest fixed filter matrix, in my case a filter matrix of size 21 x 21, even if you work only with a matrix of size 7 x 7 later and the other matrix entries are set to 0. What I’m trying to show is that e.g. in my case a fringe with a width of 10 pixels is excluded from the actual image filtering process, even if in the case of filtering with a matrix of size 3 x 3 only one pixel width would be enough. As the fringes of the image are negligible, this doesn’t matter. But if one wants to, they can work for a more accurate dynamic determination of the fringes, e.g. with the help of appropriate IF- conditions. You can start with the actual image filtering process at the top left point that no longer belongs to the fringes. Here a (long) formula has to be entered in which the new pixel value is calculated. Then this formula can be copied to the other cells representing pixels, which have to be filtered, and we get the desired result. But for several reasons this is not as easy to handle as it seems to be. The formula for the filter process with a matrix of size 21 x 21 is very long. After all, the sum of 212 = 441 multiplications, i.e., 441 matrix entries with one pixel
Figure 16. Screenshot of worksheet with the filter matrix.
267
SCHILLER
value in each case, has to be built. For matrices of size 3 x 3 that would be accomplished relatively quickly, but in a formula which refers to 882 different cells (matrix entries and pixel values), the probability of input, typing or clicking errors is very high. But as in the preparation of the formula (because of iterating through a matrix) in principle you (almost) always have to access neighbouring cells, you can implement a small utility program for creating the formula. Then you can paste the formula created by this program into the cell in the spreadsheet. To write such a utility program quickly, it is important to get an idea of the formula and begin to write it down: =RUNDEN(‘source image (greyscale)’!A1*’filter matrix’!$A$1+’source image (greyscale)’!A2*’filter matrix’!$A$2+’source image (greyscale)’!A3*’filter matrix’!$A$3+’source image (greyscale)’!A4*’filter matrix’!$A$4+’source image […] (greyscale)’!A20*’filter matrix’!$A$20+’source image (greyscale)’!A21*’filter matrix’!$A$21+’source image (greyscale)’!B1*’filter matrix’!$B$1+’source image (greyscale)’!B2*’filter matrix’!$B$2+’source image […];0) The distinction between absolute and relative references is very important. While the references to the surrounding pixels have to change when copying the formula to neighbouring cells of neighbouring pixels, the entries in the filter matrix remain always in the same place and therefore must be determined absolutely. However, references to the surrounding pixels must be relative. If you want to copy the formula obtained by the utility program into the spreadsheet, you may face another problem. For example, you get an error message saying that the maximal number of characters for a formula (on the level of 8192) was exceeded. Here you must reflect on how to shorten the formula. A possible solution would be to reduce the names of the worksheets, because “source image (greyscale)” or “filter matrix” already needs a lot of space. But how long can the new short names of the worksheets be? Can the terms be ever so short that the formula matches within the maximum allowable number of characters? Here arithmetic provides the answer. As already mentioned, there are 882 references on other cells. Of these, 441 refer to cells of pixel values and therefore are relative references. For the chosen image size you need column names at the length of 2 characters (letters) to address the entire width of the image. For the rows 3 characters (digits) are required. Therefore such a reference needs (without names for the worksheets) up to 5 characters. With the 441 references to the cells of the filter matrix, it looks a bit different. Here the line numbers are only 2 characters (digits), and in order to include the column name only 1 more character (letter) is necessary to address the matrix of size 21 x 21. Since the references to the cells of the filter matrix must be absolute, 2 more characters (dollar signs) per reference will be added. In sum you obtain again up to 5 characters for each such reference (without the names of the worksheets). Under the last assumptions, there are up to characters needed, thus about 3782 characters are left for the names of the worksheets. When we consider that all the 441 references refer to the source image or to the filter matrix, there are characters remaining for the names of the worksheets. Therefore 268
DIGITAL IMAGES
I have chosen the names “src” (source) for the name of the worksheet with the pixel values of the source image and “m” representing the worksheet with the filter matrix. For consistency, I changed the name of the worksheet of the resulting image to “dest”. If you change the utility program that prints the formula according to that work, it is now easy to insert the formula into the spreadsheet. Another possible solution would be to reduce the number of references by reducing the size of the filter matrix. But this allows fewer experiments, since comparisons between smaller and larger matrices can be interesting. As mentioned earlier, you can paste the formula created by the utility program in the cell located at the very top left, which is no longer on the fringes of the image, and then copy it to the other cells. But that sounds easier than it really is. After all, the formula consists of several thousand characters that have to be copied into over 50000 cells! Depending on your computer, resource problems are unavoidable. It will take a lot of patience! I can only recommend not to copy the formula in all the other cells immediately, but to proceed in stages. You should also save the file after such a stage of copying. I recommend keeping the image size rather small. I created the file for images of size 335 x 224 myself. Originally I planned using an image size about four times this, but I failed because of lack of time, as the copying of formulas and saving the document took a very long time on my computer. My file was in effect over 100 MB and the opening on my computer took about 10 minutes. Here you can estimate the necessary resources with an image that is four times larger! But these are also important experiences for pupils, after all. The construction or use of such a spreadsheet demonstrates the complexity of an implementation of simple basic principles. Similar to possible duration studies in filtering with a self-implemented program, pupils can see that the basic principles work, although perhaps there must be significant improvements to use them rationally in real-time applications in reality. Example to Test the General Spreadsheet Here is a little test of the general filter procedure implemented in a spreadsheet. As the source image I used the following greyscale image (Figure 17):
Figure 17. Test image (greyscale). 269
SCHILLER
I entirely filled the (right) filter matrix with 1s and I set the divisor to the number of filter entries, i.e., 212. Thus the mean of the values lying under the filter (that is lying over the image) is calculated. We can see the smoothing effect mentioned above well in the resulting image (Figure 18):
Figure 18. Resulting image by smoothing using a filter matrix of size 21 x 21.
Due to the size of the filter matrix the smoothing effect is so strong that the source image is no longer really visible. Because of the differences in brightness you can see very rough structures, such as the bright area in the left down zone of the image containing some flowers or the bright area in the upper zone, in which the concrete slabs that are lighter than the underlying plants are lying. The fringes of the image are also striking. These are already 21 pixels wide, and this raw area is clearly visible. In all previous examples, in which the untreated fringes were only one pixel wide, this was not noticed because of its thinness. But just the thickness of the fringes could now provide an extra boost to deal with boundary treatment in detail. Comments With such a spreadsheet pupils can experiment with different filters, not only with different filter entries, but also with different sizes of the matrix. Additionally experience is gained in connection with the available resources, which is important, because many pupils believe that resource problems are solved automatically when they use the next computer generation, which normally is not true. It usually depends on the functioning of algorithms that can save a multitude on resources. By working in a spreadsheet, as I have presented in this article, the filter procedure can be implemented in general, not only for linear filters, in the mathematics classroom (without the computer science teacher). The mathematics teacher needs only one ready-produced program (e.g. implemented by a computer science teacher) to be able to import the pixel values into a spreadsheet and convert the new pixel 270
DIGITAL IMAGES
values back into an image. Thus, they can solve the examples represented in this article without the help of a computer science teacher or deeper knowledge of computer science. NOTES 1 2 3
4
5
suggestions for literature: [Weigand_1989], [Weigand_1993]. see section “EDGE DETECTION”. You can find different suggestions for the treatment of image fringes in [Burger et al._2006, S. 113] (e.g. the obtaining of the original pixel values or the assumption, the image would be continued over the fringes). In [Burger et al._2006, S. 91] you get the hint, that the treatment of image fringes could need more complexity as needed for the big inner part of the image. Since each pixel is weighted by zero, thus multiplied with zero, and the total sum of zeros again is zero, the result is (apart from the not treated fringes) a completely black image. Thus, the image was deleted. The way over the one- dimensional case is in dispute e.g. also in [Burger et al., 2006, S. 118ff ].
REFERENCES Burger, W., & Burge, M. J. (2005 und 2006). Digitale Bildverarbeitung, Eine Einführung mit Java und ImageJ (Vol. 2). Überarbeitete Auflage, Springer-Verlag Berlin Heidelberg; ISBN 978-3-54030940-6 (Print) bzw. 978-3-540-30941-3 (Online) (SpringerLink), ISBN-13 978-3-540-30940-6; (This book is also available in English: Burger, W., & Burge, M. J. (2008). Digital image processing, an algorithmic introduction using Java. Springer; ISBN 978-1-84628-379-6. Pehla, M. Beispiel aus dem Tutorial “Bildverarbeitung mit Java”, 25.03.2003. Retrieved October 19, 2008, from http://www.dj-tecgen.com/downloads/Beispiele/Beispiel5.java. Tönnies, K. D. (2005). Grundlagen der Bildverarbeitung. Pearson Studium, ISBN 3-8273-7155-4. TU-Chemnitz, Kapitel 1: Farbräume – Warum nachts alle Katzen grau sind. Retrieved December 13, 2008, from http://www-user.tu-chemnitz.de/~niko/biber/tutorial/tutorial.pdf. Weigand, H. G. (1989). Zum Verständnis von Iterationen im Mathematikunterricht. Dissertation, Verlag Franzbecker, Bad Salzdetfurth, ISBN 3-88120-183-1. Weigand, H. G. (1993). Zur Didaktik des Folgenbegriffs, überarbeitete Habilitationsschrift. BI, Mannheim, ISBN 3-411-16221-X.
Thomas Schiller Linz (Austria)
271
HANS-STEFAN SILLER
16. MODELLING AND TECHNOLOGY Modelling in Mathematics Education Meets New Challenges
INTRODUCTION
Mathematics is a part of our daily life. A lot of (problem-) solving techniques were discovered by mathematical discussion or through mathematical activities involving real objects. Since the end of the 1970’s educational researchers are looking at the impact of real real-life situations that can be solved by mathematical methods and applied in the area of mathematical education. For this reason it is necessary to show the possibilities of real-life problems to mathematics education. Hence it is necessary to think about didactical-reflected and approved principles for education, so that the implementation of real-life-situations in mathematics education can be treated successfully. By thinking about this subject, it is necessary that we ask ourselves certain questions: “Where am I able to find mathematics in our daily life? Which (particular) educational principle allows educating it? How is mathematics needed for the future?” By looking in various literature sources certain accepted proposals can be found (see Siller, 2008). Very important skill requirements for students are knowing and understanding basic competencies in mathematics, like – representing, representing mathematically, – calculating, operating, – arguing, reasoning, – interpreting, – creative working. These aims are not new! They can be found for example in Wittmann (1974). The author is describing general educational objectives and basic techniques which help to create the knowledge and understanding for basic mathematical principles and basic mental activities, like Winter (1972, S. 11–12) has stated it. Following these formulated targets, consequences for mathematics education arise. Maaß & Schlöglmann (1992) have formulated central postulations that allow combining the implementation of real-life-problems and the learning of basic mathematical principles: – Mathematics education should convey an extensive and balanced picture of mathematics itself. All aspects – historic, systematic as well as algorithmic – of this subject are important for students, so that they achieve a well known picture – of mathematics concerning reality as a particular science regarding its applicability to certain problems and its effect to society. J. Maasz and J. O’Donoghue (eds.), Real-World Problems for Secondary School Mathematics Students: Case Studies, 273–280. © 2011 Sense Publishers. All rights reserved.
SILLER
– Learning for life! Mathematics education should help students to be able to solve (easy) problems of daily life, without studying the subject itself. Mathematics should be seen as a construction of reality. – Learning for profession! Mathematics education should prepare students for higher education in certain extensiveness. These requirements raise a lot of issues for dedicated teachers. It is necessary that they have some guidelines which they can obey and which are prepared for practical reasons. The idea of mathematical modelling would be an adequate one for such purposes. Other reasons for taking the idea of mathematical modelling into account are arguments of general education, like – evolvement of personality, – exploitation of environment, – equal participation in (all aspects of ) society, – placement of rules and measurements. Following such arguments (mathematical) modelling can be seen as a powerful method for mathematics education. It can be found in a lot of paragraphs in curricula, too. For example in the Austrian curriculum for Mathematics (2004) it is stated: “Mathematics in education should contribute that students are able to enforce their accountability for lifelong learning. This can happen through an education to analytical-coherent thinking and through intervention with mathematical backgrounds which have a necessary fundamental impact in many areas of life. Acquiring these competencies should help students to know the multifaceted aspects of mathematics and its contribution to several different topics. The description of structures and processes in real life with the help of Mathematics allows understanding coherences and the solving of problems through a deepened resulting access which should be a central aim of Mathematics education. […] An application-oriented context points out the usability of Mathematics in different areas of life and motivates to gain new knowledge and skills. Integration of the several inner-mathematical topics should be strived for in Mathematics and through adequate interdisciplinary teaching-sequences. The minimal realization is broaching the issue of application-oriented contexts in selected mathematical topics; the maximal realisation is the constant addressing of application-oriented problems, discussion and reflection of the modelling cycle regarding its advantages or constraints.” The aims of modelling can be realized if a certain schedule – designed by Blum (2005) – is observed (Figure 1).
Figure 1. Circle for mathematical modelling by Blum (2005). 274
MODELLING AND TECHNOLOGY
MODELLING THROUGH THE HELP OF TECHNOLOGY
Using computers in education allows discussing problems which can be taken from students’ life-world. The motivation for mathematics education is affected because students recognize that this subject is very important in their everyday life. If it is possible to motivate students in a way like this it will be easy to discuss and to teach necessary basic or advanced mathematical contents. By using technology difficult (mathematical) procedures in the modelling-circle can be done by the computer. Sometimes the use of technology is even indispensable, especially in – computationally-intensive or deterministic activities, – working, structuring or evaluating large data-sets, – visualizing processes and results, – experimental work. By using technology in education it is possible to teach traditional contents in a manner that is different to conventional methods and it is very easy to pick up new contents for education. The centre of education should be a discussion with open, process-oriented examples. They are characterized by the following points. Open process-oriented problems are examples which – are real applications, e.g. betting in sports (Siller & Maaß, 2009), not vested examples for mathematical calculations, – are examples which develop out of situations, that are strongly analyzed and discussed, – can have irrelevant information , that must be eliminated, or information which must be found, so that students are able to discuss it, – are not able to be solved at first sight. The solution method differs from problem to problem, – need competencies not only in Mathematics. Other competencies are also necessary for a successful treatment, – motivates students to participate, – provokes and opens new questions for further, as well as alternative, solutions. The teacher achieves a new role in his profession. He is becoming a kind of tutor, who advises and channels students. The students are able to detect the essential things on their own. Major goals of education will be met, if technology is used in such a way (cf. Möhringer, 2006): – Pedagogical aims: With the help of modelling cycles it is possible to connect skills in problemsolving and argumentation. Students are able to learn application competencies in elementary or complex situations. – Psychological aims: With the help of modelling the comprehension and the memory of mathematical contents is supported. – Cultural aims: Modelling supports a balanced picture of Mathematics as a science and its impact in culture and society (cf. Maaß, 2005a & Maaß 2005b)
275
SILLER
– Pragmatically aims: Modelling of problems helps us to understand, cope and evaluate known situations. The integration of technology into the modelling cycle can be helpful by leading to an intensive application of technology in education. The way how it could be implemented can be seen in Figure 2.
Figure 2. Extended modelling cycle – regarding technology.
The “technology world” is describing the “world” where problems are solved through the help of technology. This could be a concept of modelling in mathematics as well as in an interdisciplinary context with computer-science-education. This extension meets the idea of Dörfler & Blum (1989, p. 184), who state: “With the help of computers which are used as mathematical additives it is possible to escape routine calculation and mechanical drawings, which can be in particular a big advantage for the increasing orientation of appliance. Because of the fact, that it is possible to calculate all common methods taught in school with a computer, mathematics education meets a new challenge and (scientific) mathematics educators have to answer new questions.” Example Technology enables students to reflect about certain modelling activities which take place in traditional tasks. Through the help of dynamic graphical software traditional solutions can be proved and visualized in a modern way. I want to show one possibility by using a traditional example, in the area of optimization: The owner of a Power-Drink-company is looking at the costs for producing tins in which the soft-drink is sold. He recognizes that the costs are rising to a limit which he is not able to fund. So he is looking for alternatives and asks you, a student of mathematics, for a competitive solution. Think about an answer which you could give to the owner of the company! First step – Constructing a suitable model for the given situation. The given situation is a “real situation” from reality. The owner of a certain company wants to optimize the production costs for tins. First of all the student will look at the volume of a certain tin. Therefore they have to walk to a supermarket and look for tins of a certain soft-drink. For this 276
MODELLING AND TECHNOLOGY
example I want to determine the volume of such a tin with V = 250 ml. After that students have to think about the situation and realize that the surface of such tins should be minimized. Second step – Reducing the parameters. Student shall recognizes that the form of the tin is too complicated for an elementary calculation. The seam at the top and the recess at the bottom are too difficult for an easy explanation. So they should decide to idealize the tin, which could look like a cylinder. There are no seams, no recesses and no break contact at the top. With the help of this idealized tin the surface can be optimized with the help of a mathematical model. Third step – Constructing a mathematical modelling. Because of the students knowledge they describe the surface of the tin with the help of a function in two variables S(r, h) = 2r²π + 2rπh. But the solutions of such a function with elementary calculation wills not succeed. So the constraint 250 = r²πh is needed. Substituting h in terms of r into the function S(r, h) returns a function in one variable S(r), which can be solved easily. Fourth step – Solving the function. The students will calculate the solution of the function with the help of differential calculation and by the help of technology. The found solution can be visualized in a graph (Figure 3).
Figure 3. Graphical solution.
The minimum can be found graphical at the cross position, which is marked by the ellipse. With the help of dynamic graphical software a tool was developed which is able to calculate the solution and to show the solution in one step. The student has to know the software well and then he is able to implement this problem with the help of such software. Students get the possibility to reflect about their mathematical 277
SILLER
solutions and are able to visualize it in a way which shows very impressively the results of such an optimization (Figure 4):
Figure 4. Visualizing the solution.
We know the gradient of the tangent is 0 when the minimal surface area is reached (Figure 5). Therefore the tangent to the point P is sketched and the minimal surface area can be determined. It can be found at the point P (3.41, 4.39). The same value can be found by a calculation.
Figure 5. The result of the optimization.
Fifth step – Interpreting and arguing the solution. Through a detailed view to the results of the radius and the height of the tin it is easy to see, that the minimal surface is reached when the height is equal to two times the radius (h = 2r), which can be shown trough a simple analogue calculation – let us name it a pseudo-proof. 278
MODELLING AND TECHNOLOGY
Sixth step – Validating the solution. Students could search for such tins for instance in a supermarket. They will find such tins quite easily. My hypothesis for finding such surface optimized tins in reality is the following: “If the design is not important to the producers the surface is optimized. If the design or maybe a company logo is important then the tins are not optimized.” I have not proofed this hypothesis but if we have a look at tins in the supermarket it gets obvious. Tins in which drinks are kept are not surface-optimized; tins where eatables are kept, like tins for vegetables, especially (sweet-) corn, are surface optimized, because the design is not important to the buyer. If we have a look at the pictured corn-tin (see Figure 6) my conjecture will be reasonable.
Figure 6. Corn-tin.
Advancements in the model. As we have seen the situation of modelling a tin and calculating the optimal surface area with a given volume has been extremely idealized. As a first approach cycle it is a proper method. But for sure students will 279
SILLER
ask what they have to do, if other parameters like the seam should be considered. For this reason a computer-algebra-system should be used, because it is difficult to implement such conditions with the help of dynamical graphical software. Even for a better understanding of tins in reality it is necessary to discuss it with seams, which was done by Deuber (2005). CONCLUSION
By integrating technology into education routine jobs, like differentiation or integration of functions or manipulating mathematical terms, the focus on conducting routine calculations is diminished. The importance of calculus, which is a typical attribute in common education, loses ground to the importance of interpreting and demonstrating. By including real-life-problems, through the aspect of modelling, those aspects can be enforced with the help of technology. So the role of modelling in education is enforced and students can become aware of the enormous prominence of modelling in Mathematics education. REFERENCES Blum, W. (2005). Modellieren im Unterricht mit der “Tanken”-Aufgabe. In Mathematik lehren, H. 128 (pp. 18–21). Seelze: Friedrich Verlag. Austrian curriculum for Mathematics. (2004). AHS-Lehrplan Mathematik. Wien: BMUKK. Retrieved from http://www.bmukk.gv.at Deuber, R. (2005), Lebenszyklus einer Weissblechdose – Vom Feinblech zur versandfertigen Dose. In http://www.swisseduc.ch/chemie/wbd/docs/modul2.pdf, Baden. Dörfler, W., & Blum, W. (1989). Bericht über die Arbeitsgruppe “Auswirkungen auf die Schule”. In J. Maaß, & W. Schlöglmann, (Eds.), Mathematik als Technologie? - Wechselwirkungen zwischen Mathematik, Neuen Technologien, Aus- und Weiterbildung (pp. 174–189). Weinheim: Deutscher Studienverlag. Maaß, J., & Schlöglmann, W. (1992). Mathematik als Technologie – Konsequenzen für den Mathematikunterricht. In mathematica didactica, 15. Jg. Bd. 2, pp. 38–58. Maaß, K. (2005a). Modellieren im Mathematikunterricht der S I. In JMD 26(2). Wiesbaden: Teubner, pp. 114–142. Maaß, K. (2005b). Stau – eine Aufgabe für alle Jahrgänge! In PM Heft 47(3). Köln: Aulis Verlag Deubner, pp. 8–13. Möhringer, J. (2006). Bildungstheoretische und entwicklungsadäquate Grundlagen als Kriterien für die Gestaltung von Mathematikunterricht am Gymnasium. Dissertation, LMU München. Siller, H.-St. (2008). Modellbilden – eine zentrale Leitidee der Mathematik. Schriften zur Didaktik der Mathematik und Informatik an der Universität Salzburg, Aachen: Shaker Verlag. Siller, H.-St., & Maaß, J. (2009). Fußball EM mit Sportwetten. In A. Brinkmann, & R. Oldenburg, (Eds.), ISTRON – Anwendung und Modellbildung im Mathematikunterricht (pp. 95–113), Hildesheim: Franzbecker. Winter, H. (1972). Über den Nutzen der Mengenlehre für den Arithmetikunterricht in der Grundschule. In Die Schulwarte 25, Heft 9/10. pp. 10–40. Wittmann, E. Ch. (1974). Grundfragen des Mathematikunterrichts. Wiesbaden: vieweg.
Hans-Stefan Siller IFFB – Dept. for Mathematics and Informatics Education University of Salzburg
280
LIST OF CONTRIBUTORS
Manfred Borovcnik
[email protected]
Ramesh Kapadia
[email protected]
Astrid Brinkmann
[email protected]
Klaus Brinkmann
[email protected]
Tim Brophy
[email protected]
Jean Charpin
[email protected]
Simone Göttlich
[email protected]
Thorsten Sickenberger
[email protected]
Günter Graumann
[email protected]
Ailish Hannigan
[email protected]
Herbert Henning
[email protected]
Benjamin John
[email protected]
Patrick Johnson
[email protected]
John O’Donoghue
[email protected]
Astrid Kubicek
[email protected]
Jim Leahy
[email protected]
Juergen Maasz
[email protected]
Hans-Stefan Siller
[email protected]
Thomas Schiller
[email protected]
281