.
Statisticc s to
E v a n Pugh Research Professor of AtmosphericSciences T h e PennsylvaniaState
iv
Chief, Meteorology S
r-:-..-,-·-.
U.S. Wea
o.. ...........,.,-'""....,.....,..._...!
I
I. ! i
i
I
1
...
l
I l
l
./
l
'
.,I i
ai1d. M'ineral Scieii.ces CbLLEGE OF EAkTH AND
/. <·,. ' . ,: :.. ' ' '
·,> ';
'tHE PftNNSYLVAN,IA .S·l'AtE
. '.Hi¥1\rtR~ii~-i f
1=>ARKj
t,:,f~;l':::::
'}
'
ha,·r.! t•lwa,·s l•ct·'ll used in cotuu'clion with ••:.;u:;ll o.ltcll no in (.,!lot~:< tlu~ appt·urance ,,fa halo ·. :.o·.i ,,j,.d, vr r•. dlin~ pressure is really based on a ·.: , • , 1 , :. "l • ·. 11 L ,•,,, ""'"'-~ i ypcs •A pheuoinena. Sometimes, : .;< i><.·:. ,, • !,;.: ; .I • plamtl!ill!3 o;ot !.<.' fuuml for the relationships im1 ·;, ·'· llr. 1 ,,·, ::r ·. •i.<,:<:~ is to devdop objective l!ll!alls to summarize r·l" . 1;., ;~ e·. i .l; ·L, ; llf,,r•n.tl i~·!l, I<J liotd relali<>'l!'hips IJcl ween t lm data, ami to d-.1. '"''h. i ..•. ;',.,"' tJ,,.,,~, Some of th<" ntclhod:; used to accomplish this ;,.,. .,,. lc ' .; .. '· ,: '•'!'r;,.,,,;"'"' :tn! of ren·ut develnpment,and havl.' been made !"' .;i h I'••· .· · . . ..;ihl,ility o! 11 eat her i11formatiun <Jll punched cards, .·!It! ' 1,, .. .r '· i i 1 , 1• · '"·', "l!r• Ja ~'!.:'"11i!Olllll ~ of i nf c•rmal irm by dcctronic computers. l'.!rl i) r, • : •,. ··; '''-'h"llf, t;l at i"lical tech niqu<'s d l!Htp analrRis and forecast'ing "''· t •.::~tllllin!; h> n·phwe subjective methods. Further, increased applications of stali~l iml techniques are hcing found in meteorological rl.'scarch. In,; -... 1:.. ., . ·, • ... :. ;....... -"
,,
d
l.il.t:,~.·,l!:
Thi~ honk i$ dc~igne
·,i( ,., ill 1.he lit·ld. In ~uch ease!', Clllphasrs is pl;rccd not so much on d.·.;.\(., •.•f I'' o•.<·•.lnre as on an explauaticm of the llll!aning of the techniques, ,,., . 1 ·•f r!,,. :~_of'uoJ.plinns n11 which the techniques arc based.
l ;.,. :,o~ti•··r~
l(t .. :du!l::
cuF! i .,~,lid11·•· 1,: I· • ,,;,. i•.>n ru rr.produre l:rbular IIWtcrial and other inforlll';;j,,,tl:· .. , il· ;,,,,_; ; ..,..; ; :.• wl p:.• ulphh:l!': h• Pro£cssor Sir Hondd i\. Fisher, • ·,,,,t.ri•lt<' ·.L•·-:1.,,,.!_.. ,,,,.j ;., 1\·l
~··:t'•·tL•" l:. h1 I' l':·i·.,;.,,, ,,, l'rint Talok J !, abridged frunt Table IV of ··<~: .• !i,,:,.·:~l ·1.,! ;.. ,, i" r;;,,: ..:>.:d, ;\~-;rict•llun:l, aud J\h:dical R•:no'arch," and J dd•. II, :d:ri:':: .,1 .;.~c- Ill d lhe ::amc pnhlication; to Frank Wilcoxon ;ttid :h. \:,;+ _1]1 ~· .... ·~·;nHit: Cnillpany rrlr pennisfiion to print ·rablt~ 17,
''"i' :
,lf.ri i:.::,.' f:· '" 'J ,,::. II .. !···""'"" Jl;q·H t\pproJxinJ:•teStatistic:al Procedures"; ,, nd ,., I : \'t . ~.,, ,,, J •:! ,; nd I. h ... I· •\I'll ';I,, I(' ( :t!lleg<.! .l'n.'~~ ror rwrm i-;~ion to print I <•hi:.· 1'), ,,,., ..
:,11 ··:
!I:•'"
•.l•r· publ:mti'HI
"Srari~licrd
Methods."
;; lot au;:.,,., .. 1i ... gt:.td;d'::
:!d>:l"\·.lo-.1;:•.' the contributions to the book by ,,_..,, · ;o( ; ·I !11•. ;; colic. ,,•1' · .11 :lit: I' !HH'
J.
,._,,d \!I :h .d.· ,,.,;,.,·."d uit!ctll-,, the ntlw;- seclion~ in thPsc ~haptcrs . .\ ,, . I j,"; ''· ' I T I· ... (• •lll nl HII' .
A.
GLENN
Pt:i\f?r<·.itl' !'ark, l'a. I Ja>Hpry
J!J.i~
iv
PANOFSKY
w. BRIER
Page
Introduction. . . . . . .....
iii
Foreword .................................. ·.... , ............... . . iII' Contents.................................. , ............ ~ .... , ... . Tables .....
v viii
Chapter FREQUENCY DISTRIIJUTIONS.................................
1
THEORETtCAL FREQUENCY DISTRIBUTIONS .••.... ; . • . . . . . . . . . . I ntrodnction, .............. ·...... : .............. ~ :. . . . . .
32 32 ..'1.3 J5 .'!9
Introduction ...................... .'. . . . . . . . . • . . . . . . . . . . . . t The Frequency Distribution ........ , . . . . . . . . . . . . . . . . . . . . . . 3 The Histogram ........................... ~ . . . . . . . . . . . . . . 6 The Cumulative Frequency Distdbution .................... · 7 The Frequency Distribution in Relation to Prohabilit)'• .. , . . . . 9 The Probability Histogram. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 Frequency Distributions o£ Vectors . . . . . . . . . . . . . . . . . . . . . . . . 1.1, Averages ....................... ·......... ·......... ·. , . . . . . 16 Computation of the Mean ............ , .. , ... ·... , .. ·. . . . . . . . 18 Evaluation of J\Iedian am! Mode. . . . . . . . . . . . . . . . . . . . . . . . . . . 20 Statistics Versus Parameters ............. ,'...... : .. ,.,..... 2i Wind Averages ............... , ................. ·. . . . . . . . 21 Degree Days ... , .. ':"': ......·................... ·. . . . . . . . . . . 24 l\-1easnres of V.:tl'iability ................ , . . . . . . . . . . . . . . . . 25 The J\lcan Deviation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 'i'hc Standard De\,iation ...... ·............. , , .. , .......... · 26 Variance .. , ·................ , ....................... , . . . . . 28 Skewness ...................... :. . . . . . . . . . . . . . . . . . . . . . . . 29 · Kurtosis ...................................... ·.. : . .·.. : . . 30
II
The Biuoinial l)istribution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Normal Distribution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . The Poisson Distribution ............ :. . . . . . . . . . . . . . . . . . . . Trans£onnatiomi .................... , ............. :. . . . . .
BI
SAMPLING THEORY.........................................
40
46
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 . Binomial Testing Generally. . . . . . . . . . . . . ·............'. . . . . 51 More Than Two Categories ..................... : .... ; ... : 53 Significance of Means ................. .' . .' ...... : .... , . . . . 58 Dilierence or Means of Independent Samples................ 6,~
v
Con:f..en:Ls (cuntiu:ue.J)
. , , •. , , , , . , • , , . , • . . . • . , . • • . . • • . • . • • • • • • • • , ,
64 64 66 79
l!l.lo\TJ";<;SHII'BETWF.Ii:NTWOVARIATES . . . . . . 1 i•."t.J I •:.;t,ihtr!iroiJ o( Tw'l Variates....................
80
~~ .. ;q•a'"·
•• iii< ft ... ,··.~ ,•ill<"
:~11 l II I ~1·.1'
tV
:' • , ,
.................................. :,lc:nl!\ l,y Rank 1\lethods..............
'··ktlwd~ !i\' ,·(
.\•.li.>".;;·.ll IJII
i
l<
.._~ l\,',.:'"''·'•11
-·-t.. c;;l~;,i ..... ·
,;.;,ll;wri<·HI \.'ariateR)...........................
80
82
.i" 11 •':!I cs;,;,,u Ct•PIIicicuis. . . . . . . . . . . . . . . . . . . 87 S•.tllcr ,\;,. ''' u i 1111: of Rt•gression ............ ·.·........... 88 t.i•w.•• f',J••~·ktion ....................................... 90 .J 1
of Linear Correlation Coefficients .......... ,..... 92 Curvilinear l{e~n:ssion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 Cor-rdaliiJn of a Numerical Variate with a Dichotomous Variate. 98 R!!hlious between Qualit;
Si~uit'ocance
V
>"
1\:-.:ALYSTS oF HEJ.ATlONSHII'S Drnwm:::N Mmm THAN Two VARIAnu:::s .••........................................ 105
Graphical Representation ................ , ................ Three-Dimensional Regression ............................. Lim:nr Multiple Correlation ............................... Signilicnnce of Linear Multiple Correlation Coefficients .....•. ".\111 mnalic" Correlation ..................................
l'.n!inl
Corr~lnlion
.......................................
I ~~· n .• i(',, I~• !'\'I uP! tlm11 Tlm~c Variables .................... El\l(•n;.j,,., In N·mlinear Prediction Equations •.............•. ! b .. 1:-i,..:tiH,illn:;!. Fi!lt<:.l:i<w .......................•.•...•• Crt··t'!th:a; I·!~ t.;tt:c..:: it."L . ......... , ....................... 'I h•. ,\l,;liipk \··,11• Lill'•,J Had .............................. l'•·uin,l,i'it:y hii'L·c;;~t~ .................................... \'I
105 109
111 1l2 113 114 116 117 118 122 123 124
l1:1:· ~.iEt:t! ~. , ...... , ••. , •.......•.... , ....•......... , .... 126 lhU•··I.,.. I;"il . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 1~ ..-l;ni.••• of J:egul.•r C;·de5 ............................... 127
.\wtl;si:; ,,(,!,L·l'••i.,dic Fluct.ualious ....................... 1 F"''' lion,:,, . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . • . J.:li:Pil•,;i'"' ... 1 •lu.· J!q,:Phr Cycles .......................... J:.,L,li·•·• ·"": /\u .• lr~i;; of I he Irregular Cycles ................ ,\ !!l<~•:•••'~t.l:.tlion FttiiClious ................................. Spectrum Analysis ....................................... S;~mplin~ Flucluatious.of Spectral Estimates ................ Pnrpn5c (Jf Sp('clrum l\nalysis ... , ... , ..................... .Suwuthiug aut! Filtering ................................. ·. Corn.lat:ion of Time Series .............................•... C10ss Spectrum Analysis ..............................•... The Supeq>osed Epoch Method. . . . • . . . . . . . . . . . . . . . . . . . . . . . l'il Hll,<>l:-d
vi
128 134
135 136 138 140 145 146 147 153
155 159
:,;
Conteuts (cQntinucd.). ·
(
162 · Objective Analysis ....................................... 162 Factor Analysis in Meteorology ............................ 166 Space Smoothing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170
Vll
SPACE VARIATION OF 1\IETEOROLOGICAL VARIABLES •.•.....•...
VIII
STATISTICAL WEATHER FORECASTING •....•.... : •••• , .. , ..... .
174
Selection of Predictors ................................. , .. Classification of Statistical Forecasting Methods .......... , .. Multiple Linear Regression ..................... , ......... . Successive Graphical Regression ...... , ................... . Stratification ................... , .........•. ; , .... _, ..... . Residual Method. . . . . . . . . . . . . . . . . . . . . . . . .............. . Mixed Methods ......................................... . Conclusion ............................... : ......,., ..... .
175 178 178 180 183
Jnl:roduction ........................................... . IN
___..q;;;:::>
5 IX
\
186
IR8 189
191 Introduction ................ ; ............................ ·-191 Purposes of Verification ................ , ........ , ......... 192 Fundamental Criteria to be Sa.tisfiecl. , .... , ...... , ..... ·: ~ ... '195 Verification Methods and Scores ... , , ...... ; ............ , . : 198 Control Forecasts for Comparison .•......................... 204 : · Significance of Forecasts. • . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206. ·Pitfalls of Verification ..............•.....•........ ; ·... , •. 206
FoREcAsT VERIFICATioN ••.....•...•.•.•••••.•••••.••...•••••.
L
Appendix .......................................... : . . • . . . . . . . . . 209 Chapter I Chapter III
<
····~······················~~·········~~~····
209 210 212
Chapter IV
• • • • • • • • •' • •' '• • • • ' ' • • •' • •i •
Chapter V
..............••......•.....•.. ""-\'; ....• : .. 214
Chapter VI
...........................•................ 217
•.s
•
•
•
•
•
•
•
•
•
•
•
'•'
219
Chnpter VII
Index ......................................... , ............... 220
vii
i .. ·l•
Title i
2 3
l,,t·•.
Page
I :·1•· ,; c•f Ft·::quct;r;y lJistril>ulion.......................
:. ::ldo.'l.lckt l· l"Ujllt'.JCY Disl:ributinn from Table
r,
4
No. 1. . . . . . . .
1
,..,,,, ..• ty
;"ot•q!:t:IICJ
l)isltil>ulion of Wind Speed and Direction..
14
L .. < .;.:
.: ti, .._·,,mpcttation of th<~ Mean by the Short Method..
19
.'i
Short i\h•tl!od Compntation of X and s.......................
28
h
Bmmninal Distribution for f> = 1/2, N = 10.. . . . . . . . . . . . . . . . .
34
Table of the Error Fuuctiou Defined by Erf(r)
1
!T
-T'/2
dr
38
= e -T'/2 ...... ·....... . . . . . .
39
·
=
_~ v2~
e
o
8
O~tliuat(•s of Normal Curve Y(r)
9
Contput·atious for Transforming Harrisburg Precipitation ...... .
lU
Probabilities for Plotting the Fitted Gamma Distribution ....... .
44 44
11
Li111iting Values of Chi-Square .............................. .
55
12
Forr.cast Conlingency Table ................................ .
56
Ll
No- I{ elation or No-Skill. ................................... .
57
14
Distribution of "Students" t, Given the Limiting l'rohability ... .
59
IS
lktcrminalion of I
65
Differences. . . . . . . . . . . . . . . . . . . . . . . . . .............. .
65
Method ............ .
66
16
~~~nk
17
:>i)p:ifica11ce Limits, Paired Samples, Rank
61
I~:
,\Hal)s:. ,;f \'ari
19
l.iu;tP•~.
\. i·t• .. d I·' ..................................... .
72
76
lO
. \ t• .• l). i; t:f '\at i.. n:·•:, ("..rmpkx Ca~>eE. . . . . . . . . . . . . . . ........ .
21
llyp,,ll":lit:d .lui·ll r•,·of•al•ilily Dist"ribut:ion, Wind Speed, and
2_:
\_:,,..,,.ut.,•.iun d Lhw of lh:grcssion and of Linear Correlation
T .. !HJJPI ..d
"''-
in \\'htlt:r (Probability in %) . . . . . . . . . . . . . . . . . . . .
Coui·<::t·nl 1!<.· lht• Short Formula............................
82 86
2.{
;'.nal) sis nf \'nriance Taule for Testing the Signilicance of a
R!'!.!;ression Helntionship..... . . . . . . . . . . . . . . . . . . . . . . . . . . . . • . . .
89
24
""''ip;i;; ,,f Variance for Testing the Significance of the Correia! inu Coefficient. . . . . . . . . . . . . . . . . . . . . . . . . . . • . . . . . . . . . . . . . . . .
93
25
Relation Betwee11 Wiwl Direction and Temperature at State
College, Pa., January 1943-1952.............................. viii
97
~
T &blies · Table 26
'.
(conlinueJ)
'}.:·
Title
Page
Dew Point and Fog Frequency •....•.. ; .................... .
99
. 27
Relation Between Meridional Flow and Weather .•............. 101
28
No-Relation Table Corresponding to Table 27 •.••....... , ..•.. 101
29
Relation Between Thunderstorms and Pressure Tendency ..•.... 103
30
Graphical Representation of Relation Between Three Variables .. . 108
31
Linear Regression Applied to the Data of Table 30 ............ . 111
32
Analysis of Variance Table for Significance Test of Multiple Correlation Coefficient. ....... , ......· .....•...... ; .......... 113
33
Multiplicands 2/N sin ( 3 ~" i
for
Harmonic Analysis
of
12
Ouserv:ttions,
t) and 2/N cos ( 3 ~0 " i t) .. ·. ................. 132
34
Showing Temperatures as Function of Time .................... BJ
35 36
Filtering of Time Series. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148 .., Lag Correlation Coefficients Between Geostrophic Winds at 60" N and 40" N ....................••.....••••......•..... ':. 154
37
Lag (Days) of Winds at 40" N After Winds at 60" N •.....• : ... 157
38
Coherence Between Winds at 60" N and 40" N. . . . . . . . . . . . . • . . 158
39
Limiting Values of Coherence ..........•. , .•...•.•••.....• , .. 158
40
Weighting Schemes for Space Smoothing, .•.. "' ............... i71
41
Contingency Table for Precipitation Forecasts .•.....•......... 198
42 · Per Cent of Time Each Forecast Evefi't..Gccurred for a Particular · Category .................•.......•.•••.•.• , . • . . . • . • . . . • . • . 199 43
Per Cent of Time Each Observed Category was Correctly Forecast 200
44
Example of Forecasts Stated in Terms of Probability ..•....• .' •• 202
45
Verification of a Series of 85 Forecasts Expressed in Terms of the · Probability of Rain. • • • • • • . • • • • • • . . . • • • • • • • • • . . . . . . . . . . • . • . 203
ix
. '
. ~~
CHAPTER
;"
I
1. Introduction. Statistical methods are used in meteorology primarily to analyze past weather data and draw I conclusions about the behavior of similar quantities in the future. More rarely, they are used to test the significance of physical experiments, or to evaluate the efficiency of forecasters·or of forecasting procedures. In forecasting by statistical methods we may distin1~uish two· time scales: for periods of longer than perhaps live days, we deal with climatology; for less than five days, with synoptic forecasting. In both these fields statistics play an important role: in climatology, the meteorological data are summarized statistically according to space and time ·without anyreference" -·: to the conditions immediately preceding. · For the purpose of forecasting over shorter peritrus, the emphasis is placed ·on relationships between the meteorological variables at a certain time, and similar or different variables at··.a previous time. . . .. '· .. : ;
•
;
In the case of short-range forecasting, considerable progress has been made by physical method~ which are based on the solution of the basic equations of meteorology with the aid of electronic computers. Although these methods appear promising, they can predict only the large-scale pehavior of air flow, temperature, pressure and moisture. The/weather at specific stations over short periods of time needs still to be related to the largescale variables by statistical methods. Further, dynamic methods of weather prediction need, for input, smoothed aild interpolated weather information; statistical procedures dictate' the best methods of determining such information from observed weather phenomena. ' . . · Since at present the physical equatioi1s cannot be handled rigorously, some meteorologists have used statistical procedures even for the purpose of relating the properties of large-scale atmospheric variables at successive time periods. Although these methods are not restricted by our inability to consider the 1
SO,\IE i\l'PLlCA'l'IONS OF ::)'t'ATISTICS TO l\lhrrHOROLOGY
pL,;,,ic;ll cqtwtions accurately, they suffer from the possible im;r..;/:'.t:: uf ~.l1e statistical rel;~tions, as will be seen later. .
lrr .;;·pl)ing- statisticalmetlwcls to meteorologicai'clata, there urc Ll,;,;e steps to be performed. This may be called the orgm1.ization network that supports the science of lll•:i\~!)rolngy spews forth data to a deg1·ee of magnitude and pct;:;isllollce Lhat staggers the mimi. Such data may be numerical, c.1-: .• tr·mJwrnture in degrees Fahrenheit m· Centigrade, or precipi tal ion in inclws or centimeters; or the data may be qualitative (i.t:. noiHIIIlm=rical), for example, stale of the sky as overcast, broken, scattered, or clear. Obviously, special techniques are n·quin·d lo orgauizc and Stlllllll
stejJ
The
olJst~rvalioual
2. Filld rdulionships among the data. This may be calied tlte aualys£s slefJ. Tile synoptic-dynamic meteorologist is cons! ail tly questioning the "why" of physical processes in the nt!Hoiiphert. For example, he may want to know why there is an
3. Tl:Rl the significance of the results. This may be called the k~ting step. The separation of real effects from accidental varia! i()JIS is the major objective of many statistical investiga· tiu1::~. Tl,c Jllodem theory of statistics has been very successful in d1'\ doping objective tests that provide a valid estimate of til<· probability that certain statistical relationships could have been obtained as a result of chance and therefore have no real basis in fact. However, the standard significance tests are likely to be misleading when applied to meteorological data since the u;Jderl~·illg assumptions upo11 which the tests are based may nr>l be uhcycd. l\ Tnd ilication of the standard procedures to t.tke care nf tlwse difficulties can be made sometimes, but often the simplest ami most practical solution i~ to test the reliability of the statistkal1e:::ult on an independent set of meteorological data.
F
.NCY
·.\'..
'
The methods described in this book are·arranged in such ·a \vay' ,{;:i. that they can be applied directly to tabulated data.; It~ should ~ .:;r be pointed out that much labor can be saved by making use of· /. · · · punched cards. Especially steps 1· p.nd 2 described above cait ! be performed efficiently with the aid of punched cards. ', ~
Although many of the computations described here can be. done longhand or by desk calculator, the extension of some of the procedures, which are outlined only briefly, require electronic computers in order that results be available sufficiently rapidly to be useful.
2. The Frequency Distribution. The frequency distribution is used to organize. data so that their characteristics may be easily and quickly summarized. Suppose, for example, we had a set of data consisting of 59 temperatures, arranged by date, say, from January 1 through February 28. The name that statisticians give to· the element · or thing being analyzed is "the variate" (by definition, the thing., ... · that varies), and is denoted by X, X1. Y ot some such capital letter. In our example, temperature is the variate; we shall use the capital letter X to symbolize temperature. It is difficult' to work with data in their raw form; finding the lowest and highest temperatures or the value of temperature around which the other temperatures cluster would be quite laborious. If, however, the data are rearranged in order of increasing or decreasing temperature, the highest, lowest, and middle values fairly jump from the paper. The lowest value is referred to as having the value of rank 1, the next of rank 2, and so forth to the last, or rank 59. If two temperatures are the same; they are given an average rank. For example, if the 5th and 6th lowest temperatures are 30° F, their rank woJ!}d...be 5.5., \~ We can further organize the data by separating the actual temperatures into numerical dasses; this is a summarizing technique that will let us deal later with classes of temperature rather than individual temperature observations. The interval of temperature (or whatever the variate may be) over which a class extends is called the "class interval."- For our• purposes, the terms "class" and "class interval" are synonymous, although the latter term is more commonly used. Selection of the size of the class interval is arbitrary, but certain guiding principles
Three kinds of class intervals may be used: 1. C'l.t:>~ i11h'l vals that are numerical and equal in size. !. Cl.tss i,, Lc~·\·,.ls that arc uumerical and unequal in size. :;. Cla:~s iutcl'>als that are not numerical but are expressed
Ltd
L:~bw.
in words.
·1 ht· Lh:·n: k:11d.:i are discussed below. I. Cl.tss i1ite1 vals that are numerical ami equal in size. This is the mo.,t s.1tisfadory type (for later computation) and should IJe ilSed if pu:;-;"ihlt~. Jn Table 1, No. 1, the 59 temperatures are arwnv;ed ill .1 frequency distribution consisting of 6 class intervals. In selecting class intervals, the following principles will serve as a guide:
TABLE 1 Three Types of FrequencyDlstrlbutlon No. J Tl!nlperature, •F Class lntt·J·•·al Hi.0-1 !J. 9 20.()-23 IJ 2.Ul-27.'J
28.0-31·9 32.0-.lS.fJ 36.0-JfJ.!}
'l:f =
'\!
nt
f
;a
.l 'J
s
12
15 2'
12
21
,~'i
I flO
15
8
No. 3 Cloudiness
No. 2 Daily Rainfall Amount Inches
25 LJ
Class Interval
f
%
0 Trace < .005
.33 7
49 24 10
4 3
4
68
100
.01-.0.3
JH-.10 .11-.50 ..'il-5.0
l;f
N
16
5
7 6
Class lntervai
f
%
Clear Scattered Broken Overcast
29 18
13 19
37 23
I,f
79
100
=
N
1-'irst, the use of too few class intervals obscures important det.tils i.1 t.hc fru1uency distribution; the use of too many class ir~ten·c~J,; Ltils to organize and summarize that data as well as tliey tni;,;ht Lc, aud risks the chance of class intervals that have r.cro f~t;l 1 tr._ucy. As a rule of thumb, let the number of class ' in ten ,tis np1"l 'i limes the logarithm of the number of observatiollH (e.g., 5 log N; lhc letter N is used in statistics to denote the number of observations or cases being dealt with). Second, class intervals must not overlap, since an observation can belong to only one class. Third, if possibl~~. no class interval should have all observations ncar one end of the interval. Usually it is not possible to do this ex<.ctly. In analyzing a frequency distribution, it is assumed (ins<,f.tr as the mathematics is concerned) that all the observa-
16 24
lSTKil
tions ih a class interval lie at the interval's.mid-point, the''.'class mark." This is known as the "mid-value~! assumption; we shall ·/ 1 discuss later how valid it is. .· ' The upper and lower limits of the class intervals are called the "tally limits," and are written to the same number of decimal places as the data are given. Thus, the tally limits in Table 1, No.1 are 16.0 (not simply 16), 19.9, 20.0, 23.9, and so on. "Mathematical limits" on the other hand lie midway between the upper tally limit of a class and the lower tally limit of the next higher class. These mathematical limits are used in preference to tally limits for statistical computations. A mathematical limit is always written to one more decimal place than the data given, and a particular mathematical limit is at once the upper limit of one class and the lower limit of the next higher class. Mathematical limits in Table 1, No. 1 are 19.95, 23.95. 27.95, and so on. '
The size of the class interval is the distance between successive h1at\1ematical limits, and is always denoted by the letter i. !'In 1 Table 1, No. 1, for example, i = 4. · · ·. ; After the class intervals have been chosen,· the 59 observations are tallied into the proper class intervals. The frequency in · each class interval ("frequency" in statistical la11guage meai.ls number of occurrences) is listed under the column-heading} in Table 1, and for each class interval is simply the total of the tallies. Obviously, 'Zf = N, that is, the sum of the frequencies of ait the classes equals the total number of observationsJ. The frequency distributions of many meteorological variables havt.~ peaks near the center. Conspicuous exceptions are rainfall and cloudiness. . ... 2. Class intervals that are numerical but unequal in size. Table 1, No. 2, is a frequency distribution of rainfall amounts. If equal class intervals were used, the frequencies in the classes would be ridiculously disproportionate dlre-to the many times that no rainfall or low rainfall occurs. In practice, however, the use of equal class intervals has many advantages and should be followed if at all possible. Guiding principles in selecting unequal class interval sizes are the same as for equal class intervals. 3. Class intervals that are not· numerical but expressed in words. Table 1, No. 3, is a frequency distribution of cloudiness based on word description. Although numerical computations
.
~
,,;;mil'
J\t'l'LICATIONS OF STATISTICS TO METEOROLOGY
!;., nw.rle 011 the distribution in this fonn, the 79 observa, ·~ (!•nn·niently summarized for quick interpretation. r.~. tl!t::. ll;i.. ;_:; 1 .• .r fn:queucy distribution can he made numer-
<'illli'<}:,
ti•·il.
i,·al
V' ddi,,i,.g dear, scattered, broken and overcast in nuri1erical
I.e ms, tor i_,y cx 1;rcssinl!. the frequency in each class interval as a i oeJ C!:IJ t:l.<,C· (Ji {ht~ ltJI:;tl ObSCrVatiOilS,
The t•ic.:t. • . . ol
:1 f:-rq•.wncy dil'ltribntion 1s usually drawn by d a (,i,L'>l!,l'illll or frequency polygon. Figure 1 is the h islogr am t.hn t describes the frcq uency distribution given in Table 1, No. 1. The abscissa represents values of the variate,
m·~:n;,;;
Frequency, f
10
-4]
5
0
b
I!' 9~•
23.95
c
d
2795
3195
e
35.95
F.!GIJRE 1 iJislogram from Table I, No.1.
iH this c~~c;t· temperature.
The ordinate represents the class The histogrn111 coilsists of adjace111 r.:d. u•;;,HLt,· boxes whose bases extend between successive lllall ...·tllalic,d Jilllils a;HJ whose hei~hts are equal to the frequencies'' it!li11 <~ach class. fn:qtu:tH:kh a· li,.;l•.d i11 the f-column.
The area of each box is thus equal to J X i, the product of its height and base. So, areas are proportional to frequencies, but t;nly if tile class intervals are of eq11allength; .if box A in Figure 1 is b·.~cc HS large as box B, there are twice as many observations in the chtss interval from 27.95 to 31.95 than there are in the class i1Jlervul 31.95 to 35.95. Further, since J:,f = N, then
'2ji
= i'2j =
Ni
Drs-r;
.EQUI
IONS:
. . The frequency polygon is a graph constructed 'by plottii1g'the
Thus the totalarea of the histogram is equal toNi.· · .
:'.1'
class· frequencies at the mid-points of the class intervals (see points a',·b, c, d, e, and f in Figure 1) and connecting the points by straight lines. This type of graphical presentation has the disadvantage that areas are no longer proportional to frequencies. ·!.!,•
'
I
'
4. The Cumulative Frequency Distribution. We may want to know the distribution of a rileteorological element in terms of "more than" or "less than" a certain value. A cumulative frequency distribution is used to determine this. Table 2 shows the cumulative frequency distribution made from the 59 temperatures that were used in Table. 1. No. 1. The '•-'
TABLE 2
Cumulative Frequency Distribution from Table 1, No, .1 Class Interval of Temperature 16.0-19.9 ; 20.0-23.9 :. 24.0-27.9 28.0-31.9 32.0-35.9 . 36.0-39.9
f
UL
CF
CF(%) ·
3 9 12 15 12
19.95 2.1.95 27.95 31.95 35.95 39.95
3 . 12 24 39 51 59
5 20 41 66
8
87
100
•.\'
first column in Table 2 lists the class intetvals of the variate; the second column lists class frequencies (thus far, Table 2 is identical with Table 1, No. 1); the third columti, headed UL, lists the' upper mathematical limits._of the classes; the fourth column, headed CF (for "cumulative frequencies") lists frequencies accumulated from top towards bottom in thef-column, that is, 3 + 9 = 12, 3 + 9 + 12 = 24, and so on;the fifth column, headed· CF(%), lists the frequencies in the CF-column as percentages of the total observations. Table 2 can be used directly to at1swer such questions as: "How many, or what percent of, the 59 'temperatures are below 31.95°?" The' answer, from Table 2, is 39, or 66%. To determine, further, how· many temperatures are above 31.95°, we simply subtractthe number of cases below 31.95° from the total observations. :
':'·'·
..
,.
Sm.m
t.'
J\PI'I.ICAT!ONS O!' STAnsT!Cs TO METJWIWLOGY
:>uppose we wanted to know the number of temperatures below ,, -value, for example 33°, that does not happen to be one of the il•nt!H:JIIal"ical limits. Or, inversely, suppose we wanted to know that. ·.-aiue of temperature below which lies a certain percentage of Un: H.:mpcralurcs. This can be founc~ by linear interpolation ;u the (. F- or CF(%)-columns of Table 2, or the answer may be read t!;tcdlv frum the pictorial representation of a cumulative [n;qt;u,cy di.,,. i!.,.Jtivn shown in Figure 2. That figure is called CF,%
roo
80
60
40
20 Median 29.4
0
-· ---rc:::;;__.........,---.----...........--c:/:..____,,.......___,..-___, 16.0 20.0 24.0 28.0 32.0 36.0 40.0 Temp. °F F!Gtllm 2
Ogive from Table 2.
an "r,gi n·.'' The
the percentage frequency of temperatures· below·· 22°F.!-.. The_·. figure gives about 12%. We could also use linear interpolationon· Table 2. In that case we would find (remembering that mathe111atical limits must be used in all computations): ;:;· 15(22 - 19.95) (;o; CF (%) = 5% 4 _ __ _ = 12:7%
+
.
The values of the variate below which lie certain fractions of the total observations are given special names in statistics; the value below which lie k% of the cases is called the k-th percentile (percentile meaning "hundredth"); the value below which lie j/30 ofthe observations is called thej-th trentile (treritile meaning "thirtieth"). Similarly, quartiles: quintiles and deciles refer to positions of fourths, fifths, and tenths. 'Some statisticians use the "-ile" words to mean certain fractions of the total number of · observations, rather than precise points in the distribution. The former·meaning, however, will be used exclusively here. Any -ile .. , point.can be read directly from an ogive ... ; ;. :
5. The Frequency Distribution in Relation to
Probability~
Iff is the frequency of a special event and N the number of all events, the ratio f/N is called the probability of that event. Probability can be expressed as a fractiort or a percentage: obviously, it cannot exceed 1 (100%) or be less than zero. 'i ''
Probability can be defined ··a priori. from physical or math e. matical considerations, or from experimet1tal or observational data. iwe shall call the former mathematicalprobability,·and the latter .. empirical probability. . ., Mathematical probability is based on a hypothetical model which may 11ot have an exact counterpart in the universe. An example of such a model is a completely urlbiased coin. Such· an ideal ''c.oin has the propertY'that heads· and tails are equally likely.' If we spun such a coin a very large number of. times, the relative frequency of heads would approach ~ ..· For any real coin,''this would probably not be true due to some,irregtilarities 1 1 ·:'· · · .: '_ : ' _ · ' • : ·:· in the t,:features of the coin. . ' ' : .. I~
·!
•
I
• •
·,
··~:
•
! •'
' .
:
'
Ori'again, such a model might be a perfect die.· In this case each face would be equally likely to appear on top.· We would say then that the mathematical probability of throwing a six was % since, as we throw the die a large number of times, Uu:!
10
Smm
a·btin~
i\l'!'UCATIONS 01' STATISTICS TO METEOROLOGY
frequency of getting a six would be
76.
li:dtll\:lllitlkal probability is the limiting value of N in ! he cnse of the hypothetical ideal model.
Note that the f/N for large
Tl~t: prnh:1hility of an event can also be determined empirically, either through experimentation or by analyzing past events. It is not necessary that: a large N be used to obtain these empirical prohabilily estimates, but they will differ greatly from one period '" b,nnp!c to another, especially if N is small. Thus, in the 1::~:11!1 pie or the ideal coin discussed above, in which the probai.ility of a head is /J = ~. successive groups of tosses will show lluct ua lions in the percentage of heads even though the physical conditions remain constant. Thus •. it is important to distinguish the empirical probability (EP) computed from the sample, from the hypothetical probability p. A particular sample can provide t•id~· :111 estimate of p, and this estimate n1ay be good or bad accordi11g to the size of N, as well as other factors discussed later.
i\s another exnmple, the empirical probability of drawing an ace from a deck of canis can be determined by drawing a single
card many times (replacing the withdrawn card after each trial) and observing how frcqHcutly an ace appears. If this were done, we wuuid lind nn ace appearing somewhat more or less than 1 /!a ui • ~:e t;meH. This discrepancy from the mathematical probability 1 / 1:1 might he due to the i.nsufficient number of experiments, or a physical non-uniformity in the cards. If, however, ideal and nlllstant conditions could be maintained for consecutive c':pcriments, the empirical probability would be expected to approaciJ the mathematical probability as the number of experinJ<:nts <.IJlproachcs infinity.
·< r,w
it is evident that, in most situations, the meteorologist "experiment" in the laboratory sense. He cannot make the weather satisfy his m·bitrary conditions, nor can he run a re-I rial o( the weather if it fails to work the way h~ wants it to. In t:t•lkcting data, the meteorologist is the ultimate passivist, waiting for the weather to happen, observing and recordi11g it, ~md hoping that it can be dissected to give him the data he wants. J he: concept of mathematical probability is usually irrelevant i!l tnctt·orology. On the other hand, empirical probability is qwtc meauiugful, particularly if large amounts of data are availahh~. Furthermore, if conditions are such as to make empirical probabilities computed from past data equivalent to ,
• •···~,IENC'. -;~,_.r~-IBL._:+/
. ·:
:·
'!!:)~J{\gt•
the\empirical probabilities that can be eventually computet! frr>l11:·\·~~· future dat.'l, then the meteorologist has a basis for prediction; .ltJl',~i> is'·in this sense that empirical probabilities· can b(' used in fore- · casting. Two tacit assumptions are being made, however, and they must be recognized to avoid later calamity:.. • ,, · ':-First, that the empirical probabilities are based on a sufficiently large number of observations to make the probabilities reliable, that is, truly representative of the data. , In general, the sample should .extend over periods of at least ten years to avoid peaks and troughs of oscillatious of that order of marrnitude; such oscillations seem to be quite characteristic of trietcorological data; and are, as yet, unpredictable. ''·' ,·'· :;·,. ! Second, that future empirical probabilities will· not. diller markedly from past empirical probabilities. Since' all meteoro-. logical variables have long period trends' superimposed on the shorter oscillations, empirical probabilities determined from the past record cannot be expected to stay valid indefinitely. · ' ',, • 1,
' .
'
'
•
'
'< t
•
•
•: •, ~ ; ; '
• • ~I
:,
;
' '
!\
+Empirical probabilities can most obviously be·•obtained 'by'.,'• counting. This procedure 'is ·simple when punched-card 1 equip· ment is available, in which case the machines do the counting of data in any category desired. Meteorological data at many stations in the U. S. A. have been and are being punched on cards which are stored in Asheville 1 N ·:C. Einpirical probabilities and other information can be obtained from the Natioilal Weather Record Center, Grove Arcade Bttitditig, Asheville, N. C. ' ' t . ' : . '•· • .
If empirical probabilities are to be determined from data available locally, the coui1ting procedure may be quite laborious. In th_!'l.t case, the frequency distributions Jm:nish a. shortcut. , . '· '
1
;,!. .
f
;
.
;
.
-'Let us return to Table 1. ln the %-column are listed the percent of the total o~rvations in each class interval.·: Thus, in. Table 1, No. 1, 3/59 = 5%, 9/59 =· 15%, and so on. And these percentages also express for each class interval the empirical probability that the variate will lie in that class interval.· Empiri- · cal probability attd percentage frequency· are here synonymous. If we' were to pick at random one temperature from the 59 values summarized in Table 1, No. 1, the probability is 5% that its value is between and including 16.0 and 19.9, 15% that its value .is between and including 20.0 and 23.9, and so on. Fmlhermore, if ~hese empirical pwbabilities are reliable (according to assump-
12
SOME l\:>J>UCATIONS OJ' STATISTICS TO METEOROLOGY
t;un alJr>vc), ami if they arc constant with time (according to assmnption 2 above), then the probability is 5% or .OS that a frntn: tcmpu:aturc is between and including 16.0 and 19.9, 15% or .1.5 that a future temperature is between and including 20.0 anJ 2.t~J, ami so 011. · l11 'I ,tl,lt~ 2 the CF (%)-column lists, precisely, the empirical probahililies on a cumulative basis. Here, too, empirical proba)Jility ttml percentage frcqueucy arc syJJOnymous. The principlt:H discussed above arc illustrated uy the fullowing<::xample: We consider the minimum temperatures at La Guardia Field, iu March, between 1932 and 1947. The empirical probability of a te111 perature less than 20°F was 0% in the years 1945-4 7, but an estimate of the EP of a temperature less than ·20°F for the wholt period based only on these three years would have been incorrect; the actual EP for a similarly low temperature during the 15 year period was 7%. Probabilities for temperatures below freezing turned out to be 34% for 1945-47 and 43% for the whole 15 yenrs. Although this change could be due to random varialions, it can he shown to be due to a real trend in this instance.· Given the probauilitics, mathematical or empirical, of two even ls, the probability that both events will occur is the: pwdtwl of the individual probabilities. For example, the pr·•ktln!ity uf •ltnJwing a three on a die is Ys. Therefore, the probalJility c.~ thruwing threes 011 two dice is (%) 2 or 1/3 6 • in th.tt case, we can u~sume that the two dice behave independently.
ind<'f!elldt~nt
In n~ekt:roloy;y, the assumption of independence is usually not justilitd. For exa111plc, the probability of a temperature below frC'czing; in October is 1/10- What is the probability of freezing tc1111 •cralmcs on two successive October days? If independence ndd he assumed, this probability would be 1/10o. Actually, the occurrence of a freezing temperature indicates the existence of a certain weather regime which is likely to last another day. Hence the probability of two successive days of frost is much greater th,IJl 1/H"l· The precise answer can be determined only empirically. If t!H~ proiJabilities of two mutually exclusive events are given, the pn.habili1 y that either one or the other occurs is the sum of tlv: pro),·lbililics. For example, if the probability of rain is 20%,
.. ·~ ·_
.;;f~o ,
.:.t~
.
.
,.. ".;QUEh ,_. •
..
..
. ·' ·. :'
,·~:~•.~·~.·.:.!. ~
and the;:probability of no rain is 80%, the probability 'of: either . rain or no t·ain is, of course, 100%. Or, if the probability'of a ceiling below 500 feet is 5%, and the probability of a ceiling between .500 and 1000 feet is 7%, then the probability of a ceiling in either; of these categories, that is, a ceiling below 1000 feet, is 12%.;.•, ' '
6. The Probability Histogram.
~
.'
This type of histogram is analogous to the ordinary histogram shown in Figure 1, except that the height of the boxes is hot class frequency but class frequency divided by N, · the .total· obser· 'vations. :The area of each box is histogram is ~
N
i =
i
N
N
~~
.
i; and the total area of the .
iN
-=.1.'
N
.:.
. "'
In this histogram, probability is proportional to area. ·A slight variation of this is the "probability density" histogram, wherein
. ·;t i
the height of each box is plotted asp = I·
.
Thus, the area of
.
Nt
each box is _f_ i; and the total area of the histogram is ,:, Ni . · . 1 2: i = ~~ = _!:!._ = 1 Ni N . N In this histogram, probability is equal to area. Before leaving the subject of frequency in relation to empirical probability, one further point is mentioned. · The question, "What is the probability of a temperattlre of 40°?", is completely meaningless. Only the probability of a temperature in a given range can be determined. In a-pr-obability histogram, probability is expressed by the area between two values of the variate (that is within a certain range) and not by the ordinate erected at a single value of the variate. . . .:•' .
r.
7. Freq~ency Distributions of Vectors.
! .
'
~ '
•
'
The frequency distributions described so far have dealt· with a single variate, such as temperature, rainfall, cloudiness, and so on. In ll'leteorology it is often required to organize, summarize ~':
14
SO!IfF. APPLICATIONS OF STATlSTICS TO I'VlETEOROLOGY
and analyze data on a vector quantity. A vector quantity, say wind, for example, is usually described by both magnitude and direction. Of course, if we are inte1·ested only in wind speeds, then the methods already described will apply. If, however, we want to analyze. both speed and direction, a "frequency wind rose" can be constructed which gives the frequency distribution simultaneously of wind speed and direction. First, equal intervals of wind speed are selected and their limits written down the left side of the computing page, just as though a simple frequency distribution of wind speed were to be constructed. Along the top, groups of wind directions are written; if the winds are just given to eight points, each group may consist of only one wind direction. Horizontal and vertical lines are drawn along the boundaries of the classes, dividing the sheet: into m1m2 boxes, where m, is the number of wind direction groups and mz is the number of wind speed groups. All the observations are tallied in one of these boxes according to wind speed and direction, and the number of tallies in each box is determined and written into the box. Since wind direction has little meaning for the lowest wind speed group, the number of all the low speed groups can be combined regardless of wind direction. \Vhen the frequencies of columns and rows are totaled separately, frequency distributions of wind direction and speed arc oLtaincd respectively. Table 3 shows such a two-way dis- '
TABLE 3 Two-way Frequency Distribution of Wind Speed and Direction Speed, mph 4.0- 6.9 7.0- 9.!J 10.0-12.9 IJ.0-15.9 16.0-18.9 19.0-21.9 22.0-24.9 2.5.0-27.9 28.0-30.9 31.0-33.9 .H.fl-.16.9 37.0-39.9 40.0-42.9
s
sw
6
8 12 16 8 5 1
LI 11 5 1
w 2 2 5 10 9 6
5
4 4 2 I
NW
N
NE
E
16
13
17
21 8 8 2 2
7 1
1
;l
14
22
:n
26
11
14 4
.5
16
8
1
15 6 5 1 2
SE 2 7 2
5 2
2
1
35
5L
50
138
76
29
47
19
};
4 64 78 87 63 58 37 17 22 6 3 5 1 445
• I
.
~ ~·
·.·~
, < ·;.;.···.'
,
•
'
./'.
':.:·~-,~:h
tribution to correspond with the wind 'dire<;:tion; •and;,the'fragial is labeled with wind speedJ,i: Then 1 the· frequencies"-' are: . entered in a direction according· to the wind direction (or mid~~;;; 'point of the group of wind directions) at a distance from' the· . center corresponding to the mid-point of the class int~rval. The nu111ber of light winds or calms may'be written in the'_center. · The;frequencies are then isoplethed,' with the exception of the number at the center. It was notstated whether the frequencies· should be plotted at the vector end points' or'~t 'points along. directions from which the wind is coming.· 'Either convention is us~d. . · ·. :.·· . .·. ·:i' '·. ' · :;;·:: ·''! · scalei.~
F,igure 3 shows a frequency winq, rose. from th~ ~winds ,a~ \
· t~b
.
N
.-. : · r.· ·
·'
'·~-:
'.'·
I'
'('
' ..~.'.'
.,_,;,•"'I
'I
" q,
;1/l '·'~\'
sw
s
·!
'h
··· ·
''
. /1
) :· 'i . i,'
NW
'
·;_l ,···1'.il.· I ;
,;-~J
.
·
;
SE
'
Le11end
,;,1, •.· · •
' .v·' .1.
•'
• !1,
·,1 •
.•.,f,.·
'·'i:n·
jl.
'
! ' -~
';f'li:'.J.
J
• J
:; :
'/
• :
~(·.·{· t\;.'.il'
· ,·
afronc;f r··~'lli · · . · '·l.
6 moderole
·. 1 ·.weak ·,r\1i;,_;; ;.. ',. . .' ·. .:! :·.:
JO'¥o O!l.ourrlmoe ··;(;: ,
j
I - ~~-"·.occurrence.,; :·. . . . ....,..~+-,~; ;,l \ t . :· 'J!'!:;.·
.· fi
•.
i,;. I
~!.(l"
SE .. ~·; .. , ! d.;:' ·-~~; ~- . . , ':. ! ; .
-~i~ti
·
.
'
t~r·u:t.:~
-;.:.·' '
J ,_H:.'. --~it.' J . · ., ''·' . ,. Frequency Wind Rose (above) and Standard Wind Rose (below)· ·:.;•· for March Winds, LaGuardia Field. •-: ... : ·•·' ,. ··••·
•.,
S
FIGURE
"
~6
SoME API'LICAHONS OF STATISTICS TO METEOROl.OGY
La Guardia Field in March, 1932 to 1947 (1934 missing). Such a wind rose is useful for many practical purposes, for example, design of airports. In general, runways should be built to avoid strong cross-winds. A more common, but less detailed, method of obtaining fre- . quency distributions of wind vectors is based on dividing the wind speeds into two or more groups. For example, we might define weak winds as those with speeds less than 10 mph, moderate winds less than 20 mph, and strong winds 20 mph and faster. Each group is assigned a line of different thickness. The length of each line section is made proportional to the probability of the particular speed group in the particular direction. These · probabilities are easily determined from a two-way frequency distribution such as given in Table 3. The data of Table 3 can be put in the following form. weak
moderate
strong
s
sw
w
NW
N
6.1% 0.5%
8.1% 1.6%
5.4% 5.0%
0
9.0% 22.1%
3.8% 10.1% 3.2%
1.3% . 1.8% 0.9%
NE
2.9% 3.6% 0
E
4.1% 5.8% 0.7%
SE 0.4% 3.1% 0,7%
The c:orrcsponding wind rose is given at the bottom of Figure 3. Instead of lines of unequal thickness, different hatching can be used. Again, lhc uuiHIJcr of calms can IJc indicated in the center. Wind roses as shown at the bottom of Figure 3 show somewhat ' less detail than the frequency wind rose above; but the preponderance of strong NW winds as well as the absence of light NW winds is equally clear from this picture. Also the secongary maximum for east winds shows up. The advantage of this type of wind ruse is its oiJjcctivity.
8. Averages. In order to summarize data further, we next try to determine the values of the variate that lie near the middle of the frequency distribution. Such central or average values can be it5ed to describe the usual behavior of the variate. Three measures of central tendency are commonly used to summarize statistical information: means, medians and modes. The mean, obtained by addition of all the variables in a sample and division by the number of cases in the sample, is by far the most common. Yet, for some meteorological parameters, the
~EQU
.DIS'I
.f,t{ii'
IONS
' ··· · ·,:, ·.\.:::rr·;:f' ·,";~,Ut.if·~\1,{~'&1 !~A mean d.oes not constitute a "typical". value and 'is therefore;!o(:~t>!' questionable use. An example of this' is daily rainfall amounts~;J:;::i.ff,j The mean daily rainfall at many places in.the United States,·:for{r:•::0:,{.: example, is between 0.10 and 0.20 inches, yet the .most common; ' ..,, amount of rainfall is much smaller; the large average is produced ·. · by relatively rare days of rainfall amounts greater than Y2 inch. For other variables, such as pressure, wind speed and temperature, the mean is quite adequate, an~ used almost ex9lusively. ·.';
,,
.
.
'
'
'
,
:
I;
Of the measures of central tendency, the'mean' is most affected by extreme observations and is least' useful for' variables 'for which extreme deviations from "typical" values occur relatively frequently, arid occur in one direction oniy .. The inode, defined as the most probable value of the variable, is not' at all influenced by extremes; the median, the halfway point in' the frequency distribution, is influenced by the number. but not the value .of· extrem,~;·members of the frequency distribution .. ,:.,, ;.: : . '• 'h ·
. A special kind of mean in meteorological' usage: is referred to as a "normal." This is usually the mean over many years·for'a ··::;.~~ given day of the year or a given month of the 'year.i 1' Daily normals are smoothed by some process, because· simple means . have a tendency to show irregular oscillations which are,· pre-'·. sumably, accidental. For example, a 30-year average temperature for March 4 may be lower than similar· means for both· March 3 and March 5. 1 The smoothing can be accomplished, for' example, by computing weekly mearis and plotting them at the mid-point of the corresponding week. · Monthly normals 'usually need 'no smoothing of this type, because the number or'individual obser-' vations·l:~veraged is extremely large . ..,,, · ~:·; · ·· · ··' ,.. 1
In the construction of normals, the attempt is made to average the data at neighboring stations over the same periods. If one station has a shorter record than stations in the neighborhood, or 1 if some data are missing, the mean computed from this short record can be ·corrected by the use of the information in the environs. For example, if this short period was cooler than the complete period at the neighboring station, the resuit obtained from the short period at the new station· is .increased .corre~ spondingly. · · - · ' ' ·· ' .,.~:r~
::~
::~.···:
·;!·.tl···:
t,~..~:~
It isnot at all obvious that the best normal is the one computed .~~ ot all meteorologists agret; Of! the advisabilit>: of,smoothing .~ucll :•singulantles,•: ~s they are called, behevmg them to be s1gmficant. '· .·. . · ·. :· · · .·'
·•
18
Smm
API'LICATIONS OF STATISTICS TO METEOROLOGY
over the longest period of time. It is pmbably fair to say, at least for many practical puqioses, that a normal should be the hest estimate for the next value of the quantity averaged. For example, a normal mean annual temperature should be a good estimate of next year's mean tempe1·ature. Now, temperatures have been rising gradually over the last hundred years. Hence, a normal temperature based on a hundred years is considerably colder than most annual temperatures experienced in the middle of' the twentieth century. A study by one of the authors has shown that a mean temperature over 15 years gives the best estimate for next year's mean, and should therefore be preferable to normals computed over longer periods. A similm· result was found for precipitation. Some frequently used means are averages over only 'two v;ducs. The hcst known quantities of this type are the mean daily temperature (usually an average between minimum and maximum) and. the earth's mean distance from the sun (also an average between minimum and maximum, which also happens to be the semimajor axis of the earth's orbit).
9. Computation of the Mean. The most obvious way of computing a mean consists simply in adding all the data and dividing the sum by the number of ' data. In fact, this procedure is generally used when a computing machine is available. When all the observations are close to a central value, the labor can be shortened somewhat. For example, all the pressures in a given sample may be between 990 mb and 1033 mb, or between 691 mb and 719 mb. In situations such as this, it is advantageous to choose an arbitrary origin, A, somewhere close to the data, preferably a round number such as 1000, or 700. The mean can then be computed by:
X=A+B. N
where ~ denotes the differences of the individual observations from the arbitrary origin. These differences often contain one or two fewer digits than the original values. The procedure outlined here is particularly useful in the case of pressure, density and potential temperature; the variation of all of these quantities is considerably smaller than their absoluoo value. Some statis-
·;'~;;
·.r.-!-'
·t~·
ticians;;'prefer to choose A in such a way~thaCall ~the! deviations ~ are 'positive; for example, they would have• ..used690Jmb\.or:t~;.~,~!: 990 mb}I n the above examples. · ·· ·,; ·,. · ;·'_iiiHil ·.n. '·':'~'(i'ih:r.;,:ll·if.:.;i';,~;~ 1
Wher:t the mean need not be completely 'accurate,'' and ..a)re.':;: .'·. quency!distribution is available with equal class intervals, the so-called "short method" can be used. , This method. s~ves a' great deal or,' time and eliminates the necessity 'or an adding or' com- · puting machine even with large samplesl bt1.t)s n~t.,::;~mP.le;~elY.. precise,;. The formula for the mean is; ·, · · ' ' · ·· ·· · X = A
+
•T,fd
•' ' N .·'· ';.:. . ,.. \'.'!
~-.
·
' • . ··.
I
·:! -··· ,.•,\',\
••),
Here;.the summation extends over the, different cla:sses only, and therefore has much fewer terms than the original formula. 'J i is th.e class interval, as before, and A the arbitrary origin, . l chosen at the class mark of a group, preferably the most populated group. d represents the quantity: ·(Xc-A)/i,• where •Xc is the class mark of each group. ·The quantity dis either a positive' or negative integer; it represents the number of groups· a given· ';}.; group is distant from the group containing the arbitrary origin. ,. -
I
. (..
,
.
,
.:
: ·: '
Ii
Table 4 gives an example of the calculation of the mean by the\ short method for the frequency dist~ibution of temperature given' in Table·1, No. 1. The arbitrary origin was chosen. asthe' center ofthe.·thirdgroup 25.95. · · 1 .·'···.: :, :.: •''' ;:';,,;,,'!,.·.:n:·;.il :1'f:.~~
Classilnterval, °F 16.0- 19.9 20.0-23.9 24.0- 27.9 28.0- 31.9 . 32.0-35.9 36.0- 39.9
....
'I
'
.
\
\
.
:
.. ; \
~-~.~'·.-~ ~:
(:
i
t·.
.
~ii·l·H·.
Xc, op 11.95 21.95 25.95 ' 29.95 33.95 37.95
59
48
··:.
20
SOME AI'PLICATIONS OF STATISTICS TO METEOROLIR>Y
answer only if the mean of each group is equal to the central value of each group. If all observations were situated at the upper or lower limit of each group, the maximum error of i/2 in the mean might ensue. To some extent, this error can be controlled by choosing the exact class limits before the frequency distribution is started. However, if the slope of. the frequency polygon is large, the mean of each group does not lie at its center. For example, if the frequencies are increasing rapidly, the mean of the groups is above the center, and vice versa.
10. Evaluation of Median and Mode. The evaluation of the median follows from Section 4. lt proceeds best by linear interpolation of a cumulative frequency distribution or by reading the variate correspon<)ing to the 50% ordinate from an ogive. For example, the median temperature as obtained from the ogive in Figure 2 is approximately 30°F. We could also have interpolated linearly from the data in Table 2: Median (CF = 50%)
=
27.95°
+ 40 (50-41 ) = 29.4°F 25
The mode, the most probable value of the variate, can be defined strictly only by a limiting process: the mode is a maximum of a curve which results when more and more cases are added to the given sample and the class intervals are made smaller and smaller. This process presumes the existence of a hypothetical infinite population, out of which the given sample constitutes a sample drawn "at random." Actually, no such population can be defined when the sample consists of successive values of meteorological variables, so that this definition is somewhat arbitrary. The addition of later or earlier data will lead to smoother curves, but the original sample is not a random sample, but a very special one, taken from the large group. Sometimes the mode is defined as a maximum of a smooth r-urve which is drawn through the frequency distribution, either by eye, or by the assumption of a certain mathematical formula for the curve, which is then fitted by some mathematical procedure (see Chapter II). The mode is not only the absolute maximum of the limitii1g
-~~1:
tEQU!
Dtsr
toNsr······
''-';}~"·)·.·,"",_ ·>'' ·'
~;- ~!''i~!~:},1it·~·~p,: <
curve; 'any point on this curve t.y.~;=: .:f(xhwhich·;;··ISa'ttlSh~ relations, '• ' , 't ·~·"f;>':'~i .·. ' ;;~::_,,,,,,,;,;;-!*
-~:~·
dy/dx = 0; .d2yjdx2
lf>::i-1
is called, a mode. These conditiqns an:! satisfied for .seco~dary;, maxima ;as well. Thus, we may have bimodal, trimodal, .etc., distributions. It is not always easy to determine whether,certain maxim~··in a frequency distribution .should'be'counted as.tnodes,L' ' ' . I ... or whe~~er they should be ascrib(!d to.ch~u~c;:e,~u~t~;~~~i<;~~s,...;,j j,1 :.:~ .· i I~
.
'j
•
.
Since.' the exact value of the modes· can· be determined only:. with some arbitrariness, it is usually sufficient just to designate the modal interval or intervals.· As' a matter of' fact,· the use of the mode in meteorology is not common excepting. in• the' case of
win~r~irection.
,
11. Statistics Versus Parameters. '·
~.
i
>~·:: .. ·•
·}.,
. : .. · •.• ,, •. ;.~·-·
,,
'" · ·'
. ~•.• ::
.
•
''·<• .
~
•.
.
Quantities like the mean, and many otpers to be taken up later, , · ~1! are c~n:puted from a sample of da:a,~ small or large. , ,·Such,::;;;;,j.•~~-~~ quantities are themselves called "statistics."·, Now1 suppose·:we·;' ~~'!(';f~ had all ~ata of. a certain type, for exampl,e, aU 7 a.m. temperat~re~.- . at New York Ill January. The total of ,all such temperatures we·,>k · call the 111 'population." Now, whereas~'the characteristics' of'a\.~. sample are called "statistics,' 1 the cdrresponding ·chanicteristics of the:t-'population" are called 1 ''parameters."'; 1 Normally 1 • only _· statistiGS. are known, and parameters must ·be•estin1ated:· ),\,,/ . ·,·, . :
.
''
;
'
•\
'
'
'.
"
i .. ~
. ( . .. '
:
It has. become the custom to denote "statistics" by. Roman_·· letters,'':and parameters by Greek letters. For example, a·sample 1 mean is usually denoted by X, or perhaps by M. The population mean i~, denoted by p.. Similarly,. the sta1~dard deviation· or' a sample, to be defined later, is writteri s. The corresponding · · parame.ter is denoted by cr. ·•tt'i·;
12. Wip.d Averages. t;
·.
'.
';,,~i!:.·:~~-·~.;1·:''\
f
.,..
'
.
Again,· wind has to be treated separately because· two components are.ne~essary to d etin~ it comple~ely. ·· Bo.th a mea1~ an.d a mode are 111 common ust'j. 1 he mean wmd or resultant wmd IS
1
,·[,, .,
.•. , •
'. ,·j,'
defined in analogy with an ordinary mean: l
ft ~
l
_.
·
_,v. =.. l:V, .whe;e th~
, ·, ~ i f
t; · ·
li" ~·'
,i·r·
~ d ! ·. ·: ; .
arrow indicates a horizontal vector. The ;common-designation
:
.'
•··
~~
....
.. .,·:' ··~ . .
22
SoM~ APPLICATIONS OF STATISTICS TO METEOROLOGy
"resultant" wind for this quantity is not quite correct according to usage in mechanics; this term should be reserved for "£V. The mean wind is normally computed by components. The components of the mean wind along any orthogonal directions are equal to the simple means of the individual wind components along the same directions. The directions along which the components are taken are most commonly chosen as the west-east and south-north directions. When the wind is given to 8 points only, the computation then becomes particularly simple. If R;. is the component of the resultant wind in the west-east direction, and R). the componcn t in the south-north direction, these components are given by:
Rx
'X.\V - 2:E
H... = l:S- 2:N ,
+ .707 (2:SW + 2:NW)
...;_ .707 (l:SE
+ l:NE)
- .101 ("£NE
+ "£NW)
= ----------------~----------~------~--------~ N
+ .707
(XSW ·,
+ l:SE) N
where N in the denominators is the number of observations. The speed of the resultant wind equals the square root of the sum of the squares of the components. The direction can be determined from the condition that its tangent is R,./Ry. In these equations, W stands for the speed of each individual west wind, SE for the magnitude of each individual south-east wind, and so forth. The summations extend over the individual observations and, hence, computation is tedious. If a frequency wind rose is available, the same expressions may be modified by writing XfE for EE, EfSW for "£SW, etc., where now the summations extend only over the groups corresponding to the particular wind direction. Application of this technique to the frequency wind rose plotted in Figure 3, for example, leads to the result: '£fS = 474 mph, :ZfSW = 731 mph, '£fW = 915 mph, ZfNW = 3075 mph, "£fN = 1100 mph, "£fNE = 382 mph, '£fE == 562 mph, and EfSE = 296 mph.
5.26 mph, and the resultant Hence, Rx = 5.76 mph and Ry = wind blows from 313°, 7.8 mph. Note that the total number of winds in the denominator includes calms.
. - . ·.. ·t:;,_'·~:;t·,,:~t~;;if;.<:r;if''{~·~1~~l1p:~~:~ -
~fi
.
It is•.clear that the speed of:the 'resultah~~win~:i' is· smaller:.th
the average speed o~ the irtdivi~ual .winds:~:'In1 particular,:\i.Fthei~' ... wind blows half the time front one direction, ahd hall the 'time-' ~-': from the other, or if it is equally likely: to blow·from! all directions '::' ·with equal strength, the resultant wind spet:!d is zero~'' 1: ll .; 1 :H · 1
t"]1 ;..
. '.
-:
I
I
I.
l :
!
.,
••
i ~.
I;
.. tl ; .
j ':
,..,
~
I 1~ ' I :
~ .' .• I q
I
1
Resultant winds are commonly, used in-theoretical work. ;'i For example, the mean divergence in an area equq.ls the·divergence. of. the ~resultant wind in that area. • Also,.· the geostrophic wind computed from mean maps is a resultant ·wind,! iand could. be compared with observed resultant•wind.; :, ';,; .. ,.,;• i · ,· :,,,,;r;; -1
, .:
!r
1
l .' "'
~
·: . · ·,' ·.· .·:,: :·:
1 ••
The: prevailing wind direction, on the other hand, is ofmore interest to climatologists; it is the most probable wind direction or the modal wind direction. Just as there are bimodal frequency distributions of such quantities .as temperatur~, .. there may be several prevailing winds; the only condition for a prevailing, wind, direction is that it is more common than the wind directions on either side. For example, due to the passage of a cold front,' the, prevailing wind direction in the first part,or: a day may be SW, ·.: . and in: the second part NW, or a.summary:for- January ma:Y.;show. · that. both north and south are more commoh.i than ·-:~my othet;,: clirectf9ns, perhaps due to the effect of. a norl:h-so_uth nwu11tain range,A:." ., ,,.. .. . ., .·• .: ,,·L ,, '• i ·,'1·.. _,.:. 1, ·, ·f :.. ·... ,. In records in which wind direction is given to 16 points, there is a certain reluctance of observers' to record intermediate directions, · · such as NNW, or -WSW.· As a result,' 'there''appear' frequency maxima in the wind rose in' air principal' directions.'· Clearly, these cannot all be treated as prevailing directions.'';,.;;'''··; \;.,-.•, · •'!}:·
.
'
·
··'•'.i·· .; ._. .. :·:
l :t·_,·•.• r ... -.
.·;·:;ilt ·':r:t.HUi(···.·
Together with the prevailing wind direction, the' mean tspeed of the; wind, regardless of direction, is usually given.'; lt'is com· · puted· in the same way as the mean of ally quantity-either, by adding .all the speeds and dividing by the number of observations, .· or by :the short method. Clearly, the mean speed of the wind· tather11than the speed of the· resultant wind' determines' mean evapbr~tion, the cooling power, or the mean strength of' surface layer. :turbulence, because these··phenomena ·'are' concerned "only · with the. strength of the wind; not the' djrec~io!l: from ~.wh.icp it blows:· · · '·:· ··t·l·r·>.;y; .• ,,,.,:···.,,•· !.. !nc·•t! 1 k:-~· · tf.tt,:·
·/
.,
;,_,,:~::·,:·.t::·'tt.
.,.,.~d·'·!ifn.l··\·d'.·
Climatologists have found it convenient to clefine·Hpersistence'! bytheratio: ·"·· ·•·t·'dl) ·.; .:, , .,._,·,11:;; . '
24
SOME 1\PI'LICATIONS OF STATISTICS TO METEOROLOGY
P=
of the Resultant Wind Mean Wind Speed
If the wind always blows from the same direction, the persistence is 1; if it: is equally likely from all directions, or blows half the time from one direction and half the time from the opposite, it is 0. J tis clear from these properties why the quantity P measures the persistence. In the example above, the mean wind speed comes out to 17 mph, and the persistence 47%. In general, a statement of the resultant wind or the prevailing wind direction without a statement of persistence of that wind means very little.
13. Degree Days. In the case of temperature dat.-.'1, the number of degree days has been widely used as a figure summarizing considerable information. The number of degree days is given by: N = X (65 - t)
where t is the mean of minimum and maximum temperature on specific days in degrees Fahrenheit, and the summation extends over all days with temperatures under 65° F'. For example, if the mean daily temperatures in a week are 50°, 70°, 60°,. 55°, 60°, 62°, 50°, the day with an average temperature of 70° would be omitted in the calculation. The total number of degree days would then be 53. The usefulness of degree days is due to the fact that the number of degree days is proportional to the amount of fuel needed for home heating in a given period, neglecting, of course, the effect of wind. Thus, for example, fuel dealers supply their customers with additional fuel at regular degree day intervals. The colder the weather, the shorter the period corresponding to a given number of degree days. Note that the number of degree days cannot be computed from mean temperature, since temperatures above 65° may enter the mean. As another example of the use of degree days, consider the problem of determining whether the installation of insulation in a given home has cut fuel consumption. It may be necessary to compare the consumption in a mild winter with that in a hard ·winter. Then, the fuel consumption in each season, divided by the number o,f degree days in the season, takes into account any change in the kind of winter; and two such ratios in different winters can be compared to show whether the fuel consumption
;
•
-
'
~l
14. Measures of Variability.
\'' '•'
.
'I:'->
'
.
~
'
•f I
'
In order to characterize a meteorological' variable,' an average · is often,not sufficient. For example,· the annual average temperatures, at San Francisco and NewYork are nearly the.same;.yet; the climate at the two places is quite different due ..to the much greater' variability of temperature .at .N~w ,York.',' rAs:another ~ example, a contractor would like to base his bid on the number of days lost due to rain. But the mean number of days lost ina certain iseason is not sufficient information (on •which .t() calculate the risk:' The variation of rainr,days ~~or,n.one year;;,to.. tiJ~:.next .J "' J, is equally important. . .· · · · • , . . . . · . : ,., , '\' .. ·' ~ · •.
·.P' I·
•
• •
· ;<
.
'.~-:
'• < , :<
t •
,._. · ;_
J
; ·.
, ' ,l •
•
' ("
0 •
' \ I \
.·
·
f
>I ;.
I,
•
~:-· · i ';,
\ ·
o ,::.
. ·:: ··, ;·.
The simplest measure of variability. is the range, the differenceJ·:;; betweenothe largest and smallest value in a set of observat!ons.',~::?.Y For exatpple, the diurnal range of temperature7)s often used a!' ~ >:: ~ ··: measure.::of variability. 'The magnitude of the•range.depenqsiin: ':~.;;'!: a systematic fashion on ·the number: l:Jf. observatiotl~ l'}flli;lyzed: for exan:tple, the range· of minimum ,teinpe~at1,1res ..-:;. order to. arrive at a valid measure of variability .. Such corrections.· / have been provided for some frequency distributions used in' .;:' actual j)~actice. ..n~;.\ •;;i:)p.r.r ufd:J\i':ld !l!l1~ ,IJl . !.
Another objection to the use of· the ninge !is its ins4bility.~ .... : : Since it, is based on two observations only, there are iarge variations from sample to sample.· Thus,·· one January may have a •··•• ·. range .in/. the minimum temperatures of 20°F,' anothefo( 30°F, . Howe~ir: if the rang~ is.defined as' the di~e~ence!b.e twetfn\;~rtain\;.: ~i\i.;, types o(averages, thts dtfficulty largely dtsappear1:1,:'- For. exalT!ple,~;~:.,:i)• . the· anq~al "range"· of temperatur:~/d~fined'" as }the:\di~er~nce i;·~)f between' the mean· temperature' of the warmest tnonth ·mmus '.:(;•~·: the mean temperature of the coldest 'month,'1is 'often ttsed' to ·;i'J describe the annual variability of temperature.'··Similar measures· are used for other variables. · · .·, \: 1 ··' ' ',:, · " ,., ·,' ,w! · · · •;
1
'
-~
Of the measures of variability which have no large systematic dependence on the length of the record, the mean deviation and the standard deviation are the most important. To be precise, these two quantities depend to a slightextent on the size of the sample, being
pro~ortional
to
VN~l'
even if the "population"
!5tays the same, that is, if there are no systematic trends. This condition is not actually satisfied, so that even the mean and standard deviation of meteorological data vary in an irregular fashion with the length of the record.
15. The Mean Deviation. The mean or average deviation is defined by:
l.:lxl
AD
=---
N
where x = X X. The main advantage of the mean deviation is the ease with which it can be computed. It can best be evaluated from the expression defining it. The chief disadvantage of the mean deviation is the absolute value sign in its definition which makes mathematical development difficult. Thus, for example, no "short" method produces much saving of labor. Another difficulty is that the average deviation is a "poor" estimate of variability. More observations are required to estimate it reliably than,some other measures of variability, for instance, the standard deviation.
16. Tbe Standard Devla tlon. The standard deviation is defined by: 1 s
=. {fdi"
VN"
Even though it is more difficult to compute from its definition than the mean deviation, several short cuts can be used so that, in practice, its evaluation takes less time than that of the mean deviation if extreme accuracy is not required. Since the standard I Fur
.rw
rt•asons explained in the following section, some authors give this as:
V"N-I
~\W{;c ·
deviatio~:; lends
itself easily· to further!tbeoretical itrea1:ttli~nt is used extensively in theory and practice 'of correlation, significance, turbulence, etc. Compared 'to the trlean deviation, the·standard deviation places greater emphasis on large deviation's from the: mean; the value of the standard deviation is affected.·;~ 11 very little if data near the mean are changed.;; ·,''., .1; '(;".• <~1/,~.:::;' .r' · · ,, .~.J·~~·"f'·'~i:"····~·t.r.t't~i·~~tqr.·•t 1 For tl~e purpose of computation with the aid ofa coirtptiting machine; 'the expressi enning.' rtdar~ d~vlatioh 'can' be ' modified'to: ·· · "' :, 1:i:\W1y 'J.;:'·l' ;•:'; · ' -~ = 1 I ~X2 - (X)2:",·' ;·f.''?, )d: il; •;IIIH> ' '-.. ,V N ' . ··· ··.:,;•·.. ·· (This eq1.1:ation is prove '; - th~~.!!dJx). ·. ' ce the standard . deviation', measures the degree o a 1 ity' or. 1 'dispe~sion''of '' a variate, it is independent o(the posiHon·of the:zero'point1··of that variate. (This statement is also proved in'the Appendix).·,,, Hence, in order to avoid large numbers in the con1pt,It .., · (,~, · ·., , '
>.;·~:.
~~~.~
'
{', ·, 1:
• /~(X - A) 2
s=y·
N
(
-
E(X - A) ) 2 : · • : ;, N :· ..
:-.: ·.
•:-:,·.:.' '·
;
·:·,·.:i:.'r'!.:\\.' .
,;;;
- ..
For gre~test ease of computation, A is' usually chosen so as to\\. make x:·- A as small as possible, yet positive for the' whole ! I range O.fi~X. , . · ...,, .
,
'
I
If cotnplete accuracy is not required, the standard deviation ·· can be obtained by the short method, which starts from an equal interva(ftequency distribution. The short method '_formula for . the standard deviation is: . ' ,' ',. .. . . ..· '.': ":-r,· :t·;,' ',; ~ ;.:· ··f -~~~
'-!{
...r,·t·
·.
s - i
I
.v· l
Efd2
N
r :> 1.: ·::.!·r.::.:·:
-·(
n·,.#~·
Efd )~(;;.. !.-· ,' .: ' '· '
. : ·N, .. L:
._ .. ·-:,_·.~.~·
··
_:!)i:pl't!·,..._.... :
where, as before d == (Xc - A)/i, and the summation: extends: over the':qifferent groups. ·· ·;· :-,,;_,,,:,<•·:h· ·:!-" :,;,,.,_,.1.1 '"r/i
·:
··
··
The short method assumes that ·all the observations in' a 'given class interval are located at the center of this interval.' However~ itt many; distributions, the greatest frequency is· near the 'center of the distribution; therefore, in a given class below the center of the distribution, the observations tend to congregate,'.~t ''1Jie upper end of the class, whereas the observations above the center . tend to·lie on the low side of each class.' Therefore,' 1the'short tnethod 1will sy~tematically overestimate the standard 'deviation ·.~
·· ... :. ·· ·
mA
rol\
TIO!'
ROLO
m such distributions. This effect can be compensated for Sheppard's correction:
by
s= Vs~2 --{~
If ere S 0 is the observed standard deviation, i the size of the class interval, and s the corrected standard deviation. Table 5 shows the computation of s by the short method from the data of Table 3. Note that J;fd occurs both in the computations of the mean and the standard deviation.
TABLE 5 Short Method Computation of X and s X
f
fl
16.0- 19.9 20.0- 23.9 24.0- 27.9 28.0-31.9 32.0 - JS.9 36.0- 39.9
3 9 12
-2 -1 0
12 8
2 3
15
-
6 V!15 59
s = 4.
25.95 - (
+4 X
48
s9
6 9 0
12 9 0
24 24
48 72
48
!56
15
1
59
X (as in Table 4)
fd2
fd
2
48 59
15
29.2
) = 5 64
·
As a check on the computation of s, approximately % of the s. In the observations should lie between X + s and X example above, about 39 cases should lie between 23.6° and 34.8°; this is approximately correct as can be judged by the ( requency distribution. The statistic .67449 s is called the probable error, PE. I£ the frequency distribution was 41 normal" (see next chapte1·), exactly half the observations would lie between X + PE and X - PE:.
17. Variance. The square of the standard deviation is known as the variance. Thus, if 11 is the standard deviation of the parent population, the
"·::~1~·:·
variance:;o-2 of t_he population is given by. : · . ,. a2 =
:z (X
- ~)'2.
N
I
1-'
where ~ is the mean of the population and the sum extends over. the whole population. Since the mean ~ is not known, the sample mean X. is substituted in the formula above. However, this'pro-. duces a slight bias in our estimate of the population variance. An unbiased estimate a2 of the population variance is. given'' by: ·I . ,. ~··'j " :Z (X - 5{)2 ! :Zx2 ! ' l . N . ! . f' ,_.
E<
2
(1
=
N- t
N-;-
=
1-.-~~~N;·-·1 ::1L:·q. ·: . '
where s is the standard deviation computed from the. sample as detined in the previous section. The caret ( 1\ ) is used over a to emphasize that ~2 as computed from the sample is· only'an estimate' .?f the true vanance and not .the actual population varianc~·r
·,<:
The general concept of variance is important. in 1connection·· with methods that have been developed to analyze. the factors· .. which p~oduce the variance of a variate. 1 Thes~ _met~oclsi'Vil~ \ be descnqed later. . .. · t· ·.:,, :,; ;, 'i·:i \\. ,@ i .· '·, ., ·,' ':\>\';' '. •\i • : I! J ,i /
'
'
'i
18. Skewness.
A frequency distribution is said to be.:positively-skew· if• the mean is •grea ter than the mode, negatively skew' if the meari' is less than:the mode. A coefficient of skewness should be dimen• sionless,fand can be defined by ' · · ;· I•.: .,., ,:• :'i :i ; .. ,i: . . :.~~·"i iTl;··
l/.!.1:.
·tun
sk =
' 1.-•::qr·. J 1d: . :1hdirpi>;
X - Mode.· • ·q, s · ·· 1
. i(~.,:
i!•.IPr: : ·.• JJ,,:l );.
···/;:tJ .:p '··Jt;~t/·,,1"i~..
Since th~' mode is difficult to estimate,' and approximately; X - Mode = 3 {X - Median) · "' '··;r: ··'td'' •i:! t:;~ 1:> ·•::r.: l ;,
sk =
• ·
3 (X - Median)
s
qua~tity
The :Z.x3 also has the' correct 'signl for 'a 1Coefficie9t(o(:J skewness.n: Therefore, another coefficient· of· skewness has!!also ~ ::' beendefined as: ' 1 • ·, '·'· :··.crll. '{l'lll:·r••!iJti\!Hi:~ib ·. 1 ·tc• sk == :ZX' •·.r:~ 1:< ;:p 1:.1:1 'uh. ; 1:d.1 :t;,, !
N,s3., ._,;\
·:f.lH ;;fH'.:l"!C.'J r·"H!:i(;·· {'
..
;;rn·
''
Frequency, f
150 100 I---
50
-
-
0 4.9 5 9.9
r---t
-
10 14.9
-
15. 19.9
-
20. 24.9
-
25. 29.9
-
30. 34.9
35. 39.9
I I
40.44.9
I I
Wind Speed, Knots FIGURE 4 Frequency Distribution of Wind Speed at La Guardia Field, 1932-1947
Figure 4 shows a typical positively skew frequency distribution, that of wind speed. -The mean wind speed is usually greater than the most probable wind speed, due to the influence of the relatively few wind speeds of large magnitude. Figure 4 shows the distribution of wind speed :it New York, March 1932-47 (1934 omitted). The mean wind speed is 16.7 mph, th'e mode can be estimated as 12.7 mph. Frequency distribution will generally be skew when there is a physical cut-off close to the observed range of the observations. In the case of wind speed, for example, the speed cannot be negative; hence, wind speed distributions are positively skew. Similarly, rain cannot be negative; hence, frequency distributions of daily rainfall are extremely skew. However, annual rainfall· amounts in reasonably humid climates are fairly symmetrical; this is because the cut-off of. zero rainfall is far outside of the , ra1.1ge of existing observations.
19. Kurtosis. Two frequency distributions may have the same mean, dis· persion, and skewness, but may differ in "kurtosis." One distribution may have relatively few cases near the center; so tlwt the histogram appears flat __ (low kurtosis), or most of the observations may lie near the cetiter (high kurtosis). The
';
;~~·:i.i{1~~~f;Y.rff1~i' ; 1 quantitY,~-~0 is proportional to kurtosis;· 'and/ ·. ·.~~~~
t,'J.,." ' ···Ns4 defined,\as the coefficient of kurtosis.• :.·l':]··-.J.I'(Lf1l.ct' ' ' • I
'
.
• • .'
~-'.'VJi,.'
~,
1
,, •
A case of particularly low kurtosis ~igh_t be the frequency', distribution of cloudiness. 'This'·might 1 ~beJa. symmetric·.· Suggestions for Additional Reading.,';·-.:_:_ ··~.1: :r. -rn',: 1: :·>'/·::/.,! :
• •
: .
•
~."·~
1
; ·,"
1
1
·(~1
11
'
'I,
r·
1"~hl i tlfH\Itl1 nr I,·~~
'
1-:'il·h :,
On Fr.equency Distributions,' especiallyi of\ winds:·· ·,Broo,l<s,' ' .. C: E. P.land Carruthers, N ., Handbook ofr Statistical·Met}tods:in: . ~->:; Meteorology, Chapters 3 and 11, H~r Majesty,is. Statiort~ry,,Q~~~~ ·,:· <~: London,.1953. · ·· · ·: .... H.:Hiiol·•.'.l ,_f<1'-,\: .,.(iL'~·\: ,_\:, On Probability: Uspensky, J. V., Introduction ~~-Matke~~ical\\ ...·Probability, McGraw~Hill Book Co., Inc.; New'York/ 1937_.u ...··•
<{ .
.'!ftf:.
...
..·:·:.
1 ;·ll:··t ''.',',:,, ,',
./!·,·.~,.,i.7:r:::;~;:::;j
J, " 'f}ll!'i::d:.·lli ..·.. ,,,;; ;i·i--;. ,J .
. . ~·
(il' !lp
·;a! •tl: f f~~~~!
.
·'./
:.{.i!·
·.t.!.
" ·•·n
··.';!;i
·h:
I
:!;~:
1
·'.j:j-J:_:::_r:·~·!J 1 ii:d-<.·::.n . · ··
··-;; ;: ·.:l•:i..fi;d,_,-J't
l :·t~ il, .... ;~ ;-, L, .'jf~'j_;~,~-d~J ;;_d.;;_ j
,, '.,t,;!'t'i.
,!,lJd'l~~~-~[~~-.\h
. , . ;·~·~.:·.~·.:·:,~~if·~·,.:·· ·:~:~r::/·· ,, ! ; ,!
.,
! J I r ~~- !
.,f
~ .~;;,: '· 1 ':
1
·,
:~::
.
>\ •: ;. ~ J•:· c•,r : ·· ,-.;;t;~_-.rt_ I
Theoretical Frequency DistrilLHLlltions CHAPTER
II
1. Introduction. In Chapter 1, the probability density histogram was mentioned. This histot,:ralll was based on the variation of the quantity f /Ni ns f111wtion of the vnrinte, say, X. Each box in the histograin had a height of f/Ni and a width of i. The entire area of this hi!ilogram was i'kf/Ni = 1. Now suppose, that more and more data are added. Then, as N increases, the class interval can be made successively smaller without affecting the area. As N approaches infinity (that is, we eventually include all possible data of the same general type), and as i approaches zero, the ratio J/Ni lends to p, a linite limit, provided that the area has lwen held cnns1"anl: and equal to 1. Graphically, this means thnt, as the width of the boxes in the histogram approaches ~Zero, the step-like lop of the histo;{ram becomes a smooth curv<~. This curve is called the "probability curve." It is the limit of the frequency <listriuulion of a variate as N is increased indefinitely. The quantity p is called the probability density.
The question, "'What is the probability of finding an exact value X?", has no meaning. If we ask, "What is the probability of the temperature being 32°F?", we really mean, "What is the probability or finding 32°F \Vithin the error of measurement?", perhaps that the temperature falls between 31.9°F and 32.1 °F. Probability is not given by the height of the ordinate at a point, but by an area under the curve between values of the variate. Mathematically the probability that the variate X lies between
Xt and X2, is P =
JX pdX 2
x1
Quite frequently, it is desirable to fit the probability histogram, or the probability curve, by a mathematical equation. Of these, th~:; most widely applied is the equation describing the "normal" 32
/'
•)
.
-{~--~ }.ft··."'
· · 1!,
ICM.,
>UEN
STRIJ
!S,;i\)li
.:::-~,.. <:· _.. · _ ~-.:, . ·{. . ..•; ~~~-~t/.~~1~r-l\F !t,~~'f':l .
distribut!on, which fits a large variety' of probabil~ty found it~,;.?.ctual practice. 1. ·~. ~·: 1; ,';,; r:;.i'i•~:k;{i.tJ.lt::h't.
The ~eason for the general usefulness can 'best be'unde~stood by the(: prior study of the binomial distribution' whiCh,' incidentally,: has many important statistical applications in" its own rig)lt. , ,:
2. The Binomial Distribution. Consider N coins, biased in such a way that 'the probability . • of a head turning up is p. The binomial distribution gives the probability that m heads and N - m tails will show up when all . coins are·spun independently. · The binomial distribution can,be. applied to all situations where two alternatives are possible:' · First::;e shall discuss. the sim pie case when. th~re 1are: 10 ·,coln·s:~ and p ==-0 .S(unbiased coins).· The probabilityofeach combina• tion of heads and tails is given by .the .number of 'ways in :which, . that par:ticulat combination can. occur, divided by the. total . n umbe~ :!={all combinations. ,Since: each coi':l ,may\l~ow ei~~er, ·The complet~ distribution is given in Table 6.'', I,·-'·"'!··. : ·~ .· ' · ':;:''~'·' .. · • .! · -· · l ·} • •I
•
'
..
~
!
·
··: ·
The general formula for the probability of m heads in the case ::n ,, :• ·,, '· ;,,,r:;:,..:i of N biased coins is: .:. ·'
~~~i·1
p(m) =
N!
m! (N -
. t '\'. }. \ .. t'
m)!
pm(1
' i• ,
p)
'!
I
N-m
' .i[, :··\'
I
iII '1 ·>·~, tt f.
~
'··'d lo,t•{\
Note th~t; in this case, there is no difference' b~tween,prob~bility and pri?J?,.fbility density since ,the cla~s. i~}.t~r~.~li~, 1,.·.·.~:~· ·!~.·~\~ :r:.::, 1 ~:.:;.~ ~ The mean of the binomial distribution is ·Np.L For'•example,
if the probability of a head is .8, and 10 coins are spun,· the· mean
of the distribution .will be at 8 heads and 2 tails,,·, The symmetry
...
: \:':.
A:
IE
'TATI
TION
.. "0
TABLE 6
Binomial Distribution for p
M
%,
N
=
10
Nu111ber or Heads
Probability
0
1/1024 10/1024 45/1024 120/1024 210/1024 252/1024 210/1024 120/1024 45/1024 10/1024 1/1024
l
2 3 4
5
6
7
8 9 JO
of the distribution exists only for unbiased coins, i.e., when = 1 - p = 31. The standard deviation of the binomial dis-
p
tribution is
vN (1
- p) p.
As mentioned before, the binomial distribution is quite useful before passage to the limit. It is applicable whenever statements concerning two alternatives only are required. One type of problem comes up frequently in research. The question is to determine qualitatively the influence of one variable on another. Then, the cases are often divided into two groups: those for which the influence seems to be in one direction, and those where the influence seems to be in the other. If there were no relation at all, the groups should be of equal size. If the groups are of unequal size, the binomial distribution gives a clue as to the probability that the two types of influences are really equalty probable, and that the different size of the observed groups is just due to random variation between experiments. If the probability is relatively high, there is a good chance that the indicated relation is not real. This problem will be discussed mote fully in Chapter III. Another application of the binomial distribution arises when
we are interested in only two types of meteorological phenomena,
for example, rain and no rain, cloudy and clear, temperatures below freezing and temperatures above freezing. In these cases we can often determine the "bias," p, from the past record and make statements concerning probability of occurrence. For example, the records at a town in Florida might indicate that in
, ·. ;·
:,y,~i;,
::.·1i";;; t;. ·,,1: :;·'J~·?~\,•;'·;9~i-;}Jj
.
July thunderstorms occurred on!'; the .average l every!\hirl:,"day~,;·:~ Then p~= %. Assuming that the:occ,urrence of a.thunderstorm\;f.; on a given day is independent of whether a thunderstorm occurred:~; the preyious day, we can determine the probability. that, for exam pie, there will be a thunderstorm every· day for 1a:week r• .· . ·. , . . . . ·I•.• ... ( ~ )-;;orthat there will be no thunde~storms.:f~~:.~:~~~k
21 7
[m,'7' z\~7]. 8
\}
or for any other n
~~~~; or'~ ~~~ers;orm;.' •• ',,
.~,
' '
3. ThtriNormal Distribution. : · j
I
•
'
I
'· :,:;;i
I
; ; .. ' ' . ~ .
Hi"' J;i.,u .. 'II'''
'j
~
~;
":-
The ~binomial distribution as defined in. the preyious section will spread out indefinitely as the number of coins is· increased. In order to prevent the spreading out, the. standard ·'deviation has to.'•''J'il) be held constant during the limiting procesS, This.can be · ' , •
,
,o'
! '
1"
·
·
'
;
··x: 'x\· ·
I
•,, I
.:, ·•
·
<
r't
! '
, i
'
accoml?Hshed by a change of \.V<J.r.iables.r;i: l,.<(t;.~,;·~::;:::;\ 11 : .•• - ; , . , , • .·.~
. . Jr.t
..
·~
i
.·
•
::
)
,
~-
'i
.•
..,c:-~,-;>. ;.: .. · ~-.
,..
Here, a is used for standard deviation instead· of s, because W«fare'' · dealing. ~ith all possible dat':. A~ N. is incr.eased indefi.ni~~}~~~~h.~ .· ·, probabd1ty that an observation hes.m the. m~er:at dr 1s: ·, .. .r IJ(J·
i
~-~t
1 '
pdT = yl e ·
·,,
?£'
1\\ .
-T'/'l . •i
·.q
<' · ;: 1tlil1:.
dT ;
I,
;J-:·' ,.·.; ,
. ·I!: r
'-'
!
l ·,
•l
;•,
(}:'"( ~~..'H~,:11.
'.;:\l{f .: '
This formula is normalized, that is, the'totat·area unde1··the 'p curve is unity. Hence, areas under the. p ~ur'V'e can be identified with probabilities. · ·. ,.. · ·! "'' , 11 l:, i;. '!'!p::·,:: .:t'Y;'I , ~. •
1.
i
.
'l
·.
·,
I ....
'i
.
J
lil general, we are interested in the probability that ao observation is,Iocated between X and dX .. RE;nce,,weihaye 1 t~ ch~ng.e variabl~s again and obtain: ; , ; : 1 ·• ;, ·, , ::. , 1• ·;~ ,\1 ··~~-
.,.,,,
....ri!~!p. -·;~.. :
•
1
pdX - --= e av211"
•:· ...
.
:Ji •1
'•
·, '.
.
':•:.!::O:r ·';
. ~~
.''1
··~·
•
•
t t
i •' .'
Here p':is the probability density, ·that is, the probability per unit X-interval. Since the distribution as a functioh of r.depends neither.·on the inean nor the standard deviation of the.. particular population, it can be tabulated and used for all cases where the use of the normal distribution is appropriate. Quite in general/r is a convenient variable to use; it measures the number of standard
deviations by which an observation at point X exceeds the mean. For example, a point for which r = 1.6 is located 1.6 a above the mean. A binomial distribution approaches a normal distribution most . rnpidly when p = ~. In that case, the distribution can be taken as normal when N > 25. For an asymmetric distribution, N /J (l - p) should exceed 9 before a uormal distributiou fits well.
From the manner of the derivation of the normal distribution, one might expect that a variate is distributed normally, if its variation may be considered to be caused by an infinite number of independent small influences superimposed on the mean, which are equally likely to add to or subtract from the variate. On the average, the negative and positive influences cancel so that the mean is also the most probable value; the probability ·that a large majority of these influences are of the same sign is small and hence the likelihood of large deviations from the mean is small. The distribution of errors of measurement is often normal, because the various factors producing the error may be considered independent. The distributions of certain meteorological variables, such as temperature, pressure, etc., also seem to be uearly normal at many places. This is somewhat surprising since the factors in!luencing the variation of meteorological variables are nQ.t independent. For example, a certain factor may reduce the temperature of the air below zero centigr:ade; this causes precipitation to fall as snow which, in turn, produces further cooling hy its high albedo. If this type of process were important, one .would expect to find relatively more cases far from the mean than are expected if the distribution were normal. The opposile type of in terdepeuclcnce may also be illustrated.
A factor may cause warming of ai1·; this raises the evaporation
and the cloudiness, which in turn produces a lower temperature. This kind of dependence, which might be called a "compensatory" dependence, would produce a larger fraction of cases near the normal than would be expected from the normal distribution. Hence, there are both "cumulative" and "compens.atory" relations between the various factors influencing the variation
of meteorological data. •In other words, feedback may,h.. :nr•<>it·iv• or negative., Since often temperature, and :pressure.-..-.--·-· distribiJ,ted normally, these must be ~qually important. : Figure 5 shows the "normal distribution.'~.· Certain properties:'·· of the normal distribution should be remembered: " " , :, l'
:. ' ': ,,: ,; :
.:1£. "••,. -r2
-'-• 2
. '{ffi··
'
::.·.:... -~---- -- ...., ""'-
;
;-.:t
''\ ~
':
.
,.
"'j
:·'i."
('"'· ',o!j ;
;~I. :
~ ':
I
·.···,'·
··~·-· .......... :-···'···-~··;··.F"······"'·~·'";' ·. ~ <,:.. ~
•'
..:~.
:'
'-----iFZ""'a:.LI-=4!'-----,.....;...--+---....--4~~~· :;;,lli;~:' -~-:.
-.., .?.:
,
, .•
J.L
FIGURE 5 \..• );'_'(:·'1}/
:t,)i
1flf·
.,
~
r •·I:.
1"
"
. J I'" 1 ~~}~'/:>~ , ';';\t)ll'~'::,,.\\\: , ~h-."l,.,.(,I,J,,I, I
,d ·.
';_•, ' • . '
•
,
The Normal Distrib~t~~n~. ,: , _:,;.·, ,1 ,;r' t!i'.~1,1 'i),i({
'
'
'
'
l
.'f:; ;-:.; .
-
..
1. The distance from the mean to the point of inflection equals thestandarddeviation,a. . •' ,.i:·H"-' •..+.;,.:•Ti··;.·_d' . .!' ' " "'' ,._ . ' ·,, 2. 68% of the area lies between X + a and X - a•. :;: ·::1" 3. 'The probable error, defined by the'condition 'that:half!the , . area under the normal curve lie~ between' the limits X +'PE.and - PE, is equal to .67449 a. The probable error is not 'used much inlmodern statistical practice. ·· (Note 'that 'the'nutnerical values~o~curring in items 2 and 3 are. both acciclentally·near %). 4. 95% of the area under the curve is ,enclosed' by' approximatelyX + 2a and X - 2a, or between X + 3 PE and X - 3 PE. : In other words the probability that T iseither. greater th
•
clefined}~y: ~;t' . •,(··-,t,,.
-,\
1
_f.~,~·: ~;(~-~'.''·.i,·
Erf(r) = - = .
v/211"'
'
e 0
!
~
,
·:,!_,;,d
·· clr '!:.:_
:1-
·.
.
.....,
~
.
.
I
::./·,,··t!'l·,~~/:·~· ~{:''t!
.,-__
l-:-d
~'UMI' An·•·•~ATlOt-. .• ~·· STAT._, •• ~ .. TO !\.~ ... ->ROLl
jl:)
. TABLE 7 Table of the Error Function Defined by I
Erf (r) =
T
0 .1 .2 ..3 .4
.5 .6
.7 .8 .9 1.0
~
v2r
f.'
e -T'/' dr
0
Err (1)
T
Erf (T)
T
Err (r)
.000 .040 .079 .118 . 155 .192 .226 .258 .288 ..H6 .341
1.0 1.1
.341 .364 .385 .10.1 .4JCJ .4.33 .445 .455 .464 .4 71 .477
2.0 2.1 2.2 Ll 2.4 2.5 2.7 .3.0 3.5 4.0
.477 .482 .486 .489,1 .4918 .4938 .4965 .49865 .49977 .49997
1.2
1..3 1.4 1.5 1.6 1.7 1.8 1.9 2.0
This table permits computation of the probabilities of the occurrenc~ of arbitrary ranges of '. For example, as seen from the table, the probability is 34% that
r will lie between 1 and 0; hence, on account of symmetry, the
probability is 68% that r will lie between 1 and -1, a fact stated above in the form that X lies between X a and X - a. Or the probability of r being greater than 4 is .5000 - .49997 = .003%.
+
If a frequency distribution gives the appearance of a normal distribution, it may be desired to plot the particular normal curve best fitting the sample. The "best fitting'' normal distribution is commonly defined as that distribution which ha.s the same area, mean and standard deviation as the sample. Since the area of the sample is Ni, and the area' under the theoretical curve is unity, the theoretical function has to be multiplied by Ni in order to have the same area as the sample.
Nip (X)
=
Ni e av2r
(X-XP
--r
The mean and standard deviations of the curve and the sample will be the same provided that X in the above formula is computed from the sample, and that a = s
VN~
. 1 J n order to fit a normal curve to a sample, the following pro-
LH1.lV1
U.\..1.1\.
JJ1::Sl
''}·\~- i:~>;~.:~~~:~.;~-~ij :
cedure;1may thus be used: at•.the,•center-Jul.•':•.~;;<~,l;u rtf[~·•. · ~ ' computed from r = _..;;;____.___, a
:ql
normal curve gives e This be
. ':
·,
-T'/2
. :' ~·
.as function of r (see Table 8). ·
;~~suit has to be multiplied by.:
VN (~1r.
1),
a~d '~an' the~ 1 1 ' : · :. ;;'
p~~~ted as ordi~ate as function..
.ird
r 'o
.J, ;·.4. !,t..'S', .6 ' ··.7 ,.;8 .9 . 1.0 .t',
'! -, 'I.: . l
~I
TABLE 8....... ..t •.i Ordinates of Normal Curve, Y(r) ·= e ..-::'(.2 y (r)
1.000 .995 .980 .956 .923 '.882 •835 .783 .726 .667 .607'
r
.
.
·Y•(r).
',
I • ;i ' .
~:
. ' •.• ,
•·Y (r)
. T•
1.0 1.l ·,:[I 1.2 1.3 ·i ~.•... 1.4 . 1.5 . i ';;il. 1.6 1.7 1.8 1.9 2.0
,.
The fitted curve may be used to estimate the probabilities of unusual conditions to occur. This is, however, an.'uncertain · procedure, for a frequency distribution may he nearly normal in the qbserved range, but deviate"strongly from~ the ''norma!" far trom!;~e mean. · ' " :.· :.i. ~ i;,. :~ 1' i(\i ·:':·II/ ; :~·:;:":'i:·: ~:·:~:'\L;:r:
The main importance of the normal. dis~ibution i~,tbat itis fundamental in the further development of· many ~tatistical. methods. It is, therefore, important to remember that distrihu< . tions,of certain meteorological variables are definitely not normal,· suc~,a~ rainfall amounts per day, ~~nd SP,e~~· aQ..d .~l'?,ud ~WP':ln~. . , ,, "
"' ,
. .
"
',
'
'•
. '1.. ' ' "' ·.. '
',
:
·
•
'l[' •· 1:11 :r;( Li:1'!.inn·• 'j!(it·,l '1dt;i:~'l'.''l'
4. The Poisson Distribution. .
· .. •,· .
1
· ·
• : ·
T~e' Poisson distribution is the limit of·the'binomi~Idist~ibu~ · ., tion· when p, the probability·of an ·event,· is· small, provided the.; expected number of events is constanr i, ltgiye~ the pr
·;
.~:
.
SOME AI'PLICATIONS OF STATISTICS TO METEOROLOGY
p (x) that 0, 1, 2, ... , x unlikely events will occur in a given period. The formula for the probability p (x) that a rare event will occur x times in a given period is: p(x) =
fJ.
:r. - μ
e x!
where x must be an integer and μ is the expected number of events in t h e period. It turns out that μ is both the mean and the variance of the frequency distribution and is estimated from the sample or data by computing X, the average number of times the event occurred in the period of the same length in the past. μ (or XJ is assumed to be of order 1. For example, an average of 2 days of rain in the summer have been reported at Los Angeles. W h a tis the probability that 1 case of rain will occur this summer? In this case, μ = 2, x = 1, and the probability is only 27%, exactly the same as the probability for rain to occur twice. The complete probability distribution in this case is given below. (Remember that any number to the zero power is one, and 0! = 1).
Probability for Number of Events When an Average of Two Events Occurred Previously in Same Period No. of Occurrences
0 I 2 3
4 5
6
Probahility,
%
14
27 27
18
9 3 1
The Poisson distribution was derived under the assumption that the events occurred independently of each other and, of course, that the probability did not change with time. The second of these has been discussed already. In general, the condition of independence is rarely satisfied for meteorological data. If a synoptic "regime" is favorable for an unlikely event to occur (such as a very low temperature), the regime often persists long enough for the same type of event to occur again.
5. Transformations. It can be shown that any function of a variate or random variable is again a variate. From this it follows that any frequency
distribution can be transformed: into req given· form. by a suitable t r a n s f o r m a t i o n o r f u n c t i o n a l r e l a t ship. Such transformations are usually: made to overcome some d i f f i c u l t y in handling t h e original data r o r f o r ·some;• physical· reason/{An example of the first ·would arise •·where'•the·:data follow some complicated skewed distribution,· and further analy- · sis musbbe performed, the theory ·fot"which is~;only.''knowJ~ for the!i.normal distribution. The original\ data:.may 1>.theniibe ... transformed to a normal distribution and analysis: continued on · · the basis:of known theory. An example oPthe' second:situatiori is the physical principle that if the velocities of.parcels:of equal mass are distributed in a normal distribution,'·, thein ..kinetic · energi~s~are distributed in a gamma· .distrib,tition. ..· Ano~her: · physical.example would be the transformation of a distributiot;~. of temperature to a distribution of saturated vapot' ... pressure..·•· This wpuld be accomplished using· the vapor pressure .tables''or . by a functional relationship between· saturated 'vapor'pres~ure · 1 ,.,. and temperature.· We shall only discuss the;first type'of transY• 1 namely, to . transform data' from·'one distributiori to formation', li . • another of known form or to a form which'has certain prescribed propertfitets• Tthis fis to bt; done• ~i,t~ou~·~~~?,?:~i~~.~~~t.r~JW:9;,~.~?~t\' :.• · · ' " ·.. · · ·, ronn o ' 11e rans orma 10n. The essential feature of the transformation. of a ~variate:' from II< one di~tribution to a variate with a di~tribution .of prescdbed · 1 form is'that the probability of being less than a.'given'valpe the variate shall be the same as the probability of being less'than the corresponding value of the transformed variate. This,· then, is an equiprobability 'transformation which' can be accomplished most readily by employing the cumulative distribution or•ogive discussedpreviously. The method is best illustrated by Figure:6. ·,.,., it} :·-I!~~ I:·:! : , 'i 1 .
1
of
I.
0.8
0.6 0.4
.::};: ''
',.
(
'lt\''
'' -
'i;
,!;.;; ... _·!
··.:. 1 ,' ' • !
0.2
'
We illustrate the transformation of the variate X, which is precipitation, to t(X) which has a normal distribution with zero mean and unit standard deviation. The procedure is to draw the cu1nula tive distribution or ogive of the variate X referred· to the probability scale on the left and shown on the left in Figure 6. This may be done using probability estimates from the sample data or by using the statistics of an assumed distribution obtained from the sample. The ogive used here is that based on probability estimatec; obtained directly from the sample and is shown as the broken line on the left in Figure 6. On the right hand side of the figure the standard normal distribution is plotted on the scale t(X) and on the same probability scale used for X. 1f we now choose any value Xt, say from a sample of X's, enter the X-scale at Xt, proceed vertically to the X-ogive, then horizontally at the same probability to the t(X) ogive, and finally vertically again to the t(X) scale, we find the value t(Xt). This is the normally transformed value of Xt. Other values from a sample may be transformed in the same way. Hence, in general, a sample from the distribution F(X) may be transformed to a sample from the distribution G[t(X)]. It may be noted that although the transforming function t is not known, the transformation may be made as exact as desired if the mathematical forms or the two cumulative distributions are known and probability tables are available. If tables for both distributions are available, the transformation can be performed without the aid of graphs. A special example of this type will now be discussed.
The gamma distribution has been found useful in fitting and transforming distributions of various meteorological variates which are restricted to positive values such as precipitation, vapor pressure, evaporation, and precipitable water. The gamma distribution is defined by the frequency function:
f (X)
=
1
f31
r
(1)
X
1-1
e
-X/~
;
(~>0, 1
>O)
where~ and 1 are parameters, and r (1) is the gamma function of 1 and is also equal to (1 - 1)!. If we define A by 1
(1)
A
= In X - - 1- z In X, N
optimal estimates of 1 and f3 are given respectively by IJn X is the natural logarithm
or X.
··~:lit;.:
'?!''''
:t·~. ..r
(2)
~~lio .~,j,.!... :··;: 1~.
). ~a:. ' ~·(3) ·•rn
A
r
; l
~-~,)
... ·-
'
•
~
' •.
.... ~
':
l
The mean of the gamma distribution; is fJr and its 'standard deviation is f.h/y. Pearson's "Tables of'the· Incomplete r function" 1 give the probabilities for various values ofp == ·r 1- 1 and u~·= X/(3vy. We may now give art' example of the transformation of a gamma variate to normality. · '~· ;•'~ ' ' .t,;~ · ~·~
'
,
:
I
• •
·Theproblem is to transform the Harrisburg' April-precipitation data for the period 1921-1950 to normality.· ;rhe first step irt the proce<Jure is to fit the gamma distribution ·to the· data s.eries. Table:.9 will facilitate the computations.: 1 • :.. · , •• • ; l; , ,In order to compare the fitted ogive with the empirical ogive we first obtain the latter.· This is done by tabulating the 'data in increasing order of magnitude so ·that the mth val~e is m: :- 1 .· values ·from the lowest. 2 The optimal estimates of the empiric11l ";;·..:': .. •f.
·, •
!..: !· '·.
'~ ·' \
; .. '
i ..
cumulative probabilities turn out :to be m where N,i~ th~ . . '.. : .. . ..N + .1 ' ,, q; sample size or 30 in this instance.· These, probabilities are'.then plotted against the precipitation values to obtain ·the broken line empirical ogive of Figure 6.·· This ogive has already served to illustrate the empirical transformation ?iscussed ab,ov~. • 1' !:•: To''obtain the gamma distribution it is' only necessary to tabulate the natural logarithms shown in Table 9. The average ... of these is 0.9967, and In X ==: 1.1195. ·Substituting these'into equation (1) gives A = 0.1228. · Substituting A into equation (2) tl
l
gives.:J~.e optimal estimate .
A
·.
r ==
.
.I·.
\ •
~
.• ·•
i .•
!·
4.27~.·Finally. sliiJstituti~g A
.
.:.A·· •:•
r jn
equation (3) gives the optimal estimate fJ = 0.717. These may now be employed with Pearson's Tables to obtain the fitted ogive. ··•·
'
'
.
•'
Since Pearson's p = y -1, we may tabulate the probability F for a few conveniently placed values of u as in Tabl~ ~0 by first determining X from the u's by:. X==, u (fJvy). ;:;' Drawing a smooth curve to plotted values of X·and F gives tReissue 1951, Cambrid!fe University Press. 1m .~sJhe same as "rank ' defined on Page 3. .
.,.i!.';.
:!·.
tr.: .. i.;, J..•... ~ ...
So.l\tc.
44
J\l,rLIC.!_, nJ.~s OF - «~'- l .~.STlCS ... "' ,;,.~·ETH<...:~, ............ GY
TABLE 9 Computations for Transforming Harrisburg Precipitation Year
X
191() 32 22 42 50 23 41 26 31 25 38 30 .)5 39 34 27
.57 I.H 1.43 1.54 1.66 l.i4 J.i(j 1.86 2.0 l
2.()/)
2.19 2.48 2.69 2.79 2.80 2.82 3.13 3.20 3.28 3.29 3.61 3.94 ,).97 4.09 4.58 4.70 5.15 5.17 6.0J 11.22
4i
49
43 21 36 'U
48 45
.H 40 24
.n
29 28
Total
x In
I 2
.3
4 5 6 7 8 9 tO
11
12 LJ 14 15 16 17 18 19 20 21 22 2.J 24 2.'i 26 27 28 29 .)0
m
.032 .065 .097 .129 .161 .194 .226 .258 .290
.Jl.l
.355 .387 .419 .452
.484
. .'i16 .548 .581 .613 .645 .677 .710 .742 .774 .80/1 .839 .871 .90J .935 .968
91.90
= 3.0633
X
m
In X
F
l(X)
-.5621 .1310 ..1577 .4318 .5061! .5539 .5653 .62()6 .6981 .7227 .7839 .9083 .9895 1,0260 1.0296 1.0367
.OO.'i .056 .108 .130 .161 .180 .186 .213 .256 .270 .310 .400 .460 .490 .493 .498 .S80 .598 .616 .619 .691 .758 .761 .782 .8.'i2 .867 .907 .IJO'J .958 .964
-2.58 -1.59
1.1410
1.1632 1.1878 1.1909 1.2837 1.3712 U788 1.4085 I.S21 7 1.54 76 1.6390 1.6429 1.7967 1.8378
-1.24
-1.13 .99 - .92 .89 .80 .66 .61 .50 .25 .10
.03
.Q2'
.01 .20 .25 .29 .30 .50 .70 .71 .78 1.05 1.11 1.32
l ..U 1.7.1
1.80
29.9006 I
30
~In X
0.9967
1.1194
TABLE 10 Jlrobabilities for Plotting the Fitted Gamma Distribution !J
.4
.6
.8 1.0 1.2 1.5 2.3 2.8 3.3 3.8 4.8
X
.592 .889 1.185 1.481 1.777 2.222 3.407 4.147 4.888 5.629 6.369
F .006 .025 .063 .119 .192 .319 .648 .792 .885 .939.969
AL
F
. cNCY
··-ittstr;·-·- ':
'·
,: :
·.~
'
the gamn:a ogive of Figure 6. Note the good agreement.~ith'the:~ empiricalogive. It will usually be desi5"able:to plot the gamma' ogive on a larger scale than Figure 6 so that probabilities may be •.:_ more easily read from the curve._·' . ' . ' · ·-.. .
{
.
.··
.
From ~the fitted ogive we may now· read the proQabilities correspo~ding to the sample values.· These are listed as F, in Table 9. '.Since the transformation. is one of equiprobability,\the transformed values t(X) may be obtained by referring the· F:values to ·a. standard error-function or normal 'probability table. Any normal table may be used, but the inverse 'normal table of "The Kelley Statistical Tables" is especially convenient because it gives probability as the argument. We may alsc;' us~ Table ·7 ~. 1 Given F; lre· take Err' (r) as .500 - F, or: F L.i .500, ;:choosing 1 the positive alternative. From Table 7, we get r /'t(X) equals T when F· >. 500, and - r when F < '.500~ ~'The t (X) of Table 9 . . :will be a' sample from a normal population with p '= 'o and
i
I
..
•
.
·\f ;
There a·re also a number of standard transf~rmatiorts emplqy~d ·,;; .·'; either tof attain approximate normality' or to~ stabilize variance and satisfy other requirements fo~further ~ta~istical';~nalysis>'.f: ·.•: The most: common of these are -V'X ·and ·Jog X. 1' Both of: these .\. · •· · have been used extensively in statistical analysis and.have known mathematical properties. The vX transformation' may· often ,': be used Of1 data of the gamma distributiotdype to produce' data . which an{roughly normally distributed or which 'are suitable· for.·..· regression·'analysis and analysis of variance.: The tqmsformation . log X has'.been used extensively by hydrologists for transforming . discharge data to approximate normality. It is· also' used by . statisticians for correcting deficiencies in data similar· to ·those for which_; the v/X is effective when such deficiencies are greater. The qtbe root of X also has been employed, but this transforma-: tion has no. theoretical basis and is not recommended .. The cube and high~r roots give the impression of effective transformation through ~~mple scale reduction when, in :effect, there. may;· be ·. . ! . , J ·!· ;: < ·1 .li little reaiJ'llOrmalizing. 1 ·~·. i \' ' ·. /
j
•
·,
'
'r.~
Sugges.tlqns fot Additional Reading. · ,
'
;;
i. .''
•
r··.:'' ·: . . . .':~J ··.-:r~~··;: . . , ._ ,, 1, ;:
On Theoretical Frequency Distribution: ·Brooks, C. E.-P.-and Carruthers, N., Handbook of Statistical Methods in Meteorology, Chapters:·6-8, Her Majesty's Stationery 'Offi_ce, London, .1953.
§,aULn piing Theory CHAPTER
II I
1. Introduction. In order to analyze the behavior of a meteorological variable it is necessary to have available a ~~y_j[_i~_!~{~~-~~t variable. Further, as the amount of information increases, we ._____ will become increasingly confident that our analysis of the variable is reliable and that it will not be changed by the addition of more and more data. Now man, however, is an inherently lazy animal, ever anxious to achieve his ends with the least exertion. Thus, we arc inevitably led to ask: "How few data can we use and still arrive at the correct conclusions?" At least a partial answer to this question is found in the theory of "sampling." \\7hen
we sample an apple pie, we cut a piece, taste it and conclude: "This pie is !{ood," or "this pie needs more sugar." Of course, neither of these statements is based on fact, since we know nothing about the apple pic as a whole, but only about the sample of the pie we have eaten. What we mean, of course, when we say "this pic is good" is that the sample of the pie was good, and since all parts of the pie probably taste the same, it is reasonable to presume that the rest of the pie is good also. Thus, we infer the characteristics of the whole pie, in particular the uneaten remainder, from the characteristics of the sample piece we have tested. Actually, a slice of an apple pie is a very poor sample of the pie in the statistical sense, for the oven may have slanted, and all the sugar may have drifted into the sample we have tasted. Our conclusions about the rest of the pie from the sample would be 1 incorrect. vVe should have collected a "random" sample of pie by picki 11g aSj')'()oliTtiTTr~~:;:. -~-~11o.us 'J)arlsorti;pre,"'b'I i~df~Ided if possible. The taste of all these pieces together would then lead to a good estimate of the taste of the whole pie. Unfortunately, meteorological samples are often as bad as one single large slice of pie is of a whole pie; particularly, we are apt to pick a continuous period as our sample. Conclusions from 46
::iAMl~ .."• . .-.lHE<....·;·.; .-
,,.,
:_\:
such a sample as to the properties· of similar,·-:~Y:ent!'jq.,; are likelyAo be incorrect. .·· :. ·· ·t,·.·iJ '·~\Lj(;•·; :'\'ii.)l.,Lf,. '·:·'.·
.,
.
-~
•.
'
,
, ··:,
.. It
'
;
·'
In summary, we should first think Qf.~.~JmpulatioA,"~tl~e~: · characteri!:1tics of which are to be estimated;'. then' pick'~.a "random'.';sample which is designed in. such a' way that each itetn in the population has an equal chance to be selected.' 1In .practice', 1 · this is often impossible, since the population includes future vari· abies, and it would be difficult indeed to include future data in asample.di' .•·· ·.· 1 • .. • • :-' • · ' ''•, .. :.: •
·I
'·t
'·J·.!,.
Let us.:now consider a practical sample:'' Suppose that we· observe that the 31 daily average temperatures at New York in· 1 January 1951 are below 35°F. Can we conclude" that,· in 'the past, the daily average temperatures were below:35°F in 'january' and that!daily average temperatures in January in the' future.:will 1 be below 1~5°F? We would be entitled to ~uch a con<;:lusion' only 1 if we had good reason to believe t~at 'the sample of' 31 temper.', atur~s of 1anuary 1951 represented 'reliabl/the'liinitless mass'of • ... past and \!future January temperatures, I: a'J tonditi011 :'probablY. ·. not satis~.~?· . ' . ;i : .. ,. ::: ;;'il',~:. ·.~;'/ 1 ··:.-~rt:: .::.;·~;,t:.~/:;:,
\.
. :·. :;
The construction of "random" ' sam pies ; of: meteorological. 1\ parameters presents severe difficulties. Consider/ for example, I, ~. a population of minimum temperature in July at New York City covering'•100 years, perhaps from 1850 to 1950,.' If:ali the' data were available, a random sample could be constructed easily; all . the temperatures could be written on· pieces ·of paper;! thrown into the ·proverbial hat, shaken up, and a number of individual pieces could be drawn out of the hat.· 'These temperatures'would constitute~a random sample. ·'' ',: :.· ..-•.·: 1 ~··.i.·' ·';.•.;\1.. , . :. :-~.'i~,1 < l
~
I!
' '
1
I 1
Unfortunately, in practice our samples cannot be·· selected in this manner. Actual meteorological samples are commonly constructed 'from successive or nearly successive data.: For example, a sample of 62 July minimum temperatures at NewYork might consist of:the July minima for 1948 and 1949.·,The question then. arises whether this sample has similar properties as .a random sample of temperatures from 1850 to 1949. Almost certainly, the answer is, "No." In the first place, there has been general Upward trend of temperature from 1850 to .1949; for t~is reason. alone, the given sample is more likely to· contain high' temperatures than a random sample. In the second place, weather occurs i1_1 "regimes"; two successive months of July may be "unusually''
a
•
48
SOME i\l'PLICATlONS OF STATISliCS TO METEOKOLOGr
hot. This again leads to a sample with non-random properties. Sampling theory implies that each individual is drawn out of a population without regard to the previous individual, i.e., successive values of a sample drawn at random are independent or each other. ln'meteorological samples, this is rarely the case. If the temperature is high today, it is likely to be high tomorrow. lt is clear that the sample from 1948-49 would lead to incorrect inrerences regarding the behavior of all July minima in New York from 1850-1949, if the inference is based on random sampling theory. Meteorological variates are unique functions of space and time. The temperature at the thermometer level at La Guardia Airport at 7:.10 a.m. EST, March 13 is 33.2°F. No other temperature at any other place is entirely "homogeneous" with this value. The 62 ] uly temperatures at New York, strictly speaking, permit nothing but statements about those 62 days at New York. It is true that the individuality of a poodle can also be regarded as a function or space and time, but all the poodles in a sample can be brought very close to each other so that systematic variation of the poodles in space can be neglected compared to the random variation or their habits and properties. However, in spite of the fact that often the results of the classical theory of random sampling cannot be applied with validity to meteorological problems, there are other cases where the application of the theory is quite useful and reliable. Some examples of these are found in problems of forecast verification and in the examination of the statistical relationships between two or more variates that are related because of a definite physical background. Therefore, a detailed discussion of the results of sampling theory should be helpful in pointing out what conclusions can be drawn from random samples as well as the limitations that are often not recognized in meteorological applications. Summarizing thus far, variations from sample to sample are of two kinds: 1. Random variations. Tneir behavior can be described fairly ·well in mathematical terms, and their effect on results obtained from samples can be estimated. 2. Systematic variations. These are d(Je to non-random sampling, and their effect is not easily estimated. In selecting
. ""'"'
li?{:!'L ·?;';\)if'j;\;::ir::if~:~;"
~ING'. ··"~~~Ji.
the sample, we should' try to: avoid non-random.'variations:~h!ilf.. r.·: "~ ·t· however, as is often the case in meteorologicallstatistics, 'non- '( ·)·.! random errors cannot be completely avoided, then •their existence must be acknowledged and their possible bearing' on the; conclusions drawn from the sample must be: taken: into account.•.···.... . '-·· ;,_1,
~-.
. I
' ' . ''
•
:
~: ;
'
I
The further discussion deals only with effects. of variations of the first kind on statistics computed from. samples., :. :··.·:·,:; ::, '· •
.
I~ ·,I :' 1 , i .
J J ,. !': ' ·. ~ J 'I I , ~ .
~' I i. ~:;
'/
r
The purpose of sampling theory is to provide: methods· that permit one to draw some valid conclusions regarding a population, given only the data in a random sample.· !Thei investigator is usually concerned with showing that the· population has 'some property or characteristic which·has,been ·hypothesized, and.it is to be determined from the sample whether the data are or•are . not in agreement with this hypothesis. ·However,·the test is made· in what appears to be an indirect 'way by setting :up the so-called null hypothesis, denoted by H 0 , . that' the i population does· not have the property in question. ·This. is ·necessary because·;a hypothesis to be tested quantitatively must·be· exact,:.and. H 0 · .:>. fulfills this requirement. · For· example, the' statement that''Ai and B have the same value of variable x'~ is··a·precise:statement1 I ( but "x andy are related in some way": is vague and ambiguous.\', After H 0 has been formulated carefully,:statistical theoryis used to determine the strength of the e.:ider~:ce in· favor of' rejecting · ., . the null hypothesis.' The word r,ejection here means that,j( the . null hypothesis were correct, the saJllple almost' certainly .;would . have come out differently from the way:i~ did come out.'•. Once· the null hypothesis has been rejected, .we are reasonably· cer~airt · (but never completely certain) th9-t th~, p!Jpula.~!9B ¢.qes,,~w.;v~ 1 ~e . property in ques~ion. . · '"/i' 1:r;\:i ::.;;::.-i!),: ,:;,._~;',~:!.:'>,!'~· ·: ,·,; We shall illustrate this rather abstract'discussion'by.means'ofa ··,. concrete example. Two forecasters, A' and ·a,· make 10 iforecasts in 10 independent situations. · Because 'o( previous forecasting experience it is anticipated that A'will.do better:than B.'-.HFore~ caster A is better than B in aU 10 forecasts. "The' proble~ .is 'to show that A is "significantly" better than ;B.~/Comtnon~'sense. . ... would indicate that he is.· In 1statistical terms./'we 1 first:define.'a:, population out of which the given' ten· forecasts': constitute' 1a ·. . . . '· I . random sample. This population would be the total of all forecasts which the two forecasters could rriake.·'"'Ais';'.'significantly" · better than B if in the population of all forecasts 'A-would ·stili
..·
,· ..
51..
--···-Al'f'L .... JNS (.
.TISTI
MEL
:.OGY
make more correct forecasts than B. But can we be sure of this from a sample of only 10 forecasts? To test this we first set up the null hypothesis H 0 that A and Bare equally good .. What are the chances that two equally good forecasters would pt·oduce the observed lopsided distribution? If the probability is very small, we r.an reject the null hypothesis and conclude that forecaster A is significantly better than B. We have here a situation quite analogous to the binomial coin situation discussed in Chapter II. Under hypothesis J-1 0 , the probability is equal that A will be better or that B will be better. p, the probability of A's score being higher, would be 72. A and B act just like unbiased coins. What was the probability of getting 10 heads and no tails from unbiased coins? It was 1/1024, or roughly .1 %. Similarly, the probability of getting a better score from A in tO out of 10 attempts would be only .1% if ·A and B were really equally good. Since this probability is small, we might now reject the null hypothesis (that A and Bare equal), and postulate that A is significantly better than B. Note that we are willing to take a .1% chance of being wrong in this conclusion. That is, .1% of the time (in the long run) we would conclude that A was significantly better than B even though they were in fact equal. In the previous example, we took a chance of one in a thousand of being wrong. This is sometimes stated by saying that A is significantly better than B on the 0.1% level. In most cases, we are willing to reject the null hypothesis when the probability is as low as .1 %. But what would we do if this probability had been 1%, 5%, or 10%? The choice of the critical probability is somewhat arbitrary and depends to some extent on a subjective decision. However, in any case, the rejection limit should be set before the sample is analyzed. 1% and 5% limits are commonly used. The exact permissible chance for error can occasionally be determined quantitatively if the cost of a wrong decision either way may be gauged. Usually, however, this is not the case, at least quantitatively, so that the choice depends on the degree of certainty the statistician believes he needs in order to make a definite statement. In any case, the level of significance should alwnys be slated when a statistical conclusion about a population is published. In the example discussed above, we were able to reach a fairly definite conclusion about the population by rejecting the null
.
- ~
'
'.
i
SAM
TBJ
hypothesis. Now suppose, we took.'.twof D, and found that C was superior in 6 outof 10 one could not reject the null hypothesis thatl really had equal ability:· ·But does this prove they.1 had ability? In other words, could we say with certainty' that iF they •·. make a: million forecasts,' they would split' 50-50?tt 1Thefanswer· .. is in the negative .. If we cannot reject the null hypothesis,''we;do·i:>:· · ·. not prove that the null hypothesis is correct.'. 1''All we can' state is}O" that the two forecasters are not significantly 'different:from:each·: other. They may be different in their'.ability,.butiwe'can:prove this only by additionaJ:,inf.ormation. J, . ··.~·.. . r,;: 111•::\ir)i' ::'·.'i <:~ i!Jjt' i •
••
_
.. !
,
,.
~ ,:!::- ~·~~~~.-i' ·,-tl.~;.,·r
i~f~· :.'--·~
., 1 •·
•
~
....
Finally, we proceed to two more forecas,t~r~, E.a.t1d f,,whf:!re·E '· has been considered the better forecaster.•··E makes better fore•· casts than F in 8 out of 10 forecasts.·· Is E· significantly better · .. than F on the 1% level? That is, is there 1% chance or less ·that these results would have been obtained if.·the forecasters· were · really equal? In this case, common sense might be misleading, ·. for the difference in ability appears to. be -substantic~.L-:·;.~,,.;.,,~.~·i¥.r~ . i. 1
:
'
•
.:
i ,I·
r•~···f'.:\ ...,1:·::;.,.\i·r~- ··~~-,7;1i~;i?·1:'' 4 .l~;,:::,
Again, the null hypothesis H0 is ~hat the forecasters are really. equal. That is, p = Y2· ·We, then ask, what· is the probability that with 10 unbiased coins, 8 or more turn 'up. heads.·. This,will · be 45 + 10 + 1 == 56 out of 1Q24,.or slightly.Ipore,than '· Thus the probability is more than 5% that out of'two equal. fore-.. casters the one specified in advance. will beat the other 8 or more ;: times in 10. If we stated that E was betterthan·F1 'the·chances. · are better than 5% of being wrong.·: We have stated initially: that,'; · we will hot accept as targe a chance of:error.r•::Hence,·welcannotT' reject the null hypothesis and cannot pass judgment' onithe :. inferiority of F. Incidentally, whereas·ll.• ratio ·oC:8 •'l 2• is; not', . significantly different from 50-50, ·a· ratio •of. 80 J: · 20:. is highly significant. This might make it appear that by adding data~one necessarily will reach a significant'! result;' actually,' however> ·· .... starting from a score of 8 : 2, 'one might I find that' the 'ratio · approaches closer to 50-50 as more forecasts are made. •'Adding information is no guarantee of reaching significant results.since; . possibly, there is fundamentally no significant difference.rl, ·•!l·.• ,
.L_
:.
,{ ...
::.
>•l"Hi,~i ~~
.·.lf.jr,t;.;n; ~~t:i·q,.
2. Blnomla1 Testing Genera1Iy•. , ' '· ,,:, :m lii-rf d L' .U
!i;
:![
Whenever one tests a hypothesis regarding' two •alternatives (when the variable is "dichotomous,"·having only.two·values).·
the binomial distribution can be used as basis of the test. In the example of the previous section, the binomial distribution could be used because there were only two alternatives; in each forecast, either one or the other forecaster was better. (Cases of ties are not counted; they are treated as though those forecasts had not been made). The magnitude of the scores was not required, or any assumption regarding the frequency distribution of forecast scores. A test based on no special frequency distribution is called a "nonpanunetric" test, and the binomial test has this property. Practically the most important assumption underl)ring binomial as well as other nonparametric tests is the condition that the samples were drawn "at random," with no rclatiou between successive variates.
1f we had used the scores as well as just the statement of which forecaster was better, we might have come out with a more •"powerful" test. For example, in the case where E beat Fin 8 out of 10 cases, significance could not be proved by the binomial test; had we taken the scores into account (which might have shown that in the two cases in which F was superior, the difference was very small), a significant difference might have been demonstrated. But, as soon as actual scores are analyzed, it is necessary lo 111akc an assumption about the frequc1icy distribution of the scores. Such tests will be discussed later in this chaplcr. To retum to the use of the binomial test, let us consider the following hypothetical example: the Chamber of Commerce of Southern California claims that smog occurs on the average one day out of five. One hundred summer clays are selected at random, and smog occurs on 35 days. Are we permitted to conclude that the frequency of occurrence has been understated deliberately and that it is really higher? Apparently, this is a binomial situation, since there are just two alternatives, smog and no smog, so that a binomial test is proper, provided the data were selected independently. This means, particularly, that the 100 days in question are not successive days, since weather on one day is not independent of the weather of the previous day. For example, if smog is observed on a given day, the probability is high that it will again occur the next day. Our null hypothesis is that the probability of smog is }i. The question we ask is, what is the probability of getting 35 days of
'!'··
....
smog or more out of 100 with a probability:'ofi,!i:i'1dfithe proba~'~· bility is low, we reject the null hypothesis and .s,uggest .. that;the: claim of only Yo probability was· wrong. !~J[~.,; ·:, :iL' ,;,,;;,,; . .
.·..... ·:. ·1
.
This problem could be solved, in principle, by working out the binomial probabilities for p == 7fi, of getting 35, '36,'37,'etc,'cases ·. of smog out of 100. The task of doing this is, however, forbidding. Instead we remember that, for large N, the binomial distribution:' approaches the normal distribution ... The rule 'or' thumb' stat~s .. that a normal distribution is an adequate approximation if.·· Np(1- p) exceeds 9.' This condition is •hereLsatisfied.·· 'Thus, . under H0 , the probability distribution of smog :days should .:be. approximately nonrial with a mean of.Np ·=!20 and a standard deviation of v'Np(1-p) == 4. Now the question is, what is· the probability of getting 35 or more from a normal distribution with a mean of 20 and a standard deviation: of' 4?•: From·,Table: 7 ·.. in Chapter II, we see that the probability of getting a· distance of more than three standard deviati()llS from the .mean is less than 1%. If we used again the 1%' limit' as: our 'criterion,, we . ,.. must reject the null hypothesis, and· must conclude that ·the ... Chamber of Commerce has underestimated the probability o£1 . · smog. Again, we take a chance of less than 1% of being wrong.'·; · '
..
.
~
t
The approximation of the binomial distribution by. the normal distribution greatly simplifies the calculations when N is large enough. For most accurate tesults, however, the test just. described should be modified slightly (the effect of the modification is usually small): whereas the normal distribution' is 'con· tinuous, the binomial distribution gives probabilities only at discrete points. In order to allow for this,' we consider the points in the binomial distributions as midpoints in class intervals.:, For · example, 35 in the binomial distribution 'represents the·•dass interval 34.5-35.5 for the normal distribution.'' Hence'we 'now ask: What is the probability for finding 'a'value pf 34.5 or. larger. from 100 observations with· a· probabilityi of: srhog:1·of'l20%; Apparently, this small correction has no important effect oil:the · conclusions that the probability of smog is significantly greater o-t ', • '· I.",,,,. [·' ' ,. ' ' '· '' '• ' ; •1' ,, '1' l ' ' than'20;£· · · ' .· '',.t' ''!.''''·' ',1'..·'.:.'1: '··.·.,·,•:·'. 0 . , ; ,,;. :> J'',! I )i:, 'I i, i; }~·,;~;\·.( ·:·r~ ~: ~ i;. . t · ·•· · n~.,.:~ ~ :I (I d:J· ~, ;. : J
1 ·., •
3. More Than Two Categories •. ,;': ; <' ·1 ·.: •. 0
1,' 'i,
!
1'!,
,
•
When data are divided into more than· two groups,: the "chi-· square" (x2 ) test is useful. For example, in connection w~th the ·
. :.
·.,
54
SoMB Al'l'LICAnoNs oF ::;TATISTICS TO Ml':"l'll'.OROLOG•
problem of forecast verification, frequencies are reported in a number of groups greater than two. It is then required to discuss· the influence of random variations on the forecasts. The hypothesis may be set up that a forecaster really had no skill whatever, for even forecasts made without any skill whatever may produce results which appear fairly good. The no-skill hypothesis can be tested by use of the chi-square method. The chi-square method is non parametric. It requires that the data are given in terms of number of cases in certain categories. A given sample contains N individuals distributed over m categories. The number of individuals in each category is 0~, 02, O:h 04, etc. We shall make the null hypothesis that this sample belongs to a population in which the number of individuals is I-b, H2, I-13, H4, etc. It is important that the O's and H's are numbers of individuals, uot probabilities. We now form the quantity, chi-square, defined by:
x2 = (0,-H,)2 Ht
+ (02-I·b)2 + (Oa-I-h)2 + (04-H4)2 etc. H2
H3
H4
2
"' (Or-Ht) :z; - - - Ill
J-1
Here m is the number of categories. The theoretical distribution of x2 is \veil known and is tabulated in Table 11. Thus, if the hypothetical distribution is far different from the given distribution, chi-square will be large. When chi-square is larger than certain limits, the null hypothesis is rejected, and the observed distribution is "significantly" different from the hypothetical distribution. As before, the level of rejection is reached when the probability of obtaining the sample chi-square from the hypothetical population is less than 5% or 1%. These limiting values of chi-square depend on the "number of degrees of freedom," 1t. This is the number of categories, m, minus the number of restrictions. A very common restriction consists in the condition that the total number of individuals in the hypothetical distribution has to be the same as that in the sample distfibution. If there are five categories, and the only restriction is t.he given total, the number of cases in the first four categories of the hypothetical distribution is almost unrestricted, whereas the.number in the last group is fixed by the given total. There are four degrees of freedom, a number equal to 5, the number of
.,.
T AB~E '; ~ l(~'.·.~;:~\~:$?tt·~::~~~:;~~~i;;iit::;'.'"·'
Limiting Values of Chl-Square•::>:·'·"':•(/0•,;r;~,,.,;:,~·. ,., 5%
10%
~.-;~~':~f:)~~
1%
0.1%
- - - - - - - - - - - - - - - - - . . . , . , . . - - - - - - · ,' -1.'·
2.71 4.61 6.25 7.78 9.24 10.64 12.02 13.36 14.68
1
2
3 4 5 6
7
8 9 10
H:~:
11
12 13 14 '15 16 18 20 23 26 30 >30
3.84 5.99 7.82 9.49 11.07 12.59 14.07' ·... 15.51 16.92 ·-·'. _,
}g:~A
6.64 9.21 11.34 ''• : 13.28 15.09 ;'' ' 16.81 ·.r · '18.48_, 20.09 21.67· "·:·..;;._,_
~g1 .. ·
_10.83: ·, : 13.82 ";:, 16.27 ' :' ,, 18.46 ) 20.52 ', 22.46: ''
··
,24.32~·-··'•-
• 26.12 '' ' '27.88 :"'''' ''
.. -
~n~
21.0.).,. ,;. , 26.22 ,. 1 • ; 32.91-r/ 22.36 ' 27.69, ' '·34.53. ' 23,69 .·_ 29,f4L~) [•,: 1i36.}2ifj;,(· 25.00: .1-:30.58 : n .; ·,37.70 -~ ! ~ 26,30 I 32,00, , 39.25 . ' 28.87'·.:· ,;• 34.81''r'l:•>'i ·'''42.31'1:;','.H.41; I·:!:··-! 37.57!.tti \·!'.It 45.32 35.17 ; 41.64 ' •'. 49.73 ' 35.56 · ' '' 38.88 ·, ·' ~.n-.1. :45.67 1 '.!'
+
>: _.I
1
+
----------------..,.....--,-----.--...,...--1:· l
':.' ' I •
i:
,1 f
~
•J
, '
' ,.~ "•
1
·, \ ·,
•
"'Abridged from Table IV of "Statistical Tables for Biological, Agricultural ' and Medical Research" by Fisher and Ya~es, '.Qiiver and ,Doyd. Ltd.,,, by,;·:' permission or the authors and publishe.rs.'' ,j •.•,!,,,-:J 11"1. :'·~if:-'; ;•:>i:li.~./I!.,J,l::•\1:',;'. , . . . i~ .. ~:\, :~! ;.:•il~!~l:,-.,.. ~~:.~:·~·~I':LL.~.);lr~ti· . \·.:.,Ff.:.f.~J, :
1
·
categories, minus 1, the number. of restrictions•.' . An' important' . condition on the propriety of using· x~ .is' thaeJor nci 'J:;ategory, 1 •'' ·Jl,••J ';·;l(!U.nHI l·i;J~•!.iq;.;:t,;\,U. ( ·: should H be less than 5 : 1 . . . . ~ .: '•:.·~:;i,···:~. ; t.:·~i,~~-~~i=.rr:::·:f?.~r~: :-11~};~~1:~:(:·; .: It is also advisable, especially when I!· 11:1 sma,l, ·to! mt.t~e ·!! . correction similar to that suggested for ,the binomiaL testing;; :The·: x2 distribution is continuous while the ~istribution''of frequenf::i~s'' ·;: is discrete .. This correction, due to F. Yates; involv~s reducing each value .of the absolute discr~pancy1 :(Qp;H1f by~ 'before ··.·i;·::~ . .; ~~::(;'·;·:j·:'i! -~/· . ~'·,'':;'d·:~. :· ,·, squanng. ,.,._, ._,, 1.- ,:_ .. _-,.,., ' " ' :.• :-·Hh ;·i•. ' As an example, we shall consider a typical problem ,of forecast. . 1 1 verification. We are forecasting'' 5-day totat"precipitation in> three categories, light, moderate and heavy 'precipitation,.abbre· viated by L, M and H. We make 200 fore~.asts ..:the results, of.,. our forecasts are summarized by Table ii. -Successful forecasts' appear in the diagonal from upper left--to lower !right.:! .. , •
'
••••
•
J ••
•',
'.,,·';
vv
TABLE 12 Forecast Contingency Table Observation Forecast
L
M
H
Total
L
i\1
25
10
1-1
35 20 10
IS
15 40
70 65 65
Total
6$
70
6!i
200
JO
· We might now make the null hypothesis that the forecaster really he1d 110 skill, and that his apparent ability is accidental. If he did have no skill, what would have been the resulting verification contingency table? In that case, forecasts and verification would have been independent. Then, the probability of getting into any one box would have been the product of the probabilities of getting into the given column multiplied by the probability of getting into the given row. The probability of getting into a column is given by the ratio of column total to grand total; that of getting into a row, by the ratio of row total to grand total. Thus, the probability of any forecast falling into the first row is 70/200 = .35. Therefore, if forecasts and verification were unrelated, the probability of getting into the upper left-hand box would be 70/200 X 65/200 = .114. Now, we are uot so much interested in the probability of a forecast falling into a given Lox under the condition of 110 skill, but in the expected number in that box. This is given by the product of the probability of reaching that box, and the total number of forecasts. This comes out as (70 X 65)/200 = 23. lt1 general, the no-skill number expected in each box is given by the product of row total times column total, divided by grand total. The complete no-skill table corresponding to Table 12 is shown as Table 13. Note that the columns and row totals are the same as those in the original contingency table. The null hypothesis to be tested that there really was no skill is then represented by the contingency Table 13. We can then compute chi-square by forming
II
2:
I-I
the sum extending over the nine boxes. We find:
x2=
(35-23) 2 (25-24) 2 · (10-23) 21( + 24 . + i 23 ·: • + . .·' 23 'l'''!'.,'/:01211'
2 ' (10-21) 2 + (15-21) + 21 21
I
··.,:.!!' ,
,
~·.i--i,,Ji•JJ)i<··· +·(15'-23)2 ' 23: '·.•::.,:'.'21"·; +'·(40-21)2 . .. ,:=',,·J,,,,:•, 43 ~: ': '' ' ;•. ' ' .. h·•!J;· I ;I
1 1
• I".(
'··,.-·. -'
There are altogether nine categories. The number of individuals in each row or column is fixed. This may:seem ·to give 'six' restrictions but actually, when the sumofth~ three rows' and' two. columns each is given, the third column.hasa'given'v<;~.lue''since the grand total is detennined'by the three .row means. Hence:· there are only five independent restrictions~, and .~he ~l;lmber;,of · • degrees of freedom is 9 - 5 = 4. ·· :, .. , ,.. '' : '.(::,/;, l' ;·;.;';j . . · 1
\i,;j:\
'
TABLE'~~ 'll;:;ii!lj>~~::iJ'.'J ~.L;::•:,;~;~~. .:·. ,
~
i (
r ·: _', _;'' !
, ,,
Observation · .
I'
i' I''
t.
I fi-\
t ~''(
·,,
'
;;,, •. :·!Liuf:·~fi
No Relation: or'No-SklU i~·.id;
t ;: . -,_:
' ··.:., /,' · .'.'. ,·
- - - - - - - - - - - - , - - - . . . ; . . . . . . . . ••1 :~·H<"cJ.IJr)/(1.·'·
Forecast
L
L
M H
23 21 21
Total
65
M
H :·,'H
70
65
.,,.
.•
•
·,
'·
I·
,_ .-
,;:Tot~j,;~;h/t
·zoo' .
·,
I
'
'o
'
:. :: (• l''
..
·:i ·
..1
This can be seen in another way: after t,1Umbers have been · inserted in the four boxes to the upper left of the verification ' table, the numbers in the other five follow by the conditioils on .. the sums of rows and columns. The first four .numbers· could ' ··· · have been chosen arbitrarily (as· long· as· their: sum •does not · exceed 200, the total number of individuals); hence," there' are four degrees of freedom. Returning 'to Table'11 ,·we see that' the ·· : 1% limit for chi-square with 4 degrees of freedom is 13.28. >The . · .·.··. observed value of chi-square is much'larger'than this; hence' the .:~ probability is considerably less than 1%'thafia no-skill forecast · would have led to the given verification. table (Table',12).' ·J,~: Jpl:. · ;
.
,
' :
;
.
'
.
' '
.
I
t
'
'I' l
.~f .I
t i:
I '"·' .;
j,
(
J
. t'1 !~ I ·, ~
! '.'
:~
::
~
..
'
:; -:
' •
Frequently, the chi-square method is used to•test whether·a · given frequency distribution differs significantly;' from normal.: For each class, the observed frequency is subtracted .from,; the.··· frequency expected by the normal distribution,~ squared, .and· divided by the normal distribution frequency.: :These ratios: are
58
SOME AJ'I'LICATIONS OF STATISTICS TO METEOROLOGY
then added and the result is called chi-square. Since the normal distribution drops to vet·y low values at large deviations fmm the means, the fractions here are poorly determined; therefore the values falling into several classes are combined. The number of degrees of freedom should be N- 3, where N is the number of classes after the combination of extreme groups. There are three restrictions, because the normal distribution fitted to the sample has the same area, mean and standard deviation as the sample. Again, if chi-square comes out larger than the 5% or the 1% limit, the observed distribution differs more from the normal than could be accounted for by random scattering. If chi-square is smaller, the population from which this sample has been taken may be normal. There are other means of testing the normality of an obseryed frequency distribution. Most of these, however, are applied only to a special property. For example, the normal distribution has zero skewness. In a manner similar to the methods discussed previously, it is possible to determine whether a given frequency distribution has a skewness significantly different from zero.· Also, the normal distribution has a coefficient of kmtosis of 3, and the difference between the kurtosis of an observed distribution and the normal distribution may be significant. '
4. Significance of Means. The mean of a sample is the best estimate we can make of the rnean of the population from which the sample was taken. The question arises, how good is this estimate? This question can be answered on the basis of parametric tests only. Conside1; now many samples, each of size N, drawn from the same population. Let us assume that the .QQJ?.Uiation is near!x_ !J.QJ]llal, that is, that the variate is distributed to the "bell-shaped" curve, or nearly so. Suppose we compute the mean X of each sample and construct a frequency distribution of these means. The dispersion of the distribution of means wi~ depend on N according to the formula ax = a/vN where ax is the standard deviation of the distribution of sample means, and a is the ~tandard deviation of the original population from which samples of size N were drawn. (This expression is proved in the Appendix). Tht;Lqist!J~':!!!!?D .2L!!!e.41.1.sj:ends to normalit~!!._li increasefi....Cven if the distribution of the variate is far from normal. ~-temen t is called the "§'l't.~~~- ,!~~;_:~~J!~~~?~~;)i·
-;:col:omg
..... " ----- -·
referring to Table 14 for the probability of a deviation this large or larger. Figure 7 illustrates this question qualitatively. "It shows the distribution of means, centered at p., with standard deviation ax = a/vN. Its area is 1, so that fractional areas
TABLE 14 \ Distribution of "Students" t, '·1·.. Given the Limiting Probability•(,'::. ··. ,. .1
1 2 3
6.314 2.920 2.353 2.132 2.015 1.943 1.895 1.860 ·1.833 1.812 1.796 1.782 1.771 1.761 1.753 1.746 1.740 1.734 1.729 1.725 1.721 .
4
5
6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 .30 40 60 120 co
1.717
J
1.714 1.711 1.708 1.706 1.703 1.701 1.699 1.697 1.684 1.671 1.658 1.645
.01
•OS
.001
12.706 63.657 636.619 4.303 9.925 jJ.S98 3.182 " i .5.841 12.941·. 2. 776 . . 4.604 8.610 2.571 4.032 ' . 6.859 2.447 ·, 3.707 'l"•i'< 5.959 ' . ·3.499_·,~':··:'::·:·· . . 5.405 ., ' ' 2.365 2.306 ., ; •·i< 3.355 •.;H·!· 5,041 • ••.·.,1 2.262 , ,j,. 3.250 • I 4.781 :';1 \,·, 2.228 ' .:·3.169'1.•.'. •; 4.587 ;,.•. , 2.201 ' .'·.. •.tl 3.106; . ' 4.4.l7 ; :,.; 2.179 : .·•. .3.055 .. i.:'-.;;. 4.318 ' '·I• 3 012'·~' .'!•·. '4 221 2.160 • · 2.145 · :; • 2:977 ~~-'·,;; . 4:14o , ..! , 2.131 '· 2.947 · •.. : 4.073 2.120''. ·~·~ 2.921'.1,. :"' . 4.0[5· 2.110 ,,,..2898 3965 2.101 .. ,, /2:878 .; 3:922 ·. ' 1 • ·i 2.093 2.86[ ,'. . \' [ ,1.883 ~, ':· :, , 2.086 2.845 ' 3.850 : '' 2.080 . '.' . •:; 2.831 ii; • ...,;3.819'; ';.' ' 2.074 ., ·. 2.819 ,., '.·/ 3.792,~.::: 2.069 ,.,';,·,.I 2.807'''.' ,''' '·3,767:":' 1 ' · 2.064 ··., . .;' 2.797 ·j,;l. d/:. 3.74s:'Jl · 2. 787. ' ': ., ' .J-725 .>: 2.060 '... 2.056 ' .2.779 ' '. : 3.707 2.052 · 2.771. : 3.6QO.;,p 2.048,, '' ·. 2.7763' ! ·>!•::•· .3.674 :,.,;•: 2.045 ·•.·' 2. 56' •.:. ',· '·'' 3.659 ,; t; ' 2.042 ' : .•.•. 2.750 d '>•\.1.646 h) ', 2.021 ~.· 2.704·' ' ' . 3.551' ··i.· 2.000 '' -'.'''' ::2.660 ;·/ .'3.4
l'
::I
!;"
•Abridged £rom Table III o£ Fisher and Yates '"Statistical Tables. Cor :: Biological, Agricultural, and Medical Research,'' published by Oliver 'and · Boyd Ltd., Edinburgh, by permission o£ the authors and publishers.•/:l!>:t;\:!;: : ."
..
,'
-'
60
~o~m Al'l'I.ICATIONS Or .:>IATISTICt; •v MEir..o.KvLOGY
FIGURE 7 Frequency Distributions of Sample Means.
represent probabilities. The mean of the given sample, X, has been entered in this diagram. The shaded areas give the probability that a sample has a mean differing from Jl by I X - Jl I or more. Fo1· example, if I X- Jl I was exactly equal to ax, and the distribution of means nearly normal, then this probability would be 100% - 2(34%) = 32%. Similarly, if X - Jl were equal to 2 vi(, the shaded area would amount to 5% of the total area. Or, the probability of obtaining a sample mean differing by 2 ax or more from the population mean is 5%. Note that we have worked here with both ends of the normal distribution, whereas in earlier examples we used only one end. This depends on whether our tests concern only one extreme or both. In the case of smog in Southern California, we were interested only in large percentages of occurrence. Hence, we used the upper end of the curve only. Such a test is called a "one-tailed" test. 1f our hypotheses may involve large deviations from the mean in either direction, we deal with "two-tailed" tests. If the probability that X comes from a population with mean becomes very small, the sample mean is said to be significantly different from Jl. The probability at which a difference becomes sif{nificant depends on the degree of certainty desired to which Jl
.·f~'f{
;l{';~~~i;~~~~T!~~~ ~-':;··;·•c.;:z.:«;:l;'tl'i;~i!': ;,
the conclusion has to be correct. 1l Some investigators wiiJ· difference significant as soon as the probability~drops:to 1 is, as soon as the mean of the· hypotheticalrlvv1fJu•a.u:vu from the mean of the sample by 2ax or more.~· Others' require'aL difference of 2.6 ax:, where the 'probability'drops.to 1%. ·t~:ih:'f!.('. '
'
'!
..··j '·
.....
.•
'\i·:
';
In the discussion above it was assumed .. that. the ·standard · deviation a of tire population was known ... However:, in most · practical circumstances this is not the case and it is 'necess~ry.to estimate a from a relatively small sample of data>The 'proper estimate for the standard deviation of,th,e ~a~ple..~~a~ls)s.sho;.om s . · ·.._.: ·· :{': : ~. , ~ ·. ·l~ ·: :.: '< ~ .·.·:·:.: :·~: .r. · . to be equal to in the Appendix. ·The 'departure of a· _vN- 1 ... ·--· •i r,1.'!~··.'': ':,~'.!·'·i;F;)':.:: !;''.1.~_ sample mean X from a hypothetical mean fJ tan then be tested by · computing the ratio: ·. 'I '·•· •·· ,;,;,,,1 ·.·! ·~1~i> •':v:u;:;.-.1:.!,..:{.
t ==
'CX , '·
tJ)
vN -
;::• J:
... :r!
1
'
:~, .._: .i·;.:';,.t.·. ;·,.:·~·; •l'::wy~;:·.:~. ; .· i.
'
.
:.~,.··.rl;s··~,;·.~r.d·.,.l~·/'.l·~;~·;··~~~·~::t.~~~~h.'
.•
which is known as ''Student'' 1t. rr'he probability distribution':of > , t_!kgends ~ru?.!!JP~~pJe si~~!iiiJ~Jloinor.miill~i.ffiM£~\:·,: :;,_ . except a proximately for N larger than' 30. 1 .'The distribution' is. ,··.:'") gf~~-n,.·r.;" ... m:tt.:mm:rr:.51iiWIC1Jlttnaf;;-ci.:.thacin. .th.!s· table the ::..~ 1,ill.lrllaliu~ll~d "degrees..oUr.ee.d.Q!ll ".~!!.~ notJJ:tuqtiilD1!m.Q~L-~ · o!.9!t8$!LY§_tions. In testing the mean of a single'sample of size N, .; : :; the proper "(l;grees of freedom"value ta use is:n' = :N 1'-' 1:" rn':•J: ·...... ,:
H·
'
.
·'
:
,
•I
'I• ·
·t
;
1"
f l:
t [ •
,' ·f
~./ l,' J
J ;· ~ "\ :l·.li; •,' :,
:·
< '•,,..
As an application of this rather abstract discussion, we consider'·'\ the following problem. It is desired to compare two' forecasters:·,: . as to their forecasting ability. c Each' forecastet 1tnakes\:65' fore-;, ·: 1 casts under the same conditions ..· One' of th~' forecasters,:IA; has ; . 1 1 an average score of 63, the other, B, of I 64.5, according''to .some :.· scheme of verification. The question is; is one Corecaster'sign·ifi- · ·. . cantly different than the other? In othel'words,'if the forecasters····':. made many more forecasts, would 'Bstilrscore:highe~··.thfln.:~.e.,.? , ·. : ... _.. ,. ·•:.il''·'i I· 1,, '·' d!hu
.~:<.:1
1.
In·order to determine this, we investigate•the!scoreslin:'the·,. individual forecasts. For each forecast/we;subtract the•scoreto( . forecaster A from that of forecaster B. 1i We call 'these':65 'differ•:\gt:_l ences, D. The D's are sometimes negative,' sometimes positive;''?,:)'{~ for some fo~ccasts, A's forecast was better, for·some;B's,'I;;The . . • ;·,., differences, D, now take the place of the variate X in the;.theo;.l,>. retical discussion. The mean value of Dis, of course, 64.5 :"·"'(\63,:
6"'
"""vJ.,.. c.-
APPJ.J.a .....n.~iONS """'" ~ . . ATIS1 .............
A.-.)
ME .. ...:..v"'JLOG\
that is, 1.5. 'vVe now compute the standard deviations of the D's and find it equal to 15. We then test the hypothesis that, had each forecaster made an infinite number of forecasts, the scores would have come out the same, that is, 5 would come out as zero. In other words, the hypothetical population of the D's has a mean of zero. Vl'hat are the chances that a sample with a mean differing from zero by 1.5 or more will b~ obtained by the accidental variations inherent in random sampling? In this case,
t equals D
vN-----=--1 =
0.80. From Table 14 the probability for s a t this large or larger is larger than 10%. Hence, the "true" mean, that is, the population mean, may quite possibly be zero; the hypothesis cannot be rejected. The two forecasters may be efjually good. However, this test does not prove that the two forecasters are equally good. In order to conclude more definitely whether one of the two forecasters is better than the other, it would be 11eccssary to have them make more and more forecasts, until a "significant" result is reached. If the difference between the mean scores becomes smaller as the sample size is increased, it becomes likely that the two forecasters are practically equal. If the difference and s stay about the same, the difference will become "significant" since t increases with increasing N. Actually, even a "significant" difference between the two forecast score averages does not prove that one forecaster is better than the other; it only makes it extremely unlikely that the two forecasters are equally good. An application of this theory, similar at first sight, but differing in the kind of variates, concerns the testing of differences of mea1i temperatures at 0400 GMT and 1600 GMT at 700 mb, at 8 eastern radiosonde stations observed on 13 successive days. There were 101 temperature differences (three observations were missing), with a mean of 2°C and a standard deviation of sac. The 1600 GMT temperatures appear significantly larger than the 0400 GMT temperatures. This conclusion would be based on the result that it is very unlikely that a population with equal mean 1600 GMT temperatures and 0400 GMT temperatures would lead to samples with mean difTerences of two degrees or larger. But, what is meant by population: the temperatures at many more places, or those observed at the same place over a very long period? Our sample can hardly be regarded drawn
as
at random out of either population.:i\Bothothe·"~'''"u''-·'' the specific time may be quite "unusua\'·' iandi probably • . . The statement that the difference of two degrees is' significantly .•· . different froni zero is based on the concept' of random sainples;•a ·:\:: concept which is not applicable to this problem.~.·~-· J ,·: '·:• · ;Hn r";·i ' · '!, ·:·_~
5. Difference of Means of Independent Sam pies. '.
\
.-,
_ .
r
.'"
:,~,··'
)1
Again, we consider a hypothetical, infinite population, distributed normally with standard deviation, a: What is the chance that in drawing two samples from this population,·with numbers of members Nt and N2, the means Xl' and X2·will differ by· some specified amount? In order to find out, we draw pairs of samples out of the hypothetical population, one of each'pair containing Nt .. : members, the other N2. We form the difference of'the means of'.. each pair and make a frequency distribution of these differences; This distribution will be normally distributed• with' a mean )of ,
zero and a standard
: ~ . ,:
1
·:~
I'\ t .~'I
deviati~n ad= a\} ~ 2 .,+ ~~ 1 priovi(~.il~-~-~~~~ '
I
,
,
the two samples are independent of each
.othe~ ... ,
' .• ;
!-. i. , , . ',
··.·;. -.'
_, .
.
_
In practic.al applications the standard deviation of~ = ~1 ~X2 \'. must be estimated from the sample by the formula:; · . :: , - - - - - - - - : - - - - . c - .!f .J I ( • ··' ;' ', j~,l.: ~ .I Ntst 2 + N2s2 2 V N1 + N 2 - 2 ' N 1 ; . N 2 -'"' . ; i ' . . . .. '• '~ where St and s2 are the observed standard deviations of the two samples with numbers N1 and N2, respectively.: The· ratio · -~
(-1-+_1_·)· ,;, ·_, ,::. _ ·,·,. -
-
.I
v .
.x.- x2
I
!:''!!~i1 ---~~t_!;-:_·~t.
N1s1 2 + N2s2 2 ( 1 + 1 ) .·.· .· N +N -2 N:, N ._,·,: 1.
2
.
_..
•'
1 .. .-.:
. 2
. ·:
.. :
"· •, .
-
-
l,.;l-
' ', '., '.
. . . :.-;. ~-
.·
' .
is then distributed as the t distribution, Table 14... The table should be entered :with Nt + N2 ,-;-:2 degrees.of.freedom .. ,.,"; ,,~ ..
-
The technique outlined here can: be 1 applied ito 'samples';'of · . i · meteorological variates for which a population cannot be defined; ; · .. . in fact, it may serve to prove the non-existence of' a population'>;_, For exam pi~, ·we· may choose- two samples. of 1J uly' 1!Jlinimum ·..- 1 temperatures at New York, one sample' 1948-49, 1 m1d anothe r, · ' · · perhaps, 1932-33. If the means of these twosaniples differ frotn each other so that tis more than 2.8 (seerable 1~,Jhe1%limit), · -,· ·, ,, .... ~
~\JoiE J\t·•·LIL••1'10Nis u~· ;:>TATIS'f!CS TO METEOROLOGY
it is unreasonable to assume that they have been drawn at random out of a r:ommon population. Actually, the value oft was 6. This means the two samples are extremely unlikely to be random sam pies drawn from the same population; or, a population can hardly CXiRt or which the given data COIIStitutc two random samples.
6. Nonparametric Methods. · The significance of differences between samples can also be tested by a number of nonparametric tests. Many of these have the advantage of ease in computation, and of no assumptions regarding the frequency distributions. In some of the techniques the order of the data must have meaning; in others, not even this is required. For example, a variate consisting of clear, overcast, and precipitation would always be admissible, because these attributes are listed in the order of worsening weather. On the other hand, clear, overcast, rain and snow would not possess a natural order. Some of the nonpanunetric methods, particularly rank methods to be discussed presently, would not be adequate. If nonparametric tests have so many advantages, why do we not always use them? The reason is that, if we have numerical variables wllkh have well-behaved distributions, we throw away information by using nonparametric tests. This may mean that we may be able to prove a significant difference between two or more samples by a parametric technique, but are unable to do so by nonparametric statistics. We say that parametric methods have greater power than nonparametric methods.
7. Testing Difference of Means by Rank Methods. Rank is extremely easily determined, and its use has increased in recent years. All oue has to do is to put the variates in the order of their magnitude, assign 1 to the first, 2 to the second, 3 to the third and so forth. If the same value of the variate occurs several times, the average rank is given to all. For example, if the fifth, sixth, and seventh lowest temperatures are all 33°F, all are given rank 6. If only the fifth and sixth have the same temperature, the rank assigned is 5Yz. In Table 15 is given a short series or wind speeds and the corresponding ranks. First, we shall consider two samples of paired data, as for example, the forecasts that two rorecasters, A and B, made 011
SAMPLING•·
Wind Speed, mph
5 12 17 6 20 12 16
,\f
I
7
16 22 16
the same 10 days of September. First, form the 10 differences between the scores, B minus A, just as in the algebraic method~··· ' discussed before. But, instead of computing the. means of these differences, rank the differences without regard''to sign.";/Any: zero differences resulting from tied scores are discarded before ~ · the ranking is begun. Then the ranks are· assigned: positive ot negative. sigt1s according to w.hether the. differenc~s:are positive\:... or negattve. If forecaster B IS much better tha11 .A, most rank 1 .': differences would have a positive sign i tnoreoyer; any "negatiye',' :; ranks would be small. , .··. i· 1.J.'' 'I w·:i(: i':'· ~~q .i:: !l i\h'. Table 16 gives an example on how such Jank:difference m,ight: look when A appears to be better'thanB."'!it~·, '::.rn·•·• ·· :'·'' 1··n••il B scored higher only on Sept 4 and Sept 6, and then'.not very · . mucq. AreA and B significantly different?· To decide, 1we'total .·
:r.
~;: ~
TABLE 16 . . • ti Rank Differences Forecasts
'>'t· .
I
,.
' ·
Rank Differences
Sept 1 Sept 2 Sept 3 Sept 4 Sept 5 Sept 6 Sept 7 Sept 8 Sept 9 Sept 10
··.
~--lE
At .... -
.noN~
TAT!~
.·oM. ___ :oLO(
the rank differences for the fewest cases of the same sign. These are the plus two and plus four cases, which add up to 6. Note that suclt a total would be large iu numerical value if the fore· easlers arc fairly even. Table 17 gives the information requit·ed to decide whether this number 6 is significant:
TABLE 17 Significance Limits, Paired Samples, Rank Method 1 (Number of Pairs) 6 8 10 IS
Limiling Absolute Value of Total for 5% Limit t% Limit
0 4 8 25 52 89
20 25
0 0 3 16 38
68
1.1\hridged from Table II of "Some Rapid Approximate Statistical Pro· ccdurcs" by permission of Frank Wilcoxon and American Cyanamid Company.
Apparently, the value 6 for 10 forecasts falls between the 5% and 1% limits. The pwbability of getting as low or lower a total than 6 is greater than 1%, but less than 5%. Hence, in spite of the apparent superiority of forecaster A, the significance of his higher scores is doubtful. Note that no set of differences could have been significant with 6 or fewer forecasts.
If more than 25 pairs of data are to be tested, the limits can be calculated from: 5% limit: N (N + 1 4 1% limit:
N (N + 1) 4
1.
• /N (N
+
1) (2N
+
1)
v· 24 . • /N (N + 1) (2N + 1) 2 576 v· 24 960
8. Analysis of Variance. In previous sections we have seen how it is possible to test whether the mean X of a single sample differed significantly from some hypothetical population mean 1-'· The discussion of this test led to the t distribution which also permitted the and It now comparison of the means of two samples, seems natural to ask whether it is possible to test if three or more
x.
x2.
. . · . : . , . .';~ij'i'" ··;;;; '"~1?.;', ., .. {f;!.e:.. ,,.c sample means are· likely.. to: have. come:•{rom·Hi\l~isame.i~hypo;.·, the tical population with mean IL· r. 'The ,solution:to·this'' problem,_,.. {X:'t approached from a different point of vi.ew than. previously used 1 !:';::·~,~ has led to a general technique called :the analysis of variance. · ;· The purpose of analysis of variance is to determine how specific sources of variation contribute to,the total variance of a quantity,, and to test whether the effect of a p<).rticular, factor is real or likely to have arisen as a result of random errors •. :This is accomplished. by an arithmetical procedure which, will be illustrated' by r'the ·, . ' I following numerical example ... , .... :.·. :'-'_',·.,,:"':'·:·. · :t!: f,,,[ 1 ,,; '
.. '
'
.'
' '
• ;
'
•, ( t
. j- ·,
r
J ; ' •• • : : .
•I
•
~ I
I '
I
TABLE 18' .•. t._,~: •:•; ._.,.:\ ·:· di•Jd':
Analysis of Variance· ~~.1 :: ' '
< ;
'
', \ / '
. : -~ J!
' 1
.~ 11
··'' !;
,1
: ·'.
i :J
···,~ ;·1·::.-l ~
1 I ; '· I , ;1
j\
_) , , ,
'
•
Data,: .: .... ,, , :~·;. ;: ,.·.. f -. ;,.,;,:' ;~:··:.'
- - - - - - - - - - - - - - " - _ . . . . . __ _ _.c:..·''-'--"-----''"'-'----'-:'-~'.;..;"..:..':
·to (4)
1
•.•
·.·,-,t·!
4 (O) 1 (-3) s (4) 3 (-1}
•
"
••
---------------.----..,..---,---___,.~·I . ;
., \.1 •· .. 11
{•:
:~ . . . . . . . J
Computations · , , : • · · Source
Total (T) Between Groups (H) Within Groups (W)
ss
MS
df
'l
'·.~~~.<.:
! ...
. . \ \) . '·
· ... , ,\ :
Estimate of
95 14 81
,.j
-,
Table 18 presents some data obtained in three groups with :: four observations in each group, along with some computations derived from these data. Now the question to· be' answer~d is whether there is any evidence to supportthehypothesis that ~e means of these three groups are·really diffe~ent~' '.'u' 'l·"• ., : The means X of the three groups are 4,:6 and. 6.5, re~pect ively. The grand mean X of the sample, is, ·the~efore,,s.s ... .::
The figures in parentheses next to the 'original) data giye: the deviations d of the individual observations X ·from the mean X of the particular group. . . . · ', .,.':·.:: .·: :'·1.;' ..... · •.• ''·;·):;:,.'. , ,
•
I ,
I, ~
r .} '
l
.
,
~ 1:,
>
1
'
It is possible to represent any particular observation in terms of the general mean X, the deviation d of the_particular,observa~
IllS
,:,oME APPLICATIONS OF STATISTICS TO METEOROLOGY
tion from its group mean from the general mean
X
X =
X
X, X.
+ (X
and the deviation g of group mean Symbolically we have: -
X)
+ (X
- X)
In geueral, one can divide up the variation of x = X - X into two types of fluctuations: random, or sampling fluctuations, and "systematic" fluctuations. As has been pointed out at the beginning of this section, the function of analysis or variance is to find the systematic variations between means, and to determine whelhcr they are real or, in statistical language, whether they are significant. The quantities g above certainly are influenced by systematic variations between group means; but they are also, to some extent, influenced by random fluctuations, for we saw earlier in this clmpter that the mean of a sample has random fluctuations which arc smaller than the nucl:ualions of the original observations. The quantities d, on the other hand, are entirely produced by lluctuations within the sample, not at all by systematic difTerences between the means. Now, we shall transfer the X above to the left of the equation, square both sides and add. The result is: 2: (X -
X) 2 =
Zx2 = m2:g 2 l1
+ 2:d2
llcre m is the number of observations per group, aud the g under the summation sign means summation over groups. Summation without subscript means summation over all observations. Symbolically, we write this equation: SST = SSo
+ SSw
The notation SS is to be read: sum of S(]llares. Subscript T means "total," B "between groups," and W "with in groups." The equation can he read: the total sum of squat·es equals the sum of squares within groups plus the sum of squares between groups. It should be remembered that SS 0 not only contains systematic effects between groups, but also depends to some extent on random fluctuations within groups.
All the sums of squares are usually computed in analysis or variance. The computation can be accomplished without ohtnining first all the values of x, d and g. In fact, we have the identities:
'
'
·Here :Z means summation of quantities.within a group.,.: ;:1;! ;:(/. '
0
•
1
:
'
•
t ;_~
..'.
I
'f..:·>~"'':
•,
~ 'o \ ;···· f" J;'
As a check, we can use the fact that the sum'~>f.SSw and SSs
must equal SST. The quantity . (:ZX) , <;ommon'-~o-tnan; ~o~· N: ,"''"1; f' ,. !~· ': , . . . :.~,-·t.'! putations in analysis of variance, is called: ~'the·:correction•for the mean." ' [· · .. ; ; • ... , i: . '· '.;-. · 2
•
I
;
,'
With the numbers in Table 18 we find, for the sum of squares: SST . = 42 + 12 + 82 + J2 + 1Q2 + 52 + J2 + 62 22, + 92 ~2 + 72 - 662 = 95 ' '' ,,._,.. ' ;;. 12 ' . ,~·-: l }l\>~.~t-.n_··."! . .} 1€~·--;~~'-:·l
+
( SSw = ( 42
+( ss, -
+ t2 + 82 + 32 -
1 :
.,.
2 )
.i;: .....~., .. ,·.. ," ... · ·.
,, · 1 :,
+ ( 1o2. ~ ~2 -~-· ~~' ~6~ ,'-,~~} ,·•: ,. '
22
+
. ••. •;•
:
\\
i,',;-!·.:;·.:.·:d ,.,, ' ' '.
+ 911 + 82 + 72- 2462) =.8,1·..·.;'..... ' ,: .... '.,-")·.,•!:·
~( !6' + 24' + 26')- ~~',= !4 ',''
. :
-L
~-
. ,;;.) ..'': ·.... ' . . : '· . i: :; ; ,!J ;', ; :~_·,~ :.{:: ,_, r~ :1
'' :': '. :,·
::: ••...
Note that SST equals the sum of the other two sums o{squares. · With each SS, there is associated a number of degrees of freedom, abbreviated by df. The x's haveN - 1 degrees of freedom, since there are N of them, and they must aqd up to ..zero.: 'Hence, 1 N - 1 df are associated with SST: , .. n.' · •' · -~;· (,. : >~:· :: •. •
' ,
,·
• •
' •
:.;',::: ,,
~I ,I ' '
In each group, there are m values of d. , The 'sum .of all d's in each group equals zero, as seen in. Table~ 18, .where the. d's are given in parentheses. In each group, the d's therefore have m -· 1 degrees of freedom. Let there be p groups (three in the table). The total number of degrees of freedom associated with the d~s and therefore with SSw is p(m - 1) .. Finally, there are,p values
.'<:
.... .
',.
', .
·.!:··.
.
;,.
E
J\p
riON!
l"ATJS
oM
.OLOC
of g, which also add up to zero. Therefore, the number of degrees of freedom associated with SS 8 is p - 1. Note that the sum of the df's associated with SSw and SS 8 is pm - 1 = N - 1; in words, the df associated with SST equal the sum of the df associated with SSw and SS 8 • In Table 18 there are 11 df associated with the x's, 9 with the d's, and 2 with the g's. Earlier we have seen that the ratio
2
2:x is a good estimate N- 1
of the population variance. Obviously, this ratio is of the form of an SS divided by the corresponding elf. Such ratios from now on will be defined as "mean squares" and abbreviated by MS in Table 18; values of MS are computed for all SS. In general, mean squares arc optimum estimates of some population parameter. In order to see which, we make some assumptions about the character of the observations. We will assume that all observations in the population, not just the ones in the sample, can be represented by: X-X=G+E Here, G is a function only of the group or category; it is not to be confused with the g above, which also depends in part on the random variation within groups. The standard deviation of the G's will be denoted by crG· The quantities E are assumed to be random variables with a constant variance, cr 2. This variance is sometimes called the "error" variance. In words, the equation says that x is the sum of a random effect and a systematic effect . ._
Now, the mean square associated with SSw is influenced only by the E's, and is, in fact, the optimum estimate of the error variance, cr 2 • Also, the MS associated with the g's, MS 8 , turns out to be an estimate of cr 2 mcr2 c. These quantities are also shown in Table 18. Now, if the systematic effect is important, that is, if crc is large, MSa must be much larger than MSw. In the example of Table 18, cr2c obviously cannot amount to anything, since MS 13 is actually less than MSw. This is not surprising, since the numbers in Table 18 were selected from a table of random numbers. However, we often run into the situation where MSH is considerably larger than MSw. Then the C"juestion arises whether it is significantly larger. If it is, cr 2c must be significantly different from zero, and there must be some teal, systematic difference between the groups. This test is made by
+
.
SAM
'>; ':,
THJ
.
t,·~ ~.A:
.~
..... ~·)f.:>: ' "
..
~l·to;r ~
"'"Fj' , ..,. ' 1
~I·
,.;.
l:..~t
.-
".~.,. ~~~
r
'·h .rJ:)pt(,f.i\
forming the ratio of mean squares, the larger in the \nun era tor:·;·.• >; Such ratios are called F, and their limiti,ng values are ·tabulated!_:·:·,··:< as functions of the df's of numerator and denominator (see: Table 19). IfF is sufficiently large, theprobability that it arises;· by chance becomes negligible when cr 2G is zero, and we conclude: that cr 2G differs significantly from zero. · , .. ' 1 '·, '•
'
I
'
·',
'
The technique of analysis of variance is applicable to much: more complex situations than the one illustrated here. Fori example, it is not necessary for the number of observations in; each group to be the same, and the number of groups may be· much larger. In the case of two groups, the F-test gives !the; same result as the t-test, because l 2 = F when F is the ratio of MSi between groups to the MS within groups. · For a more general: and much more complete discussion of the analysis of variance,: the reader is urged to refer to one of the numerous available. ·· textbooks in statistics. Although usually the basic computations: for the analysis of variance are quite straight-forward . and\ described adequately in most texts, the problem of interpretation1-::~ · . may become difficult for more complex situations since it depends\_;;:. ::, upon the purpose of the analysis and the assumptions made in~~ . ---. settin~ up the exp:ri~en.tal m~del. Before proceeding: ~it~' a{\\ ~~ complicated analysis, It Is advisable to talk to someone wtth;,~,:: k experience in analysis of variance problems and experimental/·::~ jl, ,_ f~. :, destgn. . .... ·., · ·-I, 1 : ..•.•.• I
I
'
'
1'1
:
'
;,)',
1
~
~
It is in the field of experimental design that one of the greatest::~;. contributions has been made by the analysis of variance tech-;·-.' niques. A simple application of these principles to a meteorologi-' ·.' cal investigation is described below. In this experiment, three;. forecasters each made a series of two forecasts for six occasions.! ' ·· The first forecast for each occasion was n1ade .with incomplete> data, that is, some observational data usually· available to\ the' meteorologist at forecast time was withheld. After-this in~tiali forecast was completed, additional· data were supplied; to !the; forecaster and he made a revised forecast.; One·purpose of 1the experiment was to test whether the use of the' additiortal. data! tended to improve the forecasts and, if so, by how· much.· The~ forecasts, made under controlled conditions, were then compared/ .. with the observations, and a score was derived for each fotec1ast.)·•: ·: These scores and the analysis of variance which partitions up the: • . total SS according to various factors involved in this experiment 1 ·• · are shown in Table 20. · · ·:: ~' :.. ,:; ·).\ ·· :·: .. i ·. /..'. ;: ,[ ".:,)
.'•
':''
'
:
.: .,. !
:
I
; .·
r
J
'I
{
I. ,
!':
TABLE 19 Limiting Values of F
5% (ROMAX TYPE) AND t<;;, (ITALICS\ PO!::-fTS FOR THE OlSTRtBt:TION OF F*
...
n1
161
2 3 4
5 6
10 11 12 l3
4
5
6
200
216
225
230
234.
237
degrees ui
to
f~om
tI
((or greater m<>an sq a arc) 14
16
20
H
JO
41)
50
75
248
249
250
251
252
25.l
100
8
9
23<>
241
243
6J).~i!
244 G.IOIJ
245
1),1!,:!
240 6.169
6.;!08
6.2S4
6.;!58
6.1186
6,30!2
6.;J2J
6,J34
6.3lif>!
6.361
19.41)
19.41
!9.42
19.4J
19.44
19.45
19.46
19.47
19.47
19.48
19.49
19A'l
19.50
.!,.052
4.999
.5,40S
5.625
5.~84
5,SiifJ
S,9!!8
.5.981
6;02:!
18.51
19.00
19.16
19.25
19.30
19.33
19.36
19.37
19.38
19.39
12
25.1
200 254
501l 25{
98.40
99.01
99.17
99.:!5
99.30
09.33
90.84
99.36
99.a8
99 ..!,0
99.-11
90.+?
oo•.;s
oo:J,J,
90 ..!,5
99..$6
o0.47
09..',8
oo..:8
90 ..;9
qo ..so
99.40
9.9..>0
10.13
9.55
9.28
9.12
9.01
8.94
8.88
8.84
8.SI
8.78
8.76
8.74
8.i!
8.6<1
8.66
8.64
8.62
8.60
8.58
8.57
8.56
8.54
8.54
5.11
5.70
84.1!1!
30.81
!19.46
t8.7l
28.2.$
fJ7.91
:17.67
:17..:9
M7.3J,
S7.28
:27.18
:7.05
;26.9:!
26.S3
!16.69
fJ6.rJO
rt6.50
1.11
6.94
6.59
6.39
6.26
6.16
6.09
6.04
6.00
5.96
5.93
5.91
5.87
5.84
5.80
5.77
5.74
flV!O
6.61
16.26
5.99
18.74
5.59
9
3
H2 6,056
1£.;!5
8
2
5.32
11.!!6
5.12
18.00
5.79
16.69
5.41
12.06
11.39
5.14
4. 76
4.53
10.99!
4.74
9.5.5
4.46
8.6/J
4.26
9.78
4.35
8.-15
4.07
3.79
3.37
3.29
3.71
3.48
3.98
3.59
4.75
3.88
3.49
4.67
3.80
6.£9!
5.96
3.41
6.7.!,
5.99
3.36
5.67
3.26
3.33
5.89
3.22
3.14
3.20
3.09
3.01
3.11
3.00
6-41
5.06
3.18
3.02
5.1!0
5.tiil!
5.6.$.
6.89!
4.86
6.07
.!,.811
2.92
.!,.69!
3.73
3.44
3.48
.5.80
4.15
8.10
3.50
3.63
6.06
4.82
JO ..!J7
6.84
3.86
6.,42
H.80
7.00 6.19
4.84
6.70
7.19 6.87
6.lJ.5
9.07
3.87
7 ..!,6
4.21
8.26
3.58
':.56
6.98
3.97
4.28
8 ..F
~.88
10..!,5
3.69
6.99
3.84
4.39
8.75
4.95
10.67
14.98
6.68
4.10
9.$8
4.12
7.85
5.05
10.97
15.21
7.01
4.96
7.:20
9.15
15 ..5!1
7.59
8.02
9.65
5.!9
13.27
10.56 10.01,
Ui.:J8
.5.21
.!,.88
2.92
14.b'6
4.78
10.15
4.10
7.98
3.68
3.34
3.18
3.01
3.02
2.95
2.85
5.8.5
14.-1.5
4.70
4.00
3.96
3.92 7.0:2
3.6o
3.57 6.-17
8.35
6.27
6.15
3.23
3.20 .5 •.$8
5.36
6 ..5.!,
3.31
3.13
3.10 ii./8
2.94
2.90
2.86 4.54
4-46
2.80
.!,.89
2.84
2.77 !,.SO
.!,.19
2.12
2. 76
.po 2.67
.!,.10
4.60
4.03
i.7!J
.!,.78
-1.50
1!,.01!
9.6S
.!,.8.5
.!,.63
4.64
11,.1.5
9.77
5.7.!,
2.97
4.68
J4.2-i
9.89
.5.811!
5.26
1.!,.87
9.96
.!,.95
.!,.65 .!,.4.!,
3.63
3.39
3.23
.!,.7.$.
4.06
7.87 6.62
5.91
5.06
4.74
10.05
6.7"1
6.08
5.1,7
1-1.54
2.82
2. 72
•1.,~a ...
3.28
3.52
5.67
6 ..56
3.07
3.02
5./l
2.91
!,.71
2.79
.!,.-10
2.69
4.22
4.16
2.63
2.60
.!,.02
7.60
3.96
5.00
2.86
3.49
2.98
.!,.9:2
2.82
1,.60
.!,.5l
2.74
2.70
.!,.JJ9
2.64
1,.0s
2.55
8.85
J,.JJ1
2.60
3.98
2.5 l
8.78
18.98
18.83
4.53
4.50
9.~7
3.87
7.39
3.44 3.15 2.9.1
1,.80
2.77
3.84
7 ..11
3.-u
6.07
3.12
5.:!8
2.90
2.54 2.46
3.67
4.44
4.42
3.81
3.71 7.1.!,
3.15 7.09
7.0;]
3.38
3.34
3.32
3.72 3.29
4.40
9.13
3.71
6.9.9
3.28
5.90
5.8.;
5.78
3.08
3.05
3.03
3.00
6.00
.!,.96
2.80
2.1i .!, ..;5
4-U
5.£0
2.86
5./l
2.82
5.06
.t,.17
2.67
2.64
2.57
4.02
8.91,
2.50
2.46
2.41
9.17
18.57
.5.98
2.70 !,.1M
3 •.59
5.64
9.£1,
2.74
8.78
26.1!,
5.65
9.129
!,.51
2.61
:!J6.18
5.66
13.61
1,•.56
2.65
26..18
5.68
9.88
7.23
4.46
!!6.!!7
18.69
.!,.64
.!,.38
s.86
18.74
.!,.7.1
4-41
.!,.10
E6 ..;1. !1!6.35
2.53
8.86
2.42
3.70
8.6/
2.38
2.34
8.51
3,49!
2.61
5.7.5
2.98
2.76
2.59
.!,./!!
.!,.05
.!,.01
2.50
2.47
2.45
5.80
2.<W
8.56
2.32
3.87
3.71,
2.36
3.1,9
2.28
8.80
8.70
2.35
8.-16
2.26
8.1!7
18.5iJ
18.118
4.38
4 ..17 9.0.$
9.07
3.69
6.0.~
3.25
5.70
2.96
-1.91
2.73
!,.86
2.56
3.96
2.42
8.66
2.32
8.41
2.24
8.£1
3.68
6.00
6,.. . 1
19.5{
9fl
.<{
g.
5.6c
18._;t
3.6;
6.8;S
3.24
5.67
2.94
.!,.88
2.93 .!,.86
2.72
.!,.85
2.55
3.98
2.54
8.91
2.41
3.69!
2.31
8.88
2.30
3.36
2.22
3.18
*This table reproduced from G. W. Snedeoor's "Statistical Methods" by permission of the author and his publisilers. Calculated by G. W. Snedeoor from Table VI of R. A. Fishers' tatiatical Methods for Research Works."
...--.. TABLE 19 (continued)
5% {ROMAN TYPE) AND I% {ITALICS) POI="TS FOR THE DISTRIBUTION OF F ,., degrees of fre
"
9
10
t1
12
14
16
20
24
.30
40
so
15
100
200
500
4-14
2.70
2.65 4-0S
2.60 S.1)4
2.56 S.86
2.53 S.80
2.48 S.70
2.44 3.62
2.39 8.61
2.J5 3.49
2.31 5.34
2.27 S.£6
2.24 S.JJI
2.21 S.14
2.19 8.11
2.16 S.08
2.14 3.011
4-14
2.10
2.64 4-00
2.59 8.89
2.55 5.80
2.5 1 8.75
2.48 8.67
2.43 3.56
2.39 S.4$
2.J3 8.56
2.29 5.29
2.25 s.1o
2-21 8.ts
2.18 s.or
2.15 8.oo
2.12 11.97
2.10 1.99
2.08 e.sB
2.74 4.£0
2.66 4-05
2.59 5.89
2.54 8.78
2.49 8.69
2.45 8.61
2.42 8.55
2.37 8.45
2 ..33 8.87
2.28 8.25
2.24 5.18
2.20 8.10
2;16 $.01
2.13 . £.96
2.09 1.89
2.07 1.86
2.04 £.80
2.02 11.77
2.81
4.34
2.70 4.10
2.62 5.93
2.55 5.79
2.50 3.68
2.45 s.a9
2.41 5.51
2.38 5.45
2.J3 5.85
:u1
2.29
2.23 8.16
2.19 s.08
2.15 8.00
~.911
2.11
2.08 $.86
2.04 £.79
2.02 11.78
1.99 !1!.70
1.97 £.81
2.9.3 4.1>8
2.77 4.1!5
2.66 4-01
2.58 5.86
2.5 I 5.71
2.46 S.60
2.41 8.61
2.37 S,.44
2.34 8.S7
2.29
5.97
2.25 5.19
2.19 . S.07
2.15 S.OO
2.11 1.91
2.07 1.88
2.04 1.78
2.00 1.71
1.98 1.88
1.95 1.6!1
1.93 1.59
3.13 6.01
2.90 . 4.60
2.74 4.17
2.63 8.1}4.
2.55 8.77
2.48 8.6S
2.43 8.51
2.J8 2.J4 9,4S . $.36
2.J1 S.SO
2.26 5.19
2.21 S.IB
2.15 5.00
2.11 SI.9S
2.07 1.84
2.02 1.76
2.00 1.70
1.96 1.65
1-94 1.60
1.91 1.51,
1.90 1.51
3.49 5.86
3.10 4-94
2.87 4-43
2. 71 4-10
2.60 5.87_
2.52 8.71
2.45 8.68
2.40 5.45
2.35 1$.37
2..31 S.30
2.28 5.£3
2.23 :t.ts
2.18 8.06
2.12 1.94
2.08 11.86
2.04 1.77
UI!J
1.99
1.96 fl.fJ8
'1.92 1.66
1.90 I.SS
1.87 1.47
1.85 1.44
j.47 5.78
3.07 4.87
2.84 '2.68 4.31 4-04
2.57 8.81
2.49 . 2.42 5.85 8.51
2.37 . 5-40
2.32 8.31
2.28 $.94
2.25 S.11
2.20 s.o7
2.15 1.99
2.09 11.88
2.05 1.80
2.00 1.7!11
1.96 1.63
t.93 1.58
1.89 1.51
1.s1 1.47
1-41
1.s1 1-48
1.84 1-42
1.81 1.57
·s.ss
1-41
u1
~-~ -~
'·Ht. . ,;.~~
.3
4
5
6
14 . 4.60 8.88
3. 74 6.1>1
3 ..34 6.ti6
.3.11 6.03
2.96 4.69
2.85 4-4.6
2.17 4.£8
IS
4.54 8.68
3.68 6.36
3.29 6.,SS
3.06 4.89
z'.9o 4.1$6
2.79 4.31
16
4.49 8.1>3
3.63 iUS
3.24 6.£9
J.01 4.77
2.85
4-#
17
4.45 8.40
3.59 8.11
3.20 6.18
2.96 4.67
18
4.41 8.£8
3.55 6.01
3.16 6.09
19
4.J8 8.18
3.52 6.93
20
4.JS 8.10
21
4.Ji 8.01
1.84" ·
·r·-:_.·.
"~~~~~~~~
~~.~~~~~~
9.85
2.35
2.30 S.116
2.26 5.18
2.23 8.1i!
2.18 5.0!1!
2.13 1.94
2.01 11.88
2.o3 1.75
t.98 1.87
1.93 1.58
1.91 1.55
4.28 7.88
.J.42 5.66
3.03 4-16
2.32
2.28
2.20 8.o1
ii4 s.s1
2.10 u9
2.04 1.rs
z.oo
· 1.96
1.91
Las
24 .. 4.26 . 7.811
3.40 6.81
. 3.01 4.7£
2.16 ·2.11 1.99 . 1.89
·2.06 . 11.81
2.00 1.70
2.64
8'.94
2.53. $,11
2.45 5.54
.2.J8 S-41
s.so
S.t1
2.24 S.14
2.78 ... 2.62 4.1111 5.90
2.5 I 5.61
2.4."! 8.50
.2.36 8.38
2.30 S.£5
.2.26 8.17
2.22 S.09
25
4.24·--J.JS ... 2:99 ··2.76 -2.60_.. _ 2.49 1.11 6.57 4-68 . 4.18 8.86 S.85
2.41 8.46
2.34 8.511
2.28 8.111
2.24 ~-2.20 ·s.1s 8.o;;
26
·l-22 . 7.711
2.32 8.£9
2.27 8.17
2.22 8.09
23
--
co
8
2
'#·---~--:-
.. , --
-~-
3.37 5.55
2.80 4.£8
--~~·'"•··
2.98 4-84
..................... _ . ._,
2.74 4.14
2.59 8.811
0-~-----:-----·
2.47 8.59
2.39 8-41
..
,r·.--·......-
" ...................... --_.. ..., __ -
2.18
s.ot
uo
;:b~ i:1; ;:: ;:~~ 1:~~ -·
1.6~
us
1.96
_1.86
r ...
.11.11
-~--~
11.66
1.87 -1.84 .. 1.80 1-46 B.J,O 1.811
-----?.~-----·---
JUS
l.Y<
B.8,
1.8 S.,:l.
t:s"i" ·~ _.J
11.88
'~.5
..': ;:{-i,~<~);ti!
t8o ·' i
1
i:Si i:-;~ ""·fj"fi·,.,.J
.
1:~: u~ B~ }1~ 11g H~ ~nr~·, s
1.96~ -1:92
11.64 _.2.15___ . .2.10 . . ,.,.,. __.. 2.05 . _.. -. -·1.99. . __1.811 . . __:. _. .,. ___ . ____1.85 !.95 1.90 .,,~.~
~.48
~.84
2.0: 11.s:
uo
£.41
1.82 1.38
1.78 U8
.
~----~---
. 1.17 "l~H-_:t.i£··'1.19 S.JIS __ 1.!
.__.1.76 ... _. ,_1.72 . __ . . .,r -..q..;;.e.:l .,.;~
I.B5
:1~70
.-~,~:-'+"'
--~U
£.19 :;:~,!.5. -~
:
TABLE 19 (continued) nt
2 4.21
7.88
3.35
i)..I!J
3
2.96
1,.60
4.20
3.34
5-45
1,.57
4.18
3.33 s.J,S
.t,.ii4
7.81,
7.80
4.17
3.32
2.95
2.93 2.92
7.56
6.39
.t,.St
4.15
3.30
2.90
1.oo
4.13
1M 4.11
7.39
4.10
7.85
4.08
5.34
3.28
5.29
3.26
5.B5
3.25
5.111
3.2.3
7.81
5.18
4.07
3.22
4.06
3.21
4.05
3.20
4.04
.3.19
7.117
1.!!4
7.£1
7..19
ii.15 5.11!
5.10 5.08
4..1,0
2.88
4.41!
2.86
4.:18
2.85
4.84
2.84
4.31
2.83
J,.£9
2.82
4-:JJ8
2.81
J,.S.t,
2.80
,4.JIS
4
2.73
.t,.I1
2.71
1,.07
2. iO
5
6
2.57
2.46
3.79
2.56
S.78
2.54
8.5t;
2.44
9.5S
2.43
.:,.04
s.78
8.50
2.69
2.5.3
2.42
.t,.OS
2.67
8.97
2.65
8.98
2.63
3.89
2.62
3.86
2.61
8.83
2.59
3.80
2.58
8.78
2.57
8.76
2.56
8.74
8.70
2.51
s.86
2.49
3.61
2.48
3.58
2.46
3.54
2.45
3.51
2.44
8.1,9
2.43
8..1,6
2.42
3.41,
2.41
8.42
8.41
2.40
8.-;£
2..38
3.88
2.36
8.85
2.35
8.31!
2.34
3.119
8 2 .•17
S.SIJ
2.36
3.88
2.35
9.33
2.34
2.31
8.£4
2.30
8.1!2
2..30
8.110
3.28
2.29
8.28
2.28
2.28
8.18
2.26
!1.15
2.25
3.11! 3.10
2.2.3
8.()1
2.22
3.06
2.21
8.04
2.22
2.21
2.25
2..30
2.24
8.11
2.27
2.32
3.21
8.14
3.08
8.17
s.es
2.25
3.eo
3.30
2.32 . 2.24
8.1!8
2.30
9
3.ze
2.2.3
3.08
2.21
8.0!,
2.19
8.051
2.18
:IJ.99
2.17
1!.98
2.16
1!.94
2.14
3.06
2.19
3.01
2.17
1!.97
2.15
10
Jl
12
14
16
20
2-l.
30
40
so
iS
100
2.20
2.16
2.1.3
2.08
2.03
1.97
1.93
1.88 .!..47
UH S.SS
1.SO :!.;Jj
1.76 Z.:.!S
1.74 2.:21
8.06
2.19
S.03
2.18
8.{)0
2.16
:2.98
2.14
Jl.9!,
2.12
1!.89
2.10
2.94
S.86
2.14
2.09
1!.91
2.12
!!.88
2.11
2.14
2.08
!!.80
:i.80
Jl.71
2.14
2.10
2.05
2.00
2.12
9.90
2.10
!:.87
2.09
2.84
2.07
fJ.i5
2.04
1!.73
2.03
!!.71
2.04
1!.74
2.02
fJ.68
1.99
1!.66
!.97
1.96
:2.60
1.94
fJ.57
1.93
1!.55
!.91
1!.80
!!.70
fJ.6f!J
1!.51
2.08
2.05 ;l.i6
8.66
2.00
1.95 1!.58
1!.47
11.8:2
2.06
S.78
2.05
2.03
e:;e
2.02
2.00
2.06
Jl. 77
2.02
2.6$
1!.86
2.04
2.05
2.09
;!J.90
1!.9:2
2.06
1!.74
2.12
2.07
1!.80
!!.83
2.15
':3.95
1!.69
2.10
S.8JI
:l.IJ3
$.75
S.77
11.84
:i.98
ti.B!J
!!.88
1!.92 Jl.90
degrees of ircedom (for greater mean sqaare)
fJ.78
2.02
i.66
1.99
1!.70
1.64
2.01
1.98
2.00
1.97
S.68 S.66.
1.99
1!.6.t,
JI.6JI Jl.60
1.96
S.ii8
1.98
9.62
1.96
!1.59
1.95
1!.56
1.94
1!.54
1.92
S.51!
1.91
11.50
1.90
!!.48
1.93
!!.54
1.92
!!.51
1.90
1!..1,9
1.89
Jl•.t,fJ
1.88
!!_,4.4.
1.87
1.89
1.81
J1.4iJ 1.85
S•.!,O
1.84
!!.87
1.82·
11.35
1.81
1.91
Jl.5f!
1.90
IU,9
1.89
fJ.!,7
1.86
Jl. .;£
1.84
l!.il8
1.82
!IM15
1.80
JI.IJ!J
!.i9
Jl.i9
1.78
S.J!IJ
1.76
1.87
S•.t,.t,
1.85
11.41
1.84
s.38
1.82
S.S.t,
!.SO
$.SO
1.78
!1.116
1.76
JI.Sfe
1.74
1.110
1.73
11.17
1.72
II.S>J
fJ.£4
!!.15
1.80
1.75 !!JIB
1,71
SJ,S
!!.80
1.86
1.79
S•.t,O
2.55
fJJ18
1.74
S.£0
s.ts
1.70
Jl.ll
1.81
1!.85
1.80
S.Sz
1.79
s.sa
1.76
1!.25
1.74
!!.21
1.72
:11.17
1.71
2.14
1.69
S.I 1
1.68
fJ.08
1.66
fJ.06 1.65
s.o4
1.64
!!.OS!
1.78
1!.$0
1.i'7
S..i!1
l.i6
s.z4
1.74
S.:!O
1.71
Jl.l5
1.69
!II.U
1.67
1.75
:f/.:!2
1.73
i.I9
l.i2
2.16
1.69
fJ.I!!
1.67
!ii.V8
1.65
e.V.!,
1.63
1.08
!!.00
1.66
1.6 I
!!.05
1.64
S.Oi
1.63
1.97
1.60
1.9.!,
1.58
1!.00
1.9£
1.62
1.57
1.98
1.61
1.96
t.90
1.56
1.88
t.i2
i.18
1.71
£.15
1.69
2.18
1.67
1!.08
1.64
tW4
1.62
$.00
1.60
UJ7
1.59
1.94
1.57
200
1.6S
s.z!!
1.69 '1!./iJ 1.68
2.10
1.66
e.or
1.64
fWS
1.61
1.98
1.59
1.94
1.57
1.90
!.55
1.88
1.54
1.91
1.86
1.56
1.52
1.88
1.54
t.86
1.5J
1.84
500
1.82
1.51
1.so
1.50
1.78
1.67
S.09
1.65
i!.Od
1.64
e.os
1.61
1.98
1.59
1.94
co 1
e
1.65
!!.06
1
1!
!.(>l
e.o1
I
1
1.57
I .91
!.56
1.90
1.54
1.86
1.53
1.84
1.51
1.80
1.~ ...
1.84
1
1
1..49
1.78
1.50
1.78
1,48
1.76
1,47
1.78
1..
v
1.12
I
1.
A C
TABLE 19 (continued) \ nt degrees of freedom (for greater mean square) 10
11
12
14
16
20
24
30
40
so
15
tOO
200
500
co
4.03 7.17
3.18 5.06
2.79 4.10
2.S6 8.71
2.40 8.41
2.29 8.18
2.20 8.01
2.13 1.88
2.07 1.78
2.02 1.70
1.98 1!.61
1.95 1.56
1.90 1.46
1.85 B.S9
1.78 11.26
1.74 1.18
1.69 1.10
1.63 1.00
1.60 1.94
1.55 1.86
1.52 1.81
1.48 1.76
1.46 1.71
1 •• 1
5
4.02 7.1S
3.17 6.01
2.78 4.16
2.54 8.68
2.38 8.87
2.27 8.15
2.18 1.98
2.11 1.85
2.05 S.15
2.00 1.66
1.97 1..59
1.93 1..58
1.88 B.4S
1.83 1.36
1.76 1.29
1.72 1.15
1.67 s.OB
1.61 1.98
1.58 1.90
1.52 1.se
1.50 1.78
1.11
1.46
1.43 1.68
1.64
0
4.00 7.08
3.15 .f..98
2.76 4-18
2.52 8.85
2.37 8.34
2.25 8.111
2.17 11.95
2.10 1.81
2.04 1.71
1.99 1.63
1.9S 1.56
1.92 B.50
1.86 1.40
1.81 B.81
1.75 B.SO
1.70 1.11
1.65 6.08
1.59 1.98
1.56 1.81
!.SO 1.79
1.48 1.74
1.44 1.08
1.41 1.88
1.39 ; 1
3.99 7.04
3.14 .!,..95
2.75 4.10
2.51 8.61
2.36 8.81
2.24 8.09
2.15 1.98
2.08 1.79
2.02 1.70
1.98 1.61
1.94 1.54
1.90 f-47
1.85 1.37
1.80 1.30
1.73 1.18
1.09
1.68
1.63 1.00
1.57 1.90
1.54 1.84
1.49 1.16
1.46 1.71
1.42 1.84
1.39 1.60
1
'0
3.98 7.01
3.13 4.01
2.74 4.08
2.50 8.60
2.35 8.1!9
2.23 8.07
2.14 1.91
2.07 1.77
2.01 S.67
1.97 1.69
1.93 1.61
1.89 B.4/i
1.84 1.36
1.79 1.28
1.72 1.16
1.67 S.07
1.62 1.98
1.56 1.88
1.53 1.81!
1.47 1.14
1.45 1.69
1.61
1.40
1.37 1.66
1 ·~
10
3.96 6.86
3.11 2.72 4-88 . 4.04
2.48 8.56
2.33 8.1!5
2.21 ·3.04
2.12 1.87
2.05 1.14
1.99 1.64
1.95 1.66
1.91 1.48
1.88 1.41
1.82 B.SB
1.77 1.24
1.10 1!.11
1.65 1.03
1.60 U/4
1.54 1.84
1.5 l 1.78
1.45 1.10
1.42 1.65
1.38 ·1.6!/l
1.J5 1,66
JO
3.94 8.90
3.09 4.89
2.70 8.98
2.46 3.51
.2.30 8.10
2.19 ,1.99
2.10 I.BB
2.03 1.69
1.97 1.69
1.92 1.61
1.88 , l-48
1.85 1.36
1.79 B.JI6
1.75 S.19
. 1.68 1.06
1.6J 1.98
1.57 1.89
1.51 1.1f}
1.48 1.78
1.42 1.S4
1.39 1.69
1.34 . 1.51
1:.48
Z!S
3.92 8.84
3.07 . 2.68 4-!8 ..:_8.94
2.44 .8.47
2.29 ·8.17
.2.17 f.95
2.08 1.79
2.01 1.95 1.66 , .. 1.66
1.90 1.86 . 11.47' .•B-40
1.83 B-41)
1.11 1.72 1.88 _ B.15
1.65 1.08
1.60 1..94
1.55 1.85
1.49 1.76
1.45 "1.68
. t.J
1.36 1.54
:'t.Jt
1.27
'2.27 2.16 8.14 .:J·9'.
·2.07 1.76
·2.00 I.SJ!
s.oo
1.64
1.59 1.91
1.54 ·: 1.47 : 1.44 . 1.37 1.11! . 1.86 . 1.56 1.8S
1.34 1.61
1~~ t,-S7 . ::
1.32 1.48
1.39
2
s
50
3.91 8.81
00
3
.::<·
5
4
1
6
9
8
1.94. 1.89 1.68 · BM
'•
~~: ~
>J:'" -
1.85 . 1.82 ' . 1.16 1.71 S.87 1.80 B.20 . ·1.1£
·-
J
l.~J
i
- .,
1.'3o
1.2~
1.46. 1.40
i.29:
/is
3.00 4-75
2.67 2.43 8.91 . ·3.44
3.89 8.76
3:04 4-11
2.65 ;2.41 3.88 . :_3.41
~3.11
' 2.26
·2.14 . ·1.oo
2.05 B.78
'1.98 B.60
-~11.50
--1.92
1.87 B.4I
U4
1.83
1.80 B.28
1.74 1.69 _1.17 . "1.09
1.62 1.97
1.57 1.88
00
3.86 6.10
3.0z . :2.62 .~2.39 .~2.23 .f.-88 8.83 8.86 . 8.06
2.12 .1.85
2.03 1.69
1.96 1.55
1.90 1.46
1.85 1.87
1.81 1.29
1.78 1.28
.1.72 I.U
1.54 1.84
1.49 1.74
1.42 1.64
1.38 I.l!i7
·1.32 1-47
1.28
1.41
1.22 ~ ~:~rJr:t: 1.SB l..S4 ....,-s t;
00
3.85 8.66
1.80
B.26
1.76 1.20
· 1.1~ · :{65 .11.09 1.01
1.60 1.91 1.58 1.89
1.53 1.81
. 1.47 1.11
1.41 1.61
1..36 1.64
·'·1.30 . 1.44
1.26 1.88
1.19 1.18
1.79 1.14
1.75 1.18
1.69 i:.64 ' 1.57 ·.1.07_ . ~.J-99 1.87
1.52 1.19
' . 1.46 :. 1.69
1:4o
1.35 •· i.2s [.611 "1 ..j1
:0
3.84 8.6-f.
zit
.
3:oo ::.7 .:;2~8 ~z.:ii 2·.10 2.02 _1.95 :.1~89 1.84 4.61 . 8.80 3,34 ::8.04 :1.811 1.66 l.liS .· B.4S I_.S-4 . - .... ~ ~ -- :~ :- "'":' ~ ~ ' ---: . .._ :- ·~·. 2.99 :.:z.60 .::2.37 1 2.21 -2.09 2.01 :"1.94 . 1.88 1.83 .f..8o :::.s.1s -~8.31 .8.01 · 1.80 1.64 1.a1 , 1.41 · .s.s11 " ·· ... · :;.;;·.!
~~-
....
::..
:.·
'
-·
~
-
.
~ ....
!
~--
..... ~'
··::
---- ..
•"'-
~·-··-·.
1.67 11.04
."'.w:
" 1.52 . 1.45 · 1.79 t.69
-
..
1.69
" 1.42 1.35 1.s2 : u;s
'L
,...
1.26
1.24 1.17 1.86 . 1.25 ~-.::;<;.
1.z:Z~ '<
1.SS
< ...1
l:~iJ \~
. 1.19 -:-:
i:t1·~ri~o
. 1.16 '!;t.O
-~--~ .. ':'::~........... --,
Sm!E Arl'LlCATIONS OF
76
STATISTICS TO METEOROLOGY
TABLE 20
Analysis of Variance, Complex Cases Original Scores Forecaster
Periods
Data Amount Incomplete Complete Incomplete Complete Incomplete Complete
II
Ill
(l)
(C)
A
B
c
47 60
47
54
19 24 14 42
58
([)
(C) (r)
(C)
TOTAL
57
(jJ
D
E
F
40 4( 45
18
30 35
22
S8 62 76
31
. 57 71
42 42
26 27 27 37
J6t
177
364
255
157
45
62
37 33
25 58
218
Auxiliary Sums Forecaster Oat;z Amount
II
Ill
Total
Incomplete
2.16
2.1.1
255
Complete
2.34
2-'19
315
(6)
744 (18) 788 (18)
470 (12)
492 (12)
570 (12)
15.12 {36)
(6)
(6)
(6)
(6)
TOTAL
(6)
Analysis of Variance
ss
Suurcc Total Data ;\mount Forerasters Periods
Residual
(T) (IJ)
(F)
(PJ
df
MS
Estimates of·
F
+ t8aD22 + 12aF a2 + 6a1,2
(.79) 3.40 19.75
90.}8.9 53.8 460.2 6695.5
35
1 2 5
5U 230.1 1339.1
a2 a2
1829.4
27
67.8
a2
The value of SST is entered on the first line of the "analysis of variance" section of Table 20. We shall show next how the remaining sums of squares for this table are computed. The sum of squares between forecasts made using incomplete and complete data (SSv) can be determined from the section of "auxiliary sums" in Table 20. In this section the sums of particular groups are shown, as well as the number of observations. Note that subscript D means "due to differences in available data." For example, the total score for the six forecasts made by forecaster I
SAM,UNG
··ThE~a.~;·~ii~~:~;~~]{~:bi' ~w~
with incomplete data is 236. The total score:for.the;18 forecru;ts/i{ made with incomplete data is 744 as shown.in the table.;.~Likewise,,'\ the total score for the 12 forecasts made by forecaster fis 470, for II is 492, and for III is 570. The value.of:SS 0 \.dsJoundJrom: '' ,·.: :·i·.t: {7 442 + 7882) • •. : :. •: I: ' " , • SSo = - C = 53.8, where . C is the correction 18 '· ' .' '·.. ,·,.' ., i h~ ·,I 2 1532 for the mean = 65,195.1. the,value for SSp (betw~en I ' :.. I . ~ ' i ' ~ ~ ' ' '. :;, ' ; 36 1'1'
I ' ( ' ' . l ...
forecasters) is found by SSF =
(4702
! " j
i
· '.~ f..;::!
+ 4922 + 5702) 12
'
•
, . , , :. ;1 . ' ; '-
. • ;.·:;.;·
'
:t:l
c = 460.2
· .I
.
·, , . j
Finally, SSp (between periods) is .
rr
,
If/
I~ t: ~
~
'. '
·,, ,;;
I'
~
i~ :
+ 1772 + 364 + 2552 + 157 + 218 .if! ----------:---:----~. - c.=. 6695.5 ., 6 . ' \ '. l' 3612
2
2
>
I
;'
2 •' .
1 ;''' .:
!
I
! I •
I
; •
,· . I
'·'
,II
' ;., .~': ~.
i•
l
/
'
,!.'.'
1..,
Note that the figure in the denominators· is :the .number.' of observations that contribute to quantities tlu,tt 'are squ<;tred' in 1 the numerator. These three quantities' are computed in' an ·:\ .· a1{alogous manner as SSn in the simple first example. ·· ':• ;_,.u·· ···: ,, ,
1': , '
• .' .
·~ '
I :
.' ~
1
1 i\
In the last line of the SS column, we have entered a residual, obtained by subtracting the sum of the three. last. values__ ofSS from the total SS. · · · . ~ ., i . ! . ~ ·.:'
•.
::
l'
:. •f
f ,. ~
.• ; '
Again, the MS column is computed'from the:SS column: by dividing each SS by the proper degrees of freedom. ·For example, since there are six different periods, the mean scores for·which is fixed, the periods have five degrees of freedom. The df of .the eff~cts residual is obtained by subtracting the . df.: '~f .the. main ' l l ,.;. from the total df, 35. · . · ,•. :, · •l ·.· ; , , ~; ,,. ! ,::; 1 j'
, 1
• •
Next, we must investigate what population parameter ;each MS estimates. In order to do this we make the following hypo· thetical model. Let x, the deviation of each ·score from!'the general average score, be given by: -:., .. ,I~· '.lki· ·' ;.-i;P: h• ·lti ;~ •./ X
= D
+F +p
t
l
~:
.
~
'::;
l!, ';j , .f': h~ ·. ','; [•,
', :• 1
I
'
y.:,
Here, we assume that D depends on data amount only,.F.on:the · forecasters only, and P is different for each period, but· in-:- . dependent of the other variable. E is again a random variable,
l/5
::io~m }\l'l'LJCATIONS OF STATISTICS TO METEOROLOGY
Now, this expression for xis not at all general. In particular, we have no quantity which depends jointly on forecasters and periods. This means that we assume that there is no "interaction" between forecasters and periods. Such an interaction would mean that one forecaster is good in one period, another in another period. Also, there is no interaction between data amouut aud forecasters. That is, all forecasters a1·e assumed to be affected in a similar manner by the data amount. These assumptions are not necessarily correct and a more general model to test the interactions coukl be formulated, but would lead to the discussion of questions which arc beyond the scope of this book. Suffice it to . say here, that some additional tests not described here indicate that the interactions arc not important, and that the model defined by the above equation is satisfactory for this example. Given the model, the various mean squares are the estimates of the quautities given in the next column, where a20 is the variance of D, a2F the variance ofF, and a2 p the variance of P. Again, a2 is the error variance. The last column, headed F, gives the ratio of the other MS to the residual MS, the latter being an estimate of the error variance.
The most significant F ratio is that determiued from the MS for the periods. The 1% limit for 5 and 27 df (Table 19) is 3.79, compared to the observed F of 19.75. This means that a2p is significantly yifferent from zero.
In wonJs, the different periods are of significantly different
difficulty. To some extent, this could have been expected by inspection of the original table, for all forecasters did well in the first and third, and poorly in the second and fifth periods. forecr~sting
l
Next, the F for the MS of the forecasters is almost exactly on the 5% limit for 2 and 27 df. This means that the chances are 1 in 20 that random sampling variations could produce differences in average scores as great or greater than those observed in this sample if the forecasters were really equal in ability. On the basis of this result alone, we probably should be hesitant in deciding that there are significant differences between forecasters. Finally, a 2 0 is certainly not significantly different from zero. The average difference between the scores with and without. additional data is actually less than what would be expected by chance.
i:H•·'/
SAM..~ ... ,., THL_ •• _
9. summary. .
r.
, .
.
·.r: .. ::~i1~tjlr;j;: :,
In summary, sampling variations are e~tremely(important, are frequently overlooked by scientists.. One 'always deals 'with'. 1 samples and attempts to draw conclusions regarding a population. · Much of statistics is devoted to finding ways of estimating the re· . liability of statements concerning the population from knowledge · of the sample. Most of these techniques are·:based ·on· the assumption that the items in the sample are drawrt "at random, II; and are independent of each. other.. · ,)n 1 J!l,eteorology, .~hese. assumptions are usually not satisfied, and· conclusions from · sampling theory are ·often mjsleading. ' In 1many1cases 'tests 1on . independent samples are required·. in order;:to"ascertain~the . reliability of a result obtained from' a· given sample. ·. ':il · ,·i(iyl~; :. :
.
•
11:rt...· ~:·,!'.l; ~~i.....'i
:; r,
·
I,-.
i.·~=·.\!J!'(f·cHti~·i. ·
"l"'"~ ~:~li·.:.. "~Yit·;~·} .·~~~ '·J·rri.'Liith!
Suggestions for Additional Readl~~· 1 •!L
. _:;
il/ .If· ·\,· .:.:r;l , :,· ·· · ·
:'
On Analysis of Variance: Bennett,: Cari·A>and !Franklin, ..•. Norman L., Statistical Analysis in Chemistry and;,tlu~!·'Ckemical _;·' · .., Industry, John Wiley and Sons, Inc1, New':·York',..:1954,-;>::Also·: : .·'''.; Snedecor, George Waddel, Statistical Meth~ds. APPlt.~d to'E,xperi- .·. ; ~ents in AgricuUure and Biology,· Fifth ,r;diti~;>n, 1 'low~ >Stat~ . \' 1 College University Press, Ames, Iowa, 1956. ·Alro:;. Eisenhart, c;., \:. "The Assumptions Underlying.· the·: Analy$is:i . of.: :Variance/• •.·• . B'·wm""r1ca ~~ ' 3 1947 · ·i · !i ·:,:,;, .', ../'.!.:J:·· · '<. .· .. :, : ·: , · · .. · .. ·.!;•·· .'ir\.'•i:l;··:·t··i)1\··>.'l.'.·.-..!U ·. On Sam piing: Fisher, R. A., ' 1Uncertain; inference,!':·Pr~~eed· \;:' ings American Academy of Sciences, Vol)71, 1936. ~ · .1i ··; ·1.1/:;p·. · . ,. \·.ir· ..
tt.
.
t.
I
.
·•.... ,
.
••
On Non-parametric Methods: Siegel, ·Sidney, iNon-parametr.ic · · Methods for the Behavioral Sciences, McGraw-Hilt Book Co., Inc.,·. New York, 1956. Also: Fraser, D. •A;·'•S.,· ·Non·parametric Methods in Statistics, John Wiley and Sons, Inc., New York,: 1957 • ' t ~.
.
' 1 1 If
1
.. i·:·:·:
d
•
•
~:-{' ·'~i ·-
!'-.:!' ::
• '.
>rl_:: . ·."•... ,.\ •
!
:::: j
. . f f I\' I • ' ! ••
; 1 ' •
~
.
-. ) ;
-~.
; ..
-_t,!',.
)l; i~
} '·•
.I:
.. l
' :; : j.)
: •'''
, •• !
.
.;· -in~.t
:,11:,·
'•
: ··~ · ~ ;, ·. i .l ·/ ;1::·! ;· r
i :
'!
·,.: ~ L. ·; ; H \· ~ · i ., :". ·.. '•\ :.•..
.••
Am1aR ysis of Tl11.e Relationsl11.i p BetweeJl1l.
Two V aJriates
CHAPTER
IV
1. Frequency Distribution of Two Variates. In Chapter J we discussed the two-way frequency distribution used to summarize a vector quantity. In that chapter (see Table 3), wind was the meteorological variable; the two-way distribution was used since we were interested in summarizing both the direction and speed characteristics of the wind. In effect, we thus dealt with not one, but two variates: wind direction and wind speed. And these variates were related in such a manner that fo1· each observed wind direction there was also an observed wind speed. Now suppose we were orgalllzlllg data on two vadates, say min fall and temperature, that were not as closely related as wind direction and wind speed but still occurred in such a way that for each rainfall observation there corresponded a temperature ohservntion. (Clearly, it would be through analyziug such data that we might determine if and how rainfall varies with temperature.) It is evident that just as in the example of Table 3, a two-way or "joint" frequency distribution is a means of organizing the data. How do we proceed? Let the two variates be Xt and X2; the subscripts 1 and 2 after a symbol, for example St and S2 1 refer to Xt and X2 respectively. First, class intervals are defined for Xt and X2, according to the principles outlined in Chapter L The class intervals of Xt are written across the top with the smallest to the left; those of Xz are written along the left margin, with the smallest values at the bottom. Again, horizontal and vertical lines separating the various classes form a large number of boxes in which the individual observations can be tallied and counted. The number of tallies in each box, the frequency, is usually entered at the center of tlie box, and the individual tallies are erased after they 80
Ar.,........s oF~··~
.tELA'---··-··nP E
·
EN:.rr '' '-:·.'::•.~;/~,·:.
have been checked. If both variables are~non~numerical,' a way frequency distribution is called a ''contingency,table.':·.!ii.·.,,;\ .
,.,,
.. ·,._.. ...
...
·'-
,
~
1n order to make the distribution more easy to visualize, the frequencies can be isoplethed. So far nothing has been said as to whether the classes are of equal size, or whether one.of the variates is given qualitatively. In. principle, ·double .frequency. distribu~ · tions can be constructed for any of these· types· of variates: however, the isopleths ·are misleading iri case of class intervals of varying size, because the f~equencies in ·large classes. are relatively too high. Still,' such a· diagram 'is a convenient way of summarizing the information. Also, short methods of computing the mea11 and standard deviation can be used only when class intervals are equal. · · •L ,, . · · . :.;
When the majority of the frequencies are situated near a· line from the lower left to the upper right, that is, when large values of X1 occur with large values of X2, the two. variables are said to . be positively correlated; if large values of:Xlarc·,associated·,with ·: · • small values of X2, the variates are •said to be negatively corre- \· . Ia ted .• The correlation is perfect .when •all observations fall on· a {straight line whose slope is neither 0 nor infinity; rio correlation ::. is indicated when the isopleths of frequency .form,· roughly, ellipses with principal axes horizontal and vertical. .•When .the variates are slightly correlated, the isopleths should form ellipses with slanting axes. The greater the ratio of major to minor axis, the better the correlatioil. The concept of correlati~n will' be ·· ' treated more quantitatively later in this chapter.:• 1''' : '.·.··!I>· '·.:l'l •''
:;'.·;;l···,·: .•. ,.; .. ,•i•'jl,:lili
As before, the frequencies in a joint frequency distribution may be divided by the total number ofcases, N, and ,converted into empirical probabilities. Such a diagram can be used .for· esti- · tnating "joint probabilities.'' As a practical example, a heating · engineer may require the probability of the oc.:;urrence of either .. ·· a temperature less than 0° with any wind speed, or a'temperature less than 5°F and a wind faster than 15 mph, or a teliiperature less than 10°F and a wind faster than 30 mph. ·The..answer. is Jortnd from the percentage frequency (PF) diagrafl1.,\JY. addinK .~hT probabilities in the corresponding boxes. ,..1 ',,;;1 .lr;,i ,, Lni: Table 21 shows a slight positive correlation (lo\V temperat:ures • ·
.J\JME AI'I'LICATIONS OF STATISTICS TO METEOROLOGY
are associated with light winds) probably because radiative cooling is most effective with little wiud.
TADLE 21 Hypothetical Joint Probability Distribution, Wind Speed, and Temperature in Winter (Probability in%) Temperature degrees F 36-40 3I-35 26-30 21-25 16-20 11-15 6-10
Wind Speed I-5
6-10
11-15 2
2
I 1 2 3 l
3
4 4 4 2
16-20
21-25
26-30
31-35
36
1
I 5 6 6
I I 3 6 2
2 2 2
1 I 1 .I
4 5
6 4 2
5 2
1
1
Two numerical variables can of course also be represented by a simple dot diagram rather than by a two-way frequency distribution. That is, given a table of two simultaneously observed variables X, and X2, we can lay out a coordinate system with Xt as abscissa and X2 as ordinate. Then, we simply plot a point for each combination of X, and X2. Again, we can see at a glance whether the variables are positively or negatively correlated by whether the points generally fall along a line from lower left to upper right or vice versa. The advantage of a point graph over a two-way frequency distribution is that it requires relatively few observations before a relation can be recognized; as a disadvantage, in the case of many observations, the estimate of joint probabilities from point graphs is rather tedious as it involves counting a large number of individual points.
2. Regression (Numerical Variates). Usually, it is desirable to consider one variable, say Xt. as the independent variable, and another, x2. as the dependent variable. For meteorological situations this might mean that we know X1 and would like to forecast X2. We then refer to X, as the predictor (that which predicts) and x2 as the predictand (that which is predicted). The problem is to find that relationship
between X1 and X:~ which yields values ofx2:with;thc:rlE~ast:errc•t The simplest case is that in which the t:wo~way frequency' . tribtition or det diagram indicates a linear'relat,ionship between; Xt and X2 and therefore an equation of the form;·x2 ==a·+ b Xt.~ will fit the data well. Such a straight line may be drawn :by eye · in such a way as to pass as closely as possible through the means· of the different columns (if Xt is plotted horizontally and X2 · vertically); or a vertical line might be drawn throl}gh the mean . of X., dividing the diagram into two equal parts.: The centers :'>: of gravity of these areas can then be conhected;:;!~·· '1." ·•. ·.. ···•! .. :· ,.
.
I
'
,
,
·.
The most satisfactory method to: determine. ~h~· best forecast line, even though a little more cumbersome than the two methods' outlined above, consists in the application of the theory of least squares to the determination of the coefficients in ·the linear equation. A straight line fitted by this method is 'known as a line of regres.sion. By definition, the· sum of·.the squares of the . deviations of the individual values of x2 from those predicted by the line is a minimum. · The reasom> ·why: this method, ,at'. · least theoretically, is preferable over other methods are as follows':, .· ··~ •
.,,·.·,'-!·
•''
·t.'•
',
:.''
·.
1. It is the most probable linear relationship between X1 'and· \ · x2 if the deviations of the individual observations from the' line, . ;\. ,/f;l the X2 direction, are assumed to be normally distributed, and 1 the scatter about the line is the same at all.' values of'X1. ' (This· · d· · '\ ~~,:~· .. ii"·: .• ;~···;~; :·r:_,. x. statement 1s prove 111 the Appeng1x.) .... · . , .ql:' :·n· . ·r ot'•,:. 1
2. The scatter about the line:of regression is less than;,that about any other straight line. ' 1 ; i; . , · '· •., ·;.
··;
Let the line of regression used for the estimate' of X2 from· , ,, XI (technically, the line of regression x2 on Xt) be given by: . X2' = a + b:!.tXt. (Here X2' denotes values of X2 given by tile·. line of regression, whereas x2 stands for actual: observations of the predictand). By definition, the quantity Q = .E (X2 7 ){2')~ must be a minimum. This leads to the two. conditic;ms:.:... r:, ·
These conditions yield the two so~called ·~normal". equations for aandb: ., . ::· ._ .. ,r . . aN + b2.1.EX1 = XX2 . :\ . . . ... ,,,·; a-EXt + b2.1XX12 = XX1Xll .
Hence, d = X-., b2.1 = N:EX1X22 - 2:X,:EX2 an a 2 2 Nl:Xt - (:!:Xt)
The line of regression can thus be written: Xz) = [N2:XtX2 - 2:Xt:Z:X2] (Xt _
N:Z:X,2
(:Z:Xl)2
X) t
Note that the line of regression passes through the point X 1 , X2, a most important property of regression lines. The slope of the line of regression cnu be computed directly from the above formula. Perhaps a slightly more convenient form for machine computation is:
Here and in other formulae a bar over a quantity means an average over that quantity, that is, the sum of all such quantities divided by the total number. An important property of b2.1 is that the answer will be the same, no matter what the origin of Xt or X2. In many practical problems, the number of places carried can be greatly reduced by subtracting a constant from both X, aud X2. These constants are usually chosen as round numbers, such that the remainders are always positive. For example, if Xt is the height of the 700-mb surface and ranges from 9700 ft lo 10,.100 ft, the constant 9700 ft might be subtracted from all the heights before the computation is started. If the mean of Xt is subtracted from all the x. 's, and the mean of x2 from all the X2's, the equation for b2.1 takes the simple form:
where x, = Xt When a computing machine is not available, a short method can be devised based on the numbers on a two-way scatter diagram. This method is less accurate, because it is assumed that all the observations counted in a given box fall exactly into the center of that box. The formula for the computation of b2.1 by the short method is:
b i2 ~ 2.1 = . 2 'tJ fid,
(J;d;5 (J;J;5 (/tdt )2
AI!."~·
.s OF
...... lELA'
!IP'B
.
;;N\T
.•RIAT
'd•'·'lS'
··•- .. ·l:i•~r •"';:9''•J:<' •' "t:i-i'-':~~~J~r"r;-
Here, h and /2 are the number of observations in the classes'oF\ Xt and X2, respectively. /12 is the number'of observations in each.:box .. The procedure used in the evaluation: of this formula is. illustrated in Table 22. ·
First, a two-way frequency distribution is constructed with constant class intervals for each variate. Next, rows and col~mns are totaled, yielding/2 and ft respectively. These are followed by ch and dt, which are again positive or negative integers measuring · the number of classes a given interval is above_ the arbitrary origin. Note that the negative values of d2 are at the bottom of the table, the positive values at the top.; The row and column headed by d2 and dt are then followed by rows and columns of /2d?.,/2ch2,ftdh andftdt2 • All these rows and colutnns are totaled. The column /2ch2 is not needed for the computation of b2.1, but is required for the determination of_ s2 whicl~ is normally computed 1 along with the line of regression. • : .
I
lt~ the lower left corner of each box is written the~ product d 1dd of the particular box. These products are multiplied: by~ the values of /12 in the same box and entered in the upper right hand corners. The resulting values of ft2dtd2, are totaled along- the rows and the sums added. . · · . · -. i ;: _........' .: The other sums occurring in the formula for b2,1 are found directly from the rows and columns .at' the end of the table. Table 22 shows the line of regression determined by the short method. The quantity b2.1 does not come out in absolute units, but in units representing the ratio of X2 to Xt. For example, • if Xt is pressure, and X2 temperature, b2.1 may be degrees per mb. To construct the line of regression in Table 22, one. may proceed as follows: first, find the point representing X2 (shown by a cross). Since the slope is - .63, proceed from· mean 10 units to the right and 6.3 units down, always using the scales of the table. This locates ailother point. The straight line passing through both crosses is the line of regression. . 1:.·. :, ... · .. ; ••. :
\i -
x.·.
In meteorology there' is usually no doubt ~hich:· or' the two variables is the predictor and which is the predictand. In· fact, normally the predictor is based on an observation or observations made at an earlier time than the predictand. It is, therefore, of only academic interest to study the line of regression Xt on X2, a line which would be used to estimate xl from x2 and is represented by the equation:
TABLE 22 Computation of Line of Re~ression and of Linear Correlation Coefficiettt by the Short Formula
x.
4-7
0-3
8-11
-6
X
I~ ~2 -6
+10·+14
3 ~ 4 ~ ~
+ 5- + 9
I
~
-2
~
~ ·I
0· +4
2
0
~
-5·-1
I
~
0
0
0
I
-10--6
4
3
12
36
-12
9
2
18
36
-10
_,~ -2
12( 0
~ -2
~
•4
I
-4
3
I
15
I
15
15
8
I
0
0
0
0
6
-I
·6
6
-a
2
-2
-4
e
-6
TOTALS
35
101
·28
3
""-
~
'"" -3
9
16
15
5
J,
-2
_,
0
I
2
3
A L
f, J I
-a
-9
0
15
10
6
14
16
9
0
15
20
18
78
2
14
35
---x-51 51 51 5 = - -----
=
9.5
= 2
-.63
51
SH
5.4
fedz ft2.1td l
T
s
78 = 16 [Sl-
s, = 4.84
4 X 14
+-- =
0
101 s2t = 25 [ Sl
51
+ - - = 10.6 51 5 X .15
2
s 12
78 - (14)2
4
51
x,
ft.l2
4
b,.,
_
~
-2
28
_
~
I
J.
4
2
2
fz
f,
f,d,
x,
2
~ 0
16-19 20-23
0
+15-+19
0
12-15
= y'37.7
(14)2] = 23.4 51
- (35)'] Sl
s,
=
37.1
= 6.14
- (.40 X 23.4)
r, b,.,-s, = -.50 s:
=
5.3
2 =
also,
ru
=
vG~-
28
35
51
51
c:rJ
14
x51
1 c:1 - e~rJ
=
-.so
fl.l
where If this equation, which would be used for estimating Xt from X 2 , · . iswrittenintheform: .. · 1;,-.···.' .. ·.~··,;ir,,
1. . -· • .. (X2 - X2) = -b- (X't .- Xt) . ,~ .. ·
li :. ,
. I
i:1 ..
1.2
and compared with the equation given previously.for estimating ' ,. . ;', \' .. ;,.,;,if:.
x2 from x•. namely,,
(X2' - X2) = b2.1 (Xt - 'Xt);." :''
it will be found that
b2.1
and
b:.
'
2
!.'
?ot,e~ual ;exfept, in. ~he 1
are , .
''
,' •
.
,
.
I'
case where Xt and X2 are perfectly correlated. In fact, - - is '' ' .. . " bt.2' .; . • r;.
I
•:
•
1
larger than b2.t. the slope of X2 on Xt. The formulae for the slopes work in such a way that the deviations .o.L the ;variables. from their means are forecast conservatively .. ··_; : ,:. · · .·· · · . , •
I , :, ,'",
~
t'
•1•
.f
"' .:::.
;.
Occasionally it is not required t~ forec~st. ~ne' ,variable f,rom \\ ,.....another, but to find the best linear relationship' between .the 1 ' variables. In these cases it is assumed that there should be an exact linear relationship between the two variables, but observa-.. tional errors and other uncertainties produce considerable scatter in both variables. The line of best fit ties somewhere between the two lines of regression depending on i:he relative errors of the / · two variables which are usually not' known: ' · 1
.. ) .; . ' . ( . i ' j ~
•
3. Significance of the Regression Coefficients. · '
.
·.•
"I
The quantities a and ht are statistics, derived from a sample, . not parameters of a population. In practice,· one would like to know how representative a line of regression derived from a given · sample may be for a future sample. Statistical theory shows how good the quantities a and b are for estimating the corresponding population parameters, tt and ~. In general, a and b2.1 are the more reliable, the more observations there· are, and the smaller· the scatter of the points about the line of regression. Unfortunately, statistical theory derives its results imder the assumption' . that there is no relation between successive pairs of observations.1
Bt
lNS C
TISTI
ME1
.OGY
Such an assumption is usually not justified with meteorological data. \Veather tends to nm in regimes, that is, certain relationships may hold for a while and other relationships hold at another time. For cxa111ple, a period may be characterized by wann lows and cold highs, giving a good relation between pressure and temperature with a negative slope. In another period, we might find warm hip;hs and cold lows, yielding positive slope of the regression line. From statistical theory, one might have concluded from the first coefficient that the "population" had a negative coefficient, (3. This conclusion might have been quite wrong, because the condition of sampling at random from a population was not justified. To surnmarize, the regression coefficients vary from one sample to another, and their variation cannot be predicted by normal stath;tic;ll theory; it is usually ndvisnhle to study several samples and see how much the regression coefficients vary between them.
4. Scatter About a Line of Regression. When X2 is to be forecast from a line of regression involving X 1, it is not sullicient to know only lhe form of the equatioit; it is desirable to know also how good a forecast such an equation can produce.
The "badness of fit" can be measured by the scatter, s2.1, defined by: S2,1
=
where lhe summation extends over the values of the sample analyzed. The scatter can be computed from the formula: S2.1
= Vs22 - b2.1 2 St 2
which is derived by substituting the equation for the line of regression into the expression defining the scatter, as shown in the Appendix. If lines arc drawn parallel to the line of regression at distances equal to S2.1 above and below the line, measured in lhe X2 direction, about 68% of the observations should fall between the two lines (Table 22). Thus, the scatter is related to the line of regression in an analysis of two variates as the standard deviation
Al
liP,~
lELA'
S OF
~'1'!/f. '''••·:.;.~m~, ;;;.~~~~:i!~\:;;~'-
is related to the mean in an analysis of one 'variate, assumingJ:l;u~.tit in both cases the variates are normally. distributed. ;;j,.;·/t~~:~ii~£(1$~~ ..
.
regression\in/canib~·!t~stedl.1 by:·.,~,-;,
The goodness of fit of the means of the anaiysis of variance.. The variance to 'be accounted 1 .. for is s22 for the variable X 2. The appropriate SS, which' is Ns22/ is entered in the SS column in Table 22. ·The df is N-1 1 since 'the 1 mean X2 has been computed from the data, using up ohe df. . The SS accounted for by b, the slope of the regressionline,: is Nb2.1s2,' . and one df has been used up in determining· this slope. The· residual SS remaining for the deviations from the fitted line is,' · '.· '.;~·: :··; · ;.. ·: .','~ by subtraction, Ns22 - Nb2.1 2 s12 = Nsu 2 :·;,)~d··'''- i'~· ~~·· :.< L.,·. -~
I
while the df is N- 2, since two df have been used in determining, the regression equation. If there were only. two.'observations,· the regression line would pass through the. two points and ther:e ':· would be no residuals. ' . .., . ·, , ._. ·. . ' :': ···: .· ::•,
.··
TABLE 23 , 'c]''i't..in:'i,:t\;~.~. :i,.~,": ~!.;.t•::} Analysis of Variance Table for Testing the.·., ····;;!:J, ·. .. Si!1nificance of a Regression Relationship . ~,,; ...,:; ··".cj
h.
Source
ss
df
,)'II_·~. ...;:_:.-
Total
Ns, 2
N-1·
Slope of regressiOit line
Nbz.t2st=
1
Deviations from regression
Nsu2
\ ·~.
. ·MS '·"' ... I·' :,"IF•. ·J
, --s,l.'·' N . .. i·. . ·.· .,_·,\··_·;,, . . ·: . •·.>' N-;-1 .,·;'' .,, (N -2) b2.llsli' S2,t1
or reciprocal N-2 '
'
~ ':.
'
The values for the MS column are found by dividing the SS. by· the df, and F is the ratio of the greater MS to· the 'smaller· M~.·~ The significance level is determined from Table' 19. · However,..' such a test is valid only if theN observation pairs are independent'. from each other. Note that the "residual" SSis proportional,'to., 1 .'·:·' '1. .:·'.'" • the scatter, s2.12, and the SS due to the regression to bu2:•;ld· . l ,, . :. 1.·· ·'f :· ..... t;: J,
•
!
From the magnitude of the scatter, it wouid be tempting to ·1 infer the accuracy of forecasts made by using the regression : equation for future samples. Again, this is possible by use of'.
90
50MB APPLICATIONS OF STATISTICS TO METEOROLOGY
statistical theory, provided the sample from which the regression equation was derived as well as the future sample are "random" :c;mnples derived from the same population. That is, the assump· tion is that the "regime" did not change between samples, and that tomorrow's weather is imlependent of today's weather. · Even under Rtatistically ideal conditions, the result is not simple; since a and b2.1 themselves are subject to sampling error, the error of a prediction of X2 would not be uniform over the whole range of X1. At the extremes of X~, the error would be largest because of the large influence of a possibly incorrect slope of the line; in the center, the error would be least. · ln most meteorological practice, the safest way to ascertain the standard error of estimate of a given regression equation is to test it on independent samples. Of course, in a very qualita-: tive way, the larger the scatter, the less reliable will be a forecast based on a given regression line.
5. Linear Correlation. vve· have seen how the relationships between two variables can be expressed by linear regression. 1-Iowever, the regression coefficients and the measure of the "badness of fit" depend upon the units of measurement used, and it would be desirable to have some measure of association between two variables that is nondimensional, i.e., does not depend upon any arbitrary choice of units by which the original variables were measured. The linear correlation coefficient r12 does just this, and will be defined by:
Since sz.1 2 is at most equal to s22 , the square of the correlation coefficient rJ22 can vary from 0 to 1. If s2.1 vanishes, n22 = 1, and the correlation is perfect. n2 2 will vanish when s2 = s2.1. which is possible only when there is no linear relation between the two variates. As defined earlier, s2 2 measures the "variance" of variate X2. Similarly, s2.1 2 can be considered as the portion of the variance which is not explained by the linear association between of the variates x2 and XI. Thus, we may call the quantity S2 2 - S2.1 2 the "vat·iai}Ct:_()lJ.I;?_~~e!~il!_e.9J>Y it1; ass~ciation witlt.XI,:' Therefore, n2 2 , the square of correlation coefficient, measures the ratio
x2
-
==-·---···---..
1
ANALYSIS OF THE RELATIONSHIP
.
--
-
d~·... •'~''r":"" . ·~·....~w~·'•":"'.....~~~"':-!dloi;r"!··r"'·"':M."·•
~--~~-'"--"""'""~-·
.
····. ~· ~.;: ·. ·, ·.' ·
.:::·:~:·~·:.~.·~~.t.• ·~:·-n.~.;:)~·:6 :~~
BETWEEN.;,Two.~V.ARIATE~ r)~:)?K~{,~itt'l
. -/~:' ·:·::-.:·. ~:· . ; \· :~ ··:·· ~.·.·<··."·.:.~:;.·~:;,~~:-·r;/~·~: . ·:\f~_/.:;..i.·~
of the variance of X2, explained by'the liriear: association:•of:X2 J~t'\1, with x~, to the total variance of X2. ,, 1
\ ;\,;; 1
We see from the formula used in the computation of s2-1 that the square of the correlation coefficient is alsogiven by: St ' : i . . 2 2 SJ2 r12 == b2.1 -,or r12 = b2.1 -'. -~~~::'~_
Y'.
· ·... S2 ,·
s2Z
\.,.:•.:. ,..
We now define the coefficient of linear correlation byiti~ ' "~ ·. ;.sl .
;~;~:~;~ ~~f;i'•'>q.;
!'
vv
hJ
Even though the formula for the correlation coefficient would permit either a plus or a minus sign in this expression,· the plus sign is appropriate since we know that the sign of bu is the sign ' .'~' of the correlation between x2 and xl. : ' : : .' '
.
Thus, n2, the coefficient of linear correlation is. the square root . ·. of the ratio of explained to total variance of X2, and has the sign. of the slope of the lines of regression. Since r1~ can vary from 0 to 1, r12 may vary from -1 to'l. Thus,'for example, a linear. , .. correlation coefficient of -.70 means that\ the: correlation 'is negative (large values of X1 aSsociated with smalt values of X2), · and that a little less than half the variance of X2 is accotinte~Jor L by the variation of Xt. , ... ,-~ -~' ,., -.i ,. ,';,.,,,, ,.,,•. · \\'
·~
I
'
1,
•
•
,
' •' I ,
•
••
, • • '
,
• ~
The correlatioll coefficient r12 is symmetrical in Xt'and X2,~ as /can be seen w~e1.1 bu is eliminated fr?rri' the ,formula f?r' ~12 'in . terms of the ongmal values of the vanates Xt and X2.: ·.In that case the "product moment" formula for the. coefficient of linear. correlation results:
·
~~
.! • ,
1
•••
I
•
.
•
·J
.... I~',:' t
This symmetry indicates that the coefficient of correlation does not distinguish between the dependent 'and the independent variable. It permits conclusions as to the existence of alinear relationship between two variables, but does not indicate 'which variable causes the variation of the other. In fad, a 1 relation between two variables may exist because"the variation of both variabies is influenced by the variation ofa.th~rd,y~r,i~~I,::::;;;:'~·}'; The numerator in the product moment formula has many uses. • , of its own; it is called the "covarianc,e:', betJw~et;t ,;x2 an~:~~; ~~F.;·~!' The coefficient of correlation is usually computed by computing
....
:
·.:;
·,·.
9,
~·,~>lo i\t'PLiLAHONS OJ' .::ifATISTJCS TO METEOROLOGY
machine through a transformation of the product moment formula:
Again, the result is not aiTccted if first constants are subtracted from both variables. Also, if a two-way frequency table is available, a short-method can be used: f]2
ft2d1d2 - (/Jdt) (fzd2)
Vfzdz
= ~==~==~==~~~======== 2
\/ jJ(it 2 -
(ftd;)
2 -
(fzd;) 2
The evaluation of the various terms in this equation was discussed alre~1dy in connection with the discussion of the computation of the slope of the lines of regression. Of course, once the slope of the line of regression has been determined, rt2 can be found by a much less laborious procedure since r12 b2.1SJ/s2.
6. Significance of Linear Correlation Coefficients. \Vhen small samples of meteorological variables are correlated, a correlation may be found accidentally where really no correlation exists. In order to be reasonably certain that an observed correlation is real, one usually makes the null hypothesis that the correlation between the two populations is zero. One then determines the probability that the observed correlation between the samples could be due to accidental sampling fluctuations. Under the assumption that the populations are normal with zero correlation and pairs of data selected at random, it is possible lo determine the frequency distribution of the correlation codlicients of all possible samples containing N pairs of observations. When N is large, this distribution is approximately normal, with a mean of zero and a standard deviation,
CJro
equal to
1
--;::;:::::==:;.·
l·Jence, if a correlation coefficient exceeds 2.6 O"r in absolute value, the probability of its originating from uucorrelated populations is less than 1%, and it may be regarded as significant. For example, a correlation coefficient of 0.43 is just at the boundary of signilkauce by the 1% criterion for 38 pairs. The formula for cr, even for large N, is correct only when the r:orrelation between the populations is near zero. Jf the population correlation is large, the distribution of sample correlation
ANAI.YSIS OF THE RELATIONSHIP BE'IWEEN.iTwo
coefficients about the population c o e f f i c i e n t i s n o t orm definitely asymmetrical. For e x a m p l e , i f t h e p o p u l a t i o n s a r e correlated with a coefficient of 0.90, no sample;cotrelation~can' exceed 1.0, but samples with correlations as 'low' as 0.70 may occasionally be found. For this reason, the interpretation o f ar as standard error of the correlation coefficient or of 0.67 ar'as the probable error of the correlation coefficient has no dear. meaning. Whereas the distribution of sample' correlation' coefficients' is not normal when the population coefficient differs from zero, the distribution of z
=
!2 [In (l + r)
- In (1 -. r)], is normal. T h e
'a
. ' ' ' ''' r12 .. elsewhere. This symbol ·r in this formula is the s a m e transformation can be used for the purpose o( studying the sampling variation of correlation coefficients i n general, for the standard error of z is practically independent of the population c o r r e l a t i m ~ 1 . Whe;1 . and is given by covN- 4 . . efficients have been determined f r o mdifferent samples, z can be used to combine the several values or to test whether one correlation differs significantly from another. s
t w oo r m o r e correlation ,.
For testing the significance of an individual correlation coefficient the method of analysis of variance c a n also be used. Table 24 shows how the values are entered and the test made. Note that the value of F in this. table c a nbe shown to be, the s a m e as t h a t in Table. 22. If the t o t a l variance to be a c c o u n t e d . for is cons•dered as unity, the porbon accounted for by the linear
TABLE 24 Analysis of Variance for Testing the' Significance of the Correlation Coefficienti (r used here for r 12)
Source
ss•
dj
Total. ............. .
N-1
Correlation ... . ..... .
1'
Residual ...... . .. . ..
t-r 2
'
.
:
MS '
'
. r2 (N -2),, 1 J - r2 .. ,
.:· ., 1-r2 , or ~eciprocal ' N-2 · "· - .. '•.1·' ·':: N-2 '·, ,., . ':
•Strictly, the SS and MS columns should be 1nultiplied by N.s,t, but this factor is often omitted because it cancels out in F. · ·' · · .. , ; · · · · ·' ·.' 1 '·'' •
Smll•, 1\l'i'LICATIONS OF S1 !\TIS TICS TO !VI i;TEORULOGY
\14
correlation is r 2 , and the residual remaining is 1 - r 2. For example, if a coefficient of -0.70 is obtained from 20 pairs of observations, we have F =
0 702 • (18) = 17.294. 1 - 0.702
According to Table 19, this is highly significant. On the other hand, acorrelationof0.20 from 20 pairs gives F
= ~~2 (18) = 1 - 0.20
. 750. Since this is less than unity, we must take the reciprocal
_}- = 1.33 and look this up in the F table with 1 df for the
.750
slllallcr MS ami 18 df for the greater MS. According to Table 19, this is clearly not significant. We can expect a correlation between -0.20 and 0.20 from 20 pairs of observations to occur quite often d uc to chance alone. The usc of the analysis of variance proccd ure to test the significance of a correlation is recommended because it can easily he generalized to more complex situations. However, as will be seen presently, there are several reasons why application of standard statistical theory to meteorological correlation coefficients often leads to incorrect conclusions. Statements regarding the "true" or population correlation based on a definite correlation in a sample can be especially misleading. In the first place, statements concerning significance depend on the possibility of defining the populations. To some extent, this depends on the existence of a "permanent" relation between the variates. For example, the vertical pressure gradient and density have a "permanent relation"; if both are measured separately, errors of measurement may easily produce scattering and reduce the correlation coefficient. The double scatter diagram in this case will not change much when more and tnore cases are added to the original sample. Often, with meteorological correlations, there is no such permanent relationship. For example, if we correlate surface pressure with surface temperature for a period of pronounced told outbreaks, we will get. a negative correlation coefficient which, acconling- to the criteria above, may be signif1cant. Yet, as we add nHWe dat:t, the regime changes. The joii1t probability distribution does iwt appro
ANA, •• ~.--OF1 --
:;LATI
PBE
:~·.Tv.
-·
··,,:·95
llATE
,,
•;.-· >~· , , - >;.;,- , , 'tY' :-~~~~~:'W.;w-~!4-r::t~
coefficient may actually go to zero or: become.--positive. i ·tThe'::·:_·.-~·Tf·. s1gmficance statements were meaningles!i because the sample" :•:·:• · could not be considered as a random . sample.. One. should, · · therefore, never be misled by the statement that, for example, a correlation coefficient is significant on the 1% level.,. This statement is based on the assumption that additional.data will'have the same characteristics as the information used in the sample. Correlation coefficients themselves show trends and oscillations in meteorological data which are unpredictable as yet. ;,_":: . · o
\
•
I
,
~. . \, , .
A second difficulty is this: since successive data are usually positively correlated, we have effectively less information from a given sample than statistical theory assumes we do. For example, we may have temperatures for the first' 30 days of January. Since the temperatures on a given day depend to some extent on the temperature the day befm·e, and even the· temperature the day before that, we do not deal with 30 independent· numbers. . Approximately, each observation is independent of the observation three days previously; therefore we actually have about 10 independent observations. This means that· in the theoretical formulae for an we should replace N by approximately. N /3. This will make apparently significant correlations less significant;.· for example, the coefficient of .43 found from 38 pairs ofdata will no longer be significant, since we have only about 13 independent bitsofinformation. 1,, · :-, ·1.' -~ .. :;, '
i
'
.
A third difficulty in judging the significance of correlation coefficients has to do with the fact that one often computes more than one coefficient and only picks out the larger ones for testing. For example, one correlates sun spots with daily temperature. Since nobody knows how long it takes sun 'spots to influence temperature, correlations are computed with lags from 1 to 10 days. Suppose that all the coefficients are small, except that the coefficient for lag of 4 days, computed from 102 independent pairs (only every third day was used), was .3. In this case, ar is .1, .. and the observed differs by 3ar from zero.·· Hence the probability· that a single correlation of .3 will be observed when there really is no correlation is less than 1%. But this is really no longer the question. The question is: what is the probability of finding one coefficient in 10 which will be as large as .3 when there really is no correlation. This probability is about ten. tirries as high, and hence the possibility of there really being no correlation can ·· no longer be rejected. Also; research workers tend to reportc;mly
::}6
SoME AI'PLICATIONS Oj' S:rATlSTlCS TO METEOROLOGY
their largest coefficients; hence, again, one cannot use the simple computations of the probability that a single correlation coef· ficient reaches a certain size if the populations were uncorrelated. Jn summary, significance levels of correlation coefficients reported in the literature must be considered with great caution, particularly when there is no clear physical basis for the relation· ship. Finally, if a correlation coefficient is "significant" in the statistical sense, the relation implied may not be particularly useful in estimating X2 from x,. Consider, for example, the situation where a correlation is 0.5, which is apparently half way l>et.wce11 no correlation and perfect correlation. Now suppose that the standard deviation of x2 (which can be thought of as the scatter about the means) is 10 units. The scatter about the This comes line of regression is given hy: s2.1 = .~2 out as 8.7 units. In a sense, then, the relationship with a correlatiou of 0.5 has reduced the scatter only by about 13%.
7. Curvilinear Regression. Occasionally a joint frequency distribution indicates that the relation between the two variates is definitely not linear. ln that case the line of regression is not the best relation between the variates, and the coefficient of linear correlation is lower than the degree of relationship between the variates might lead one to expect. In Table 25 a double scatter diagram is shown where the linear coefficient of correlation cannot be used to adequately describe the relationship. There arc three ways in which a valid relationship between the variates might be discovered; the general form of the relation might be assumed and the coefficients determined by least squares, or horizontal lines may be drawn through the mean in each column and the resultant step function used to express the relation. Finally, H line might be drawn by eye. The first of these methods is time-consuming, but preferable when there is a theoretical reason to expect a certain form of the relationship. Also, especially when t:he curvature is slight, a parabolic relation of the form: X2' = AX1 2 + BXt + C generally fits well. How· ever, this technique is not so commonly used with two-way frequency distributions as with time series, and will be discussed
.
..
-.- ,.'':~~00 ,·.:·.:~~~~~~· ·.·.::,~:~:s;:
ANALYSIS OF THE RELATIONSHIP BETWEEN'TWo VARIATES '\:·?:1~¥l
. •·' •. · '(i; ';i);':::.,·}~>;,·l:~i: . , TABLE 25 d;u) •?i<.c.::.:1:1tr .··i k1:i:i'~'f Relation between Wind Direction ~nd. Temperature at State College, Pa., January 1943-1952' ' '),,~' .. ,. ~emperature
Wind Directions NE - NNW SE - ENE ...
Degrees F
SW - SSE
·10 - -1
1
0 - 9
6
10 - 19
13
5
20 - 29.
21
lfl
,.,
n
30 - 39
39
4
5
44
40- 49
14
l
4
7
50 - 59
5
'
2
Total
99
1 :14
3
'7'
20
:\
\
'·
. '27
34
SE-ENE 25.0
.·,
..
...
132 .." . ,
'
'the step function indicates means of columns. Wind Directions......... SW-SSE 30.0 Mean Temperatures......
NW - WSW
·' '· '.
:1,''
NE-NNW NW-WSW . 24.5 . ' .. ! 25.0 ·~ ...
Variance about Means ... . Total Variance ........... · Correlation Ratio Squared.
137.3 deg.J · , (s,.r') ,,,'; 143.1 deg.2 : (s,s ..041 ('11,12) (Percentage of variance accounted for ls 4.1 %) . ' ' · ·. · ·'
•
1.•
in that connection. The second procedure is much simpier, but the result less satisfying. It lends itself, however, to definition.' of an easily computed correlation ratio, ' 'oc' •::>< ' ! n,:• The correlation ratio '1 is defined by: '·.··· ·1}Ut: .· I
.
!
I
. i•
.
!
h~ ~-l__ · - •: .
, ...,.
Here s22 measures the total variance of X2, whereas .s2.1 measures ·· the variance about the step function. • This can be compttt_ed by first determining the variance in each column, and then com- · .· puting the weighted mean by these variances, giving each column variance a weight equal to the number' of observations· in' the column. Just as r122 measures the fraction of- variance 'of X2 .· accounted for by the linear relationship between the X2and;X1, so 172-1 2 measures the I raction of variance of X2 accounted forb y X1
....
·:_ ',
98
Smm APPLICATIONS OF STATISTICS TO METEOROLOGY
through the step function relationship. In Table 24 the square of the correlation ratio is only .041. This means that only 4.1% of the temperature variance is accounted for by the wind direction; the remaining 95.9% is due to other causes. Although the coefficient is very small in this case, the relation is probably real, since the variation of mean temperature with wind dil-ection indicated in Table 25 agrees with synoptic experience and physical reasoning. The rei a tionship between the correlation ratio and the analysis of variance should be noticed here. Using the earlier notation for the total sums of squares (SST), between column sums of squares (SS 11 ), and SSw for the residuals within groups, the correlation ratio is found from:
~ =
SST - SSw = SST
SS 8 SST
There are 285 observations in the example which means a total of 284 df for the analysis of variance test. There are four columns with 3 df, leaving 281 df for the variation within columns. The 1661 MS for between columns is = 553, and the MS within is 3 39,133 5 136. The ratio F = ~ ~ = 4.07 is significant at the 136 . . 281 1% level according to Table 19. Of course, this assumes that the 285 observations are independent. If the observations were made dose In nne another (either in space or time), the effective df is less and the conclusion of significance may be invalid.
It is interesting to note that the method of "column mean regression" just described requires only that the dependent variate is in numerical form. The independent variable may be lmowr1 qualitatively only.
8. Correlation of a Numerical Variate with a Dichotomous Variate. The word dicl;otomous means that a variable can have only two alternatives. Examples are quite frequent in meteorology: fog or no fog, rain or no rain, pressure rise or pressure fall, etc. Sometimes one variable may be numerical, the other dichotomous. For example, we may have some dew points in the evening,
ANALYSIS OF THE RELATIONSHIP BETWEEI\f.
:. :_· - . >~. ~:~~:t' ;~~.:.. ·
'.i
and we should like to find whether! fog>lwouidt.be: expec next morning. Our information may ·look 1like·Table. 26(, •.·
.•. ··.· :···(! i:-:":-:1~:': ·?·: -~·-·-~·.:. :_~.; .. >
TABLE 26 · Dew Point and Fog Ftequency··
.·
Frequency of Dew Point
Fog
26-30 31-35 36-40 41-45 46-50 51-55 56-60
2 4 5 2
1
1 0
Totals
15
No Fog
.
Total '!I
1 4 6 '10 ,g 1;: ,.3 3 35
I
,\
.....
-.-
·'
3 8 11 f .• ;12 .. ' ·,I
'.
,g
'j,
.:; ·}
..
'·'/ . ''"ol I
.4 .. ; ::···_, 3
so .. , , ..
r.:
I
· This table was constructed by making separate frequency distributions for the fog and no-fog cases. There appears to be. '.:;:.~; some relation: the fog cases are concentrated around dew points of 35°, the no-fog cases around 43°. , What is the correlation?: Is it significant?· This problem is the same as asking whethel' there is a significant difference between Xt. the mean dew· point for fog, and X2, the mean dew point for no .fog.; Here the t~test is appropriate where t is computed from ·the formula given earlier. The correlation may be expressed in terms of the biserial correla-. tion coefficient which has had limited application in other sc1ences. It is defined by: '. "! ,:; . . , . '·,,, .... :··. ··:· z s wHere s is the standard deviation for all· the data. ·Thus,· X1 is 37.7°, is 43.9°, and s, computed from the last column,'is 7.75°. p and q are the empirical probabilities of fog and no fog, respectively, or .30 and . 70. z is the ordinate on the normal curve at which the total area under the curve is divided into .30% on one side and 70% on the other. . ..
x2
We can.find z as follows: The 30%-70% division means that 20% of the area is found between the mean and the ordinate required. The error function of Table 7, Chapter II, measures this area. The abscissa corresponding to thisarea is .found from
111\J
.:>O~fE API'LICATIONS OF STATiSTICS TO METEOROLOGY
Tnble 7: r = 0.5. With this value of r, we find the corresponding ordinate, z, from Table 8; z = 0.35. Hence, finally, the biserial cocOicieut is given by: r. b•s
= (43.9 - 37.7) (.30) (.70) 7.75
.JS
=
.
48
Is this significantly diiTercnt from ?:cro? Assuming that the populatiou cocllicicnt is zero is equivalent to assuming that ;.tl = P.2· The test for the significance of the dillerence between the means, discussed in Chapter li I, gives a significant value when applied to these data. Thus, the correlation coefficient is Rign ifican t:. Again, this conclusion h ingcs on the necessity of being able to find a population of which this particular sample call ue considered a random sample. If the data above have been collected in September-October, 1951, no relation at all may be found in September-October, 1953. Again, the "significance" is dubious.
9. Relations between Qualitative Variables. The methods for finding relationships for non-numerical variables (which will be called "attributes") require only that the variables are summarized according to categories. Hence, they could be applied to numerical data as well if one is willing to divide them into categories (such as above and below normal), and thereby throw away information. Whereas in the previous sections we first found incasures of relationship between two variables and then discussed their significance, in this section we will first find out how to tell whether two attributes are related, and then define quantities which measure the strength of the relationship. The first step is to arrange the two attributes in a contingency table, a table quite analogous to a two-way scatter diagram, but with categories usually defined in qualitative terms. Thus, Tnble 27 shows the relation between the south-north geostrophic wind component and the weather type. Next, we construct a table which would have resulted if there were no relation between meridional wind component and weather. Pmceeding exactly as in Chapter 111, we compute for each box the quantity: row total times column total divided by grand total. The result is given in Table 28.
; .
"
.. ~: ··., :;.;:~'·'' :.).i~,; :~~~~.~.::;~;:·~"·\::··:l::r~.;·;:~~~·{~·~~,~t~?:~:~}(~~
ANALYSIS OF THE RELATIONSHIP BETWEEN'TWO'.VARIATES) .'··;;~lOt:' ..~~.
TABLE ;27 ;.:.
··I ,;. ,;:·::.:: ~ .·r::·:·,:' .:l};~;~tcj~~{ '
.. I
Relation Between Meridional Flow and Weather·.
"
No-Relation Table Corresponding to Table 27 .. , i South-North Wind
Southerly
Weather Type Clear-scattered Broken-overcast Precipitation Total
. 1:,'1 Total
No Meridional •··: Component Northerly
12 8 5
20 12
25
40
.·~ ... · .+:· I,.
·50 · I 30 \\ 20 ''' ·.·•
8
35
100
Next, we ask, is Table 28 significantly different. from Table 27? To answer this we make the null hypothesis that the no-relation table represents the population, and the observed table is just a chance deviation from the hypothesis. To test the null hypo the-. ·· 1 sis, we compute X: as: ' . · u :.. •. f J.· :·, . \ ·. ·,:;.\1 2 x.2 = (5 - 12)2 + (25 -20) + (20 - 18)2 (10 - 8)2 . ; 12 20 18 .. ' 8 ' :•: •·.:. :1 ,'··i '!.) (10 - 12) 2 (10 - 10)2 (10 - 5) 2 (5 - 8) 2 + 12 10 + . 5 .t,,.'' 8 :,., .f.'' .. ,;!_:'·, 1
+ t
2
+
(S - 7) = 4.1 7 .6 = 13.0
.+
•
.
,,
I
.
I
..
·:. :, .
+ 1.2 + .2+ .5+ .3 + 0 + 5.0 + t.iu:n:} ~
:
:·' 1'
'·
•..
:;
1· .,.. :->··I "'.
l;;.l·f)
'i:: . :,.
The number of degrees of freedom is 4. ·The 1%'1itnit for'r with 4 'degrees of freedom is 13.28. Hence,' the •probability of
Al'l'\
_
ONS i
.TIST.
LOGY
getting a reiation between the two variates as good as that indicated in Table 11 would he just about 1%, if H 0 were correct, that is, if weather and meridional flow were independent. Since a 1% prouability lies just on the limit, we probably should not reject the null hypothesis. This may be true especially if the data were taken oil successive maps, so that successive pairs of data were not independent, a condition required for the chi-square test. Effectively, this would increase the probability of getting the observed x 2 by chance. As most forecasters realize, of course, the statistician would have been too cautious in this case, since a relation between meridional wind at 500 mb and weather really docs exist. However, these data were insufficient to prove this poiut within the limit of error permitted. Additional information is needed before the relation can be claimed to be significant. Jf we want to measure the degree of the relationship, we can define a coefficient of contingency which is given by: a
.
=V
X
2
N+~
where N is the total number of observations, and x is defined relative to the no-t·elation table, just as before. In this case, the 2
1 rr. • f · · • /lJ-. 34 coeulcten t o con tmgency ts V ~----- = . . 100 13.1
+
Of course,
since x 2 was not significantly different from zero, neither is this coefficient. One does not need to test the significance of contingency coerficienls, since their computation involves the prior determination of x2 , the significance of which is easily tested.
If a contingency table were exactly equal to the corresponding no-relation table, the contingency coefficient would of course be zero. Unfortunately, the coefficient is not unity for a perfect relation between the variables. The upper limit of the coefficient depends on the size of the table: in particular, if the number of rows and columns was equal, and given by S, the maximum value of the coefficient is
In this case, S
= 3, and the
contingency coefficient for a perfect relation would 'have been .817. Since the upper limit of contingency coefficieuts depends 011 the size of the contingency table, it appears that contingency coefficients should only be compared with each other when they
~SIS
> RE
V~~~ ,, , ~~A:R:
iSHII
.','f:.;jj~;~:;;,:;~~
\
have been evaluated from equally constructed contingencyitables. ~;~~;:;! Further, contingency coefficients should ·not be compared· with'''' ordinary linear correlation coefficients.:,, But, !contingency,. co- · efficients have the great advantage that they are able to measure 1·elationships which are far from linear, and where the variates may have arbitrary frequency distributions. · . ·' I 1
,
, , ,
r ~
, ,
•
,
1 \ •
,1 ; :
Several statistics have been suggested to measure the relationship between two variables, both of which are dichotomous.·. The best known of these quantities. is the tetrachoric. correlation coefficient. It is computed from the numbers .in a two-by-two contingency table:
;
x,
Q. .
c.
·'I
t
..
'
~
Alternative 2 ~· · '.
Alternative 1
Alternative 1. ............. , . Alternative 2 ............... .
~·-
'' ,''
where a, b, c and d represent the number of cases in each group. The formula for the tetrachoric correlation coefficient is: ft
= sin[:!....
vad - v'bc]
2 yad
+ vbc
For example, consider the following table:
TABLE 29 Relation Between Thunderstorms and Pressure Tendency Pressure Rise ....................... . Pressure Fall ....................... .
Then r1
.IS sm . [ -,.. .vso 2 v'SO
Thunder
No Thunder
2 24
33 25
v'792] = -sin 53.8
+ v'792
=
-
.807
;·
The quantity, rt, varies from -1 to 1, and behaves in a manner similar to an ordinary linear correlation coefficient, giving zero for no relation. However, the exact numerical value of ft is not completely comparable with that of an ordinary linear correlation
coefficient. The sign depends in a rather arbitrary manner on the arrangement of the contingency table. It is positive when the number of cases in the diagonal from upper left to lower exceeds thnt in the other diagonal.
There also exists a formula for the standard deviation of the theoretical distribution of rt which can be used to test whether a given value of rt is significant; however, in practice, the significance of a relation in a two-by-two contingency table can be tested most easily by applying x2 with one degree of freedom in comparison with a no-relation table.
Suggestions for Additional Reading. On Regression: Ezekiel, Mordecai, Methods of Correlaiion A nnly.~i.v, John Wiley nnd Sons, Inc., New York, 1941. Also: Quenouille, M. H., Associated Measurements, Butterworth, London, 1952. Also: Eisenhart, C., "The Interpretation of Certain Regression Methods and their Use in Biological and Industrial Research," The Amtals of Mathematical Statistics, Vol. 10, No.2, 1939.
. ·_,>).·,~:T\:J:~;::v~,:~ . ::;., ·.··,..
'. '>,;•·
.., ·9·~fl-
l
Vt q'·H~+r. 'f•' *d \' t\'r'~
Analysis of R~l~'ti~~~hip~·i.~·~l::,._ Betweell1lMore:Than: Two V ari~lb!'e~· . ":'··.···;·:.·,. · 1
.; .:';,.·
'.,,,.'·!
CHAPTER
v
! .
.
,
We sometimes wish to analyze the behavior of a meteorological vadable which varies in accord with two (or more) other'variables. In forecasting, in particular, Xa might be an element to be predicted on the basis of known values of X1 and X 2. · To do this we would necessarily begin with a study of the relationship betweenXa,Xt,andX2inpastdata. ,., ·: ,;,,·,,,·; ,,: ···:l'' . '· ~
1. Graphical Representation.
''· A graphical presentation of the relationship between the three variables can be obtained as follows: ·: '.;.; · .:•\ ·• '''' .. ; '·
A diagram with X1 as abscissa and 'X2 ·as ordinate is constructed. At the points defined by Xt ·and X2, values of ·xa are written in. The values of Xa are then isoplethed .. ·. Frequently, the values of Xa do not appear to fall into any regular pattern 1 and the isopleths appear quite erratic. · This situation can often be improved by forming averages in areas of certain sizes on' the. diagram. These areas may or may not overlap. ::The greater the apparent randomness of the Xa values, the larger the areas within which the means are determined. · ·, >, ~~.' :·,.,.,-. · ':' ·. ,:,•• ,;.•.;;,; 1
~
1
.
' ,
1
The mean for each area should be plotted at' the· center of gravity of the observations used to form this mean. ·This is particularly important in the areas far from the center, where the center of gravity is likely not to be situated in ithe geometrical center of the area, but between it and the center of the· whole graph. :· ~ i· • , ).:_r,,:j ·• ' l :;
~.;
i·'
~j':
Finally, the areas need not all be of . the same size.·· Since observations tend to thin out toward the edge~ of the· diagra.rn 105
.
106
SOME At'PLICAT!ONS OF ::>TATisncs TO MKrEoROLDGll
it is often desirable to increase the area size as one proceeds away from the center; in that way the number of observations in each area can r·emain fairly large so that the average in the area remains represen tat.ive. In addition to writing the mean in to the center of gravity in each area, it is also helpful to enter the number of cases on which the mean is based. Further, the standard deviation in each area is useful. Both of these quantities aid in the determination of the representativeness of the mean. Even frequency distributions can be constructed for each area and shown by a small histogram, to see if there are any systematic changes of frequency distributions of the predictand as dependent on changes of ' the predictors. Instead of considering Xt and X2 as mathematical abscissa and ordinate, it is also possibte to start by selecting class intervals of xl and x2. and begin as though a simple double frequency distribution between Xt and X2 is to be constructed. However, instead of counting the number of cases in each box, values of Xa are written in each box, and averaged within each. These averages can then be isoplethed. This procedure can be used also when the independent variables are not given in numerical terms, but by a description. A special problem arises when the dependent variate is not given in quantitative terms. It may be, for example, the occurrence or non-occttrrence of rain, or it may be the statement that nn nirport was dosed or open for coutact or instrument flight. In that case, it is best to use a code symbol for each possibility; for example, a positive sign ( +) for rain and a negative sign (-) for no rain. These code symbols are entered in the individual boxes instead of numerical figures. After the tallying is completed, the fractiou of times a given event occurred in a given box can then be entered in per cent, and these values isoplethed. For example, the isopleth of fraction of occurrerices of rain may be interpreted as isopleths of probability of rain. Another special case of meteorological importance is the case wherein the wind vector is one of the independent variables. If both the wind speed and direction are important, the wind actually constitutes two independent variables. In many ways the wind direction and speed can be treated very much like any other two independent variables; ouly, the isopleths of the
••.
IALY~ ••
__ REL,
>HIPS
/EEN . . :
:·~1:~
;~~~~·~<
;
-~t'A~\'i··~.:;~,:~f~m~
dependent variable are usually drawn on polar coordinates' rather.:.';':'.~ J~:'r;~ than Cartesian coordinates. In general, polar coordinates are · ··' preferable when wind direction is one of the independent variables, because 0° is the same as 360°, and the isopleths must be continuous at zero. Also, the slopes of the isopleth!;~ should be continuous at 0°. · · ·: · · · \ ,. ·· · •" ·~·1-'''
",, ••·
'~· J.• _h>.• •1
It is difficult to show graphically the· simultaneous variation of more than three variables. Frequently, this· is accomplished by using the three-variable technique just outlined, iand strati~ fying the fourth variable according· to differentcategories. For example, ~parate three-variable diagrams. may•· be· made for cyclonic, anticyclonic, and essentially a straight isobar curvature;. The most common type of stratification is that accordingd to season. Others in general use are stratification according to time of day, according to positive or negative lav.se. rate,· etc. ·. ·~ :··! '
.
\ll',"d'l~-J-,·;··,·-~ .. ~-~<··.·:·;
~-·
If the graphical representation of three in~ni~ri~at" ~ariable11, gives isopleths that are fairly· evenly· spaced··' straight·' lines, methods of lhtear regression analysis can beiused which are outlined later in this chapter. It is surprising how often the curvature \ on an isopleth diagram is produced by relatively few observations 1 , · •. • so that the linear isopleths actually fit the individual observations \1. as well as the curved isopieths. Probably, the adequacy of linear methods can be judged best from graphs of Xa separately plotted ~function of Xt and X2; if both of these graphs show negligible curvature, linear regression methods· are. probably adequate. These methods have the advantage that the relation between the three variables can be established objectively; moreover, experience shows that the linear equations derived objectively are usually more stable than the relationship:implied·:by.the curved isopieths. In other words, a linear relationship between three variables derived from one set of data is likely to fit another set, provided sufficient data are used. Of course, if wind direction is one of the independent variables, linear,. relationships cannot be satisfactory. ;, . ,· .. 1. :. : ·.;!; .'. 1-•:·i .1;11' As an example of some of these statements, consider Table 30. Here the problem is to present; graphically ·the· relationship between minimum temperature at· New :York; the·· minimum temperature at New York the day before, and the south-north wind component at New York the day before," ·For brevity,· these variables will be referred to as Xa, Xt and Xz, respectively.
108
So~m 1\I'I'LICAHONS OF STATISTICS TO METEOROLOGY
TABLE 30 Graphical Representation of Relation Between Three Variables
+ 5- + 14.9
+15- +24.9
(29.0)
(36.0)
v
33.3
39.(
(29. ~)
(42.0)
The numbers in the boxes were obtained by averaging the individual observations of X:~ in each box. Only three months of data (all March) were entered; probably these ninety-three sets of observation are somewhat fewer than one should have to establish the relation between three val"iables. Parentheses around the numbers in a box indicate that the number is based on one or two observations only, and is therefore not reliable. The isopleths\vere drawn subjectively for the values in the boxes. Comparison of the isopleths in Table 30 with those drawn from other Mardr periods showed up considerable differences; yet isopleths obtained from the linear theory were much more stable from one group of March periods to another. Physically, this result seems reasonable; for the larger the south-north· wind component, the higher the temperature the next day (other things
-
-
.,-
··~:;:::::\'').sj!if''~'{t;:,r{J·"~,,~~:.:;:;
ANALYSIS OF RELATIONSHIPS BETWEEN'·MORE·THAN;/fwoi,VARIAB
, .<:. ~,~··.,,
..;)~
.,
. ·r
<~:~~~i·;·:··l·~~::h(·;~·,~.)~::.;
.
being equal); and the relationship between the'minimumltempet"!~:·:.0: atures on two successive days should be linear.'':· ·;;,} ,;;~,M,F .? · .•
· . ! (:.
2. Three-Dimensional Regression.
r
' t.. . l I •
·~
'
.J.(r.
~:rr .-_~!·:,/t ~~i:
~:I
.
'I :l!'. : i: !'i
1• ' . ' •
•'
t
'
: .,
.• C •
\;I if; 1 I~;
(;
The equation for the linear relation best suite{j to forecasts-of Xa from x2 and XI can again be obtained by the'method of.Jeast squares. If Xa' denotes the values expected by the equation, and Xa are the observed values, the three-dimensional linear least square relation has the property that l:(Xa - Xa') 2 is a minimum. Since a linear equation in three dimensions is the equation for a plane, the three-dimensional least square equation' is called' the "plane of regression" in analogy with the line of regression in· two dimensions. This plane, then, hasthe property that the· scatter 1 about it is less than that about any other plane. Also, it is the most probable plane if the observations are assumed to be normally distributed, with constant standard deviation.·. We shall write the equation for the plane of regression in the form:· !• ~ 'I' ' '· •·i I Xa' = aa.12. + ba1.2X1 + ba~.1:x;2··: · · ·.\ . ··"··· I
where aa.12 and the b's are to be determined from the data'in the t · sample. The subscripts of the coefficient' a give first the pre- \\ dictand (in this case Xa), then a period, and finally the two ·'• predictors. The subscripts of the b's give first the' predictand, then the particular variable of which this b is the coefficient, then a period and finally the other independent variable or varia~les also oq::urring in the equation. Thus, for example, boa.t245 is'the 1 coefficient of X 3 in the equation that predi~ts ·X 6 ! from. the variables xl through Xs. . . .. . . . .. ·
,t
'
:
d
I
.
I,
< ••
. '·
The coefficients ba1.2 and ba2.1 can be obtained by solving .. : T ' ;! simultaneously the three "normal" equatiohs: '
aa,t2l:X1 aa.12 :ZX2
....
,
+ bat.2 :ZX12 + ba2.1 l:X1X2 +
ba1.2 l:X1X2
+ ba2.1 X:X2~
For the data in Table 30, the equation for the plane of regression turns out to be . ·~': ··. ··;; Xa'
= 0.524 X1
.
+·0.095 X2
+ 1.5~86
. t,:: i ~. '' .' : '
•.. , ,:
' ,~·'j
The first of the three normal equations above implies that the "plane of regression" passes through the mean of Xt. X2. and Xa.
I n other words, when X , is equal to its mean, and X2 is equal to its mean, then the best estimate of Xa is that it equals its mean as well. This property is common to least square solutions with any number of variables, and makes it possible to reduce the n u m b e r o f normalequations by one. Instead of the three normal equations above, we may write the two equations:
ba1.2 X1X2
+ ba2.1 :X;'J.
=
x2xa
I Jere, as before, lower case x implies a deviation from the mean. The mean products which have to be calculated from the observations are all variances or covariances. They can be calculated, in, general, from the identities:
x,x1 = x,x 1
-
x, x 1
where i and j may be any integers. The variances and covariances can also be calculated by the 11 short method." The two equations above have the following solutions for the slopes, ba1.2 and ba2.t:
b31.2 = sa ( rat s,
1 -
ra2r12) ' b32.t r122
= -sa ( ra2 - ratrt2) 2 s2
1 -
rt2
where the r's are ordinary linear correlation coefficients between the variables indicated by the subscripts. These expressions show that a predictor may be correlated with the predictand, and yet its knowledge does not contribute to the forecast; for example, rat may be equal to ra2r12, in which case, ba1.2 will be zero. More will be said about this later.
1f we make use of the condition that the plane of regression passes through the means, we can write the equation for the plane of regression also: Xa' - Xa = ba1.2 (X, - Xt)
+
ba2.t (X2 - X2)
If one wishes to obtain a set of linear isopleths of Xa from the equation for the plane of regression, o n e only needs to give Xa' constant values as demanded by the particular isopleth. Then, the equation for the plane of regression becomes a family of straight lines that are parallel and equally spaced; these can easily be plotted on a diagram with coordinates xl and x2. Table 31 shows these linear isopleths obtained from the same
.LATI
D
data as Table 30. As mentioned before, it turns o u t that th ese linear isopleths are quite similar in different samples of three March periods for the example discussed a b o v e . TABLE.31
Linear Regression Applied to theData of Table 30
-25'- -15.1
-15--5.1
- 5 -+4.9
. ···~'
+15- -t-24.
;.
3. Linear Multiple Correlation.
.
<'
3
The linear multiple correlation coefficient, Ra.21, measures how well the plane of regression fits the observations. ·If'future data were taken entirely from the same population'as thedata inithe·., sample, the coefficient R3.21 also is related to the likely accuracy of· ferecasts of Xa made by using the linear regression equation.· )The 1 subscripts of the multiple correlation coefficient are· always arranged in the following order: first, the predictand. variable,. theii a period, followed by the subscripts for: ali the. predictor
--~ ... 1!: APn.. ''-"iiONS Vt' .:>'fATISTICS TO METEOROLOGY
variables. given by:
The Jinear coefficient of multiple correlation is Ha.21 2
=
2
2
sa - sa.21 _;;_--==sa2
Here sa.21 2 is defi11ed by .Z:(Xa' - Xa)2/N. It measures the part of the variance of Xa that is not related to variation of Xt and X 2 • Hence R:~.21 2 measures the percentage of the total variance of Xa accounted for by the variation of Xr and X2. The multiple correlation coelf1cient 1s related to the simple linear correlation coefficients by: , ra1 2 I'3.21 2 =
+ ra22 -
2ral ra2rt2 1 - Y12 2 (This expression is proved in the Appendix.) In the example above Ra.2J = 0.58 and Ra.21 2 = 0.34. That is, 34% of the variance in minimum temperature is accounted for by the variation of previous days minimum temperature and wind direction. Note that the sign of R is meaningless.
In a generaJ way, the magnitude of the multiple correlation coefficient is an indicator of how good forecasts can be made by the plane of regression in future samples. However, relationships between variables occur in regimes, and a true test of the generality of the relation can be made only on the basis of independent samples. Another point is that the error to be expected from a given Jinear prediction equation depends 011 the deviation of the predictors from the mean; since the slopes are subject to statistical errors, the estimates become less precise when the values of>.one or both predictors are far from the normal.
4. Significance of Linear Multiple Correia tion Coefficients. Jf a sample is drawn at random out of populations of three variables, x, x2, and X:i. there is a chance of getting a large multiple correlation coefficient even though the populations are not correlated. For example, with 30 triplets of observations, the probability is 1% that uncorrelated populations \viii yield samples with correlation coefficients of .S4 or larger. With more data in the samples, the chance of getting sizable correlation coefficients out of uncorrelated populations becomes smaller. f
However, as stated before, meteorological samples are not random samples, and the fluctuations of meteorological correlation coefficients have not been studied sufficiently by theory or
'
ANALYSrS OF RELATIONSHrPS
7:,ry!'.l''!':'
~~·-~i-''"
..~1\:~:tt',' ~-,
.;·~
BETWEEN,MOR.E\T~~i!,~Wc:)._IY,,\~,AB,l;~~iN . - . ;~-~:~ . ~:: t:~ ( .··.:<.-~~~:~r:~~-~3::> --:·.>~ ~:~~!~\J~~·r:{~
.,
experiment. It seems likely- that :if • multiple?:correlationhco efficients were computed between the sante variables over;sey,era( periods, quite large fluctuations might· be expected.-JY. · ,;_;)~····;' hv~ '
'·
'
'•/'
..
(
The coefficients in the regression equations 'also'will fluctuate .. from period to period, especially· if. there· is';no 'ex:act'\physical, · · · basis for the relationships implied.' But it'seems 'reasonable 'to, suppose that, the longer the record on.'which''the 'regression' 1 equation or the multiple correlation coefficient is based, the 1 greater their stability. ' ·· : ·· ·.···.·. ·:ifi.· ~·'. .',:;'··:::< '.;:~·!;:,1
lr..
-1',:,
-~·:~
•;
~~·~·.t,l
T!
~·
When the observations can be regarded as ·'·'random samples"! from populations, a formal test of the significance of the multiple, . · correlation coefficient can be made by the analysis of variance.· technique as shown in Table 32. In this table pis the number of. predictors (called independent variables). and N is' the. number'··
..
, ' '
' .
' '
• l · '! • t l
! .. ~ . ·-:. . ' ,·- ; 1:; i ; I ~ ~
: .: ;.. /ii:{n
·.
:·•;t(!'t·.
· .~
TABLE 32 Analysis of Variance Table for Significance Test.·:· of Multiple Correlation Coefficlentt. ·.. ; .. .
'
' .. '
.. .
•. :· . l ~
>
ss
Source
/
Total. .•............
1.
N-1
Regression equation ...
R'
p
Residual ...•. , ..... .
1-R'
IAgain,
MS
df
· ~.r!~ '
.> •.
,)·
1' •:,: . ·:·
·;
I•.
, I
".''·', ·'
'l:··~t~..-.f•
R'
-
· • r: ·~
.·~,
(j
I'~
F\·?''·;t· ·
: .'::·,:tH~f!J.1.;1.\\
R2 (N-1-/>)
·~---'---'
1 ·'
all quantities in the SS and MS colu.mns.should be multiplied by Ns2•. ' ! •
' : '
! . . ') ~.1 ) '·' '·. ;l,11' i
'I
j
( .... (
,1.; '"\~,.
·~
.
l ) I I ! .' '.
of sets of observations.· In the example discussed eartier,N := .93. since daily data for three March months were used. :However, in this case the test would be of douf?tfuL validity since. the· 93 · March days are not independent. · .,: ·· ·,. i > . ,,~}i.ltl :,
5. "Automatic" Correlation.
. . ' ...
\
c:·:: .. :;j:'< ·~r):·r;( :.,i'.-! ; ,. ) . J
: . ·..·
• (,
i-~
..
i. ';
..
'· ' ~
.
'
Occasionally, a variable Xa may be correlated positively<with, · both X1 and X2. Yet, the multiple correlation coefficient 'Ra.21 is no larger than the simple correlation coefficient raJ· In .other words, the probable error of estimate of Xa from X1 and X2 is no
•,
!.
ATIS1 __ . .l M~:.
..
)LoG·
smaller than probable error of estimate oi Xa from X1 alone. This will happen when there is no "real" correlation between Xa and X2. The observed linear correlation coefficient between Xa and X2 is e11tircJy due to the fact that X1 is correlated with both X2 and Xa. In such a case, we get an "automatic" correlation between two variables just because each of them is related to a third. For example, there exists an apparent relation between stability and relative humidity in the lower troposphere. The reason for this, however, is that both stability and relative humidity are correlated with the surface temperature. Actually, there is iittle intrinsic relation between the two.
If the correlation coefficient between X1 and X2 is n2, and between xl and Xa is YJ3, then the "automatic" correlation coefficient between x2 and Xa is YJ2YI3· Only when the observed correlation coefficient between X2 aml Xa differs considerably from the automatic coefficient will the additional knowledge of X2 improve a forecast of Xa based on X1. Thus, if one predictor has been selected as a basis for a linear forecast of a predictand, the question of whether a second predictor should be added will depend on whether the correlation between this second predictor and the predictand differs considerahly from the automatic correlation between them. Note that the condition for an automatic: correlation is just the condition that will make one of the b's in the equation for the plane of regression equal to zero.
In the example above, the correlation coefficient between minimum temperature and wind direction the day before is only slightly greater than the automatic correlation. Why? Because yesterday's temperature is correlated both with yesterday's wind and today's temperature, and this leads to an automatic correlation between wind one day and temperature the next. In our example, the automatic correlation coefficient is 0.22, and the actual correlation coefficient is 0.33. As a result, only slightly more variance of minimum temperature is accounted for by the addition of the wind factor to the temperature factor. The variance explained by the temperature alone is 32%, while the variance explained by temperature and wind together is 34%. 6. Partial Correia tlon. The partial correlation coefficient measures the "real" correlation between two variables after the influence of other variables
ULY.
~Hll"O
has been eliminated. It might be expected'that such a coefficient.:; would be proportional to the difference between ·total correlation''; · ,, ... ,, .:• and automatic correlation. A partial correlation coefficient is always expressed by a small· followed by two subscripts, then a dot, followed by more.sub· sCripts. The subscripts before the period ·designate the two variables correlated; the other subscripts denote the variables, the effects of which are to be eliminated. 'For example, rts.245 is the coefficient of partial correlation between variables Xt and Xs, after eliminating the influences of X2, x., andX 5 • . i . . . r
Theoretically, the effect of the variation of certain variables on two given variables, say X2 and Xs, may'be eliminated by choosing the sample for fixed values of the extraneous 'variables.'· For example, we might study the relation between. stability' and relative humidity, all at the same temperature .. : However, we· would probably have difficulties in collecting sufficient data under such a restriction; and the difficulty increases when the influences,· of more than one variable are to be elimina~~· ,\, ·......• :1 ,ii) ,. , . The effect of one variable on a second can be'.eliminated'by. considering only the deviations of the second from 'the lines' or I; regression of the second variable on the first. ' In particular,· if\\ we want to find the partial correlation coefficient between'·;Xs and X2, with the effect of variations of Xt eliminated, we proceed as follows: First, we determine ·the regression equations, Xs on Xt and x2 on XI. We then form the deviations of X3 and x2 from these lines of regression and correlate them with each other. In practice; the formula for the partial correlation coefficient between x2 and Xs, holding XI constant, is: .
Similarly,
T13 - Ti2T23 Tl3·2 = ;::;=:=::::=;:;--;=.:===7;2 ,,2:~ r2s
v1 -
vl -
(These expressions are proved in the Appendix). Note the partial correlation is zero when r23 = rtsrt2, that is, when the correlation coefficient between X2 and X3 is just equal to the automatic coefficient. . . ·· · ,, In the example of Table 31 the coefficient'of partial correlation between minimum temperature and tlte: previous day; holding '
.
} :,
·,
' !'
,'t'
)'
·.
(! .
110
::>O?>!E Ai'l'LICAT!ONS OF ::>Tll.TlSHCS TO METEOROLOGY
temperature the previous day constant, is only 0.15. The total correlation is 0.3.3, a misleadingly high value.
7. Extension to More than Three Variables. The liuear techniques discussed in this chapter can be extended to additional predictors. Of course, the computational work grows rapidly, perhaps with the square of the number of predictors. Nevertheless, electronic computers have made such extension possible. In practice, linear prediction methods have been used in meteorology with as many as 22 predictors. However, in order to reduce the numbet· of variables in the regression equations, much effort has been exerted toward the development of methods to determine the "best" combination of predictors. This subject is discussed in Chapters VII and VIII under ·the headings of "Factor Analysis" and "Selections of Predictors." Jf one makes use of the fact that all linear regression equations pass through the mean of all variables, the number of normal equations equals the n(unber of predictors. If Xn is the predictancl, and Xt, X2, X3 ... X~: are the k predictors, the normal equations for the unknown coefficients b take the following symmetrical form (the notation for the coefficients has been abbreviated here for convenience): 2 X1 b1
X2X1b1 X3X;b1
+ + +
X1X2b2 X2l!b2 X3X-;b2
+ + +
XJX3ba X2X3b3
X3..,b3
+ + +
X1X4b4 • • • • •
XtXn
X2X4b4 • • • • •
=
X3X4b4 • • • ••
= X3X
X2X 11 11
The prediction equatiou itself is written: Xn' -Xn = bt(X.-XI) + b3(X3-X3)
+ b2(X2-X2) + b4(X4 -X4) .....
all the normal equations can be abbreviated by: k
~
I- I
-
b;X;Xj
=
-XnXj
(This expression is proved in the Appendix). l\lany technif)ues are available for the solutions for the co-
'·J.VJ.Ul<~\THA,~ ... .,.o:.V,A.,.... .,~.e.~'::'t\,, 1
ANALYSIS OF RELATIONSIHPS D!!:TWEEN •
•·•
·.: ·- • ·.-.--
¢
.,;
:--~ :.');~; H~--~:·!···~-~~~:;:~;~~·~;r,_1l-r;f·~f-r:~~~~~~~!f.t
efficients, as well as for the computations •of~thei:additionaL.};;~,,:: variance which each additional predictor explains.~f'-·AgainJ•the• ;. :~ ... variance accounted for by all the predictors can.be expressed1in terms of a multiple correlation·coefficient, defined by: ::.· ,':'!.:'' 2
R
n.1234 • • •
=
2 S n !
2
s
i 'J f '
n.1234. • •
I ~ -L!l ,s2n -
.:\... ·t~_, ...,..~ 1
I'
.
. ! :I ~ i ·~ .:.
'
fi·.i-11 . ·). ~~- .,.,,<-~ !'
l '
•• : : \ :'I
.' \
i;
' -
'.
~-
However, as the number of variables 'increases, it' becomes more and more difficult to evaluate the'."signifi~ai1ce': ''o(the results. Even if the samples could be considered to be selected at random, the chance would be 1%,'Jo/example, ·'or findin'g 'a. multiple correlation of .61 or larger when there, is no 'correlation 1 between the populations, with five. 'variables and 'thirty. values of each. In general, multiple correlation. coefficients based' on. four or more variables possess little stability.'·" In' a' three-year.. 1 period, for example, R4.a21 might be 0.40'and'in the next' three-: year period it might be 0.80. Similarly, multiple regre~sion ·.· equations based on many variables''may reduce 1the1 scattedn a given sample, but they are not likely, to be .stable ,Jon fut~re , samples. 1t is, therefore, doubtful.whether•increasing indefinitely · ~he n ~mber of v~riables for development of forecasting equations'\, ts desirable. 1t Is true, of course, that the .longer-·the record 1; the ;·\ greater the stability . of multiple correlation ·:·coefficients' and regression equation coefficients for. a fixednumbe.r;-of variables~ . Rarely, however, will the record be long enough for computation:' of stable coefficients when many, independent . '.variables·. are involved. Only variables should be added which bear; a physica~ .. relationship to the predictand and,· more specifically, a relationship which has not already been implied by another variable., . For example, if temperature and 11vapor 1r pressure·' have ! been · included as predictors of thunderstorm.'' probability,'· there 1 is little point in adding relative humidity,':''The fact that 'a· 'given · set of data can be fitted arbitrarily well by a sufficieJ;tt number of· variables has led to the mistaken conclusion that if'we'onlyhad a sufficient number of dependent- variables, the forecasting problem would be solved, simply by using high order regression:' This subject is treated £urther in Chapter VIII. i·lrm.'-' •rr :;'t~ll::~?d
,;)A·.~i.; ,J,:i,~:~:::;~~~i~;;::. 8. Extension to Nonlinear Prediction Equations •.. 1 ·r •;i·.,1( :.·l. f"':.. ··· • }. c
·
.
·
'· · · ·•r
·
• ,
• , I
· '•:·:1.
~
••• ,
,
I
... ;
• \:.'
~~ ,~
\, ,
~ .. ~'
It is quite simple, in principle, to extend the -theory: of· linear., regression to nonlinear regression. ·With the exception of objec•·'
tive analysis (Chapter Vll), however, not much work has been done on this subject. The reason is that rarely, when the variables arc quantitative and well behaved, will nonlinear terms in the equation aid materially in the explanation of the variance of the pred ictand. · Suppose, however, we would like to add some nonlinear terms to our prediction equation. How should we proceed? For example, we should like to admit a term proportional to X1 2 • In that case, we simply invent a new predictor, say X7 (assuming that we have so far considered six predictors) anti let X1 be X. 2 • We then proceed with the normal equations just as outlined above. Or, we may wish to include a sinusoidal term in the regression equation; we simply define a new variable, say X5, by the sine of our predictor, no matter whether the same predictor has already been included in linear or other form.
9. The Discriminant :Function. The technique of multiple linear regression can be used even though the predictand is non-numerical, such as the occurrence or non-occurrence of an event like a thundershower. In this case the numerical predictors of temperature, pressure, humidity, etc., are used to determine a linea•· "discriminant function," which as judged from the data will most successfully discriminate between the two groups of events. For example, in the case of two numerical preuictors, Xt and X2, we uefine a "discriminant function" L by: L bo b1X1 b2X2
+
+
L is to have the property that the straight line in the X1, X2 plane giveu hy L = 0 hest discriminates between two alternatives. The coefficients b1 and b2 are chosen in such a way as to maximize the quantity
·p - ( ["'
~ L.,, )'
because we would like L to be as different as possible for the two groups. Here [(I) is the mean value of the L for the first group of easel!, and [(2) is the mean value of L for the second group. sL is the standard deviation of L estimated from pooling the sums of squares computed within each group. The constant bo is determined so that the relation L = 0 defines the point of separa-
~I,YS
~EI.A
!ill'S
EEN! ,,
4
:::~~~-~ :·:. f~:.~c::~•' ,::.~~·¥'~~~;~~
tion between the two groups of events. In particular; if:L.>:;;o,f}~l we predict that the event is of group 1, and ifL < 0, we.predict ··~ ·· that the event falls in group 2. ; . · · .. · !.- "., j ,-. 'i · . , : If' there are Nt cases in group 1 where the event has occurred, and N2 cases in group 2 where the event did not occur, we define; a predictand which takes the value N2 for all cases· in ·. Nt + N2
N 1 for all cases in group, 2.,. In this way the Nt + N2 ,' · average value of the. predictand is zero si~ce group 1 and
Nt (N 1
.....
~ N2) +
1
0 2 ,N ,t(Nt-: N 2 ) = .
.i
,.
I
,.. I·'
This is for mathematical and computational convenience and means that in equations to be used later we need hot worry,about a correction term for the mean. :Thus, the sum of squares of the predictalld is given by: // _· · · '/ ;··; ~:; · !
N, (N. ~ NJ + ~. (~.-;~J~ ~~~~.'Y' .i
If the mean values of the predictors are Xt and
NIN2dt' ·! (N1 + N2) 2 -N1N2d2 2 b1X1X2 + b2X2 = , (Nt + N2) 2 where the terms x 1xi have been defined previously: and are computed from the entire Nt + N2 cases, ignoring the group distinction. The similarity of these equations' to those used in the usual multiple regression analysis is easily seen, and the extension to three or more variables is straight forward. · ' 1 · · ·1', · .' • · The application of this method ·will be. illustrated by' using some data from a study1 relating the occurrence of precipitation during a 24-hour period at Albany, New York, to a measure of
+ b2X1X2
=
!
t'
,
ll
X2 for group 1, ! ·
and X1' and X2' for group 2, the mean differences (group 1group 2) will be represented by: dt = Xt --' Xt', d2 = X2 - X2'· The coefficients of the discriminant function can then be de· termined from the solution of the equations: .· b1X1 2
I
1
i
••
'• I
!"Forecasting Precipitation Occurrence from Prognosl ic Charts of Vertical Velocity," J. M. Sassman and R. A. Allen, manuscript August 195~. ,·. ': i
IGOr-------------,-------------.-------------.-------------.-------------.-------------,
-"" 0
L= -0.2.590-0.005162. X1+0.006910 Xz PRECIPITATION
X
120
100
eo 60
(=I
X
•
X
I .
•
X
X
20
• 0
10
·,.---X
::r>
;s ~
X
;;
//
> o-j
/
~
"'0
L=O
"'
~
> o-j
iii ::::!
..
40
0
~
X X
~
X
NO PRECIPITATION •
n
"'
..
>-j
0
~
;!j
tsl
x• 20
30
x,
~
40
50
8 Discriminant function theory used to separate occurrence and non-occurrence of rain as function of vertical motion and dew point depression. FIGURE
60
§<
':
'
ANALYSlS OF RELATIONSHIPS JjETWEEN MORE'
vertical velocity X2 and a measure of the.dew point depression Data for 91 days were available, giving N '=I 28 precipitation events. The following quantities were . computed :from ; the · original data.
Xt
X2
= 19.29, = 79.89,
=
x1 , = x2, =
·.-~
I.
•: ,.' • .-,.
27.S2,
dt = -.8.23
30.0S,
d2 =, 49.84·
li .. :, .
; _: i. ' ., i! r. l ·. ·· . t1
' :~
!
...
1883.8, X22 =.,14,517~9 1 ,Xl~~' =:=,.,V,2?.3
The equations for solution are
th~s: ·
; ! '
~
';
'
•~ (; i 1 l
·• I ! .
~;;-_-,·~nlf:
. ,I , . .. ~
". '': i:. : ' ) .
, ;_
1f
~'"; ., ·;
i
28 63 bl 1883.8 - b2 1129.3 ,= · (8.23). '.'' .\J · . . 91 2 " .. , . : . ., ... ''''!'1'' . 28 63· ' ... ' ;! . ; -bl 1129.3 b2 14,S17.9 == '91 _:_ (49.84) . .:, . ' 2 . . . .: "·'· :q . from which bt = -O.OOOS163, b2 = -0.0006911,·andl•.'· ·P·':, •
+
.
bo = O.OOOS163 (19.29
~
1. ·.
27.S2)
.
'
\1-l ,·
.
.,
(
;,·
~0.0006911 (79.~~ .~ 3~;0~; .!.,:::
= -0.02S90. Thus, the discriminant function is
-toL = -0.2S90 - O.OOS163.Xt
+ 0.006910'X2.
·1 ·"··
To be used as a prediction equation, 'the known.values of X1\and X2 would be substituted; if L > .0, · the' forecast would. be pre:- · cipitation, white if L < 0, no precipitation would be predicted.· · For example, if X 1 = 10 and X2 - SO, we ha~e:Lp=:. o,q3,488, so· that precipitation would be expected.· .>~•• ,,1 .·.fii';)··;,,, ;.,iii···T-'.-' i For this example, the results ar~ shown graphically in .Figure,8.. The discriminant function is the straight Jine L which, tends to separate the 28 precipitation cases plottep as JC. from the 63 cases of no precipitation plotted as ( ·). This tinecan b.e. interpr~ted as the isopleth of SO% probability of precipitatior). 'Touse this chart in prediction, all points falling above.attd.to' the.left,of line L would be forecast as precipitation, and all points falling' on the other side of the line would be' forecast' as 'no 'precipitation. The method of discriminant analysis has been greatly'expanded to include, first, the possibility 'of 'numerous predictors .t and, second, many categories of predictand. This 'technique, 'called
:.
lo
"multiple discriminant analysis" has become practical due to the availability of electronic computers and has proved a powerful tool for all kinds of statistical forecasts.
10. Graphical Regression. Statistical prediction can also be made by the technique of "graphical regression" whether or not the predictors or predictand are numerical. This method, which has been successfully employed, has been discussed earlier in this chapter under the heading of "Graphical Representation." Here, isopleths of a prcdictaml Xa were drawn as a function of the predictors Xt and X 2 , in order to make the relationship between the three variables apparent. The same isopleths can also be used for purposes of prediction. Whenever a given combination of predictors defil1es a point not on an isopleth, the predictand can be determined by interpolation by eye. In connection with the discriminant function example discussed in the previous section, it is interesting to note that the dashed line in Figure 8 was drawn subjectively by eye as an isopleth of 50 per cent probability of precipitation before the discriminant function line was determined. Actually, on the original graph (not shown here) the 50 per cent isopleth was drawn as a straight line uut appears as a curved line on Figure 8 because of a logarithmic transformation that was made on Xz. The graphical analysis made on the original chart also gave isopleths of 10 per ce11t prouability, 20 per cent, ... 90 per cent, etc., whereas only the 50 per cent line is given easily from discriminant theory. This appears to be one of the advantages of the graphical method over the objective computations of the discriminant function method. Of course, the linear discriminant function method could be used to provide estimates of probability of the predictand for different values of the predictors, but in meteorological applications it is doubtful whether this would be worth the trouble involved, especially since the assumptions made in the discriminant function theory such as normality, equality of variances, etc. arc not valid in many cases. A weakness of the graphical regression method is its subjectivity. lVleteorologists draw isopleths subjectively, and often ill not agree whether certain rapid changes of curvature have ly meaning. In general, it has been found that the simpler the
l
...... Lvs1:.. -- .:ELA'.
nrs
1
r
'EN ·, . ; ,. : '
.
..
s1.12:-
f''N.A>
THAl
· ·:;
~f:'t,!_!{~·:_<::·,\·:· ·-:'. . ::~;·_:·~·i';1 :.,~·if<~;:::::.:~~. --.-'<'·)~;:.~~~~~ ;·-.;:,:.!.i~~?ii;
isopleth pattern, the better the chance thatdt wilb'.be;lilstable~J:·.;:~~-:~t~ and can be used successfully with future' data.'· ~!}·;~;.:_ ;,;r,::;.,,::i~S' ;y(V: ·
. ;:
' ':.'-.·I. .. : i_• I .-.~·i~-- ·- .. i,..r..f:;:;(•l't_lt· (.J:ht·t~
·
On the other hand, graphical regression can' be applied without. assuming the mathematical character of the·relationship between the variables, whereas the mathematical theory of' regression requires such an assumption. Further, graphical·regression :is a· much more rapid process, unless machines are available. ··More' will be said about this in Chapter VIII.·' ' :: ', d' . •· '' ' ' : ·;!
j
'
i
11. The Multiple Correlation Ratio., .
•
:
f ::
I
~'
1;· ~ \ ~ ~ 1
' ' ... ' •
j :
:
,:
• ;'
Even in the case of graphical regression it is possible to define a multiple correlation index.' Such' a coefficient' would .measure how well the isopleths of the dependent variable fit the observa- . tions of the same variable. Again, the. correlation .ratio is ,deqn,ed . by: .. ··•i. • • tlf-'::'' 1 '/ :;.~l)fi(H'Jl -~'.:',::j ·
sa
·
713.21 2 -
2
2
_,:. 1:·1·: i!''' :·: ,,: ''.i.!i ;. " ' . : ·sa2.' .... :•··· ;. , :. '· .. :~·:·) ·i·>~
-
sa.21
where the scatter is again defined by:· ' ·· ·i ·
11 ,., · ·' ·· ·
'·"' 'i;i··'
As b~fore, X 3 denotes the observations of the depend~ntvariable and Xa' the value of that variable at the same1 X1 and X2, .as 1 determined by the isopleths . .· · ·. ·:•·:' ',:I:: •l: ' : ·" "'· ,"'' d • ' .
'
· •
l ·,
'
~:
•»;
<
•
1;:
Even though a definition of a correlation coefficient of' this nature is possible in principle, its v:alue is questionable since it depends too much on the manner in. whi~h. the isopleths :are drawn. The more smoothing, the smaller the .coeipcient.·.· A definition more analogous to the 2-dimensional curvilinear coefficient would be based on a scatter de~ned ·~y the vari~tion,pl Xa about its mean in every box. For each individual box, a frequency distribution could be ,constructed anp ·the -y:ari~nce . measured in each. The square, of the scatter would . be, the weighted mean of these individual va~iaf!ce.~,;·w~ig!l.~e,<;I, :PX: t,he. number of observations in each box. ... ,.,~ . 11.!:1; '·<· ., ii\ 1.!r,·:.. !< Such coefficients, however, will tend to overestimate' the degree of association between the variables. The' vaiue of 712; may 'fal1 quite considerably as additional data are used from, new samples. . i
•:I
,:
!2-t
SOME Arl't.ICAT!ONS OF STATISTICS TO METEOROLOG')
12. Probability Forecasts. When· forecasts are of a qualitative nature, and probabilities arc forecast, the error of the probability is usually not the criterion by which the forecast will he judged. frequently, a forecaster will be forced to make a categorical statement whether it will min or not, though the isopleth diagram gives him proba· hility only. When the probability is higher than 80% or less than 20%, there is usually (but not always) no doubt what forecast to make. In a general way, forecast methods arc poor when the pt·obabilities tend to come out uear the climatic mean most of the time (not necessarily S0%-50%). For example, if the climatic probability or a thunderstorm is 20% on June evenings, a forecast method which will predict this probability a large portion of the time will not be of great usefulness. In gelleral, the use of a probability forecast should depend on the monetary loss resulting from a wrong forecast. For example, a movie company may schedule an outdoor scene; if rain is forecast and does not occur, only the day's wages are lost; if no rain is forecast, but does occur, expensive sets may be ruined. Clearly, errors in the two directions are not of equal monetary value. Unfortunately, a forecaster generally does not know to what use his forecast will be put. Eventually, the public may be able to make direct application of forecasts of the probability of meteorological events to their specific problems. · Even now, public utilities, baseball clubs, movie studios, and other enterprises that are affected by weather phenomena should be able to make use of probability forecasts. A good example of such use was the famous snow storm in New York City in December 1949. The local forecaster informed the public utilities that the probability was 25% that extremely heavy snow would fall. Based on this forecast, the railroads and other utilities took special precautions. The public was misled, however, because the forecaster had to give them a definite categorical forecast. Apparently, in the event of extreme weather conditions, even relatively small probabilities are worth knowing. In the case of the New York snow storm, the probability estimate was entirely subjective. Based on a reliable di::1gram of isopleths of probabilities, the probability would be esthnated without guessing. Limited experience with actual broadcasts or probability on
l\NAL -·- >F R1
JNSHI
TW~l •.
,
~~~.~?:: :r;\w~:>
•·•
~~E~c·•:~7~
the West Coast of the USA has been quite satisfactory.:•.;Th~>r~~i
main difficulty has been a confusion of- probability with··odds.'' -;' For example, a probability of SO%· has been. mistakenly inter·
preted to mean that the odds were 50 for one event against 100 for the other. If odds were added to the statement of probability, this difficulty could be avoided.·
Suggestions for Additional Reading. On Regression: Ezekiel, Mordecai, Methods ·of Correlation Analysis, John Wiley and Sons, Inc., New York, 1941. ·Also: Quenouille, M. H., Associated .fttfeasurements, · Butterworth. London, 1952. Also: Eisenhart, C., ~
•,l
. ; 1 ': '
: -~ . ·,
.,
' ~ :' ; l ·,
'
J
i !·,: ·'.'.\ '< t
:!'
~ ' :
l\ ,. .\ \1
.\ . . i
., ,.. 1.. 1
•·'';'' !::·
'
~~-~
1_·_
\:
·j I
j
i
.I
.L.;.. ,-::.t~.~-.:L.JJ •
. ,
.
;
CHAPTER
VI
1. Introduction.
A time series is a list of values of a variate according to time. Normally, the time interval between observations of the variate (called the data interval) is constant. The interval in case of • meteorological time series may extend from small fractions of a second (for studies of turbulence) to thousands of years (for the study of climatic fluctuations). The purposes of the statistical analysis of time series are: 1. To understand the basic properties of the time series-its variability, and· characteristics of its periodic and irregular oscillations. This understanding helps in the primary purpose. 2.
To pretli.ct the behavior of the time series in the future.
In particular, we will meet the term "stationary" time series. This means that, although the detailed fluctuations of the time series may be quite irregular, certain statistical ,Properties will remain fixed from one period to another (over the entire series). The graph of a time series might look like Figure 9. Note that the appearance of the graph would be similar, no matter whether the abscissa is measured in seconds, days, or decades. of pressure
~1
a:
Q.
TIME FIGURE
9
Typical Meteorological Time Series.
126
1
ti~e
fi~sUh~:;:~~ '~::~~~~;;·~~~~dj;~~~
In analyzing series, we note oscillation of a meteorological variate with time can be described: ;:r·~ as the sum Of several oscillations. Certain toscillations~·.wiiJ' be ' 1 regular; others will be irregular. ·.The temperature•variation at a station, for example, can be studied as the sum of at least•three distinct oscillations: the regular·diurrtaFoscillation, ithe~regular · · annual oscillation, and irregular oscillations,:: The intent'of·time series analysis in meteorology is,· first,· to separate the time:series ...' into its component regular and irregular oscillations and, second, analyze these components individually.';:': ·! .·;;1:.!• ~rill·~·>·.:;;:.;;! ":
•:\
2. Isolation of Regular yYc1e~·.: .. :;::,,t;:,•; ~;!,\).
a
' i I,,/
h
The annual cycle of mete~rological elemept can, bestr be studied after computation of the:meah yalues of the eleme11t for against time each month, or each season. . These means plotted 1:1 . will exhibit the .nature of the a_nnual y~ri~tio_fl.:/. f.ig4%~01 i>h?~~ ·'
Temp., °F
.
.
'
80°
10
n
L :;.r• :.{: ... ; . l·
60° ;li; -< ''\\. ,; 50° ', .' 40° ' ···::. ~· r .~.:. \ · 30 L------t--t---+---il---1--t--""+-t---+-t---+--+-----··~.·.: J F M A M, J.: J_:,.A. S .. 0. N-'/q3l!''!'H~: '· ':t
""'
'I
..
FIGURE 10. ·t:··· ·. !'.i>:·r.li'i.!. Annual Temperature Variation at New ,York,.1871-1949.: . t. ):·' ;,.{ · · r!~ ;'~~ H; J."f ···:l~d~j· ·~~.. '·, : : · .. • ; '
·'I
the monthly variation of surface temperature at New York.'Hn ·. the same way, the average daily· temperature for each day:of the year might also be. used, but such' a :series: is 'likely to· show,: accidental fluctuations which are due .. to shott-petiod irregular · . 1 • .. ; ! !· · · , ' :·. :/.n··~~i·,::':. ,,>;ii :i:iJ.
The annual and diurnal variation of meteorological elements have a tendency to interfere with each other. Thus, the nature of the annual vnriation of temperature depends on the hour of the day at which the observations are taken; and the diurnal variation of temperature depends on the mouth or the season. For this reason, diurnal variation is usually studied on the basis of observations during one month only or on averages over entire years; and lhe annual variation is analyzed on the basis of observations taken at the same hour of the day, or averaged over all hours of the day.
3. Analysis of the Periodic Fluctuations. The type of aualysis most commonly applied to the periodic varial.ioufl of the lllclcorologieal parameters is "harmonic analysis." Such an analysis helps in the physical understanding of the regular fluclualious. Accordiug to mathematical principles, any ~unction which is given at every point in the interval can be represented by an infinite series of sine and cosine functions. This series is called a Fourier Series, and the method of finding the functions, Fourier Analysis. In the case of meteorologi<;.al data, observations exist only at discrete points, 11ot conli11uously. Thus, temperature obse1·va· tions may be made every hour, or temperature means may be given for every month. Equal spacing of the observations will be assumed in the following discussion. Now, if only a finite number of points exists in the interval to be analyzed, a finite number of sines and cosines will be able to account for all the observations. For example, if temperatures are given for each of the 12 months, five sine, six cosine terms and the mean are sufficient to describe the annual variation completely. The determination of n finite sum of sine and cosine terms is called "harmonic analysis." The first "harmonic" (or fundament.al) has a period equal to the total period studied (one year irl the example above). The second harmonic has a period equal to half the fundamental period, the third harmonic a period of one-third of the fundamental, up to the sixth harmonic, which would have a period of one-sixth of the fundamental period. In general, if lhe number of observations is N, the number of harmonics equals N/2. The different harmonics are isolated so that each can be
:,.~~~,- ....;! ,, r:~~~;·,·'"-'~.~t~f~l. ;· .
··J
treated as an independent entity; don eachimay:'have.; dllterent physical cause. For example, the: first. harmonic:.o£. pressure cycle may be due to the diurnal heating•by;the' su the second harmonic may be caused·by·the·sun's tide-prod force. Often it is impossible to account for the complete variation ·: 1 at once, but the individual harmonics'can be explained. ;~Aiso,'. harmonic analysis puts the temperature or pressure fluctuations . into a convenient form to serve as boundary conditions'in' the.· solution of certain· differential· equations. ".For example, 11 the · vertical variation of the daily temperature cycle can be' expressed ( as a solution of the equation of heat ·conduction, provided 'that the temperature variation at the surface is given. in terrns. of , • • . · ,·,.. ·:·i~·., ii~···i .t;y:.;- }',1'l''i::·;.,~·.z ,·~·: .·! harmomc analys1s. . , , " , ''I i ,. , .. ·:1 11 • 1··;·I>-;..' ... Howcve1·, each harmonic does not necessarily have a distit1ct physical meaning. For example; many harmonics are required:. to account for the annual temperature 'variation' in India} •rThis· will happen wherever the periodic functior 1 is not'of·a·;sinusoidal:: . ; character; in that case harmonic analysis just 'provides 1a ma.the~ matiaal representation equivalent ~o, the p~riod\c f~~ction.· · · :~ :_· ".~' 1
It is not always required to.de~ermine;'aii,N/2 h~rmonics;'in\ fact, usually the first two, or at most three, harmonics describeii, the variation of the periodic function sufficiently' well. ·we'shall\\ see later that this is very different in· the case of non~periodic:'· "t , . . •o'"tj J functions. · · · .. · ·· · · · · ·' 1 · ·· · " · .: .. '( .t ll: : • ' ... ' . ! . . . ; . • ' l• ~. ~) ' Let the variate X(t) be. given by'· the. following ·series:
1
·
•
• 1;
'
.~,:·
'
__x
==.X +Atsin( 360o p
.
+. B2 Cos(
/
,,
l«~!l'l·
•;f .ot~
:~:,·'\{\
t
j
)'T,
]
--.'
'
360j;'2t)· t) + B,1 c·o~( 360p ~~;):+·A~·si•~( .. P ... .
360o.2t).
·;.·~··
.~.;.~:
·:. :.·· ;.. '·
p
. . ·., ., . .
.- l-N/2[ '.
-:-·
,.\.
:-·
. X= X+
1:
1 .
::~
·
!'.''
~··'
\ ..... :.~~·~I:J~(
;~
!.''i., :/ :•.!;!·•,
;.;ti ·., :,:; , ;''ii!s~:;h.r•'~J 1
...i.. •
: ..
··~;·~.!I .., .
·As mentioned before, there are N /2 ''- · 1 ; sin~ and N/2\osil~e terms. Hence, we may write the 'complete series (AN/.2 is always • .!· ,n~i.. i ·-.{·.:··;.. ~·{~ .. ···~}.J;.·•;:.:·;.~rr:· zero) : , ,., ., .." . :. . . . 1:~ . /' (360o' .) .. '·· ,.. ~ .. (360o ·· ..
' ,.
1
A pin
'.·'p.'· ~t.;,~·-~~}~~' ·~:
')Jc· . ·
.:J.:.; :·
time series· equ~ls the· mean plus the sum of In these expressions,·P.is·the ·''fundamental P i& no~ period" or the· total period oH:.he· periodic Junction.· .•,
li1 other words, the ~'!
.. .
\
"~'
.... .
2·
... ··
.
.·:
130
::>OlliH i\I'I'I,ICATIONS 01' ;:,·IATIS'IKS
.,<)
MEfEUl(0LOG1<
always equal to N. 1f observations are made every two hours for a day, N = 12 but P = 24 hours. (Note that P has units of time, whereas N is a pure number). The quantity i will be called the "number of the harmonic" and is an integer between 1 and N/2. A fmal point to be noted is that the units oft and P must be the same; in the example above, where time is· measured in months, P = 12. The first two terms of the series go through a complete cycle in one fundamental period. The third and fourth terms vary twice as rapidly, completing a cycle in half the fundamental period and so forth. The last term varies most rapidly, having a period of 21'>/N. If the twelve monthly temperatures are given, the last harmonic has a period of two months. If any shorter periods exist, they can he found only from mor~ frequent observations. · Harmonic analysis starts with finding the A's and B's in the series above, all of which can be computed independently of each other. The formulae for these coefficients are: A1
=
~ Z [X sin ( 3 ~0 o it)
J;
B1 =
~
Z [X cos
c:oo it) J
where i may have any integer value from 1 to N /2 - 1. The summation in these formulae extends over the N obser.vations. (These expressions are proved in the Appendix). The A for the last harmonic is zero, and the B follows the above formula, but divided by 2. Several shortcuts are available to compute the A's and B's; however, with modern hand-computing machines, the coefficients are easily computed from the formulae as written. The coefficients A1 and 13 1 will come out the same no matter what origin of X is used. The arithmetic can therefore be greatly simplified by subtracting some integer value close to X from all the values of X before commencing the analysis. In that case, slide rule accuracy is often sufficient for the computation of the quantities to be summed. · The origin of time also is immaterial; changing the origin will change A1 and Bit but does not affect the amplitude of the final harmonic C 1• It just must be remeri1bered to apply the results to the same origin iu time from which time was measured originally. For example, if January is defined as t = 1, February·as I = 2, ett., the origin (t = 0) would be the middle of December.
\
'
'
'
. ·: ~
ME',~
'
.t.t·~'.::~.~
.
,;:::~.· ~-
;jj!:~'.!~~';
' :· p~i~:~~~:' : "
It is usually convenient to make· separate lists o£ all·_the' sines.''· 1 and cosines ne:de~ f?r th: c:omt)Utat.jon ,• multiplied by.\2/N or:{",;~i 1/N. Such a ltst ts gtven 111 fable .33 for. the:case of 12 obServa- .... :: tions (6 harmonics). The cosines and sines are multiplied by 2/N ' .. · for all harmonics excepting the sixth,· in 'which case 1/N,is'used instead. Given such a table, the observations~are written below each other on a strip of paper and multiplied' through by each column, one by one, and the products added vertically.· The sums give all coefficients. Actually, the data have yielded 12 quantities 'determined from the 12 observations, . the 11 :coefficients and . .". l __ . I, .· , •;:, • , t hemean. .. -··.· :::-•-r..,;-. 'I' . . ,~,l.fq,:_., .... ,.,. '·t•i: . ,._. •
.
;•'
.,
I
•. :
·,,
'
,
. •
'\.
The computation of the A's and B's is not, however, .the' final step in the analysis. The result is more easily!interpreted:'}f the sines and cosines belonging to the same harmonic are:.:com. I . . •.. l ·~... ,. · .: "·'·•' ·.,,.. . d' b me mto a smg e term. t •.• 1 •. .' 1 1;- .a!;,.;,".~. ··: ; ·,';.. ·
( 360° )
·
. rn genecal, A, sin :---" and to give
and I,
\
C1
by
co{ 3 ~
(1-4) ).
360'i
•t
~· ?,os I·
.;:
·1'
\
·, .. ·r
·· ., ~
;s+;, ~~:~r~:~, !Y~ •
I,
' \ .,
1,:..·~· !h 1
l--... ,
: • ·: •
·""••
·· '( 360° · )'•' · ;.1 ·~v ·~t --·
~h~,.,' ~ isgi:n \,y ~ A1 ,~
;·. ·'
p
•
,, ' ·
-f~ '.''f.,\d;k·\,#_:~:)~~·.~:.~
:·xo
"'!t
.
a~c~t~n (~,/~. i;H~f.1/~ ~t ~pl:t;~~~~
i'th harmonic, and t1 is thetf:ime at which!the i'th harmonic:bas a maximum. Thus, if C1 and t1 are' know,n, the i'th" hartnohio'is easily sketched. In: the· computation' :of the difficulty-is}~ encountered that arc tan (A1/B 1) ' has·· two values betwee'n 0°-fl:.t~
4.1
and 360°.
1
~e corre~~ solution ~an b~~~~r~e~ Jom th~iA~~n~ti~n JJJ.l;;
.that 4. = - - . arc sm (Ai/C 1).; (These1relat10ns are proy~ 111 ·:;';;' ; , . 3~0°1- ··. --- ------~ --f---<· .:· _.g "J .. ..:·:~-~~ ... ., . .i_ ....~.:·.~~~~',;:r:~:b,~,;~·.:_' the Appendtx). • . ·. · 1 · ..• :' .:• : ii.
!
. .: .
.. .
·..
,
.. I
. . . '•. '
·..'·'·,·~~)
_' After having computed the coe~ci~nts: for. t~e-harmol;lics,')t ·is cr&) tfe.sirable to sketch them, add them' up for:each value'oft,'andadd ·:.-~;;~i! ' , r , ' this sum to X. If no mistake has bee11; mad~, the sum of the;:;:-,;;:[) mean and the harmonics would·add up to the:or!ginal tin:e series,i~.;~~ · X(t). Even when ~nly the .first two o.rnP.ree. harmom.c~1\have:;;;;*l '\)een.fomputed, the1r sum when add~d to the mean w1ll o£ten;;}!·(i .,. resemble the original time series. In such 'a case/the.que~tion~i_:'Ji may be asked;r "Should I compute any more' harmonics?,:_:,i1 St~-~:;j{!; '>
.
t:'·..
; . :'· ... :·
·/~,.J·)
. ':· : ·:::\;:rt:~\f.:x,~n\.
TABLE 33 Multiplicands for Harmonic Analysis of 12 Observations, 2/N . ( 3600 l sm p ~. t ) an d 2/N l cos e600 p l. t )
:
::.~
No. Obs.
At
A2
A3
A4
As
Bt
84
B2
B;
B6
.0833
.1443
.1667
.1443
.0833
.1443
.0833
.0000
-.0833
-.1443
-.0833
2
.1443
.1443
.0000
-.1443
-.1443
.0833
-.0833
-.1667
-.0833
.0833
.0833
3
.1667
.0000
-.1667
.0000
.1667
.0000
-.1667
.0000
.1667
.0000
-.0833
4
.1443
-.1443
.0000
.1443
-.1443
-.0833
-.0833
.1667
-.0833
-.0833
.0833
5
.083.3
-.1443
.1667
-.1443
.0833
-.14+3
.0833
.0000
-.0833
.1443
-.0833
6
.0000
.0000
.0000
.0000
.0000
-.1667
.1667
-.1667
.1667
-.1667
.0833
7
-.0833
.1443
-.1667
.1443
-.0833
-.1443
.0833
.0000
-.0833
.14+3
-.0833
8
-.1443
.1443
.0000
-.1443
.1443
-.0833
-.0833
.1667
-.0833
-.0833
.0833
9
-.1667
.0000
.1667
.0000
-.1667
.0000
-.1667
.0000
.1667
.0000
-.0833
10
-.1443
-.1443
.0000
.1443
.1443
.0833
-.0833
-.1667
-.0833
.0833
.0833
11
-.0833
-.1443
-.1667
-.1443
-.{)833
.1443
.0833
.0000
-.0833
-.1443
-.0833
12
.0000
.0000
.0000
.0000
.0000
.1667
.1667 .
.1667
.1667
.1667
.0833
....
3z
,-
...;
> :::! ,.
0
s: 't ~
0t"' 0
("
..., ,. : __,.,. ~, ·:.· ;,, ·;;:.:~:';\i;.i;~:~r~j~R. ··.
.
tistically we may ·ask instead :: .. J'Wh~t:'\f.raction!''of ··.. ,·~.~··n~ ~• variance of X is accounted for by the';first'::fe~/t,'armonic~??'·~ :•i this fraction is substantial, no additional computations 'need be.. m'ade. Fortunately, the equation for the' variance accounted 1for .. · by a single harmonic, i, is simple: .it is CN2, except for the last ,··;· harmonic where it is C12 •• If we form the ratio of this quantity to the total variance, sx2 , we have the requited fraction. Since':. . ·' the harmonics are all uncorrelated, no two harmonics can explain · the same part of the variance of X. ·In other .words, the variances explained by the different harmonics' can ·be ·added. ' If the first harmonic accounts for 30%, the second for 50%, and the·third for 15% of the variance, the three harmonics combined. explain 95% of the variation of X, aod. additionaLharmonics. 1are·1 uJ1;. :.;. important. i,:j,·;:.•, ... ,, :1 ·,! Here is an illustrative example of harmonic analysis applied to · the time series consisting of average hourly, t~mper~t~:~res at .. New York, January 1951, given in Tab!~. 34,, :1 .1m·... •·,,;,.;.;!! tO ' . · .... ··'
,
1 1 '' ~ !• \;I;};.~ , •• •
I:''
.0
'TABLE ~34 ~d, · • · Showing Temperatures as Function of Time ·
·
Time
Temperature
,.,,,.
Time
,,
•
..
:t~i , ..
;
. _;_.
1: ,', ~ •~ ;_" -·~·.;{~· :. ~·n······t·
f
.:
;,:
.
.
·i:
I ,
• j\ ', '· \1
· .. Temperature I
•
. 1 pmil ·d.::.,h.·:f'i38.H·•u!: 35.5 i~ 2 p m , ; ,·\_ 39.0., , 35.2 •• 34.8 ., •I ( 3pm 40.9 :.. · ; 41.2.1 34.8 I .· ·, .• 4pm. 34.6 i \· 5pm 38.9 '; 6pm 38.1;. 34.4 34.5 1. l '"· · .. 7 p,m •,.;. 1( 1 j, .37.9 34.6 .89 pm, .. r, .:·>c;r:''''.3367.93; ·,._•. pm·\·~·· ..~_,\4 ~·:~ ... :1. 35.3 35.9 . . 10 p m · . . :1 • . 36.3 .: , ,. 1 , . . 11 p m . ·.' ·35.9 · 36.7 i ···. ' ' Midnight''' 35.8 37.7
1a m 2am 3 am 4 am 5 am 6 am 7 am 8am 9am lOam Ham Noon
;;":·;
' : ...
,
··.•,.
,.•.:._
..
.. ,
i' ' h••,.l; ";il 'l(;t,,.:,,•.•_~,:-~:::··'··: ..'.. Table 33 cannot'be used in·this case,i(sincecwe deal.withf24 observations. After nl'aking up a ·table; like r~able,~Jj.jJor~.24 , obser-Vations (N ·= 24, P = ·24 hours),~we,find::'ti)"·· ,\;·~:~Wf' .. ···! At = 2.27, Bt = 1.16, A2''='.84/ B2 '=·;.09/.:·{·Mi; ·;; ....
Ct = vAt 2
+ B111 .
=
·2.55,):2:-=. vA22
'~·-·,j'•lft 1 ''
+1B2:l. ~; .8_5
)·I
:·.1 .
'f'"t\'
• • • • , ••
1 , ( 2.27) '1' , t> ' ( 2,27 •16'2'1'' ·', arc sm - - = - arc an· ______. · = • . 1rs. 15 2.55 15 _ ! ·, t' ·,. , 1.16 ·n ,.,.,;;.,1\:i·
'· t1 = -
\I:
1
l
'
.
';;
i arc sm . (.84 .84 t2 = -- ) = 1 arc tan ( - ) = 2.8 h rs. 30 .85 30 .09
Variance accounted for by 1st harmonic (s 2 = 3. 78):
C,2 2s2
6.50 7.56
0.86 (86%)
By 2nd harmonic:
C22 2s2
0.72 = 0.09 (9%) 7.56
As might have been expected, the first harmonic is vastly more important than the second. Together, the 1st and 2nd harmonic account for 95% of total variation. No additional harmonics need be computed.
4. Orthogonal Functions. I 11 the past section, we have expanded a periodic function in terms of sines and cosines. Why did we not expand in terms of some other kinds of periodic function, for example, tangents? The reason is that sines and cosine have the very important property of "orthogonality." This means that the average values of products like: ' Sill
(360° --p ~. t ) Sill. (360° -- ;- J. t ) 1
when averaged over the fundamental period, P, are zero unless Further, the average of
i = j.
. Sill
(360° p ~. t )
COS
(360° p J. t )
over P is zero for all i and j. The consequence of this orthogonality is that the coefficients of the harmonics above can all be determined independently; in order to find, for example, A, and B,, we do not have to solve simultaneously all 12 equations with 12 A's and B's (see Appendix). There is another consequence of the orthogonality of sines and cosines. Suppose we wanted to fit a single sine-cosine curve with period P to a time series in such a way as to make the sum of the squares of the deviations between sine curve and observations a
'·
Orthogonal functions are not used ·only to :fit, periodic time series. We shall meet orthogonal functions again in connection with space series later. Here, orthogonal functions are: defined more generally: a set of functions fn (.x, y ·.• 1.') is orthogonal ifthe average value (averaged over some specified .range of x, y, etc.) of the product/n (x, y ..• )/m (x,y .•. )is.zero,unless'n,=m1• lf n = m, we deal with the average of / 0 2 (x, y ••• ) which .. obvi~ ously is not zero, but is generally'a known quantity: ·ln'fact; the functions are often chosen in suc'1 a way that. the sum o(~he squares of the functions is unity; in that case, the functions are called "orthogonal and normal" or, in short,'"orthonormal.''· The .
-
I2 f unctton V N .
•
Sill
p
'
.
:!\
)
I
.:
' ' ·• ·,· .
..r ,.;•.
!~.:i·.:'!.::·.:i: .. ! .
. h I . . .:' ' 1-t , or exa~p.~,; ts ~~~, . ~~~~~~~~:.·~;':·'.:,·. .' \.:
. ( 3600 . ) f
o
'
'
"t \
t
~
I , I t I I:
5. Elimination of the Regular. Cycles.'.:''''
I '
i
~
.
!
I~
, _,
;
,
, • •
1
r '
After isolating and analyzing the regular oscillations of atime series, the next step is to elin~inatc them or "subtract them out" of the data in some fashion so as to investigate the remainder. This elimination may be done in several ways: : >
1. Jf the period of the regular cycle is shorter than the presumed periods of the irregular oscillations, we .may.~use ~ither of.two methods: ' ,· ,· : ·. · . ;:: 1 •,: ·:;; , a. Use only observations ~t the same point ,of the cyfle. For example, we should use only temperature· observations . made at noon and thus avoid the complications introduced' by . has,' a period of·on~ d.~y. ·· the diurnal temperature cycle' which . ,t ' ' . l ' " b. Use the average of aU·. observations over ·a: complete · regular cycle. For example, 'we could eliminate the influence· of the diurnal temperature cycle by working only with' average,· . daily temperatm:es., · ... ·; . !· ::· :_\'\·.. ;:;-;,··f ... ~,.'; .:Lr·,;-,.', ·.·
'This definition is sometimes generalized to:., g_ (x, y) fn fm. = ·o/'where' · ,Y:·. g (x, y) is an arbitrary "weight function." · ,. ' · • ·... i i •:.:,.-! · • :;: ·.;. :~ ··i!! ;·.:: ; .,.-l.
2. If the period of the regular cycle is long compared to the periods of the irregular oscillations, we can express each observation as a deviation from its mean or normal. For example, if a time series consisted of average monthly temperatures, each average monthly temperature would be expressed as a difference bclwccn il and the normal average monlldy temperature. From the point of view of forecasting, we are interested only in deviations from the "normal." We know it will be hot in july, but the real question is: will the next July be hotter than ·~ormal?
6. Isolation and Analysis of the Irregular Cycles. The remainder of the original time series after regular oscillations have been eliminated is called an "oscillatory time series." It presumably exhibits no particular regularity and no appare'nt cycle. No matter whether the time scale is seconds, clays or years, it is generally made up of several types of oscillations: l. Short-period fluctuations that are of such small scale as to g-o through one half or more periods between adjacent data observations. These cycles cannot be studied because the data are not frequent enough. Their effect can be largely eliminated by an averaging technique such as "running means" or "moving totals." If, for example, the series consisted of 200 observations, G1, G2 ... , G2oo, then we might replace the series by another series consisting of the terms
G1
+ G2 + Ga 3
G2
+ Ga + G4 ...... G19s + G199 + G2oo 3
3
which would be considerably smoother. This subject is discussed later in detail under the heading of "smoothing and filtering of time series." 2. There may be a slow, gradual change of the variate over the whole interval under investigation. Such a gradual change is called a trend. This trend never persists indefinitely, but is rather a part of oscillations with periods long compared to the record. 3. Irregular fluctuations of an intermediate scale. The trend may be isolated and studied by the method of least squares. Jn the simplest case, the trend is essentially linear. Then. a straight line can be fitted by the method of least squares.
'
---
:•.;;r·f..;.
. :::iER___
.. .
.
·' L
·-·- •
• •
'al~~ad;.·,.·:~;~;~:-Chapter-': iv~i1J~h~::~~'~' · ·. ,_, '.·;,.,,
The method has been outlined slope o£ the li1ie of regression is given qy :the:1£ormula ; ... 1 d~-t~.:\)_{1,!,.',~:~~ xe~xe:.-·:. :·•>: 11t =
·. .-:.
t2- (t)2
Here a bar stands for a mean.
. .:
:· :·
:II·.'·
'.
., ....
.:.
'.·
,.,ti ..... .:.;
'· 1£ the slope has been computed, the equation for the trend line can be written down because it passes through the point X, t. Thus, the equation is: X -X = m (t -'t). The computations can be simplified somewhat by choosing the origin o£ time at T. Then the slope formula simplifies to:· · · ·. ' · · ·:·: :' '',' ·;
m =
[i''
l,
Xt
;t
.·
~.
• ,•
. j
·: i···, l
1
. ,. ,._,
'
I '.
,
~
.
··.·. -~·
_! / 1'
:
~
Since observations are equally spaced, 'the denominator in these equations reduces to (N 2 - 1)/12 where N is, the .number:o£ observations, as shown in the Appendix). 1 .:. . ··i; ,-i~' i . . ·:· ,,.0: Before the trend is computed, great care must be taken' to make the time series homogeneous. As an ex_ample.of the influence of heterogeneous data, the temperatures' at New York ·city indicated a trend of 3°F per 100 years in the last 80 years .. But in 1911 the U. S. Weather Bureau Office was moved''to a' skyscraper where the temperatur~s averaged ·1 °J:4 colder than. at. the lower weather statiop. This' was di~covered by fitting separate lines of regression to the data before and _a£ter.1911. , In 1911 a jump of -1°F oc~urred. The reaL trend, indicat~d by the' two separate lines of regression, was 4°F per.100 years.·· This number wo~ld als6 have been obtained if the 'temperatu-res. were' first homogenized, for example, by ?-ddin~ 1°F;1toall ~he. ~e~p~r,~t~r~~ after 1911. ~ · •i:
Trends should not be used as forecasts, .·They descri.be only the behavior of the variable in the past, and may stop at any time.· Also, the cause for trends is not necessarily 'altogether due to a true change o£ weather.· In many cities; such'as'New York, the gradual warming is due partly to the:growth'of" the city. ;•· ''
• [ '
·' ' 1
, ,,
l ' ' •.
~
.1
- 1 '
:
: ';,
, ' :.
1
~
' ' 1
>
i I •!
When the trend shows definite curvature, a parabola is some- . times fitted by least squares. Let the equation for the parabola be:·.·· X
=
.
a
.
;
-·. ~
'
+ bt + ct2 .
.-,..
.~' t
.. ' ...-.
·_,_j; ,: ,\'
1.115
::,o.r.m
Al•ri.ICATIONS
v•·· ;:,TATISHCS
tO Mr;n,vl<.OLOGx
Then a, b, and c can be found as the solution of the equations:
+ bZt + c2;t2 a.Zt + b.Zt2 + czt3
l:X = Na zXt
=
:ZXt2 = a:Zt2
+
bLt3
+
cZt4
Again, the algebra can be simplified by choosing the origin of time at the center of the series. [n that case, 2;t and 2;t3 vanish. Just like linear trend lines, parabolic trend lines cannot be extrapolated accurately. In fact, the author once predicted that the enrollment at a certain srnall college should go to zero in 1947, and become negative thereafter. This prediction was based on a p
7. Autocorrelati<>n Functions. Autocorrelation means correlation with itself.
X) (XI + L s,? N
X)
In other words, correlation cotime series an autocorrelation
'-
: i·'J~.~ 1
ESE
~q1·
, ~r ·
.,
. . . ·'
'
;;:';;:;_:~~'~
since s1 and s2 are essentially the: same···and:equal to ·sx:h'The':::~:.; interval Lis called the lag. If the lag is zero, the autocorrelation·. · coefficient is one. If the lag is small, meteorological cprrelation coefficients are normally still positive. In other words, meteoro~ logical data normally have persistence. That is,' if the tempera· ture is above normal today, it will usually' be . above normal tomorrow. As the lag increases, the autocorrelation becomes smaller and may even become negative. 1 This would mea11, for example, that if the temperature today· is above normal, it is likely to be below normal after a certain intervaL It is clear that such information has some forecasting. value.'' 'A cm·relogram shows the autocorrelation coefficients as a function of the.lag L. Figure 11 shows a typical correlogram. " · : · . ' · . · .' •
I'
•
•
;
,"
~
,
,
•
l 1
.' . ~ :
r
I
. ; •: 1 ~ ~ '
:
' I
.~
~ ,, .. : ; ; " ·.
• . ... ;' ' •
j! '!
1
~ / ~ , : ~ f
! )" •
,
' ••
r
':'•J '.'{ ... :t
i •}
.. ;\
. lf:' 'j ~ ,·h
'•
0 '--1~--+--+--+'--+-t--~--~- L,days .. .... ,. ~ .. . •
I
Typical
t.'.·: :
:~ .. \
.• \
I
.:c
Aut!~~~r:~a~!n Functions. ,.
• }>:. t
••
:;
l
"j,
' 1 ''
I';)
:
~I
....
li ,· ~ ; \I .. J.
~
i : ·. 1 '.·\ ..
'
·).:~::.-.:L~t ~r::~:r~·:·
If we constructed the correlogram of an ideally'sinusoidal time series with period P, the co1relogram would show a cosine curve having the same period and an amplitude o(unity.·L(i.e..~perfect ·. correlations for lag one period, correlation -1 for half a period . lag). Originally, correlograms were used to estimate important · periods in the time series. From the schematic figure above, the· . conclusion would have been drawn that th,e time' series had··.'a ··. significant period of six days .. Now, 'one' concludes'. that the siectrum of the time series probably has a general maximu1p near six days. .., · :, '!':· :,.;,;:;,:, :1.! ;.,.:,;l,·~j,.\ ,' '
' '
' ' • ' ·_ i~l
' '(
:
' ...... ' '.·. : i. ' '
In general, the autocorrelogram ·shows 'fluctuations 1with 'the same kinds of periods as the original. time series, excepting' that· all the fluctuations have been put in phase so that' they.'reach ·.a··' . t l '• .:;-_~J ·,. .. : ·~·(, ;,:·(·~·.i:~ l·.'tli".lJ!r,'l,di·Jf.:;.' max1mum a zero ag. .1 . ~.f... \.;·';;,·(,, ' ..,•,-;:· ···-'';;) ,:,\ l,!,;,V;' · ,.· The fact that autocorrelation. coefficients~are'(diffe~~nt\\rrom · ; zero. means that many· of the' statistical testS are inapplicable,./: since independent data are assumed.·: On the. othet; hand 1 :auto~ ::·( ·,,~>.
,. . $_· '·.:>:·~·~·~:~i~:;·: :·, ., .
!.Yt:· :~~~}
.:r
co;rclation coefficients are useful in the prediction of future · values of the time series. · Since the autocorrelation coetlicients do not drop to zero immediately for lag L, it seems reasonable to ask how well the temperature, or other variate, tomorrow can he forecast as function of the temperature today, yesterday and so forth. As a matter of fact, tJuite a good forecasting method can be obtained in many cases by just writing the regression equation for the variate on one day as function of the same variate the day before. The result is the "normal-persistence" method of forecasting; the forecast for the next day will always be closer to the normal, and in the same direction from the normal. 1t seems reasonable to think that inclusion of values of the variable at an earlier time than, say, the day before, in the forecasting srheme, will improve the forecasting method further, especially since the autocorrelatiou functions often do not drop to zero rapidly. However, attempts in this direction have always met with failure. It is always possible to derive, from a given set of data, an equation connecting the value of a variate at a given time with the values at previous times. But, unfortunately, such equations seem to hold only during the periods from which they have been derived, and break down in test periods. Clearly, the forecast of a given variate must depend not only 011 the same variate previously but on other variates as well.
8. Spectrum Analysis. Spectrum analysis is done, in practice, by applying a type of harmonic analysis to the autocorrelation function. However, the spectrum can best be defined first in terms of the principles of harmonic analysis set forth in connection with the analysis of regular cycles. Suppose that we are given a record of some meteorological scalar variable with N observations taken at time interval tlt. Let the overall length of the record, N tlt, be denoted by P. Now we subject this record to harmonic analysis, choosing P as the fundamental period or the period of the first harmonic. We could, in principle, compute N /2 harmonics, the last of which has a period of 2tlt. The contribution of each harmonic (except the last) to the variance is given, as before, by C 12/2, where i is the number of the harmonic. The spectrum now is defined as a
' j>t
graph of CN2 as function of i. ~; Figure~l.2: shows:·a''schematiJJ ''·· .· view of such a graph. The meaning·of thespectrum'.isthat1it:':<;;;7.J shows the contribution of each harmonic to the total variance. ;'~ · ·::,~·· I
I•'
•"•! i•;,l:
' .. ; ',· .: ''·' 11.:
•.
·f
!'.'
',1:
l
'
••
!.
·: ! .~
II
' •1:-
·.,:
'
01
5
.....
I
___ ..... 10
,:,
.'
I
'' '
:1,
:1 j
I .
.~
I •I
,~I
l'
'
;
'i'
..i \ ~ . '
; I
15
..;, i'h... , .·: 1 • 1·'· ,·; ..i/::·t:.irH),:~~l,. Spectrum (Schematic) py [)ir,ect H~~~~?i~.{\9aly~i~<·' 1 FIGURE
12
'•\i',' ,• ~
iif;;;;; ;·,};.
A spectrum of a time series is analogous to an optical spectrum. ·. ·· .:·· An optical spectrum shows the· contribution:' of- different. wave1 .. lengths or frequencies to the energy of a given light source:·· The spectrum of a time series shows the contributions 'of oscillations with various frequencies tol.the•variance of a time series.'··!:, . J .. !
·:
,,,··,
·r;~~~-
'!J'
;~~-i
··\·
.~··.
The area under the "variance" spectrum is,the .sum .. of. the contributions of the individua( frequencies ,.,.to;: the variance~: which mtist eqtlal the total variance ... Somethpes we work with . a "normalized" spectrum, which has unit area and is derived by·· dividing ali ordinates of the original spectrum by the varianc~ ·
t ~·mentioned pre~iously~··~h~ ~:~~t;ti:;~~':;:::t::\1·~~1~~;\~~. ~L' harmonic. It is .also called the' "frequency, n,; for it measures the number of complete cycles the i'th harmonic'executes in,tin~e 'P.'. For example, if P is 300 days, the 4th harmonic undergoes· four complete cycles in a period of ~00 days .. Note ,that the period. of the i'th harmonic is inversely proportional to i, being P ji. In the above example, t~e 4th ha~moni.c ,ha,s ~,P,en~d,of, 75 ?~~~:.·:·
;
Unfortunately, if all N/2 harmonics are computed, and•half. the square of their amplitude is plotted as function of the. fre-'
quency, the points scatter wildly. In fact, if the spectra of two dirferent portions of the same stationary time series are derived, the individunl points of the spectrum will look quite different. However, the smooth underlying spectrum (drawn by a line in Figure 12), will be the same. The figure attempts to !'ihow the kind of scatter encountered but, in an actual spectrum, the scatter about the true spectrum has a much larger amplitude. In particular, outstanding amplitudes may be found by accident at parti<.:ular values of i. At one time this was believed to imply exact periodicities at this frequency; now, individual sharp peaks in irregularly oscillating time series are recognized to be accide1t ta I. .Spectrum analysis does not attempt to determine the individual amplitudes of each harmonic, but is aimed rather at the determination of the smoothed spectrum, which is the same for different portions of the same stationary time series. In fact, we are really not considering the "spectrum" of the given short time series. Rather, we consider the spectrum of an infinitely long stationary time series, of which the given time series is a short random sample. For the long series, we can define a smooth spectrum by a suitahle limiting process; our problem is to estimate this smooth spectrum on the basis of the given short series. There are at least three ways of determining such a smoothed
spc<.:lnlm:
1. Determine the smoothed spectrum by smoothing the amplitudes (CN2) of all the N /2 individual harmonics by some all.{ebraic smoothing process, an equivalent operation to drawing a smooth line through a number of points.
2. Harmonic analysis of the autocorrelogram. 3. Use of a mechanical or electronic harmonic analyzer. The first of these methods is too time-consuming, and the last. is beyond the scope of this book. Hence, the remaining discussion will concern the technique of computing smooth spectra with the aid of autocorrelation functions. If we usc the autocorrelation functions themselves, we obtain normalized spectra (area 1). If we use autovariances instead (correlation coefficients in which the variance in the denominator has been omitted) the result is a spectrum as shown i11 Figure 12, the area of which is the variance.
In the case of B" and Bm the coefficients resulting from the : fonpula have to be divided by 2. If Br is plotted as a function of . ,.. i/2mllt, the resulting curve is a smoothedversion'of the ilOt;malized spectrum of the· original series, C 1 ~/2sx.2 ;. the smaller 'm is\ compared to the original number of observations N, the m01·e\\ smoothing is obtained. But, when 1n becon1es small, little \vi11 'i .. , be known concerning the importance 'of long:· periods/. since.'· 2mllt is the fundamental pedod.· Further, small m decreases the resolution of the technique' (it ·may smooth the' spectrum ·too · much) and may hide significant characteristicsof.. the: .'~popula tion" spectrum. .· 1 •.• "· ' •. ,.,
:·w.;.. ': ·. . ;;;_;:, . •
•
!
The above formula, however, does not give 'the ·bes~ estimate of the smoothed 'spectrum function.'.''In''effect,''it•'computes · weighted means of the unsmootHed spectral estimates :as are· shf:wn schematically in Figure 12 .. Unfortunately, some of the weights are negative; this results in greatly. distortet;l sp~ctral estimates whenever there are rapid .fluctuations ·in • the .•true spectrum. To overcome this drawback, John W. Tu~ey suggests to improve the spectral estimates; by for.ming', the •· smoothed estimateS; = .25B1_ 1 .50B 1 .25Bt +:.1', .. ;/ ·: .: 1••• _;:;.-:•'~: 1 :/; ;i To summarize, a good estimate of the spectrum· is 'computed · aS follOWS: . , . ~ ·\::o·:; ~ .. <. :lf::)tttd f·HJ.h·n:r~~; ... r'.J,
+
+
I ; ' t!
, , : ·. ... ! - <: •: -~·
1. Autocorrelation coefficients are formed Jor,iags Oto 111:·:~; 1 .1
144
:,oME AI'I'LH:A1•0NS
OJ• ...HATISTI'-"
"' ME'1.c.v.-•Jt..OGY
2. A harmonic analysis of these coefficients is carried out. 3. The coefiicienls are smoothed by a weighted moving average. The following figures illustrate the detet·mination of · the speclrum of vertical velocity obtained from 391 observations of vertical velocity at ten second intervals. Figure 13 gives autoco· variances from lag 0 to lag 30. Since the mean vertical velocity during the period was zero, the autocovariance for lag zero equals 16
Aut ocovonances, Vertical Vetocit1es m2 sec2
14 12 10
08 06
04 02 0 -.02
----100
200
la9, seconds
300
Autocovariance of Vertical Velocity.
the square of the standard deviation, or the variance, of the vertical velocity. No periodicities are particularly obvious in the autocovariances. Had they been computed up to higher lags, it would have become apparent that major oscillations occur with periods of about 200 to 300 seconds. Figure 14 shows the spectrum obtained by harmonic analysis of the serial products of Figure 13 and subsequent smoothing. The spectrum shows that most of the energy of the variance of vertical velocity is produced by eddies with periods of the order of 200 seconds (three cycles in 10 minutes). There are no important gaps in the spectrum. Further, though maximum energy is contributed by the longer periods, the total energy coutributed by eddies with periods less than 3 minutes is by no means negligible, as can be judged by the area under the spectral curve to the right of 3 cycles per 10 minutes.
.016 .014 012
.' i
:;
I "
I'
.010
•
• •
.004
1• . · . . •
• :i ·
I \ : ~.j I 1 ·. ;. . ;, ~ t I
I' ; ~
'
.. \
·tl
;~ .; !
:.
1
f: ., ,,;:t:i i
:J
: :
i ! I f -·
-:
.002 0
2
4
6
8
10 12
14
!6 16. 20 22 24 26 28 3Q,.:.
14 . . . . .. Spectrum of Vertical Velocity.';: :,~ 1 ' ~ ·• :,", 1,.·~ ·::,.: FIGURE
'
'
' I!'
. :'.
9. Samplinll Fluctuations of Spectral Estimates.•; :: '!·" :• ·' :. l
'
I }
I
,j '
I,' :
!
.
i ~l•
Suppose we consider a given time series as a sample from an infinitely long stationary time series. We shalL call this .long series the "population." . · ' ·.. ' · ., '·. ···:·:· ~' ;<'.1 ·' , '"',' ; i.. ·
··
· :·1;' · .: .. ' :: ·r·: \ •,':;
;,.
i \i
Also, suppose that the spectrum of the population were known. 1 • Then different samples would have· different"spectra. ·;If. the · samples are drawn at random, and the distributiori of the variate · is normal, the distribution· oti'the-sample· spectrum ·estimates:at ·a given period is distributed about the f:Orrel!lponding population spectrum approximately according. to the"distribution l. oLchi square divided by degrees of freedor:n~· Th~,~u~9er of.degr~!:s ?.f ·. ., freedom in this case is ZN -
m
m/Z~-w~er~~'~i:::th~--~~lp~~>~f·~a~s . ·. . . ... l
:
t
·.> ~-. \
! •, '
'.:! :. ~
J
f,
and N the original number ofobservations.: In the, example, ·:. · n/the number of degrees of freedom, was (782-15)/30, or about:26. From Table 11, Chapter III, the 5% limit' of .chi'square'is 38.88. · and therefore the 5% limit of chi square/n is L50.' ':This-~ means , the following: suppose the popttlation had a;spectral interisity of ·" .0054 at 9 cycles/10 minutes (see Figure 14):tThert the'probability is 5% that a sn~ple may show an estitnate o£ 1.50 !X .0054, or larger. The actual estimate at 'that; point' was .0068,,• considerably less than this 5% limit of· .0081. :· Hence, :the observed .0068 is not sigm:.fican.tly.differentfrom the hypotlwtical us~,
or
/\l't'l
JNS
t.
.liST I
vOGY
.0054. T:1is was the reason why a smooth curve could be drawn to represent all observed points sufficiently well, even the apparent peak at 9 cycles. Without the theory of sampling fluctuations, one might have supposed that the spectrum had a "significant" period at 10/9 minute (9 cycles per 10 minutes). Actually, the peak at this point is probably due to sampling fluctuations. In general, the theory shows the sampling fluctuations are large, and that peaks and troughs are significant only if they are extreme, or if several adjacent points are all relatively high or low. For if 1 point has 26 degrees of freedom, 4 points have together 104, and their average is subject to much less sampling variation .. Also, peaks become more significant for large values of N/m.
10. Purpose of Spectrum Analysis. Since spectrum analysis can locate only the general shape of the spectral curve, but cannot pinpoint particular cycles, one might ask whether there is any use in making it. There are several reasons for statistical spectroscopy. l. One may need the spectrum to understand the physics underlying the variation of a time series. In the short period variations, at daytime, often two distinct maxima appear in the spectrum, one due to mechanical turbulence, one due to heat convection. Aga!n, theory might predict the instability of waves with certain periods. A maximum should appear at such periods. 2. Significant maxima and minima are important for forecasting, even if they are not sharp peaks or troughs, for they indicate the likelihood or unlikelihood o£ variations with certain average periods. If there exists a gap in the spectrum near four days, then repetitions of weather phenomena with periods perhaps between 3}-2 and 4}-2 days are rare.
3. Cloudsecders and other personnel engaged in weather control claim they can change the spectrum significantly. This can be tested by Tukey's spectrum fluctuation theory. 4. The extent of the spectrum shows how quickly instruments have to respond to measure such things as kinetic energy of the air or, more generally, the variability of any quantity, and what the error is likely to be when the instrument reacts slowly. 5. The question may be raised over what interval of time winds should be averaged in order to be stable. It is the practice
..
.
•
3E1Ul
··
..
•
;;:::Jr'
-~~
min~tes:1:\~ow,,!it· ~:;·:':r:~~~Y~1
to average winds over on: or tw? that the spectrum of the wmd vanance has a greatdeal!of energy· , '' near periods of two to three minutes, implying gusts every two ·, or three minutes. This means that a one-minute wind. average now and a one-minute wind average a minute later may come out quite differently. In fact, it appears that winds near the surface should be averaged for at least 30 minUtes before really ... stable estimates can be expected.. · . 1·' , .: . · ·· , j :
j :
! j ":
l.r .'
·~
:·'
11. Smoothing and Filtering. As explained above,· it is possible to' .des~ribe th/spectral con tent of a time series by statistical means.·'· If the original' series contains some frequencies or periods which are of no interest in the study at hand, these waves may be reduced' in amplitude by statistical filtering whereby the spectrum of the original series is altered. For example, a time .series of wind measurements made by a cup anemometer' may contain fluctuations having a period of only a few minutes. In a study o(wind changes resulting from the movement of synoptic-scale storms, a'knowledge of tl1ese short-term fluctuations would not be important and 1their appearance in the data would unnecessarily. complicate . the problem. Therefore, it would be desirable to smooth these wind data to eliminate short-period variations. Smoothing is a form of filtering which produces a time series in which the spectral components at high frequencies are reduced. In terms of electrical engineering, this type of filter is called a tow-pass filter, since low frequency (long-period) waves are barely affected by the· smoothing. A smoothed time series value is merely an estimate of what the value in the series would be ·if .undesin;& high :fre· quencies were not present. ·I' ·, ·, •:'·i ·i ·>": ',; 'I•. ·: · ,,, , • .. ; " .. 1
It is also possible to filter out low'frequencies; leaving only hrer frequency waves in the series.' This typeof filt~ring . on time series will be called "high-pass filtering.'~ It is also possible to fitter out both high and low frequencies, with the effect of leaving only medium frequencies in' the· resulting' time series.' Such filtering will be called "band-pass filtering.".' Finally,· it'is actually possible to design statistical filters whic~ will 'amplify' high frequency waves in a time series in such away as to partially re.verse the effects of a previous smoothing of the same series. This latter process is dlled "desmoothing"J.Ot: "inverse smooth·: ing." .'. 'I 'rl· :::.:1 1'rr.
:usn
LOGY
Statistical filters consist of a series of weights (usually fractional values) which are cumulatively multiplied by consecutive values in a time series to obtain the filtered variable. The simplest statistical filter, or filtering function as it may be called, is the equally-weighted running mean which is computed by adding n consecutive time series values and dividing the sum by n. These means arc computed for data centered on each value in the series so there is considerable overlap in values used in computing adjacent running means. Hence, this type of mean is often called the overlapping mean. In the case of the overlapping mean, the filter "wciv;hts" arc all equal to 1/n. As will be explained later, it is preferable not to have the weights equal, but to have lhem decrease smoothly aJHI symmetrically outward from the central weight in spite of the increased computational labor involved in using this type of filter.
TABLE 35 l"il terin~ of Time Series Filtering Function
Time Series
Filtered Values
16
15
.06 X .2.5 X .38 X .25 X .06 X
.12
1.3 17
20 18
13.5
14 ..3
16.5 18.4
21 2:1
r\n example of ftltering is given in Table 35. The series of weights on the left in the center are lnultiplied by consecutive time series values. The product on the right is the filtered variable for the time of the series value multiplied by the central or principal weight of the filter, which is 0.38 in this case. For example, 18.4 is the filtered value for the time series value of 20 and is the cumulative product: (.06 X 13) + (.25 X 17) + (.38 X 20) + (.25 X 18) + (.06 X 21): The filtering function is moved down the time series one data interval after the computation of each filtered value before the next value is computed. It is desirable to know how the amplitudes of waves in the time series are affected by the smoothing or filteri11g. In other words, one wishes to know how the spectrum of the original series
vi>""
' .. '
:1
,.
'•f:
:·.•.1),-"f
"''""A .
~·
l~t~;·•'¥ .. ~.~;_""
1
is altered by the filtering. This information is expressed:by.(the: filter's frequency response which. is the t;atio of the amplitude'of (', ,,~ an oscillation of a given frequency after filtering to the original ;,. ·' · amplitude before filtering. This ratio varies with frequency. For example, the frequency response of a smoothing function varies from near unity at low frequencies to zero at higher frequencies. The frequency response of any symmetrical discrete-valued smoothing or filtering function is given by the following equation:
+ 2 k ~•n I W~c cos ,(2knf At)
I
1 i .. where R(J) = the frequency response, f = frequency, wk =: the kth weight numbered outward from the central weight w0 ,: and " At = lhe data interval, the time between successive observations in the time series. ·· ' R (f) =
W0
The equally-weighted running mean type of smoothing is so common that the following approximate general formula will be useful for computing its frequency respopse: R(J) ==sin (1r fT) · ..
11'/T
where T is the interval· over which the time series values are averaged, and f is measured in cycles per time in the same units as T. A graph of this function is given in Figure 15 . . Notice that for some frequencies the response is negative, which means that the amplitudes of these waves are not only reduced but their polarities are also reversed, namely, maxima are turned into minima and ~ice versa. This property of equally-weighted running means can be avoided by using smoothing functions which have weights which decrease outward in each ,direction from the central weight in proportion to the ordinates of a normal p7'bability curve.· The frequency response of a smoothing function having discrete weights proportional to a normal curve with a standard deviation of a is given approximately by the following equation: · · '' R(J)=e
-211' 111'!1
Thus, the frequency response of the normal curve smoothing function has the shape of a normal curve also.'.' Consequently, the response of this filter decreases slowly and' smoothly with increasing frequency. f(lt is impractica1 to design statistical ··;.
150
SOME 1\PPLICATIONS 01' STATISTICS TO l\ll,TEOROLOux
0.5
1.0
1.5
2.0
2.5
3.0
Frequency in Cycles per inlorvol T, Time over which runnin11 mean Is. cornputed FIGURE 15 Frequency Response Function or 01·dinary Mean.
filters which have a "sharp cut-oii," namely, a rapid decrease in response at some specified frequency. Statistical filters which, in theory, have sharp cut-off will never give this ideal response. in practice. Smoothing functions can be designed which have approximately the shape of the normal curve by making the weights proportional to the binomial coefficients. As explained in an earlier chapter these coefficieuts ure computed by the following equation: Nl Cnt = - - - - - nt ! (N - m) I These coefficients converge on the shape of the normal curve as N increases. vVith N = 4, the binomial coefficients for m = 0, 1, 2, 3, and 4 are 1, 4, 6, 4 and 1, and the smoothing function weights are~.~.~.~. and~. or 0.0625, 0.250, 0.375, 0.250, and 0.0625. i\n example of the use of this filter is shown in Table 35. In g~neral, the frequency response of a binomial smoothing function is: R (j) = COSN ( 1f f ill)
mS:
which, in practice, approaches the frequency',response· oL :the normal curve smoothing function, As N increases, the'ttumber of coefficients also increases, but the number of weights of nonnegligible magnitude in the final filter does not increase as rapidly. In order to obtain a binomial filter having many weights, it may be necessary to take a very large value of N whose coefficients may be inconveniently large or not even tabulated. In these cases it may be more convenient to make the weights of the filter proportional to the normal curve, and use the· corresponding ,., .. ··,,, .1 · equation to compute the frequency re~ponse,. Statistical high-pass filtering can be accomplished by subtracting smoothed values from the original series. ' ·This operation leaves only high frequencies in the remaining time series. If these values are smoothed again slightly, only intermediate frequencies will remain, and the process is a band-pass filter.''.The frequency response of the band-pass filter is merely the difference between the frequency responses of the two smoothing functions used .. The frequency response of the high-pass filter is one minus the frequency response of the smoothing function used in computing the smoothed values subtracted from . the original values in obtaining the filtered series. .. " · A common type of smoothing produced by instruments is "exponential" smoothing. It is called exponential smoothing because the contribution of·various values in the time series to the smoothed output value 'decreases exponentially· in· the direction of past time. Future values in the time series do not contribute to the smoothed output. All physical instruments with constant lag coefficients (time constants) . smooth exponentially. A m~rcury-in-glass thermometer has nearly constant lag coefficient and is an example of an instrument which smooths exponentially. (The frequency response of an exponential filter is: .
R(J)-
. ·:·
v'1
1
+4
... :~
' '
-r}2 )..2 ,,
where ).. is the time constant in the same units of time as the reciprocal of frequency f. Because the weighting of this type of filter is not symmetrical about the time of the filtered variable, as was the case with all the other filtera discussed so far, this filter will shift the phase of waves in the output series in respect to the
phase of waves of the same frequency in the original series. This phase shift is the following function of frequency: 1/J
tan-1 ( -21r fA)
where 1/J is the phase shift. The angle 1/J is always a negative angle between zero and 90 degrees. Thus, the filtered waves always lag behind the original waves. In the above equations, l:.t does not appear because time series smoothed exponentially by physical instruments are continuous and not discrete-valued.
All meteorological instruments smooth data to a greater or less degree. A meteorologist, knowing the frequency response of the exponential filter, can correct the computed spectrum of a time series of observations taken by an instrument so as to estimate the true spectnun which would be obtained by analysis of observations of a pedect instrument which docs not smooth at all. Furthermore, it is possible by means of inverse smoothing or desmoothing to obtain an estimate of ·the true series itself, not just its spectrum. An inverse smoothing function for reversing exponential smoothing is derived by considering the differentiat equation for the reading of an instrument with constant lag ,:ocllicien t. This equation is:
X-X
A dX dt
where X is the true value of the variable measured and X is the reading of the instrument which performs the smoothing. If this equation is solved for X, one obtains:
The derivative in this equation could be determined graphically· from a plot of the time series. However, it would probably be more accurate to compute the derivative by the method of finite differences. Before the latter method is used, the continuous exponentially-smoothed values must first be read off at eveniyspaced times to form a discrete-valued time series. When these values are obtained, the following finite difference form of the differential equation may be used to desmooth the series: where Xt and
Xt are the values of
lhe desmoothed and smoothed
• !':
12. Correlation of Time Series.
. !I
'!' ,!•
Given two time series, X(t) and Y(t), observed 'simultaneously for a period P at a constant time interval t:.t. ' The problem .is to. determine whether these time series are reiated to.each other; and .' tofind thecharacteroftherelationship.·~'' .,.·':',';'.''I'·._,·,'··· ., .
I1
1
, •
'I ,, -~
-'
)I
I,
• I \1
' '
Olle would be tempted to start by computing ·simple linear,.. ":;-;;: correlation coefficients between X(t) and .Y(t) by the' usual i ... formula: X(t) Y(t) - X(t) Y(t) '\\ · '':,·
r
=
.
.
.
. l l
', \
.
• ~' f•
. '-1 ~
.
.•.
As pointed out before 1 it is difficult to judge the significance · •·• ; of such coefficients, becauser'standard tests of significance··.are based on the hypothesis that the different observations of. X. as well as the different observations of Y, are independent of each·.· . other. The very fact that autocorrelation functions exist shows;:· , . that this assumption is not satisfied. Nevertheless, such '.'cross· . correlations" between X and Y have often been computed; . :!. occ~sionally, quite large coefficiertts have bee.n found to exist for a Jime, only to be replaced by negligibie coefficients when more : observations at·e available. Still, cross correlations between time series often give a correct picture of a more perman·ent relationship, .particularly when the coefficients are large and based on· a · known physical mechanism responsible for the re1ation. We wili see later that we can test the significance of the relatiohs by the technique of cross spectrum analysis, which allows for auto· correlation in each time series. Sometimes we may suspect that the best correlation· is not obtained if we correlate simultaneous values of X and Y .. F.or:
> Ar
'A TIS'
IONS
) lVll
)LOG
example, we may have a theory that particles emitted from the sun produce a heating of the stratosphere four days later. In that case, we would correlate an index of solar particle emission at timet with the temperature at, say 50mb, at timet plus four days. Such correlations are called lag-cross correlations. In practice, the word "lag" is often omitted, so that the term cross correlation between two time series also includes correlation coefficients between the series with different lags. One difficulty with this is, of course, that if many lags are tried without a firm theory as to why one particular lag is best, the chances are good that a large correlation will ue fouud ala certain lag by accident. 1n the case of correlations between solar elTects ami weather, the sign of the lag is usually known, since it is difficult to see how terrestrial phenomena can inlluence solar phenomena. In other l'ases, the direction of the lag for best correlation is not known, but is, in fact, the principal unknown in the problem. In that case, one would compute corrclalion coefficients both with negative and positive lags.
Let rL
~e
given by:
xv Sx Sy
If there arc N observations, there will be N- L terms in each cross correlation. Although the means and standard deviations of X andY are slightly dilTerent for dilTerent lags (since different observations are being included), these differences are often ignored, and constant means and standard deviations are used, computed from the complete series X and Y. The numerator of the expression is called the "cross covariance."
TABLE 36 Lag Correlation Coefficients Between Geostrophic Winds at 60" N and 40" Nl Lag, days .......... .
Correlation ''· ...... . I Lag
:J -.33
-2
-1
-.45
-.54
0 -.60
l -.59
2
-.53
3 -.45
positive if 60° 1>recedes 40°.
Taule 36 shows a series of cross correlation coefficients between mean gcostrophic wind speeds at 500 mb (averaged around the
.
~ SEJ
'U.
...;;:~};
,t. ·.:,\
and'40°~,"~~~puted;f~o~);;~~~:~.~~1
hemisphere) at latitudes 60°N ,,. • day averages covering about three yea·:;. studies it was well known that there is a negative correlation · · between winds at these latitudes, that fast winds at 60°N tend to coincide with slow winds at 40°N. · The question is: ·is there at1y tendency for a lag relationship between these two latitudes? In other words, is a decreasing wind at 60°N likely to be followed or preceded by a speeding up at 40°N? · ·' '. ., ·,e;,;~;:'i!•
If the effect was simultaneous at both latitudes; the distribution of correlation coefficients would be symmetrical about lag zero. Actually the coefficient at lag plus 1 is about as large numerically as that with lag zero, and larger numerically than that ·with lag -1. We conclude tentatively that there is a tendency for effects at 60qN to precede those at 40°N. ·In this case, additional samples showed similar results, confirming the reality of this conclusion .
..
13.. Cross Spectrum Analysis. When two time series appear to be correlated, the question may be asked whether this correlation is due to a correlation between \ high frequency components or low frequency components.' Or, \': perhaps, two time series may appear to be uncorrelated.hecause, · really, the low fr_t_:guet_!S,Y..Components are ne.g~tivdY_..fQU!!1~.1~9. th_C:~~0S!lency_<;:J:>IDP9U.~Uositively correlated. · > "·. ;,: A cross spectrum consists of two components, the cospectrum and the quadrature spectrum. · The ·cospectrum 'measures 'the contribution of oscillations of different· frequencies' to the total cross covariance lag zero between two time series . .The quadrature spectrun1 will be discussed later. ·· '· :· ·' · I .: 1 • 1 : •
'(
The cospectrum is computed by first averaging the cross -covariances at lag L and lag - L. For example, the cross covariance at lag 3 days is averaged with that at lag - 3 days. These , ·: averages are then tabulated as a function ofL.,;_, '. The series of these mean cross cov:ariances' is subjected to' a· type of Fourier analysis, in which the same formulae are employed 1 as those already given in connection with spectrumJanalysis; instead of multiplication of the autocovariances ·by different cosines and summing, the cross covariances are treated· in 'the same way. Even the final smoothing with weights ·.25, .50, .~5 remains the same. _1: , ' ' , -., ·.:·,.,,.,,' .,:' :·,, •••1 .:;:fb
::>OME At•PJ.ICATIONS Ul' ::>]'ATIST!CS TO METEOROLOG'll
J.:l()
E :J '-
0
Q)
a.
1/)
0 (.)
-~o~----3~0-----~o~----~3----~
Period FiGURE
(Days) 16
Cospectrum, Westerly Indices 40° Nand 60° N.
The cospectrum computed in this manner from the 500 mb winds already described in connection with Table 36 is shown in Figure 16. Again, all the cospectrum estimates have been divided by the period in order that the area under the curve, when plotted on a logarithmic scale of period, represent the total cross covariances. The figure shows that the largest contributions to the cross covariance between winds at the two latitudes come from periods of the order of 25 days. Incidentally, all cospectrum estimates were negative or close to zero, consistent with the fact that the total covariance was negative. Apparently, high-frequency (short-period) oscillations contribute little to the relationship between the winds at 40°N and 60°N. The cospectrum allows only for the simultaneous relations between two time series. If we are interested in lag reiations, we would start with differences between cross covariances lag L and - L, since any lags will show up in an asymmetry of the cross correlations about lag zero. In the cospcctrum analysis, we averaged quantities at lags L and - L, thus averaging out the efrect of lag. The quadrature spectrum measures the contribution of the difrerent harmonics to the total covariance of the series obtained
SER
..
,
~';{·:~: .; . ·. , , }~::~~; .;; ;<~·!J~~~~~l:·:.:;·;· ;,~t~:~r ~~
when all the har;nonics of time series X ar~ .delayed by a;qllarter".:~;:f\.>:;.1 period but the Y series remains· unchanged. The' quadrature:}.' ,::·S,'~ spectrum is computed as follows: first·'subtract the cross,co- · :., .. ' variances, lag - L, from the cross covariances, lag L, and divide by 2. These quantities are functions of L, and ·will be denoted by D(L). Then quadrature estimates are obtained by: Q
(i)
mt [o 1
= }:_
(L) sin
mL=l
('II' L m
i)] .· .'·:.."_:. ..
'followed by smoothing with weights .25, .SO, .25. m is the number of positive lags used, not counting zero. The importance of the quadrature spectrum is that .the ratio of it to the cospectrum is a measure of the relative phase of the harmonics of X(t) and Y(t). Quantitativ!!ly:, · · ·. · '·' ' '· · '
'II' T i --=tan-Q (i)
J
'
I
'
:
• ;· '
;'
:
.' '
'
I • :
~
1 ' ,: ' :
··
m ···' .. :t .' ,;, .. ,·; ....
Co (i)
Where Co (i) is the cospectrum of frequency i, and
r
.·,
. ..
is the lag.
·II" I•"·!!; .~
.. -
TABLE 37' · • ·11 ·'~· ·.• Lag (Days) of Winds at 40° N After Winds at 60° N · 30 16.7
Period of Harmonic.. . . . . . . . . . 60 Lag ........ , . . . . . . . . . . . . . . . ·(32.2
15 . ;.
. 20. , ~10.8
' 7.6'
Table 37 shows the lag ot the winds at 40°N after'the winds at· 60°N, showing that they ·are scHhewhat 'greater than half 'the period of the corresponding harmonic."''' I( there was ·a' simul- .. taneous negative correlation, all the lags would have been exactly ·· half a period. ·· -' · . :·_.,,; ···.·~: ·h,·,. · :· · :
\
Finally, the question may be asked, how good is· the relation- . ship between the two variables for various p~riods. :'This is answered by use of the coherence, 9efined by: . :, : ,1 '· , <
I
I
CH(i) =
~-I ere
Sx(i) .and Sy(i). are the .
requency~.
'
,
,
'I:
i~
I
0
i :
i: ; , I,;
···,.'
~
I
' ;
0
Q2~?(i;S~~~-~~~··.,\;>.·.'i~' '. ,}::.'));'::\:~:
··'·
spe~t,ru~ .e~t,i~~te\. -~£..X ~m~~~.:\.~A?:t.
:,_,,
.•.. ,..._,, 1
. . . ·•. ;i!::.
1
··-~, --:.~ , .; . ·
The coherence can vary from 0 to J 1, a11d is.a~alogous to,th..e ·
.' ~
•
square of a correlation coefficient, except that the coherence is a function of frequency. Table 38 gives the values of coherence between the geostrophic winds at 60°N and 40°N. Apparently, the coherence is small for oscillations with short period, large for long-period fluctuations.
TABLE 38 Coherence Between Winds at 60° N and 40° N Period, days. . . . . . . . . Coherence. . . . . .
60 .33
30 .JB
20 .41
15 ..l 1
5
10 .23
.to
Now, what are the chances of finding a coherence equal to or greater than a certain value fJ when the coherence is really zero. Evidently, the coherence should he more reliable as the ratio of observations to the number of lags used increases. The approximate formula for the limiting coherence at the probability level p is (after Goodmai1):
fJ=
·<,;
where elf is the number of degrces.of freedom.defined as in cou-
. w1t . I1 specta . I spectra I ana I ysts . by 2N - m/ 2 . nectton '-···- m .: i Tahle 39 shows the coherences at the 1% and the for various degrees of freedom.
TABLE 39 Limiting Values of Coherence Degrees of freedom ........... J%1imit. ...... . ~
........
5% limit ..............
4 .89 .80
10 .63
.53
L~ J
.201 ) .46 .38
;
5%
t\', . .• •
'
limit
vG· 40
.33 .27
This table slates, for example, that the chances are 1 in 20 that a coherence of .38 or better will be found by accident when there are 20 degrees of freedom. In the case of the observations above, there were about 40 degrees of freedom, so that the low-frequency coherences were significant from zero. If the coherence is significantly different from zero, the total correlation must also differ from zero significantly.
14. The Superposed Epoch
M::~od.
•· "';, •. 1ii. '
•
'I
•
t
,f;,i,:1~,;~~ r;. :;·:
.
·~ ,. .:.
The relationship between two time series X(t) and Y(t) is often~./. studied by a technique known as the "superposed epoch". method~! .. ·., This method of examining the data is especially useful if one of the series is comprised of discrete events, such as the occurrence or non-occurrence of a magnetic storm, the day of central meridional passage of a spot group on the sun's disk, etc. From such· a ~eries,: a number of "key'' dates are selected, representing the occasions when the event in question occurred. If the key date is denoted· as D0 , then the sequence of days following can be denoted as.D 1, ·· D2, Da ... Dn. Average values of the other variable are then obtained for each day of the sequence. Thus, if k key: days (D 0 ) have been selected (and therefore k sequences), the corresponding.· values of Y for the Do days are tabulated and the averages determined. Likewise, for day Dt the key values of variable Y are averaged and this procedure repeated for D2, Da; .'·>. ·; Dn· or course, in some cases we may be interested in the behavior of variable Y ·for m days before Do, in which case we obtain the seqUence of averages corresponding to D-m• D-m+t .. ;·~ D_2 >··D_1• ...~: In general, we end with a series of average values of Y denoted by:. -Y-m• -· - - . ·- : . .. \ Y-m+t ..... Y-1 Y 0 , Yt • • ~ · • Yn ; :nl ::..
.
-
Each Y is based on k observations and Yo is the average value of the quantity Y for the key days Do which were selected on the basis of the X(t) series. The1Y sequenc~ is then plotted and the variation in the sequence studied in relation· to some a priori hypothesis. For example, physical theory may suggest that the value of X should decrease before D0 , and increase for a p~riod of 5 to 10 days after D 0 • However, a quantitative evaluation of: the results may be difficult unless provision has been made to test their significance by comparison with results from another independent sample or with some random dis~ribution.
\
An application of this method is discussed in. the following numerical example. The question under consideration is whether an increase in solar activity, represented by the daily sunspot . number, is followed by a change in the amount of cirrus cloud cover. Twenty key days Do were chosen, one from each Novem- . her in•a 20-year period. Do was chosen as a day on which solar activity began to rise and continued to rise for 10.days or more.; · The second time series used, the Y series, consisted of the total: . number of reports of cirrus clouds observed on a daily basis from' •
160
So~m 1\l'PLICATIONS Oro .::lTA'fiSTJCS TO JV!ETEOROLOGY
a network of stations in southwestern Uuited States. These were summarized for the twenty Novembers according to D 0 , D1, D2, .... etc. up to Dw. The results are shown by curve A in Figure 17.
110
100
90 0
w
1-
a: 60 0
a. w
a: ~ 70 a: a:
u
l1.
0110
a: w
[IJ
:::E
~100
90
60
FIGURE 17 Number of Reports of Cirrus Cloud Cover on Key Date and 10 Days Following.
If the variations represented by curve A are real and due to a solar erfect, one would expect a corresponding curve determined form another sample of data to show approximately the same variations. The curve B in Figure 17 resulted when the data for
.. 1 -···"·
.-~ERI:~·.:,·.. :~ ·.·-~1)!-fi. :>,::'.: ..,;:~tt~~~f-::::;~i~r":J~:U~\ ,,,,,. .,·i;
the months of January for the same' 20-yeariperiod~were;.lreatecl': ., in the same way. Inspection of the curves 'suggests~ very little't;;.:,\~t relationship between the two. ·The 'linear correlation coefficient·/·~·\~: is 0.25. The significance of this correlation:' was ·judged by ·, > producing a series of "nonsense" correlations based on a set of . key dates chosen at random. The same basic· cloud! data were·.· used as when the key dates were selected on the basis of the solar data. Twenty artificial correlations between "nonsense"· F and observed X were determined and ranked in order as follows: -0.46, -0.46, -0.36, -0.24, -0.23, -0.16, -0.13, 0.01, 0.02;' ·_ 0.11, 0.17, 0.20, 0.22, 0.22, 0.23, 0.23, 0.30, 0.69, 0.72, 0.72. '
' !
r
'
'
,' '
~
'
•••
: :
Since ho or 20% of these correlations produced at random are larger than 0.25, we conclude that no significant relationship has been demonstrated between the solar index used and cirrus cloud cover. If any such relationship exists, apparently·:.it is· quite small, and much more data would be needed to show· it.:·· '~·~ '' . 4
1
.
-~:··· \,
·::~/1'·'.
··.1\
.l-(i'j_-
i•i 1 .;·:{4'(,'·-~-~).. ~·~.
Sug~estions for Additional Reading.'(i;·~r u'~ :.:r:. ·.i.'.'·:·d:~~·;,:L:\.\·;: · ·· ,. On Spectrum Analysis: Tukey, John· W., -"The Sampling· 1·· Theory of Power Spectrum Estimates," Symposium Oil Applica- ·,.\1 ii tion of Autocorre Iation Analysis to Physica I Problems, Woods.':\.'. Hole, Mass., June 13-14, 1949, ONR, NAVE;os, g~735.'/: ·~~\'tr On Smoothing and Filtering:· Holloway~}. Leith, Jr., "Smooth- . · ing and Filtering of Time Series and Space Fields," Advances in · Geophysics, Vol. 4, Academic Press, New York,•1957or 1958.'\'' '
'
I
On Time Series: Kendall, M.G., "Contributions tothe Study of Oscillatory Time-Series,"_ Occasional Papers· IX, ! Natiot£al. Institute of Economic a11d Social Research, Cambridge, England, Cambridge University Press, 1946. ;/• · :t •
':
~
;.,
..
.;
!. .
'; ::.::+-.·::·. ,·
§pace Variation of Meteo!-nological Varia"bles CHAPTER
VI 1
1. Objective Analysis. The variation of meteorological variables in space is, in some ways, similar to that in time. For example, there is a correlatiqn between variables at adjacent points; there is such a thing as a space-spectrum; and, at a fixed time, the variables are uniquely determined by the space coordinates. But there are also some important differences: space is three dimensional, although many problems are treated in two dimensions; observations are usually not spaced equally in space as they are in time; there are no exact periodic fluctuations in space except, of course, that the variables repeat themselves after one earth's circumference. vVeather maps are two-dimensional representations of meteorological val"iablcs. The data on these maps are usually subjected to a process called "subjective" analysis where lines (called isopleths) are drawn through points having equal values of a particular meteorological variable. Some of the purposes of these analyses are the following: 1. Meteorological tlata must be smoothed. This is because actual observations rellect instrumental and human error as well as small-scale motions which m·e on so small a scale that their exact chamcter cannot be determined. These errors and small scale motions must he eliminated. 2. In order to proceed with numerical prediction, or with some of the modern statistical forecasting techniques, observations must first he available at fixed grid points. Weather map analysis interpolates in two dimensions and not only allows the estimate of the variable itself at the grid points, but also of its derivatives with respect to space coordinates. 162
The simplest type of expressions which may' be fitted· are polynomials in the coordinates, x and y. Polynomials up to the· third order have b~en used in actual practice, although quadratic equations seem to' be sufficient for most purposes. ·-For' this discussion, the meteorological variable will be assumed to be the contour height h; other variables could be treated, simi)ar:(y. ·. ·' · :
I
·'
'
•
· .. ·
'· . .
Let it be required to ftnd a quadratic expression in an area centered at a grid point where the best yalue of his sought ...The· equation for the quadratic is: . , · · . ·. · ' · "·~ . , ·; ~.·,,: "'' h
->
= ii
+ a1x + a2x2 + aaxy+ a4y +·aay2.~
,,.;.~.,;,.
The problem is to determine the values of the coefficients ~. \, .· If there were exactly six observations of It, we could write the I' equation for h above for the location of each of the observations, , and solve precisely for each ,a as well as for it. .The quadratic equation would then fit· the observations accurately. There would be no smoothing of errors or of turbulence, but interpolation for values at particular grid points would be possible. ~;, , In practice, some smoothing is desirable, just as the human analyst smoothes the observed height field while drawing' his isolines. This can be accomplished by making use of more than six heights. Since only six observations are required to determine the six unknowns, the problem is, in a sense, over-determined. "-..We have several :vays of getting at the:,unk~lC~:Vh coefficients, and the problem 1s to find the best set. Thts 1s clone by: th,e method of least squares. We will require ttu:t the quantity . , ,
Q - obs };
(It -
h - a1x -
a2x2 - aaxy - a4y - a&y2)2 · .
.
,
Hence six partial derivatives:· a9, aQ ~·~nd. so · • . ah oat. . forth, must all be zero. This yields the six ;'normal" equations,
be a minimum.
which !hen can he solved uniquely for lhc six unknown t:oefficieuts. This process is, of course, rather tedious, especially wheu it has to uc repealed for many grid points a11d must be doue in a li111ited time. Hence, objective analysis is usually a process for high speed electronic digital computers.
In general, the degree of smoothing depends on the ratio of obscrvatiom; nsed lo the number of coefficients. In the case of heights, probably not much smoothing is desirable so that, perhaps, eight observations would be satisfacto1·y to determine the six coefficients. Winds would have to be smoothed more. Subjective analysts use the fact that the wind is usually not far from g-costrophic. To what extent can objective analysis simulate this procedure? The method simply consists in mini· mizing the following quautity:
\J
=
2.: (h -
obs
h')2
+ c obs 2.: (tt
tt
1
g5 ) 2
+ c 2.: (v
-
1
V g 9) 2
where primes d<:nolc lhe variable as giveu by the quadratic. The quantity cis a weight which depends on the relative reliability of heights and winds. 1t and v are the west-east and south-north wind components, and subscript gs denotes geostrophic. Tlw procedures outlined so far arc quite satisfactory for an area like the United States where observatio1is aloft are suffJcie!ltly close together that the eight or more observations required to determine the height value at a grid point are reasonably near to that point. However, over the oceans one might have lo pick tremendous areas before sufficient observations are available. The areas inay be so large that the observations at the boundaries should have little influence on the value at the center of an area. Yet, the value at the center must be found and can be estimated only from distant observations. Further, special difficulties arise from the fact that observations tend to fall along lines, representi11g tracks o£ planes or ships. In order to get around this difficulty, an analysis of height can be based on fm·erast height in addition to .observed height and wind. In a sense, this is analogous to the analyst's custom of looking at the last map before the analysis is completed. The last map's features to the west, being based on land observations, may have been much better defined than the same features after drifting out over the ocean. Thus, a forecast based on the last map puts information from well-covered areas into sparsely settled regions or oceanic m'eas. In that case the quantity to be minimized is:
...Cii\ ••••••... ION~
.•TiiOF .
.obs .
+ 'c obs ::!! (v
...
~
. ','
.
...
+\.c: ob8~ •(u . ··. . · -~i·. ' · ,
Q ... ~ (I' - h') 3.
- v'11.) 2
+:b T: fl£ · · ;,} oba •'i. 1
>lf:
.
where subscript f stands for forecast and b is weight which can;be · determined from a study of reliability of a' forecast.' .. The'last. summation might extend over all grid poh1ts in the area covered.;j. ' ,·· I,.;,,,_ If there are still not enough data available at grid points, climatological normals can be incorporated into. the analysis procedure in the same manner as forecast'values.· ·:·.· ·:,·.·: '.:·,'' ..;:;. .
'
.
'
;(>
Of course, an objective analysis determined by solving for· the· coefficients of a quadratic is not the only procedure possible. Cubics have been ~ried, but the additional computation required (10 coefficients) does not generally result in an improved analysis. Even linear equations may be sufficient in the: neighborhood of stations. An entirely different method of analysis starts· from: a';pre• liminary map {probably a forecast from 12 hours before) 1'and'as ,. observations come in, they are used to modify the heights at' the, •. :· grid points according to some complicated system in which the, . weights decrease with increasing distance from the.observations.\, . . . ...:. ,, '•.\'·' In general, the experience with 'objective ·analysis 50 far, has'\' been encouraging, since numerical forecasts based on' it :are ;as · good or better than those ba11ed on subjective analysis.' :Further, when observations come in la'te ahd are compared with an analysis prepared earlier, they seem to favor the. objective rather,.than . •, . ~·; .... , · · • · ·;' l ·; , : the subjective analysis. .\.
Finally, it might be argued that. one' unique function of the · subjective analysis is to recognize and eliminate bad observations: This can be done also with the various schemes or'objective analysis. For example, suppose that a quadratic surface has been fitted, and the residuals from this surface {observed 'minus computed) have been determined at each station where observations were made. If the residual at any one station e..'f.ceeds a prearranged tolerance, that station's observation is.' assumed incorrect, and tl3e analysis is repeated. ~itP.,out ~h~t.o~~ery~tiory. It might be argued that this technique would· find' only 'the large errors, and leave small inaccuracies that only a subjective analysis can correct. The validity of this argumertt is weakened · by the fact that quite often different analysts will i10t agree with
each other. Under those conditions, the objective smoothing technique built into the methods of objective analysis are probably at least as suitable as the subjective analyst's opinion. In summary, objective analysis can do the equivalent of any type of subjective analysis in which the analyst can rationalize his procedure.
2. Factor Analysis in Meteorology. The weather of past and present consists of a tremendous amount of information. This is so vast that it is completely hopeless to base statistical forecasting procedures on all observations, even on all observations at a fixed level and ~t a fixed time of a definite day. Actually, it would not be necessary or desirable to make use of all observations on a given map, since observations at different points are generally correlated with each other. Factor analysis attempts to find a relatively small number of independent quantities which convey as much of the original information as possible without redundancy. Such "faclors," interesting in themselves and useful in particular research studies, can then be used as predictors in some statistical forecast procedures. Ordinary harmonic analysis can be regarded as a form of factor analysis. Consider, for example, the heights at 500 mb · at latitude 40°N, listed according to longitude every five degrees (72 observations in all). lf a harmonic aualysis is made with the circumference of this latitude as the fundamental wave length, and the first eight harmonics are computed, the 16 resulting coefficients (the mean height, and the seven A's and eight B's) are "factors" which convey essentially the same information as lhc oriJ{inal 72 hei~hts.
l'l11llw1, !ii11111 uom• nl' 1111' r;itu•n und c·ut~itu•H urc· cmTI'Ial.c'cl \\'ilh t•m·h ollwr tauolher war of $laliug orthu~onality), the 16
coetlicients do not give any superfluous informati01i. Of course. the 16 coefficients give no information at all about very smallscale features of the pressure pattern, since the 8th harmonic ft.hc highest r,r,mputerl) has a wave length o£ 45° o{ longitude. ll,,w,,v,·r, in 111al!'y pr,,l,l•~m!l 1.1.•: !ltnaller-sr:ale features should ,,,,, 1••. ww!·1 ~t·tl ,,.,,;,v. .,, •h•· lad~ ,,r 'lllffit:i•:ut.ly dr.!Sf: c,bserva-
~t>ACE , •• ,. ATIO~, .
.JETE<.
.
.
G,lql
' .,: . .
r:··
.
;ABLE
:::.·· ":7, ,:~!:::~}:~ ,.;~:·{ ~:. . ~~ -r
tions. In any case, the 16 harmonics wouJd,accountJor';mos ~ :.-1 the vanance of the ongmal mformatJon; ::1h;.:;u!.1 ~ft~:1v ;")AiJ:J{~;i·tNi,t 0
"
f
•
•
•
.
'•, ,'
'
'
• '.
,•., .
'
'li;r,
,• •'
:·> ·.. ' .- ".d' ; ... ' '. _::' ·. ,· /' '.(~-~~:~ :. ,../:.
.-
Harmonic analysis in the manner. discussed lsolfar· has· been}{;,;: applied in one dimension only. The world-wide or hemispheric ·?'· pattern of meteorological variables can be fitted by a technique · called "spherical harmonic analysis.":, Spherical harmonics are:a: class of orthogonal functions; that is, the, mean product oLany one of these multiplied by any other, and ·weighted by. some.· weighting function, will vanish. · As :.a function of' longitude,: spherical harmonics vary sinusoidally.:,: The coefficients of· the sines and cosines vary in a specified manner: with latitude, in such a way that the be,Pavior at the pole can be taken;care of properly .. Since spherical harmonics are orthogonal, each coefficient of each ..•... harmonic can be determined independently.~, .
', :
' -.
:' : ',·., .. J.
> ·: :· '· :.·:· ;'.,·,.,
l • ; .; ;
':
...,,
; :
: • .'
,~
Spherical harmonics have been used to represent mean worldwide pressure and temperature fields. ·Such· representationsare · particularly useful in the study of atmospheric· tides in~ which each harmonic can be treated separately, and ~the resultant ·motioi1s can be added later. · · : · '· : :. · ( · .·· U!P? 1
'. i l,
•• ''
! ' :'
•l
; :·
!. . ,,•· . '·j
.' '
" ..
J-.
t". t ~
For the representation of contour patterns of meteorological'i . variables over sections of a hemisphere, e.g., for the U. S. and\· ...:> Canada, harmonic analysis in its various forms is not particularly, useful. The reason is that sinusoidal functions are, of. course, . periodic, and it would not belappropriate to use them in represent•.:. ing the height pattern at 500mb over the United States since we·. definitely do not have the same average height. at· the southern:~ and the northern boundaries.. 'Actually, there. exists a' gradual ';,: trend, and inany harmonics may be' 'needed.· just' to' fit: this.· . tJ·end. It would be much better to represent the observed height.r.' pattern by a series of different orthogonal'functions such ·as.tthe. "Tchebycheff Polynomials" which have· been ndnptcd · to:: the. meteorological problem. The1-e exist!!' n wholu ·sut1 of· aueh polynomials, which w~ will d~nolu hy Pu(<'\.Y)•'' Tlw IIUh~tlll'lt•t ~. is any integer. Thus, we have Pt. P11, Pu. nmlttu ftll'th~~~ ltHI.lh nl thesl' polynomials describes a simple height pattcm (?r ,, lhu . ttern of any other variable). Not nil of them nt'e functions of . ~\h x and\. Some are £unctions o( :...: only, some of ;y onl)•, 801~\U
. d• y. .I·n gc . 'll.,;•n\t• th~ l•u~"''' o (o x an . th~ "''h'~ , uf' "' . th\\ \\\\l\'\t\1~~~\\ll\ . : '-~· . the patterns possess~ .·. \ ·'• .·.· ,.,,~-· _.,... ·I,,.,, . t £the p 'sis that they are o~~hogonal; The important proper Y 0 n .. ·· , . · ·'· ',.
.
namely, the average value of the product of any two different polynomials over the area to be fitted is zero. The procedure used is to fit a sum of such polynomials, each multiplied by a weighting factor a 11 , to a given height .field. If we are interested only in broadly smoother fields, we need only few polynomials; if great detail is required, we may use as many as 20 or more polynomials to fit an area of the size of the United States. The fitting process is made relatively easy because the polynomials are orthogonal. Given a height field, with observations preferably at grid points (the observations are not made at grid points, but interpolation for grid points is made by subjective or objective analysis), we now wish to represent the height field by a series of the form:
It (x,y) =
li
(x,y)
+ a1P1
(x,y)
+ a2P2 (x,y) + aaPa (x,y) ... .' ••
Here, the a's are to be determined from the observations. For example, let it be required to find aa (any other a can be deterniined by analogy). To do this, multiply both sides of the equation by Pa(x,y), and average over all the grid points. Due to the orthogonality, all but the square term on the right disappears, and we can solve for aa:
aa =
h (x,y) Pa (x,y)
In general h (x,y) 1~" (x,y)
P,? (x,y) The quantities Pn (x,y) have been tabulated as function of x and · y. The denominator is unity if the polynomials are not only orthogonal but normalized as well. The a's also have the property of giving the best fit of the height field to the polynomials in terms of least squares; i.e., no other set of coefficients leads to a representation which fits the observations better. (This statement is proved in the Appendix). The fitting of the observations has accomplished the following: instead of perhaps 80 data points initially, the height field is now given by the a's. For example, if 20 polynomials have been fitted, the 80 original data poiilts have been replaced by 20 new variables which convey most of the information of the original observations. It ia then possible to usc these 20 a's ("factors," as they arc called)
.7~_-·f>-ftl
-.i'ACE . ----.\Tim
•. :El'Et . i
...
GICAI . :· ...
~:·
'.:.
.ABLE
;·'~:,_
~(~·,:
"~;'i
~
.
,--~~~;;~\~
instead of the original data in a method of: predicting th~- develop.f~ii;;{~; ment of the height field.. ~'I ~;z. !i( :i: r::~ J !.H?.:h;(,ll,",· Yj:;~!·,l··:'': ln order to find out how much of the variance of the ··original fields is accounted for by the polynomials, it is necessary to Z [h(x ) - ii(x . . '' · -·- ' 2 'I ' · determine the value of · ,y N ,y)] · or, in other words, :
the variance of the height field.
'
'
'
,,
.. [
·l
Assuming now that the functions are orthogonal as well as normalized,. we find simply: ---.:·, (h- h)Z =
a1
2
+ a2 2 + aa2 + .....
: .·
'
(
Hence, the sum of'the squares of the coefficients show how much of the original variance has been explained by the weighted sum .-.: .. -... -~-
of the· polynomials. ~:::
'
· One w~uld like to find orthogonal functions such ··that::the fewest number of them explains most of the· variance.. of· the variable fitted. Are the Tchebycheff Polynotnials the "best", ·set of functions in this sense? The answer is that· they are not.': If we fit a set of such polynomials to large number of maps we\.· . .. '\ may find that the a's are not independent of each other.· For\r example, whenever aa is small, a0 may also be small •• ,Or, when-· ' ever a2 is large, a1 is small. In other words,.the a's are correlated . with each other when considered as functions of time.' This· means that we could ex plaid the same amount of variance with fewer a's if we could find functions having the property that their coefficients, when fitted to a weather map, are uncorrelated with each other. Such functions exist, and in 1956 practical computational methods for determining· them' were developed and applications made to a set of weather maps.' These i!empirical orthogonal functions," as_ they are called, can· account for· as much as approximately 90% of the variance of the observatio11s of the pressure height field in the United States with only eight coefficients. A similar portion of the variance of the northern hemisphere map can be accounted for by 30 coefficients. ;:·The main .ffifficulty with empirical functions is that although a given set of them may well summarize most of the variation 1of some meteorological variable for a given period of time,· there· is no guarantee that the same set of functions will do equally well for another set of maps. ·' ·
a
7
1
'
'"
•'
:. • •
·'
Given the coefficients of the orthogonal functions, one. would
17V
::-.OME Al'PLICATIO:
relate them to the variables to be predicted. Now, it is not obvious that a whole map has to be fitted in order to make predictions for a particular place. As a matter of fact, it is reasonable to assume that the meteorological conditions at one place are inlluenced most strongly by the distribution of the data close to the station, and more weakly by conditions further away. In fact, this sort of result has been derived by the methods of dynamic meteorology. Therefore, an alternative method has been found successful for finding optimum predictors which is discussed in connection with statistical forecasting. On the other hand, the factors are certainly useful in summarizing large-scale meteorological infor-' mation, and can be used in problems such as establishing relationships between solar and terrestrial variables.
3. Space Smoothing. Meteorological quantities which vary in two dimensions, such as pressure, may be smoothed by an extension of the methods of Chapter VI if the data are evenly spaced. Meteorological observations, of course, are rarely made at equally-spaced points, hut artificial data can be obtained at regularly-spaced grid points by interpolation of analyzed maps of the variable. The grid used may be latitude and longitude intersections or some arbitrarilychosen n·clangulnr or triangular mesh superiwpoHed on the map. Such grids are used extensively in meteorology now, especially in ex tended forecasting and u umerical prediction. The sir11plcst fonn of space smoothing is the Fj,Prtoft method which consists of taking the mean of four values at the corners of each square. The mean thus obtained is the space-smoothed value at the center of the square. The Fj,Prtoft smoothing is usually done on pressure or contour height data in conjunction with uumerical prediction by dynamic methods. The computation of the Fj ,Prtoft space mean is the first step in the calculation of the vorticity, and is also used to determine a geostrophic !low. According to dynamical theory, the- vorticity can be advected along these smoothed isobars or height contours with the smoothed geostrophic wind. Another advantage of the Fj,Prtoft space smoothing is that it can be done conveniently by graphical addition on a light table. l n spite of its convenience the Fj,Prtoft space smoothing has
SPA
RIATJ
ME·
.
~RIAE
LOGIC
"><~1!!J)t'
' .,:, . . .,;:,;ti"i':/ .:; :;,; ~i;,¥~~ipi.i'i:
. ~.
some disadvantages, especially when 1 the. pUrpose·· of;; the\{ mean is not so much numerical prediction··bu~,:the':elimination. of small-scale features. For example, the four' values averaged·.< do not eliminate completely the effects of random error in the data or very small-scale but intense systems such as hurricanes on pressure maps. If the center of a hurricane occurs near a grid point, the smoothed map will be unduly affected by this disturbance which should have been almost completely smoothed out because of its small scale. Also, if graphical Fj,Prtoft smoothing is used on a hurricane map, the storm will appear as four low centers in the smoothed pattern. These difficulties can be overcome by using space smoothing functions with at least ten weights which d~crease outward in proportion to the bivariate normal distribution. Having the central weight greater than those at the edges eliminates polarity reversals (lows appearing as highs and vice versa) similar to what occurs in one-dimensional smoothing with equally-weighted running means . .' · .\
7
··
. For example, a 21-weight space smoothing function with weights proportional to the ordinates of the bivariate normal ....:~j · distribution is shown in Table 40. This space smoothing function is designed for use with uniformly spaced grid point data, but it could be used on data at latitude and longitude intersections at low latitudes where the meridians are nearly parallel. In this case, lows or highs having mean separations (low to next low or high to next high) equal to1seven grid intervals will be reduced to about one-half of their original amplitude (value at high center minus value at center of low) by this space smoothing function. Large-scale features will be less affected, and smaller-scale material in the space pattern will be virtually filtered out. · On the right side of Table 40 are shown the weights for a smoothing function which would duplicate Fj,Prtoft smoothing
TABLE 40 Weighting Schemes for Space Smoothing ~ormal
.01 .01 .01
.01 .05
.11
.05 .01
Fjt/>rtoft Weighting Scheme
Weighting Scheme .01 .11 .24
.11
.01
.01 .05
.11
.05 .01
.01 .01 .01
'
0 .25 0
.25 0 .25
.·),
0 .25 ,0
172
~OME
Al 1 l'l.ICATIONS
0•• "Tfd'ISTIL:>
FIGURI!
IV
fvlETC.VI{Vo,OGY
18
Comparison of Different Trpes of Space Smoothing. Top-Original Map; Center-Fjl/>rtoft Smoothing; Bottom-Bivariate Normal Smoothing.
ACE.
.TION
·':
HCAL ' ' - . \BLE:"
J!:TEC
't \
.·.~'
while using the same grid. as the bivariate· norrr:al smoothing~;<.: function on the left. If the grid interval were made equal 'to six':· degrees, the SI!loothing would be identical· with that which · Fjiprtoft recommends. The bivariate· normal and . Fjct>rtoft smoothing functions shown in Table 40 are designed so they will both give about the same reduction in amplitude to medium and long-wave meteorological patterns. The effect on short. waves is slightly different as is illustrated in Figure 18 where an unsmoothed 500mb chart is shown above the same chart smoothed by the two smoothing functions in Table 40. The grid length used was six degrees of latitude and longitude in this smootlling, and tl1e convergence of the longitudes at high latitudes was ignored. Notice t)lat the chart smoothed by the bivariate. normal smoothing functio;1 appears almost identical but very slightly smoother than that smoothed by the Fjct>rtoft method' for 1 reasons elaborated above. · , • .·.'."; .;:':·;,.' •.• ·.··, ;,·.·,:,' , • This implies that Fj,Prtoft smoothing, in spite of its theoretical imperfections, is generally more useful in meteorological-upper-air. ,. analysis due to the ease of its computation. · • ',;. • '
•
.
•
.'
' ' I
~ ' ; \ ~
.; •
;
.. ,., ·i·,-.,1_:· ;j.(
Suggestions for Additional Reading . . { .·; .,., 1.•..
.~ ;
J
)l 1 ,· •
', '
i'
>.'··.' ... >. ,"·,\ ,,;,; • ,, ,,, ~\ ·.-· ..
.
7 On
Factor Analysis: Sellers, W. D., "A ·statistical' Dynamic Approach to Weather Predictions," Scientific Report Na. ·II, Project AF 19 (604) 1566 i\FCRC-TN-57-471 AD 117231.' . . . .. .. .;
On Smoothing and Filtering: Holloway, J. Leith, Jr., "Smoothing and Filtering of Time Series and Space Fields," Advances in Geophysics, Vol. 4, Academir Press, New Yor~,:.1958. . ; •. ·'. ,..
.·
: ......·.:.
'\' •J .'
,)I''
.•'f:.·.''.
·1'
:·.
Sttatistbical Weather .Forecasting CHArTER
VIII
1. Introduction. 1\:lethods of statistical weather forecasting are based on the principle of making predictive inferences about future weather from statistics of past weather. In practice, we wish to forecast a meteorological variable from observations of the same or other meteorological variables at a previous time. Usually, the mathematical form of the relationships is not known, and such relationships must be established empirically. As opposed to most methods of weather forecasting, statistical forecasting methods are to a large extent objective, being completely objective as far as the forecaster is concerned. Give11 values of the predictors, the best forecast of the predictand is uniquely determined by the method. However, some degree of subjectivity ami judgment enters the formulation of the pmceclure to be used in the forecast. Often, a forecast method· is based on isopleths drawn by eye, or on subjective grouping of data. It is, therefore, not quite correct to designate statistical forecasts as "objective" forecasts. They will be as nearly objective as possiulc without knowledge of the exact physical relationship::~ between predictors and predictand. Also, statistical methods wHJ make the most objective use of past weather information. Sometimes, objectivity in the design of forecast · procedure can be increased only by considerably increased complexity of the method. Three steps are followed in the formulation of a statistical forecast method: 1. The relation of the predictand to any number of predictors must be investigated, and the predictors most advantageous for the method must ue selected. 174
··~
~· :;:~~;.g,~-
2. Cot1Venient rules, graphs, or equations ·must be developed:·~:~, These should be in a convenient form for use in.future ·forecasts/·. iY· . . f_
.
•,· -
'
I
:
·I'
<,
3. The reliability of the rules, graphs and equations must be tested on a new set of observations of the predictors as well as the predictand. It has even been suggested :that the available information be divided into three not necessarily equal portions and treated as follows: develop the method· from portion 1, modify it on the basis of information in portion 2, and test the modified method on portion 3. The operation of this procedure in particular instances is explained later.. ·
,.·\~:·:::~
· ··
2. Selection of Predictors. Whenever possible, predictors should be selected on the basis of physical reasoning. For example, rain fall probability is known to be related to vertical velocity and humidity. The vertical motion is often not available, but is known from ,theory to be correlated with the south-north wind component or with the temperature advection aloft. Hence, some humidity. panimeter and some parameter related to temperature advection would be us~ful for precipitation forecasts. '• ;. '· >iql!! • '' ·,. ·~
\:.
Physical reasoning by itself, however, is not sufficient for the choi6,e of predictors. In the example above, should we use wetbulb depression, relative humidity or dew point depression? At what level? Frequently questions·of this type are answered by the statistics of the predictors and the predictand observed in the past. Just how the "best" predictors are selected depends on the particular forecast method aild will be'discussed.later..•..
In Chapter VII we discussed "factor analysis," where it was shown that it was possible to describe the complex pressure pattern over the United States .or even the whole Northern Hemisphere by relatively few "factors." Should we use all the factors? The answer seems to be in the negative. The "factors" of pressur\; in the U.S.A. will depend, for example, on the pressure in San Francisco as much as on the pressure in Pittsburgh. Now, if our predictand is pressure in New York, the Pittsburgh pressure may, be much more important than San Francisco· pressure.. The factors, which were determined· without regard' to any predictand, would probably not be the best; predictors' here. Presumably, different predictors are 11 best" for each predictand.
.........
/\1'1'1
JNS C
TISTI
MEl
.OGY
Another important question involves the number of predictors. Ideally, perhaps, the more predictors, the better the prediction. However, the development of methods based on a large number of predictors i.vould presuppose an immense volume of statistical information. In practice, one is always limited to a definite II limber of observations of each predictor, denoted by N. Further, as mentioned before, many of these observations have to be saved for modification and testing. Hence, there may actually be n values of each predictor available [or the initial design of the method. Now, with 11. observations of each predictor, we can immediately see that the number of predictors must be less than 11. For if we had n predictors and only n observations of each, we could derive n linear equations which would exactly fit each observalion of the developmental sample. But it would probably not fit future samples well. The reason for this is that we have fitted the developmental sample too well; we have explained wilh our predictors variations in the predicta11d caused by observational error, by small-scale or short-period fluctuations, or by variables not included in our set of n !Jredictors. In general, the number of predictors must be much· smaller than the m1m her of usable observations of each predictor. Just how much sriwller depends on the complexity of the method of prediction .. If prediction proceeds by i11eans of a linear equation, relatively many predictors can be brought in. The addition of quadratic or other nonlinear terms means that additional coefficients have to be determined from the same data. Since, effectively, the number of coefficients well determined from a given number n of observing periods is constant, this means that the number of predictors has to be reduced correspondingly.
J n terms of the isopleths in a graphical regression scheme, this means that, the simpler the structure of the lines, the more predictors can be used. The effect o[ increasing the number of predictors or of adding greater detail or wiggles into the isopleths is the same; the developmental data are fitted, but the forecasting system tends to be less stable. Finally, the number of predictors to be used with a given number of observations, n, is limited by the fact that the 1t observations are usually not independent. If a given day is [air and warm with a SW wind, the next day is likely to have the same characteristics. Although we have two sets of observations,
TICAl
."CHEF
ICAS1
~,;,1r •
'
~·.;,
1·· . 'l·AtJ!
. ,.::r.,.....l:', .. ·.:f~ i:. ::·:<·,,,; ·'tP·;I> r~.~~~~~
they both give essentially the same,information,.and should\be .;t);.::h~ counted only as a single set. It is sometimes.assumed.that n/3, :~./;+F or even only n/7 observation groups are actually inqepende~.~,;~,: ··,>:·~!~ '
1
From all this, it is difficult to state the optimum number of predictors, given n sets of observations. This number depends on the characteristics of the variables as well as on the kind of relationship to be established. As an example, ·1000 sets ·of observations on successive days distributed over 10 winter seasons might yield stable regression coefficients in a simple linear regression scheme with as many as eight predictors. But, with the same number of predictands fitted to predictors graphically, the usable number of predictors might drop sharply. These numbers, how,ever, are by no means certain. Testing on independent data is likely to indicate whether too many predictors were used. , ,.; . In most statistical forecast techniques, the predictors are picked at the same places for every forecast. These places may be observing stations, or they may be grid points at which the variables are known through some type of meteorological analysis. Such techniques might be called fixed-point techniques. :
i
I
;,
In some respects, the so-called trajectory techniques are \; physically more satisfactory. In these methods,· one first determinet the present location of the air which will be at ··the forecast station at the forecast. time. ·· One then relates the difference between the predictand at the end and at the beginning of the trajectory to other variables which, for· physical reasons, are expected to modify the air. For example, the difference in the reciprocals of visibility at New York (a· ineasure of 'air impurity) and the reciprocal of the visibility at the beginning of the trajectory would depend on the sources of air pollution close to New York and the stability of the air. Both of these quantities could be related objectively to easily measured variables. · • The principal difficulty with the trajectory method is that air trajec\ories must first be forecast obj~ctively. · This can ;be done, but no't- without error. For example, one might assume that· the pressure systems move from west to east without change of speed, and without deepening or filling, and that the wind bears a fixed . · relation to the isobar spacing and direction. Another possibility would be to determine the speed by some· objective procedure based on steering of the surface systems by the upper-air flow.
ill')
...r\Jl\11~ 1\l'PI .. I\.../\ I fONS
·vJ'
~JJ ATIS'1"1'··~ nJ
ME. J:.vr..Vl..OG\
In any case, the construction of the trajectories is rather tedious, and usually not too accurate, if the procedure is objective. Hence, trajectory methods have been used only rarely in statistical forecasting. Of course, eventually, it may be possible to get accurate forecasts· of trajectories by physical or large-scale statistical methods, which then could be used to forecast local properties of the air, such as temperature, visibility, or humidity.
3. Classification of Statistical Forecasting Methods. There is a great variety of methods of statistical forecasting. For the purpose of the discussion in this book, they will be classified as: 1. Linear regression or multiple d iscri minan t anal}rsis.
2. 3. 4. 5.
Successive graphical regression. Method of stratification. Residual method. Mixed methods.
The mixed methods utilize procedures of two or more of the preceding techniques.
4. Multiple Linear Regression. Multiple linear regression has already been discussed iu detail in Chapter V. Its popularity since 1950 is largely due to its objectivity coupled with the availability of electrouic computers. Further, experiments have indicated that the inclusion of nonlinear terms into the regression procedures has not increased the fprecasting accuracy significantly. · Multiple linear regression has been particularly useful in forecasting the field of a numerical variable, such as pressure or temperature, over a large area, such as the U.S.A. or the northern hemisphere. The accuracy of such predictions compares favorably with predictions based on dynamic considerations alone or those based on subjective synoptic techniques. The statistical regression techniques have the advantage over dynamic methods that no severe physical restrictions have to be made before the techniques can be applied; on the other hand, there is no guarantee that the coefficients derived will be stable.
HICA
~THE.
ECAS'
. . ;. ,._
~;~i
. /:;~---t·.~:.: :.___ ~,·-.j
.:·,.:.~.~·.',::::..
•• ., \·'.
~~
·,~,~·.ls\~
ldeall y, the statistically determined· coefficients! ofs a.;regression i::r'·,\\ equation should shed some light on the physics of the atmospheric·~~i)\l processes which lead to movement and development 'of'sy~optic :•: ' patterns. Further, physical and statistical 'methods,· when perfected much beyond their present state, should actually lead · ,. " · · ' ' . · ···'· to the same set of equations. I'
,(•_:
'\:,·.··,
The predictors used in the linear regressiou scheme· could be the factors discussed in Chapter VII. The difficulty with the factors is that they were picked without reference to a specific predictand. The availability of electronic computers makes the following "trial-and-error" method practical and leads to "better" predictors: first, d\vide the observations into three groups.· The· first group, perhaps' the largest, will be called the developmental group. The values of the predictands in this 'group are correlated linearly with the values of each predictor in tuni. ·. For example, if we wish to predict the temperature at New York, we find. correlation coefficients between it and temperatures, pressures, and perhaps other variables at various stations 24 hours ago. . The predictor with the highest correlation is pifked 1 and a simple;·:.'· .• si.ngle predictor regressi~n equation is fo.rmed: This sing.le pre- \l · d1ctor accounts for fractiOn r 2 of the vanance of the pred1ctand. \\1 Next, an additional predictor is picked, which accounts for more · additional predictand variance than any other'predictor .... With . these two predictors, a linear regression equation is formed.
180
5. Successive Graphical Regression. In some ways, the procedure followed in this method resembles the bracketing in a tennis match. For example, if eight predictors are to be used, they are first combined in pairs. The results from this study are combined in the "senli-finals"; the results from the semi-finals are analyzed in the "final match." Suppose, for example, we have many observations of eight predictors (X,, x2 ....... Xs) and Olle preclictand, xf). First, N
Pressure, mb
·11
Rainfall Probobllily X9.12 os function of Wind Speed and Direction
·2
·1
0
I l! 3 Presson l•ndenty, mb/3 hr.
Rainfall Probobilily X9.34 OS funcllon of Prenure ond Pressure Tendency
S-N Wind Comp.
T•mp.
ol 700mb
••
4~
40°
I
20'7o
40%
60o/o
80%
35° 300
0
-10 mph
-20mph
60% 80'7o 0
+I
D•w Pf.
+ 2 Curvolur•
of 700mb
Rainfall Probability X9.56 01 function of Upper Wind Componenl ond Conlour Curvolure ol 700 mb
Rainfall Prabab!lily X9.78 OS Funclion of Temperolure and Delli Poinl
FIGURE 19 Illustration of Multiple Graphical Regression (Quarter Final).
I sop letha of Rainfall Probability ·:. i x9.5 s78 as, function ?f .. )( 9_56 ,, ! ond X9.78, . .: '~ , . ; .(i.; : ,· '
lsoplelhu of Rainfall Probability x9.1 234 as function of x9. 12 ond 9. 34
x
!: ':.;
, ' , I : ; ·~
l
-
'\
,;__
;·t·~f~1·:·; .. J\/r;_~l
\
. i;:;
!t
'
:::•; J rr~u:) !
;_ .'
·-.:::i-~·-~:i· . t
., \
·,,.. '-11.:
·.~.· ~: . ~}_. . ·:' :\
tot;':-~ -.i: (~-~ ~:\-:r~(; \~-;Th\i::).:l~
.i
••
.. . 1 . ' !'"i ... ·, -~· :'::~·-d!J.; lsopleths of Rainfall Probability X9.12345678 ., .. . , .• 1 os function of X9.12.34 one! · X9.5678 · · · ·. · • "' •·' · ' 1· FIGURE
20
• • • : · ..
Illustration or Multiple Graphical Regression. (Semi-final-above; Final-below). .. .· ·
·::; • ·i:
·
'i\ 'j:
.,·, .. I . , . ··.··
· . ! : L"'· 1. .'·-~
<' i
four graphs are constructed, with· the independent variables as abscissa and ordinates. For example, one graph would contain X5 as abscissa and Xo as ordinate. ' (See FigUres 19 and'20 as example). Note that polar coordinates· are used'·when wind .~ . . . ,. ·· ...· dtr ctton ts a vartable. . ..... \ ... , . . :·. ·,:. .... , ... r.. ·. -
~
Into each of these four diagrams, values of X9 are tabulated. These are smoothed by averaging, if necessary, and isoplethed. These isoplcths on the four different. diagrams; may' be· regarded as leading to four separat~ forecasts of by graphical. regression. .. '
x9
,
'
'
SoML ...... ICAT.
·- 1F ST
cs
T(
EOR<
We shall denote by Xo.a4 that value of Xo which would be forecast as a function of x3 and x4 alone using one of these isopleth graphs. Similarly, Xo.7s would be the value of X 0 predicted from the graph having X1 as abscissa and Xs as ordinate. Next, the "semi-final;' graphs are constructed. These are two diagrams, one with X9.t2 as abscissa and X9.34 as ordinate, and the other with Xo.oo as abscissa and Xo.7s as ordinate. Note that these quantities are now known functions of the eight independent variables. Into each of the graphs the observed values of Xg are written as functions of the new independent variables, Xo.12, etc. These values of Xg are again isoplethed on both diagrams. So, for example, on the first diagram, the isopleths would give forecasts based on Xt, X2, Xa and X4 alone. The forecasts based on Lhcsc isoplcths arc Lhcn denoted uy Xu.tz;H; a similar forecast from the other semi-final graph, is denoted by x9.5678· The final graph would have Xo.t234 as abscissa, and Xo.&678 as ordinate, with observed values of Xo again written in and isoplethed. These isopleths will give the final forecast of Xo. The forecaster would use the seven diagrams then in the following manner: Given the eight predictors, he will go into the first (quarterfiual) diagrams and read from the isopleths X9,12. Xo.a4. Xo.&o and Xn.7S· With these values, he enters the two semi-final diagrams and finds fron1 the isopleths, Xu.I234 and X9.5678· With these he enters the final diagram and arrives at his forecast. In actual practice this can be done quite rapidly. If the number of independent variables is odd rather than even, some of the variables can pass into an advanced round, "by default," so to speak.
The technique of successive graphical regression has been applied to as many as 16 predictors. Again, with so many predictors, one runs into the danger of overfitting the developmental sample. As pointed out before, graphical regression is the subjective equivalent of nonlinear regression if the predictand is numerical and of discriminant analysis if qualitative. It has the advantage of being fast, and not requiring electronic computers, even when applied to ma1iy predictors. Further, no special form of .the
._., '•' ' .... ~)~~,·~t~\_,_. . S,~J~ equations has to be assumed. However, it is not ~o easy• to gauge:l:~l~~ the improvement of the forecast: when,·. one adds:~additionaL ~·;~ . . predictors. ·• ·, ·· ·1 ·!i• •:' )~·; :.i~i·.: "l;:r:i''"·;;, ,•
'~
The selection of predictors in this method 'again should·' be based primarily on physical considerations and synoptic experience. Whenever the isopleths on a given diagram· are nearly horizontal or vertical, one of the predictors has no. effect on 'the predictand. Also, if the predictand scatters wildly on a diagram, with little systematic t1·end, neither of the' predictors on this diagram is useful. In either of these cases,· certain, predictors can be eliminated from the graphical scheme .. · . · ·· · . · · · ,
'
I
'
Graphical regression is possible, though awkward,· when the predictors are given\, in descriptive, nonnumerical terms.': In 1 those cases some type of stratification is preferable., ' . .. .· ·;-: '.' • . ' . ··t; . •. ;•.-:.~~·!
6. Stratification. The stratification method consists of selecting class intervals or groups for each of a number of predictors, and testing which combination of groups is favorable to certain properties . of· the predictand. . ··: ·'! 1 • i .,'_ ;· Since there may be more than ten independent 'variables, 'each having five or more class intervals, the number of combinations may be very large: Unless the number of observations· is un~ usually large, certain combinations may occur only· rarely and with insufficient frequency to determine·. the effect of. ' that particular combination on the dependent variable.; One. might think that such combinations are actually so rare· that it is not particularly important to know how the predictand will react to them. However, just these rare combinations may .precede important and rare phenomena, such as hurricanes.·~>··',,,;, • .; .
'
;
"
.
An efficient way of handling the grouping problem' is' to use punched card machines. From the analysis of frequt;ncy .distributions of the individual variables, suitable classes of each are selected. Each is assigned a code number.· Then,' for a'·given J(Criod, the code for the class of the predictand ·and .'the code J\,umbers for all the predictors are punched.; on the. same card.·. ~·. · il, ,, '··l
•
'
.••
,:
: .. :.!
Now, suppose that all the information has been· placed on 'the cards, and we are interested in certain combinations of the predictors. Let us suppose that we may· be interested in 'all
184
So~
.'LICA
OF
S
ncs ·
:TEOl
y
combinations of a pressure fall greater than 1 mb per three hours, a NE wind, and increasing cloudiness. The punched-card machine will immediately select all dates with that combination of variables. Then, if we are interested in rainfall probability, we would count the fr.action of times rain occurred under those circumstances. If it occurred sixteen times out of a total of twenty-five cases, we would say that the probability of rainfall for the specified combination of predictors was 16/25 or 64%. Of course, if the combination of predictors occurred only five times during the developmental period, the probability will have little stability for any future situations. Attempts 'have been made to relate the reliability of a probability estimate to the number of occasions in which a certain combination of categories occurs. Unfortunately, such studies must rest on the assumption that the situations leading to the same combination of meteorological variables are independent. This is not altogether correct. Such combinations will often occur on successive days, or at least in the same season, under the same regime. Another regime may give a similar combination of predictors, yet the predictand may be different. Again, the •·diabilily of the probabilities can be estimated only from additional information, although the qualitative statement is certainly correct that the larger the number of observations used to compute them, the more reliable are the probabilities.
A difficulty of the stratification method arises frotrt the initial , selection or the size of the groups, which has to be fixed prior to the punching of the cards. If the number or variables is large, the number of groups should be small, perhaps no more than three or rot1r per variable. Only then will there be sufficient cases with same combinations of variables. Also, the results are commonly undecisive. With a given combination, it may raiu a little less tlFm half the time; or an airport might have been closed, instrument or contact, equally often for a certain set of values of predictands. It usually takes a considerable amount of experimentation to find that set of indepei1den.t variables which lead to the least ambiguous results. It is of advantage to code as mauy vnriables as possible, and then lind out by experimentation the combination of variables which leads to the best results. In fact, the number of variables used in the forecasting process does not have to be constant. For example, it might be found that 700 mb wind direction, and the position relative to a rain
TICAl
T.IUUI
area of the upper 'level· contotlri tine'' forecast station, give decisive rain!onJbl:tbili exception of certain ranges of the ·variables.1(,tJnderlthese: ditions another variable, perhaps' surfacer: pressure·· . ."""'"'"'- Y may have to be consulted. • ,. >I ,:,, 'il;:'i/i'\(;lji:,,)i: • . ·. ·· r
-· -:- t •.
-:~····
In general, even if a large; number of-Jvariables· have been punched on cards, it is presumably a •mistake to. investigate. combinations of more than three or four. variables at· a :time.· H certain combinations of a set of variables lead to unambiguous forecasts, and a few combinations to doubtful results, additional variables should be investigated in th~ af!ibigu?,us ranges. . . ,
.•,
,
.
~
; .
·
I ;' •
: I
Stratification is possible also, of course, without punched cards. . However, in that case it tends to be tedious,• particularly. with a ·. · large number of observations. ·•· Further, much·lineteorological information already exists on cards in':1the.·:Weather Records Cetlter in Asheville, N. C. Nevertheless, statistical forecasting··.· '·. techniques are often developed on the basis of data read directly,,:.· t, from weather charts, and simple stratification by; means ofitwo-.::,.:., .,,r, way frequency distributions, contingency: tables, or· other· types •':· '.\ of tabulation may be preferable when the.:data amount is.not'; ·.. :: ··~.· ~· -J \' ., · prohtbrttve. . ·. ·: · f: rH r.~·~F!tr. :Ht:., ; !:.i;1 .''il)::'>~.\1 : ~
•
9
•
•
•
,· •.•• •
•. ,,
.
;··.
;,
.
;
A method has been proposed to make lise of contingeticy"tables directly in statistical foretasting •. :•The-.procedure•·works·!as. follows: first, contingency- tables are constructed: for ·theI pre·;'" . dictand and each predictor separately ...:i1In ··other:w!Jrds, there,; · will be a number of contihgency tables equal 'to the number of , . predictors. Corresponding to each· contingency. table,·•:a, no-·· relation table is formed by multiplying row total by column\total' and dividing by grand total (as explained in Chapter .til}. ,.1£ a . predictor is useless, the contingency tab1e and ·the ;no~relation table would be nearly identical. · . · '\' :l .'' !,.,:;;*"~
The usefulness of each predictor' can be found· from. ,"contingency ratios." These are simply the ratios of the numl?ers in the coi1tingency table to the numbers in the· corresponding' boxes in the no-telatioh table. Wherever these'ratios aren'luch greater \ than miity, the relationship between predictor and'pred.ictand is tnlich better than a chance one. Therefore, given that a predictor falis into a given dass, we predict that the' predid:an(.l'falls 'into a class for which the contingency ratio is ·much larger•thanrone. -~· -.lf all the ratios are near one, the predictor is' useless. i·Due-·.to ~..., ·.:' ··,,.
. ·l. .
.." .. ·
.····
·,
..
:
!.. ~-··-
APPL. -··"-JNS 0
- .. flSTll
- MET-
.OGY
large chance fluctuations, it is difficult to say by how much contingency ratios must exceed unity before they are useful. Attempts to estimate the significance of contingency ratios are usually based on assumptions of independence of successive observations, which.may not be valid. So far, we have only considered a single predictor and a single table of con tiugency ratios. Suppose now we had several tables of contingency ratios, each based on the same predictand but different predictors. The tables can be useful in combination as follows: We are given the class interval of each predictor. Each pre· dictor defines contingency ratios for the various classes of predictand. That class of predictand should be forecast which gives high contingency ratios for all predictors. ln practice, for each predictor, a different predictand class may have the highest ratio. ln order to pick a single predictand class, we choose that class which has the higl;est product of contingency ratios for all predictors. In a sense this means that we determine the predictand class most probable for the given combination of predictors. Actually, the selection of the largest product implies that all the predictors are independent of each other. 1f they are not, large products may be produced by accident. For example, let one predictor of thunderstorms be dew point, another wet bulb. When the dew point is in a high class, the contingency ratio for occurrence of thunderstorms may be 2.0. The wet bulb will also be high; again, the contingency ratio for the occurrence of thunderstorms may be 2.0. The product is 4.0, implyi11g that thunderstorms are four times as likely as chance under these conditions. Clearly, this conclusion is false; we have used essentially the same information twice, and thus overestimated the power of our method. In practice, we would never use dew point and wet bulb as separate predictors; but rarely will the actual predictors be truly independent of each other. Therefore, it is difficult to determine how many tontingeucy tal>les to use with this method; again, independent samples supply the only useful criterion.
7. Residual Method. The residual method consists of correlating errors of torecasts based on one variable with another variable. Thus, the second
:ISTIC
\l:ATB
variable is used to correct forecasts: basech process can be continued so that 'a th correct a forecast based on the first two.;:L;<.{f;t For example, sUppose there are three predictors; xlt x2. X3. and one predictand, X4, plotted as in Figure' 1L· First?& is ' plotted as function of X1. A line of· regression or a _smooth curve is fitted to this graph, designated' as :Graph A:'-•The deviations of X4 from the smooth curve are next plotted against I'
\
,. Min. Temp. •F
First Correction to Min. Temp. "F. l.-
300 200
.•.
•• •
-too
.':'
Temp. prev. doy
wind spud
0
0
10
40mpb \
'· 20
l'i
Flret Correction os tunctlon of Wind Speed
Minimum Temperature X4 as function of Temperature prevloue Day
'(
,.·., Second Correction to Min. Temp. • F ·
·i . I;:
•
;n
.I
.,
"!
i '~ '
L....---1--~~---:-"t- Low ·cloudlnue.
0
.5
1.0'
Second Correction oe function of · Low Cloudineae , , .. : .• · FIGURE
I.
.,
....
21 ..
Residual Method.
·i.;- i.,-.·.
..
H
APPL
lNS 0
rrsn
MET
.OGY
X2. Again, a smooth curve is drawn, as in Graph B. Finally,
the deviations of the observed X1 from the values of X4 on the • I smooth curve are plotted as funct1on of Xa on Graph C, and a smooth curve is drawn. Now, given values of X~, X2 and Xa, a forecast of X4 can be made as follows: first, Graph A gives an estimate of X4; next, Gmph B yields a correction to this value based on knowledge of X2; and, finally, Graph C adds another correction as a function or Xa. Even though this method can he extended to many variables, it has not been used much in practice. Occasionally it is used aH part or a lllixed method. The residual mclhud has one very obvious advantage; it shows immediately whether an additional variable improves a forecast based on other variables. The same method could be modified to appiy to residuals from graphical regression isopleths, and to correlate these with new variables.
8. Mixed Methods. In practice, meteorologists use whatever technique happens to be convenient for a given problem, and may use several tech· nittues in the same problem. When the criterion is giveh in words only, stratification is preferred; with quantitative data, graphical or numerical regression is more suitable. · I 11 reality, most techniques of statistical prediction used are· really mixed methods. For, no matter what other procedures are applied, the observations are usually stratified first. Observa· tions made in all seasons, or during all hours of the day, are mrcly analyzed together, since the various relations differ with season and time of day. Also, the fact that different methods are used for different locations can be considered as a form of stratification. Three of the techniques discussed here \Vere used, for example, in a statistical forecasting system developed for the direction of motio1i o£ surface pressure centers. By graphical correlation (isopleths), an estimate of the direction of motion is made from the direction normal for that type of storm (Bowie and Weightman chart) and the direction of motion in the last six
UOU\.
,,
~A:1ti,__~·-
'
·,~:l;·;,'tlilWJI!'.O~I">'·:l!
hours. Corrections to this forecast an~;made as direction of the isallobaric gradient.:~1 Stn:ltification' , re-. · · quired; different graphs are used dependtng.pnwhethenthe storm ~~ ,, ,_·_,:·:,;,,_',:i·;r-,, :,·;_:·;....L:·1r.i::;, ,.., has recurved or not. . . · • • 1'
I i; ; J i ~! ·: : .t ! ; ·~ . 71 : '.:
9. Conclusion.
1.! ~ i ~
:·;.i'
Statistical methods have proven quite successful for the forecasts of as widely different elements as minimum temperature and the flying conditions at airports. ·Statistical methods ·often' perform better thall subjective forecasts at the Saine' station, Eveh when the statistical technique does hot by itselrtead 'to a more accurate forecast than subjective tech1ilques, forecasters . find that a statistical techniqUe frequently:· improves their! " subjective forecast. Then techniques:.·are' used'' as >objective· forecast "aids," rather than as objective'forecast"methods:··j\; ' :. : .
'
'
'!;".'~
<,:. . .:. ·~;
r. 'i)•.: ....
. .x~:·:·(.
.}:· '
One obvious advantage of a statistical method< is that(a new: cr< forecaster at a station can use it as welt· as anyohe else, since he_,:>:;';: will not have to get used to the local peculiarities of the station::. >;.:: (the methods take these into account).' 'This points'up·a'dis-;L . :· advant~ge of statistical methods; unle~ the sta~istica.l me~hods;\\> are denved under very general condttlons ·(as very few are), ··. ·. . they ate valid only at the stations for which they'are designed.:· , · Therefore, separate research JllUst be done at every station for, · .. which a statistical technique{ is to be set tip.,. For this reason;. statistical techniques have thus far been''developed only . for: · important centers of population or aviatio~,· · ', '. -.' '' ·~-;~;· ·. ''j
It appears now that statistical methods. will keep their:im- · . portance even after the electronic computer methods, which are . based on the physical equations, have been perfected. · For these · techniques are used only to predict the geheral synoptic situa-' · tions, not detailed weather elements at particular stations. 'This will remain the province for statistical methods for many years : to come.·
. ,.
Whether statistical methods will retain 'their usefulness in forecasting the large-scale fields of flow and temperatute is still a.matter of conjecture. This depends, first, on the ability of the dynamic meteorologists to construct· realistic dynamic models that make fuil and efficient use of the observations of the atmos- · phere and are not limited by important physical restrictions; ahd,.. , . '.
•'\'•'
,.: .
l~v
vV.OU'·
APrL .... A
. . ONS
(.,, ~··•TIST>-~ ·~ME·. ___
LOGY
second, tile ability of the statistical meteorologists to select predictors and relationships that contribute information regarding the predictand independent of the dynamic model. Many meteorologists believe that ultimately the best forecasts will result from a combitlation of the two methods. Ideally, however; both techniques should lead to the same relationships.
Suggestions for Additional Reading. Gringorten, Irving I., "Methods o£ Objective Weather Forecasting," Advances in Geophysics, Vol. 2, Academic Press, New York, 1955.
:.
I'
CHAPTER
--.-,~·. ,;i:i:i:··~r+;:;m ,:::::::::: ~;~·~
1. lrttroductioil.
~
', ·,
.~ ~ ... ·...
.. . .
.
.,,
Verification of weather forecasts has' been a· controversial subject for more than seventy years and has' affected· nearly the entire field of meteorology. This chapter will' discuss ·some ·of the important reasons for the controversy· and 'attempt to show.; that much of the existing confusion disappears' wh¢n: a''careful . analysis is made of the objectives of forecasting'and verification.· : . A number of· verification systems' that have' been used ~will' be ... described, but it is beyond the scope of this discussion' to 'make a complete survey of verification practices or of. the literature on 1 the subject. · · ·.,' :,. ;:·:r.::~ .:;\~iii,'~i·:~ny.,_:;~~\i/:ii,
'/;;,,;\·;".)
Definition of the· Problem.-Verificatiort .·!is usually' under•·:; ::., stood to ritean the entire process of compating"the predicted. : · · weather with the actual weather, utilizing the data 'so' obtained\> to produce one or more indices or scores and then interpreting ·· these scores by comparing them with some standa~d depen,ding upon the purpose to be serived,.by'the verification.'';1In'· the' dis-.:. ' cussiot1 which follows, it will be assumed that both forecasts and : ::' : observations have been expressed objectively' so. thahto elerrient:>·:: of judgment enters into the comparison of forecast with 'observa-., · tion, and it will be assumed further that errors of observation are·.·.····-~ unimportant. These assumptions· are discussed' in' a' subsequent section. The selection and interpretation' of' an: index or: score which will hleet the objectives of the verification study' usually ·· : constitute the most difficult part of the probl~m;.as will be shown,··· practical considerations often require that'.the .; score 'fulfill a number of requirements and furnish informationon a' riumber.of ·· different characteristics of the forecasts. Selection o(arbitrary· scores intended to measure a number of parameters usually leads . to difficulty in the last step, interpreting the score.·'·'>\:' :;:·., · ,._: · ' ·, •
•
•
·.~.· •
~;:"";',",;.~ ... ~ .. ·('•c"•";
••
:"\
,:'~j ··:··:..-·:~· .. ~~·
' .. ;:.
'Adapted from "Verification of Weather Forecasts,'' by Glenn W. Brier and Roger A. Allen, Compe11dium of Meteorology pp. 841-848. Published by the Americ~n. ~eteorologlc!ll Society thro~gh the support of !:he ·Geophysics .. Research Division of the A1r Force Cambndge·ResearCh Center.: ..r •· •r: .··! 191 ,·':_),; .
19~
~------ .\PPLi ___ • ._NS Ol
:rsTK
_ .\liEn
OGY
2. Purposes oi Verification. One of the earliest purposes of forecast verification was to justify the existence of the newly organized national weather services, and thus the question of verification immediately became a football to be kicked around by the supporters and opponents of the national weather services. Some claimed that the official synoptic forecasts had little or no practical value and produced verification figures intended to prove their point. Others claimed that the weather forecasts could not be subjected to rigorous statistical tests or that the tests that had been performed had no meaning. Today the value of the national weather services is so widely accepted that this particular purpose of verification no longer is of much importance. However, the eiTects of this controversy still show their influence by often preventing a realistic attempt to solve some of the problems of forecast verification since some meteorologists object a priori to any scoring system that does not produce high enough figures to make the forecasts appear favorable in the eyes of the public or government appropriating bodies.
Economic i-'urposes of Verificatiott.-Since the sole purpose of practical forecasting is economy of life, labor, and dollars, it would seem that one of the chief purposes of verification is to determine the economic value of the forecasts which have been mafle. But such an evaluation, especially for the wh~le economy, is dillicult or i111possible since the uses ami users of forecasts ate so diven:;c and ramified. Forecasts that nmy have considerable value for one user may have little or no utility for another user or may actually have a negative value if their accuracy is low. The lneasurement of the economic value of the forecast is thus a separate project for each user, and this is usually impracticable or impossible because the individual has neither the essentiaf economic data nor the facilities to make such an evaluation. Thus a farmer m~y feel that an accurate weather forecast eriabled hirrl to make some saving in seed cost and planting labor, but he may be unable to estimate the cost of his own labor, and the total saving might depend upon the value of the harvested crop which is, of· course, unknown until harvest time. Sometimes such economic evaluation of the forecasts is possible, the 1iecessary data being found in the cost .accounting and financial structure of the particular business. However, since evaluatioh in economic terms is usually impossible, it is most often desirable to determine
'" •.v
1
the reliability of weather :forecasts. by:: n.tea.sutr.tn!~~tneJJr.P,
Administrative Purposes of Verification.--Dne of thel most useful purposes of verification is to determine the relative abiiity · ., of different forec~sters. In a weather·organizatior the' re,lative · , ranks of forecasters may be desired for a number of. reasons, such·· as selecting promising personnel for;'futthet~training :, inl,fore• , · · casting, selecting able men for difficult of} specialized forecasting . . assigntnents, selecting research personnel/ or ratihg the efficiency : :( of the forecastet. The verification scheme adopted ina}tl!appli oitly to the regular official forecasts made by. the individual ·or- to )',I'll' a special practice-forecast program.· -Such a program may involvff"~tfL people not assigned to regular forecasting!ahd 'thus~provide. a)··: : basis for discovering hidden talent.· In the ~lassr,o?ht a practice.i,VJ : forecast verification program is used to· tneasure the progress oF · · student forecasters and can assist the' professor, in vocational > guidance. A student show\ng Jow forecastlng''ability but high mathematical ability might be encouraged to go irtto theoretical. hleteorology rather than Into practical forecastihg. ' 'N utnerot.ts other examptes of administrative purposes·o£,verification 'could be given. · . :; ~.:~·)~ · :;,:;.·,;..i,.J,,iJ/·.:~\,
>>
''lT;;
Verification procedures, when applied tothe officiaUorecasts · of a weather station, can make a valuable contribution to the control of the quality of the output of the station. In industry it has been found that scientific sampling procedures are necessary • to maintain uniformity' in the.,manufacturing'ptocess.,',I(the forecasts trtade at a particular· station' over a.· 'period' of'time appear to fall below the sta11dards o£ accUracy. that. have. previously been attained, administrative' action' to' investigate' the reasons for the falling standards might be desirable.'' .There is also a further advantage that the mere existeltce ora: checking scheme, even if imperfect, tends to keep the forecasters tnore alert and interested in maintaining the acclltacy::9Uorecasts. ·:·: . ,
19
s
ii'I'Ll
NS Of
IS TIC
A En
JGV
S,;ient~;ic Pttr;ooses of Verification.-One of the goals of scientific meteorology is to be able to predict precisely the state of the atmosphere at any time in the future. This goal is a difficult one to attain, but considerable progress has been made in the past several decades in understanding the physics of the atmosphere. Sometimes the question is raised as to whether there has been any increase in the accuracy of weather forecasts over a period of time. The question asked may be quite general, such as whether temperature or precipitation forecasts made by meteorologists in general are more accurate than they were fifty years ago. Sometimes the question may be directed at a specific group of forecasters using special methods on an experimental basis. In both these cases verification statistics can be used to provide information on the trend in forecast accuracy, although the technical difficulties in obtaining accurate statistics may be great:. When some new advance in meteorological theory which bears on forecasting is proposed, it may be possible to compare the verification scores of e~perimental forecasts made using the supposed advance with forecasts made without the new theory. In this case verification is equivalent to testing the validity of a scientific hypothesis.
Another scientific purpose of forecast verification is the analysis of the forecast errors to determine their nature and possible cause. This is, in the opinion of some, the most important and most fruitful objective of verification since it is more susceptible o{ scientific treatment than some of the other purposes. A search may be made for indicators of forecasting difficulty which will help to locate the synoptic situations under which forecasts are most likely to be wrong. It is commonly thought, for example, that situations where weather changes are taking place rapidly are more difficult to forecast than other situations, but it should be noted in passing that the literature does not reveal any precise studies to support this view. Likewise, although this is a wide~ spread belie£, it remains to be shown by verification figures that the forecasting accuracy for sonie element such as precipitation occurrence is lower for cases in which deepening or filling o£ allow is poorly forecast than for cases in which the deepening or fiiling is accurately forecast. Such verification as this can be used to discover the weaknesses or forecasting systems in order to decide where research emphasis is needed.
'"'""'"'~"""'- .. ...,.
.. r
....... ,. .... .,.
3. Furidameiltat Criteria to be.Satisfied Terminology of ·Forecasts.-One essential for·'•sati verification is objectivity, which requires' that' the': forecasts: be explicitly stated, either numerically or categorically, thus' per. mitting no element of judgment to enter' the' comparison., ol forecast with subsequent observation. But the relation between the forecaster and the public, which depends upon the forecaster's terminology, is usually subjective in nature, since even objective terms and the actual weather ate irtterpreted. subjectively by',. many individuals. If every forecast is accompanied by a definition of the terms used, the forecaster sometimes objects on the basis that it hinders his freedom to express himself adequately: At this point it is easy to raise questions about: the· psychology. of forecasters and"'of the public and many; arguments .in forecast. ··• · , verification revolve around this point.· '•Although 'these~': are, practical and very important problems to those. attempting to ·... serve the public with weather information, they· are in essertce ' outside the field of forecast verificatioi1 and the goals of verifica.... . tion would be reached much sooner if this were more generally.::~;:·:~ recognized. Only those forecasts which are expressed in objective .. >c··;:: ter:ns can ?e satisfactoril~ verified as {or?cas~s.. The extent. tq ~;·~·;:~~ wh1ch publtc forecasts sat1sfy the pubhc 1s a d1fferent questlorti\.;' .;. :i which can be answered, it appears, only by public. opinion potls; !'.,:-c.:~ and the answer generally will contain little information! about : :: ~ the agreement of forecast ~nd !ictual·weather,-,;:,·'(,· .;,.;;;;i;r\i, . ~i : ·!
1
Meteorologists themselves sometimes advocate ··subjective verification, particularly in the case of prognostic charts' which attempt to portray the pattern of some weather element '(such· as pressure) rather than to specify the. weather: at ·individual ·.·. points. In some cases this has led to the {.Jse of boards of experts ·· who compare the prognostic charts with observed charts.!t;The · · difficulty in objective comparison arises because ~of~ the:·:un• ·. satisfactory state of knowledge as to what are' the important ' parameters of the prognostic pattern; or in'oi:her··words/·the.· forecaster is unable to specify objectively' just what he is trying . to represent with a prognostic chart. ';Jn effect, the· forecaster! is ,• trying to verify something which cannot.·be observed,;hertce'this situation does not tneet the definition of verification.:: An extreme view denying the n~ed. for objectivity.~~. ~atexpt~~.~t'~:~'i'),l~~ <.'':. . cetttttry meteorologist. ·~ ·l; .• ·_:; ·._ -._ .. -~ .. -~~ J:~i: 1':f rt:'tUH:~:. ·_:_:-_·~: ..~ . to make a proper vetification of a weather forecast! ·an4, ?~t: ·;;t;; l{ .
-
:.·.. ·
~
.
" .
'' 'I:-
k
•.. ::'.)' 1 .; . .,:~
. ::.,. : i ~ ;_. _
;'1-~ ~!
19!J
S ........ ,l'I'Lh.
.
~S OF
ISTIC
1ETl!
JGY
that shall be rigidly applicable to every case . , it should IJe done by ~ .. one thoroughly acquainted with the average conditions and he should verify from the map on which the prediction was based and not from subsequent maps. If the verification is from a subsequent ttmp, care should be taken to consider what abnormal conditions have occurred which could not have been anticipated. Selection of Purposes of Verification.-Before any satisfactory verification scheme is adopted it is necessary to determine the primary purpose or purposes to be served by the verification. This may appear to be obvious, but the history of the forecast verification controversy during the past half-century makes it clear that this fundamental point has been forgotten again and again. A scheme that is adequate for one purpose may be un,satisfactory for another, just as an automobile is suited for travel over the highway but is qt1ite inefficient for flying through the air. Unfortunately, much time has been wasted and much confusion has resulted from attempts to devise a verification method or sing-le score that will serve all purposes. The very nature of weather forecasts and verifications and the way they are used make oue single ot· absolute standard of evaluation impossible. A set of forecasts has many different characteristics. Some of the forecasts are correct, and knowledge of the percentage whkh are correct may serve some purposes. Usually the weather ~ccurrences in one part of the range will seem to be more important than those in other pa1·ts of the range, and this must be considered in specifying the purpose of the verification. Thus if the set of forecasts are forecasts of temperature in a citrus grove, the characteristic which is most important may be the percenta11:e of forecasts of temperature below freezing which were actually followed by temperature below freezing, and the percentage of forecasts which were correct may be unimportant. The situation is in important respects analogous to measuring an object, for instance a table. The table has length, breadth, height, smoothness of the top, hardness of the top, number of legs, etc. A measure of any such characteristic is of no value unless a way of using the measurement has first been established.
In general; if each purpose of verification is exactly specified in advance, in the form of a hypothesis, not only will it be much easier to select verification scores to satisfy each purpose, but there will be no doubt as to what action is indicated by any
·For-·
1'
:V&.·
. numerical value which the. verification often be desirable to select the purpose way that the result will either suppor(or~'·reject'(an hypothesis. Specifications of a Scale of Goodness.::.......Ahother ·,essential : , criterion for satisfactory verification is 'that·· the 'verification scheme should influence the forecaster in no undesirable way. One of the greatest arguments raised against' forecast verification is that forecasts which may be the 11 best" a~cording :to' the · accepted system of arbitrary scores may not' be the' most useful . ; ; forecasts. Resolving this difficulty becomes' a' question~' of' how . to define "best." ·That there is no unique answer to this questio1~ · · : can be seen by considering the following. exantple)n which it is , : desired to verify some· forecasts·. ofr minimutn~~temperature. Suppose that ori some particular' ocbision ·the~· probability of · · occurrence during the subsequent· night of the•various• integral ·. degrees of minimum temperature •is attuaiJy. knowri ·bY.'a: for.e-:.:. caster to be as follows: ·· .. : ·· 1 •/.;j\:'~·\,.,;;;:)t:'·, ;;,~·!<':l'~N:l} ,;F.;;·
·
··· ···· ·
· ·...
St~;0
Minirimm temperature (°F) ... , .
.,.
If the forecaster is required to, state a single tempera~lire figure·: ;\' in his forecast for verification purposes, what figure should he. :::r: state on this particular occasio~' so as ·to'inaxhrtize' his' score? ' ·' .: There are a number of answers: I£ the verification scheme counts ' as hits only the occasions when 'the forecasts are' exactly; right, . ; he should forecast 35°F, since this ·value.has the:greatesttproba-''•:'' bility of occurrence. tr he is being. verified on the. basis of mean-~ absolute error, he should say 34°F, the tnedian or' the frequency·; . distribution, since this will minimize the sum of the absoluteL.>\ deviations. If he is to be verified on the basis of the square· or/ .· root-mean-square of the error, he should forecast~34.i5°F,'which · · is the meari vaiue of the frequencydistribution:·~·Any:number : ·' of such arbitrary scoring systems could be devised and· they will. ' all influence the forecaster; at least to· some ~xtent, 'or' in effect . · actually do part of the forecasting .. The verification scheme may . ;·. lead the forecaster to forecast something other than •what' he , .. ·· thinks will occur, for it is often easier·to'analyze,the.effect:of: ' different possible forecasts on the verification score than itis to,~ '~.~~ analyze the weather situation. Some.. schetnes l;i{e \vors~.-Jpah : ,.' . ., ' ,. •,•'
;
'·)'
',•l"'t
t
others in this respect, but it is impossible to devise a set o[ scores that is free of this defect. There is one possible exception to this which will be discussed in the section on verification of probability statements.
4. Verification Methods and Scores. It has been stated above that, in general, different verificatio11 statistics will be required for each different purpose. The discussion of various verification statistics in the following paragraphs can therefore only hint at the use which might be made of each of the scores, since ·so few types of scores appear ever to have been of real value.
Contingency-Table Sttmmaries.-When forecasts are made in categorical classes a useful summary of the forecast and observed weather can be presented in the form of a contingency table. Such a table does not constittl te a verification method in itself, but provides the basis from which a number of useful pertinent scores or indices can easily be obtained. An example is given in Table 41 where precipitation was forecast in three classes, heavy, moderate, or light, for thirty occasions. One of the greatest advances in forecast verification was made when the limits of the various classes were chosen in such a way that each category had an equal probability of occurrence, based on the past climatological record. Thus there is no incentive for a forecaster to choose one class in preference to another because of purely probability (climatological) considerations. This principle, with slight modification, has been used in the verification of the Extended Forecasts of the U.S. Weather Bureau, and during the last war the Army Air Forces devised a scheme in which thirty TABLE 41
Contingency Table for Precipitation Forecasts Observed Precipitation
Forecast Precipitation Heavy
Moderate
Light
Total
Heavy ................ l'vlodcrale ............. Light .................
5 8
2
2
l I
5
8 13 9
Total .............
15
8
7
30
2
4
classes (called trentiles) were used, of the frequency distribution based on Jar element being verified. From this table a number of interesting verification statistics can be obtained. A comparison of the margins reveals whether the various categories were forecas~ with ·the same frequency ·.·. as they occurred. Thus it is noted that,.· although heavy. pre-.· cipita.tion was forecast 15 times, it occurred only eight times, the opposite tendency being shown for moderate precipitation. This may be only a sampling difference, or· it may 'be great enough to cause the forecaster to reject the hypothesis that he is able to distinguish the relative frequencies of the various classes. · · The relative frequency of occurrences of the various classes. can also be compared with that expected oh the,basis oCclimatology~ '
:
. : t.
~'
'.
1
'
'
,, :
! / ·,
' '
,.
Percentage Correct.-From ·the contingency' tablea'frequency distribution of errors can easily be obtairted.' ., In "the example 1 · giyeu, 14 forecasts are exactly right, 13 fotecasts are wrong'by · · one class, and three forecasts are wrong· by' two classes, .'A , commonly used score is the per dmt tight, in' this case'i4fad 1== 47:-. per cent. More useful information is provided by constructing',~ . two other tables, Tables 42 and 43 •. The extelit to which sub-'\.
.>
'
o
'< ' ' :I '.:
; ,-~~\ .·: .;,
.
: : ':
i
' /..,
TABLE 42,: ,, Per Cent of Time Each Forecast Event Occurred,:.: for a Particular Category · '· !' : ·' · • Forecast Class Observed Class
Heavy
··Moderate' ··
Heavy.. .. . . . .. .. .. .. .. . 33 , · 25. , . : 14 , :. Moderate............... 53 50 14 Light .. "............... 14 25 72 ,, Total ..............• ---1-00_ _ _ _ _1_0_0-'-----1-0o"""·-,.:-·-
sequent observations confirm the prediction when a certain event is forecast is shown by Table 42. The term postagreement'has been suggested for this attribute of the forecasts and· the term prefigurance for the extent to which the forecasts' give advance notice of the occurrence of a certain eveht :.~(illustrated' by. 1'able 43). Thus it is seen that forecasts of heavy ptecipitation' . were followed by the occurrence of heavy precipitation 33 pet certt . .
:
.
200
::,or.m i\rrLtCIITJONs <J~ ..:.rAT!Sou,.,, <0 M~=. • .,,.... oLOG.
of the time while occurrences of heavy precipitation were correctly indicated in advance 62 per cent of the time.
TABLE 43 Per Cent of Time Each Observed Cate~ory was Correctly Forecast Forecast Class Observed Class Heavy ................ 1\1 odcrate ...•......... Li~ht .................
Heavy
62 61
22
Moderate
Light
Total
25
13 8 56
100
31
22
100 100
For some economic uses these two scores are probably the most important verification figures. In planning an operation which is influenced in an important way by heavy rain, for example the operation of a series of flood-control dams, it is potentially of value to know both the percentage of heavy-rain forecasts which are likely to be correct, that is, the reliahce to be placed on a heavy-rain forecast, and the percentage of heavy-rairi occurrences which are likely to be correctly forecast.
Skill Score.-The information contained in the contingency table is often combined into a single index (S), called a skill score. It is defined by where R is the number of correct forecasts, T is the total number of forecasls, and E is the number expected to be correct based on some standard such as chance, persistence, or climatology. This score has a value of unity when all forecasts are correct and has a value of zero when the number correct is equal to the expected number correct. 1t will be discussed at greater length in the section on comparison with control forecasts.
Average Error, Root-Mean-Square-Error, Etc.-When the values of the forecast element are expressed on a continuous numerical scale, as temperature usually is, it is often desirable to express a verification score in terms of an average absolute error or a root-mean-square-error (RMSE). I£ in a series of N lorec.asts F 1 represents the i-th fot·ecast and 0 1 the corresponding
'. ·;~
observation, the average absolute' -
•. i
i
.•
'.
;f
'
F1 . ,·:,.;:N,, 1';;,!.' 2:
e.=---~~~--~·:
and the RMSE by
RMSE =
V
2: (F1
;
.
0 1)2 · · " · ·· •.·. ·;i" :\ ·J:. < N- .,. .,.., ··,..... , :··::'···· t
j
.:;··
'
f
One disadvantage of using either of these scores is tha~ it, gives.';· .. the "cautious" forecaster an opportunity to.. hedge! by.-' playing;.' · the middle of the range. Thus a forecaster may feel ::that· the .··•· ' 1 1 minimum temperature during the night. wUt' be 'near 40°F if the·::. ~ '; 1 cloud cover remains but will be n~ar 30°F ifthe:skies dear: 'IC::iJ he is being v~rified on the basis of theRl'4SE, he ~i~ht ~e~d to.·;,~·~cj choose 35°F ln cases of doubt even though he mar. be.qu1t~,.s e :.'::} 'II b i · . OF .. J ~ IJ r, . .. ... :,.t.:~~tl11 · ~·",···..;;'! that t.he temperat~re Wl ~?t·: .~..ct~, t~,~~flif !p~;'b:::··,i;J~iHh . di;·;~1
1
·These scores can also be used 'to ;verify prognostic' pressure•,?,:i:i~l patterns by comparing the forecast and observed 'pressures:'or ·· 1 pressure gradients at a number of..sample' stations· or~ polrtts. orl ,;,, .. t1emap '' ·<~ ,. ·' \ ..• ·· ,._,..,. ~,c:<•·~·~··'· I !1' , ~.', ,-..,>::•' l·,~~.:·f,/·.:·;,.,j.,.'.,? '1;.,;':.~
t
·~ ~:;~,,·
~·- •· ,;' •:,t·.:·}.·~f,
1,
"'._
Carrelatia1ts.-The correlation' coefficient. discussed~· earlier.ris :. ·: :1: a statistic which measures association between: two seb:! of val~e(ik;~ such as forecast and observed temperatures.' When used as:: a ~:1~<'~ verification score it has tl;ie advantage. of being relatively' free / ; from the fault of influencing the forecaster in an undesirable way.'<: ·', However, it is insensitive to any bias or 'error in' 'scale that'a ·»;, forecaster may have. Thus the Centigrade and Fahrenheit stales. ,.; are perfectly correlated, but a person would not use one' scale to verify forecasts n1acle using the other scale ... 1~he .-correlation .-·~ coefficient is also difficult to interpret £or the purpose of making ~-· operating decisions and is subject to abuse of. interpretation, · such as attaching too much significance to the value of· a co- ... efficient obtained from a small sample. It is influenced by trends.· that exist in both series so it is usually desirable to compute it by using departures from normal, ! ·, ·· ··.. ·· · · · \
Verification af Probability Statements ...:...There' appears· to . f,'' be one situation where it is possible to devise a verification scheme ·/': that ca1111ot influence the forecasi:et in'any Undesirable way.~This .J: ls the case when the forecasts are expressed in terms of probability ·,:, ';;} statements. Suppose that on each .of Nocca.slons,an .eyen'~,~an·;:;-:~:C:}j; . ·,.:,~:·<·} .
r?.}
-~-~:;.A·_::_;
202
SOME APPLICATIONS OF STATISTICS TO METEOROLOGY
occur in only one of r possible classes and on one such occasion i, fu, /12, ... , ftr represent the forecast probabilities that the event will occur in classes 1, 2, ... , r, respectively.· If the r
classes are chosen to be mutually exclusive and exhaustive, r
Z /1i = 1 for each and every J -·
A score P can be defined by 1 r
P = -
N
Z
i = 1, 2, ... , N. N
Z (~ 1 -
1-1 1-1
E11 ) 2
where E11 takes the value 1 or 0 according to whether the event occurred in class j or not. For perfect forecasting this score will have a value of zero and for the worst possible forecasting a value of 2. Perfect forecasting is defined as "correctly forecasting the event to occur with a probability of unity or 100 per cent coti· fidence." The worst possible forecast is defined as "stating a probability of unity or certainty for an event that did not materialize" (anc) also, of course, "stating a probability of zero for the event that did materialize"). It can be shown that if Pt, P2, Pa, ... , Pr are the respective climatological probabilities of classes 1, 2, 3, .. ; , r then in the absence of any forecasting skill the best values to choose for / 11 will be PJ for all N occasions. This will minimize the score P for constant values of !t1= /21 ••• fNJ and the expected value of the score will be. E(P)
= 1-
t
Z
l-1
p21
To illustrate the procedure consider Table 44 which shows
TABLE 44 Example of Forecasts Stated in, Terms of Probabillty Rain Occasion 1
2 3 4
5
6
7 8 9
10
No Raitt
Forecast probability
Observed
Forecast probability
Observed
0.7 0.9 0.8 0.4 0.2 0.0 0.0 0.0 0.0 0.1
0 1 I 1 0 0 0 0 0 0
0.3 0.1 0.2 0.6 0.8
1 0 0 0 1 1 1 1 1 1
t.O
1.0 1.0 1.0 0.9
ten actual forecasts of. "rain''• ore!! no'·-~: .. ·"~=-··· ot confidence statement was made .for· ··In Table 44, unity is placed in the 11 rain~i'colum~ if If the event is "no rain," unity is placed in the ~·no rain:'col .... The score P is therefore · ...i · ,· .. ':'' :· ·.::"r;• J:.'ll'i"'•). ~ i,:(
P = ..!. {o.72 + o.72 + o.t 2 ,+ o.1 2 '+ o.22 10
'
.
\..
.I
·+ ~.22 +r.:. · . ·
>·· ~:,.' :,; ~ ;: l}
' ; : ·,, :~._._:, t\
0.12,
... ·
; ·.:)
:: :.:,.t. ..t
± ~·~2 J. =;-. ~J,9!n
>;'
•
Since rain occurred of the time,· the .minimum (or·best)' ·· · score that could have been obtained by,making tpe same fore~;:as~, .•· every day would be ·· .·, :• \ ·. ,·h;h :i , "·, ;;,ii.;;'·.;·( ·. '. 3 /10
\:i
.~ .. !
,:t ::.~:.,
...
~:\(.~~-~:, :~-~-
p 113tn =:' 1 - (0.3 + ~.7 ~' ~ 0.~~· \''·•;P~!;·:~ ' /:',,:··.~ Thus the forecaster is encouraged to minimize his· score' by: · ::. : getting the forecasts exactly right and sf:?ting~a',·probability;·of,;:·:~i,··, unity. If he cannot forecast perfectly/he is encouraged 'to 'state f'iJ; unbiased estimates· of the probability 'of' each' possible: event:•<~·r:;. On the other hand, with complete: absence''o(knowledge:or\J:'.,:: forecasting skill he is encouraged to predict the'· climatologic<j.l~·.~:ft;; probabilities and not just. forecast t~l~·~?s.t..frGq~~~}~·~~!a.~;;:~ip;;f:~.f~):ir · When a series of forecasts has beert · made: ush1g probability.\\'·. statements, a study can also be made to determine whether the L/' 1 forecast probabilities are related. to.· the 'relative!' frequency at ' ·. which the events occur. An 1exa!llple ofthis type 'of comparison •.·.· is shown in Table 45 (based on a more extehded series:of such,: · forecasts) which suggests a relationship between ·the forecast and the observed probabilities, but which indicates thatthe £orecaster•h .. , should modify or adjust his scale to improve ..the forecasts.{~ 2
2
'
.·..
_,
..·
{
-A;:;·/; ;,
Arbitrary Scores for Special· Purpo~es.~A~vi~finite::n~~·~~~.L~J!.~;
of scores can be devised for verification purposes,: but it is .beyo!ld .)·:>,;: ·;.-. ·, i··~: i!.: :i .-~~:1. tt.!6: ~ -·~:-·Ger.:. /r:~'tiX ;: - ~--~:~~~:-~ TABLE 45 ,.:;.·l~M ;fif'i!!·~·· h. :1 i;,t;t;•r.·>~ · '··::. J
Verification of a Series of 85 Forecasts Expressed·i!,} ..1rs;,.. : .. · in Terms of the Probability of Rain ·. · · iF(t:,·, · · · ·
Forecast Probability of Rain 0.0()-().19 0.20-Q.39 0.4o-0.59 0.60-Q.79 O.Bo-1.00
Observed Proportion of Rain Cases . ·
the scope of this text to discuss many of them in detail. In particular cases, such as verifying the forecasts of cloud heights or visibilities, it may seem desirable to express errors in terms of per cent and compute a score such as · 1 !'! jF1 - O;J ep = -- 2: N 1=1 01 where F, is the forecast value and 0; is the observed value. In other instances, verification of selected cases in which either forecast or observation was of particular importance may be required, or it may be desired to transform the forecasts into forecasts of change uefore verifying them. 1t is common, for example, to want to verify only cases when low visibility either is forecast or occurs, since high values are much less critical for nirplan<~ operations. It may he desired to verify only the cases when ceiling changes across the critical value, either in the forecast or in the occurrence. With more intensive study of the decisions which are made on the basis of such forecasts and of the consequences of right or wrong decisions, it would be possible to devise verification scores based on the relative values of different forecasts, but as it is, practically all such specialized verification systems arc arbitrary. This is not intended to discourage the use of such scores, but it should be emphasized that coi1clusions as to the relative values of two sets of forecasts will always be uncertain so long as the value of any individual forecast cannot be estimated. Innumerable other scores to be used for some special purpose could be mentioned.
5. Control Forecasts for Comparison. Chance or Random Forecasts, Persistence, Climatology.Up to this point most of the discussion has been concerned with methods of 'comparing the forecast weather with the actual weather and obtaining some indices or scores. The final phase of the verification procedure is the interpretation of these scores, which is usually performed by comparing them with some standard. It has long been recognized that a figure such as the percentage of correct forecasts is often meaningless, since the same figure might be obtained by chance or by some scheme of forecasting requiring np meteorological knowledge. It has been suggested that forecast scores be compared with scores obtained using various "blind" forecasts such as random
:.lME
· S~A
CATI•
.-
;_·~
: l.
-~
s:.To.
:'.·
-~-:.~~-W~~~ .·{·;~~;::
, ·:,r,,;. :,·;~,,:·.:r;!cft•.c·"
forecasts, persistence forecasts, or.climatological· arguments have arisen over the proper choiee·,:~.(i(ian • . . .I • blmd forecasts to use as a standard. · There· is no·correct.'answer, .. of course, since the choice depends not only 'upon."ilie use' that is '· made of the forecasts but upon ..the purpose 'of' the'verification .. , . _,. ·_; . ;·_ ''·: .. .H··.i .. • ; .... ~
~
~-
To illustrate these comparisons; let lis· return' now to the 'skill score defined in the earlier section.·· The expected number- of . forecasts correct ER, based on chance or "no-relation"· forecasts for the data in Table 41, is computed by the foll~wing: formula: =
···:r: ·::·:'.· ,.i
·. ·:i',
~~~Ct.
, ER · · . ·_,:: 1
·. ,:· ,., ·:;:L-'!,i.U. -;·.\ T.o
\;i
,r
where R 1 is the total of the i-th row and Ct is the total for the i-th column and T is the grand total. In the exampie given,: ,this is
+ 13 X 8 + 9 X 7 =
.
8 X 15
; • . f \ : ·, ·, ' ~
.
9 .6 ·
30
•
··:
so· the skill score based upon· the margins of 'the· conting~ncy. ~ ~ 1I: ! ', _.. i ,' ~ ..' ,': ' table is s = 14 - 9.6 = 0.22.· : ·.'.! ' ' '• ' .. ~ ... R 30-9.6. . • ,,'1 ':.,.<•it' ·.\·,:.· :
'
\ i
,:
' : ; : ·, • '. :
',
r ·_., ·' -~.:, . . · .· If, in this saine example, persistence forecasts had been made·. by predicting a continuation of the preceding pr~cipitation: ·the ·
expected number right would have been 12, so I the skill· score ' '. . ' '. . _ li . would be · ·: .. , .. !.:. .·. -:_-: .. :·i · ;.' 14 Sp = 30 - 12 = 0.11. ·. ·' .1, >: > . ; . -:; i;.: ·, '" :·( ~
• .
I
; •
.
'
•.
. -·.; . -..~; . -
;
•'
,: : '.
..·.
. ,· .:_..' ·
.·:
,_ .
if the climatological frequencies of heavy, ~oderate, and light precipitation were PH, PM, and PL, respectively, the expe~ted ·. number right Ec is given by the formula '' · · · .' '. · ·: . ··. ·', . :•
.
and
,
1
Ec = - (30) = 10 3
: , I ·' •
, '
,
':
'.. :.
' :
.~
. '; .. ,d,.·
''
/.''.:1
::
in the example, since the three classes were defined as equally •: :::.·i:: 1 :i:t'( :L:· ·il •; i H('f.~· ·: .· likely. Generally, any score which may be ·devised can be compUted . for a set of control forecasts as well as for the actual· forecasts, .... ,, . and a comparison can be made: ' ;::'' '·h..lj,~d~·;-;-; '-~•Jto
2vu
.TtSTl
ME1
~OOY
6. Significance of Forecasts. As we have seen, the no-relation or chance contingency table is useful for the computation of a skill score. It also lends itself to the computation of chi-square, which then can be used to test whether the actual score is significantly different from the norelation score, provided that the various assumptions underlying the chi-square test are satisfied. In particular, it is assumed that the entries in the contingency table are clearly independent and can be regarded as a random and representative sample of a population. Further, the number of observations in each box of the. no-relation table should exceed four. For these reasons, it is presumably doubtful whether the test should be applied to the data summari,;ed in Table 41. Should the asnumptions be obeyed, chi-square could be computed, as usual, from t_he equation: where now i-1 1 stands for the numbers in the no-relation table, and 0 1 represents the observations. As before, the number of degrees of freedom is (m-1)1~. where m is the number of rows or columns in the tables. If we had applied the procedure to the data in Table 41, for example, we would have had four degrees of freedom. Chi-square would have been 7.9. The 5% limit (Table 11) with this number of df comes out as 9.5. Therefore, had the assumptions been obeyed, we would conclude that the apparent skill could have been obtained by chance. We would need additional forecasts, first, to make the chi-square test more adequate, and second, to see whether the skill indicated by the skill score is significant.
it might be worth noting that the chi-square test only shows whether two contingency tables are significantly different from each other. It does not show which is "better." We can make this decision only on the basis of the skill score. If the skill score is negative, and chi-square is significant, the forecast is signifi· cantly worse than chance. 7. Pitfalls of Verlficatlori. There are many pitfalls of verification to entrap both ti1e novice and the expert. One of the greatest dangers lies in attempts to compare the relative abilities of forecasters on the
)
.
~·v,
"
Vl>
.A;~~•f':#~·~~r~:f'
basis of forecasts which are not comparable; ' .·. of location, season,·· time of day,: lengtnlof'•.-....,,..<,., .. , The reason for this is that the degree' of . .. •'that varies· so tmtch from one circumstance. to large sample of forecasts ..is needed tq assur~ ~hat·t~e~:~ver~ge, . weather has been approximately the same m .·the two sets of .· •· forecasts being compared. Even if the' forecasts being compared· are for the same event, there may: be other factors· to be con .. · sidered, such as whether equal map facilities were available. to. each forecaster. Furthermore as poirtted bilt.': recently, 'there is the problem of finding valid significance tests l for ~om paring forecasters or the determination of confidence limits of a fore· . cast score. "i'; .. f,:,.{.. H'd ·•.;·::::•:f \ ·
Interpretation 'is made even more difficult when.scores oni two·· or more forecast weather elements are arbitrarily combined 1 to ' form· a single index to be used for comparison putposes.r !'It may · . · be true, for example, that. there: is some· particular ilse'::of a .: forecast for which ail error of 2° in temperature: is equivalent to an error of 0.25 inches in a precipitatioh forecast, butit is certain:::~; that this (or any other) arbitrary weighting is not true ill generat · •· Those who must make a selection between forecasters based on:·. verification scores may demand a single' index to represent: the\': verification of all forecasts made by each. man, but since a logical · combination of scores is in general impossible, it' would 'be 1 far·. better to consider each scor~ separately~ i. As pointed out in the. section on Selection of Purposes of-Verification, if it is decided . ahead of time jUst what measures of accuracy are. needed and'· what will be done with each, the need for combining scores could .. ;<. be largely eliru'inated. ' · ' . ··.··: ···:: ''/·,;\~::
Another practice that may lead to difficulty is the use of ov~r:,. :·:~· lapping classes for verification purposes. Thus rules may be· set up stating that a ceiling forecast of 200 ft"wiH verify if. the ·!.· observation is between 100 and 400ft, and that a forecast of 400ft ·.... will verify if the observation is between 200 and 800 ft. Although ,: · such rules may Seem reasonable they tertd to encourage forecasters .·. to hedge by choosing the classes having the widest range .. Also,•.,.:. the choice of rather wide tolerances ih verifica.HonJimits tends · to give rather high and uniform percentage scores to all· fore- ' : casters, thus failing to discrimjnate between the better ahd pOorer- ·-,._·. forecasters. Since the use of a contingency table for comparing forecasts and observations presents the data in one of the most . . \:
20
\l'PL
NS 0
'ISTH
MET
useful forms, the practice of using overlapping classes is to be discouraged because such verification cannot be displayed in this way. Suggestions for Additional Reading. Bleeker, vV., "The Verification of Weather Forecasts," Meded. Ned. Meteor. Jnst., (B) Deel1, Nr. 2, 1946. . Gringorten, I. I., "The Verification and Scoring of Weather Forecasts," Journal of the American Statistical Association, Vol. 46, 1951. Gringorten, I. I., "Tests of Significance in a Verification Program," Journal of Meteorology, Vol. 12, 1955.
o£
Holloway, j. L, Jr., and Woodbury, M. A., "Application I n"rormation Theory and Discriminant Function Analysis to Weather Fo1·ecasting and Forecast Verification," Technicai Report No. 1, Meteorological Statistics Project. Institute for Cooperative Research, University of Pennsylvania, February 1955.
Muller, R. H., "Verification of Short-Range Weather Forecasts (A Survey or the Literature)," Bull. Amer. Meteor. Soc., Vol. 25, 1944. Thompson, J. C. and Brier, G. W., "The Economic Utility of Weather Forecasts," Monthly Weather Review, Vol. 83, 1955. U. S. Army Air Forces, "Critique of Verification of Weathet Forecasts," Air Wea. Serv. Tech. Rep. No. 105-6, 1944.
/
"'\.:'
CHAPtER
Proof
~hat s2 =·
x2 -
s2 is defined by:
•I'
(X)2.
l:(X- X) 2 · ,, 1 s2. = N ...... (: '
Expanding, we get..; l:X2 Since
l:X s2
;~
)--··
··~·I;".;:;(~ ::fj·:;· '.
-· 2
.:? -~
:-:-·2X N',+,(X)
-
··'·'····
.
=X, == x2 _ (X)2
-N
:uir
·h~:, ..'\
±x
-
N
'/,,,
• ••
•• 1
1 '
·
1•:
:·'>,~i.
!~·~··
'.'i
i·U.·; . ·.
~·;·r;l·;'f;:
·;·!·.;·:.·.·.A.·.·.·,,·,· ...·····~· ···!:· • ..
1 . ': ' • •
•l
.;i;~F·~
,;
.:?.,1'\\:
<::<).i(+ . \;l', ,,
Proof that origtn of X is itnmateriaHn calcuiationp~ ~2 ·:··.~.\· '., . . .•
Lett be X - A; that
-: :1:~~ >·-.~
\\ ·
is,~ is the variate counted from origin.A.
·,·:
";•,
'·.•,. 1.· : !
~ .'•' ;
·.
f :t
I'{ ,
~
. 1:''
..:...
209
CnAPTER
Proof that
111
a,. =
Consider a population of X-variates. Select random groups of two and add each two values of X. Consider the standard deviations of these sums. Let Xt
= Xt - X1,
a2xd-x.
X2
=
X2 - X2
·+ tJ'/,2 + ~~x.-1 X2
= 0'12
N
If drawings of successive numbers were independent, tbe last term is zero (see correlation, Chapter IV). Thus,
a,} +:u =
By induction:
a2Ex = :Za2
o-1 2
+ o-22
Since all x's have the same a, o-2:tx = Na2 flEx = cr
vN
O'Ex = a,. = N
:;
since the variance of a variate times a constant equals the variance times that constant.
Proof that the sample standard deviation s is related to the best unbiased estimate a"2 of the variance. of the population by "" s2 = N- 1 fl"' N Let Xt. X2, ... Xn be a random sample of N independent values selected from a population with. mean z~ro and standard deviation 210
•••• 'ENOL~
Let the sample mean be X. An Ut1biased "'"'·"""""'"' be given by 't ~.'
0'.
~ = 2:X,2
N
but in actual practice we compute the statistic ~ using s2 =
_l:..:..(X--','---X--'-)2 N
!',
Thus, s2 can be written as
s2
= 2:Xt2 _ (Xt .......
N
+ X2 + ... XNY.l . N2
;
'
,I
'
x,2 x,2 2: x,x1 =2:N- -2:2 N . N2
l
.
.
, • ~-
:
where the last sum is to· be carried out over all combinations of X 1 and X1 for which i is not equal to j., ... ..- 1 , ., •,', ,, The final term involving products of X, with Xt vanishes since
•·.-·'· t I
•
2: X21 = ~2 , we have the X'e are independent. Since, as above, - -
.
N
A =" N- 1 (j2 N
I • ,
.
· ' '
I
•'
~
.<'
.:·
.
.·;-.
Appendix CHAPTER
IV
Proof that the least square estimate is the maximum likelihood estimate with normally distributed predictands and uniform scatter. Let X2' = a + {jX1 be the line which is the most likely to have produced a given se.t of observations X1 and Xz. The probability of getting a particular value of 1X2 for a given X1 is
P1
= -
1 -
(tX, - tXt')'
e
O'Vi;·
20''
where 1X2' is the estimated value of 1X2. Similarly, the probability of getting a point zXz is (2Xt-•X•'l' 20''
If all points are picked independently, the probability of getti11g N points is the product of the individual probabilities: 1
P (1X2, zXz, aXz ... ) = ...; N e • ( 21rcr)
20''
Maximizing this probability is equivalent to minimizing: Hence, the method of least squares produces the line giving the highest probability of obtaining the observed points X1 and Xz.
Proof of relations between regressiott equation, scatter, and correlation. Let the scatter sz.t be defined by: s2.1 = 212
Putting xz' we get
S2,1
=
Now :r2' is given by bz.1x1 Hence,
S2.1
=
Vr, (xz-~z:~X1): l
...
•\.
Expanding, we find
'{'
... ,
1.'
.. ,
f
.
'
•'
...l ·,
~
"
.
···:· ,, .1.:
. -~
l' -:,'ir:!·....
....... ,, ..
!_''l~~~;·tu.~·
,.,·;:_<··:r.!J .:):
-'·j
. ·.:/
v
CHAPTER
Derivation of normal equations fork predictors ... Xk, and predictand X".
X~t
X 2, X 3
Let nil variates be measured from their respective means. Then the prediction equation is :X 11
1
=
k
1:
I= I
b;Xi
The condition of least squares is: 2.:
obs
[(.t b,x,)-x 1 =I
2
muumum
11 ]
The derivative of this quantity with respect to any coefficient must vanish. Let us differentiate with respect to b1, where b1 may be any b (the coefficient of b1 is x;). Using the chain rule of differentiation, we get 2 2.: [ obs
~ (b,x;xJ) - XXJJ 11
1-1
= 0
Exchanging summations in first term, and remembering that b1 is the same for all observations: k
~ X·Xj obs 1
2.: b1 ,._,
( .. 1
,. ,._,
obs
X X
n l
0
Dividing all observations by N, we find k (. ) 2.: b; XjXj 1-t
=
-"--X 11 Xj
which represents k normal equations, each of which contains a different integer for j. This equation stands for the several equations in the text.
Derivation of equation for partial linear correlation coefficient. Let all variates be counted from their means. Required: ra2 I· 214
da and d2 have means of zero because means of x1,. :r2,· 'and ·,x3 are zero. . :. ~ i \ ~.r I
\
•
'
I
'
The partial correlation coefficient, ra2.'i, is the linear coefficient between da and d2. From the product moment formula:· ·,.)·
. t·
ra2.1
d3 d 2
··
sd,sd,
(xa - ba.txd
(x2 -
b2.1x!)
'V(xa - ba.1X1) 2 (x2 - b2.1X1) 2
·,
Now, we may use:
·--:·
. ~ .'. '; . : ! : .
sa
ral - ; Sl
,· ..
't:
--
1·.-
X1X2
Substituting:
.\_ ...
Derivation for equation of multiple correlation coefficient. Ra.122 is defined by:
Let all variables be measured from their lneans. · The11, '·
•. ,·~- ' ..
-!·' i
,.·
.
Substituting and expanding, ~. 122 = - ba1.22s1 2- ba2.1 2s22- 2ba1.2ba2.1;x;+ 2bat.2Xt'X3+ 2ba2.1XaX2 sa 2
·.-·-;---'
'·
~~-
';.
.,. _,
~
..
,·,·
'
2L
VIETI
Now, b31 .2 =S3 - (rat St 1XtX3 =
Yt3StS3:
r32rt2) ; r12 2 X2X3
Stibsti tuti ng,
I
b32.1 =
S3 (
sz
l'~z
1
OGY
Appendix
!
'
VF''
CHAPTER'
Derivations of the expressions for , the · t:oefficients ; in ,:, harmonic analyses. ,,;-~.. ~· ... 1.:.'! ···~.:~·-~ ... j·~-; •'""'-.
Sines and cosines have the following property: Given N ·' equally spaced observations at intervals·. •at covering. period P = N Ill.
•,
•
This is 1/2 when i
t .. • .
,
. . ,. l
.
1
~h~a~erage ~f' si;{~~t}
r· -· · . .
.
.·
·_._;~;:P ·;~-?·:·I ::t: r _.
<_!i_ and 1 when i = N. The fad. that.the 2
average is 1/2 can be seen easily since
. ( 21ril) =
sm2 - p
.
,
\ . r.•i
unless i = j. lf i = j, we must determine
Smce cos
~ ). ~it~ ( 2p~jt ):{> ~,•. 0.:· . ' .
The average value of sin .( 2 1 . · p.
. ·,' ! ' 2
,.,
...
i
(hit)'
'. '
1/:l - 1/2 c.os .. •' . p t
..
',? ,",
.
,.:. i ·.t I •,'"•.'' ~ .
·,'
··',: ..
'
(47ril) ' . (P27rit) p averages out to zero,. the average of sm
21rit) 2 ' 1/2. • AI so th e average · ·· ( P · d ·~ of sm cos ( P1rjl) over peno r ·· .
IS
is zero as long as i and j are integers ;;;; N/2.
h.' ';:
··.;·.
t
Now, let a time series be expanded in terms of sines,and cosines:
x
=
x + f [ A1sin (
Multiply both sides by sin (
~ )] + f [ B,cos C~')J.,,
2 1
2 ~1 )
and
aver~~~ o~~'r a;l N ~imes
.of observations. All terms o1t the right drop 6ut except the one ·: ·'· · ' ' with coefficient A1• ·.. 217
,,
A _ N2
.... I - - "t
Hence,
.OLO<
[x . (21rit)]
L.Jo
Stn
.<
P
Similarly, multiplying both sides of the series by co{ B1 =
2
;it) we get:
N2 ;:·[Xcos (p27rit)]
Proof of relations between A., B., C1 and t 1• Let X = C1 cos(
;i
2
(t- t1))
Expanding the cosine of a difference:
. (27riti) + c (27rit) . (27rit,)' X= C . (21ril) p p p p
~
lf
1 Sill -
Sill - -
X = At.sin(
;it) +
2
·
1 COS
Sill - -
z;it)
B1 cos(
the two expressions can be correct for all t only if A1 = C, sin (
B1 = C1 cos( Hence,
A12
+
2 11
~)
2
\
;:4)
B;2 ; AJB1
=
tan (
Derivation of denominator of trend line. 'fl1e vanance . . bl e t .IS: oI t I1e vana
~Nt
2
~11 )
2
(~Nt)
2
-
Now, let l be represented by successive integers. The sum of the
. . . · b N (N squares o r mlegers 1s g1ven y: Hence, variance
=
(N
+ ) ~ZN + 1
1
+ 1) (2N + 1) 6
) - ( N
~
1
2
)
=
N2
-
12
1
CiiAPTER
Proof that for expansion of a series· in orthogonal functions, the coefficients give the "best" fit in the sens~ of least squares. k
Let 1 =
~
1~
1
....
a1 P,
. ..
·.
·~:·';..-
• ·. '! :. ~.. ,
.
!';
where the P,'s are orthogonal. The problem is to find the a's by': ? least squares. 1 and P, may be functions of x, y, z,·t, or any.;:.; combination thereof. The functions must be orthogonal in the,,;.,; domain of observations. , e. · ·· ' · ·, ' · The condition for least squares is ·
~ [ 1-1 ~
oba
(at Pt)
. ,
;,: :. ''"··" ,.
-'1] ~: tnin···imum .. /:: , : · · · ' • · ·' · ·. ,. · 2
··
..
The derivative of this with respect to any a must be zero. ·Let·)\;.· us differentiate with respect to one particular a, say a1• , \~i· 't· .'. '·, ' From the chain rule 2 ota [
7(a,P,P,) 7JPa .~ o .. .
.
Remembering that~ is constant over the . domain ' . . of ~bservationsr,;), . .. ·. ···+ and exchanging summations, we find: · · : · ·· •· · .· · · ci·; ;:;:
~ [a,
I= 1
Z (P1P1)] =
oba
~
oba
(JP1) .
• •·
•.
. . .
Dividing by N : k
~
1-1
,1
j:~!~l •
',:.< :
·· · , .. ·.>.>. : ''i''
a, P, P 1 = 1 P1 --
' .:
--
•'•'ll,.
.
\
According to the orthogonality condition, P, PJ = 0 unlessi = Hence, only one term is left on the left side:
.
~ P12 = 1Pi
or,
a1 =
JP· _,
!,
··;':.··
.,, ,;J.
I<.!'·
pl2
.
in.
.·:..
.
.
·.
•' ....
·l·.:·-;
which is the same formula as derived the text. 'Hence,~ this;;;'/~; form uta satisfies the conditions of a teast square fit. .;· .. N}\: 219
,.....
:.
rt..~ ..'·:.f.
A·~.~·r·
. .:~: ··:;:{;~.;~
Page
Page Codi!'!l;· ............•...•.............. 183 Coclhc!cllt of contingency .....•...... 102-103 Cocffic•ert of correlation: see Correlation
A
1\lllitude ol sinusoidal function •. l.JJ, 140-14 I, 148-143, 218 Analysis of varinnce; 9f:e V~triance of weather nuws: see \Veather snap analysis Analysis of variance: se.e Variance Annual variation .... , .......... 127·J.l6, 155 Ashe,·lllc, N. ·c...................... II, 185 Associatlon . . • . . . . . . . . . . . . . . . . . . . . . . . • . 90 A ltrilmtc In three rllm<'n;,ion~ ..•••...... 106, 118·123 In two dimensions .•...•. , ••.•..••. 100-104 rw.rc than three dlrncns!<.ms .. , ..•...... 1113 · Autr>corrclatlon defined ....•............•... , •.••.... 138 ~fi~(:t tm crofts cnrrelaUon . ...... , ...... J5.1 J>rotJ.,rtics ........................ i .19-140 relation to sp~ctrn ••.......... 138, 142-144 Autncr,rrdogrnm: ACr. Autocorrdation Autncttvariaru:•!: RC{! Autm;orrclation Automallr. correlation •...... , ... 113·114, 115 ,\vera~e dcdatlon: see Mean deviation Average error: ace Brror; Menn deviation 1\vcral!es ....•. , ..................... 16-25
coeffictent
CCoc fficicnt of skewnei!S ..•..........•• , . • 2' o1u~rence t1 clir.',fl ........ , ......... , ..•.......• 157 s 1gil\ cance .•.. , .....•.......•...•.. . 158 tn 1>lc of limiting values .•.....• , .. , .... 158 C 011 11ngency cocllicient •................•...•.. 102-iOJ ratio ...........•.........•••..•• 185-186 tahlr: sec Contingency table ContlnRcncy table · us~d for data summary ..... 81, 100-101, 104 used for forecast verification •... 56, 198-2110, 205-206, 207-208 • used for forecastlllg ...• , ••...•. , .. 185-186 Cont•mr~: see lsopletha
<:uolinK JJower • •..•......•.........•..
IJ ilcnnr.lt, Carl A.. . . . . . . . . ............. 79 !lias •..................•...•......... 34 IJinumial rllslrlhutlon ....... , .......... J3-.15 filtering: see l"lltcrlng formula. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .lJ llmltlnJJ forMls. • . . . . . . . . .........16, .19-40 lll~an .•.......... ' •.. ' • . . . . . . . • . . . . . 33 smr..,lhlng ......... , .......... ,, .150-151 slarvlanl devlnthJn. , .....• , .. , .... , ... 34 . U$ed In slatlsllcal tests ... , .......... S0-53 1llscrlal corrclntlon ................... 99-100 I <:rkr.r, W .... , . . . . . , , ........... , .... 201! I.! lind forecasts ....•. , ... , .......... 204-205 J!r:mmlnr~· conditions .•. , . , ....... , ...... 129 lln,wl~ '!"'' Wr:h~htmnn chart ........ , ... IRR llrwr, (,l•:nn W ........................ I'JI Brooks, t:. E. P.: ....•.........•.•.. .'31, 45
c
23
u~etf in verUh:ntion •••..•. , ...... , .. ~, 201
Cnttclation ratio computation ..•...................... 97 d<'lmcd . , . . . . . . . . . . . . . . . . • . . . . . . . . . . fJ7 n:rul!h>ie ...•............••..•.......• 123 &Ignoficance. , ..................... , .. 98 Coopectrurn .••............•. , ..•... 155-157 Co•·arinnce •................ 91, 110, 154-156 Cross correlation •.....•....••....••• 153-157 Cross covariance •.......•••...•.•... 154-156 Cross spectrum •••..•.•..•.......•.. ISS-159 Cumulative frr.<111ency distribution •klinctl ..........•••••••.••.......... 7 estimated from ranked data ..•..••.• ,,. 43 graphlcal repreoentatlon ••.. , .•...... l . 8 in I'Ctt::(~llt. . . . . . . . . . . . . . . . , . • . . . . . . . 12 used In transformations •. , . • • . . . . . • . . . 42 Curvnturr.. of tlrne seric~ ••.••....•.•.... 137 Curvllincn.t regression: 9e ltcgressiun Cycles: sec Oscilla lions
Carruthers, N ....................... 31, 45
CaLr:w)tlt$: nr.e Clnss
*
Correlation atttf): sr:r. J\utocortf!'1nUon automatic ..... , ... , ............ , .IIJ-115 coellicient: sce Correlation r.oeffident cross: see Cross correlation i!lvolving dichotomous variable .•.... 98-100 l!ncar .................... , .. ,, ,81, 90-96 hnc~~;r multiple ........ 111-113, 117, 215-216 nonhn<"ar •.. , ...................... 96-98 Involving non numerical variates .••.. 100-104 of s~e also !-=ontingency lime SCriCS .....•....• , , , •.....• ISJ-!61 fHirtial ................... 114-116, 214-215 ·Correlation coefficient l!iscrial. .................•.•...... 99-100 hn<"ar.' ... ' .... ' .• 90-96, 154, 161. 179, 201 COII!J>IItation ....•.•..••...•...... !ll-92 drttvatlons ..•..• , •. , •...•.....•... 213 d~fincd ..............•...•.••.... 90-91 significance .•.....•...... 92-96, 154, 161 rnultlplc defined ........•....•.. , ... , ... 112 117 derivation ..................... 21S·216 significance ......... , ...... 112-113, 117 nnnlhwar: !t«~ Corrclntio11 r.ntlo of lfme S{'rlcs, , .. , ... , , .. , • , , . , , , ,(5;1-157 partial. .................. 114-116, 214-215 l
int~rval
C<>ntral-limlt theorem ....... , •.......... 58 <:;entrnl tendency: sec A vr.rages Chance J<>tccnst. ••....•......•....• 204-205 t;fmractcri~tlc time, of instruments •... 151-15.1 Chi-square lnhle of limiting vulncs •......•..•..... 55 b:st. . . . . . !.1-58, 101-102, 104, 145-146, 206 Class l.nterval rl~fin~rl ... ,.,, ........ , .•..... ,,..... .1 fu twr.' v:trhthh:R, ....... , ... , •. , ....• , HO sclcctmn ...•...................... 4, 198 used for stratlf1r.atlon. , ... , 181 . """''In Umr. "'"'""analysis . .'.'.'t'i6,' i49,' 15J Class mark . . . • . . . . . . . . . . . . . . . . . . • . . 5 Cllmalt•l•>l!!ca! record · ·' 119 basis fur frmnlnR of cnt"""rics ..•.... 198 as control score .•..•..•.•. , .•• 200, 204-205 see also Normals
220
Page D
..
Dedle ...•. , ••...........••.. .' •..•....• 9 Decisions. • . .... , .•..•....... , .... 50, 124 Degree days ........ ., ... , ............ 24-25 Degrees of freedom for analysl1 of variance .•• , .•.. 69-78. 89-90, 93, 98, 113 for chi-square .............•.. . 54, 101, 104 for spectrum estimates •. , •.• , ...•. 145-146 for t-dlstrlbutlon,,, ....... , .......... 61 Dependence: see Independence Desmoothlng • · applied, •• , ...•......•........ , .. 152-1 SJ defined, ..........•.•.•............. 147 Developmental sample: see Sample Deviation mean: see Mean de,·iatlon standard: see Standard deviation Dichotomous variables defined ........................... . 51, 98 relation to dichotomous variables .. , .10.1-104 related to other variable1 •.. ,,, .. , ·.. 98·100 significance tests .....•..... ,., .•.••. Sl-53 Discriminant function .........•..... 118-122 Differential equation! •.... , .. , • , .• , , .••• 129 Dispersion: see Variability Diurnal variation. , •..••••....• , .... 127-136 Dlvergenoe. , •• , , ...•••• , , • , ••.. , .•• , • .. 23
f
6
•
a :
..
a
low.. pass •• ,.·,\.-:.;;;,·...·;:.:.·,:.··,·.;~.-.\
.
normal., ... .,.,,, ................ ISOnormal blva~ate, •. ; •. .1," ; ·, ,, ... 171-172 Fisher, R. A;';l,.,, ... , ... ,.; •••• ; .SS, S9, 79 Flxed-poln t methods, ••. , , • , , .• , , •.•.... 177 Fj\l>rtoft space average .... ;.;;,.,.\ •.. 170-173; -.~·; Forecast verification . . •.,.,,, application or sampling theory to .. , •. 55·57/ (~ . .
.. ''· 71-78, 206, 207'
::.·~.
control rorecaets. ; ·.-; ; • ;, , :, ', • , ... 204-206 ~ -:';;. defined •••.• ;,,,,·,;.·.•••.•••.... , ....·,,l91 c•· Index ................ ,, ... ,, ........ 207 .. meaning of goodnees..... ~ ......... . 197-198 ':,;f..~ methods . • ~ •••..• , ........ , ....... • 198-204
('i-~
pitfalls .. ,,.,,,,, •. , ....... , ..... 206-208 ...• , purpasee .... ~,,, ............. !92·194, 196 -:,:;
scores .•... • •. 4 ~
• • • , • • • • • • ,. . . . . . •.•
198-204 ·, -.<
terminology ..•..... , ............ , . 1 • •' 195 ftorecastlm1: eee.Weath~r prediction:· ;:·.;· Fourier analysis ... •· .. , ..... , ....... ,'. b8 Franklin, Norman L ••.•• , .•• ; , • , , .. , . • • 79
\_;_ ~,~.- ': .·
.
1
· ,.
·. i.;· Fraser. D. A •. S .. .......... ~ ... • .... • .. J .: 79 . ~~~: Frequency,ln spectrum analysis •• , i·•••• ~.-141';, .
Frequency distribution · · ., . · ;.:\ construction .••••.•....• • ••. ,. ,·, ...... J-6 ··.~r . . cumulative: see Cumulative .· frequency X"·.1~ · dletrlbutlori ·, · " ,.;•·, graphical presentation. , .......,. •·•. ·...... 6-7 limiting forme ... , ... , •.••.• , •••• to, 32-4S li!l'' related to probability ..•• ,.,./.... ; .... 9-13:;~~ tests of normality ••. , .••••• , , .. : • ••. .57-SS ·, , ., theoretical .••. ;' •• :, ••••.•..•..•.•.. 32-4!1 :-:i?.;! two-way.; .•.• ,,·, ... , •• !4-1.5, 80-.82, 96, 100 ·,//!.
:JJ}:
E
Eisenhart, C .................. ,, 79, 104, 125 Electronic cornpu ters •.. , . , . , •.... 1, 164, 178 Ellipse ..•••........................•. ·. 81 EmJ>Irlcal orthogonal polynomials: see PolY· nomlals Empirical probability: see Probability Equiprobablllty transformation., . , ••..••• 41 Error average ...•.•..•...••.••.. 1., .••• 200·201 of analysis .............. , ............ HiS of decision, •.••....•.. , ...•.. , • , ..... 50 of forecast1 ••..•. , ... ; . , , ... , •... 194, 207 of measurement. ••...•.....••... 32, 36,187 ' probable: see Probable error root mean square .•...•..• , , ...•.. 200·201 standard: see Standard deviation; Standard error of estimate variance,; ....... ,, ............ 67-78, i79 Error function applied •.•••...••.. , .••.............• •U d~tlned • , , , , ............ , ..... , . . .. . . 3 7 table ...••••••... , •...•..•......•...• 38 Estimate . of parameters generally .. , ......•..... , 49 of probability .• , ............ , ... , .... 10 or spectrllhl. ..................... 142·145 or standard deviation •.•.••.•........• 61 or variance .••. ' •.•.........•... '.' .29, 61 Evaparatlon. , • , •. , ..••..•.•. , ...••.•.. 23 Expanentlal smoothing: see Filtering Ezekiel, Mordecai, , , • ~ ...••...•.•.. I 04, 125 Even functions ......... : ...... , ......... t4J
Frequency polygon,,*..... ' .. ~ .. ~, •;·· ... 7. 20 _. . .,~~ Frequency re!Jpanse., ••.•••..•..• ,, .i49-IS3 .. (•,:.,,. Frequency wond rose: see Wind tOll!! !' ·',. ::r:! Front.,, •..•• ,;,.,,,.,.·.• ,·, ............. \163 '·f.( Fundamentai; •.• ;,,',,,,,,,J28,129,IJ4,143.,'>': . '., .·. · •. ,. .:'~. :;~~··.": ··.,·:.:.~.';..: ·~·
....
.
~- •: ,l,
:, ..':'\';
.. ': ·,!!,~
Gamma distribution .•..• • ;, . '. ;.-·,:; •...•.. 42-4S •· r:•,. Geostrophlc wind, •• ,, •• , •. ~·. ,.23, 1.54, 164: 1;:,-f~ Goodman. N. R.• • • ~;. • ••·1,, •• t • • ·• ~ • • " . 1.58 ".~"~;. ~~,~ Goodness or fit., •• , i • •••..• , •·• ,·, ••., .88-90 ..1"/{:~,, Goodne88 of fore<:nat ••..••........ ,, 197·198 ·.;· ..;H Graphical regression: see Regression, ·, ,,.i ·::. ;;~. Graphical representation . · ,.,: · '•• of three variables. 1,., •. ;, ..•. ·,, •. toS-109 ':·$J of two variables, • , ••. , .... , ..•., .••• 8.0.82. ;;._,t;; see also Graphh:al regression · ' · .,. •:ifil. Grfngorten, Irving 1...... , , 1 o';. , .. ·.t90, 208 .-:,
•
'
. :
.
.
1:
H .·
.
.:·~\:I}:i~
Harmonic analysis: .. : .. i:is.u4.-i:ls, 140-141 .; -~~.i~ computational procedures .•.• , ••... l.l0-134 ··· .·.'•;.-., definitions .•• , •••• •,.;.,; • .• '1128·130, 166 , :'.•.,.:.; ·derivations .• • ....•........•.•.••.• 217-218 · ;·~·~ · relation to spec:trum analysis •• ,.,; .140-141· . spherical., ...• , ........... : .:, ,.,., .... 167 i ' ;>:: . tab!~ of multiplicands •.. , , ; •• ·•• ,·, ..• :. IJ2 '';~;;,:::>li variance accounted for by .•.• :'.', .. 133·134'.'~·~::~
l-lnnnonica ....... ~ .. ,,.~·'··'••~·l.l28·13.5 •. .-·k ··~ Heat conduellon •• ; , , ••... , .. 1 . . . . . . . . . 129 ·.r::·;:r; Hedging., •• , ...... ~ •••..·, • ... ' ••• 197,-201 ·-.-,}.,, H•stogram ....... ~t····~~···-··••1-.-.4.,,, .. 6--7 ··
F-rallo
applied •.•..... , ....... 76-78, 89-90, 94, 98 defined., ......... ·................... 71 relation to t. , .•.. , ............. ,. . . . . 71 table of limiting values .••••.•. , ; . ; .. 72-75 Factor nnalysls ...•• ' ....... 166·170, 175, 179 Feedback .............................. 37 Fiducial limit:. aee Rejection limit
Filtering:'•.: • band-pass,·. J ~·., ~ · binomial ••..• ·,'.;,., ......... · 'exponential ... ; h • ' •. ~ ~ jo. hfgh .. paas;.. • .:vt ·..:. ~-• ~ ,·f .. ,.·{,'·.··' ••
defined ..... , .. , ..... , •... i .... ·.1 •• ~ i ••• ~·-. 6 limiting curve .• , •..••.• , •.. , , ..• ~ .• to, 32 Holloway, J, Leith, Jr.. , ... , ... , !61, 17.1, 208 Homogeneity .••.•. , .•....••••. , , ••. 48, 137. .see also Stationary time series • . . . · ·
221
~urricane ..• ................... , ..• ·~ .-~," •. •"• ~ •·• ~171
Page
Page Moving averages •••....•• , • 1.16, 144, i48-149 Mnlkr, R. H .•....................•.... 208 Multiple correlation: sec Corrclatlo11 Multiple regression: see Regression
Independence effect of lack of. .......... 12. 48. 87·88, 95, 102, I 13, 176, 186 In rr.lation lo probabllitleR, ••••.... , ..• 12 lndrx . In for~cn'l verllkntion, •...•.•........ 207 of \\'I'Hl•orlk• ....... , ...•........ 156·15~ Inference .•............•........•.•.... 46 lnstn1montal smoothing: son Srnoothlng lut(;rat:lhm, i11 analy~l~ r1f vnrlanr.e ...•..•• 7H Interval: Hf!C Cln1ts iult•rvul hwcrso. smoothing: sec Drsmoothlng lrrt:g.ulariti~s. in tfm~ S('ries: sr.c OsdUation!'l lsoplcths .....•. 15·16, 81, 106-HI'J, 121·12.1, 176, 180-183, 188
-N No-rt:lution forecast ..•••... , ...... , •............ 205 ~~nrc. , ........... , , •... , ...• , .•. 205-206 tahlc ...........%-57, 101-llll, 185-18(,, 205 · No-skill table: see No-relation table Nonllnenr re~:ression: see Regressio11 Nnnnurnc~ricat varinte: sec Attribute No!! parametric tests ............ 52, 54, 64-66 Nonperiodlc functions: llee Osc!tlations Normn1 tU~trihution aslhttil to binomial distribution. ,35, 53, 151 filtering: see !>iltcring littin~ to obs~rvatlons ..•••.....••. : .38-39 ordirmtr·s, tublc, ••.•••.....• , ..•..... 39 properties ...................... 32, 35-39 smoothin!l by ......•......••..... 149-151 trnnf!Jurmalinu to. , , . , . , ............ 41 .. 4.~ n•e•t in testinll siglllficancc .....•..• , . 58-64 Normal ertllatioi!S in Hhj~ctlv., nllnlysls .•••••..•......... !63 In thrL'C rlintcn~ions ••••....•.•....•••• 10'1 In two dimensions .••.............•• , . ·83 more thn11 three dimensions ......••.•.. 116 proof of cxJ>r"sslons for ...•..•.•.. , .... 2 I 4 Noronnl-pcrsi~tence predictloil .........•.. 140 Normnllzatioll, matbctnatlcal.l35, 141, 143, 168 Normals, In climatology ....... 17-18, 136, 165 Null hypothesti •..•.••. , ......•.• 49-51, 101 Number of harmonic, •. , ................ 130
J Joint fn:rptcncy dislrihutfon: sec lircttucncy distribution Joint probability: sec Pmbahillty K
Kelley statistical tables ... , . . . . . . . . . . . . . 45 Key •lay .•.•..............•.•...•.. 15'1-161 Kurtosis ..................... , .... 30-.H, 58 L
Lag .•.........•....... 13•J-140, 151, 154-157 Level of significance: see Rejection limit Line ,,f rt•grr.~•lon: sec R"greMion Least llfltHttf~~ for trend del.crminatlon, ..••......• 136-1.18 In three variates .. , .•.....•....•.. , •.. 109 In two ,-ariato.s ..•.................... 8.1 more thnn three variates ...•...•.. I IIi, 214 related lo orthogonal functions .•... 134-135, 168. 219 ro.latcd to maximum likelihood .••... 8.1, 212 11sed in objective analysis .•••.•.... 163-165 !.!near correlntion: lW' Correlation · Linear rcgres8lon: sec Regression
0
Objr.ctlve analysis..••... , ... , ...... ,163-166 Ohicclive weather prediction: see Weather prediction Odds .....•........ , ............•••.... I 25 ()give .• , ••.......••...•...•..•..•• 8, 42-45 One-tailed test •..••..•..............••• 00 Orthogonal functions, •.. 134-135, 166-170,219 Orthogonal polynomials: see Polynomials Orthonormal. ••••.••••••.•• , ••.•.••.... 135 Osciiln tlons . . Irregular, •••.•.•••....•.• 126-128, 135-147 periodic ............... : ..... 126-136, ISS
M ?>latltettHttlr.al probability: see Probability Maximum likelihood •..•.•.....•••.•. 83, 212 Mean computation ..•.••.....•....... , ... 18·20 dclint:ri, ....... ,,,, .... , ... , ... ,., •.. ]C. estimation .•..•..........•......... 59·60 lu relation to forcrast verification, ...... 197 of binomial distrlhutl•m •...••..........13 of l'oisson distribution .••. , •.. , . . . • . . . 40 related to normal' ...••. , .... , ...... 17·18 S.'lml'""ll distribution or •..••........ 59·61 sign licunt rHI!ercnccs In., .. , ......•. lil-78 M~nn de"iatlon •..•.......•. , • . ..... 26, 27 Mean wind speer! .•... , ..•....•.. , ...... 23 Mr:an "'\'"""" ..•.....•.. 70-78, 89-90, 9.1, 11.1 Mcnn w nd vector .•.••••..... , .•.••.. 21-24 Median dr:liw
p
Parabolic least-square fil. .•••....•..•. , .138 Parnmctcrs applied to regression analysis........... 87 defined .•...•......••.....•.. , . . . . . • • 21 Parametric methods: see Nonparametrlc ml.'lhods l'aniai correlation: see Correlation; Correlation coefficient l'cnrMn'~ tah!cs of gamma-functions .•••.• d Percentage frequency .•• , ••.•.•.•. , . . • . • 11 l'crcentlle •..•. , •... , . • . . . . . . . . . • • . . . . • 9 Periorlk functions: see Oscillations ... l'crsisteucc ••...•.•... 23-24, 139-140, 2b4-205 Phase ehlft ••....••. , •..•...•.....•. 151-152 Plane or regrcsslou: see Regression . ·· Poisson distribution., •.••..•...••.•... 39-40 Polarity reversal. .................. 149, 111 Polynomials . e1nr>lrlcal orthogonal .•.•••••... , •• 169-170 ordhmry .•..........•.•...•.....••.•. 163 orthognnal. ... , , ................. 167-170 TchcbychciT •... ,.... . ....•...... 167-169
222
Page Population ' · " .. .' l}esreaaion coel'lldenit . · · :,:.for time aeries. • . . .. . defined .......................... 21, 145 of three variates .•••••......••........ 112 . · i;:ln three dlmeuslona., ; .• o':• or two variates .••...•....•...... 88, 92-96 . ' ''·(iln two dlmeilalono .•• ; •... ; •• ,. ·. variance, .............•....... 29, 211)..211 · i. more than three dlmenalono. ,,,, Post agreell!cnt ••........... , ....••. 199-200 ,., ~r~fs of ~la.tlona ~Qr... , •:• ,·,,;, ·-······.. ·"·"''¥ Power spectrum: see Spectrum s1gntfi.cance .. ~.•.• ·., .·, • .•:..~ :
t
,z,
of cohe!ence . . ~ .• , .. .: • ..... ., •....... ~ 158 .~ -~~..:··
of con tmgency . . • . . . . . . . .. . . • ... I 0 I , 206 .; ' of correlation coefficients......92-96,,99-100, ..i 1. I · 112-IIJ, 117 bf means ..... + ........... , ••••• • ••• • 58~78· :;r;;. of apectrl\m ...................... 145-146 ·.~I.: of re,remon coefficient& •..• , .1- .•... 87-88·~ of skill In prediction. , , , •••• , • , , • , 206, 207, .~~ Singularity . ••• • ....... • , . ,;, •.. ~ ~ • ~ i . -. •. 17 ....~~.
R
<:t
Random forecast. , ••. , • , , , , ....• , .. 204-205 Random eampiee de6ned .............................. , 46 effect of lack of •• , , •• , .. , .62-64, 79, 87-.88, 95, 102, 112 ~!~~e 25
.................................
4
••
Sk=fii~fent ...•... ·. ·.... ~ ~ .·~--~- ...·.. ·.-~· ..... ~: ·.··29 \~~;:; · defined.
definition ....•.... , ....•.. , .... , . , . . . j methods for testing hypothesis ... , .... 64-66 used In estimating cumulative proba· blllties.,.,,., ....•.•.• ; ............. 43 Regime, meteorological,,,, •.•. 47, 88, 90, 112 Regression coefficients: 8ee .kegreulon coefficients curvilinear, ••.. , ....... , ............ 96-98 graphical, •• ; .... 105-109,122·123, 176-177, . 18().183, 188 llnenr,ln three vnrlalea •....... 107, 109-111 Jlnear,ln two variates ........... 82-90, liS linear, more than three variates.ll6-117, 176, 178-180 nonlinear .................. 96-98, 117-118, . 122-12.1, 176-177 plane of ..••.....•. , ... ; ............. 10~ proofs lnvolvlne .............. 213-214,218
f
••••••••••••• , ................
29 . ..·r;.:
of normal dletrlbutlon.,. ', ... , , .••..•. 58 ·o-~•~ Skill •• , .. ,• •.. ,.: .·i 1, , .••• ,' •• 200, 20J. 205-206 ',.:,,.' Skill score ........ , ..... ', ... , ..... 200, 205 • p~, ' lnstrurnentnl,' •• ' •... ; .•••.••.••.• 151-153 '\•\ji' of spectrum,;., ................ , .142-143 ·.· ;, <·, produced br,laoplethe .• ,., ........ 123, 162 ·'ili two-dhnens onal., .••• " •. 163-166, 17()..173 ,., ~ see also Filtering . :,,·.:;, Snedecor, G. W .... • ....•••.... 1 •••.••• 72. 79 ·;.·t~ Solar-weather relatione, .•......• 154, 15~·161 · i; Space .smoothing: see Smoothing ·. .':.·~
22J.
·· ·
~·,r;l
index \continued) Page
Page
u
Spectrum
•..................•.... 140-14 2 estimation ....................... 142·145 purpo•<'S ......................... 146-147 rr.latlon to autocorrelation ......... 143-144 sampling variations ...•........... 145-146 space ....•...•... , ...•.•....•....... 162 Sphrrknllmrmrmlcs .. , ..... , ..... , ... , .. 167 Stawlarr<'nce .,r means. • . . . . . . . . . . . . . . . 63 of means ......................... 58, 210 St!«! nlsn Vuriuucc Stnmhml error of ('Sthnntc .•.......... , •. <JO Stationarl· tim(' series ..•.......• 126, 142. 145 . Stati"tfca weather l>r.,dlctlon methods uy contlnlwncy tabh~s ..••••...... ,185-186 cla•sificatlon of methods ............... 178 fix•,.I·IJ"Int uuethnrls ................... 177 1ul~cd methmls ••.•............... 188-18!1 lnultlplc linear regression ...... , .... 178-179 tr9itrnt!lkathm .... , ...........•.... 183-186 Stl<:ccssive graJ>hlcal regression, , .... 180·183 trajectory mr.thorls, .....•......... 177-178 \•ali•lity <>f., ......................... 189 Statistics, In contrast to )lRrameters..... 21. 87 Strallficntlon .•. : ....... 107, 18,1-186, 188-189 Student: "'"' !-distribution Sum of squares .. ,, .............. ; . ,.68-78 Sur><erpri.w.nA .......... , • 48-49, 61!-7'J d~linition
T t-distrlbu tlon lll>l>licrl tn bl~rlal corrclatinn., , ... , .... 9!.1 •!dined .•.........•..•...........•.. 61 relation to F ....... ·...... , ......... 71 tohle •.............................. 59 te
Uspensky, J, V..... , ................... Jl
v Variability •••...............•.... , ... 25-29 see also Standard deviation; Variance; Mean deviation Variance analysis of. ••..•... 66-78, 89-90, 93, 98, Jlj definer!. , .• , •.. , . , .• , . , ••.••...•. , . 28-29 estlmnte of •••.••••.. 29, 38, 70-78, 210-211 exf!lained by harmonic analysis., •.• 133-JJ4 explulned by linear teJ:tesslon •. : .••..•. 179 ext>lainecctrunl: see Spectrum sec also Stmulard delliatlon Val'iate, definition, • , , , ... , . , . , • . . . • . . . . 3 analysis of more than two ... , ...•.. 105-125 analy~i~ of two.,,,.,, ..•.......... 80-104 In lustogram. , .•...•... , •.......•... 6, 32 Vectors . nvcrn~c~, ....... , .................. 21-24 fre
w WNtlht•t IU:lll auutysis
objective ••••••••...•.•.......... l6J-166 subjective .. , ..••............ 162·16.1, 166 W(·ather prediction aids ...........•...••..•.•....•...... 189 b1• dynamic methods.... I, 162, 179, 189-190 hy statistical methods ••..•. I, 126, 137-140, 162, 174-190 see also Statistical weather prediction hy suhkctlve methods •• , , , ....•...... 189 In terms of probability statements: see Prohahility . reliability estimates.,, ...•.•.. 193, 198-208 t•·rminolngy ............ ,, ... ,, ... 195-196 Weather Records Center.,,, ......•. II, 185 We.ights: see Smoothing; Filtering Wilcoxon, FrAnk, • , , , • , .. , • , , .•..• , .... 66 Wind: see Vectors; Geostrophlc wind Wind a\·eragcs, , , • , , • • . . . . ...••.•... 21·24 \Vhul resultant: see Resultant wind Wiud rose frequency ..• ,, .•.... ,,,.,,, •.... 15-16, 22 standard ............ : . • • .. • .. • • .. • • .. 16 y
Yates, F ....•••••......••.•.•• : . .•. SS, 59
z z-transformation. ; , .•.•• , ... , •.•••. , .• ,
9J