This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
) qt(
+
whence it can be shown that they all, as well as their sum-the generating function of Pm,n-satisfy the same equation of the second order
Zn+2- tzn+l + pq(t2- 1)Zn
=
0.
FUNDAMENTAL LIMIT THEOREMS Setting t
= ei",
307
one of the characteristic roots will be given by .
e
...
(1-2pq)>u-4pq(l-3pq)"2 + ...
for small u, while the other root tends to
0
be reached in the same way as in Examples
as
u-+
1 and 2,
0.
The final conclusion can now
pages
297 and 301.
References P. LAPLACE: "Thoorie analytique des probabiliMs," Oeuvres VII, pp. 309ff. S. PoissoN: "Recherches sur la probabilite des jugements en matiere criminelle et en matiere civile," pp.
246ff.,
Paris,
1837.
P. TsHEBYSHEFF: "Sur deux th�oremes relatifs aux probabilit�s," Oeuvres
481-491.
2,
pp.
A. MARKOFF: The Law of Large Numbers and the Method of the Least Squa1·es (Russian), Proc. Phys.-Math. Soc. Kazan, .Jl"e-z• --- : Sur les racines de }'equation e� dx"
1898. =
0,
Bull. Acad. Sci. St. Petersbourg,
9,
1898. --- :
"D�monstration du second thooreme limite du calcul des probabilit� par la
m�thode des moments," St. Petersbourg, --- :
1913. 1912.
"Wahrscheinlichkeitsrechnung," Leipzig,
A. LIAPOUNOFF: Sur une proposition de la thoorie des probabilit�, Bull. Acad. Sci. St. Petersbourg,
13, 1900.
---: Nouvelle forme du thooreme sur la limite de probabilite, M�m. Acad. Sci. St. Petersbourg,
12, 1901.
G. P6LYA: Ueber den zentralen Grenzwertsatz der Wahrscheinlichkeitsrechnung, Math. Z., 8, 1920. J. W. LINDEBERG: Eine neue Herleitung des Exponcntialgesetzes in der Wahrschein lichkeitsrechnung, Math. Z.,
15, 1922.
P. LEVY: "Calcul des probabilit�," Paris,
1925.
S. BERNSTEIN: Sur !'extension du thooreme limite du calcul des probabilites aux sommes de quantit� d�pendantes, Math. Ann., A.
KoLMOGOROFF:
Eine
97, 1927.
Verallgemeinerung des Laplace-Liapounoffschen Satzes,
Bull.·Acad. Sci. U.S.S.R., p.
959, 1931.
---: Uebe� die Grenzwertsii.tze der Wahrscheinlichkeitsrechnung, Bull. Acad. Sci.
. U.S.S.R., p. 363, 1933. A. KHINTcHiNE: "Asymptotische Gesetze der WahrscheinlichkeitsrechnutJ.g," Berlin,
1933.
CHAPTER XV
NORMAL DISTRIBUTION IN TWO DIMENSIONS. LIMIT THEOREM FOR SUMS OF INDEPENDENT VECTORS. ORIGIN OF NORMAL CORRELATION 1. The concept of normal distribution can easily be extended to two and more variables.
Since the extension to more than two variables
does not involve new ideas, we shall confine ourselves to the case of two-dimensional normal distribution. Two variables,
x, y, are said to be normally distributed if for them
the density of probability has the form (j'l'
where cp
ax2 + 2bxy + cy2 + 2dx + 2ey + f
=
is a quadratic function of x, y becoming positive and infinitely large together with lxl + IYI· This requirement is fulfilled if, and only if,
ax2 + 2bxy + cy2 is a positive quadratic form.
The necessary and sufficient conditions
for this are:
a> 0;
ac-b2
=A > 0.
Since A> 0 (even a milder requirement A ;;6 0 suffices), cbnstants
xo, y0
can be found so that cp
a(x-xo)2 + 2b(x-Xo)(y-Yo) + c(y - Yo)2 + g
=
identically in
x, y.
presented thus:
It follows that the density of probability tr"' may be
4
The expression in the right member depends on six parameters K; a, b, c; x0, yo. But the requirement
reduces the number of independent parameters to five.
We can take
a, b, c; xo, Yo for independent parameters and determine K by the condition
J_ ..,f_"',.e-,.(s-so)t-'Jb(z-so)(V:"•l-c(u-1/o>•dxdy
K
.
308
=
1
'
SEc. 2)
309
NORMAL DISTRIBUTION IN TWO DIMENSIONS
which, by introducing new variables
-
�= x can be exhibited thus
.. ..
f1 = Y
Xo,
- Yo
f_ .. J_ .. e-af._26�"'�"d�d., = 1.
K
To evaluate this and similar double integrals we observe that the positive quadratic form
a�2+ 2b�., + c.,2 can be presented in infinitely many ways as a sum of two squares
a�2+ 2b�., + c.,2 = (a�+ p.,)2+ ('Y�+ a .,)2, whence
b and
(aa
- P'Y)2
=
=
a{j+ 'Ya
1:1.
By changing the signs of a and {j if necessary, we can always suppose
a8
- P'Y
=
+Vil.
Now we take u =a�+ Pfli
for new variables of integration.
v
=
'Y�+ a.,
Since the Jacobian of u, v with respect
to ��.,is Vil, the Jacobian of�� f1 with respect to u, v will be by the known rules
f .. f .. -oo
-oo
e-afL-26�•'1"d�fl =
1
__
Vil
Thus
K'lf' Vil
=1
'
J
oo
-oo
f
oo
e-u•-v•dudv
-oo
=
1/Vil and, __!____ . Vil
Vil K=-· 7f'
That is, the general expression for the density of probability in two dimensional normal distribution is
vac 7f'
-
b2e- (..-zo)L- (..-zol
.
2. Parameters Xo, Yo represent the mean values of variables x, y. To prove this, let us consider E(x
-
Xo) =
.V:J_'".. J_ .. ..
(x
-
Xo)e-a<..-zolL-26(..-zol(u-uol-c(u-llol'dxdy.
310
INTRODUCTION TO MATHEMATICAL PROBABILIT.Y
[CHAP. XV
To evaluate the double integral, we can express x and y through new variables
u,
v introduced in the preceding section. X - Xo =
and E(x - Xo) =
au - {3v, v'6
Y - Yo
1 J .... f .... (au
_/A
rvA -
whence
We have
--yu
+ v'6
av
----,=---
=
-'
- {3v)e-u._v•dudv = 01
_
E(x) = xo,
and similarly E(y) =Yo·
3. Having found the meaning of xo, Yo we may consider instead of x, y, variables x - Xo1 y - Yo whose mean values 0. Denoting these new variables by x, y again the expression of the density of probability for =
x, y will be:
It contains only three parameters, a, b, c. To find the intrinsic meaning of a, b, c let us consider the mathematical expectation of (x + 'Ay)2 where}.. is an arbitrary constant. E(x + 'Ay)2 =
or, introducing
u,
We have
yxJ_:J_..
,.
(x + 'Ay)2e-a:z:2-'Jb:z:u-cu•dxdy,
v defined as in Sec. 1 as new variables of integration,
But
a{j + -ya
=
b,
whence E(x2)
+ 2'AE(xy) +
c
b
2A
2A.
'A2E(y2) = - - 2}..-
+
a
}.,_2-1
2A
and since}.. is arbitrary E(x2) =
c
2A'
E(y2)
a =
-·
2A·
SEc.
4]
NORMAL DISTRIBUTION IN TWO DIMENSIONS
On the other hand, if of x,
y
crt, cr21
and
311
r are respectively standard deviations
and their correlation coefficient, we have
E(x2) =
E(y2)
E(xy) = rcrtcr21
crf1
=4
Hence
and
or
Finally, a=
1 , 2crf(1 - r2)
b
=
�= With these values for
a,
r 2crtcr2(1 - r2)1 1 . 2crtcr2V1 - r2
b, c, and
.y'K the
density of probability can
be presented as follows: " 1 e 2(1 :r•>[ (�)" -2� �+ (�) ] 21r1Tt1T2�
and the probability for a point x,
y to
belong to a given domain D will be
expressed by the double integral
1 21rcrt1T2� extended over D.
4. Curves
1 2(1 - r2)
If
e
l
2T=- !!.. (!!.. . 2(1-r•) [(=-)· .. - ... ...+ ...) ] dxdy
__
.
(D)
2- 2r=._ JL ( y )2] = l = [(=-) ITt ITt ITt 0'2 +
const.
are evidently similar and similarly placed ellipses with the common center at the origin. equal probability. of
l
(ellipse
l)
is
For obvious reasons they are called ellipses of
The area of an ellipse corresponding to a given value
312
[CHAP. XV
INTRODUCTION TO MATHEMATICAL PROBABILITY
whence the area of an infinitesimal ring between ellipses l and l + dl has the expression - r2dl.
2?ru1u2y1
The infinitesimal probability for a point x, y to lie. in that ring is expressed by
e 1dl.
Finally, by integrating this expression between limits l1 and l2 > l 1, we find as the expression of the probability for x,yto belong to the ring between
two ellipses l1 and l2.
If l1
=
0 and l2
1
l,
=
- e-1
gives the probability for x, yto belong to the ellipse l. If n numbers l, l1, l2, . . . ln-1 are determined by the conditions
1 - -1 e
=
-1 - -1 • e e
-Z. - -1• e e
=
=
·
the whole plane is divided into n +
·
1
·
=
e-1·-•
-
e-1·-•
=
1 --, n+l
regions of equal probability:
namely, the interior of the ellipse l, rings between l, l1; lt, l2; ••• ln-2, ln-l and, finally, part of the plane outside of the ellipse ln_1• 5. To find the distribution function of the variable x (without any regard toy), we must take forD the domain - oo <X< t;
-oo
As the integral
1
f' f,. 2?ru1.yr;=T22f'-..
2?r1T11T2� 1
=
e
-
oo
-
1_ __ 2{1-r')
' • [(!!.. -!!. ) 'J .. , ) +(1 -r ) ( =.. , ,,.
oo
dxdy
=
.
' e 2"•'dx
f-.. oo
·
"'
z!
r e 2<1 - '>dz
we see that the probability of the inequality X< t is expressed by
1
---
IT1
-yl2;
f'
-
e 2
.,.
Similarly, the probability of the inequality y< t
=
' "''
1 1Ttyl2; ---
J
e -2.. •'dx�
..
is
1 ---
IT2�
Thus, if two variables means
313
NORMAL DISTRIBUTION IN TWO DIMENSIONS
SEc. 6)
=
x,
f'
•
e
u -2Viody.
-oo
y
are normally distributed with their
0, each one of them taken separately has a normal distribution
of probability with the common mean 0 and the respective standard
y
�r1 and �r2• Variables x and are not independent except when For if they were independent the probability of the point
deviations r = x,
0.
y belonging to an infinitesimal rectangle t <X< t + dt;
would be
whereas it is
e
1
27riTliT2�
-2(1
� r•) [ (f.)"-2rf, ·;;+(;;)"]dtdT'
and these expressions are different unless r
=
Thus, except for r
0.
= 0,
normally distributed variables are necessarily dependent in the sense of the theory of probability. "correlated variables."
Dependent variables are often called
In particular, variables are said to be in "normal
correlation" when they are normally distributed.
6. The probability of simultaneous inequalities
y
X<X< X', is represented by the repeated integral
while
1 ITl�
---
is the probability that (Chap. XII, Sec.
10)
x
X'-�dx LX e
:wa •
will be contained between X and X'.
the ratio
X'e -�dxf� ,.e ! fx r x· -� U2V27r(1 - r2) Jx e :Wo•(
1
[ �z Jdy
r ) u- •
2
can be considered as the probability of the inequality
y
Hence
314
INTRODUCTION TO MATHEMATICAL PROBABILITY
it being known that
x is
contained between X and X'.
[CHAP. XV
Considering X' as
variable and converging to X the above ratio evidently tends to the limit
1 crs.yr-=Ti.y2;
' J
-
1
[
'"x]"dy
e 2,,•(1 r•) !1-r;; ..
which can be considered as the distribution function of fixed value X.
Hence,
y
for
x =X
y
when x has a
has a normal distribution with the
standard deviation
and the mean
cr
y = r sx. 0"1
Interpreted geometrically, "line of regression" of y on
this equation represents the so-called
x.
In a similar way, we conclude that for
y =Y
the distribution of
x
is normal with the standard deviation
r2
crtv'1 and the mean
X = rcrty, 0"2
This equation �epresents the line of regression of .
x
on
y.
LIMIT THEOREM FOR SUMS OF INDEPENDENT VECTORS
7. So far normal distribution in two dimensions has been considered abstractly without indication of its natural origin.
One-dimensional
normal distribution may be considered as a limiting case of probability distributions of sums of independent variables.
In the same manner
two-dimensional normal distribution or normal correlation appears as a
limit of probability distributions of sums of independent vectors. Two series of stochastic variables X11 Xs1
, X,.
Yt, Ys,
.
y,.
define n stochastic vectors Vt1 v2 , . . . v,. so that x,, y, represent com ponents of v, on two fixed coordinate axes. If
E(x,) =a,
E(y,) = b,,
SEc.
8)
315
NORMAL DISTRIBUTION IN TWO DIMENSIONS
the vector
a, with the components
b, is called the mean value of v,.
a,,
Evidently the mean value of v=
Vt + v2 +
·
·
·
+ v,.
is represented by the vector
a = a1 + a2 + and that of
+ a,.
v- a is a vanishing vector.
Without loss of generality
we may assume at the outset that i
E(x,) = E(y,) = 0; in which case
1, 2, .. .
=
n
,
v,. are said to be inde E(v) = 0. Vectors Vt , vz, x,, y, are independent of the rest of the variables •
•
.
pendent if variables
Xf, Yi where j
�
i.
In what follows we shall deal exclusively with independent vectors.
8. As before, let x,., y,. be components of the vector
v,.(k = 1, 2,
.
.
)
.
n .
Then
X = Xt + X2 + Y Yt + Y2 +
+ x.. + y,.
=
will be the components of the sum
v
=
Vt + v2 +
·
·
+ v,..
·
If
E(x,.) = E(y,.) = 0 E(y�) = c,., E(x:) = b,., then
E(X) 0, bt + bs + E(X2) E(Y2) = Ct + Cs + E(XY) = dt + d2 +
E(Y) = 0
=
·
=
•
·
·
+ b,. + c,.
·
·
··
=
=
B,. Cn
+ d,. = r,. v'B.. y'C"
·
because if variables
j � i,
x, and y 1 being independent.
Let us introduce instead of variables
x,., y,.(k
variables
�,.
x,. =
__ ,
VB..
y,.
.,,.=--=
yC,.
=
1, 2,
.
.
•
n)
new
316
[CHAP. XV
INTRODUCTION TO MATHEMATICAL PROBABILITY
and correspondingly
X
u=
8=--,
VB,.
instead of X, Y.
y
--
VC:.
We shall have:
E(��o)= E(fl�o) = 0 E(�:)
.,;,
!:;
�:
E(fl:)=
, and
E(s)= E(u) = 0 E(s2) = E(u2)= 1 E(su) = r,.. The quantity
� 1.
r,.,
the correlation coefficient of s and
u, is in absolute value
We define
q,(u, v) = E[ei] as the characteristic function of the vector s,
q,(O, v)
u.
Evidently
q,(u, 0)
are respectively the characteristic functions of 8 and
ei(ua+Vor)
= ei(uf•+••ll>
•
ei(uf:r+v.,,)
•
•
•
and
u. Since
ei(ufn+v.,n)
and the factors in the right-hand member represent independent varia bles, we shall have
q,(u, v) = E(ei
9. For what follows it is very important to investigate the behavior when n increases indefinitely while u, v do not exceed an
q,(u, v)
of
arbitrary but fixed number l in absolute value. Let
and
!t+!z+ · Bl g1 + U2 + c•.. ·
If
"'"
and fin tend to
0
as
n --+ co,
·
+!.. =
"'"
+g.. =
fin•
we shall have
(1) provided
lui�
l,
lvl
� l
SEc.
9)
and
n
NORMAL DISTRIBUTION IN TWO DIMENSIONS
317
is so large as to make
l(wi + 711) Since ei
1 +
- �(u�,.
i(u�,. + V1/�o)
<
+
1.
V7/�o)2 +
+
8. (u�,. + V7/,.)a.I 6
IBI < 1,
we shall have:
18'1 < 1. On the other hand,
1
- �2 b 2Bn
2dk
UV 2VJ[;.C,.
_
_
c
:4-v2
2Cn
Ci 2di bi --u•---ue--•• = e
2Bn
2yB.Cn
2Cn
+
IB"I < 1 and so
Furthermore,
because
Em)
=
!"n
<
w !,
Also
[E(u�,. + V7/�o)2)2
<
[E(u�,. + V7/�o)2)f � Elu�,. + V'7�ola
. Elu�,. + V7/�ol3 � 4la
(�l �i} +
Taking into account these various inequalities, we may write
INTRODUCTION TO MATHEMATICAL PROBABILITY
318
[CHAP. XV
where
Finally,
and
Let P denote the probability of simultaneous inequalities
as was stated.
10. Theorem.
Provided r,. remains less than a fixed number a < 1 in absolute value and the above introduced quantities "'"' .,,. tend to 0 as n --+ oo , P can be expressed as p
=
1-(11-2rntT+ T') i"iTI -e 2(1-r.•> dtd-r + A,. 27ry1 - r! lo To 1
where A,. tends to 0 uniformly in to, t1; -ro, -r1. If, in addition, r,. itself tends to the limit r(lrl to
Proof.
<
)P will tend uniformly 1
a. In trying to extend Liapounoff's proof to the present case
we introduce an auxiliary quantity II defined as II
=
iTl -(!=!)I ( 1 i" -(!.=:!)I e " du e " dv h'T lo TO
E -
Using the inequality
1 i __ .. e-'"dt
.y;.:J:
)
·
• < e-..
2
for
i" - (!.=:!). .[Tl - ( !=!) . .e " du e " dv - 1 1 1h'T lo TO
·
X> 01
one can easily derive the following inequalities:
(2)
·
<
I
<
�(e -(";•)' + e -('"�·)' +
e
-t'�")' + e -to�")')
SEc.
10]
319
NORMAL DISTRIBUTION IN TWO DIMENSIONS
if
(3) and
(4) _!_ h2�
i"e -( ; ) du iO e-("�")'dv < ! (e-( ; )' '
"
to
'
n
·
2
T
+
+
"
"
e-
( ;")
if at least one of the inequalities (3) is not fulfilled. of II, P and from (2) and (4) it follows that
n
)
e - (" ; " ' + '
+
e-e·;")'
)
From the definition
IP- III < !E e -(" ; ")' + e -('·;·)' + e -(Tl;")' + e-e·;")' . But referring to (1) and setting
(
)
we have by virtue of the developments in Chap. XIV, Sec. 3,
8 e -(�)' IP - III < 2a,.(l) + hv'2 + y; hl
(5)
b. Replacing derivative of
II
t1, -r1
by variable quantities t, t and -r, we get
with respect to
d 2II 1 --E-e -(' " •)' dtd-r h2�
(
_
-r
·
and taking the second
-(�)). "
On the other hand
-te -e'-,.•)' -cT-,.")' �
h2 -
=
�2
J .. f.. e -oo
whence (6)
d2II dtdT
_!_ A_ ":t'lf 2
=
Here we substitute
For all real
f f oo
-oo
cp(u, v)
u, v
oo
-oo
=
-oo
4 .. -�<
..i. (u, v)dudv. e -¥(u'+••}e- i( tu+Tvl .,
e-i(u'+2r.uv+v•> + g(u, v).
lg(u, v)l ;;:;; If
lui ;;:;; l, lvl ;;:;; l,
where
l
·+··>e-i(tu-tTvlei
2.
is an arbitrarily fixed number, and n is large
enough, we have
lg(u, v)l ;;:;; a,. ( l) .
320
INTRODUCTION TO MATHEMATICAL PROBABILITY
[CHAP.
XV
Hence, the double integral
1 4-n-2
I
fe
-�(u'+•') 4
e-i(lu+rv>g(u, v)dudv
extended over the region outside of the square
in absolute value.
lui �
l,
lvl �
lui �
l,
lvl �
The same double integral extended over the square
l is less than
·
l2 2a ,.( l) 'II"
in absolute value.
lis less than
Thus, referring to
(6)
and
Now
1�1 < 1 and
Hence
and (hl)•
IR'I
l2 a,.(l) + < r2
e -4 h2 ---,;_2 + 4-n-(1
By transformation to new variables �
= u + r,.v;
-
a2)t"
Smc.
10)
32 1
NORMAL DISTRIBUTION IN TWO DIMENSIONS
the foregoing double integral becomes
�r,.I I oo
1
A- - / 1 ":t:lf2 v
2
_
- ..
oo
-oo
6
_.!�a-b,o 2
2
•
it�+i T-Iro ' 6 '\11-r.• d�d11 =
so that finally
d2J1 dtdT
_
1
=
2'11' �
=
1
1-)( C•-2rn1T+T1) 2(1-rn• 6 --
2'11' -vr=r!
- -1-(11-2 roiT+TI) 6 2(1-rn• ) + R',
Integrating this expression with respect to t and T between limits to, t1
and To1 T11 we get:
(7)
II
1 =
f'•
27ry1 - r�J,.
i
"16 -2(1�r.•)(t•-2rntT+,.•) dtdT
...
+
P
where
Hence combining inequality (5) with
p
1 =
27ry1 - r�J,.
where
IA.I
<
[ + :.
(t,-
2
f1t
i
lo)(n- To) +
(7)
and (8),
T1 � (t•-2rntr+,.•J 6 2(1 r.•) dtdT
...
+ .1..,
]a.(!)+�}'{��+
(t1 - to)(TI -
}
To) + hv'2 +
(ti - to)(n
- ;o)h2•
471'(1 - a2)2
Considering to, t1; To1 TI as variable and denoting an arbitrarily large
number by
L,
we shall assume at first that the rectangle D
is completely contained in the square
Q: lui� L.
Then, taking h
1.1... 1
<
=
z-� we shall have
( + �:z2)a.. (Z) + -�(�z-; + 4L2J 2
l6
+
0l-� +
L2Z-l �· '11'(1 - a2)2
322
INTRODUCTION TO MATHEMATICAL PROBABILITY
e,
Given an arbitrary positive number
le -�(�z-� After that, since
no(e) so
+ 4£2
a,.(l) ---+
0 as
+
n---+
that
a,.(l) for n >
)
no(e).
<
�
v'2l- � +
n
no(e);
>
c. To prove that Tt
large as to have
L2z-t � � <
11'(1 - a2)2 l)
e.
we can find a number
(2 + 4£2l2)-l ----;-
Finally, we shall have <
E
� .. tends to 0 uniformly in any Q with an arbitrarily large side 2L.
that is,
..
contained in the square
ro,
l so
(for a fixed
oo
�� .. � as soon as
we take
[CHAP. XV
�
rectangle D
tends to 0 uniformly no matter what are
to, t1;
we observe that the integral
ff -2(l�rn')(t•-2rntr+r')dtdr e -
1
21rv11
,.�
Q
extended over the area outside of Accordingly, we take
L
becomes infinitesimal as
L
---+ oo
.
so large as to make this integral < e/2 (no matter
what n is) and in addition to have L-1 <
e/4.
The number L selected
according to these requirements will be kept fixed. Let D' represent that part of D which is inside that the point let
J'
and
J"
s, u
shall be contained in D' or D", respectively.
ff -2(1�r.•)!t'-2rntr+r'}dtdr e r�
extended over D' and D", respectively. > 0 a number
n0(e) can
no(e).
=
P' + P";
whence
IP no(e).
<
E
Now
P
for n >
By what has been proved, given
be found so that
IP'- J'l for n >
Also,
be the integrals
1 211'V1 E
Q, the remaining part or
Let P' and P" denote the probabilities
parts (if there are any) being D".
- Jl
<
J E
=
J' + J",
+ P" + J"
Since by Tshebysheff's lemma (Chap. X, Sec.
probability of either one of the inequalities
1)
the
Smc. 11)
323
NORMAL DISTRIBUTION IN TWO DIMENSIONS
lsi> L is less than
1/L,
or
lui > L
we shall have
P"
<
�
<
L
�. 2
Also,
J" < �� 2 whence
jP- Jj for
n > no(e);
2e
<
that is, the difference
tends to 0 uniformly, no matter what
to, t1; ro,
Tt
are.
Finally, the last statement of the theorem appears as almost evident and does not require an elaborate proof.
11. The theorem just proved concerns the asymptotic behavior of the probability P of simultaneous inequalities
which, due to the definition of sand
tov'B,. � x1 + x2 + rov'C: � Y1 + Y2 +
u,
·
'
·
are equivalent to the inequalities ·
+ Xn < trv'B..
' ' + y,. <
1"1VC:.
From the geometrical standpoint the above domain of tangle.
s,
u
is a rec
But the theorem can be extended to the case of any given
domain R for the point
s,
It is hardly necessary to enter into details
u.
of the proof based on the definition of a double integral.
It suffices to
state the theorem itself:
Fundamental Theorem. The probability for the point (s, u ) to be located in a given domain R can be represented, for large n, by the integral
1 27ry1- r!
ffe
-2(1 � r.•>(t•-2r.t.-+.-•>dtd.,.
extended over R, with an error which tends uniformly to 0 as n becomes infinite, provided w,.--+01
.,,.--+0,
324
INTRODUCTION TO MATHEMATICAL PROBABILITY
while for all
[CHAP. XV
n
lr.. l
In less precise terms we may say that under very general conditions the probability distribution of the components of a vector which is the sum of a great many independent vectors will be nearly normal. The first rigorous proof of the limit theorem for sums of independent vectors was published by S. Bernstein in 1926.
Like the proof developed
here it proceeds on the same lines as Liapounoff's proof for sums of independent variables.
Moreover, Bernstein has shown that the limit
theorem may hold even in case of dependent vectors when certain addi tional conditions are fulfilled.
12. A good illustration of the fundamental theorem is afforded by E, F, G. For the sake of simplicity we shall assume that probabilities of E, F, G are
series of independent trials with three alternatives,
p, q, rin
Naturally
all trials.
p+q+r=l. In the usual way, we associate with these trials triads of variables
(i
=
1, 2, 3, . . . )
so that
x, 1 y, = 1 z, = 1 =
or 0 according as
E
occurs or fails at the ith trial;
or 0 according as F occurs or fails at the ith trial; or 0 according as G occurs or fails at the ith trial.
Evidently
E(x,) = E(xl) = p E(y,) E(yn = q =
so that vectors v, with components
�. have their means
x,
-
p,
'I'Ji
= y,
-
q
The independence of trials involves the inde
0.
=
pendence of vectors
=
Vt1 V21
•
•
•
v,..
Hence we can apply the preceding
considerations to the vector v
= Vt +v2 +
·
·
·
+ v,.
with the components
X .= �1 + �2 + y = '171 +'172 +
+ �.. +TJ,..
We have
B,. = E(X2) = np(1 - p);
C,.
=
E(Y2)
=
nq(1 - q).
SEc. 12]
NORMAL DISTRIBUTION IN TWO DIMENSIONS
325
Moreover,
E(�•"''>) =E(x(IJ;)- pq =-pq and
E(XY) = r,.VJi:v'C,. = -npq whence r,. =
pq ypq(1 - p)(1 - q)
The quantities denoted by /J;,, gk in Sec. 9 are in our case
fk =El�kl3 = p(1 - p)3+ (1 - p)pa Uk = El'llkl3 = q(1 - q)3+ (1 - q)q3• Hence
- q(1 - q)3+ (1 - q)q8 ' "'" nlq1(1 - q)t
p(1 - p)3+ (1 - p)p3 ' "'" = nlp1(1 - p)i and the conditions w,. ___..
are satisfied.
o,
'lin___.. 0
The fundamental theorem, therefore, can be applied.
If k, l, mare the respective frequencies of events E, F, Ginn trials, the
quantities X and Y represent the discrepancies }. =k
-
p. = l- nq.
np,
Introducing the third discrepancy
v=m- nr we shall have
>.+p.+v=O so that
v
is determined when >. and p. are given.
The last two quantities,
however, may have various values depending on chance.
Concerning them the following statement follows from the fundamental theorem:
The probability that discrepancies >., p. in n trials shall Theorem. simultaneously satisfy the inequalities
tends uniformly, with indefinitely increasing n, to the limit
1
Cll
li
'hypqr ao
{J1
{Jo
(
l a" + (JI +
e 2P --
-
-
11
'Y•)
r dadfl
-
326
INTRODUCTION TO MATHEMATICAL PROBABILITY
where, to have symmetrical notation,
[CHAP. XV
is a vm·iable defined by
'Y
a+ {3+ 'Y=
0.
On account of _symmetry, perfectly similar statements can be made in regard to any two pairs of discrepancies >., p.,
v.
Since the fundamental theorem and its proof can be extended without any difficulty to vectors of more than two dimensions, we shall have in the case of trials with more than three alternatives a result perfectly analogous to the last theorem .
Theorem. Each of n independent trials admits of k alternatives E1 , E2, Ek the probabilities and the f1·equencies of which respectively are P1, P2, . . . Pk and m1, m2, ...mk. The probability that the discrep . k - 1) should satisfy simultaneously the ancies m; - np;(i = 1, 2, . . inequalities •
•
•
arv'n
<
m; - np;
<
{3;-vn
tends uniformly, with indefinitely increasing n, to the limit
fllt J Pk
1 k-1
a "'
2 VPtP2 ' ' ' (2?r)-where
. .
1
k
1<2
fP•-• -2�1 Pi e dttdt2 ... dtk-1 Jo
.
a k-l
tk= -(tt+t2+ ...+tk-1).
From this theorem, by resorting to the definition of a multiple integral, we may deduce an important corollary: Let Pn
denote the p1'0bability of the
inequality
(m2 - np2)2 (mt - np1)2 (mk - npk)2 � x2. + ...+ + np2 npk np1 Then, as n tends to infinity Pn tends to the limit
1
f Pk
k-1
2 VP1P2 (2?r)--
·
·
·
' ' '
f -!(�+�+ 2
e
Pt
Pt
.
P•
. .
+�)
dt1dt2 ' ' ' dtk-1
where the integration is extended over the (k - 1) dimensional ellipsoid t t t = l + � + ...+ f � x2. P1 P2 Pk
({'
It is easy to see that the determinant of the quadratic form
(k - 1)
(PtP2
variables is
·
·
·
Pk)-1
1 (2?r)-2
ff
•
•
•
f - !<••'+••'+ e
in
Hence, by a proper linear trans
•
formation the above integral reduces to
--,k;:-· --;1
({'
2
. . .
•-• dv dv 1 2 +• •)
·
·
·
dvk-1
SEc.
13)
327
NORMAL DISTRIBUTION IN TWO DIMENSIONS 2
of integration being v¥ + v: + + vl_1 � x • But this multiple integral, as will be shown in Chap. XVI, Sec. 1, can be the
domain
·
·
·
reduced to a simple integral
Thus
.
hm P,.
_
=
The probability
(A)
Q,.
=
X (k ) L
1 k-3 1 22r -2
e
-.!u• 2 uk-2du.
o
1 - P,. of the opposite inequality
.. -_n-=-p-=1)'- 2 + (ms- nps)2 + (.:.m_l_ np2 np1
tends to the limit
and for large
n
we have an approximate formula
Q.
�
(� - l)f -l..
!='! 22r -2
e
u'-'du,
x
but the degree of approximation remains unknown.
In practice, to
test whether the observed deviations of frequencies from their expected
values are significant, the value of the sum (A), say x2, is found; then
by the above approximate formula the probability that the sum (A) will be greater than
x2
is computed.
If this probability is very small, then
the obtained system of deviations is significantly different from what could be expected as a result of chance alone.
The lack of information
as to the error incurred by using an approximate expression of
Q,. renders
the application of this "x2-test" devised by Pearson somewhat dubious.
HYPOTHETICAL EXPLANATION OF EMPIRICALLY VERIFIED' CASES OF NoRMAL CoRRELATION 13. Normal distribution in two di:rp.ensions plays an important part in target practice.
It is generally assumed on the basis of varied evidence
collected in actual target practice that points of a target hit by projectiles are scattered in a manner suggesting normal distribution.
By referring
INTRODUCTION TO MA'I'HEMATICAL PROBABILIT Y
328
[CHAP. XV
points hit by projectiles to a fixed coordinate system on the target, it is possible from their coordinates to find approximately (provided the number of shots is large) the elements of ellipses of equal probability. Dividing the surface of the target into regions of equal probabilities as described in Sec. 4, and counting the actual number of hits in each region, the resulting numbers in many reported instances are nearly equal.
That and the agreement with other criteria are generally con
sidered as evidence in favor of assuming the probability in target practice to be normally distributed. Two-dimensional normal distribution or normal correlation has been found to exist between measurable attributes, such as the length of the body and weight of living organisms.
Attributes like statures of parents
and their descendants, according to Galton, again show evidence of normal correlation. Facing such a variety of facts pointing to the existence of normal correlation, one is tempted to account for it by some more or less plausible hypothesis.
It is generally assumed that deviations of two magnitudes
from their mean values are caused by the combined action of a great many independent causes, each affecting both magnitudes in a very small degree.
Clearly, the resulting deviations under such circumstances may
be regarded as components of the sum of a great many independent vectors.
Then, to explain the existence of normal correlation, reference
is made to the fundamental theorem in Sec. 11. Problems for Solution 1. Let p denote the probability that two normally distributed variables (with means
=
O) will have values of opposite signs.
lation coefficient
r 2. Variables
Show that between p and the corre
r the following relation holds:
x,
=
cos p7r.
y (with the means O) are normally distributed. x, y to be located in an ellipse =
Show that the
probability for the point
z2
x
y
y2
.,.1
1Tt
.,.2
2 -2r--+2
=
l
is greater than the probability corresponding to any other domain of the same area.
3. Three dice colored in white, red, and blue are tossed simultaneously n times. Let X and Y represent the total number of points on pairs: white, red and white, blue. Show that the probability of simultaneous inequalities
7n + toy'ii;, <X< 7n + ttV¥n; tends to the limit
asn-> oo.
7n + ToVij.;; < Y < 7n + ny'ij;,
NORMAL DISTRIBUTION IN TWO DIMENSIONS 4. Three dice, white, red, and blue, are tossed simultaneously
n times.
329 If k and
l
are frequencies of 10 points on pairs: white, red; red, blue; show that the probability of simultaneous inequalities
n
+
12
{11n < l < n + / 11 n "�144 12 "''\Jl44
tends to the limit
asn-+ oo,
5. Two players, A and B, take part in a game arranged as follows: Each ti� e one
ball is taken from an urn containing 8 white, 6 black, and 1 red ball; if this ball is white, black,
A and B both gain $1; A loses $2, B loses $4;
$4, B gains $16. Let a,. and a,. be the sums gained by A and B after n games. A gains
red,
Show that the probability
of simultaneous inequalities
TO
for very large
�
n will be approximately equal to
Note that the probability of the inequality
s,.a,. < 0 is about 0.13-not very small
so that it is not very unlikely that the luck will be with one player and against another.
6. Concentric circles C,, C1 , Ca, ...in unlimited numbers are described about Points P,, P1, Pa, . ..are taken at random on these circles. Let R be the end point of the vector representing the sum of vectors OP1, OP1, OPa, H r,, r1, ra, ...are radii of C,, Ct, Ca, and the condition the origin 0.
•
.,a 1
+ .,a+ I
(r� + r: +
•
. , , +.,a " + r!)il
·
•
•
as
-+0
·
n-+
oo
is fulfilled, show that the probability that R will lie within the circle described with the radius
p
for large
about the origin will be very nearly equal to
n.
1
-
e rt•+r•'+
pi ·
·
·
+""'
References A. BRAVAis: Analyse matMmatique sur les probabilit� des erreurs de situation d'un point, Mem.
pr�enUs par div. sav. a l'Acad. Roy .Sci. l'Imt. France, 9, 8 1 46. Ann. l'Bcole polyt.
Ca.M.ScaoLs: Thoorie des erreurs dans le plan et dans l'espace,
Delft, 2, 1886. K. PEARSON: Mathematical Conttibutions to the Theory of Evolution, Philos. Tram . , 187A, 1896.
330
INTRODUCTION TO MATHEMATICAL PROBABILITY
[Cuu. XV
S. BERNSTEIN: Sur l'extension du theoreme limite du calcul des probabilites aux sommes des quantites dependantes, Math. Ann., 97, 1927.
A.
KHINTCHINE:
BegrUndung
der
Normalkorrelation nach
der Lindebergschen
Methode, Nachr. Ver. Forschungsinstitute Univ. Moskau, 1, 1928.
G. CABTELNuovo: "Calcolo delle probabilitA," vol. 2, Bologna, 1928. V. RoMANOVBKY: Sur une extension du theoreme de Liapounoff sur la limite de prob abilite, Bull. l' A cad. Sci. U.S.S.R., No. 2, 1929.
CHAPTER XVI
DISTRIBUTION OF CERTAIN FUNCTIONS OF NORMALLY DISTRIBUTED VARIABLES 1. In modern statistics much emphasis is laid upon distributions of certain functions involving normally distributed variables.
Such dis
tributions are considered as a basis for various "tests of significance" for small samples, that is, when the number of observed data is small. Some of the most important cases of this kind will be considered in this chapter.
x1, x2, . . . x,.
Independent variables
Problem 1.
distributed about their common mean deviation
u.
=
are normally
0 with the same standard
Find the distribution function of the sum of their squares 8
X�
=
+
X�
+
·
•
•
x!.
+
The inequality
Solution.
X� < t being equivalent to
the distribution function
F;(t)
1
=---=
uy2-n-
v'i J
e
v'i
-ve < x;
_=:
2"'dx
=
F;(t)
=
1
uy'2;
--
0
Sot -�2"'u e
_
! 2du
for
t
t
for
o
< 0.
Hence, the characteristic function of any one of the variables x! is
IPk(t)
=
1
uy'2; --
l,.
e
-
( 2"' _!_ ) u u -
t't
_
! 2du
1
--
o
and that of their sum !p (t)
=
1
(
1
(uv'2)" 2u2
--
(
1
-uv'2 2u2
=
)
. -
)_!
it
• -!!·
�t
;:;:; 0
2
Consequently, the distribution function of 8 is expressed by
2
x�, xi,
INTRODUCTION TO MATHEMATICAL PROBABILITY
332
and it remains to transform this integral. distributed over the interval
(0, + aJ )
[CHAP. XVI
To this end, imagine a variable
with the density
Its characteristic function is
(
1 20'2
- /n
2
(O'y ..:;)-n - - it
)-�
and since the distribution function is given a priori, we must have for
t � 0 <·
rr
r
-
2
r·.- ;;.,;;;-·du Jo
�
const. +
<·
'{;'-"f •. ( ; �·:/•· -00 w
20'2
- w 2
Hence
The constant must be
=
member vanishes fort
=
F(t)
0 since F(t) as well as the integral in 0. The final expression is therefore:
1 =
(D'.y2)nr (;) F(t)
=
fo'e -;;.,u�-1du
for
the right
t � 0
0.
for
0
t � 0.
The probability of the inequality Xl
+
X�
+
·
•
•
+
X� <
t,
on the other hand, can be expressed directly as a multiple integral
extended over the volume of the n-dimensional sphere S Xl
+
X�
+
•
•
•
+
X� <
t.
By equating both expressions of F(t), we obtain an important transforma tion,
SEc. 2)
DISTRIBUTION OF CERTAIN FUNCTIONS
II··· f
<1>
:a:••+:a:s•+ 21r• •
e
.
•
+.:..•
dxtdXs
·•·
dx..
333
=
--L' (�) o n
11'2
=
If
F(xf + x� +
· ··
_...!._ !.'_1
e 211'u2 du.
r
+ x=) is an arbitrary function of =
tt
x¥ + x� +
·
·
·
+ x!,
the integral
1
{uvl21r)n
If I ·
·
·
e-
:a:··+:a:··+ ... +z.· F(xi + 2u•
+ x!)dxtdX2
·
·
·
·
·
dxn
·
extended over the whole n-dimensional space represents the mathematical
F(u). On the other hand, the distribution function of u being known the same multiple integral will be equal to expectation of
1
{uV2)"r Taking in particular
{2)
II
.
..
f
u
=
(i)
f Jo
1, F(u)
oo
0
=
n-2
u
e -21r•F(u)u_2_du. ..
e""J.i, we get the formula
� <:a:••+ ... +z.•>+.. v:a:••+ ... +z.•dxtdxs ... dxn e-
=
n
11'2
r(�)Lo
oo
=--
_!+,.ul �
e 2
u 2du1
which will be used later.
2. Problem 2.
Variables
Xt1 X21
Denoting their arithmetic mean by
8
=
Xt + Xs +
•
••
Xn are defined + Xn
n
find the distribution function of the sum 2;
Solution.
=
(xt
- 8)2
+ (xs
- 8)2 ·· · +
The probability of the inequality 2; < t
is expressed by the multiple integral
as in Prob. 1.
I
8•
+ (x,. - )2
334
[CHAP. XVI
INTRODUCTION TO MATHEMATICAL PROBABILITY
1
F(t)
=
y'2; (u 2?r) "
If
·
·
·
f
"'•'+"'•'+ 2cr• •
e
•
•
+zn'
dx,dx2
·
·
·
dx,.
extended over the volume of the n-dimensional ellipsoid
(x1 - 8) 2 + (x2 - 8)2 +
Let
·
·
+ (x,. - 8)2 < t.
·
whence = 0
U1 + U2 + ' ' ' + U,. and
xl + x: +
Taking
u,, u2,
Jacobian J of
=
J
1 1 1
.
.
·
.
·
+ x�
·
1t,._,,
u\ + u: +
=
·
+ u� +
·
·
and 8 for new variables, we must first find the
x,, x2, . . . x,.
with respect to
u 1, u2, . . . Un-1, 8.
1
0
0
0
1
1
0
0
. . 0
0
1
0
0
1
0
1
0
. . 0
0
0
1
0
1
0
0
0
1
1
0
0
0
1
1
-1
-1
-1
. . -1
n
0
0
0
0
In the new variables the expression for
F(t)
F(t)
=
n82.
It is
will be
u•'+ . • 2;, . +u. d8du,du2 -;:: -'"'+ ... f f f : (u 2?r)" e
•
•
. du _ ,. ,
and the domain of integration in the space of the new variables is defined by
ul + u � +
-oo<8
·
·
+ u �1 + (u1 + u2 +
·
·
·
+ u,._1)2 < t.
·
After performing the integration with res pect to 81 we get
Vn F(t) (uy'2;)n-l The quadratic form lfJ =
If
ul + u� +
·
·
·
·
·
·
f - Ut'+u••+ e
•
2.-'
. . +un•
+ u !_1 + (u, + U2 +
can be represented as a sum of the squares of variables
u,, u2,
. u,._,: lfJ =
The Jacobian
vl + v� +
·
·
·
du,du2
·
·
·
(n - 1)
+ v!_1•
·
·
·
du,._1,
+ Un-1) 2 linear forms in
SEc. 2)
335
DISTRIBUTION OF CERTAIN FUNCTIONS
o (vl, V21 Vn-1 ) o(ul, U2, ... Un-1) •
•
•
is the square root of the determinant of the form If', which is the same as the determinant of linear forms
! 8��' = 2u1+u2
+
2ou1 ! Olf' = u1+ 2u2 + 2 OU2
!� 2 O Un-1
+un-.t +Un-1
=u1+u2+
Now, in general p times
xu . . lh1 . .
1 1
= (X
1)P-1(X+ p
-
-
1)
111 . . so that the determinant of If' is
=n, whence
o(vl, V21 o(u1, U21 and
•
•
• • Vn-1) = Vn • • Un-1)
o(u1, U21 • • • Un-1) = o(V11 V21 • • • Vn-l)
Therefore, taking
v1, v2, • •
1
yn:
Vn-1 for new variables, F(t) can be expressed
as follows
F(t) =
1
(uyl2; 27!-)n-1
If
· · ·
f
e
·
vt•+vo•+ . . . +••-•' :z.r•
dv1dv2
·
·
·
dvn-1
where the integral is extended over the volume of the sphere
vf + v� +
·
·
·
+v�_1
<
t.
This multiple integral is exactly of the type considered in the preceding problem, and it can be reduced to a simple integral as follows
336
INTRODUCTION TO MATHEMATICAL PROBABILITY
[CHAP. XVI
Mter substitution, the final expression of F(t) is F(t)
1
(
=
(crv/2)n-tr n; F(t)
=
3. Problem 3. Variables As in Prob. 2, we set
)
1
0
J:' - "2 'u--2 du u
for
t>O
o
for
Xt, x2,
•
t � 0. •
X1 + X2 +
8
n-3
e
•
Xn
•
•
i
Xn
+
'
=
n
are defined as in Prob. 1.
1, 2, . .
=
.
n
and introduce the quantity E
=
�u12 + u22 + .
. . +
n
un2 ,
What is the distribution function of the ratio
8 E
or, which is the same, the probability F(t) of the inequality
8
< tE?
First, assuming t to be positive, let us find the probability
Solution.
1/>(t) of the inequality
or ull1
+ u22
+
... + un2
< =
n82
t2
-·
This probability can be presented in the form 1/>(t)
=
n � cr ( 2Jr)n
where the multiple integral
'¥(8) in which
=
II
·
I
L..
-2"-q,(8)d8
·
·
0
ut•+u•'+ 2u• ·
e
ns• e
+u.•
du1du2
•
·
•
dun-l
SEC. 3]
DISTRIBUTION OF CERTAIN FUNCTIONS
337
is extended over the domain
Proceeding in exactly the same manner as in Prob. 2, we can transform
'IJI(s)
into
'lt(s)
If
=
· · ·
Vn 1
f
e
dv1dv2 · · · dv,._1
2u•
-···+···+
.
.
.
+··-··
extended over the sphere
in the space of the variables v1, v2, .
Vn-1·
For this multiple integral
we can substitute a simple integral n-1
ns'
n-1
n-1
a
du =
r(�)J."•-,:'u•;' ��J.'•and thus reduce
'lt(s)
;;.,•-•d!
to the form
:Ms. n-1
"'(·)
=
n-2
a
·· -
:::,.-'d,.
After substitution we can express cf>(t) as a repeated integral
cf>'(t)
=
338
INTRODUCTION TO MATHEMATICAL PROBABILITY
[CHAP. XVI
whence
Now
q,(+
so that C
=
1
00
)
=
.. J- '"(1 + z2)i
2-
dz
0;
v;rr(n--1 ) r(�)
=
and
q,(t)
=
1
-
r v;rr(n(�); 1) J'- '"(1 + z2)2 dz
n
Such is the probability of the inequality
� tE.
8
The probability F(t) of the inequality
will be
1
- ,P(t)
8
<
tE
or
F(t)
=
v;rrr(;) (n; 1) J-' '"(1
but this is established only for positive t. for negative tas well .
Fortbeing negative 8
<
-TE
is entirely equivalent to -8 >
and its probability is evidently
TE
dz n' + z2)2
However, this result holds -T the inequality
=
SEc.
4)
339
DISTRIBUTION VF CERTAIN FUNCTIONS
But
which permits of writing the preceding expression for
(n)�r(n; 1) i r
F(
-
)
r
=
.
2
F( -r)
as
follows·
n oo
(1
+
z2) -Zdz
=
T
Thus, no matter whether
t
is positive or negative, the distribution func
tion of the ratio 8 E
or the probability of the inequality
s
by F(t)
The distribution of the quotient 8/E was discovered by a British statistician who wrote under the pseudonym "Student," and it is com monly referred to as "Student's distribution."
The first rigorous proof
was published by R. A. Fisher.
4. Problem 4.
Variables x,
y are in normal correlation.
A sample of
x., Yn is taken and the" correla corresponding pairs, Xt, Yti x2, Y2i tion coefficient of the sample" is found by the formula
n
.
p
=
•
•
2:(x; - s)(y; - s') y2:(x; - 8)2 2:(y ; - 8 )
--�====���==�� ' 2 •
where, for the sake of abbreviation, 8
=
+ Xn Xt + X2 + ' ����----�---
n
.
, 8
Yt + Y2 +
·
=
n
+ Yn,
INTRODUCTION TO MATHEMATICAL PROBABILITY
340
[CHAP.
XVI
Find the distribution function of p, that is, the probability P of the
R for a given R( -1 < R < 1). Solution. Since the expression of p is homogeneous in x1, xz, x ,.; uz 1. Also without loss of general y,. we can assume u1 y1, y2, 0 Denoting by r ity the expectations of x and y may be supposed the correlation coefficient of x and y, the density of probability in the inequality p <
•
•
•
=
•
.
.
=
=
.
two-dimensional distribution will be:
1
1-(z1+y•- 2rzy) -2(1-r•)
=-.-::---=-. e
r2)l
27r(1 -
·
Hence the required probability will be expressed by the multiple integral
P
!!J f
1 =
·
·
·
(27r)"(1 - r2)2
f
e
-20�r•>dx1
·
·
·
dx,.dy1
·
·
·
dy,.
extended over the 2n-dimensional domain
2;(x, - 8)(y, - 81)
(3)
<
Ry2;(x, - 8)2
·
2;(y1 - 8')2
and
(4) 1, 2, ... n), respectively, by �x,, �y,, x,, y,(i we can write P thus: Replacing
P while
(3)
=
=
<\;,.}:)iJ J
and
(4)
·
·
·
J -4"'
dx1
e
·
·
dx,.dy1
·
·
·
·
dy,.
still hold but with the new notation for the variables.
Let us set now Xi- 8
=
Y•- 81
Ui,
v,,
=
then
U1 + Ug + ' ' ' + U,. Introducing 81 81;
u1, U21
find as in Sec. 2
•
•
•
=
0,
Un-1
v1 + ;
V11 Vz1
•
Vz •
+ •
·
·
·
+ v,.
=
n82
+
n812 - 2nr8s' +
0.
Vn-1 as new variables, we
where
1/1
=
};ul + };vl -
2r };u.v,
4)
SEc.
341
DISTRIBUTION OF CERTAIN FUNCTIONS
and the domain of integration is defined by the inequalities -
s
oo <
< oo;
l:u,v,
<
s'
-oo <
< oo
Ryl:ul· l:v[.
Now by the same linear transformation the quadratic forms
l:v[
n
(each containing
-
1
l:u[,
independent variables) can be transformed
into n-1
n-1
�wl,
� zil
i=I
i=1
""' 2.
at the same time n
n-1
�u.-v,
�WiZi·
=
i=1
Proceeding as in Sec.
2
and noting that
J f.,. e oo
-oo
we find that
rn2)-1 2
i=1
n (•'+•'' - 2 rss') 2
--
n-1
p
=
(1
-
(2'11')
where
dsds'
2'11' ' n�
=
-oo
If
· · ·
x =
J
1
e
-nx 2
dw1 · · · dwn-1dZ1 ·
·
•
dzn-1
l:wl + l:zl - 2r l:WiZi
and the domain of integration in the space of
2n
-
2 dimensions is defined
by
l:wiZi
<
Rv l:w� · l:zf.
We shall integrate now in regard to variables system of values
w1, w2,
transformation
Z1 Z2 Zn-1
. . . Wn-1·
z1, z2, . . . Zn-1 for a fixed
To this end we use an orthogonal
=
C1,1� 1 + C1,2�2 + ' ' ' + C1,n-1�n-1 C2,1�1 + C2,2�2 + ' ' ' + C2,n-l�n-1
=
Cn-1,1�1 + Cn-1,2�2 + ' ' ' + Cn-1,n-1�n-1
=
in which the elements of the first column are
C;,l
=
� VW +
Wi .
,
•
+ W�1
Wi W
342
INTRODUCTION TO MATHEMATICAL PROBABILITY
Defining
��, �2,
•
•
[CHAP. XVI
�n-1 by
•
WI = C1,1�1 + C1,2�2 + W2 = C2,1�1 + C2,2�2 + we shall have � � =
w, �2 =
=
orthogonal transformations
+ C1,n-l�n-l + C2,n-l�n-l
�n-1
By the properties of
0.
=
so that for a fixed system of values integration in the space of variables
w1, w2, r1, r 2, •
•
•
•
Wn-1 the domain of
rn-1 will be
•
(5) Thus we must first evaluate the integral
J = If
·
· · f e- Hl"•'+
..
.
+!"·-•'>- ll"•'+rwt•dt 1dt2
If rl < 0 no restriction is imposed upon r2,
r� + · · · + r�-�
(�
>
2
•
•
- 1
.
•
•
dtn-1·
•
tn-li if rl
)n
Consequently the result of integration in regard to t2,
•
•
•
>
o, then
tn-1 can be
presented thus:
where the inner integral is extended over the domain
r� + · · · + r��
(�2 )n - 1
<
Making use of formula (1), Sec. 1, the expression of
and cis a constant.
J reduces to r'w' -
J = ce
2
-
n-2 211' 2
r(
n
l.. r•• r for' ( )
;2
o
- 1-
e 2
+ rw ,
dtl
o
1 -.-1 R
)l
••
--
e 2vn-adv.
This has to be multiplied by
1 --(1 (21!')n-l
-
n-1
_!:Ew•'
r2 ) 2 e 2
dw1 · · · dwn-1
and integrated over the whole space of the variables The resulting expression for P will be
w1, w2,
Wn-i·
SEc.
4]
343
DISTRIBUTION ,OF CERTAIN FUNCTIONS
P - const. -
n-1 n-2 2'11'-2 (1- r2)-2
(2'11')"-lf where
M
=
fo
..
(n ; 2)If e
_!r,•+rwi• ds1 2
J:r.(�-1)1 0
e
-� 2v"-3dv •
Now we differentiate in regard toR, reverse the order of integrations, and make use of formula
Sec. 1; the resulting value of dP/dR will
(2),
then be expressed as a double integral
n-4 n-1 -2(1- 1'2)_2 _(1- R2)_2 _
1
dP dR
'II'
=
or
dP �
) n-2 (n-1 Lo Lo 2,_3r ( 2 r 2 ) n-1
=
'"
00
t +u')+Rrtu e -!(' 2
(tu)" -2dtdu,
n-4
2 f .. f .. e- H'+ ' + tu (1 - 1'2)_2_(1- R2)__ t u l Rr (tu)n-2dtdu d�-� � Jo . '
since
r
)
)(
(
n-1 n-2 r --2 2
=
y; ,. r(n-2). 2 _3
In the double integral we make transformation to new variables E, 11
defined by
E
t u
= _,
11
=
tu.
The Jacobian oft, u in regard to E, 11, beinw 72E-1, we have
.. .. e -�<•t +u•> +Rrtu(tu)n-2dtdu
So So
- 2 1 r(n -
- 1)
f Jo
..
and so, finally,
dp dR -
In case r
=
=
2
E
(
�
=
i
So .. So
..
e -�
E-ldE 1 E- 1 r "- - R
)
n-1
'II'
r(n - 1)
n-4
�(1- r2)_2_(1- R2)_2 _
:
(E+t'-Rr) 1/n-2dTJ
Lo
E
=
dt Jf.. o (cht- Rr)"
1'
dt (cht- Rr)" 1•
oo
0, that is, when the variables x, y are uncorrelated, we have
a very simple expression of P:
344
INTRODUCTION TO MATHEMATICAL PROBABILITY
p
In case
=
n-2-1) (v;r(n; 2) f r
[CHAP.
XVI
n-4
R
-1
(1 - p2)_2_dp .
r :;6 0 the integral
f"' Jo (cht can still be found in finite form.
"' Lc
whence
f "' J o (cht
dt - Rr)"
dt ht - Rr
1
=
=
V1
- 2)
We have, in fact,
1
_
[ -2 7r
R2r2
dn-2
r-
(n
dt - Rr)" 1
I dRn-2
{[1
.
+arc sm
]
(Rr) ,
[
. - R2r2)-! 27r + a.rc sm (R1·)
]}
'
and so
P =A
J_R1 (1 - p2)n;':;:�2{[1 -
where
A When
n
=
p2r2]-
(1 1r(n- 3)1
r-
{�
+arc sin
(rp)
]}
dp,
n-1
- r2) -2-.
is an even number, this integral appears in a very simple finite
form, but in case of an odd type appear.
n
certain integrals of a rather complicated
Besides, the behavior of P for somewhat large
be easily grasped by using this integral expression for P.
n
cannot
6. Fisher, who was first to discover the rigorous distribution of the
correlation coefficient, called attention to the fact that, setting
thz
=
2:(x, - s) (y, - s') y2:(x, - s)22:(y; - s')2
,
the distribution of z will be nearly normal even for comparatively small values of
n.
Let us set
p Instead of
_
n
thR = w, th!
- 2
- -7r-
"' f"'
J .. Jo _
=
r; then P can be expressed thus:
chzdtdz . (chtchzch! - sh!shz)" 1
t it is convenient to introduce a new variable chtchzch! - sh!shz = r-1ch(z - !).
r
so that
SEc. 5]
345
DISTRIBUTION OJ<' CERTAIN FUNCTIONS
chz) l J-"' .. (chr [chCz - mn
Then
P
=
dz
n- 2 'lrV2
where
P for all values of
z
ch(z + r) - 2chzchr
I
fl-r-1(1 - -r)n-2d-r Jo v'1 - p-r
ch(w + r) 2chwchr
< =
under consideration.
Now
and
since
f l-r-1(1 - -r)n-2d-r < f \-1(1 - -r)n-2(1 + p-r)d-r • )o )o v'1 - p-r for 0 < p < 1 as can be easily verified. Consequently dz p (n - 2)r(n - 1) "' chz l -
( ) - !) J .. chr [ch(z - r)]n-t [ 1 ch(w r) •
_
y'21rr(n
-
+
·
+
]
8
0<8<1.
2chwchr 2n - 1 ;
As to the integral in this formula, its approximate expression, omitting
f..
terms of the higher order, is:
-oo
e
_2n-a<•-t>• 4
t
dz- hr e 2n-3
Thus for somewhat large
n
_2n-a<.. -r>• 4
•
the required value of P can be found with
the help of a simple approximate formula. ·
The various distributions dealt with in this chapter are undoubtedly
of great value when applied to variables which have normal or nearly normal distribution. be doubted.
Whether they are always used legitimately can
At least the "onus probandi" that the "populations" with
which they deal are even approximately normal restswith the statisticians.
1. Show that
Problems for Solution
l+C �- -(n) L n -
n2
lim- -,. oo
n-->
2 2 r 2
0
n 8
nu n 1 --
2u2
du
' ::11 v'2rJl
=---=
HtNT: Liapounoff's theorem and Prob. 1, page 332.
oo
8
-
1
2
1
du
346
[ CHAP. XVI
INTRODUCTION TO MATHEMATICAL PROBABILITY
2. With the same assumptions and notations as in Prob.
3, page 336, show that the
distribution function of the quotient
Xi- 8 -E
;
i =
1, 2,
. . .
n
is
F(l)
{·T)n 2 r (· - n- 1r.. ( ; ) .y;;-=I
-
_
V?r(n-1)r F(t) =
1
t
if
>
�;
It is worthy of notice that for
ltl
if
�
Vn"=1
..
-
n = 4 the
F(t)
=
0
t<- � .
if
distribution is uniform.t
3. In two series of observations, samples
x., x2,
. . . Xn and y., Y21
•
•
•
Yn• from
the same normally distributed population (or of the same normally distributed vari able) are obtained.
8 =
X1
Denoting for brevity
+ X2 + ' n E
=
' ' + Xn
�(; ;;} +
�(Xi - 8 ) 2
find the distribution function of the quotient
F(t) =
r
8' =
Y +� Yn' 1 + Y2 + --.... .::_--'--.::_--'-...:.. ·
+
8 - 81 E
n'-
-
·
�(Yi - 81 ) 2) , Ans.
("Student").
(n + n'2 -1) 2) J' n+ r( .. (1 + 2
- r v ,.
·
�
dT n+n
•
'1'2
)
•
-1
2
References "STUDENT": The Standard Error of a Mean, Biometrica, vol. 6, 1908. R. A. FisHER: Frequency Distribution of the Values of the Correlation Coefficient in Samples from Indefinitely Large Population, Biometrica, vol.
---:On
10, 1915.
the Probable Error of a Coefficient of Correlation Deduced from a Small
Sample, Metron, vol.
1, 1920.
D. JACKSON: On the Mathematical Theory of Small Samples, Ame1·. Math. Monthly, vol. 1
42, 1935.
CoMMANDER LHOSTE: Memorial de l'artillerie fran9aise, pp.
245 and 1027, 1925.
APPENDIX I Let f(x) be a function with a f'(x) in an interval (a, b) where a and b > a are
1. Euler's Summation Formula. continuous derivative
arbitrary real numbers.
The notation n ;:i!b
!J(n)
n>a
n
will be used to designate the sum extended over all integers which are >a and �b. It is an important problem to devise means for the approxi mate evaluation of the above sum when it contains a considerable number of terms..
Let [x], as usual, denote the largest integer contained in a real number x, so that x [x] + 0 where 0, so-called "fractional part" of x, satisfies the inequalities =
0�0<1. Considered as functions of a continuous variable discontinuities for integral values of
x.
x, both [x] and
0 have
The function
[x] - x + t is likewise discontinuous for integral values of x. Besides, it is a periodic function of x with the period 1; that is, we have p(x + 1) p(x) for any real x. With this notation adopted we have the following p(x)
=
!
-
0
=
=
important formula: nSb
(1)
tJ(n) J)<x)dx + p(b)f(b) - p(a)f(a) - J>(x)f'(x)dx =
n>a
which is known as "Euler's summation formula."
Proof.
Let k be the least integer
>a and l the greatest integer �b.
The sum in the left member of (1) is, by definition,
f(k) + f(k + 1) +
.
. . + f(l)
and we must show that this is equal to the right member. we write first 347
To this end
348
INTRODUCTION TO MATHEMATICAL PROBABILITY
(H f.abp(x)f'(x)dx f."a p(x)f'(x)dx J(z bp(x)f'(x)dx J i-A: i 1p(x)f'(x)dx. j s:+1p(x)f'(x)dx LH1(j- �)f'(x)dx _f(j) �(j 1) fiH1f(x)dx J i=l-1
+
=
Next, since
� ..,
+
is an integer,
+
X+
=
+
+
=
+
and
i=l-1 i 1 n=l-1 ) f( f( ) �i + p(x)f'(x)dx _ k t l - � f(n) Lf(x)dx. +
=
1=A:
n=A:+1
On the other hand,
i"p(x)f'(x)dx i"(k- - x �)f'(x)dx _f�)- p(a)f(a) if(x)dx .[bp(x)f'(x)dx .[b(z- x �)f'(x)dx _!�) p(b)f(b) .[bf(x)dx, 1
=
+
+
=
+
+
=
+
=
+
so that finally
J.>(x)f'(x)dx
=
-f(k) - f(k + 1)
-
·
+
f(l) p(b)f(b) - p(a)f(a) J)<x)dx; ·
·
+
-
+
whence
n:S:b a J)<x)dx p(b)f(b) - p(a)f(a) - J:bp(x)f'(x)dx, n>�f(n) +
=
which completes the proof of Euler's formula.
Corollary 1.
The integral
JCzp(z)dz u(x) =
represents a continuous and periodic function of
u(x
+ 1)
If 0 �X� 1,
x
with the period
- u(x) J:+1p(z)dz Jo1p(z)dz JC1(i - z)dz =
=
=
1.
=
0.
For
349
APPENDIX I
11(x)
=
f'"o (2l - z)dz = x(l - x) 2
J
and in general
O'(X) where
0 is
a fractional part of
x.
0 �
=
0(1
- 0} 2
Hence, for every real
O'(X) � j.
Supposing thatf"(x) exists and is continuous in parts, we get
x
(a, b) and integrating by
J>(x)f'(x}dx O'(b)f'(b} - q(a)f'(a) - J:bO'(x)f"(x)dx, =
which leads to another form of Euler's formula:
S:b "'£f(n) J)<x}dx + p(b)f(b) - p(a)f(a) - O"(b)f'(b) +
n
n>a
=
+ Corollary 2.
If
11(a)j'(a) + J:O'(x)f"(x)dx.
f(x) is defined for all x � a and possesses a continuous (a, + oo); if, besides, the integral J: p(x)f'(x}dx
derivative throughout the interval oo
exists, then for a variable limit
(2}
n
b
we have
S:b
"'£f(n)
n >a
= C+
jf(b}db + p(b)f(b) + !"'p(x)f'(x)dx
where C is a constant with respect to It suffices to substitute for
b.
J:p(x)f'(x}dx the difference
£ p(x)f'(x)dx - h p(x)f'(x)dx oo
oo
and separate the terms depending upon
2. Stirling's Formula.
b
from those involving a.
Factorials increase with extreme rapidity
and their exact computation soon becomes practically impossible.
The
question then naturally arises of finding . a convenient approximate
350
INTRODUCTION TO MATHEMATICAL PROBABILITY
expression for large factorials, which question is answered by a celebrated formula usually known as "Stirling's formula," although, in the main, it was established by de Moivre in connection with problems on proba bility.
De Moivre did not establish the relation to the number 11" =
3.14159 ...
of the constant involved in his formula; it was done by Stirling.
(2)
In formula
it suffices to take
a
=
�.
J(x)
= log
x,
and replace b
by an arbitrary integer n to arrive at the remarkable expression log
(1
·
2
·
3
·
·
·
( �)
+ n+
n) = C
log n - n+
i ..p(x�dx
For the sake of brevity we shall set
where C is a constant.
f ..p(x)dx. Jn X
w (n) =
Now
f ..p(x)dx Jn X
=
and
f k+1p(x)dx Jk x
=
=
i
f"+1p(x)dx + "+2p(x)dx + Jn X n+l X
r-p(u)du = f ' p(u)du f1p(u)du = Jo u + k +J; u +k Jo u + k f' Ci - u)du + f1(!- u)du = ! f 1 (1 - 2u)2du . 2 j o(k+u)(k+1-u) J; u+k Jo u+k
Hence
where
F (u) ,.
=
1
� (k +u)(k + 1- u)' k=n
Since
(k +u)(k + 1 - u) it follows that for 0 < u < � (k (k Thus for 0 <
+ +
u
u)(k + 1 - u) u)(k + 1 - u) <
�
> <
=
k(k +1) +u - u2,
k(k +1) (k +!)2
<
(k +!)(k +t).
351
APPENDIX I
..
1 1 Fn(u) < � k(k + 1) =n �
A:=n
..
Fn(u)
>
1
�
� (k + i)(k +f) A:-n
1 =
n +
f
Making use of these limits, we find that
w(n)
<
w(n)
>
1 r• -2u) 2du = 12n 2n)o 1 1 r• -2 2d u) u = 12(n+ !)' 2n + 1)o
1 (1
(1
and consequently can set
( )
w n
where
=
1 12(n + 8)
0 < 8
<
i.
Accordi!lgly log
(1 . 2·3 .
·
( + �)
· n) = C +
n
lo n g - n +
1 12(n + IJ)
•
The constant C 4epends in a remarkable way on the number
'II'.
To show this we start from the well-known expression for 'II' due to Wallis:
�
2
=
l im
2n . .i.i . . . 2n-1 2n+1 133 5
�).
(� . �
n--+
co
which follows from the infinite product sin by taking
x2 x=x 1'11'
(
1
- '11' 9 2
)( - �)(1 �) .
2
4'11'2
. .
x='II' /2. Since
2 2 4 4 · · · ···
I 3 3 5
2 2·4·6 · · 2n 1 2n 2n · = · · ··· 13 5 2n - 1 2n + 1 (2n-1) 2n + 1
[
·
]
we get from Wallis' formula
y;;:
=
li m
[
1 2·4·6 ..· 2n __ · · 5 ·· (2n - 1) Vn , 13 ·
]
n-+co,
On the other hand,
2·4·6 ··· 2n=2n·1·2·3··· n 1·2·3·· ·2n 1 . 3 . 5 . . . (2n - 1) = n .1 . . 2 2 3 . . . n'
352
INTRODUCTION TO MATHEMATICAL PROBABILITY
so that ;= v1r _
. =hm
{
}
22"(1·2 ·3 ...n)2 1 ·-, 1·2·3 . . ·2n Vn
n--+
Q)
or, taking logarithms log v; = lim [2n log 2 + 2 log (1·2·3 · . n) - log (1·2·3 ··· 2n) - ! log n ] .
But, neglecting infinitesimals,
log (1 ·2 ·3 ··· n ) = C + (n + !) log n - n log (1 2·3 ··· 2n) = C + (2n + !) log 2n 2n ·
whence
-
lim [2n log 2 + 2 log (1·2·3 ··· n ) - log (1·2·3 ·· 2n) - ! log n] = C - ! log 2. Thus logy.;;: = C - ! log 2, and finally (3)
log (1·2·3 · · n ) =log y27r +
C =log
(n + �) +
y27r
log n - n +
1
12(n + 8);
This is equivalent to two inequalities
;
e12 +6 <
1 .2 .3 . . . n
�n"e-n
�
< el n
which show that for indefinitely increasing n lim
1·2·3 ·· · n
�n"e-n
=l.
This result is commonly known as Stirling's formula. For a finite n we have 1 .2 .3 . . . n = �n "e" . where 1
The expression
12(n + !)
<
n) "'(
<
1 12n
'
ew(n)
0 < 8 <
1
2'
353
APPENDIX I
1 2 3 n for large n in the sense that the ratio of both is near to 1; that is, the relative error is
is thus an approximate value of the factorial small. large
n,
·
·
On the contrary, the absolute error will be arbitrarily large for but this is irrelevant when Stirling's approximation is applied
to quotients of factorials. In this connection it is useful to derive two further inequalities. Let
m
<
n;
we have, then, k=n-1
Fm(u) - F,.(u)
�
=
k=m
u
and further, supposing 0 <
1 (k+ u)(k+ 1 - u)i
Y2,
<
k=n-1
Fm(u) - F,.(u)
<
>
1 m
1 n
k(k
�
1 1 (k + !) k+ t) =m+!-n+f
k=m k= n-1
Fm(u)- F,.(u)
� 1)
�
�
k=m
Hence,
w(m) - w(n)
<
1 1 12m - 12n'
w(m) - w(n)
>
1 12(m+ !)
1
12(n +
i)
and, if lis a third arbitrary positive integer,
w(m) + w(l) - w(n)
<
w(m) + w(l) - w(n)
>
1 1 1 + 12l- 12n 12m 1 1 + 12(l + !) 12(m + !)
3. Some Definite Integrals.
1
12 (n + !)
·
The value of the important definite
integral
fo .. e-''dt can be found in various ways.
One of the simplest is the following: Let
J = n
in general where
n is
fo .. e-t't"dt
an arbitrary integer � 0.
can easily establish the recurrence relation
J,.
=
n-1 --Jn-2i 2
Integrating by parts one
354
INTRODUCTION TO MATHEMATICAL PROBABILITY
whence
J2m
=
J2m+l
=
1 ·3 ·
5
·
.
·
2m
1·2· 3 ·
(2m
- 1)
Jo
m
2
On the other hand,
J..+t + 2'XJ.. + )..,21 .._1
=
.fo .. e-••tn-1(t + X)2dt,
which shows that
Jn+l + 2'XJ.. + X2Jn-l > 0 for all real
X.
Hence, the roots of the polynomial in the left member are
imaginary, and this implies
J! Taking n for
J2m
2 ·4
1·
3 ·5
=
·
6
2m and n
J2m+l,
and ·
·
=
<
Jn+IJ n-1•
2m + 1 and using the preceding expression
we find . 1
·2m
1) y4 m + 2
(2m-
<
Jo
2 ·4·6 ·· ·2m
<
1 3 ·5 ·
··· (2m-
_1_. 1) .yTni
But . lim
m-
,.
1 ·3
· 5
·
hence
Jo Here substituting
t
1 -- 1) ym -
2 4·6···2m
=
=
yau,
(2m -
·
·
Jo .. e-••dt
where
a
=
-� v '��' ''
!y';.
is a positive parameter, we get
r.. .d J o e--a" u
=
1 r; 2\ja'
As a generalization of the last integral we may consider the following one:
v
=
..
Jro e-au• cos budu .
The simplest way to find the value of this integral is to take the derivative
dV •
-
=
-
L 0
.. e--ou•
sin
bu ·udtt
.
and transform the right member by partial integration.
The result is
APPENDIX I
dV db
-=
or
b -V
2a
b• d(Veiii)
=
whence
V
=
(V),_o
=
0,.
b•
=
Ce -44.
To determine the constant C, take C
355
b
=
0; then
f e-au'du Jo ..
=
f!, 2\ja
!
so that finally
f e-au• cos budu Jo ..
=
f!.e -�. 2\ja
!
The equivalent form of this integral is as follows:
APPENDIX II
METHOD OF MOMENTS AND ITS APPLICATIONS To prove the fundamental limit theorem
1. Introductory Remarks.
Tshebysheff devised an ingenious method, known as the "method of moments," which later was completed and simplified by one of the most prominent
among
Tshebysheff's
disciples,
the
late
Markoff.
The
simplicity and elegance inherent in this method of moments make it advisable to present in this Appendix a brief exposition of it. The distribution of a mass spread over a given interval characterized by a never decreasing function
rp(x),
(a,
b) may be
defined in
(a,
b)
and varying from rp(a) = 0 to rp(b) = mo, where mo is the total mass con tained in (a, b). Since rp(x) is never decreasing, for any particular point
xo,
both the limits lim lim
rp(xo - E) rp(xo + E)
exist when a positive number
E
= =
tends to
rp(xo - 0) rp(xo + 0) Evidently
0.
rp(xo - 0) � rp(xo) � rp(xo + 0). If
rp(xo - 0) then
x0
=
rp(Xo + 0)
is a "point of continuity" of
rp(Xo + 0) Xo
is a point of discontinuity of
rp(x).
In case
rp(xo - 0),
>
rp(x),
rp(Xo),
=
and the positive difference
rp(xo + 0) - rp(xo - 0) may be considered as a mass concentrated at the point x0• In all cases rp(x0 - 0) is the total mass on the segment (a, xo) excluding the end point Xo, whereas rp(xo + O) is the mass spread over the same segment including the point
Xo.
The points of discontinuity, if there are any, form an enumerable set, whence it follows that in any part of the interval (a, b) there are points of continuity. If for any sufficiently small positive
rp(xo + E)
>
E
rp(xo
-
)
E ,
xo is called a "point of increase" of rp(x). There is at least one point of increase and there might be infinitely many. For instance, if 856
357
APPENDIX II
IP(x) IP(x) then
c
0
for
mo
for
=
=
is the only point of increase.
1.0(x) every point of the interval
On the other hand, for
x-a b-a
mo-
=
(a, b)
a�x�c C <X� b,
is a point of increase.
In case of a
finite number of points of increase the whole mass is concentrated in these points and the distribution function
IP(x)
is a step function with a
finite number of steps. Stieltjes' integrals
J:d1,0(x)
=
mo,
represent respectively the whole mass mo and its moments about the origin of the order 1, 2, . . . i. When the distribution function IP(x) is given, moments mo, mt, m2, . . . mi (provided they exist) are deter mined.
If, however, these moments are given and are known to originate
in a certain distribution of a mass over
(a, b),
the question may be raised
with what error the mass spread over an interval by these data?
(a, x) can be determined
In other words, given mo, m1, m2, . . . mi, what are the
precise upper and lower bounds of a mass spread over an interval
(a,
x)?
Such is the question raised by Tshebysheff in a short but important article "Sur les valeurs limites des integrates"
(1874).1
The results contained
in this article, including very remarkable inequalities which indeed are of fundamental importance, are given without proof. The first proof of these results and the complete solution of the question raised by Tsheby sheff was given by Markoff in his eminent thesis "On some applications of algebraic continued fractions"
(St.
Petersburg,
1884),
written in
Russian and therefore comparatively little known. Suppose that Pi is the limit of the error with which we can evaluate the (a, x) or, which is almost the same, the
mass belonging to the interval value of
IP(x),
when moments mo, m1, m2, . . . mi are given.
If, with i
tending to infinity, Pi tends to 0 for any given x, then the distribution function IP(x) will be completely determined by giving all the moments
One case of this kind, that in which mo
1
=
1,
m2,.
( 2_k_-_1...!_) , 1 _3 _. _5 _· ---=:---'-
__· =
Jour. Liouville, Ser. 2, T. XIX, 1874.
2k
APPENDIX II METHOD OF MOMENTS AND ITS APPLICATIONS 1. Introductory Remarks. To prove the fundamental limit theorem Tshebysheff devised an ingenious method, known as the "method of moments," which later was completed and simplified by one of the most prominent among Tshebysheff's disciples, the late Markoff. The simplicity and elegance inherent in this method of moments make it advisable to present in this Appendix a brief exposition of it. The distribution of a mass spread over a given interval (a, b) may be characterized by a never decreasing function cp(x), defined in (a, b) and varying from cp(a) = 0 to cp(b) = mo, where mo is the total mass con tained in (a, b). Since cp(x) is never decreasing, for any particular point xo, both the limits lim cp(xo - e) = cp(xo 0) lim cp(xo + E) = cp(xo + 0) -
exist when a positive number
cp(xo
-
E
tends to 0.
Evidently
0) ;;;;; cp(xo) ;;;;; cp(xo + 0).
If
cp(xo - 0)
=
cp(xo + 0)
then x0 is a "point of continuity" of cp(x).
cp(xo + 0)
>
cp(xo
=
cp(xo),
In case -
0),
x0 is a point of discontinuity of cp(x), and the positive difference cp(xo + 0) - cp(Xo
-
0)
may be considered as a mass concentrated at the point x0• In all cases cp(x0 - 0) is the total mass on the segment (a, xo) excluding the end point x0, whereas cp(x0 + 0) is the mass spread over the same segment including the point Xo. The points of discontinuity, if there are any, form an enumerable set, whence it follows that in any part of the interval (a, b) there are points of continuity. If for any sufficiently small positive E
cp(xo + E) > cp(xo
-
)
e ,
x0 is called a "point of increase" of cp(x). There is at least one point of increase and there might be infinitely many. For instance, if 356
357
APPENDIX II
IP(x)
=
IP(x)
=
for
0
a�x�c C <X� b,
for
mo
then c is the only point of increase. On the other hand, for IP(X) every point of the interval
=
x-a 1no;-b
-a
(a, b) is a point of increase. In case of
a
finite number of points of increase the whole mass is concentrated in these points and the distribution function IP(x) is a step function with a finite number of steps. Stieltjes' integrals
represent respectively the whole mass mo and its moments about the
origin of the order
1, 2, ...i.
When the distribution function IP(X)
is given, ·moments mo, m1, m2, . . . m. (provided they exist) are deter mined. If, however, these moments are given and are known to originate in a certain distribution of a mass over
(a, b), the question may be raised (a, x) can be determined
with what error the mass spread over an interval by these data?
In other words, given mo, m1, m2, . . . m., what are the
precise upper and lower bounds of a mass spread over an interval
(a, x)?
Such is the question raised by Tshebysheff in a short but important article 11 Sur
les valeurs limites des integrales"
{1874)
.
1
The results contained
in this article, including very remarkable inequalities which indeed are of fundamental importance, are given without proof. The first proof of these results and the complete solution of the question raised by Tsheby sheff was given by Markoff in his eminent thesis of algebraic continued fractions"
11 On
(St. Petersburg,
some applications
1884), written in
Russian and therefore comparatively little known. Suppose that P• is the limit of the error with which we can evaluate the mass belonging to the interval
(a, x) or, which is almost the same, the
value of IP(x), when moments mo, m1, m2, . . . m, are given. If, with i tending to infinity, P• tends to 0 for any given x, then the distribution
function IP(x) will be completely determined by giving all the moments
One case of this kind, that in which mo 1
=
1,
m2,.
=
5_ . · �<2::.:k :....::._ 1_· .::.3_ . · .::. . 1:!.), --= � 2k
Jour. Liouville, Ser. 2, T. XIX, 1874.
358
INTRODUCTION TO MATHEMATICAL PROBABILITY
'
was considered by Tshebysheff in a later paper, "Sur deux theoremes relatifs aux probabilites"
(1887) 1
devoted to the application of his
method to the proof of the limit theorem under certain rather general conditions.
The success of this proof is due to the fact that moments,
as given above, uniquely determine the normal distribution
IP(x)
=
-1y;
J '"
e-"'du
-oo
of the mass 1 over the infinite interval
(-
co,
+co).
After these preliminary remarks and before proceeding to an orderly exposition of the method of moments, it is advisable to devote a few pages to continued fractions associated with power series, for continued frac tions are the natural tools in questions of the kind we shall consider.
2. Continued Fractions Associated with Power Series.
q,(z)
=
At + A, f!%1 za
• +
Aa zaa
Let
+ . . . i
be a power series arranged according to decreasing powers of smallest exponent
at
is positive.
z where the
We consider this power series from a
purely formal point of view merely as a means to form a sequence of rational fractions
and we need not be concerned about its convergence. Evidently
1/ ,P(z)
can again be expanded into power series, arranged
according to decreasing powers of non-negative powers of
z,
z.
Let its integral part, containing
be denoted by
qt(z),
and let the fractional part
Bt + B2 + Ba + zll•
zil'
containing negative powers of
z,
·
·
·
be denoted by
1
,P(z)
zfJ•
=
- t/Jt(z),
qt(Z) - tPt(Z),
In the same way 1
tPt(Z) can be represented thus:
1
t/Jt(z) 1
=
q2(z) - t/J2(z)
Oeuvres compl�tes de P. L. Tshebysheff, Tome 2, p. 482.
so that
·
359
APPENDIX II
where q2() z is a polynomial and
cP2(z)
Ct C2 Ca z-r• + -r z • + z-r• +
=
a power series containing only negative powers of z.
Further, we shall
have
1 q,2(z)
=
qa(z) - c/Ja(z)
with a certain polynomial qa() z and a power series
z c/Ja()
=
D2 Dt Da ' z • + zh + ' z • + . .
containing negative powers of ,z and so on.
.
Thus we are led to consider a
continued fraction (finite or infinite)
1 ql-1 q2- qa-
(1)
.!
•
associated with q,() z in the sense that the formal expansion of
1
-
1 z q, - q,,()
into a power series will reproduce exactly 9() z .
(1)
The continued fraction
is again considered from a purely formal standpoint as a mere abbre
viation of the sequence of its convergents
The polynomials
Pt, P2, Pa, .. . Qt, Q2, Qa, .. . can be found step by step by the recurrence relations
(2 )
p, Q, Pt
Ql
= = = =
}
q.Pi-l- p._2 i q,Q._t - Qi-2 1, Po = 0 Qo = 1 qt,
=
2,
3, 4,
360
INTRODUCTION TO MATHEMATICAL PROBABILITY
from which the following identical relation follows:
(3) showing that all fractions
�
are
irreducible.
P,(z) Q,(z)
Evidently degrees
of consecutive denominators of
convergents form an increasing sequence and the degree of least i.
Q;(z) is at
Since
=
1
P;(q,+l - 4>>+I(z)) - P;-1 Q;(qi+1 - 4>>+1(z)) - Q'-1
=
=
P;+l - P;lf>;+J(z) Q'+1 - Q;lf>;+1(z)
we can write
q,(z)
=
P1+1 - P,q,,+l(z) Q;+1 - Q;lf>;+J(z)
in the sense that the formal development of the right-hand member is identical with
q,(z).
By virtue of relation
p,
4> (z) Q, The degree of
(3)
1 =
Q,(Q>+I - Q;l/>i+1)"
Q; being �� and that of Qi+1 being �i+1, the expansion of Q;(Q;+1 - Q;lf>i+l)
in a .series of descending powers of z begins with the power
z>-tH·1+�.
Hence,
p,
q,(z) - Q, and, since
M
=
z>.M->-1+1 +
�i+J ?; �; + 1, the expansion of q,(z) - p, Q;
begins with a term of the order characterizes the convergents
2�, + 1 in 1/z at least. This property P;/Q; completely. For let P/Q be a
rational fraction whose denominator is of the nth degree and such that in the expansion of
q,(z) - p
Q
361
APPENDIX II
the lowest term is of the order 2n + 1 in 1 Iz at least. Then PIQ coincides with one of the convergents to the continued fraction (1). Let i be determined by the condition
Then 4> (
z
)
P; - Q; P
4J(z) -Q
M zX.+X<+�
=
=
+
·
·
·
N + z 2n+ l
whence in the expansion of
the lowest term will be of degree degree of
2n + 1 or>.; +
>-•+t in
1/z.
Hence, the
PQ;- P;Q in
z is
not greater than both the numbers
>.;-n-1
and
which are both negative while
is a polynomial.
Hence, identically, PQ;- P;Q
=
0
or
which proves the statement. 3. Continued Fraction Associated with
ib d'P(x)X
Let
·
a Z-
IP(X)
be a never
decreasing function characterizing the distribution of a mass over an interval (a, b). The moments of this distribution up to the moment of the order 2n are represented by integrals mo
=
.C d'P(x),
m1
=
J:bxd'P(x), m2
=
.Cx2d1P(X),
•
·
•
m2n
=
J:bx2nd'P(x).
INTRODUCTION TO MATHEMATICAL PROBABILITY
362 Let
If rp(x) has not less than
n + 1 points of increase, we must have Lll > o, . . . Ll,. > o,
.do > 0,
and conversely, if these inequalities are satisfied, rp(x) has at least points of increase. To prove this, consider the quadratic form
in
n + 1 variables to, t1, ,P
=
.
.
•
n + 1
t,.. Evidently
};mi+;t,t;
(i, j
=
0, 1, 2, . ..n)
so that � .. is the determinant of ,P and �o, Ll 1, � .. -1 its principal minors. The form ,P cannot vanish unless to 0. t1 t,. For if x 0, we must have also � is a point of increase and q, •
•
•
=
=
·
·
·
=
=
=
=
. J:e+•(to + t1x + f-• for an arbitrary positive
(to + t11/ +
'
•
•
E1
·
·
+ t,.x")2drp(x)
·
=
0
whence by the mean value theorem
J:e+•drp(X) e-e
+ t,.f/")2
=
0 (� -
E
< 1/ < � + E)
or
to + tlf/ + . . . + t,..,.. = 0 because
J:e+•d�p(x) e-e Letting
E
>
0.
converge to O, we conclude to + lt � +
at any point of increase.
·
·
·
+ t,.�"
=
Since there are at least
0
n + 1 points of increase
the equation to + t1x + would have at least
·
·
·
+ t,.x"
=
0
n + 1 roots and that necessitates to
=
t1
=
·
·
·
=
t,.
=
0.
363
APPENDIX II
Hence, the quadratic form !/>, which is never negative, can vanish only if all its variables vanish; that is, !/> is a definite positive form.
determinant � .. and all its principal minors �n-t, �..-2, positive, which proves the first statement.
•
•
Its
�0 must be
•
Suppose the conditions
At> 0, ... A,.> 0
Ao> 0, satisfied and let
have 8 < n+ 1 points of increase.
cp(x)
Then the
integral representing !/> reduces to a finite sum
+ t..��)2 + + t..��)2 + P2(to + t1�2 + + + p ( to + h�. + + t,.�:)2 ·
·
denoting by p1, p 2 ,
•
•
•
·
·
·
,
·
•
·
p. masses concentrated in the 8 points of
increase �1, �2, �.. Now, since 8� n constants t0, t1, all zero, can be determined by the system of equations •
·
·
•
+ t..�� to + h�t + . . + to + tt�2 . . . + t..��
to + tt�• +
·
·
+ t..�:
·
=
=
=
•
•
•
t,., not
0 0 0.
Thus !/> vanishes when not all variables vanish; hence, its determinant
�..
=
0, contrary to hypothesis.
From now on we shall assume that increase.
cp(x)
has at least n +
1
points of
The integral
rb dcp(x) Ja Z- X can be expanded into a formal power series of
rb dcp(x) Ja z X -
=
1/z,
thus
m2n mo mt m2 ... + 2 +t + + + 2 + z3 zn z z
and this power series can be converted into a continued fraction as explained in Sec. 2. Let
Pt P 2 Q/ Q21 be the first n +
1
•
P,. P..+t ' Q,. Qn+J
convergents to that continued fraction.
degrees of their denominators are, respectively,
I say that the
1, 2, 3, ... n+ 1.
Since these degrees form an increasing sequence, it suffices to show that there exists a convergent with the denominator of a given degree
8�n+l. This convergent PjQ is completely determined by the condition that in a formal expansion of the difference
INTRODUCTION TO MATHEMATICAL PROBABILITY
364
ibdlp(X)X
_
1/z, terms involving 1/z, 1/z2,
into a power series of absent.
!:_ Q
a Z-
•
•
•
1/z2" are
This is the same as to say that in the expansion of
ibdfP(X)X- P(z)
Q(z)
a Z-
1/z, 1/z2,
there are no terms involving
•
•
•
1jz•.
The preceding expres
sion can be written thus:
ibQ(x)dfP(x) ibQ(z)- Q(x)dlp(x)- P(z) - X - X +
a
a
Z
Z
=
__!._ z•+l
+
Since
i bQ(z)- XQ(x)dfP(x)- P(z) a
is a polynomial in
z,
it must vanish identically.
P(z}
(4) To determine
Z-
That gives
ibQ(z)- XQ(x)dfP(X).
=
a
Z-
Q(z) we must express the conditions that in the expansion of
ibQ�)�fP;X) terms in 8
1/z, 1/z2,
•
•
•
ljz•
vanish.
These conditions are equivalent to
relations
which in turn amount to the single requirement that
(6) 8(x) of degree � 8- 1. Q(z) of degree 8 satisfying con ditions (5), and P(z) is determined by equation (4), then P(z)/Q(z) is a
for an arbitrary polynomial
Conversely, if there exists a polynomial
convergent whose denominator is of degree
ibdfP(X)
_
z- x
lacks the terms in
1/z, 1/z2,
•
•
•
1/z2•.
8.
P(z) Q(z)
For then the expansion of
365
APPENDIX II
Let
Q(z) Then equations
=
lo + l1z + l,z2 +
·
·
+ l,_1z•-1 + z•.
·
(5) become
molo + m1l1 + m2l2 + m1lo + m2l1 + mal2 +
·
·
·
·
·
·
+ m,_1l,_, + m, + m,la-1 + m,+l
ma-1lo + m.l1 + ma+1l2 + . . . + m2a-2la-1 + m2a-1
=
=
=
0 0 0.
This system of linear equations determines completely the coefficients lo, l1, . . . la-1 since its determinant �•-1 > 0. The existence of a convergent with the denominator of degree 8�n+l being established, it follows that the denominator of the sth convergent P,jQ, is exactly of degree 8. The denominator Q. is determined, except for a constant factor, and can be presented in the form:
Q,
c =
-
�a-1
1 z z2 mo m1m2 m1 m2ma
•
•
·
•
• •
•
•
•
z• m, ma+1
A remarkable result follows from equation (6) by taking Q 8
=
Q,.; namely, if
(7) while
(8 � n). In the general relation
Q.
=
q.Qa-1 - Q,_ll
the polynomial q, must be of the first degree
q,
=
a,z + fJ,,
which shows that the continued fraction associated with
ibdrp(x)
z-x
=
Q, and
366
INTRODUCTION TO MATHEMATICAL PROBABILITY
has the form 1
a1Z + �1-
1
a:aZ + ,.,R 2-
1
aaz +�a---:---=-
The next question is, how to determine the constants plying both members of the equation
a,
and �..
Multi
(s � 2) Q,_2drp(z), (7), we get
by
integrating between limits
0
=
a
and b, and taking into account
a.J:zQ,_1Q,_zdrp(z) - .CQL2drp(z).
On the other hand, the highest terms in
Q,_l and Q,_2 are
Hence,
zQ,_z
.
1
=
-Q,_1 + 1/1 ao-1
where ![I is a polynomial of degree �s
-
2.
Referring to equation (6),
we have
and consequently
(8)
a, aa-1
J:bQ;_2drp(z)
=
J:bQ;_ldrp(z).
mz,.; how m1, a, can be found? Evidently a1 1/mo. Fur thermore, Qo = 1 and Q1 is completely determined given mo and m1. Relation (8) determines a2, and Q2 will be completely determined given mo, m1, m2, m8. The same relation again determines a8, and Q3 will be determined giyen mo, m1, ... m5. Proceeding in the same way, we conclude that, given mo, m1, m2, ... m2,., all the polynomials' Suppose that the following moments are given: mo,
many of the coefficients
as
well as constants
•
=
.
.
367
APPENDIX II
can be determined.
It is important to note that all these constants are
positive. Proceeding in a similar manner, the following expression can be found
{3,
=
J:b zQ!-1d111(z).
-a,
(b Ja Q:_ldiP(Z)
It follows that constants
f311 f3s1
•
•
•
f3n For if B
are determined by our data, but not f3n+l·
=
n + 1, the integral
J:bzQ�diP(z) can be expressed as a linear function of mo, coefficients.
m1,
•
•
•
m2n+l with known
But msn+l is not included among our data; hence, f3n+1 ·
cannot be determined.
4.. Properties of Polynomials Q,.
Q,(z)
=
0
Theorem.
Roots of the equat{on
(s ;;!! n)
are real, simple, ancl. contained within the interval (a, b). Proof. Let Q,(z) change its sign r < B times when z passes through points z1, z2, Zr contained strictly within (a, b). Setting •
•
•
8(z)
=
(z - Zt)(Z - Z2)
•
•
•
(z - Zr)
the product
B(z)Q,(z) does not change its sign when
z increases from a to b.
J:8(z)Q,(z)d!l'(z)
=
However,
0,
and this necessitates that
B(z)Q,(z) or
Q,(z) vanishes in all points of increase of !II(Z). But this is impossible, n + 1 points of increase,· whereas the degree B of Q, does not exceed n. Consequently, Q,(z) changes its sign in the interval (a, b) exactly 8 times and has all its roots real, simple, and located within (a, b).
since by hypothesis there are at lea,st
It follows from this theorem that the convergent
Pn
Q,.
368
INTRODUCTION TO MATHEMATICAL PROBABILITY
can be resolved into a sum of simple fractions as follows:
P,.(z) =�+�+ . Q,.(z) z - Zt z - Zs
(9)
. z,. are
. +� z - z,.
.
roots of the equation
Q,.(z)
=
0
and in general
,. P�(zr.). A = Q,.(zr.) The right member of coefficient of
1/z"'
can be expanded into power series of
(9)
being
1/z,
the
"
�A..z�1•
a=l
By the property of convergents we must have the following equations: n
�A..
=
mo
=
m1
a�l "
�A ..z..
a=l n
�A..z!n-1 = msn-l•
a-1
These equations can be condensed into one, n
�A..T(z..) = J:T(z)drp(z)
(10)
a=l
which should hold for any polynomial Let us take for
T(z)
T(z)
of degree � 2n
-
1.
a polynomial of degree 2n - 2:
T(z) -
[ (z
Q,.(z) z..)Q�(z..)
-
]2
•
Then
T(z..)
=
1,
T(zfl)
·=
0
and consequently, by virtue of equation
A.. = Thus constants At, A2,
fb[ (z .
.
.
.
(10),
]2
Q,.(z) drp(z) > 0. z.. )Q�(z..)
_
A,.
if
are. all positive, which shows that
.
P,.(zr.)
369
APPENDIX II
has the same sign as
Q�(zr.).
Now in the sequence
Q�(zt), Q�(z2), ...Q�(z,.) any two consecutive terms are of opposite signs.
The same being true of
the sequence
P,.(zt), P,.(z2), ...P,.(z,.), P,.(z)
it follows that the roots of
are all simple, real, and located in the
intervals
(Zt1 Z2); (z2, Za); . ..(Zn-11 z,.). Finally, we shall prove the following theorem:
Theorem.
For any real x
Q�(x)Qn-t(X) - Q�_1(x)Q,.(x) is a positive number. Proof. From the relations
Q.(z) Q.(x)
=
=
(a.z + {3.)Qs-t(Z) - Q._2(z) (a.x + {3.)Q,_t(x) - Qa-2(x)
it follows that
Q.(z)Q.-t(X) - Q.(x)Qa-t(Z) =
z-x
a.Q._1(z)Q._1(x) + +
whence, takings
Q .-t (z) Q .-2 (x)
-
Q. _t (x)Qa- 2(z)
z-x
1, 2, 3, ... nand adding results,
=
•=1
It suffices now to take z
=
x to arrive at the identity. n
Q�(x)Q,._t(X) - Q:,_1(x)Q,.(x) Since
Q0
=
1 and
a.
>
=
! a.Q._ (x) 1
2•
•-1
0, it is evident that
Q�(x)Q,._t (X) - Q:,_1(x)Q,.(x)
>
0
for every real x.
6. Equivalent Point Distributions.
If the whole mass can be con
centrated in a finite number of points so as to produce the same l first moments as a given distribution, we have an "equivalent point distribu-
.
·370
INTRODUCTION TO MATHEMATICAL PROBABILITY
tion" in respect to the l first moments.
In what follows we shall suppose
that the whole mass is spread over an infinite interval
- oo,
oo
and that
the given moments, originating in a distribution with at least n + 1 points of increase, are
The question is: Is it possible to find an equivalent point distribution where the whole mass is concentrated inn + 1 points?
Let the unknown
points be ��. ��. . . . �..+1 and the masses concentrated in them
At, A2, ... A,.+l· Evidently the question will be answered in the affirmative if the system of 2n + 1 equations
n+l
�
Aa a=l n+l Aa�a a=l n+l Aae: a=l
�
(A)
�
=
=
=
mo m1 ms
n+l
� AaE!"
=
a=l
m2,.
can be satisfied by real numbers E1, � � • . . • �..+Ji A1, A2, • • • An+11 the last n + 1 numbers being positive. The number of unknowns being greater by one unit than the number of equations, we can introduce the additional requirement that one of the numbers ��. ��. . . . �..+1 should be equal to a given real number v.
(A) may be replaced by
The system
the single requirement that the equation
n+l (11)
� AaT(�a) f_"'.., T(x)dlf'(X) =
a-1
shall hold for any polynomial
T(x) of degree �2n. Let Q(x) be the �n+l and let B(x) be e2,
polynomial of degree n + 1 having roots��. an arbitrary polynomial of degree �n
-
1.
(11) to
T(x)
=
B(x)Q(x).
•
.
•
Then we can apply equation
APPENDIX II
Since
QU..)
(12)
= 0, we shall have J_"'.., 8(x)Q(x)d�P(x)
for an arbitrary polynomial see that requirement
(12)
371
0
=
8(x) of degree �n - 1. Presently we shall Q(v) 0 determines Q(x), save
=
together with
for a constant factor if
Q,.(v) ¢ 0. Dividing
Q(x) by Q,.(x), we have identically Q(x)
where
=
(Xx + p.)Q,.(x) + R,.-I(x)
R,._I(x) is a polynomial of degree �n - 2,
-
1.
If
8(x) is an arbi
trary polynomial of degree �n
(Xx + p.)8(x) will be of degree �n
- 1.
Hence
.f<xx + p.)8(x)Q,.(x)diP(x) = 0 by
(6),
and
(12)
shows that
Lb8(x)R,._I(x)d�P(X)
=
0
8(x) of degree �n - 2. The last require R,.-I(x) differs from Q,._I(x) by a constant factor. Since the highest coefficient in Q(x) is arbitrary, we can set
for an arbitrary polynomial ment shows that
Rn-J(x)
=
-Q,._I(x).
In the equation
Q(x)
=
(Xx + p.)Q,.(x) - Q,._I(x)
X and p.. Multiplying both members by Q,._1(x)dfP(X) and integrating between - oo and oo, we get
it remains to determine constants
X _"'..,xQ,._lQndfP(X)
J
or
But
=
J_"',. Q!...1d!p(X)
372
INTRODUCTION TO MATHEMATICAL PROBABILITY
whence The equation
Q(v) (an+tV p,)Q,.(v) - Q,._t(V) serves to determine p, if Qn(v) The final expression of Q(x) will be Q(x) (an+t(X - v) QQ:(;)>)Q,.(x) - Q,._1(x). 0
+
=
=
¢ 0.
+
=
Owing to recurrence relations
Q2 (a2X f32)Q1 - Qo; Qs (asx f3s)Q2 - Q1; Q,. (a,.X (3,.)Qn-l - Qn-21 it is evident that Q, Q,., Q,._J, . Qt, Qo 1 +
=
+
=
·
+
=
·
x
in a Sturm series. For = oo only permanences.
x
=
·
1
- oo, it contains n + variations and for It follows that the equation
=
Q(x)
1
=
0
v.
has exactly n + distinct real roots and among them Thus, if the is solvable, the numbers h, , determined as problem + are roots of
Q(x)
�2
=
•
•
•
�n l
0.
Furthermore, all unknowns Aa will be positive. it follows that
(11)
In fact, from equation
a J_.. [<x _ ��)b'(�a)rd�(x) > o. .. Now we must show that constants Aa can actually be determined so A
=
to satisfy equations (A).
as
To this end let
_ P(x) f_.... Q(x) - Q(z) d�(z) [an+t(X- v) + Q,.Q,.(tv(v))JPn(X)- Pn-t(x). Then Q(x) J_.... :�z�- P(x) J_..,. Q�)��;z) =
X_ Z
=
=
(12),
and, on account of the expansion of the right member into power series of lacks the terms in Hence, the expan sion of
1/x
1/x, 1/x2, 1/x". f .... d�(z) P(x) _ x- z Q(x) •
_
•
•
373
APPENDIX II
lacks the terms in
P(x) Q(x)
1/x, 1/x2, =
mo
+
x
•
•
mt
•
1/x2"+1;
+ ... +
x2
that is, m2,. + x2"+1
On the other hand, resolving in simple fractions,
P(x) Q(x)
�
=
+
�
n
A +t . X - �n+l
+ ... +
X - �2
X - �1
1/x (A).
Expanding the right member into power series of
and comparing
with the preceding expansion, we obtain the system
By the previous
remark all constants A .. are positive.
Thus, there exists a point distribu
tion in which masses concentrated in
mo,
m1, . . . m2,..
+
n
1
points produce moments
One of these points v may be taken arbitrarily, with
the condition
Q,. (v)
¢
0
being observed, however.
6. Tshebysheff's Inequalities.
In a note referred to in the introduc
tion Tshebysheff made known certain inequalities of the utmost impor tance for the theory we are concerned with. proof of them was given by Markoff in
The first very ingenious
1884
and, by a remarkable
coincidence, the same proof was rediscovered almost at the same time by Stieltjes.
A few years later, Stieltjes found another totally different
proof; and it is this second proof that we shall follow. Let
-
oo,
IP(x)
oo.
be a distribution function of a mass spread over the interval
Supposing that a moment of the order i,
J_..
x'diP(x)
=
m;,
..
exists, we shall show first that lim
when
l
tends to
+ oo
.
Similarly
= =
0 0
For
J, .. xidiP(x) or
li(mo - IP(l)) lim liiP( -l)
f, .. diP(X)
!1;; li
=
li[IP(
+ oo
) - IP(Z)]
374
INTRODUCTION TO MATHEMATICAL PROBABILITY
or
Now both integrals
converge to 0 as ately.
l tends to + oo; whence both statements follow immedi
Integrating by parts, we have
J:xidrp(x) = li[rp(l)- mo] - i.fo1[rp(x) - m0]xi-1dx fo xidrp(x) = ( -1)i-llirp(-l)- iJ-01xi-1rp(x)dx, -l
whence, letting
l converge to + oo, '
=
m;
f .. xidrp(x) =-iJro .. [rp(x)- mo]xi-ldx- if 0 -�
xi-lrp(x)dx.
-�
If the same mass
mo, with the same moment m;, is spread according to 1/t(x), we shall have
the law characterized by the function m;
�
J_.... xidl/t(x) = -ifo ..[l/t(x) - mo]xi-1dx- if� .. xi-11/t(x)dx,
whence
J_....xi-l[rp(x)- 1/t(x)]dx = 0.
(13)
Suppose the moments
rp(x) are known. Provided rp(x) + 1 points of increase, there exists an equivalent point
of the distribution characterized by ha.<; at least
n
distribution, defined in Sec. 5 and characterized by the step function
1/t(x) which can be defined as follows: 1/t(x) = 0 1/t(x) =A1 1/t(x) =At + A2
for
+ An + An+t
for
1/t(x) =A1 + A2 + 1/t(x) =At + A2 + ·
provided roots
��, �2,
•
•
•
·
·
·
·
·
for for
for
-00
<X< ��
�� �X< �2 �2 �X< �3 �n �X< �n+l �n+l �X< + 00 1
�n+l of the equation Q(x) = 0 are arranged
in an increa.<�ing order of magnitude.
APPENDIX II
Equation
(13)
will hold for
same, the equation
J_
(14)
"',..
i
375
1, 2, 3, ... 2n
=
B(x)[cp(x) - ift(x)]dx B(x)
will hold for an arbitrary polynomial function
h(x)
=
=
or, which is the
0
of degree �2n
- 1.
The
cp(x) - ift(x)
in general has ordinary discontinuities.
We can prove now that
h(x),
if
not identically equal to 0 at all points of continuity, changes its sign at least 2n times.1 Suppose, on the c ontrary, that it changes sign r < 2n times; namely, at the points
Taking
B(x) equation
(14)
=
a1) (x - a2)
(x -
·
·
·
(x -
a.) ,
will be satisfied, while the integrand
B(x)h(x), if not 0, will be of the same sign, for example, positive. point of continuity of since
h(x)
h(x).
If �
changes sign at a;.
numbers a11 a21
a;
=
(i
=
1, 2,
.
.
.
r
)
Let � be any
h(a1)
then
=
0
If �does not coincide with any one of the
a., then for an arbitrarily small positive
E
we must
E
above a
have
(H• f- B(x)h(x)dx J •
=
0.
But by continuity
O(x)h(x) remains in the interval (� -
certain positive number unless
E,
�+
h(�)
)
E
=
for sufficiently small
0.
h(x) does not vanish cp(x) and if;(x) do not differ
Thus, if
at all points of continuity (in which case
essentially), it must change sign at least 2n times. the change of sign can occur. - oo,
Let us see now where
In the intervals n l, + �1 and �+
oo
1A function f(x) is said to change sign once in (a, b) if in this interval there exists a point or points c such that, for instance, f(x) � 0 in (a, c) and f(x) � 0 in (c, b), equality signs not holding throughout the respective intervals. The change of sign occurs n times if (a, b) can be divided in n intervals in which f(x) changes sign once.
376
INTRODUCTION TO MATHEMATICAL PROBABILITY
�(x) - 1/l(x)
evidently cannot change sign. Within each of the intervals
there can be at most one change of sign, since
1/l(x)
remains constant
there, and �(x) can only increase. The sign may change also at the points of discontinuity of 1/l(x); that is, at the points �1, �2, ... �n+1· Altogether, �(x) - 1/l(x) cannot change sign more than 2n + 1 times and not less than 2n times. Since 1/l(x) = 0 so far as positive
e,
< �1 and
x
�(�t - E) Also
1/l(x)
�(�1 - e)
is not negative for
we must have
=
mo for
x
> �..+1 and
�(�n+1 + e)
1/1(�1 -
�(x) ;a;
E
) � 0.
mo, so that
- 1/1(�..+1 +
E
) ;a; 0.
At first let us suppose
In this case
�(x) - 1/l(x) must change sign an odd number of times;that is, 1 times. Since this cannot happen more than 2n + 1 ' number of times �(x) - 1/l(x) changes its sign must be exactly
not less than 2n + times, the
2n + 1.
These changes occur once within each interval �i--1, �.
. �.. +1· When the change of sign and in each of the points �1, �2, occurs in the interval (�i-1, �.)where 1/l(x) remains constant, because �(x) never decreases, we must have for sufficiently small e •
•
(15) But the sign changes in passing the point �.;therefore,
(16) The equalities
�(�t - e) - 1/1(�1 - E)
=
O,
cannot both hold for all sufficiently small be a change of sign at �1 and not be greater than 2n -
�..+1,
1 which
e.
For then there would not
so that the number of changes would is impossible. Therefore, let
and Then there will be exactly 2n changes of sign: one in each of the intervals
�i-1, �-
377
APPENDIX II
and in each of the points
�2, �a, .
. �n+l·
The inequalities (15) and
(16) would hold fori � 2, but
foO(�l
-
) - !f(h
E
for all sufficiently small Now let
-
)
E
0,
=
E.
and for all sufficiently small positive of sign: In each of the points
Then there will be exactly 2n changes
E.
�1, �2,
•
•
•
�n and in each of the n intervals
The inequalities (15) and (16) will again hold fori � n, but
tp(�n+l - E) - !f(�n+l
for all sufficiently small
E.
foO(�n+l + E) - !f(�n+l + E)
and
) <0
- E
Letting
E
converge to
0,
=
0
we shall have
tp(�i - 0) � !f(�i - 0) tp(�i + 0) � !f(�i + 0) fori
=
1, 2, 3, . . . n + 1 in all cases.
Then, since
tp(�.) � tp(�i - 0); we shall have also
tp(�.) � lf(�i - 0) tp(�,) � !f(�i + 0) or, taking into consideration the definition of the function
!f(x)
i-1
tfJ
( ) �.
> =
� P(M � Q'(�z) l=l i
These are the inequalities to which Tshebysheff's name is justly attached.
For a particular root
�.
=
v
they can be written thus:
�P(�z)
tp(v) � �Q'(�z) (17)
�I
378
INTRODUCTION TO MATHEMATICAL PROBABILITY
with the evident meaning of the extent of summations.
Another, less
explicit, form of the same inequalities is
cp(v) !;; if;(v - 0) cp(v) � if;(v + 0).
(18) As to
P(x) and Q(x), they can be taken in the form: P(x) =[an+J(X - v)Q,.(v) + Q,_J(v)]P,.(x) - Q,.(v)Pn-l(x) Q(x) =[an+l(x - v)Q,.(v) + Q,_J(v)]Q,.(x) - Q,.(v)Q,_J(x).
Thus far we have assumed that
v was different from any root of the
equation
Q,.(x) =0, but all the results hold, even if
v approaches a root � of or tends to ( either h , one root of �n Q(x) +l) - oo or + oo, while the Q,.(x) remaining n roots approach the n roots x1, x2, . . . x,. of the equation
To prove this, we note first that when a variable
Q,.(x) =0. If
�1 tends to negative infinity, it is easy to see that P(�1) Q'(�J)
tends to
0.
In this case the other quotients
P(�,) Q'(�,) tend respectively to
P,.(x1) P,.(x2) ' ' Q�(x1) Q�(x2) If
�ntl tends to positive infinity the quotients P(�,). 'l=1, 2, . ' Q (�z)
.
.
n
approach respectively
P,.(x,), l - 1, 2, 3, Q�(x,), _
while
. . . n,
APPENDIX II
tends to
0.
Now take
tive number
E
v
� =
converge to
0.
-
E
and
v
379
� =
+
E
in
(17)
and let the posi
Taking into account the preceding remarks,
we find in the limit
whence again
(" cp "') cp
(�)
> =
�P..(x,) � Q�(x,) zz<e
< =
�P..(x,). � Q�(x,) "'' ;:;;e
(17)
But these inequalities follow directly from
by taking
v
=
� .
Since
t/l(v + 0) - t/l(v - 0) it follows from inequalities
(18)
=
P(v) ' Q (v)
that
0 � cp(v) - t/l(v - 0) �
P(v) ' Q (v)
·
On the other hand, one easily finds that
P(v) ' Q (v)
=
1 an+1Q,.(v)2 + Q�(v)Qn-l(v) - Q�1(v)Qn(v)
•
But referring to the end of Sec. 4,
..
Q�(v)Q,._l(v) - Q�_1(v)Q,.(v)
=
� a,Q,_l(v)2,
•=1
whence
an+1Q,.(v)2 + Q�(v)Qn-l(v) - Q�_l(v)Q,.(v)
=
Q�+l(v)Q,.(v) - Q�(v)Qn+l(v).
Finally,
1
0 � cp(v) - t/l(v - O) � Q�+1(v)Q,.(v) - Q�(v)Qn+l(v) If
cp1(v)
is another distribution function with the same moments
380
INTRODUCTION TO MATHEMATIOAL PROBABILITY
we shall have also
and as a consequence,
( 19 ) Here for brevity we use the notation
-a very important inequality. x ..(v) =
1
Q�+l(v)Q,.(v) - Q�(v)Q�1(v)"
7. Application to Normal Distribution.
An important particular
case is that of a normal distribution characterized by the function
In this case it is easy to give an explicit expression of the polynomials
Q,.(x).
Let
H,.(x)
dne-"1 e"•--· dx"
=
Integrating by parts, one can prove that for l � n - 1
J_"".,e-"•x1H,.(x)dx Hence, one may conclude that factor.
Let
Q,.(x)
Q,.(x) To determine
c,.,
=
=
0.
differs from
H,.(x)
by a constant
c,.H,.(x).
we may use the relation
H,.(x)
= - 2xHn- l (x)
which can readily be established.
- 2(n- 1)H-2(x) Introducing polynomials
relation becomes
Q,.(x) Hence, n C n C- --2
Since
Ho(x)
=
=
Qo(x)
c,.
1
n a,. = -,., Cn-1
2n - 2'
= 1, we have
IXl
1
co
= - = mo
= 1; also
1
=
1 C
-z:! �
fJn
=
0.
Q,.,
this
APPENDIX II
whence
The knowledge of
Ct =- �.
381
Co and Ct together with the relation
Cn-2 c = ,. 2n - 2 allows determination of all members of the sequence
c2, ca, C4, •
The final expressions are as follows;
1 2"' · 1 · 3 · 5 · · (2m- 1) -1 C2m+l = +1 2"' · 2 · 4 · 6 • · • 2m' C2"' =
H,.(x), Hn-t(x), H,._2(x) and owing to H,.(x) is an even or odd polynomial, according as n is even or
From the above relation between the fact that
odd, one finds
H2m(O) = (-2)"' · 1 · 3 · 5 · · · (2m- 1), while another relation
H�(x) = -2nHn-t(X), following from the definition of
H,.(x),
gives
H�,.._1(0) = (-2)"' · 1 3 · 5 · · · (2m- 1). ·
These preliminaries being established, we shall prove now that
1 (v) = CnCn-rt(H:..rl(v)H,.(v) - H�(v)Hn+l(v))
x ..
attains its maximum for
v =0.
Let
O(v) =H:..r1(v)H,.(v) - H�(v)Hn+I(v). Then, taking into account the differential equation for polynomials
H,.(v): H::(v) =2vH�(v) - 2nH,.(v) we find that
dO = 2v0- 2H,.(v)Hn+t(v). dv On the other hand,
( )t!:._ H,.(v)
0 =-Hn+l V
and denoting roots of the polynomial
dV Hn+l(V)' Hn+I(v) in general
by �'
38 2
INTRODUCTION TO MATHEMATICAL PROBABILITY
Consequently
Again
and so
�� = 2Hn+l(v) 2� �:�n) (v � �)2 = - �+� �v) 2� (v � �) 2• Roots of the polynomial
Hn+l(x) being symmetrically located with
respect to 0, we have:
and finally
Hence
dO dv -
>
dO dv
v < 0;
if
O
-<
that is,O(v) attains its maximum forv for
v =
0.
H�m+l (0),
v>O
if
O
= 0 and :Xn(v) attains its maximum C2m, c2m+li H2m(O),
Referring to the above expressions of we find that
2 4·6··· 2m 3· 5 · 7 ··· ( 2m + 1) 2·4·6··· 2m :X2m+l(O) = • 3· 5 · 7 ··· ( 2m + 1) X2m(O) =
354, we find the inequality
In Appendix I, page
·
2·4·6 .. · 2m 1 .y; -: :----.---: -"'"F.'====:=� < 2 5;:;·- ··---= 1) ( 2m 1:;--· 3;oV 4m + 2 -
-
·
whence
•
-
2·4·6···2m 5 7 ( 2m + 1)
��=-����= <
3 Thus, in all cases
·
·
·
·
·
� 4m + 2
.
383
APPENDIX 11
whence, by virtue of inequality
(19),
I1P1(v) - tp(v)l Thus any distribution function mo
=
1,
m21o
-
1
=
1,01(v)
m2,.
0,
=
with the moments
1·3·5 (2k - 1) -------:2=-,. ---'---__.:.
corresponding to
1,0(v) differs from
tp(v)
1
=
J
v
� -
by less than
�2:·
<
(k
� n)
e-""du oo
Since this quantity tends to 0 when n increases indefinitely, we have the following theorem proved for the first time by Tshebysheff: The system of infinitely many equations
(2k- 1) k
=
1, 2, 3, . . .
uniquely determines a never decreasing function
tp(x) such that 1,0(-
oo
)
=
0;
namely,
8. Tshebysheff-Markoff's Fundamental Theorem. depending upon a parameter
�,
1 F(x, �)
When a mass
is distributed according to the law characterized by a function
=
we say that the distribution is variable.
Notwithstanding the variability of distribution, it may happen that its moments remain constant.
If they are equal to moments of normal
distribution with density
� then by the preceding theorem we have rigorously
F(x, �) no matter what � is.
1
=
r:_ V 'lr _
f"'-
e-""du oo
384
INTRODUCTION TO MATHEMATICAL PROBABILITY
Generally moments of a variable distribution are themselves variable.
>..
Suppose that each one of them, when instance
oo
),
tends to a certain limit (for
tends to the corresponding moment of normal distribution.
One can foresee that under such circumstances
1p(x)
F(x, >..)
"' e-u•du. J v;
will tend to
1
=
--
-oo
In fact, the following fundamental theorem holds:
If, for a variable distribution characterized
Fundamental Theorem.
by the function F(x, >..) , lim
for any fixed
J_
k
..
=
xkdF(x, >..)
=
�I-'" e-"'•xkdx;
..
0, 1,
lim
>..�
oo
..
2,
3,
F(v, >..)
..., then =
1 _ ,r: V 'll"
J
"
>..�
e-"'"dx;
oo
-oo
uniformly in v. Proof.
be
2n +
Let
1 moments corresponding to a normal distribution.
They
allow formation of the polynomials
Qo(x), Q1(x), ... Qn(x) and the function designated in Sec. 6 by
and
Q(x)
1/l(x).
Similar entities cor
responding to the variable distribution will be specified by an asterisk. Since as
and since .!ln > O, we shall have
.1! for sufficiently large
>...
Then
> 0
F(x, >..)
will have not less than
n+1
points of increase and the whole theory can be applied to variable dis tribution.
In particular, we shall have
0 ;;! 1p(v) - 1/l(v
-
0) ;;! Xn(v)
(20) 0 ;;! F(v, >..) - 1/l*(v - 0) ;;! x!(v). Now
Q �(x) (s
=
0,
1,
2, ... n)
and
Q*(x)
depend rationally upon
385
APPENDIX II
mt(k
=
0,
1,
. 2n);
2,
hence, without any difficulty one can see that
Q!(x) --+ Q,(x); s 0, 1, 2, ... n Q*(x) --+ Q(x) =
as >.. --+ oo ; whence,
x:(v)
--+ x.. (v).
Again
1/t*(v
-
O)
--+
1/t(v
0)
-
as >.. --+ oo.
A few explanations are necessary to prove this.
Q,.(v) ¢
Then the polynomial
0.
�1
�2
<
<
Q(x)
�3
will have
' ' '
<
<
n +
At first let
1 roots
�n+l·
Since the roots of an algebraic equation vary continuously with its coefficients, it is evident that for sufficiently large >.. the equation
Q*(x)
=
0
will haven + 1 roots:
n and
<
��
<
et will tend to �" as >.. --+ oo. 1/t(v - 0). If Q,.(v) - oo
or +
· ·
<
·
<
�:...1
In this case, it is evident that
1/t*(v
-
O)
o, it may happen that nor �:...1 tends as >.. --+ oo, while the other roots tend to the
will tend to
respectively to
n
oo
=
roots
of the equation
Q,.(x) But the terms in tend to
0,
1/t*(v
-
0)
=
0.
corresponding to infinitely increasing roots
and again
1/t*(v
-
0)
--+
1/t(v
-
0).
Now
x.. (v) <
�;
n
.
Consequently, given an arbitrary positive number large as to have
x .. (v) <
�;n
..
<
E.
E1
we can select n so
386
INTRODUCTION TO MATHEMATICAL PROBABILITY
Having selected
in this �anner, we shall keep it fixed. Then by the
n
preceding remarks a number L can be found so that
x*(v)
<
.J"in
< e
1¥-(v - 0) - !f*(v - O)l for
X > L.
Combining this with inequalities
IF(v, X) - .,o(v)l for
X
> L.
<
E
< 3e
And this proves the convergence of To show that the equation
fixed arbitrary v.
lim
F(v, X)
1 =
c. V'lr _
holds uniformly for a variable to P6lya. Since
.,o(-
oo
)
=
function, one can determine
.,o(x) - .,o(x)
1 Next, because
.,o(x)
� 1
we find
(20),
s·
<
a2
for k
=
large
X
<
. . .
0, 1, 2,
<
.
n
v we can follow a very simple reasoning due 1 and .,o(x) is an increasing two numbers ao and an so that 0, .,o( + oo )
=
� < �
.,o(ao)
�
for
<
- .,o(an)
for
an.
X�
ao
(ao, an) can be and an points
so that
-
1.
By the preceding result, for all sufficiently
1
- F(a,., X)
<
�
and
k Now consider the interval
0 �
for a
e-"''dx
is a continuous function, the interval
an-1
.,o(v)
-oo
subdivided into partial intervals by inserting between
al
to
F(v, X)
F(v, X)
(<
oo,
�;
=
ao).
1, 2, ...
Here for v �
0 <
.,o(v)
and
IF(v, X) - .,o(v)l
n
<
e.
<
�
- 1.
ao
387
APPENDIX II
v belonging to the interval (a,., + oo )
For
0 �
1 - F(v, X)
<
E "2'
0 < 1 -.cp(v) <
E
2,
whence again
IF(v, X) - cp(v)l
E.
<
Finally, let
(k
=
. .. n - 1).
0, 1, 2,
Then
F(v, A) - cp(v) � F(ak, A) - cp(ak+l) [F(ak, A) - cp(ak)] + (cp(ak) - cp(ak+l)] a v a , v F( , A) - cp( ) � F( k+l A) - cp( k) [F(ak+ll X) - cp(ak+l)] + (cp(ak+l) - cp(a�o)]. =
=
=
=
But
whence -E <
F(v, A) - cp(v)
Thus, given E, there exists a number such that
L(E)
IF(v, A) - cp(v)l for
A
< E.
depending upon
E
alone and
< E
> L (E) no matter what value is attributed to
v.
The fundamental theorem with reference to probability can be stated as
follows:
Let s,. be a stochastic variable depending upon a variable positive integer 1, 2, 3, n. If the mathematical expectation E(s:) for any fixed k tends, as n increases indefinitely, to the corresponding expectation =
E(x")
=
1 -- .. x"e-"'"dx .v;;: - ..
J
of a normally distributed variable, then the probability of the inequality Sn < V
tends to the limit
and that uniformly in v.
_1.v;;:_J.
- ..
e-"'"dx
388
INTRODUCTION TO MATHEMATICAL PROBABILITY
In very many cases it is much easier to make sure that the conditions of this theorem are fulfilled and then, in one stroke, to pass to the limit theorem for probability, than to attack the problem directly.
APPLICATION TO SUMS OF INDEPENDENT VARIABLES 9. Let z1, Z2, za, ...be independent variables whose number can be increased indefinitely.
Without losing anything in generality, we may
suppose from the beginning E(zk)
k
0;
=
=
1, 2, 3, ....
We assume the existence of E(z&) for all k
=
1, 2, 3, ....
=
bk
Also, we assume for some positive o the
existence of absolute moments
k
=
1, 2, 3, ..
.
.
Liapounoff's theorem, with which we dealt at length in Chap.XIV, states that the probability of the inequality Z1
+ Z2 + ' ' ' + Zn
<
t,
v'2B: where
B
..
=
b1 + b2 + · · · + b,.
tends uniformly to the limit
1 --
..Ji as n
---+
oo
J
' e-"''dx
-co
, provided
p.r"i-6>
+
p.�'J.+6)
+
.
6
. . +
p.�2+6>
---+
0.
B!+2 Liapounoff's result in regard to generality of conditions surpassed by far what had been established before by Tshebysheff and Markoff, whose proofs were based on the fundamental result derived in the preceding sec Since Liapounoff's conditions do not require the existence of
tion.
moments in an infinite number, it seemed that the method of moments was not powerful enough to establish the limit theorem in such a general form. ·
Nevertheless, by resorting to an ingenious artifice, of which we
made use in Chap. X, Sec. 8, Markoff finally succeeded in proving the limit theorem by the method of moments to the same degree of generality as did Liapounoff.
389
APPENDIX II
Markoff's artifice consists in associating with the variable
x, andy,. defined as follows:
variables
z, two new
Let N be a positive number which in the course of proof will be selected so as to tend to infinity together with
n.
Then
if if Evidently
z,., x,, y,. are connected by the relation z,. x, +y,. =
whence
E(x,) + E(y,.)
(21)
=
0.
Moreover
(22)
Elx�ol*'
+
EIYki2H
=
Elz,.l*'
I'L2+11,
=
x, and y,. x,. is bounded, mathematical expectations E(xt)
as one can see immediately from the definition of Since
exist for all integer exponents l
l, 2, 3, . . . and fork
=
=
1, 2, 3,
In the following we shall use the notations
IE(xDI
cL11; l 1, 2, 3, . . . c�2> + c�2> + . . . + c�2> B� 1'�2+1> + 1'�2+8> + . . . + 1'�2+6) c... =
=
=
=
Not to obscure the essential steps of the reasoning we shall first establish a few preliminary results.
Lemma 1.
Let
q,
represent the probability thaty, � . < c.. + q.. . q1 + q2 + ·
·
·
=
Proof. only if
lz�ol
Let
IP�<(x)
N*'
be the distribution function of
> N, the probability
q, is not greater than
��:diJ'k(X)
+
0; then
z,.
JN"' d!p�o(X).
On the other hand,
J::lxi*'diJ',.(X)
+ fN"'ixi2+'diJ',.(X) � 1'�2+3>,
Since y,. � 0
390
INTRODUCTION TO MATHEMATICAL PROBABILITY
But
J:
- lxl2+6d
"' N +6 +6 k(x N Ixl2 d
J
�J :
- d
.J:,"'a
whence
The inequality to be proved follows immediately.
The following inequality holds:
Lemma 2.
Proof.
From
EIYkl2+6 � J.lk2+6) which is a consequence of the second equation
E(y�) � The first equation
(22)
it follows that
(2 J.! +6) 6 .
N
(22)
gives
Taking the sum fork
= 1, 2, 3,
.
.
.
n,
Bn ;:;;;; B� ;:;;;; Bn
we get
- ��'
whence
Lemma 3.
For e ;:;;;; 3, c<•> 1
+
2 +
c<•>
·
·
·
+
Bn2 Proof. inequalities
This
inequality
n
c<•>
=
<
e
follows
( ) N2 Bn
e
2
- . _ z
,
immediately
from
the
evident
391
APPENDIX II
Lemma 4.
The following inequality holds cp> + c�ll + ... + c�ll < = B:
Proof.
( C,. )t· N2H
Since
E(x,.) + E(y,.) = 0,
we have
c�ll = IE(x,.)l = IE(y,.)l � Ely,.l. On the other hand, by virtue of Schwarz's inequality
[EIYtl + EIY21 + ... + Ely,l)2 �
n
+ q,.)
� E(yl) � B".;:,, k=l
whence the statement follows immediately. If the variable integer
N
should be subject to the requirements that
both the ratios
c.. N2+'
N2 B,
and
should tend to 0 when n increases indefinitely, then the preceding lemmas would
give
three
important
corollaries. But
before
stating
these
corollaries we must ascertain the possibility of selecting N as required.
It suffices to take
N = (B,.C,.) 4+'. 1
Then
N2 C,. = N2+B B,.
C,.
=
( )._2 B!+�
--. 4+'
0
by virtue of Liapouno:ff's condition . Also
will tend to 0.
By selecting
corollaries:
Corollary 1.
tends to 0 as n --.
The
N in this
manner we can state the following
sum
ql + q2 + .. . + q,. oo
•
392
INTRODUCTION TO MATHEMATICAL PROBABILITY
Corollary 2.
The ratio
B'.. B,. tends to 1. Corollary 3.
The ratio 4•> + c�•> +
·
·
•
+ c�>
B! tends to 0 for all positive integer exponents e except e = 2. 10. Let F,.(t) and q,,.(t) represent, respectively, the probabilities of the inequalities
.
+ Z2 + ··· + Z,. -v'2B.: X1 + X2 + ' ' ' + X,. -v'2B.: Z1
t
<
< t.
By repeating the reasoning developed in Chap. X, Sec. 8, we find that
IF .. (t) - q,,.(t)l � ql + q2 + ... +
q,..
Hence, lim by Corollary 1.
(F ,.(t) - q,,.(t))
=
0
n-+
co
n-+
co,
as
It suffices therefore to show
' V1r J 1
e-"'"dx
q,,.(t)-+---=
as
-oo
and that can be done by the method of moments. theorem
(Xt
)
· · · + x,. m V2B.:
+ X2 +
=
�
�adpt
m! ·
·
·
By the polynomial
AI
Ba,fl,
.
!!!
•
.
l\
�
22B,.2
where the summation extends over all systems of positive integers
a � p � ··· � A satisfying
the condition
a+P+ ··· + A=m and Sa,fl,
. . . ll
denotes a symmetrical function of letters Xt,
determined by one of its terms
xyx{ .. . x�
x2, . . . x,.
393
APPENDIX II
if l represents the number of integers x
1
, x2,
•
•
•
x,.
+
+
E(X1 X2
·
·
+
·
V2B,; where
Ga,IJ,
...
x,.)m
=
•
•
X.
•
Since variables
� m! �a!{J! ..
;.. is obtained by replacing powers of variables by mathe
matical expectations of these powers.
-
It is almost evident that
c�a> c
c�a) + c�a) + IG a,{J, ... ;..I :::;; m a B,.2
a, {J,
are independent, we have
·
·
{3
·
+ c�>
B,2
B,.2
c1X> + c�> +
Now if not all the exponents
a, {J,
•
•
•
X are
only when m is even), by virtue of Corollary
tends to 0.
+
+
'
·
V2B.: if m is
2 (which is possible 3 the right member as well as =
Hence
E(X1 X2
+ x ,.
)
m �
O
odd.
But for even m we have
(23)
+
+ ..
E(Xl X2 V2B.:
Let us consider now
+c�>-> .
(m being even)
where summation extends over all systems of positive integers
x�l'� ... �"' satisfying the condition m
+w=2
394
INTRODUCTION TO MATHEMATICAL PROBABILITY
and H�o.,,., ... ., is a symmetric function of its term
(c�2>)"(c�2>)� l being the number of subscripts X, H>.,,.,
•
.
m
Bn2
.
;S;
., -
(Ct<2>)>.
c�2>, c�2>, . . . c�2> determined by (c12>)"',
•
p.,
+ (c<22 >)>. + .
.
Apparently
w.
B" n
B"' n Besides
and
if not all subscripts X, p.,
But by Corollary
•
•
w
•
are equal to 1.
It follows that
2 B' _.!!
�
B..
and evidently H1,1,
•
•
1
G2,2,
= (;)I -
•
G2,2,
Hence
2·
;! · 2 �
•
•
1 .
B ..
1
(i}
and this in connection with (23) shows that for an even m
( +x. �. + "J �
E "'
2
•
·
Finally, no matter whether the exponent m is odd or even, we have
( + +Inn' ' ' + )
• X1 IliD E
X2
v2B,. _
Xn m
=
_1_
� V1T _
.. J .. -
X me- .,.dX.
APPENDIX II
395
Tshebysheff-Markoff's fundamental theorem can be applied directly and leads to the result: lim t/>n (t) uniformly in
t. t.
r:. V'll" _
f'
e-"'•dx oo
-
On the other hand, as has been established before, lim
uniformly in
1
=
[F,.(t) - q,,.(t)]
Hence, finally lim
F,.(t)
1
=
r:. V'll" _
f' -
=
0
e-"•dx oo
uniformly in t. And this is the fundamental limit tJ?.eorem with Liapounoff's condi tions now proved by the method of moments. Markoff, is simple enough and of high elegance.
This proof, due to However, preliminary
considerations which underlie the proof of the fundamental theorem, though simple and elegant also, are rather long.
Nevertheless, we must
bear in mind that they are not only useful in connection with the theory of probability, but they have great importance in other fields of analysis.
APPENDIX III ON A GAUSSIAN PROBLEM 1. In a letter to Laplace dated January 30, 1812,1 Gauss mentions a difficult problem in probability for which he could not find a perfectly satisfactory solution.
We quote from his letter:
Je me rappelle pourtant d'un probleme curieux duquel je me suis occup� il y a
12 ans,
mais lequel je n'ai pas r�ussi alors a r�soudre a rna satisfaction.
Peut
�tre daignerez-vous en occuper quelques moments: dans ce cas je suis sur que vous La voici: Soit M une quantit� inconnue
trouverez une solution plus complete. entre les limites 0 et
1 pour laquelle toutes les valeurs sont ou �galement probabies
ou plus ou moins selon une loi donn�e: qu'on Ia suppose convertie en une fraction continue 1
M=-, + a
1 a"+
Quelle est la probabilite qu'en s'arr�tant dans le developpement a un terme fini > a
Ia fraction suivante
1 __ +
a
1
-
a
soit entre les limites 0 et x?
+ .
Je la designe par P(n, x) et j'ai en supposant toutes
les valeurs egalement probables P(O, x) = x. P(1, x) est une fonction transcendante d�pendant de la fonction
1+!2 +!+ ... +! 3
X
que Euler nomme in�xplicable et sur laquelle je viens de donner plusieurs re cherches dans un memoire pr�sente a notre Soci�te des Sciences qui sera bienMt imprim�. intraitable.
Mais pour le cas ou n est plus grand, Ia valeur exacte de P(n, x) semble Cependant j'ai trouve par des raisonnements tres simples que pour
n infinie P(n, x) =
log
1 Gauss' Werke, X, 1, p. 371. 396
(1+ x) log 2
397
APPENDIX III
Mais les efforts que j'ai fait lors de mes recherches pour assigner
P
(
pour une valeur tres grande de
x)
n,
n,
_
log (1 + x) log 2
mais pas infinie, ont ete infructueux.
The problem itself and the main difficulty in its solution are clearly indicated in this passage. The problem is difficult indeed, and no satisfactory solution was offered before 1928, when Professor R. 0. Kuzmin succeeded in solving it in a very remarkable and elegant way.
2. AnalytiCal Expression for P,.(x).
We shall use the notation
Pn(x) for the probability which Gauss designated by P(n, x).
The first question that presents itself is how to express P,.(x) in a proper analytical
form.
15(vl, v2, ... v,.,
Let
x) be an interval whose end points are
represented by two continued fractions:
!_
1
and
+V2 +
VJ
.
. +
.
1 Vn
with positive integer incomplete quotients positive number � 1. systems of integers
1 · +-
v,.
+X
v1, v2,
v,.,
while x is a
Two such intervals corresponding to two
v1, v2, . . .v,.
vL v�, . . . v�
and
that is, do not have common inner points.
different
do not overlap;
For, if they had a common
inner point represented by an irrational number N (which we can always suppose), we should have for some positive x' < 1 and·x" < 1
N
1 =
-
1
1
=
VJ +-
V2 + +
But that is impossible unless
1
+ +
v,.+
' x
v�
v1, v�
=
1
+�
v�
-
·
=
v2,
v�
=
1
v�
+ x"
'
v,..
A number M being selected at random between 0 and 1 and converted into a continued fraction
M
1 =
-
1
VJ +-
V2 + ·
.
1 + -v,. +�
if the quantity� turns out to be contained between 0 and x < 1, M must belong to one (and only one) of the intervals
15(vl, v2, ...v,., x) cor n positive integers
responding to one of all the possible systems of
v1, v2,
•
•
•
v,..
Since M has a uniform distribution of probability and
398
INTRODUCTION TO MATHEMATICAL PROBABILITY
[:
ll(tlt, v2,
since the length of the interval
(-1)" Vt
.
..
x) is
v,,
1
!_ v.+ -
- -1 tit+-
. .
t12 + .
1 + , V,. +X
[� �
.
1 · +-
v,.
]
the required probability P,.(x) will be expressed by the sum
P.(x)
: ��
-1)•
..
1 Vt+ 1 V";+
+
+
1 +-- Vn+X
.
1 . +Vn
extended over all systems of positive integers Vt,
·v2,
v,.
]
In general
let p,
=
Q.•
!_
(i
1
tit+-
t12 +
.
=
1,
2, ... n)
1
· +-
tli
be a convergent to the continued fraction
!_
1 Vt+-
V2 + ·
.
1 . +-· tin
Then the above expression for P,.(x) can be exhibited in a more convenient form:
P,.(x)
(1)
[
P,.+ xP,._t (-1),. Q, + xQ,._t
=
_
]
P,. · Q,.
By the very definition of P,.(x) we must have P,.(1)
=
1; hence the
important relation
�
(2)
1 -1 Q,.(Q,. + Q..-t) .
This result can also be established directly by resorting to the original expression of P,.(1) and performing summation first with respect to Vt, then with respect to Relation
(2)
v2,
etc.
can be interpreted as follows: Let
length of an interval
ll(vt, t12,
•
•
•
�(l
v,, x).
=
ll
in general be the
Then
1
summation being extended over the (enumerable) set of intervals ll.
APPENDIX III
3. The Derivative of P,.(x).
399
In attempting to show that
P,.(x)
tends uniformly to a limit function as n � oo it is easier to begin with its derivative
p,.(x).
Series
(2)
obtained by f ormal derivation of interval
(0, 1).
is uniformly convergent in the
For
whence
2
1
< Q,.(Q,. + Q..-t) (Q,. + xQ,.-1)2 and the series
2
� Q,.(Q is convergent.
..
- 2 + Q ..-t)
Hence
dP,.(x) �
=
p,.(x)
=
1 � ' � (Q,. + xQ,.-1)2
Since
we have
and, performing summation with respect to
v1, v2, .
.
v..
whence '
p,.(x) =
..
�Pn-1 ��n=l
(
11,.
)
1 1 + x (v,. +
x) 2
.
Vn-t for constant
400
INTRODUCTION TO MATHEMATICAL PROBABILITY
or else ..
Pn(x)
(3)
=
� Pn- {v ! x) (v : x)2 v=1
-an important recurrence relation which permits determining com pletely the sequence of functions
P1(x), P2(x), . .
.
po(x) = 1. 4. Discussion of a More General Recurrence Relation. In discussing relation (3) the fact that po(x) = 1 is of no consequence. We may start starting with
with any function sequence
fo(x) subject to some natural limitations, and form a JI(x), f2(x), fa(x),
by means of the recurrence relation ..
fn(x)
(4)
=
�fn- {v ! x) (v � x)2. •=1
The following properties of fn(x) follow easily from this relation:
a. If
fo(x)
a =
1 +
x
then n =
1, 2, 3, . . .
For ..
fi(x)
=a
� (v ! x- v + x1 + 1)
=
•=1
whence the general statement follows immediately.
b. If
1 then
� x � fo(x) � 1 !
x
1
�
x
.401
APPENDIX III
follows from
(a)
and equation (4) itself.
As a corollary we have: Let
Mn
and
mn
be the precise upper and
lower bounds of
(I + x)fn(X) in the interval 0 �
x � 1.
(n =
Then
Mo E1;; M1 E1;; M2 E1;; mo � m1 � m2 � c.
·
·
•
We have
(
.. {1 � {1 )dx f = � J o fn-1 v Jo n(x v�l
1 dx + x (v + x)2 =
)
()
1 du f .. J 1 /n-1 u u 2
=
d.
2, ... )
0, 1,
=
{1 J o fn-1(x)dx
=
f1 Jo fo(x)dx.
The following relations can easily be established by mathematical
induction:
) ) )
1 � .f n + xPn-1 ( In X) = � JO Qn + XQn-1 (Qn + XQn- )2 1 1 xP n n-1 + � f ( ) 2n X = �� Qn + XQn-1 (Qn + XQn-1)2 1 n + x Pn-1 �� � 2n Qn + fan(X) XQn-1 (Qn + XQn-1)2'
(p (p (p
n
=
Let us suppose now that the function 0 �
fo(x)
defined in the interval
X � 1 P.o be x)fo(x)l.
possesses a derivative everywhere in this interval and let
an upper
bound of
Then by
property
lf�(x)l while M (b)
lfn(x)l � M;
is an upper bound of 1 (1 +
l/2n(x)l � M;
l/an(x)l � M,
The function fn(x) represented by the series
fn(X) where
u
=
�fo(u)(Qn +
stands for
Pn + xPn-1 Qn + xQn-t'
�Qn-t)2
402
INTRODUCTION TO MATHEMATICAL PROBABILITY
has a derivative; for the series obtained by a formal differentiation
J�(x) �J�(u)(Q,. �;J:_1)4 - 2�/o(u)(Q,. :�Q,._1)3 ts uniformly convergent and represents /�(x). Now Qn-1 (Q,. + xQn-1)3 < _!_ Q� =
a.nd
Hence
12�fo(u)(Q,. !;Q,._1)31
<
4M
�Q..(Q,. � Q,. 1) _
=
4M
(2). On the other hand, the inequality Q,.(Q,. + Qn-1) (v,.Qn-1 + Qn-2)[(v,. + l)Qn-1 + Q,._2] > > 2Q,._1(Qn-1 + Qn-2) holding for n � 2 together with an evident inequality Q1(Q1 + Qo) � 2
by virtue of
=
shows that
(n �
2).
Thus
(Q,. + xQn-1)' > Q!. Q! > Q,.(Q,. t Q,._1) . Q,.(Q,. t Q,._l) > > 2n-2Q,.(Q,. + Qn-t}
and consequently
P.o ( -1)" J,'( '� ) u (Q,. xQn-1)4I 2n-2" � +
0
<
Hence, we may·conclude that P.1
is an upper bound of in (d), we find that
1/�(x)l.
=
2��2 + 4M
Similarly, starting with the second equation
P.t
=
:.2 + 4M
APPENDIX III l!�.. (x)l,
is an upper bound of
43 0
and so forth.
In general, the recurrence
relation
P.k
=
P.k-t + 4M 211-2
(k=1 23 '
'
'
. .
·
)
determines upper bounds of
IJ�(x)j, IJ�..(x)j, IJ� ..(x)j, It is easy to see that in general
P.o P.k < 2k(n-2) + 1
_
4M 2-(n-2)
so that for sufficiently large n
P.k < 5M. 6. Main Inequalities.
Let
�Po(x) = fo(x)
-
mo
1 + x'
Then
J,.(x)
-
1
mo
+
X= !f'n(x) =
=
1 � � IP o(u)(Q,. + xQ,.-1)2
>
1� 1 2 � 1Po(u)Q ,. Q,. + Q,._ ( t)'
Since the intervals � defined at the end of Sec. 2 do not overlap and cover completely the whole interval (, 0 1), we may write:
l
1
=
f1
2Jo IPo(x)dx
=
1� f 2 � J (8)1Po(x)dx
=
1� 1 2 � 1Po(u l)Q,. Q,. + Q,._1)' (
the latter part following from the mean value theorem and u1 being a number contained within the interval �.
f,.(x) and, since both
1 u
!f'o(u) Consequently,
mo
+X
-
By subtraction we find
1�
1
l > 2 � [ o ( u) - !f'o(ut) 1P ]Q,.(Q,. + Q,._1)
and Ut belong to the same interval �. -
!f'(Ut)
>
-
Q ��n-l)
Q,.( ,.
>
P.o +mo. 2"
404
INTRODUCTION TO MATHEMATICAL PROBABILITY
and a fortiori
It follows that
(5) In a similar way, considering the function
t/lo(x)
=
-fo(x) � l+x
and setting
we shall have
whence
M1 �Mo-l+2-"(�o&o+Mo).
(6) Further, from
(5)
and
(6)
M1- m1 �Mo- mo+2-"+l(�o�o+Mo)- l- l1. But
l+Z1
=
i log
2 (Mo- mo) ·
=
(1 - k)(Mo- mo);
k
< 0.66,
so that finally
M1- m1 Starting with j
..
<
k(Mo- mo)+2-n+l(llo+Mo).
(x), /2 (x), ...instead of /o(x), in a similar way we find ..
M2- m2 Ma-ma
<
M,.- m,.
<
<
k(M1- m1) + 2-n+l(#£1+M1) k(M2- m2)+2-"+1(#'2+M2) k(M-1- m,.-1)+2-"+l(lln-1+M,_J).
From these inequalities it follows that
M,.- m,.
<
(Mo-mo)k"+2-n+1 [�Lok"-1+lllk"-2+ +Mokn-1+M1kn-2+ ·
·
·
•
+ lln l+ •
•
+M-J).
Without losing anything in generality, we may suppose that positive function. Then *
M1,
m1
are used here with the same meaning as M ,.,,
m,.,
in Sec. 4.
/o(x) is ·a
405
APPENDIX III /J-k <
at least for sufficiently large
(7)
Mn - mn .
<
n.
(Mo-
(k = 1, 2, 3, ... )
5Mo
Owing to these inequalities we shall have mo)
kn + (2k)n-l + (1 JJ-o
6Mo · k)2n-l
_
This inequality shows that sequences
Mo
$;;:;
mo �
approach a common limit a. the value of this limit. and
n
M1!?;; M2!?;; m1 � m2 �
·
·
·
The following method can be used to find
Let N be an arbitrary sufficiently large integer
the integer defined by
n2 � N <
(n + 1)2•
Then
and therefore
mn < f ·) < Mn 1 + X = N(X = 1 + x" The last inequality permits presenting /N(x) thus:
(8)
IOI
< 1,
whence
18'1 and, because
Mn - mn
< 1,
ultimately becomes as small as we please in
absolute value,
a log
Equation
(8)
2 =
Jo1!o(x)dx.
shows clearly that the sequence of functions
fo(x), h(x), !2(x), ... defined by the recurrence relation function
(4)
approaches uniformly the limit
406
INTRODUCTION TO MATHEMATICAL PROBABILITY
where
1
a=
lo
! 2fo !o(x)dx.
6. Solution of the Gaussian Problem. 0 and
ing considerations to the case mo =
1,
P.o =
It suffices to apply the preced In this case M0 2
fo(x) = po(x) = 1. 1 -2
a=
log
=
I
·
Consequently,
PN(x) = (1 0 and
<
where n =
As N
t
[ .yN]. 1 to find
pN(t) �
+
=
;)
log
+
2
+
(
o kn
(1
- � . 2n a) ;
(1
log
+
2
t)
+
(
x kn
+
(1 - 3k)2n-a .'
)
oo
PN(t)
lpN (t)
< 1
It suffices to integrate this expression between limits
log
as stated by Gauss.
IOI
�
log
1�� ;- t)
Moreover,
_
log
(1 log
+
2
t)
for sufficiently large, but finite N.
<
lXI
l t(kn
+
3
(1 - k)2n
)
3
< t .
TABLE OF THE PROBABILITY INTEGRAL >(zl
z
0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.10 0.11 0.12 0.13 0.14 0.15 0.16 0.17 0.18 0.19 0.20 0.21 0.22 0.23 0.24 0.25 0.26 0.27 0.28 0.29 0.30 0.31 0.32 0.33 0.34 0.35 0.36 0.37 0.38 0.39 0.40 0.41 0.42 0.43 0.44 0.45 0.46 0.47 0.48 0.49 0.50 0.51 0.52 0.53 0.54 0 55 0.56 0.57 0.58 0.59 0.60 0.61 0.62 0.63 0.64
I
=
�
I ,P(z)
0.0000 0.0040 0.0080 0.0120 0.0160 0.0199 0.0239 0.0279 0.0319 0.0359 0.0398 0.0438 0.0478 0.0517 0.0557 0.0596 0.0636 0.0675 0.0714 0.0753 0.0793 0.0832 0.0871 0.0910 0.0948 0.0987 0.1026 0.1064 0.1103 0.1141 0.1179 0.1217 0.1255 0.1293 0.1331 0.1368 0.1406 0.1443 0.1480 0.1517 0.1554 0.1591 0.1628 0.1664 0.1700 0.1736 0.1772 0.1808 0.1844 0.1879 0.1915 0.1950 0.1985 0.2019 0.2054 0.2088 0.2123 0.2157 0.2190 0.2224 0.2257 0.2291 0.2324 0.2357 0.2389
z
0.65 0.66 0.67 0.68 0.69 0.70 0.71 0.72 0.73 0.74 0.75 0.76 0. 77 0.78 0.79 0.80 0.81 0.82 0.83 0.84 0.85 0.86 0.87 0.88 0.89 0.90 0.91 0.92 0.93 0.94 0.95 0.96 0.97 0.98 0.99 1.00 1.01 1.02 1.03 1.04 1.05 1.06 1.07 1.08 1.09 1.10 1.11 1.12 1.13 1.14 1.15 1.16 1.17 1.18 1.19 1.20 1.21 1.22 1.23 1.24 1.25 1.26 1.27 1.28 1.29
1 L.
--.
e-!•'dt
0
z
>(•)
0.2422 0.2454 0.2486 0.2517 0.2549 0.2580 0.2611 0.2642 0.2673 0.2703 0.2734 0.2764 0.2794 0.2823 0.2852 0.2881 0.2910 0.2939 0.2967 0.2995 0.3023 0.3051 0.3078 0.3106 0.3133 0.3159 0.3186 0.3212 0.3238 0.3264 0.3289 0.3315 0.3340 0.3365 0.3389 0.3413 0.3438 0.3461 0. 3485 0.3508 0.3531 0.3554 0.3577 0.3599 0.3621 0.3643 0.3665 0.3686 0.3708 0.3729 0 3749 0.3770 0.3790 0.3810 0.3830 0.3849 0.3869 0.3888 0.3907 0.3925 0.3944 0.3962 0.3980 0.3997 0.4015
1.30 1.31 1.32 1. 33 1.34 1.35 1.36 1.37 1.38 1.39 1.40 1.41 1.42 1.43 1.44 1.45 1.46 1.47 1.48 1.49 1.50 1.51 1.52 1.53 1.54 1.55 1.56 1.57 1.58 1.59 1.60 1.61 1.62 1.63 1.64 1.65 1.66 1.67 1.68 1.69 1.70 1.71 1.72 1.73 1.74 1.75 1.76 1.77 1.78 1.79 1.80 1.81 1.82 1.83 1.84 1.85 1.86 1.87 1.88 1.89 1.90 1.91 1.92 1.93 1.94
407
>(<)
0.4032 0.4049 0.4066 0.4082 0.4099 0.4115 0.4131 0.4147 0.4162 0.4177 0.4192 0.4207 0.4222 0.4236 0.4251 0.4265 0.4279 0.4292 ().4306 0.4319 0.4332 0.4345 0.4357 0.4370 0.4382 0.4394 0.4406 0.4418 0.4429 0.4441 0.4452 0.4463 0.4474 0.4484 0.4495 0.4505 0.4515 0.4525 0.4535 0.4545 0.4554 0.4564 0.4573 0.4582 0.4591 0.4599 0.4608 0.4616 0.4625 0.4633 0.4641 0.4649 0.4656 0.4664 0.4671 0.4678 0.4686 0.4693 0.4699 0.4706 0.4713 0.4719 0.4726 0.4732 0.4738
z
1.95 1.96 1.97 1.98 1.99 2.00 2.02 2.04 2.06 2.08 2.10 2.12 2.14 2.16 2.18 2.20 2.22 2.24 2.26 2.28 2.30 2.32 2.34 2.36 2.38 2.40 2.42 2.44 2.46 2.48 2.50 2.52 2.54 2.56 2.58 2.60 2.62 2.64 2.66 2.68 2.70 2.72 2.74 2.76 2.78 2.80 2.82 2.84 2.86 2.88 2.90 2.92 2.94 2.96 2.98 3.00 3.20 3.40 3.60 3.80 4.00 4.50 5.00
>(,, 0.4744 0.4750 0.4756 0.4761 0.4767 0.4772 0.4783 0.4793 0.4803 0.4812 0.4821 0.4830 0.4838 0.4846 0.4854 0.4861 0.4868 0.4875 0.4881 0.4887 0.4893 0.4898 0.4904 0.4909 0.4913 0.4918 0.4922 0.4927 0.4931 0.4934 0.4938 0.4941 0.4945 0.4948 0.4951 0.4953 0.4956 0.4959 0.4961 0.4963 0.4965 0.4967 0.4969 0.4971 0.4973 0.4974 0.4976 0.4977 0.4979 0.4980 0.4981 0.4982 0.4984 0.4985 0.4986 0.49865 0.49931 0.49966 0.499841 0.499928 0.499968 0.499997 0.499997
INDEX Distribution, determination of, 271
A
equivalent point, 369 general concept of, 263
Arrangements, 18
normal (Gaussian), 243 Poisson's, 279
B
"Student's," 339 Distribution
Bayes' formula (theorem), 61
function
of
probability,
239, 263
Bernoulli criterion, 5
Divergence coefficient, empirical, 212
Bernoulli theorem, 96
Lexis' case, 214
Bernoulli trials, 45
Poisson's case, 214
. Bernstein, 8., inequality, 205 Bertrand's paradox, 251
theoretical, 212
Buffon's·needle problem, 113, 251
Tschuprow's theorem, 216
Barbier's solution of, 253 E
c Elementary errors, hypothesis of, 296 Ellipses of equal probability, 311, 328
Cantelli's theorem, 101 Cauchy's distribution, 243, 275
Estimation of error term, 295
Characteristic function, composition of,
Euler's summation formula,
201,
Event.s, compound, 29
of distribution, 240, 264
contingent, 3
Coefficient, correlation, 339
dependent, 33
divergence, 212, 214, 216
equally likely, 4, 5, 7
Combinations, 18
exhaustive, 6
Compound probability, theorem of, 31
future, 65
Continued fractions, 358, 361, 396
incompatible, 37
Markoff's method of, 52
independent, 32, 33
Continuous variables, 235 Correlation,
177,
303, 347
275
normal
(see
Normal
relation) Correlation coefficient,
mutually exclusive, 6, 27
cor
distribution of,
opposite, 29 Expectation, mathematical, 161 of a product, 171
339
oi a sum, 165 D
Difference equations, ordinary, 75, 78 partial, 84 Dispersion, definition, 172 of sums, 173 Distribution, Cauchy's, 243, 275 characteristic function of, 264
F Factorials, 349 Fourier theorem, 241 French lottery, 19, 108 Frequency, 96 Fundamental lemma
(see Limit theorem) (see Tshebysheff-
Fundamental theorem Markoff theorem)
of correlation coefficient, 339
409
INTRODUCTION TO MATHEMATICAL PROBABILITY
410
Markoff's theorem, for simple chains, 301
G
Markoff-Tshebysheff
probabilities,
of
function
( see
Mathematical expectation, definition of,
Gaussian problem, 396 Generating
theorem
Tshebysheff-Markoff theorem)
Gaussian distribution, 243
161 of a product, 171
47, 78, 85, 89, 93, 94
of a sum, 165 Mathematical probability, definition of,
H
6 Moments, absolute, 240, 264
Hermite polynomials, 72 Hypothesis of elementary errors, 296
inequalities for, 264 method of (Markoff's), 356.tf.
I
N
Independence, definition of, 32, 33
Normal correlation, 313
K
origin of, 327
J{hintchine (see Law of large numbers) I{olmogoroff ( see Law of large numbers;
Normal distribution, Gaussian, 243 two-dimensional, 308
Strong law of large numbers)
p
L PeP.rson's "x2-test," 327
Lagrange's series, 84, 150 Laplace-Liapounoff ( see Limit theorem) Laplace's problem, 255 of
large
numbers, generalization
by Markoff, 191 for identical variables
Point, of continuity, 261, 356 of increase, 262, 356
Laurent's series, 87, 148 Law
Permutations, 18
(Khintchine),
195
Poisson series, 182, 293 Poisson's case, 214 Poisson's distribution, 279 Poisson's formula, 137 Poisson's theorem, 208, 294
Kolmogoroff's lemma, 201
Polynomials, Hermite ( sec Hermite)
theorem, 185
Probability, approximate evaluation of,
Tshebysheff's lemma, 182
by Markoff's method, 52
Law of repeated logarithm, 204
compound, 29, 31
Law of succession, 69
conditional, 33
Lexis' case, 214 Liapounoff condition ( see Limit theorem) Liapounoff inequality, 26� Limit theorem, Bernoullian case, 131
definition (classical) of, 6 total, 27, 28 Probability integral, 128 table of, 407
for smns of independent vectors, 318,
323, 325, 326
R
fundamental lemma, 284 Laplace-Liapounoff, 284 Line of regression, 314
Relative frequency, 96 Runs, problem of, 77
Lottery, French ( see French lottery)
s
M Marbe's problem, 231 Markoff's theorem,
191
infinite dispersion,
Simple chains, 74, 223, 297 Markoff's theorem for, 301 Standard deviation, 173
411
INDEX Stieltjes' integrals, 261
Tschuprow
Stirling's formula, 349
Tshebysheff-Markoff
Stochastic variables, 161 Strong law of large numbers (Kolmo
(see Divergence coefficient) theorem,
mental, 304, 384 application, 388 Tshebysheff's inequalities, 373
goroff), 202 "Student's" distribution, 339
Tshebysheff's inequality, 204 Tshebysheff's lemma, 182
T
Tshebysheff's problem, 199
v
Table of probability integral, 407 Tests of significance, 331 Total probability, theorem of, 27, 28 Trials, dependent, independent, repeated,
44, 45
Variables, continuous, 235 independent, 171 stochastic, 161 Vectors
(see Limit theorem)
funda