Group Invariance in Statistical Inference

Group Invariance in Statistical Inference T h i s p a g e i s i n t e n t i o n a l l y left b l a n k G r o u p I n...

Author: Narayan C Giri

45 downloads 1035 Views 18MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

Group Invariance in Statistical Inference

T h i s p a g e i s i n t e n t i o n a l l y left b l a n k

G r o u p I n v a r i a n c e in Statistical Inference

Narayan

C . Giri

University of Montreal, Canada

W o r l d h

Singapore

Scientific * New Jersey

• London • Hong

Kong

Published by World Scientific Publishing Co. Pte. Ltd. P O Box 128, Fairer Road, Singapore 912805 USA office: Suite I B , 1060 Main Street, River Edge, NJ 07661 UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE

G R O U P I N V A R I A N C E IN S T A T I S T I C A L I N F E R E N C E Copyright © 1996 by World Scientific Publishing Co. Pte. Ltd. All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means, electronic or mechanical, including photocopying, recording or any information storage and retrieval system now known or to be invented, without written permission from the Publisher.

For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy is not required from the publisher.

ISBN 981-02-1875-3

Printed in Singapore.

To NILIMA NABANITA NANDIAN

T h i s p a g e i s i n t e n t i o n a l l y left b l a n k

CONTENTS

Chapter 0. G R O U P I N V A R I A N C E

1

0,0. Introduction

1

0.1. Examples

3

Chapter I . M A T R I C E S , G R O U P S A N D J A C O B I A N S

7

1.0. Introduction

7

1.1. Matrices

7

1.1.1. Characteristic Roots and Vectors

8

1.1.2. Factorization of Matrices

9

1.1.3. Partitioned Matrices

9

1.2. Groups

10

1.3. Homomorphism, Isomorphism and Direct Product

12

1.4. Topological Transitive Groups

12

1.5. Jacobians

13

Chapter 2. I N V A R I A N C E

16

2.0. Introduction

16

2.1. Invariance of Distributions

16

2.1.1. Transformation of Variable in Abstract Integral

22

2.2. Invariance of Testing Problems

25

2.3. Invariance of Statistical Tests and Maximal Invariant

26

2.4. Some Examples of Maximal Invariants

28

2.5. Distribution of Maximal Invariant

31

2.5.1. Existence of an Invariant Probability Measure on 0{p) (Group of p x p Othogonal Matrices)

33

2.6. Applications

33

2.7. The Distribution of a Maximal Invariant in the General Case

36

vii

viii

Contents

2.8. An Important Multivariate Distribution

37

2.9. Almost Invariance, Sufficiency and Invariance

44

2.10. Invariance, Type D and E Regions

45

Chapter 3. E Q U I V A R I A N T E S T I M A T I O N I N C U R V E D MODELS

49

3.1. Best Equivariant Estimation of y, with \ Known 3.1.1. Maximum Likelihood Estimators 3.2. A Particular Case

50 53 53

3.2.1. An Application

58

3.2.2. Maximum Likelihood Estimator

58

3.3. Best Equivariant Estimation in Curved Covariance Models

60

3.3.1. Characterization of Equivariant Estimators of S

61

3.3.2. Characterization of Equivariant Estimators of 0

63

Chapter 4. S O M E B E S T I N V A R I A N T T E S T S I N MULTINORMALS

68

4.0. Introduction

68

4.1. Tests of Mean Vector

68

4.2. The Classification Problem (Two Populations)

75

4.3. Tests of Multiple Correlation

82

4.4. Tests of Multiple Correlation with Partial Information

85

Chapter 5. S O M E M I N I M A X T E S T S I N M U L T I N O R M A L E S 5.0. Introduction

91 91

5.1. Locally Minimax Tests

93

5.2. Asymptotically Minimax Tests

106

5.3. Minimax Tests

111 2

5.3.1. HotelUng's T Test

112

2

5.3.2. R -Test

124

5.3.3. e-Minimax Test (Linnik, 1966)

135

Chapter 6. L O C A L L Y M I N I M A X

TESTS IN SYMMETRICAL

DISTRIBUTIONS

137 137

6.0. Introduction

137

6.1. Eliiptically Symmetric Distributions

137

6.2. Locally Minimax Tests in E (n, S )

140

6.3. Examples

143

p

Chapter 7. T Y P E D A N D E R E G I O N S

162

Group Invariance in Statistical Inference

Chapter 0 GROUP I N V A R I A N C E

0.0. I n t r o d u c t i o n One of the unpleasant facts of statistical problems is that they are often too big or too difficult to admit of practical solutions.

Statistical decisions

are made on the basis of sample observations. Sample observations often contain information which is not relevant to the making of the statistical decision. Some simplifications are introduced by characterizing the decision rules in terms of the sufficient statistic (minimal) which discard that part of sample observations which is of no value for any decision making concerning the parameter and thereby reducing the dimension of the sample space to that of the minimal sufficient statistic. T h i s , however, does not reduce the dimension of the parametric space.

B y introducing the group invariance principle and

restricting attention to invariant decision rules a reduction to the dimension of the parametric space is possible.

I n view of the fact that sufficiency and

group invariance are both successful in reducing the dimension of the statistical problems, one is naturally interested in knowing whether both principles can be used simultaneously and if so, in what order. Hall, Wijsman and Ghosh (1965) have shown that under certain conditions this reduction can be carried out by using both principles simultaneously and the order in which they are used is immaterial in such cases. However, one can avoid verifying these conditions by replacing the sample space by the space of sufficient statistic and then using group invariance on the space of sufficient statistic. I n this monograph we treat multivariate problems only where the reduction in dimension is

1

2

Group Invariance in Statistical

Inference

very significant. In what follows we use the term invariance to indicate group invariance. In statistics

the term invariance is used in the

mathematical

sense

to denote a property that remains unchanged (invariant) under a group of transformations. In actual practice many statistical problems possess such a property. As in other branches of applied sciences it is a generally accepted principle in statistics that if a problem with an unique solution is invariant under a group of transformations, then the solution should also be invariant under it. This notion has an old origin in statistical sciences. Apart from this natural justification for the use of invariant decision rules, the unpublished work of Hunt and Stein towards the end of Second World War has given this principle a Strang support as to its applicability and meaningMness to prove various optimum properties like minimax, admissibility etc. of statistical decision rules. Although a great deal has been written concerning this principle in statistical inference, no great amount of literature exists concerning the problem of discerning whether or not a given statistical problem is actually invariant under a certain group of transformations. Brillinger (1963) gave necessary and sufficient conditions that a statistical problem must satisfy in order that it be invariant under a fairly large class of group of transformations including Lie groups. In our treatment in this monograph we treat invariance in the framework of statistical decision rules only. D e F i n e t t i (1964) in his theory of exchangeability treats invariance of the distribution of sample observations under finite permutations. It provides a crucial link between his theory of subjective probability and the frequency approach of probability. T h e classical statistical methods take as basic a family of distributions, the true distribution of the sample observations is an unknown member of this family about which the statistical inference is required. unknown.

According to D e Finetti's approach no probability is

If « i , i E s , . . . are the outcomes of a sequence of experiments con-

ducted under similar conditions, subjective uncertainty is expressed directly by ascribing to the corresponding random variables X ,X^, t

• • • a known joint

distribution. When some of the X's are observed, predictive inference about others is made by conditioning the original distributions on the observations. De Finetti has shown that these approaches are equivalent when the subjectivist's joint distribution is invariant under finite permutation. T w o other related principles, known in the literature, are the weak invariance and the strong invariance principles. T h e weak invariance principle is used

Group Invariance

3

to demonstrate the sufficiency of the classical assumptions associated with the weak convergence of stable laws (Billingsley, 1968). T h i s is popularly known as Donsker's theorem (Donsker, 1951). L e t X\, X2, • • • be independently dis2

tributed random variable with the same mean zero and the variance a

Sj =

£j=i

= ^ . * = | . ;

i

-

1

. - " . « -

converges weakly to Brownian motion.

and let

Donsker proved that { * „ ( ( ) }

Strong invariance principle has been

introduced to prove the strong convergence result (Tusnady, 1977). Here the term invariance is used in the sense that if X i , X j , . . . are independently dis2

tributed random variables with the same mean 0 and the same variance a , and if ft is a continuous function on [0,1| then the limiting distribution of h(Xi) does not depend on any other property of Xi. 0.1. E x a m p l e s We now give a n example to show how the solution to a statistical problem can be obtained through direct application of group theoretic results. E x a m p l e 0.1.1. Let X

a

= (X

n l

,...,X

a p

)\

a = l,...,N(>

p) be inde-

pendently and identically distributed p-variate normal vectors with the same mean ^ = (m,..

. , j t ) ' , and the same positive definite covariance matrix E . p

The parametric space f! is the space of all {11, £ ) . H0 : // = 0 against the alternatives Hi : 11

T h e problem of testing

0 remains unchanged (invariant)

under the full linear group Gj(p) of p x p nonsingular matrices g transforming each Xi -* gXi, i = l,...,N.

-

Let

N

N

It is well-known { G i r i , 1977) that (X,S) transformation on the space of {X,S)

(X,S)^(gX,gSg'),

is sufficient for ( / * , £ ) . T h e induced

is given by

g € Gi(p) •

(0.2)

Since this transformation permits arbitrary changes of X, S and any reasonable statistical test procedure should not depend on any such arbitrary change by g, we conclude that a reasonable statistical test procedure should depend on (X, S) only through 2

T

1

= N(N - l l X ' S " * .

It is well-known that (Giri, 1977) the distribution of T

(0.3) 2

is given by

4

Group /nuariance in Statistical

2

2

fr*(t \6 )

=

(N-l)T(^N-p)) (

where 6

2

Inference

-

\8 y(t j{N 2

2

+

2

= JV/i'£-y

Under H S

l

>- T{\N

if £ > 0 2

+ j)

= 0 and under ff i S

0

2

(0.4)

> 0.

Applying

Neyman and Pearson's L e m m a we conclude from (0.4) that the uniformly 2

most powerful test based on T T

2

test, which rejects H

0

of H

0

against Hi is the well-known Hotelling's

for large values of

2

T. 3

N o t e . I n this problem the dimension of fl is p + also the dimension of the (X,S).

= P(P+ )

For the distribution of T

2

)

w n

i h is c

the parameter is

a scalar quantity. One main reason of the intuitive appeal, that for an invariant problem with an unique solution, the solution should be invariant, is probably the belief that there should be a unique way of analysing a collection of statistical data. A s a word of caution we should point out that, if in cases where the use of invariant decision rule conflicts violently with the desire to make a correct decision or to have a smaller risk, it must be abandoned. We give below one such example which is due to Charles Stein as reported by Lehmann (1959, p. 338). E x a m p l e 0.1.2.

f

Let X = Xi,...,X )', p

Y = (Yi,...,Y )' p

be indepen-

dently distributed normal p-vectors with the same mean 0 and positive definite covariance matrices S , 6 S respectively where 6 is an unknown scalar constant. Consider the problem of testing Ho : 6 — 1 against H\ : 6 > 1, T h i s problem is invariant under Gj(p) transforming X —> gX,Y

—» gY,g e Gt(p). Since this

group is transitive (see Chapter 1) on the space of values of (X, Y) with probability one, the uniformly most powerful invariant test of level a under Gi{p} is the trivial test $ ( X , Y) = a which rejects Ho with constant probability a for all values {x,y)

of (X,Y).

Hence the maximum power that can be achieved

over the alternatives Hi by any invariant test under G;(p) is also a . B u t the test which rejects Ha whenever

(0.5) where the constant C depends on level a, has strictly increasing power 0(6} whose minimum over the set 6 > Si > 1 is /?{<5i) > /3(1) = a. discussions and results refer to Giri (1983a, 1983b).

F o r more

Group Invariance

5

Exercises 1. L e t Xi,...

,X be n

independently and identically distributed normal random

variables with the same mean 9 and the same unknown variance o

2

H

A

(a)

and let

:d = 0 and H i : 9 / 0. F i n d the largest group of transformations which leaves the problem of testing Ha against Hi invariant.

(b)

Using the group theoretic notion show that the two-sided student (-test is uniformly most powerful among all tests based on (.

2. U n i v a r i a t e G e n e r a l L i n e a r H y p o t h e s i s .

Let

dently distributed normal random variables with E(Xi) i — 1,... ,TI.

, . . . ,X

be indepen-

n

= (ft, V a r ( X ; ) =

2

a,

L e t fi be the linear coordinate space of dimension of n and

let lip. and IL

U

be two linear subspaces of fl such that dim I I ^

dim IIJJ — i , I > k. Consider the problem of testing HQ : 8 —

— k and ... ,9 )' G P

lTm against the alternatives Hi : 9 6 HQ. (a) F i n d the largest groups of transformations which leave the problem invariant. (b)

Using the group theoretic notions show that the usual F-test is uniformly most powerful for testing HQ against H\.

3. L e t Xy,...,

X

n

be independently distributed normal random variables with

the same mean 9 and the same variance a . 2

against H\ : a

2

= a

2

For testing H

2

Q

: a

=

a\

< af, where
invariant test using the group theoretic notions.

References 1. P. Billingsley, Convergence of Probability Measures, Wiley, New York, 1968. 2. D. Brillinger, Necessary and sufficient conditions for a statistical problem to be invariant under a Lie group, Ann. Math. Statist., 34 (1963) 492-500. 3. B. De Finetti, Studies in Subjective Probability, H . E , Kyburg and H, E . Smoker, eds., Wiley, New York, (1964) 93-158. 4. M. Donsker, An invariance principle for certain probability limit theorems, Amer. Math. S o c , 6 (1951). 5. N. Giri, Hunt-Stein theorem. Encyclopedia of Statistical Sciences, Vol. 3, K o t z Johnson, eds., Wiley, New York, 1983a, 686-689. 6. N. Giri, Invariance Concepts in Statistics, Encyclopedia of Statistical Sciences, Vol. 4, Kotz-Jofmson, eds., Wiley, New York, 1983b, 219-224. 7. N. Giri, Multivariate Statistical Inference, Academic Press, New York, 1977.

6

Group Invariance m Statistical

Inference

8. W. J . Hall, R. A. Wijsman and J. K . Ghosh, The relationship between sufficiency and invariance with application in sequential analysis, Ann. Math. Statist, 36 (1965) 575-614. 9. E . L . Lehmann, Testing of Statistical Hypothesis, Wiley, New York, 1959. 10. G . Tusnady, in Recent Developments in Statistics, Barra, J . R . , Brodeau, F . , Romier, G. and Van Cutsera, B. eds, North-Holland, Amsterdam, (1977) 289-300.

Chapter 1 M A T R I C E S , GROUPS AND JACOBIANS

1.0.

Introduction

T h e study of group invariance requires knowledge of matrix algebra and group theory. We present here some basic results on matrices, groups and Jacobians without proofs. Proofs can be obtained from G i r i (1993, 1996) or any textbook on these topics. 1.1. M a t r i c e s A p x q matrix C = (c\j) is a rectangular array of real number Cy written as (1.1)

where e% is the element in the i t h row and the j t h column. T h e transpose C of C in a q x p matrix obtained by interchanging the rows and the columns of C. If q = p, C is called a square matrix of order p. K q — 1,0 is a 1 X p row vector and if p = 1, C is a g x 1 column vector. A square matrix C is symmetric if C — C. A square matrix C = ( c ^ ) of order p is a diagonal matrix D ( c , c ) with diagonal elements e n , . . . ,Cp if all off-diagonal elements of C are zero. u

pp

P

A diagonal matrix with unit diagonal elements is a n indentity matrix and is denoted by / . Sometimes it will be necessary to write it as I to denote a n identity matrix of order k. K

7

8

Group Invariance

in Statistical

Inference

A square matrix C — (cij) of order p is a lower triangular matrix if

= c

0 for j > i. T h e determinant of the lower triangular matrix det C = H(=i » We shall also write det C — \C\ for convenience. A square matrix C = (ey ) of order p is a upper triangular matrix if Cj, = 0 c

for i > j and det C = [IS=I "A square matrix of order p is nonsingular if det C ?^ 0. If det C = 0 then C , is a singular matrix. A nonsingular matrix C of order p is orthogonal if CC = CC = I. T h e inverse of a nonsingular matrix C of order p is the unique matrix C such that C C

-

1

= C

_

1

1

_

1

l

C = J . From this it follows that det C " = (det C)~ .

A square matrix (7 = ( c ^ ) of order p or the associated quadratic form x'Cx = J2i Z l j djXiXj is positive definite if x'Cx > 0 for x = ( i i , . . . , x )' / 0. p

If C is positive definite C A of order pACA'

_

1

is positive definite and for any nonsingular matrix

is also positive definite.

1.1.1. Characteristic roots and vectors T h e characteristic roots of a square matrix C of order p are given by the roots of the characteristic equation det(C-A/)=0

(1.2)

where A is real. A s det {8C8' - XI) = det ( C - XI) for any orthogonal matrix 8 of order p, the characteristic roots of C remain invariant under the transformation of C —> 8C8'. T h e vector x — ( i ^ , . . . , x )' / 0 satisfying p

{C-XI)x

= 0

(1.3)

is the characteristic vector of C corresponding to its characteristic root X. I f x is a characteristic vector of C corresponding to its characteristic root X, then any scalar multiple ax, a j i 0, is also a characteristic vector of C corresponding to X. Some Results on Characteristic R o o t s a n d Vectors 1. T h e characteristic roots of a real symmetric matrix are real. 2. T h e characteristic vectors corresponding to distinct characteristic roots of a symmetric matrix are orthogonal.

Matrices, Groups and Jacobians 9 3. T h e characteristic roots of a symmetric positive definite matrix C are ail positive. 4. Given any real square symmetric matrix C of order p, there exists an orthogonal matrix 9 of order p such that 9C9' is a diagonal matrix D{\\,..., where A i , . . . , A

p

A ) p

are the characteristic roots of C. Hence det C — n f = i A<

and tr C — X]f=i *W- Note that tr C is the sum of diagonal elements of C.

1.1.2.

Factorization

of matrices

In this sequal we shall use frequently the following factorizations of matrices. For every positive definite matrix C of order p there exists a nonsingular matrix A of order p such that C = AA' and, hence, there exists a nonsingular 1

matrix B (B = A' )

of order p such that BCB'

= I.

Given a symmetric nonsingular matrix C of order p, there exists a nonsingular matrix A of order p such that

- i )

m

where the order of / is equal to the number of positive characteristic roots of C and the order of —I is equal to the number of negative characteristic roots of C . Given a symmetric positive definite matrix C of order p there exists a nonsingular lower traingular matrix A (an upper triangular matrix B) of the same order p such that

C = AA' = B'B.

(1.5)

C h o l e s k y D e c o m p o s i t i o n For every positive definite matrix C there exists an unique lower triangular matrix of positive diagonal elements D such that

C = DD'. 1.1.3.

Partitioned matrices

Let C — ( c y ) be a p x q matrix and let C be partitioned into submatrices

Cij as C = (^

n

\Cai where C

u

-

^ C22 /

(cij)(i = l , . . . , m ; j = l,...,n)\C

12

- (ftj)(» = l , . . . , m ; jf =

10

n+

Group Invariance in Statistical

l,...,q);C

Inference

= ( c y ) ( t = m+

2 l

l,...,p; j = l,...,n);C

2 2

=

=

m + l j . . , ,p; 3' = J l + 1 , . . . . , jjj. For any square matrix 1

C — f^ ' \C2i where Cn,C 2

\ C22 /

are square matrices and C 2 is nonsingular

2

2

(1) det ( C ) = det ( C

2 2

) det ( C n

CaC^Cji),

(2) C is positive definite if and only if C ,C U

C22-.Cn — C\ C C \ 2

Let C

-

1

22

22

- C iG^C2

or equivalently

n

are positive definite.

2

= B be similarly partitioned into submatrices B ,i,j i:i

= 1,2.

Then 1

C^, — Bu - B12B22B21, C^C

12

C22 — B22 -

= -B B^.

B iB 'Bi , 2

ll

2

(1.6)

l2

1.2. G r o u p s A group G is a set with an operation r satisfying the following axioms. A i . For any two elements a, b E G, arb € G. A . For any three elements a,b,c G G\ {arb)rc — o r ( b r c ) . 2

A3. There exists an element e (identity element) such that for all o £ G,are —

a. A . 4

For any a G G there exists an element a~ ara

- 1

l

(inverse element) such that

— e.

In what follows we will write for the convenience of notation arb = ab. I n such writing the reader may not confuse it with the arithmetic product ab. A group G is abelian if for any pair of elements o, 6 belonging to G, ab = ba. A non-empty subset H of G is a subgroup if the restriction of the group operation T to H satisfies the axioms A i , , . . , A4, E x a m p l e s of G r o u p s E x a m p l e 1.1.

A . L e t X be a set and G be the set of all one-to-one

mappings

g : X -> X with g{x) = g(y)\x,y G I ; implies x — y and for x G X there exists y G X such that y = g{x). W i t h the group operation defined by 9r92(x) =SI(£J2M);SI,S2 € G , G forms a permutation group.

Matrices,

Groups and Jacobians

11

E x a m p l e 1.2. T h e additive group of real numbers is the set of all reals with the group operation ab = a + 6. T h e multiplicative group of all nonzero reals with the group operation ab — a multiplied by b. E x a m p l e 1.3. L e t X be a linear space of dimension n. Define for x

0

G X,g (x)

— x + XQ,X G X. T h e set of all {g }

xo

Xo

forms an

additive abelian group and is called the translation group. E x a m p l e 1.4. L e t X be a linear space of dimension n and let Gi(n) be the set of all nonsingular linear transformations X onto X. Gi(n) with matrix multiplication as the group operation is called the full linear group. E x a m p l e 1.5. T h e affine group is the set of pairs (g,x),x

G X,g € Gi{n)

with the group operation defined by (9i,£i)(
2

G X. T h e identity element and the inverses are

1

(1,0) and ( s . x r - O r E x a m p l e 1.6.

1

T h e set of all nonsingular lower triangular matrices of

order p with the usual matrix multiplication as the group operation forms a group GT(P) (with identity matrix as the unit element).

T h e set of all

upper triangular nonsingular matrices of order p forms the group GUT(P)

with

identity matrix as the unit element. E x a m p l e 1.7. T h e set of all orthogonal matrices 6 of order p with the matrix multiplication as the group operation forms a group E x a m p l e 1.8. L e t o C £ 1 C £ 2 C • • • C X

k

0(p).

= X be a strictly increasing

sequence of linear subspaces of X and let G be a subgroup of Gi(p) such that g G Gi{p) if and only if gXi = Xi, i = 1 , . . . , k. T h e group G is the group of nonsingular linear transformations on X to X which leaves Xi invariant. Choose a basis x^\ ... , x $ , t = 1 , . . . , k for Xi-Xi-i

with n, — dim (X,) -

dim (3t,_i) and XQ — (null set). Then g G G can be written as

9 =

/ 9(U) 9(12) 0 9(22) \

0

0

fl(lJfc) \

9<2fc)

(1.7)

12

Group /nuariance in Statistical

inference

where g/*^ is a block of Bj rows and rij columns for i < j . I f n; = 1 for all =

i,G

GUT-

1.3. H o m o m o r p h i s m , I s o m o r p h i s m a n d D i r e c t P r o d u c t Let G, H be groups. A mapping / of G into H is called a homomorphism if, for gi, g

2

£ G,

/(9i!72) = / ( 9 i ) * / M where * is the group operation in H. is the identity element in H and / ( g

(1-8)

I f e is the identity element in G , }{e) - 1

-1

) — [ / ( < ? ) ] . I n addition, if / is an

one-to-one mapping, then / is called an isomorphisim. T h e Cartesian product G x H of groups G and H with the operation h

(9u iK92,h ) 2

£

= (9i92,hih y,gi,g2 2

GMM

e

H is called the direct

product of G and i f . A subgroup H of G is a normal subgroup of G if for all h £ H and s e (?, ghg~

l

£ / / , or equivalently if gHg

E x a m p l e 1.9.

-1

—H .

T h e group G f ( p ) of lower triangular matrices of order p

with positive diagonal elements is a normal subgroup of GT(P)-

A n y subgroup

of an abelian group is normal. Let G be a group and let i f be a subgroup of G. T h e set G/H

of all sets

of the form gH, g € G is called the quotient group of G modulo H with group operation defined by, for j&jjfe G &,.(% H){gzH) by multiplying all elements of giH element is H and (gH)

-1

—

is a s e t of elements obtained

with all elements of g H. 2

T h e identity

v

g~ H.

1.4. T o p o l o g i c a l T r a n s i t i v e G r o u p s Let X be a set and A a collection of subsets of X.

(X,A)

is called a

topological space if A satisfies the following axioms:

TAi : X £ A. T A j : T h e union of an arbitrary subfamily of A belongs to A. TA

3

; T h e intersection of a finite subfamily of A belongs to A.

T h e elements of A are called open sets. If, in addition the topological space (X,A)

satisfies: for x,y

£ X,x / y, there exist two open sets A,B,

belonging

to A such that A n B =

Matrices,

Groups and Jacotians

13

If a group G has a topology r defined on it such that under T the mapping ab from G x G into G and a

-

1

from G into G ( a , b £ G ) are continuous, then

( G , r ) is a topological group. A compact group is a topological group which is a compact space. A locally compact group is a topological group which is a locally compact space. A compact space is a Hausdorff space X such that every open covering of X can be reduced to a finite subcovering. A locally compact space is a Hausdorff space such that every point has at least one compact neighborhood. Let X be a set. T h e group G operates from the left on X if there exists a function on G x X into X whose value at {g,x}

is denoted by gx such that (i)

ex — x for all x € X and e, the identity element of G ; (ii) g^igix)

—

gzgi(x).

This implies that gtG is one to one on X into X. Let G operate from the left on X. every Xi,x

G o p e r a t e s t r a n s i t i v e l y on X if for

£ X there exists ag £ G such that gx\ — x -

2

2

E x a m p l e 1.10.

Let X be a linear space. T h e full linear group operates

transitively on X — { 0 } . E x a m p l e 1.11.

T h e group 0{n)

of n x n orthogonal matrices operates

transitively on the set of all n x p(n ^ p) real matrices, x satisfying x'x = E x a m p l e 1.12. T h e linear group G ; ( p ) acts transitively on the set of all p x p positive definite matrices s, transfering s to gsg',g

£ G/(p).

1.5. J a c o b i a n s Let Xi,...,X

be a sequence of continuous random variables with pdf

n

fx,

xAXi,--.,X }

and let Yi = gi(X ...

n

,X )>i

u

n

continuous one-to-one transformations of X%, • • •, X ^ n

tion gi,...,g

n

= l,...,n

be a set of

Assume that the func-

have continuous partial derivatives with respect to X ] . . . ,

Denote the inverse functions by X ; = hi(Yi,...,

Y ),i n

= l , . . . , n . Let J

-

X. n

1

be

the determinant of the n x n matrix of partial derivatives Mi

Mi

9Y„ (1.9) dX„ 9V„ J is called the Jacobian of the transformation X\,..., The pdf of Yi,...,

Y

n

M , . . . y - ( K i . - - - . Vn) = fx,,...^Cftltflj r

X

n

to V j , . . . ,

is given by •••.»»). . ••*Mi*l. --v.Sn)|/|

Y. n

14

Group Invariance

tn Statistical

Inference

where \J\ is the absolute value of J. We now state some results on Jacobian without proof. We refer to G i r i (1994), OIkin (1962) and Roy (1959) for further results and proofs. S o m e R e s u l t s on J a c o b i a n s Let X

=

(Xi,...,X )',y

=

p

(Y ,...,Y )' 1

P

v

€ E.

T h e Jacobian of the

l

transformation X —> Y — gX g g G j ( p ) is (det g)~ • Let X, Y be p x n matrices. T h e Jacobian of the transformation X —* Y = gx,geGt{p)

1

is (det

L e t X,Y

g)" .

bepxn

matrices and let A 6 Gj(p),Z? € Gt{n). 1

the transformation X -> Y = AXB L e t g, k nf i(ft.i)

€

_ 1

Gr(p).

is (det A' )*

l

(det

T h e Jacobian of p

S- ) .

T h e Jacobain of the transformation g —» kg

is

where h — (/i,j) and the Jacobian of the transformation g —* gh is

=

p

1

nf= <M'- - . 1

Let g,k l

n^iC »)' ^

GUT(P)-

e - p - 1

T h e Jacobian of the transformation g —» kg is

where k = (hij) and the Jacobian of the transformation g —• gh

n L Let G S T - ( P ) be the multiplicative group of p x p lower triangular nonsingular

matrices in the block form, i.e. g e G B T ( P ) .

9

=

\ 9(21) m

9(22) m

•9(1:1)

9(*2)

0

0 *

| •••

9(ltfc)

where g^q are submatrices of order pi x pi such that Y^iPi GBT(P)

the Jacobian of the transformation g -* a

=

where cr = Y? =\P3* o ;

is n t j d e t

°-

3

1

kg is n ! = i i

d e t

/i(ii)j

6 - , T i

,

T h e Jacobian of the transformation g —> gh

p

h^F- " .

Let GBUT(P)

be the group of nonsingular upper triangular matrices of order GBUT,

p in the block form i.e. g e

'9(H) ?

where g^

r

= P- F ° 9 . ^

_

9(12)

•••

0

9(22)

••

0

0

I

are submatrices of order p; satisfying

9
9(fcfc) P

the Jacobian of the transformation g -t gh is n L i [ 9^/»9isnLi[

d e

i

,

p

t«(u)]" - " .

= . For g,h

i

e

p

d

e

t

-

fya)] "'

a

n

d

G t

h

e t

a

t

7 , T

o

f

Matrices, Groups and Jacobians

15

Let 5 be a symmetric positive definite matrix of order p. T h e Jacobian of the transformation 5 -> gSg',g

e Gt(p)

is [det g ] "

t p + 1 )

and that of S - » 5

_ 1

2

is (det S ) " . Let 5 be a symmetric positive definite matrix of order p and let g be the unique lower triangular matrix such that S = gg'. T h e Jacobian of the trans-

formation S -> gSg' is 2~" ^ i f f f i i ) ^ the transformation S —• gSg',g

1 -

*

5

with g = (jfc), T h e Jacobian of

g G £ ( p ) is [det

<

p+1

g]~ - K

Exercises 1. Show that for any nonsingular symmetric matrix A of order p and non null p vectors x,y (a) (A +

1

xy')- ^

l +

l

y'A~ x

x'A-^x 1

(b) a?{A + xx')- x

=

l + x'A-tx

(c) |A + xa;'| = \A\{\ + 2. Let A,B

be p x q and qxp

1

1

x'A~ x). matrices respectively. Show that \I + AB\ P

=

\I +BA\. q

3. Show that for any lower triangular matrix C the diagonal elements are its characteristic roots. 4. Let X be the set of all n x p real matrices X satisfying X'X that the group 0(n)

— I. p

Show

of orthogonal matrices 9 of order p, which transform

X —> 6X acts transitively on X. 5. Let 5 be the set of all symmetric positive definite matrices s of order p. Show that G j ( p ) which transforms s —t gsg',g

£ G)(p), acts transitively on

5.

References 1. N. Giri, Multivariate Statistical Inference, Academic Press, New York, 1977. 2. N. Giri, Introduction to Probability and Statistics, 2nd Edition (Expanded), Marcel Dekker, New York, 1993. 3. N, Giri, Multivariate Statistical Analysis, Marcel Dekker, New York, 1996. 4. I. Olkin, Note on the Jacobian of certain transformations useful in multivariate analysis, Biometrika, 40 (1962), 43-46. 5. S. N. Roy, Some Aspects of Multivariate Analysis, Wiley, New York, 1957.

Chapter 2 INVARIANCE

2.0,

Introduction

Invariance is a mathematical term for symmetry.

In practice many sta-

tistical problems involving testing of hypothesis, exhibit symmetries, which imposes additional restrictions for the choice of appropriate statistical tests, for example, the statistical tests must also exhibit the same kind of symmetry as is present in the problem. In this chapter we shall consider the principle of invariance of statistical testing problems only. For an additional reference the reader is referred to Lehmann (1959), Ferguson (1967) and Nachbin (1965). 2.1. I n v a r i a n c e of D i s t r i b u t i o n s be a measure space and ft the set of points 8, that is, f! — {9}.

L e t [X,A)

Consider a function P on ft to the set of probability measures on (X A) whose t

value at 8 is P . B

L e t G be a group of transformations operating from the left

on X such that g € G g : X°"°X

is (A, A) measurable;

(2.1)

9 : ft ° -

(2.2)

and

is such that if X has distribution P , 6

0

gX, X € X has distribution P s

where

0* = 98 6 ft. A l l transformations considered in connection with invariance will be taken for granted as one-to-one from X onto X. A n equivalent way of stating (2.2) is as follows: 16

Invariance

P {gX e

e A) = P- (X g6

€A),Ae

A,

I7

(2.2a)

or 1

P {g- A)

=

9

P {A)

(2.2b)

se

This can also be written as

Pt(B) = P »{gB), s

If all Pe are distinct, that is, 81 / 82 Pe,

Be A PH*

t

n

e

n

9*

S a

homomorphism.

The condition (2.2a) is often known as the condition of invariance of the distribution. Very often in statistical problems there exists a measure A such that Pa is absolutely continuous with respect to A for all 8 S fi so that p$ is the corresponding probability density function with respect to the measure A. I n other words, we can write

(2.3)

Also in great many cases of interest, it is possible to choose the measure A such that it is invariant under G, viz

\(A) = \[gA)

for all A G A,

geG.

(2.4)

D e f i n i t i o n 2 . 1 . 1 . A measure A satisfying (2.4) is left invariant. In such cases the condition of invariance of distribution under G reduces to Pgs(gx) = pe(x)

for alia; e X, g e G.

(2.5)

The general theory of invariant measures in a large class of topological groups was first given by Haar (1933).

However, the results we need were

all known by the end of the nineteenth century. I n the terminology of Haar, Ps(A) is called a positive integral of p«, not necessarily a probability density function. T h e basic result is that for a large class of topological groups there is a left invariant measure, positive on open sets and finite on compact sets and to within multiplication by a positive constant this measure is unique. Haar proved this result for locally compact topological groups.

T h e definition of

right invariant positive integrals, viz. Pe{A) = Pge(Ag) is analagous to that for left invariant positive integrals.

18

Group Invariance in Statistical

Inference

Because of the poineering works of Haar such invariant measures are called invariant (left or right) Haar measures. A rigorous presentation of Haar measure will not be attempted here. T h e reader is referred to Nachbin (1965). It has been shown that such invariant measures do exist, and from a left invariant Haar measure one can construct the right invariant Haar measure. We need also the following measure-theoritic notions for further developments. Let G be a locally compact group and let B be the <7-algebra of compact subsets of G. D e f i n i t i o n 2.1.1. (G,B)

(Relatively left invariant measure). A measure v on

is relatively left invariant with left multiplier x{g) i f f satisfies v{gB) = x(9)v(B),

forallBeB.

(2.5a) +

T h e multiplier x(t») is a continuous homomorphisim from G —» R . is relatively left invariant with left multiplier \(g) -l

x(ff )f(^ff) '

s

a

then \ ( g

- 1

If v

) = l/x(ff) and

e

l ft invariant measure.

D e f i n i t i o n 2.1.2.

A group G acts topologically on the left of x if the

mapping T(g, x) : x x G —» x

1S

continuous.

D e f i n i t i o n 2.1.3. (Proper G-space). Let the group G acts topologically on the left of x

a

n

d

let h be a mapping on G x \ —* X

x

X given by

Hg>x) = {gx,x) for g G G,x C C x

x

€ X-

T h e group G acts properly on \

if for every compact

J

X> / » " { G ) is compact.

T h e space x is called a proper G-space fG acts properly on \ . A n equivalent concept of the proper G-space is the Cartan G-space. D e f i n i t i o n 2.1.4 (Cartan G-space). Let G acts topologically on \ . Then X is called a Cartan G-space if for every x e x there exists a neighborhood V of a such that (V, V ) = {g e G\{gV) n V # 4>} has a compact closure. Wijsman (1967, 1978,1985, 1986) demonstrate the verification of the condition of Cartan G-space and proper actions in a number of multivariate testing problems on covariance structures. E x a m p l e 2.1.1. Let G be a subgroup of the permutation group of a finite s e t * - For anysubset A of x define A( A) = number of points in A. T h e m e a s u r e

Invariance

19

A is an invariant measure under G and is unique upto a positive constant. It is known as counting measure. Example

2.1.2.

L e t x be the Euclidean n space and G the group of

translations defined by i

3

e %g t

e G implying

Xl

g (x)

=x

Xl

Clearly G acts transitively on x-

+

xi.

T h e n-dimensional Lebesgue measure A is

invariant under G. Again A is unique upto a positive multiplicative constant. E x a m p l e 2.1.3. left.

Consider the group Gi(n)

operating on itself from the

Since an element g operating from the left on Gi(n)

Jacobian with respect to this linear operation. 2

space of dimension n . dimension n x n b y

is linear, it has a

Obviously, Gi(n)

is a vector

L e t us make the convention to display a matrix of

column vectors of dimension n x 1, by writing the first

column horizontally. T h e n the second is likewise and so forth. I n this manner a matrix of dimension nxn let g = (gi ),x }

— (x j),y z

n

represents a point in R

— (y ).

. L e t g,x,y

e Gi(n)

and

If we display these matrices by their column

t]

vectors, the relationship y — gx gives us the coordinates of the point

as

Vij =

y^9ikXkj k=l

in terms of the coordinates of {^111 • • ' * ^ l n > " ' • ! X i, n

. . . , Xjin) -

The Jacobian is thus the determinant of the n (9 0

0 g

\0

0

... ...

2

2

x n

0^ 0

9>

where each 0 indicates null matrix of dimension nxn. n

the transformation y = gx is (detg} . dX(y)

matrix

Hence the Jacobian of

Let us now define for y e -

(det T / ) "

Gt{n)

20

Group Invariance in Statistical Inference

Then n

dX(gy) =

(det g) Y[dVij (det(gy))"

(det 3/)"

= dX(y) Hence A is an invariant measure on G ; ( n ) under (?/(n). It is also unique upto a positive multiplicative constant. E x a m p l e 2.1.4. Let G be a group o f n x n

nonsingular lower triangular

matrices with positive diagonal elements and let G operate from the left on itself. We will verify that the Jacobian of the transformation

Q-*hg for

G, is I l i C " ) ' - Thus an invariant measure on G under G which is 1

unique upto a positive multiplicative constant, is

dX{g) =

il>,

n?=i(j?»)' To show that the Jacobian is n r = i ( ^ « ) * > ' kij = 0 for i <j. T h e n

e t

9 -

(fij),^ =

with g

tj

=

This defines a transformation (fti)-(Cy). We can write the Jacobian of this transformation as the determinant of the matrix /dC 3C„ \ dCu dg &921 n

lL

dC

21

dC

2l

dgu

BC \

nn

dgn

dC

21

dgnn

dC

nn

$921

&9nn 1

/nuartonce

21

which is a [n(rc + 1 ) ] / 2 x [ n ( f i - r l ) ] / 2 lower triangular matrix with the diagonal elements -

> ro.

h ,k kk

In other words h%i will appear once as a diagonal element, /122 will appear twice as a diagonal element and finally h 1

will appear n times as a diagonal

nn 1

element. Hence the Jacobian is I I I L i C " ) ' E x a m p l e 2.1.5.

L e t S be the space of all positive definite symmetric

matrices S of dimension n x n . and let Gi(rc) operate on 5 as S^gSg',

SeS,

g 6 G,{n).

(2.6) n+1

To show that the Jacobian of this transformation is (det g) , matrix obtained from the nxn the j t h row; M^C)

let i ? y be the

identity matrix by interchanging the i t h and

be the matrix obtained from the n x n identity matrix by

multiplying the ith row by any non-zero constant C and let A y be the matrix obtained from the nxn

identity matrix by adding the j t h row to the i t h . We

need only show this for the matrices Eij,Mi(C)

and A y . T h i s can be easily

verified by the reader; for example, d e t ( M , ( C ) ) = C and Mi{C)SMi(C) obtained from S = ( 5 y ) by multiplying Sa by C that the Jacobian is C ^ " "

1

= C"

+ 1

2

is

and S y for i ^ j by C so

.

Let us define the measure X on 5 by

d

X

{

b

}

-

(detS)(«W

Clearly, d\(S)

=

dX(gSg'),

so that X is an invariant measure on S under the transformation (2.6). Here also X is unique upto a positive multiplicative constant. T h e group G;(p)

acts transitively on S.

E x a m p l e 2.1.6. L e t H be a subgroup of G. T h e group G acts transitively on the space x = G/H

(the quotient group of H) where group action satisfies

9i(gh)

= 9i9Hi

for

9i e G ,

g€G.

When the group G acts transitively on x, there is a one to one correspondence between x and the quotient space G/H

of any subgroup H of G.

22

Group Invariance in Statistical

Inference

Transformation of variable in abstract integral

2.1.1.

Let (X, A,n)

be a measure space and let (y, C) be a measurable space.

Suppose

Let us define the measure v

(y,C) by 1

v(C) = i{>p- (C)),

CEC.

t

T h e o r e m 2 . 1 . 1 . If f is a real measurable function g[tp(x))

and if f is integrable,

(2.7) on X such that f{x) —

then

I }{x)dti{x)

=

f

Jx

g(
Jx =

f g{y)dv{y) Jy

(2.8)

For a proof of this theorem, the reader is referred to L e h m a n n [(1959), p. 38], E x a m p l e 2.1.6. ( W i s h a r t d i s t r i b u t i o n ) . Let X\,...,

X

n

be n indepen-

dent and identically distributed normal p-vectors with mean 0 and covariance matrix £ (positive definite). Assume n > p so that

is positive definite almost everywhere. T h e joint probability density function of X ...

,X

1}

n

is given by

2

1

2

1

p(x ,...,xJ-(27r)-"P/ (det£- )"/ xexp|-itrS- ^^;| 1

.

(2.9)

Using Theorem 2.1.1, we get for any measurable set A in the space of S

P(S

1

2

e A) — ( 2 J T ) - ' ^ J (det £

s - X^Xixl

= (2^)-^

e A exp

_

1

)

n

/

2

j-itr

(det £

_

1

)

f[dx,, ' i=l

S^sj

n

/

2

exp j - i

(2.10)

tr

dm(s)

Invariance

23

where m is the measure corresponding to the measure v of Theorem 2.1.1. Let us define the measure m' by

To find the distribution of S, which is popularly called Wishart distribution ( W ( S , J I ) ) with parameter E and degrees of freedom n, it is sufficient to find dm'.

T o do this let us first observe the following:

(i) Since E is positive definite symmetric, there exists a € Gi{p)

such that

£ — aa'. (ii) Let

As a

1

,

Xi s

are independently normally distributed with mean vector 0 and

covariance matrix I (identity matrix), 5 is distributed as W(I,n).

P .(S aa

n

6 A) = ( 2 7 r } - ^

2

l

/ JatA

(det(aa')- s))

x exp | - ^ t r ( a a ' )

P (aSa' r

n

€ A) = (27r)- ^

2

/

a a

' ( S e A ) = Pj{aSa

s|

(det((c.a')

x exp j - i t r ( a a ' ) Since P

- 1

l

Thus

n/2

dm'(s);

_1

s))

/a

s | x dm*{a

J

5a

')

€ A ) for all measurable sets A in the space of

5 , we must have for all g £ G J ( J I ) dm'(s)

t

= dm (gsg').

(2.12)

By example 2.1.5 we get

d

m

{ S )

where C is a positive constant.

= (det

'

24

Group Invariance in Statistical

Inference

T h u s the probability density function of the Wishart random variable is given by

1

n

2

W ( E , n ) = C(det £ - } ' ( d e t s )

( n

p

- -

, ) / 2

exp j - ^

tr£

_

1

sj

(2. 13)

where C is the universal constant depending on E . T o evaluate C let us first observe that C is a function of n and p. L e t us write it as C = C , . n p

Since C ^ n

p

is a constant for all values of £ , we can, in particular, take E = J to evaluate C . np

Write Sn

S12

\$2l where S

u

S22

is (p — 1) X (p — 1). Make the transformations

£11 = Tn, S12 — T12, S22 - SuSiiSu

= T22 = Z ( s a y ) ,

which has a Jacobian equal to unity. Using the fact that

\S\ = \Su\ \S22 - S iSi, Si2\

,

l

2

and

J C , |5| n

l n

p

p

1

- - »^exp|-^tr5jds= 1

forallp.n,

we get

1

J

2

C , \ \^-"- >' e^p^~tis^ds n p S

kiil

C

= -.pj

| n

" "

p

p

1 ) / 2

x y"{,)(- -i)/

*

/

e x p

s

{~5 'i

2 s

exp|-itr

2 e x p

1

i"i '

|_I |

5 l 2

z

}

d l l

d 2

»2

S l l

Jd

5 l l

/nuoriance

x /

|sn|

I

C,

m

^

n P

t n

p

" ~

2 j r

l ) / 2

exp

da

j( -iV2 (n- +n/2 P

2

P

r

C

n,(p-1) C

j

n 2

Since C , i = 2- ' /T{n/2),

^

l / 2

2

exp

J da n •

we have

n

1

7 r

( n

11

f n - p + l \ \

",(p- i)|5n|

25

-p(p-l)/22-' /

2 J r

-p(p-l)/2 -np/2 2

2.2. I n v a r i a n c e o f T e s t i n g P r o b l e m s We have seen in the previous section that for the invariance of the statistical distribution, transformation g of g g G must satisfy g9 g fi for 9 g fl.

D e f i n i t i o n 2.2.1. ( I n v a r i a n c e o f p a r a m e t r i c s p a c e ) . T h e parametric space fi remains invariant under a group of transformation G : X (i) for g g G, the induced mapping j on fi satisfies g9e.il

1

A , if

for all # € fi;

(ii) for any 8' £ fi there exists 0 g fi such that g$ = 8'. An equivalent way of writing (i) and (ii) is

fffi = It may be remarked that if P

fi.

(2.14)

for different values of 8 g fi are distinct then

g

g is one-to-one. T h e following theorem will assert that the set of all induced transformations g,g g G also forms a group.

T h e o r e m 2.2.1. Letg g ly

The transformations

g gi 2

2

be two transformations

and g^

1

which leave fi invariant.

defined by

0201 (x) = 02(9i (aO) _. ffi

(2-15) 9iW=x

26

Group /rtuariance in Statistical

Inference

for all x £ X leave SI invariant

and g gi

l

— fftffi, (Si ) — (ffi)

2

1

P r o o f . We know that if the distribution of X is Pg, 0 G fl, the distribution of gX is Pgd.gi € ft. Hence the distribution of g2gi{X) 1

and

leaves fi invariant, g~2g~\6 €

ft.

T h u s g g\ 2

is P g ^ e , as
obviously,

9i gi = 92fli • Similarly the reader may find it instructive to verify the other assertion.

•

L e t us now consider the problem of testing the hypothesis Ho : 6 G Slu

0

against the alternatives Hi : 6 £

ftjj,,

where ft#

0

and Sin,

are two disjoint

subsets of SI. Let G be a group of transformations which operate from the left on X satisfying conditions (i) and (ii) above and (2.14). I n what follows we will assume that g £ G is measurable which ensures that whenever X is a random variable then gX is also a random variable.

Definition 2.2.2. of testing Ho : 8 £ SIH

( I n v a r i a n c e o f s t a t i s t i c a l p r o b l e m ) . T h e problem U

against the alternative H i : 0 <S Sin, remains invariant

under a group of transformations G operating from the left on X if (i) (or g € G, A € A, P (A)

=

e

(ii)

=

ttH ,gSl a

Hl

= Sl

Hl

P- (gA), sa

for all g G G .

2.3. I n v a r i a n c e of S t a t i s t i c a l T e s t s a n d M a x i m a l I n v a r i a n t Let (X,A,

A) be a measure space.

Suppose pg,8

G SI, is the probability

density function of the random variable X G X with respect to the measure A.

Consider a group G operating on X from the left and suppose that the

measure A is invariant with respect to G , that is, \{gA)

= \(A)

for a l l A e

A ? € G .

Let [y,C) be a measurable space and suppose T ( X ) is a measurable mapping from X into X.

Definition 2.3.1.

( I n v a r i a n t f u n c t i o n ) . T ( X ) is an invariant function

on X under G operating from the left on X if T ( X ) = T(gX) G,X€X.

for all g G

Invariance

D e f i n i t i o n 2.3.2.

( M a x i m a l i n v a r i a n t ) . T(X)

function on X under G if T(X) X.

is a maximal invariant

is invariant under G and T(X)

Y £ X implies that there exists g £ G such that Y = If a problem of testing HQ : 8 e f l #

0

27

= T{Y)

for all

gX.

against the alternatives H 8 lt

£ fijjr

remains invariant under the group transformations G, it is natural to restrict attention to statistical tests which are also invariant under G. Let v>(X), X £ X be a statistical test for testing Ho against H i , that is,
a

when x is the observed sample point. If f is invariant under G

then ip must satisfy
xeX.geG

A useful characterization of invariant test f(X) variant T(X)

in terms of the maximal in-

on X is given in the following theorem.

T h e o r e m 2.3.1. Let T(X) ip(X)

(2.16)

is invariant

be a maximal invariant

on X under G. A test

under the group G if and only if there exists a function

for which
h

h(T{X)).

P r o o f . Let
Obviously,

g£G,x£X.

Conversely, if
— T(y);x,y

£ X then there exists

a g £ G such that y — gx and therefore tp(x) — f[y).

•

In general if ip is invariant and measureable, then
but h

may not be a Borel measurable function. However if the range of the maximal invariant T(X)

is Euclidean and T is Borel measurable then Ai — {A : A £

A and gA — A for all g £ G) is the smallest f-field with respect to which T is measurable and ip is invariant if and only if ip(x) — h(T(x))

where h is a Borel

measurable function. T h i s result is essentially due to Blackwell (1956). Let G be the group of induced transformations on f! (induced by the group G). Let v(8),8

€ fi, be a maximal invariant on fi under G.

T h e o r e m 2.3.2.

The distribution

depends on f! only through

of T(X)

[or any function

of

T(X))

i/{9).

P r o o f . Suppose v{9i) = v(8' ),8\,8 2

i

£ fi. T h e n there exists a g £ G such

that f?2 — § # i . Now for any measurable set C

28

Group Invariance in Statistical

Inference

P (T(X)

£ C) = P {T{gX))

tl

G C)

9i

= P- (T(X)

E C )

g9l

• 2.4. S o m e E x a m p l e s of M a x i m a l I n v a r i a n t s E x a m p l e 2 . 4 . 1 . L e t X = {Xu • - • . * n ) ' £ * and let G be the group of translations gc{X)

= (Xi + C,..

.,X

a

+ C)',-oo

A maximal invariant on X under G is T{X) Obviously, it is invariant. Suppose x — (x\,... T(x)

Writing C — y

= T(y).

n

— x

n

< C < oo. = (X\,—X ,..

— X ).

, £ „ ) ' , y = ( j / i , . . . ,y )'

and let

n

n

n

we get yi = Xi + C for i — 1 , . . . , n.

Another maximal invariant is the redundant (X\ — X,...,

X

n

— X) where

X = i 51™ X j . It may be further observed that if n — 1 there is no maximal invariant. T h e whole space forms a single orbit in the sense that given any two points in the space there exists C G G transforming one point into the other. In other words G acts transitively on X. 1

E x a m p l e 2.4.2. Let A be a Euclidean n-space, and let G be the group of scale changes, that is, j? G G, a

9 (X) a

= (aX ,...,aX )', 1

n

a > 0.

E x a m p l e 2.4.3. Let X = X x X where Xi is the p-dimensional Euclidean space and X is the set of all p x p symmetric positive definite matrices. Let XeX S€X . Write Y

2

2

u

2

X -

(X i),...,X ))', 1

( t

where X is d; x 1 and 5 is a\ x d; such that £ i U < * i = P- Let G be the group of all nonsingular lower triangular pxp matrices in the block form, S G G [t)

{ i i )

Invariance (9(H) 3(21)

0 3(22)

\9(ki)

9(k2)

0

29

>

0

9 = •••

9{kk)/

operating as (X,S)->(gX,gSg'). Let

f{X,S) =

{Rl...,Rl),

where £ ^ - % S ® %

*= 1 . - . *

Obviously, R* > 0. L e t us show that Rj,...,

(2-17)

if £ is a maximal invariant on X

under G. We shall prove it for the case k — 2, T h e general case can be proved in a similar fashion. Let us first observe the following: (i) I f X -* gX,

S - »
- * 9 ( n ) X i ) , 5(u) -t 9(ii)S(n)0( ). (

U

Thus ( i f j . f i f + R%) is invariant under G. (ii) A " ' 5

_ 1

X - XJ'JJSJ"!,^!) + ( X (5

- 5(ai)5(n)^"(i))' 1

( 2 2 )

- S(21)5 ^ 5(i2))" (X(2) (

Now, suppose that for X,Y (x ' (

( 2 )

1 1

5-! x )

)

e X\\ S,T

1)

.

2

,x'5- x)-(r ' (

1 1

T-; y l

where K, T are similarly partitioned as X, S. exists a.g eG

l )

G A*

1

( 1 )

S(21)S^ \ X( )

such that X = gY, S = gTg'.

1

[ 1 )

,y'r- y),

We must now show that there

L e t us first choose 0

g = ' 3(21)

3(22)

with 3(11) = 3(22) ~ (S(22) ~ 5 ( 2 ) S ^ ) S 2 ) ) " " 1

9(12) = 9(22)5'(21)S " (

11)

(

.

(2.18)

)

( 1

2

30

Group Invariance in Statistical

Inference

T h e n gSg' — I. Similarly choose

h

{

1

1

)

0

h = ( \fy l)

/l(22)

2

such that hTh! = I. Since

implies we conclude that there exists an orthogonal matrix 0\

or order
that Since l

X'S-'X = s-

T

Y'T~ Y,

1

= s's,

1

=

h%

(fl(ii)-^(i))'(»(ii)^(i)) = (fc(u)%))'(tyii)%))> then 2

I|S(21)-^(1) + 9 ( 2 2 | A T ( 2 ) | | = 11/1(211^(1) + ^(22)^(2) IP so that there exists an orthogonal matrix 0

of order d

2

2

x d

2

such that

(9(21)X(i) + 9 ( 2 2 ) X ( ) = 02(/l( i)y"(i) + / l ( 2 2 ) V ( 2 ) ) . 2 )

2

Now let r

=

-

f 0,

0

\

I

o

J

o

2

T h e n , obviously, X = =

g^ThY aY,

where a £ G and gSg' =

rhTh'V,

Invariance 31 so that aTo'

S =

Hence ( R J . f l J + i i j ) and equivalently (J?J, Ji^) *

s

a

maxima! invariant on x

under G. Note that if G is the group of p x p nonsingular upper triangular matrices in the block form, that is, if g € G ( 9{ii)

S(i2)

'•'

9(i*) \

0

9(22)

•'•

9(2*)

0

0

•••

9 = V where

9{kk)/

is di x d{, then a maximal invariant on X under G is (ir,...,n)

defined by / % )

Y^T* =

%,]

• • •

(Xti ,,..,X }' )

-l

(X(i),...

m

,X )i w

= 1 , . . . ,k

2.5. D i s t r i b u t i o n o f M a x i m a l I n v a r i a n t Let (X,A,X)

be a measure space and suppose pg,9 £ 0 , is a probability

density function with respect to A. Consider a group G operating from the left on X and suppose that A is a left invariant measure, \{A) Let [yX)

g £ G,

= X(gA),

A £ A.

be a measurable space and suppose that T{X)

is a measurable

mapping such that T is a maximal invariant on X under G. measure on {y,C)

induced by A, such that X'(C)

1

= X(T- (C)),

CeC

Also let Pi be the probability measure on (y,C)

Pt{C)

=

j JT-i(c)

defined by

Pe(x)dX(x)

!

L e t A* be a

32

Group /n variance in Statistical

so that P

2

Inference

has a density function p" with respect A*. Hence

/

Assumption.

= j

p'{T)d\'{T)

p(x)dX{x),C€C

There exists an invariant probability measure p on the

group G under G. L e m m a 2.5.1.

The density function

p* of the maximal

invariant

T with

respect to A* is given by

P*(T) = J (gx)dp(g).

(2.19)

P

P r o o f . Let / be any measurable function on (y,C}-

j f(T)p'(T)d\-(T)

= J

Then

f(T(x))p(x)d\'x).

However,

j f(T(x))

J

p{g(x))dft(g)dX(x)

= jj = J

HT(x))p(g(x))d\(x)du(g) J

nnx^pigix^dHgix^g)

= j

f(T{y)) {y)d\{y). P

T h e n , since the function /

p(g(x))dfi{g)

JG

is invariant, we have p*{T)

= / p(g(x))M9) Ja

•

Invariance

2.5.1. Existence of an invaraint probability measure (group of p x p orthogonal matrices).

on

33

O(p)

Let X = ( X i , . . . , X ) be a p x p random matrix where the column vecP

tors Xi

(p-vectors) are independent

and are identically distributed normal

(p-vectors) with mean vector 0 and covariance matrix / (identity). Apply the Gram-Schmidt orthogonalization process to the columns of X to obtain the random orthogonal matrix Y = (¥%,....,

Yi-

Define Y = T(X)

Y ),

where

p

g

,

and let G = 0(p).

i =

i ...,p. t

Obviously,

gY =

(gY ...,gY ) u

v

=

T(gX ...,gX )

=

T(X;,...,X;),

u

p

where X;=gXi,

i =

l,.... . P

Since X ; , X " have the same distribution and Xi,...,X*

are independently

normally distributed with mean 0 and covariance matrix / , we obtain Y and gY have the same distribution for each g £ 0(p).

T h u s the probability measure

of Y defined by P[Y is invariant under 2.6.

eC)

= P(X

L

eT~ (C))

0{p),

Applications

Example

2.6.1.

(Non-central Wishart

distribution).

Let X

=

( X ! , . . . , X ) (each X * is a p-vector) be a p x n matrix (n > p) where each n

column Xi is normally distributed with mean vector / i ; and covariance matrix / and is independent of the remaining columns of X . T h e probability density function of X with respect to the pn-dimensional

i&'t$

=

(

2 n

-)np/2

e x p

{~i

t t x

x

Lebesgue measure is

' - | * i W * ' + t*avi'j

(2-19)

34

Group Invariance in Statistical

In/erence

where fi = ( f t , . , . . . , U p ) . It is well known that n

iml

is a maximal invariant under the transformation X -

re

XT,

0{n),

where O ( n ) is the group of orthogonal matrices of dimension n x n . Let p(s, p.) be the probability density function of 5 at the parameter point p. with respect to the induced measure dX'{S)

as defined in L e m m a 2.5.1. T o find the

probability density function of 5 with respect to d we first find dX'{s).

From

(2.13) (

p(s,0)dX'(s) By

p

= C(det s ) " - -

, ) / 2

e x p | - | t r s J da,

(2.20)

(2.18)

(s,o)=

f (2rr,o)d^(r) Join)

P

P

d

-

fH/ „., *

(r)

0

(2ff)»f/

a

rexpi-^trsl , \ 2 P

where / i is the invariant probability measure on O ( n ) . Hence, n

2

dA*(s) = ( 2 T ) " / C ( d e t »f*+-W*it

.

Again by (2.18)

p { a , u ) = exp | -1 tr(s 4- J V ) J jf

= ^ P | - g tr(s

+ W

^ exp j - j t r acT**' J

0j p),

dp{T)

Invariance

35

where h(x,fi)=

j

exp J — - t r a d ? a ' \

Jot,n)

2

I

du(T)

J

It is evident that h(x,u)

=

h(xT,p.
where T, * e 0{n). T h u s we can write h(x,p)

as

The probability density function of S with respect to the Lebesgue measure ds is given by n p

2

p(s,^)dA*(s) = p ( s , ^ ) ( 2 7 r ) / C ( d e t

(

1

/2

s) "-''- > ds.

This is called the probability density function of the noncentral Wishart distribution with non-centrality parameter fia'. E x a m p l e 2.6.2.

( N o n - c e n t r a l d i s t r i b u t i o n of t h e c h a r a c t e r i s t i c

r o o t s ) . Let S of dimension p x p be distributed as central Wishart random variable with parameter £ and degrees of freedom n > p. ••• > R

p

Let Ri

> R

2

>

> 0 be the roots of characteristic equation det(S — XI) — 0, and

let P denote a diagonal matrix with diagonal elements R i , . . . ,R . P

Denote by

p(i?, S ) the probability density function of R at the parameter point E with respect to induce measure dX'(R)

of L e m m a 2.5.1. We know [see, for example,

Giri(1977)] /p

p(R,I)dX-(R)

= C

1

v(n-p-l)/2

{U^J

.

p

.

exp|4X>j

*ll(&-Rj)dR

where dR stands for the Lebesgue measure and C\ is the universal constant. It may be checked that R is a maximal invariant under the transformation

s

r s r ,

r e

o(p).

By (2.18)

p(R,

I) = C

2

(j[

R^j

exp | - \ J £ Ri |

^ dp(T).

36

Group Invariance

in Statistical

Inference

So dX'(R)

=

^-'[[(R -R )dR. 1

]

Again by (2.18) / p 1

p ( R , E ) = (det S " ) " /

2

\

(n-p-l)/2

(n^)

(n-p-l)/2

exp J --tr x / Jotv) I 2 'O(p)

0- TRr'\ l

da(T) J

where t9 is a diagonal matrix whose diagonal elements are the characteristic roots of £ . T h e non-central probability density function of R depends on £ only through its characteristic roots. 2.7. T h e D i s t r i b u t i o n of a M a x i m a l I n v a r i a n t i n t h e General Case Invariant tests depend on the observations only through the maximal invariant. T o find the optimum test we need to find the explicit form of the maximal invariant statistic and its distribution. For many multivariate testing problems it is not always convenient to find the explicit form of the maximal invaraint. Stein (1956) gave the following representation of the ratio of probability densities of a maximal invariant with respect to a group G of transformations g leaving the testing problem invariant. T h e o r e m 2.7.1. Let G be a group operating from the left on a topological space (X, A)

and \

a measure

in x which is left invariant

not necessarily

transitive on X).

densities pi,p

with respect to A such that

2

Pi(A)

Assume

=

under G (G is

that there are two given probability

J P2(x)d\( ), A

x

Invariance

37

for A € A and Pi and P% are absolutely continuous.

Suppose T : X —* X is a

maximat invariant

when X has

Pi.

and p" is the distribution

Then under certain

of T(X}

distribution

conditions

(2.21)

where p, is a left invariant

Haar measure on G. An often used alternative

form

of (2.21) is given by

(2.21a)

where f

is the probability density with respect to a relatively invariant

with multiplier

measure

X(g).

Stein (1956) gave the statements of Theorem 2.7.1. without stating explicitly the conditions under which (2.21) holds. However this representation was used by G i r i (1961, 1964, 1965, 1971) and Schwartz (1967). Schwartz (1967) also gave a set of conditions (rather complicated) which must be satisfied for Stein's representation to be valid. Wijsman (1967, 1969) gave a sufficient condition for (2.21) using the concept of Cartan G-space.

K o e h n (1970) gave

a generalization of the results of W i j s m a n (1967). Bonder (1976) gave some conditions for (2.21) through topological arguments. Anderson (1982) has obtained some results for the validity of (2.21) in terms of the proper action of the group. Wijsman (1985) studied the properness of several groups of transformations used in several multivariate testing problems. Subsequently W i j s m a n (1986) investigated the global cross-section technique for factorization of measures and applied it to the representation of the ratio of densities of a maximal invariant.

2.8. A n I m p o r t a n t M u l t i v a r i a t e D i s t r i b u t i o n Let Xi,...,

XN be N independently, identically distributed p-dimensional

normal random variables with mean £ = ( £ , . . . , £ p ) ' and covariance matrix £ (positive definite). Write N x

=

w E

x

N ' -

s

=

jy x -x%x -xf. i

i

i

38

It

Group Invariance in Statistical

is well-known that

X,S

Inference

are independent

in distribution and X

has

p-dimensional normal distribution with mean £ and covariance matrix (1/jV) E and S has central Wishart distribution with parameter E and degrees of freedom n — N — 1. Throughout this section we will assume that N > p so that S is positive definite almost everywhere. Write for any p-vector b,

where fofl is the d; x 1 subvector of b such that £j

d* — p and for any p x p

matrix A A(n)

•••

A(

l f c

)

N

'An) Am

,A i)

•••

( A

where

A

A

•••

(n)

=

j

(kk)

is a di x dj submatrix of A. L e t us define Ri,...

,R

k

and

ft,...,5*

by t

i

E ^ = ^w2r7,^ , t = l,2,...,fc. W

T h e o r e m 2.8.1. TTte joint probability density function given by (for Ri > 0, £ * = i fl^ < y

m

« w -

r

(

(

*

w

/

2

*

x T j £

n

2

t

)

/ /

of R% . ..,R

k

\

W l - E ^ J

/ J V - <5,_i

dj J L & \

[(W- )/2]-l P

k

is

Invariance where a

— S i = i dj

^ o — 0 and tp(a, b : x) is the confluent

w

a

39

kypergeometric

series given by a +

a{a + 1 ) x X

+

b

2

b(b + 1) 2!

3

a(a + l ) ( a + 2) x +

6(6 + l)(b + 2) 3!"

+

" ' '

Furthermore, the marginal probability density function of Ri,...,Rj,j

< k

can be obtained from (2.22) by replacing A; by j and p by YJ[=I hiproof.

We will first prove the theorem for the case k — 2 and then use

this result for the case k — 3. T h e proof for the general case will follow from these cases. For the case k = 2 consider the random variables W = 5(22) - 5 ( 2 1 ) 5 ^ , 5 ( 1 2 ) ,

f

=

%,).

It is well-known [see, for example, Stein (1956), or G i r i ( l 9 7 7 ) , p. 154] that (i) V is distributed as Wishart W(N

- 1, E ( ) ) ; U

(ii) W is distributed as Wishart W{N

- 1 - di, E 2 ) -

S^ijSmjE^));

[ 2

(hi) the conditional distribution of L given V is multivariate normal with mean £(21)2^,V

- 1

/

2

and covariance matrix (£(22, - E(21)S^ i S(12)}®/ ( , 1

)

(

where ® is the tensor product of two matrices £ ( 2 2 )

—

1

^(21)^(11)^(12) and the

di x d, identity matrix I , • a

(iv) W is independent of Let us now define W\, W

2

W (l 2

+ W ) = N(X L

{2)

(L,V). by

- S

m

S ^

t

i

)

x ( X 2 ) - 5 ( 2 i , 5 ^ , A" (

T ( S ( 1 )

m

- 5

( 3

i 5 )

(

1 )

5(i2 ))

1

(2.24)

).

Since X is independent of 5 , the conditional distribution oi\fNX^2)

given 5 ( j n

and X ( i ) is normal with mean \/~N{£(2) + ^ < 2 i ) ^ ( i i ) £ ( i ) ) and covariance matrix

40

Group Invariance in Statistical

1 1

£{22> ~ E { 8 i > E r i i j E ( i 4 > of y/NXt2)

given 5

( l l

Inference

thereupon follows that the conditional distribution

, and y/NX^f

is independent of the conditional distri-

bution of

given Sni)

and Xay.

5 ( ~ ! ) ^ { i ) give

11

s

Furthermore, the conditional distribution of V W S ( i 2 ) a

(n)

n

(

i

i s

normal with mean V W E ( a i ) S ^ j j ^ i ) and

covariance matrix

Hence the conditional distribution of Viv" ( x <

given 5 ( H )

a n

2 )

-

(i +

W M

_

1

/

2

s

d -^(i) ' normal with mean

(C(2) - S

( 2

1

i)S^i^(i)) (i +

2

Wi)- ?

and covariance matrix £ ( 2 2 ) - ^(21) £ ( i i ) £ ( 1 2 ) -

From Giri (1977) it follows that the conditional distribution of W% given

X^

and S ( u ) is p(W \X ,S ) 2

(1)

=

tn)

xl [Hi

1

+ Wi)" )

/xl- dl

dl

,

(2.25)

where Xn($) denotes the noncentral chisquare random variable with noncentrality parameter 6 and with n degrees of freedom and Xn denotes the central chisquare random variable with n degrees of fredom. distribution depends on Xh-t and Sti\) probability density of WuWz p{W ,Wi) 2

Since this conditional

only through Wi,

we get the joint

as =

2 X 2

l

(6 (l

+ Wi)- )/x .

2

N

d l

.^

(2.26)

and furthermore it is well known that the marginal probability density function of Wi is P(Wi)

2

= X (Si)/x dl

N

ai

•

(2.27)

Invariance

41

Hence, we have

jSI

= 0

fl +

1

r(* + /?)r(V)

i f f t ^ ^ H *

2

JJ3

j !

(2.28)

From L e m m a 2.8.1. below,

W

2

= {(Ri + R ){1 2

- R i - Ra)" 1

x (1 + H j . { l -

Hi)" )"

1

~Hi(l-

1

Ri)~ } (2.29)

1

1

=

fi (l-fli-fi )" . 2

2

From (2.28) and (2.29) the probability density function of Ri

and R

2

is

given by

4I

1

nriW-Jh*

1

x R$ - BI*'- (I-R T

x exp | - - ( f t +6 ) 2

2 x n ^ i ( J V -

f

f

l

+ rfefiji

_ 0 , f ; i i U , )

(2.30)

We now consider the case k = 3. Write W

3

-

1

( N X ' S " : ? - N%S^X{ y)

(l +

3

It is easy to conclude that 5(33, - 5 [ 3 2 j S

[22]

S

l23

j

•1

42

Group /n variance in Statistical

Inference

is distributed as Wishart da,E(33) - E[33]E^jE(23]j ,

w(N-l-d-!-

and is independent of S( 2) and S[ -• 2

Furthermore, the conditional distribution

S2

of \ / i V X ( 3 ) given S[ ]<X[2] is normal with mean 22

( f o ) - E|32]S|2 (-?[2] - £[2])) 2J

and covariance matrix E

S

E(33) - [32]Ep2] [23] . and is independent of the conditional distribution of ^ [M] \~22) W S

S

X

given S[ 2] and X p ] , which is normal with mean Z

and covariance matrix NX

m [~2 2) m S

l

X

S

S

T h u s the conditional distribution of W given (Wi,W ),

S

S

( [33] - (32] [22j [23]) 3

given Sp2j and X [ j or, equivalently 2

is

2

viW^W^Wy)

X%$41

=

+ WiftWi

+ W + 2

WiWsT ) 1

Ixfu-dy-it-di-

(2-31)

Hence the joint probability density function of - — ^

p(Wi,W ,W ) 2

3

,W, W 2

3

is given by

5

'-

XN-di-di-dz

X.N-d,-d?

Aiv-fl,

From L e m m a 2.8.1. below, W =R (1-R )- , 1

1

1

l

1

W

=

W

= {{R, +R

a

3

Ri(l-R -R )- , l

2

2

+ R )(l 3

- R , - R

x (1 - R, - R^-'Xl

- R, -

= R (.1-Rl-R -R )- . 1

3

3

a

2

-

^ p

1

- (R, +

fl ) 2

R) 2

(2.33)

Invariance

From (2.32)-(2.33)

43

the joint probability density function of i f , , i f 2 , # 3 is

given by

i=l

V

i=l

x n ^ i t i V - ^ f ; ^ ) .

(2.34)

Proceeding exactly in the above fashion we get (2.22) for general k. Since the marginal distribution of JEui is normal with mean &a and covariance matrix S[iij it follows that given the joint probability density function of

i f i f ^ ,

the marginal distribution of i f , , . . . , i f j , 1 < j < p can be obtained from (2.22) by replacing k by j .

•

The following lemma will show that there is one-to-one correspondence between ( f i i , . . . , i f t ) and (if J , . . . , i f £ ) as defined earlier in this chapter. L e m m a 2.8.1. 1

NX'(S

+ NXX')- X

NX'S^X

=

1+

NX'S-i-X

P r o o f . Let 1

{S + NXX )-

1

=S-'

+A.

Hence, 1

I + NXX'S-

+(S

+ NXX')A

=

I.

So we get (S + NXX')-

1

= S"

1

1

- N(S +

NXX'^XX'S'

Now NX'(S

+ NXX')-^

= NX'S^X

- NX'(S

+

NXXT'XiX'S^X).

44

Group Invariance

in Statistical

Inference

Therefore, 1

NX' NX>
1+NX s

S~ X

lx

Note that

5

=i

j=i

V

J=I

2.9. A l m o s t I n v a r i a n c e , S u f f i c i e n c y a n d I n v a r i a n c e D e f i n i t i o n 2.9.1. Let ( A " , A , P ) be a measure space. A test ih{X),X

G

A" is said to be equivalent to an invariant test with respect to the group of transformations G, if there exists an invariant test ip(x) with respect to G such that ib(x) =

y{x),

for all x G X except possibly for a subset TV of A" of probability measure zero. D e f i n i t i o n 2.9.2. A l m o s t i n v a r i a n c e . A test ip{x), x G X, is said to be almost invariant with respect to the group of transformations G, if
where N

g

is a subset of x depending on g of probability measure zero.

It is tempting to conjecture that if tp(x) is almost invariant then it is equivalent to an invariant test i>{x). Obviously, any ifi(x) which is equivaleant to an invariant test is almost invariant. For, take N

-1

g

— N U g N.

If x £ N , g

then

x & N, and gx $ N so that tp(x) = y(gx)

= ih(gx) = ii>{x).

Conversely, if G is countable or finite then given an almost invariant test
g

= {x G x '• f{x)

# f{gx)}

1J N

g

,

and AT has probability measure zero.

T h e uncountable G presents difficulties. Such examples are not rare where an almost invariant test can be different from an invariant test.

Invariance

Let G be a group of transformations operating on X and let, A,B,

45

be the

d-field of subsets of X and G respectively such that for any A £ A the set of pairs (x, g) for which gx £ A is measurable Ax

B. Suppose further that there

exists a tr-finite measure v on G such that for all B £ B,v(B) v{Bg)

= 0 implies

= 0 for all g £ G . T h e n any measurable almost invariant test function

with respect to G is equivalent to an invariant test under G. For a proof of this fact the reader is referred to Lehmann (1959). — 0 imply v(Bg)

requirement that for all g £ G and B £ B,v(B) satisfied in particular when f{B)

— f{Bg)

The

= 0 is

for all g £ G and B £ B. Such a

right invariant measure exists for a large number of groups. Let A\ = {A : A £ A and gA — A for all g £ G } and let .4 be the sufficient S

o--field of X.

Let G be a group, leaving a testing problem invariant, be such

that gA„ = A

s

for each g £ G . If we first reduce the problem by sufficiency,

getting the measure space (X, A , P ) , and subsequently by invariance then we s

arrive at the measure space (X,A i,

P) where A i

s

A

s

s

is the class of all invariant

measurable sets. T h e following theorem which is essentially due to Stein

and Cox provides an answer to our questions under certain conditions. T h e proof is given by Hall, Wijsman and Ghosh (1965). T h e o r e m 2.9.1. Let [X, A, P ) be a measure space and let G be a group of measurable transformations

leaving P invariant.

for P on (X, A) such that gA almost invariant

A

3

s

—A

s

Then A

s

s

be a sufficient

subfield

for each g £ G . Suppose that for

measurable function

probability measure zero.

Let A

each

Til 8 (0 ) T TV = all + $ > hA + 5 > £ a t f / r - #r) } f (r) 0 p(dr) (7.7) Type D and E Region) as S - » 0. 165 According to G i r i and Kiefer (1969) this is called the symmetric reduced regular case ( S R R ) . T h e o r e m 7 . 1 . In the SRR case, an invariant D among invariant test i>* of level a is of type

) if and only if IF ^^={1} where c is a constant and qT 1 E « {<} = const h^ + EotZj (7.8) ai Rj'{R) 3 P r o o f . I n the S R R case, every invariant has a diagonal B$ whose i t h diagonal entry, by (6.7), is 2[hia + £o{53, a R 4i(R)}]i3 } B y Neyman-Pearson lemma tr QB$ is maximized over such a <j> by a d>* of the form (6.7). Hence we get the theorem. E x a m p l e 7.1. Let 0 = A r l 2 ^ F - 1 • 2 ( T - t e s t ) Consider once again. Problem 1 of Chapter 4. ^ where F is the unique member of GT with positive diagonal elements satisfying F F' = E . T h e n 8 is Euclidean p-space, GT operates transitively on H = {positive definite E } but trivially on © and we have the S R R case. We thus have (6.7) with h = - 1 , { '1, d i > 3 , - < 0, if i < j , N - j Hotelling's T 2 + l, if (7.9) i - j . test has a power function which depends only 5Z;°^> so that, with the above parametrization for 6, we have BT* a multiple of identity. r c Hotelling's critical region is of the form £ > > - B u t , when all are equal, the critical region corresponding to (6.8) is of the form ^ ( - / V + p — 1 — 2 j ) r , > c, which is not Hotelling's region if p > 1. Hence we get the following theorem. T h e o r e m 7.2. F o r 0 < a < l < p < N , D among Gr-invariant E) among all tests. Hotelling's T 2 test is not of type tests, and hence is not of type DA or DM (nor of type 166 Group Invariance in Statistical Inference R e m a r k s . T h e actual computation of a <j>* of type D appears difficult due to the fact that we need to evaluate an integral of the form (7.10) for ft = 0 or 1 for various choices of the Ci'fl and c. W h e n a is close to zero or 1, we can carry out approximate computations as follows. A s a —» 1 the complement R of the critical region becomes a simplex with one corner at 0. When p = 2, if we write p = 1 - a and consider critical regions of level a of the form byi + y > c where 0 < L 2 _ I < b < L , L being fixed but large (this keeps R r close to the origin), we get from (6.10) that p = E {l-(p{R)) o(c) as C 2 i 0. 3 2 p b - ' /2(N-2) Similarly, E {(l - a 1 2 = (A -2)c/26 / + a 3 1 s 2 = {N - 2 ) c 6 - ' / 8 + o(c) + o(p) as p — 0 while £ ( # , ) = l/N. = From (5.18) we obtain 0 for the power near He 1 - Np + o(p) + <5 2 2{N-2)b? p + o(p) 1 - 2(N (N - l)pb% + o(p) 2(N-2) -2)6* + where o{p) and o ( J ^ 6 i ) terms are uniform in A and p, respectively. The product A ^ of the coefficients of 6\ and b% is maximized when b = (N + 1)/{JV - 1) + o ( l ) , as p —* 0; with more care, one can obtain further terms in an expansion in p for the type D choice of b. T h e argument is completed by showing that 6 < L _ 1 implies that R lies in a strip so close to the r i - a x i s as to make EQ(1 - d> as good as that with b = (N + l)/(N - d>(R))R } 2 too small to yield a - 1), with a similar argument if b > L . When p is very close to 0, all choices of 6 > 0 give substantially the same power, 2 a + (Si + £ ) / 2 + o(p ), 2 2 Hotelling's T so that the relative departure from being type D, of test or any other critical region of the form bR\ + R 2 > c, with b fixed and positive, approaches 0 as a - » 1. However we do not know how great the departure of A j from A can be for arbitrary a. r We can similarly treat the case p > 2 and also the case a —* 0. One can treat similarly other problems considered in Chapter 4. References 1. N. Giri, and J . Kiefer, Local and Asymptotic Tests, Ann. Math. Statist. 35, 21-35 (1964). Minimax Properties of Multivariate Type D and B Regions 2. S. L . I s a a c s o n , On the Theory Specifying of Unbiased the Values of Two or more Tests of Simple Parameters, Statistical 167 Hypothesis A n n . M a t h . S t a t i s t . 2 2 , 217¬ 234 (1951). 3. J . Kiefer, On the Nonrandomized Symmetrical Designs, 4. E . L . L e h m a n n , (1959). Optimality and Randomized Nonoptimality of A n n . M a t h . Statist. 2 9 , 675-699 (1958). Optimum Invariant Tests., A n n . M a t h . Statist. 3 0 , 881-884

Statistical Inference in Science Read more Statistical Inference Read more Statistical Inference Read more Statistical Inference Read more Statistical Inference Read more Statistical Inference Read more Statistical Inference Read more Statistical Inference Read more Statistical Inference Read more Galilei Group and Galilean Invariance Read more Introduction to statistical inference Read more Introduction to Statistical Inference Read more Principles of Statistical Inference Read more Probability and Statistical Inference Read more Probability and Statistical Inference Read more Comparative statistical inference Read more Essentials of statistical inference Read more Nonparametric Statistical Inference Read more Probability and statistical inference Read more Nonparametric statistical inference Read more Probability and Statistical Inference Read more Comparative Statistical Inference Read more Introduction to Statistical Inference Read more Probability and Statistical Inference Read more Probability And Statistical Inference Read more Bayesian statistical inference Read more Principles of statistical inference Read more Principles of Statistical Inference Read more Essentials of Statistical Inference Read more Probability and Statistical Inference Read more

Recommend Documents Statistical Inference in Science Statistical Inference in Science D.A. Sprott Springer To Muriel This page intentionally left blank Preface This... Statistical Inference Stalisticallnference Second fdition George Casella Roger l. Berger DUXBURY ADVANC ED SERIE S 01KONOMlKO nANEniITHM... Statistical Inference Stalisticallnference Second fdition George Casella Roger l. Berger DUXBURY ADVANC ED SERIE S 01KONOMlKO nANEniITHM... Statistical Inference ... Statistical Inference Statistical Inference ... Statistical Inference ... Statistical Inference Stalisticallnference Second fdition George Casella Roger l. Berger DUXBURY ADVANC ED SERIE S 01KONOMlKO nANEniITHM... Statistical Inference Galilei Group and Galilean Invariance