Sampling Theory and Methods
S. Sampatb
CRC Press Boca Raton London
New York Washington, D.C.
...
889 downloads
4479 Views
4MB Size
Report
This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
Report copyright / DMCA form
Sampling Theory and Methods
S. Sampatb
CRC Press Boca Raton London
New York Washington, D.C.
Mumbai Calcutta
S. Sampath Department of Statistics Loyola College. ChennaJ-600 034. India
Library of Congress Cataloging-in-Publication Data:
A catalog record for this book is available from the Library of Congress.
All rights reserved. No part of this publication may be reproduced. stored in a r~trieval system or transmitted in any form or by any means, electronic, mechanical. photocopying. or otherwise, without the prior permission of the copyright owner.
This book contains information obtained from authentic and highly regarded sources. Reprinted material is quoted with permission, and sources are indicated. Reasonable efforts have been made to publish reliable data and information. but the author and the publisher cannot assume responsibility for the validity of all materials or for the consequences of their use.
Neither this book nor any part may be reproduced or transmitted in any form or by any means. electronic or mechanical, including photocopying, microfilming, and recording, or by any information storage or retrieval system, without prior permission in writing from the publisher.
Exclusive distribution in North America only by CRC Press LLC
Direct all inquiries to CRC Press LLC, 2000 N.W. Corporate Blvd., Boca Raton, Florida 33431. E-mail: orders @crcpress.com
Copyright@ 2001 Narosa Publishing House, New Delhi-110 017, India
No claim to original U.S. Government works International Standard Book Number 0-8493-0980-8 Printed in India.
Dedicated to my parents
Preface This book is an outcome of nearly two decades of my teaching experience both at the gmduate and postgraduate level in Loyola College (Autonomous), Chennai 600 034, during which I came across numerous books and research articles on "Sample Surveys". I have made an attempt to present the theoretical aspects of "Sample Surveys" in a lucid fonn for the benefit of both undergraduate and post graduate students of Statistics. The first chapter of the book introduces to the reader basic concepts of Sampling Theory which are essential to understand the later chapters. Some numerical examples are also presented to help the readers to have clear understanding of the concepts. Simple random sampling design is dealt with in detail in the second chapter. Several solved examples which consider various competing estimators for the population total are also included in the same chapter. The third is devoted to systematic sampling schemes. Various systematic sampling schemes like, linear, circular, balanced. modified systematic sampling and their performances under different superpapulation models are alSo discussed. In the fourth chapter several unequal probability sampling-estimating strategies are presented. Probability Proportional to Size Sampling With and Without Replacement are considered with appropriate estimators. In addition to them Midzuno sampling scheme and Random group Method are also included. presented with Stratified sampling, allocation problems and related issues full details in the fifth chapter. Many interesting solved problems are.also added. In the sixth and seventh chapters the use of auxiliary information in ratio and regression estimation are discussed. Results related to the properties of ratio and regre~ion estimators under super-population models are also given. Cluster sampling and Multistage sampling are presented in the eighth chapter. The results presented in under two stage sampling are general in nature. In the ninth chapter, non-sampling errors, randomised response techniques and related topics are discussed. Some recent developments in Sainple surveys namely, Estimation of distribution functions, Adaptive sampling schemes, Randomised response methods for quantitative data are presented in the tenth chapter.
are
Many solved theoretical problems are incorporated into almost all the chapters which will help the readers acquire necessary skills to solve problems of theoretical nature on their own. I am indebted to the authorities of Loyola College for providing me the necessary faciliti~s to successfully complete this work. I also wish to thank Dr.P.Chandrasekar. Department of Statistics, Loyola College, for his help during proof correcti~n. I wish to place on record the excellent work done by the Production Department of Narosa Publishing House in fonnatting the 1nanuscript
S.Sampath
Contents
Chapter 1 1.1 1.2 1.3
Chapter 2 2.1 2.2 2.3
Chapter3 3.1 3.2 3.3 3.4 3.5 3.6 3.7 3.8
Chapter4 4.1 4.2 4.3 4.4 4.5 4.6
ChapterS 5.1 5.2 5.3 5.4
Chapter6 6.1 6.2 6.3 6.4 6.5 6.6 6.7
Preliminaries Basic Definitions Estimation of Population Total Problems and Solutions
1 3 8
Equal Probability Sampling Simple Random Sampling Estimation of Total Problems and Solutions
10 11
16
Systematic Sampling Schemes Introduction Linear Systematic Sampling Schemes for Populations with Linear Trend Autocorrelated Populations Estimation of Variance Circular Systematic Sampling Systematic Sampling in Two Dimensions Problems and Solutions
29 29 34 39 42 43 44 47
Unequal Probability Sampling PPSWR Sampling Method PPSWOR Sampling Method Random Group Method Midzuno scheme PPS Systematic Scheme Problems and Solutions
55 60 63 67 70 71
Stratified Sampling Introduction Sample Size Allocation Comparision with Other Schemes Problems and Solutions
76 79 86 89
Use of Auxiliary Information Introduction Ratio Estimation Unbiased Ratio Type Estimators Almost Unbiased Ratio Estimators Jackknife Ratio Estimator Bound for Bias Product Estimation
97 97 100 102 104 105 106
x
Contents
6.8 6.9 6.10 6.11
Chapter 7 7.1 12
7.3 7.4 7.5 7.6
Chapter 8 8.1 8.2 8.3
Chapter 9 9.1 9.2 9.3
Chapter 10 10.1 10.2 10.3
Two Phase Sampling Use of Multi-auxiliary Information Ratio Estimation in Stratified Sampling Problems and Solutions
108 113 115 117
Regression Estimation Introduction Difference Estimation Double Sampling in Difference Estimation Multivariate Difference Estimator Inference under Super Population Models Problems and Solutions
122 124 125 126 129 137
Multistage Sampling Introduction Estimation under Cluster Sampling Multistage Sampling
140 140 143
Non-sampling Errors Incomplete Surveys Randomised Response Methods Observational Errors
152 158 161
Recent Developments Adaptive Sampling Estimation of Distribution Functions Randomised Response Methods for Quantitative Data
165 171 174
References
179
Index
183
Chapter 1
Preliminaries 1.1 Basic Definitions Definition 1.1 "Finite Population" A finite population is nothing but a set containing finite number of distinguishable elements. The elements of a finite population will be entities possessing panicular characteristics in which a sampler would be interested and they will be referred to as population units. For example, in an agricultural study where one is interested in finding the total yield, a collection of fields or a collection of plots may be defined as population. In a socio-economic study, population units may be defined as a group of individuals, streets or villages. Definition 1.2 "Population Size" The number of elements in a finite population is called population size. Usually it is denoted by Nand it is always a known finite number. With each unit in a population of size. N, a number from 1 through N is assigned. These numbers are called labels of the units and they remain unchanged throughout the study. The values of the population units with respect to the characteristic y under study will be denoted by Y1 , Y2 , ... , YN. Here Y; denotes the value of the unit bearing label i with respect to the variable y. Defmition 1.3 "Parameter" Any real valued function of the population values is called parameter. 1 N 1 N For example, the population mean Y = S2 = Y]2 and
L,r; , --IIli -
N i=l
N-1 i=l
population range R = Max {X; }- Min {X; } are parameters.
Definition 1.4 "Sample" A sample is nothing but a subset of the population S. Usually it is denoted by s. The number of elements in a sample s is denoted by n(s) and it is referred to as sample size. Definition 1.5 "ProbabHity SampUng" Choosing a subset o( the population according to a probability sampling design is called probability sampling.
2
Sampling Theory and Methods
Generally a sample is drawn to estimate the parameters whose values are not known.
Definition 1.6 "Statistic" Any· real valued function is called statistic, if it depends on Yt, Y2, .... YN only through s. A statistic when used to estimate a parameter is referred to as estimator.
·Definition 1. 7 "Sampling Design" Let .Q be the collection of all subsets of S and P(s) be a probability distribution defined on .Q. The probability distribution {P(s),se .Q} is called sampling design. A sampling design assigns probability of selecting a subset s as sample. For example, let .Q be the collection of all (:] possible subsets of size n of the populationS. The probability distribution P(s)=
jCf 0
if n(s) = n otherwise
is a sampling design. This design assigns probabilities
(Nl-1 II
for all subsets of
I
size n for being selected as sample and zero for all other subsets of S. It is pertinent to note that the definition of sample as a subset of S does not allow repetition. of units in the sample more than once. That is, the sample will always contain distinct units. Alternatively one can also define a sequence whose elements are members of S as a sample, in which case the sample will not necessarily contain distinct units.
Definition 1.8 "Bias" Let P<s) .be a sampling design defined on .Q. An estimator T(s) is unbiased for the parameter 8 with respect to the sampling design P(s) if
Ep[T(s)] =
L T(s)P(s) =8.
seD
The difference Ep[T(s)]-8 is called the bias of T(s) in estimating 8 with respect to the design P(s). It is to be noted that an estimator which is unbiased with respect to a sampling design P(s) is not necessarily unbiased with respect to some other design Q(s).
Definition 1.9 "Mean Square Error" Mean square error of the estimator T(s) in estimating 8 with respect to the design P(s) is defined as MSE
= L[T(s) -8] 2 P(s) seD
Preliminaries
3
If Ep[T(s)] =8 then the mean square error reduces to variance. Given a parameter8. one. can propose a number of estimators. For example, to estimate the population mean one can use either sample mean or sample median or any other reasonable sample quantity. Hence one requires some criteria to choose an estimator. In sample surveys, we use either the bias or the mean square error or both of them to evaluate the performance of an estimator. Since the bias gives the weighted average of the difference between the estimator and parameter and the mean square error gives weighted squared difference of the estimator and the parameter. it is always better to choose an estimator which has smaller bias (if possible unbiased) and lesser mean square error. The following theorem gives the relationship between the bias and mean square error of an estimator. ~
Theorem 1.1 Under the sampling design P(s), any statistic T(s) satisfies the "'
....
....
~
....
....
relation MSE( P : T) = V p (T) + [ B p (T)]- where Vp (T) and B p (T) are variance and bias ofthe statistic T(s) under the sampling design P(s). ~
~
.,
Proof MSE(T: P) = E p[T(s) -8]-
= ~)f<s) -8] 2 P(s) .~.Q
= ~)f(s)- E p (T(s)) + E p (T(s))- 8] 2 P(s) .~.Q
= ~)f<s)-Ep(T(s))] 2 P(s)+[Ep(T(s))-81 2 se.Q
Hence the proof. • As mentioned earlier, the performance of an estimator is evaluated on the basis of its bias and mean square error of the estimator. Another way to assess the performance of a sampling design is the use of its entropy. Definition 1.10 "Entropy" Entropy of the sampling design P(s) is defined as, e =- LP(s)lnP(s) .~.Q
Since the entropy is a measure of information corresponding to the given sampling design, we prefer a sampling design having maximum entropy.
1.2 Estimation of Population Total In order to introduce the most popular Horvitz-Thompson estimator for the population total we give the following definitions.
4
Sampling Theory and Methods
Definition 1.11 "Inclusion indicators" Let s 3 i denote the event that the sample s contains the unit i . The random variables I if s 1 i. IS. iS. N l·(s)= { · ' 0 otherwise are called inclusion indicators. Definition 1.12 "Inclusion Probabilities" The first and second order inclusion probabilities corresponding to the sampling design P(s) are defined as TC;
=L P(s). rcij = L P(s) ni.j
s3i
where the sum
L
extends over all s containing i and the sum
L
extends
ni.j
Hi
over all s containing both i and j.
Theorem 1.2 For any sampling design PCs). (a)£p[/i(s)]=rc;.i=l.2..... N (b) E P [I; (s) I J (s)] = rc iJ , i, j
= l. 2..... N
Proof (a) Let .Q 1 be the collection of all subsets of S containing the unit with label i and !22 = .Q -.Qt. Ep[l;(s)]= L
l;(s)P(s)+ Ll;(s)P(s)
seD 1
= L
seD,
1P(s)+ L
.teD1
OP(s)
.fE.CJ,
= LP(s) ni
= TC; (b) Let .Q1 be the collection of all subsets of S containing the units with labels ; and} and .Q 2 = .Q -!2 1.
Notethat. l;(s)I 1·(J)= {
1 ifse.Q 1 .
0 otherw1se
Therefore Ep[/;(s)lj(s)]= L
l;(s)J 1 (s)P(s)+ Ll;(s)J 1 (s)P(s)
seD 1
.fED2
= LP(s) seD 1
= LP(s) Hi.j
= ";J
Hence the proof. • N
Theorem 1.3 For any sampling design P(s). E p[n(s)] =
L"; i=l
Preliminaries
5
Proof For any sampling design. we know that. N
n(s) =
L I;
(s)
i=l
Taking expectation on both sides, we get N
Ep[n(s)]= L,Ep[l;(s)] i=l N
= :Llr; i=l
Hence the proof. •
Theorem 1.4 (a) For i =I. 2•...• N. V p[/; (s)] = lr; (1-Jr;). (b) Fori. j =I, 2..... N.cov p[/; (s).l j (s)] = lr;j -Jr;lr j Proof of this theorem is straight forward and hence left as an exercise.
Theorem 1.5 Under any sampling design. satisfying P[n(s) = n] =I for all s. N
(a) n= ~Jr · ~ 1
N
(b) £.. ~[Jr·Jr· -Jr IJ.. ]=Jr·(l-Jr·) I J I I
i=J
)~I
Proof (a) Proof of this part follows from Theorem 10.3 N
(b) Since for every s. P[n(s) = n] =I, we have
LI
j
(s) = n -I; (s)
Hence by Theorem 1.4, we write Jr;(l-Jr;) =Vp[/;(s)] =cov p[l;(s),l;(s)] N
=cov p[l;(s),n-
L
I 1 (s)]
j~l
N
=-l:cov p[l;(s),l j\S)] j~i
N
= I[lr;Jr j -Jrij] Hence the proof. • Using the first order inclusion probabilities, Horvitz and Thompson ( 1952) constructed an unbiased estimator for the population total. Their estimator for the population total is YHT = ~ Y; . The following theorem proves that the £.. Jr· .. ies
1
6
Sampling Theon and Methods
above estimator is unbiased for the population total and also it g1ves the variance.
Theorem 1.6 The Horvitz-Thompson estimator
yHT
=
L .!!._ is unbiased for IE.~
11 - - +2 LN LN Y-Y [rc -rc;rc i LN Y· 2 [t-rc;] 1C 1 TC·TC. I
1=J
I=J 1=J l<j
I
1
I
1
l
the population total and its variance is I
TC-
_.
Proof The estimator yHT can be written as N
.. YHT
y.1 f;(s) = ""' £..Ji=l 1C;
Taking expectations on both sides we get ..
Ep[YHT]
N ~Y·
=£..J- TC; = y 1
i=l rc 1
Therefore yHT is unbiased for the population total. Consider the difference ..
y.
N
YHT -Y =
=
L -~ i=l
~;
N
y.
2,1
N
l;(s)-
L
Y;
i=l
[/;
(s) -rc; J
i=l 1C;
Squaring both the sides and taking expectations we get
]2
N [ Y; .. Ep[YHT -Y] 2 = ""' £..J -_ i=I rc 1
Ep[/;(s)-rc;] 2 N
N
y.
yj
1 -Ep[/;(s)-rc;][lj(s)-rc ) +22,2,1
i=l j=l 1C j 1C j
i<j
y. yi L -Y- 12J Vp[/;(s)]+2L,L,-covp(/;(s).J 1 (s)) N [
=
N
N
1
i=J
1
TC·
TC·TC· i=J j=J 1 1
I
i<j
[1-rc;]
-_LN y. 2 - - +2 LN LN y.y. I
i=J
1C. I
I
i=J j=J i<j
1
[rcij
-rc;rc 1 ]
1C ·TC . I
1
Hence the proof. •
Remark The variance of Horvitz-Thompson estimator can also be expressed in the follo~ing form
Preliminaries
LL ;i -,/y ]2 N
N [
i=J j=J i<j
[lr;lr j -Jrtj
7
I
)
I
Proof for Remark From the previous theorem. we have .,[1-Jr 1 ] + 2~ ~ Y; Yj V p [Y~HT] =LN Yt -
f f
Jr·
i=J
=
i=J j=J i<j
I
L[y2] N
[Jrij -lr;lr j] Jr·lr . I
N
N
N
)
;2 L[lr;lrj -Jr;j]+ I~Y;Yj 1=1 j=J Jrl j=l
i=J
[Jr .. -Jr·lr.] J
I)Jr·Jrl I
)
j~
j~
J
N N Y; yj I N N [ Y/ r] =- ~ ~ - + - [lr·lr. -Jr .. ]-~ ~--[Jr·Jr. -Jr .. ] 1) j I ~ ~ Jr. Jr I) ) I 2 2~~ . 2
1=J j=J
1=J j=J j:;:.i
Jr j
Jrl
j:;:.l .
=z-II --I N N ( f;
YJ ]
lr;
i=l j=l j:;:.i
N N [ L Y; =L
)
2
[lr;lrj-Jrij]
1r j
y
_ _j_
i=l j=l 1r; i<j
I
]2
[lr;lr j -Jrij 1
1r j
Hence the proof. • The above form of the variance of the Horvitz-Thompson estimator is known as Yates-Grundy form. It helps us to get an unbiased estimator of the variance of Horvitz-Thompson Estimator very easily. Consider any design yielding positive second order inclusion probabilities for all pairs of units in the population. For any such design. an unbiased estimator of the variance given above is
I I[!L-~]2( ies jes i<j
1r
1
Jr·
1
lr;lr J -Jr;j Jr 1..
J
1
The Horvitz-Thompson estimator. its variance and also the estimated variance can be used for any sampling design yielding positive first order inclusion probabilities for all the population units and positive second order inclusion probabilities for all pairs of units. It is pertinent to note that the estimated variance is likely to take negative values. A set of sufficient conditions for the non-negativity of the estimated variance expression given above are. for every pairofunitsiandj. Jr;Jrj -lr;j ~0 (i.j=1,2 ..... N).
8
Sampling Theory and Methods
Note For estimating population total. apart from Horvitz-Thompson estimator several other estimators are also available in literature and some of them are presented in later chapters at appropriate places.
1.3 Problems and Solutions Problem 1.1 Show that a necessary and sufficient condition for unbiased estimation of a finite population total is that the first order inclusion probabilities must be positive for all units in the population. Solution When the first order inclusion probabilities are positive for all units, one can use Horvitz-Thompson estimator as an unbiased estimator for the population total. When the first order inclusion probability is zero for a particular unit say, the unit with label i expected value of any statistic under the given sampling design will be free from Y;, its value with respect to the variable under study. Hence the first order inclusion probability must be positive for all units in the population. • Problem 1.2 Derive cov p ( Y. X) where Y and X are H
..
V p[Z]
(I. I)
N N [ z. z j ]:! ~~ = k.J L rc~ -~ [rc;rc J -rei}] i=J j=J i<j
J
I
N N [ X } ]:! (TC·TC · -TC··) + ~ N N [ y ]2 (TC·TC · -TC··) =~ ~ !..!_ __ ~ !i_ __j_ k.JL TC· TC· J k.JL TC· TC· J I
i=J j=J
I)
I
J
I
i=J j=J
I
~j
kj
LNN[ L -y
+2
i=l j=l i<j
1
1C;
y][. X] i
--·
1C J
-
X·1
1C;
J
---
[!C;TC j -TC;j]
1C j
Comparing ( 1.1) and ( 1.2) we get
..
][X; J]
~~n~[ Y; YJ X cov p(X,Y) = LL -. --. - . - - . [rc;rc J -rciJ] i=J j=J i<j
Hence the solution. •
I)
J
1C I
1C J
1C I
1C J
( 1.2)
Preliminaries
9
Exercises I .I
Let S be a tinite population contammg 5 units. Calculate all first and second order inclusion probabilities under the sampling design 0.2 for s = {2.3,4}, s = {2.5} P(s)= { 0.3 fors={l,3,5},s={l.4} 0
1.2
1.3
otherwise
From a population containing five units with values 4,7,11,17 and 23, four units are drawn with the help of the design 0.20 if n(s) = 5 P(s) = { 0 otherwise Compare the bias and mean square error of sample mean and median in estimating the population mean. List all possible values of n(s) under the sampling design given in Problem N
1.1 and verify the relation E p[n(s)] =
L1r; i=l
1.4 Check the validity of the statement "Under any sampling design the sum of first order inclusion probabilities is always equal to sample size". 1.5 Check the validity of the statement "Horvitz-Thompson estimator can be used under any sampling design to obtain an unbiased estimate for the population total".
Chapter 2
Eq~al
Probability Sampling
2.1 Simple Random Sampling This is one of the simplest and oldest methods of drawing a sample of size n from a population containing N units. Let n be the collection of all 2N subsets of S. The probability sampling design P(sj=
(N)n
1
if n(s) = n
0 otherwise is known as simple random sampling design. In the above design each one of the
fNJ
lll
possible sets of size n is given
equal probability for being selected as sample. The above design can be implemented by following the unit drawing procedure described below: Choose n random numbers from 1 through N without replacement. Units corresponding to the numbers selected as above will constitute the sample. Now we shall prove this sampling mechanism implements the sampling design defined above. Consider an arbitrary subset s of the population S whose members are i" i 2, i 3, ... , i,.. The probability of selecting the units i1. i 2, i3, ... , i,. in the order i 1 -+i2 -+i3 -+ ... -+i,. is 1 1 1 ------NN-IN-2
1
N-(n-1)
Since the number of ways in which these n units can be realized is n!, the probability of obtaining the sets as sample is n!-1 _I_ _!__, 1 N N-1 N-2 N-(n-1)
which reduces on simplification to[:f' Therefore we infer that the sampling. mechanism described above will implement the simple random sampling design.
Equal Probability Sampling
ll
2.2 Estimation of Total The following theorem gives the first and second order inclusion probabilities under simple random sampling.
Theorem 2.1 Under simple random sampling, (a) For i . .
(b) For r, 1 =1, 2, ... , N,
tc;j
=1, 2, ... , N , 7t; =~ N
n(n-1)
=--N(N -1)
Proof By definition
=~P(s) =~(NJ-1
1C;
S)l
Since there are
( Il N~
N
l n-1
n
I
-1
n
S)l
[ N-1~I subsets n-1)
with i as an element, the above sum reduces to
n which is equal to-
N
Again by definition, we have
=~-P(s) =~.(NJ-1 Since there are rN-2] subsets with i and j as elements, we get =(N-2INJ-1 1Cij
.ur.J
ur.J n
n-2
n(n-1)
1tij
n-2
n
N(N -1)
Hence the proof. • The following theorem gives an unbiased estimator for the population total and also its variance.
Theorem 2.2 Under simple random sampling,
Y.m
= N L,Y; is unbiased for
n IES .
. . V[Y. ]--N2(N-n)S.2 . the popu 1auon tota1 and.Its vanance 1s srs Nn y
where
N
s
2 y
1-~[Y· =N-1£.,. I
-fl 2
i=l
Proof We have seen in Chapter 1. for any sampling design A
yfff
~
y.
=£.,.-' ies 1t;
is unbiased for the population total with- variance
(2.1)
12 Sampling Theory and Methods y ]2 :L :L !!_ __L N
N [
1C·
i=l J=l i<j
(1C;1C 1 -rei})
(2.2)
1C·
J
I
n
=N
By Theorem 2.1, we have 7t;
n(n-1)
= N(N _ 1)
and rcij
Substituting these values in (2.1) we notice that
N""'
(2.3)
y liT = - ""-" Y; A
n.IE.S
is unbiased for the population total. Note that nn(n-1) 1C·1C. -1C·· = - -
.,
I
J
N2
I)
N(N-1) ·rt~
~
N [
n N -n N N(N-1)
=
N-.,
(2.4)
]2 (Y; -Y1 ) 2n(N-n)
Therefore by (2.2) V(Yfff) = L L -., i=l J=l n-
N(N -1)
i<j
N N
=
N-n LL(Y;-Yi)2 n(N -1 ) i=l j=l
(2.S)
i<j N
We know that
N
N
N
N
L L,a;1 = L,a;; + 2L L,aij , if a;1 =a i=l j=l
i=l
ji.
i=l j=l i<j
Using the above identity in the right hand side of (2.S), we get
VCYfff)=
=
N-n
2 n(N - 1)
N-n
2n(N -1)
{±±
i=l
{2ffr? -2ffr;r1} . I j= . I
I=
. I . I
I=
j=
= N -n {±NYl-NHj2} n(N -1)
i=l
N 2 = N(N -n) L (Y; - f)2 = N (N -n)
n(N -1)
i=l
Nn
s;
(2.6)
Hence the proof. • The following theorem gives an unbiased estimator for the variance obtained in Theorem 2.1.
Equal Probability Sampling A
A
Theorem 2.3 An unbiased estimator of V(Y_rr.S') is v(Ysr.S') = where
s;
A
A
Nn
.,
s; -
s -~ .
is the sample analogue of
0
N 2 (N-n)
13
2
2
Proof Smce V(Ysr.r) = E(Y.S'r.r)- Y , "2 we have E(Y.S'n)
=V(Y.S'n) + Y 2 = N A
The sample analogue of
2 (N -n)
Nn
s; s; n~
1 L[Y;-
=
is
2
Sy + Y
2
Y] 2
iE.S'
(2.7) ~
1
whereY =- LY; n. IE.S'
2 ~2 } = -1- { ~ ~Y; -nY n-1 iE.S'
=1- {
n-1
~ r? -n[r.S';..S' J} ~ N2 IE.S'
Taking expectations on both sides we get
E<s:)= n~l H~r? 1 [
. .Imp 1.IeS , Th IS
n
]-n&f2J}
= n-1
N~ 1=1
n [ = n-1
N~
=__!!_[ n-1
N-1 s;-N-n s; N · Nn ·
-__!!_[ n-1
n-1 s2\' ] n -
N
1 N
Y-12 -n{N-n S 2 +f 2 } ]<using(2.7)) Nn Y y.2 _ { N- n S 2 + y2 }
Nn
1
N 2 (N -n) S 2v ] Nn ·
Y
J
J
=s2·,.
=[N
(2.8)
2 (N -n)Js2
Nn
y
Hence the proof. •
TMorem 2.4 Let (X;, Y;) be the values with respect to the two variables x and y associated with the unit having label i. i = 1, 2, ...• N . If
X = N LX; . n IE.S' .
14 Sampling Theory and Methods
y = N ~ Y; n ~
N
=-1 -~ (X, -X){Y; -Y), then under simple random
and Sry
N-1~
ie.s
t=:=I
•
A
[N (NNn-n)J S xy · 2
A
sampling, cov(X, Y) = ...
....
Proof
....
....
....
....
cov(X,Y)=E[X-E(X)][Y-E(Y)] A
A
A
A
= E[X Y]- E(X)E(Y)
=
J(:1 &..l )
2
I,x,I,r,]-xr IE.S
IE.S
~
2
~
-- ( N ] E ~I Y· X·I + ~I Y· X J· - XY n .IE.S .I,JE.S . i~j
n n(n-1) ( ]2{-LY;X; + LY;X N . N(N -1) . .
N = n N
N
N
1=1
I~}
N
N
N
j
}
-XY
(2.9)
N
It is well known that LY; L Xi = LY; X; + L LY; X j i=l N
Hence
j=l
i=l N
i~ j
i=l
N
N 2 Y X-~ Y·X·I = ~ ~ Y·X J· ~I ~~I i=l
i=l
(2.10)
i~ j
Substituting (2.10) in (2.9) we get
2 [~]fx;Y; +[ n(n-_n ](N 2f cov(X,Y>=[NJ n N N(N
=[N]2[~][tN
n
=N
2
X- fr;x;]-rx
1)
i=l
n(n-1) ] f X;Y;
N(N-l)
i=l
N
(N -n) _l_L(X; -X)(Y; Nn N -1 i=l
+[
i=l
n(n-1) ](N2f X ]-N2f X
N(N-1)
-Y)
Hence the proof. • 1
A
A
Theorem2.5Undersimplerandomsampling sry =--~(X; -X){Y; -Y) n-1~ IE.S
Equal Probability Sampling 1
N
is unbiased for S , =-~(X -X)(Y· I N - 1 £..J I x.'
-Y) where
1
X = - ' X;
n £..J
15
and
~~
i~
1 ~ Y=-IY;. n. lEI
AA] 1 [ LX;Y; -nXY Proof sx:'. = n-l lEI
=1- ['x.y. £..J I I n- 1
_ _!!_, 1£..J .,£..J x.~y. I
n-
iEI
IE.f
iEI
1 ~ ~ y.x. +£..J£..J _..!_ 'y.x. =-- 'x.y. I I I I . n-1 £..J .1E1 I I n £..J iE1 1:;:. j iEI jE1
=-·-
n- 1
n-1' I I .£. y.x. IZ
. lEI
~ ~ y.x. _..!_ £..J£..J 1 I . IZ . lEI 1EI
;:;:. j
Taking expectations on both sides. we get N
n-1 n ~
1
n(n-1)
E[sxvl = - - ---£..J Y;X;-
n-1
N
n
N N
~~
£..J £..J Y;X ·
nN(N-l)i=lj=l
1=l
1
j-:;:.1
1
1
., __
] N [ N-Yx-'Y·X· N =-'Y·X·~ I I N ~ I I N(N -1) 1=1
.
1=1
]f -[_!!_]v x
=[-
1 1+ Y·X · N N(N-1) i=l I I
=-
1-
[f
Y; X; -
NY X]
N-1
Hence the proof. •
N -1 i=l
A
~
Remark 2.1 If Y
Yr
=....!...!_ N
...
then under simple random sampling Y is unbiased for
the population mean and its variance is N - n S ;-. Nn
16 Sampling Theory and Methods This remark follows from Theorem 2.2.
2.3 Problems and Solutions Probkm 2.1 After the decision to take a simple random sample had been made. it was realised that Y1
the value of unit with label I would be· unusually low
and YN the value of the unit with label N would be unusually high. In such situations, it is decided to use the estimator
..
Y + C if the sample contains YN but not Y1 .:..
Y
A
= Y - C if the sample contains Y1 but not YN
.
Y for all other samples where the constant C is positive and predetermined. Show that the estimator
Y• is unbiased and its variance is ~. N-n S,. 2C V(Y ) = - [ - 2· ---(YN -Yt-nC)]
N
n
N-1
Also prove that V(Y.)
!l 1 ={s I n(s) =n, s contains I but not N}, !l 2 = {s I n(s) = n,s contains N but not 1} and !l3 = !ln -!ll -!22 It is to be noted that the number of subsets m respectively
rN- J.[N2 n-1
2 ] and n-1
[N]_jNl n
!l~o!l2
2 ].
n-1
Under simple random sampling
E(Y.)=
L y• (N)-1
seDn
n
=(N)-l{ I [r +C]+ I [r -c]+ Ir} n
seD1
,\'E D2
sell3
=r:rL~n r+Cr:~~J-~c~~J} =(N)-l I,r =Y n
seOn
(refer the remark 2.1)
and !l3 are
Equal Probability Sampling
17
Y• is unoiased for the population mean. The variance is V(f.)=. 2, [r• -Y ] 2 (NJ-I (by definition)
Therefore the estimator oftheestimator
r•
selln
"
=(NJ-I( L [Y. +c-r ] 2+I [r -c-r] 2+I [r -r ] 2J se~
"
se~
=.(NJ-1[ ~. [A
Y-Y
N-; +rN-; l
Note that
n- (
n-
N
=
~~
]2 +~' [AY-Y ]2 +~' [AY-Y]2
2n(N -n) N(N-1)
. Funher it may be noted that all the
" members of .0 1contain the unit with label 1, (
(N-2·] n-1
N-Jl of them contain the units n-2
with labels j (j = 2, 3, ... , N -I) and none of them contain the unit with label N. Therefore
~)r-Y]= sellt
L r- Ir
sell}
.fElll
=.!_[(N-2\,I +(N-3~y. -(N-2\,] n n-1 J n-2 )f::2 n-1 J 1 (N-2~ Y + n -1 :I,r. N-1 }- (N-2}' =n n-1 N- 2 . 2 n-1 1
1
1
1•
1
Proceeding in the same way, we get
~)Y -Y) =.!_(N-2~YN + n=l rrj}-(N-2r n n-1 N 2 · 2 n-1 n
(2.12)
1•
se.a•2
It can be seen that
~= CJ n
N-n N(N-1)
Using (210)-(2.13) in (2.9) we get
(2.13)
18 Sampling Theory and Methods .!..,. -n [S;· ---(YN 2C . V(Y ) =NN n N-1
-Yt -nC) ]
(2.14)
which is the required result.
V(Yl=
N;n[ s!]
(2.15)
Therefore
~. )0 N-1
(comparing (2.14) and (2.15))
:::) YN - Y1 > nC (when Cis positive)
=> O
:yl]
Hence the solution. •
Problem 2.2 Given the information in problem 2.1, an alternative plan is to include both Y1 and Y8 in every sample, drawing a sample of size 2 from the
.
units with labels 2. 3, .... 7, when N=8 and n=4. Let
.
Y;
be the mean of those 2
~ Y1 + 6f., + Y8 units selected. Show that the estimator Y' = is unbiased for the 8
.
9V(f.,) . . . popu1auon mean wnh vartance - . 16
.
..:. Y1 +6Y.,. So lut1on Y'= 8
+Y
8
= Y1
+Y + [6] - -1L Y· 8
8
8 2 .
I
IES
Taking expectations on both the sides we get
E[Y']=[Yt~Yg HiH~I;Y;] where I; = 1 if iE s = 0 otherwise Since E[/;] =~.we get from (2.16) 6
Hence the solution. •
(2.16)
Equal Probability Sampling
19
P.,.obkm 2.3 Show that when N = 3. n = 2 in simple random sampling, the
estimator
..!_y1 +..!_Y2if s:;:{l,2} 2
y• =
2
..!_ Yt 2
+~ Y3 if s ={1.3} 3
..!_y2 +.!.Y3 if s={2,3}
2 3 is unbiased for the population mean and V(Y.)>V(Y)
if Y3 [3Y2 -3Y1 -Y3 ]>0
Solution By definition
If
~.
""' ~. 1 (1 1 1 1 2 1 1 } ~[Y ]3 = J )l2Yt +2Y2 +2Yt +JY3 +2Y2 +JY3
E[Y ] =
=
1.s1 .s,
(
2
J
=( Hence V[
~ Jrt + r 2 + Y3 }=i'
y• is unbiased for the population
y•] = ~ { [ ~ Yt + ~ Y2
r
+ +[
-[Yt +Y; +Y3
~ Yt + ~ Y3
r
r~ ~ n
mean.
+ [ y2 + Y3
1212 2 21 1 18 18 27 18 - 9 We know that under simple random sampling,
= -Yt +-Y2 +-Y3 --YtY., --Y2Y3
~[Y·- Y] 2
1 - (3)(2) 3-1 ~
V[Y]- 3 - 2
(refer remark 2.1)
I
t=l
=I~ [
Y? +Yf + Yl-Y1Y2 -Y1Y3 -Y2 Y3 ]
Therefore
v[YJ-v[Y"J ={ 5~]rl
+(;4]r YJ -[534]rtYJ 2
Using the above difference we get, ~
~.
V[Y]-V[Y ]>0 ::::) Y3[3Y2 -3Yt -Y3]> 0 Hence the solution. This example helps us to understand that under certain conditions, one can find estimators better than conventional estimators. •
20 Sampling Theory and Methods
Probkm 2.4 A simple random sample of size n = n1 + n2 with mean y is drawn from a finite p!Jpulation, and a simple random subsample of size n 1 ts drawn from it with mean Yt . S~ow that (a) V[y 1 -
Y2] =
s;[1..+J..] nt
n2
where Y2 is the mean of the remaining n2
units in the sample, (b) V[yt-Y1=S 2
[2__.!_]
Y n1
n
(c) cov(y. y1 - y) = 0
Solution Since y 1 is based on a subsample, (2.17) V(yt) = Et V2(Yt )+ V1E2 (Yt) where £ 1 is the unconditional expectation and £ 2 the conditional expectation with respect to the subsample. Similarly V1 is the unconditional variance and V2 is the conditional variance with respect to the subsample . It may be noted that E 2 lYt] = y and V2[y1] =
n-nnnt s; (refer Remark 2.1 ). 1
·
_ N -n 2 _ n-n 1 2 Therefore V1£2[y 1] = - - S y and Et V2£Yt1 = S y. Nn
nn 1
Substituting these expressions in (2.17) and doing the necessary simplification we get .V[yl] =
(.!.- 2... 1~; n
N [·
(2.18)
Further cov(y. Yt) = E [y Yt]- E [Y]E l:Yt]
= Et E2 [y Y"t1- YEt £2[ Yt1 = Et[yyt]-YY =V[y]
=[NNn-n]s;· We know that cov(y. y 1 - y) = cov(y, Yt )- cov(y, Y> = V[y]- V[y] (using 2.19) =0 This proves (c). • Note that V[y1 - Y1 = V[y1]+ V[y]- 2 cov (y. y 1) = V[yt]+ V[Y]- 2V[y] (using 2.19) = V[y1]- V[y]
(2.19)
Equal Probability Sampling
21
(using (b))
= :; [
n:n~l ]s;
S;. (since n2 = n- n1 ) = nl +n2 s; =[-·-+-•-]s; nln2 . nl n2 =.~-1n2 nl
This proves (a). •
Probkm 2.5 Suppose from a sample of n units selected with simple random sampling a subsample of n' units is selected with simple random sampling, duplicated and added to the original sample. Derive the expected value and the approximate sampling variance of y' , the sample mean based on the n + n' units. For what value of the fraction
~ n
does the efficiency of
Y'
compared to
that of y attains its minimum value? Solution Denote by y o, the mean of the subsample. The sample mean based on
n + n' units can be written as 0 -. nv+n' V y = . " n+n' Since )" is based on the subsample, E[y'] = E 1E 2 [y'], where E2 being expectation w.r.t. the subsample and E1 the original sample.
[ny+n'y0 ]_nEtE2(Y)+n'EtE2(yo) Therefiore E( -y 1 = EE __ 1 2 n+n' n+n' ----~--__..;.__;;;,
=nY+n'Y = f n+n' Hence the combined mean is unbiased for the population mean.
22 Sampling Theory and Methods
V[y'] = E 1V2[y' ]+ V1E2[y']
v[ nv. n+n' +n'
0
=EtV2 =
]
1 2 {EtV2(n'yc')+Vt(ny+n'y)} (n +n') 1
=
(n+n')
=[
=[
=
yo +VtE2 [ny+n' ......;;__~ n+n'
]
[
n-~· s~]+(n+n') 2 V1 (Y)}
2 {Et[n' 2
n'
2
]
nn
[n-n' s; nn' ·
n
n+n'
-
N -n ]+ . Nn
n:n. n:,~· s; ] + 2
n'
]
n+n'
[ 1
N-n Nn
s2y s2
s;
1 ] S;+., --n'
n
·
Y
(approximately)
n
n'
n ]2[..!._.!. =[ 1+n' n'
n
n
(2.20)
=
By Remark 2.1, V(y)
=[NNn-n]s~· =s;n
Therefore by (2.20) and (2.21 ), the efficiency of
r
(2.21)
y' as compared to y is
n'
1+3-
E =[ I+ :
n
=
1+39 II+ 9
p
n'
where 9
=-;;-.
Using calculus methods. it can be seen that E attains maximum at Therefore the value of
!t.. n
for which the efficiency attains maximum is
a=~.
~ .• 3
3
Equal Probability Sampling
23
Probkm 2.6 Let Y; be the ith sample observation (i = 1, 2..... N) in simple random sampling. Find the variance of y 1 and the covariance of Y; and .v 1 ( i ~ j ). Using these results derive the variance of the sample mean.
Solution Claim : In simple random sampling, the probability of drawing the unit with label r(r 1, 2, ... , N) in the ith draw is same as the probability of drawing the unit with label r in the first draw.
=
Proof of the Claim The probability of drawing the unit with label r in the first draw is - 1 . N The probability of drawing the unit with label r in the ith draw is
[I-~] [I- N~l] [I- N~J{I- N-1i+2] [I-N-1 i+l]
which on simplification reduces to - 1 . Hence the claim.
N
Proceeding in the same way it can be seen that the probability of selecting the units with labels r and s in the ith and jth draws is same as the probability of drawing them in the first and second draws.
Y; can take any one of the N values Y1 , Y2 , ••• , YN
Therefore, we infer that with equal probabilities - 1
and the product y; y i
N
Y1 Y2. Yt Y3 •..., YN-1 YN with probabilities
can take the values
1 N(N -1)
Hence we have 1 N E[y;]=N LY;
(i)
(2.22)
i=l
N
E[y~] =_I """ y;2 I N£.. I
(ii)
(2.23)
i=l
l (iii)
N
N
E[y;Yjl= N(N-l)LLY;Yj
(2.24)
i=l j=l
i"'}
From (2.22), we have E[y;] =f. Therefore V[Y; ]
1 N 2 =-LYl-Y N . ]=
(refer (2.23))
I
N
=-1 ~[Y· -f]2 = N-1 s2 N~ 1 N y j=l
Using (2.24) and (2.25) we get
(2.25)
24 Sampling Theory and Methods N
,V
1
L,L,r,r1 -r 2
cov(y;.y 1 )=
N(N -1) •=l J=l l~j
-
y2-
1
- (N-1)
N
1 ~ y2 N(N-1)~ lc /c=l
(2.26)
=--1-[fr/c2 -Nr2]=--s; N(N -1) lc=l
N
We know that
LYi
1 n [ V[y]=Vn i=l
]
1
= 2
n
n
L
n
V(y;
j:;:l
i=l j=l i<j
=-1-[n(N- 1) s;+2n(n-n(-s; n2 N · 2 N
=
n
)+2L,L,cov(y;.Y j)
11 (using(2.25)and JJ
(2.26))
[N-n]s; Nn ·
Hence the solution. • Sv
Probkm 2. 7 If the value of the population coefficient of variation C = ~ is y
known at the estimation stage, is it possible to improve upon the estimator y . the usual sample mean based on a sample of n units selected using simple random sampling? If so, give the improved estimator and obtain its efficiency. by comparing its mean square error with V (y) . Solution Consider the estimator Y.t =ly where A. is a constant. The mean square error of the estimator y1 is MSE( YA. )=E[A. y-Yf =E[A.(y- Y)+(l-l)Y]2
Equal Probability Sampling
25
=l2 E(y- Y) 2 +(A. -1) 2 f 2 +2l(l-DYEC:Y- Y) =
..t2V(y)+(l-1) 2 Y2 (2.27)
Using differential calculus methods. it can be seen that the above mean square error is minimum when
r =[I+ ;,.n r
A= [I + N;, n C 2
(2.28)
Therefore, the population mean can be estimated more precisely by using the estimator
y~
c2
N
y
whenever the value of C is known. Substituting (2.28) in (2.27) we get the minimum mean square error
M* =[N -n s;J[l+ N -n c2]-t Nn
·
Nn 0
Therefore the relative efficiency of the improved estimator to y is
[I+ NN~nC2
yl
when compared
r
It may be noted that the above expression will always assume a value less than one.•
Remark 2.2 We have pointed out, a simple random sample of size n is obtained by drawing n random numbers one by one without replacing and considering the units with the corresponding labels. If the random numbers are drawn with replacement and the units corresponding to the drawn numbers is treated as sample, we obtain what is known as a "Simple Random Sampling With Replacement " sample (SRSWR). Problem 2.8 Show that in simple random sampling with replacement (a) the sample mean y is unbiased for the population mean (b) V(y)=[N-l]s2 Nn
Y
Solution If Yi, i =I. 2, ... , N
is the value of the unit drawn in the ith draw then
Yi can-take any one of theN values Yi with probabilities Therefore E(yi) =
N 1 I,ri= Y N . I J=
~. N
(2.29)
26 Sampling Theory and Methods
In the same way. we get N
E(v:!) -'
=~
~
l .v =-~ y: N N.£..,; 1
Y2 1
I
1=1
l
Hence V(y;) =N
1=1
L Y[., -Y_.., =--S; N-l .., N
N
1=1
-
(2.30)
Since draws are independent cov(y;, )'j) = 0, we get
E(y) = E[.!_
i
n .
I=
=.!.nr n
Y;] 1
(using (2.29))
=Y and
._ [I~ ] [I ~ ] I
V(y) =V -;; ~ Y;
=
N -I 2 N -I 2 2" .£..,; V(y;) =2(n)NS y = Nn S_v
n
t=l
n
t=l
Hence the solution. •
Probkm 2.9 A simple random sample of size 3 is drawn from a population of size N with replacement. As an estimator of Y we take y'. the unweighted mean over the different units in the sample. Show that the average variance of y' is (2N -l)(N -l)S;
6N 2 Solution The sample drawn will contain I.i or 3 different units. Let P~o P 2. and P3 be the probabilities of the sample containing I,2 and 3 different units respectively. N
P1 =
L
P (selecting rth unit in all the three draws)
r=l
=N-I _I _I =-I-
NN N
N2
N
P2 =
L
P (selecting rth unit in draw I and a unit different from rth unit in
r=l N
the second and third draws) +
L
P (selecting the rth unit in draw 2 and
r=l
a unit different from rth unit in the first and third draws) + N
L
P (selecting the rth unit in draw 3 and a unit different from rth unit in
r=l
the first and second draws).
Equal Probability Sampling
27
1 N-l N-1 l N-1 N-l 1 N-1 N-1 = N------- + N ------- + N ------NNN NNN NNN
=
3(N -1)
N2 N
P3
=L
P (selecting rth unit in draw 1, a unit different from rth unit in the
r=l
second draw and
sele~ting
in the third draw a unit different from
units drawn in the fi.rst two draws) =
=
N(N -1)(N- 2) N3
(N -1)(N -2)
N2 We know that the variance of the sample mean based on n distinct units is N-n 2 --Sy. Nn Therefore th~ average variance of y' is (
N - 1 S 2 J-1- + ( N y N2
N - 2 S; ) 3( N - 1) + ( 2N · N2
which -on simplification reduces to
N- 3S2 3N y
l
(N-l)(N-2) N3
•
(2N - 1)( N - 1) S 2
6N
2
y•
Hence the solution. •
Exercises 2.1 2.2
2.3
2.4
2.5
Derive V(s;) and cov(x,s.;) in simple random without replacement under usual notations. Let v denote the number of distinct units in a simple random sample drawn with replacement. Show that the sample mean based on the v distinct units is also unbiased for the population mean and derive its variance. Suggest an unbiased estimator for the population proportion under simple random sampling without replacement and derive its variance and also obtain an estimator for the variance. Suppose in a population of N units. NP units are known to have value zero. Obtain the relative efficiency of selecting n units from N units with simple random sampling with replacement as compared to selection of n units from the N- NP non-zero units with simple random sampling with replacement in estimating the population mean. A sample of size n is drawn from a population having N units by simple random sampling. A subsample of n 1 units is drawn from the n units by simple random sampling . Let y1 denote the mean based on n 1 units and
y2 the mean based on
n-n 1 units. Show that wy1 +(1-w)y2 is unbiased
28 Sampling Theory and Methods for the population mean and derive its variance. Also derive the optimum value of w for which the variance attains minimum and the resulting estimator.
Chapter3
Systematic Sampling Schemes 3.1 Introduction In this chapter, a collection of sampling schemes called systematic sampling schemes which have several practical advantages are considered. In these schemes, instead of selecting n units at random, the sample units are decided by a single number choseri at random. Consider a finite population of size N, the units of which are identified by the labels I. 2, ... ,Nand ordered in ascending order according to their labels. Unless otherwise mentioned. it is assumed that the population size N is expressible as product of the sample size nand some positive integer k, whtch is known as the reciprocal of the sampling fraction or sampling interval. In the following section we shall describe the most popular linear systematic sampling scheme abbreviated as LSS.
3.2 Linear Systematic Sampling A Linear Systematic Sample (LSS) of size n is drawn by using the following procedure: Draw at random a number less than or equal to k, say r. Starting from the rth unit in the population, every kth unit is selected till a sample of size n is obtained. For example, when N=24, n=6 and k=4, the four possible linear systematic samples are : Sample Number Random Start Sampled units 1, 5, 9, 13 1 1 2 2, 6, 10, 14 2 3 3, 7, 11, 15 3 4 4, 8, 12, 16 4 The linear systematic sampling scheme described above can be regarded as dividing the population of N units into k mutually exclusive and exhaustive groups {S 1, S2 , ... , Sk} of n units each and choosing one of them at random where the units in the rth group are given by Sr ={r,r+k, ... ,r+(n-l)k},r= 1.2, ... ,k
----------------------------------------
30 Sampling Theory and Methods The following theorem gives an unbiased estimator for the population total and its variance under LSS.
Theorem 3.1 An unbiased eslimator for the population total Y under LSS N n corresponding to the random start r is given by YLSS = - Lyr+(j-l)k and its n . A
j=l
V(YLSs) = ~
variance is
lc
.Lrr,- Y]
2 where
A
Y,
IS
the value of Y LSS
r=l
corresponding to the random start r. Proof Note that the estimator YLSS
Y,, r =1, 2, ... , k with equal probabilities
can take any one of the k values
..!.. . k
Therefore
A
E[YLSs] =
k I,r,A(l) lc
1
=k
k N n L-;; ~ Yr+(j-l)k.
r=l
r=l
N 1c
J=l N
n
= nk 2, 2: Yr+(j-1)/c =LY;
(3.1)
i=l
r=l j=l A
Hence Yus is unbiased for the population total Y.
YLSS
Since the estimator
can take any one of the k values
Y,, r =I. 2, ... , k
with
..!.. and it is unbiased for Y,
equal probabilities
k
A
A
V(YLSs) = E[YLSs - Y]
2
=~)r,- r1 2 (~) r=l
k
=
~ _L[r,- rf
<3.2)
r=l
Hence the proof. • Apart from operational convenience. the linear systematic sampling has an added advantage over simple random sampling namely, the simple expansion estimator defined in the above theorem is more precise than the corresponding estimator in simple random sampling for populations exhibiting linear trend. That is, if the values Y1, Y2 , ... , YN of the units with labels l, 2, ... , N are modeled by Y; =a+ f3 i, i = 1, 2, ... , N then systematic sampling is more efficient than simple random sampling when the simple expansion estimator is used for estimating the population total. This is proved in the following theorem. Before
S)'stematic Sampling Schemes
31
the theorem is stated. we shall give a frequently used identuy meant for populations possessing linear trend. Identity For populations modeled by Y, =a+ f3i, i = l, 2..... N
Y, - Y= Nf{r - (k ;
I)]
(3.3)
.
where Y, is as defined in Theorem 3.1.
Proof: Note that when Y; =a+ {3i. i = 1. 2..... N . we have .. N n N n . Y, =-L,Yr+(J-I)k =-:L{a+{J(r+(j-l)k.]}
n ]= . I
=
n j= . I
~ [ na + fJnr + fj[k n(n2-l)l]
= N[a + {Jr -1- {J[k (n
-l)]l
2
(3.4)
j
N
and
Y = LY; i=l
= t.[a +
./li]
= Na+
/3[ N(~ +I)
]
(3.5)
Using (3.3) and (3.4) we get
Y,- y = N[a+ fJr+ .13[ k(n2-1) ]-a- .13[ nk2+1]] =Nfl[r+ nk-k~nk-l] = Nfl[r- (k ;
I)]
Hence the identity given in (3.3) holds good for all r, r =I. 2, .... k . • Theorem 3.2 For populations possessing linear trend, V(YLSS) < V(Ysrs) where
YLSS and Ysrs are the conventional expansion estimators under linear systematic sampling and simple random sampling, respectively. Proof We know that under simple random sampling 2 N V(Y )= N (N -n) ~ ~[Y.·- Y]2 (3.6) srs Nn N- l
{:t '
a: 1· = I. 2 , .... N . t h.en 1' =a+ fJa (N + l) N ote that 1"f Y..; =a+ fl'• 2 Therefore • for i = I. 2, ...• N
(3.7)
32 Sampling Theory and Methods [Y1
-..,
-n·:
. N+l [ a+P<-a-P(
2
)~
2
:pt-('';1 )]' =P2[;2 + (N:I)2 Hence
f[Y;
-i(N+l)]
-f]2 =/32[N(N+1)(2N+l) + N(N+1) 2 _ N(N+1) 2 ]
6
i=l
=pi[ N(N +1~N
4
-I)]
2
Substituting this in (3.6) we get
V(y
)= N 2 ( N -
srs
Nn
n) __.!.._
a 2[ N ( N + 1)( N - 1)]
N - 1 ~
12
2a2
N ~ (k - 1)(nk + 1) 12 On using identity'given in (3.3), we get
(
=
tif,-
Y]2: ~N2p2[r-
.
usmg
(k; I)
N
=n
k)
(3.8)
r
(k:1)2 -
=N-2 p2t.[r2 +
i(k
+I)]
= N2/32[k(k + 1)(2k + 1) + k(k + 1) 2 _ k(k + 1) 2 ]
6
4
2
N2/32k(k2 -1)
=-....;...._.....;..._ _
12 N2 /32(k 2 1) Therefore V(YLSs) = 12 -
(3.9)
Thus using (3.8) and (3.9) we get. V(Y
srs
)-V(Y LSS
)=N 2 {3 2 (k-1)(nk+1-k-1) 12
= N 2f3 2k(n- 1)(k - 1) 12 Since the right hand side of the above expression is positive for all values of n greater than one, the result follows. •
33
Systematic Sampling Schemes
Yates Corrected Estimator In Theorem 3.2. it has been proved that linear systematic sampling is more precise than simple random sampling in the presence of linear trend. Yates (1948) suggested an estimator that coincides with the population mean under linear systematic sampling for populations possessing linear trend. The details are furnished below: When the rth group S r is drawn as sample. the first and last units in the sample are corrected by the weights At and
~
respectively (that is. instead of
using Yr and Yr+(n-t)A: in the estimator, the corrected values namely At Yr and A2 Yr+(n-t)A: will be used) and the sample mean is taken as an estimator for the
population mean. where the weights At and A2 are selected so that the corrected mean coincides with the population mean in the presence of linear trend. That is. the corrected mean
Yc
=.!.[At Yr + I Yr+(j-l)A: +A2Yr+(n-t)A:]
n
. 2
j=
is equated. to the population mean to get
Y after substituting.
Y; = a+ {3i. i = 1, 2, ... , N
.!.[.1.1(a+ P..l+ ~[a+ /J[r+(j -l)kJ]+l =a+ {j(N + 1) n
J= 2
2
(3.10)
A2 [a+ {j[r+ (n -1)k1]
Comparing the coefficients of
a in (3.10) we get
1 -[At +A2 +n-2] = 1
n Therefore At + A2 = 2 Again comparing the coefficient of f3 in (3.10) we get 1[
..,
;; Atr+A 2 [r+(n-l)k]+(n-~)r+
Cn -l)(n- 2) 2
k
(3.11)
]
N +1
=2-
2[At r + A2[r + (n -1)k] + (n- 2)r + (n - 1)(n- 2) k ]= n(N + 1)
2 [2Atr + 2(2 -At )(n -1)k + 2(n- 2)r + (n -1)(n- 2)k ]= n(N + 1)
(using (3.11)) . n(2r- k -1) Solvmg for At we get At= 1 +_;,---~
(3.12)
2(n -1)k
Using (3.12) in (3.13), we find that n(2r- k -1)
(3.13)
A2=1----2(n -1)k
When the above obtained values of At and estimator, we get
~
are used in the Yates corrected
34 Sampling Theor_.; and Merhods -
1
r
(:!r-k-1)
Yc =-lYr + [Yr- Yr+(n-l)k] N 2(n-l)k
]
(3.14)
Therefore the estimator ,A
[
Yc = Yr +
(2r- k - 1) ] [Yr - Yr+
estimates the population total without any error. Since the estimator coincides with the parameter value, it has mean square error zero.
3.3 Schemes for Populations with Linear Trend In the previous section, we have seen a method in which the corrected expansion estimator coincides with i:he population total in the presence of linear trend. However, instead of correcting the estimator, many have suggested alternative sampling schemes which are best suitable for populations with linear trend. Three such schemes are presented in this section.
(i) Centered Systematic Sampling (Madow, 1953) As in the case of linear systematic sampling, in centered systematic sampling also the population units are divided into k groups S 1, S 2 , ... , S k of n units each, where S r = { r. r + k, ... , r + ( n - 1)k }, r = 1, 2, ... , k . If the sampling interval k is odd then the middlemost group namely Su+l)/ 2 is selected as sample with probability one. On the other hand, one of the middlemost groups, namely Sk 12 or Su+2)/2 will be randomly selected as
sample. To estimate the population total, one can use the expansion estimator as in the case of linear systematic sampling. If Ycss is the estimator of the population total under centered systematic sampling, then (i) when k is odd, Ycss = Yck+l)/ 2 with probability one A
(ii)
when k
•
IS
A
even, Ycss =
{yk'"-
with probability 1/2
A
Y(k+2)/2
with probability l/2
A
It may be noted that in both the cases Ycss is not unbiased for Y. However, for populations with linear trend. it has same desirable properties as shown in the following theorem.
Theorem 3.3 For populations satisfying
Y; =a+ f3i, i = 1, 2, ... , N ,
(i) when k is odd, Ycss = Y and MSE(Ycss) = 0 and •
A
f3 2
(ii) when k is even, E(Ycss) = Y and MSE(Ycss) = -
4
Proof For populations with linear trend, we·have seen in (3.3)
Systematic Sampling Schemes
35
- Y- N/3 [ r- (k+l)l , r- l. "'-· ... , k Y,-
2
Therefore
.-
(I) Y(k+ll/2- Y
. -
(u) ykl:!
and
, •••
,j
=N/3. [(k+l) 2 -
-Y=N/3
A
llll) Y(k+ 2>12 -Y
.
[k2-
(k + 2
(k+l)] 2
=0
( 3.15)
1)] =- N/32
[k+2
= N/3 - - 2
(k+l),J
2
(3.16)
N/3
=2
(3.17)
Hence when k is odd MSE('Ycss) = [ Y< k+l)/2 - Y ] 2 = 0
•
A
1 [A
and when k 1s even MSE(Ycss) = 2
Yk 11
-
Y
(By (3.15))
]2 +21 [AY(k+2)/2- Y ]2
=_!_[N:!/32 +N2/32] 2 4 4 = N2/32 4 Thus we have proved the theorem. • The centered systematic sampling described above is devoid of randomisation. Hence the results based on a centered systematic .sample are likely to be unreliable panicularly when the assumption regarding the presence of linear trend is violated. Hence it is d~sirable to develop a sampling method free from such limitation. In the following pan of this section, one such scheme developed by Sethi( 1962) is presented. (ii) Balanced Systematic Sampling (Sethi, 1962) Under Balanced Systematic Sampling (BSS}, the population units are divided
into!!.. groups (assuming the sample size n is even) of 2k units each and a pair of 2 units equidistant from the end points are selected from each group. This is achieved by using the following procedure: A random number r is selected from 1 to k and units with labels r and 2k - r + 1 will be selected from the first group and thereafter from the remaining
!!.. - 1 groups, the corresponding pairs of elements will be selected. 2
For example.
6 when N=24 and n=6, the populatton units are divided into - = 3 groups of 2 (2)(4)=8 units each as follows: 1 9 17
2 10 18
3 11 19
4 12 20
5 13 21
6 14 22
7 15
23
8 16 24
36 Sampling Theory and Methods The four possible balanced systematic samples are listed below: s 1 ={1,9.17,8,6,24}, s 2 ={2,10.18,7,15.23}, s 3 = {3,11,19, 6,14, 2~}. s 4 = {5,13, 21, 4,12, 20}
Thus the balanced systematic sample of size n corresponding to the random start r is given by the units with labels {r + 2 jk ,2(j + l)k - r + 1}, j = 0, 1, 2, ... , ~- 1 2 When the sample size n is odd, the balanced systematic sample of size n corresponding to the random start r is given by the units with labels
{r + 2jk,2(j + l)k- r + 1}U {r + (n -1)k }, j = 0,1, 2•... , n- 3 2 T/Norem 3.4 Under balanced systematic sampling, the conventional expansion estimator is unbiased for the population total. ProofCase 1 "n even" . The expansion estimator YBL corresponding to the random start r can take any one of the k values N (n-2)/2 .. (r) ~ Y BL =- . L.)Yr+2jk + Y2(j+l)k-r+l ], r = 1, 2, ···• k n j=O with equal probabilities
I_ . k
k
1 ~ .. (r)
..
Therefore E(Yst> =k .L.JYBL r=l 1 k
=k
N (n-2)/2
L-n L r=l
N = nk
L L
N
I
ri
{Yr+2jk
+ Y2(j+l)k-r+d
j=O k
csince us r = s and s r ns, = ,
i=l
.
+ Y2(j+l)k-r+l}
k (n-2)/2
r=l
=
{Yr+2jk
j=O
for r '* t >
r=l
Hence YBL is unbiased for the population total Y. Case 2 "n odd" In this case, YBL can take any one of the k values
Y~2 = Nn [("f[~r+2jk +
Y2(j+l)k-r+l ]+ Yr+(n-l)k] • r
j=O
with equal probabilities
I_ . k
= 1, 2, ... , k
Systematic Sampling Schemes
37
k
1 ~ '(r)
Therefore E(YBL) = k LJ YBL A
r=l I
k
=k L r=l
N{(n-3)/2
-n
L j=O
,
[Yr+2jk
+ Y2(j+l)k-r+d + Yr+(n-l)k ~
J
k
= Y (since Usr =Sand srns, ='for r'*t) r=l A
Hence in this case also, Y BL is unbiased for the population total. • Thus from the above theorem we infer that the conventional estimator is unbiased for the population total in balanced systematic sampling. It may also be noted that the variance of the estimator is k
A
V(YsL)
1~
A(r) 2 =k LJ[YBL -f)
r=l
where
Yj2
is as defined in the previous theorem.
Theorem 3.5 When Y; = a+ {Ji, i = 1,2,...• N • 0
when n is even
v(YBL) = { f3 2k 2 (k 2 -1) 12
. when n 1sodd
Proof For r = I. 2 •...• k • when n is even
N (n-2)/2 ~ YBL =- LJ[Yr+2jk + y2(j+l)k-r+l] n j=O A(r)
N (n-2)/2 =+ /3(r+ 2jk) +a+ /3[2(j + I)k- r+ I]} n j=O N (n-2)/2 ={2a + f3[r + 2jk + 2(} + l)k- r +I]} n j=O
Lfa
L
=!!._{!!_{2a+/3(2k+I)}+4/Jc n(n- 2) } n 2 8
=N[a+ Jl(~ +I)] =Y A(r)
Thus we have YBL
=Y
for all r = I, 2 •...• k
Therefore V (YBL) = 0 . For r = I, 2; .•.• k • when n is odd
38 Sampling Theory and Methods v{(n-3)/2
Y~'i} =-'
~
n
[Yr+2jk
+ Y:!(j+l)k-r+l] + Yr+(n-llk
}
.=0 }-
L[2a + /3(4jk + 2k + 1)] +{a+ f3[r + (n -1)k} }
N {(n-3)/2
=n
~
'=0 J
<[na+P{
(n -1}(2k + 1) (n -l)(n- 3)k k }] -----+ -+r+(n-1)
2
2
=!!_[na + /3Jl (n -l)(nk + k + 1) + r
l]
(3.18)
n 2 J Funher we know that for populations having linear trend Y = Na + f3 (nk)(nk
+ 1)
(3.19)
2 Using (3.18) and (3.19) we get, for r = I, 2, .... k
[r- k+1] 2
y -Y = f3N BL
n
Squaring both sides and summing with respect to k we get
~ ±~!2 -rf = p2:2 ±[r- k ;1]2 n k
r=l
=
r=l
/3 2 N 2 r~ r2 + k(k+1) 2 _ 2 k+1 k(k+1) n 2k
£...J
L..r= 1
4
-2
]
2
= f3 2 N 2 [ k ( k + 1)( 2k + 1) _ k ( k + 1l ] n2k 6 4
f3 2_ (k _ 2 -_ l)k_ 2 =..;....._ 12 Hence the theorem. •
(iii) Modified Systematic Sampling (Singh, Jindal and Garg, 1965) The modified systematic sampling is another scheme meant for populations exhibiting linear trend. A sample of size n is drawn by selecting pairs of units equidistant from both the ends of the population in a systematic m~nner. The details are furnished below. As in the case of linear and balanced systematic sampling here also a random number r is selected from 1 to k. When the sample size n is ev~n. the sample corresponding to the random stan r ( r = 1, 2, ... , k ) is given by the set of units with labels
s r = { r + jk, N - r - jk + 1}, j = 0, 1, ... , n ..:.. 2 2
Systematic Sampling Schemes
39
When the sample size n is odd, the sample corresponding to the random stan r ( r = 1, 2•...• k ) is given by the set of units with labels •
•
k
s r = { r + jk. N- r- jk + 1} U {r +
(n-l}k
.
n-3
, J = 0, I, ..., -
r=l 2 2 For example, when N= 16, n=4 and k=4 the four possible modified systematic samples are s 1 = {1, 5, 12, 16}, s 2 = { 2, 6, 11, 15} • S 3 = {3, 7• 10, 14}, s 4 ={4, 8, 9, 13} It is interesting to note that the theorems which we proved in the previ,.,us section for balanced systematic sampling are true even under modified systematic sampling.
TINorem 3.6 Under Modified Systematic sampling. the conventional· expansion estimator is unbiased for the population total.
Proof of this theorem is left as exercise. Tlteorem 3.7 Under Modified Systematic sampling. 0 if n is even V (YMOD)= { f3 2 k 2 (k 2 -1} . . ..;...._----•f n1sodd 12
..
when Y; =a+ f3i, •i = 1, 2, ...• N • where YMOD is the conventional estimator under Modified Systematic Sampling. Proof of this theorem is also left as ~n exercise.
3.4 Autocorrelated Populations Generally, it is reasonable to expect the units which are nearer to each other possess similar characteristics. This property can be represented using a statistical model assuming that the observations Y; and Yi are correlated, the correlation being a function of the distance between the labels of the units which decreases as the distance increases. Such population's are called autocorrelated populations and the graph of the correlation coefficient p,. between observations separated by u units. as a function of u is called "correlogram''. Because of the finite nature of the population the correlogram will not be smooth and it will be difficult to study the relative performances of. various sampling schemes for a single finite population. But it is easy on the average over a series of finite populations drawn from an infinite super population to which the model applies. A model which is suitable for populations possessing autocorrelatedness is described below. Model The population values are assumed to be the realized values of N random variables satisfying the following conditions :
40 Sampling Theory and Methods
~
and EM [Y; - ,u][Yi+" - ,LL] = Pua- where Pu 2: Pv whenever u < v. The subscript M is used to denote the fact that the expectations are with respect to the superpopulation model .The following theorem gives the average variance of the expansion estimator under simple random sampling and linear systematic sampling.
Theorem 3.8 Under the super population model described above A
EMV(Y.m)=
-1)N 2
a2 (k
[
nk
N-1
L(N-u}pu
r[ i 1-
nk
l
2
V(Y
srs
)=N 2 (N-n) Nn
nk-1
•
]
L(nk-u)pu +
nk(k -1) u=l
2k [ n(k-1) ProoifWeknowthat
]
N(N -1) u=l
a 2 (k -l)N 2
A
EMV(Yf..Ss)=
2
1-
n-l
~(n-u}pku 1
] }
f[Y·-Yi
N - 1 £..J
I
(3.20)
i=l
N
N
Note that L[Y; - ,u] 2 = L[Y;- Y + Y- ,u] 2 i=l
i=l N
= L[Y; -
Y] 2 + N[Y- .ul 2
i=l N
·N
Therefore L[Y;-
Y] 2
i=l
= L[Y;- ,u] 2
N[Y- ,u] 2
i=l
= ~[Y;- ~tf- N N
=
-
f[Y;-
!'] 2 -
[
~[
r=l
N
f[Y;- 2t[Y !'] 2
1N =---~ ,u] 2 N £..J [Y·I i=l
2
+
1 - !LUYi- !L] ]
1<1
1=l
N
]2
~~[Y;- !L]
N-IN-u
~ £..J ~ [Y·I -,u][Y.·I+U -,u] N £..J
--
u=l i=l
Taking expectations on both the sides with respect to the model we obtain
EM[~)Y;-Y] 2 ]=(N-1)0" 2 [1- N(N2 ~(N-u)pu] i=l
Using (3.20) and (3.21) we get
-1) u=l
(3.21)
Systematic Sampling Schemes E V M
y ( srs)
= N 2 (N- n)a 2 (N -1) [ 1_ Nn(N -1)
N-1 (N
2 N(N -1)
~
41
]
_ u) Pu
N-1 2 =a 2 (k -l)N 2 [ 1 _ ~ (N _ u) ] nk N(N -1) .L..i Pu u=l
Thus we have obtained the average variance of the conventional estimator under simple· random sampling with respect to the autocorrelated model described earlier. le
We know that
V(YL.Ss) = ~ L[Yr- Y] 2 r=l
=:
2
le
A
I[r;. -"f]2
. r=l
..:. 1 n I lc ..:. where Yr =- Lyr+()-l)le and Y =- Lyr n J= . I k r= I n
1e
N
~
-2
~~
-2
Note that .L..i[Y;- Y] = .L..i.L..i[Yr+<j-1\lc - Y] i=l
r=l j=l le
=
n
LL[Yr+(j-l)le - Yr-+ Yr- f]2 A
A
r=l j=l
len
A
~~
= .L.J .L.J [Y.,.+(j-l)le r=i j=l le
A
~-
Therefore .L.J[Yr-
-2 Y]
r=l
1 N
~
=- .L.J[Y;n i=l
-
-2
leA
~--2
Yr] + n .L.J [Yr -
Y]
r=l
1 1e n -2 ~~ Y] - - .L.J .L.J[Yr+(j-l)le n r=l j=l
(3.23)
On using !).21) we get
(3~24)
Using (3.21) and (3.24) in (3.23) we obtain
42 Sampling Theory and Methods k ~..:.
' .,]1il-
If.
-2
EM ~[Y, -Y]
2
=-~N-na-
n
r=l
N(N
L
(n-l)ka 2 n
{
1-
_
N-l } ~ ~(N-u)pu
-
l) u=l
2
~ } ~(n-u)pku
n(n-1) u=l
2 { } 2 N 2k n-1 = ~ (k-1)--L(N-u)pu +-X,(n-u)pku n nk u=l n u=l
=
a 2 (k _ 1) {
2
1-
n
nk(k -1)
nk
2k
n-1
u=l
n(k -1)
u=l
L(nk-u)pu+
L(n-u)pku
}
Substituting this in (3.22) we get the average variance of the conventional estimator in linear systematic sampling. • A comparison between these two average variances is given in Chapter S.
3.5 Estimation of Variance In Linear systematic sampling the second order inclusion probabilities are not positive for all pairs of units in the population. This makes unbiased estimation of the variance of the estimator impossible. In the absence of a proper estimate for the variance, several ad hoc procedures are being followed to estimate the variance of the conventional estimator. One of the methods is to treat the systematic sample as a simple random sample of size n units and estimate the variance by N 2 (N - n)
1
" [ ~
Nn
n -1
i=l
-] 2
£..J Y;- y
where Y; is they-value of the ith unit in the sample and y is the sample mean. It may be· noted that the above estimator is not unbiased for the variance of conventional estimator under linear systematic sampling. The second approach is to treat systematic sampling as a process of grouping the N population units into.!: groups of 2k units each and selecting two units from 2 each group in a systematic manner. In this case the population total can be estimated by 2 n/2 NY2i + Y2i-l (3.22) n i=l 2 Assuming that the two units have been selected with simple random sampling without replacement from the 2k units in the ith group, an unbiased estimator of the variance of the ith term in the brackets on the right hand side of (3.22) will be given by
L
(k; l) { Y2i - Y2i-l 2
r
Systematic Sampling Schemes
43
Hence an unbiased estimator of the variance of the estimator given in (3.22) is N2 N - n 2
~{ Y2i + Y!i-1 } 2
Nn i=l 2 An alternative variance estimator based on the same principles as. those considered above. which takes into account successive differences of sample values is given by n-1
N2N-nL Nn i=l
{y i+I-Yi }2 2(n -1)
Singh and Singh ( 1977) proposed a new type of systematic sampling which facilitates the estimation of variance under certain conditions. The scheme suggested by them is described below. (i} Select a random number from 1 toN (ii) Starting with r select u continuous unitS and the remaining n- u = v units with interval d, where u(iess than or equal to n) and dare predetermined. They have proved that if u+vd is less than or equal to N then the above sampling scheme will yield distinct units and the second order inclusion probabilities are positive if (a) dis less than or equal to u and (b) u+vd is greater than or equal to (Nn.)+ 1. When these two conditions are satisfied it is possible to estimate the variance of the conventional estimator. They have observed that in situations where usual systematic sampling performs better than simple ran~om sampling the suggested procedure also leads to similar results and for some situations, it provides better results than even linear systematic sampling.
3.6 Circular Systematic Sampling It has been pointed out in the beginning of this chapter, the population size is a multiple of the sample size. However in practice this requirement will not be satisfied always. Some survey practitioners will try to take the sampling interval k as the integer nearest to N/n. When this is followed, some times we may not get a sample of the desired size. For example, when N=20, n=3 and k=7, the random start 7 yields units with labels 7 and 14 as sample whereas the desired sample size is 3. In some cases, some units will never appear in the sample thereby the estimation of the population total (mean) becomes impossible. For example, when N=30, n=7 and k=4, the units with labels 29 and 30 will never appear as sampled units. These problems can be overcome by adopting a method, known as Circular Systematic Sampling (CSS) by Lahiri(1952). This method consists in choosing ~ random. start from 1 to N and selecting the unit corresponding to the random start and thereafter every kth unit in a cyclical manner till a sample of size n units is obtained, k being the integer nearest to Nln. That is,· if r is the number selected at random from 1 to N. the sample consists of the u·nits corresponding to the numbers r + jk if r + jk .s N and
44 Sampling Theory and Methods
r + jk- N if r + jk > N for j = 0, l. ... , (n -1) .It is to be noted that. if the sampling interval is taken as the integer closest to N/n, it is not always possible to get a sample of the given size as shown in the following example. Let N=l5, n=6 and k=3. The sample corresponding to the random start 3 has only five distinct elements namely 3,6,9,12,15. Motivated by this, Kunte(l978) suggested the use of the largest integer smaller than or equal to Nln to avoid the above mentioned difficulty. The following theorem due to Kunte( 1978) gives a necessary and sufficient condition under which one can always obtain samples having n distinct elements for any n less than or equal toN. Theorem 3.9 A necessary and sufficient condition for all elements of s(r. n) the
sample to be distinct for all r S. N and n S N • is that N and k are relatively coprime , where s(r. n) {i 1 , i 2 , .... in}. Here i j = [(j -l)k + r] mod N. with
=
the convention that 0 is identified with N. Proof Suftidency Suppose Nand k are relatively coprime and there exists rand n such that two elements of s(r. n) are equal. Without loss of generality assume that i 1 and i j+l =[jk + r] mod N. where j < n S. N. This contradicts the fact that k and N are coprimes. Necessity
Suppose for all r S. N, and n :S N • all elements of s(r, n) are distinct and Nand k are not coprimes. Let gcd(k,N)=a. with k = b.a- N = c.a , where band care both smaller than N. For any r let us take n ~ c + 1 . Then ic+l =[ck + r] mod N -
= [cb.a + r] mod N =[b.N + r] mod N = r= i 1 This again contradicts our assumption that all elements of s( r, n ) are distinct. Hence the theorem. • Under circular systematic sampling, the conventional expansion estimator is unbiased for the population total (whenever Nand k are relatively coprime) and N n 1 2 its variance is given by Y] where Yc; = !!_ LY~. yj being theyN •= . I n ]= .I
L[Yci -
value of the jth unit in the circular systematic sampling corresponding to the random start i.
3.7 Systematic Sampling in Two Dimensions The linear systematic sampling can~ extended for two dimensions populations in a straightforward manner. Here it is assumed that the nmlcl population units are arranged in the form of ml rows , each containing nk units and it is planned
Systematic Sampling Schemes
45
to select a systematic sample of mn units. The following procedure is adopted to draw a sample of size mn. Two random numbers rand s are independently chosen from 1 to l and I to k respectively. Then the sample of size nm is obtained by using the units with coordinates r + (i -I)l, s +(j -l)k. i =I. 2 .... , m and j =I. 2•... , n. For example. when m=3. 1=3, n=3 and k=4. the units corresponding to the random starts 2 and 3 are those placed against the coordinates (2.3). (5.3). (8.3). (2.7), (5.7). (8.7). (2.11 ). (5, 11) and (8.II ). Refer the Diagram 3.1 A systematic sample selected in this manner is called aligned sample.
Theorem 3.10 An unbiased estimator for the population total corresponding to NM m n • . . the random starts rands 1s g1ven by YTD = - L Lyr+(i-l)l.s+(j-l)k . I . I j= nm I= A
Proof Note that the estimator YTD can take any one of the lk values NM m n L Lyr·Hi-l)l.s+(j-l)k. r =I. 2.... ,/, s =I, 2, ... , k . I . I j= nm I= with equal probabilities values _I . kl I k I NM m n Therefore E(YTo>= 'L'L---LLYr+(i-l)l,s+(j-l)k =Y r=l .f=l lk nm i=l j=l A
Hence the proof. •
Remark It may be noted that the variance of the above estimator is
:l
I
k
LL[Yrs- Y] 2 where Yrs is the expansion estimator defined in the above r=l s=l theorem corresponding to the random starts r and s.
Theorem 3.11 An aligned sample of size n drawn from a population consisting of n 2 k 2 units has the same precision as a simple random sample of size n 2 when the population values are represented by the relation Yij = i + j. i = 1. 2 ..... nk; j = 1. 2, .... nk if the expansion estimator is used for estimating the population total.
Proof The variance of the expansion estimator based on a sample of size n 2 drawn from a population containing n 2 k 2 units is given by nknk ~ 44 22 ~~[f.. - f]2 I V(Y ) n k (n k -n ) ~ ~ IJ 2 2 2 2 2 srs n k - 1 i=l J=l n k n
=
nk nk Note that Y =--~~f .. 2 2 ~~ I) n k i=l j=l _
I
(3.23)
46 Sampling Theory tUU1 Methods 1
n/c nJc
=22LL[i+j] n k
=
i=l j=l
n 2~ 2 [{ (nk)(nk:nk +I)}+ { (nk)(nA:;(nA: + 1)}]
=nk+1 Therefore [Y;1 - Y] 2 = [i + j- (nk + 1)] 2 Summing both sides with respect to i and j from 1 to nk we get nJc nJc
~~[Y;j
nJc nJc
-f] 2
=~~[i 2 + j 2 +2ij]-
i=l i=l
i=l i=l 11/cnk
2(nk + 1)
LI
(i + j) + n 2k 2 (nk + 1) 2
i=l j=l
=2{ (nA:)(nA: +~)(2nk + 1)} + 2{ (nA:)(~ + 1)} 2 _ 4{ (nk + l)(nk~nk}(nk +I)}+ (n 2 k 2 }(nk + 1) 2 n 2 k 2 (n 2 k 2 -1)
=-----6
(3.24}
Substituting (3.24) in (3.23) we get n 2 k 2 (n 2 k 2 -n 2 )
A
V(Ysn>=
n
Yrv
n 2 k 2 (n 2 k 2 -1}
2 2
n k -l
n 4 k 4 (n 2 k 2 -1)
=
Since
2
1
6 is unbiased for the· population total, we have
6 (3.25}
2 V(Yro) = E(Yro - Y) .. 2 2 =E(Yro> - Y A
A
= E(Yr0 >2 - n 4 k 4 (nk + 1} 2 Corresponding to the random starts rands we have 2k2
Yro
=n
2
n
n
n
~ ~ Yr+(i-l)l.s+(j-:)k i=l J=l
n
n
1c 2 ~~[r + (i -1)k ;..a J=l
=
n
n
+ s +(j -l)k]
=k 2 ~~[r+ s +(i + j}k- 2k] i=l j=l
(3.26)
Systemtltic Sampling Schemes II
47
II
= k 2 [n 2 (r + s) + 2k ~Li- 2n 2k] l=l j=l
.::k 2 n 2 [r + s + k(n -l}]
(3.27)
Therefore lc
Efirol 2
lc
= LL£Yro1 2 ~ r=l s=l k = k2n 4 [
2t.~r 2 +2t.~rs+4k(n-l)t.t. r+k•(n-1) 2] 2k[k(k + 1}(2k + 1}
--~----~---+
=k2n4
2 k 2 (k +1) 2
6
+
4
4k(n-l) k(k+I} +k4(n-1}2
....,
.
= k 4 ,. 4 {
(3.28}
Substituting (3.28} in (3.26} we get and simplifying the expression we get ,. n4k4(n2k2 -I) V(Yro) = _ ____;_6_ __ Hence the proof. • The aligned systematic sampling described above is sometimes referred to as square grid sampling by some authors. For two dimensional populations an alternative sampling scheme known as "unaligned scheme" is also available. The details are furnished below. Two independent sets of random integers of sizes n and m, namely Ur.i2 , ••• j 11 }and {j1.h .... ,j11 } are draw~ from I to land 1 to k respectively. Then the units selected for the sample are those having the coordinates {il + rl, j r+l }, (i2 + rl, j r+l +k ), ...... ,{i 11 + rl, j r+l + (n -l}k. r = 0,1, ... (m-1) For example, the diagram 3.2 shows an unaligned sample of size nine when m=3, 1=3, n=3 and k=4; .in this example i 1 = 2,i 2 = 3,i3 =I, h = 2, h = 1 and iJ=3.
The two sampling schemes described for two dimensional populations have been compared by Quenouille ( 1949) and Das ( 1950). A review of systematic sampling in one or more dimensions is found in Bellhouse (1988).
3.8 Problems and Solutions Problem 3.1 Derive the average variances of the expansion estimators under
linear systematic sampling and simple Yl, Y2, ... , YN are random variables satisfying
random
sampling
assuming
48 Sampling Theory and Methods
l
2
3
4
s
6
7
8
9
lO
ll
l2
II
12
l 2
•
•
•
•
•
•
•
•
•
3 4
s 6 7 8
9
Fig. 3.1 Aligned Sample of size 9
I
2
3
4
s
6
7
8
9
•
l
•
2
•
3
•
4
s
•
•
6
•
7 8
9
10
•
•
Fig. 3.2 Unaligned sample of size 9
Systematic Sampling Schemes 49 EM(Y;)=a+bi, VM(Y;)=o- 2 ,i=l.2..... Nand covM(Y;.Yj)=O.i~j.
Solution The variables Y; · s can be written as Y; = a+ bi + e; • i ~ 1. 2, ... , N
where e; 's satisfy
.,
.
.
EM (e;) = 0, VM (e;) = o--. i = 1,2, ... , Nand cov M (e;, e j) = 0 for l ~ J.
. [b(N + Under the above descnbed model Y =a+ 2
Therefore
r, -f =h+{i-
1)] +-I L e N
j
N J=l
;n }+•- ~ ~·i}
Squaring and summing with respect to i from I to N we obtain EM f[Y; -f]2 =b2f[;2 -(N + l)i+(N + 1)2 ]+ No-2 + o-2- 20'2 i=l
4
i=l
N
(3.28)
N
We know that under simple random sampling V(Y SRS
) = N2(N -n} 1 Nn (N -1}
[f=: [~·'
2
-f]2]
1 EM Therefore EM [V (fSRS )] = N (N -n) _ Nn (N -1}
[±
[Y; -
Y] 2 ]
(3.29}
i=l
Substituting (3.28} in (3.29) and doing the necessary simplifications we get the average variance of the expansion estimator as EM [V(YsRs)] =[(nk + l)(k -l)lb2 + N(k -l)o-2
12n
~
We have seen in Section 3.2
.. N n N Yr -Y =-I,rr+(j-l)k-I,r; n j= · l · l •= =N n
n
N
J=l
i=l
L ~+b(r+(j -1}k +er+(]-l)k 1}-l: {a+bi+e;} (substituting the model}
=Nb[
f
r-(k+l}] N ~ 2 +-;~er+(j-l)k-~es J=l
Squaring both the sides we get
s=l
50 Sampling Theory and Methods .,
2
n
[y _ Y]2 = N2b2 [ r-(k + l) ] - +!:!.__ ~ r
2
.
_
2 £.Jer+(]-l)k n j=l
2
2N { L e s2 --;;~e N
n
s=l
J=l
r
r+(j-l)k
L es
}{ N
}
s=l
Taking expectations on both the sides we get
EM!Y,
-Y)2 =N2b2[r-(~+l) +( ~2 }2 +Na2 -2Na2 = N 2b 2[r-(k2 +
1)] 2 + N(k -1)<1 2 (usmg . N=nk)
(3.30)
We know that under systematic sampling A
V(YI.Ss)
~ 2 =-;;1 ~[Y, -Y] A
(3.31)
r=l
Using (3.30) in (3.31) we get the average variance of the expansion estimator under linear systematic sampling as
EM [V(fLSs J) =N2[ k:;l ]+N(k -1)2 Hence the solution. •
Problem 3.2 Let s r denote the set of labels included in a circular systematic
ins, for which r + jk S Nand s, be the complement set for which r + jk S N . If the end
sample
of size nand
n 1(r) be
the
number
corrections are applied to the sample mean unit with smallest label,
.!__ x to n
of labels
y, by giving the weight .!..+xto the n
the unit with largest label and
.!.. to n
the
remaining (n- 2) units in the sample so that the resulting estimator reduces to the population mean when there is a perfect linear trend Y; = a + bi , then show that
x=
2r+(n -l)k -(N + 1) ·r ( ) 1 n1 r 2(n-l)k
=n
2r+(n-l)k -(3N + 1)+ 2Nn 1(r) .f --------------1 n 1(r) < n 2n(N -k)
Solution Case (1) n 1(r) = n In this case the corrected sample meari can be Written as
Systematic Sampling Schemes 5 t
_=[1
Yc
I
I~ +- ~Yr~j-l)k +-Yr+(n-l)k n . 1 n
1 -+xjYr
n
j=
When we use the model Y; reduces to (a+br)+[b(n-1)
:a + bi .
the above estimator on simplification
~]+xb(n-l)k
(3.32)
We know that when Y; =a+ bi. the population mean is given by (nk + 1) b + a+ ---
2 Comparing the coefficients of b in (3.32) and (3.33) we get
r+[(n-1) ~]+
xk(n-1) = nkl+
(3.33)
1
. 2r+(n-l)k-(N+1) Solvmg for x we get x = _....;..._....;...___~ 2(n-1)k Case (2) n 1(r) < n
In this case the n labels included in the sample can be grouped into two categories. In the first category, we have those labels for which r + (j -1)k > N and in the second category, we include those labels for which r + ( j -1) S N . Note that the smallest label is nothing but the first label in the group 2, namely the one which exceeds N for the first time when we make circular systematic selection and the largest label is the last label in group 1, namely the one satisfying the inequality r + (j -1)k <. N . Therefore the corrected sample can be expressed as Yc
=;1
~Yr+(j-l)k
llt(r)-l
1 ] ;-x] Yr+{n tr)-l)k + [ ;+x Yr+n tr)k-N
[1
1
1
1
n LYr+(j-l)k-N
+n
j=n1(r)+2
i = £Yr+n1(r)k-N
-Yr+(n1(r)-l)t
]x+n
~)
2., Yr+(j-l)k + j=l
Substituting the model values we get Y,.+n 1(r)k-N -Yr+(n1(r)-l)k =bk{l-n) Note that
n LYr+(j-l)k j=n1(r)+l
(3.34)
(3.35)
52 Sampling Theory and Methods
=_!_[i Yr+cj-l)k -bN[n-nl(r)]] n
.
j=
l
k N =a+br+b(n-l)--b-[n-n 1 (r)] (3.36) 2 n Using (3.35) and (3.36} in (3.34) we get _ k n-n 1 (r) Yc =bk(l-n}x+a+br+b(n-1}--bN (3.37) 2 n Comparing the coefficient of b in (3.37} with the same in the population mean we get nk+I k r ]N -=k(l-n)x+r+(n-1)--Ln-nl(r}2 2 n Solving the above equation for X we get the required expression. • Problem 3.3 Show that the sample mean coincides with the population mean in
the presence of linear trend when a purposive sample of size 2 in which the first and last population units are. included. Compare its average mean and square error of the sample mean in centered systematic sampling assuming k is odd with the help of the super-population model Y; =a+bi+e;.i=l.2•...• N where e;'ssatisfy EM(e;)=O.EM<ef>=a 2 and EM(e;ej)=O.i~j
Solution Note that. if Y; =a +bi+,i = 1,2, ... ,N. the sample mean above mentioned purposive s~pling 5cheme is ft +YN Yp = 2 a+b+a+bN =----2 N+l =a+b-2 The right hand side of the above expression is nothing but the population mean in the presence of linear trend. Under the super-population model described in the problem we have (I)
Yp
=a+{N;l]+ ~ +eN)
(2)
Yc
=a+{N+l]+..!.. ~ ek+l and 2 n ~~ -+(j-l)k
(e 1
}=
2
(3} Y =a+JN+l]+.!.. ±ej 2 N i=t
1
where Yc is the sample mean under centered systematic sampling. Since both the sampling scheme considered above are purposive sampling schemes, we have
Systematic Sampling Schemes 53
=..!_(2a2)+~Na2 _ _!_( 20'2) 4
N
N2
N-2
.,
=--u2N
EM[MSE(yc)]=EM[
Yc-Y ]2
=EM[(..!..* e(k+D
n ~ j=l
.
--+U-l)k ~
[1]2 EM ~e(k+l) n
= -
n
.
j=
EM
l
{t
2
e (k+l> +(
=[~r na +[~ 2
2
r
2
N .
J=l
[1]2 EM ~eJ-[2] 11
. --+(J-l)k
J=l
-(.!.)tej]
·-l)k J
11a2
+N
.
J=
l
Nn
}{±•J} J~
-[~n]na2
N -n ., =--a-
Nn It can be seen that EM [ MSE(yp) ]-EM [MSE(Yc) ]~ 0. Hence we infer that
among these two purposive sampling schemes. centered systematic sampling is more efficient than the two point scheme. •
Exercises 3.1
Derive average mean squared errors under balanced systematic and modified systematic sampling schemes under the super-population model described in Problem 3.4 and compare them. 3.2 Derive the variance of the conventional estimators in simple random sampling, Linear systematic sampling and stratified sampling assuming
Yi =a+ bi + ci 2 and compare them.
54 Sampling Theory and Methods 3.3
Develop corrected estimator for the population mean in the presence of parabolic trend assuming the sample size is odd under Linear systematic sampling.(Hint: The first, middle most and last units in the sample can be corrected so that the corrected estimator coincides with population mean.) 3.4 Assuming the population values are independently distributed with same mean and same variance. show that the mean square error of the estimate based on a systematic sample is increased on the average by applying the end corrections to the extent
k2(k2 -I)
6(n-1) 2
a2.
3.5 Derive the variance of conventional estimator for the population mean "under circular systematic sampling_assuming the presence of linear trend.
Chapter 4
.Unequal Probability Sampling 4.1 PPSWR Sampling Method In simple random sampling and systematic sampling selection probabilities are equal for all units in the population. These schemes do not take into account the inherent variation in the values of population units. Therefore. these methods are likely to yield results which are not totally reliable, particularly when the units have significant variation with respect to their values. For such populations, one can think of other sampling schemes, provided additional infonnation about a suitable variable is available for all the units in the population. These type of variables are called 'size' variables. In this section, we shall present a method which makes use of the size information. Let X; and Y; be the values of the size variable and the study variable for the ith population unit, i = 1. 2, ... , N . Here it is assumed that all the N values X 1 , X 2 , .•. , X N are known. A sample of size n is obtained in the form of n units with replacement draws, where in each draw the probability of selecting the unit i, namely P; is proportional to its size X; . A sample obtained in this manner is known as Probability Proportional to Size With Replacement (PPSWR) sample. It is to be noted that P;
oc
X;
~
P; = kX; .Summing both sides we get
L P; = k LX;. Therefore k = -l and hence P; = -X·' fori= I. 2....• N. N
N
X X In order to select units with these probabilities, one can use either ··cumulative Total Method" or ';Lahiri's Method". These methods are described below. i=t
i=t
Cumulative Total Method Obtain the cumulative totals T1 = X 1 • T2 = X 1 + X 2 , T3 = X 1 + X 2 + X 3 TN =Xt +X2 + ... +XN A random number R is drawn fro~ 1 to N. If T;_1 < R S T; , then the ith unit is selected. It can be seen that in this method, the probability of selecting the ith
56 Sampling Theory and Methods unit in a given draw is T, - T;-t which is nothing but X; . This procedure is
X
X
repeated n times to get a sample of size n. Note Sometimes X; 'scan take even non-integral values. In such cases a new set
of values can be derived by multiplying each X; by a suitable power of 10 and the resulting values can be used in the selection process.
Labiri's Method The cumulative total method described above is very difficult to implement for populations having larger number of units. In such cases one can employ the Lahiri's method of PPS selection which is described below. Let M be an integer greater than or equal to the maximum of the sizes X 1, X 2 , ••• , X N . The following two steps are executed to select one unit. Step 1 Draw a random number say i from 1 to N. Step l Draw a random number R from 1 toM. If R S X;, then the ith unit is selected. If R >X;, then steps 1 and 2 are repeated till we get one unit. Theorem 4.1 The probability of selecting the ith unit in the first effective draw
is X; , in Lahiri's method of PPS selection. X Proof Note that a draw becomes ineffective when the number drawn in step 1 is i and that drawn in step 2 exceeds X;. Therefore the probability of a draw becoming ineffective is
il=f-1 [M-X;] =1-i=l
N
M
X
M
This implies the probability of selecting the ith unit in the first effective draw is
[:~ H:~ ]A+[:~ ]A
2
+. .
=[:~ Jll-A[' =[:~Il-l+;
r=[; J
Hence the proof. • Thus from the above theorem we infer that Lahiri's method of PPS selection yields desired selection probabilities. The following theorem gives an unbiased estimator for the population total under PPSWR and also its variance estimator.
UneqUill Probability Sampling
51
Theorem 4.2 Let Y; be the )•-value of the unit drawn in the ith draw and P; be the corresponding selection probability, i =I, 2, ... , n. Then an unbiased n
L y; n P;
·.ypps =..!.
estimator for population total is
(4.1)
i=l
and its variance is V[Y pps]
=..!.
f {.!!._- ~ Y
P;
n i=l
2
P;
(4.2)
J
Y· y. Proof Since the ratio - ' can take any one of the N values - 1 Pi PJ
,
j = 1, 2, ... , N
with respective probabilities P; ,
,
Hence
Y; l ~[yj] -. J= £..J P. pj = y p, j=l 1
ET.fpps l
=E[! I )'~ ]=[!I E[~J] n i=l p,
n i=l
p,
=[( ~ }r] =Y n
Therefore Ypps
=_!_
L 1:!.. is unbiased for the population total under PPSWR.
n •=I P; Since the draws are independent.
[1~
\'•]
V[Ypps] = V - £..J .:..L n i=l P; A
=_I
f, v[lL] Pi
!12 i=t
=[~ i,E[~-YJ 2 ] n
i=l
p,
r
Note that, the quantity [ ;: - Y
y. [ p~ - Y
(4.3)
can take any one of the N values
]2 with respective probabilities Pi, j =1, 2, ... , N
Therefore
.J. . Lp,Y;.- r] = f[y~PJ -r] 2
j=l
2
Pi
(4.4)
58 Sampling Theory and Methods A
Using (4.4) in (4.3) we get V[Ypps]
=-I,{-' - 12 I N
y.
n
p.
t=l
Y
'
J
P,
Hence the proof. • A
The following theorem gives an unbiased estimator of V[Y pps] . A
Theorem 4.3 An unbiased estimator of V[Ypps] is
1 ~{Y;
A
v[Ypps1=
A}2
~ --Y
n(n -1) i=l
P;
P;
Proof By Theorem 4.2,
f {21.- r}
V[Y pps] =.!.
n i=l
P;
N
2
=.!. L Y; n
2
P;
- y2
(4.5)
P;
i=l
If v[f pps] is an unbiased estimator of V[Y pps] then
E{ v[Ypps] }= V[Y pps] 0
A
A2
(4.6)
2
SmceV[Yppsl=E[Yppsl-Y , we have A
E { v[Y pps] Hence Note that
2
}
A2
.,
= E[Y pps]- Y-
A2
(using (4.6))
A
Y =E[Ypps -v(Ypps)]
(4.7)
E{: i Y;~} =I~~
(4.8)
i=J
p,
i=l
,I
Using (4.6),(4.7) and (4.8) in (4.5) we get· A
E { v[Ypps]
}
1 { 1 ~ \'~} 1 A~ A = -E - ~ .:.....!._ - - E[YP-;u- \'lYpps )]
n
n
n i=l P; n
~
2
y· 1 Therefore --E~[Yppsl}= E-.,~ ~--Yppsj n - ~ - n n t=l Pi .. n-1
{ 1
A
E{ v[Ypps ] }-- E{ (
1
n
A.,
~ YT
[ 1 n i=l P;
{ [ii=J
=E
1
n( n- I)
Y; P;
J'!'SllJ!J
1 y: n
-1) - ~-2 - -
ypps \~]l i ;
f
59
Unequal Probability Sampling Y· .2, P· n(n-I). I
Hence
~
1
-
(
II
I=1
Y pps
2
J
A
is an unbiased estimator of V[Y pps] . •
I
A meaningful equal probability sampling competitor to the PPSWR sampling scheme is simple random sampling with replacement. The following theorem gives an estimate for the gain due to PPSWR when compared to simple random sampling with replacement. Theorem 4.4 An unbiased estimator of the gain due to PPSWR sampling as
compared to SRSWR
is~ i[N __I_.] y~
P1 n i=l Proof We know that under SRSWR, V[Y swr
]=N
2
P1
N
(N-l)
Nn
~[Y·-f]2 N - 1 £...J I I
i=l
-r 2 } =~{Nfr? . n I=
(4.9)
I
N
Note that, under PPSWR, unbiased estimators of the quantities Ir? and Y 2 i=l II
2
1 ~ Y;
A
A2
.
are- £...J-.- and E[Ypps -v(Ypps}J respectively. n i=l P; A
Therefore, by (4.9), unbiased estimator of V[Yswr] under PPSWR is 1 A A2 } 1 II Y;2 - { NI--nYpps +-v(Ypps) n p·I i=l n2
(4.10) A
Already we have seen in Theorem 4.I, an unbiased estimator of V[Y pps] is
[~ l
y2 ] y 1 n(n-1) ~ -p~- pps 1=1
I
Subtracting (4.11) from (4.10), one can estimate the gain due to PPSWR as
n12
t[N-;.] ~:
Hence the proof. •
( 4.11)
60 Sampling Theory and Methods 4.2 PPSWOR Sampling Method When probability proportional to size selection is made in each draws without replacing the units drawn. we get a Probability Proportional to Size Sample Without Replacement (PPSWOR). Since the selection probabilities changes from draw to draw, we must device suitable estimators takingjnto account this aspect. In this section, we shall discuss three estimators suitable for PPSWOR.
Desraj Ordered estimator Let t1 =l.L,t2 = Y1 + y 2 0- PI ),t3 = Y1 + Y2 + YJ (1- P1- P2), ... , P1 P2 P3 tn = Y1 + Y2 + ... + Yn-1 + Yn (1- P1 - ... - Pn-1·) Pn where Y; and P;, i = 1. 2, ... , n are as defined in Section 4.1.The Desraj ordered estimator for the population total is defined as 1 n YoR =- L,t; n . I A
•=
l Y0 R =n i=l
Lt; II
A
Theorem 4.5 Under PPSWOR,
is unbiased for the population
total and an unbiased estimator of its variance is n
L (t; -
1
vCYDR)=
n
r) 2
where
t =.!.. L,t; n i=l
n(n - 1) i=l
y y. Proof Note that the ratio - 1 can take any one of the N values - 1 , j= 1,2, .... N PI pj
with respective probabilities Pi . Therefore
J11:.]=f;
....,LPr
r=l
r
Pr
= y Hence t 1 is unbiased for the population total. For r=l,2, ... ,n, E[tr]=E1E2[tr li1,i2, ... ,ir-d where £ 2 is the conditional expectation after fixing the units with labels i 1, i2 ,... , ir-l for the first (r -1) draws. Now E[t r] can be written as
E[t,)=E.[ Y1, +Y12 + ... +Y1,_, +(1-P1,-P;, - ... -P1,_1 )
E{;:
(4.12)
lit,i2,... ,i,_t } ]
Unequal Probability Sampling
v
Note that conditionally, the ratio
.;.L can
61
take any one of the N- r -1 values
Pr
yi , j = 1, 2, ...• N,
Pi
* i 1, i 2 ,... , 1,_1 with probabilities
pj (1-P;1 -P;2
- ••• -P;,_,)
N
· '2 · , ... , 'r-1 · } There tiore E{ Yr Pr I· '•,
=
I
p.J (1-P.·,, -P.·12 - ... -P.·'r-1 )
j=l
(4.13) Substituting (4.13) in (4.12) we get
E(t,) = £ 1 Y,· I
+ Y,· 2 + .... + Y; r-1 + j=l ;~:;, .i2 •..... ,;,_,
=Y
Thus we have , for r = I, 2. ~ .. , n • E[t r] = Y . I n
Therefore E(YoR>=- LE(t;) n. I
•=
I =-nY =Y n
Hence
YDR =!
n
L t;
n .
is unbiased for the population total.
~-
'=A
A.,
")
(4.14) Weknowthat V(YoR>=E(YJ3R)-YNote that E&,ts ]= EtE2[t,t_, I it. i2 , ... ,is-l] (assuming without any loss of generality r<s) =E1[t,E 2 (t_, lit.i2, ... ,(,_t)1 = £ 1[t,Y]
= YE 1[t,] =Y2
E{ n(n-1) I :f,r,t_,} = Y2
(4.15)
r;~:s
Substituting (4.15) in (4.14) we get A
1 n
V(YoR)=E..:. { L,t; }
n .
•= 1
2
-E{
I
n 2 } Ltrts -Y n(n-1) rills
62 Sampling Theory arui. Methods ( 1 n 2 [ 1 =Et +--
~"2Ir l r=l
=
Elr
tt
r~s
1
n 1 'r; - - t 2 J where n(n -1) L -1
£{ n(n
)
ll(n-l)Lrsr
1
r=l
=
":!
]n
1
11
J
1 n t =-
I,r;
n I= . 1
i(tr- r)2}
1 -1) r=l
Hence the proof. • Murthy's Ordered Estimator The Desraj ordered estimator depends upon the order in which the units are drawn. Murthy ( 1957) obtained tht: unordered estimator corresponding to Desraj ordered estimator. For the sake of simplicity, we shall restrict to samples of size 2 only. Suppose y 1 and y 2 are the values of the units selt"'cted in the first and second draws and p 1 and p 2 the corresponding initial selection probabilities. The ordered estimator is y0.2) DR
=..!_[l.L + Yt + Y2 (1- Pt >] 2 Pt P2 1 r1 + Pt =-1
Y2 ] Yt +-(1- Pt) 2 L Pt P2 On the other hand, if the same two units are m the other order then the corresponding ordered estimator is 1+ p rbi =..!_[ 2 P2 1>
2
Y2 +1'!_(1- P2 Pt
>] P(1. 2) == Pt P2 0- Pt)
and
P(2,1) = Pt p 2 .The ordered estimator based on the ordered estimators (1- P2)
yf 1. 2 J
Their
and
corresponding
selection
probabilities
rb'i) is given by y _
y0.2) + y<2.1)
DR DR M - P(1,2) + P(2,1)
(1-p2)1.!_+(1-pt) Y2 Pt P2 =---~----__..;;;._ 2- Pt- P2
are
DR
Unequal Probability Sampling An unbiased estimator of V(YM (1- PI- P2)(1- PI
)
63
Is
)~1.- P2)
[11.-
(2- PI - P'!. )-
PI
Y2 ]
2
P2
Horvitz-Thompson Estimator To estimate the population total one can use the Horvitz-Thompson estimator provided the inclusion probabilities are available. In the case of PPSWOR explicit expressions are not available for inclusion probabilities. With the help of computers one can li~t all possible outcomes when n draws are made and hence calculate the inclusion probabilities. In the following sections some unequal probability sampling schemes yielding samples of distinct units are presented.
4.3 Random group method The random group method is due to Rao,Hartley and Cochran ( 1962). This method makes use of the size infonnation and always yields sample containing distinct units. In this method, the population is randomly divided into n mutually exclusive and exhaustive groups of sizes N 1, N 2 , •••• N n and one unit is drawn from ~ach group with probabilities proportional to size of the units in that group. Here the group sizes N 1, N 2 •••.• N n are predetermined constants. An unbiased estimator of the population total is n
~
Yi YRHC = .£....,-, A
P; where Yi is they-value of the unit drawn from the ith random group and Pi lS the selection probability of the unit drawn from the ith random group. i=l
I
•
Let Yij and X ij be the y and x value of the jth unit in the ith random group for a given partition. Then Yi can take any one of N; values Yij, j = 1,2.... , Ni and X··
pj can take any one of the N i values _.....;,1;......_, j N-
= I, 2, ... , N i; i = l, 2, .... n
!,xij j=l
n
Theorem 4.6 The estimator YRHC =
L4 i=l
is unbiased for population total Y.
Pi
(4.16) Proof E[YRHcl=E1E2[YRHC IGI,G2·····Gnl where £ 2 is the conditional expectation taken with respect to a given partitioning of the population and E 1 is the overall expectation.
64 Sampling Theory and Methods Note that E2['YRHC 1Gt,G2•····Gnl= Ez{i Y';. 1Gt,G2·····Gn} i=( pI
(4.17)
y y. Since the ratio ~ can take any one of the N; values -....;;11- , j = 1,2, ... , N i
Pi
xij N·
~X-·I] L j=l
x ..
with respective probabilities -N-._..;11~, we have
f,x;i j=l
E2(
x I]..
y~ IGi )= f ;~. PI j=l
(4.18)
N·
~ x I].. L
ij
N·
~ x I].. L
j=l
j=l
Substituting (4.18) in (4.17) we get n
N·
E2['YRHC 1Gt,G2, ... ,Gnl = I ! r i j i=l j=l
=Y n
Therefore, by (4.16), YRHC =
L )' ~ is unbiased for the population total under i=t
Pi
random group method. • The following theorem gives the variance of the estimator YRHC Theorem 4.7 The variance of the estimator YRHC is n
I,N;(Ni -1) i=t
N(N -1)
N
~[Yr f=t Pr -
2 r]
p
r
Proof
V[:YRHC] = E1 V2[:YRHC I Gt ,G2, ... ,Gn] + V1E2[YRHC I Gt,G2, ... ,Gnl
(4.19)
We have seen in Theorem 4.6, E2£'YRHC 1Gt,G2, ... ,Gnl = Y Therefore V1E2[YRHC I Gt ,G2, ... ,Gn] = 0 Since draws are made independently in different groups,
(4.20)
Unequal Probability Sampling
65
1Gt,G2 ..... Gn1= V2{t y~ 1Gt,G2·····Gn}
V2[:YRHc
PI
i=l
t V2( y~ IG;)
=
PI
i=l
:L! [y..;~ -T; ]2 P;i N·
n
=
i=J j=l
(4.21)
I)
N;
X;i
where p .. = -~- and T; = N· IJ
!x;~c
LX
The right hand side of the above
ilc .
lc=l
k=l
expression is obtained by applying Theorem 4.1 with n=1.
Claim
y.] L [Y·-P, -Y ]2 P; =L.. [Y·-P -~ P N
N
1 .
. I= 1
I<]
1 • 1
2
P;Pj
1
Proof of the claim n
n
n
n
a·· if a lj.. =a·· .. + 2""' =""'a We know that ""'""'a·· fl ~ lj ~ II ~~ lj i=l
i<j
f[XL-
y1 ] P;P;
i=l J=l
Therefore
ff[Y;i=l j=l
P;
2
y1 ] P;Pi
=
pj
i=l
2
P;
P;
+2f[XLi<j
P;
The above expression can be written as
2""' ~ ~ p. p. - 2""' ""' y. y . =2""' -p. N
N
N y.2
~""' i=J j=J
P;
2
I
~~
}
I
N [ y.
N
}
I
i=J j=J
P.]2 P; P1
2
N [ y. N y. yj ""' 2 ""' Therefore 2~-1-- 2Y = 2~ ~- I=J
Hence
P;
i<j
}
I
Y~ ] P;P1 -Y] P; = ±[Y~f[Y~ .. P, pl .IP, 2
2
I=
~ i<j
I<]
Thus we have proved the claim. Making use of(4.22) in (4.21) we get
1 I
y.
2
y1 ] P;P1
pi
]2 p. p.
_J
p. }
I
}
(4.22)
66 Sampling Theory and Methods
can take any one of the values
i
=
i=l
-l)f[.!!:__y]2 Pr
Nj(N; N(N-1) r=l Pr
(4.24)
Substituting (4.20) and (4.24) in (4.19) we get the required result. • Remark When the groups are all of the same size, then N Nl =N1 = ..... =Nn = n ~
N(N -n)
In such cases we have £...JNi (N; -1) =- - - n i=l
Substituting this in (4.24), we obtain VCYRHC)
= N(N -n)
f[.!!.- r]2
Pr
nN(N -I) r=l Pr
=
(N- n) (N -1)
A
V(Ypps)
(refer Theorem 4.2)
From this we infer that random group method is better than probability proportional to size with replacement whenever the groups are of the same size. The following theorem gives an unbiased estimator of VCYRHC). A
Theorem 4.8 An unbiased estimator of V ( YRHC) is
Unequal Probability Sampling
61
n
LN2-N 1=1
A
v( yRHC ) =
N2
r~ y,:. YA } i ~ , - RHC LN l l i=l pi p i I
n
-
.
i=l
p;
where and P; are as defined in Theorem 4.6 and Theorem 4.2. Proof From Theorem 4.7 we have ,2
-" -'Y j P,. where ).. =L N(N
=A.L
N [ Y
A
V(YRHC)
n N · tN -1) 1
r=1
Pr
y2
N
i=l
.J
-1)
i
=AL-t--r2l {
r=1
'
(4.25)
j
r
Using the argument given in Theorem 4.6 it can be seen that
E{t ~? .}= f ~2 i=l
p,p
r=1
(4.26)
r
If v[YRHC] is an unbiased estimator of for V[YRHC] then
}= V[YRHC]
E{ v[YRHC]
(4.27)
A A2 2 Funher V[Y RHC] = E[YRHC]- Y 2
A
2
A
Hence Y = E[YRHC- v(YRHC )] Using (4.26),(4.27) and (4.28) in (4.25) we get
E{ v[YRHC1
}=
(4.28)
A.E{i: PIP ·'·.?, +v(YRHc>-YiHc} i=l
Solving for v(YRHC), we get as estimator of V(YRHC). A
v('fRHc)=
1 A
rn
2
--1£...--. 1-A.
I~
i=l
)';
P; p
~
2
lI
-YRHCJ
Hence the proof. •
4.4 Midzuno Scheme This is another unequal probability sampling scheme due to Midzuno ( 1952). Let X be an unbiased estimator of the population total X of the size variable x A
68 Sampling Theory and Methods under stmple random sampling. That is.
X = !:!... L Xi n
. The Midzuno sampling
IES
design is defined as
P(s) =
(4.29)
0 otherwise The above sampling design can be implemented by using the following sampling method. To draw a sample of size n. one unit is drawn by probability proportional to size method and from the remaining (N -1) units a simple random sample of size (n -1) will be drawn. Now we shall prove that the above sampling method will implement the sampling design defined in (4.29). Let s = {i 1, i 2 , ... , in} . The probability of getting the sets as ~ample is n
P(s) =
L
(4.30)
P(A;,)
r=l
where P( A; r) is the probability of obtaining the set s as sample with r selected
X·
in the first draw. It is to be noted that P(A;,) = -'-' ( X Therefore by (4.30), P(s) =
N-1
-1
)
n-1
[~ :; ] (N~I) IES
n I
(using X= LX;) ies
From this, we infer that the sampling scheme ·described above implements the sampling design defined in (4.29). The following theorem gives the first order inclusion probabilities corresponding to the Midzuno sampling design. TINorem 4.9 Under Midzuno sampling design the first order inclusion probabilities are N-n X· n-1 TC; =-~+--,i=1,2, ... ,N N-1 X N-1
Proof By definition
TC; =
L P(s) sJi
69
Unequal Probability Sampling -1
~
=N )
L.!.
(•
'" X
=(N)-r !:!.__1 LL x, n X
n
(4.31)
Hi 1e.r
Note that ( 1) the number of subsets of size n containing the label i is (
N-l)
and
\ n-1
(2) the number of subsets of size n containing the label j along with label i is 2 ( N- ). n-2
Therefore, by (4.31),
TC;
=-1 (N-1)-I[(N-IJyi +(N-2}X -X;)] X
n-1
n-1
r
n-2
=1 - (N-I\J'-I[(N-2v N-n}(X; X n-1 n-2 n -1
)l N-n X; n-1 .= 1, 2•... , N =---+--,, N-1 X N-1
+X)]
Hence the proof. • The following theorem gives the second order inclusion probabilities under Midzuno scheme.
Theorem 4.10 Under Midzuno sampling design, the second order inclusion probabilities are (N - n)(n -1) X; + X J (n -1)(n- 2) TC .. = +----'1 (N-1)(N-2) X (N-l)CN-2) Proof By definition,
rr;j
= L P(s) sJi,j
=!:!..._1 IIx, n X
. .. S)I,]IES
+ lJ (n
Note that the number of subsets of size n containing the labels i and j is
( N-2) n-2
and the number of subsets of size n containing the label k along with labels i andj is
(N-3) . _n-3
70 Sampling Theory and Methods
Therefore TC·=-1 1 '1 X I N -I )
l
[(N-2'(x +X·)+rN-3\fX-X·-X·)lJ :
n-2 ;
I
1
I
, n-3
1
n-1
~ c~}C~:~ ~~; }<x, +x +x)]
=
=
1
(N- n)(n -1) Xi +X J (n -l)(n- 2) +-----(N-l)(N-2) X (N-1)(N-2)
Hence the proof. • Thus we have derived the first and second order inclusion probabilities under Midzuno sampling scheme. These expressions can be used in the HorvitzThompsen estimator to estimate the population total and derive the variance of the estimator. Midzuno sampling design is one in which the Yates-Grundy estimator of variance is non-negative.
4.5 PPS Systematic Scheme As in the case of cumulative total method, in probability proportional to size systematic sampling, with each unit a number of numbers equal to its size are associated and the units corresponding to a sample of numbers drawn systematically will be selected as sample. That is, in sampling n units with this procedure, the cumulative totals T;, i = 1.2.... , N, are determined and the units corresponding to the numbers {r + jk}, j = 0, 1, 2, ... , (n -1) are selected, where k = T = ~ and r is a random number from 1 to k. This procedure is known as
n n pps systematic sampling. The unit U i is included in the sample, if T;_ 1 < r + jk S T; for some value of j =0.1. 2, ... , (n -1). Since the random number, which determines the sample. is selected from 1 to k and since X; of the numbers are favourable for inclusion of the ith unit in a sample, the nX·
probability rr; of inclusion of the ith population unit is - - ' provided k < Xi . X It is to be noted that if
~ n
taken as the integer nearest to
is not an integer, the sampling interval k can be
~and in this case the actual sample size differs
n from the required sample size. This difficulty can be overcome by selecting the sample in a circular fashion after choosing the random start from 1 to X instead of from 1 to k. Hartley and Rao( 1962) have considered pps systematic procedure when the units are arranged at random and derived approximate expressions for the variance and estimated variance which are given below.
Unequal Probability Sampling
vdHR > =!_
flL_!J__ r] P, 1
~
[1-ln -l)P;]
P,
n •=i
~~ n n.
1
~ n
[
v· ..
., [ V·
~~ 1-n(p;+P;·)+n~p;-: ...:...!... ___,_ n-(n-1) i=l i'
v(YHR)= .,
71
]2]
It can be shown that even when the units are arranged at random pps systematic sampling is more efficient than ppswr sampling.
4.6 Problems and Solutions Problem 4.1 Derive the variance of Desraj ordered estimator when the sample size is two. . Solution When n = 2 , the estimator YDR can be written as 1
~
yDR = 2 (t I + t 2) = .!_r Yt(1
+PtPt )+ Y2l'
1- Pt
)11
P2 ~ Note that the above estimator can take the values
2L
v ( [ 'i
l
1- P, ]~ · · 1 2 1+ f} + y [ T P.. ~· '· 1 = . . .... N :,· ~ 1· j
P,.p. _' 1 1- P;
with respective probabilities
1ff[ (1+P.·J
A2 Therefore E[YDR1=~~ Y; 4 i=t i=l
.'
P,
(1-P.·]~ 2 _ P,.p.,
+ yi - . '
1
1- P;
p1
i~j
n
Using the identity.
n
n
n
i=l
i~ j
LLaii =La;;+ Laij , we can write •=I j=l
E[YBR1 =.!.
ff[r;(l + I+ ri(1- f1 ]~2~ 1ff[ (1+P;) (1-P;)~ P;
4 i=t i=l
P;
--~~
4 .•. 1 I=
P;Pi
)
1=
1- P;
pi
Y.·- +Y.·I P.,· I P.,·
On simplification the first and second terms reduce to N Y. 2 ( 1 P. ) 2 N N y2
.!.L; 4
i=l
.+;_ P; (1- P;)
N
2
+.!.u-LP/1Li+.!.r2-L 4
i=l
j=l pj
2
1'; 2
1-PI Y. 2
i i=l (1- P;)
72 Sampling Theorv and Methods
y2
N
and
L
respectively.
1
t=l (1-
P;)
Therefore 1 p + ;) ; =_!_I vo'vR) P,·O-P N y2
<
4.
2
N
4
1·)
I=I
N y2
+~[1- L/~ 2 12,-'-· -2Y 2 .
I= 1
.
)= 1
P1-
N
y2
N
4L ;P; ) +2YL',Y;P; i=l (1-
Using
f£... Ylp. =f[ Y;p. - r] J
i=l
i=l
2
i=l
P; + Y 2 in the above expression and simplifying
I
the resulting expression we get
~ -Yl [1-.!.2 f P/]_!_2 f[ j~ P;
V( YoR) =
2
i=l
f[
2
Y;. - Y] P; 2 4;~ P;
P; _.!_
Hence the solution. • Problem 4.2 A finite population of size N is divided randomly into n groups of equal sizes (assuming the population is a multiple of sample size) and one unit is drawn from each group randomly. Suggest an unbiased estimator for the population total and derive its variance. Compare the resulting variance with the variance of conventional estimator under simple random sampling. Solution We know that probability proportional to size sampling reduces to equal probability sampling if the units are of the same size. Therefore, the sampling scheme described in the given problem can be viewed as a particular case of random group method. Hence the results given under the random group method can be used for the sampling scheme given above by taking X 1 =X~ = .... = X N = X o (say). In this case. we have
p .. IJ
=
X··
=
I)
N;
X
0
N;Xo
L,x;k k=l
1 2, .. " N" n · ;:l=1, 2.... ,n =-=-,]=1,
N;
N
1 X0 X; andP.· = - = - - =N NX 0 X 1
Under this set up, by Theorem 4.6 an unbiased estimator of the population total n
N"""
b YA . . IS given y RHC =-£..JYi · . I n I=
Unequal Probability Sampling Further
f[~ -r] r=l
2
73
P, =±[NY, - Nf] 2 ~
r
r=l
N
= LN[Yr-
Y] 2
r=l
Substituting this expression in the variance expression given under the Remark (stated below Theorem 4.7) we get 2 )= N (N-n)
V(Y
N- 1
Nn
RHC
N
___!_ ~[Y·
f:t
-f]2
1
It may be noted that the variance expression given above is nothing but the variance of the expansion estimator under simple random sampling and in fact n
. YA N~ . the esttmator RHC = - ~ y; ts nothing but the expansion estimator under n i=l simple random sampling. •
Probkm 4.3 Show that the Yates-Grundy estimator is non-negative under Midzuno sampling design.
Solution We have seen in Chapter 1, a set of sufficient conditions for the nonnegativity of Yates-Grundy estimator are given by TC;TC j -TCij ~ o. i. j = 1, 2, ... , N; i ~ j Using the expressions given in Theorems 4.9 and 4.10 we have tc;tcrtcij =
[:~~; + ~~~] [ :~~; + ~~~] (N- n)(n -1) X; +X j (N-l)(N-2) X
=
(n -l)(n- 2) (N-l)(N-2l
[(N -n ')2 X;~ j + (N -n)(n-1) X; +X j [ - l___I_J] N-1)
=(N-n) 2 N -1
X*-
N-1
X;Xj +[ (N-n)(n-1) X
2
(N -1) 2 (N -2)
X ·
N-1
N-2
+ n-1 [n-1 _ n-2] N-1 N-1 N-2 [l-
X;+Xj]] . X (N -n)(n-1)
+......;..-~-----
(N -1) 2 (N- 2)
Since the right hand side of the above expression is non-negative, we conc.lude that the Yates-Grundy estimator is always nonnegative. •
74 Sampling Theorv and Methods ~
Probkm 4.4
n
Derive the bias and mean square error of-=; LYi under 1=1
probability proportional to size sampling with replacement.
Solution Bias of the estimator is given by N n
B=-I,E
1=1
N n N
=-LLyipi -NY n I= . I
. I
j=
N N =-n L[Yi- Y]Pi n ]= . I N
= NL[Yj- Y]Pj j=l
N n Consider the difference n .
L )'; - Y"= -Nn L. (Y; - Y) n
1=l
-
•=I
Squaring both the sides and taking expectation on both the sides we get the mean square error as
M= :: [t.E(y f> 1-
N2[n
2
N
+ 2tE(y,- fXYr N
=~ tt~(Y1 ·-Y) 2 P1 +2~0
f>]
]
N2 [
N = - L(Y;f) 2 P; ] n I= . I
Cross product terms become zero because units are drawn independently one by one with replacement. •
Exercises Derive the first and second order inclusion probabilities in PPSWOR when n=2. 4.2 Derive the necessary and sufficient condition for the variance estimator to be non-negative in PPSWOR when n=2 4.3 Suppose the units in a population are grouped on the basis of equality of their sizes and that each such group has at least n units. Then a sample of n units is chosen with ppswr from the whole population and repeated units are replaced by units selected with srswor from the respective groups.
4.1
Unequal Probability Sampling
75
Suggest an unbiased estimator of the population mean and derive its variance and compare with the usual P PSWR. 4.4 If in a s~ple of three units. drawn with PPSWR, only two units are distinct. show that the estimators
_!_[ Y1 3 Pt
+ Y:?. + Y! + Y:?. ] P2
P1
+ P:?.
.
Y1
1- (1- P1 )
are unbiased for the population total.
3
+
.V2 3 1- (1- P2)
Chapter 5
Stratified Sampling 5.1 Introduction In simple random sampling, it has been seen that the precision of the standard estimator of the population total (mean) depends on two aspects, namely. the sample size and the variability of the character under study (refer the expression (2.6) given in Chapter 2). Therefore in order to get an estimator with increased precision one can increase the sample size. However considerations of cost limit the size of the sample. The other possible way to estimate the population total (mean) with greater precision is to divide the population into several groups each of which is more homogeneous than the entire population and draw sample of predetermined size from each of these groups. The groups into which the population is divided are called strata and drawing sample from each of the strata is called stratified sampling. In stratified sampling, samples are drawn independently from different strata and it is not necessary to use the same sampling design in all strata. Depending on the nature of the strata, different sampling designs can be used in different strata. For example, in the absence of suitable size information, simple random sampling can be used in few strata, whereas probability proportional to size sampling can be used in the remaining strata when size information is available in those strata. Notations N :Population size :Number of strata in the population N h :Number of units in the stratum h. h 1, 2, ... , L Yhj : y-value of the jth unit in the stratum h, j = 1, 2, .... N h; h = 1, 2, .... L
=
Sample size corresponding to the stratum h, h = 1, 2, ... , L : Stratum total of the stratum h. h = 1, 2, ...• L
nh :
Yh
Yh : Stratum mean of the stratum h, h = 1, 2, ... , L Yhj :
y-value ofthejth sampled unit in the stratum h, j = 1, 2, .... N h ; h = 1, 2, ... , L
yh : Stratum sample mean of the stratum
h, h = 1, 2, .... L
77
Stratified Sampling 2 1 ~ -2 2 l ~ -2 sh =--L[Yhj -Yh] . sh = L[Yhj- Yh] N II -1 j=l n h - 1 J=l
The following theorems help us to identify unbiased estimator for the population total under different sampling designs and also to obtain their variances. Theorem 5.1 If Yh, h = 1, 2, ...• L is unbiased for the stratum total Yh of the L
stratum , then an unbiased estimator for the population total Y is Ysr = L,rh h=l L
and its variance is V(Ys1 ) =
L V(Yh) h=l
Proof Since Yh is unbiased for the stratum total Yh of the stratum h. we have E(Yh)=·Yh ,h=l,2, .... L L
Therefore E(Ys
1)
= L,E(Yh) h=l L
= L,rh =Y h=l L
Hence Ysr = L,rh is unbiased for the population total. h=l Since samples are drawn independently from different strata, cov(fh. Yk) = 0 A
forh
~
A
k. L
Therefore
L
L
V(Ys1 )= I,V(fh)+2I,I,cov(fh,l\) h=l h=lh
= L,v h=l Hence the proof. • Corollllry 5.1 If v(Yh) is unbiased for V(Yh) , h = 1, 2, .... L, then an unbiased L
estimator of V(Ysr) is v(Ys1 ) =
L v(Yh)
h=l Proof of this corollary is straight forward and hence omitted.
78 Sampling Theory and Methods
Corollary 5.2 If simple random sampling is used in all the L strata. then an
unbiased estimator of the population total is Yst
±t
=
Nh
y hj .
h=l nh j=l
Proof We know that when a simple random sample of size n is drawn from a
L
Y; is unbiased for the population total. population containing N units, N . n IES
Therefore N h
I
Yhj
is unbiased for the stratum total Yh , h =1. 2, ... , L. Hence
nh j=l
LN
~
L
we conclude that
_h
L
. y hj is unbiased for the population total Y (refer
h=l nh j=l
Theorem 5.1). •
A
Therefore by Theorem 5.1 , V ( Yst )
~N~(Nh-nh).,
Sh
=£..J
N hnh
h=l
Hence the proof. • Corollary 5.4 If simple random sampling is used in all the L strata. then an unbiased estimator of V(Ys 1 ) considered in Corollary 5.3 is L N1It ( N h - nh) ~
A
v(Yst )
=£..J
2 5 11
N hnh
h=l
Proof We know that under simple random sampling s 2 is unbiased for S 2 (refer Theorem 2.4). Therefore an unbiased estimator of V(Yh) considered in
Corollary 5.3 is
.,
v( Yh)
2
=
N h (N h - nh) " 5
Nhnh
h
Hence by Corollary 5.1, an unbiased estimator of V (Yst ) is A
v(Yst)
~ N~(Nh
=£..J h=l
Hence the proof. •
-nh)
N hnh
2
sh
Stratified Sampling
79
5.2 Sample Size Allocation Once the sample size n is fixed next arises the question of deciding the sample size nh meant for the stratum h,_h = l. 2, ...• L. In this section some solutions are given assuming that simple random sampling is used in all the L strata. Two popular allocation techniques are (i) Proportional allocation tii) Neyman allocation.
Proportional Allocation Under proportional allocation the number of units to be sampled from the stratum h is made proportional to the stratum s1ze. That 1s, nh oc N h , h = I, 2.... , L ~nh
=kNh where k is the constant of proportionality. Summing both the sides of the above expression we obtain L
L
Lnh=kLNh h=l
h=l
~n=kN
n N n Therefore nh =- N h , h = 1, 2.... , L N The following theorem gives an unbiased estimator for the population total and its variance under proportional allocation. ~k=
Theorem 5.2 Under proportional allocation , an unbiased estimator for the N L ~
LL
population total is ~tr =n h=l and its variance is A
y hj
j=l
N 2 (N -n) L N A
V(Ysr>=
~
£..J-S 1~ "
.,
Nn h=l N Proof We know that under proportional allocation n nh =-Nh ,h=l,2.... ,L N Substituting these values in the expressions given in Corollaries 5.2 and 5.3, we get the required results after simplification. •
The above discussion gives the sample sizes under proportional allocation when the total sample size is known in advance and it does not take into account the cost involved under the allocation. Normally cost will always be a constraint in the organisation of any sample survey. Therefore it is of interest to consider proportional allocation for a given cost. Let ch , h = 1, 2, ... , L be the cost of
80 Sampling Theory and Methods collecting information from a unit in stratum h. (These costs can differ substantially between strata. For example, information from large establishments can be obtained cheaply if we mail them questionnaire, whereas small establishments may have to be personally contacted in order to get reliable data). Therefore the total cost of the survey can be taken as L
C=Co+ Lchnh
{5.1)
h=l
where C 0 is the fixed cost. When the sample size nh is proportional to the stratum size, we have nh =leN h • h = 1, 2, ...• L • (5.2) where k is the constant of proportionality. Summing both the sides of (5.2) with respect to h after multiplying by ch • we get L
L
Lchnh = LchNh h=l
h=l
Using (5.1) in the above expression we get
~
C-C0
C-C0 =k LJchNh => k =-L_ _..;:._ h=l ~ N
LJch h h=l
Therefore the proportional allocation ior a given cost is given by
C-C0
nh =
L
(5.3)
Nh
LchNh h=l
Summing both sides with respect to h • h size as
n= C-Co N L
=1, 2, ... , L
we get the total sample
(5.4)
LchNh h=l
Under the above allocation the variance of the estimator defined in Theorem 5.1 is (5.5)
Optimum Allocation The proportional allocations described above do not take into account any factor other than strata sizes. They completely ignore the internal structure of strata like within stratum variability etc., and hence it is desirable to consider an allocation scheme which takes into account these aspects. In this section two allocation schemes which minimise the variance of the estimator are considered. Since minimum variance is an optimal property, these allocations are called
Stratified Sampling
81
"Optimum allocations''. Note that under simple random sampling the variance of L
Yst
= L Yh
can be expressed as
h=l
~ N;(Nh -nh)
A
V(Ys1 )= ~
N hnh
h=l
2 2
L
=
.!
Sh
L
L Nhsh- LNhsl
nh h=l Global minimisation of the above variance with respect to n 1 , n 2 , ...• n L does not yield ~ non-trivial solution (see what happens when the first order partial derivatives with respect to n 1 ,n 2 •... ,11L are equated to zero). Therefore in order to get non-trivial solutions for n 1 , n 2 ,... ,nL, we resort to conditional minimisation. Two standard conditional minimisation techniques are (i) Minimising the variance for a given cost and (ii) Minimising the variance for a given sample size. The solution given by the latter will be referred to as "Neyman optimum allocation" and the former allocation will be referred to as "Cost optimum allocation". The expressions for the sample sizes under the two types of allocations mentioned above are derived below. h=l
(i) NeymaD optimum allocation As mentioned above under Neyman allocation, the variance of the estimator will be minimised by fixing the total sample size. That is, we need the values of n1 •!'2 ,... ,nL which minimise 2 2
L
L
L Nhsh - LNhsi· h=l
nh
h=l L
subject to the condition
L n1r = n . h=l
To solve the above problem consider the function
L N2s2 h
~=L h=l
:
h
L
:LNhS~+A. h=l
JL
L.nh-n
lh=l
}
(5.6)
where A. is the Lagrangian multiplier. Differentiating the above function partially with respect to nh and equating the derivatives to zero we get
N2s2 h2 h
+A.::<>, ,h=I,2, ... ,L
nh
(5.7) Differentiating the function zero we get
c1»
with respect to A. and equating the derivative to
82 Sampling Theory and Methods L
l.nh =n
(5.8)
h=l
Summing both the sides of (5.7) with respect to h from 1 to L, we get L
L
LNhSh
L ~
~ nh
=
h=l
.[i.
L,NhSh h=l
~
n = --:/"i.=A.~
L
LNhSh Therefore
.fi. = h=l
(5.9)
n
Using (5.9) in (5.7) we get nh =
NhSh
L
n
(5.10)
L,N,sh h=l
The expression given in (5.10) can be used to calculate the sample sizes for different strata. It can be seen that the matrix of second order partial derivatives is positive definite for the values satisfying (5.7) and (5.8). Therefore we conclude that the values yielded by (5 .1 0) minimise the variance of the estimator for the given sample size. Under the above allocation the variance of the estimator reduces to
.!. { LL n
h=l
N S }
h h
nh
L -·LNhS~
(5.11)
h=l A
This expression is obtained by using (5.10) m V(Yst) under simple random sampling. (ii) Cost Optimum Allocation
Under cost-optimum allocation the sample sizes are determined by minimising the variance of the estimator by fixing the total cost of the survey. As in the case of proportional allocation for a given cost, the total cost of the survey can be L
taken as C
=C0 + L,chnh where
C 0 is the fixed cost and ch is the cost per
h=l
unit in stratum h, h = 1, 2, ... , L .
Define
;=
±
N:s:- ±N.s:
h=l
h
h=l
+'A.{co + ±chnh -c}
(5.12)
h=l
Differentiating the above function partially with respect to nh and equating the derivatives to zero we get
Stratified Sampling
NhSh nh = r:; ~ "'A.vch
83
(5.13)
Differentiating the function zero we get
~
with respect to A. and equating the derivative to
L
C0 + .L,chnh =C h=l
L
(5.14)
.L,chnh =C-C0 h=l
Summing both the sides of (5.7) with respect to h from Ito L, we get L
LNhShfc;
L
= h=l
L chnh
r:;
"'A.
h=l L
_L,Nhshfc; ..[i. =.:.:..h=-=-·--L
.L,chnh h=l L
_L,Nhshfc; = h=l
(using (5.14))
Using this expression in (5.13) we get
NS
h h
(C- Co)
rc; nh =.......;;L_ _ _ _ _ ,h =I, 2, ... , L
(5.15)
_L,Nhshfc; h=l
The expression given above gives the optimum allocation of the sample for a given cost. It can be shown that the matrix of the second order partial derivatives is positive definite. Hence we conclude that this allocation minimises the variance for a given cost. The expression given in (5.15) leads to the following conclusions. In a given stratum. take a large sample if : ( 1) the stratum size is larger; (2) the stratum has more internal variation with respect to the variable under study; (3) sampling is cheaper in the stratum.
84 Sampling Theory and Methods
Summing both sides of the equation (5.15) with respect to h from 1 co L, we get the total sample size under the cost-optimum allocation as
(5.16)
The variance of the conventional estimator under the cost-optimum allocation reduces to
{t.NhSh~r L
LNhS~
-
C-C0
(5.17)
h=l A
This expression is obtained by using (5.15) m V(Yst) under simple random sampling. The following theorem compares the variance of conventional estimator under simple random sampling, proportional allocation for a given sample size and optimum allocation for a given sample size.
Theorem 5.3 Let V ran , V prop and Vopt be the variances of the usual estimators under simple random sampling, proportional allocation and optimum allocation for a given sample size. If N h is large then vran ~ v prop ~ vopt
Proof We know that under simple random sampling, the variance of the conventional estimator for the population total is 2
= N (N -n)
V ran
s2
(5.18)
Nn
L N
Note that (N
-l)S 2
=L~[Yhj- ¥']2 h=l j=l
L N
L
h=l j=l
h=l
L N
L
~~ - 2 ~ -2 = .4J2)Yhj- Yh 1 + £..J N h[Yh- Y1
= L~(Nh h=l j=l
L
Therefore s 2
-1)S~ + LNh[~ -f1 2 h=l
L
='L whs ~ + 'L wh [~ - v12 h=l
(5.19)
h=l
(5.20)
Stratified Sampling
85
Nh where wh = - . N This is obtained by using the fac_tthat N h and hence N is large. Using (5.20) in (5.18) we get vran
=N 2 c;n-n)
[±whs; + ±wh[Yh -f1 2 ] h=l
h=l
N 2 (N-n)[L =Vprop +--N-n-I,wh[Yh
-f] 2 ]
(ByTheorem5.2)
h=l
Therefore Vran
~ V prop
(*)
By expression (5.11) we have Vopr
=~ {±Nhsh}
2 -
h=l
ThereforeVprop -Vopt
= =
±N.s~ h=l
N 2 (N-n) L
Lwhs~
Nn
lz=l
{L
1 - - I,NhSh n lr-=1
}2
r
L
+ LNhs: h=l
(N: t. N.s; -! {t.N•S• t.N•Sl +
n)
2 = (N- n) ~ £...JNhSh
n
h=l
N 2 -2 1~ - - S where S =-£...JNhSh n N h=l
L
=!!.. LNh(Sh
n
-S) 2
(5.21)
h=l
Therefore vprop ~ vapr The result follows from(*) and (**).•
(**) L
Note From (5.21)wehave vprop -Vopt= !!..LNh(Sh
n
~
N £...J N h ( S h - S) Therefore V ran = Vopt +-; h=l
-S) 2
h=l
2+ ·[~ - - Y]- 2] £...J Wh [Yh h=l
This expression leads to the conclusion that as we change from simple random sampling to optimum allocation with fixed sample size. considerable amount of precision can be gained by forming the strata such that variance between means and variances are large.
86 Sampling Theory and Methods
5.3 Comparison with Other Schemes (1) Comparison under populations with linear trend Suppose that the population i~ divided (assuming that N nk, n and k being integers) into n strata where the stratum h contains units with labels Gh h - 1)k + j. j l. 2•...• k}. h I. 2•...• n and one unit is selected from each stratum randomly to get a sample of size n.
=
={(
=
=
Under the above stratification-sampling scheme an unbiased estimator for the II
population total is given by Yo=~ LYhl
II
=k LYhl
h=l
h=l
=
=
This estimator is derived from Corollary 5 .2. by taking L n, N h k, nh On applying Corollary 5.3. we obtain the yariance of the above estimator as
Lk II
h=l
2
k
(k -1) 1 L[Yhj- f,.]2 k k -1 j=l /c.
II
which reduces to k LL[Yhj- Yh ] 2
(5.22)
h=l j=l
When the population values are modeled by the relation. Y; =a+bi.i 1. 2•...• N under the stratification scheme described earlier. we have Yhii =a +b[(h-t)k + j]
=
Therefore
-
1
rh =-I {a +b[
k J= . I
=a +b{(h -l)k +
~
yhj -
Y,.
=1.
=b{ j -
k(~;
l)}
(k; 1)}
Squaring both the sides and summing with respect to j from 1 to k. we get
~ ~[Yhj -
- 2 Yh]
~ { 1. 2 + (k +4 1) =bf::
2
·}
- (k + l)J
= b2J k(k + •><2k + •> + k(k + •> 2_k 2(k + •>2}
l
6
4
2
= b2 k(k 2 -1) 12 Using this in (5.22) we get the variance of the estimator Yo as A
Stratified Sampling
87
(5.23) Already we have seen in Chapter 3. for populations exhibiting linear trend •
.,
.,
V(Y ) =b2 n-k-(k -l)(nk + 1) srs 12
(5.24)
=b2 n 2 k 2 (k 2 -
D (5.25) 12 Denoting the variances given in (5.23).(5.24) and (5.25) by Vst, Vran, Vsys, we V(Y
)
LSS
obtain Vst S Vran S V.rys . From the above inequality, we conclude that the stratification -estimation scheme described in this section is better than both simple random sampling and systematic sampling for populations exhibiting linear trend. (2) Comparison under Autocorrelated trend Assuming that the N population values are the realized values of N random
variables having a joint distribution such that EM[Y;]=,u .EM[Y; -,u] 2 =a 2 and
=
EM [Y; - ,u][Yi+u - ,U] p 11 a 2 where Pu 2:!: Pv whenever u < v, we have proved that (Theorem 3.8) 0'2(k -1)N2 [
A
EMV(Ysrs>=
nk
1-
2
N-1
L(N-u)p 11
]
(5.26)
N(N -1) u=l
A
The expected variance of the estimator Yo under the above model is given by EM[V(fo)]=EM {k
=
~~[Yhj-
f.f} 2
a1(k -l)N2 [1-
nJc
(5.27)
~ (k- u)p
k(k - 1) "'u=l
]
(5.2~)
II
(refer Theorem 3.21) Define
"-1
~
L(j)= . f(j-u)p 11 ,j=2.3•... J(J- 1) u=l 2
- 1)N Then E MV(Ysrs ) =a (k nk 2
and
EM V(Y0 ) =a (k ~ l)N
2
(5.29)
2
[1- L(nlc)]
[1- L(k)]
(5.30) (5.31)
Thus in order to prove EM V(Y0 ) S EM V(Ysrs) it is enough to show that L(nk) S L(k) (5.32) Consider the difference
88 Sampling Theory and Methods L(j)- L(j +
I
n=
. _; ](}
-1)
L
(5.33)
If S stands for the summation term in the right hand side of (5.33), grouping together the terms equidistant from the beginning and end, S can be written as m
S = L[2m + 1- 2u][pu - P:!m+l-u 1
if j=2m is even
u=l m
S=
L [2m+ 2- 2u ][p
u -
P2m+2-u 1 if j=2m+ 1 is odd
u=l Since P; ~ Pi+l for all i, every term in S is non-negative. Therefore S is nonnegative. Hence we conclude that L is a non-decreasing function. Therefore L(nk) ~ L(k). This leads to the conclusion that the average variance of the conventional estimator under simple random sampling is larger than the average variance of the estimator Yo introduced in this section. However no such general result can be proved about the efficiency of systematic sampling relative to simple random sampling or stratified sampling unless further restrictions are imposed on the correlations p u . The following theorem is due to Cochran (1946).
Theorem 5.4 If Pi~ Pi+l ~O.i = 1. 2.... , N -2
.al = Pi+2 + P; -2Pi+l ~0.
and i = l, 2, ... , N -2 then EM[V(YLSs )] ~ EM[V(fo)] S EM[V(~,rs)] .
.,
..
.
Furthermore, unless () F = 0, i = 1, 2•.... N- 3.£M [V(YLSS )] S EM [V (YLSS )] Proof As
a? ~O.wehave
Pi+2+P;-2Pi+l ~O.i=l.2, ... ,N-2
By induction it can be shown that Pi+c+l - Pi+c ~ Pi+c- Pi for any integer c. Hence for any integer a,c>O we have i+c+l a+c-1 LPi+c+l- Pi+c i=a
~
L Pi+c- P; i=a
which gives Pa+2c + Pa- 2Pa-t-c ~ 0 Consider the difference EM [V(Yo)]-EM [V(YLSs )]
2a 2 (k l)N 2 2-
Nnk (k -1) nk-1
(5.34)
=
[nk-1 n-1 k-1 ] L(nk-u)pu -k 2 L(n-U)Pku -nL(k-u)pu (5.35) u=l
u=l k
n-1
Note that L (nk- u)pu = L L[nk- (i + jk)]P;+ jk u=l i=l j=O
u=l
Stratified Sampling k-1 n-1
89
n-1
=LL[nk -(i+ jk)]P;-..Jk + L
j=l
k-1 n-1
n-1
= LL(n- j)(k-i)Pi+jk +k L,
j=l k-1 n-2
k-1
LLip jk+i +n L
i=l
(5.36) k-1 n-i
k-1 n-2
Since
L L i(n- j- i)p jk+i = L L i=l j=O
i(n- j) p jk-(k-i)
i=l j=l k-1 n-i
(5.37)
= LL(k- i)(n- j)p jk-i i=l j=l
The expression inside square braces of (5.35) can be written
as
k-1 n-i
k-1 n-i
n-1
LL(k -i)(n- j)p jk+i +LL(k- i)(n- j)p jk-i -k(k -l)L,p jk i=l j=l
i=l j=l
j=l
k-1 n-i
which is equal to
L L (k - i)(n- j)[p jk+i + P jk-i - 2p jk] · i=l j=l
By (5.34) this is clearly non-negative. Therefore EM [V(fo)] S EM [V(YLSs )]. Further from (5.38), it can be seen that the above inequality will be strict if and only if ();
=0, i = 1, 2, ...• N -1.
Hence the proof. •
5.4 Problems and Solutions Nt Problem 5.1 A sampler has two strata with relative sizes W1 =-and N N2 W2 = - . He believes that S1 •S2 can be taken as equal. For a given cost
N
C = c1n1 + c 2n 2 • show that (assuming N his large)
[
Vprop]= Vopt
[W1c1 +W2c2]
[wl~ +W2~r
Solution When Nh is large, V(Ysr) =
±{N~ h=l
N
h -nh N hnh
}s~'
90 Sampling Theory and Methods
t, {Ni[ n~
=
-
~.]}si
_~ N: 5 z -~-
(5.38)
h
h-I nh
For the given cost, under proponional allocation we have nh
=
CNh ciNI +c2N2
,h=1,2
This expression is obtained from (5.3) by taking C0 Substituting these values in (5.38) we get N 12s[ CNI
=
Vprop
and
=
L 2.
N?s? CN2
+
c1N 1 +c 2 N 2
=0
c1N 1 +c 2 N 2
=c1N1 +c2N 2 { N 12S12+ N 225 22} C
N1
N2
= ciNI +c2N2 s2N
(S.39)
c
The above expression is obtained by taking N 1 + N 2 = N and S 1 = S 2 • For the given cost, under optimum allocation we have CNhSh nh
=
r
:;-;
r::- , h =I, 2
N 1SpJCI +N2S 2 "1c2
=
=
This expression is obtained from (5.15) by taking C0 0 and L 2. The variance of standard estimator can be obtained from (5.38) by substituting the sample size values given above. It turns out to be CN1S1 CN2S2
F. ,jc; + N 22s22 NISI ,Tc I,Jc; + N 2S2 ,jc;
v - N 2s 2 opt -
I I N IS l..rc I + N 2S 2
r c= [N1s1..Tc1 +N2Sz,lc;1 C [NISrvc I+ N2S2"/c2]
= [Nisi,Tcl +cN2S2~J
2
S
2
Therefore by (5.39) and (5.40) we get V prop _ Vopt
[w1c1 + W2 c 2 )
-~~F. +W2~r
Hence the solution. •
(S.40)
Stratified Sampling
91
hobMm 5.2 With two strata, a sampler would like to have n1 = n 2 for administrative convenience, instead of using the values given by the Neyman allocation. If V and Vopt ~enote the variances given by the n 1 = n 2 and the
Neyman allocations, respectively, show that the fractional increase in variance V-Vopt
--~
[r-1]
2
n1
where r =-as given by the Neyman allocation. Assume
= -
r+ 1
vopt
n2
that N 1and N 2 is large.
= =-n2 . Substituting
Solutioa Under equal allocation we have n 1 n 2 (5.38) we get (with n
-[2]f
=2)
2 2] i.NI2sl2 +N2S2
v- ;
this in
(5.41)
Under Neyman allocation we have
NS 1 1
n1 =
n and N 1S 1 +N 2S 2 Substituting these values in (5.38) we get Vopt
= ..!_ [NISI + N 2S 2 )2
(5.42)
n
By the definition of r. we have r
= N 1s1 . Using this in V and Vopt
given in
N2S2
(5.41) and (5.42) we get 2 2 2 ., V =-N 2 S2 (r- +I)
(5.43)
n
(5.44)
V -V r Therefore
op V opt
'
=
-1)-.!. Nisi [3:.N:fS:f
(r+l)2]
..;;;....-------------= .!. N f S i (r + 1) 2 n
n
(r-1)2
=--(r+ 1)2
Hence the solution. • ProbMm 5.3
If there are two strata and if ~ is the ratio of the actual
nl
to
n~
the Neyman optimum nl , show that whatever be the values of N 1,N 2 ,S 1 and n2
92 Sampting Theory and Methods . f Vapr S 2 • the rauo o --
never Iess t han
. IS
v
4cP 2 when N 1 and N:! are large. (1+<0)
Here Vopt is the variance of usual estimator under Neyman optimum allocation and V is the variance under actual allocation. Nl2 ., N; ., =-SC + -=-s:r and by (5.42),
. Solutzon By (5.8), V
nl
nl
= ...!_[NISI + N 2 S 2 ]2
V opt
n
1
Therefore
.,
-[NISI +N2S2]-
V opt
v
=n
s
N2 2 I I
2 N;s - 2
+
(5.45)
nl n2 Under Neyman allocation, nl NISI (refer Problem 5.2) n2 N2S2
- = ~
=
N 2S.,nl
-
(5.46)
N1Sin2
1{1 Vopt
-;;
}2
N2S2
+
v-= -+ 1
N1S1
N2
s;
2 -
n1 Nf S12n2 The above expression is obtained by dividing both the numerator and denominator of (5.45) by N12S12 . Substituting the value of ; given in (5.45) in the above expression. we get
!{l+f ::}'
v.., --=-;...__ _;;,;::;.. ._ V
1
-+<& nl 1 -(nl
n
2
n~
-2-
nl n2 .,
+n2~)
(5.47)
:...:..:....----
·2
nl + n2f'
Replacing n by n 1 + n 2 and (nl + n 2<&) 2 by (nl - n 2;) 2 + 4n 1n 2; • the above ratio can be expressed as
Vopt -
-
V
(nl - n2;)2 + 4nln2; W kn h _ e ow t at 2 2 . (nl - n 2;) + n1n2 (1 +";)
Stratified Sampling
44»
vopt
conclude that - - ~
v
(1+<0)
2 .
93
Hence the solution. • L
Probkm 5.4 If the cost function is of the form C
=C0 + Lt h ..r;;; , where C0 h=l
and t h are known numbers. show that in order to minimize the variance of the
estimator for fixed total cost
Solution
nh
must be proportional to {
N::l}
2
3
To find the desired values of nh, we must minimize the function <0 =
±{N:[f--f-]}s~ + A.{co + ±rh_J;;;- c} h=l
h
h
h=l
where A. is the Lagrangian multiplier. Differentiating partially the above function with respect to nh and equating the derivatives to zero we get
N~S~ 2
nh
th
c- =O. h =I, 2•...• L
+
2-ynh
(5.48) Differentiating partially with respect to A. and setting the derivative equal to zero we get L
C =Co+
"'J:/h..r;;; h=l
From (5.48) we have,
(5.49)
94 Sampling Theory and Methods
±{Nis;} h=l
C-Co=
lh 113
113
(using(5.49))
·
A.
±{N~S~ 1
113
A.
1/3
r
h=l th =--~--~----
C-Co Substituting this value in (5.48) we get
(C-C0 ) 2 [N::~ n• = ±[N:s~ h=l
r
]1/3 th
th
It can be seen that for these values of nh , the matrix of second order partial derivatives becomes positive definite. Therefore we conclude that the above v3Jues of nh minimize the variance of the estimator for a given cost. •
ProbZ.m 5.5 In a population consisting of a linear trend, show that a systematic sample is less precise than stratified random sample with strata of size 2/c and two units per stratum if n > 4k + 2 when the first stratum contains first set of 2k k+l units, second stratum contains second set of 2k units in the population and so on. Solution We know that under systematic sampling in the presence of linear trend, the variance of the usual estimator is ..
V[Yus 1=
N2/J2(k2 -1) 12
(5.50)
Under the stratification scheme described above. we have L = ~. N h = 2k for 2 h = l, 2, ... ,~and the labels of units included in the stratum h are given by 2
Gh = {2(h -l)k + j, j = 1, 2, ... , 2k ~ h = 1, 2, ... , ~ 2
Therefore we have -
Hence
Yh
Yhj
=a+ /3{2(h-l)k + j}, j = l, 2, ... , 2k, h = 1. 2, ...• ~
2
1 2k
= -Ia+fJ{2(h-l)k+ j} 2/c ]= .I
=a+
II[
2(h-1)k+
2\+l]
Stratified Sampling
YIV
95
-f. =J'[j- 2/c2+1]
Squaring and summing we get, .
s2 =
~ a2[ 1·- 2k+l] 2
I
2k-I~./J
h
2
j=
= /32 k(2k +I) 6 ., k(2k +I)
2
Since N h = 2k,nh = 2 and S h = {3estimator .as
V31
6
. . , we obtam the vanance of the
±{N:[-I __I ]}s~
=
h=l
nh
Nh
= /32 k 2 (k -I)n(2k + 1)
(5.5I) 6 Comparing (5.50) and (5.5I) we infer that systematic sampling is less precise than stratified sampling if n > 4 k + 2 . Hence the solution. • k+I Probkm 5.6 Suggest an unbiased estimator for the population total under stratified sampling when ppswr sampling is used in all the L strata and also derive its variance. Solution Let X hj be the value of the size variable corresponding to the jth unit
=I, 2, ... , N h; h =1. 2, ... , L, X h be the stratum total of the size variable corresponding to the stratum h and Phj = X hj . If Phj is the P-value of xh in the stratum h, j
the jth sampled in the stratum h, then an unbiased estimator of the population total is Ypt
=±-I-~ YhJ
.This estimator is constructed by using the fact
h=l nh j::al Phj
that _I_
t
y hj is unbiased for the stratum total Yh of the study variable Y and
nh i=l Phj
hence by Theorem 5.I , Ypt = .
L-I ~ L L
Yh· _IJ
is unbiased for the population
h=l nh i=1 Phi
total. The variance of the above estimator is
VfYpr
)=±-I~{yhj h=1 nh j::al
Hence the solution. •
phj
-
rh}2
phj
(refer Theorem 4.2)
96 Sampling Theory and Methods
Exercises 5.1
Derive the variance of the estimator considered in Problem 6.6 under proportional allocation. That is. the sample size is made proportional to stratum size X h rather than the number of units in the stratum h. 5. 2 A random sample of size n is selected from a population containing N units and the sample units are allocated L strata on the basis of information collected about them. Denoting by n" the number of sample units failing L
in stratum h, derive the variance of
LN h=l
h
f,.
(note that N h is known).
N
N
5.3 A population is divided into L strata, stratum h containing N h units from which n 11 , h = 1, 2, ... , L are to be taken into the sample. The following procedure is used. One unit is selected with pp to x from the entire population. If the unit comes from the stratum h, a simple random sample of further nh ·-1 units is taken from theN h -1 units that remain. From the other strata simple random samples of specified sizes are taken. Show that L
LNhYh under usual notations
hZI
is an unbiased estimator of
~.
I,Nhxh h=l
5.4 For the sampling scheme in which the population is split at random into substrata containing N;, i = 1~ 2, ... , n units, and one unit is selected with pp to x from each substratum, suggest an unbiased estimator for the population total and derive its variance. (Compare this with Random group method described in Chapter 4 ).
Chapter6
Use of Auxiliary Information 6.1 Introduction So far we have seen many sampling-estimating strategies in which the knowledge of the variable under study, y, alone is directly used during the estimation stage. However in many situations the variable under study, y, will be closely related to an auxiliary variable x and information pertaining to it for all the units in the population is either readily available or can be easily collected. In such situations, it is desirable to consider estimators of the population total Y that use the data on x which are more efficient than the conventional ones. Two such methods are (i) ratio methods and (ii) regression methods. In the following sections we shall discuss "Ratio estimation".
6.2 Ratio Estimation Let Y and X be unbiased for the population totals Y and X of the study and auxiliary variable respectively. The ratio estimator of the population total is defined as (6.1)
For example, if Y is the number of teak trees in a geographical region and x is ... its area in acres, the ratio ~ is an estimator of the number of teak trees per acre X ... of a region in the population. The product of ~ with X, the total area in acres X would provide an estimator of Y. the total nu~ber of teak trees in the population. The estimator proposed above is meant for any sampling design yielding unbiased estimators for the population totals Y and X. Let P(s) be any sampling design. It may be noted that Ep[YR]= LYRP(s) = sEll
=
L[ ~(s)
sdl X(s)
L[ ~(s) f
lvP(s)
sEll X(s)
}(s)
98 Sampling Theory and Methods Since the right hand side of the above expression is not equal to Y. the ratio estimator is biased for Yunder the given sampling design. The following theorem gives the approximate bias and mean square error of the ratio estimator.
TMorem 6.1 The approximate bias and mean square error of the ratio estimator
_ scYR> =r{[V~) and
]-[cov1~·f> ]}
MSECYR> = y2{[ v:~>
HV~~)
A
]- 2 [ cov~~,f) ]}
A
Y-Y X-X Proof Let e0 = - and e1 = - y X
It may be not
E(e0 ) = E[ Y; y] = 0
{6.2)
=!=[X; X]= 0
(6.3)
(ii) E(e 1)
(iii)
ECeJ l = E[
f; y
r
=
v:~)
(6.4)
2
(iv) E(ef)=E[X-X] - V(X)
(v)
E(eoel) = E[ (f- y~
(6.5)
x2
x
-X)]
= cov:;: i)
(6.6)
Assume that the sample size is large enough so that I e0 k 1 and Ie1 k 1 . This is equivalent to assuming that for all possible samples 0 <X< 2X and ... ... ... ... y 0
= f(l+eo)(1-el +ef - ... ) = Y(1+e 0 -e 1 +ef -e0 e1+ ... ) Using (6.2) and (6.3) and ignoring terms of degree greater than two we get 2
A
E[YR-Y]:YE[e1 -eoed =
r{V(~)- cov(f, X)} (using (6.4),(6.5) and (6.6)) x-
YX
Proceeding as above we get (ignoring terms of degree greater than two)
Use of auxiliary information
99
Hence the proof. •
CoroliDry 6.1 Under simple random sampling, (i)
..
LY; ies
YR=LX;X X· iEs
'
s;
Sxy} N 2 (N-n){s; .. ... - ·+ - - 2 (m)MSE(YR)= XY y2 x2 Nn N 2 (N-n) 2 .. SY Proof We know that under simple random sampling V(Y} = Nn N 2 (N -n) S"'. Substituting these -; Nn x Nn expression in the results available in Theorem 6.1, we get the required expressions. • The following theorem gives the condition under which the ratio estimator will be more efficient than the conventional expansion estimator. . y .. TMorem 6.2 The ratio estimator YR = ....- X is more efficient than the X .. V(X)
=N
2 (N -n)
S
2
..
..
..
and cov(Y, X)=
1
c
s,
s
expansion estimator Y if p >-~where C y =-=-, C x = x and p is the X Y 2 Cy coefficient of correlation. Proof V('Y) > MSE('YR)
>.!_.!_Sx
S ;ry
2XS y
100 Sampling Theory and Methods
Hence the proof. • Estimated mean square error under simple random sampling Note that N
N
~
2~
-,,)Y; - RX; 1 = ..L..[Y; - Y + Y - RX; 12 i=l
i=l
N
= L[Y; -Y+RX
-RX;] 2
(sinceR= y)
X
i=l N
N
~
N
2~
-2
~
-2
-
-
= ..L..[Y; -Y] +R ..L..[X; -X] -2R..L..[Y; -Y][X; -X] i=l
i=l
i=l
Dividing both the sides by (N -1) we get 1
N
~
2·
2
2 2
..L.. [Y; - RX; ] = S y + R S JC
-
2RS rJ
(6.10)
(N -1) i=l
Substituting this in the expression for the mean square error, we get an equivalent expression as 2
MSE(YR>= N (N-n) Nn
1
N
L[f; -RX;] 2 (replacing(6.10))
(N -I) i=l
Therefore a reasonable estimate for the mean square error of the ratio estimate is 2
N (N -n) Nn
~f.. """' I _ ie.r , 1. - RX... 1· 12 where R... --==:--~ x.
L[v
1 (n -1) . ~$
"""'
ie.r
I
The ratio estimator considered in this section is not unbiased for the population total (mean). In the following section of this chapter, few ratio type unbiased estimators meant for simple random sampling are presented.
6.3 Unbiased Ratio Type Estimators Already we have seen that under simple random sampling, the ratio estimator
LY; X . As an alternative to this, it is reasonable to take
takes the form iE.r
..L.. [xY; ]x Y. .·v ,,.. =[ Nn ]~. • 1
as estimator of the population total. Like the ratio
IE$
estimator, the above estimator is also biased for the population total. The ... following theorem gives an expression for the bias of YRo.
Tlu!onm 6.3 The bias of the estimator YR0 is
B(YRQ) = -[N -1]Szx
Use of auxiliary infomuJtion 101 N
where S
zx
=-1 -~[Z· -Z][X· -X],Z· =~ N-1~ ~-1
z~
Proof Taking ..
YRo
'
'
X·
'
'
= Y; . i = 1. 2, ...• N. the estimator YRo can be written as
X,
N~
-
=-~Z;X
n IE$ . .. _
=ZX where
..
N~
Z=-~Z;
n IE$ .
._
The bias of YRo is B(YRo ) = E [YRo - Y]
= E(ZX)-Y N
= ZX -Y where Z =
LZ;
(6.11)
i=l
N
2
Weknowthatcov(Z,X)=N (N-n) 1 L[Z;-Z][X;-X] Nn N -1 i=l
= N2(N -n)
1{fz;X; -z x}
N -1 i=l
Nn
= N 2 (N -n) Nn
1 {Nf -N Z X} using Z; = Y; N -1 X;
2
= N (N- n) 1 B(Y ) (using (6.11) Nn N -1 Ro
...
Therefore the bias of the estimator Y.Rtl is
B(YR ) = 0
Nn(N - 1) cov(Z, X) N 2 (N-n)
(6.12)
.. .. N 2 (N-n) We know that under simple random sampling , cov(f, X) = S xy Nn Making use of this result in (6.12) we get the required expression. •
The above theorem helps us to get an unbiased estimator for the population total as shown below. We have seen in Chapter 2 that
sJCy
1-LCX;- X)(Y;=n -1.
f)
is unbiased for
IIE.r
N
Sry
= __!__
L (X; - X )(f; - Y) . Therefore an unbiased estimator for the bias ,
N -1....I
given in Theorem 6.3 is
102 Sampling Theory and Methods
B(YRo) = -<X;-X) n-1 k.. lES
=
=
-(N-1){~ ~..:..} .£JZ;X; -nZ X) n-1
.
IES
-(N-1){~ ~..:..} .£J Y;- nZ X) n-1
~s
= -n(N-1) [Y
f.. (using Z; = - ' ) X· '
-Z X]
n-1 It may be observed that. if b is an unbiased estimator of the bias of the estimator T (which is meant for estimating the parameter 9) then T-b is unbiased for the parameter 9. Therefore YRO - B(YRO) is an unbiased estimator of the population A
total. That is, YRO +
n(N -1)
n-1
~
~..:..
· [Y - Z X] is unbiased for the population total Y.
Thus we have obtained an exactly unbiased ratio-type estimator by considering the mean of the ratios of Y; to X; ( instead of the ratio of sum of
Y; to sum of X; ) to form the estimator and correcting for the bias. The above estimator is due to Hartley and Rao ( 1954 ). In the following section another corrected estimator is presented.
6.4 Almost Unbiased Ratio Estimator Suppose a sample of size n is drawn in the form of m independent sub-samples of the same size, selected according to the same sampling design and Y; and
X;, i =I. 2, ...,mare unbiased estimates of the population totals
Y and X based
on the m subsamples. The following two estimates can be considered for Y: A
A
Y1
y
=-:-X
(6.13)
X
1 m
where
Y=-LY; m. I •=
A
and
1 m
X =-LX; m. I •=
and
(6.14)
Use of auxiliary information 103 where
f..
r; =--:1-.
X· Under the usual' assumptiol16 (stated in the proof of Theorem 6.1 ), the bias of the estimator Y1 is A
B1 =Y[RV(X)-cov(X,Y)]. =
(byTheorem6.1)
r[Rv{~ Ix;}-cov{~ ~X;.~ ~r;}] r=l
r=l
r=l
mr
1 A A A] = 2YLtRV(X;)-cov(X;.Y;)
m
i=l
1
m
=-r LB(r;) m
i=l
where B( r; ) = Y[ RV (X; ) - cov( X; , Y; )1
(6.15)
A
and the bias of the estimator Ym is A
Bm=B(Ym) 1 m
(6.16)
=-I,B(r;) m. l
•=
Comparing (6.15) and (6.16) we get mBl = Bm
(6.17)
This shows that the bias of the estimator Ym is m times that of be seen that Bm -
Bt = E[Ym -
Y]- E[ft
-
Y1 • Further it can
Y]
= E£Ym- ft1 Therefore E[Ym- Y1] = (m -1)81. H ence
[fm- fd .IS an un b"1ased estimator . of
B
1•
m-1 Thus after correcting the estimator Y1 for its bias, we get an unbiased estimator for the population total A
YAu = Yt- £Ym- ft1 = [mft- Ym1
(6.18) m-1 m-1 Since the estimator given above is obtained by correcting only the approximate bias (not the exact bias), it is known as "Almost Unbiased Ratio-Type Estimator".
104 Sampling Theory and Methods
.6.5 Jackknife Ratio Estimator As in the previous section, here also it is assumed that a simple random sample of size n is selected in the form of m independent subsamples of k units each.
Y
A
Let Y1 =-A X
..
X y
1 m
Y=- I,r;
where
m~ 1
i(i)
YRi --X (i) X where
y
and
i
and
L i; . Further denote
1 m =-
m;~
by
are unbiased estimators of X and Y
YRi
obtained after omitting the ith subsample. That is,
.
is the ratio estimate
.
computed after omitting the ith subsample. Combining Y1 and YRi , Quenouille (1956) suggested the estimator ..
..
m
m-1~
..
Ya =mY1 ---~YR; m
.
•=
(6.19)
1
The above estimator is popularly known as Jackknife ratio estimator. In the following theorem it is proved that the above estimator 1s also approximately unbiased. A
Theorem 6.4 The estimator
A
m-1 m A
Ya =mY1 ---LYR; m .1
1s approximately
•=
unbiased. A
m-1~A
A
Proof The bias of the estimator Ya =mY1 ---~YR; is m .1
•=
B(YQ) = E(YQ)- y m-1
A
~
m ] -Y =E [ mY1 ---LYR; m .1
•=
m-1
A
A
m ] +(m-1)Y =E [ m(Y1 -Y)---LYR; m . 1
•=
m-1
m = E [ m(Y1 -Y)---I,(YR; -f)] A
m
=
..
m-1
A
•=. 1
m
~
A
mB(Y1)---~B(YR;)
m
.
•=
(6.20)
1 A
We have seen in the previous section that the bias of the estimator Y1 is_ A
B(Y1)
1 m =-I, B(r;) A
m. 1
•=
Use
of auxiliary infornultion
105
where B(r;) is as defined in (6.15). Since the subsamples are drawn independently and they are of the same size, B(f;) = B0 (constant) for i = 1. 2, ... , m. 1 Hence B(Y1 ) = - mBo A
m2
Bo m
=-
(6.21) 1
m
--I, X j m-1 .
Note that for each i,
1 I m -1 . m
A
and
)=1
A
Yj
are unbiased for the
)=1
-;
-; population totals X and Y respectively. Therefore by Theorem 6.1, the bias of the estimator YRi is B('YRi) = Y[RV(X (il)- cov(X(i). y
=Y
Rv[ ~ ml.f,xi ]= 1
~ fxi ].[ ~ ml.fri J J
J-cov( [ ml.
~
]= 1
]= 1
~
~
(in -1)80
(m-1) 2
=
Bo
(6.22)
(m-1)
Substituting (6.21) and (6.22) in (6.20) we get • B0 m-1 B0 B
6.6 Bound for Bias In the last three sections. we have seen unbiased ratio-type estimators. In this section, an upper bound is presented for the bias of the ratio estimator. We know that the bias of the ratio estimator YR is B[YR)=·E[YR]-Y
and cov(YR.X)=E[YRX]-E[YR]E[X]
106 Sampling Theory and Methods
=, ~
X ]-E[fR )E[i] ...
...
=X E[Y]-E[YR ]X
= XY -E[YR]X ... = -XB [YR 1 (using (6.22)) Therefore
cor(YR, X)< SD(YR )SD(X) =-X ~YR]
~SD(YR)SD(X)S XIB[YR11
IB[YR~
SD(X) ... S - - SD(YR) X The above bound is due to Hartley and Ross(1954). Hence
6.7 Product Estimation We have proved under simple random sampling, the ratio estimator is more precise than the expansion estimator when the variables x and y have high positive correlation. In fact, it is not difficult to see under any sampling design, ...
..
. . ...
I [C(X)] .. SD(f) where C(f) = and 2 C(f) Y
YR is more efficient than Y if p(X, f)>- - -...-
C(X) = SD(X) . This shows that if the correlation between x andy is negative, X the ratio estimator will not be precise than the conventional estimator. For such situations, Murthy( 1964) suggested another method of estimation, which is ... ... expected to be more efficient than Y in situations, where YR turns out to be less efficient than Y. In this method, termed "Product method of estimation, the population total is estimated by using the estimator ... .. y .. Yp =-:-X (6.24) X ... Since the estimator uses the product YX rather than the ratio ~ , it is known as X product estimator. The following theorem gives the exact bias and approximate mean square error of the ratio estimator.
Theorem 6.5 The exact bias and the approximate mean square error of the product estimator are given by ...... B[Yp]= cov(X,Y) X
I 07
Use of auxiliary information and
Proof Using the notations anC:l assumptions introduced in Theorem 6.1, the ... estimator Yp can be written as
yP = y (1 + eo )(1 + el ) =YO+ e0 + e1 + e0 + eJ) Therefore Yp - Y = Y[eo + e1 + eo + e1] Taking expectation on both the sides of (6.25) we get
Yp- Y =YE [eoed
(6.25)
(since E(eo) =E(e 1) =0)
=Y[ cov~i)] ~coveY, X)
(6.26) X Squaring and taking expectation on both the sides of (6.25) and ignoring terms of degree greater than two, we get the approximate mean square error
E[Yp- Y]
2
= Y 2 E[e5 +ef + 2eoel 1
=r2{ [v:~>
Hv~~) ]+{cov~i>]}
= V(Y)+ R 2 V(X)+ 2R coveY, X)
(6.27)
Hence the proof. • ... The following theorem gives the condition under which the estimator Yp
will be more efficient than Y . Theorem 6.6 The product estimator Yp is more efficient than Y if
. .
p(X' f)<
1[C(X)] C(f)
-2
Proof Left as an exercise.
Theorem 6.7 Under simple random sampling
unbiased for the population total. Proof We know that
. . [N (N-n)]
under
. [N (N-n)](sry)
Yp +
simple
2
Nn.
random
X
is
sampling
2
cov(Y, X)=
Nn
Sry .Therefore the true bias of the _product estimator
108 Sampling Theory and Meflrods ] . 1e ran dom samp1"mg ts . - [ N 2 (Nunder s1mp Nn n)
s s·mce s.cy ts. unb"1ased X. XV
for S ry under simple random sampling, an unbiased estimator of the bias of the & . the prod uct . . - [ N 2 (N- n)] -s xy . Th ere,ore prod uct esttmator ts a,ter ad.JUSllng Nn X &,
• & • b.1as we get estimator ,or tts
y'"'
p
+ [N2(N Nn
-n)](sry) X
. as un b.1ased est1mator of
the population total. It may be noted that the above estimator is not an exact unbiased estimator. •
6.8 Two Phase Sampling The ratio and product estimators introduced in this chapter assume the knowledge of the population total X of the auxiliary variable x. However there are some situations where the population total of the auxiliary variable will not be known in advance. In such cases. two-phase sampling can be used for getting ratio or product estimator. In two phase sampling, a sample of size n' is selected initially by using a suitable sampling design and the population total X is estimated and then a sample of size n is selected to estimate the population totals of the study and auxiliary variables. The second phase sample can be either a subsample of the first phase sample or it can be directly drawn from the given population. The sampling designs used in the first and second phases need not be the same. Depending on the situation, different sampling designs can also be used. Generally, two-phase sampling is recommended only when the cost of conducting first phase survey is more economical when compared to that of the second phase. Let
XD
be an unbiased estimator of X based on the first phase sample and
X. Y be unbiased estimators of X.
Y based on the second phase sample. Then the ratio and product estimators based on two-phase sampling are A
A
y
A
YRD =--::-X D
(6.29)
X
and ......
... YX (6.30) Ypo=X The following theorems give the approximate bias and mean square error of the ratio and product estimator under different cases of two-phase sampling. Theorem 6.8 (i) When the samples are drawn independently in the two phases of sampling the approximate bias of the ratio estimator is
~YRol = r{[v~~)
]-[cov~,f)
]}
Use of auxiLiary information
I 09
(ii) When the second phase sample is a subsample of the first phase sample. the
H
approximate bias of the ratio estimator is
mYRol=Y{[ V~~) ]-[ cov~.Yl ]-[ cov(:~Xol covc~i ol ]} When the samples are drawn independently in the two phases of
Proof
sampling cov(X, X 0 ) =0
(6.31)
cov(f, Xo)=O
(6.32)
Y-Y y
X-X
Xd-X
X
X
Let e0 = - , e1 = - - and ed = ---==---It may be noted that (i)
E(eo) =,
Y; y] = 0 ,
(ii)
E(e 1 ) = , i ; X]= 0
&l
(iii) E(ed )= Jxd-X] X . =0 (iv) E(e 0., )= E[Y-Y] -Y2
(v) E(e 1 ) .. ) E(
(vn
.. E
(vn)
IX
V(Y) -f2 2
Jx-X] 2 V(X) ., Jx 0 -X] V(X 0 > = &;Ol X = (vi) E(ed) = &;Ol X = X2
"""i"2
J (f- Y)(X- X)]
)
eoel = &;Ol
(eoed) =
( . ) E(
2
YX
.
=
cov(f, X)
YX
E[ YX = YX
)_J<X-X)(Xo-X>]_cov(X,X 0 )
eled - &;Ol
YX
-
YX
The ratio estimator can be expressed in terms of e0 , e1, ed as follows :
YR =Y(I+e0 )(1+e1)- 1(1+ed) = Y(l+e 0 )(1+ed)(l-e 1 +ef - ........ ) = Y( I +eo + ed + eoed )(1- e1 + e12
.,
-
e13 + ........ )
= Y(l- e1 + ej +eo- eoe 1 + ed - e1ed + eoed) (ignoring terms of degree greater than two) (6.33) This implies YRD - Y = Y(e 0 - e1 + ed + e[ - e0 e1 - e1ed + e0 ed) Taking expectations on both the sides of (6.33) and using expressions above we get when the samples are drawn independently drawn, the approximate bias
"""X"2 -
of the ratio estimator as B[YRD] = Y{[V(X >] [cov(X, XY f)]} •
•
A
II 0 SampLing Theory and Methods
The bias of the ratio estimator when the second phase sample is a subsample of the first phase sample can be obtained by taking expecWions on both the sides of(6.33) as
~YRol =r{[ V~~) ] - [ cOv~,Y) ]-[ cov(;~Xo)
Hcovc~io)
]}
Hence the proof. •
Theorem 6.9 (i) When the samples are drawn independently in the two phases of sampling, the approximate mean square error of the ratio estimator is
MSElYRol= y2{[
HH
v;~> v~~) v~:> ]-{ cov~~·Y> ]}
(ii) When the second phase sample is a subsample of the first phase sample, the
approximate mean square error of the ratio estimator is
MSElYRol=Y2{ [ v~~)H v~~) Hv~~>]-{ cov~,f>]
_{ cov(; ~X 0 )] + { cov(~Xo)] } Proof of this theorem is left as exercise.
Theorem 6.10 (i) When the samples are drawn independently, the approximate bias of the product estimator in two phase sampling is
H r{[V~:>H cov~~·
~fpo)=r{[v~~) cov~~·f>]} (ii) When the second phase sample is a subsample of the first phase sample, the
approximate bias of the product estimator is
B[Ypo] =
Yl ]-[cov(:~Xo) ]-[ cov(~i 0 ) ]}
Proof: Proof of this theorem is left as exercise.
Theorem 6.11 (i) When the samples are· drawn independently in the two phases of sampling, he approximate mean square error of the product estimator is
MSElfpo]=r2{[ v:~>
Hv~~) Hv~:>]+{cov~,Y)
]}
(ii) When the second phase sample is a subsample of the first phase sample, the approximate mean square error of the product estimator is given by
Ill
Use of auxiliary infomuJtion
MSET.Y Po
1= r 2 {
V(Y) + V(X) + V(X o> +
x2
y2
x2
+{cov(X,f) _ cov(X.X 0 ) + cov(Y,Xo>]}
x2
XY
XY
Proof of this theorem is left as exercise. The theorems stated above are quite general in nature and they are applicable for any sampling design. Now we shall develop the approximate bias and mean square error of the ratio estimator under the two cases of two-phase sampling when simple random sampling is used in the two-phases of sampling. Towards this we observe the following : Let s' and s be the samples obtained in the two-phases of sampling, where in the first phase a large sample of size n' is drawn to estimate the population total X and in the second phase a sample of size n is drawn to estimate the population totals X and Y both by using simple random sampling. Here n is assumed to be small when compared to n '.
NL . X=-N~ X·' andY=-NL X·' n . n .
Let X 0 = X· n, . . ' A
A
A
IE.f
IE.f
IE.f
Note that when s is a subsample of s'. N
(i) E(Xo>=
~X;
(6.36)
i=l
(ii) E(i) = EtEII[i Is')= EtEII[.;.
=E[N (N -n·)~x{ ~ )]=x vcid =[N Js; 2
1
(iiil
~Xi Is']
>
2
(6.37)
(6.38)
(iv) V(X) = E1V11 [X I s']+V1 E11£X Is']
= E1 [ N
2 (n'-n)] 2
n'n
A
s" +V1 (Xd)
;,.">]s;
= [ N2 (N
(6.39)
Cv> vcYd >=[N 2 ]s~
(6.40)
112 Sampling Theory and Methods (vi) cov(X.Xd>=E [XXd]-E[X]E[Xdl A
A
I
= E 1 E 11 [XXd Is ]-X
2
... ... 2 =E 1 [xdxd 1-x
=[N2 (N;,">Js.~ (vii) cov('f, X d)= [N 2
(6.41)
(N- n)Js xv Nn
(6.42)
·
(viii) cov(f, X)= E[XY]- XY
...... = E 1 E 11 [ XY I s']- XY = E1[N 2 =
Yd
]-xr
[N 2 (n'~n>Js xv + E1 [x dyd ]- XY nn ·
= [N 2
(:~:) ~_., +cov(Xd, Yd)
=[N2
(n'~n>Jsxv +[N2 (N-~')]sxv
= Here
(:~:) s'_.,+YdXd
nn
·
Nn
·
[N 2 (Nn)Js Nn ry
and s~ are the analogues of
Y and
(6.43)
s ry respectively based on the
samples'. When the samples are drawn independently, the results derived in simple random sampling can be used directly without any difficulty. The following theorem gives the approximate bias and mean square error of the ratio estimator when simple random sampling is used in both the phases. Theorem 6.12 When simple random sampling is used in both the phases of sampling and the samples are drawn independently
;,"> ]r(c..
B[fRol =[N 2 (N
-C_.,]
and
MSE[Y
RD
]=[N2 (N-n>Jy2rc +C -2C 1+[N2 (N-n')]y2c Nn ~ yy .u ry Nn' .u
s2
s;
sxy
whereC.u =-L,Cyy =-and Cry= XY. x2 . y2 TMorem 6.13 When simple random sampling is used in both the phases of sampling and the second phase sample is a sub sample of the first phase sample
Use of auxiliary information [
A
B{YRol= N
2
113
] r (n'-n)] n'n YLC.u -Cxy
and
N [(N-Nnn)](c·· +C.u -2C..., l+[N
MSE[YRo) = 2 Y2
vv
2 (N -n'
··J
Nn'
>][c
r'\1
-C.u]
-J
The above theorems can be proved by applying the expressions given in (6.36)(6.43) along with Theorems 6.8- 6.11.
6.9 Use Of Multi-Auxiliary Information There are many situations in which in addition to the study variable, information on several related auxiliary variables will be available. In such situations, the ratio estimator can be extended in several ways. In this section. one straight forward extension due to Olkin ( 1958) is considered. Let X; be unbiased for X;, i = I. 2, ... , k the population total of the ith auxiliary variable and Y be unbiased for Y , the population total of the study variable. Olkin (1958) suggested a composite estimator of the form
YR/c
±w;[ ~- Jx;
=
(6.44)
X,
i=l
lc
where W1, W2 , ...• W1c are predetermined constants satisfying
L W; =I. i=l
Note that if k=2, the above estimator reduces to y y YRl =W1 -::-X I+ W2 -A-X 2 A
A
A
(6.45)
x2
x1
where W1 +W2 =I. The following theorem gives the approximate bias and mean square error of the estimator YRl . In order to make the expressions compact, the following notations are used.
v. _V(Y) v _V(X 1 > v o- - 2- , 1 2 • 2
x1
Y
C
_ cov(f,X2) C
02 -
YX 2
,
_V(Xz) -
x2
2
c
_cov(Y,X 1)
• 01-
YX
1
•
_ cov(X 1,X 2 ) 12 -
X IX 2 A
Theorem 6.14 The approximate bias and mean square error of YRl are
B(YR 2 ) = Y{V2 - C02 + Wt(C02 - C01 ·- ~2 )} and
114 Sampling Theory and Methods
Y-Y Proof Let eo = - y - . ... The estimator YR2 can be written as
YR2 = W1Y(l + e0 )(I+ e 1)- 1 + W2 Y(1 + e0 )(1 + e 2 )-1 =f{Wt(l+eo)(l-el +ef2- ... )+} w2 (I +eo )(l-e2 +e2 - ... ) = r{wl (1-el +ef -.eo -eoel + ... )+
}
(l-W1)(l-e 2 +ei +eo -e0 e 2 + ... ) = r{(l-e2 +el +eo ~eoe2 ... )+ w. (1-e, +ef -.eo -eoel···} -l+e 2 -e2 +eo +eoe 2 ... )
... Therefore YR2
(eo -e2 +e22 -eoe2 ... )+ } (6.46) ., ., W1(e 2 -e1 +ej -e0 e1 -ei +e0e 2... ) Taking expectations on both the sides after ignoring terms of degree greater than two,.we get the approximate bias (6.47) B(YR2)=Y{V2 -Co2 +Wa(Co2 -Coa-V2)} Squaring both the sides of (6.46) and taking expectation, on ignoring terms of degree greater than two, we get the approximate mean square error -
Y = Y{
MSE(YR2)=Y2{Vo.+V2 -2Co2 +Wa2(V2 +Va -2Ca2)-} 2Wa (Coa +V2 -Co2 -Ca2> Hence the proof. •
(6.48)
RemtU'Ic Note that the mean square error given in (6.48) attains minimum if
w, = Coa +V2 -Co2 -C12 v2 +Va-2Ca2 The minimum mean square error of the estimator (6.49) in (6.48) is
(6 .49)
YR2
obtained by substituting
y2{vo +V2 -2Co2- (Coa +V2 -Co2 -Ca2>2} V2 + v1 - 2c12
(6.50)
It is pertinent to note that the denominator of the expression is nothing but the variance of the difference e 1 - e 2 • Therefore the minimum mean square error given in (6.50) is always less than or equal to Y.2{vo + V2- 2Co2} which is nothing but the approximate mean square error of the ratio estimator based on the auxiliary variable x 2 . Therefore we infer by using the additional auxiliary variable the efficiency of the ratio estimator can be increased. It is to be noted that the optimum value of W1 given in (6.49) requires the knowledge of some
115
Use of auxiliary infomuztion
parametric values which in general will not be known in advance. The usual practice is using their estimated values.
6.10 Ratio Estimation in Stratified Sampling When the sample is selected in the form of a stratified sample, the ratio estimator can be constructed in two different ways. Let Yh and Xh. h = l, 2, ...• L be unbiased for the population totals Yh and X h • hth stratum totals of the study and auxiliary variables respectively. Using these estimates, the population total can be estimated by using any one of the following estimates : L ... (6.51) YRs = ~L xh h=l XL
L
L
I,rh X
YRC- h=l L A
-
(6.52)
I,xh h=l
... The estimates YRS and YRC are known as separate ratio estimator and combined ratio estimator respectively. The separate estimator can be used to estimate the population total only when the true stratum total X h of the auxiliary variable is known for all strata.
Theorem 6.15 The approximate bias and mean square error of the separate ratio L {[V(X ~ estimator are B[YRs] = I,rh il=l Xh •
A
and MSE[YRs 1=
>] - [cov(X h , Yh >]}
± !{[V(Y~ >]+[V(X~
X hyh
yh
>]-2[cov(X h• yh >]} h=l Yh Xh XhYh Proof Bias of the estimator under consideration is
B[YRs] = E[YRs ]-Y =
~£{[~: xh ]-rh} ~
h)] _[ cov(X h. Yh)]} (using Theorem 6.1) __ ~ yh {[ V ( X 2 h=l Xh XhYh Hence the proof. • The mean square error of the separate ratio estimator is ...
...
MSE[YRs] = E[YRs -Y]
2
116 Sampling Theory and Methods
=~E{[ =
;: Xh ]-Yh
r
±rh2{[V()~~i]+[V(X~)J-{cov(Xh,t\>]} Yh
h=l
(6.54)
X h yh
Xh
The above mean square error is only an approximate expression and it is obtained by applying Theorem 6.1 under the assumptions stated in the same theorem. L
L
The combined ratio estimator is constructed by using L i h and l',rh as h=l h=l estimates for the population totals X and Y respectively. Therefore the approximate bias and mean square error are L L L V co I,ih•Lyh h=l y h•l h=l B[YRc1=Y~:.;;;.:....A
XY
X2
(6.55) L
V
L
l',cov(Xh• Yh) 2 h=l
XY
(6.56)
respectively. The expressions given in (6.53)-(6.56) are applicable for any sampling design. In particular, if simple random sampling is used in all the L strata then they
~
A
reduce to (i)B[YRS 1= .LJ Yh h•l A
~
2
(ii)MSE[YRS 1= ~ Yh h=l
N:(Nh -nh)
{C .u1a - C zyh} and
Nhnh
N:(Nh -nh) Nhnh
{C .ah + C yyh - 2C zyh}
117
Use of auxiliary infomuztion
6.11 Problems and Solutions l'roiMrrt 6.1 Consider the" estimator
Ya =
r[;
r
which m!uces to the ratio
estimator when a = 1 and the conventional expansion estimator Y if a = 0. Derive the approximate bias and mean square error of the above estimator and also the minimum mean square error with respect to a (Shrivastava, 1967). Solution Using the notations introduced in Section 6.2, the estimator
Ya =
r[;
r
can be written as
Ya = y (1 + eo )(1 + el ) -a
=Y(l+eo){l-ael + a(a2+1) e,2 - ..
J
=Y{ l-ae 1 + a(a+l) e12 +eo -ae0 e1 + ...... .} 2 Therefore Ya-Y= Y e0 -ae 1 + a(a+l) e 12 +eo -ae0 e1 + ...} 2 A
{
(6.57)
Taking expectation on both the sides after ignoring terms of degree greater than two, we get the approximate bias as 1) B=Y { a(a+ E(e 12 )-aE(eoe1) } 2
=f{a(a+l)V(X) acov(X,f)} 2 x2 XY
(6 .58 )
Squaring and taking expectations on both the sides of (6.57) after ignoring terms of degree greater than two, we get the approximate mean _square error as M =[E(e5)+a 2 E(ef)-2aE(eoet>1
= r 2 {V(f) +a 2 V(X) -2a cov(f. X)} y2 x2 XY
(6 .59)
By employing the usual calculus methods, we note that the above mean square error is minimum if a=~ cov(X. f)
Y
XY
(6 .60)
Substituting (6.60) in (6.59), we get the mtmmum mean square error as V(f)(l- p 2 ) where p is the correlation coefficient between
Y and X. •
118 Sampling Theory and Methods A
...
A
A
X
Probkm 6.2 Consider the estimator Ya = aY +(1-a)Y- which reduces to X the product and conventi
A
X
Ya =aY +(1-a)Y X A
A
A
=ay+(l +eo)+(l-a)Y(l +eo) (1 +e 1) = Y +(I +eo)+[a + (1-a) O+e1 >] =Y[1+e 1 -ae 1 +eo+eoe 1 -aeoe 1] Therefore fa-Y= Y[eo +(I +a)(el + eoe1 >] Taking expectations on both sides. we get the bias of the estimator f)] 8 = y (1-a) [cov(X, XY Squaring both the sides of (6.61) and taking expectations after ignoring terms of degree greater than two, we get the approximate mean square error as
M = y2{V(f) + ( 1 -a)2 V(X) + Z(l-a) cov(X, f)} y2 X2 XY It can be seen that the above mean square error is minimum if a=
I+[~ I"ov~~· Y>]
and the minimum mean square error is V ( f)(l- p 2 ) where A
correlation coefficient between
p as the
A
Y and X . •
Probkm 6.3 Derive the approximate bias and mean square error of the estimator af
~
and also find the minimum mean square error. X Solution The given estimator can be W..ilten as A
A
X
Ya =aY-A X =aY(l + e0 )(1 + e 1)- 1 =af(l+e0 )(1-e 1 +e12 - ... )
.,
=aY(l-e 1 +ei +eo -e0e 1 + .. ) A
2
Therefore Ya-Y= (a-l)Y -aY(eo -e 1 +e 1 +eo -eoe 1 + ... ) Taking expectations on both sides of the above expression we get the bias of the estimator as
119
Use of auxiliary information (a-l)Y-ar{'vci) _ covci.Y) ~ x2 XY J
Proceeding in the usual way we get the mean square errt>r of the estimator as 1 2 +a 2y2{V(f) M=(a-) - + V(X) - 2 cov(f, X)} y2 x2 XY _ 2a(a -nr 2{V(X) _ cov(X. f)}
x2
XY
Minimising the above mean square error with respect to a and substituting the optimum value in the mean square error expression we get the minimum mean square error. • Problem 6.4 Letx1 and x be samples means in two phase sampling when samples are drawn independently and simple random sampling is used in both the phases of sampling. Show that the estimator -. _ n(N- 1) (- _) y =rx1 + y- rx N(n -1) is unbiased for the population mean where y is the sample mean of the study variable based on the second phase sample. Solution Since the samples are drawn without replacement and independently of each other in the two phases of sampling we have
E[y*]=E[r]E[X't1+ n(N-I) E(y-rx) N(n-1)
Note that E[rx] = cov(r,x)+ E[r]E[x] N
N-n 1 ~ -= - - - ~(R; -R)(X; -X)+R X Nn N-1. 1
,.
N
=
N-n - (n-1)N-1~ Y.· Y+ R X where R =-~R;.R; =-'
n( N
-
n( N
1)
Therefore we have E(y*) =
- 1)
N i=t
n(N-1) N(n-1) -
··
N(n-1) n(N-1)
--
X;
--
(Y - R X) +R X '
=f Hence the solution. • Problem 6.5 Derive an unbiased ratio-type estimator based on xw, the mean of say w distinct units in two phase sampling when independent samples are drawn in the two phases of sampling using simple random sampling. Solution Let r xw be an estimator of the population mean Y where r is as
defined in the last problem. Note that E[rxw1 = E{E(rxw I w)}
120 Sampling Theory and Methods
Since each units in a particular subset s w l containing
w
distinct units 1 is given
equal chance for being included in the sample, we get E[r I w] = rw . Therefore E[r xwl = E[rwxw] =cov(rw,Xw) + E[rw]Elxwl
= E { N-w
1
Nw N-1
f
-
£..t (R; - R )(X;
-} + -R X (refer the next problem)
- X)
i=t
where the expectation in the right hand side of the above expression is with respect to the distribution of w. E[rxwl=E{N-w_1-}r-E{N(w- 1)}R X Nw N-1 w(N-1)
Hence the bias of r xw is B(r xw I w) = E(r xw I w)- Y
)}(f -R
=- E{N(w- 1
X)
w(N-1)
)}
Further we know that E[)'- r x] = { N(n - 1 n(N -1)
Therefore B(r xw> = -E{N(w-l)}{n(N -l)}(y- rx) w(N-1)
=- E{
N(n-1)
n(w-1)}(-y-rx __ ) w(n-1)
Hence an unbiased estimator of the population mean is
_
_
Yw = rxw +
{n(w-1)} __ _ w(n-1)
(y- r x)
•
Problsm 6.6 Under the notations used in problem 6.5 derive (a) E(xw). (b) V(xw) and (c) cov(iw• .Y> Solution
(a) E(xw)= EE(xw I w)
=E(X)
=X (b) V(xw> = EV(xw I w)+VE(xw I w)
=
E{~- ~ }s;+V(X)
(c) We can write E(y Is w) = Yw Therefore cov(iw• y) = E(xwy)- E(xw)E(y)
= EE(xwyl sw)- E(xw)E(yl sw)
Use of au.x1liary information
121
=cov(xw. Yw>
Hence the solution. •
Exercises 6.1
Derive under simple random sampling the approximate bias and mean
..
square error of the estimator YRS
.. s2
=Y
; .
Sx
6.2 Derive the approximate bias and minimum mean square error of the A
estimator Y=_!:_[aX + (1- a )X] and compare the minimum mean square X error with the mean square error of the Linear regression estimator. 6.3
Let :x• be the mean of distinct units when samples are drawn independently in two phase sampling. Derive the approximate bias and mean square error of the estimator
f'• = Xy :x* assuming simple random
sampling is used in both the phases of sampling. 6.4 Derive the minimum square error
y• = Y+ b(x* -
a sample of n .. n N-1 .. .. B[RnX1=--E[x(Rn -Rt>1 N n-1
v
the
estimator
x) under the notations explained in 6.3.
6.5 Show that for
I~ y·
where R 1 = :_ and R~1 =- L X n i=l A
of
A
-'. Y; and
units
selecting
X; being
usmg
srswor,
y and x values of the ith
X;
drawn unit. 6.6 Let C x be the coefficient of variation of x. Derive the condition under
YX + C:c
is more efficient than the usual ratio X-Cx estimator. assuming simple random sampling is used. which the estimator
Chapter7
Regression Estimation 7.1 Introduction Like ratio estimation, regression estimation is another method of estimation of a finite population total using the knowledge of an auxiliary variable x which is closely related to the study variable y. The regression estimator is developed below. We know that. when the variables x andy are linearly related, the least squares
b= si
estimates of the slope and intercept are respectively
a=Y- bX .
and
SJC
The Y can be expressed as y = L,r; + L,r; where ies
s= s- s .
(7.1)
ies
Once the sample is observed, the first term in the right hand side becomes fully A
Sxy
known. Using the least squares estimates b =-and s2
a= Y- bX A
A
each
JC
unobserved y value can be estimated by
Y; =a+bX;.ie s Summing both the sides over
s = S - s , we get
L,r; = +hLX; ies
ies
ies
=
r = L,r; + +brNX -nX1 ies
Regression Estimation
123
=nY +
under simple random sampling is N 2 ( N - n) S; (1- p 2 )
.:,._
Proof Define e0
=
Y-Y
.:.,_
,e1
Y
=
X-X
Nn
,e2
X
Sxv-
s
= ·
..ty
2
.e3
Sxy
=
3
Sx-Sx 2
Sx
= =
It is to be noted that E(e;) O,i 0, I, 2,3. The regression estimator can be expressed as
YLR =NY+ Nb[NX -nX] = Y(l+e 0 )+
S xvO +e.,)
i
-
[X- X (l+et)]
SX (I+ e.3)
XSxy
1
=f(l+eo)-~(l+e2)(l+e3)-
e1
Sx 1 Sxv =f(l+e0 )-XB(l+e 2 )(l+e 3 )- e1 where B=2 Sx Assuming I e; I< l, i = 0, 1. 2, 3 , the above expression can be modified as
2
A
4
Yu -Y =Yeo -XBe 1(1+e2)(t-e 3 +e 3 -e 3 + ... ) =Yeo - XB(e 1 - e1 e3 + e 1e 2 ) (ignoring terms of degree greater than two) Squaring both the sides and taking expectations we get E('YLR- Y) 2 = Y2 E(eJ> +X 2 B 2E(ef)- 2XYBE(eoet) =N2(N-n)[S2+B2s2-2BS ] Nn Y x xy =
N2 N ) S2 S ( -n [S 2 +--2..si-2-2.Sxv1 Nn Y s4 s2 . X
X
si,]
= N 2 (N-n) 52 [ 1Nn Y s2s2 X
=· N2 (N-n) S2(l-p2) Nn Y
y
124
Sampling Theory and Methods
Hence the proof. • ~
Tuorem 7.2 Under simple random sampling,
MSE(YR] > MSE(YLR]
.
V[~rrs]
~
> MSE[fLR]
and
.,
Proof Since -1 < p < 1 , we have (I - p - ) < I Therefore N 2 (N- n)
Nn
s;op ·
2) <
N 1 (N- n) Nn
s;
Hence V[Ysrs] > MSE[YLR] . Consider the difference 2 +R 2 S 2 -2RS MSE[YR ]-MSE[YLR ]N 2 (N-n>{S -S y2 +S y2 p 2 } Nn y .x xy
=N 2 (N -
Nn
n) { S; p 2
= N2 (N -n) Nn
·
+ R 2 S; - 2RS xy }
[S~S~ +R2 s; -2RS ] s2s2 xy y
.%
= N 2 ( N - n) ( S xy Nn
-
RS .x )
2
s2 .%
Since the right hand side of the above expression is always non-negative, the result follows. •
7.2 DilTerence Estimation The ratio estimator which is obtained by multiplying the conventional estimator
Y by the factor ~
is an alternative to the estimator Y. Here we shall examine X the possibility of improving upon Y by considering the estimator obtained by adding Ywith constant times the difference X - X whose expected value is zero. That is, as an estimator of f. we take YoR = Y +A.( X- X) (7.3) where A. is a predetermined value. Since the above estimator depends on the difference X -
X
rather than the ratio
~
X
, it is tenned as "Difference
estimator". The difference estimator is u~biased for the population total f and its vanance ts A
A
V(YoR)=E[YDR -f]
2
= E[(Y- f)+ A.(X- X)] 2 = E(Y- f) 2 +A.2 E(X- X) 2 -2A. E(X- X)(f- f) = V(Y)+ A.2 V(X)- 2A.cov(X, f)
(7.4)
Regression Estimation
125
The above expression for variance is applicable for any sampling design yielding unbiased estimator for Y and X. It can be seen that the above variance is minimum if
...... l= cov(X, f) (7.5) XY and the resulting minimum variance is V(f)(l- p 2 (.X, f)] where p(X, f) is the coefficient of correlation between
X and Y. It is interesting to note that, when
simple random sampling is used, the optimum value of A. is
s
~ and the
s:x
minimum variance happens to be N 2 ( N - n) S; (1- p 2 ) which is nothing but Nn · the approximate mean square error of the linear regression estimator. It is peninent to note that the optimum value of A. depends on Sxy which in general will not be known. Normally in such situations, survey practitioners use unbiased estimators for unknown quantities. The .value derived from the optimal choice ... happens to be the least squares estimate. Therefore the estimator YDR reduces to the linear regression estimator. It is also to be noted that the difference estimator ~
reduces to the ratio estimator when A.
=~ . X
7.3 Double Sampling in Difference Estimation As in the case of ratio estimation, here also one can employ double sampling method to estimate the population total Y whenever the population total X of the auxiliary variable is not known. The difference estimator for the population total under double sampling is defined as ...
r
00 =Y+A.
(7.6>
where X D is an unbiased estimator of the population total X based on the first phase sample. Evidently the difference estimator is unbiased for the population total in both the cases of double sampling. ~
...
Note that V(Y00 ) = E[Y00 - Y]
2
= E[(Y- Y)+ A.(Xd- X)] 2 A
A
A
=E[(Y- Y)+A.[(Xd- X}-(X- X)]]
2
= V(f)+A.2 [V(X)+ V(X d )-2cov(X, Xd )] (7.7) - 2A.[cov(f, Xd)- cov(X, f)] When the samples are drawn independently, the above variance reduces to
V[f00 ]=V(f)+ A. 2 [V(X) + V(Xd )] - 2A.cov(f, id) The following theorem gives the variance of the difference estimator in double sampling when the samples are drawn independently in two phases of sampling using simple random sampling.
126
Sampling Theory and Methods
Theorem 7.3 When the sampies are drawn independently in the two phases of sampling using simple random sampling the variance of the difference estimator is
V(Y 00 >= N 2 [s; +12 (/ + f')S; -2A. I Sxyl where f
=-N-n and f' = N-n' .
Here n' and n are the sample s1zes Nn Nn' corresponding to the first and second phases of sampling. Further the minimum variance of the difference estimator in this case is
N
2I s y2[1- /+/' I P 2]
where p is the correlation coefficient between x and y. Proof Using the results stated in Section 6.8 in the variance expression available in (7. 7) we get V(Y _ N 2 (N-rr) 5 2 ;..2 N 2 (N-n') S2+ N 2 (N-n) s2 DD) Nn Nn ' .t Nn ;c ·,. +
.,
_ 2A. N ... (N -n) S . Nn
-~
= N 2 [s; +A-2 (/ + f')S.~- 2A. I S;cy] Differentiating the above variance expression partially with respect to A.
(7.8) and
f S;cy2 .Substituting this value in f+f'sJC (7.8) and simplifying the resulting expression we get the minimum variance
equating the derivative to zero, we get A.=
N2 I s2[1y
f P2] /+/'
It is to be noted that the second order derivative is always positive. Hence the proof. • Theorem 7.4 When the second phase sample is a subsample of the first phase sample and simple random sampling is used in both the phases of sampling the variance of the difference estimator is
V(Y oo) = N 2 [/ S; + A.2 ( / - f')S; + 2A.(/'- /)S;cy]. The minimum variance of the difference estimator in this case is N2 s;[f' P2 + 10 _ p2)l
where f and f' are as defined in Theorem 7.3 Proof of this theorem is left as an exercise.
7A Multivariate DitTerence Estimation
Regression Estimation
127
When information about more than one auxiliary variable is known, the difference estimator defined in Section 7.3 can be extended in a straight forward manner. Let Y, X1 and X! be unbia5ed estimators for the population totals Y , X 1 and X! of the study variable)', the auxiliary variables x 1 and x 2 respectively. The difference estimator of the population total Y is defined as
y02 = y + 8 I (X I - XI ) + 8 2 (X 2 - i
(7.9)
2)
where the constants 8 1 and 82 are predetermined. A
The estimator Y02 is unbiased for the population total and its variance is ...
..
..
..
V(Y02 )=E{(f-f)+8 1(X 1 -X 1)+8 2 (X 2 -X 2 )]
2
.. 2 .. 2 .. ... . . =V(Y) + 8 1 V(X 1) + 82 V(X 2)- 28 1 cov(f, X 1)
-28 2 cov(f. :i 2 ) + 28 182 cov(Xt, X2) Denote by V0 =V(Y),V1 =V(X 1),V2 =V(X 2 ) A
A
A
A
A
A
C01 =cov(Y,X 1),C02 =cov(Y,X 2 ),C 12 =cov(X 1,X 2 ) Differentiating the variance expression partially with respect to 8 1 and 8 2 and equating the derivatives to zero, we get the following equations (7.10) V181 +C1282 =Cot (7.11) C1281 + V282 =Co2 Solving these two equations, we obtain C01 V2 - C12C02 (7.12) 8t= ., V1V282 =
cr2
Co2V1- C12Cot
(7.13)
v1v2 -c122
Substituting these values in the variance expression, we get after simplification V(f)[1- R~
.~ ...t•• ..t2
(7.14)
] A
where R
y ...t•• ..t2
A
is the multiple correlation between Y and X 1 . X 2 . Since the
multiple correlation between
Yand X1. X2
is always greater than the
correlation between Yand X1 and that of between Yand X2 , we infer that the use of additional auxiliary information will always increase the efficiency of the estimator. However, it should be noted that the values of 8 1 and 8 2 given in (7.12) and (7.13) depend on C01 and C02 which in general will not be known. The following theorem proves that whenever b1 and b2 are used in place of 8 1 and 8 2 given in (7 .12) and (7 .13 ), the resulting estimator will have mean square error that is approximately equal to the minimum variance given in (7.14), where
128
Sampling Theory and Methods where c 01 and
coz are
unbiased estimators for C01 and C02 respectively.
TINorem 7.5 The approximate mean square error of the estimator A.
A
YD2 =Y + bt (X t - X t ) + b2 (X 2 - X 2) is same as that of the difference estimator defined in (7 .9), where b 1 as defined in (7.15) and (7.16) respectively. Proof Let
eo
A
A
Y-Y X1 -Xt =y ' el = X t
e _ ' 2-
~d
b1 are
X2 -X 2 X2
e'= cot -Cot ,e"= coz -Co2 Cot Coz It can be seen that
_ CotO+e'}Vz -Ct2CozO+e") b1-
=B
VtVz- Cfz CotVze'-CtzCoze"
(7 .17)
t+--~~~=-~-
vlv2
-c?2
Similarly it be seen that b
2
= Bz +
CozVte"-CtzCote' . V1V2
Using (7.17) and (7.18), the estimaror
YA•D2 -- YA +
(7.18)
2 -C12
r; 2 can be written as
[B I + CotVze'-CtzCoze"] X 2 ( I-
XA
vtvz -Ctz
I)+
XA [B 2 + CozVte''-Cl2Cote']<x 2 2- 2) V1V2 ~ C12 A
(7.19)
A
Replacing Y ,X 1 and X 2 by YO+e 0 ),X 1(1+e 1) and X 2 (1+e 2 ) in (7.19) we obtain
yA•D2
-
y
= Yeo + [ 8 t + Cot Vze'-CtzCoze"] (-X t e t ) + 2 V1V2 -C 12
. [B
z+
CozVte"-CtzCote'] 2
v1v2 - c 12
(-Xze2)
Squaring both the sides and ignoring terms of degree greater than two, we obtain
E[Y;z- Y] 2 =Y 2 E(e5) + Bf XfE(ef) + BfX f E(ei)-
Regression Estii1Ultion 2
~
..,
~
129
~
=V(Y)+B1 V(Xt)+B!V(X2)-
cov(f, X1 )- 282 cov(f, X2) + 28 18 2 cov(X 1, X2) Substituting the values given in (7. 12) and (7 ,13) in the above expression. we get 2B1
the approximate mean square error off~ 2 as V(f)[l-R y2. ,.rz ] . Hence the proof. • Thus in the last few sections of this chapter. we constructed the linear regression estimator for the population total by using the fact that the variables x and y are linearly related and extended this to cover the case of having more than one auxiliary variable. In the following section. the problem of identifying the optimal sampling-estimating strategy with the help of super population models is considered.
.r,
7.5 Inference under Super-population Models In the super-population approach. the population values are assumed to be the realised values of N independent random variables. In this section, we shall assume that the population values are the realised values of N independent random variables Y1, Y2 , ... , YN where Y; has mean bx; and variance a 2 v(x;). The function v(.) is known, v(x) ~ 0 for x ~ 0. The constants b and a 2 are unknown. Let ~ denote the joint probability law.
Implied estimator Consider any estimator f for the population total T which can be uniquely expressed in the fonn f = L,Y; +bL,X; where b .does not s
i
A
depend on the unobserved y-values. Here b is referred to as the implied estimator of the parameter b . For example, under simple random sampling, the ratio estimator can be expressed as
fR
=
LY;
N
X,
i=t
t .L,x; =L,r; .r
s
LY;
+~~X;. Therefore, the _LX; .r .r
implied estimator of b corresponding to the ratio estimator is
L,r; Ls . It is to be X·I
s
noted that the implied estimator does not depend on unobserved y' s. In the following theorem we shall prove that of two statistics f and f 0 , the one whose implied estimator for b is better is. the better estimator for T.
Theorem 7.6 For any sampling design P, if estimators estimators
b and b0 for A
E~ [b- b]
2
b which satisfy Ao 2
S E~ [b - b]
f
and
f0
have implied (7.21)
130 Sampling Theory and Methods for each s such that P(s)>O, then
MSE(P: T) ~ MSE(P: f 0 ) (7.22) If for some subset. say s 0 of~ with P(s 0 ) > 0, the inequality in (7.21) is strict. the strict inequality holds in (7.22). Proof Under the super-population model described earlier.
ri P(s)(T- T)
MSE(P: T) =E;
2]
(7.23)
s
and
(7.24) s
MSE(P: T) ~ MSE(P: f
Therefore
0)
~ A 2 ~ Ao 2 E; [~ P(s)(T- T) ] ~ E; [~ P(s)(T - T) ]
s
(7.25)
s
= LY; +bLX;
Consider f-T
- LY;- LY;
s
s
=b'LX;-
s
s
Ir;
s
s
s
s
Squaring both the sides and taking expectations, we obtain 2
A
E;[T-T] =
~
[~X;]
2
A
E;(b-b)
2
~
+E;[~(Y; -bX;)]
s
2
s (7.26)
Since
b is independent of unobserved
y's and
E; (b- b)= 0,
E;[b-b]L(Y; -bX;)]=E;[b-b]LE;( Y; -bX;)]=O Further E;
ri
(Y; - bX i )] 2
=
L E; (Y; - bX
i )2
(7.27)
(since y s'are independent)
s
=a 2 'Lv(X;)
(7.28)
Substituting (7.27) and (7.28) in (7.26) we get A
E;[T-T]
2
~
= [~X;] 2 E;(b-b) 2 +a A
2~
~v(X;)
(7.29)
Proceeding in the same way, we obtain
Ao 2 E;[T -T] =
~
[~X;]
s
2
Ao 2 E;(b -b) +a 2~ ~v(X;) s
If E; [b- b f ~ E; [b 0 - b] 2 for each s such that P(s)>O, then
(7.30}
131
Regression Estimation
=> E~ [T- T] 2 ] S E~ [T 0 - T] 2 for each s such that P(s)>O. Hence the proof follows from (7.25) • Thus we conclude that the estimator which has the better implied estimator is better. In order to make the study more deeper under the super-population approach, we give the following definition. Model Unbiasedness An estimator f is said to be model unbiased or <: unbiased if for each s,
f-
LY; is an unbiased predictor of the unobserved sum s
(7.31) It may be seen that the above definition is equivalent to E~ [T- T] = 0.
Theorem 7. 7 An estimator is model unbiased if and only if the implied estimator is unbiased for b Proof We know that
f- T = (b- b) LX; s
L (f; - bX; )]
- [
(refer (7 .25))
s
Taking expectations on both the sides we get
E~ff -T )= l',X;E~ (b -b) s
From this we infer that E~ (T- T) = 0
¢::>
E~ (b- b)= 0.
Hence the proof. • Note A design unbiased estimator is not necessarily model unbiased and in the same way a model unbiased estimator is not necessarily design unbiased. The following theorem gives the Best Linear Model unbiased estimator for the population total.
Theorem 7.8 For any sampling design P. let
f
be a linear estimator satisfying """* E~ (T- T) = 0 for ever s such that P(s)>O. then MSE(P: T ) S MSE(P: T) A
A
~X·~· ~II
.. • _ ~ ... ~ ... s v( X; ) where T - "'-' Y; + b ~X; • b = .....-.-~-
s
s
L
xl
s v(X i)
Proof Since we consider only estimators that can be expressed in the form
132 Sampling Theory and Methods
f
=
:Lr; +bLX; s
s
it is easily seen that f is a lin~ function of the observed y 's if and only if b is a linear function of the observed y's. Therefore by Gauss-Markov Theorem and Theorem 7 .6, the proof follows. • The above theorem helps us to derive the Best Linear Unbiased estimators (with respect to the model) for different choices of v(.). Case (1): v(x) = 1 I I ~X·Y·
A.
b
1
s
=
.f
2
LXI; Case (2): v(x) =x
A.
b
LX)·~
= :Lxj2
(7.32)
1
s
:Lr; = 2 = 2,~ :Lx; I I LX·Y·
X;
s
.r
X·I
S
(7.33)
s
Case (3): v(x) =x 2
A.
~ X;f; ~ 2
b =
s
X;
L
x'!_I
1 ~ Y; =-~-
(7.34)
n s X;
2
s X;
Thus under the three cases the Best Linear Unbiased Estimators are
:Lx;Y;
f, =I_r, + s
I.
2 X;
:Lr;
I_x, .f2 =I_r, + s
.r
s
..!. L n
s
i
X;
I_x, and s
f3 =I_r, + s
s
Y; LX; respectively. It is interesting to note that the estimator
X;
_
f2
is
s
nothing but the ratio estimator. This proves the ratio estimator is the Best Linear Unbiased Estimator for the population total when v(x) = x. Now we shall state and prove a lemma which will be used later to prove an interesting property regarding the estimator
T3 = Lf; + ..!.n L s
s
Y; LX; .
X;
_
s
1Am11111 7.1 If 0 S bt S b2 S ... S bm and if Ct S c2 S ... S em c1 +c2 + ... +em ;;?; 0 then btcl +b2c2 + ... +bmcm ~ 0
satisfies
133
Regression Estimation Proof Let lc denote the greatest integer i for which c; S 0. Then lc m b1c 1 +b 2 c 2 ~ ... +b"'c'!' = 1',h;c; + Lb;c; ,.....
i=k+l
lc
~b~ci,c;+blc+l i=l
m
I,c; i=k+l
m
m
i=l
i=k+l
~ b1c I,c; +( blc+l - b1c ) I,c; ~0
Hence the proof. •
.
.
XLY·
The following theorem proves T3 is better than T pps = n
__.!_
S
X·I
for any
fixed size sampling design under a wide class of variance functions. N
TMorem 7.9 If Max{nX 1 .nX 2 .... , nX N} S
LX;
and v(x) is non-increasing
i=l
in x then for any sampling plan P for which P(s)>O only if n(s)=n then A
A
MSE(P: Tpps) s; MSE( P: T3).
Proof In order to prove the given result, it is enough to show that A
E~ [bpps -b)
where
bpps
2
A
s; E~ [b3 -b)
2
is the implied estimator of b corresponding to the estimator
f pps •
To identify the implied estimator corresponding to the estimator f pps , it can be written as
.!. L 21.._ I_Y;
~ n X, T pps = ~ Y; + .r ~ s s ~X;
...
~ ~X; 'i
s
Therefore the implied estimator corresponding to the estimator X~ Y;
~
-~--~Y;
n
"'
b pps = _
X·
S I S _,;;....,'t"==--...:;...-
~X;
_
-
Y·a· L -nX· I
s
I
'
f pps
is
134 Sampling Theory and Methods X-nX
where
ai
I
=
x.
1
I
[~ Y-a· r:x; - 1 2
2
A
Hence E.; [b pps
b] = E.;
-
b..A
~~ ~[Y; - E", (Y; )]] "L~ nx-
= EJ:
S
2
I
.,
.,
=a-~ arv(X;)
~ S
n2
Further note that b3 - b = .!_ n
(7.35)
X~I
L X·Y; - b S
I
1~ 1 =- ~ X[Y;n s ;
E.; (Y; )]
Squaring both the sides and taking expectations, we get 1 ~
2
A
2
E.; [b3- b] =
n
1
~2 E.; [Y;- E.; (Y; )] .r X;
2
= a 2 ~ v(X;) n2 ~ x.2 S
(7.36)
I
From (7.35) and (7.36) we get E .;
[b
-b]2-E pps
.;
[b -b]2=a2~(af-1)v(X;) 3
From the definition of a; . we get 2
2
a; -1 ~a i -1 , we note that
n
X2 ;
s
a? ~a J :::::> X; \'(X;) 2
~
X-I conditions
2 ~
v(Xj) 2
(7.37)
S X j . Therefore whenever
. Further under the
given
X·J
L [af - 1] ~ 0 . Therefore by the Lemma 7.1 the proof follows. • s A
A
Thus we have proved the estimator T3 is better than T pps • It is to be noted A
that the estimator T pps is not model unbiased even though it is linear which makes the above theorem meaningful. Thus we have identified the optimal estimators under the given super-population model for different choices of the variance function appearing in the super-population model. Now we shall come to the problem of identifying the ideal sampling plan with respect to the ·given super-population model.
Regression Estimation
135
If we fix the sample .iize, then the problem of identifying an optimal sampling design becomes straight forward. Let Sn denote the set of all subsets of size n of the population S and Pn be the collection of all sampling designs P for which P(s)>O only when s is in S n . Since for P in Pn. MSE(P: T) = E~
lL P(s)(T-
T) 2 ]
(7.37)
·'
clearly the optimal sampling plan is one which selects with certainty a subset s which minimises E~ (T - T) 2 • Some insight can be gained when the quantity to be minimised is expressed in the form E~[T-T] 2 = [LX;] 2 E~(b-b) 2 +0' 2 Lv(X;)
(7.38)
From (7 .38) we understand that the sampler has two objectives namely (I) to choose a sample which will afford a good estimate of the expected value of the total of the non-sampled values. That is. to choose a sample s so that
[LX;] 2 E~ (b -b) 2 is small and (2) to observe those units whose values of y s
have greatest variance so that only the sum of the least variable values must be predicted. That is to choose s so that
a2
L
v( X;
) is small.
For a wide class of variance functions. if the optimal estimator defined in Theorem 7.8 is to be used. then determination of the optimal sample is quite simple. As it is shown in the following theorem. the best sample to observe consists of those units having the largest x values. Let s • be any sample of n units for which Maxs"
LX; =LX; s
and p • be a sampling plan which
s*
entails selecting s • with cenainty. That is. P*(s=s*)= 1. Theorem 7.10 If
v(~)
is non-increasing then MSE(P* :f*)5.MSE(P:f*>
X for any sampling plan P in Pn where
f* =
LY; +b*'Lx;. and S·
S
Proof We know that MSE(P: T) = E~
lL P(s)(T-
T) 2 ]
s
and
E~[T-T] A
2
~
= [.LJX;] 2 E~(b-b) 2 +0' 2~ Llv(X;)
·'
A
s
136 Sampling Theory and Methods
L X;Y; v(X;)
A
Note that (b- b)=
.r 2 ~X·.
b
~v(;·) S I
-
L
X;[Y -bX;] v(X·) I
S
-
x'f
L v(;·) S
I
Since y's are independent, squaring both the sides and taking expectations we get
a2
A
.,
E~;(b-b)-
=
L[
X; ] 2 v(X; ) v( X;) ---=s~--~--
[~v~!.J
Therefore
Clearly the expression in side the square brackets is minimum for s=s*. Since the sampling plan P* selects the sample s* with certainty we hav~ * ..... ..... MSE(P : T ) S MSE(P : T ) . Hence the proof. The following theorem proves the sampling plan P* is optimal for use with A
A
the estimator T3 and T pps under rather wealc conditions.
x2
Theorem 7.11 For any Pin Pre and any v(.) for which both v(X) and - v(X)
are non-decreasing N
(i) MSE(P* :fppa> S MSE(P: fpps) if Max{nX t•nX 2•···•nX N} S
LX; i=l
*
A
A
(ii) MSE(P : T3) S MSE(P: T3)
Proof From (7 .35) we have
Regression EstimD.tinn
L
a2 al v( X 1 ) X - nX 1· where a· = n2 s X I Xi
2
A
E [b b] .; pps -
Therefore E.; [b pps - b] = (
L
l
.,
a-~v(X;)
1
2
A
137
]2 n2 LX;
7
X?
2
[X - nX; ]
-
s 2 2~ a 2 ~ v( X; ) 2 E.;[Tpps-Tl =a ~v(X;)+2 [X-nX;] 2 ~ "i n s X; A
Hence
~
A
MSE(P:Tpps>= ~P(s) {a
s
2~ ~v(X;) i
a 2 ~ v(X;) 2 +2 [X -nX;] } 2 ~
n s X; " It can be seen that the above expression will be minimum when s=s*. Since P* is the sampling plan which selects s• with certainty •
A
A
MSE(P : T ppa) S MSE(P: T pps >. This completes the first pan.
a 2 ~ v(X ·) 2 WehaveseeninTheorem7.9.E.;[b3 -b] = ; 2 ~ n s X; A
Therefore E.; [T3 - T] 2 =
(1: a: L, X; ]
2
n
i
Hence
L,
v(X;> + a 2 v(X i) s X; i
MSE(P:T3 )= LP(s) {a 2 Lv(X;) +[LX; ] ~
2
a:}:; v(X;)1
X; Clearly under the assumptions stated in the theorem. the right hand side of the above expression is minimum if s=s•. Since the sampling plan P* yields the set s* as sample with probability one, •
"i
A
n
"i
s
A
MSE(P :T3) SMSECP:T3) Hence the proof. • The results discussed in this section are due to Royall ( 1971 ).
7.6 Problems and Solutions Problem 7.1 Derive the Best Linear Unbiased Estimator for the population total ·under the super-population model E.;[Y;]=a+bi,i=l, 2, ... , N, V.;[Y;]=a 2v(i);cov.;[Y;,Yj]=O,i;t;j
where ~ is the joint probability law of Y1, f2, ... , YN error when a=O.
• Also
find its mean square
138 Sampling Theory and Methods
By Theorem 7.8. the Best Linear Unbiased Estimator for the
Solution
population total is gtven by f =
a=
L Y + (N -
v(i)
bL X; where
s
L-y.' -bALi-
v(i)
s
n )a+
1
~ iY, ~ i ~ v(i) ~ v(i)
A
and b = s
.f
·2
.f
:L ;(i) I
Lv:i) s
I
s
.f
A.
When a=O, the estimator reduces to T
~ Y; ~ i ~ v(i) ~ v(l)
.f[
s
I
v -
s
.
]2
v;i>
A·~ A* =~ ~ Y; + b ~ X; , b = s
s
L iY; .c
v(i) •2
I
~
.f
Consider
I/~;
A.
(b -b)=
'
.f
·2
-b
L.:(i) L llY; -bi] .f
v(i) -
-
.f
·2
L; s Since y's are independent. squaring both the sides and taking expectations we get
A•
E~(b
2
2 a 2 I[~] v(i) v(1)
-b) = --=·c;...__~-
[~:(:J
(*)
= (b*
-b)Li- I,£Y;- Et; (Y; )] s
1
Squaring and taking expectation with respect to- the model • we get Et;(T* -T) 2 =E~(b* -b) 2 [Li] 2 + s
La s
2 v(i)
(**)
Regression Estimation
Using (*) in(**) we get
E~(t•
-T)2=[
.,
a~
1£Li11 +
I-'·- J 2
-
.f
139
I,a1v(i) .f
s v(i)
Hence by definition of mean square error, we get
.,
Exercises 7.1
...
Show that the mean square error of T 0 derived in the above .problem is minimum in P,. under the sampling design P0 (s), where
1if s=s* . 0 otherwise where s* is the set containing the units with labels N - n + 1• N- n + 2 , P0 (s)= {
· · non-decreasmg · m · '· and ~ v(i) 1s · non...•N, prov1'ded the tiunction v ( l') 1s l
increasing in i. Here P,. is the class of sampling designs yielding samples of size n. 7.2 Extend the difference estimator considered in this chapter to p-auxiliary variables case and show that the resulting minimum mean square error is 2
7.3
s 2~· ( 1-
2 Ro.l23 . hemu I.t1p Ie corre Iat10n . ... p ) , where R0 123 1st · ... p coefficient. assuming simple random sampling is used. Derive the Best Linear Unbiased Estimator for the population total under the super-population model,
N (N -n) Nn
·
.,
E~ [Y;] =a +bi +ct. i
cov~ [Y; , YJ ] = 0. i
-:¢:
?
= 1. 2..... N, V~ [Y;] =a-;
j •
where ~ is the joint probability law of Y1 , Y2 , •••• YN . Also derive square error of the estimator.
~e
mean
ChapterS
Multistage sampling 8.1 Introduction So far we have seen a number of sampling methods wherein a sample of units to be investigated are taken directly from the given population. While this is convenient in small scale surveys. it is not so in large scale surveys. The main reason being that no usable list describing the population units to be considered generally exists to select the sample. Even if such a list is available. it would be economically viable to base the enquiry on a simple random sample or systematic sample, because this would force the interviewer to visit almost each and every part of the population. Therefore it becomes necessary to select clusters of units rather than units directly from the given population. One way of selecting the sample would be to secure a list of clusters, take a probability sample of clusters and observe every unit in the sample. This is called single· stage cluster sampling. For example to estimate the total yield of wheat in a district during a given season. instead of treating individual fields as sampling units, one can treat clusters of neighbouring fields as sampling units and instead of selecting a sample of fields one can select clusters of fields. Sometimes instead of observing every field within each cluster, one can select samples of fields within each cluster. This is called two-stage sampling since now the sample is selected in two stages- first the cluster of fields (called first stage or primary stage units) and then the fields within the clusters. This is also called Subsampling. Generally, subsampling is done independently from all the selected primary units.
8.2 Estimation Under Cluster Sampling Suppose the population is divided into N clusters where the ith cluster contains N;, (i =1. 2, .... N) units. Let Yij (j =1. 2..... N;; i =1. 2.... , N) be the y-value of N· the jth unit in the ith cluster and Y; = Yij . That is, Y; (i =1, 2, .... N) stands
f, j=l
for the total of all the units in the cluster i. Suppose a cluster sample of n clusters is drawn by using a sampling design with first order inclusion probabilities 1C; , ( i = 1, 2, .... N) and second order inclusion probabilities 1Cij, i ~ j . An unbiased estimator for the population total Y of all the units in the population
Multistage Sampling N
namely, Y =
141
N·
L !, YiJ is given by i=l j=l
.. = """ y. ycl £.,;-' .
IE.f
1C
(8.1)
I
and its variance is
.
f
~[I-TC; J ~ f
V(Yc1) = L Yt
~ + 2£.,; £.,; Y; Yj
i=l
i=l j=l i<j
I
lr
TCij -TCiTC J
rr -ir I
l
. j
t"8.2)
J
The expressions given in (8.1) and (8.2) can be used for estimatmg the population total and to get its variance under any sampling design for which the first and second order inclusion probabilities are known. In particular. when simple random sampling is used to get a sample of 11 clusters, then an unbiased estimator of the population total is given by N""' (8.3) Ycls =- £.,; Y; n A
lES
and its variance is A
V[Ycl.r ] =
where
[
N""(N ..,
-n)
Nn
]
.,
S~
(8.4)
·
-2 . s;' = -1 - LN [Y;- Y]
-
N -1
i=l
In the same manner, for all sampling designs appropriate unbiased estimators can be constructed and their vanances can also be obtained. It is interesting to note that the number of units in each cluster can be taken as a measure of size and cluster sampling can be performed with the help of probability proportional to size scheme. Let Y; be the total of the ith sampled cluster and P; be the selection probability of the unit selected iri the ith draw. i = 1. 2, .... n when a probability proportional to size sample of size n is drawn with replacement. Note that, when the number of units in each cluster is regarded as size, the selection probability of the rth unit in the population is given by
Pr
N
N ..
=~.r=
No
1,2, ... ,N, where N 0
=L N;
. In this case an unbiased estimator
1=1
of the population total is given by A
yelp
1 ~[)';] =£.,; n •=I p·I
(8.5)
and its variance is
..
I
y. ) L -' -Y~ n P; J
V(Yc~,)=-
N {
i=l
.:!
P,
(8.6)
142 Sampling Theory and Methods
Thus we have understood that no new principles are involved in constructmg estimators when a probability sample of clusters is taken. A problem to be considered is the optimum size of cluster. No general solution is available for this problem. However, when ctusters are of the same size and simple random sampling is used. partial answer is provided to this problem. The following tlleorem gives the variance of the estimator under simple random sampling in terms of inuacluster correlation. when the clusters contain same number of units.
TMorem 8.1 When the clusters contain M units and a cluster sample of n clusters is drawn using simple random sampling, the variance of the estimator considered in (8.3) is V[Y
cis
l=[N2(N-tz)]NM -1 Nn N -1
S~[I+(M -l) ] ~ p
where p is the inuacluster correlation given by N
2
,W
L !,[Y;
1 -
YHYu, -
~Jd
p=
Y]
.,
=
y
.Y=--
(M -l)(NM -l)S;
NM
Proof We have seen in (8.4) V[Y
·1 =[N2(N-
cls
Nn
n>]_!_ tt~[Y·-'
f]2
(8.7)
N -1
Note that
N t;.[Y;
N[M
-i? = ~ ~[Y;j=
12
f1j
f[f[riJ -YJ f ~)Y;j ~ 2] +
i=l
j=l
2
Y][Y;k -
Y]
i=l j
=
(8.8)
Substituting (8.8) in (8.7) we get the required result. • From the above theorem we infer that the variance expression obtained depends on the number of clusters in the sample, the variance
S: ,
the size of the
cluster M and the intracluster correlation coefficient.
. (NM -1) • M , the vanance . . given . . Th eorem 8.1 Remark Smce expression m N-1
can be written as
.Wultistage Sampling
V[Ydsl= A
143
[N 2 CN-nl] MS~[l.,-(M-llp] -, Nn
(8.9)
-
Under the conditions stated in Theorem 8.1. if instead of sampling of clusters. a s1mple random sample of nM elements be taken directly from the popuiation wh1ch contams NM clements. V[Y
·''·'
]=ll(NM> 2(NM -nM)]s~ "" M -Nn
-\
=[N2(N-n>]Ms~ Nn
(8.10)
·
Comparing (8.9) with (8.10) we note that (8.11) V[Ycls 1= V(~m )[I+ tM -l)p] Since p is generally positive (because clusters are usually formed by putting
together geographically contiguous elements). we infer from (8.11) cluster :;ampling will give a higher variance than sampling elemt:nts directly from the population. But it should be remembered that cluster sampling will be more economical when compared to simple random sampling. However, if p 1s negative, both the cost and the efficiency point to the use of cluster sampling.
8.3 Multistage Sampling If the population contains very large number of units, we resort to sampling in several stages. For the first stage, we define a new population whose units are dusters of the original units. The clusters used for the first stage of sampling are called primary stage units (psu). For example when the population is a collection of individuals living in a city. the psu may be taken as streets. Each psu selected in the first stage may be considered as a smaller population from which we select a certain number of smaller units. namely secondary stage units (ssu). Unless stated otherwise. the second-stage sampling in each psu is carried out independently.
1. Two-stage sampling with simple random sampling in both the stages As in the cluster of sampling. let Yij U = I. 2..... N i; i
= I, 2.... , N) denote the yN-
-f
value of the jth unit in the ith cluster (psu), ( Y;_ = - 1 N;
Y;i mean per second-
j=l
I N stage unit of the ith primary stage unit and ~- = N. Y;. the population mean.
L
I
j=l
Assume that a sample of n primary stage units are selected using simple random sampling in the first stage and a subsamplc of size n; is drawn from the ith
144 Sampling Theory• and Merhods
primary stage sampled unit i e 1. 1 being the set of indices of the sampled primary stage units. The following theorem gives an unbiased estimator for the population total under two-stage sampling and also its variance.
Theorem 8.2 An unbiased estimator of the population total Y is given by
Yms = N n
LN;Y;. }
1•
being the mean of the units sampled from the ith sampled
IE}
N .,1 12 2l l' -} WI +N [ - - N -] sPsuandltsvanancels V(Ynu )=-~ N N~ [ - -N ~ 1 b 4
• • •
n i=l
n·I
N.
where S 2 · WI
2 =M - - 1 ~ L [ y .. - Y: 1
1
I
Proof
I.
I]
n
N
and S 2 b
j=l
E(Yms) = £1 £2 CY,u). where
·
I
=-N - 1~ [ Y: - f ~ 1-
I.
..
]2
i=l
£1and £2 are the overall and conditional
expectation with respect to subsampling respectively.
E(Yms>=E1E2[: LN;Y;.] IE}
=
£1[: LN;£2(v;. )] IE}
=
£1[: ~N,f;.] IE}
=
£1 [Nn L Y;] . IE
J
N:
=Y ,. The variance of Y,,u is given by (8.12)
Multistage SampLmg A
Therefore V1E 2 ( Ym.~ )
.,r 1 1 ] S h., -l--rz
(8.13)
N
., [ l .,
N-
I
I
N~ - - ) =-~ V.,(Y nl.f .. ·IE 1 I n; N ; n-., £.... A
Further
N
=:
N-., n
A
1=1
ll-
.,
s-."''
---y-.
I 1 ., w1 V
.., [ l
.'V
E 1V..,(Y . , N ~ N~1 - ms )=--~
ll;
I
r
l }" 1 N ~~-; =-IN,.---N; n
(8.14)
"[
N
145
n;
i=l
Using (8.13) and (8.14) in (8.12), we get the required result. • The following theorem g1ves an unbiased estimator of the variance given in the above Theorem.
Theortm 8.3 An unbiased esumator of V ( Y11u ) is
v(Yms)=: LNl[~- ~- k~., +N 2 [~- ~]s; s2
where Yij
b
1
1
IE}
j
n·
1 -~[N·v· -~ ~ N·v. ] 2 and ='. '· n £..,. '· '· n -I£....
s2· w1
~}
~}
1-~[)'··- y· f ='· '1 n· _ L
1 j~
I
•
being the value of the jth sampled unit from the ith sampled psu.
Proof We have )-;· -~ ~ ~[N £.... N·v-1 £.... N-:..¥ -~[~ £.... Nv. ] =~ £.... I
n
iEJ
consider
r-1.
I.
2
2
1EJ
1-1
I .I
!
n
iEJ
iEJ
2
(8.15)
J
ErL,N?'Yll= £ 1£ 2 L,N,2 vpj1 _IE}
1...1EJ
_.
=
E,lr L,1N?V1Ci'; 1+ Nl[E1
=
£,[L{N,2f-1 -f}s;; +NFY;~ l] ln,
iEJ
i
I }s2 +..!:_ f y.2] =..!:.[f £.... £.... N2f_l _ _ N
Further
I
i=l
•
ln;
N;
WI
N i=l
I
(8.16)
146 Sampling Theory and Methods
.,_.,
=n-r-+n
., I I I,Nrl--[-n - N-I lJSb+[ N N;
2 '
II
.,·
N
'
i=l
II;
Hence by (8.15) we have E
[L [N·v·- - -I L N·v- ]-., ] 1- I.
IE}
n
1- I
.,
n- I
=(n-l)Sb-+-N
IE}
f] .,
~;
---r-··
[LN ., ( 1
1
N~ I n·
N
1
i=l
1 ., ] ~H
• 1
Therefore
E[N 2 (tn_!__ Nl l~IL[N;Y;. _ _!_ I,N;Y;,1 2 ]=V(Ym.,) n. 1 1 11
\
. IE
IE
-[fNp-l'~--~.~~~.;J n N, r i=l ·
1. 1. - Since [N I,N;2 ( n iEJ
nl
Nl
1
}~·i]· is unbiased for [fN?(-~~~1. - -1. 1~,:.;]. r w~ Nl
1=1
conclude that
~[N·y-·
N2(_!_ __ 1 J-1I L. N n n- iEJ
I
I.
__!_n ~L. N-v.
1- I.
iEJ
]2 +
[Nn ~L. N~(-1 _ _I }:!·]is I
iEJ
n;
N
"-1
i
unbiased for V (Y,u) . • Remark If all the first stage units have same number of second stage units, say M and the same number of second stages units is sampled from every sampled primary stage unit, then Y,ns and V(Ym.f) take the following forms: (Here it is assumed that m is the second stage sample size)
Multistage Sampling
L-
. y~ =NM -. ms
(l)
147
-V·1.
n
1EJ
l---
I ,., ., ., ( I N 2M 2 ( i . I '; . , 6,~ T"N-M- - - n N b m M) 11 N N .., I~ .., y I _.., -. Y where S 1~. =- £.... S ,~.,. and sb- = - •
A
..
(ul
V
L-
N-1.
N i=l
_.,
It is to be noted that S;;
t=l
.,
sb
=-
M
I.
=.., ==_. M
.,
where S 17.- is as defined in Theorem 8.2.
Optimum values of n and m Now we shall find the values of n and m. namely the sample sizes to be used in the first and second stages of sampling. assuming the conditions stated m Remark 8.2. Naturally these values depend on the type of cost function. If travel between primary stage units is not a major component. then the total cost of the survey can be taken as (8.17) C=c 1n+c 2nm The above cost function contains two components. The first component is proportional to the number of psu · s to be sampled whereas the second component is proportional to the second stage units to be sampled from each sampled psu. Under the above set up, it is possible to find the optimum values of n and m for which V(Ym.f) ts minimum for a given cost. Towards this. we
consider the function
vo'm.f) + .A.[cln + Czllm- C] 1)-2 .. t'\.., (1 +.A.[c 1n+c 2 nm-C] ---1Sb IS;;.+N-M---} =N 2nM 2 \_m ,n N) M.
L=
"II
(8.18)
Differentiating the above function partially with respect to n and equating the derivative to zero we get I \S,~ I s; (---J--+-c +czm=0 --M nl
n2
m
2
n =
1
( I I 'Lz S-z +!--~ \111 M I w b
(8.19)
c 1 +c 2 m Again differentiating partially with respect to m and equating the derivative to zero we get -
s2
w
.., 2 m-n ..,
+c.,-
=0
s~.
Combining (8.19) and (8.20) we get
(8.20)
148 Sampling Theon· and Merirods
sl +( 2---1 . ~,~ \m M ) c 1 + c:!m
1
= 5,11 c.-,m.;..
m=g
(8.21)
This value can be substituted in ( 8.17) and a best solution for n can be found easily. The expression given in ( 8.21) indicates that m is directly proportional to S"" . This implies. the number of second stage units to be taken from a primary stage unit should be large if the variability with respect to y is large within primary stage units. Similarly if the cost per secondary unit c 2 is small or the cost per primary units c 1 is large, m should be large.
2. Two-stage Sampling Under Unequal Probability Sampling The results presented above are applicable for the case of using simple random sampling in both the stages of sampling. In this section, we shall discuss some results which are quite general in nature and can be applied for any sampling design with known inclusion probabilities. Here it is assumed that all the second stage units in the population arc labelled using running numbers from I to N 0 N
where N 0 =
LN
r.
N, being the number of secondary units in the rth psu.
r=l
If a unit i belongs to the 11h psu. r=r(i). then the unit will be included in the samples if I. the rth psu will be included in the first sample: and 2. the unit i will be selected in subsampling. provided that case I happened.
Denoting the probability of case I by rc
2 by
1r / 1 •
t
and the conditional probability in
we have the following formula for the overall inclusion probability
Multistage Samplin1:
rc; = rc: rc ,11 • i = I. 2..... N n . r= 11 i)
149
(8.12)
N
where
N0 = LN, r=l
For example: If simple random sampling is used in both the stages of sampling. rc;
=.!!_.!!.!.__. r =I, 2, .... N. i =I. 2..... N 0 ; r =r(i)
( 8.13)
N N,
Here it is assumed that a sample of 11 psu's are selected in the first stage and a subsample of size n, is drawn from the rth sampled psu. Now we shall consider second-order inclusion probabilities in multistage sampling. For two units i and j belonging to rth and sth psu, respectively, we may write
rc ij where
=rc:, rc ;~1 i. j =I, 2•... , N 0 ; r =r(i), s =s( j)
( 8.14)
rr:r denotes the probability of simultaneous inclusion of the rth and sth
psu in the first stage. and
TC;~1 denotes the conditional probabi1ity of
simultaneous inclusion of the units i and j in subsampling. provided that the rth and sth psu have been selected in the tirst stage. If r :;c s . the subsampling concerning the unit is independent from one concerning the unitj. Therefore _1 (') ') Tr;jII =reiII rc jII and rc;1 = ""n Tr;II rc jll·f 1 r l '*- S(J . If r
=s, then we have
rc:, = rc: ~nd rc ij
=rc: rc 61 if r(i) =r(j).
Using these first and second order inclusion probabilities. one can easily construct unbiased estimator for the population total. The total over the rth psu will be denoted by (with respect to the variable yl T n·, r = I. 2.... , N .The set of indices belonging to the psu selected in the first stage of sampling will be denoted by J c {I, 2..... M} and identified with the first stage sample. The sample of second stage units yielded by subsampling in the rth psu will be denoted by s, . r = 1. 2, .... N . Consequently the sample of second stage units will be given by s =
Us, . Under
this notation. we have
rei
rc:
=P( r e J) and 1C61 =P(i e s I r(i) e J) .
An unbiased estimator of the population total Y is given by
. Ly.....!...
y=
iE.f
TC·1
The above estimator can also be expressed as
(8.15)
150 Sampling Theor;• and Methods
'Y,
~ rrll
=L
tEsr
1 1Cr
rei
1
where fry is the estimator of Try based on subsampling . That is,
'y,
=l. 2.... , N
Try = ~ ~· r A
,
(8.16)
ies rr i
The following theorem gives the variance of the above unbiased estimator for the population total.
Theorem 8.4 In two stage sampling N A
A
V(Y)=EI[YI -Y] ~
,rn.
where Y1 = ~ 1
2 T
L
E II (Try· - T '!' )
r=l
I lrr
A
(8.17)
and the expectations E 1 and E 11 refer to the first -stage
rei rr r
sampling and sub-sampling respectively. Proof Fixing the first stage sample J. we have Eu(Y)= ~Eu A
'
[
rei
1f n· ]
f n· = '~~-=Y 1 A
rei lrr
lrr
Furthermore, since subsampling is carried out in each sampled first stage unit independently, we have
E II ('Y - Y) 2 = E II [ y - E II ( 9)] 2 +
=
A
.,
2.
E 1/ [Tn· - T1'\' ] -
rei
(lr r )-
·
[EII ( y- Y)f
·1
.,
-
A
+ [Y1 -
n-
.,
Applying the well-known relation £[.]= £ 1 £ 11 [.] to both the sides of the above expression, we get the required result. •
Exercises 8.1
Suggest an unbiased estimator for the population total assuming simple random sampling is used in the first stage and ppswr sampling is used in the second stage and derive its variance. 8.2 Suppose a population consists of N primary stage units out of which n are selected so that the probability that a sample s of size n is selected. is proportional to sample total of size variable of the primary stage units. Suppose further that for the ith psu there is an estimator T; (based on
Multistage Sampling
8.3
151
sampling at the second stage and subsequent stages> of the total Y, of the primary stage unit. Suggest an unbiased esumator of the population total usmg ~·sand denve its vanance. In a two-stage des1gn one subumt is selected wnh pp to x from the enure population. If_ this happens to come from the ith psu. a without replacement random sample of m: - I subunits is taken from the M; -1 that remain in the psu. From the other N -I psu 's a without replacement random sample of N -l psu's is taken. Subsampling of the selected p.nc's is n
~M·v· ~ 1-1 without replacement simple random. Show that ...;.i=....;l~-- is an unbiased n
~M·x· ~ I I i=l
. f y esumator o - . X
Chapter9
Non-sampling Errors 9.1 Incomplete Surveys In many large scale surveys. data cannot always be obtained from all the sampled units due to various reasons like the selected respondent may not be available at home and even if present may refuse to co-operate with the investigator etc. In such cases the available data returns are incomplete and some times. this kind of incompleteness called Non-response is so large as to completely vitiate the results. In this section some techniques meant for removing biases arising from incomplete data are presented.
Hansen and Hurwitz Technique Hansen and Hurwitz (1946) suggested a solution for obtaining unbiased estimates in mail surveys in the presence of non-response. In their method, questionnaires are mailed to all the respondents included in a sample and a list of non-respondents is prepared after the deadline is over. Then a subsample is drawn from the set of nonrespondents and a direct interview is conducted with the selected respondents and the necessary information is collected. The parameter concerned are estimated by combining the data obtained from the two pans of the survey. Assume that the population is divided into two groups. those who will respond at the first attempt belong to the response class, and those who will not respond called non-response class. Let N 1 and N 2 be the number of units in the population that belong to the response class and the non-response class respectively (N 1 + N 2 = N). Let n1 be the number of units responding in a simple random sample of size n drawn from the population and n 2 be the number of units not responding in the sample. We may regard the sample of n 1 respondents as a simple random sample from the response chiss anu the sample of n 2 as a simple random sample from the non-response class. Let h2 denote the size of the subsample from n 2 non-respondents to be interviewed and
f
=n2 . Unbiased estimators of N 1 and h2
N 2 are given by
Non-Sampling Errors Nn 1
A
Nn-,
A
N 1 =--· and N., =--n n Let yh2 denote the mean of h1 observations in the subsample and
Yw =
ntY't +n2Y'112
153 (9.1)
(9.2)
n The following theorem proves the above estimator is unbiased for the population mean and gives its variance. . _ n1Yt+ll2Yin TINonm 9.1 The esumator y w = - is unbiased for the population n
mean and its variance is
-
V(yw)
where S
r
1 I I~ 2 N., s ~ =( -;;N +(/-I) N ;
1 is the analogue of
S 2 based on the non-response class.
Proof E(y~.)=EE[y"'"ln,,n2]
=
4
ntY'r+n2Yin ] - I n 1. , n-,n
L.
=Elf] (since
yh:
is unbiased for the mean of non-response class)
=Y Therefore the estimator is unbiased. V(yw)=VE[yw ln,,n2]+EV[yw ln,,n2]
(9.3)
=V[y]+EV[yw ln,,n2]
_ [ntY't +n2Y1r-, ] Note that V[yw ln,,n2]= V n - ln,,n2
n.,
"
n2
-
:-::...(/ -l)s.;
si is the sample analogue of Sf. Hence£V[Yw ln1,n2J= (/ si I n2] = (/: l)
where
-1)1:;
=l/-1> s~ N 2 n
- N
sf{;] (9.4)
154 Sampling Theory and Methods -
( 1
·
~
t -,.,
Note that V( vn) = I - - -
-
N
n
(9.5)
Substitwing (9.4) and (9.5) in (9)) we get the required result. • The cost involved in the above technique contains three components (i) the overhead cost C 0 , (2) the cost of collecting. processing per unit in the response class C 1 and
(3)
the cost of interviewing and processing information per unit in
the non-response class C 2 . Thus. it is reasonable to consider a cost function of the form C = C 0 n + C 1n 1 + C ~n::!.. Since n 1 and n::!. are random quantities with N., N expectataons n - 1 and n-- respectively, the average cost funcuon as Nf N C' =
~ [C 0 N + C 1N 1 + C 2
values of
n; ]. The following theorem gives the optimum
f and n for a given variance.
Theorem 9.2 The values of
f and n for which the average cost is minimum
c:! s2- N2s11 when V[yw1 = V0 are given by
f=
N
sf(co+N~1 )
and
2
S2+N2(/-1)S2 N n = --.,---~--
( V+~N Ij
Proof of this theorem is left as an exercise. The problem of incomplete surveys has received the attention of many including El-Bardy( 1956), De Ienius ( 1955). Kish and Hess ( 1959), Bartholomew( 1961) and Srinath( 1971 ). Deming's Model of the effects of caD-backs Deming (1953) developed a mathematical model to study in detail the consequences of different call-back policies. Here the population is divided into r classes according to the probability that the respondent will be found at home. Let wij = probability that respondent in the jth class will be reached on or
before the ith call . Pi =proportion of population falling in the jth class. mean for the jth class and wij
Y1 =
G;., =variance for the jth class. Here it is assumed that
is positive for all classes. If
Yij
is the mean for those in class j. who were
155
Non-Sampling Errors
reached on or before the ith call. it is assumed that E(YiJ]
= Y1 .
The true
r
L [p
pcpulation mean for the item is ~ =
Y
1 1] .
j=l
Suppose a simple random sample of size n is drawn. After i calls, the sample is divided into ( r+ I) classes: in the first class and interviewed: in the second and interviewed; and so on. The (r+ I )st class consists of all those not interviewed yet. The numbers falling in these ( r+ I) classes are distributed according to the multinomial r
[w;IPI +w;2P2 + ... +w;,Pr +(1- LwijPj)]"o j=l
.
where n0 is the initial size of the sample. Therefore n; follows Binomial r
distribution with parameters n 0 and
L wii p
1 .
For fixed n, . the number of
j=l
mterviews
nij \
j
=I. 2..... r)
follows multinomial with probabilities
r
I
W;jPj
j=l
Therefore E[ n ij I n i ]
n; wij p J __;;,___;;._ = -,
.L
WijPJ
j=l
If
y,
is the sample mean obtained after i calls. r
E[y,
In;]=
E[
-
L
nij Yij
j=l
ll;
r
LwiJpiYJ 1=1
-,
=~-----Y r
LWiJP j j=l
Since the above expected value does not depend on n; , the overall expectation of
Y;
is also
Y; . Therefore the estimator 1s biased for the populati9n mean
The bias of the esumator
E[y;] = Y,- y
Y;
is given by
Y.
156 Sampling Theory and Methods r
L L
p J YJ
W;j
-
r
r-l
-
-
:"
WijP,I
I
r
-
w .. p . y . -
,
I)
J
r
1-
I}
j=l
]
I
w .. p . IJ
}
-y . I
j=l
L.
j=l
[ 1-
=
i
W;j
J- '
pj
~ w" p ' (f.,
j=l
~
r
~ ~wijPj
IJ
J
-f.I
IJ
1 )
j=l
j=l
where Y;' is the mean of the units not interviewed yet. The conditional variance of the estimator
v.
~ [ N ij ~ --[Y;j -
-
V[y; In; I=
N - n· I
'
'
N; -1 ";
•
after i calls is
2 N 'k - 1 ., ] • Y;] + ' S;j Nij
N;
j=l
I
N;;
L[ -1
1
where S ;] = Nij
meaning. Taking
¥;1 ] 2
Yijk -
.
The quantities N iJ, N; etc. have the usual
k=l
v..
a,·, =:....JL and .
N,
IJ
N· -n· 1 V[y-·ln]= I I N' 1' i -
If we further assume
Nij-1 ., ., _ s~IJ =a-:-:1}' V[y·I In·] can be written as .. I N
L [r-.. -r.]-+a - ., .. } r
{a !I..
2
I}
I
IJ
II; j=l
a;7 = a}
and ignore terms of order
._!,,.it can be seen that n-
r
.!
L {a ij [f;j - Y; ] + a J} 2
MSElY;l=(!- J-J=-1- - ,- - - -
L
WijP j
j=l
Deming has also considered the problem of determining optimum .number of call-backs for the given sample size and cost of the survey. For related results one can refer to Deming ( 1946). Politz-Simmons Technique Politz and Simmons ( 1949,1950) developed a technique to reduce the bias due to incomplete surveys without successive call-backs. Their method is described below:
Non-Sampling Errors
157
The interviewer makes only one call during a specific time on six weekdays. If the respondent is at home. the required information is collected and he is asked how many times in the preceding five days he was at home at the time of visit. This data is used to estimate the probability of the respondent's availability. If the respondent states that he was at home t nights out of five, the ratio
'+6 1 is taken as an estimator of the frequency
7t
with which he is at home
during interviewing hours. The results from the first call are sorted intQ six groups according to the values oft (0.1,2,3.4,5). Let nr be the number of interviews obtained from the ith group and
y1
the mean based on them. The Politz-Simmons esumate of the
L 6ntYt5
~
population mean is Yp.f
_ r=O t -
+l .
5
L 6nt
In this approach, the fact dtat the first call
t=O t +I
results are unduly weighted with persons who are at home most of the time is recognised. Since a person who is at home. on the average. a proportion 1t of the time has a relative chance 7t of appearing in the sample. his response should receive a weight _!_.The quantity 1C
~ t+l
is used as an estimate of _!_.Thus Yps 1C
is less biased than the sample mean from the first call, but because the estimator happens to be weighted mean.
1t
has greater variance
Let the population be divided into classes, -people in the jth class being at home J of the time. Note that the kth group will contain persons from various
rc
classes. That is. persons at home t nights out of the preceding tive belong to various classes. Let n jc. y jt be the number and the mean for those in class j and
±t
6njrYjt
group r. Then the estimator
~ Y_,,.~
~
can be wntten as Y
ps
--
tt
r=O j=l
t+1
. If n 0
6n_;1
r=O j=l t
+l
1s the initial size of sample (response plus not-at-homes) and n j is the number from class j who are interviewed, the following assumptions are made. n· (l) - 1 . is a binomial estimate of p 1 1
rc
no
<2>
E[n 11 1n 1 J=n 1
r5 )rcjo-rc 1 >5-t
.'l
(3) E[y jt] = Yj for any j and t
!58 Sampling Theory and Merhods
It can be shown that under the above assumptions ;
E(I-6_, = i [ l - n-It, + r=l
t
1
_16]
1
:rj
r
= n0
L
p 1 ( 1- ( 1-
1r 1 ) 6 ]
j=l
Since E[y jt] = Y1 for any j and t. we have
~
_t=l
E[Y p.f] = ~,-------
LPj[l-(1-~r_; i=l
This
~haws
)6]
.
that the estimator Yp.{ is biased for Y . However, in practice the
amount bias is likely to be small when compared to call-back surveys. The variance of the estimator is quite complicated. For more details. one can refer to the original paper.
9.2 Randomised Response Methods In many sample surveys involving human populations, it is very difficult to get answers which are truthful and in some cases the respondents fail to co-operate. It is mainly due to sensitivity of certain questions which are likely to affect the privacy of respondents. To overcome this limitation, Warner ( 1965) has designed a technique to encourage co-operation and truthful answering. Suppose members of a group A in a population have a socially unacceptable character and we are interested in estimating the proportion 1t A of the persons belonging to A. Assume that a simple random sample of size n is drawn with replacement from the given population. Wa.rner's Method Each selected respondent is given a random device which results in one of the two statements "I belong to group A " and "I do not belong to A ". The respondent is asked to conduct the. experiment unobserved by the investigator and report only "yes" or '"no'' according to the outcome of the experiment . He does not repon the outcome of the experiment. If n 1 persons in the sample report ••yes" answer and n 2 = n - n 1 report "no" answer then an unbiased
estimator of 6. the probability of ··yes" answer is given by8w =~. It is to be n noted that
159
Non-Sampling Errors
8w =Prc ..1 +0-P)!l-Jr.\l
(9.6)
where P is the probability of getting the statement ··1 belong to group A" and (1- P) is the probability of getting the other statement. It is assumed that P is known.
Hence
an
unbiased
estimator
of
the
parameter
It' .1
ts
8w -(1-P) 1 . , P :1:--. Smce n 1 has binomial distribution with parameters 2P-l 2 nand 8w, It' A w
=
V(lt'AW)
=--";..;;.._-.,-
(9.7)
(2?-l)-
=lt'A(l-Jr.4.)+
P(l-P)
n
n(2P -1)-
.,
(9.8)
Here it is assumed that the respondent is truthful. The first term. on the right hand side of (9.8) is the usual binomial variance that would be obtained when all the respondents are willing to answer truthfully and a direct question is presented to each respondent included in the sample. The second term represents a sizable addition due to the random device. It is to be 1
noted that the above method is not useful when P = -
.2
and for P = 1 . the
method reduces to direct questioning. Simmons Randomised Response Model -In order to enhance the confidence of the respondent in the anonymity provided by the randomised response method. Simmons suggested that one of the statements referred to is a non-sensitive attribute. say Y. unrelated to the sensitive attribute A. In some cases the respondent would get one of the following two statements with probabilities P and n - P). 1. I belong to group Y 2. I belong to group A In this case the statement 1 would not embarrass the respondent. If lt'y is the proportion in the population with the attribute Y and it is known then the proportion It' A caJl be estimated unbiasedly. Note that the probability of getting
the yes answer is
8 s =Pit' A
+ (1- P)lt'y . If
Bs
is the proportion of yes answers
in the sample of size n. then an unbiased estimator of
.. It' AS
=
8 5 -(1-P)lt'y p
and its variance is .. ) -_........;:;. 8 s_ (1-8 V( lt'As _s ) nP 2 ~
It' A
ts
160 Samplmg Theory and Methods
When
is unknown. the method
1ry
~an
be altered to facilitate estimauon of both
Try and rr A . Here. the sample is drawn in the form of two independent samples
of sizes n1 and n 2 again with replac~ment and with probabilities P1 and P2 for getting the sensitive statements in the tirst and second samples respectively. The same unrelated question is presented in with probabilities I - P1 and I - P2 in the first and second samples respectively. If 8 1 and 8 2 are the respective probabilities of "yes" answer. then we have 81 = Plrr A + 0- PI )Try 8 2 = P2 rr A+ (1- P2 )Try Solving these two expressions. we get -
- P281- P182 p'!. -Pl
-
- (I - p2 )81 -(I - pl )82
'"Y-
'"A-
pl -P'!.
Let n' 1 and n' 2 be the number of yes answers in the first and second samples I
I
respectively. Since ~ and n 2 are unbiased for 8 1 and 8 2 respectively. an nl
n2
unbiased estimator of rr A is given by
. _ o - P2 >B1 - o- P1 lB 2
rrAs-
where
P1 -P2 .. n' 1 .. n'., 8 1 = - and 8., nl
-
=-. Since n2
11'1
and n'2 are independent and
binomially distributed with parameters (11 1,8 1) and (n 2 .82). the variance of
it .4 s is found to be nl n1 V(iAs)=------~------------~----=-------
( pl - p2 )-
Folsom's Model with two unrelated characteristics Folsom et al ( 1973) developed an unrelated-question model with two nonsensitive characteristics. y 1 and y 2 in addition to the sensitive character A. Assume that the non-sensitive proportions rr yl and rr y'!. are unknown.
Two
independent simple random samples with replacement of sizes n 1 and n 2 are drawn. Each respondent in both the samples answer a direct question on a nonsensitive topic and also one of two questions selected by a randomised device. The following table given in the next page describes the scheme.
Non-Sampling Errors
Technique used wtth respondents Randomised Response (RR)
Sampie I Question A Quesuon f;
Direct Response (DR)
Question
Y2
161
Sample 2 Question A Question Y2 Question
Y;
In both samples let the sensitive question be asked with the probability P. and for i
= l, 2, A.r ( A.f ) be the probability of a "yes" answer to the question selected by
RR( DR) in the ith sample. Then
AI = p 1t' A +(I -
p )1t'yJ
A.2 = p 1t'.4 + (l- p
)1t'Y2
(9.9) (9.10)
A,d
=
1ryl
(9.11)
A;
= Kr:
{9.12)
Let i[ , i2
.if
and i~ denote the usual unbiased estimators of
A.r . A.2 . A.f
and A.~ respectively. given by the corresponding sample proportions. Then from (9.9) and (9.12) we get an unbiased estimator as -ir- (]P)it.,1 .. AJ . (9.13) JrA(l)= p Using (9.10) and (9.11) we get another unbiased estimator as .. A.~ J?A(2)=-
..d
- 0 - P)A.
I (9.14) p Variances of the estimators defined in (9.13) and (9.14) can be obtained easily. This is left as an exercise. In addition to these three Randomised Response methods, several other schemes are available. For details one can refer to Chaudhuri and Mukerjee ( 1988).
9.3 Observational Errors So far in all our discussions, it has been assumed that each unit in the population us attached a fixed value known as the true value of the unit with respect to the character under study and whenever a population is included in the sample, its value of y is observed. However~ this assumption is an over simplification of the problem and actual experience does not support this assumption. There are plenty of examples to show that error of measurements of responses are present when a survey is carried out. In this section we shall consider this problem and devise methods for the measurement of these errors to plan the survey as meticulously as possible. Let us assume that M interviewers are available for the survey. The response xijk obtamed by interviewer on unit j assumed to be a randcm variable with E2[xijk j =X ij
and
V2[xijk]
= S;J. The average of responses obtamed by
162 Sampling Theon• and Methods N
~
-
interviewer i on all the N units in the population ts X,- = L
-
x,J
N
and the
j=l
-
M -
~X 1
average obtamed by all the M interviewers would be X = ~ 1=1
,w
•
This value
can be taken as the expected value of the survey, whereas the true value is Y the population mean based on all the umts in the population. The difference X - Y is called the response bias. The response obtained from a sampled unit depends on the person who observes the unit. Therefore it is desirable to allocate the sample interviewer (selected out of the M available) to the sample units (selected out of theN units in the population). Now consider the situation. in which a simple random sample
n= ~
units is selected from the population of N units and assigned to an m interviewer selected at random from the population of N units and assigned to an interviewer selected at random from the M available for the study. Another independent sample of size is selected and assigned to another interviewer selected at random from the M. In this process m such subsamples of size ii are selected and assigned to the M interviewers. The following theorem gives an unbiased estimator of X under the above scheme. of
n
Theorem 9.3 Under the sampling scheme described above. an unbiased I Ill 1 n X·· 11k . h . . gtven . by X =- ~ - h ~ 1 esttmator o f X ts ~ .t; w ere x; ~--=- ts t e samp e m 1=1 n 1=1 n
=-=
A
mean provided by the ith selecuon of the interviewer. Proof If a unit is selected at random from the population containing N units and an interviewer is chosen at random from the M and assigned to the selected unit. the expected value of the response x;jk will be X . It is because for a given interviewer i and for a given unit j. E 2 [x;jk] =X ij . This implies for a tixed i, 1111 N_E:dxijk] =-LX ;1 .· Therefore E[x;jk] =--LX ij =X . This implies that
N.
j= 1
MN.
}=1
m
L
1 the sample mean 'X; is unbiased for X. Hence E[i] =E[x;] =X . Hence m.I= I the proof. •
Non-Sampling Errors Theorem 9.4
[I 1]
]
_ Vlxl V(x)=--+ - - - C where V[.t] = MN
n
m
l
and C =
n
M
M
163
N
L L E[x,1k - Xl
2
r=l J=l N
_
_
L L E[x;jk -X ][x;j'k -X I
MN(N -1). 1 ., t= j<J
Proof of this theorem follows from routine algebra and hence left as an exercise. The variance of the sample mean based on a survey employing interviewers has two components. One is the variability of all responses over all units to all interviewers and the other is the covariance between responses obtained from different units within interviewer assignments. If advance estimates of these two components are available. one can determine from the variance gtven tn Theorem 9.4 the optimum number of interviewers to employ for the collection of data. Let c 1 be the cost per unit in the sample and c., be the cost per interviewer, so that the total cost of the surveyor is c1 = c0 + c 1n + c 2 m (9.15) The values of n and m can be found by minimising V(x) for a given cost with the help of the method of Lagrangian multtpliers. Setting the partial derivatives of V(i) +A. (c 0 + c 1n + c 2 m- Cr) with respect to n and m equal to zero. we get
kl = V(x).,- C and n-
c
A.c.,=.,
~
!!: = n
(9.16)
m
v~
.Jvcx>-C
rc
(9.17)
The actual values of n and m are obtatned by substituting the ratio gtven in (9.17) in the cost function defined in (9.15). Since the covariance component C and the vartance V depend on the number of intervtewers used and the size of the asstgnment. the solution obtained should be used for getting an idea of the magnitudes involved. Thus we have seen the manner in which resources can be allocated towards the reduction of sampling errors (as provided by n) and nonsampling errors (interviewer errors). The following theorem gives unbiased estimates of C. V(x) and V(:t) under the sampling scheme described in this section.
Theorem 9.5 Under the sampling scheme described tn this section. unbiased ., ., ., 2 s- -s;. ., s--sw estimates of C. V(x) and VCi) are gtven by C = b • V(x) = s~. +-b_ _ n
n
164 Sampling Theorv and Methods m
/PI
- = V(x)
and
~[h L X;- -·, xj- were
l m(mm
: .Sw =
J) 1=1
1 ~r---)1, =--L n{.t; -x m-1
and
!=I
n
~ ~[ _1 L ~ Xijlc m(n -I) . 1 . 1 I= J=
sb,
-
-X;
12
Proof of this theorem is left as an exercise.
Exercises 9.1 9.~
9.3
Extend Warner"s method to the case of estimating two proportions. Find the mean square error of the estimator rrab = a8 + b where 8 is as defined in Warner's model. Find the minimum mean square error of the estimator suggested in 9.2 and offeryourcommen~.
9.4 Obtain an unbiased estimator for the sensitive proportion under Warner's method assuming, the probability of a respomient being untruthful is L and derive the variance of the estimator.
Chapter 10
Recent Developments 10.1 Adaptive Sampling It has been an untiring endeavour of researchers in sampling theory to seek estimators with increased precision. In the earlier chapters we have seen a variety of sampling estimating strategies which use the information of a suitable auxiliary (size) variable either in the sampling design or in the estimator. There are very few sampling schemes which use the knowledge of study variable in the selection stage. Recently Thompson ( 1990) introduced sampling schemes which directly use the knowledge of study variable in the selection process. In this section details of his sampling schemes are presented. Quite often we encounter surveys where the investigator gathers information regarding the number of individuals having some specific characteristics. As an example one can think of a survey involving endangered species in which observers record data regarding the number of individuals of the species seen or heard at locations within a study area. In such surveys frequently zero abundance is encountered. In those cases, whenever substantial abundance is seen, exploration in nearby locations is likely to yield additional clusters of abundance. These kinds of patterns are encountered along with others, from whales to insects, from trees to lichens and so on. Generally, in sample surveys, survey practitioners decide their sampling strategy before they actually begin data collection. However, functioning in this predetermined manner may not be effective always. For example, in epidemiological studies of contagious diseases, whenever an infected individual is encountered, it is highly likely that neighbouring individuals will reveal a higher than expected incidence rate. In such situations, field workers may not like to stick to their original sampling plan. They will be interested in departing from the preselected sample plan and add nearby or closely associated units to the sample. Keeping these points in mind, Thompson (1990) suggested a new sampling scheme. In the sampling scheme suggested by him namely, "adaptive sampling", an initial sample of predetermined size is drawn according to a conventional sampling design. The values of sampled units are scrutinized. Whenever the observed value of a selected unit satisfies a given condition of interest, additional units are added to the sample from the neighbourhood of that unit. The basic idea of the design is illustrated in Figures 1Oa and lOb. Figure
166 Sampling Theory and Methods
..
•
•
•
•
• •
• •
• Fig. 10 (a) Initial sample of 10 units
..
"' ... "' •
•••• •••• •
• ••• ••
• ••• • • ••• ••• •• • • •
•
•• ••• •
•
Fig. 10 (b) Final sample after neighboring units are included
Recent Developments
167
lOa shows an initial sample of 10 units. Whenever one or more of the units is found to satisfy a given condition, the adjacent neighbouring units to the left, right, top and bottom are added to the sample. When this process is completed, the sample consists of 45 units, shown in Figure 1Ob. It is pertinent to note that neighbourhood of units may be defined in so many ways other than spatial proximity. The fonnal definition of adaptive sampling is presented below: In adaptive sampling an initial set of units is selected by some probability sampling procedure, and whenever variable of interest of a selected unit satisfies the given criterion, additional units in the neighbourhood of that unit are added to the sample. The criterion for selection of additional neighbouring units can be framed in several ways, depending on the nature of study. For example, the criterion for additional selection of neighbouring units can be taken as an interval or a set C which contains a given range of values with respect to the variable of interest. The unit i is said to satisfy the condition if Y; e C . For example, a unit satisfies the condition if the variable of interest Y; is greater than or equal to some constant c. That is, C {x: x S c} . Here it is assumed that the initial sample consists .of simple random sample of size n units selected either with or without replacement. To introduce appropriate estimators under adaptive sampling scheme, we need the following definitions.
=
Neighbourhood of a unit For any unit in the population, the neighbourhood of a unit U; is defined as collection of units which includes unit Ui with the property that if unit U i is in the neighbourhood of unit U; , then the unit U; is in the neighbourhood of unit U i . These neighbourhoods do not depend on the population values.
Cluster The collection of all units that are observed under the design as a result of initial selection of unit U; is tenned as cluster. Note that such a collection may consist of the union of several neighbourhoods. Network A set of units is known as a network if selection in the initial sample of any unit in the set will result in inclusion in the final sample of all units in that network. It is convenient to consider any unit not satisfying the condition a network of size one, so that the given y-values may be uniquely partitioned into networks. Edge unit A population unit is said to be edge unit if it does not satisfy the condition but is in the in the neighbourhood of one that satisfies the condition.
Notatioas
n 1 : Size of initial sample '1'1.: :Network which consists of the unit
U 1.:
m~,: : Number of units in the network to which unit k belongs
168 Sampling Theory and Methods ai
:
Total number of units in networks of which unit i is an edge unit.
Selection procedure and related properties As mentioned earlier, an initial sample consisting of n 1 units using SRSWR or SRSWOR is sampled. When a selected unit satisfies the condition all units within its neighbourhood are added to the sample and observed. Note that in addition to units satisfying the condition, even those units in their neighbourhoods are also included in the sample and so on. Denote by m; the number of units in the network to which unit i belongs and by a; the total number of units in networks of which unit i is an edge unit. Note that if unit i satisfies the criterion C then a; =0, whereas if unit i does not satisfy the condition then m; =1 . It may be noted that the unit i will be selected in a given draw if either any one of the m; units in its network is drawn in the initial sample or any one of the a; units for which this is an edge unit, is drawn in the sample. Hence the probability of selection for the unit in a given draw is
Pi
= m· +n· 1
1 •
N
The number of ways of choosing n 1 units out of N is
(N)
. Let
n1
B; be the subset of the population units containing either the units which are in the network containing the unit i or the units for which unit i is an edge unit
Clearly n(B;) =m; +a;. A sample not containing the unit i can be drawn by considering the set S- B; which contains N- m; -a; units. Hence the ( N -m; -a;)
probability of not including unit i in the sample is
~ . Therefore the
probability of including the unit i in the sample is a 1 =
1-~
. When
the initial sample is selected by using SRSWR, the probability that the unit i is included in the sample is a;
=1- (1- P; )n; • Since some of the a; may not be
known, the draw by draw probability P; as well as the inclusion probability a; cannot be determined.
Estimators under Adaptive sampling Classical estimators such as sample mean are not unbiased under adaptive sampling. Now we shall describe some estimators suitable for adaptive sampling.
Recent Developments
169
(i) The initial sample size
If the initial sample in the adaptive design is selected either by SRSWR or SRSWOR, the mean of the initial observations is unbiased for the population mean. However, this estimator completely ignores all observations in the sample other than those initially selected. (ii) Modified Hansen-Hurwitz estimator
In conventional sampling, Hansen-Hurwitz estimator, in which each y-value is divided by the number of times the unit is selected, is an unbiased estimator of population mean. However, in adaptive sampling, selection probabilities are not known for every unit in the sample. An unbiased estimator can be formed by modifying the Hansen-Hurwitz estimator to make use of observations not satisfying the condition only when they are selected in the sample. Let 'I' k denote the network which consists of the unit U k and
be the number of units
mk
in that network. Let 'YZ be the average of observations in the network that includes the kth unit of the initial sample. That is
'YZ =- 1-
I
Y1 • A modified
mk .
IE'Ifk
Hansen-Hurwitz estimator can be defined by using
'YZ as
t
Y~H =...!.. "Yt nt i=l
estimatory~H =-1-ty; is
Theorem 10.1 The
unbiased for the population
nl i=l
mean. Proof Case 1 When SRSWOR is used to select initial sample. Let Z; indicates the number of times the ith unit of the population appears in the estimator, The random variable z; has a hypergeometric distribution when initial sample is selected by SRSWOR with E[z 1 ]
'YZ Y~H =-n1-~ ~
Therefore
I k=l
I
=nl m; N
.
=_!_ ~-1- ~ Y·1 n ~m ~ k jE'Ifk k=l I
N
=-1
zkYk
nt k=t m~y
Taking expectation on both the sides we get the required result. Case 2 When SRSWOR is used to select initial sample. Let z1 • as in case 1,indicates the number of times the ith unit of the population appears in the estimator. It is to be noted that Zi is nothing but the number of times the network including the unit i is represented in the sample. Note that
=[nZ;
I
P(Zj)
r) m·
~
z· (J- m·~ )n-z· I
.
I
Therefore
n 1m·
1 • Expressing the E(z;] = - N
170 Sampling Theory and Methods
estimator
Y~H = - 1-
f 'YZ
in the form considered in case I and taking
nl k=l
expectations we get the required result. • The following theorem gives the variance of the estimator
Y~H =_.!._I 'YZ in nl k=l
the two cases of simple random sampling. Theorem 10.2 (a) If the initial sample is selected by SRSWOR, the variance of
-•HH = 1- ~ -•k ts . gaven . by N - nl - 1 Y
LY
Nnl
nl k=l
~ -• - J.l ]2 (b ) If L,[Y;
the initial
t 'YZ
is given by
N - 1 i=l
sample is selected by SRSWOR, the variance of
Y~H = _.!._
nl k=l
n
N -L[Y; - JJ.]
1 --
2
,where
N -1 i=l
y; is the average of observations in the network that N
includes the kth unit of the initial sample and J.l = - 1
N Proof Taking
LY; . i=l
y; as the variable of interest and applying the results available
under non-adaptive sampling scheme, the desired expressions can be obtained. • (iii) Modified Horvitz-Thompson Estimator
We know that the knowledge of first order inclusion probabilities tc; can be used to construct the Horvitz-Thompson estimator for estimating the population total . With the adaptive designs, the inclusion probabilities are not known for all units included in the sample. Hence it can not be used to estimate the total unbiasedly. An unbiased estimator can be formed by modifying the Horvitz-Thompson estimator to make use of observations not satisfying the condition only when they are included in the initial sample. In this case, the probability that a unit is used in the estimator can be computed, even though its actual probability of• inclusion in the sample may be unknown. Define the indicator variable J k = 0 if the kth unit in the sample does not satisfy the condition and was not included in the sample 1 otherwise, for k I, 2, ... , N .
=
=
The modified estimator is
y~ = - 1 N
a;
f k: y
k=l
k
where v is the number of distinct
ak
units in the sample is the probability that i is included in the estimator. It can be seen that, whether the unit i satisfies the condition or not, the probability of
Recent Developments
171
including the unit in the estimator is 1 - w . The following theorem gives
the variance of the estimator y *HT Theorem 10.3 The estimator
.
y *HT
is unbiased for the population mean and its
variance is-1-ttYhYj[tcjh -tctrtcj]where Dis the number of networks N 2 i=l h=l tc tete i in the population and tc jh is the probability that the initial sample contains atleast one lDlit in each of the networks j and h. Proof of this theorem is straight forward and hence omitted. The results presented above are due to Thompson and more about adaptive sampling are available in Thompson (1990,1991a,1991b).
10.2 Estimation of Distribution Function The problem of estimating finite population totals, means and ratios of the survey variables are widely discussed in sample survey literature. But estimation of finite population distribution function has not received that much attention. Estimation of distribution function is often an important objective because sometimes it is necessary to identify subgroups in the population whose values for particular variables lie below or above the population average, median, quantiles or any other given value. The of distribution function in finite population mean has received the attention of Chambers and Dunstan ( 1986) and Rao, Kovar and Mantel ( 1990). In this section their contributions are presented.
Population Distribution Function: Let Li(x) be the step function Li(x) = 1if x ~ 0 =0 otherwise Let Y1, Y2 , ... , YN be the values of the N units in the population with respect to the variable y. The finite population distribution function of y is defined as 1 N FN(t)=-'LLi(t-Y;),te R (10.1) N i=l We know that the Horvitz-Thompson estimator for a finite population total Y is given by
.
YHT
=
Lr..- ' ies
Tr·1
(10.2)
172 Sampling Theory and Methods
where 1t'; 's are the inc:lusion probabilities corresponding the sampling design P(s) used to choose the sample. Hence the Horvitz-Thompson estimators of '~land L N
N
i=l
i=l
Li(t-Y·)
'
are
1t' l
'~-and~ 1 'Li(t-Y;) respectively. Hence a ies 1ti
tes
1t' l
L
Li(t -Y;)
design-based estimator of
I, ies
1t'·
'
Note that FN (t) reduces to the ordinary sample empirical distribution function and it is design unbiased under any sampling design satisfying
L-1- = N . .
IES
1t';
Kuk( 1988) compared the perfonnance of the estimator F N (t) with those of A
A
F L (t) and F R (t) where FL(t)=-1 ~.1 (t-Y;) N . 1r; IE.t
1 FR(t)=I-SR(t),SR(t)=-
LL1
N .
•es
(t-Y;) 1r;
It may be noted that SR (t) estimates the proportion of units in the population whose values exceed the given value r. It is interesting to note that FL (t) is not necessarily equal to
FR (t)
FR (t) .
Further it can be easily seen that both
are unbiased for F N (t) under all sampling designs for which
FL (t) and 1t';
> 0 for
every i = 1, 2, ... , N . Even though both of them are unbiased for FN (t) , they lack the most important property of being distribution functions. On the other hand, FN (t) even though by nature a distribution function, it is not un.biased. A
A
The following theorem gives the mean square errors of FL (t) and FR (t) and the approximate mean square error of F N (t).
Theorem 10.4 (a) The mean square error of FL (t) is
2
1
N
LL N
N 1t'··-1t'·1t'· '1 ' 1 Li(t - Y; )Li(t - yj) i=l j=I 1t' ;1r i
(b) The mean square error of F R (t) is
2
1
N
N
N 1t·· -1t'·1t' . ' 1 Li(Y; - t)Li(Yi - t) i=l i=l 1t' ;1t' i
2, ~ '1
Recenc Developments
! 73
~
(c) The approximate mean square error of F N ( t) l -2
N
-tr tr LL tr .trtr N
N
l)
t=l J=l
l
J (L1(t-Y;)-FN
1
l
Proof of this theorem is straight forward and hence left as an exercise.
MSE(FR(t))~MSE(FL(t)) if
Remarks (l) Further it can be seen that N
N
N tr··-Tr·tr·
~ b· > ~ b-t1(Y·l -t) where b·l = ~ '1 £..,-l-~l ~ t=l
i=l
1
'
tr ·tr .
j=l
l
J
(2) The results mentioned above are due to Kuk( 1988) and more details can be obtained from the original paper. Rao et al( 1990) suggested difference and ratio estimators for population distribution function which use the knowledge of auxiliary information. The A
design based ratio and difference estimator ofFN (t) are obtained from standard results for totals or means treating t1(t- Y; ) and t1(t- RX; ) as· y and x variables
~[;:]
A
respectively, where R = _;;lE.:.;S==-=--= is the customary design based estimator of
>[X;] ~ lES
tr· l
the population ratio R = ..!._ . The ratio estimator of the population distribution X function is given by
L[ I[t1(t ~~Xi>] .1(1- Y; )]
F,(t)= l
iE.f
N
Tr;A
lES
±L1(t-RX;) i=l
I
A
which reduces to F N (t) when
l'i
is proportional to Xi for all i. Hence the
variance will be zero if Y; is proportional to Xi . This suggests that could lead to considerable gains in efficiency over
FN (t) ,
when
Fr (t) Y; is
approximately proportional to X; . The difference estimator with the same desirable property is given by
Fd=
~f~[t1]] _lE.f
l
1=1
lES
I
Using the data of Chambers and Dunstan (1986), the performance of the above two estimators were studied by Rao et al ( 1990). They found that the difference estimator is less biased than the ratio estimator for smaller values of FN (t).
174 Sampling Theory and Methods They also found that ftd (t) is more precise than
Fr (t) and
ft N (t). The presence
of Rin Fr (t) and ftd (t) creates difficulties in evaluating (analytically) the exact bias and the mean square errors of these estimators. Invoking the results of Randles (1982), they obtained the approximate design variances of Fr(t) and A
A
Fd (t) which are given below: l
A
V[Fd (t)] = -V[L1(t- Y;)- .1(t- RX; )]
ll
N2
l 2 V L1(t-Y;)- Fy (t) L1(t-RX;) V[Fr(t)]=F (_:_) N A
R
X
where V(Y;) =
ff
2 (tr;j -1r;1r j )[:;. - :j. ]
i=l j=l i<j
)
I
The estimated variances of ftr (t) and
Fd (t)
l
A
are A
v[Fd (t)] = - 2 v[L1(t- Y;)- .1(t- RX; )]
N
A
Fy(t)
l
.1(t and v[Fr(t)]=2 v L1(t-Y;)- _..;...._ t N Fx(-;:-)
Rx j)
A
R
where
~ v(Y;)=.'-
T"" [
Y; - yj -] -
~ """-' 1r .
ies jes i<j
1
1r . 1
2
(tr;1r j
-1rtj 1r ..
J and
Fx(t)and Fy(t) arethe A
A
IJ
customary estimates of Fx (t) and F y (t) respectively.
10.3 Randomised Response Method for Quantitative Data In the last chapter, we have seen several randomised response methods which are meant for estimating the proportion of units in a population possessing a sensitive character. In this section, a randomised response method meant for dealing with quantitative data as developed by Eriksson (l973a,b) is presented. This problem arises when one is interested in estimating the earnings from illegal or clandestine activities, expenses towards gambling or consumption of alchoholic and so on. These are some examples where people prefer not to reveal their exact status. Let Y1 , Y2 , ... , YN be the unknown values of N units labelled i = l, 2, ... , N with respect to the sensitive study variable y. To estimate the population total Y, Eriksson ( 1973a,b) suggested the following procedure.
Recent DeveLopments
115
A sample of desired size is drawn by using the sampling design P(s). Let X 2 , ... , X N be predetermined real numbers supposed to cover the anticipated range of unknown population values Y1, Y2 ,... , YN . The quantities
X 1,
q i, j = 1, 2, ... , M are suitably chosen non-negative proper fractions and C is a M
rightly chosen positive proper fraction such that
c + L qj
= 1.
Each
j=l
respondent included in the sample is asked to use conduct a random experiment independently k(> 1) times each to produce random observations Z ir , r = 1, 2, ... , k,
Z ir = Y; = Xi
with probability C with probability q j, j = 1, 2, ... , M
A corresponding device is independently used for every sampled individual so that the values Z;r, r = 1, 2, ... , k, for i e s are generated. For theoretical purpose, the random vectors Z r = (Z 1r, Z2r, ... , Z Nr ), r =1, 2, ... , k are supposed to be defined for every unit in the population. Let Z
=(Z 1, Z 2,... , Z k ).
Denote by
E R, V R and C R taking expectation, variance and covariance with respect to the -
LZ
1 k
randomisation technique employed to yield Z ir values. Let Z; = k
ir
and
r=l
1 M J.l. =-~q-X·. X 1-C.£, 1 1 j=l
M
Note thatER[Z;r1 = CY; + LqiX j j=l
= CY; + (1- C)J.l.x
(10.3)
Hence ER[Z;]=CY; +(1-C)J.l.x;i=l,2, ... ,N;r=l,2, ... ,k. .. Therefore an estimator of Y; is given by Y;
=Z· -(1-C)J.l. x c
(10.4)
1
A general estimator for the population total and also its variance is given in the theorem furnished below. Theorem 10.5 An unbiased estimator for the population total is given by e(s,Z)=a 1 + Lb1 ;Y; , where a 1 and b1; are free of Y1 ,Y2 , ... ,YN and ie.r
satisfy I,a 1 P(s) = 0 and Lb1;P(s) = 1,
i = 1, 2, ... , N. The variance of
Hi
N
e(s,Z) is given by Vp[e(s,Y)]+
I-~-I,ali,b};P(s). Here kC
over all samples.
i=l
.r
L .r
is the sum
176 Sampling Theory and Methods
Proof Taking E p, V p and C p as operators for expectation, vanance and covariance with respect to the design. Assuming commutativity, we write E PR = E pER =ERE p = E RP.VPR = V RP to indicate operators for expectation
and variance respectively, with respect to randomisation followed by sampling , or vice versa. Taking expectation for the estimator e(s, Z) we get ER[e(s,Z)]=as + LbsiER[Y;] ies ies
Again taking expectation with respect to the sampling design, we note that EpER[e(s,Z)]
=Y
The variance of e = e(s, Z) can be written as VpR(e)=VpER[e]+EpVR[e]
M
1
_Denoting by a
-L ix =1-C .
(10.5)
q j (X j - Jl x) 2 and af =a; + C(Y; - Jl x) 2 , we
J= 1
write VR[Z;,]
=(1-C) [a;+ C(Y;- Jlx) 2 } = (l- C) G;2 , i = 1, 2, ... , N; r = 1, 2, ... , k ;
Therefore
VR[e]=~ Lb.;;VR(Z;,) ies
kC
(10.6) Hence the proof. • Note The second term in the right hand side of (10.6) shows how variance increases (efficiency is lost) when one uses randomised response method rather than direct survey. Under designs yielding positive first order inclusion probabilities for all units and positive second order inclusion probabilities for all pairs of units, an unbiased estimator for the above variance can be found easily in particular when as =0 as shown below.
When as =0 ,the variance of the estimator with respect to the sampling design can be written as N
N
vp[e(s; Y)] = L i=l
C;
r? + L
N
2, dij Y; yj
i=l j=l i~j
Recent Developments Denote
by
v( s, Y)
=
Lf
1;
Y/ +
ie.r
LL
g sij Y; Y j
where / 1; 's
and
177 g sij
's
j~i
i
i,jes
quantities free of r_ satisfying E p [ v( s,
r> J= V ( s, !> .
satisfies EpR[v(s, Z)] = Ep[ERv(s, Z)]
= Ep[v(s,r_)] = Vp[e(s,r_}] .
2
Funher1f Szj
Hence
1~
=--~[Z;r
k -1
-2
2
-Z;] , then ER[szj]=VR[Z;r],r=1,2, ... ,k·.
r=l
ER[~ }:bi;si;]=~ }:bi;VR(Z;r}=VR(e} kC kC ies
ies
Taking expectation with respect to the sampling design, we have
EpR[~ }:bi;si;]=Ep[VR(e}] kC ies
As a result of the above discussion, we have E PR
[v(s, Z) +~ ~>1. s~] = VPR (e) kC
L
ies
Therefore v( s, Z} + ~ bi; s ~ is an unbiased estimator for VPR (e) . kC ies For more details about randomised response methods, one can refer to the monograph by Chaudhuri and Mukerjee (1988}.
References 1. Bartholomew, D.J. (1961}: A method of allowing for "not-at-homes" bias in sample surveys, App. Stat., 10,52-59. 2. Chambers, R.L. and Dunstan, R. ( 1986} : Estimating distribution function from survey data, Biometrika, 73,3,597-604. 3. Cochran, W.G. (1946} : Relative accuracy of systematic and stratified random samples for a certain class of populations, Ann. Math. Stat., 17, 164177. 4. Delenius, T. (1955} : The problem of not-at-homes, Statistisk Tidskrift., 4,208-211. 5. Deming, W.E. (1953} : On a probability mechanism to obtain an economic balance between the resulting error of response and bias of non-response, J. Amer. Stat. Assoc.,48,743-772. · 6. Das, A.C. ( 1950} : Two-dimensional systematic sampling and the associated stratified and random sampling, Sankhya. 10,95-108. 7. El-Bardy, M.A.(1956} : A sampling procedure for mailed questionnaire, J. Amer. Stat. Assoc.,51,209-227. 8. Erikkson, S. (1973a} : Randomised interviews for sensitive questions,Ph.D. thesis, University of Gothemburg. 9. Erikkson, S. (1973b} : A new model for RR, Internat. Statist. Rev., 1.101113. 10. Folsom, R.E., Greenberg, B.G.,Horvitz, D.G. and Abernathy, J.R.(1973}: The two alternate questions RR model for human surveys, J. Amer.,Stat. Assoc.,68,525-530. · 11. Hansen, M.H. and Hurwitz, W.N.(1946} : The problem of nonresponse in sample surveys, J. Amer. Stat. Assoc., 41,517-529. 12. Hartley, H.O. and Rao, J.N.K.(1968} : Sampling with unequal probabilities and without replacement, Ann. Math. Stat. 33,350-374. 13. Hartley, H.O. and Ross, A.(1954} : Unbiased ratio type estimators, Nature,174, 270-271. 14. Horvitz, D.G. and Thompson, D.J. (1952} : A generalisation of sampling without replacement from a finite universe, J. Amer. Stat. Assoc., 47; 663685. 15. 'Kish, L. and Hess, I. (1959} : A replacement procedure for reducing the bias of non-response, The American Statistician, 13,4,17-19. 16. Kuk, A.Y.C. (1988} : Estimation of distribution functions and medians under sampling with unequal probabilities, Biometrika, 75,1,97-103. 17. Kunte, S. (1978} : A note on circular systematic sampling design, Sanlchya c. 40,72-73. 18. Madow, W.G. (1953}: ·an the theory of systematic sampling lll, Ann. Math. Stat., 24,101-106.
180
References
19. Midzuno (1952) : On the sampling design with probability proponional to sum of sizes. Ann. Inst. Stat. Math .• 3.99-l 07. 20. Munhy. M.N. (1957) : Ordered and unordered estimates in sampling without replacement. Sankhya.18:379- 390. 21. Munhy. M.N. (1964): Product methods of estimation. Sankhya.26.A.69-74. 22. Olkin, I. ( 1958) : Multivariate ratio estimation for finite populations. Biometrika.45.154-165. 23. Politz, A.N. and Simmons, W.R. (1949,1950): An attempt to get the "not at home" into the sample without callbacks, J.Amer. Stat. Assoc., 44,9-31 and 45,136-137. 24. Quenouille, M.H. ( 1949) : Problem in plane sampling, Ann. Math. Stat., 20, 355-375. 25. Quenoulle M.H. (1956) : Notes on bias in estimation, Biometrika,43,353360. 26. Rao, J.N.K., Hartley, H.O. and Cochran, W.G. (1962): A simple procedure oi unequal probability sampling without replacement, Jour. Roy. Stat. Soc., B24, 482-491. 27. Rao J.N.K.,Kovar, J.G. and Mantel, H.J. (1990): On estimating distribution functions and quantiles from survey data using auxiliary information, Biometrika,77 ,2,365-375. 28. Royall, R.M. (1970) : On finite population sampling theory under cenain linear regression models, Biometrika,57 ,377,387. 29. Sethi, V.K. (1965): On optimum pairing of units, Sankhya B. 27,315-320. 30. Singh, D., Jindal, K,K. and Garg, J.N. (1968) : On modeified systematic sampling, Biometrika, 55,541-546. · 31. Singh. D. and Singh, P. (1977) : New systematic sampling, Jour. Stat. Plano. Inference, 1, 163-179. 32. Srinath, K.P. (1971) : Multiphase sampling in nonresponse problems, J. Amer. Stat. Assoc., 16, 583-586. 33. Shrivastava ,S.K. (1967) : An estimator using auxiiiary information, Calcutta Statist. Assoc. Bull., 16, 121-132. 34. Thompson, S.K. (1990) :Adaptive cluster sampling, J. Amer. Stat. Assoc., 85, 1050-1059. 35. Thompson, S.K. (1991a) : Stratified adaptive cluster sampiing, Biometrika, . 78, 3089-3097. 36. Thompson, S.K. (1991b) : Adaptive cluster sampling: designs with primary and secondary units, Biometrics, 47, 1103-1105. 37. Warner, S.L. (1965) : Randomised response : A survey technique for eliminating evasive answer bias, J. Amer. Stat. Assoc., 60,63-69. 38. Yates, F. (1948) : Systematic sampling, Phil. Trans. Roy. Soc:, London, A 241,345-371.
Books 1. Chaudhuri, A. and Mukerjee, R. (1988) : Randomised respon_se theory and
technique, Marcel Dekker Inc. 2. Cochran, W.G. (1977): Sampling techniques, Wiley Eastern Limited.
Ref"erences
181
3. Des Raj and Chandok. P. ( 1998 l : Sampling Theory. Narosa Publishing House. New Deihi. 4 Hajek. J. ( 1981 l : Sampling from a finite population. Marcel Dekker Inc. 5. Konijn, H.S. (1973 l : Statistical Theory of sample survey des1gn and analysis. North-Holland Publishing Company. 6. Murthy, M.N. ( 1967) : Sampling Theory and methods. Statistical Publishing Society. Calcutta. 7. Sukhatme. P.V .. Sukhatme.B.V .. Sukhatme,S. and Asok.C. (1984) : Sampling theory of surveys with applications. Iowa State University Press and Indian Society of Agricultural Statistics, New Delhi.
Index
-------------------------------------------------adaptive sampling, 165-171 almost unbiased ratio type estimator, 104,105 autocorrelated populations, 39,87 auxiliary information, 97-121 balanced systematic sampling, 35-37 Bartholomew, 154 Bellhouse, 47 bias, 2 bound for bias, 105 centered systematic sampling, 34 Chambers, 171,173 Chaudhuri, 161 circular systematic sampling, 43,44 cluster sampling, 140 Cochran, 63,88 cost optimum allocation. 82 cumulative total method, 55 Dalenius, 154 Das, 47 Deming's model, 154 Desraj ordered estimator, 60 difference estimator, 124-126 distribution.function, 171 Dunstan, 171,173 edge unit, 167 El-Bardy, 152 entropy, 3 Erikkson, 174 finite population, I Folsom's model, 160 Garg, 38 Gauss-Markov, 132 Hansen and Hurwitz, 152 Harltey, 63,70,102,106 Hess, 154 Horvitz-Thompson. 3,6,8,63 implied estimator, 129 inclusion indicators, 4 inclusion probabilities, 4,5
incomplete surveys, 152 Jindal, 38 Kish, 154 Kovar, 171,173 Kuk, 172 Kovar, 171,173 Kuk, 172 Kunte. 44 Lagrangian multipliers, 81,93 Lahiri, 43,56 linear systematic sampling, 29-32 Madow 34 Mantel, 171,173 mean squared error, 1,3 Midzuno, 67-70 model unbiasedness, 131 modifed Hansen-Hurwtiz estimator, 168 modified Horvitz-Thompson estimator, 170 modified systematic sampling, 38,39 multi-auxiliary information, 113 multistage sampling, 140-150 Murthy's unordered estimator, 62 neighbourhood, 167 network, 167 Neyman optimum allocation, 81 non-sampling errors, 152-164 observational errors, 161 Olkin, 113 parameter 1,3 Politz-Simmons technique, 156 population siie, 1 pps systematic scheme, 70 ppswor, 60 ppswr,55 probability sampling, 1 product estimation, 106-108 proportional allocation, 79
l84
References
Quenouille. 47 random group method. 63 randomised response 15 8-161 . 17 4 Rao. 4.63.70,1 02.171,173 ratio estimator. 97-1 05 regression estimation. 122-124 Ross. 106 Royall, 137 sample size allocation. 79-85 sample, I sampling design, 2,3,4,5 sampling in two dimensions. 44,45 Sarndal, 16 Sethi,35 Simmons. 159 Sethi. 35 Simmons. 159
simple random sampling, I0-28 Singh,38 Srinath. 154 srswr. 25 statistic. 2 stratified sampling, 76-96,115 super-population model, 129 systematic sampling, 29-54 Thompson, 165.171 two phase sampling, 108-112 two stage sampling, 140-150 unbiased ratio type estimators, ! 00 unbiasedness, 2 unequal probability sampling, 55 ·Warner, 158 Yates. 33.73 Yates-Grundy, 7