8(&*) < 35 for all s ^ n* = max (w2, n3, w4). (6.13) By the fact that &(A*) is UMP at level ocs and ^ SJ8(A*) = aa, < 0. (6.14) one has ^[oJs(Af)\RSfhs]-£$s{A£)\BS}hs] From now on, one simply repeats the arguments employed after relationship (5.40) in order to conclude that SQ (os(Af) — $ $ (fis(Af) < 4:8 for all s ^ n*,
say.
However, this contradicts (6.6) and completes the proof of the theorem. ] To Theorem 6.1, there is the following corollary. COROLLARY 6.1 Under assumptions (A 1-4), the sequence of tests {(f>n(An)} defined by (3.2) and (3.3) is asymptotically locally most powerful for testing Hx: 6 e To against A: 6 e co within the class of all sequences of tests {An} satisfying (6.1) when the parameter values are restricted to those 6s in o) for which [ni(d — d0)} remains bounded from above. That is, within the class of sequences of tests just described, {<j)n(An)} satisfies (3.1) when the parameter values are restricted to those 6s in o) for which {ni(d — dQ)} remains bounded from above. For the symmetric case, we have the following results which are established in a way entirely analogous to the one employed for proving Theorem 6.1. Namely, T H E O R E M 6.2 Under assumptions (Al-5), (A5'), the sequence of tests {<j)'n{An)} defined by (3.20) and (3.21) is AUMP for testing H[: 6 e u' = {6 e 0; 6 ^ 60} = co u {60} against A': 6eo)r within the class of all sequences of tests {An} such that (6.15) limsup[sup{SeXn\ Oew')] ^ a.
One-sided against one-sided alternatives
127
COROLLARY 6.2 Under assumptions (A 1-4), the sequence of tests {^(An)} defined by (3.20) and (3.21) is asymptotically locally most powerful for testing H[: deo)' against A': deo)' within the class of all sequences of tests {Xn} satisfying (6.15) when the parameter values are restricted to those 6s in a/ for which {ni(6 — 60)} remains bounded from below. That is, within the class of sequences of tests just described, {^(An)} satisfies (3.1) when the parameter values are restricted to those 6s in to' for which {ni(6 — 6)} remains bounded from below. R E M A R K 6 . 1 For the explicit form of the tests in the case of the concrete examples that we have been dealing with, the reader is referred to Section 4.
Exercises 1. Verify condition (A5)' in connection with Example 1.3. 2. Verify the equivalence asserted in relation (4.4). 3. In connection with Example 4.4, show that an MLE of 0, is a root of the equation 0
= 0,
where
a0 = - - £ -2^-1-* n
j=I
«! = - £ (Xf.! + ZJ) - 1 and a2 = a0. n
j=l
5. Some statistical applications: asymptotic efficiency of estimates
Summary In this chapter, we apply some of the basic results obtained in Chapter 2 to the problem of the asymptotic efficiency of a sequence of estimates of the unknown parameter 6e®. The asymptotic efficiency adopted in this chapter is the one proposed by Wolfowitz [1] (see Definition 1.1). The main results obtained herein are in the nature of establishing a certain upper bound for the limiting probability of concentration of various estimates under consideration. All these results (Theorems 4.1, 5.1 and 5.2), except for one (Theorem 4.2), are established for the case that 0 is an open subset of E. In Sections 6 and 7, we consider the asymptotic efficiency from the classical point of view and show that standard results for both the one-dimensional and multidimensional case (see Theorems 6.1 and 7.1) are obtained either as special cases of, or are closely related to, the main results mentioned above.
1 W-efficiency - preliminaries The classical approach of proving asymptotic efficiency (a.eff.) of estimates has been geared towards showing that an MLE is a.eff. More specifically, under suitable regularity conditions, an MLE, properly normalized, is asymptotically normal with mean zero and variance the inverse of Fisher's information number. One then considers the class of all estimates which, properly normalized, are asymptotically normal with mean zero, and calls the one with the smallest variance (of the limiting normal distribution) an a.eff. estimate, if such an estimate exists. Again, under appropriate conditions, an MLE is shown to be [ 128 ]
W-efficiency
129
a.eff. except possibly on a subset of the parameter space whose Lebesgue measure is zero (see, e.g. LeCam [1], [3] and Bahadur [1]). However, an example has been constructed by Hodges (see LeCam [1]), where on this exceptional set the variance of the limiting normal distribution of an estimate is actually smaller than the corresponding variance of an MLE. Such estimates are known as superefficient estimates. Several legitimate objections have been raised against the classical approach just outlined (see, e.g. Wolfowitz [1]). The existence of superefficient estimates is disturbing enough; but what is entirely unnecessary is to confine ourselves only to those estimates which, properly normalized, are asymptotically normal. Of course, this is necessary if a.eff. is to be judged on the basis of the variance. However, there is no reason that this should be the criterion of a.eff. Arguing along these lines, Wolfowitz [1] was led to introducing an alternative more general and quite reasonable criterion of a.eff. For the formulation of this criterion, some additional notation is required. As usual, X0,Xl9 ...,Xn are n+1 observations from the Markov process under consideration and © is an open subset of R. Let ^f* be a class (to be specified later on) of sequences of estimates (of 6) {Tn} = {Tn(X0,Xl9 ...,X n )}, and let {Vn} =
{Vn(X0,Xv...,Xn)}
be a sequence of estimates not necessarily belonging in <8?*. Also for each n and 0, consider two classes of intervals in R (to be specified later on) of the following form. In(6) = {certain intervals in(6; tv t2); tl9t2 > 0}; 0 e 0 .
(1.1)
Jn(d; T) = {certain intervals jn(d; tvt2,T); tvt2 > 0}; 0 e 0 . (1.2) The presence of T in the definition of Jn(6; T) indicates that the j-intervals also depend on a quantity associated with the sequence {Tn}. D E F I N I T I O N 1.1 Let {Tn}, {Vn} and ^ * be as above and for each n and each 0, let In(0) and Jn(d; T) be defined by (1.1) and (1.2), respectively. Suppose that \\mPe\Vnein(6\ tvt2)] exists for
130
Asymptotic efficiency of estimates
all tl9t2 > 0 and every 6 e 0. Then the sequence of estimates {Vn} is said to be asymptotically efficient in Wolfowitz sense (W-efficient, for short) with respect to the class ^ * if the inequality 1imm-pPe[Tnejn{6; tl9t2,T)] < limPd[Vnein(6; tl9t2)]
(1.3)
holds for all {Tn} in #*, every £1? £2 > 0 and all (or almost all with respect to Lebesgue measure I in R, a.a. [I]) 6 in 0. The optimality of {Vn} consists in that for every tv t2 > 0 and all (a.e. [I]) 6 e 0, asymptotically, Vn is concentrated in in(6; tl912) at least as much as any Tn, such that {T n }e^*, is concentrated in jn(d; tx,t2,T). Actually, Wolfowitz' programme was to show W-efficiency of the MLE in the i.i.d. case. More specifically, what he does is this. He takes 0 to be a finite interval with or without its endpoints and defines the class #*, to be denoted here by ^w, as follows <€w = {VnY, Pe[(ni(Tn-0)
^ x] -> FT(x; 6), a d.f., uniformly in R x ©}.
Of course, the presence of T in the limiting d.f. simply indicates that this limiting d.f. depends o n ^ } . For each 0 e 0 , let lT{6), uT{6) be the 'smallest' and 'largest' median of FT(-; 6) and define / n (0), Jw(0; T) as follows: = [0-*!?&-*, 0 + * a n-*], M a > 0}, Jn(d; T) = {jn(6; tvt2,T);jn(6;
tvt2,T)
where wT and WT are certain specified functions defined on 0 and such that lT{6) ^ wT(6) < WT{0) < uT{d), 0G0. With the above notation and under suitable regularity conditions, Wolfowitz shows that (1.3) holds true (with lim sup replaced by lim) for all but countably many 0s when Vn is equal to the MLE. Prominent among his assumptions is that PdlnHVn - 0) < x] -> O(x; 6) uniformly in R x 0, where *(•; 0) is the d.f. of N{0, l//(0)), 1(6) being Fisher's information number.
W-efficiency
131
Kaufman [1] has generalized Wolfowitz' result for the case that 0 is ^-dimensional. Now the problem of establishing W-efficiency can be split into two parts. First one concentrates on determining an upper bound, B(6; tvt2), say, such that > lnn*vpPe[Tnejn(0;tl9t29T)] ^B(0;tl9t2) forall{T n }e^*,l (1.4) every tvt2 > 0 and all (a.a. [I]), de®, J and then one attempts to find a sequence of estimates {Vn} for which the upper bound is attained, in the sense that lim
Pd[VnGin(d;tvt2)]
-> B(6; tl9t2)
for all tl9t2 > 0 and every 0 e 0 , (1.5)
where in(0; tvt2)eln(d) for a certain specification of In(0). In other words, one would use inequality (1.4) in much the same way that the Cramer-Rao inequality is employed (only in reversed order). However, the optimality of {Vn} is not defined by (1.5) but rather by (1.3), the reason being that (1.3) may be fulfilled for some {Vn} while (1.5) may fail to occur. Schmetterer [1], Roussas [5] and Pfanzagl [1] have modified Wolfowitz' assumptions and have established inequality (1.3) for several specifications of the class <%*. The first two of these authors also established their results in a Markov process framework. Our objective here is to show that assumptions (A 1-4) are sufficient for establishing inequality (1.4) for certain specifications of the classes ^ * and Jn(d; T). Accordingly, we shall not concern ourselves, with the problem of proving (1.5) for some specific {Vn}. 2 Some l e m m a s The lemmas below are needed in the proof of the results in the remainder of this chapter and occasionally elsewhere, too. D E F I N I T I O N 2.1 For each n9 let fn and / be functions defined on E c Bk into B. We say that {fn} converges con9-2
132
Asymptotic efficiency of estimates
tinuously to / in E if for each xeE, fn{xn) ->f{x) whenever xn->x. The following result then holds true. LEMMA 2.1 If {fn} converges continuously t o / i n E, then/is continuous on E.
Proof In order to establish continuity at an arbitrary xeE, it suffices to show that/(# n ) ->f(x) whenever xn -> x. For contradiction, suppose that / is not continuous at x. Then there exists a sequence {x^} such that \f{xn) -f(x)\
> e for some e > 0.
(2.1)
Now from the assumption of continuous convergence it follows that/ ri (x) ->/(#)• (Just take {xn} = {x}.) Then for each fixed n, there exists an integer mn such that
Clearly, the mns can be chosen so that mn < mn, \in < n'. Thus {mn} ^ {n}. Setting ymn = xni we have that ymn -> x and relations (2.1) and (2.2) imply "
\f(ymn)-f(*)\ > c
\fmn(ymn)-f(ymn)\ < ¥.
(2.3)
From (2.3), it follows that \fmn(ymn)-f(x)\ > \e, which contradicts the assumption of the lemma. The proof is completed. ] The concepts of continuous convergence and uniform convergence are related as in the following lemma. L E M M A 2.2 (i) If {fn} converges t o / uniformly on E, then the convergence is continuous, provided/is continuous on E. (ii) If E is compact, then continuous convergence of {fn} to / implies uniform convergence.
Proof (i) Let x,xneE have
for all n be such that xn -> x. Then we
by continuity of/ and uniform convergence of {fn} t o / . (ii) In the first place, / is continuous on E by Lemma 2.1. For contradiction, assume that {fn} does not converge uniformly t o / .
Some lemmas
133
Then there exists a subsequence {m} c= {n} and a sequence {#m} with xmeE for all m, such that \fm(*m) -f(*m)\ > e for some e > 0.
(2.4)
By the compactness of E, there exists a subsequence {xr} c {o;m} such that xr-> xeE. Replacing m by r in (2.4), we have e < |/ f (*,)-/(*,)| < !/,(*,)-/(*)| + |/(*)-/(ar r )|.
(2.5)
But \f(x) —f(xr) | < Je for all sufficiently large r, by the continuity of/. Hence from (2.5), one has |/r(#r) — f(x)\ ^ | e for all sufficiently large r. However, this contradicts the assumption of continuous convergence. The proof is completed.] The lemma stated below generalizes the dominated convergence theorem and its proof can be found in Loeve [1], pp. 162-3 and also Pratt [1]. 2.3 For each n, let hn) gn, Gn and h, g, G be 3&hmeasurable, real-valued functions defined on Rk such that LEMMA
(i) hn{x)->h{x),
gn(x)-+g(x),
Gn(x)->G(x),
(ii) gn(x) < hn(x) ^ Gn(x) for all n and xeRk, and (iii) jgndlk->jgdlk, JGndlk-+ JGdlk with jgdlk and JGdlk finite. Then jhndlk-> jhdlk and jhdlk is finite, where lk is the Lebesgue measure on @tk. A slight variation of the following result is proved in Bahadur L E M M A 2.4 Let {fn} be a sequence of measurable functions defined on an open subset E of Rk into R and letfn(x) ->• 0, xeE and \fn(x)| ^ M(< oo) for all TI and xeE. Then for any sequence {xn} with xn -> 0 there exists a subsequence {m} c {71} and an Zfc-null subset of ^/ (both of which may depend on {xn}) such that for all x e E outside this null set, one has
fjx + xm) -> 0 pointwise. Proof For any xn -> 0, consider the integral r
134
Asymptotic efficiency of estimates (A;)
where 0 is the d.f. of N(0,1) (I is the kxk unit matrix), and perform the transformation x + xn = y. We have then
= J |/ n (* + *«) | (2;r)-*« exp ( - 1 * = J\fn(y)I (2^)-fc/2exp [ - \{y -xj
(y- xj] d/*(y).
On the basis of our assumptions, the conditions of Lemma 2.3 are fulfilled with K(V) = \L(y)\ (2n)-kl*exv[-h(y-xny Gn(y) = and
{y-xn)l
gn(y) = 0,
(2n)-WMexj>[-Uy-xnY(y-Xn)]
h(y) = g(y) = 0, C(y) = (2»r)-*«Jf e x p ( - ^ ' 2 / ) .
Therefore
= J \L(y) I (2TT)-^ exp [ - \{y - x j
so that
(y - xn)] dl\y)
-> 0,
f \fn{x {x + xxnn)\) d®M(x) -> 0.
Therefore the Markov inequality implies that fn(x + xn) -> 0 in O(/e)-measure. Hence there exists a subsequence {m} c {n} and a O(fc)-null set (both of which may depend on {xn}) such that for all x e E outside this null set, one has fm(x + xm) -> ° pointwise. Since O(&) « Z^, one has finally that for all xeE outside the before-mentioned J^-null set, fm(x + xm) -* ° pointwise, as was to be seen. ] The lemma to be presented below follows from Lemma 2.4 above (for k = 1 and E = B) and is discussed in Pfanzagl [1].
Some lemmas
135
L E M M A 2.5 Let {fn} be a sequence of ^-measurable, realvalued functions defined on R and being such that
liminf/n(x) ^ 0, xeR,
and
\fn(x)\ ^ M(< oo)
for all 7i and xeR. Then for any sequence {xn} with #w -> 0 there exists an Z-null set (which may depend on {xn}) outside of which one has 0.
Proof L e t / * = i n f ^ , 0). Then the sequence {/*} satisfies the requirement of Lemma 2.4. Therefore for xn -> 0, there exists a subsequence {m} ^ {n} and an Zfc-null set (both of which may depend on {xn}) outside of which one has fm(x + xm) -> 0 pointwise. From the definition of/*, one has
fn(x + xn) ^ /*(x + xn) for all n and xeR. Therefore ^
= 0,
as was to be shown. ] In closing this section, we would like to point out that Lemmas 2.1, 2.2 and 2.4 are valid in a more general set-up than the one considered here (see Schmetterer [1], Section 1), but their present formulation is sufficient for our purposes.
3 A representation theorem In the present section, we consider a class of sequences of estimates (of 6) defined by (3.1) below such that when each estimate of a given sequence in the class under consideration is properly normalized, the resulting sequence of their distributions converges (weakly) to a probability measure J§?(#), say. Then it is shown (see Theorem 3.1) that <3?{d) assumes a certain representation as the convolution of two specified probability measures. This result is employed in the next section in connection with the asymptotic efficiency of estimates and, of course, is of considerable interest in its own right.
136
Asymptotic efficiency of estimates
For an arbitrary 0 e 0 and any heRk, set dn = 6 + hn~i, so that 6n e 0 for sufficiently large n. Define the class ^ H of sequences of estimates of 0, {Tn}, as follows
, a probability measure},
(3.1)
where J§?(0), in general, depends on {Tn}. The probability measure J§?(0) can be represented as the convolution of two probability measures. This result is due to Hajek [2]. However, the proof to be presented below is based on an idea of Bickel [1]; see also Roussas and Soms [9]. Finally, an alternative proof can be obtained from some general results of LeCam [7]. The precise formulation of the theorem is as follows. THEOREM3.1
Consider the class (SH defined by (3.1). Then
one has that where J2\(0) = JV(O, T-^d)) and j£?2(0) is defined by (3.15) below. Proof Since 0 is kept fixed throughout the proof, we shall omit it from our notation for simplicity. Thus we will write Se, JS?i, JS?2, T, etc. instead of &(0), ^ ( 0 ) , SP^d), T{6)9 etc. Now let A* = AJ(0) be the truncated version of An defined by (1.5) in Chapter 3 and consider the sequence J5f* = SC*(d) = ^[(n*(3;-), A*)|P,]. Then by the weak compactness theorem there exists {m} c {n} such that JSf^=>J5f*, a measure. Then the marginal measures [JS?m*(r m -0)|PJandJ2?(A*|^) of J5f[(m*(Tm-0),A*)|P,] converge (weakly) to the corresponding marginal measures of J§?*. These latter marginal measures are then probability measures since both £f[mi(Tm — 0)\Pe] andoSf (Aj* |i^) converge to probability measures. It follows that JSP* itself is a probability measure. Setting (T, A) for the identity mapping on Rk x Rk, we have then that J2P[(2\ A)|J2?*] = J2?*, so that .
(3.2)
A representation theorem
137
It is shown in Lemma 3.1 below that
- Sdexp [iu'mi(Tm -6) + h'At - \h'YK\ -> 0 (3.3) and gB exp [iu'mi(Tm - 0) + A'A* - \h'Yh] -> g^exp
(iu'T + h'A - \h'Yh).
(3.4)
Therefore by setting {u, h) = & exp (iu'T + ih'A)
(3.5)
and replacing h by zero, we obtain by means of (3.4) that id exp [iu'm$(Tm - 6)] -> S^ exp iu'T = S'# exp iu'T = (u9 0). (3.6) Also from (3.1) and (3.5), it follows that *0mexp[iu'mi(Tm-dm)] Next
-> ^ , 0 ) .
(3.7)
i6m exp [iu'mi(Tm - 0J] = i e exp [iu'mi(Tm -6)- iu'h + A J = exp ( - iu'h) Sd exp [iu'mi(Tm -6) + A J
and this last expression converges to exp ( - iu'h) gg* exp (iu'T + h'A - \h'Yh) by means of (3.3) and (3.4). That is exp ( - iu'h) £* exp (iu'T + h'A - \K'Yh).
(3.8)
From (3.7) and (3.8), we obtain {u, 0) = exp ( - iu'h) S'#. exp (iu'T + h'A - \h'Yh).
(3.9)
It is shown in Lemma 3.2 below that in (3.9) we may replace h by ih. By doing so, we obtain (u,0) = e^u'hS^ex^(iu'T
+ ih'A + \h'Yh).
(3.10)
By means of (3.5), relation (3.10) may also be written as follows 0(w,O) = (exvu'h)
138
Asymptotic efficiency of estimates
so that
$(u9h) = (u9O)exp(-u'h)exp(-%hTh).
(3.11)
Next, it is easily seen that - u'h - %h'Yh = - \( so that (3.11) becomes (3.12) Setting successively h = 0 and h = — T^u in (3.12), we obtain and
4>{u, 0) = exp ( - iuT-hi)
(3.13)
{u, 0) exp (fyi'T-hi),
(3.14)
respectively. From (3.5) and (3.14), it follows that $(u9 0) exp (tyiT-hi) is a characteristic function (under JS^*), namely, the characteristic function of the random vector T — r - 1 A. Define <2?2 by I
(3.15)
Also exp( — %u'T~-1u) is a characteristic function (under namely, the characteristic function of the random vector r~xA which is distributed as ^ ( O , ^ 1 ) . Set &x = N(0, T"1). Then from relation (3.13) and the composition theorem (see, e.g. Loeve [1], p. 193), it follows that JSP = ££x *J^2> a s w a s ^° ^ e seen. ] We now proceed to establish the lemmas used in the proof of the theorem. L E M M A 3.1 With the notation employed in the theorem, one has that (i) gB exp [iu'mi(Tm -0) + A J - Se exp [iu'mi{Tm - 6) + A'A* and (ii) Se exp [iu'mi(Tm -d) + A'A* - \KTK\ -> £#* exp (iu'T + h' A - \h'Th). Proof (i) For simplicity, set
A representation theorem and
139
Q* = N(-±hTh,hTh).
Then the convergence J?(An\Pe) => J§?(A|Q*), implies that
whereas &0Un = ^g«.exp A = 1. Also the convergence
implies that so that
-2?(A'A* - \h'Yh\Pe) => -2?(A|0*), ^f(Kil^) =>^(exp A|Q*).
Furthermore, ^ exp A'A* -> ^ Q exp A'A = exp |ATA by (3.8) in Chapter 3, where Q = N(0, T). Thus SeVn -> 1, so that Lemma 2.1 in Chapter 3 applies and gives that Un, Vn, n ^ 1, are uniformly integrable. Next Un — Vn -> 0 in /^-probability by arguing as in (3.5) of Chapter 3 and using the fact that An — A'A* -> — \h'Th in P^-probability. Then, by Lemma 2.2 in Chapter 3, it follows that \Un — Vn\,n^ 1, are uniformly integrable and Corollary 2.1 of the same chapter gives that $o\Un — Vn\ -> 0. The proof of (i) is then completed by observing that the left-hand side of (i) is bounded in absolute value by &(\Un — Vn\. (ii) Clearly, it suffices to show that >iu'mi(Tm-d) +ft'A*]-> g&exp (iu'T + *'A).
(3.16)
Set ^4m = (|A'A* | > c). Then one has that \Seexp [m'wifa^ - 6) + A'A*] - ^
f J^w
+1-If
-f
exp (iu'T + A'A)|
* m
f e
) Am
exp [m'mi(ym - 5) + VA* ] dP, (3.17)
As was seen in the proof of (i), ^ e x p (A'A£) -> exp (\h'Th). Also JS?(A'A* |Pe) => JS?(A'A|Q) implies that
140
Asymptotic efficiency of estimates
Therefore Lemma 2.1 in Chapter 3 gives that exp h'A^, for all m, are uniformly integrable. Thus one may choose c sufficiently large, so that f exph'A*dP0<e J Am
and
f expA'AcLSf* JAm = I
expA'AdQ < e. (3.18)
Furthermore
f
f
'AJdJS?*
(3.19)
J(|ft'A|
by the fact that J§?* =>&* and jS?*(i?fc xB) = Q(5) = 0, where B = {keRk; |A'A| = c}. Then relations (3.17)-(3.19) complete the proof of (3.16) and hence that of (ii). ] LEMMA 3.2 With the notation employed in the theorem, consider the expectation $%* exp (iu'T + h'H) as a function of h, call it g(h), where h = (hl9 ...,hk)', h^eR, j = 1,...,k. Then, for j = 1,..., Ii ,g(h) is analytic in the jih coordinate hj when the first j — 1 coordinates hr,r= 1,..., j — 1 are any complex numbers and the last h —j coordinates hr, r = j + 1,..., 1c are any real numbers.
Proof By setting h = (h2,..., hk)', A = (A x ,..., Ak)r and A = (A 2 ,..., Afc)', one has that $2* exp (iu'T + A'A) = £#* exp [(iu'T + A'A) + Ax AJ
= ^ J e x p f m ' T + A'A) S ^ L
i
J!
A representation theorem
141
Now at this point we observe that for n ^ 0, one has = expA'A S "T
^A x |) (3.20) and this last expression is, clearly, Q-integrable and hence J§?*-integrable. Then by the dominated convergence theorem, we get $2+ exp (iu'T + h'A) = £ [i^ [exp (iu'T + h'A) ^ 1 ) h{ which shows that (A) = ^(A1? ...,hk) is analytic in Ax when the other coordinates are kept fixed. Thus, hx may be replaced by a complex variable. Making this replacement and working as above, we have that the left-hand side in (3.20) is n V
_ __ foj'
2 exp^T + i^A^^Aj + ^A)-^ wherehx = h^ + ih^K = (h3,..., A7<.)'andA = (A3 ,..., A^)', whereas the last bound on the right-hand side of the same relationship is equal to exp(A11A1 + I / A+ |A2A2|) which is Q-integrable and hence j£?*-integrable. Therefore h2 may be replaced by a complex variable. In a similar fashion each one of the remaining coordinates may be replaced by complex variables and this completes the proof of the lemma. J
4 W-efficiency of estimates: upper bounds via Theorem 3.1 The main purpose of the present section is that of establishing the inequalities in Theorems 4.1 and 4.2 for two specifications of a class of sequences of estimates (of 6) .The first specification, which is due to Schmetterer [1], is given in (4.8) and is restricted to the case that 6 is real-valued. The result associated with it is presented as Theorem 4.1. A certain multiparameter version of this result is presented as Theorem 4.2. The corresponding class
142
Asymptotic efficiency of estimates
of sequences of estimates is given by (4.10) and is the multiparameter analogue of the class (S8 given by (4.8). For the proof of the above-mentioned theorems, one needs an auxiliary result which is formulated as Proposition 4.1. For aeRk with a 4= 0 and veR, consider the half space Ha(v) = {xeRk;a'x
^ v).
Denote by L the d.f. corresponding to a probability measure ££ in Rk, set Qx = N(0, F"1) and let O(#; F"1) be the corresponding d.f. Then it is clear that there exists v*eR such that
f
dL(x) = f
dO(*;r-i) = f
dQv (4.1)
Next, denote by /i and © the probability measure and the d.f., respectively, corresponding to the characteristic function and let (X, Y) be two random vectors distributed SLSQ1X/I. Also let Q2 x /i be an alternative distribution of (X, Y), where Q2 is the measure induced by O ^ + AF^ajF" 1 ), and consider the problem of testing the hypothesis that (X, Y) are distributed as Q1x/i against the alternative that they are distributed as Q2 x [i. Iif(x; F"1) is the density of Q± with respect to the ^-dimensional Lebesgue measure lk, it follows that/(^;F~ 1 ) is also the density oiQ1x/i with respect to lk x JJL. This is so because for any Bl9 B2e&9 one has that
f
x
x Bt).
JB
Similarly, f(x + AF^a;F" 1 ) is the density oiQ2xju, with respect to lk x [i. Then for the hypothesis testing problem above the most powerful test rejects the hypothesis when
which is equivalent to a'x < c* after some simplifications, provided A > 0. Thus the test which rejects the hypothesis in question whenever a'X ^ v* is at least as powerful as the test
Upper bounds via Theorem 3.1
143
which rejects it whenever a'(X + Y) ^ v. These two tests have the same level of significance because f
d[O(x; T-1) x G(y)]
d(Qx x fl) = f
J Ha(v*)
J HO(V*)
=f
dO(a;r-i)=f
J HO(V*)
dL(x),
J Ha(v)
the last equality being true because of (4.1), and f
d(Q1x/i)=
J [a'(x+y)
f
d[*(a?;r-i)
J [a'(x+y)^v]
= f
dL(«) = f
dL(a:).
The second equality from the right-hand side above is true because the characteristic function of the sum X + Y of the independent random vectors X and Y is equal to exp (— |wT~%)
f
f d(Q1xju,)^
I
d(Q1x/^).
(4.2)
But
f
d[O(a: + AT^a; T"1) x G(y)]
d(Q2 x /i) = J
-JT
[ a'(2!+2/-Ar- 'o
=J That is,
f
^
d(Q2x/^)=|
dL(«). di(2).
(4.3)
144
Asymptotic efficiency of estimates
Also f
d(Q2 x pL) = f
JHaiv*)
J (a'x^v*)
=f In the last expression above, we set A =
_± , s > 0. Then
so that
dO^jr- 1 )^ f
f J [o'fe-r-'flXi;*]
Thus
f
dO^T- 1 ).
J (a'^'y*+s)
dO^jT- 1 ).
d(g2x/0=f
(4.4)
On the other hand, for the above choice of A, relation (4.3) becomes as follows f
d(0 a x/O=f
dL(z).
(4.5)
Then relations (4.2), (4.4) and (4.5) give d
f
J (a'x
di(a:)
(5 > 0).
(4.6)
J (a'x^v*+s)
For 5 < 0, we arrive at f
dO^jT" 1 ) ^ f
J (a'x<'y*+s)
dL(x)
(4.6')
J (a'x^v+s)
by considering as alternative distribution of (X, Y) the distribution Q2 x /i, where Q'2 is the measure induced by <$>(x — Ar- 1 ^; F"1), A=
^ - , s < 0. Since also
dQ)(x; F"1) =
dL(x) by
Upper bounds via Theorem 3.1
145
(4.1), considering the two cases that s > 0 and s < 0, relations (4.6) and (4.6') provide easily the following inequality f
dO^jT-1)
dL{x)^{
J Ha(V+s)AHa(v)
(seR).
J Ha(v*+s)AHa(v*)
Thus we have established the following result. 4.1 For aeRk with a 4= O&ndaeR, consider the hyperplane Ha(a) = {xeRk\ a'x ^ ex) and let L(x) and O(x; T"1) be the d.f.s of the (probability) measures J§? (in Rk) and iV(0, F"1), respectively. Furthermore, for any veR, let v* PROPOSITION
be defined by f
dL(x) = f
J Ha(v)
dO^jF" 1 ). Then for any
J Ha(v*)
vfseR, one has that f
dO^jF" 1 ).
dL(s)*sf
J Ha(.V+s)AHa(v)
(4.7)
J Ha(v*+s)AHa(v*)
We are now going to derive the main results in Schmetterer [1] and Roussas [5] as special cases of (4.7). To this end, suppose that 0 is an open subset of R (i.e. k = 1) and define the class ^s of sequences of estimates (of 6) by
<€a = {{Tn}- Pd[ni(Tn -6)<x]-+
FT(x; 0), a d.f.,
continuously in 0 for each fixed xeR and also continuously in R for each fixed 6 e 0}.
(4.8)
From (4.8), we have that for each 0 e 0 ,
Pd[ni(Tn-6)^x]->FT(x;6) continuously in R. Therefore, by Lemma 2.1, FT(-; 6) is continuous and hence the 'smallest' and 'largest' median lT{6) and uT(d), respectively, are such that FT(lT; 6) = FT(uT; 6) = |.Then consider the following definition of the class Jn{6) mentioned in (1.2).
Jn(£>; T) = {jn(6; tv t2, T) ;jn(6; tv tit T) - i + Ma,((9)w-i))<1,«2 > 0}. (4.9) RCO
146
Asymptotic efficiency of estimates
Finally, for each 6 e 0, let B(d; tl9 t2) be defined by B(6;tvt2)
= O[t2cr(e)]-01^(6)1
tl9t2 > 0.
(4.10)
Then one has the following result. T H E O R E M 4.1 Let
limPe[Tnejn(6;tl9t2,T)] ^ B(6 ; tv t2)
for all tvt2> 0 and every 6 e 0.
Proof In the first place, it is clear that VS^^H Next
P6[ni(Tn -6)^
and
FT{-;d) = L(-;d).
lT{6)] -> £ = f
di(x ; 6)
=f i(0)
and similarly
6)^ uT(0)] -> i = f = f
di(x ; 5) dQ[x;
J Hx(0)
Therefore for the present case, if we take v equal to lT{6) and uT{6) in Proposition 4.1, then the corresponding v*s are equal to 0. By also taking s to be equal to — tx and t2, relation (4.7) gives, respectively,
f
dL{x;d) ^ f
dO[x;
and
f
dL(x;d)^ f
J Hdu^+t^AHdu^d)]^
Equivalently, and limPe[uT{d) < ni(Tn-d)
)AH1(0) J H1(0+t22)AH 1
Upper bounds via Theorem 3.1 Hence
147
lim Pe[Tn ejn(6 ; tv t2, T)] fl)-*! < ni(Tn-d)
<
as was to be seen. ] Now we do not assume that k = 1. Then by interpreting the various inequalities below in the coordinatewise sense, we consider the class <€R of sequences of estimates defined as in (4.8), namely VR = {{Tn} ; Pe[nHTn- 6) ^ x] -> FT(x ; 6), a d.f.,
continuously in 0 for each fixed xeRk and also continuously in Rk for each fixed 6 e 0}.
(4.11)
Then for each heRk, it is clear that the d.f. ofnW(Tn — 6), under Pd, converges to a d.f. FT(x; 6, h)i say, continuously in 0 for each fixed xeRk and also continuously in Rk for each fixed 6e&. Furthermore, the d.f. FT( -;6,h) is continuous, so that the ' smallest' and' largest' median lT(0, h) and uT(6, h), respectively, are such that FT(lT; 6) = FT(uT; 6) = J. It is also clear that c €n c ^H. Now working as in the proof of the previous theorem, one obtains the following result. 4.2 Let tSR be obtained by (4.11) and for each heRk, let lT(6, h) and uT(6, h) be as above. Then for each heRk, one has THEOREM
HmP0[-tx + lT{d9h) < nih'(Tn-d)
< t2 + uT(6,h)]
^ OpaO-^^-Ot-^o-^A)] and every
for all ^,«2 > 0
where a2(6,h) = hT
5 W-efficiency of estimates: upper bounds In this section, we establish upper bounds for two additional specifications of the class ^ * for the case that 0 is an open subset of R. For the proof of the relevant theorems, we shall require the relations (5.1) and (5.2) below. In order to formulate them, let
148
Asymptotic efficiency of estimates
6e®, let 0 + heR and set 6n = 6 + hn~i. Then from Theorems 4.3 and 4.5 in Chapter 2, it follows immediately that for xeR,
(6 2)
-
2
where
a\d) = Y(0) = 4<^[01(6>)]
(5.3)
and, recall that, In the sequel, we shall write A^ rather than A(d, dn) for the sake of simplicity. In the definition of the class ^s, Pfanzagl [1] removes the requirement of continuous convergence and assumes instead that the limiting distribution has zero as its median. More precisely, he defines the class ^*, to be denoted here by ^ P , as follows. = TO \Se\n\(Tn-d)\Pe\^5eT^
a probability
measure, such that J£?T>6(( — oo, 0]) ^ | and Thus if 0 G [ZT(5), %»(#)] for all 6> e 0, then ^P 3 ^ . The requirement that 0 be a median for J5fT e for all ^ G 0, is a very natural one, as the following argument shows. Suppose that 0 is not a median for JSfT> e for some # and let lT{6) > 0. Choose a continuity point a; = x(d, T) olSeTtQ such that 0 < x < lT(6). Then | > J5f r ^((-oo,x]) = limP6[ni(Tn-d)
<x]^
]imavpP6[ni{Tn-d)
< 0]
In other words, limsupP^(jP7l ^ ^) < | and in a similar fashion, limsupP^T^ ^ d) <\ if ^T(/9) < 0. Such an asymptotic behaviour of an estimate of 6 perhaps would be judged by many as not being satisfactory. For the proof of the first theorem in this chapter, the following lemma presented in Pfanzagl [1] is required.
W-efficiency: upper bounds
149
L E M M A 5.1 Let {Tn} be a sequence of estimates such that J?[ni(Tn — d)\Pd] => yTid, a probability measure,
and
liminf Pe[ni(Tn -d)>x\>\
for all x < 0
lim MPe[ni(Tn - 0) < x] ^ \
for all x > 0.
Then, under (5.1) and (5.2), one has ^ B(6 ; ^, t2) for all *1? £2 > 0 and a.a. [I] 6 e 0; (5.5) the quantity B(d;tvt2) is given by (4.10) where Jn(6) is defined by (5.6) below. J»(0) = Un(O;t1,tt);jn(0;t1,t,) = (6-^,6
andjn(6;tvt2)eJn(0), + 1^),^,^
> 0}. (5.6)
Proof For each fixed x < 0, define {/n(# ; •)} as follows fn(x;0) = Pd[ni(Tn-d)
>x]-i,de&.
Then, clearly, the conditions of Lemma 2.5 are satisfied. Thus, if for reR we consider the sequence xn(r) = rn~i, it follows that there exists an Z-null set which depends on {x(r)} and perhaps also on x, call it N(x, r), such that lim sup/ n (x ; 6 + rn~i) ^ 0 for all 6 e Nc(x, r). Let RQ stand for the set of rational numbers in R and set Nx= [)[N{x,r);xe(-oo,0)nR0,re(0,oo)nR0]. Then 1(N±) = 0 and lim sup/ n (o:; 6 + rn~i) ^ 0 for all 6eN{, x e (— oo, 0) n Ro and re(0, oo) n Ro, or equivalently, ]imm-pP[ni{Tn-d) > x + r\Pd+rn-k] > | for all deNl,xe(-ao,0)(]
Ro and re(0,oo) n RO- (5.7)
Set s for the left-hand side of (5.7). Then there exists a subsequence {m} c: {n} such that limP[mi(Tm-6)
> » + r|Ptf+rn-»] = 5.
Applying (3.2) with #m = d + rm~i and 5 > 0, we get
150
Asymptotic efficiency of estimates
so that for all sufficiently large m,m ^ ml9 say, one has
< PeJm*(Tm -6)^
x + r]
for all 6 e Ncl9
a e ( - o o , 0 ) ni? 0 andrG(0,oo) (\Bo.
(5.8)
At this point, an ingenious use of the Neyman-Pearson fundamental lemma, due to Wald [1], provides the following inequality P$[mi(Tm-0) > x + r] for all deNl,
xe(-oo,0)
(]Ro and rE(0,oo) n Ro.
(5-9)
Namely, define Cm and Dm by
Then for each m ^ wx, Cm and D m are alternative critical regions for testing the simple hypothesis H: the underlying probability measure is Pm>e against the simple alternative A: the underlying probability measure is Pm e . Since Gm is optimal, by theNeymanPearson fundamental lemma, and its power Pe (Gm) is less than Pdm(Dm), the power of Dm, by (5.8), the level of Cm must also be™less than the level of Dm. That is, Pe{Gm) < Pe(Dm)9 or, equivalently,
t^
j
e) > x+ r]
which is inequality (5.9). In inequality (5.9), we first take the limits as m -> oo and utilize (5.1) and then let S -> 1. One has then
limmvPd[ni(Tn-6) ^ x + r] ^ O(-ra) for all deNl,
xe ( - oo, 0) n Ro and r e (0, oo) n Bo. (5.10) Next, for each fixed x > 0, define {fn(x ; •)} by
W -efficiency: upper bounds
151
Then once again the conditions of Lemma 2.5 are fulfilled and working as above with 6m now defined by dm = 6 — rm~%9 one obtains lim sup Pe[ni{Tn - d) ^ x - r] ^ O( - rcr) for all deNc2 with
l(N2) = 0,xe(0,oo) nR0
and re(0,oo)nR0.
(5.11)
Now, for e > 0, let y = y(e, 6) > 0 be defined by and set 9/ = 7/(e, 6) = 2yor. Then for any xv x2 such that (equivalently, l^cr —x2o*| ^ rj), we clearly have 10(^0-)- 0(x2cr) | ^ e.
(5.12)
Let t > 0 be arbitrary and choose re(t,t + 2y) oRo,
xe(t-r,O) [)R0.
Then |r — ^| ^ 2y, or equivalently, |( — t)or— ( — r)er| ^ ^ and hence relation (5.12) implies -e ^ O(-rcr)-O(-*(r) ^ e.
(5.13)
Thus for deN{, relation (5.10) gives ]imfmpP0[ni{Tn-d)
^ x + r] ^ O(-rcr).
(5.14)
Since also x + r > t by the choice of x, one has Umm-pPd[ni{Tn-d)
^ t] ^ limsn-pP6[ni(Tn-6)
^ x + r]
^ O(-ro-) ^ O(-^o-)-e on account of (5.14) and (5.13). Since this is true for every e > 0, one concludes that d) ^ t] ^ <S)(-t(r)
for all* > 0 and all 6eN\, or equivalently, ]imin£Pe[rA(Tn-O)
< t] ^ O(^cr)
foralH > OandallfleiVJ.
152
Asymptotic efficiency of estimates
Finally, replacing t by t2, one has limmiPd[ni(Tn-d)
< t2] ^ ®(t2cr) for all t2 > 0 and all deN^
(5.15)
For an arbitrary t > 0, choose re(t,t + 2y)nR0
and
xs(0,r-t)
()RO.
c
Then (5.12) holds true again. For 6eN 2, relation (5.11) implies UmsuyP6[ni(Tn-d) ^ x-r] ^ O(-w).
(5.16)
Since also x — r < —tbj the choice of x, one has ]ima\ipPe[ni(Tn-0)
^-t]^
l i m s u p P ^ i ^ - t f ) < x-r] ^ <3)(-rcr) ^ <3>(-to-)-e
on account of (5.16) and (5.13). From this, it follows as above that lim sup P6[ni(Tn-d)
^-t]^
0(-£ 0 and all 6eN%.
Upon replacing t by tv one has ]imm-pP0[nl{Tn-6)
^ -t±] ^ O(-^o-) for all t± > 0 and all 6eNc2.
(5.17)
Now for any tvt2 > 0, let —t[,t2 be continuity points of jSfT ^ such that —1[ < — tl912 > t2. Then liminfP^[-^ < ni(Tn-6) ^ limPe[-t[
< t2]
< ni(Tn-d)
= limP6[n$(Tn-6)
< t'2]-limP6[ni(Tn-d)
= liminfPe[ni(Tn-d)
^ -tfi
< Q-limsupPe[ni(Tn-6)
^ -t[]
and this is bounded above by
< t2] < O(^o-)-O(-^(r).
Letting now —t[-> —tx and 12 -> t2 through continuity points of we get that for all 6ENQ and all tl912 > 0,
&T9O,
P , [ - ^ < ni(Tn-d)
< t2] ^ O(^2o-)-(D(-^o-) = B(6;tl9t2),
as was to be shown. ] Now we can formulate the first main result in this section.
W-efficiency: upper bounds
153
T H E O R E M 5.1 Let ^P, Jn(6 ;tvt2) and B(d ; tv t2) be defined by (5.4), (5.6) and (4.10), respectively. Then one has
lim sup Pe[Tn ejn(d ; tl9t2)] ^ B(d ; tl9t2)
for all tvt2 > 0 and a.a. [I] de@. Proof First we show that the assumptions of Lemma 5.1 are satisfied. Suppose x < 0. Then for each # e 0 , there exists y = y{T, 6) in [x, 0) such that 3?Ti0{{y}) = 0. Thus Pd[ni(Tn-6)
>x\> Pe[ni(Tn-d)
> y]
and hence KminfPe[ni(Tn-d)
^ x] ^ liminfP e [ni(T n -6) ^ y] =
limPe[ni(Tn-d)>y]
= &T,e(\l/><x>)) > &T,e([0><x>)) > h
That is,
lim in£Pe[ni{Tn -d)^x~\>\
for all x < 0,
and in a similar fashion lim inf Pe[ni(Tn - 6) < x] ^ \ for all x > 0. Next we proceed as follows. For e > 0, define y and r\ as in the proof of Lemma 5.1. Let tl912 > 0 be arbitrary and choose sxe {tl9 tx + 2y], s2 G (t2, t2 + 2y], so t h a t Then s1 — t1 ^ 2y,s2 — t2 ^ 2y, or equivalently, s-^cr — t-^cr < 7/, s2cx — t2cr ^ ^/.
From the above choice of sl9 s2, it follows that Pe[Tnejn(6;tvt2)]
^ Pd[Tnejn(6;
svs2)]
and hence limsupP,[r n ej n (0; tl9t2)] < limP6[Tnejn(d; s1}s2)].
(5.18)
By means of (5.5), the right-hand side of (5.18) is bounded by B(6; svs2) a.e. [I]. Thus one has ]imavpPe[Tnejn(0; But
tl9t2)] ^ B(d; svs2)
s1cr — t1cr= \( — t1)o'—( — s1)o'\ ^ y,
a.e. [I].
(5.19)
s2cr — t2(T ^ TJ
154
Asymptotic efficiency of estimates
imply |0(-* l ( r)-<&(-«,er)| < e, |O(« 2 cr)- O(*acr)| < e, and hence -Of-Sjtr) ^ - O ( - ^ ( r ) + e, O(s2
tvt2)] *c B(0; tl9t2) + 2e a.e. [I].
Finally, letting e -> 0, we obtain the desired result. ] In Theorems 4.1, 4.2 and 5.1, it is a basic condition that {S?[(Tn — 6)\Pe]} converges (weakly) to a probability measure. However, this condition is a rather arbitrary restriction. Other criteria for defining a class of estimates could be used as well. Such an alternative criterion used by Pfanzagl [1] is medianunbiasedness. That is, the class ^*, to be denoted here by ^ P ,, is the following one.
^ d) > i for all 6 e 0 and all n}.
(5.20)
Now we can prove the second main result in this section. T H E O R E M 5.2 Let ^ P ,, Jn(6) and B(6; tl9t2) be defined by (5.20), (5.6) and (4.10), respectively. Then one has
limsupPd[Tnejn(6; tvt2)] < B(6; tvt2) for all tv t2 > 0 and all 6 e ©. Proof The proof is based on the same ideas as those used in Theorem 5.1. That is, the utilization of (5.1) and (5.2) and the use of the Neyman-Pearson fundamental lemma. More precisely, let S, h > 0 be arbitrary and set dn = 6 + hn~i. Then the assumption of median-unbiasedness implies that for all n, one has P6n(Tn > dn) > | , or, equivalents, Pen{Tn < 6n) ^ | .
(5.21)
W-efficiency: upper bounds
155
Next, applying (5.2) with x = (l+8)hcr, one has
Therefore, for all sufficiently large n, n ^ %1? say, we have
(5 22)
4
-
Utilizing relations (5.21) and (5.22) and also theNeyman-Pearson fundamental lemma in the same way it was done in the course of the proof of Theorem 5.1, one has that for all sufficiently large n, n ^ n2, say, and all 6e®, Pe{Tn < 6n) < Pe
Letting first n-^co and then 8 -> 0 and replacing h by t2 in the resulting limits, one has limmvP[ni(Tn-d)
< t2] ^ <5>(t2
(t2 > 0).
(5.23)
Next taking dn = 6 — hn~i, In > 0, and x = - Shcr, S > 0, we obtain in a similar way the following result
O(-^cr) ^ l i m i n f P ^ i ^ - f l ) < -tx]
(t± > 0). (5.24)
Therefore, by means of (5.23) and (5.24), one has limsupPd[TnEJn(6; tl9t2)] = limsupP,[-* x < n*(Tn-0) < t2] < ]imfm-pPe[ni{Tn-0) < t2]-liminfP6[ni(Tn-d) = O{t2cr)"O(-tx(r)
< -t±]
= B(6;tl9t2)9
as was to be shown. ] This section is closed with a remark and two examples. R E M A R K 5.1 It is to be pointed out that, as follows from Theorems 4.1, 4.2, 5.1 and 5.2, the variance )]2 plays the same role as Fisher's information number 1(6) under the standard conditions (of pointwise differentiability, etc.). The following example shows that a sequence of estimates {1^} which is optimal with respect to ^ * need not belong to *&*.
156
Asymptotic efficiency of estimates
EXAMPLE
5.1 Refer to Example 3.2 in Chapter 2. Then
^ follows that j= l
J?[ni(S*n-d)\Pe]=>N(0,2d*) (see, e.g. Cramer [1], Sec. 28.3). Thus the upper bound B(6; tl912) given by (4.10) is attained for {Vn} = {8%} in the sense of (1.5). Accordingly, {8%} is optimal with respect to all ^s, ^P and ^ P ,. However, {Sfy^&p*, because Pe{8\ ^ 6) = P0{Xn ^ n) a n ( i therefore Pd{8%/ ^ 6) ^ | and P^(/S| < #) ^ | cannot be simultaneously true. On the other hand, it is clear that Furthermore, { $ | } e ^ s because
/
n
V
(j = 1, ...,n). Next, Y1 = d(Z- 1) with where Tj = {Xj-fi^-d Z being xl so that SeY1 = 0, o-fli = 2^2 and ^ 7 3 = 8^3 (since SZ = 1, cr2Z = 2 and ^ Z 3 = 15). Let F% be the d.f. of
Then the Berry-Esseen theorem (see Loeve [1], p. 288) gives 4c
°
uniformly in e
-
where c is a numerical constant. It follows that
which implies that 2
n-0)\Pe]
=> N(0,26*)
uniformly in d.
This result, together with Lemma 2.2 shows that this last convergence is continuous, as is required in the defining relation (4.8).
W-efficiency: upper bounds
157
5.2 Refer to Example 3.3 in Chapter 2. Here cr\6) = 1, as was seen in Example 1.3. Furthermore, if Vn stands for the sample median, it was seen in Example 4.3 that EXAMPLE
It follows that the upper bound B(6 ; tl912) is attained again for the sequence {Vn}. It is then immediate that {Vn} is optimal within ^P and ^p, and also within <SS (see also the exercise at the end of this chapter). The references Weiss and Wolfowitz [1], [2] and Hajek [3] are of related interest. 6 Asymptotic efficiency of estimates: the classical approach The classical set-up for investigating asymptotic efficiency of estimates is the following. First the class ^ * is specified by V* = {{Tn}; &[ni(Tn-d)\P0]
=> N(0, cr%{6))} for all 6e&.
Next it is shown that, under suitable regularity conditions, one has for all {Tn}eV* <x%{d) > I-\6)
for all (a.a.p])
(0eQ).
(6.1)
Then any sequence of estimates {P^} in ^ * for which Oy(d) = 1^(6) for all 6 e 0 is said to be asymptotically efficient. Under appropriate conditions, a sequence of MLEs enjoys this property. In connection with the classical approach, the reader may wish to consult, e.g. Bahadur [1] and LeCam [2]. Another approach is presented in Rao [2]. In particular, for the Markov case, existence, consistency and asymptotic normality of an MLE is studied in Billingsley [1] and references cited there. A set of conditions for consistency of an MLE different from, the ones used in the reference last cited, are presented in Roussas [3]. These conditions suffice when we restrict ourselves to compact subsets of ©. Also of relevant interest is a paper by Ganssler [1]. Finally, conditions for asymptotic normality of an MLE weaker than those used in Billingsley are given in Roussas [4].
158
Asymptotic efficiency of estimates
What we intend to do in the present section is to show that the classical result (6.1) follows from theorems obtained in Sections 4 and 5 for two important specifications of the class ^ * . In this context, then, classical efficiency is a special case of W-efficiency. Define two classes of sequences of estimates as follows. Vc = {{Tn}; P6[ni(Tn - 0) ^ x] -> d>T(x ; 0)
for all
x e R and 0 e 0, where OT( •; 0) is the d.f. of N(0,cr2T(6))}
(6.2)
and
< x] -> OT(x;d) continuously
in 0 for each xeR, where O T (-; 6) is as above}. (6.3) Then the following theorem holds true. T H E O R E M 6 . 1 With the classes ^c and ^ and (6.3), respectively, one has (i)