JOURNAL OF OPTIMIZATION THEORY AND APPLICATIONS:Vol. 71, No, 1, OCTOBER I991
A Basic Searchlight Game V, J. BASTON 1 AND F, A. BOSTOCK 2
Communicated by M. Pachter
Abstract. A searchlight game is a two-person zero-sum dynamic game of the pursuit-evasion type in which at least one of the two players has a searchlight. A searchlight can be flashed a given number of times within a fixed time period and the objective is to catch the opponent in the region illuminated by the flash. Olsder and Papavassilopoulos instituted the study of these games and, in this paper, we supplement their results, obtaining a closed formula for the value and optimal strategies for the players in their basic game.
Key Words. Two-person games, zero-sum games, dynamic games, searchlight games.
1. Introduction
Searchlight games are a class of two-person zero-sum dynamic games o f the pursuit-evasion type and have aroused considerable interest in recent years. In a searchlight game, at least one of the two players has a searchlight which can be flashed a certain number of times within a given time period. A flash of the searchlight illuminates a region of known shape and the objective of a player with a searchlight is to catch his opponent in thisregion at the time of the flash. However, when a player flashes his searchlight, he automatically discloses his position to his opponent wherever his opponent is. Thus, if both players have a searchlight and a player does not catch his opponent in the region illuminated by the flash, his position can be more vulnerable as he has provided his opponent with potentially valuable information; such information is called dynamic since it depends on the actions of the players. ~Iecturer, Faculty of Mathematical Studies, University of Southampton, Southampton, England. 2Lecturer. Faculty o f Mathematical Studies, University of Southampton, Southampton, England.
47 0022-3239/91/1000-0047506.50/0 ~" 1991 PlenumPublishingCorporation
48
JOTA: VOL. 71, NO. 1, OCTOBER 1991
As is pointed out in Ref. l, Section 5.4, it is conceptually very complicated to define mixed a n d / o r behavior strategies for games which proceed continuously in time. It is therefore not surprising that a number of investigations into searchlight games assume that play takes place in discrete time and that the players are constrained to move in a finite state space; i.e., the players move in a network with a finite number of nodes. In particular, Olsder and Papavassilopoulos (Refs. 2 and 3) have considered the case where the nodes are positioned on the circumference of a circle. Other authors have analyzed games which are essentially of searchlight type. For instance, Baston and Bostock (Ref. 4) investigated the (continuous) problem of a helicopter with k bombs trying to destroy a submarine in a narrow channel, but it could equally well have been formulated as a searchlight game. Similar comments apply to other games considered by Baston and Bostock (Ref. 5), Lee (Refs. 6 and 7), and Bernhard, Colomb, and Papavassilopoulos (Ref. 8). The main purpose of this paper is to supplement the results of Olsder and Papavassilopoulos. In Ref. 9, they show that the optimal strategies of what they term the basic game can be found via a certain matrix game. By adopting a different approach, we obtain explicit optimal strategies for the players in this basic game and a closed formula for its value. The particularly simple nature of our results enable us to deduce the values of some of the other games they consider.
2. Basic Game As its name implies, the basic game is a building block for many other searchlight games. It is a zero-sum game played on n equally spaced points on the circumference of a circle by two players called Pursuer and Evader. We will usually think of the points as being labelled 0, 1 , . . . , n - 1 in clockwise order (so that n - 1 is adjacent to 0) but, for convenience of notation, we will sometimes denote the point i by a number j satisfying j = i(mod n). Thus, for example, we often use - 1 to represent the point n - 1. Only Pursuer has a searchlight and he can flash it just once; the flash illuminates Pursuer's position together with the two points adjacent to it. Play terminates when the flash is used or at a fixed time T (known to both players), whichever is the shorter time. Play is assumed to take place in discrete time and, at the initial instant of time t = 0, Pursuer is at the point 0 while Evader is at the point d; this information is again known to both players. If d = 0, 1, or n - 1, the game is trivial since Pursuer can achieve his objective by flashing immediately. Thus we may assume n >~4 and, taking advantage of symmetry, we may further assume that 2~d<~[n/21, where [xj denotes the greatest
JOTA: VOL. 71, NO. 1, OCTOBER 1991
49
integer less than or equal to x. At each instant of time, a player can move one position clockwise or one position anticlockwise or loiter. Once the time starts, the players do not get to know about their opponent's new positions (unless Pursuer flashes), even when they are at the same node. As to Pursuer, the payoff will be thought of being one if Evader is caught by the flash and zero otherwise. Suppose T < [ n / 2 ] - 1. Then, Evader can at each instant of time move toward the point [n/2] and, if he reaches it, loiter there. Whatever Pursuer does, he certainly cannot narrow the gap between himself and Evader to at most one until Evader reaches [n/2J. However, since Pursuer cannot reach I n / 2 ] - 1 or [n/2] + 1 by the time T, it is clear that Evader can avoid capture. Thus the value of the game is zero when T < [ n / 2 J - 1 . Olsder and Papavassilopoulos (Ref. 9) have already shown that the value of the game for T~> In/2] is independent of T, so only the cases T = [ n / 2 ] - 1 and T = [n/2] remain to be considered. However, we do not need to use their result since it will follow from our analysis. The bulk of the paper deals with the game where T~> [n/2J, and we shall denote this game by F(n, d), our previous comment justifying the omission of T from the notation. Our results on F(n, d) enable us to deduce the values of many of the games when T = [ n / 2 J - 1 ; the outstanding cases are then dealt with separately.
3. Pursuer Optimal Strategies We now obtain Pursuer strategies which in Section 4 will be shown to be optimal in F(n, d).
Lemma 3.1.
Pursuer can obtain an expected payoff of at least
3/n in
F(n, d). Proof. Given a point P, Pursuer can clearly get to P in a time of at most [n/2]. Thus Pursuer can choose a point P at random from 0, 1. . . . . n - 1 and flash the searchlight at P at time [n/2J. Wherever Evader is at time [n/2J, he is clearlycaught with probability 3/n, and the proof is complete. [] The strategy for Pursuer in the proof of Lemma 3.1 is extremely naive and takes no account of the fact that he knows Evader's initial position. Thus we might expect that Pursuer can do much better than 3/n in general. However, we shall see that Pursuer can make only limited use of the information and, in a number of instances, cannot use it to increase his expectation at all.
50
JOTA: VOL. 71, NO. 1, OCTOBER 1991
Suppose that n is odd and not divisible by 3, and let s be an integer satisfying 0 4 s ~<[ n / 3 j - 1. Denote by oIs) the Pursuer strategy whereby he chooses a point at random from {3u+(n-3)/2:
u=-[n/3l+s+
1,...,0,
1. . . . . s}
and flashes his searchlight there at time ( n - 3 ) / 2 ; since 3 + ( n - 3)/2 = n - ( n - 3)/2, he is clearly able to get to his chosen point by time ( n - 3)/2. When n = 6m + 1, the only point that has no chance of being caught by o-(s) is 3s + 3m + 1. However, if Evader starts at 3s or 3s + 1, he cannot get to the point 3s + 3m + 1 at time (n - 3)/2. Hence, if Evader starts at either of these two points, he is caught with probability 1~In~3] by o-(s). When n = 6m + 5, the only points that have no chance of being caught by o-(s) are 3s + 3m + 3 and 3s + 3m + 4. I f Evader starts at 3s + t, he cannot get to either of these points at time (n - 3)/2, with the result that he is caught with probability 1~In~3] by o-(s). When n = 6m + 4, suppose that Evader starts at 3s + a, where a = 0 or 1. It is easy to verify that Pursuer can choose a point at random from
{3u+ 3m+a: u = - 2 m + s . . . . ,0, 1 , . . . , s } and flash his searchlight there at time 3m + 1. The only point that has no chance of being caught by this strategy is 3s + 3m + a + 2; however, Evader cannot get to this point at time 3m + 1, so Pursuer can ensure himself of an expectation of at least [n/3l. Since our Pursuer strategies tell Pursuer to flash at time I n / 2 ] - 1, we have thus proved the following lemma. Lemma 3.2. Provided T>~ I n / 2 ] - 1, Pursuer can obtain an expectation of at least l/[n/3] when: (i) (ii) (iii)
n = 6m + 1 and d = 3s, s = 1. . . . . m, or d = 3s + 1, s = 1 , . . . , m - 1 ; n = 6 m + 5 and d = 3 s + 1, s = 1 , . . . , m; n=6m+4andd=3sor3s+l,s=l . . . . . m.
The next result concludes the analysis needed for our discussion of Pursuer strategies in F(n, d). Lemma 3.3. Provided T>~ I n / 2 ] - 1, Pursuer can obtain an expectation of at least 2/(4m + 1) when n = 6m + 2 and d = 3s + 1, s = 1. . . . . m.
JOTA: VOL. 71, NO. 1, OCTOBER 1991
Proof.
51
It is easy to check that Pursuer can choose a point at random
from 2m
-- s
~=~ { 3 ( s - r e + u ) - 1, 3(s-re+u)} s--I
u ~o { 3 ( s + m - v ) - 1, 3 ( s + m - v ) } w {3s+ 3m +2}, and flash his searchlight there at time 3m. Notice that every point other than 3s+ 3m + 2 is caught with probability 2/(4m + 1). Since Evader cannot get to 3s + 3rn + 2 at time 3m when he starts at 3s + 1, the result now follows. []
4. Evader Optimal Strategies The optimal strategies for Evader are not so straightforward as those for Pursuer, and we need to introduce some notation before we can describe them. After a time [ n / 2 ] - 2 , Evader can get to any of the points d [n/21+2+w, for w = 0 , 1. . . . . 2 [ n / 2 ] - 4 , and the Evader strategies that we give involve choosing a point from a particular subset of these points. Thus, throughout this and the following section, we shall employ the following notation. Let
A=[(2[n/2l-d)/3J,
B=[d/3],
& = { d - [ n / 2 1 + 2 + 3 u : u = O , 1. . . . , A - I}, & = { d + [ n / 2 ] - 2 - 3 v : v = 0 , 1. . . . . B - 1}, and
S=S, u $2 • {in/21}.
(1)
Note that S~ and $2 are disjoint. For 0 ~
P~(s) = a,
if s E S 1 u $2,
P~([n/2 ]) = 1 - a(A + B).
(2a) (2b)
Evader can get to a point d - l n / 2 ] + 2 + 3 u of S~ at time [ n / 2 ] - 2 by adopting the following procedure, which will be denoted by X ( d [n/2]+2+3u). If u is odd, move clockwise for [3u/2] units, loiter for one unit, and then move [ n / 2 ] - 3 - [ 3 u / 2 1 anticlockwise. If u is even, move clockwise for 3u/2 and then move anticlockwise for [ n / 2 ] - 2 - 3 u / 2 units.
52
JOTA: VOL 71, NO. 1, OCTOBER t991
Evader can get to a point d + [ n / 2 1 - 2 - 3v of $2 at time [ n / 2 1 - 2 by adopting the following procedure, which wilt be denoted by Y(d+ [ n / 2 t 2 - 3 v ) . For the first I n / 2 ] - d units, move clockwise, loiter at [n/2] for the following 3v units, and then move clockwise for the next d - 2 - 3 v units. We will also need the procedure L, whereby Evader moves clockwise for [ n / 2 ] - d units and then loiters at [n/2] for d - 2 units. The following lemma will be used extensively. Lemma 4.1. If Pursuer flashes the searchlight at a time less than I n / 2 ] - 1 , he is unsuccessful against L and is successful against at most one member of X w Y, where
ProoL Let Pursuer flash the searchlight at time T, where T< [n/21- 1. Then, he flashes at one of the points - T+ i, for i = 0, 1. . . . . 2T. Thus Evader can only be caught if, at time T, he is at one of the points of C1 u C2, where
C,={i+l:-l<~i<~T}, (i)
C2={n-i-l:O<~i<<.T}.
If Evader adopts L, then his position is given by d + t,
for O<~t<~[n/2j-d,
[n/2],
for [n/2l-d<.t<<.tn/2]-2.
Now, for O<~t~[n/2]-d,
t+ 1 < t + d ~ [n/2]~n- [n/2]~n- t - d < n - t - 1 ; while, for [ n / 2 ] - d~< t ~ [n/2] - 2,
t+ l <[n/2l<.n-[n/2 ]<.n- t - 2 < n - t - 1. Thus, under L, Evader is not at a point of C1 w C2 at time T, and it follows that Pursuer is unsuccessful against L. (ii) Suppose that Pursuer catches Evader when he adopts X ( d [n/2 ] + 2 + 3u). Under X ( d - in~2 ] + 2 + 3u), the position of Evader is given by: (a) d+t, for O<~t<<.[3u/2]; (b) d+[3u/21, for t=[(3u+ l)/2]; (c) d+[3u/2]-t+[(3u+ 1)/2]=d+ 3 u - t , for [(3u+ 1)/21<<.t<~[n/2]-2.
JOTA: VOL. 71, NO. 1, OCTOBER t991
53
Now, for 0~
1 + t < 2 + t<~d+ t<~d+ (3u+ 1)/2 ~
<~d+2 [ n / 2 ] - d - 2 - t < n - 1 - t; i.e.,
1+t
1-t.
Thus Pursuer must catch Evader later than [(3u+ 1)/2]. Hence Evader is caught at the point d + 3 u - T, and the searchlight must have been flashed at one o f the points d + 3 u - T + E for E = - 1 , 0, 1. By the same argument, Pursuer would have to flash the searchlight at one of the points d+3ut- T+ ~ for E = - I , 0, 1 to catch Evader when he adopts X(d-[n/2]+2+3uO with ul ~au. Hence Pursuer can catch at most one member of X. (iii) Suppose that Pursuer catches Evader when he adopts Y(d+ [ n / 2 ] - 2 - 3v). Under Y(d+ [n/2]- 2 - 3v), the position of Evader is given by:
(a) (/3)
(y)
d+t, for O<<.t<<.[n/2]-d; [n/21, for [n/Z]-d<~t<<.[n/Z]-d+3v, [ n / 2 ] + t - [ n / 2 ] + d - 3 v = t + d - 3 v , for [n/2]-d+3v~t<<.[n/2]-2.
Now, for O<~t4[n/2]-d,
t+ 1
<~n-t-d+3B-3<~n-t-3
54
JOTA: VOL. 71, NO. 1, OCTOBER 1991 (iv)
Notice that, for E= - 1, 0, 1 and v = 0, 1. . . . .
n> T + d - 3 v + ~ > ~ T + d - 3 B + 3 -
B - 1, we have
l>~T+2.
Since Pursuer can only flash at the points of C1 u C2, we see that he has to flash at one of the points of {i: n - T - 1 <<.i~T + d + 1} _c C2 to catch a member of I1. Further, for E = - 1, 0, 1 and u = 0, 1. . . . . the points d + 3 u - T - E that lie in C2 are contained in
A - 1,
{i:n+d-T-l~i~n}. Hence a flash that catches a member of X and a member of Y has to be a point of {i: n + d - T -
l~i<~n} c~ {i: n - T -
1 <~i<~T+ d+ l }.
However, this set is empty, since
T+d+l
T<~ tn/2]- 2. Thus Pursuer cannot catch both a member of X and a member of Y, and the p r o o f is complete. [] We now prove a number of lemmas which tell us the value of the game F(n, d) in all cases; the corresponding Evader optimal strategies will be given in the course of the proofs. Lemma 4.2. I f [n/3 ] is even, Evader can restrict Pursuer to an expected payoff of at most 1/[n/31. Proof. Suppose that In/3] is even. Then, we can assume n = 6 m + b , where 0 4 b <~2. Taking d = 3s + a, where 0 ~
A =2m-s,
B=s,
so Evader can choose a point x of S by using the probability distribution P~ with a = 1/2m. F r o m (2), it is clear that x is in $1 va $2. If x is in & , Evader adopts the procedure X(x), followed by a move anticlockwise at time [ n / 2 ] - 1, after which time he loiters at the point x - 1. If x is in & , Evader adopts the procedure Y(x) and then loiters at x. By L e m m a 4.1, Pursuer has a probability of at most 1/2m = 1~In~31 if he flashes at a time less than I n / 2 ] - 1. However, at a time greater than or
JOTA: VOL. 7I, NO. l, OCTOBER 1991
55
equal to [ n / 2 j - 1, Evader is at one of the 2m points of W = { d - [ n / 2 ] + 1 + 3 u : u = 0 , 1. . . . .
2 m - s - 1}
{d+[n/2J-2-3v: v = 0 , 1. . . . . s - 1}. Since
d-ln/2]+ 1 + 3 ( 2 m - s - 1)=3m+a-lb/2l-2 ~<( d + [ n / 2 l - 2 - 3(s - 1)) - 3, d + |n/2J - 2 = d + 3m + [ b / 2 J - 2
=6m+b+d-3m-b+[b/2J-2<~n+(d-[n/21+ 1 ) - 3 , a flash can cover at most one point of W. Thus Pursuer has an expectation of at most 1~In~31 if he flashes at time [ n / 2 J - 1 or later, and the result follows. [] Corollary 4.1. (i) (ii)
The game F(n, d) has value
n=6m and 2<~d<~[n/2], n = 6m + 1 and d = 3s, s = 1. . . . .
Proof.
1/[n/3] when:
m, or d = 3s + 1, s = 1. . . . .
It is immediate from Lemmas 3.1, 3.2, and 4.2.
m - I. []
We shall need the following definition. Definition 4.1. Evader is said to disperse if he chooses to move clockwise, anticlockwise, or loiter with equal probability. L e m m a 4.3. The value of F(n, d) is s = 0 , 1. . . . . m - l , Proofi
3/n when n = 6m + 1 and d = 3s + 2,
Let n = 6m + 1 and d = 3s + 2. Then,
A = 2 m - s - 1,
B=s,
First o f all Evader chooses a point x of S by using the probability distribution P~ where a = 3/(6m + 1). If x is in S~, he employs X(x), moves anticlockwise at time [ n / 2 j - 1~ disperses at time [n/2j, and loiters thereafter. I f x is in $2, he employs Y(x), moves clockwise at time [ n / 2 ] - 1, disperses at time In/21, and loiters thereafter. I f x = [n/2j, he employs L; then, with probability 3//4, he moves clockwise at time [ n / 2 J - 1 , disperses at In/2], and loiters thereafter; and, with
56
JOTA: VOL. 71, NO. l, OCTOBER 1991
probability 1/4, he loiters at time [ n / 2 1 - 1 , moves anticlockwise at [n/2l, and loiters thereafter. By Lemma 4.1, Pursuer can have an expectation of at most 3/(6m + 1) if he flashes at a time less than I n / 2 ] - 1. At time [ n / 2 1 - 1, Evader is at one of the points of W= {3s+3-3m+3u:
w {3s+3m+l-3v:
u = 0 , 1. . . . . 2 m - s - 2 } v=O, 1 , . . . , s -
t} w {3m, 3 m + l ) .
Now, 3 m + 3s + l = n - 3 m + 3s,
3s + 3 - 3m + 3( 2 m - s - 2 ) = 3 m - 3,
3s+ 3m+ 1 - 3 ( s - 1) = 3 m + 4 , so Pursuer cannot catch more than one of the points of W with a flash at time [ n / 2 ] - 1, since he cannot reach either 3m or 3m + t at that time. Hence a flash at time I n / 2 ] - 1 can give Pursuer an expectation of at most 3/ (6m + t). However, at time In/2] and subsequently, Evader has spread himself, in the sense that his strategy means that, for any point x, the probability that Evader is at x is 1 / ( 6 m + 1). Hence Pursuer has an expectation of 3/ ( 6 m + 1), if he flashes at a time greater than or equal to In/2]. The result now follows by Lemma 3.1. [] Lemma 4.4. 3s+2. Proof.
The value of F(n, d) is 3 / n when n = 6m + 2 and d = 3s or
Let n = 6m + 2 and d = 3s + a, where a = 0 or 2; then,
A =2m-s,
B=s.
First of all Evader chooses a point x of S by using the probability distribution P~, where a = 3 / ( 6 m + 2). To continue with the Evader strategy, we now divide the analysis into the two cases (i) a = 0 and (ii) a = 2. (i) Let a---0. If x is in S~, Evader adopts the procedure X ( x ) , moves anticlockwise at time 3m, disperses at time 3m + 1, and loiters thereafter. If x is in $2, he employs Y(x), loiters at time 3m, disperses at time 3m + 1, and loiters thereafter. If x = 3m + 1, he employs L, moves anticlockwise at time 3m; then, with equal probability, he moves anticlockwise or loiters at time 3m + 1 ; after time 3m + 1, he loiters. (ii) Let a = 2 . If x is in S~, Evader employs X ( x ) , loiters at time 3m, disperses at time 3m + 1, and loiters thereafter. If x is in $2, Evader employs Y ( x ) , moves clockwise at time 3m, disperses at time 3m+ 1, and loiters thereafter. If x = 3m + 1, he employs L, moves clockwise at time 3m; then,
JOTA: VOL. 71, NO. 1, OCTOBER 1991
57
with equal probability, he moves clockwise or loiters at time 3m + 1 ; after time 3m + 1, he loiters. In both cases, Lemma 4.1 tells us that Pursuer has an expectation of at most 3/(6m + 2) if he flashes at a time less than 3rn. At time 3m, Evader is at one of the points of
Wo={3s-3m+3u: u = 0 , 1. . . . . A - 1} w {3s+3m-l-3v: vo {3m},
v=0, 1,..., B-l}
when a = 0 ,
and at one of the points of W2 = {3s-3m+3+3u: u = 0 , 1. . . . . A - 1}
vo {3s+3rn+2-3v: v = 0 , 1 , . . ,, B - 1} vo {3m+2},
when a = 2 .
It is easy to check that the only point at which a flash can catch more than one member of Wa is 3m + 1. However, Pursuer cannot reach 3m + 1 at time 3m; so, if he flashes, he has an expectation of at most 3/(6m + 2). Finally, when t >~3m + 1, Evader under both (i) and (ii) has an equal probability of being at any of the points, so Pursuer has an expectation of 3/n if he flashes after time 3m. The result now follows by Lemma 3.1. [] Lemma 4.5. The value of F(n, d) is 2 / ( 4 m + 1) when n = 6 m + 2 and d = 3 s + 1, s = l . . . . . m. Proof.
Let n = 6m + 2 and d = 3s + 1 ; then,
A =2m-s,
B=s.
Consider the following strategy for Evader. He chooses a point x of S by means of the probability distribution Pa, where a = 2/(4m + 1). If x is in & , he adopts the procedure X(x); then, with equal probability, he loiters or moves anticlockwise at time 3m; after time 3m, he loiters. If x is in $2, he adopts the procedure Y(x) ; then, with equal probability, he loiters or moves clockwise at time 3m; after time 3m, he loiters. If x = 3m + 1, he adopts L and subsequently loiters. By Lemma 4.1, Pursuer has an expectation of at most 2/(4rn + 1) if he flashes before 3rn. However, at time 3m and subsequently, Evader is at one of the points of {3t+ 1: t=O, 1. . . . . 2m} vo { 3 t + 2 : t=O, t . . . . . m - l } w { 3 t : / = m + 2 . . . . . 2m},
58
JOTA: VOL. 7t, NO. 1, OCTOBER 1991
and each point is equally likely. Since Pursuer can cover at most two or these points, he has an expectation of at most 2 / ( 4 m + 1), and the result follows from Lemma 3.3. [] Lemma 4.6.
The value of F(n, d) is 3/n when n = 6 m + 3 and 2~<
d~[n/2]. Proof.
Let n = 6m + 3 and d = 3s + a, where 0 ~ a ~<2; then,
A = 2rn-s,
B=s.
Consider the following Evader strategy. He chooses a point x by means of the probability distribution P~, where a = 1/(2m + 1). I f x is in SI, he adopts the procedure X(x), moves anticlockwise at time 3m, and loiters thereafter. If x is in 5;2, Evader adopts the procedure Y(x), moves clockwise at time 3m, and loiters thereafter. If x = 3m + 1, he adopts the procedure L, moves to 3m + a at time 3m, and loiters thereafter. By Lemma 4.1, Pursuer has an expectation of at most 1/(2m+ i) if he flashes before time 3m. However, at time 3m and subsequently, Evader is at one of the points { 3 t + a : t = 0 , 1. . . . . 2m}, and each point is equally likely. Since Pursuer can cover at most one of these points with his flash, his expectation is at most 1/(2m+ 1), and the result now follows by Lemma 3.1. [] Lemma 4.7. (i) (ii)
The value of F(n, d) is 1/[n/3] when:
n=6m+4andd=3sor3s+l,s=l . . . . ,m, n=6m+5andd=3s+l,s=l . . . . ,m.
Proof.
Note that, in all the cases,
A = 2 m - s + 1,
B=s.
Consider the following strategy for Evader. He chooses a number x in S by means of the probability distribution P~, where a---1/(2m + 1). He adopts the procedure X(x) or Y(x) according to x~Sl or x~S2, respectively, and then loiters thereafter. By Lemma 4.1, Pursuer has an expectation of at most 1/(2m + 1) if he flashes before time 3m+ 1. However, at time 3 m + l and subsequently, Evader is at one of the points
{3(s-m+u)+a: u = 0 , 1 , . . . , 2m-s} {3(s+m-v)+a: v = 0 , 1. . . . . s - l } , where a = d - 3 s , and each point is equally likely. It is easy to verify that
JOTA: VOL. 71, NO. 1, OCTOBER 1991
59
a flash can cover at most one of these points, so the result now follows by Lemma 3.2. [] Lemma 4.8. s=O, 1 , . . . , m . Proof.
The value of F(n, d) is 3/n when n = 6m + 4 and d = 3s + 2,
Note that
A=2m-s,
B=s.
Consider the following strategy for Evader. He chooses a number in S by means o f the probability distribution P~, where a = 3/n, and then adopts the procedure X(x), Y(x), or L according to xsSj, xES2, or x =[n/2j, respectively. If xeS1, he chooses with equal probability one of the following three courses of action: (a) (b) (c)
loiter at time 3m + 1 and at all subsequent times; move anticlockwise at time 3m + 1 and loiter thereafter; move anticlockwise at times 3 m + l and 3 m + 2 and loiter thereafter.
If x s $2, he chooses with equal probability one of the following three courses of action: (d) (e) (f)
move clockwise at time 3m + 1 and loiter thereafter; loiter at time 3m + 1, move anticlockwise at time 3m + 2, and loiter thereafter; loiter at time 3m + 1 and at all subsequent times.
If x = 3m +2, he chooses with equal probability one of the following four courses of action: (g) (h) (i) (j)
move clockwise at time 3m + t and loiter thereafter; move anticlockwise at times 3 m + l and 3 m + 2 and loiter thereafter; loiter at time 3m + 1, move anticlockwise at time 3m + 2, and loiter thereafter; loiter at time 3m + 1 and at all subsequent times.
By Lemma 4.1, Pursuer has an expectation of at most 3/n if he flashes before 3m + 1. At time 3m + 1, each of the points {3s+ 1 - 3 m + 3 u : u--0, 1. . . . . 2 m - s - 1} u {3m+2} u {3s+2+3m-3v: v = 0 , 1. . . . . s - 1}
60
JOTA: VOL. 71, NO. I, OCTOBER 1991
has a probability o f
2/n of being Evader's position, while each of the points
{3s+2-3m+3u: u--0, 1. . . . . 2 m - s - 1} u { 3 m + l , 3m+3} w
{3s+3+3m-3v: v = 0 , 1. . . . . s - 1}
has a probability of 1/n of being Evader's position. Since Pursuer cannot be at the point 3m + 2 at this time, it is easy to check that Pursuer has an expectation of at most 3/n if he flashes at time 3m + I. However, at time 3m + 2 and subsequently, Evader has an equal chance of being at each of the points 0, 1 , . . . , n - 1 , so Pursuer can again expect at most 3/n. The lemma now follows by Lemma 3.1. [] Lemma 4.9.
The value of F(n, d) is
3/n when n = 6m + 5 and d = 3s,
s=l,...,m. Proof.
Note that
A=2m+
l-s,
B=s,
and consider the following strategy for Evader. He chooses a number x in S by means of the probability distribution P~, where a = 3/n, and then adopts the procedure X(x), Y(x), or L according to x eSl, x eS2, or x = In/2], respectively. If x e St, he moves anticlockwise at time 3m + 1, disperses at time 3m + 2, and loiters subsequently. If xES2, he moves clockwise at time 3m+ 1, disperses at time 3m+2, and loiters subsequently. If x=[n/2], he loiters at time 3m+ 1, with equal probability moves anticlockwise or loiters at time 3m + 2 and then loiters subsequently. If Pursuer flashes at a time before 3m + 1, he has an expectation of at most 3In by Lemma 4.1. At time 3m+ 1, each of the points
{3s-3m+3u- 1: u = 0 , 1. . . . . 2m-s} {3s+3m-3v+ 1: v = 0 , 1 , . . . , s - 1} has a probability of 3/n of being Evader's position, while 3m + 2 has probability 2/n of being Evader's position. Since Pursuer cannot get to 3m + 3 at time 3m+ 1, he can get at most one of these points with a flash at time 3m + 1. Thus Pursuer has an expectation of at most 3/n if he flashes at time 3m + 1. However, at time 3 m + 2 and subsequently, Evader has an equal chance of being at each of the points 0, 1. . . . . n - 1, so again Pursuer can expect at most 3/n. The lemma now follows by Lemma 3.1. []
JOTA: VOL. 71, NO. 1, OCTOBER 1991
61
Lemma 4.10. The value of F(n, d) is 3/n when n = 6m + 5 and d = 3s+2, s = 0 , t . . . . ,m. Proof.
Note that
A=2m-s,
B=s.
Consider the following strategy for Evader. He chooses a number x in S by means of the probability distribution P~, where a = 3/n, and then adopts the procedure X(x), Y(x), or L according to xeS~, xeS2, or x = [ n / 2 l , respectively. Ifx~S1, he moves anticlockwise at time 3m + 1, disperses at time 3m + 2, and loiters thereafter. If x~S2, he moves clockwise at time 3m+ 1, disperses at time 3 m + 2 , and loiters thereafter. If x = In/2], he chooses with equal probability one of the following five courses o f action: (i) (ii) (iii) (iv) (v)
move clockwise at times 3m + 1 and 3m + 2, then loiter thereafter; move clockwise at time 3m + 1 and loiter thereafter; loiter at time 3m + 1 and subsequently; move anticlockwise at time 3m + 1 and loiter thereafter; move anticlockwise at times 3m+ 1 and 3m+2, then loiter thereafter.
By Lemma 4.1, Pursuer has an expectation o f at most 3In if he flashes before time 3m + 1. At time 3m + 1, each o f the points { 3 s - 3 m + 3 u + t : u = 0 , 1. . . . , 2 m - s -
1}
w {3s+3m-3v+3: v = 0 , 1. . . . . s - 1} has a probability of 3/n of being Evader's position, while each of the points {3m+ 1, 3m+ 3} has probability 2/n and point 3 m + 2 has probability I/n of being Evader's position. Since Pursuer cannot reach 3m + 2 at time 3m + 1, it is easy to check that he has an expectation of at most 3In if he flashes at time 3m + 1. However, at time 3m + 2 and subsequently, each of the points 0, 1. . . . . n - 1 has probability 1/n of being Evader's position, so Pursuer has an expectation of 3/n if he flashes at time 3m + 2 or later. The lemma now follows by Lemma 3.1. [] We now collect the results in this section into a single theorem. Theorem 4.1. The value of F(n, d), where 2<<.d<~[n/21, is: 1~In~3], if n = 3m + 1 and d = 3s or 3s + 1 or if n = 6m + 5 and d = 3s + 1 ; 2/(4m + 1), if n = 6m + 2 and d = 3s + 1 ; 3/n, otherwise.
62
JOTA: VOL. 71, NO. l, OCTOBER 1991
5. Basic G a m e when T = [n/2] - 1
Denote the basic game when T = [ n / 2 ] - 1 by O(n, d). Since every Pursuer strategy in O(n, d) can be used as a Pursuer strategy in F(n, d), the value of F(n, d) is greater than or equal to the value of O(n, d). Thus from L e m m a s 3.2 and 3.3 and T h e o r e m 4.1, we immediately have the following lemma. L e m m a 5.1. (i) (ii) (iii)
T h e value o f
O(n, d)
is
1/[n/3]
when:
n = 6m + 1 and d = 3s, s = 1. . . . . m, or d = 3s + 1, s = 1 . . . . , m - 1 ; n=6m+4andd=3sor3s+l,s=l . . . . . m; n = 6m + 5 and d = 3s + 1, s = 1. . . . , m.
F u r t h e r m o r e the value o f 0(6m + 2, 3 s + 1) is 2 / ( 4 m + 1) for s = 1. . . . .
m.
When n = 3m, it is easy to check that Pursuer can choose one o f the points { [ n / 2 ] - 1 - 3 u : u = 0 , 1. . . . , m - 1} at r a n d o m , go to it, and flash the searchlight at time [ n / 2 ] - 1 ; wherever Evader is at time I n / 2 ] - 1 , he is caught with probability 1/m= 3/n. Since the value o f F(3m, d) is 3/n by T h e o r e m 4.1, we therefore have the following lemma. L e m m a 5.2.
The value of
O(n, d)
is
3/n
when n is a multiple of 3.
W h e n n is not divisible by 3, say n = 3m + a, where a = 1 or 2, Pursuer can choose a point at r a n d o m f r o m {[n/2]- 1-3u: u=0, 1,...,m-
1} w { [ n / 2 ] + 2 } ,
go to it, and flash at time [ n / 2 ] - 1. It is easy to verify that, wherever Evader is at time [ n / 2 ] - 1, he is caught with probability at least 1/(m + 1). We have therefore proved the following lemma. L e m m a 5.3. W h e n n is not divisible by 3, Pursuer can ensure himself o f an expectation o f at least t / [ ( n + 3)/3] in O(n, d). L e m m a 5.3 will enable us to deal with all but one of the cases not covered in L e m m a s 5.1 and 5.2. F o r the outstanding case, we need the following result. L e m m a 5.4. If n = 6m + 4, Pursuer can ensure himself o f an expectation o f at least 6 / ( 2 n + 1) in O(n, d).
JOTA: VOU 71, NO. 1. OCTOBER 1991
Proof.
63
Let Pursuer choose a point at random from
{3m+ 1 - 3 u : u=O, 1. . . . . 2m} u { 3 m - 3 u : u = 0 , 1. . . . . 2m} w {3m+3}, go to it, and flash at time 3m + 1. Wherever Evader is, he is clearly caught with probability at least 2/(4m + 3) = 6/(2n + 1), and the result follows. [] Lemma 5.5. The value of O(6m+4,3s+2) is 6 / ( 2 n + 1 ) for m = 1, 2 . . . . and s = 0, 1. . . . . m. ProoL Let Evader choose a point x in S by means of the probability distribution P~, where a=2/(4m+ 3). If xeSt, he chooses the procedure X(x) and then, at time I n / 2 ] - 1, loiters or moves anticlockwise with equal probability. If x~S2, he chooses the procedure Y(x) and then, at time [n/21- 1, loiters or moves clockwise with equal probability. If x = [n/2], he adopts L and then disperses at time [n/21-1. By Lemma 4.1, Pursuer has an expectation of at most 2/(4m + 3) if he flashes before [n/21- 1. Now, at time [n/2]- 1, Evader is at one of the points of {3u+2: u = 0 , 1. . . . . 2m} u {3(u+m): u = l . . . . . m + 1} w {3u+ 1: u = 0 , 1 , . . . , m}, and each point is equally probable. Since Pursuer cannot be at the point 3m + 2 at time 3m + 1, Evader is caught with probability 2/(4m + 3) if Pursuer flashes at time [ n / 2 J - 1. The result now follows by Lemma 5.4. [] Lemma 5.6. (i) (ii) (iii)
The value of 0(n, d) is 1/[(n+3)/3t when:
n=6m+l andd=3s+2, s=O, 1 , . . . , m - t ; n=6m+5andd=3s+2, s = 0 , 1 . . . . . m; n = 6m + 5 and d = 3s, s = 1. . . . , m.
Proof. Let Evader choose a point x from S by means of the probability distribution P~, where a = 1/[(n+3)/31. Notice that In/2] is chosen with probability 2/[(n + 3)/3] in cases (i) and (ii) and with probability 1/[(n + 3)/ 3] in case (iii). IfxE S~, he adopts X(x) ; then, at time In/2] - 1, he moves anticlockwise. If xeS2, he adopts Y(x); then, at time I n / 2 ] - 1 , he moves clockwise. If x = [n/2] he adopts L; at time I n / 2 ] - 1, he loiters or moves clockwise with equal probability in cases (i) and (ii), but just loiters in case (iii).
64
JOTA: VOL. 71, NO. 1, OCTOBER 1991
By Lemma 4.1, Pursuer can expect at most 1/(2m+ 1) if he flashes before time [ n / 2 J - 1. However, at time [ n / 2 j - I, Evader is at one of the points U u V u W, where
U={d-[n/2]+ l + 3 u : u = 0 , 1. . . . . A - 1}, V= {d+ln/2J- 1 - 3 v : v = 0 , 1. . . . . B - 1}, W= {[n/2J, In/2]+ 1}, W= {In/2]},
cases (i) and (ii),
case (iii) ;
furthermore, each point of U u V u W is equally likely. Noticing that, at time [ n / 2 J - 1, Pursuer cannot be at either of the points [n/2J and [n/2J + 1, it is easy to verify that Pursuer can catch at most one of the points of Uw V u W with a flash at time [ n / 2 J - 1. The result now follows by Lemma 5.3. [] Lemma 5.7.
The value of 0(6m+2, d) is 1/[(n+3)/31 when:
(i)
d=3s,
s = l . . . . . m;
(ii)
d=3s+2,
s = 0 , 1. . . . . m - 1 ,
Proof. Let Evader choose a point x from S by means of the probability distribution Pa, where a = 1/[(n+3)/3]. Notice that In/2] is chosen with probability a. IfxeS~, he adopts X(x); then, at time [ n / 2 j - 1, he moves anticlockwise in case (i) and loiters in case (ii). If xeS2, he adopts Y(x); then, at time [ n / 2 l - 1 , he loiters in case (i) and moves clockwise in case (ii). I f x = [n/2j, he adopts L; then, at time [ n / 2 l - 1, he moves anticlockwise in case (i) and moves clockwise in case (ii). By Lemma 4.1, Evader is caught with probability at most 1/[(n + 3)/3j if Pursuer flashes before [ n / 2 J - 1. However, at time [ n / 2 J - 1, Evader is at one of the points of {3u: u = 0 , 1. . . . . m} w { 3 ( m + u ) + 2 : u = 0 , 1. . . . . m - l } , and each point is equally probable. Since Pursuer cannot be at the point 3m + 1 at time 3m, he can catch Evader with probability at most 1/(2m + 1) if he flashes at time [ n / 2 J - 1. The result now follows by Lemma 5.3. [] We now collect the results of this section into a single theorem.
JOTA: VOL. 71. NO. 1, OCTOBER 199I Theorem 5.1. (i)
(ii) (iii) (iv)
65
The value of O(n, d), where 2<<.d<<.[n/2], is:
1~In~3],when n=3m, or when n = 3m + 1 and d = 3s or 3s + 1, or when n = 6m + 5 and d = 3s + 1 ; 1/[(n+3)/31, when n = 3 m + 2 and d=3s or 3s+2, or when n = 6m + 1 and d = 3s + 2; 6 / ( 2 n - 1), when n = 6 m + 2 and d = 3 s + 1 ; 6 / ( 2 n + 1), when n = 6 m + 4 and d = 3 s + 2 .
6. Conclusions We have obtained explicit optimal strategies for the players in Olsder and Papavassilopoutos's basic searchlight game in addition to finding a ctosed formula for its value. Evader's strategy can be thought of as spreading himself over as many points as possible, while Pursuer's strategy tells him always to flash at a given time. Knowing these results should considerably reduce the computation required to obtain results for generalizations of the basic game. Indeed, in a number of instances, the solutions follow easily from them. Notice that, when n is divisible by 3, our results show that Pursuer does not benefit from knowing Evader's initial position. Thus, for this case, it is easy to see that 1 - (1 - 3In) ~ is the value of the generalization in which Pursuer has k flashes at his disposal and T>>.k([n/2]-1). Notice that, for n divisible by 3, 1 - ( 1 - 3/n) k is also the value when Pursuer has k flashes at his disposal, T~> ( k - l ) ( [ n / 2 l - 1), and Evader's initial position is chosen using the uniform distribution (of course Evader would know his and Pursuer's initial positions, but Pursuer would only know his own). Under the same restrictions on T, the corresponding games where Pursuer must flash k times in time T (rather than just having k flashes at his disposal) clearly have the same values.
References 1. BASAR,T., and OLSDER, G. J., Dynamic Noncooperative Game Theory, Academic Press, London, England, 1982. 2. OLSDER,G, J., and PAPAVASSILOPOULOS,G. P., About When to Use a Searchlight, Journal of Mathematical Analysis and Applications, Vol. 136, pp. 466-478, 1988. 3. OLSDER, G. J., and PAPAVASSILOPOULOS,G. P., A Markov Chain Game with Dynamic Information, Journal of Optimization Theory and Applications, Vol. 59, pp. 467-486, 1988.
66
JOTA: VOL. 71, NO. 1, OCTOBER 1991
4. BASTON,V. J., and BOSTOCK,F. A., A One-Dimensional Helicopter-Submarine Game, Naval Research Logistics, Vol. 36, pp. 479-490, 1989. 5. BASTON, V. J., and BOSTOCK, F. A., An Evasion Game with Barriers, SIAM Journal on Control and Optimization, Vol. 26, pp. 1099-I 105, t988. 6. LEE, K. T., A Firing Game with Time Lag, Journal of Optimization Theory and Applications, Vol. 41, pp. 547-558, 1983. 7. LEE, K. T., An Evasion Game with a Destination, Journal of Optimization Theory and Applications, Vol. 46, pp. 359-372, 1985. 8. BERNHARD, P., COLOMB, A. L., and PAPAVASSILOPOULOS,G. P., Rabbit and Hunter Game: Two Discrete Stochastic Formulations, Computers and Mathematics with Applications, Vol. 13, pp. 205-225, 1987. 9. OLSDER,G. J., and PAPAVASSILOPOULOS,G. P., On a Finite State Space PursuitEvasion Game with Dynamic Information, Proceedings of the Conference on Decision and Control, Athens, Greece, 1986.