This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!
{y) = sup ^(C)}. Then, by Lemma 1.5, dist(c, H) = V^A^ic) , that is, (1.159). Hence, by the definition of evenly quasi-convex D functions, we obtain the second statement. (2) Let us consider now the polarity A = A^ : 2^ ^ 2^*\^^^ defined by A^(C) := {O e A:*\{0}| sup0(C) < sup(G)} d sup 0(G)} is a (closed) half-space. Geometrically, formula (3.77) means that sup/(G) = sup i n f / ( y ) , d ^(y^^^ iG)
- sup0(C)| = ^ { s u p O ( C ) - cD(c)}
(c e C),
whence dist(C, H) = - ^ inf dist(c, H) = - ^ inf {supO(C) - cD(c)} = 0. IIOil ceC
IICDII ceC
2° =^ r. Assume 2° and (1.42) (in the case of (1.43), replacing O by - O , we arrive at the case (1.42)), and let ^ > 0. Then by (1.58), there exist Cs e C and he e H such that ||ce — h^W < v^^s- Hence, cD(/z,) - CD(Q) = <^(h, - Cs) < \m
Whs -Cs\\< £,
and thus 0(ce) > ^(h^) — £ = d — £, which proves (1.44).
D
1.1 Some preliminaries from convex analysis
15
Corollary 1.2. Let X be a normed linear space. For a ball B = B(x, r) c Z and a hyperplane (1.41), the following statements are equivalent: v. H quasi-supports the ball B. T. We have (1.58) with C ^ B and / / n i n t ^ = 0.
(1.59)
dist(jc,//) = r.
(1.60)
3°. behave
Proof. By the above, in order to prove the equivalence 1° ^ 2°, it will be enough to show that (1.42) (with C = B) <^ (1.59 ).
(1.61)
Suppose first that (1.59) does not hold, say, y e HCMniB, that is,
£
r+s
XH
r
y. r -\- £
Then ||x — z|| = ^ lU — yll < £, whence, since ^ > 0 was arbitrary, it follows that we have dist (H, B) =0, i.e., (1.58) of 2°. On the other hand, if there existed an element y e H D intB, then we would have dist(jc, H) < ||jc — }^|| < r, violating the assumption (1.60); hence, we also have (1.59) of 2°. D Lemma 1.6. Let X be a normed linear space, x e X, and r > 0. For any O e X* with ||(|)|| = 1, the hyperplane H C X defined by
16
1. Preliminaries H = {yeX\
4>(3; - x) = r} = {y e X\ <^(y) = ^(x) + r]
(1.62)
quasi-supports the ball B = B(x, r), and conversely, for each quasi-support hyperplane H of the ball B there exists a unique O € X* with || O || = 1 such that we have (1.62). Proof Let ^ be a hyperplane of the form (1.62), with ||cl>|| •= 1. Then, by Lemma 1.5, dist(jc, //) = |cl>(jc) - [4>(x) - r]| = r, and hence by Corollary 1.2, H quasisupports the ball S. Conversely, let // = [y e X\^\{x) = Ji} be a quasi-support hyperplane of the ball B, Then, for ^2 '•= i^,d2 := 1^, we have IIO2II = I and H = ^
110,11'
^
||cD,||'
II
^"1
{y G X\^2{y) — ^2}- Since H quasi-supports the ball B, we obtain, by Lemma 1.5 and Corollary 1.2, |02(JC) — J2I = dist(jc, //) = r. Hence, for O := sign {^2 - ^2(-^)}^2 we have HOH = 1 and ^{y - x) — IJ2 - ^2(-^)l = r ( j e / / ) , so / / ^ { j e X | c D ( _ y - j c ) = r}. Since both sets in this inclusion are hyperplanes, they must coincide, so H is of the form (1.62), with II
cDO(j) = (O - <^')(x)}.
Since O^ 7^ O, both sets in this inclusion are hyperplanes, so they must coincide. Hence, since x belongs to the second set, it follows that x e H, and therefore, by (1.62), r = 0, in contradiction to the assumption that r > 0. Consequently, we have 0^ = 0 . D Remark 1.4. If for a function O G X* the hyperplane H defined by (1.62) quasisupports the ball B(x, r), then, necessarily, \\^\\ = 1, since by (1.60) and Lemma 1.5 we have
.^dist(.,//):=l^«^I^^^l±^ = - ^ . Il^ll ll^ll Definition 1.2. We shall say that a closed half-space V (respectively, an open halfspace U) quasi-supports a subset C of a locally convex space X if the hyperplane bd V (respectively, bd U) quasi-supports the set C. Lemma 1.7. A closed half-space V quasi-supports a set C if and only if it has one (and only one) of the forms Vi={yeX\<^(y)>sup
(1.63)
or V2 = {ye X\
(1.64)
1.1 Some preliminaries from convex analysis
17
and an open half-space U quasi-supports a set C if and only if it has one (and only one) of the forms Ux=[y
e X\ 0 ( y ) > sup(D(C)},
(1.65)
U2 = [y e X\ 0 ( j ) < sup 0 ( C ) } ,
(1.66)
or
where ^ e Z*\{0}, sup 0 ( C ) e R. Also, conversely, every closed half-space V of the form (1.63) or (1.64), and every open half-space U of the form (1.65) or (1.66), where O G Z*\{0}, sup 0 ( C ) e R, quasi-supports the set C. Proof This follows from Definition 1.2 and Corollary 1.1, since for y = Vj or V = V2 of (1.63) or (1.64), respectively, and U = UiorU = U2 of (1.65) or (1.66), D respectively, we have bdV =bdU = {y e X\ ^(y) = sup 0 ( C ) } . Corollary 1.3. Every closed (respectively, open) half-space V (respectively, U) quasi-supporting C and not containing C (respectively, int C j can be written in the form (1.63) (respectively, (1.65)), where O G X*\{0}, sup<^(C) G R, and conversely, every closed (respectively, open) half-space V (respectively, U) of the form (1.63) (respectively, (1.65)), where ^ G X*\{0}, s u p O ( C ) G R, quasi-supports the set C and does not contain C (respectively, int C). Proof This follows from Lemma 1.7, since for all O G X*\{0} we have C C V2 and int C Q U2 (by Lemma 1.8 below). D Lemma 1.5 implies the following useful formula for the distance to a closed half-space, respectively an open half-space: Corollary 1.4. Let O G Z * \ { 0 } , d e R and
V^,d : = {y e X\ <^(y) > d],
U^^d '= [y e X\
(1.67)
Then, for any XQ ^ V^j, we have d — O(xo) dist(xo, V
(1-68)
Proof By /f^ c y^j, j , we have dist(xo, //
[TTT;
•
(1-69)
ll^ll But since XQ ^ V(D,J. we have ^(XQ) < d, whence by (1.69), we obtain (1.68).
D
18
1. Preliminaries
Remark 1.5. If d = +CXD, then for the sets defined by (1.29) and (1.67) we have H(^j = V^^d = U,t>,d = 0- Hence, formulas (1.54) and (1.68) remain valid also for d = +00, by the convention (1.51). The following result was used in the proof of Corollary 1.3: Lemma 1.8. Let C be a nonempty open subset of a locally convex space X and O G Z*\{0}. Then inf cD(C) < cD(c) < sup 0(C)
(c e C).
(1.70)
Proof. If supO(C) = +00, then, since
0(x,) = ^ 3 ^ ^ W - Y^^^^W > supO(C), in contradiction to XQ G C . This proves the second inequality of (1.70). The proof D of the first inequality is similar. Remark 1.6. (a) Lemma 1.8 admits (by Lemma 1.4) the following geometric interpretation: If C is a nonempty open subset of a locally convex space X, then C has no support hyperplane. (b) Actually, as shown by the above proof, one can replace in Lemma 1.8 "open" by "linearly open," i.e., such that C = core C, where core C := {c G C\ Vjc G X, 3^ > 0, V^ G [-£, +6], rjx + {\ - r])c G C}. (1.72) We shall also use the following lemma. Lemma 1.9. Let X be a locally convex space, f: X -^ R a convex function (see (1.93) below), and G a subset of X satisfying the ''Slater condition'' Mf(X)
< sup/(G) < +00.
(1.73)
Furthermore, let us consider the sets A := Asup/(G)(/) = {yeX\
f(y) < sup/(G)},
(1.74)
S := 5sup/(G)(/) = {yeX\
f(y) < sup/(G)}.
(1.75)
Then A yi^ 0 and 5CA.
(1.76)
1.1 Some preliminaries from convex analysis Proof. Note first that by (1.73), we have A ^^. jco G A (by A ^ 0), let
Let jc G 5 be arbitrary. Taking any
(A2 = 1, 2, . . . ).
Xn = -XO + (l - -)x
19
(1.77)
Then, since / is convex and XQ G A, x e 5, we have fiXn)
< -f{X0)
n so Xn e A (n = 1,2,...). proves (1.76).
+ ( l - -)fix)
\
< sup/(G),
n/
Also, clearly, x„ -> x, which, since x e S was arbitrary, D
Remark 1.7. (a) For any lower semicontinuous function / we have A C 5,
(1.78)
and hence, by Lemma 1.9, if / is convex and lower semicontinuous, and satisfies (L73),then 5 = A.
(L79)
Indeed, if / is lower semicontinuous, then S is closed and A c 5, whence A c 5*. (b) If / is convex and upper semicontinuous, and satisfies (1.73), then int5 = A.
(1.80)
Indeed, since / is upper semicontinuous, A is open, so we have the inclusion 2 in (1.80). In the opposite direction, let x e int5, and, by (1.73), let XQ e A, so fixo) < sup / ( G ) . We may assume that JC / XQ (since otherwise x e A and we are done). Let yn := --X0 + ^ ^ x (« = 1, 2 , . . . ) . (1.81) n n Then, since jc G int 5" and int S is open, for sufficiently large n we have y„ G int 5 c S, so f(yn) < sup/(G). Butby (1.81), we have jc — ^ j „ - h ^ X Q , whence, since / is convex, we obtain fix)
< -^fiyn)
+ -^f(Xo)
< sup/(G),
so X G A, which proves the inclusion c in (1.80), and hence the equality (1.80). The polar set of a set C c X is the subset of X* defined by r
= {^e
X*| 0(c) < 1 (c G C)},
(1.82)
20
1. Preliminaries
and the bipolar of C is C°° := (C°)°. The classical "bipolar theorem" states that, for any subset C of a locally convex space X, we have C°° = co(C U {0}); hence, a set C containing 0 is closed and convex if and only if C°° = C. Since in optimization theory it is useful to work with functions having values in the extended real line R = [—oo, +oo], or in the extended positive axis /?+ = [0, +oo], it is necessary to give a precise meaning to expressions like oo — oo and 0 X 00. We recall (see, e.g., Moreau [164]) that the usual addition -\- on R = (—oo, +oo) admits two natural extensions to R, + and +, called upper and lower addition, respectively, defined by a+b = afb = a-\-b if either/? H {A, Z?} y^ id or a = b = ±oo, a + b = +00, afb = - o o if a = -b = ±oo.
(1.83) (1.84)
We shall use the notation ^ j l j « / = ^1 -i
i-«m,
(1.85)
and as usual, if /? H {<3, Z?} 7^ 0, we shall denote -j- and -f simply by +. According to the above, subtraction — on R will mean either -j— or H—. We shall use the well-known calculus rules with -j- and -j- on /? developed by Moreau; the proofs can be found, e.g., in [164]. For example (see, e.g., [164, p. 115, Proposition], or [254, Lemma 8.3]), -(a + b) = -af-b
{a,be~R).
(1.86)
It is also well known (see, e.g., [164, p. 119, Proposition], or [254, Lemma 8.2]), that a^b>c<^a>c-f-b
(a,b,ce^);
(1.87)
hence, in particular (see, e.g., Moreau [164, p. 120, Corollary]), a + b>0<^a>-b
(a,be^).
(1.88)
Let us also mention the following relation between the upper and lower addition and the order on R (see, e.g., [164, p. 119, Lemma]), which we shall use in later chapters: a + {bfc)>{a+b)fc
{a,b,ce~R).
(1.89)
It is well known too (see, e.g., Moreau [164, p. 122, Corollary] or [254, Lemma 8.3]) that for any set Z, any function / : X ^ /?, and any a e R v/e have inf (/(;c) +a) = inff(X)
+ a,
(1.90)
xeX
sup {f(x) -\-a) = sup f(X) + a. xeX
(1.91)
1.1 Some preliminaries from convex analysis
21
We shall also use the extension to the cases a = 0 and b e {—oo, +oo}, respectively a e {—oo, +co} and b = 0, of the usual multiplication x, defined by the conventions 0 X +00 = +00 X 0 = -hoo, 0 X - o o = - o o x 0 = 0.
(1.92)
We recall that if X is a linear space, a function / : X -> /? is said to be (a) convex if fm,
+ (1 - l^)X2) < l^f(x,) + (1 - l^)f(X2)
Ui, X2 G X, 0 < ^ < 1); (1.93)
(b) concave if the function —/is convex; (c) proper if / is not identically -\-oo, and f(x) > — oo for all x e X; (d) sublinear, if it is convex Sind positively homogeneous, i.e., f(ax) = af{x)
(x eX,a
e R,a>
0).
(1.94)
It is well known and easy to show that a function f: X -^ R is convex if and only if the set epi / C X x /? (defined by (1.21)) is convex. Conjugate functions will later be a basic tool for defining dual optimization problems. Given a locally convex space X, the (Fenchel) conjugate function of a function / : X ^- /? is the function f*:X*-^R defined by / * ( 0 ) = sup {cD(x) - f(x)}
(CD G X*).
(1.95)
xeX
The function / * is also called the convex conjugate of / (since / * is a convex function, for any f: X ^^ R), while the concave conjugate / ® : X* ^ /? of / is defined by /®(cD) = inf {
(CD G X*);
(1.96)
however, in the sequel we shall consider mainly the convex conjugate /*, and therefore we shall omit the adjective "convex." The biconjugate of any function / : X -^ /? is the function /** \ X ^^ R defined by /**(x) = sup {cD(x) - /*(cD)} = sup {cD(x) - sup{cD(3;) - /(j)}} OGX*
= sup inf {f(y) - cD(y) + <^(x)}
y^X
(x G X).
(1.97)
When X* is endowed with the weak* topology cr(X*, X), its conjugate space coincides with X, and hence /** = (/*)*. By (1.97), we have /**
(fe'R^'y,
(1.98)
22
1. Preliminaries
also, / = /** if and only if f(zo) = sup inf lf(y) -
Uo e X).
(1.99)
For any function / : X -> Rom. locally convex space X we shall denote by fco the lower semicontinuous convex hull of / , i.e., the greatest lower semicontinuous convex minorant of / . We have the following classical theorem of Fenchel-Moreau (see, e.g., Ekeland and Temam [54, Ch. 1, Section 4, Proposition 4.1] or loffe and Tikhomirov [111, Ch. 3, Section 3.3, Theorem 1 and Corollary 1] or Barbu and Precupanu [14, Ch. 2, Corollaries 1.5 and 1.6]): Theorem 1.4. Let X be a locally convex space. We have
r* = fco
ifeR'').
(1.100)
Hence, for a function f \ X -> R, we have f = /** (or, equivalently, f = fco) if and only if f is the supremum of a set of continuous affine functions. For any function / : X ^- R, where X is a linear space, the ("effective") domain of / is the set d o m / := {x e X\ f(x) < +oo}.
(1.101)
We recall that if X is a locally convex space, then the support set supp / of any function / : X -> /? is the subset of X* defined by s u p p / := {O G X * | 0 < / } .
(1.102)
Also, the (X*, R)-support set Supp / of any function / : X ^- /? is the subset of X* X /? defined by Supp / = {(^, t/) G X* X /?| CD - J < / } .
(1.103)
Clearly, we have the following relations between Supp / and supp / : ( 0 , J ) G S u p p / 4 » (D < / + J o (D Gsupp(/-f ^ ) .
(1.104)
For any /, /z: X ^- /? we have the equivalence / epi/5epi/z;
(1.105)
indeed, we have f < h if and only if there exist no jc G X and d e R such that fix) >d> h{x). Furthermore, e p i / * = Supp/;
(1.106)
1.1 Some preliminaries from convex analysis
23
indeed, for any O e X* and J e /? we have the equivalences /*((!>) < J <^ sup {^(y) - f(y)} < d ^ ^ - f < d <> ^ - d < f.
(1.107)
Consequently, for any functions f,h: X -^ R satisfying / = /**, h = /z**, we have the equivalence f
(1.108)
indeed, by / = /**, h = /z**, (1.105), and (1.106), / < /z <^ / * > /z* <^ e p i / * c epi/z* <^ S u p p / c Supp/z. By (1.95), we have the so-called "Fenchel inequality" ^(x) < fix) 4- / * ( 0 )
(x eX,(^e
X*).
(1.109)
If X is a locally convex space, the subdijferential of a function / : X -> /? at a point zo ^ X is the subset df{zo) of X* defined by 9/(zo) := {O e X*| c|>(x) - cD(^o) + f(zo) < f(x)
(x e X)}.
(1.110)
We have (see, e.g., Ekeland and Temam [54, Ch. 1, formula (5.3)]) the implication 9/(zo) / 0 ^ / ( z o ) =/**(zo).
(1.111)
If f(zo) e R, then by (1.110) and (1.95), we have
^(zo) = f(zo) + r m
(1.112)
holds. Using (1.112), one deduces easily (see, e.g., Ekeland and Temam [54, Ch. 1, Corollary 5.2]) that for any function / : X ^- 7? we have the implication ^0 € a/(zo) ^ zo e a/*(Oo),
(1.113)
and if f{zo) = /**(zo) (in particular, if df(zo) 7^ 0), then cDo e a/(zo) ^ zo € a/*(Oo).
(1.114)
The subdifferential at a point zo ^ X with f{zo) e R can also be expressed as df(zo) = {<^e X*| fizo) - O(zo) = min {f(x) - <^{x)}};
(1.115)
xeX
indeed, O(x)-c|>(zo) < fix)-fizo)
ixeX)^
/(zo)-O(zo) < fix)-<^ix)
ix e X).
24
1. Preliminaries
A function f\X -> 7? is said to be subdijferentiable on a subset A of X if _ f{A) c R and dfizo) # 0 for each ZQ e A. As an example of subdifferentials, let us note that if / : X -^ /? is any function satisfying /(O) = 0, then, by (1.110), we have 3/(0) = {CD G X*| CD < / } = supp/.
(1.116)
In particular, if X is a normed linear space and f{x)
= \\x\\
(XGX),
(1.117)
then a/(0) = Bx*,
(1.118)
where 5x* = {O G Z * | | | 0 | | < 1}, the unit ball of X*. Remark 1.8. Obviously, the supremum in (1.99) is attained for some OQ G X* if and only if 4>o e df(zo). Hence in general, the supremum in (1.99) need not be attained (e.g., take any proper lower semicontinuous convex function / such that df(zo) = 0 for some zo ^ X), but if / is convex and continuous, then it is attained for some CDQ e X* (e.g., by Theorem 1.13 below, applied to a singleton G = {zo})We have the following classical theorem of Moreau-Rockafellar (see, e.g.. Holmes [106, p. 25]): Theorem 1.5. If X is a locally convex space and f,h:X^^R are convex functions such that one of them is continuous at some point 6>/dom / Pi dom h, then Hf + h)(xo) = df(xo) + dh(xo)
{xo e X).
(1.119)
Remark 1.9. It is easy to see that here the + signs are just the usual sums. We recall that if X is a linear space and / : X -> R is a. convex function, then for any XQ e X with /(JCQ) G R and any x e X, the limit w, , y f(xo-\-tx)/ (xo; x) := hm UO
f(xo)
n lom (1.120)
t
exists in R, and it is called the directional derivative of f at XQ in the direction x. By a theorem of Moreau and Pshenichnyi (see, e.g., Holmes [106, p. 27] or Laurent [129, Theorem 6.4.8]), if X is a locally convex space and f: X ^^ R is a convex function that is finite and continuous at XQ, then f(xo;x)=
max
cD(jc)
(x e X).
(1.121)
If C is any subset of a set X, the indicator function xc of C is defined by
1.1 Some preliminaries from convex analysis
25
The normal cone to a subset C of a locally convex space Z, at a point CQ e C, is the subset of the conjugate space X* defined by N(C; Co) = {O G X*| cD(co) = maxO(C)}.
(1.123)
Note that always 0 e N(C', CQ), so N(C; CQ) / 0. As another example of subdifferentials, let us mention that for any convex subset C of a locally convex space X we have ^Xcico) = N(C; Co)
(co e C).
(1.124)
If Co ^ C, then dxc(co) = 0; more generally, for any proper function f: X -> R, iff(zo) = +00, then by (1.110), dfizo) = 0. The extended normal cone to a set C at a point xo e X is the subset of X* defined by iV(C; xo) = {<0 G X*| cD(xo) = max 0(C)}.
(1.125)
In particular, for xo G C we have, clearly, N{C; XQ) = N(C', XQ). Let us recall how the normal cones to the level sets 5/(jco)(/) of (1.22) can be expressed with the aid of subdifferentials. Theorem 1.6. (see, e.g., loffe and Tikhomirov [111, p. 217, Proposition 2]). Let X be a locally convex space, and f: X ^ R a proper convex function, continuous at a point XQ G X, such that inf f(X) < f(xo) < +oo. Then f is subdifferentiable at XQ and ^ ( 5 / ( , , ) ( / ) ; xo) = U^>o/xVUo).
(1.126)
For any e > 0, the set of all e-normal directions, or briefly, the e-normal set, to a set C at a point CQ G C is the subset of X* defined by Ne{C\ Co) = {O G X*| O(co) > sup cD(C) - e].
(1.127)
n,>oA^,(C;co) = ^(C;co).
(1.128)
Obviously,
For any £ > 0, the set of all extended e-normal directions, or briefly, the extended 8-normal set, to a set C at a point XQ e X is defined by Ns{C', xo) = {O G X*| a>(xo) > sup cD(C) - s).
(1.129)
In particular, for xo G C we have, clearly, NgiC; XQ) = Ne(C; XQ). For any 6 > 0, the e-subdifferential of a function / : X -^ /? at a point Zo ^ X with /(zo) ^ Ris the subset 9e/(zo) of X* defined by
26
1. Preliminaries dj(zo)
:= {O G Z*| 0(x) - cD(zo) < / U ) - /(zo) + £ (x e X)},
(1.130)
which is always nonempty. Clearly, ns>odsf(zo) = Sof(zo) = VUo).
(1.131)
For the ^-normal sets to Sf(xo)(f)^ we have the following theorem: Theorem 1.7. (see Hiriart-Urruty and Lemarechal [104, Ch. XI, Corollary 3.6.2]). Let X = R", f: X -^ R a finite convex function, XQ e X such that inf f(X) < f(xo), and £ > 0. Then A^.(5/Uo)(/); ^o) = ^^>odsW)(xo).
(1.132)
We recall that if Z is a linear space, a function f: X -^ R is said to be quasiconvex if fii^xi + (1 - i^)x2) < max{/(xi), /(X2)}
(xi, X2 G X, 0 < z^ < 1);
(1.133)
it is well known and easy to see that this happens if and only if all level sets Sdif) (d G R) of (1.22) are convex, or equivalently, all level sets Ad(f) of (1.23) are convex. Clearly, every convex function is quasi-convex, but the converse is not true. A function / : X -> R is said to be quasi-concave if the function — / i s quasiconvex. For any function f:X -> /? on a linear space X we shall denote by /q the quasi-convex hull of / , that is, the greatest quasi-convex minorant of / (i.e., the greatest quasi-convex function majorized by / ) . When X is a locally convex space, a function / : X ^- R is quasi-convex and lower semicontinuous if and only if all level sets Sdif) {d G R) are closed and convex. For any function / : X -^ /? on a locally convex space X we shall denote by /q the lower semicontinuous quasi-convex hull of / , i.e., the greatest lower semicontinuous quasi-convex minorant of / . We recall that for any function f: X ^^ R we have (e.g., by (1.153) below, applied to the polarity A = A^^ of (1.189) below), mf
/q(^) =
d ==
deR
xeco Sdif)
= sup sup deR OGX*
inf
deR
xeco Adif)
inf /(_y) = sup y^^
inf
f(y)
(x G X).
(1.134)
^{x)>d^(y)>d 0(v)>OU)-l When X is a locally convex space, a function / : X ^- R is said to be evenly quasi-convex if all level sets Sd(f) (d G R) of (1.22) are evenly convex. For any function / : X -^ /? we shall denote by /eq the evenly quasi-convex hull of / , i.e., the greatest evenly quasi-convex minorant of / . We recall that (e.g., by (1.153) below, applied to the polarity A = A ^^ of (1.191) below) for any function / : X -^ R we have fea(x) = ^
inf
deR xeecoSdif)
= sup sup deR
d=
inf
deR xeecoAdif)
inf f{y)=
d sup
inf
>'^^ cD(v)>4>(^)
f(y)
(x e X).
(1.135)
1.2 Some preliminaries from abstract convex analysis
27
A function / : X -> /^ on a locally convex space X is said to be evenly quasicoaffine if all level sets S^if) {d e R) of (1.22) are evenly coaffine. For any function f:X -^ /? we shall denote by /qca the evenly quasi-coaffine hull of / , i.e., the greatest evenly quasi-coaffine minorant of / . We recall that (e.g., by (1.153) below, applied to the polarity A = A^^ of (1.193) below) for any function / : X -> /? we have /qca(^)=sup sup
iuf / ( > ' ) = sup
(x)=d <^(y)=d
iuf
f(y)
(xeX).
(1.136)
(y)=
Finally, let us recall the following so-called minimax theorem (actually, inf sup theorem) of Sion-Kneser-Fan (Sion [261, Theorem 4.2']): Theorem 1.8. Let M be a set, N a compact topological space, f : M x N ^^ R a finite-valued function that is "convexlike" on M (i.e., for every X\,X2 € M and 0 < y? < 1 there exists x e M such that f{x, y) < r]f(x\, y) + (I — ri)f(x2, y)for all y e N), and ''concavelike" on N (i.e., for every y\, y2 ^ N andO < r] < I there exists y e N such that /(jc, y) > r]f{x, y\) + (1 — r])f{x, yi) for all x e M), and such that f(x, .) is lower semicontinuous on N for each x. Then sup inf /(jc, y) = inf sup / ( x , y).
(1.137)
We shall also use the following "inf-sup theorem" of Moreau (see [162, Corollary]): Theorem 1.9. Let C be a convex subset of a linear space E, and D a weakly compact convex subset of a locally convex space F. Furthermore, let cp: C x D -^ (—oo, +CX)] be a mapping such that for each y e D, the function (p{., y) is concave on C and for each x e C, the function (p(x, .) is convex and lower semicontinuous on D. Then supmin (p(x, y) = minsup (p(x, y). xeC y^^
(1.138)
y^D xeC
1.2 Some preliminaries from abstract convex analysis Let us present now some elements of abstract convex analysis, which will be used in the sequel; some additional elements will be given in Chapter 9. Proofs can be found in [254]. There are two very useful tools for defining the dual problem to a primal optimization problem. The first one is the concept of a polarity between families of subsets, which gives a connection between subsets of a set X and subsets of another set W. Namely, if X and W are two arbitrary sets (which we shall assume nonempty, without any special mention), a mapping A: 2^ ^- 2 ^ (where 2^ denotes the family of all subsets of X) is called a polarity if for any index set / we have
28
1. Preliminaries
A(U,,/Q) = Di^jAiQ)
{{Cihei c 2^),
(1.139)
or, equivalently, A(C) = DcecMlc})
(C c X),
(1.140)
with the usual conventions Uy,0C/ = 0, n,e0A(C/) = W.
(1.141)
Remark 1.10. (a) In the above, for a function f: X ^ Y and a set C c X, we have set, as usual, / ( C ) := {/(c) | c € C} (thus, for example, in Lemma 1.8, inf 0(C) = infcGC^(c), supO(C) = supcec^(c)); this should lead to no confusion with the fact that for a polarity A: 2^ -> 2 ^ and a set C c X, A(C) is the set n^ec A({c}) (by (1.140)), since A is defined only at subsets of X, not at elements c e X. (b) In our previous papers, as well as in [254], we have used (following Evers and van Maaren [65]) the term "duality" instead of polarity. However, here we shall adopt the term "polarity" (which is also used by several authors; see, e.g., Pickert [179]), in order to avoid overlapping with subsequent terms like "theorem of weak duality," "theorem of strong duality," etc. There are many natural examples of polarities. For example, if X is a locally convex space and W = X*, the conjugate space of X, then the mapping A: 2^ -> 2^* defined by A(C) = C°
(C^X),
(1.142)
with C° of (1.82), is a polarity. Clearly, every polarity A is antitone (i.e., C\ c C2 implies A(C2) c A(Ci)). The dual of a polarity A, i.e., the mapping A^ 2^ -^ 2^ defined by A\S) := {jc G X| 5 c A({jc})}
{S c W),
(1.143)
(C c X, 5 c W),
(1.144)
is again a polarity, and we have the equivalence S c A(C) <^ C c A'{S) whence A'' := {A')' = A. For any C c X, the set A'A(C) := A'(A(C)) c X
(1.145)
is called the A'A-convex hull of C. The mapping A^ A: 2^ -> 2^ is a "hull operator," i.e., for any set C c X we have C C A'A(C),
(1.146)
A'AA'A(C) = A'A(C),
(1.147)
C c C
=> A'A{C) c A'A(C).
(1.148)
1.2 Some preliminaries from abstract convex analysis
29
A set C C X is called ts!t^-convex if C = A'A(C). This happens if and only if for each jc G CC there exists w — w^ ^V^ such that C c A'({w;}), X e
(1.149)
CA\{W}),
that is, C and each outside point x e CC can be "separated" by a set of the form A'({M;}), where w e W, or equivalently, C is an intersection of a family of subsets of X of the form A\{w}), where w e W. By (1.143) applied to 5" = {w}, we have AX{W})
= {X
eX\w
e A({x})}
(w e W),
(1.150)
so C c X is A^ A-convex if and only if for each x e CC there exists w = Wx ^ W such that w e n,ecA({c}) = A(C), w i A({x}).
(1.151)
Remark 1.11. Instead of the term "A^ A-convex" of the language of "abstract convex analysis," sometimes the language of general topology has also been used in the literature. Namely, (1.146)-( 1.148) mean that A'A is a "Moore-Smith closure operator," and thus the A^A-convexity of C can also be expressed by saying that C is "Moore-Smith closed" for the operator A^A. However, in the sequel we shall use only the language of abstract convex analysis, since it will be convenient for applications, e.g., in dealing with "A'A-quasi-convex" functions. A function f \ X ^^ R is called A'A-quasi-convex if all level sets Sdif) (d e R) of (1.22) are A^A-convex, that is, if for each d e R and x e CC there exists w = Wd,x ^ ^ such that SAf)
c A\{w]),
Xe
(1.152)
CA\{W}).
For any function f: X -> J? we shall denote by /q(A'A) the A' A-quasi-convex hull (i.e., the greatest A'A-quasi-convex minorant) of / . We have (see, e.g., [254, p. 301, formulas (8.265) and (8.262)]) /q(A'A)(-^)=
inf xe^HSAf))
d=
sup
inf
f{y)
(x € X).
(1.153)
^eCA'({«;})
In the sequel we shall be interested in polarities for the case that X is a locally convex space and W = X*\{0} or W = (X*\{0}) x R. For the first case, let G be a subset of X. We mention now some special polarities A' = AQ : 2^ -^ 2^*^^^^ (/ = 1, 2, 3, 4), depending on G. (1) Let us first consider the polarity A = A^ : 2^ -^ 2^*^^^^ defined by AJ^(C)
:= {O G X*\{0}|O(c) < supO(G) (c e C)}
(C c X).
(1.154)
For this polarity we have, by (1.150), (A'^YiW)
= {xeX\
0(jc) < supcI>(G)}
(cD e X*\{0}).
(1.155)
30
1. Preliminaries
Lemma 1.10. (a) For any set G the polarity A = A^^ satisfies A|^j({g}) = 0 ( ^ e G ) ,
(1.156)
C A ^ ( G ) = {CD € X*\{0}| 3^ G G, O(^) = sup cD(G)}.
(1.157)
(b) The set G is (A^)^A^-c6>/2V^x if and only iffor each jc G CG there exists ^ = ^^e X*\{0} such that 0(g) < supO(G) < a)(x)
(geG).
(1.158)
Hence, ifG is (AQYA^-convex, then it is evenly convex. (c) A function f'• X -^ R is (A^jY A^-quasi-convex if and only if for each d e R andx e CSd(f) there exists O = ^d,x ^ ^*\{0} such that
(y e 5 j ( / ) ) .
(1.159)
Consequently, if f is (A^jYA^-quasi-convex, then it is evenly quasi-convex. Proof (a) By (1.154), A|^j({g}) = {O e X*\{0}| <^(g) < <^(g)] = 0 (g e G). Also, by (1.154) we have C A ^ ( G ) = {O G X * \ { 0 } | 3g e G, 4)(g) > supO(G)}, whence (1.157). (b) G is (A^)^AJ^-convex if and only if for A = A^, (1.149) holds with C = G, w =
(C c X).
(1.160)
(O e X*\{0}).
(1.161)
For this polarity we have, by (1.150), (AlY(W)
= {xeX\ 0(x) < sup cI>(G)}
Lemma 1.11. (a) For any set G the polarity A = A^ satisfies Af,}({^}) = X*\{0}
(geG),
A2^(G) = X*\{0},
(1.162) (1.163)
{AlYAl(G)=coG.
(1.164)
Consequently, G is (A^Y A^-convex if and only if it is closed and convex. (b) A function f:X -^ R is (A^Y A^-quasi-convex if and only if for each d e R andx e CSd(f) there exists 4> = ^d,x e X*\{0} such that supO(5^(/)) < supO(G) < ^(x). Consequently, if f is (A^YA^-quasi-convex, and quasi-convex.
(1.165)
then it is lower semicontinuous
1.2 Some preliminaries from abstract convex analysis
31
Proof, (a) By (1.160), we have ^\,^{[g}) = [<^ e X*\{0}| cD(g) < ^{g)} = X*\{0} (g e G) and A^iG) = {^ e X*\{0}\ sup (D(G) < supO(G)} = Z*\{0}. By (1.163) and the expression of co G given in [254], formula (2.131), we obtain (AlYAliG) = {AlYiX'Xm = {x eX\^(x)
<supcD(G) (O G Z*\{0})} = coG,
which proves (1.164). Consequently, G is (A^)^A^-convex if and only if G = (AlyAl{G) = cdG. (b) A function f:X -> R is (A^)'A^-quasi-convex if and only if for each d e R mdx e CSd(f) there exists O = O^,^ e X*\{0} satisfying (1.152) for A = A^, that is, (1.165). Hence, if this condition is satisfied, then each level set Sd(f) (d e R) is closed and convex, that is, / is lower semicontinuous and quasiD convex. (3) Let us consider now the polarity A = A^ : 2^ ^ 2^*^^^^ defined by A^(C) := {O G Z*\{0}| supO(G) ^ 0 ( 0 }
(C c X).
(1.166)
For the polarity A = A^ we have (AlnW)
= {xeX\ cD(x) # supc|>(G)}
(O G X * \ { 0 } ) .
(1.167)
Lemma 1.12. (a) For any set G the polarity A = A^ satisfies AU{g})
=0
igeG),
Ai;(C) c Alio
(1.168)
(C c X).
AliG) = A^iG), (Aj;)'({4))) c (Alyi{
(1.169) (1.170)
(
(1.171)
(b) G is (A^QY AQ-COHVCX if and only if for each x eZG there exists ^ = ^^ € X*\{0) such that (g)<supct.(G) = cD(x)
igeG).
(1.172)
Consequently, ifG is {A^Q)'A]^-convex, then it is (Ajj)'Ag-convex. (c) A function f:X -^ R is iA]j)'A^(j-quasi-convex if and only if for each d e R andx e ZSdif) there exists O = ^^ ^^ € A'*\{0} such that {x) = supcI>(G) i cD(5rf(/)).
(1.173)
Consequently, if f is (A^Q)'A^-quasi-convex, then it is evenly quasi-coaffine. Proof (a)By(1.166),wehave Af^|({g}) ={
32
1. Preliminaries (b) By (1.167), condition (1.172) means that we have G c {^l)'{m)^
X e C(A^^)^({4>}),
that is, for A = A^ we have (1.149) with C = G, w = ^, which yields the first assertion. Furthermore, by (1.171) and (1.170), we have G C (Aj,)^Aj,(G) c (AlYA'^iG)
= (AlYA^iG),
(1.174)
whence the second assertion follows. (c) A function f:X -^ R is (A^)'A^-quasi-convex if and only if for each d e R andx e CSd(f) there exists O = O^,^ e X*\{0} satisfying (1.152) for A = A^, u; = O, that is, (1.173). Hence, by the definition of evenly quasi-coaffine functions, we obtain the second statement. D Corollary 1.5. If a set G ^ X is {A^)' A]j-convex and A^^(G) = X*\{0},
(1.175)
G = [x eX\ 0(x) < sup 0(G) (O G Z * \ { 0 } ) } .
(1.176)
then we have
Proof. By (1.166), we have (1.175) if and only if for each O G X * \ { 0 } , we have O(^) < sup 0(G) {g G G), that is, if and only if G c {jc G Xl 0(jc) < sup(|>(G) (O G A:*\{0})}.
(1.177)
If also G is (A^)^A^-convex, then by Lemma 1.12 (b), we have the opposite inclusion as well, and hence the equality (1.176). D Remark 1.12. By (1.176), the set G is an intersection of open half-spaces; i.e., it is evenly convex. Corollary 1.6. For a set G c X, let us consider the following statements: 1°. G is (A^QYA^Q-convex and A^(G) = X*\{0}.
(1.178)
(Aj;)\X*\{0}) = G.
(1.179)
2°. We have
3°. We have (1.116). 4°. G is (AJ.)'AJ^-CO«V^JC, and we have (1.178). Then r =>2° <^3° ^ 4 \
(1.180)
1.2 Some preliminaries from abstract convex analysis
33
Proof, r ^ 3°. By A^(G) = A^(G), formula (1.178) implies (1.175), which, by Corollary 1.5, implies (1.176). 2° <^ 3°. By (1.143) and (1.154), we have (Aj,)\X*\{0}) = {xeX\
0(x) < sup cD(G) (O
G X*\{0})},
(1.181)
whence the equivalence 2° 4^ 3° follows. 3° =^ 4 M f (1.176) holds, then we have (1.177), and for each x e CG there exists
(C c X).
(1.182)
For this polarity we have (A^)^({cD}) = {xeX\
(O e Z*\{0}).
(1.183)
Lemma 1.13. (a) For any set G ^ X the polarity A = A^ satisfies A^^ag}) = X'\{0] A^(C)CA^(C)
(geG),
(1.184)
(CCX),
(1.185)
A^(G) = X*\{0}, (A^)^A4.(G) = [xeX\
(1.186)
(cD e X*\{0})}.
(1.187)
Consequently, the set G is {A'^Y A"^-convex if and only if it is evenly coaffine. (b) A function f:X —> R is (A^Y A'l^-quasi-convex if and only if for each d e R andx e CSd(f) there exists O = Oj^^ € X*\{0} such that ^(SAf))
^ ^ ( G ) , cD(x) ^
(1.188)
Consequently, if f is (A'^QY A^-quasi-convex, then it is evenly quasi-coaffine. Proof (a) By (1.182), we have A\^^{[g}) = {O G X * \ { 0 } | CD(^) G <^{[g])] = X*\{0} (g G G) and A'^^(G) = {O G X * \ { 0 } | 0 ( G ) C 0 ( G ) } = X*\{0}. Also, formula (1.185) is obvious from (1.182) and (1.160). Next, by (1.186) and (1.183), we obtain (1.187). Furthermore, by (1.187), G is (A^)'A^-convex if and only if for each x G CG there exists O = O^^ ^ X*\{0} such that ^(x) ^ 0(G), i.e., if and only if G is evenly coaffine. (b) A function f \ X ^^ R is (A^)'A^)-quasi-convex if and only if for each d e R andx e CS^if) there exists O = O^,^ G X * \ { 0 } satisfying (1.152) for A = A^, w; = O, that is, (1.188). Hence, by the definition of evenly quasi-coaffine functions, we obtain the last statement. D
34
1. Preliminaries
Finally, let us mention now some special polarities A: 2^ ^ 2^^*^^^^^^^ that do not depend on a subset G of X. (1) For the polarity A^^: 2^ -^ 2^^* ^^^^^^^ defined by A^\C):={(^,d)
€ (X*\{0}) X R |supO(C) < d}
(C c X),
(1.189)
we have (A^^/A^kC) = COC (C G 2^), ZqaAnyA") = /q ( / ^
^R"").
(1.190)
(2) For the polarity A^^. 2^ -> 2^^* ^^^^^^^ defined by A^^(C) := {(^, J) G (Z* \ {0}) X R I
(C c Z),
(1.191)
we have (A'^yA'\C)
= ecoC (C € 2^), /q((A>^rA'^) = U (f ^ ^ ' ' ) -
(1-192)
(3) For the polarity A^^: 2^ -^ 2^^* \ ^^^^""^ defined by A^^(C) := {(CD, d) e (X* \ {0}) x R\
(C c X), (1.193)
we have (A'^yA'\C)
= ecaC (C 6 2^), /q((A'3)'A>3) = /qca ( / e :^'').
(1.194)
The above expressions of co C, eco C, eca C, and /q, /eq, /qca with the aid of the polarities A^\ A^^, and A^^ depend on two parameters, ^ e X*_\ {0} and d e R. However, the sets C c X with 0 € C and the functions f: X -^ R satisfying /(0) = inf/(X\{0})
(1.195)
admit expressions of coC, ecoC, and /q, /eq with the aid of simpler polarities, depending only on one parameter
:= {O G X*\{0}| sup 0 ( 0 < 1)
(C c X),
(1.196)
we have coC = (A^')^A^'(C) /qU) = /q((A0.yA0.)(x)
(C c X, 0 6 C),
(1.197)
( / G R \ / ( O ) = iuf/(X\{0}), X G X\{0}). (1.198)
Note that for any C c X the set A^^(C) U {0} c X* (with A^^ of (1.196)) is the usual polar C° of C (see (1.82)).
1.2 Some preliminaries from abstract convex analysis
35
(5) For the polarity A^^. 2^ -> 2^*^^^} defined by A^2(C) := {O G X* \ {0}| 0(c) < 1 (c G C)}
(C c X),
(1.199)
we have eco C = (A^^YA^^C) /eq(^) = /q((A02yA02)(x)
( C C X, 0 G C ) ,
(1.200)
( / G R \ / ( O ) = i n f / ( Z \ { 0 } ) , X G Z\{0}).
(1.201)
The second important tool for defining dual problems to a primal optimization problem, which gives a connection between functions on a set X and functions on another set W, is the following generalization of the conjugate (1.95): given two sets X and W and a ("coupling") function (p: X x W -^ /?, the Fenchel-Moreau conjugate function of a function f: X -^ R (with respect to cp) is the function /c(,^): W ^ i ? defined by f'^'^\w)
:= sup{(^(y, w) + -/(>')}
(u; G W),
(1.202)
where -|- denotes the lower addition on R (see (1.83), (1.84)). Theorem 1.10. For a mapping c: f e R -^ f^ e R , where R denotes the set of all functions f: X ^^ R, there exists a coupling function (p: X x W -^ R such that f = f'^f"^ {of (1.202))/or all f e ~R^ if and only if c satisfies the following two conditions: for any index set I {including the empty set 0, with the usual conventions inf 0 = +oo and sup 0 = — oo), (inf Z;)^ = sup/.^ /€/ '
( { / } , , / c -t)^
(1.203)
/e/
{a -f fY = -aff'
( / G /?^, a G R);
(1.204)
moreover, (p is uniquely determined by c. Proof See [237] or [254], Chapter 8. —X
D —w
Any mapping c: f e R -^ f' e R satisfying (1.203) and (1.204) is called ([237], [254]) a conjugation. In the sequel we shall use only the case W ^ R , i.e., where W is a set of functions w: X -> R, and (p = (p^^: XxW ^^ /? is the "natural coupling function" associated with W, defined by (p{x,w):=w{x)
{x eX,w
eW),
(1.205)
which is apparently a particular case, but in fact, it turns out to be equivalent (from X
the point of view of conjugations): given two sets X and W C. R , the FenchelMoreau conjugate function of a function f: X -> R {with respect to W) is the function / * : W -^ R defined by
36
1. Preliminaries r(w)
:= sup {w(y) + -f(y)}
(w e W).
(1.206)
yeX
The Fenchel-Moreau biconjugate of a function f: X -^ R (with respect to W) is the function /**: X ^^ R defined by /**(x) := sup {w{x) + -r(w)}
(X e X).
(1.207)
weW
By (1.207), (1.86), and (1.90), for any function / : X -> ^ we have /**(x) = sup {w{x) + - r (u;)} = sup {wix) f - sup [w(y) f weW
weW
-/(y)]}
yeX
= sup {w(x) + inf - [w(y) + - / ( j ) ] } wew
y^^
= sup inf {[/(y) 4- -w(y)] + K;(X)}
(JC e X),
(1.208)
whence /** < /
( / e T?"").
(1.209)
Remark 1.13. (a) Actually, a more precise notation would have to specify also that / * and /** are understood with respect to the given W 'O R ; however, the above notation will lead to no confusion, since W will always be clear from the context. (b) Let us show the equivalence of the above two concepts. Clearly, in the particular case of W c ;^^ and cp of (1.205), formula (1.202) reduces to (1.206). In the converse direction, given any pair of sets (X, W) and any coupling function (p: X X W -^ R,foY each w e W one can define a function w = w^p. X ^^ R by wix) := ^(x, w)
(X e X),
(1.210)
and hence a set Vj/ = W^ c /? by W := {w\ w eW} = {(p(x, w)\ w e W}.
(1.211)
The mapping w e W ^^ w e W defined by (1.210) need not be one-to-one. Indeed, for example, if Z is a locally convex space, W = X* x R and cp: X x W -^ R is the coupling function defined by cp(x, (O, J ) ) : = -X{yeX\^iy)>d}(x)
(x G X, O € X*, J G /?),
(1.212)
then, by (1.202), r ^ ^ ^ O , J ) = SUp{-X{,eX|0(v)>d}(^) + xeX
=
sup (-f(y))= yeX <^(y)>d
-inf veX 4>(v)>J
-fix)}
f{y)
(O eX*,d
e R)
(1.213)
1.2 Some preliminaries from abstract convex analysis
37
which is (modulo an inessential additive term + J ) the so-called quasi-conjugate of / , in the sense of Greenberg and Pierskalla [95], which plays an important role in duaUty for quasi-convex optimization; then, since -X{yeX\ix^iy)>fid}
= -X{yeX\iy)>d}
(M > 0),
we have jlw = w for all w = (O, d) e W = X* x R, /x > 0, so the mapping If ^- 25 is not one-to-one. Nevertheless, for any coupling function (p: X xW -> R we have the implication wu W2 eW,wi=W2=^
f^'^\wx)
= f^'^\w2),
(1.214)
since sup l(p(x, wi) f -fix)} xeX
= sup {W2(x) f
= sup {Si (x) f -fix)} xeX
-fix)}
xeX
= sup {(pix, W2) + xeX
-fix)}. ^
Hence, one can uniquely define a conjugation f e R /*(S):=/^(^)(u;)
y/
-> / * e /? by
iweW).
(1.215)
(c) For W = X* X R,it is convenient to denote the quasi-conjugate (1.213) of / , in the sense of Greenberg and Pierskalla, mentioned above, by / J . The second quasi-conjugate of / is the function (/J)^ : X -> R defined [95] by (fJ)dM
= -inf / J W
U € X),
(1.216)
(y)>d
and the normalized second quasi-conjugate of / is the function f^^iX defined [95] by
fyy = sup if J)',.
-> R
(1.217)
deR
It is well known and easy to see that for any function f. X -^ R wc have
r = sup/J, f>f'^>f'\
(1.218)
deR
where /*, /** are the Fenchel conjugates (1.95), (1.97). Corresponding to (1.100), we have} r'=/eq
(/e^""),
(1.219)
with /eq being the evenly quasi-convex hull (1.135) of / . (d) There are also other "conjugates" of a similar form, useful for duality in convex and quasi-convex optimization, that are particular cases of the Fenchel-Moreau
38
1. Preliminaries
conjugates /^^*^^ (for suitable coupling functions (p), for example, the "pseudoconjugates" defined by / ; ( 0 ) = -inf/(x)
(O eX\d
G R),
(1.220)
e /?),
(1.221)
xeX ^{y)=d
and the "semiconjugates" defined by /;(cD)=
- i n f f{x)
(CD eX\d
xeX ^(y)>d-\
for which one introduces the second conjugates (/J^)J,(/j)j and the normahzed second conjugates / ^ ^ , /^^, similarly to (1.216) and (1.217) respectively (mutatis mutandis). We have r«-/q
ifeR''},
(1.222)
with /q being the lower semicontinuous quasi-convex hull (1.134) of / . Let us return now to the more general case in which X and W are two arbitrary sets. For any polarity A : 2^ —> 2^ the conjugation of type Lau associated with A is the mapping L(A): R -^ R defined by f^^^\w)
:=
- i n f fix)
( / e^^^we
W).
(1.223)
xeCA'({u;})
One can show (see [254, p. 279, Theorem 8.14]) that the mapping c = L(A): —X —w R ^ R satisfies (1.203), (1.204) (i.e., it is a conjugation), and that the mappmg X \Y c((p): R -^ R defined by (1.202) is a conjugation of type Lau if and only if cp takes only the values 0 or —oo, i.e., if and only if (p = — Xc» for some subset C of X xW. If X and W are two sets, C is a subset of X, and A: 2^ ^ 2^ is a polarity, then for the "representation function" pc'. X -> {—oo, -hoo} defined by
{
—00 if X e C, P^
(1-224)
+00 if jc G LC, we have (Pc)^^^^=/OA(C).
(1.225)
For any polarity A: 2^ ^ 2 ^ , the dual of L(A): R -^ R is the mapping w
Y
L(Ay: R -^ R defined by g^(^)'(x):=
- i n f g{w) weW xe{^A'({w})
(geR'^^xeX).
(1.226)
1.3 Duality for best approximation by elements of convex sets
39
The dual L(A)^ of L(A) is again a conjugation of type Lau (namely, L(A)^ = L(AO, with A^ of (1.143)), and we have L{Ay = (L(A)O' = L(A). For any / : X ^ /?, the function (/^(A))/^(A)' : X-> R is denoted by /^(^)^(^)'. By (1.226) and (1.223), we have /'^^^^^^^'=/q(A'A)
(/e^""),
(1.227)
with /q(A'A) of (1.153). In particular, for f = pc of (1.224) (where C c X is any set), we have (PC)^^^^^^^^'=PA'A(C).
(1.228)
For the polarities A = A^^ of (1.189), A = A^^ ^f (1.191), and A = A^^ ^f (1.193) we have, by (1.223), /^^^"\
(f e J^, (O, d) e (X* \ {0}) x /?),
(1.229)
( / e /? , (O, d) e (X* \ {0}) x /?),
(1.230)
(f e^^,
(1.231)
xeX
/ ^ ( ^ Ho, d)= - inf / ( x ) JCGX
/ ^ ^ ^ " \ 0 , J) = - inf / ( x )
(O, J) G (X* \ {0}) x R).
jceX
V-
If A: 2^ ^ 2 ^ is a polarity, f e R , and zo e ^ , /(^o) > - o o , the subdifferential of f at ZQ with respect to the conjugation of type Lau L(A) is the subset ^^^'^Vfeo) of W defined by d^^^^f(zo) := [woeWlzoe = {woeW\zoe
CA'({U;O}),
f(zo) = -f'^^^Hwo)} = min f(x)};
CA^({U;O}), / ( Z O )
^^'^^^^
XGCA'({U;O})
here the assumption f{zo) > — oo is essential.
1.3 Duality for best approximation by elements of convex sets In this section we shall present some duality results and some methods of obtaining them, for best approximation in normed linear spaces by elements of convex sets. These will serve as a basis of comparison with the nonconvex duality results of Chapters 2 and 5 and with the methods of obtaining them. We recall that if G is a subset of a normed linear space X, any go ^ G for which the inf in (1.50) (with C = G) is attained, i.e., such that IUo-goll = inf Wxo-gh
(1.233)
40
1. Preliminaries
or, equivalently, such that ll^o-^oll
(geG),
(1.234)
is called an element of best approximation of (or a nearest point to) XQ in G (see Figure 1.3).
Figure 1.3. We shall denote by PG(-^O) the set of all nearest points to XQ in G, that is, VG{X^)
:= {go G G\ lUo - goll = inf Iko - g\\}.
(1.235)
geG
We shall denote by max (respectively, min) any sup (respectively, inf) that is attained. Thus, in (1.233) and (1.235) one can replace inf by min. Clearly, Vcigo) = {go} for all go e G. In finite-dimensional spaces Z, if G c X is closed, then Vcixo) 7^ 0 for all XQ e X (see, e.g., [210]), but as we shall see below, in infinite-dimensional normed linear spaces X we may have Vci^o) = 0, that is, elements of best approximation of XQ need not exist, even for closed sets G with "very good" geometric properties. One can also express VG(XO) with the aid of the (closed) ball B(xo,d):=[yeX\
\\xo-y\\
(1.236)
with center XQ and radius d = dist(xo, G), namely, VG(XO)
= Gn B(xo. dist(xo, G)).
(1.237)
We shall be concerned with the following two main problems: (1) Find convenient formulae for dist(xo, G). (2) Give characterizations of elements of best approximation, i.e., necessary and sufficient conditions in order that an element go ^ ^ satisfy (1.233) (that is, in order thaigoeVGixo)). For these problems, ''duality " means simply their study with the aid of the elements of the conjugate space X*. Indeed, this is quite natural in the light of the results of the next section, since best approximation by convex sets is a particular case of convex optimization, namely, it is the infimization of the convex function (1.264) on convex sets, and since the function (1.264) has very good properties (it is finite and continuous). We have the following basic formula for the distance to a convex set.
1.3 Duality for best approximation by elements of convex sets
41
Theorem 1.11. Let X be a normed linear space, G a convex subset ofX, and XQ e CG. Then distUo, G) = max {(D(jco) - sup 0(G)}.
(1.238)
\\n=\ In other words, we have dist(;co, G) > O(xo) - supO(G)
(O € X*, ||0|| = 1),
(1.239)
and there exists OQ G X* such that IIOoll = 1,
(1.240)
dist(xo, G) = Oo(xo) - supOo(G).
(1.241)
Proo/ We have (1.239), since lUo - g\\ >
O(xo) - sup cD(G)
(CD e X\ ||0|| = 1).
Furthermore, since XQ G CG, we have dist(jco, G) > 0. Let A:={y e X\ \\XQ - y\\ < dist(jco, G)} = int^(jco, dist(xo, G)). (1.242) Then A is a nonempty open convex set, and G (1 A = &. Hence, by the separation theorem, there exists OQ G X * \ { 0 } such that sup(Do(G) < inf Oo(A);
(1.243)
we may assume without loss of generality (dividing by ||Ooll, if necessary) that ||a>o|| = 1. We have Oo(xo)-supcDo(G)>0;
(1.244)
indeed, otherwise, from (1.243) we would obtain Oo(jco) < inf Oo(A), in contradiction to XQ G A. Let us consider the hyperplane Ho := {y G X\
(1.245)
By Lemma 1.5, (1.244), and \\<^o\\ = 1, we have dist(xo, Ho) = Oo(xo) - supOo(G) = inf {Oo(xo) - ^o(g)} geG
< inf | | x o - g | | =dist(xo, G). geG
If dist(jco, Ho) < dist(jco, G), then there exists ho G Ho such that \\xo — ho\\ < dist(xo, G), so ho G A. Hence, using also that ho G Ho and formula (1.243), we obtain
42
1. Preliminaries Oo(/2o) = supcDo(G) < inf OoCA) < Oo(/io),
so ^o(^o) = inf Oo(A), which contradicts Lemma 1.8. Therefore, we must have OoUo) - sup ^o(G) = dist(jco, //o) = dist(xo, G).
(1.246) D
Remark 1.14. (a) For various particular classes of convex sets G, e.g., for convex cones G, linear subspaces G, or finite-dimensional convex sets G, formula (1.238) for dist (xo, G) takes simpler forms (see, e.g., [210, 211] and the references therein), (b) From (1.238) it follows that for XQ e CG we have dist(xo, G) =
max
{^(XQ) - sup 0(G)}
eXM|cD|| = l sup(I>(G)<4>(.vo)
max
{0(jco)-supcD(G)};
(1.247)
GXM|CD|| = 1 sup4>(G)<4>(.vo)
indeed, for any O G Z* with ||4>|| = 1, supcI)(G) > O(xo), we have sup
O(JCO)
—
Now we shall deduce from Theorem 1.11 some other duality formulas for the distance to a convex set, and we shall give for them some geometric interpretations. Corollary 1.7. Let X be a normed linear space, G a convex subset ofX, and XQ e CG. Then dist(xo, G ) =
max
|0(jco) - sup4)(G)|
(DGA:M|CD||::=I sup(G)<(jco)
max
|0(jco)-supcD(G)|.
(1.248)
GXM|(t>||=i sup
Proof. By (1.247), we have the inequalities < in (1.248). On the other hand, for any O G X* with ||0|| = 1, supcI)(G) < 0(xo), we have |0(jco) - supO(G)| = O(jco) - sup 0(G), whence by (1.239), dist(jco, G) >
sup
|0(jco) - supO(G)|,
OeXM|(D|| = l supcI)(G)<0(jco)
and hence, finally, we obtain (1.248).
D
Remark 1.15. (a) One cannot omit in (1.248) the conditions supO(G) < O(xo) and supO(G) < 0 = yi {y = {yx.yi) ^ X). Then dist (jco, G) = 1, and |Oo(xo) - supOo(G)| = |0 - J | = J > 1, so (1.247),
1.3 Duality for best approximation by elements of convex sets
43
with s u p O ( G ) < <^(xo) or supOCG) < O(jco) omitted, fails (in this example, supOo(G) =d >0=
max dist(xo, H) HeHc.xQ
max
inf
\\xo - yh
(1.249)
OeX*\{0} yeX supO(G)<(jco) 0(y)=supO(G)
where HG,XO denotes the set of all hyperplanes that quasi-support the set G and that strictly separate G and JCQ. Indeed, it is enough to consider the hyperplanes H = {yeX\<^{y)
= %n^<^{G)} (CD G X*, | | 0 | | = l,supcD(G) < O(xo));
(1.250)
the second equality in (1.248) has a similar interpretation, with "strictly separate" replaced by "separate" and sup 0 ( G ) < 4>(jco) replaced by sup 0 ( G ) < O(xo). (c) Under the assumptions of Theorem 1.11 we also have dist(jco, G) =
max
|0(jco) - s u p O ( G ) | ,
(1.251)
eX\\\\\ = \
supO(G)
with A of (1.242). Indeed, in the proof of Theorem 1.11 we have shown that there exists Oo ^ X* satisfying (1.240) and (1.243), and that for any such OQ we have (1.246), which yields (1.251). Corollary 1.8. Let X be a normed linear space, G a convex subset ofX, C G . Then dist(xo, G) =
and XQ G
max |cD(jco) - d\ = max l^(-^o) - d\. a>GXM|||=i,jG/? OGXM|ci>||z=i,je/? supO(G)<J<(I>(.ro)
sup(G)<
(1.252) Proof. For any O G X* with \\^\\ = 1, sup 0 ( G ) < O(jco), we have max
|0(jco) - d\ =
max
|0(xo) - d\ = O(xo) - s u p O ( G ) ,
d^R
deR
supO(G)<6/<0(jco)
sup4>(G)<J<0(.x:o)
whence by (1.248) and (1.239) we obtain (1.252).
D
Remark 1.16. (a) Geometrically, the first equality of Corollary 1.8 means that dist(jco, G) = =
max dist(xo, H) max
inf
^£X*,deR VGX supO(G)<J
lUo —jIL
(1.253)
44
1. Preliminaries
XQ^
(a)
(b) Figure 1.4.
where HQ ^^ denotes the set of all hyperplanes that strictly separate G and XQ. Indeed, it is enough to consider the hyperplanes H =
{y€X\^(y)=d} (O G Z * , | | 0 | | = \,d e /?,sup(D(G)
the second equality in (1.252) has a similar interpretation, with "strictly separate" replaced by "separate" and sup 0(G) < d < 0(xo) replaced by sup 0(G) < d < O(jco) (see Figures 1.4a and 1.4b). (b) The reduction principle: The usefulness of formulas (1.249) and (1.253) consists in the fact that they reduce the computation of the distance from a convex set to the computation of the distance from a hyperplane, and that there exists a simple formula for the computation of the distance to a hyperplane, namely, Lemma 1.5 (which is very convenient for applications in various concrete spaces, since for these spaces the general form of continuous linear functions O e Z* is well known and simple). This basic idea, which we shall call the reduction principle, will be applied later also to nonconvex approximation and will be extended to convex and nonconvex optimization. (c) One can give some geometric consequences of the above results, using halfspaces instead of hyperplanes. The usefulness of those distance formulas consists again in the "reduction principle": they reduce the computation of the distance from a convex set to the computation of the distance from a half-space, and there exists a simple formula for the computation of the distance from a half-space, namely. Corollary 1.4. Duality results for the distance, such as Theorem 1.11, can be used to derive characterizations of nearest points, e.g., the following. Theorem 1.12. Let X be a normed linear space, G a convex subset ofX, and XQ e CG. For an element go e G, the following statements are equivalent: WgoeVcixo)2°. There exists OQ e X* satisfying (1.240) and Oo(xo) - sup Oo(G) = llxo - goll.
(1.255)
1.3 Duality for best approximation by elements of convex sets
45
3°. There exists <J>o e X* satisfying (1.240) and «I>o(xo -g)>
llxo - ^oll
{g e G).
(1.256)
4°. There exists o € X* satisfying (1.240) and cI>o(go) = max cDo(G),
(1.257)
o(^o-^o) = 11x0-foil.
(1-258)
Moreover, one can take the same OQ in statements 2°, 3°, and 4°. Proof. V => 2°. If 1° holds, then by ||jco - ^oll = dist(xo, G) and (1.238) we have 2°. T ^ 3 M f 2 ° holds, then ^oUo - ^) > ^o(-^o) - sup cDo(G) = \\XQ - goII
(g e G).
3° = ^ 4 M f 3° holds, then ^o(go) - ^o(g) = ^oUo - g) - ^o(^o - go) > 11^0 - goII - ^oUo - go) > 0 Iko - goII < ^o(^o - go) < 11-^0 - goII •
(g € G),
4° => l M f 4 ° holds, then lUo - goll = ^oUo - go) < ^oUo - g) < 11-^0 - g\\ that is, go eVcixo).
{g ^ G), •
Remark 1.17. (a) Any function ^o ^ ^* satisfying (1.240) and (1.258) is called a "maximal function" of the element jco — go- The usefulness of Theorem 1.12 for applications in various concrete normed linear spaces is due to the fact that for these spaces the general form of maximal functions of the elements of the space is well known and simple (see, e.g., [210] and the references therein). (b) For various particular classes of convex sets G, e.g., for convex cones G, hnear subspaces G, or finite-dimensional convex sets G, Theorem 1.12 takes simpler forms, which yield some results on characterizations of best approximations, for example the classical theorem of S. Bernstein on the characterization of polynomials of best approximation of degree < AZ of continuous functions on a closed interval [a, Z?], in terms of "points of alternation" of XQ — go, i.e., points at which •^0 — go takes the value |JCO — goll with alternating signs (see, e.g., [210, 211] and the references therein). (c) Theorem 1.12 admits some geometric consequences, such as the following characterization of nearest points: For XQ 6 CG we have go G PG(-^O) if and only if there exists a hyperplane HQ that supports G at go, separates G and XQ, and satisfies llxo-goll =dist(JCO,//o).
(1.259)
46
1. Preliminaries
Indeed, if ^o ^ ^G(-^O), then the hyperplane //Q := {y e X\ ^o(y) = supOo(G)}, with
max
{cD(jco) - supcD(G)}.
(1.260)
Proof. The inequality > in (1.260) is obvious from (1.239). On the other hand, if Oo G X* is as in 4° of Theorem 1.12, then ||Oo|| = 1, OQ e A^(G; go) and we have dist(xo, G) = \\xo - goII = ^o(-^o - go) = ^o(-^o) - sup Oo(G), whence (1.260), with the max attained at 4>o.
•
1.4 Duality for convex and quasi-convex infimization In this section we shall present some duality results and some methods of obtaining them, for infimization of convex and quasi-convex functions on convex sets in locally convex spaces (we consider also the latter, since for the validity of some results only the quasi-convexity of functions is needed, i.e., the convexity of their level sets, rather than their convexity, i.e., the convexity of their epigraphs). These will serve as a basis of comparison with the nonconvex duality results of Chapters 3, 4, 6, and 7 and with the methods of obtaining them. Although the infimization of quasi-convex functions is, actually, nonconvex infimization, we shall include it in this section (rather than devoting to it a separate chapter), since quasi-convexity belongs to the field of "generalized convexity." In the first part we shall present some elements of the theory of "unperturbational dual problems," i.e., of dual problems defined without using perturbations. In the second part we shall present some elements of the theory of "perturbational dual problems," i.e., of dual problems defined with the aid of perturbations of the primal problem, and we shall show that various unperturbational dual problems can be deduced from the perturbational theory, in a unified way, for suitable choices of particular perturbations.
1.4 Duality for convex and quasi-convex infimization
47
1.4.1 Unperturbational theory Assume that we are given a constrained "primal" problem, called a "problem of infimization," (P)
a = inf/(G),
(1.261)
where the "constraint set" G is a subset of a locally convex space X and f: X ^^ R is a function, called "the objective function." When G = X, problem (P) is called "unconstrained." Any go e G for which the inf in (1.261) is attained, i.e., such that /(go) = inf/(G),
(1.262)
is called a (global) "optimal solution" of problem (P). The set of all optimal solutions will be denoted by Scif), that is, Scif)
'= {go e G\ f(go) = inf/(G)};
(1.263)
naturally, one can also write min instead of inf in (1.262) and (1.263). If G is a convex set and / is a convex (respectively a quasi-convex) function, then (P) of (1.261) is called a problem of convex (respectively, quasi-convex) infimization. As has been observed above, best approximation may be regarded as a particular case of infimization, by taking Z to be a normed linear space, xo ^ X, and / : X -> R the convex function fiy):=\\xo-y\\
(yeX);
(1.264)
indeed, then inf/(G) = dist(jco, G),
(1.265)
and the optimal solutions go ^ ^ of problem (P), for this case, are the elements of best approximation of XQ by G. Therefore, it is natural that many results on infimization can be applied to the particular case of best approximation. Moreover, in the converse direction, although the extension from the particular function / of (1.264) to a function / : X ^^ P on a locally convex space Z is a rather big step, it turns out that many results and methods of the theory of best approximation can be extended to results on the infimization of functions. For example, since for the function (1.264) we have Sdif) = [yeX\
\\xo - y\\
(d e P),
(1.266)
the balls Bixo, d) of (1.236) will be replaced by the level sets 5 j ( / ) of (1.22). Also, if X is a normed linear space, JCQ G X, and / is the function (1.264), then for any go e Xv^Q have (see, e.g., [212, Lemma 4.1]) a/(^o) = [^e
X*| cD(xo - go) = \\xo - goW, ll^ll < 1};
(1.267)
48
1. Preliminaries
clearly, when go 7^ XQ, one can take ||0|| = 1 in (1.267). Therefore, the "maximal functions" of Remark 1.17(a) will be replaced by the elements of 9/(go)In the present section and the next one, we shall be concerned with the following two main problems for convex (respectively, quasi-convex) infimization: (1) Find convenient formulae for inf/(G). (2) Give characterizations of optimal solutions, i.e., necessary and sufficient conditions in order that an element go e G satisfy (1.262) (that is, in order that In the duality results for best approximation, the distance dist(;co, G) is expressed by equalities involving the continuous linear functions 4> G X*, such as formula (1.238) in Theorem 1.11. In the theory of duality for the more general case of infimization of convex functions, instead of such equalities there appears a "dual problem" of supremization of a "dual objective function" on a "dual constraint set," and the infimum occurring in the primal problem (P) of (1.261) is not necessarily equal to the supremum in the dual problem; they are equal only under some conditions on the primal variables G and / , but those are satisfied by the particular function (1.264) on a normed linear space X (for example, / of (1.264) is convex and continuous). This explains why in Section 1.3 on best approximation by convex sets no "dual problem" has appeared explicitly. There is also another essential difference between the duality theories for best approximation by convex sets and for convex infimization. Namely, while the results of Section 1.3 have been proved by arguments that work directly in X and X* (e.g., separation theorems for convex subsets of X), problem {P) of (1.261) requires one to find the "lowest point" of the graph (or, equivalently, of the epigraph) of the restriction / | G of the convex function / to the convex set G c X (see Figure 1.5, in which X = R^, endowed with its natural topology); therefore, duality results for (1.261) are often obtained by applying separation theorems for the epigraph (1.21) of / , which is a subset of X x /?, by functions in (X x /?)* = X* x R (thus, instead of working with functions on X, one works with functions "one floor higher"); this is made possible by some useful connections between / and epi / (for example, it is well known that / is a convex function if and only if epi / is a convex set). There are two main types of dual problems to any (convex or nonconvex) constrained primal optimization (infimization or supremization) problem: "Lagrangian dual problems" and "surrogate dual problems." Roughly speaking, a Lagrangian dual problem, say, to (1.261), is an optimization problem whose objective function is defined by replacing the primal constraint set G by the whole space X, at the price of adding a "penalty term" to the objective function / (in order to compensate the "violation of the constraints"), and a surrogate dual problem to (1.261) is an optimization problem whose objective function is defined with the aid of the same objective function / , but replacing the constraint set G by a family of "surrogate constraint sets" (usually related in some way to G). We have the following basic theorem of Lagrangian duality. Theorem 1.13. Let X be a locally convex space, G a convex subset of X, and f: X ^ R a proper convex function that is continuous at some point ofGH dom /
1.4 Duality for convex and quasi-convex infimization
49
//
Figure 1.5. {i.e., finite and continuous at some point ofG). Then inf/(G) = max inf [f{y) - 0 ( j ) + inf cD(G)}.
(1.268)
4>GX* yeX
Proofi Clearly, for any G and / we have inf/(G) > inf {/(g) - (D(g) + inf cD(G)} geG
> i n f { / ( j ) - < I . ( y ) + inf
(^ e X*),
yeX
whence the so-called "duality inequality" inf/(G) > sup inf [f{y) - <^{y) -f- inf 0(G)}.
(1.269)
Let us prove now the opposite inequality and the attainment of the sup. Observe first that if inf/(G) = - o o , then, by (1.269), we have the equality (1.268), with the max being —oo, attained at all O e X*. Hence, we may assume that inf/(G) > —oo. Let M := {(g, ri)eXxR\geG,r]<
inf/(G)}.
(1.270)
Then M and e p i / are nonempty convex sets, with i n t e p i / = {(x,d) e X x ^\ fM < d) ^ & (since / has a point of continuity). Also, M H intepi / = 0; indeed, otherwise, if (g, r]) e M Hint epi / , then r] < inf/(G) < f{g) < rj, which is impossible. Hence, there exists (by Theorem 1.1) (^, fi) e (X x R)* = X* x /? separating epi / from M, i.e., such that sup ( ^ , /X)(X, d) < inf (Vl/, ^M)(g, T]), (x,d)eepif i8,r])eM
which implies that
(1.271)
50
1. Preliminaries ^(x) + MJ < ^(g) + M inf/(G)
(U, d) e epi / , g G G ) .
(1.272)
Clearly, ^ ^ 0. Also, /L6 / 0, since otherwise ^ ( x ) < vl/(g) for all x e dom / and g e G, that is, ^ separates d o m / from all points g e G, which is impossible by the "only if" part of Theorem 1.1 (indeed, since by our assumption / is a proper convex function that is continuous at some point go e G Ci dom / , we have go G G n intdom/ 7^ 0). Moreover, /x < 0 (since otherwise, taking (x,d) e e p i / with d -^ +00 in (1.272), we would arrive at a contradiction). Hence, dividing by ~fi (> 0) and taking d = f(x) in (1.272), and O o : = - i v i / (6X*\{0}),
(1.273)
we obtain Oo(x) - fix) < cDo(g) - inf/(G)
(X e d o m / g e G),
(1.274)
(x G d o m / g G G).
(1.275)
whence inf/(G) < fix)-
cDo(x) + cDo(g)
Consequently, inf/(G) < fix) - (Do(x) + infcDo(G)
(x e X),
which together with (1.269), proves (1.268) (with the max attained for O = OQ)- D Remark 1.18. (a) This proof illustrates the "epigraphic methods" of proofs in convex analysis, mentioned above. Later we shall also give another proof of Theorem 1.13, deducing it as a particular case of more general results. (b) Any condition involving the primal constraint set that ensures that (weak or strong) duality holds, e.g., the condition of Theorem 1.13 that / should be continuous at some point of G fl dom / , is called a constraint qualification. There are also other constraint qualifications that ensure the validity of the strong duality formula (1.268), e.g., the following condition discovered by Attouch and Brezis ([7, Corollary 23]): X is a Banach space, G is a closed convex subset ofX,f\X^^Risa lower semicontinuous proper convex function, fl«JU^>o/x(G —dom/) = X (in particular, the latter equality is satisfied when dom f = X). For some other constraint quaUfications see, e.g., Hiriart-Urruty and Lemarechal [104]. (c) In particular, when X is a normed linear space and / is the function (1.264), formula (1.268) means that dist(jco, G) = max inf {||xo-JC|| - CD(JC) + inf 0 ( G ) } . OGX*
(1.276)
xeX
Theorem 1.13 suggests that the left-hand and right-hand sides of the duality formula (1.268) be split into two optimization "problems," namely, the initial primal problem (P) of (1.261) and the "dual problem"
1.4 Duality for convex and quasi-convex infimization (D)
p = supA.(X*) = sup X(cD),
51
(1.277)
where A(cD) = inf {f(y) - cl>(^) + inf 0(G)}
(cD e X*);
(1.278)
yeX
this is a Lagrangian dual problem in the sense mentioned above, with the penalty terms 7T^(y) := -(D(y) + inf 0(G)
(y e X);
(1.279)
in other words, the Lagrangian dual problem (1.277), (1.278) "penaHzes," via the term (1.279) added to the primal objective function / (with lower addition), the fact that the initial constraint set G is replaced by the whole space X. The "dual constraint set" is the whole conjugate space X* (so (D) is an unconstrained supremization problem), and the "dual objective function" is X: X* -^ R of (1.278). The numbers a and fi are called the (optimal) values of problems (P) and (D) respectively. With the above notations for the Lagrangian dual problem (1.277), (1.278), the "duality inequality" (1.269) can be written as a>p.
(1.280)
For any primal-dual pair {(P), (D)} of optimization problems, when a = ^ (that is, when the values of (P) and (D) coincide), one says that weak duality holds, or that there is no duality gap', when a > p, one says that there is a duality gap. If we have a = fi and the dual problem (D) has an optimal solution, that is, if the value of (D) is attained for some OQ e X*, then one says that strong duality holds (see, e.g.. Theorems 1.11 and 1.13). Besides the use of constraint qualifications, another method of getting rid of a possible duality gap of a primal-dual pair [(P), (D)} of optimization problems is to replace the dual problem (D) by a new dual problem (DO
)6' = supV(X*),
(1.281)
for which a = fi' (possibly without assuming any constraint qualification). For the case of Lagrangian dual problems, one way of doing this is that of replacing the Lagrangian (1.287) by an "augmented Lagrangian" L': X x X* -> /?. To this end, a useful tool is provided by abstract convex analysis (for some details, see, e.g., [254, Section 0.8a]). Let us return now to the Lagrangian duality result (1.268). Applying formula (1.268) to a hyperplane G = H = [y e X\ (Do(jc) = JQ}, where OQ e X*\{0}, do e R, and observing that for any O G X* we have inf(H) = |''^<'
'fr^.*r'*^'
(1-282)
we obtain the following result, which gives a formula for the infimum of a function on a hyperplane:
52
1. Preliminaries
Corollary 1.10. Let X be a locally convex space, H = [x e X\<^oM = do], where Oo e X*\{0}, do e R, and f: X ^^ R a proper convex function that is continuous at some point of H H dom / . Then inf
xeX ^o(x)=do
f{x) = maxmflf(y)-ri(t>o(y)-hr]do}, rjeR yeX
(1.283)
Remark 1.19. One can obtain max^>o instead of max^^/? in the right-hand side of (1.283), by taking, instead of the hyperplane H = {x e X\ ^oM = do}, the closed half-space D:={x
eX\^oM>do},
(1.284)
where OQ and do are as above (so // = bd D, the boundary of D) and assuming that / is a proper convex function that is continuous at a point of D D dom / . Indeed, we have, for any 4> G X*,
inf OCD) = I "^0 ^ ^
[ -cx)
*^rVr'*^'
(1-285)
if Jy? > 0, O = r]^o,
whence by (1.268) (with G = D), we obtain inf
fix) = max inf {f(y)-r]
xeX ^o(x)>do
(1.286)
ri>0 yeX
The following is a useful tool for the study of the Lagrangian dual problem (1.277) to (P) of (1.261): the function L: X x X* -^^ defined by L(jc, O) := f(x) - (t>(x) + inf 0(G) (x e X, O e X*), (1.287) is called the Lagrangian function, or simply the Lagrangian, associated with the primal-dual pair {(P), (/))}, or with the dual problem (D). Thus, by (1.278), (1.287), and (1.277), X(0) = inf L{y, cD)
(O e X*),
(1.288)
yeX
P= sup inf L(j,(D);
(1.289)
therefore, conversely, (D) of (1.277) may be called the dual problem associated with the Lagrangian function (1.287). Duality results for inf / ( G ) , such as Theorem 1.13, can be used to derive characterizations of optimal solutions of convex optimization problems, e.g., the following one, due to Pshenichnyi and Rockafellar (see, e.g., [106, p. 30]): Theorem 1.14. Let X be a locally convex space, G a convex subset of X, and f: X -^ R a convex function that is continuous at some point of G. Then for an element go ^ G the following statements are equivalent: r. go e Scif) (i.e., /(go) = min/(G)).
1.4 Duality for convex and quasi-convex infimization
53
2°. There exists o e X* such that 4>o e Sfigo),
(1.290)
ct)o(go) = min cI>o(G).
(1.291)
Proof. If figo) = min/(G), then by Theorem 1.13, there exists o e X* such that f(go) = inf {f{y) -
(1.292)
xeX
whence f(go) < fix) - cDo(x) t inf cDo(G)
(x e X).
(1.293)
Taking x = go in (1.293), we obtain 0 < -Oo(go) + inf cI>o(G), which, since go e G, yields (1.291). Furthermore, by (1.293), we have figo) - Oo(^o) < figo) + sup (-Oo)(G) = f(go) - inf a)o(G) < / ( x ) - Oo(x)
(X G X),
so (1.290) holds. Conversely, assume 2°. Then by (1.290), we have f(go) - ^o(go) = inf {fix) - cI>o(x)}, xeX
which, together with (1.291), yields (1.292). Hence, by Theorem 1.13 and go ^ G, we obtain /(go) = min / ( G ) . D Remark 1.20. (a) Let us also mention a more classical proof of Theorem 1.14, based on the fact that 2° is equivalent to -A^(G;go)na/(go)7^0.
(1.294)
By the definition (1.122) of XG. 1° can also be written in the form ( / + XG)igQ) = m i n ( / + XG)iG), and, by the definition (1.110) of the subdifferential, this equality holds if and only if 0 e 9 ( / -J- XG)(^O)- But since dom XG = G, by Theorem 1.5 and formula (1.124) we have dif + XG)igo) = a/(go) + dxGigo) = dfigo) + NiG; go), so 1° is equivalent to 0 e 9/(go) + A^(G; go), that is, to (1.294). (b) In the particular case that X is a normed linear space and / is the function (1.264), from Theorem 1.14 one obtains again Theorem 1.12 on the characterization of the elements of best approximation by using the subdifferential formula (1.267). In the case that optimal solutions exist. Theorem 1.14 permits the following sharpening of the basic Lagrangian duality formula (1.268):
54
1. Preliminaries
Corollary 1.11. Let X be a locally convex space, G a convex subset of X, and f: X ^ R a proper convex function that is continuous at some point ofGD dom / . If problem (P) has a solution, say go, then min/(G) = /(go) =
max
inf {f(x) +
(1.295)
^eNiG;go)xeX
Proof The inequality > in (1.295) with max replaced by sup, is obvious. On the other hand, if OQ e X* is as in Theorem 1.14, then - O Q e N(G; go), and we have inf/(G) = /(go) = inf {f(x) -
whence (1.295), with the max attained at O = —OQ.
•
We have the following simultaneous characterization of primal and dual solutions, and strong Lagrangian duality: Proposition 1.2. Let X be a locally convex space, G a convex subset of X, go G G, f: X -> R a function, and Oo ^ ^*- The following statements are equivalent: r. We have (1.292). 2°. go is a solution of problem (P) (of (1.261)), 4>o is a solution of the dual problem (D) (of (1.277)), and we have strong duality (i.e., a = fi, with fi being attained). Proof If 1° holds, then, by the duality inequality (1.280) we have a = min / ( G ) < /(go) = inf {f(x) -
xeX
which yields 2°. Conversely, if 2° holds, then /(go) = min / ( G ) = « = ^ = inf {f(y) - <^o(x) + inf cDo(G)}.
D
xeX
In the preceding, the primal constraint set G has been an arbitrary convex subset of a locally convex space X. One can consider some more structured ways of expressing the primal constraint sets G c X. For example, one can take G = {x e X\u(x) e T],
(1.296)
where Z is a set, T is a subset of a set Z, w : X ^- Z is a mapping, and f: X -> R is a function. It is convenient to introduce the following terminology: Definition 1.3. (a) Any triple (X, Z, w), consisting of two sets X, Z and a mapping u: X ^^ Z is called a system. Given a system (X, Z,u), any subset 7 of Z is called a target set. (b) A linear system is a triple (X, Z, w), consisting of two locally convex spaces X, Z and a continuous linear mapping u: X ^^ Z. (c) A convex system is a triple (X, Z, w) consisting of a locally convex space X, a partially ordered locally convex space Z == (Z, <), and a "convex mapping" w: X -> Z, i.e., such that u(cxi + (1 -c)x2)
<cu(xi)-\-(l
-c)u(x2)
(xu X2 G X, 0 < c < 1). (1.297)
1.4 Duality for convex and quasi-convex infimization
55
Remark 1.21. One can also consider (see, e.g., Combari, Laghdir, and Thibault [33]) convex systems (X, Z U {+oo}, u) in which w: X ^- Z U {+00} is a proper convex function (in the sense (1.297)), but in the sequel we shall limit ourselves to the classical case u: X -> Z. Definition 1.4. For a linear (respectively, a convex) system (X, Z, w), a target set r c Z, and a continuous linear (respectively, a convex) function / : X ^- /?, the primal infimization problem (P)
a = a,-.^T)j=
inf / ( x )
(1.298)
MU)Gr
is called a linear (respectively, a co^v^x) programming problem. Remark 1.22. (a) In (1.283), the primal constraint sQi G = H = {x e X\ ^oM = do] may be regarded as being "structured" in the sense of (1.296), namely, by taking Z = R, u = Oo, and T = [do] (a singleton). A similar remark is valid also for the constraint set (1.284) occurring in (1.286), by taking Z = R, u = <E>o, and T = {d e R\d > Jo}. (b) Problem (1.298) is equivalent to problem (P) of (1.261). Indeed, given X, Z, w, r , and / as above, the programming problem (1.298) is nothing but the optimization problem (1.261) with G = {x e X \ u{x) e T} := u~\T). Conversely, every optimization problem (1.261) can be written in the form of a programming problem (1.298), for example, by taking Z = X, M = /x, the identity operator in X (i.e., M(X) = X for all jc E X), and T = G. Naturally, a constraint set G of an optimization problem (1.261) can be written in different ways in the form (1.298), which give rise to different dual problems. Note that the advantage of (1.296), (1.298), is that one can also use the properties of T and u. (c) One can combine problems (1.261) and (1.298), considering the infimization problem (P)
Qf= inf / ( x ) ,
(1.299)
xeG u(x)eT
where X, Z,u,T and / are as above, and G is a subset of X, called an "abstract constraint set." However, we shall not pursue this direction. Let us give now some results of unperturbational Lagrangian duality for problems with structured primal constraint sets in the sense of (1.296), which will be used later. Some other unperturbational Lagrangian duality theorems will be deduced in Section 1.4.2 from more general results on perturbational duality. The following result holds for arbitrary extended-real valued functions. Proposition 1.3. Let X be a linear space, and let fj\,... functions. Then
Jrn'- X -^ R be m -\- \ m
inf f{y) > inf 0' = l,...,m)
{i = \,...,m)
f(y) > sup inf Ifiy)
^Yriiliiy)],
(1.300)
56
1. Preliminaries
with the upper addition + of (1.84) and the uppermultiplication x of (1.92), which we have denoted simply by x, that is,
here, as well as throughout the sequel, Y^=\ "^eans upper addition and R+ — [0, +oo). Proof. By (1.301), for any y e X we have fiy) + X[xex\i,M
jf max,<,<„ /.(y) > 0,
= sup\fiy)^Tr]My)\.
(1.302)
Consequently, by (1.302) and the well-known inequality inf sup > supinf, inf /(>^) > yeX li(y)<0 {i=\,...,m)
inf f {y) = inf {f{y) ^ X{xex\h(x)
yeX
= inf sup f(y) + T
mU{y) > sup inf f{y) + V ^///(j) •
•
Remark 1.23. The first two terms of (1.300), that is, (P<)
a=
inf
/(j),
(1.303)
/(j),
(1.304)
\eX liiy)<0 0 = 1 w)
respectively (P<)
a=
inf \eX /,(v)<0 (/• = l,...,m)
ar^ ''structured'' primal convex programming problems in the sense (1.298), with Z = R"^, u: X -> R"^ defined by u(y) = (li(y),...Jm(y))
(y e X),
(1.305)
and T = {z e R'^lz < 0}, respectively T = {z e /?^U < 0}. By the right-hand sides of (1.310) and (1.326), one can define the Lagrangian dual problem to each of these primal problems by (D) where
^ = supA(/?:^),
(1.306)
1.4 Duality for convex and quasi-convex infimization
57
m
Xi^)^mf{f(y)
+ Yr]ili{y)}
(^ ^ (m, • •• ,rir„) € R"l),
(1.307)
that is, P of (1.315), and it is natural to associate to each of the pairs {(P<), (D)] and {(P<), (D)} the Lagrangian (function) m
L(y. vl/) = f(y) + J2 riMy)
(J G X, vl/ = (r/i, . . . , r?^) G /?7).
(1.308)
i=\
In the next result of strong duality the assumptions on the functions / and / i , . . . , /^ are very general. Theorem 1.15. Let X be a linear space, and let fj\,... Jm'- ^ -^ R be m + I convex functions. If the ''Slater constraint qualification*' (dom/) n{y eX\ li(y) < 0 (/ = 1 , . . . , m)} / 0
(1.309)
is satisfied, then inf
f(y) =
yeX //(>')<0 (/=!,...,m)
inf
f(y) = max infl f{y) + T rjMy)]^
yeX //(>')<0 (i=l,...,m)
rjeR'l yeX I
^ '=^
(1-310)
J
Proof For simplicity, let us set / , : = { ! , . . . , m}.
(1.311)
For the first equality of (1.310), it is enough to prove the inequality < . Let >; G X be such that li(y)
(i e Im).
(1.312)
Let us denote the left-hand side of (1.310) by of, and put yn = y + -(yo-y)
(n = i , 2 , . . . ) .
(1.313)
n Then, by (1.313) and the convexity of//, we have
// (yn) < (l - -)li(y) +
-//(JO)
<0
(ielm;n
=
h2,...),
\ n/ n whence, by (1.313) and the convexity of / , a =
inf yeX liiy)<0 Helm)
f{y) < fiy^) < ( l - -)f{y) \
n/
+ -f{yo)
(« = 1, 2 , . . . ) .
n
(1.314)
58
1. Preliminaries
We shall show that a < f(y), which, since y e X with //(}')< 0 (/ = 1 , . . . , m) was arbitrary, will prove the first equality of (1.310). If f{y) = +oo, we are done. If f(y) = - 0 0 , then by (1.314) and jo e d o m / , we have a = -co = f(y). Assume now that f(y) e R. If fiyo) = - o o , then by (1.314), a =-oo < f(y). Finally, if /(yo)G/?, then by (1.314), (A2=l,2,...), -fiyo) < (l - -)f(y) n \ n/ whence, passing to the limit as AT ^- +oo, we obtain a < / ( y ) . Let us prove now the second equality of (1.310). Put m
P := sup inf j / ( y ) + V r]My)\-
(1-315)
By Proposition 1.3, we have a > p. Hence, if a = —oo, then fi = —oo, and therefore, for any r] e R^, \ ^ 1 Of = - o o < inf /(y)4-y]^7///(j)
and we are done. On the other hand, by the Slater condition (1.312), we have a < f(yo) < +00, so it remains to consider the case a e R. Define v: R"^ ^ Rhy v(z):=
inf f(y)
(z = (zu ..., Zm) e R^-
(1-316)
yeX li(y)
Then v is convex ( see, e.g., Ekeland and Temam [54, Ch. 3, Lemma 5.2]) and decreasing (i.e., v(z') > v(z'') for z\ z'' e R"^, z' < z") for the natural (product) order on R"^. Moreover, by (1.312), there exists s > 0 such that //(yo) 5 —^forall / e Im- Let zo := - ( ^ , . • •, 6:) G 7?^. Then for any z e R"^ with zo < ^ < 0, we have - o o < Of = i;(0) < v{z) < v(zo) =
inf f(y) < f(yo) < +oo,
xeX li{y)<-£
SO V is finite valued on [—£, 0]^. Hence, since i; is convex, it cannot take the value —oo, and thus it is proper (recall that i;(0) = a e R,so v ^ +oo). Furthermore, since v is decreasing, for any z e R^ with zo < z, that is, any z e [—£, +oo)^, we have v(z) < v(zo) < +oo, and hence 0 e int(domi;). Consequently, v is subdifferentiable at 0 (see, e.g., Ekeland and Temam [54, Ch. 1, Proposition 5.2]), so there exists r]^ e R"^ such that m
v{z)-viO)>-J2^Ui 1=1
(zeR""),
(1.317)
1.4 Duality for convex and quasi-convex infimization
59
or equivalently, m
inf f{y) + T ',(J)<M
rf!z, >a
(z e R"").
(1.318)
'= '
(i6/„)
We claim that rj'^ e /?™. Indeed, let j e /„ and define a sequence {z"} Q R'" by in
if/=7,
(J 319)
then, from (1.312) and (1.319) we have li(yo) < z" (i e /„), whence, by (1.312), (1.316), (1.317), and (1.319), + oo > fiyo) - viO) >
inf f{y)-v(0)
= v{z")-v{0)>-nii°
yeX 1 /,.\
(n = l , 2 , . . . ) , •'
n
and therefore r]^j > 0, proving the claim. Let us prove now that m
a<mf\f(y) yeX
+ Tri%(y)\.
I
.^j
(1.320) J
Let J e X. If there exists j e Im such that lj{y) = +00 or if f{y) = +00, then, by the rules (1.84) and (1.92) for + and x, we have f(y) + Yl?=\ ^%iy) = + ^ Assume now that //(j) < +00 (/ e Im) and f(y) < +00, and let J:={i
eUli{y)
=-00}.
(1.321)
Define a sequence {y"} c R^ by
Then //(y) < yf (/ G /^; AZ = 1, 2 , . . . ) , whence, by (1.318) and (1.322), m
a < f(y) + J2 nhl = f(y) + E ''°''(>') - « E''°-
^1-323)
If there exists j e J such that ry^ > 0, then, passing to the limit in (1.323) for n -> +00, we obtain a contradiction with of > — (X). Therefore, rj^ = 0 for all / G J, and hence, by (1.92) and (1.323), we have (1.320). Consequently, m
P
+
which yields the second equality in (1.310).
f^
Yrj%(y)\
60
1. Preliminaries
Remark 1.24. (a) For a locally convex space X and finite continuous convex functions l\,... Jrn'- ^ ^^ R. Theorem 1.15 is the particular case Z = R^ of Corollary 1.12(b) below. (b) Applying Theorem 1.15 to a locally convex space X, m = 1 and the affine function l\ = —OQ + ^o, where OQ e X*\{0} and do e R, we obtain that if f: X -> /? is a convex function such that (dom/) n{yeX\
(1.324)
then inf yeX ^o(y)>do
f(x)=
inf
f(x) = maxmf[f{y)-r]
yeX ^o{y)>do
T]>0
+ rjdo}.
(1.325)
yeX
The second equahty of (1.325) is nothing other than (1.286), but now it has been proved under different assumptions. Under some additional restrictions, but without assuming any constraint qualification, one can obtain a formula of the above type, with max replaced by sup. Indeed, we have the following result: Theorem 1.16. Let K be a compact convex set in a topological linear space X, and let f,l\, ... ,lm'. K -> R be m -\- \ finite valued lower semicontinuous convex fiinctions. Then f(y) = sup inf f(y) + T r]My)\.
inf yeK
nm yeK I
^
(1.326)
J
(i=h...,m)
Proofi For each y e X with // (j) < 0 (/ = 1 , . . . , m) we have f
f(y) =sup\
^
1
f(y)^TriMy)\.
whence inf yeK //(>')<0 {i = \,...,m)
f{y) = inf sup f(y) + T ^///(j) • yeK ^j^m I ^ ^
^ '=^
(1.327)
J
Hence, by (1.327) and the minimax Theorem 1.8, we obtain (1.326).
D
Let us pass now to (unperturbational) surrogate duality for quasi-convex infimization. A surrogate dual problem to a primal infimization problem {P) of (1.261) is a supremization problem whose objective function is defined with the aid of the same objective function / , but replacing the constraint set G by a family of "surrogate constraint sets" (usually related in some way to G). We have the following result of (strong) surrogate duality for quasi-convex problems (P) of (1.261):
1.4 Duality for convex and quasi-convex infimization
61
Theorem 1.17. Let X be a locally convex space, G a convex subset ofX,f:X-^ R an upper semicontinuous quasi-convex function, for which the constraint set G is ''essentialJ' that is, inf/(X) < inf/(G) < +oo,
(1.328)
and XQ an element ofX such that /(xo)
(1.329)
Then inf/(G) =
max
inf
f{y).
(1.330)
O€X*\{0} yeX sup 0(G)<')=supO(G)
Proof Since / is upper semicontinuous, by (1.329) and Lemma 1.1 we have /(jco) < inf/(G) = inf/(G), whence JCQ ^ G. Hence, since G is convex, by the strict separation theorem there exists 4> G X * \ { 0 } such that sup 4)(G) = sup cD(G) < O(jco).
(1.331)
Consider any O e X*\{0} satisfying (1.331), and any £ > 0. Take g = ge e G such that / ( g ) < i n f / ( G ) + e,
(1.332)
and define a function (^: [0, 1] ^^ /? by ^(^) := cD(^xo + (1 - ^)g) = i^cD(xo) + (1 - ^)<^{g)
(0 < i^ < 1). (1.333)
Then (p is continuous on [0, 1] and by (1.331), we have <^(0) = ^{g) sup 0(G) < O(xo) = <^(1), so there exists i^o ^ [0, 1) such that
<
(^(z^o) = supO(G).
(1.334)
yo:=i^oXo + ( l - ^ o ) g .
(1-335)
Put
Then, by (1.333) and (1.334), cD(3;o) - 0(^0X0 + (1 - ^o)g) = ^(^o) = sup cD(G);
(1.336)
furthermore, by (1.335), the quasi-convexity of / and (1.329), (1.332), we get f{y^) < max {/(JCQ), f(g)} < max {inf/(G), inf/(G) + 8} = inf/(G) + s. (1.337)
62
1. Preliminaries From (1.336) and (1.337) it follows that
inf
f(y)
+ 6,
yeX (i>(y)=sup^(G)
whence, since O e X*\{0} satisfying (1.331) and e > 0 have been arbitrary, it follows that inf/(G) >
sup
inf
f(y).
(1.338)
OeX*\{0} >"^^ supO(G)
On the other hand, by (1.328) and since / is upper semicontinuous and quasiconvex, the set A := {y eX\f(y)<
inf f(G)}
(1.339)
is nonempty, open and convex; furthermore, clearly, A Pi G = 0. Hence, by the separation theorem, there exists OQ G X * \ { 0 } such that supcDo(G) < inf 4>o(A).
(1.340)
Then inf o(A) < OO(JCO) (since by (1.329) we have XQ e A, and so Lemma 1.8 applies), whence by (1.340), sup Oo(G) < OO(JCO). Hence by (1.338), we obtain inf/(G) >
inf
f(y).
(1.341)
yeX Oo(v)=supOo(G)
Let US show that in (1.341) equality holds, which will complete the proof. If not, then there exists yo ^ X with ^o(jo) = sup Oo(G), such that inf/(G) > f(yo) (so yo e A). Thus, the hyperplane H:={yeX\
(Po(y) = sup Oo(G)}
(1.342)
contains yo, and hence in the open neighborhood A of yo there exists y\ e A such that Oo(yi) < sup Oo(G); indeed, one can take y\ := yr~yo — yz-^o. with /x > 0 sufficiently small, since then ji is sufficiendy near to yo (so y\ e A), and 1 /x ^o(ji) = :; ^o(yo) ^o(-^o) 1 — /x 1 —M = -; supcI)o(G) Oo(xo) < supOo(G). 1 — /x 1 — /x But this contradicts (1.340).
D
Remark 1.25. (a) Geometrically, Theorem 1.17 means that under the assumptions (1.328) and (1.329), we have inf/(G) = max i n f / ( / / ) , HeHcxn
(1.343)
1.4 Duality for convex and quasi-convex infimization
63
where HG,XO denotes the set of all hyperplanes that quasi-support the set G and that strictly separate G and XQ (see Lemma 1.4); thus, (1.343) reduces the computation of inf / ( G ) to that of inf / ( / / ) , for H e H-cxo^ so it may be called a "hyperplane theorem" of surrogate duality; note also that (1.343) generalizes the distance formula (1.249). In other words, Theorem 1.17 gives the following extension to quasi-convex optimization of the ''reduction principle" of Remark 1.16(b): it permits one to apply any formula known for i n f / ( / / ) to the computation o/inf/(G). (b) The above proof of Theorem 1.17 shows that in (1.330) it is enough to take the max over the set {CD G X*\{0}| sup cD(G) < inf 0(A)}
(1.344)
(where A is defined by (1.339)), which is contained in the set {O e X*\{0}| supO(G) < cD(jco)}
(1.345)
occurring in (1.330). On the other hand, in (1.330) one can take the max over the larger set {O € X*\{0}| supcD(G) < O(jco)},
(1.346)
as follows by slightly modifying the above proof (namely, replacing the sign < by < in (1.331), and ^o e [0, 1) by z^o e [0, 1] in (1.334)-( 1.336)). Therefore, it is natural to ask whether one can further enlarge the set (1.346), e.g., to the "barrier cone" of G, defined by G^ := {(D G X*\{0}| sup0(G) < +oo},
(1.347)
i.e., whether inf/(G) = max
inf
fiy).
(1.348)
cI>(>')=supcI>(G)
However, the answer is negative, even when G is a closed convex set and / is a finite continuous convex function on a finite-dimensional space X, as shown by the following example: In X = R, let G = {x eX\ -2<x f(x)=x (xeX), (Do(jc)=jc (xeX).
<-!},
JCQ = - 3 ,
(1.349) (1.350) (1.351)
Then supcDo(G) = - 1 < -foo (so OQ G G^) and /(XQ) = - 3 < - 2 = inf/(G), but {X G X\ cDo(x) = supcDo(G)} = {xeX\x whence
= -\} = {-1},
64
1. Preliminaries inf/(G) = - 2 <
inf
xeX OoU)=supOo(G)
f(x) = / ( - I ) = - 1 ,
and hence (1.348) does not hold. Thus, it is necessary to obtain a smaller set than G^ in the right-hand side of the duality formulas involving the hyperplanes {x e X\ ^(x) = sup 0(G}, and this was accompUshed above by the assumptions (1.328) (i.e., that the constraint set G is essential) and (1.329). Actually, in Chapter 3 we shall see that under certain assumptions, the right-hand side of (1.348) gives an expression for sup / ( G ) (so it cannot be an expression for inf/(G) when / is not constant on G). (c) A fixed element XQ e X satisfying (1.329) arises quite naturally in some optimization problems. For example, as has been observed above, best approximation may be regarded as a particular case of infimization, by taking X to be a normed linear space, XQ e Z, and f: X -^ /? the convex function (1.264). Indeed, then, by (1.265), we have 0 < dist(xo, G) (or equivalendy, XQ e CG) if and only if (1.329) holds. Thus, Theorem 1.17, in the geometric form (1.343), is an extension of Remark 1.15(b). Lagrangian and surrogate duality theorems are closely related, as shown by the following observations: Remark 1.26. (a) The substitution method: Let us recall the following method of deducing Lagrangian duality results from surrogate duality results, introduced in [214], which we shall call the substitution method: Given a set G c X and a function f:X -^ /?, if we have a surrogate duahty result for inf/(G), expressed using, e.g., i n f / ( / / ) , for H belonging to a family of hyperplanes or closed half-spaces or open half-spaces (see the above reduction principle), and if we know a Lagrangian duality formula for i n f / ( / / ) , then by substituting it in the surrogate duality result for inf/(G), we can obtain a Lagrangian duality result for inf/(G). Usually, it is convenient to assume that / is convex, since then the known Lagrangian duality theorems may be applicable. For example, following [214, pp. 247-248], let us show that using this substitution method, the surrogate duality result of Theorem 1.17, together with the Lagrangian duality formula for i n f / ( / / ) , given in Corollary 1.10 (with
= inf/(G) > sup inf {f(y)-
<^{y) + inf 0(G)) >
inff(X)
(the last inequahty follows by taking O = 0), which impHes (1.268). On the other hand, assume now that the constraint set G is "essential," i.e., that we have (1.328), so there exists XQ e X satisfying (1.329). In order to handle this case, note that by Remark 1.25(b) and since — inf (—0(G)) = sup 0(G) < O(jco) is equivalent to inf (—0(G)) > (—O)(jco), we can write (1.330) in the equivalent form
1.4 Duality for convex and quasi-convex infimization inf/(G) =
max
inf
4>eX*\{0} 0(jco)
yeX cI>(>')=infO(G)
f{y).
65
(1.352)
Nov^ formula (1.352) and Corollary 1.10 (with OQ = O, JQ = inf 0(G)) imply inf/(G) = =
max
inf
/(y)
max
maxinf { / ( y ) - n O ( y ) + r/infO(G)}.
OeX*\{0} n^R yeX ct)Uo)
Hence, there exist
OQ G X * \ { 0 }
(1.353)
'
with Oo(xo) < inf c|>o(G) and rjo e R such that
inf/(G) = inf [f(y) - m^oiy) + r?oinf cI>o(G)}.
(1.354)
yeX
We claim that rjo > 0. Indeed, by (1.354) and (1.329), we have inf/(G) < fixo) - rio^oixo) + r/oinf Oo(G) < inf/(G) - m^oixo) f
Tjoinf^oiG),
whence r/o{-Oo(xG) + inf cDo(G)} = -rjo^oixo) + r/oinf Oo(G) > 0. This inequality, together with Oo(xo) < inf Oo(G), impHes that r]o > 0, which proves the claim. Now, by rjo > 0, we have r^o inf Oo(G) = inf y7o^o(^). and hence by (1.354), inf/(G) = inf {/(y) - r]o
which together with (1.269), impUes (1.268) (with the max attained for O = rjo^o), completing the proof of Theorem 1.13. Note that although in the case of Lagrangian duality for convex optimization, the direct method of proof is simpler than the above substitution method, it will turn out that in sonie other cases, Uke that of reverse convex optimization, the situation is different (see Chapter 7). (b) In the converse direction, one can show (see [214, p. 247]) that Theorem 1.13 of Lagrangian duality and the "duaUty inequahty" > in Theorem 1.17 of surrogate duality (which is the simpler part of Theorem 1.17) imply Theorem 1.17. Now we shall show that, replacing hyperplanes by closed half-spaces, one can also give corresponding "half-space theorems" of surrogate duality, involving the whole barrier cone G^ (defined by (1.347)) in the right-hand side. Indeed, we have the following half-space theorem of weak duality. Theorem 1.18. Let X be a locally convex space, G a subset ofX with G^ ^ 0, and f: X -^ R a function. The following statements are equivalent:
66
1. Preliminaries 1°. We have inf/(G) =
sup
inf
f{y) = sup
^(y)<supcD(G)
inf
f(y).
(1.355)
2°. For each d e R,d < inf/(G), there exists Oj e X*\{0} such that supO^(G) <
{X e Adif))-
(1.356)
3°. For each d e R,d < inf/(G), there exists Oj e X*\{0} such that supcD^(G) <
{X e SAf))-
(1.357)
Proof. The second equality in (1.355) is obvious, since G^ 7^ 0 and since for O e X*\G^ we have {y e X\^{y) < sup 0(G)} = X. Rather than giving a direct proof of Theorem 1.18, as in [231], we shall show how the result can be deduced from a more general duality theorem of Chapter 3. Define a polarity A = A^ : 2^ ^ 2^*\<0J by A^(;(C) := [^ e X*\{0}| (D(c) > sup
(C c X).
(1.358)
For this polarity we have, by (1.150), (A'cYim)
= {xeX\
(O e X*\{0}).
(1.359)
Then 1° is nothing but (3.56) below and we shall show that conditions 2° and 3° above are equivalent to conditions 2° and 3° of Chapter 3, Theorem 3.3, respectively, for W = X*\{0}, A = A^, and a = inf/(G). Indeed, first, for each d e R, d > inf/(G), each g e G such that d > f{g) > inf/(G) and each O e X*\{0} we have g e A j ( / ) nC(A^)^({4>}) (note: equivalendy, one can observe directly that we always have inf/(G)>^^:=
sup inf/(C(A^^)^({cD}))
=
sup
inf
fix),
(1.360)
because G ^ [x e X\
AAf) n C(A^^)^({0^}) = AAf) n{xex\
Hence, the assertion on condition 2° follows. The proof for condition 3° is similar. D Remark 1.27. (a) The assumption G^ 7^ 0 implies that G ^ X (since otherwise, for each <$> G Z * \ { 0 } we would have sup 0(G) = sup 0(G) = sup 0 ( Z ) = +CXD,
1.4 Duality for convex and quasi-convex infimization
67
so G^ = 0), and if G is convex, then the converse is also true (take any x ^ G, and apply the strict separation theorem). (b) Geometrically, formula (1.355) means that inf/(G) = sup i n f / ( V | ^ ) ,
(1.362)
where V^^ is as in (1.30), with d = sup 0(G), i.e., the smallest closed half-space determined by O containing G; note that if O G G^, then V^^ ^ X. (c) If G^ 7^ 0 and (1.355) holds, then we also have inf/(G) =
sup supcD(G)<J
inf /(jc),
(1.363)
^ ^-
or, geometrically. inf/(G) = sup inf/((V),
(1.364)
VeV GCV
where V denotes the set of all closed half-spaces in X. Indeed, by (1.362) we have the inequality < in (1.364), and on the other hand, for any set V with G c y we have inf/(G) > inf/(V). We have the following result of strong duality. Theorem 1.19. Let X be a locally convex space, G a subset ofX with G^ ^ 0, and f: X ^ R a function. The following statements are equivalent: 1°. We have inf/(G) = max
inf
/(y).
(1.365)
cD(>')<supcD(G)
2°. There exists O^ € X*\{0} such that supcD«(G)
(XGA«(/)),
(1.366)
where a = inf/(G). Proof Assume 1°. If of = - o o (so (1.365) holds by (1.360)), (1.366) is vacuously satisfied for any Oa e X*\{0}. If a > - o o , then, by (1.365), there exists O^ e G^ such that a = inf/(G) =
inf
f{y),
(1.367)
0„(>0<supO«(G)
whence by Chapter 3, Lemma 3.4 (a) (with $^ = {y G X| ^^(y) < sup
68
1. Preliminaries
Conversely, assume now 2°. If o: = —oo (so (1.366) is vacuously satisfied for any ^^ e X*\{0}), then by (1.360), we have (1.365). If of > - o o , then by 2° and Chapter 3, Lemma 3.4(a), we have inf
f(y)>a
=
mff(G),
yeX <^a(y)<SUp
whence by (1.360), we obtain (1.365).
D
As an application of Theorem 1.19, let us give now a sufficient condition for strong duality. Theorem 1.20. If X is a locally convex space, G a convex subset ofX with G^ ^ 0, and f: X ^ R a function such that A«(/) is nonempty, convex, and open, where a — inf/(G) (in particular, if f is upper semicontinuous and quasi-convex), then we have (1.365). Proof Since G and Aa(f) are nonempty convex subsets of X with Aa(f) open, and since G Pi Aa(f) 7^ 0, by the separation theorem there exists O e X*\{0} such that sup4>(G) < inf 0 ( A J / ) ) <
(x € A , ( / ) ) ,
where the last inequahty holds by Lemma 1.8. Hence, by Theorem 1.19, we have (1.365). D We also have the following result on surrogate duality formulas having inf^ez,cD(j)<supO(G) f(y) in the right-hand side (half-space theorems): Proposition 1.4. Let X be a locally convex space, G a subset of X, f: X -^ R a function satisfying (1.328), and XQ an element of X satisfying (1.329). The following statements are equivalent: \\ We have {1355). 2°. We have inf/(G) =
sup
inf
f{y)
(1.368)
supO(G)
Also, a similar equivalence holds for the corresponding strong duality equalities (i.e., for (1.365) and sup replaced by max in (1.368)j. Proof By (1.329), we have sup
inf
f(y) < f(xo) < inf/(G),
cI>(jco)<sup
and hence if 1° holds, then we have 2°. On the other hand, by G <^ [y e X\ ^(y) < sup 0(G)} (O e X*), we have
1.4 Duality for convex and quasi-convex infimization inf/(G) > sup ^
Qb
inf
/(_y) >
yeX cD(j)<sup0(G).
sup
inf
69
/(>'),
^ ^b yeX , u p n G ) < ^ { x o ) ^(^)<sup0(G)-
and hence if 2° holds, then we have 1°. The proof for the corresponding strong duality equalities is similar.
D
Remark 1.28. Besides the close connection between Lagrangian and surrogate du-> R WQ ality theorems, mentioned in Remark 1.26(a), for any function f:X have the following obvious relation between Lagrangian dual values ^Lagr (see, e.g., (1.268)) and surrogate dual valuesfi^unof type (1.355): inf/(G) > y^suiT := = sup
sup oex*\{0} inf
inf f(y) y^^ cD(v)<sup4)(G) f{y) > sup inf {/(g) + 0(g) + - sup 0(G)}
> sup inf {f{y) +
(1-369)
thus, in particular, the equality a = p^un holds for a larger class of problems (1.261) than the equality a = y^Lagr- However, surrogate duality a = j^surr is useful even for some convex problems for which we have a = ^Lagr (whence also a = )6surr). as shown, for example, by the problem of best approximation (see Remark 1.25(c)). Furthermore, surrogate dual problems are also convenient for computations. Replacing the closed half-spaces {x e X\
sup
inf
exn{0}
y^^
f{y).
R
(1.370)
2°. For each d e R,d < inf/(G), there exists <^d ^ ^*\{0} such that AAf)
n{xeX\
(1.371)
3°. For each d e R,d < inf/(G), there exists Oj e X*\{0} such that SAf) n[xeX\
(1.372)
Proof One can proceed similarly to the above proof of Theorem 1.18, defining a polarity A = A^: 2^ ^ 2^*^^^^ by A^(C) := {CD G X*\{0}| 0(C) H 0(G) = 0}
(C c X),
(1.373)
70
1. Preliminaries
and observing that for this polarity we have, by (1.150), (AlYiW)
= {xeX\ cD(x) ^ 0(G)}
inf/(G)>^^:=
sup
(O e Z*\{0}),
(1.374)
inf/(C(AS)^({4>}))
^eX*\{0}
=
sup inf fix). ^^^*\{«Uu)i0(G)
(1.375) D
Remark 1.29. Formula (1.370) suggests splitting the left-hand and right-hand sides of this duality formula into two optimization "problems," namely, the initial primal problem (P) of (1.261) and the "dual problem" (D)
p=
sup A(
(1.376)
OeX*\{0}
where A(0)=
inf
f(y)
(OeX*\{0});
(1.377)
yeX 4>(y)eO(G)
this is a surrogate dual problem in the sense mentioned above, with the "surrogate constraint sets" QG,^ := {y e X\ cD(j) e cP{G)}
(O e X*\{0});
(1.378)
in other words, A(cD) = inf/(^G.o)
(^ e X*\{0}).
(1.379)
Naturally, similar remarks can be made also for the preceding surrogate duality formulas. The next result of strong duality corresponds to Theorem 1.19. Theorem 1.22. Let X be a locally convex space, G a subset ofX, and f \ X ^^ R a function. The following statements are equivalent: 1°. We have inf/(G) =
max
inf
CDGX*\{0}
VGX
f{y).
(1.380)
2°. There exists 4>« G X*\{0} such that A^if)
r^[xeX\
cl>,(x) G CD,(G)} - 0.
(1.381)
Proof The proof is similar to the above proof of Theorem 1.19, using now the set Q = [yeX\<^o.{y)e<^AG)}. D
1.4 Duality for convex and quasi-convex infimization
71
We also have the following result corresponding to Proposition 1.4. Proposition 1.5. Let X be a locally convex space, G a subset ofX,f:X-^Ra function satisfying (1.328), andxo an element of X satisfying (1.329). The following statements are equivalent: W We have {\310). T. We have inf/(G)=
sup CDGX*\{0}
inf y^^
f{y).
(1.382)
^
Also, a similar equivalence holds for the corresponding strong duality equalities (i.e., for (1.380) and sup replaced by max in (1.382)| Proof. The proof is similar to that of Proposition 1.4, observing that we have sup
inf
/(>;)
OUo)€0(G) ^(v)e4>(G).
and using the obvious inclusion G c. [y e X\ ^{y) e 0(G)}.
D
7.4.2 Perturbational theory The theory of perturbational dual problems, of which we shall present some elements in this section, is a convenient tool to handle duality both for problems (1.261) with general constraint sets G and for problems (1.298) with the structured constraint sets (1.296). Let us consider the primal infimization problem (P) of (1.261). Clearly, inf/(G) = inf/(X),
(1.383)
where / : X ^- /? is the function defined by 7 ( x ) = . / ( x ) + xo(x) = {^(^^
l l %
(1.384)
Thus, problem (1.261) and the primal problem {P)
a = \nif{X)
(1.385)
have the same value. Moreover, if / | G ^ +oo, which we shall assume in the sequel without any special mention, then problems (P) and {P) have the same optimal solutions; indeed, if go 6 G, /(go) = inf/(G), then /(go) = £(go) + XG(go) = inf/(G) = £ n f / ( X ) , and conversely, if JCQ G X and /(JCQ) = inf/(X), then f{xQ)\XG(-^O) = /(-^o) = inf/(X) = inf/(G) < +oo, whence XQ e G and /(xo) = inf/(G). Furthermore, if G is convex and / : X ^- /? is a function such that / | G is convex (respectively, quasi-convex), then / is convex (respectively, quasi-convex)
72
1. Preliminaries
on the whole space X. Therefore, we shall assume from the beginning that we are given an unconstrained primal infimization problem (P)
a = inf0(Z),
(1.386)
where (/>: X ^^ R is a. function, and then, taking in particular 0 = / -f XG and a suitable perturbation /?, the duality theory for (P) of (1.386) will yield a duahty theory for (P) of (1.261). A classical way of defining a dual problem to the primal infimization problem (P) of (1.386) is to embed it into a family of "perturbed" infimization problems, as follows. Let Z be a locally convex space (called set of "perturbations" or of "parameters"), and p: X x Z ^^ R a. function (called a "perturbation function," or "parameterization"), such that p(jc,O)=0(x)
(xeX),
(1.387)
a=mfp(x,0).
(1.388)
so (P) of (1.386) is nothing other than (P)
xeX
With the aid of this perturbation function, (P) is embedded into the family of "perturbed" (or "parameterized") infimization problems (P,)
viz) := inf p(x, z)
(z e Z);
(1.389)
xeX
indeed, then (P) = (PQ) and a = v(0),
(1.390)
One defines the Lagrangian dual problem associated with the perturbation function p (or relative to the parameterization {Z, p)) as the unconstrained supremization problem {D)
y0:=supA(Z*),
(1.391)
where A: Z* -^ P is the dual objective function defined by X{^) := inf {inf {p{x, z) - ^(z)}}
( ^ e Z*).
(1.392)
By the canonical identification of X* x Z* with {X x Z)*, given by (1.28), we can write X{^) = - sup { - inf [p{x, z) - ^(z)}} = - sup sup {^{z) - p(x, z)} xeX
=
^^2
xeX zeZ
- s u p {(0,vi/)(x,z)-/7(x,z)} = -/7*(0,vl/)
(vJ/eZ*),
(1.393)
(x,z)eXxZ
and thus (D) is nothing other than the problem of supremization of the concave upper semicontinuous function X:
1.4 Duality for convex and quasi-convex infimization (D)
p = sup X(vy) = sup {-/7*(0, ^ ) } .
73
(1.394)
One says that vv^a/c duality holds if or = y6, and strong duality holds if a = ^ and the sup in the dual problem (D) of (1.391) is a max, i.e., it is attained for some ^0 ^ 2* (or in other words, if (D) has an optimal solution ^o)- There are known various sufficient conditions for achieving strong duality, of which we shall mention only the following one: Proposition 1.6. (see, e.g., Ekeland and Temam [54, Ch. Ill, Propositions 2.3 and 2.2]). With the above notation, assume that a is finite, p is convex, and there exists an element XQ e X such that the function z -^ p(xo, z) is finite and continuous at z = 0. Then inf0(X) = maxA.(Z*);
(1.395)
i.e., we have a = ^ and the sup in (1.391) is attained for some ^o e Z*. The following is a useful tool for the study of the Lagrangian dual problem (1.391), (1.392) associated with p: the function L: X x Z* ^ J defined by L(x, ^ ) := inf {p(x, z) - ^(z)}
(x eX,^
e Z*),
(1.396)
zeZ
is called the Lagrangian function, or simply the Lagrangian, associated with p. Thus, considering the partial functions pAz):=p{x,z)
(xeX^zeZ),
(1.397)
we have L(x, vi/) = M{pAz)
- ^{z)} = -p:m
(xeX,^
e Z*).
(1.398)
zeZ
By (1.392) and (1.396), X(^) = inf L(jc, ^ )
( ^ G Z*),
(1.399)
xeX
and hence by (1.391), fi = sup inf L(x,vl/).
(1.400)
On the other hand, by (1.387), (1.397), and (1.398), Hx) = PAO) > PT(0) = sup v,ez*{^(0) + Lix, ^ ) } = sup L(jc, vl/) (jc G X),
(1.401)
vPeZ*
and hence by the inequality inf sup > sup inf and (1.400), we obtain the "duality inequality"
74
1. Preliminaries a = inf 0(Z) > inf sup L(x, ^) > sup inf L(x, ^) =
fi.
(1.402)
Actually, one is interested in obtaining conditions for "weak duality," i.e., the equality a = inf sup L(x, ^) = sup inf L{x, ^) =
fi,
(1.403)
or strong duality, i.e., (1.403), with the second sup of (1.403) attained for some ^0 ^ ^*; in general, it is convenient to use, to this end, some minimax theorems, such as Theorems 1.8, 1.9. If in addition Px(0) = p**(0) (x e X) (e.g., if for each x e Z the partial function px of (1.397) is proper, convex, and lower semicontinuous), then, similarly to (1.401), there follows (P(x) = PAO) = PT(0) = sup vi;ez*{^(0) + L(x, vl/)} = sup L(jc,vl/) (x G Z ) ,
(1.404)
vl/eZ*
and thus in this case, a = inf sup L(jc, ^ ) .
(1.405)
Remark 1.30. (a) If px = /?** (x e X) (i.e., if for each JC e Z the partial function Px of (1.397) is proper, convex, and lower semicontinuous), then for all jc e X and z e Z WQ have p(x, z) = Pxiz) = PT(Z) = sup {vl/(z) - /.^(vl/)} VI>GZ*
= sup {vI/(z) + L(x,vI/)},
(1.406)
^eZ*
which expresses p with the aid of L. (b) It is well known and easy to show (see, e.g., Ekeland and Temam [54, Ch. Ill, Lemma 2.1 and Remark 2.1]) that if X and Z are linear spaces and f: X ^^ R is convex, then so is the "(optimal) value function" (called also "marginal function") v: Z -^ R (where v stands for "value") defined by (1.389); also, by (1.387) and (1.389), we have a = v(0). There are many duality results involving the value function v. For example, note that by (1.392) and (1.389), X(^) = inf inf {p{x, z) - ^(z)} = inf {inf p(x, z) - ^(z)} xeX zeZ
zeZ
= inf {viz) - ^(z)} = -i;*(^)
xeX
(^ e Z*).
(1.407)
zeZ
Also, by (1.394) and (1.407) we have P = sup A.(^) = sup {vI/(0) - i;*(^)} = i;**(0); ^GZ*
vi/eZ*
hence weak duality a = ^ holds if and only if v{0) = i;**(0).
(1.408)
1.4 Duality for convex and quasi-convex infimization
75
By (1.401) and (1.399), we have 0(jc) > L(jc, vl/) > A,(vl/)
(x eX,^
e Z*).
(1.409)
A pair (XQ, ^O) G X X Z* is called a saddle point of L if L(x, ^o) > ^(^0, ^o) > L(xo, vl/) When pAO) = p^i^),
(jc G X, ^ G Z*).
(1.410)
by (1.404) and (1.399) condition (1.410) is equivalent to 0(xo) = L(xo,^o) = ^(^o).
(1-411)
Theorem 1.23. If (1.405) holds, then for a pair (XQ, ^O) e X X Z* the following statements are equivalent: 1°. xo G X is a solution of the primal problem (P) of (1.386), ^o G Z* is a solution of the dual problem (D) of (1.391), and we have min0(X) = maxX(Z*).
(1.412)
2°. (xo, ^o) is a saddle point of the Lagrangian L. Proof See, e.g., [185, Theorem 2] or [54, Ch. Ill, Proposition 3.1].
D
One can show (see, e.g., [185]) that this duality theory is symmetric; i.e., one can embed the dual problem (D) of (1.391) into a family of perturbed problems that generates, as the dual problem to (D), the initial problem (P). The above scheme encompasses as particular cases many known unperturbational dual problems to convex infimization problems. For example, given a linear system (X, Z, u) (see Definition 1.3 (b)), let / : X -> /? and h\ Z ^^ /? be two convex functions, and let us consider the primal infimization problem {P)
oc=mf{f{x)^h{u{x))].
(1.413)
xeX
In what follows, for simplicity, when dealing with the composition of two functions, we shall omit the symbol of composition o between them; thus, instead of {h ou){x) and ( ^ o w)(jc) we shall write hu{x) and ^w(x), respectively. For problem (P) of (1.413), let (t) = f-\-hu{=
f + {hou))
(1.414)
and let us define the perturbation function p: X x Z -^ Rby p(x, z) := fix) + h(u(x) -z)
{xeX,ze
Z),
(1.415)
which satisfies (1.387). Then, by (1.398), (1.397), (1.415), (1.86), and (1.91), we have
76
1. Preliminaries - L ( x , vl/) = p;(vl/) = sup {^(z) - p(x, z)] zeZ
= sup {-^(u(x)
- z) + ^(u(x))
- fix) + -h(u(x)
- z)}
zeZ
= -fix)
+ ^iuix))
+ sup [-^iuix)
-z) + -hiuix)
- z)}
zeZ
= -fix)
+ ^iuix))
-h h\-^)
(jc G X, vl/ G Z*),
(1.416)
and hence by (1.399) and (1.391), the dual objective function k and the dual problem (D)are A.(vl/) = inf L(x, ^) = inf [fix) - ^uix) + -/i*(-vl/)} xeX
(D)
xeX
=-/*(VI/M) f-/z*(-4') (^ G Z*), ^ - sup {-/*(VI/M) + -/z*(-vl/)}.
(1.417) (1.418)
Given a convex system (X, Z, u) (see Definition 1.3 (c)) and / , h, (P), 0, and p as above, the dual objective function A. and the dual problem (D) are still (1.417) and (1.418) respectively, but with ^u e R instead of ^w G X* (because we apply X
Fenchel-Moreau conjugation (1.206) to W = R , instead ofW = X*). Remark 1.31. For any (not necessarily linear) mapping u: X ^^ Z and any ^ G Z*, ^ o w = vi/w is denoted by M * ( ^ ) , where u*: Z* -> X* is the "adjoint" of w, so M* is defined by w*(^)(jc) := ^(M(JC))
ix eX,^
e Z*);
(1.419)
however, in the present chapter we shall not use this notation in order to avoid confusion with the Fenchel and Fenchel-Moreau conjugate functions. From the above we obtain the following result, whose part (a) is a classical theorem of Fenchel-Rockafellar: Theorem 1.24. (a) (See [183].) Let (X, Z, u) be a linear system and / : X -> R, h: Z ^^ R two convex functions for which there exists an element XQ G d o m / such that h is finite and continuous at M(JCO). Then inf {fix) + hiuix))} = max { - / * ( ^ M ) -f - / z * ( - ^ ) } . xeX
(1.420)
vi/ez*
(b) Let iX, Z,u) be a convex system, where Z = iZ, <) is a partially ordered locally convex space, and let f: X -^ R, h: Z ^y R be two convex functions for which there exists an element XQ G dom / such that h is finite and continuous at w(xo) and h is increasing ii.e., Z\, Zi G Z, Z\ < Z2 ^ hiz\) < hizi))- Then we x X have (1.420), with ^u e R and Fenchel-Moreau conjugation f*:R -> R. Proof (a) Let (X, Z, w) be a linear system. By /(JCQ) < +00, /z(w(xo)) < +00, we have inf;cGX {fix) 4- hiuix))} < +00. Furthermore, since u is linear and / , h are
1.4 Duality for convex and quasi-convex infimization
77
convex, 0 of (1.414) and p of (1.415) (which satisfies (1.387)) are convex as well. Finally, since h is continuous at w(xo), the function Pxo' z^
p(xo, z) = f(xo) 4- hiu(xo) - z)
(1.421)
is finite and continuous at z = 0. Hence, by Proposition 1.6 and (1.417), we obtain (1.420). (b) Let (X, Z, u) be a convex system. Then infj^ex {fM + h(u(x))} < -\~oo (as in part (a)). Let us observe now that for any increasing convex function h: X ^^ R, the function hu is convex; indeed, since u is convex and h is increasing, for any x\,X2 ^ X and 0 < c < 1 we have h(u(cxi + (1 — c)x2)) < h(cu(xi) -f- (1 — c)u(x2)), which, since h is convex, is < ch(u(x\)) + (1 - c)h(u(x2)). Also, 0 of (1.414) is convex (since so are its summands), whence so is p of (1.415), and as in part (a), the function (1.421) is finite and continuous at z = 0. Hence, by Proposition 1.6 and (1.417), we obtain (1.420). D Remark 1.32. In the particular case that Z = X is a locally convex space and u = Ix, the identity operator in X (i.e., u(x) — x for all x e X), Theorem 1.24 (a) yields the following classical result (see, e.g., [183, 185]) on the problem of the infimization of the (upper) sum / -j- /z of two convex functions, that is, the problem (P)
a= inf {fix)+
hix)}:
(1.422)
xeX
If X is a locally convex space and f,h\ X -> R are two convex functions for which there exists an element XQ e dom / such that h isfiniteand continuous at XQ, then inf {fix) + hix)} = max {-/*(^) + -h\-^)}. jceX
(1.423)
^eX*
However, this result does not imply directly Theorem 1.24 (a) above when applied to the convex functions f: X -^ R and hu\ X ^> R, since its assumption is that hu is continuous at XQ, while in Theorem 1.24 (a) it is assumed only that h is continuous at w(xo). Note that formula (1.423) is symmetric in / and /z, since max {-/z*(vl/) + -/*(-vI/)} = max {-/*(^) -h
-h\-^)}.
Hence instead of assuming in the above result that h isfiniteand continuous at some xo € dom / , we may assume that f is finite and continuous at some XQ G dom/i. Let us give now an application to the primal "programming" problem (1.298), in the particular case that Z = (Z, <) is a partially ordered locally convex space and r c Z is the negative cone
78
1. Preliminaries T:={zeZ\z<0}.
(1.424)
i.e., to the problem (P)
a=
inf fix).
xeX u(x)<0
(1.425)
Corollary 1.12. (a) Let (X, Z, u) be a linear system, where Z — (Z, <) is a partially ordered locally convex space, and let f: X -> R be a convex function for which there exists an element XQ G dom / satisfying M(JCO) G i n t r ,
(1.426)
with T of (1.424) (this is called the ''Slater condition'' or the ''Slater constraint qualification "). Then we have inf f{x) = max inf {/(jc) + ^(w(jc))}, xeX u(x)<0
(1.427)
^eZlxeX ^
where Z ; :={vl/ G Z*|vl/ > 0 } ,
(1.428)
with ^ >0 meaning that ^(z) > Ofor all z > 0. (b) Let (X, Z,u) be a convex system, where Z = (Z, <) is a partially ordered locally convex space. Then for f and T as in (a), we have (1.427), with Z^ of (1.428). Proof (a) Since u(xo) e intT, we have u(X) n T 7^ 0 andxr(«(-^o)) < +C)0, and there exists an open neighborhood V of w(jco) such that u(V) C T, whence XT is continuous at u(xo). Hence, by Theorem 1.24 (a) above, applied to the convex functions / and h:=XT,
(1.429)
we have inf [fix) + XTiuix))} = max [-f*i^u) xeX
+ -xH-"^)}-
(1-430)
VJ/GZ*
But for any ^ G Z* we have -f*i^u)
= -sup {i^u)ix)
- fix)} = inf [fix) - ^iuix))},
xeX
(1.431)
^e^
- X ? ( - ^ ) = - sup {-vl/(z) - XTiz)} = inf vl/(r),
(1.432)
zeZ
which, together with (1.430), yield inf fix) = max inf {/(JC) - ^iuix))
xeX u(x)eT
^eZ*xeX
= max inf ^feZ*xeX
{/(JC)
+ ^iuix))
+ inf vl/(r)} + - sup ^ ( 7 ) } .
(1.433)
1.4 Duality for convex and quasi-convex infimization
79
(the last equality of (1.433) follows from the first one, replacing ^ by — ^ and using thatinf(-vl/(r)) = - s u p ^ ( r ) ) . Now^, by (1.424), we have
^(T)
=
(-00,0] [0,+00) 0 R
ifO
(1.434)
whence
{
0
if vi/ F 7*
which, together with (1.424) and (1.433), yields (1.427). (b) We can write formula (1.298) in the form a =
inf
xeX u{x)eT
f(x)
= inf {f(x)
xeX
+ XT(U(X))},
(1.436)
where XT denotes the indicator function of the set T. As in the proof of part (a), XT is finite and continuous at u(xo). Also, since T is convex, so is its indicator function XT- Furthermore, by (1.424), XT is increasing; indeed, if z\ G T, then XT(ZI) =0 < xr(^2)forallz2 ^ Z, while if zi ^ T a n d z i < Z2,thenz2 ^ T (since otherwise Z\ < Z2 < 0, contradicting z\ ^ T), whence XT(Z\) = Xrizi) = + o o . Consequently, by Theorem 1.24 (b) above, (1.420) holds with h : = x r . whence by (1.431), (1.432), and (1.435), we obtain (1.427). D Remark 1.33. (a) As shown by the above proof, we have the following extension of Corollary 1.12 (a): Let (X, Z, u) be a linear system, T a convex subset ofZ such that u(X) C\T ^ &, and f: X -^ R a convex function, for which there exists an element XQ E dom / satisfying the Slater condition (1.426). Then we have (1.433). In the particular case that Z = X is a locally convex space and u = Ix, the identity operator in X, this yields the following result: if X is a locally convex space, G a convex subset of X, and f:X -^ R a convex function for which there exists an element XQ e dom / such that jcoGintG,
(1.437)
then i n f / ( G ) = max inf {/(jc) - ^(x) + inf ^ ( G ) } ^eX*xeX = max inf {/(x) + vI/(jc) + - s u p ^ ( G ) } . ^eX*xeX
(1.438)
Note that (1.438) is nothing other than formula (1.268), which has been stated in Theorem 1.13 under a different assumption, namely, that f: X -^ R is SL proper convex function that is continuous at some point XQ e G H dom / . However, this
80
1. Preliminaries
fact is also a consequence of Remark 1.32 above and formulas (1.431), (1.432) for u = Ix, since for h = XG "^^ have domh — G. (b) Let us observe that for any convex subset T of Z, [+00
ifu{x)-z^T
= X.-'(r+z)W
(X6Z,ZGZ).
(1.439)
Hence, the perturbation function (1.415) is now p{x, z) := f{x)
+ XT{U{X) -
/•/ \ I
z)
r \
\ fi^)
i f w ( ^ ) e r + Z,
r^ AAC\\
(c) When (X, Z, u) is a convex system, where Z = (Z, <) is a partially ordered locally convex space, by the proof of Theorem 1.24 (b) the function ^u\ X -^ R occurring in (1.427) is convex for each vj/ G Z;j.; that is, {^u){Zl)
c Conv(X),
(1.441)
where Conv(X) denotes the set of all convex functions w. X ^^ R. Also, for T of (1.424), formula (1.440) becomes
pix,z)=U_}'^
^!";"!^^'
(1.442)
and thus the family ( P J of infimization problems of (1.389) becomes (P,)
v{z) = inf fix) xeX
iz e Z).
(1.443)
U{X)
The perturbations/?: XxZ -^ ^ o f problem (1.425), given by (1.387), (1.389), where the objective function is perturbed, are called horizontal perturbations, while the perturbation p defined by (1.442), (1.443), in which only the constraint set is perturbed, but the objective function is left unchanged, is called a vertical perturbation (see, e.g., Laurent [129]). Let us pass now to (perturbational) surrogate duality. We shall consider again the primal infimization problem (P) of (1.386), embedded into a family of perturbed optimization problems (1.389), with the aid of a perturbation p: XxZ ^^ R satisfying (1.387). Following Crouzeix [34], one defines the quasi-convex dual problem associated with the perturbation function p (or relative to the parameterization (Z, p)) as the unconstrained supremization problem (Aurr)
AuiT '= SUpAsurr(Z*),
where Asurr: Z* -> /? is the dual objective function defined by
(1.444)
1.4 Duality for convex and quasi-convex infimization Asurr(^) :=
inf
p U , z)
(^ € Z*).
81
(1.445)
{x,z)eX^Z vl/(z)>0
The main difference between the functions X of (1.392) and Xsurr of (1.445) is that in (1.392) the "penalty term" —^(z) is added to the objective function p(., z) of {Pz) of (1.389), for each z e Z, while in (1.445), ^(z) is used to form the new surrogate constraint sets {(;c, z) G X X Z| ^(z) > 0}
(^ G Z*);
(1.446)
thus, the quasi-convex dual problem is a surrogate dual problem. There exists a theory of quasi-convex dual problems analogous to the theory of Lagrangian dual problems. The role of Fenchel conjugates /*, /** for Lagrangian duality is played by the Greenberg-Pierskalla quasi-conjugates / J , /^^ (defined by (1.213), (1.217)) for the above surrogate dual objective function: Corresponding to (1.393), (1.407), and (1.408), we have now Xsurr(^) = -PI„0)(^, fturr =
^ ) = -v'o(^)
sup Asurr(^) -
sup
(^ iuf
G Z*),
(1-447)
i;(z) = SUp ( - ^ ^ ^ ( Z * ) = ^>^>^ (0),
(1.448)
where JCQ G X is arbitrary, and v is the (optimal) value function (1.389). Hence ^surr of (1.447) is always quasi-concave and weak* upper semicontinuous, and the inequalities (1.369) remain valid in this general case (by (1.218)). Furthermore, by (1.390) and (1.448), weak surrogate duality a = ^surr holds if and only if v(0) = vyy{0), or equivalently (by (1.219)), v{0) = i;eq(0). Corresponding to (1.396), the following is a useful tool for the study of the quasi-convex dual problem associated with p: the function Lsurr- X x Z* ^^ R defined by ^surr(^,^):=
inf
zeZ *(z)>0
p(jc,z)
(jc G X, vl/G Z*)
(1.449)
is called the quasi-convex Lagrangian associated with p. By (1.445) and (1.449) we have Asurr(^) = inf Lsurr(-^, ^ ) xeX
( ^ ^ Z*),
(1.450)
and hence by (1.444), the surrogate dual value is Psun = sup inf Lsurr(-^, ^ ) .
(1-451)
One can compute [240] that for the structured primal infimization problem (P) of (1.436), with any system (X, Z, u) and any target set 7, we have T
/ ^ vl/>| -
""^ ' ^
1 -f^^^ + X{x'eX\^(u{x'))>mf^{T)}(x)
if mf^(T)
G
^{T),
1 fix) + X{x'exinu(x'))>infnT)}(x) if inf^(r) ^ vi,(r), ^''^^^^
82
1. Preliminaries
and hence the surrogate dual value is Aurr = sup mini vT/c7*
I
inf
f{x),
inf
xeX ^(u(x))e^iT)
f(x)\.
xeX ^(u(x))>\nf^{T)
(1.453)
J
As an application to a particular case, let (X, Z, u) be a convex system, where Z = ( Z , < ) is a partially ordered locally convex space, and let us consider the structured primal programming problem (1.298), where T is the negative cone (1.424) in Z, i.e., the problem (1.425). We shall show now that in this case, for the perturbation (1.440), that is, for p(x, z) = fix) + Xu-HT+z)M = fM4-X{x'ex\uix')
(xeX^zeZ),
(1.454)
the surrogate Lagrangian (1.449), that is, ^surr(-^, - ^ ) =
inf {fix) zeZ ^(z)<0
= fix)+
+
X{x'eX\u(x')
inf X{x'ex\u(x')
(1.455)
^\z)<0
becomes /
(r
yjj^- lf(^^-^X{x'ex\^(u(x'))
ifx € X , 0 <
VI/GZ*,
and hence the surrogate dual value is Aurr-
sup
inf
fix).
(1.457)
One can deduce this from (1.452), but here is a direct proof: If ^ > 0, then ^zez,nz)
< 0}.
(1.458)
Indeed, the inclusion c is obvious; conversely, if x^ e X, ^^iuix')) < 0, so there exists z 6 Z, z > 0, such that VI/(M(JC')) = ^ ( - z ) , then for z' = w(xO + z we have w(jcO < z\ ^(zO = 0. Hence by (1.458), i^C X{x'eX\u(x')
= X{x'eX\^(u(x'))<0}ix)
ix G X),
which proves the part 0 < ^ G Z* of (1.456). On the other hand, if vl/ 2^ 0, then there exists z! ^ Z with z! > 0, vl/(z0 < 0. Then for any jc G X and for /JL > 0 sufficiently large, the element z = M(X) + /xz^ satisfies uix) < z and ^(z) = ^iuix)) + /x^(zO < 0, whence the second inf in (1.455) is 0, which proves the part 0 ^ vi/ G Z* of (1.456). Finally, by (1.450) and (1.456), we obtain (1.457). By (1.457) we have now, similarly to (1.369),
1.4 Duality for convex and quasi-convex infimization
83
a = inf fix) > fi',,„ xeX u(x)<0
:= max
inf
f{x) > max inf {/(jc) + ^(u(x))}
VI/(M(JC))<0
:=
fii^^,.
(1.459)
^
Hence if there holds strong Lagrangian duality (1.427), i.e., a = Pl^^^, then we also have strong surrogate duality a = yS^^j^; clearly, a similar statement holds also for the corresponding weak Lagrangian and weak surrogate dualities. Consequently, from Corollary 1.12(b) we obtain the following: Corollary 1.13. Let {X, Z, u) be a convex system, where Z = (Z, <) is a partially ordered locally convex space, and let f \ X -^ R be a convex function for which there exists an element XQ G dom / satisfying the ''Slater condition'' (1.426), with T of {\.424). Then we have inf /(jc) = max
inf
xeX
JceX
vi/ezi
M(JC)<0
^
f{x).
(1.460)
^{u{x))
Corollary 1.13 can be sharpened, as shown by the following classical theorem of surrogate duality for a quasi-convex programming problem (1.425), due to Luenberger ([136, Theorem 3]): Theorem 1.25. Let u: R^ ^ R^ be a convex mapping such that there exists XQ e R^ satisfying u{xo) <^ 0 {i.e., with all components of u(xo) being < 0) and let f: R" -> R be a quasi-convex function that is upper semicontinuous along lines {i.e., for every X\,X2 G /?", ^/{r]) := r]X\ + (1 — r])x2 is an upper semicontinuous function of r] for rj e [0, 1]). If inf xeR" f{x) is finite, then we have (1.460) with u{x)<0
X = R^ Z = R"^. Besides the above quasi-convex dual (1.444), (1.445) to the primal infimization problem (1.386), other surrogate dual problems are also used. For example, one defines [234] the 0-dual problem associated with the perturbation function p as the unconstrained supremization problem {Do)
^, :=supA,(Z*),
(1.461)
where XQ: Z* ^^ R is the dual objective function defined by Xe{^):=
inf
p{x,z)
(^ G Z*);
(1.462)
(x,z)eXxZ vl>(z)>-l
the only difference between the functions Asurr of (1.445) and XQ of (1.462) is that ^(z) > 0 is now replaced by ^(z) > —1. In the corresponding duality theory, the role of / * and / J is played by the semiconjugate / ^ defined by (1.221). It turns out that weak surrogate duality a = Po holds if and only if v{0) = v^^{0), or equivalently (by (1.222)), i;(0) = Uq(0).
84
1. Preliminaries
As another example, let us mention that one defines [234] the n-dual problem associated with the perturbation function p as the unconstrained supremization problem (D,)
yS, :=supX,(Z*),
(1.463)
where Xj^: Z"" ^^ R'l^ the dual objective function defined by A^(vl/) :=
inf
/7(x, z)
(^ e Z*);
(1.464)
{x,z)eXxZ vy(z)=0
the "surrogate Lagrangian" associated with
[{P),{DT^)}IS
L^{x, ^) := inf p(jc, z)
defined by
(x e X, ^ e Z*).
(1.465)
zeZ
The only difference between the functions Asurr of (1.445) and A,;,- of (1 -464), respectively the functions Lsurr and L^^, is that ^(z) > 0 is now replaced by ^(z) = 0. The role of / * and fj is played now by / J of (1.220). One can compute [240] that for the structured primal infimization problem (P) of (1.436), with any system (Z, Z, u) and any target set 7, we have Lj,(x,
^ ) = fix)
+ X{x'eX\^(uix'))e^iT)}(x),
(1.466)
and hence the corresponding surrogate dual value is Pn = sup
inf
vi/(=7*
^^^
fix).
(1.467)
xeX
^(uix))e^(T)
In the particular case that Z — X and u = Ix, formula (1.467) (with T = G c X) reduces to the right-hand side of (1.370) (where, clearly, sup(j>g;^*\{0} = sup4>ex*)One can deduce from (1.466) (see [240, p. 60]) that for the particular case of problem (1.425) and the perturbation (1.454) we have
LAx,-^)
=
fM + X{x'ex\^iuix'))
(1.468)
whence Pn=fisurr=
SUp 0<^eZ*
iuf xeX ^(w(;c))<0
/(x).
(1.469)
2 Worst Approximation
We recall that the deviation (or excess) of a set G (assumed nonempty in this chapter, without any special mention) from an element XQ in a normed linear space X is the number 8(G, XQ) > 0 defined by 5(G,xo):=sup||g-xo||,
(2.1)
geG
and any go e G for which this sup is attained, i.e., such that llgo--^oll = s u p llg-xoll,
(2.2)
geG
or equivalently, such that llgo-^oll> \\g-xo\\
(geG),
(2.3)
is called an element of worst approximation of (or SL farthest point to) XQ in G (see Figure 2.1). For any (nonempty) subset G of a normed linear space X, and XQ e X, v/e shall denote by ^G(-^O) the set all farthest points to JCQ in G, that is, ^G(XO) := {go e G\ \\go - xo\\ = sup \\g - xo\\}.
(2.4)
geC
Similarly to the case of best approximation, we may have JG(-^O) = 0, even for bounded subsets G of Z with "very good" geometric properties. Note that we could assume that G is closed and convex, since it is well known and easy to see that for any XQ e X and any set C c X we have
86
2. Worst Approximation coG
-^ Xo
Figure 2.1. (2.5)
sup ||c-xol| = sup \y - Joll ceC
JGCOC
We shall be concerned with the following tv^o main problems', (1) Find convenient formulas for 5(G, XQ). (2) Give characterizations of elements of worst approximation (i.e., necessary and sufficient conditions in order that an element go ^ G satisfy (2.2), that is, in order that ^0 ^ ^G(-^O)). We shall obtain duality results, using the elements O of the conjugate space X*.
2.1 The deviation of a set from an element We shall need the following lemma: Lemma 2.1. Let X be a normed linear space, G a subset ofX, and XQ e X. Then sup suplOC^-xo)| = sup | s u p O ( g - x o ) | . OGX*
11011 = 1
(2.6)
geG
l|0||=l
Proof. Let us first prove the lemma for JCQ = 0, i.e., that we have (2.7)
sup sup|cD(g)|= sup |supcl)(^)|. OeX* geC
OGX*
\\n=\
iioii=i
g^G
To this end, it will be sufficient to show that for any O e Z*\{0}, sup|cD(g)|=max{|supcD(g)|,|sup(-cD)(g)|}. geG
geG
(2.8)
geG
Using that \a\ = max {a, —a), the left-hand side of (2.8) is sup 10(g)I = supmax{0(g),-)(g)}. geG
geG
On the other hand.
geG
geG
(2.9)
2.1 The deviation of a set from an element
87
I supO(g)| = max{supO(g), — supcl)(g)} = max{supO(g), inf(—0)(g)}, geG
geG
geG
geC
8^G
|sup(-cD)(g)| = max{sup(-cD)(g), inf
geG
S^G
SO the right-hand side of (2.8) is max{|supO(g)|, | sup(-0)(g)|} geG
geG
= max{sup(D(g), mf(-^)(g), geG
sup(-0)(g), inf <^(g)]
S^G
g^Q
geG
= max {sup ^(g), sup(-cD)(g)}, geG
geG
which together with (2.9) proves (2.8), and hence (2.7). Now let jco G X be arbitrary. Then, by (2.7) applied to the set G - xo, we obtain (2.6). D Theorem 2.1. Let X be a normed linear space, G a subset ofX, and XQ G X. Then s u p | | g - x o i l = sup |supO(G)-cD(xo)|. geG
(2.10)
\\n=\ Proof. By Lemma 2.1 we have supllg-xoll = sup sup \(^(g-xo)\= geG
geG
sup s u p | 0 ( g - x o ) |
^eX*
OGX*
\\n=\
geG
\\n=\
= sup | s u p O ( g - x o ) | ,
geG
i.e., (2.10).
D
Remark 2.1. (a) When G is unbounded and XQ e X, formula (2.10) reduces to -f-cxD = -J-cxD. Indeed, if the right-hand side of (2.10) is < +oo, then by the uniform boundedness principle, G — xo is bounded, and hence so is G, which proves our assertion. Thus, only the case ofG bounded is of interest. (b) By Lemma 1.5, the main (i.e., the bounded) case of Theorem 2.1 admits the following geometric interpretation: IfG is bounded and XQ G X, then s u p 11^ -Xo\\= geG
s u p d i s t (Xo, //(D,sup(G)) ^eX*
IIOINI
=
sup
inf
llj-xoll,
(2.11)
^(y)=sup
where //o,supcD(G) is the hyperplane H^,suv<^(G) = {yeX\
(2.12)
2. Worst Approximation Equivalently, by (1.49), sup llg - xoll = sup dist Uo, / / ) , geG
(2.13)
H^HG
where HG denotes the collection of all hyperplanes in X that quasi-support the set G (see Figure 2.2). Thus, the reduction principle of Remark 1.16 (b) now takes the following form: formula (2.13) reduces the computation of the deviation of a bounded set G from XQ to the computation of the distances to the hyperplanes H G (c) For unbounded sets, formula (2.11) still holds (it reduces to +oo = +oo, as noted in (a) above), but (2.13) need not hold, as shown, e.g., by any closed affine subset G of X\ indeed, for such a set G, the left-hand side of (2.13) is +oo, but since a hyperplane H quasi-supports a closed affine set G if and only if H ^ G (and hence then H supports G), the last term of (2.13) is < dist (JCQ, G ) < +oo. The reason for this discrepancy is that for unbounded G there exists O € X* such that sup 0(G) = +00, whence //
Figure 2.2.
Corollary 2.1. Let X be a normed linear space, G a subset ofX, and XQ G X. Then sup||g-xo||= geG
supO(G) - O(xo) IIOI eX*\{0} sup
(2.14)
Proof By Theorem 2.1 and its proof, it is enough to show that sup | s u p O ( G ) | = sup supcI>(G).
(2.15)
11^11 = 1
The inequality > in (2.15) is obvious. In order to prove the opposite inequality, let IIOII = 1. If sup CD (G) > 0, then | sup O(G) | = sup O (G). On the other hand, if sup 0(G) < 0, then for OQ := - O we have ||4>o|| = 1, and |supcD(G)| = -supcI)(G) = inf
D
2.1 The deviation of a set from an element
89
Remark 2.2. Conversely, Corollary 2.1 implies Theorem 2.1. Indeed, by (2.14) we have |supcD(G)-0(xo)| sup \\g - xoll < sup -— . geG ^eX*\{0} ll^ll
(2.16)
On the other hand, since supcD(g-xo) < s u p | 0 ( g - x o ) | < ||0|| geG
geG
sup\\g-xo\\ geG
w e obtain
4>GX*\{0}
II ^11
geG
which, together with (2.16), yields (2.10). Corollary 2.2. Let X be a normed linear space, G a subset ofX, and XQ G X. Then sup||g-xo||=
sup
geG
,1^.,
(O,fif)e(X*\{0})x/? sup4>(G)>^
=
Il^ll
sup
ii^ii
(4),J)e(X*\{0})x/? 3geG,<^{g)>d
Il^ll
• (2.17)
Proof. Clearly, sup 0 ( 0 ) = sup {d e R\ sup 0(G) > d] = sup Id e R\ sup 0 ( 0 >d}
(O G Z * ) .
(2.18)
Hence, by Corollary 2.1 and (2.18), we obtain sup llg - xoll = geG
supO(G)-0(xo) — =
sup
=
II ^ 1 1
sup i
,
sup
sup
cDeX*\{0}
deR sup (G)>d
J-O(xo) ——-— II ^ 1 1
(2.19)
II ^11
sup
which proves the first equality in (2.17). Finally, the proof of the second equality in (2.17) is similar, since sup 0(G) > J if and only if there exists geG such that 0(g) > J. D Remark 2.3. (a) In the converse direction. Corollary 2.2 implies Corollary 2.1, which in turn implies Theorem 2.1. Indeed, this follows by starting with the first equality of (2.17) and writing formula (2.19) in the reverse order. (b) Corollary 2.2 admits the following geometric interpretation: We have
90
2. Worst Approximation sup llg - xoll =
sup
geG
dist (XQ, U<^^d)
(^,d)e(X*\{0})xR sup^(G)>d
=
inf
sup
\eX (,d)e(X*\{0})xR sup(D(G)>^ (>')>^
=
IIJ-^OI
sup dist (XQ, V^^d) (
=
sup
inf
(2.20)
IIJ-XOII,
(,J)e(X*\{0})x/? >"^^^ , supcD(G)>J ^(.V)>^
with L^
UeU
(2.21)
VeV
where U and V denote, respectively, the collection of all open half-spaces in X and the collection of all closed half-spaces in X. Indeed, for the open half-space U^^d and the closed half-space V^^d of (1.67), we have U^d n G 7^ 0 <^ sup cD(G) > J,
(2.22)
and, respectively, V<^,df^Gy^0<^3ge
G, ^(g) > d.
(2.23)
Hence, by (2.17) and Corollary 1.4, we obtain (2.21) (see Figures 2.3 (a) and (b)). closed half-space
open half-space
XQ
XQ
(a)
(b) Figure 2.3.
Remark 2.4. Note also that for the hyperplane //o,^ = {>^ e X| 0 ( j ) = (i} we have //^^ n G 7^ 0 <^ 3g G G, cD(g) = J,
(2.24)
and sup \\g - XQW == g^G
sup
(O,^)e(X*\{0})x/? supO(G)=J
dist (xo, H<^,d) =
sup
inf \\y - XQ\\ ,
(cD,J)€(X*\{0})x/? >'^^ supcD(G)=J ^0')-^
(2.25)
2.1 The deviation of a set from an element
91
or equivalently, sup llg - xoll =
sup dist (xo, / / ) ,
geG
Hen
(2.26)
where 7i denotes the collection of all hyperplanes in X (see Figure 2.4).
Figure 2.4. One can also express the right-hand side of (2.10) as follows: Proposition 2.1. If G is a subset ofX and XQ e X, then sup I supO(G) - cI)(jco)| = sup I supcI)(G) - (l>(xo)|.
(2.27)
OeX* I|0II<1
11^11=1
Proof. By the above proof of Theorem 2.1, it is enough to consider the case XQ = 0, i.e., to show that (2.28)
sup |sup4>(G)|= sup |supO(G)|.
\\n<\
11^11=1
But, for any O G Z* and 0 < a < 1 we have sup |(acD)(G)| = a sup |0(G)| < sup |0(G)|, whence, since each <^ e Z* with \\^\\ < 1 can be written as O = aOo, with IIOoll = 1 and 0 < a < 1, we obtain (2.28). D In (2.13) one can replace hyperplanes by other sets, such as quasi-supporting closed or open half-spaces (see Figures 2.5a and 2.5b). Indeed, we have the following theorem: Theorem 2.2. Let X be a normed linear space, G a subset ofX, and XQ e X. Then sup 11^ - xoll = sup
inf
geG
y^^
||
lly - xoll = sup <J>eX*
inf y^^
||0|| = l 0(>-)>sup4>(G)
(2.29)
92
2. Worst Approximation
or equivalently, sup \\g — jcoll = sup dist(jco, V) = sup dist(jco, U), geG
VeVc
(2.30)
UeUc
where VG (respectively, UG) denotes the collection of all closed (respectively, open) half-spaces that quasi-support G and do not contain G (respectively, int G).
closed half-space
open half-space
Figure 2.5. Proof We claim that for the set C := CO (xo, G),
(2.31)
sup lUo-cll = s u p \\XQ- g\\.
(2.32)
we have ceC
geG
Indeed, since G C C, we have the inequality > in (2.32). On the other hand, for any ^0-^0 + T!iL\ ^iSi ^ CO (xo, G), where ^o, ^i, • • •, ^m > 0, /yo + YlT=i Vi = 1.
^mgi) < Y] rji \\xo - gi II < sup lUo - ^11 ,
xo-imxo +
/=1
i=\
seG
whence, passing to the closed convex hull, we obtain the inequality < in (2.32), which proves the claim (2.32). Consequently, we may assume that JCQ 6 G in (2.29). Then O(jco) < sup<E>(G) for all O e Z*, and XQ ^ U := {y e X\^(y) > sup 4>(G)}. Hence by Theorem 2.1 and Corollary 1.4, we obtain s u p | | g - x o | | = sup |sup4>(G)-4)(xo)| geG
OeX*
= sup {sup 0(G) - O(jco)} = dist (JCQ, U), 11^11 = 1
which, by passage to U = {y e X\ ^(y) > sup 4>(G)}, yields also (2.29). Let us give now another formula for the deviation.
D
2.2 Characterizations and existence of farthest points
93
Theorem 2.3. Let X be a normed linear space, G a subset ofX, and XQ ^ X. Then
supllg-xoll geG
0
ifG = {xo}.
Proof. We may assume that XQ = 0 and G 7^ {0}. By Corollary 2.2, it is enough to show that in this case, d
sup (
=
sup
II^11
1
/^ o^x
-—-.
(2.34)
The inequality > in (2.34) is obvious. Conversely, for any (O, d) e (X*\{0}) x R for which there exists g e G such that 0(g) > d, let Oo := 3 0 . a
(2.35)
Then Oo(g) = ^O(g) > 1 and ^ ^ = i j ^ , which proves the inequality < in (2.34) and hence the equality. D Remark 2.5. It is necessary to consider the two cases in (2.33) separately, since for G = {jco} the left-hand side of (2.33) is 0 and the supremum on the right-hand side is —00. Of course, only the case G 7^ {XQ} is of interest.
2.2 Characterizations and existence of farthest points We shall first give some characterizations of farthest points to JCQ in G, where G is a subset of X and XQ e Z, i.e., some necessary and sufficient conditions in order that go e TG(XQ) (that is, ||go - XQII = sup^^^ ||g - xoll). Theorem 2.4. Let X be a normed linear space, XQ e X, and G a subset ofX. For an element go G G, the following statements are equivalent: V.goeJ^Gixo). 2°. There exists Oo E X* such that ll^oll = 1, ^o(go - xo) = sup llg - xoll.
(2.36) (2.37)
geG
3°. There exists Oo e X* satisfying (2.36) and l^oteo-^o)l = s u p | | g - x o l | . geG
(2.38)
94
2. Worst Approximation
Proof. Assume 1°. By a corollary of the Hahn-Banach theorem, we can choose cDo G Z* satisfying (2.36) and ^ o ( g o - ^ o ) = llgo-^ol|.
(2.39)
Then, by (2.39) and 1°, we obtain ^o(go - -^o) = llgo - ^oll = sup llg - xoll, geG
i.e., (2.37). Thus, 1° => 2°. The implication 2° => 3° is obvious. Finally, assume 3°. Then 11^0 - ^oll > l^o(go - ^o)l = sup 11^ - xoll, g&G
whence, since go G G, we obtain 1°.
D
Remark 2.6. (a) The equivalence T ^ 2° of Theorem 2.4 admits the following geometric interpretation: For an element go e G we have go e ^G(-^O) if and only if there exists a hyperplane HQ that supports the ball B = B(xo, sup^^^ 11^ — -^o II) at go. Indeed, let r := sup^^^^ \\g - xo\\. If go G J^G(XO) and OQ G Z * is as in 2° of Theorem 2.4, then by Lemma 1.6, the hyperplane Ho := {y e X\
(2.40)
quasi-supports the ball B; also, by (2.37) and go G Tci^o), we have ||go — -^oll = ^ and ^o(go - ^o) = sup^^^^ \\g - xo\\ = r, so go e B n //Q. Conversely, if there exists a hyperplane Ho that supports the ball B = B(xo, sup^^^^ \\g — xo\\) at go, then by Lemma 1.6, there exists a (unique) function OQ G X* with ||0oll = 1 such that we have (2.40). Then, since go e Ho, we obtain (2.37), and hence, by Theorem 2.4,go€ J'G(^O). (b) For any
(geG),
(2.41)
(i.e., OQ € N(G; go)).
Indeed, by the implication 2° =^ 1° above and (2.37), we have Wgo - -^oll = sup llg - xoll = Oo(go - -^o), geG
SO (2.39) holds. Also, by (2.36) and 1°, we have ^oig - xo) < \\g - xoll < llgo - -^oll
(8 e G),
i.e., (2.41). Furthermore, by (2.39) and (2.41), ^o(go) = ^o(^o) + llgo - .^oil > ^^oig)
(g G G),
(2.42)
2.2 Characterizations and existence of farthest points
95
whence, since ^o ^ ^ . we obtain (2.42). (c) For any go ^ ^G{^Q) and any hyperplane //Q as in (a) above, the sets G and B = B(xo, swpg^Q \\g — jcoll) lie in the same half-space Do := {y G X| ^o(y) < ^o(-^o) + sup^^Q llg - -^oll}, and HQ supports the set G at go; also, we have (2.43)
^0 G P/ZoUo),
i.e., go is a nearest point to XQ in HQ (see Figure 2.6). Indeed, by (2.41) and go G J='G{XO), we have G c Do; also, if \\y - XQW < sup^^^^ \\g - xo\\, then by ||Oo|| = 1, we have ^o(y - XQ) < \\y - xo\\ < sup^^^^ \\g - XQW, SO ^ C DO. Furthermore, by (2.37) and (2.42), we have 4>o(xo) + sup^^^ \\g - xo\\ = ^o(go) = sup Oo(G), which proves the second statement. Finally, go e HQ and, by (2.40) and (2.36), 11^0--^0II
^o(y-xo)
sup||g-xo|
< ||y-xol|
(y e Ho).
geG
Theorem 2.5. Let G be a subset of X, and XQ e X. For an element go e G the following statements are equivalent: l°.g0GJ^G(X0).
2°. There exists Oo e X* satisfying (2.36), (2.42), and I sup (I>o(G) - Oo(xo)| = max | sup 0(G) - cD(xo)|.
(2.44)
11011 = 1
Proof If go e ^G(-^O), then by Theorem 2.4 and Remark 2.6, there exists Oo e X* satisfying (2.36), (2.38), and (2.42). Also, by (2.42), (2.38), and Theorem 2.1, we have I supOo(G) - Oo(xo)| = 1^0(go) - ^o(-^o)l = s u p | | g - x o | | = sup | s u p O ( G ) - 0 ( ^ 0 ) 1 , geG
whence (2.44).
11011 = 1
96
2. Worst Approximation
Conversely, assume that there exists OQ G X* satisfying (2.36), (2.42), and (2.44). Then by (2.42), (2.44), and Theorem 2.1, l^oigo) - ^o(-^o)l = I supcDo(G) - Oo(xo)| = sup | supO(G) - O(xo)| II^INl = sup||^-xoll, geG
SO cDo satisfies (2.36) and (2.38). Hence, by Theorem 2.4, go e TG{XQ).
D
Remark 2.7. By Corollary 1.1 and Lemma 1.5, Theorem 2.5 admits the following geometric interpretation:/or an element go e G we have go ^ •^G(-^O) if and only if there exists OQ G Z* with ||0o|| = 1, such that the hyperplane {3;GX|(I>o(j) = supcDo(G)}
(2.45)
supports G at go and dist (xo, //(Do,supOo(G)) = max dist (XQ, / / ) ,
(2.46)
HeTic
where He is the collection ofhyperplanes defined in Remark 2.1(b); or equivalently, there exists a hyperplane Ho that supports G at go and such that dist (xo, Ho) = max dist (JCQ, H) Hen,G
(2.47)
(see Figure 2.7).
Figure 2.7.
None of the conditions (2.42), (2.44) can be omitted in Theorem 2.5, as shown by the following examples: Example2.1. Lei X = l\G = {^en\n = 1,2,...}, where {Cn} denotes the sequence of unit vectors in / ^ and let xo = 0. Then the function OQ ^ Z* defined by
2.2 Characterizations and existence of farthest points
97 (2.48)
satisfies (2.36). Furthermore, , |supOo(G) - Oo(xo)| = sup Oo I
n-l
er
= 1,
and by Theorem 2.1, sup I sup (G) - cD(xo)| = sup \\g -xo\\= OeX*
sup
geG
n-l
n
11^11 = 1
so Oo satisfies (2.44). However, writing gn = ^^n, 1, 2 , . . . } and Wgn -^Oll =
n- 1 -en
n- 1
whence TG(XO) = 0. Also, ^o(gn) = ^ hold, for any go e G.
<1
< 1,
AZ
we have G = {gn\n
(AZ = 1,2,...), = 1, 2 , . . . , so (2.42) does not
Example 2.2. Let X = R^, with the Eudidean norm, G = {(1, 0), (1, 1)}, and XQ = 0. Then the function OQ G X* defined by <^o(y) = yi
(y = (yuy2)eX)
(2.49)
satisfies (2.36). Furthermore, for go = (1, 0) G G we have ^o(go) = 1 =supOo(G), so go satisfies (2.42). However, sup llg - xoll = max (1, V2) = V2 > 1 = ||go - ^oll, geG
SO go ^
^G(XO)'
Also, for the function ^\ e X* defined by
V2
Oi(j) = — ( j i + j2)
(J = (}^i, yi) e Z),
we have ||Oi|| = 1 and I supOi(G) - cDi(jco)| - V2 > 1 = I supOo(G) - o(xo)|, so (2.44) is not satisfied.
(2.50)
98
2. Worst Approximation
Theorem 2.6. Let X be a normed linear space, XQ e X, and G a subset of X such that G / {^o}- For an element go e G, the following statements are equivalent: WgoeTcixo). 2°. There exists OQ G X* such that (2.51)
'i'oigo --^o) = 1,
1
llgo--xoll =
1
HI
11^0
(2.52)
r
1
max
=
3geG ,^(g)>«I>Uo)+l
II CD
(2.53)
IfG is weakly compact, these statements are equivalent to: 3°. There exists ^'^ e X* satisfying (2.51), (2.52), and max
k>n II
eX* ""
IIO
-.
(2.54)
sup
Proof r => 2°. We shall show that if % e X* satisfies (2.51)-(2.53), then 4>o := -[T^^o satisfies (2.36), (2.42), and (2.44). Indeed, (2.36) is obvious. Also, by (2.53), (2-33), (2.10), ^'^ ^ 0, and (2.51), we have 1
1
I sup CD(G) - cD(xo)| |supcD;j(G) - <^^,(xo)\ sup > ^^-n n— Wioi ll
=
1 ^ 0 II
ll^oll
which yields (2.44). Finally, by (2.55) and go ^ G, we obtain ^ ,
,
%(So - xo)
I sup %(G) - %{xo)\
1 ^ 0 II
11^0 I
> sup Oo(G) - cDo(xo) > cDo(go - -^o), whence (2.42). 2° =^ 1°. We shall show that if G 7^ {xo} and CDQ e X* satisfies (2.36), (2.42), and (2.44), then sup cDo(G) ^ cDo(xo) and cD^^ := ,^^^^^G)-<^^(XO)'^^ satisfies (2.51)(2.53). Indeed, by (2.44), (2.10), and G ^ [xo] we have | sup Oo(G) - cDo(xo)| = sup^GG 11^ - -^0II > 0. Also, by (2.42), we have (2.51). Furthermore, by (2.36), the definition of CDQ, and (2.42), we have 1
l^ol
= |supcDo(G)-cDo(xo)| = |Oo(go-^o)l < Jl^o-^oll,
2.2 Characterizations and existence of farthest points
99
and on the other hand, by (2.44), (2.10), and since go ^ G^ J—J = I sup Oo(G) - Oo(xo)| = sup \\g -xo\\> ll'^oll see
\\go - ^oll,
(2.56)
whence we obtain (2.52). Finally, by (2.56) and (2.33), we get WJ
= ReG sup||g-xo|| =
^e^* max
ll^ll
that is, (2.53). 2° 4^ 3°. This follows from the fact that if G is weakly compact and O 6 X*, then sup (G) is attained (see Lemma 1.3). • Now we shall study the existence of elements go e G for which the sup in the left-hand side of (2.10) is attained (i.e., of farthest points go e TG(XO)). Definition 2.1. We shall call an optimal dual solution, or, briefly, optimal function (with respect to the pair (G, XQ)) any function OQ ^ ^* with ||
2°. There exists an optimal dual solution OQ G X* such that Oo attains its supremum on G.
(2.57)
Proof Condition (2.57) means that there exists go ^ G satisfying (2.42), so the result follows from Theorem 2.5. • Remark 2.8. By Lemma 1.5, for the hyperplane Ho:={yeX\
(2.58)
where OQ e X*, ||4>o|| = 1, we have dist (JCQ, H) = \ sup Oo(G) - Oo(xo)|. Hence, a function OQ e X* with HOQII = 1 is optimal if and only if the hyperplane (2.58) satisfies (2.47); we shall call any such hyperplane an optimal hyperplane. Then Theorem 2.7 admits the following geometric interpretation: We have J^ci^o) # 0 if and only if there exists an optimal hyperplane Ho G HG such that HoCiG ^ 0. In the particular case that G is weakly compact, condition (2.57) can be omitted, as shown by the following: Corollary 2.3. Let G be a weakly compact subset of a normed linear space X, and jco G X. The following statements are equivalent: 1°.^G(^O)7^0.
2°. There exists an optimal dual solution OQ G X*.
100
2. Worst Approximation
Proof. Since G is weakly compact, every OQ ^ ^* satisfies (2.57) (see Lemma 1.3), so the result follows from Theorem 2.7. • Corollary 2.3 is no longer true without the assumption of weak compactness, as shown by Example 2.1. Moreover, one can modify Example 2.1 to show that in Corollary 2.3 the assumption of weak compactness of G cannot be replaced by the assumption of weak* compactness of G when X is a conjugate space: Example 2.3. Let X = /^ = c*, G = {^en\n
= 1, 2 , . . . } U {0}, and let XQ =
0. Then ^ ^ n ^ 0, so G is weak* compact. Furthermore, by the argument of Example 2.1, the function OQ ^ X* defined by (2.48) is optimal, but ^G(-^O) = 0Let us summarize the connections between existence of farthest points and existence of optimal dual solutions. To this end, we shall denote by OG(-^O) the set of all optimal dual solutions with respect to the pair (G, XQ). Theorem 2.8. Let G be a subset of a normed linear space X, and let XQ e X. Then
(a)^G(^o)#0^OG(^o)^0; (b)(9G(^o)7^0^^G(^o)^0; (c) ifG is weakly compact, then !FQ{XQ) ^ 9^ ^
OG{XQ)
^ 0;
Proof (a) is an obvious consequence of Theorem 2.5 (or of Theorem 2.7). (b) is shown by Examples 2.1 and 2.3. (c) is nothing else than Corollary 2.3.
•
Duality for Quasi-convex Supremization
Given a locally convex space X with conjugate space X*, a subset G of X and a quasi-convex function / : X -> /?, in this chapter we shall give duality results for the primal supremization problem (n
=
(PGJ)
« ' = < / = sup / ( G ) .
(3.1)
Any go e G for which the sup in (3.1) is attained, i.e., such that /(go) = sup/(G),
(3.2)
is called an optimal solution of problem (P^); these will be studied in Chapter 4. The set of all optimal solutions will be denoted by Mcif), that is, Mcif)
:= {go e G\ figo) = sup/(G)};
(3.3)
naturally, one can also write max instead of sup in (3.2) and (3.3). If / is a quasiconvex function, then (P^) of (3.1) is called a problem of quasi-convex supremization. Taking / ' := —/, which is a quasi-concave function, one can also write (3.1) as the infimization problem -a' = - s u p / ( G ) = i n f / ( G ) ;
(3.4)
thus, quasi-convex supremization is equivalent to quasi-concave infimization. However, here we shall consider only quasi-convex supremization. In contrast to the cases of convex and quasi-convex infimization (see Chapter 1, Section 1.4), it will turn out that for quasi-convex supremization the theory of surrogate duality is more developed (see Sections 3.1-3.3) than the theory of Lagrangian duality (see Section 3.4).
102
3. Duality for Quasi-convex Supremization
Our starting point for the study of surrogate duality will be the observation that worst approximation may be regarded as a particular case of supremization, by taking X to be a normed linear space, XQ e X, and / : X ^- /? the convex function (1.264); indeed, then sup/(G) = 5(G,xo),
(3.5)
the deviation (2.1) of G from XQ, and, for this case, the optimal solutions go ^ G of problem (P^) are the elements of worst approximation of xo by G. Although the extension from the particular function / of (1.264) to a function f: X -> R on a locally convex space X is a rather big step, it turns out that similarly to the case of passing from best approximation by convex sets to convex infimization, many results and methods of the theory of worst approximation can be extended to results on the supremization of functions. Similarly to the fact that formula (1.249) on the distance to a convex set extends to the surrogate duality formula (1.330) on quasi-convex infimization, it is natural to expect that formula (2.11) on the deviation will extend, under certain assumptions on G and / , to a formula like sup/(G) =
sup
inf
f{y),
(3.6)
^' '
obtained formally by replacing in (2.11) the function / of (1.264) by a function / on a locally convex space X\ this will be achieved in Section 3.1. Next, corresponding to formula (1.355) on infimization, one would like to replace the hyperplanes [y e X\ 0(y) = sup 0(G)} of (3.6) by other sets, e.g., closed half-spaces. Therefore, in Section 3.2, we shall consider "unconstrained surrogate dual problems" to problem (P^) of (3.1), defined as supremization problems of the form ^ ' = supA.^(X*\{0}),
(3.7)
where X*\{0} is the dual set (unconstrained), and X^ = A.^ ^: X*\{0} -> P is a function (the dual objective function, depending on G and / ) of the form A^(cD) = inf /(^G.o)
(^ € X*\{0}),
(3.8)
with {^G,O}OGX*\{0} being a family of subsets of X related in some way to G. The right-hand side of (3.6) is indeed of the form (3.7), with k^ of the form (3.8), where the surrogate constraint sets ^G,4> are the hyperplanes ^G,^ = [yeX\
0 ( j ) = sup cD(G)}
(O G Z*\{0}).
(3.9)
Problem (3.7), with X^ of (3.8), is an unperturbational dual problem to ( P ^ , since it is defined directly, without using the method of embedding first (P^) into a family of perturbed primal problems, and it is a surrogate dual problem to (P^), since it replaces the primal constraint set G of (3.1) by a family of "surrogate constraint
3.1 Some hyperplane theorems of surrogate duality
103
sets" ^G,cD c X (O e X*\{0}) (while it keeps the primal objective function / unchanged). Next, more generally, in view of further applications, given an arbitrary set Z, a subset G of X and a function / : X ^ /?, for the supremization problem (P^) of (3.1) we shall consider in Section 3.3 a "surrogate dual problem" of the form ^ ^ = ^ ^ ^ = supA(W),
(3.10)
where W = WQ ^ is a. set (the dual constraint set) and A = A^ y^: W ^^ /? is the function (the dual objective function) defined by k^GjM = inf fiQcw)
(w e W),
(3.11)
with {^G,w}wew being a family of subsets of X related in some way to G. Then, taking Z to be a locally convex space, W = X*\{0}, and A = A^ of (3.11), problem (3.10) reduces to problem (3.7), (3.8). Furthermore, taking X to be a locally convex space, W c Z*\{0} or W c (X*\{0}) x R, and X = X' of (3.11), we shall obtain some useful unconstrained and "constrained" surrogate dual problems to problem (P^) of (3.1). Actually, instead of {^G,w}wew, we shall find it more convenient to use the equivalent language of polarities A: 2^ -^ 2^ (this will be explained in Section 3.2). In Section 3.4 we shall deal with Lagrangian dual problems to problem (P^) of (3.1). Finally, the general dual problem (3.10) will permit us to study (unconstrained and constrained) surrogate duality for more structured primal supremization problems (i.e., in which the primal constraint set G is expressed in more structured ways), by considering suitable dual constraint sets W and dual objective functions A = r ^ ^ : W - ^ ^ as in (3.11) (see Section 3.5).
3.1 Some hyperplane theorems of surrogate duality In this section we shall give some hyperplane theorems of surrogate duality, generalizing the (equivalent) geometric forms (2.11), (2.13) of Chapter 2, Theorem 2.1. Let us first give a lemma, in a somewhat more general form than needed in the sequel. For a linear space X, we shall denote by X* the set of all linear (not necessarily continuous) functions O: X -> /?. Lemma 3.1. Let X be a linear space, O e X*\{0}, and f: X ^ function, and let co(d):=
inf f(y)
(d e R).
R a convex (3.12)
yeX
If (D{d) > - o o {d e /?), then CO is finite and convex, and hence continuous on R.
(3.13)
104
3. Duality for Quasi-convex Supremization
Proof. Since O / 0, we have [y e X\ ^(y) = J} ^ 0, so (jo(d) < +oo (d e R), whence by (3.13), a)(R) c^ R. Lctdud2 e R,0 < IX < 1, and £ > 0. Then by (3.12) and co(R) c R, there exist y[,y2e X with cD(jj) = d\, ^(y!^) = d2, such that fiyD < co(di) -\-s (i = 1,2). But then, since O is Unear and / is convex, we obtain a)(ixdi + (1 - /x)^2) =
inf
f{y)
yeX 0(>')=M+(1-Ai)^2
< / ( / x j ; + (1 - /x)y^) < nf(y[) + (1 - /x)/(j2) < iio)(di) + (1 - /x)co(d2) + £, which, since
sup
inf
y^^
f{y).
(3.14)
^^ ^0(};)=supa>(G)
(b) If either G is bounded and f: X ^^ R is a convex function satisfying inf fiy) > - 0 0
(O G Z*\{0}, d e R),
(3.15)
yeX ^(y)=d
or G is weakly compact and f: X -> R is an arbitrary function, then sup/(G) > sup inf f(y).
(3.16)
^' ^4)(y)=sup
(c) Consequently, if either G is bounded and / : X -^ R is a lower semicontinuous convex function satisfying (3.15), or G is weakly compact and / : Z -> R is a lower semicontinuous quasi-convex function, then we have the equality (3.6). Proof (a) Let / : Z -^ /? be a lower semicontinuous quasi-convex function, and assume, a contrario, that sup/(G) >
sup
inf
f(y).
(3.17)
f(y).
(3.18)
Then there exist go ^ G and s > 0 such that /(go)-^>
sup oexnio}
inf y^^
3.1 Some hypeq^lane theorems of surrogate duality
105
Hence, /(go)-^>
inf
f{y)
(OGX*),
(3.19)
yeX 4>(>')=sup4)(G)
and thus for any O e Z*\{0} there exists y = y^ e X with 0(3;) = sup 0(G), /(go) - e > f(y).
(3.20)
Case r . /(go) < +00. Let ^/(.o)-.(/) := {y ^ ^1 f(y) < f(8o) - e},
(3.21)
Then, by (3.20), Sf(gQ)-s(f) 7^ 0. Furthermore, since / is a lower semicontinuous quasi-convex function, Sf^g^^sif) is a closed convex set; also, since /(go) < +C)0, we have go ^ 5'/(gQ)_e(/). Hence, by the strict separation theorem, there exists Oo G X*\{0} such that Oo(go) > sup cDo(5/(,,)_,(/)).
(3.22)
We claim that inf
f(y) > /(go) - e;
(3.23)
^o(>')=supo(G)
indeed, otherwise, there would exist yo ^ X with Oo(jo) = supOo(G) such that f(yo) < /(go) - £ (so, yo e 5/(^,)_,(/)), whence by (3.22), Oo(jo) = sup
sup ^'
inf
f(y).
(3.24)
Then by (3.24), SAf) 7^ 0, and by /(go) = +cx), we have go ^ SAf). so the above argument of case 1° yields (3.23) with /(go) - £ replaced by d, which contradicts (3.24). This proves (3.14) for case 2°. (b) Let G c X be a (nonempty) bounded set (hence supO(G) e R for all O G X*), and f: X ^ R a convex function satisfying (3.15). Then by Lemma 3.1, CO of (3.12) is continuous on R, for all O G X*\{0}. Assume now, a contrario, that sup/(G) <
sup
inf '
Then there exists Oo ^ A'*\{0} such that
f{y).
(3.25)
106
3. Duality for Quasi-convex Supremization sup / ( G ) <
inf
/(>^) = coisup Oo(G)).
(3.26)
yeX o{y)=sup
Choose a sequence {g„} c G such that ^o(gn) -^ sup Oo(G). Then c^(^o(gn))=
inf
f(y)<
f(gn)<
supf(G)
(« = 1,2,...),
(3.27)
yeX ^o(>')=^o(gJ
whence a;(supcDo(G)) = lim a)((t>o(gn)) < sup/(G),
(3.28)
in contradiction to (3.26). Thus, (3.25) cannot hold, which proves (3.16). Assume now that G is weakly compact and / : X -> /? is an arbitrary function satisfying (3.25), and hence (3.26), for some OQ e X*\{0}. Then, since G is weakly compact, there exists go e G such that Oo(go) = supcI)o(G) (see Lemma 1.3), whence a;(sup cDo(G)) = (o{<^o{go)) =
inf
f(y) < f(go) < sup / ( G ) ,
yeX cI)o(v')=Oo(go)
in contradiction to (3.26). Thus, (3.25) cannot hold, which proves (3.16). Finally, part (c) is an obvious consequence of parts (a) and (b).
D
Remark 3.1. (a) Formula (3.6) admits the following geometric interpretation: sup/(G) = sup i n f / ( / / ) ,
(3.29)
HeHc
where HG denotes the family of all (closed) hyperplanes that quasi-support the set G. Thus, the reduction principle of Remark 2.1 (b) extends now to the following form: formula (3.29) reduces the computation of sup f{G) to the computation of inf f(H) for the hyperplanes H e He(b) Theorem 3.1 remains valid if we also permit O = 0 in (3.6), since sup
inf
f(y) = sup
^' ' (D(v)=sup(G)
inf
fiy)-
(3.30)
0(v)==supcI>(G)
Indeed, the inequality < in (3.30) is obvious. On the other hand, for each ^ e X*\{0} and for OQ = 0 we have inf yeX ^{y)=sup
/(^)>inf/(X)=
inf
f(y),
yeX a)o(v)=supOo(G)
whence we obtain the inequality > in (3.30). However, formula (3.6) has the advantage that it is a "hyperplane theorem" (by 3.29)), while for OQ = 0 we have {y e X\ ^o(y) = sup Oo(G)} = X, which is not a hyperplane. (c) The assumption of boundedness of G cannot be omitted in parts (b) and (c) of Theorem 3.1. Indeed, for example, if X = G = R and f(y) = 1 for all y e X, then the right-hand sides of (3.16) and (3.6) are -\-oo (since supO(G) = -hoo, so {y eX\
cD(j)
= sup 0(G)}
= 0, for all O G X * \ { 0 } ) , but sup / ( G )
= 1.
3.1 Some hyperplane theorems of surrogate duality
107
In the particular case G = [XQ] (a singleton, hence weakly compact), from Theorem 3.1 we obtain the following corollary: Corollary 3.1. Let X be a locally convex space, XQ e X, and f: X ^^ R a lower semicontinuous quasi-convex function. Then f(xo)=
sup
inf
f(y).
(3.31)
Remark 3.2. Formula (3.31) admits the geometric interpretation fixo) = sup i n f / ( / / ) ,
(3.32)
Hen xoeH
where H denotes the family of all hyperplanes in X. The sup in (3.31) need not be attained, even when / is finite, as shown by the following example: Example 3.1. Let 5 be a nonreflexive Banach space, let X = B*, endowed with the weak* topology a{B*, B), and let f(y) = \\y\\
(yeX).
(3.33)
Then Z is a locally convex space and / is a finite lower semicontinuous convex function on X (but it is not continuous at any XQ e X). Hence, by Corollary 3.1, we have (3.31), which, since X* = (B*, a(B*, B))* can be identified with B, means that ||xol|=
sup
inf
||x||=
sup disi(0,Hb,,,)
(xo e B'),
(3.34)
where H,,,, = {xeB''\xib)=xo(b)},
(3.35)
Thus, for each b e B, Hbx^ is a hyperplane in B*, and therefore, by Lemma 1.5, dist(0, Hb,xo) = l-^o(^)l/ ll^ll- Consequently, (3.34) becomes the well-known formula lko||=
sup 1 ^ ^ beB\{0}
(xoeB^).
(3.36)
ll^ll
But since B is nonreflexive, by a well-known theorem of R.C. James (see, e.g., [40], p. 63, part (7)), there exists XQ e B* for which the sup in (3.36), and hence in (3.34), is not attained. For the next hyperplane theorem we need some preparation.
108
3. Duality for Quasi-convex Supremization
Lemma 3.2. Let X be a locally convex space, ^ e X*\{0} and f: X -> R a function such that the (possibly empty) sets ^r = [<^{y)\ yeX,
f{y)
(re R)
(3.37)
are closed in R. Then the function co: R ^^ R defined by (3.12) is lower semicontinuous. Proof Letr e R, [zn] ^ Sr{(jo) = [z e X\(D{Z) < r}, Zn -^ zo-Then, given £ > 0, by Zn € Sr{a)) and (3.12) there exist j„ G X with 0(y„) = Zn such that / ( j „ ) < (o{Zn)-\-£
(3.38)
D
3.2 Unconstrained surrogate dual problems for quasi-convex supremization While in the preceding section we have been concerned with "hyperplane theorems" of surrogate duality, now we want to consider as well other types of surrogate duality results for supremization, e.g., ''half-space theorems." To this end, as mentioned at the beginning of this chapter, we shall consider for the supremization problem (P^) of (3.1) a "surrogate dual problem" of the form (3.10), where W = W^j is a set (the dual constraint set) and X = X^ j-: W -> /? is the function (the dual objective
3.2 Unconstrained surrogate dual problems for quasi-convex supremization
109
function) defined by (3.11), with {QG,w}wew being a family of subsets of X related in some way to G. Our main tool will be the unifying framework of polarities A: 2^ —> 2 ^ . We shall first give some results using arbitrary polarities, and then we shall apply them to various concrete polarities. Let us fix, for simplicity of notation, the set G c X, so that we can write Qy^ instead of Qcw We then have the following basic lemma. Lemma 3.3. Given two sets X, W and a family {Qw}wew of subsets of X, there exists a unique polarity A: 2^ -^ 2 ^ satisfying Q^ = CA\{W})
(W e W),
(3.39)
namely, the mapping defined by A(C) :={w eW\C
^ C^^} = {w eW\CnQ^=
0}.
(3.40)
Proof By (1.150) and (3.40), for any set C c X we have CA^({W;}) = {X
eX\w^
A({jc})) = {x eX\x
^ te^} = ^ ^ ,
which proves (3.39). Furthermore, if A is a polarity satisfying (3.39), then, by (1.144) and (3.39), we obtain A(C) = lweW\C
^ CQ^},
that is, (3.40), which proves the uniqueness of A.
D
Remark 3.3. (a) Using (3.39) and (1.144), the dual objective function A,^ of (3.11) becomes X'^iw) = inf /(CA^({U;})) =
inf
f{x)
xeX
(w e W),
(3.41)
u;eCA({jc})
where A = AG : 2^ -> 2 ^ is a polarity (depending on G, but not on / ) . Then, by (3.10) and (3.41), the dual value (i.e., the value of the dual problem) becomes PI = sup inf f(CA\{w})) weW
= sup weW
inf
xeX we^A({x})
f(x).
(3.42)
Formulas (3.40) and (3.39) yield a one-to-one correspondence between families of subsets {^w}wew of X and polarities A: 2^ ^- 2^, so the two languages (3.11), (3.10) and (3.41), (3.42) are equivalent ways of expressing the dual objective function X^ and the dual value ^^. In the sequel we shall choose the language (3.41), (3.42), since this will allow us, by using (1.140), to express the results, e.g., on the relations between the primal and dual problems, in a more concise way. (b) If there exists WQ e W such that CA'({W;O}) = 0, or, equivalently, A\{wo}) = X, thenby (3.41), we have A."^(u;o) = inf0 = +oo, and hence by (3.42), y0^ = +oo. Thus,
110
3. Duality for Quasi-convex Supremization ^^ < +00 =^
CA'({U;})
/ 0 <^ A\[w}) ^X
(we W),
(3.43)
(c) In the particular case of Theorems 3.1 and 3.2, we have W = X*\{0}, and by (3.9) and (3.39), the surrogate constraint sets are CA\[^})
= {y eX\ ^(y) = sup 0(G)}
(O e X*\{0}),
(3.44)
where A = AG : 2^ ^ 2^*\<^^ is the polarity A^ of (1.166); also, the dual objective function (3.41) is A1(0)=
inf
f(y)
(cD6X*\{0}).
(3.45)
yeX )=supO(G)
We shall first give some necessary and sufficient conditions on G, / , and A, in order that a > )S^ or or < )6^ or a = )6^, where a e R is arbitrary, in terms of the level sets Sd(f) and Aj(/) of (1.22) and (1.23). Lemma 3.4. Let X be a set, Q <^ X, f: X ^ ^, and d e R. (a) We have inf/(^) > d
(3.46)
if and only if Ad(f)nQ
= &.
(3.47)
(h)If i n f / ( ^ ) > d,
(3.48)
SAf)nQ
(3.49)
then = id.
Proof (a) If yo e Ad{f) H ^ , then i n f / ( ^ ) < fiyo) < d. Conversely, if i n f / ( ^ ) < d, then ^ 7^ 0 and there exists >'o ^ ^ such that f(yo) < d, so
yoeAAf)nQ. (b) If there exists yo e S^if) n ^ , then i n f / ( ^ ) < f(yo) < d, so (3.48) cannot hold. n Proposition 3.1. Let X, W be two sets, f: X -^ R a function, A: 2^ -> 2^ a polarity, and a e R. The following statements are equivalent: 1°. We have a>^l
= sup inf /(CA^({it;})).
(3.50)
2°. We have Ad{f) n ZA\{W})
7^ 0
{w eW,deR,d>a).
(3.51)
7^ 0
{w eW,deR,d>oi).
(3.52)
3°. We have Sdif) n ZA\[W])
3.2 Unconstrained surrogate dual problems for quasi-convex supremization
111
Proof, r =4^ 2°. If r holds, then for each w; G W and J G /?, J > a, we have d > inf f(CA\{w})), whence by Lemma 3.4(a), we obtain (3.51). The implication 2° =^ 3° is obvious. 3° =» 1°. If 3° holds, then by Lemma 3.4 (b), we have X'^(w) = inf f(ZA\{w}))
(w e W,d e R,d > a),
whence ^^ = sup A,^(W) < infj>(^ d — ot.
•
Proposition 3.2. Let X, W be two sets, f: X ^ ^ a function, A: 2^ -> 2^ a polarity, and a G R. The following statements are equivalent: r. We have a
sup inf f(CA\{w})).
(3.53)
weW
2°. For each d e R, d < a, there exists w^ e W such that AAf)nCA'(lwd})^&.
(3.54)
3°. For each d e R, d < a, there exists Wd £ W such that SAf)nCA\{wd})
= &-
(3.55)
Proof r =^ 3°. If 1° holds and J G /?, J < a, then d
supinf/(CA'({K;})), weW
and hence there exists w^ e W such that d < inf / ( C A ' ( { M ; J } ) ) . Then by Lemma 3.4(b), we have (3.55). The implication 3° => 2° is obvious. 2° => 1°. Ifd and wj are as in 2°, then by Lemma 3.4(a), we have X^iWd) = mf
f(CA\{w,}))>d.
whence ^^ = sup A,^(W) > sup^^^ d = a.
•
Combining Propositions 3.1 and 3.2, we obtain the following result: Theorem 3.3. Let X, W be two sets, f: X ^^ R a function, A: 2^ -> 2 ^ a polarity, and a G R. The following statements are equivalent: 1°. We have a = Pl=
sup inf f(CA\[w}))\
(3.56)
weW
2°. We have (3.51) and for each d e R, d < a, there exists Wd e W satisfying (3.54). 3°. We have (3.52) and for each d e R, d < a, there exists Wd ^ W satisfying (3.55).
112
3. Duality for Quasi-convex Supremization
Now we shall give, for the case when a = a^ of (3.1), some convenient sufficient conditions in order that a' > yS^ or a' < ^^ or a' = yS^- ^^ ^^is end, let us first prove a lemma: Lemma 3.5. Let X and W be two sets, A: 2^ ^ following statements are equivalent: 1°. We have
2^ a polarity and XQ e X. The
H[xo]) = 0.
(3.57)
2°. We have xo ^ U^ew^\M).
(3.58)
Proof. If (3.58) does not hold, i.e., if there exists WQ ^ W such that XQ e A'({ifo}), then A({jco}) ^ AA'({u;o}) ^ w^o, so (3.57) does not hold. Conversely, if we do not have (3.57), i.e., if there exists WQ e A({jco}), then A\{wo}) 5 A'A({jco}) 3 JCo, so (3.58) does not hold. D Theorem 3.4. Let X and W be two sets, f: X ^ R a function, and AG : 2^ -)• 2 ^ (G C X) a family of polarities such that for any G C. X we have A{,}(lg}) = id
(geG).
(3.59)
inf/(CA^c({u;})) < supinf/(C A;^J({U;}))
(W
e W).
(3.60)
geG
Then, given G C. X, sup/(G)>)Sl^.
(3.61)
Moreover, if we have (3.59), (3.60) and f is A^^ Ac-quasi-convex, then sup/(G) = ^ l ^ .
(3.62)
Proof By (3.59) and Lemma 3.5, we have g e C Aj^}({w;}) (g e G,w e W), whence inf/(CA;^J({W;})) < f{g) (g eG,w e W). Therefore, by (3.60), inf/(CA^^({u;})) < supinf/(CA;^j({u;})) < sup/(G)
(w e W),
geG
and hence by (3.42), we obtain (3.61). Furthermore, if also / is Aj^Ac-quasiconvex, then by (3.111) below (applied to A = AG), we have sup / ( G ) =
sup
inf f(CA'a({w})) < sup inf f{CA'^({w})) = ^ 1 ^ ,
weCAciG)
whence by (3.61), we obtain (3.62).
ujeW
D
3.2 Unconstrained surrogate dual problems for quasi-convex supremization
113
Theorem 3.5. Let X and W be two sets, G a subset of X, f: X -> R a function, and A: 2^ -> 2 ^ a polarity. The following statements are equivalent, where a, = a'= sup f(G): r. We have
2°. We have
3°. We have
4°. We have
{a=)supfiG)
(3.63)
A(5,(/)) # 0
(d< a).
(3.64)
HAAf))
(d< a).
(3.65)
/ 0
Aa(f) c U^^wA\{w]).
(3.66)
Proof 4° ^ 1 ° . We have 4° if and only if for each d < a there exists Wd e W such that Ad(f) c A\{wd}), i.e., such that A j ( / ) n CA'({W;J}) = 0, which, by Proposition 3.2, is equivalent to sup / ( G ) < ^^. 2° =:^ 3°. If 2° holds, then since AAf) £ SAf), we have A(Aj(/)) 3 A(5^(/)) 7^ 0 (J < a). 3° => 4°. If w;j 6 A(Aj(/)) (J < a), then (d < a), Ad(f) C A'A(A^(/)) C A'({u;^}) C U^^wA\{w}) whence A«(/) = U^<«A^(/) c U^jew^'iM). 4° =^2°. If 4° holds, then for each d < a there exists Wd e W such that 5 j ( / ) c A J / ) c A'({it;j}), whence A{Sd(f)) 2 AA^({u;^}) 3 u;^.
D
Corollary 3.2. L^r Z, W be two sets, G a subset ofX,f:X^^R a function, and A: 2^ ^^ 2^ a polarity. Iffor each d < sup / ( G ) , the level set Sd(f) is A Q Acconvex (in particular, if f is A'Q AQ-quasi-convex), then (3.63) holds. Proof. The assumption that Sd{f) is AJ^Ac-convex {d < sup/(G)) means that for each d < sup / ( G ) and x e CSd(f) there exists w = Wd,x ^ W such that Sdif) c A'cd^dJ),
X e CA'a({^d,x})-
(3.67)
Hence, for each ^f < sup/(G) we have A j ( / ) c Sd(f) ^ Uu;evrA^(^({w;}), whence (3.66), and thus, by Theorem 3.5, implication 4° =^ 1°, there follows (3.63). D Combining theorems 3.4 and 3.5, we obtain the following corollary: Corollary 3.3. If we have (3.59), (3.60), and (3.66), then
sup f(G) = PI
(3.68)
114
3. Duality for Quasi-convex Supremization
Concerning simultaneous characterizations of optimal solutions of (P^) and of weak duality a^ = i^A, let us prove the following theorem: Theorem 3.6. Let X and W be two sets, G a subset ofX.f'.X^^R a function, and A: 2^ -> 2 ^ « polarity. For an element go e G and for a = ot^ = sup / ( G ) , the following statements are equivalent: r . We have go G Mdf) {i.e., /(go) = max / ( G ) ) and a = p^. T. We have AAf)
n CA\{W})
^0
(weW^deR,d>
/(go)),
(3.69)
and for each d e R, d < a, there exists w^ ^ W satisfying (3.54). 3°. We have SAf) n CA\{W})
#0
(weW,deR,d>
f(go)),
(3.70)
and for each d e R, d < a, there exists Wd ^ W satisfying (3.55). Proof r => 2°. If r holds, then f(go) = sup / ( G ) = a, and hence by Theorem 3.3, we have 2°. 2° ^ r. Assume 2°. Then by (3.69) and Proposition 3.1 (with a = /(go)), we have /(go) ^ P^- Furthermore, by the second condition of 2° and by Proposition 3.2, we have (3.53) with a = sup / ( G ) . Hence by go e G, we obtain Pl>
a = sup f(G) > f (go) > PI
Finally, the proof of the equivalence 1° <^ 3° is similar.
D
In the remainder of this section we shall assume, often without any special mention, that X is a locally convex space, with conjugate space X*, and G C. X, and we shall apply the preceding results to the special polarities A^ A^ of Chapter 1, Section 1.2. (1) By (1.155), for the polarity A = A^ : 2^ -> 2^*^^^^ of (1.154) the dual value (3.42) becomes fil^ =
sup
inf/(C(A^)'({0})) =
"
^^^*\^«}
sup
inf
f(x).
(3.71)
^^^*\{OUu)>s'upcI>(G)
Note also that for any 4> G X* we have supO(G) > —(X), since G 7^ 0. On the other hand, if there exists 4>o G X* such that sup Oo(G) = +00, then by (1.155) and a>o(x) G R, we have C(A]jy{{^o}) = {x e X| cDo(x) > supOo(G)} = 0. Hence by (3.71), ^^, < +00 =^ sup 0(G) e R(^
e X*).
(3.72)
3.2 Unconstrained surrogate dual problems for quasi-convex supremization
115
Theorem 3.7. Let X be a locally convex space, f: X ^^ R a function, and G a subset of X.
i^)If inf
fix)
< sup
inf
^
OU)>0(g)
f(x)
(O e X*\{0}),
(3.73)
then sup/(G) >
sup
inf
/(jc).
(3.74)
f(x)
(3.75)
^^^*\
(b) There holds sup/(G) <
sup
inf
if and only if for each d < sup / ( G ) there exists O = Oj e X*\{0} such that 0(y)<sup(D(G)
(yeSAf))
(3.76)
(by Lemma 1.10(c), this condition is satisfied, e.g., when f is (AQYA^j-quasiconvex). (c) If we have (3.73), then for each d < sup / ( G ) there exists O = Oj e Z*\{0} satisfying (3.76) if and only if sup/(G) =
sup
inf
fix).
(3.77)
^^^*\{OU(.)>s'upc|>(G)
Proof (a) This follows from Theorem 3.4 for A = A^ of (1.154), using (1.156), (1.155) and <^ix) >
C(A|^P^({0}) = {xeX\
X*\{0}).
(3.78)
Alternatively, one can also give the following direct proof: We have sup / ( G ) > f(g) >
inf
fix)
(^ E G, CD e X*\{0}),
xeX 0(^)>0(g)
whence by (3.73), sup/(G) >
sup
sup
inf
OeX*\{0} seG ^ ( / ) | i ( ^ )
fix)>
sup
inf
fix)-
eX*\{0} ^(,)>-,t.p 0(G)
(b) This follows from Theorem 3.5, equivalence 2° <^ 1°, applied to AJ. , since formulas (3.75) and (3.76) mean, respectively, (3.63) for A = AQ and O G Al^iSdf))(c) This follows by combining parts (a) and (b). •
116
3. Duality for Quasi-convex Supremization
Remark 3.4. (a) Theorem 3.7 is a "half-space theorem of surrogate duality," since for each O e Z*\{0} the surrogate constraint set Q<^ = ^G,
(3.79)
VeVc
where VG denotes the family of all closed half-spaces that quasi-support the set G and do not contain G (see Corollary 1.3). (b) Theorem 3.7 remains valid, with the same proof, if we replace in it, and in the definition of A^^; Z*\{0} by any subset W o/X*\{0}. (c) In formula (3.73) the inequality sign can be replaced by equality, since the opposite inequality always holds. A similar remark holds also for formulas (3.171) and (3.185) below. Corollary 3.4. Let X be a locally convex space, G C X, and f: X ^^ R a function such that for each d < sup / ( G ) the level set Sdif) is evenly convex {e.g., let f be evenly quasi-convex). Then we have (3.75). Proof For each d < a = sup / ( G ) there exists gd ^ G such that f(gd) > d, that is, gd G ZSdif). Hence, since Sdif) is evenly convex, there exists Oj € Z*\{0} such that <^d{y) < ^d(gd)
(y e Sdif)).
(3.80)
Then, by (3.80) and g^ G G, we have <^diy) < ^digd) < sup
iy e
Sdif)).
Consequently, by Theorem 3.7(b), we obtain (3.75).
D
Definition 3.1. For a locally convex space X and O e X*\{0}, a function / : X ^ R is called regular with respect to O if inf fix) = sup xeX (x)>d
inf
fix)
id e R).
(3.81)
j/^r, xeX j'<5^(^)>^'
For example, it is known (see [244], Remark 3.2) that if f: R" ^^ R and / is convex, then / is regular with respect to all O e iR"^)*. Using Theorem 3.7(a), let us prove the following: Corollary 3.5. Let X be a locally convex space, G a subset ofX such that supcD(G)G/?
(OGX*),
(3.82)
and f: X ^^ R a function that is regular with respect to all ^ G X*\{0}. Then we have (3.74).
3.2 Unconstrained surrogate dual problems for quasi-convex supremization
117
Proof. Let ^ e X*\{0}. Then by our assumption of regularity, we have (3.81) for d = sup 0(G) 6 R, i.e., inf
fix) =
sup
inf f(x).
(3.83)
But for each d' e R with d' < supO(G) there exists g = g^> e G such that d' < ^{g), whence [x e X\ ^(x) > d'] ^ [x e X\ 0(x) > 0(g)}. Consequently, sup
inf f{x)<
sup
inf
rl'czl?
X^X
oad
X^X
/(jc),
(3.84)
which, together with (3.83), yields (3.73). Hence, by Theorem 3.7(a), we get (3.74). D Combining Corollaries 3.4 and 3.5, we obtain the following: Corollary 3.6. Let X be a locally convex space, G a subset of X satisfying (3.82), and f:X—>R a function that is regular with respect to all O e Z*\{0} and such that for each d < sup f(G) the level set Sdif) is evenly convex (the latter condition is satisfied, e.g., when f is evenly quasi-convex). Then we have (3.77). (2) By (1.161), for the polarity A = A^ : 2^ -^ 2^*^^^^ of (1.160) the dual value (3.42) becomes Pl2=
sup inf/(C(A2.)^({CI>})) =
sup
inf
f(x).
(3.85)
Note that if there exists 4>o e X*\{0} such that sup Oo(G) = +oo, then by (1.161) and cDo(x) e R, we have C{Aly({^o]) = {x e X| CDO(JC) > supOo(G)} = 0. Hence by (3.85), P'
< +00 =^ sup 0(G) G /? (O G X*).
(3.86)
Lemma 3.6. (a) For any set G, i^l. < PI2. ^G
(3.87)
^G
(b) If X is a locally convex space, G c X, and f: X ^^ R is upper semicontinuous, then Pl^ =^'^2. ^G
(3.88)
^G
Proof (a) By the definitions, we have fi'^, =
sup
inf (jc)>sup4)(G)
fix) <
sup
inf
^^ ' 0(jc)>sup(G)
f(x) = ^^2 •
118
3. Duality for Quasi-convex Supremization (b) We have
{x eX\^(x)
>supO(G)} = {jc eX\
(O G X*\{0}),
and hence if f: X ^^ R is upper semicontinuous, then by Lemma 1.1,
inf
xeX
fix) =
inf
fix)
(O e X*\{0}),
xeX
which yields (3.88).
D
Let us observe now that by (3.87), any condition ensuring (3.75) ensures also sup/(G) <
sup
inf
fix).
(3.89)
Hence, for example, from Corollary 3.4 we have the following result: Corollary 3.7. Let X be a locally convex space, G c. X, and f: X -^ R a function such that for each d < sup / ( G ) the level set Sdif) is evenly convex ie.g., let f be evenly quasi-convex). Then we have (3.89). Theorem 3.8. Inequality (3.89) holds if and only if for each d < sup / ( G ) there exists O = O^ G Z*\{0} such that sup ^iSdif))
< sup0(G).
(3.90)
Proof. This follows from Theorem 3.5, equivalence 4° ^ 1°, applied to A^, since formulas (3.89) and (3.90) mean, respectively, (3.63) for A = A^ and CD € AliSdif)). D Remark 3.5. (a) Condition (3.90) is satisfied, in particular, if SAf)^G
(J < sup/(G)).
(3.91)
(b) If / is (A^)^A^-quasi-convex, then by Lemma 1.11 (b), the above condition involving (3.90) is satisfied, and hence by Theorem 3.8, we have (3.89). However, this follows also from Corollary 3.2 applied to A^, or from Corollary 3.7 above, since by Lemma 1.11 (b) every iA^YA^-quasi-convex function is lower semicontinuous and quasi-convex, and hence evenly quasi-convex. Corollary 3.8. Let X be a locally convex space, G a subset of X satisfying (3.82), and f: X ^ R an upper semicontinuous function that is regular with respect to all 0 G X*\{0}, and such that for each d < sup / ( G ) the level set Sdif) i^ evenly convex ithe latter condition is satisfied, e.g., when f is evenly quasi-convex). Then sup/(G) =
sup OGX*\(0I
inf ^^^
^fcA \l^)<|>(^)>sup(G)
fix).
(3.92)
3.2 Unconstrained surrogate dual problems for quasi-convex supremization
119
Proof. By Corollary 3.5, we have (3.74), whence by Lemma 3.6 (b), we obtain the inequality > in (3.92). On the other hand, by Corollary 3.7 we have the inequality < in (3.92), and hence equality. D Remark 3.6. Corollary 3.8 is a "half-space theorem of surrogate duality," since the surrogate constraint sets ^
^k = ',^p,„. ^^^ ndAinm)) ^
= sup
OeX*\{0}
M
ci>ex*\{0}^. ^^^ l>{x)=sup
/(X).
(3.93)
Note that if there exists OQ G X * \ { 0 } such that sup Oo(G) = +00, then by (1.167) and Oo(jc) e R, we have C(A^)'({Oo}) = [x e X| OO(JC) = sup(Do(G)} = 0. Hence by (3.93), ^'
< +00 => sup4>(G) eRi^e
X*).
(3.94)
Let us also note that by the definitions, we have P'., =
sup
inf
f(x) <
sup
inf
f(x) = p'.,.
(3.95)
We have the following theorem of surrogate duality, which should be compared with Theorem 3.7(a).
Theorem 3.9. IfG^Xandf.X^J inf
fix) < sup
xeX 0(jc)=sup(I)(G)
satisfy inf
f(x)
(O e X*\{0}),
(3.96)
g/xeX ^""^ it>{x)=^(g)
then sup/(G) >
sup
inf
fix).
(3.97)
Proof Formula (3.97) follows from Theorem 3.4 applied to A = A^ of (1.166), using (1.168), (1.167) and C(A^^p\{cD}) = {xeX\
(g G G, CD e Z*\{0}).
(3.98) D
Remark 3.7. A sufficient condition for the inequality (3.96) to hold for a given O G X*\{0} is the existence of an element g e G such that 0(g) = sup 0(G); hence in particular, (3.96) holds if G is weakly compact.
120
3. Duality for Quasi-convex Supremization
(4) By (1.183), for the polarity A = A^: 2^ ^ 2^*^'"! of (1.182) the dual value (3.42) becomes ^^. =
sup inf /(C(A^)'({cD})) =
sup
inf
f{x).
(3.99)
Let us also note that by the definitions, we have PI, =
sup
inf
fix) <
sup
f(x) = ^^2 •
inf
(3.100)
Theorem 3.10. Let X be a locally convex space, G a subset ofX, and f: X -^ R a function. We have sup/(G) <
sup
inf
fix)
(3.101)
if and only if for each d < sup / ( G ) there exists = j e X*\{0} such that <^(Sdif)) C cD(G).
(3.102)
Proof This follows from Theorem 3.5, equivalence 4° ^ 1°, appHed to A^ , since formulas (3.101) and (3.102) mean, respectively, (3.63) for A = A^ and ^ € AS(5^(/)). • Note that we always have inf
/(x)>sup
inf
fix)
(
(3.103)
but in order to obtain conditions for the inequality sup/(G) >
sup
inf
fix).
(3.104)
^^^^^^'UixfAo and the equality sup/(G)=
sup
inf
fix),
(3.105)
we cannot apply Theorem 3.4 to A = A^, because of (1.184). Similarly, we always have inf xeX
/(x)>sup Q^Q
inf
fix)
(O G X*\{0}).
(3.106)
xeX
but we cannot obtain conditions for the opposite inequality and the equality in Theorem 3.10 by applying Theorem 3.4 to A = A^, because of (1.162).
3.3 Constrained surrogate dual problems for quasi-convex supremization
121
3.3 Constrained surrogate dual problems for quasi-convex supremization In this section we shall consider "constrained surrogate dual problems" to problem (P^) of (3.1), defined as supremization problems of the form ^^ = sup X^(Wlj), where the dual constraint set WQ is a proper subset either of an arbitrary set W, or of (X*\{0}) X R, or of X*\{0}, depending on G, and the dual objective function is (3.11). For the families {^G,(
U/e/ A/ —> R, we
inf inf/(Ay) = inf/(U/,/A,),
(3.107)
iel
sup sup/(A/) = sup/(U/e/A/).
(3.108)
iel
Proof The inequality > in (3.107) is obvious. Conversely, for each /x > inf/(U/e/A/) there exists a^ e U/^/A/, whence a^ e A/^ for some /^ ^ L such that M > /(«/x) > inf/(A/^) > infinf/(A/), whence, since /x > inf/(U/^/A/) was arbitrary, we obtain (3.107). This formula implies (3.108), since -sup/(U,e/A,) = i n f ( - / ) ( U , e / A , ) = infinf(-/)(A,) iel
= inf ( - s u p / ( A / ) ) = - s u p sup/(A/). i^^
D
iel
The following general duality theorem will be applied to various special polarities A: 2^ ^ 2^x*\m^R and A: 2^ ^ 2^*\{0}^ Theorem 3.11. Let X be a set, W ^ ^^, A: 2^ -> 2^ a polarity, f \ X -^ ~R a A' A-quasi-convex function, and G C. X. Then sup / ( G ) = - i n f f^^^\CA(G)).
(3.109)
Proof Let us first observe that by (1.139) we have U,,C;(CA({^}))
= C(n,,aA({g})) =
CA(G).
(3.110)
Hence, since f: X ^^ Ris A^ A-quasi-convex, by (1.153), (1.144), Lemma 3.7, (3.110), and (1.223), we obtain
122
3. Duality for Quasi-convex Supremization sup / ( G ) = sup /q(A'A)(g) = sup g^G
geG
sup
sup
iuf
f(CA\{w}))
we{lA({g})
inf f(CA'{{w}))
u;eU,,G(CA({g}))
sup
i-f^^^\w))
= - i n f f^^^\CA(G)).
n
u;eCA(G)
Remark 3.9. (a) By (1.223) and (1.144), one can also write (3.109) in the form sup / ( G ) =
sup
inf/(CA'({W;})) =
.eCA(G)
sup -'^^^^^^
inf
/(JC),
(3.111)
u^etlix})
which expresses sup / ( G ) as a *'sup inf," similarly to the preceding duality formulas. (b) Theorem 3.11 gives explicidy the reladon between the constraint sets, and the reladon between the objective funcdons, of the primal problem (P^) and the dual problem. Indeed, by Theorem 3.11, if X is a set, V^ c ;^^, A: 2^ ^ 2 ^ is a polarity, f e R , and G c X, then the supremizadon problem (Z)A)
y^A = sup AA(CA(G)),
(3.112)
where AA(M;)
= -f^^^\w)
= inf f(CA\{w}))
(w e C A ( G ) ) ,
(3.113)
might be called the "(A-)dual problem" to (P') (of (3.1)), while the set C A ( G ) and the function X^ of (3.113) might be called the "(A-)dual constraint set" and the "(A-)dual objective function," respectively. However, it will be more convenient to consider, instead of (DA), the infimizadon problem (DA)
h
= inf ( - A A ( C A ( G ) ) ) = inf / ^ ^ ^ ^ ( C A ( G ) ) = -y^A,
(3.114)
with AA of (3.113), as the (A-)dual problem to (P^) (of (3.1)), since then we will obtain a symmetric duality between abstract quasi-convex supremization problems and infimization problems with an abstract reverse convex constraint set (see Chapter 6, Remark 6.15 (b)). (c) Formulas (3.112)-(3.114) are surrogate dual problems, with "surrogate constraint sets" CA'({U;}) (W G C A ( G ) ) , instead of the inidal constraint set G of (3.1). Note that each A\{w}) (w e W) is A'A-convex (since A'AA'({M;}) = A\{w})), so each CA^({W;}) in (3.113) is a reverse A'A-convex constraint set. Let us first apply Theorem 3.11 to the special polarities A^^A^^,A^^:2^-> 2(x*\{0})xR ^^^ ^01^ ^02. 2X ^ 2^*\{0} of Section 1.2. (1) For the polarity A = A^^ of (1.189), we obtain the following corollary of Theorem 3.11: Corollary 3.9. Let Xbea locally convex space, / : X ^^ R a lower semicontinuous quasi-convex function, and G C X. Then
3.3 Constrained surrogate dual problems for quasi-convex supremization sup / ( G ) -
sup
inf
f(y).
123 (3.115)
(j)e(x*\mxR y^^^^ sup ^{G)>d
^(v)>^
Proof. For the polarity A = A^^ of (1.189) we have (1.190), so / is (A^^YA^^quasi-convex if and only if it is lower semicontinuous and quasi-convex. Hence, applyingfomiula(3.111) to A — A'^ we obtain (3.115). D Corollary 3.10. Let X be a locally convex space, f: X ^^ R a lower semicontinuous quasi-convex function, and G c. X. Then sup / ( G ) =
sup inf f(U),
(3.116)
where U denotes the family of all open half-spaces in X. Proof The open half-spaces in X are the sets of the form U^^d = [y^X\^{y)>d]^
(3.117)
where (cD, d) e (X*\{0}) x R, and sup <^(G) > d if and only if G n ^o,^ / 0Hence, (3.115) is equivalent to (3.116). D Remark 3.10. Formula (3.116) is another instance of the reduction principle: it reduces the computation of sup / ( G ) to the computation of inf f(U), for all U e U
with una
^0,
(2) For the polarity A = A^^ of (1.191), we obtain the following corollary of Theorem 3.11, which should be compared with Corollary 3.6: Corollary 3.11. Let X be a locally convex space, f: X -> R an evenly quasiconvex function, and G ^ X. Then sup / ( G ) =
sup
inf
fiy)=
(O,J)6(X*\{0})x/? y\^ 3geG,
sup
inf
(ct),g)eX*xG
-v^^ ^(>')>^(g)
f{y),
(3.118)
and if G is weakly compact, then sup / ( G ) =
sup
inf
f{y)=
sup cD(G)>^
^(>)>^
sup
inf
f(y).
(3.119)
cD(>')>supcD(G)
Proof For the polarity A = A^^ ^f (1.191) we have (1.192), so / is (A^2y^i2_ quasi-convex if and only if it is evenly quasi-convex. Hence, applying formula (3.111) to A = A^^, we obtain the first equality of (3.118). The second equality of (3.118) always holds, since sup
inf f(y) — sup
(cI>,J)e(X*\{0})x/? y^^ 3geG,cI>(g)>^ ^(>')>^
sup
OeX"^ {g4)^GxR ^{g)>d
= sup sup sup
inf
f{y)
J^^ ^ '^iy)>d
inf f{y) =
eX* geG deR >'^^
sup
inf
f(y).
(^>,?)eX*xG ^^ >;^^^ ^ ^ ^(y)>^(8)
124
3. Duality for Quasi-convex Supremization
When G is weakly compact, the first equality of (3.119) follows from the first equality of (3.118), since sup (G) is attained for each O G X* (see Lemma 1.3). The second equality of (3.119) always holds, since sup
inf f(y) = sup
sup
sup
inf f(y) = sup
supO(G)>JO(j)>J
Corollary 3.12. Let X be a locally convex space, f:X convex function, and G c. X. Then sup / ( G ) =
sup
inf
fiy)-
•
0(>')>sup
-> R an evenly quasi-
inf / ( V ) ,
(3.120)
VeV
where V denotes the family of all closed half-spaces in X. Proof The proof is similar to that of Corollary 3.10, using the fact that the closed half-spaces in X are the sets of the form V^^d = [x
> J},
GX|0(JC)
(3.121)
where (
D
Remark 3.11. (a) In the particular case that / is a continuous quasi-convex function. Corollary 3.^2 follows from Corollary 3.10. Indeed, if ^ G ZY and ^ n G # 0, then U eV and L^ H G 7^ 0 (where U denotes the closure of U). Also, since / is upper semicontinuous, by Lemma 1.1 we have inf / ( ^ ) = inf fijj).
(3.122)
Hence, since / is also lower semicontinuous, from Corollary 3.10 we obtain sup / ( G ) =
sup
inf f{U) < sup
UeU
VeV
inf f{V) < sup / ( G ) ,
which yields (3.120) (indeed, for the last inequality observe that if g G V HG, then inf/(V)
^0.
(c) As an application to approximation, let us note that if Z is a normed Unear space, G c X, and XQ G Z , then, from Corollaries 3.10 and 3.12 applied to the finite continuous convex function f(y) = \\xo-y\\
(yeX)^
(3.123)
we obtain again formula (2.21) on the deviation of G from XQ. (3) For the polarity A = A^^ of (1.193), we obtain the following corollary of Theorem 3.11:
3.3 Constrained surrogate dual problems for quasi-convex supremization Corollary 3.13. Let X be a locally convex space, f:X coajfine function, and G c. X. Then sup / ( G ) = sup deR
sup
inf
J^^ ^ ^(y)=d
sup
125
-> R an evenly quasif{y)
inf
f(y).
(3.124)
Proof. For the polarity A = A^^ of (1.193) we have (1.194), so / is (A^^YA^^quasi-convex if and only if it is evenly quasi-coaffine. Hence, applying formula (3.111) to A = A^^, we obtain the first equality of (3.124). The proof of the second equality of (3.124) is similar to that of the second equality of (3.118), replacing everywhere X* by X*\{0} and the inequality sign > by equality. D Corollary 3.14. Let X be a locally convex space, f: X -^ R an evenly quasicoaffine function, and G C X. Then sup / ( G ) =
sup
inf/(//),
(3.125)
Hen
where H denotes the family of all (closed) hyperplanes in X. Proof The proof is similar to that of Corollary 3.10, using the fact that the hyperplanes in X are the sets of the form (1.29), where <J> G X*\{0} and deR. D Remark 3.12. Formula (3.125) is another version of the "reduction principle" (it reduces the computation of sup f(G) to the computation of'mi f (H), for all H eH with H f) G ^ &), which should be compared with the reduction principle (3.29) of Remark 3.1 (a). Note that in contrast to (3.29) (see Remarks 3.1 (c) and 2.1 (c)), formula (3.125) holds also for unbounded sets G. In the above duality results there are two parameters, O e Z* \ {0} and deR. For functions f: X ^^ R satisfying condition (1.195) of Section 1.2 we obtain duality results involving only one parameter, O e X* \ {0}. Theorem 3.12. Let X be a locally convex space, f: X -> R a lower semicontinuous quasi-convex function satisfying (1.195), and G ^ X, G ^ {0}. Then sup / ( G ) =
sup
inf
f(x).
(3.126)
sup (G)>1 ^ ^ ^ ^ > '
Proof By G 7^ {0} and (1.195) we have G\{0} / 0 and sup f(G\m
> inf /(X\{0}) = /(O),
whence, using that / is a lower semicontinuous quasi-convex function and (1.198), sup / ( G ) = sup / ( G \ { 0 } ) = sup / q ( G \ { 0 } ) = sup /q((A0iyA0i)(G\{0}).
126
3. Duality for Quasi-convex Supremization
Hence, by Theorem 3.11 applied to /q((Aoi)'AO') and G\{0}, and by (1.227) and ^L(A)L(AyL(A)^y.L(A)^^g obtain
sup / ( G ) = -inf/^^^"^(CAO^(G\{0})),
(3.127)
and thus by 0(0) = 0, (1.223) and (1.196), it follows that sup / ( G ) = -
inf
(-
OGX*\{0}\ supO(G)>l
inf
fix))=
.veC(AOi )'({(!>})
/
sup
inf
OGX*\{0} supO(G)>l
"i^,^ , '^^•'^^^
f(x).
D
Remark 3.13. The assumption G 7^ {0} cannot be omitted in Theorem 3.12. Indeed, for G = {0} we have sup / ( G ) = /(O), but {O e X*\{0}| supO(G) > 1} = 0, so the right-hand side of (3.126) is —00. Theorem 3.13. Let X be a locally convex space, f an evenly quasi-convex function satisfying (1.195), andG £X, G 7^ {0}. Then sup / ( G ) =
sup
inf
<1>€X*\{0}
^ ^ l
fix).
(3.128)
Proof The proof is similar to that of Theorem 3.12, using now (1.201) and (1.199). D Remark 3.14. (a) Similarly to Remark 3.13, the assumption G 7^ {0} cannot be omitted in Theorem 3.13. (b) As an application to approximation, let us note that Theorem 3.13 yields again Theorem 2.3. Indeed, we may assume that JCQ = 0 and G 7^ {0}. Then, by Theorem 3.13 appHed to the function f(y) = \\y\\
(yeX),
(3.129)
which satisfies (1.195), we have sup llgll = geG
sup
dist(0, {yeX\
0 ( j ) > 1}),
(3.130)
cDeX*\{0} 3geG,
whence by Corollary 1.4, we obtain (2.33) for XQ = 0. One can obtain duality theorems for sup / ( G ) for many other classes of functions f: X ^^ Rby choosing suitable polarities A such that / is A^ A-quasi-convex and applying Theorem 3.11. Indeed, let us give here an example of such a result. We recall that a set G c X is called R-evenly convex if it is the intersection of a family of open half-spaces whose closures do not contain 0, and a function / : X -> /? is called R-evenly quasi-convex if all Sdif) (d e R) are /^-evenly convex. Corollary 3.15. Let X be a locally convex space, f: X ^^ R an R-evenly quasiconvex function, and G C. X. Then sup / ( G ) =
sup
inf xeX ^(-^)>-l
fix).
(3.131)
3.4 Lagrangian duality for convex supremization
127
Proof. From the general form of open half-spaces, it follows that a set G c X is /^-evenly convex if and only if it is the intersection of a family of sets of the form [x e X\ 0(jc) < - 1 } , where O e X*. Hence, if we define a polarity A^"^: 2^ -^ 2^*\{0} b y
A^\C)
= {CD G X*\{0}| 0(c) < - 1 (c G C)}
(C c X),
(3.132)
then G is (A'^)'A^'^-convex if and only if it is /?-evenly convex, so / : X -> /? is (A^^)^A^'*-quasi-convex if and only if it is /?-evenly quasi-convex. Hence, formula (3.111) yields the result. D
3.4 Lagrangian duality for convex supremization 3.4,1 Unperturbational theory Theorem 3.14. Let X be a locally convex space, / : X ^- R a function, and G a subset of X. Then s u p / ( G ) > sup inf{/(j)-cD(y)-hsupO(G)}.
(3.133)
Moreover, if f is a proper lower semicontinuous convex function, then sup / ( G ) = sup inf [f{y) - (D(y) -h sup cD(G)}.
(3.134)
Proof Since G is nonempty, we have sup 0(G) > —oo (O e X*). Let O € X* and J G /?, J < sup 0(G). Then there exists ^' = g'^j e G such that O(g0 > ^. Consequently, sup / ( G ) > /(g^) > fig') - cD(g^) + J > inf {/(j) - 0(};) + J}, whence, since O e X* and J < sup 0(G) were arbitrary, we obtain (3.133). On the other hand, if / is a proper lower semicontinuous convex function, then by (1.99), we have fig) = sup {inf[/(j) - 0 ( j ) ] +
ig e G).
Hence by (3.133) and (3.135), we obtain (3.134).
(3.135) D
Remark 3.15. (a) If (3.16) holds, then for any O e X*\{0} we have sup/(G) >
inf
fiy)>
yeX a)(>0=sup(G)
>
inf
inf
fiy)
yeX 4>(v)>sup(G)
{/(^)-(D(^)} + supa)(G)
yeX <^(y)>sup
> inf [f(y)-^(y)} yeX
+sup ^{G),
(3.136)
128
3. Duality for Quasi-convex Supremization
whence, by Remark 3.1(b), sup/(G) > sup
inf
f(y)>
sup
(t>{y)=sup(G)
inf
f(y)
(p(y)>sup
> s u p [ i n f [ / ( ^ ) - 0 ( > ; ) ] + supO(G)j,
(3.137)
which implies some relations between Lagrangian duality (3.134) and hyperplane and half-space theorems of surrogate duality (3.6) and (3.77) (for example, in this case the Lagrangian duality equality (3.134) implies the surrogate duality equalities (3.6) and 0.11)). (b) If G is bounded, then by the "substitution method" described in Remark 1.26 (a), combining Theorem 3.1 (c) and formula (1.283), with 4>o, d^ replaced by O and sup 0(G) respectively (which is a Lagrangian duality formula for the infimum of / on a hyperplane), we obtain sup/(G) =
sup
maxinf {/(>;)-6>
(3.138)
from which one can deduce again the equality (3.134); however, the above direct method of proof is simpler. (c) In the particular case that X is a normed linear space and / is the finite continuous convex function (3.123), from (3.134) we obtain the following formula of Lagrangian duality for the deviation of a set G from JCQI sup \\g - xoW = g^G
sup
eX*\{0}
inf {||xo - y\\ - <^{y) + sup 0(G)}.
(3.139)
y^^
Let us show that (3.139) impUes Corollary 2.1, whence also Theorem 2.1 (by Remark 2.2). Indeed, by (3.139), we have sup ll^-xoll > i n f { | | x o - j | | + ^ ( x o - y ) - c D ( x o ) + s u p 0 ( G ) } geG
y^^
> -0(jco) + supO(G)
(O eZMlcDII = 1),
whence s u p | | g - x o | | > sup {-cD(xo) + supcD(G)}. geG
(3.140)
\m=\ In order to prove the opposite inequality, let g e G and e > 0. Choose O^ e X* with II ^ i = 1 such that
>
\\g-xo\\-£,
whence, since geG
and s > 0 were arbitrary, we obtain sup {-O(xo) + sup 0(G)} > sup \\g - xoll ^eX*
geG
\\n=\ which, together with (3.140), yields the equality (2.14).
3.4 Lagrangian duality for convex supremization
129
3A.2 Perturbational theory In this section we shall develop a perturbational theory of Lagrangian duality for convex supremization, by suitably modifying the one for quasi-convex infimization (See Chapter 1, Section 1.4.2). Assume that we are given a constrained primal supremization problem a = sup/(G),
(3.141)
where G is a subset of a locally convex space X and f: X -^ /? is a function. Clearly, sup/(G) = sup7(X),
(3.142)
where / : X -^ /? is the function defined by fix) ifxeG, —oo if jc e CG. Thus, problem (3.141) and the primal problem (P)
a = supf(X)
(3.143)
have the same value. Moreover, if / | G # — oo, which we shall assume in the sequel without any special mention, then problems (P) and (P) have the same optimal solutions; indeed, i^go ^ G, /(go) = sup / ( G ) , then f(gol = /(go) + ^Xcigo) = sup/(G) = sup/(X)^and conversely, if jco G X and f(xo) = s u p / ( Z ) , then = sup/(G) > - o o , whence xo e G /(•^o) t -XG(XO) = 7(xo) = supf(X) and f(xo) = sup / ( G ) . Therefore, we shall assume from the beginning that we are given an unconstrained primal supremization problem (P)
Qf = sup0(X),
(3.144)
and then, taking in particular 0 = f-\-—XG and a suitable permutation p, the duality theory for (P) of (3.144) will yield a duality theory for (P) of (3.141). We shall define a dual problem to the primal supremization problem (P) of (3.144) by embedding it into a family of "perturbed" supremization problems, as follows. Let Z be a locally convex space (called a set of "perturbations" or of "parameters"), and p: X x Z -> /? a function (called a "perturbation function") such that /?(jc,O)=0(jc)
(JCGX),
(3.145)
so (P) of (3.144) is nothing other than a = supp(x,0); xeX thus, (P) is embedded into the family of supremization problems (P)
(3.146)
130
3. Duality for Quasi-convex Supremization (P,)
v(z) := SUP/7U, z)
(z e Z).
(3.147)
Let us define the Lagrangian dual problem associated with the perturbation function p as the unconstrained supremization problem (D)
^:=supA(Z*),
(3.148)
where X: Z* ^^ Ris the dual objective function defined by A(vl/) := sup {inf {p(x. z) - ^(z)}} xeX
(^ e Z*).
(3.149)
e Z*)
(3.150)
2^2
The function L: X x Z* ^^ R defined by L(x, vl/) := inf {p(x, z) - ^(z)}
(x eX,^
zeZ
is called the Lagrangian function, or simply the Lagrangian, associated with p\ note that this is the same as (1.396). Thus, considering the partial functions pAz) := p{x, z) (xeX^ze Z), (3.151) we have L(x, vl/) = M{pAz)
- ^(z)} = -p:m
(X € X, vl/ G Z*).
(3.152)
zeZ
By (3.145), (3.151), and (3.152), Hx) = PAO) > PT(0) = sup v,ez*{^(0) + L(x, vl/)} = sup L(jc,vl/) (x eX).
(3.153)
Furthermore, by (3.149) and (3.150), A(vl/) = supL(x, vl/)
(vl/ 6 Z*),
(3.154)
and hence by (3.148), P = sup supL(x, vl/). ^eZ*
(3.155)
xeX
Thus, by (3.153) and (3.155), a = sup0(X) > sup sup L(jc, ^) = j6.
(3.156)
.VGX vi/eZ*
If in addition, /7;,(0) = / ^ f W ' then (t>M = Px(0) = PT(0) = sup vi;ez*{^(0) + L(x, vj/)} = sup L(jc, ^ ) (jc G Z),
(3.157)
VI/GZ*
and thus in this case. a = sup0(X) = sup sup L(x, ^) = JCGXVI/GZ*
fi.
(3.158)
3.5 Duality for quasi-convex supremization over structured primal constraint sets
131
Remark 3.16. For the constrained primal supremization problem (3.141), let Z = X, 0 = / + — XG- Then the perturbation function p: X x X -^ R defined by p{x, z) := fix + z) +
-XGU)
fix + z) —oo
if.eG, if jc ^ G
satisfies (3.145), and the perturbational dual (3.148) yields the unperturbational dual of Section 3.4.1. Indeed, by (3.152) and (3.159), for any x G Z and vj/ G Z* we have L(x, vl/) = inf {p(x, z) - ^(z)}
(3.160)
zeZ
= inf {fix + z) - vl/(z) + vi/(;c) - vl/(x)} +
-xcix)
zeX
= inf {fix') - *(x')} + ^ix) + -XG(X)},
(3.161)
x'eX
whence, by (3.155) we obtain ^ = sup sup Lix, vi/) = sup inf {fix') - ^ix') + X G ( ^ ) } ' ^eX*xeX
vi>eX*-^'^^
which is nothing other than the right-hand side of (3.134).
3.5 Duality for quasi-convex supremization over structured primal constraint sets The primal constraint set G considered in the preceding sections of this chapter has been an arbitrary subset of a locally convex space X. Now we shall study some more structured ways of expressing the primal constraint sets G c X. In the present section we shall consider one of these ways, namely that of systems, and (surrogate and Lagrangian) duality for supremization in systems. We recall (see Chapter 1) that a system is a triple (X, Z, u), consisting of two sets X, Z and a mapping u: X -^ Z. Given a system (X, Z, u), a subset T of Z (called "target set"), and a function / : X —> /?, we shall consider the primal supremization problem ^ "^K-HT)./ = ^^P f^^^-
(3.162)
xeX u(x)eT
Remark 3.17. (a) If M(X) n 7 = 0, then w ' ^ r ) = {x e X|w(x) e T} = 0, whence a = sup0 = — oo. Therefore, in the sequel we shall assume, without any special mention, that w(X)nr#0.
(3.163)
132
3. Duality for Quasi-convex Supremization
(b) Problem (3.162) is equivalent to problem (P^) of (3.1). Indeed, given a system (X, Z, u) and T, f as above, problem (3.162) is nothing other than (3.1) with G = {x eX\ u{x) eT} = U-\T) (7^ 0).
(3.164)
Conversely, every problem (3.1) can be written in the form (3.162), by taking Z = X, u = Ix, the identity operator in X (i.e., u{x) = x for all x e X), and T = G. However, in the study of the "mathematical programming problem" (3.162) one can also use the properties of T and u. Now we shall assume that (X, Z, u) is a system in which X and Z are locally convex spaces, with conjugate spaces X* and Z*, 7 is a subset of Z, and f: X ^^ R is a function. There are several natural ways to introduce unconstrained dual problems to (3.162), which generalize the dual problems of the preceding sections. (l)Let W := M*(Z*)\{0} = [^u\ vl/ e Z*}\{0} ( c X*\{0}),
(3.165)
where w* is the adjoint operator of u (that is, w*(^)(jc) = ^u(x) for all x G X, ^ € Z*) and let A^,,^^^: 2^ -^ 2"*^^*^^^^^ be the polarity defined by ^ i - U r ) ( 0 := {w*(^) e w*(Z*)| w*(vl/)(c) < supuH^^)(u-\T)) = Ai_,(^)(C) n (w*(Z*)\{0})
(c e C)}
(C c X),
(3.166)
where A^_,^y,^: 2^ -> 2^*\^^^ is the polarity (1.154) (with G = u'^T)). since ^M(w-Hr)) = {^u(x)\ u(x) eT} =
VI/(M(X) DT)
(^ e Z*),
Note that (3.167)
we have, for any set C c X, Ai_,^y,^(C) = {^u\ ^ 6 Z*, ^(u(c)) < supvI/(M(X) n 7) (c e C)}.
(3.168)
Clearly, for the particular case X = Z, w = Ix (the identity operator on X), W = X*\{0} and T = G, the polarity A^_,^y,^ of (3.168) reduces to A]J of (1.154). In the converse direction, given any (X, Z, u) and T as above, by (3.166) we have Al_,^^^(C) c A ; ; _ , ( ^ / C ) ( C C X). For the polarity A = ^l-un ^^ (3.168), the dual objective function (3.41) and the dual value (3.42) respectively, become X'-,
(^M) =
^.-1(7-)
y6|,
inf
fix)
( ^ G Z*, ^u # 0),
(3.169)
xeX ^u(x)>sup^(u(X)r]T)
= sup l^^O
inf
fix).
(3.170)
^uix)>supyl'(u(X)nT)
Hence, by Remark 3.4 (b) (with W of (3.165)), we obtain the following generalization of Theorem 3.7 (c):
3.5 Duality for quasi-convex supremization over structured primal constraint sets
133
Theorem 3.15. Let (X, Z, u) be a system in which X and Z are locally convex spaces, let T be a subset of Z, and let f: X -> R be a function. If we have inf
f(x) < sup
inf
fix),
(3.171)
then for each d < ^^Pxex,u(x)eT /(-^) ^here exists ^ = ^ j e Z* with ^^w / 0 such that ^u(y)
< sup^{u(X)
n D
(ye SAf))
(3.172)
fix)'
(3.173)
if and only if sup fix) = sup
One can define, similarly, polarities
[^u 7^ 0| vl/ € Z*, sup ^uiO
inf
A;^.,^^,^
: 2^ -> 2"*^^*^^^^^ (/ = 2, 3, 4) by
< sup vl/(M(X) n 7)}
[^u / 0| vl/ e Z*, sup VI/(M(X) n
D
^ vi/f^(C)}
{vI/M ^ 0 | v l / e Z*,^w(C) c vi/(M(X)nr)}
(C c X),
(C c X),
(C c X),
(3.174) (3.175) (3.176)
and one can obtain for them results corresponding to those of Section 3.2. (2) Instead of A^,,^^,^: 2^ -> 2"*^^*^^^^^ of (3.168), let us consider the polarity A^V : 2^ -^ 2^*\^0J defined by A ' ! r ( 0 := [^ e Z*\{0}| ^uic) < s u p ^ ( 7 ) (c e C)}
(C c X);
(3.177)
thus, the only difference between (3.168) and (3.177) is that sup^iuiX) (1 T) is replaced by s u p ^ ( r ) . For A = A^'^^, the dual objective function (3.41) and the dual value (3.42) become A^2.^(vI/)= /S;„ =
mf sup
fix) inf
(vj/€ Z*\{0}), fix).
(3.178) (3.179)
^^^ \i^/ vl/M(A-)>supvl/(r)
Again, AJ^ of (1.154) is the particular case X = Z,u = Ix (the identity operator on X) and 7 = G, of the polarity A^'y^ of (3.177), but the converse direction no longer works, since we have only sup ^ ( M ( X ) DT) < sup ^iT), whence ^i, ,
<^l.^,
(3.180)
134
3. Duality for Quasi-convex Supremization
with possibly strict inequality. Note that given a mapping w: X -> Z, the family of polarities A|^_, depends on subsets u~\T) of X, while the family of polarities /s}Jj depends on subsets T of Z, so one needs some care when generahzing the expression A{g}({g}) of (3.59) to the family A^^^^. In order to obtain duaHty theorems using the polarities A^J^^ of (3.177), let us first give the following generalization of Theorem 3.4: Theorem3.16. Let (Z, Z,u) be a system (so X and Z are two sets andu: X -^ Z is a mapping), T a subset ofZ (satisfying (3.163)), W a set, f: X ^^ R a function, and Auj: 2^ -> 2^ (T c Z) a family of polarities such that AuAu(x)}({x}) = ^ (xeu-\T)), i n f / ( C A 1 ^ ( { I . } ) ) < sup i n f / ( C A ; ^,(,)j({u;}))
(3.181) (3.182)
(w e W).
xeX u(x)eT
Then, given T ^ Z, we have sup f(x) xeX u(x)eT
> ^l
^ (= sup inf / ( C A ; T({W}))). "' weW
(3.183)
Moreover, if we have (3.181), (3.182) and f is (AujY^uj-^l^^si-convex,
then
sup f{x)^fil^.
(3.184)
xeX u(x)eT
Proof By (3.181) and Lemma 3.5 applied to A = Au,{u(x)}^ we have ^ e ^KAU(X)}({^})
U
e u-'(T),
w e W),
whence inf/(CA;,
j,(,)j({u;})) < f(x)
(X e U-\T)^
W
e W).
Therefore, by (3.182), inf/(CA:,^({w;}))<
sup inf/(CA^ ^,(,)j({u;})) < sup f(x) xeX u(x)eT
"
(weW),
xeX uix)eT
and hence by (3.42) (with A = A^ 7), we obtain (3.183). Furthermore, if also / is A^ 7^ Aj^,7-quasi-convex, then by (3.111) (applied to A = A^j), we have sup f(x) xeX u(x)eT
=
sup inf/(CA:,^({W;})) < sup i n f / ( C A : ^ ( { U ; } ) ) = u;eCA„7-(w-i(r)) ^^^
whence by (3.183), we obtain (3.184).
p^^,
D
Note that the family of polarities A^^^^ of (3.177) obviously satifies (3.181). Hence, applying Theorems 3.16 (with W = Z*\{0}) and 3.5 to this family, we obtain the following generalization of Theorem 3.7:
3.5 Duality for quasi-convex supremization over structured primal constraint sets
135
Theorem 3.17. Let (X, Z,u) be a system in which X and Z are locally convex spaces, let T be a subset ofZ, and let f: X ^^ R be a function. inf
fix) < sup
inf
f(y)
(^ e Z*\{0}),
(3.185)
then sup fix) >
sup
u(x)eT
inf
fix).
(3.186)
fix)
(3.187)
vi/M(^)>supvi/(r)
(b) The inequality sup fix) <
sup
xeX u(x)eT u(x)eT
inf ^u{x)>sup^(T)
holds if and only if for each d < sup^^^^^^^^^^y- fix) there exists vj/ = vj/^ G such that ^iuiy))
< sup vl/(r)
iy e 5^(/)).
Z*\{0}
(3.188)
(c) If we have (3.185), then for each d < sup^^;^ ^^^^^^T- fix) there exists ^ = ^d ^ Z*\{0} satisfying (3.188) if and only if sup fix) =
sup
xeX u(x)eT u(x)eT
inf
fix).
(3.189)
^u(x)>sup^iT)
Similarly, one can consider the polarities A^'^-: 2^ -> 2^*"^^^^ defined by AfjiO
:= {^ e Z*\{0}| sup^uiC)
< sup^iT)}
Al'riC)
:= {vl/ e Z*\{0}| supvl/(r) ^ vi/w(C)}
A^;^7^(C) := {^ G Z*\{0}| ^w(C) c vi/(7)}
(C c X), (C c X),
(C c X),
(3.190) (3.191) (3.192)
which are generalizations of the polarities (1.160), (1.166), and (1.182) respectively, and one can prove for them corresponding duality results. Remark 3.18. Concerning Lagrangian duality for the primal supremization problem (3.162), where (X, Z, w) is a system, T is a subset of Z, and / : X ^- R is SL function, we make here only the following observation, without entering into details: similarly to the way formula (1.268) for infimization is extended to the Lagrangian duality formula (1.433), the natural extension to systems of formula (3.134) for supremization should be sup fix) = max inf {fix) - ^iuix)) j,^X u{x)eT
^eZ^xeX
+ sup 4/(7)}.
(3.193)
Optimal Solutions for Quasi-convex Maximization
4.1 Maximum points of quasi-convex functions Let X be a locally convex space, / : X -> /? a function, G c Z, and go e G. Clearly, if /(^o) = +00, then go is an optimal solution of the primal supremization problem(PO(of(3.1)),i.e.,/(go) = max/(G),andif/(go) = - 0 0 , / | G # - 0 0 , then go is not a maximum point of / on G. Therefore, the cases of interest are those where /(^o) e R.
(4.1)
Remark 4.1. From (1.22) it is obvious that go ^ ^ is an optimal solution of ( P ^ if and only if G c 5/(,„)(/).
(4.2)
Theorem 4.1. L^r X be a locally convex space, W c^ R , A: 2^ -> 2 ^ a polarity, f a A^ A-quasi-convex function, and G C. X. For an element go e G the following statements are equivalent: 1°./(go) = max/(G). 2°. We have MSf^,M))
^ A(G).
(4.3)
Proof, r =^ 2°. By Remark 4.1, if/(go) = max / ( G ) , then for any set of functions W c 7? and any polarity A: 2^ -> 2^ we have (4.3) (since A is antitone).
138
4. Optimal Solutions for Quasi-convex Maximization
2° ^ r . Since / is A'A-quasi-convex, we have A'A(Sf(gQ)(f)) = Sf(go)(f)Hence if 2° holds, then by (4.3) and since A' is antitone, we obtain G c A'A(G) c A'A(5/(,„)(/)) - 5/(,„)(/), and thus by Remark 4.1, /(go) = max / ( G ) .
•
Corollary 4.1. Let Xbea locally convex space, / : Z ^- /? a lower semicontinuous quasi-convex function, andG C X. For an element g^ G G, the following statements are equivalent: 1°./(go) = max/(G). 2°. W^ /i«v^ {where A^^: 2^ -> 2^^*\<0J>^^ /5 the polarity (1.189)) A'^(%,o)(/))^A^\G).
(4.4)
Proo/ By (1.190), / is lower semicontinuous quasi-convex if and only of it is (A^^)'A^^-quasi-convex, so the result follows from Theorem 4.1 applied io W = (Z*\{0}) X /?andA = A^^ D Corollary 4.2. Let X be a locally convex space, f: X -^ R an evenly quasi-convex function, and G c. X. For an element go e G, the following statements are equivalent: r . / ( g o ) = max/(G). 2°. We have (where A^^. 2^ -^ 2^^*^^^^^''^ is the polarity (1.191)) ^''(Sfiso)(f))
^ A^2(G).
(4.5)
Proof By (1.192), / is evenly quasi-convex if and only if it is (A ^^)^ A ^^-quasiconvex, so the result follows from Theorem 4.1 applied to W = (Z*\{0}) x R and A = Ai2. n Corollary 4.3. Let X be a locally convex space, / : X -^ R a lower semicontinuous quasi-convex function, and G C. X. For an element go e G, /(O) < f(go), the following statements are equivalent: 1°./(go) = max/(G). 2°. We have {where A^^: 2^ -> 2^*\^^^ is the polarity (1.196))
Proof 1° =^ 2°, by Remark 4.1 and since A^^ is antitone. 2° ^ 1°. By /(O) < /(go), we have 0 G Sf^g^){f). Hence if (4.6) holds, then since (A^^)^ is antitone, we obtain, by (1.197) and since / is a lower semicontinuous quasi-convex function, G c {A^'yA^\G)
c (A«^)^A«^(5/(,,)(/)) = co5/(,,)(/) = 5^(,o)(/).
•
Remark 4.2. Condition (4.6) can be also written as Sf(gQ){fy c G°, an inclusion between the usual polar sets (1.82).
4.1 Maximum points of quasi-convex functions
139
Corollary 4.4. Let X be a locally convex space, f: X -^ R an evenly quasi-convex function and G C Z. For an element go e G with /(O) < f(go), the following statements are equivalent: 1°./(go) = max/(G). 2°. We have {where A^^: 2^ -> 2^*\<0J is the polarity (1.199)) A''(5/(,o)(/)) ^ ^""^G).
(4.7)
Pro6>/ The proof is similar to the above proof of Corollary 4.3, using now (1.200). D Now we shall give some subdifferential characterizations of maximum points. To this end, let us first introduce the following class of abstract quasi-convex functions: Definition 4.1. Let X be a locally convex space. We shall say that a function f:X -> /? is strongly evenly quasi-convex if all Aj(/) {d G R) of (1.23) are evenly convex. Remark 4.3. (a) Every strongly evenly quasi-convex function f: X -^ Ris evenly quasi-convex, since Sdif) = n^>^A^(/) (d e R) and since the family of all evenly convex sets is closed for intersections. (b) Every upper semicontinuous quasi-convex function / : X -> R is strongly evenly quasi-convex (since each A^(/) is open and convex, and hence evenly convex). Proposition 4.1. Let X be a locally convex space, f: X -^ R a strongly evenly quasi-convex function, and XQ e X such that /(JCQ) G R. Then a^(^")/(xo)/0,
(4.8)
where A^^: 2^ -> 2^^*^^^^^''^ is the polarity (1.19\). Proof Since / is strongly evenly quasi-convex, the set Af(xQ){f) is evenly convex. Hence since XQ ^ Af(xQ)(f), there exists OQ G X * \ { 0 } such that if)).
(4.9)
Therefore, we have f(x) > f{xo) for allx e X with
min
/ ( x ) = -/^<^">(cDo,cI>o(xo)),
xeX OoUo)
and thus by (1.232),
(OQ, OO(JCO))
e a^^^"V(xo).
•
140
4. Optimal Solutions for Quasi-convex Maximization
Theorem 4.2. Let X be a locally convex space, f: X -> R an upper semicontinuous quasi-convex function, and G C X. For an element go ^ G with /(go) ^ R, the following statements are equivalent: r . / ( g o ) = max/(G). 2°. We have 0 7^ a^^^"V(^o) ^ [(X*\{0}) x R]\A]^(G), and each (O, J) e 9^^"^ V(go) ^"^' «« optimal solution of the dual problem (D^n) (of (3.114)/or A = A^^X i.e., /^(^"HcD,^) = min /^^^"^([(X*\{0}) X R]\A'\G))
((O,
t/) € a^^^"V(go)).
(4.10)
3°. r/z^r^ exists (Oo, ^0) ^ 9^^^ ^/(go) ^^^^ '•5' «« optimal solution of the dual problem (D^n), i.e., such that /^(^">(Oo, Jo) = min /^^^"^([(X*\{0}) x
R]\A'\G)).
(4.11)
Proo/ Observe that since / is strongly evenly quasi-convex (by Remark 4.3 (b)), and /(go) e /?, we have a^^^"V(go) 7^ 0 (by Proposition 4.1). 1° ^ 2°. If 1° holds, then for any polarity A: 2^ ^ 2(^*\<°^^^^ such that every upper semicontinuous quasi-convex function is A^A-quasi-convex and d^^^^f(go) / 0 (hence in particular, for A = A ' ^ ) and any (
(4.12)
also, by (CD, J) e a^^^V(go), 1°, Theorem 3.11, and (4.12), /^(^>(cD, J) = - / ( g o ) = - m a x / ( G ) = min /^^^^([(X*\{0}) x /?]\A(G)). The implication 2° =^ 3° is obvious. 3° =^ 1°. If 3° holds, even with A ' ^ replaced by any polarity A: 2^ -> 2(x*\{0})xR ^^^Yi that every upper semicontinuous quasi-convex function is A^Aquasi-convex,wehave,by(Oo, Jo) e a^^^^/(go) ^ [(X*\{0})x/?]\A(G), (1.232), (4.11) (for A), Theorem 3.11, and go e G, /(go) = -/^^^\4>o,Jo) = - m i n /^^^^([(X*\{0}) X R)]\A(G)) = max / ( G ) .
D
Proposition 4.2. L^r X be a locally convex space, f: X -^ R a strongly evenly quasi-convex function, and XQ e X such that /(O) < f(xo) < +00.
(4.13)
a^'^^'/Uo) ^ 0,
(4.14)
Then
where A"^ w the polarity (1.199).
4.1 Maximum points of quasi-convex functions
141
Proof. By the above proof of Proposition 4.1, there exists OQ ^ ^*\{0} satisfying (4.9), whence f{x) > /(XQ) for all x G X with O Q U ) > OO(JCO). But by (4.13) we have 0 G A/(;CQ)(/), whence by (4.9), 0 = Oo(0) < Oo(xo). Consequently, /(xo)=
min
f{x)=
min
/(x) = - / ^ ^ ^ " ^ ( - - ^ c D o ) ,
andthus^cDoGa^(^">/(xo).
•
Theorem 4.3. Let X be a locally convex space, f an upper semicontinuous quasiconvex function satisfying (1.195), and G C. X such that /(O) < sup / ( G ) . For an element go ^ G with /(go) ^ ^, the following statements are equivalent: 1°./(go) = max/(G). 2°. We have 9:^ ^^^^''^figo) ^ ( X * \ { 0 ] ) \ A 0 2 ( G ) , and each O G a^^^''V(^o) />y an optimal solution of the dual problem (D^oz) (of (3.114) for A = A^^), i.e., /^(A")() = min /^<^">((X*\{0))\A''2(G))
(O € a^<^">/teo)). (4.15)
3°. There exists OQ G 9^^^ ^ f(go) that is an optimal solution of the dual problem (D^oi), i.e., such that /^(^")(cI>o) = min /^^^"H(X*\{0})\A«2(G)).
(4.16)
Proof If 1° holds, then by our assumptions, /(O) < sup / ( G ) = /(go)- Hence since / is strongly evenly quasi-convex (by Remark 4.3 (b)) and /(go) G R, we have 9^^^ ^f(go) 7^ 0 (by Proposition 4.2). The remainder of the proof is similar to that of the above proof of Theorem 4.2, replacing (O, d), (4>o, do) G (X*\{0}) x R, and A^^ by O, Oo G X*\{0}, and A^^ respectively, and using Theorem 3.13. D Remark 4.4. (a) If X is a locally convex space, A^^: 2^ -> 2^*^^^^ is the polarity (1.199), / G ^ ^ , and xo e X with /(xo) G /?, then by (1.232) applied to A = A^^ we have d'^^^''^f(xo) = {<^o e X*\{0}| Oo(xo) > 1, /(xo) = - / ^ ^ ^ ' ' ^ O o ) } .
(4.17)
(b) In the particular case when X = R'\ Thach ([272], Definition 2.2 and the remarks made after it) has introduced a similar subdifferential, namely 9^/(xo) = {Oo G X*\{0}| cDo(xo) = 1, f(xo) = -/""(Oo)},
(4.18)
where / ^ is the "quasi-conjugate" of / defined [272] by /^(A«^)(0) = - i n f ,ex f(x) 0(.v)>i - s u p f(X) thus in fact.
if O G X*\{0}, (4.19) if 0 = 0;
142
4. Optimal Solutions for Quasi-convex Maximization d"f(xo)
= {Oo e X*\{0}| c|>o(xo) = 1, f(xo) = -/''^^''^(Oo)}.
(4.20)
For this subdifferential, Thach has proved some results corresponding to Theorems 4.2 and 4.3 above (see [272], Theorems 2.6, 6.1, and Corollary 6.1 ii)). If X is a locally convex space, then clearly, for any function / : X ^ /? we have d" fixo) ^ 9^^"^^ V(-^o)- Let us observe that if f: X -^ R is "strictly increasing along segments starting from 0" (i.e., for each x e X\{0} and 0 < y] < 1 we have f{r]x) < f{x)\ then a^^^''V(^o) = d^ f(xo). Indeed, if OQ e a^(^''V(^o) and Oo(jco) > 1, then 0 < ^ ^ < 1, whence by (1.232) (for A = A^^), we obtain / ( ^ , ,-^0 I < /(-^o) =
min f(x) < f (
xo ) ,
which is impossible. Therefore, Oo(jco) = 1, so OQ G 9 ^ / ( X O ) (since by OQ 7^ 0 we have / ^ ( O Q ) = /^^"^ H^o)). which proves our assertion. For example, if f: X -> R is ''strongly quasi-convex" (i.e., for each x,y e X with x ^ y and each 0 < ri < I we have f(r]x + (1 — r])y) < max {/(x), f(y)}) and if f(0) = min f(X) (in particular, if f satisfies (1.195)), then f is strictly increasing along segments from 0; indeed, for any x e X and 0 < ^ < 1 we have firjx) = f(r]x + (1 - rj)0) < max {f(x), /(O)} = / ( x ) . Also, the function / of (3.129) on a normed linear space X is strictly increasing along segments from 0. Hence, in these cases, 9^^^ V(-^o) = 9^/(jco). Another such case is given in (c) below. (c) In the particular case that X is a normed linear space, Jo e X, and / is the function f(y) = \\xo-y\\
(yeX),
(4.21)
for each XQ e X\{xo} we have a^(^°V(xo) - ( o o € X*|
I
^
ll'l'oll
l
J
•
(4.22)
Indeed, if XQ 7^ XQ, then by (1.232) (for A = A"^) and Corollary 1.4, we have 9^<^">/(xo) = {cDo e X*l cDo(xo) > 1, llxo - xoll = dist(xo, {x e X\ cDo(x) > 1))) = Uo e X*\
^ - J ^ l ll«I>oll I
Now let 4>o e 3^*^ ^f(xo). If
4.1 Maximum points of quasi-convex functions
143
which is impossible. Thus Oo(xo) = 1, which proves (4.22). Note that in this case Proposition 4.2 asserts that if 11^0 II < 11^0--^0II,
(4.23)
then 9^^^ VUo) # 0- This can also be seen directly, as follows: since ||Jo —-^oll i^ 0 (by (4.23)), there exists O^^ G X * \ { 0 } such that ^ ; ( x o - y o ) = l|Ooll 11^0-^0 II (by a corollary of the Hahn-Banach theorem). We claim that o(^o) < 0, then by (4.24) and (4.23) we obtain ll^oll 11^0-xoll = o ; j ( x o - x o ) < ^'^{-x^)
which is impossible. Thus, have Oo(xo) = 1 and
OQ(JCO)
(4.24) OQ(XO)
> 0. Indeed, if
< ||0;)|| llxoll < \\
> 0. Hence by (4.24), for
= o ^ ^ o ^^
OQ
1 - ^0(^0) ^ ^o(xo) - ^0(^0) ^ IIOp 111^0-^0II ^ ll^oll ll^oll^oUo) ll^oll^o(^o)
_~ '
that is, OQ G 9^^^ V(-^o) of (4.22), which proves our assertion. (d) As an application to approximation, let us give now another proof of Theorem 2.6, using Theorem 4.3. Let us denote XQ of Theorem 2.6 by x, and assume first that X = 0. Note that for JCQ = 0 and XQ = go 7^ 0, (4.22) becomes ^'^^^''Vteo) = 1^0 e X*| cDo(go) = 1, ll^oll
l^oll
Then by Theorem 4.3, equivalence 1° <^ 3°, applied to / of (3.129) (which satisfies (1.195) and /(O) < sup / ( G ) , since G ^ {0}), and by (4.22) (for XQ = 0, xo = go^ 0), Corollary 1.4, and (1.223) (for A = A^^), we obtain that go e G satisfies ||goll = max^eG 11,^11 if and only if there exists ^Q e X*\{0} such that o(^o)-l, llgoll = ^ , a n d = -dist(0,{>;eX|cD;(y)> 1}) l^ol = /^(A"^)(0;,)= =
min 3geG,<^ig)>\
min
f'^^''\
[-dist(0,{yeX|
min
( - ^ ) ,
3geG,^(g)>l
i.e., the equivalence 1° <^ 2° of Theorem 2.6 for x = 0. If G is weakly compact and O' G X*\{0}, then sup 0'(G) is attained (see Lemma 1.3), so 2° <^ 3° of Theorem 2.6 for x = 0. Finally, if x G X is arbitrary, then applying the above to the set G^ = G — X, we obtain the desired conclusion.
144
4. Optimal Solutions for Quasi-convex Maximization
4.2 Maximum points of continuous convex functions In this section we shall assume that X is a locally convex space, f: X -^ /? is a function and G is a subset of X such that (1.73) holds. The assumption (1.73) is quite natural, since if the first inequality is not satisfied, then sup/(G) = inf f(X) < f(g) for all g e G, whence each g G G is an optimal solution of (P^) (i.e., a maximum point of / on G), and if the second inequality of (1.73) is not satisfied, then no go € ^ satisfying (4.1) can be a maximum point of / on G. Theorem 4.4. Let X be a locally convex space, f: X -^ R a function, and G a subset of X satisfying (1.73). For an element go e G, consider the following statements: 1°./(go) = max/(G). 2°. There exists <^o ^ X*\{0} such that Oo(go) =
max
4>o(j).
(4.25)
/(v)<sup/(G)
(a) If f is upper semicontinuous and convex, then 1° =^ 2°. (b) If f is continuous and convex, then 1 ° <^ 2°. Proof Let A:={ye
X\ f{y) < sup/(G)}, S := {y e X\ f{y) < sup/(G)}.
(4.26)
Then by (1.73), A # 0 . (a) Assume that / is upper semicontinuous and convex, so A is an open convex set. If 1° holds, then go ^ A. Hence by the separation theorem, there exists Oo € Z*\{0} such that supcDo(A)
(4.27)
sup Oo(5) < sup cDo(A) = sup cDo(A) < Oo(go).
(4.28)
whence by Lemma 1.9,
On the other hand, by 1° we have go G S, whence Oo(go) < sup Oo(5'), which, together with (4.28), yields (4.25). (b) Let / be continuous and convex, and assume 2°. If we had go G A, then since A is an open convex set, by Lemma 1.8 and Remark 1.7 (a) we would obtain ^o(go) < supOo(A) = supOo(A) < supOo(5), in contradiction to 2°. Hence go ^ A. On the other hand, go G G c 5*, so go G G n (5\A), that is, /(go) = max / ( G ) . D
4.2 Maximum points of continuous convex functions
145
Remark 4.5. (a) Theorem 4.4(b) admits the following geometric interpretation: when (1.73) holds and f is continuous and convex, an element go e G satisfies /(go) = max / ( G ) if and only if there exists a hyperplane HQ that supports the set S (defined in (4.26)) at go. Indeed, if f(go) = max / ( G ) and if OQ e Z*\{0} is as in Theorem 4.4(b), then the hyperplane Ho = {yeX\
(4.29)
has the required properties. Conversely, every hyperplane that supports the set S is of the form (4.29), for some ^o e X*\{0} with supc|>o(5) e R (by Chapter 1, Corollary 1.1), and hence if go ^ ^o, then we must have (4.25). (b) For any go e G satisfying /(go) = max / ( G ) and any hyperplane Ho as in (a) above, the sets G and S lie in the same half-space Do:=[yeX\
(4.30)
and ^0 supports the sets G and ^S at go- Indeed, since Oo(g) < sup ^o(S) (g e G) and Oo(y) < sup 00(5) (y e S), we have G c Do and S ^ Do. Furthermore, Oo(^o) < sup^o(G) < supOo(5), and by 2°, Oo(go) = supOo(5), whence ^o(go) = supOo(G) = supOo(5). (c) For any go ^ G satisfying /(go) = max / ( G ) and any hyperplane Ho as in (a) above, we have go e Ho and /(go) = min/(//o).
(4.31)
Indeed, by Lemma 1.9, we have sup
Oo(y) = sup c|>o(A)},
(4.32)
and hence, since A is open, by Lemma 1.8 it follows that >' ^ A for all y e Ho. Therefore, by /(go) = max / ( G ) , we obtain / ( j ) > s u p / ( G ) = /(go)
(ye Ho).
(d) Using normal cones (see Section 1.3, formula (1.123)), condition 2° of Theorem 4.4 can be written as N{S; go) ^ {0}.
(4.33)
Note also that for any Oo e X*\{0} satisfying (4.25) we have cI>o(go) = maxcDo(G);
(4.34)
^ ( 5 ; go) ^ ^ ( G ; go).
(4.35)
that is, using normal cones,
Indeed, if N(S; go) — {0}, this is obvious; on the other hand, if N(S; go) / {0}, then from go G G c 5 and (4.25) it follows that ^o(go) < supOo(G) < supc|>o(5) = Oo(go).
146
4. Optimal Solutions for Quasi-convex Maximization
Let us consider now the particular case when G = Theorem 4.4(a) yields the following corollary:
{JCQ},
a singleton. In this case,
Corollary 4.5. Let X be a locally convex space and f: X ^^ R an upper semicontinuous convex function. Then for each XQ e X satisfying inf/(X) < /(jco) < +00
(4.36)
there exists ^o ^ A'*\{0} such that Oo(xo) =
max
^o{y).
(4.37)
/(y)
Remark 4.6. (a) Geometrically, Corollary 4.5 means that if f \ X ^^ R is an upper semicontinuous convex function, then for each XQ e X satisfying (4.36) there exists a hyperplane HQ = {y e X\ ^o(y) = 4>o(-^o)} that supports the level set S/^XQ) = {y e X\ f(y) < /(JCQ)} at XQ. Moreover, from Remark 4.5(c) above it follows that for any such hyperplane HQ we have XQ G HQ and f(xo) = min / ( / / Q ) . (b) The assumption of upper semicontinuity is not necessary in Corollary 4.5, as shown by the following example: Let X be a normed linear space endowed with the weak topology a (X, Z*) and let / be the function f(y) = \\y\\
(yeX)
(4.38)
(i.e., (3.123) with XQ = 0). Then X is a locally convex space and / is a finite lower semicontinuous function on X that is not upper semicontinuous at any jo ^ ^, and (4.36) is equivalent to XQ 7^ 0. Also, by a corollary of the Hahn-Banach theorem, for each XQ e X\{0} there exists 4^0 € X*\{0} (X* is the same both for the weak and for the norm topology on X) such that ^o(^o) = ll^oll lUoll =
max ^oiy) = \eX ILvblkoll
max
^oiy)-
veX /(>0
In connection with the duality result of Corollary 3.1, we have the following result: Corollary 4.6. Let X be a locally convex space and f: X -^ R an upper semicontinuous convex function. Then for each XQ e X there exists 4>o ^ X*\{0} such that f(xo)=
min
f(y).
(4.39)
yeX
Hence we have (3.31) with the sup being attained. Proof If xo satisfies (4.36), then (4.39) follows from the last part of Remark 4.6(a) above. On the other hand, if /(JCQ) = min/(X), then for each OQ e X*\{0} we have
4.2 Maximum points of continuous convex functions /(xo) >
inf
f{y)
147
> inf/(X) = /(XQ),
yeX
whence (4.39). Hence the last statement follows since the inequahty > in (3.31) always holds. D Remark 4.7. (a) Geometrically, Corollary 4.6 means that if f: X -^ R is an upper semicontinuous convex function, then for each JCQ G X there exists a hyperplane HQ with XQ G HQ such that f(xo)=mmf(Ho).
(4.40)
(b) If X is a normed linear space and / is the function (4.38), then condition (4.39) is equivalent, by Lemma 1.5, to II
II
•
lUoll =
II II
mm
l^o(-^o)l
IIJII =
y^^
,
11^0 II
^o(>')=^oUo)
and it is a well-known corollary of the Hahn-Banach theorem that such a function Oo e X*\{0} exists. In the case that / is also continuous, we have the following theorem: Theorem 4.5. Let X be a locally convex space, f: X ^^ R a continuous convex function, and G a subset ofX satisfying (1.73). For an element go e G the following statements are equivalent: P . / ( g o ) = max/(G). 2°. There exists OQ G X * \ { 0 } satisfying (4.34) and inf
/(>;)=
yeX
max
inf
f(y).
(4.41)
Proof 1° ^ 2 M f 1° holds, then by Theorem 4.4 and Remark 4.5(d), there exists Oo G X*\{0} satisfying supOo(G) = sup 00(5). Furthermore, by Remark 4.5(c), 1°, and Theorem 3.1 we have, for the hyperplane HQ defined by (4.29), inf
/(>;) = inf/(//o) = /(go) = m a x / ( G ) =
sup
inf
f(y).
2° =^ I M f 4)0 is as in 2°, then by (4.34), inf
fiy) < /(go).
yeX Oo(>0=supo(G)
Hence by Theorem 3.1 and (4.41), we obtain sup/(G) =
sup
inf
fiy)=
(v)=supO(G)
which, together with go e G, yields 1°.
inf
/(j)
Oo(v)=supo(G)
D
148
4. Optimal Solutions for Quasi-convex Maximization
Remark 4.8. Theorem 4.5 admits the following geometric interpretation: When / : X ^- R is 3. continuous convex function satisfying (1.73), an element go e G satisfies /(go) = max / ( G ) if and only if there exists a hyperplane HQ that supports G at go and such that mff(Ho)
= max i n f / ( / / ) ,
(4.42)
HeHc
where HG is the family of hyperplanes defined in Remark 3.1(a), that is, the family of all (closed) hyperplanes that quasi-support the set G. Let us consider now the set A^G ( / ) of all maximum points of / on G (see (3.3)) and the set Odf) of all functions
inf
f(y)=
o(v)=supOo(G)
max
inf
f(y)];
(4.43)
0(j)=:supO(G)
we shall call any element of Ocif) an optimal function (with respect to the pair (G, / ) ) . The optimal functions are nothing other than the optimal solutions of the dual problem (3.42), with W = X*\{0} and C A ' ( { 0 } ) of (3.44). Corollary 4.7. Let X be a locally convex space, / : X -> R a continuous convex function, and G a subset ofX satisfying (1.73). We have Mcif) 7^ 0 if and only if there exists an optimal function OQ G Ocif) such that Oo attains its supremum on G.
(4.44)
Proof Indeed, the condition means that there exist a function ^o G X*\{0} and an element go e G satisfying (4.41) and (4.34), so the result follows from Theorem 4.5. D Corollary 4.8. Let X be a locally convex space, / : X —> R a continuous convex function, and G a weakly compact subset ofX satisfying (1.73). We have Mcif) 7^ 0 if and only if there exists an optimal function OQ € Ocif)Proof Since G is weakly compact, every OQ G X* satisfies (4.44), so the result follows from Corollary 4.7. D Remark 4.9. As shown by Example 2.3 and the function (4.38), the sufficiency part of Corollary 4.8 is no longer true without the assumption of weak compactness, even in the particular case that X is a conjugate Banach space, / : X ^- /? is a finite continuous convex function, and G is a weak* compact subset of X. Let us summarize the connections between the existence of maximum points and of optimal functions with respect to the pair (G, / ) . Theorem 4.6. Let X be a locally convex space, f: X -^ R an upper semicontinuous convex function, and G a subset ofX satisfying (1.73).
(a)>fG(/)7^0^OG(/)7^0; (b)OG(/)#0^MG(/)^0; (c) if G is weakly compact, then Mcif)
7^ 0 ^ O G ( / ) 7^ 0.
4.3 Some basic subdifferential characterizations of maximum points Proof. Part (a) follows from the necessity part of Corollary 4.7. Part (b) is shown by Remark 4.9. Part (c) is Corollary 4.8.
149
D
4.3 Some basic subdifferential characterizations of maximum points In Section 4.1 we have given some characterizations of maximum points of lower semicontinuous convex functions using abstract subdifferentials, like 9^^^ ^/. In the present section we shall give some basic characterizations of maximum points of continuous, respectively, lower semicontinuous, convex functions, with the aid of usual subdifferentials and £-subdifferentials. We recall that a subset G of a normed linear space X is called proximinal if each XQ e X admits a best approximation in G, i.e., if P G U O ) 7^ 0 for ^H -^o ^ ^• Lemma 4.1. Let X be a normed linear space, C a proximinal convex subset of X, and G a subset ofX. We have the inclusion G^C
(4.45)
if and only if N(C;x)^NiG;x)
(x ebdC),
where N(C', x) denotes the ''extended normal cone'' ofCatx formula (1.125)).
(4.46) e X (see Chapter 1,
Proof Necessity: Assume (4.45) and let x ebd C, OQ e N(C; x). Then Oo(c) < Oo(x) (c G C) and G c C, whence Oo(g) < ^o(-^) (g ^ G), so OQ G iV(G; x) (actually, this part holds for any sets G, C c X with G c C and any x e X). Sufficiency: Assume that (4.45) does not hold, i.e., there exists go e G\C. Then, since C is a proximinal subset of X, there exists an element of best approximation X e C of go, i.e., such that ||^o - -^11 = miricec Wgo - c\\ := ro (> 0, since go ^ C,x e C); clearly, x ebd C.Let us consider the open ball O(go,ro) = {y e ^\ Wgo — y\\ < ro). Since C is convex and O(go, ro) is open and convex, and since O(go, ro) n C = 0, by Chapter 1, Theorem 1.1 we can separate C and O(go, ro); i.e., there exists OQ e X*\{0} such that y := sup cDo(C) < mf
(4.47)
Hence by Lemma 1.8, Oo(c)
(ceCye
O(go. ro)).
(4.48)
Then by x G C we have 4>O(JC) < y. But since x G bdO(go, ro), by (4.48) we also have Oo(jc) > y, so OO(JC) = y. Therefore, Oo(c — x)
150
4. Optimal Solutions for Quasi-convex Maximization
that is,
< /(go) < +00.
(4.49)
The following statements are equivalent: 1°./(go) = max/(G). 2°. We have df(x) c N(G; X)
(X e X, f(x) = /(go)).
(4.50)
Proof By the first part of Remark 4.1, condition 1° is equivalent to G C Sf(gQ)(f), which, in turn, by Lemma 4.1 applied to G and the proximinal convex set C = Sf(gQ)(f), is equivalent to A^(%,o)(/); ^) ^ ^(G; X)
(X e bd 5/(,,)(/)).
(4.51)
But by (4.49) and since / is a continuous convex function, bd 5/(,„)(/) = 5/(,„)(/)\/l/(,„,(/) ^{xeX\
fix) = /(go)}
(4.52)
(see Remark 1.7). Hence by Theorem 1.6, we have 7V(5/(,,)(/); X) = ^ ( 5 / ( , , ) ( / ) ; x) = U,>or]df{x)
(x e bd 5/(,,)(/)).
Consequently, (4.51) is equivalent to (4.50).
D
Replacing in (4.49) inf/(X) by inf/(G), one can replace in (4.50) the extended normal cones A^, considered for all x G X with f(x) = /(go), by usual normal cones A^, considered only for elements g e G with / ( g ) = /(go). Namely, we have the following theorem: Theorem 4.8. Let X be a locally convex space, / : X -> R a lower semicontinuous convex function, and G a subset of X, such that G C int dom / . For an element go G G satisfying inf/(G) < /(go),
(4.53)
the following statements are equivalent: r . / ( g o ) = max/(G). 2°. We have df(g) c N(G; g)
(g e G, f(g) = /(go))-
(4.54)
4.3 Some basic subdifferential characterizations of maximum points
151
Proof. 1° => 2°. Assume 1° and let g e G, f{g) = /(^o), ^o e df{g). Then ^o(j) - Oo(g) < / ( y ) - / ( g ) = / ( j ) - / ( g o ) < 0
(jG % , , ) ( / ) ) .
(4.55)
But by 1°, G c 5/(,,)(/), and hence by (4.55), <^^{g') < cDo(g) {g' e G), so
^oeN(G;g). 2° => 1°. Assume that 1° does not hold, so there exists g' e G such that /(go) < fig'). Note that by (4.53), there exists g" e G such that f{g") < f(go) < f(g'). Let g^ := ng'^ (\ - r])g''
(0
(4.56)
Then, since f\[g'\g'] is convex and continuous, where [g'\ g'] is the segment {gr; | 0 < r] < 1}), there exists g = r]Qg' + (1 - r]o)g" G G, 0 < r?o < 1, such that / ( g ) = /(go), and the directional derivative f'{g\ g' — g) must be > 0; indeed, since g' — g = r(g-g^O, where r = ^ > 0, by [184], Theorem 23.1 we h a v e / ( g ; g-gO =
f\g\
t{g" - g)) = tfig; g" - g) < t{f{g") - f(g)) < 0, whence 0 < / ( g ; g^ -
8) + f(g', g-g') < f\g\ g ' - g ) . Hence since/Xg; g'-g) = maxci>e3/(g) O ( g ' - g ) (by (1.121)), it follows that there exists c|>o ^ 9/(g) such that Oo(g^ - g) > 0, i.e., such that Oo ^ A^(G; g). Thus, 2° does not hold. D Remark 4.10. (a) The assumptions (4.49) and (4.53) in Theorems 4.7 and 4.8 cannot be removed. Indeed, for example, if / : X -> R is differentiable and has a unique minimum on X at some go G intG, then conditions (4.50) and (4.54) are satisfied, since {x G X\ f(x) = /(go)} = {go} and a/(go) = {0} c A^(/; go), but /(go)=min/(X)^max/(G). (b) In Theorem 4.8 one cannot replace (4.53) by the weaker assumption (4.49), as shown by the following example: Let X = R^ with the Euclidean norm, f(x\,X2) = (1 — xi) + jc| (so / is convex and differentiable), and G = {0} x [ - 1 , +1]. Then for go = (0, 0) G G we have (4.49) and (4.54), but not /(go) = max/(G). Indeed, if g = (0, g2) G G, / ( g ) = /(go) = 1, thengi = 0, 1+gf = 1, whence g = 0, and 9/(0) = {V/(0)} = {(-1,0)} c A^(G; 0). On the other hand, /(go) # max/(G) (since /(go) = /(O, 0) = 1 < g^ + 1 = /(O, g2) for all (0, g2) G G, g2 # 0 ) . One can also see directly that (4.50) does not hold either: for x = (1, 1) (^ G) we have f{x) = 1 = /(go) and 9/(1, 1) = {V/(l, 1)} = { ( - 1 , 2)} (the gradient of / at (1, 1)), but ( - 1 , 2) ^ iV(G; (1, 1)), since for (0, 1) G G we have ( - 1 , 2)(0, 1) = 2 ^ (-1,2)(1,1) = 1. By introducing a parameter s, namely, by considering the ^-subdifferentials 9e/(go) (^ > 0) instead of the subdifferentials df(g) (g G G, / ( g ) = /(go)), and the ^-normal sets Ns(G; go) instead of the normal cones N(G; g) (g G G, / ( g ) = /(go)), one can transform the purely local conditions of (4.54) into global conditions. Indeed, we have the following: Theorem 4.9. Let Xbea locally convex space, / : X ^^ Ra proper lower semicontinuous convex function, and G a subset ofX. For an element go G G, the following statements are equivalent:
152
4. Optimal Solutions for Quasi-convex Maximization r . / t e o ) = max/(G). 2°. We have Ssf(go)£Ns(G;go)
{e>0),
(4.57)
Proof, r =^ 2°. Assume T and let £ > 0, OQ e a^/Cgo). Then 0 > fig) - f(go) > <^o(g -go)-s
(ge G),
whence Oo e 7V,(G; ^o). 2° =^ 1°. Let us first observe that if we have 2°, then by (1.131) and (1.128), df(go) = a>oa./(go) ^ n^^oNeiG; go) = A^(^; go),
(4.58)
so (4.57) holds also for s = 0. Assume now that 1° does not hold, that is, using (3.134), /(go) < sup / ( G ) = sup inf {fix) - cD(x) + sup 0(G)}.
(4.59)
Then there exists 4>o G X* such tl^at /(go) < inf {/(x) - Oo(x) + supcDo(G)}, jceX
whence sup
(4.60)
xeX
By (4.59) and (4.60), we have /(go) € /?. Let 8 := sup[
(4.61)
xeX
Then by (4.61), we have e >0 and sup{(|>o(x) - fix)} = Oo(go) - /(go) + £, xeX
SO cI)o G dsfigo). On the other hand, by the first inequahty in (4.60), sup cI>o(G) > /(go) - inf {fix) - ^oix)} = Oo(go) + 6:, JCGX
so
•
5 Reverse Convex Best Approximation
The study of reverse convex best approximation, that is, of best approximation by complements of convex sets, is motivated, among others, by its connections with the famous unsolved problem whether in a Hilbert space every Chebyshev set (i.e., such that each x e X has a unique element of best approximation in the set) is necessarily convex. Namely, it is known (see the Notes and Remarks to Section 5.2) that if a Hilbert space X contains a Chebyshev set that is not convex, then X also contains a Chebyshev set that is the complement CG of an open bounded convex subset G (7^ 0) of X. Geometrically, if G is a convex set with intG 7^ 0, and XQ e intG, the problem of finding dist(jco, CG) amounts to finding the greatest radius of an open ball with center XQ contained in G (see Figure 5.1); clearly, when xo G bd G, no such open ball exists, and we have dist (JCQ, CG) = 0.
Figure 5.1. We shall be concerned with the following two main problems: (1) Find convenient formulas for dist(jco, CG).
154
5. Reverse Convex Best Approximation
(2) Give characterizations of elements of (reverse convex) best approximation (i.e., necessary and sufficient conditions in order that an element zo e X satisfy zo e 7^CG(-^O), that is, zo ^ CG and ||xo - Zoll = dist(xo, CG).) We shall obtain duality results, using the elements O of the conjugate space X*. Remark 5.1. If XQ e bdG, and hence in particular, if G is any subset of X with int G = 0 and XQ G G, then dist(xo,CG) = 0.
(5.1)
Indeed, if XQ G bdG, then every ball with center XQ intersects CG, whence dist(xo,CG) = 0. This applies, in particular, if intG = 0 (hence G c bdG) and XQ £ G. Therefore, it is natural that in most of the subsequent results we shall assume that int G ^ &.
5.1 The distance to the complement of a convex set The following theorem gives an explicit formula for the distance to the complement of a convex set. Theorem 5.1. Let X be a normed linear space, G a convex subset ofX with int G 7^ 0, andxo e G. Then dist (xo, CG) = inf {sup cD(G) - O(jco)}.
(5.2)
\\n=\ Proof Let us first assume that XQ = 0, so formula (5.2) becomes dist(0,CG)= inf supO(G).
(5.3)
\\n=\ Since int G 7^ 0, for each z e CG there exists O^ e X* with ||0J| = 1 such that sup 0^(G) < 0^(z) (by the separation theorem). Hence Ikll > ^z(z) > supcI>,(G) > inf supO(G)
(z e CG),
which yields dist(0,CG)= inf llzll > inf sup ^ ( G ) .
(5.4)
On the other hand, since 0 € G, for each O G X* with ||c|>|| = 1 we have supO(G) > ^(0) > Oand CG^{X
e X\ (D(jc) > sup 0(G)},
(5.5)
5.1 The distance to the complement of a convex set
155
whence by Corollary 1.4, dist(0, CG) < dist(0, {jc G X|
(5.6)
zeCG
where G — XQ is a convex set containing 0, with int (G — XQ) ^ 0. Hence by (5.3), dist(0, C(G - jco)) = inf sup 0 ( G - XQ) = inf {sup a>(G) - O(xo)}, IIOII-l
(5.7)
||ct>|| = l
which, together with (5.6), yields (5.2).
D
Remark 5.2. (a) If int G = 0, the expression infyon^i{sup 0(G) - O(xo)} may have any value d >0. Indeed, for example, if X == C([0, 1]) and G = the (convex) set of all algebraic polynomials of norm < d, then int G = 0, so dist (0, CG) = 0, but G = {jc G X| ||jc|| < d] (by the classical theorem of Weierstrass on the uniform approximation of continuous functions by polynomials), and hence supO(G) = supcD(G) = d\\^\\ for all O e X*, so inf|,o||=i{supO(G) - 0(0)} = d. (b) If G is open, then by Lemma 1.8, CG 5 {JC G X\ 0(jc) > sup cD(G)},
(5.8)
and hence in particular, CG 2 {JC G X\ 0(jc) = sup 0(G)}. (c) By Lemma 1.5, Theorem 5.1 admits the following geometric interpretation: if G is a convex subset ofX such that int G 7^ 0, and ifxo G G, then dist (xo, CG) = inf dist(xo, //cD,supcD(G)) =
^
11^11 = 1
inf
inf
yeX
||xo - yh
(5.9)
cD(j)=supO(G)
where //
0(3;) = sup 0(G)}.
Equivalently, using also Corollary 1.1, this means that dist (jco, CG) = inf dist (JCQ, / / ) ,
(5.10) (5.11)
HEHG
where HG denotes the collection of all hyperplanes that quasi-support G (see Figure 5.2); thus, this is another instance of the reduction principle (it reduces the computation of dist (jco, CG) to the computation of dist (JCQ, H) for H G HG)(d) Formula (5.2) remains valid for any closed convex set G (possibly with intG = 0) and any XQ G G, in a normed linear space X. Indeed, this follows by replacing in the above proof of Theorem 5.1 the separation theorem by the strict separation theorem.
156
5. Reverse Convex Best Approximation H
Corollary 5.1. If G is a convex subset ofX such that int G ^ 0 and ifxo e G, then dist (jco, CG) =
inf
(^,d)eiX*\{0})xR
itrf (
IIOII
supO(G)<J
d - O(xo) ||0||
(5.12)
Proof. Clearly, sup^(G) = inf{JG /?|supO(G) < d]
(5.13)
(O e X * ) .
Hence by Theorem 5.1 and (5.13), we obtain dist(xo, CG) =
inf {sup 0(G) -
^{XQ)}
||0|| = 1
. =
supO(G)-0(xo)
inf O6X*\{0}
=
ll^ll
. . inf
inf
OGX*\{0}
deR
sup (t>iG)
d - cI>(xo)
llII
d -
inf
(*,rf)€(A-\(0|)xR
(5.14)
II (J) II
sup
which proves the first equality in (5.12). Finally, the equality of dist(jco, CG) with the third term of (5.12) follows similarly to (5.14), using that sup cD(G) = \nf[d e R\ ^(g)
{g e G)}.
D
Remark 5.3. (a) Conversely, Corollary 5.1 implies Theorem 5.1. Indeed, this follows by starting with the first equality of (5.12) and writing formula (5.14) in the reverse order. (b) Corollary 5.1 admits the following geometric interpretation: ifG is a convex subset ofX such that int G 7^ 0 and ifxo e G, then dist(xo,CG)=
inf
disi(xo, U
i
'
inf
(
dist(xo,V^d),
(5.15)
5.1 The distance to the complement of a convex set
157
(b)
(a) Figure 5.3.
where U^^d and V^ ^ are the half-spaces (1.67). Equivalently, this means that dist (xo, CG) =
inf dist (XQ, U) = inf dist (JCQ, V), UeU VeV unG=i/\ vnG=0
(5.16)
where hi and V denote, respectively, the collection of all open half-spaces in X and the collection of all closed half-spaces in X (see Figures 5.3 (a) and (b)). Indeed, for the open half-space U(t>j and the closed half-space V
(5.17)
and, respectively. y^ ^ n G = 0 <^ cD(^)
e G).
(5.18)
Hence by (5.12) and Corollary 1.4, we obtain (5.16). Note that in (5.13), and hence in (5.12) too, one can replace sup 0(G) < ^ by sup 0(G) < d. (c) The half-spaces U and V in (5.16) can be replaced by hyperplanes, that is, we have dist (jco, CG) =
inf dist (JCQ, / / ) ,
Hen HnG=^
(5.19)
where H denotes the collection of all hyperplanes in X. One can also express the right-hand side of (5.2) as follows. Proposition 5.1. If G is a convex subset ofX and ifxo € G, then inf {supcD(G) - O(jco)} = inf {supO(G) - cD(jco)}
OeX*
II^INl
I|0||>1
= inf {supO(G)-0(xo)}. OeX* ||cD||>l
(5.20)
158
5. Reverse Convex Best Approximation
Proof. Let us first assume that JCQ = 0, so formula (5.20) becomes inf sup(D(G)= inf s u p O ( G ) =
(DeX* llll = l
inf supcD(G).
OGX* II0||>1
(5.21)
If G = Z, then all members of (5.21) are equal to +CXD. Assume now that G # X . Then, since {O G X*| ||0|| = 1} c {O G X*| ||cD|| > 1}, we have inf supcD(G)> inf supO(G). OGX*
11011=1
(5.22)
\\n>\
On the other hand, let OQ ^ Z* Jl^oll > 1, be arbitrary. Then -^^ < 1, and since 0 e G, we have sup Oo(G) > 0. Hence, On 1 inf sup 0(G) < sup TTT—7(G) = —— sup Oo(G) < sup Oo(G), l^^jfi ll'^oll 11^0 I and therefore, since OQ e Z* with ||Oo|| > 1 was arbitrary, we obtain, using also (5.22), that inf supcD(G)== inf sup 0(G). \m=\
(5.23)
iicDii>i
Furthermore, since G ^ X and G is convex, by the strict separation theorem there exists Oo € Z*\{0} such that sup Oo(G) < +oo.LetOo G X*with||Ool| = 1, sup Oo(G) < -\-OQ be arbitrary and let /x > 1. Then ||/xOol| > 1 and inf supO(G) < sup(/xOo)(G) = /xsupOo(G),
OeX* ||0||>1
whence, using that ^ > 1 and Oo G X* with ||Oo|| = 1 were arbitrary, we obtain inf supO(G)<
inf supO(G).
OeX*
OGX*
II^II>1
IIO||=l
(5.24)
By (5.23) and (5.24), it follows that inf s u p O ( G ) = inf supO(G)<
inf supO(G)<
inf supO(G),
(DGX*
OeX*
OGX*
OeX*
11^11 = 1
I|0||>1
1|01|>1
||cD|| = l
which yields (5.21). Assume now that XQ G G is arbitrary. Then G — JCQ is a convex subset of Z, containing 0, and hence, applying (5.21) to G — JCQ and using that sup 0 ( G — XQ) = sup 0(G) - 0(xo), we obtain (5.20). D Remark 5.4. Combining Theorem 5.1 and Proposition 5.1, one obtains further expressions of dist (jco, CG).
5.1 The distance to the complement of a convex set
159
One can replace in (5.9) hyperplanes by other sets, such as quasi-supporting closed or open half-spaces (see Figures 5.4 (a) and (b)). Indeed, we have the following theorem: Theorem 5.2. Let X be a normed linear space, G a convex subset ofX such that int G 7^ 0, and XQ e G. Then dist(xo, CG) = inf
inf
\\y - XQ\\ = inf
inf
\\y - XQ\\ .
OeX* veX 11^11 = 1 cD(j)>supcD(G)
(5.25)
Proof. It will be enough to prove that dist (jcn, CG) =
inf dist (JCQ, U<^ suprG)).
(5.26)
\\n=\ where L^cD,supO(G) is the open half-space ^o,sup4>(G) = [yeX\
0 ( j ) > supcD(G)},
(5.27)
or equivalently, dist(xo, CG) = inf dist (JCQ, U),
(5.28)
UeUc
where KG denotes the collection of all open half-spaces that quasi-support G and do not contain int G. Since JCQ G G, we have JCQ ^ ^cD,supcD(G). whence by Corollary 1.4, dist(jco, f/(j>,supO(G)) = supO(G) - O(jco), for all O G X* with ||0|| = 1. Hence by Theorem 5.1, we obtain (5.26). •
Figure 5.4. Let us show now that in the case 0 G G 7^ {0}, it is enough to consider d = \'\n Corollary 5.1 above. Theorem 5.3. Let G be a convex subset of X, with 0 G G, and let XQ G G. If int G 7^ 0, then dist(xo,LG)=
mf ex*\{0} sup4>(G)
= II Oil
mf CDGX*\{0} ^{g)<\
(geG)
—-—-—. II CD II
(5.29)
160
5. Reverse Convex Best Approximation
Proof. By (5.12), we have the inequalities < in (5.29). On the other hand, since 0 G G, for any O G X*\{0} we have sup 0(G) > 0, and hence for any O G X * \ { 0 } and d ^ R with sup 0(G) < d we have d > 0. Then the function O' = ^O e Z*\{0} satisfies supO^(G) < 1. Also, J-O(xo) ^
l-cD-(xo) ^ l-cD^(xo)
ll^ll
^lioni
lio^i
'
which, by the last part of Remark 5.3(b), yields the inequalities > in (5.29), and hence the equalities. • Remark 5.5. By Corollary 1.4, one can also write the first equality of (5.29) in the following geometric form: dist (Jo, CG) =
inf
dist
(XQ,
{y
G X|CD(J)
> 1}),
(5.31)
OGG°\{0}
where G° is the (usual) polar (1.82) (with C = G) of G. In the case dim X < +oo, one can obtain more complete results. Indeed, let us first prove the following proposition: Proposition 5.2. If G is a convex subset of a finite-dimensional normed linear space X, and XQ G G, then dist (jco, CG) = dist (XQ, CG).
(5.32)
Proof Since G c G, we have CG ^ CG, whence dist(xo, CG) < dist(jco, CG).
(5.33)
Assume now that the inequality (5.33) is strict, so there exists ^ > 0 such that dist(jco, CG) 4- 2^ < dist(jco, CG).
(5.34)
Choosez G CG such that II jco - z | | < dist (JCQ, CG) 4-6:. Then z G G (since if Z G CG, then dist(xo, CG) < jjjco — zjj < dist(jco, C G ) + ^ , which contradicts (5.34)). Hence, z G G n CG c bd G = bd G, where the last equality holds by dim X < -\-OQ and the convexity of G. Consequently, there exists j G CG such that \\z — y\\ < £• Then by (5.34), we obtain Iko - y\\ < \\xo - z\\ + \\z - y\\ < dist(xo, CG) ^le
< dist(xo, CG),
in contradiction to j G CG.
D
Remark 5.6. (a) The assumption dim X < -hoo cannot be omitted in Proposition 5.2, as shown by Remark 5.2(a) above. (b) One can also give the following alternative proof of Proposition 5.2: Since int G = int G (by dim X < +oo and the convexity of G), we have, by (1.52) and Lemma 1.2, dist(jco, CG) = dist(jco, CG) = dist(jco, C(int G)) = dist (JCQ, C(int G)) = dist(jco, CG) = dist(jco, CG).
5.2 Elements of best approximation in complements of convex sets
161
Proposition 5.3. Let dim X < +oo, G a convex subset ofX, and XQ e G. Then we have (5.2). If in addition, 0 G G, then we have also (5.29). Proof. For the first part, by Theorem 5.1 and Remark 5.1, we have to prove that if int G = 0, then the right hand side of (5.2) is 0. Since dim X < -foo and G is a convex set with int G = 0, G is contained in some hyperplane {x e X\<^o(x) = d}, with Oo e X*, IIOoll = l,d e R. Hence, since XQ e G, we have 0 < inf {sup0(G) - O(xo)} < supOo(G) - Oo(jco) =d-d
= 0,
l|0|| = l
If, in addition, 0 € G, then J = 0, and hence, since XQ e G (so Oo(xo) = 0) and sup (/xOo)(G) = /x sup Oo(G) = 0 < 1 for all /x > 0, we obtain 0<
mf OGX*\{0}
l-cD(xo) ^ . ^ < mf 110)11
1
1 =
M>o ll/xOoll
..1 n mf — = 0.
n U
||^olU>OM
sup4)(G)
Remark 5.7. Alternatively, the first part of Proposition 5.3 also follows from Proposition 5.2 and Remark 5.2(d) applied to G.
5.2 Characterizations and existence of elements of best approximation in complements of convex sets We shall first give some characterizations of elements of best approximation of XQ in CG, where G is a convex set and XQ e G, i.e., some necessary and sufficient conditions in order that zo G PCG(-^O) (that is, zo e CG and ||xo — zoW = dist(xo, C G ) ) . To this end, we shall use the distance formula (5.2) of Theorem 5.1. Note that we do not need to consider the case of G closed (see Remark 5.2(d)), since in that case CG is open, so PCG(-^O) = 0Theorem 5.4. Let X be a normed linear space, G a convex subset of X with int G 7^ 0, and let XQ G G . For an element ZQ G CG, the following statements are equivalent: 1°. IUo-Zoll=ciist(jco,CG). 2°. We have Zo G bd CG = bd G,
(5.35)
and there exists OQ G X* such that sup cDo(G) -
(5.36) (5.37)
162
5. Reverse Convex Best Approximation sup cDo(G) - OoUo) = lUo - zoll.
(5.38)
Moreover, in T and 3° we may also assume that IIOoll = 1.
(5.39)
Proof. Assume 1°. Then we have (5.35). Since int G 7^ 0 and zo ^ CG, by the separation theorem there exists OQ G X* such that supOo(G) < Oo(zo); clearly, we may assume that ||Oo|| = 1. Hence by intG 7^ 0, Theorem 5.1, and 1°, we obtain \\xo -zo\\>
Oo(zo - -^o) > sup
= dist(xo,CG) = \\xo - zoh whence (5.36) and (5.37). Thus, T =^ 2°. _ Assume now 2°. Then by (5.35), we have zo ^ bd G c G, whence
\\n=\ so Zo ^ ^CG(-^O). Thus, 3° => 1°, which proves the equivalence of 1°, 2°, and 3°. Finally, if we have 2°, then by (5.37), |10o|| > 1 (since otherwise Oo(zo —-^o) S ll^oll Iko - ^oll < Iko - -^oll), so l i ^ < 1. Hence by (5.36), ^^P^(^) - ^(^0) = ^
11^0 II
11^0 II
i^f {supcD(G) -
11^0 II if^^i^*^ < inf {supO(G)-0(jco)}, 11^11 = 1
and therefore s u p - ^ ( G ) - T^(xo)
11^0 II
= inf {supO(G) -
11^0 II ^^^fj^
which shows that (5.36) is satisfied also for Oo replaced by TTI^, i.e., that in 2° we may assume (5.39). Consequently, in 3°, too, we may assume (5.39) (because in the above proof of the implication 2° =^ 3° we have used the same Oo). •
5.2 Elements of best approximation in complements of convex sets
163
Remark 5.8. When G is a bounded convex set, the equivalence 1° <:> 3° of Theorem 5.4 admits the following geometric inteq^retation:/or an element zo ^ CG we have zo ^ 7^CG(-^O) if and only if there exists ^o ^ ^* "^ith ||Oo|| = 1 such that the quasi-support hyperplane //
(5.40)
satisfies dist (xo, //ci>o,supci>o(G)) =
inf dist(xo, / / ) ,
(5.41)
HeHc
lUo - ^oll = dist(xo, //ci>o,supci>o(G)); or, equivalently (by Corollary \.l),for an element zo G CG we have if and only if there exists a hyperplane HQ e He satisfying dist(xo, Ho) = inf dist(A:o, / / ) ,
(5.42) ZQ G PCG(-^O)
(5.43)
HeHc
(5.44)
Iko-zoll =dist(xo,//o) (see Figure 5.5). Indeed, by XQ e G, we have ||4>o|| = 1 and Lemma 1.5, dist (xo, //cDo,supo(G)) =
I^O(-^O)
OO(JCO)
< sup
- sup Oo(G)| = sup cDo(G) - Oo(xo). (5.45)
•^0
J
m Hn
Figure 5.5.
Remark 5.9. (a) When G is unbounded, there exists (by the uniform boundedness principle) OQ G X* such that supcI)o(G) = +oo, so then //(i>o,supOo(G) = 0; hence in this case, (5.41) does not hold (since its left-hand side is +oc, while its right-hand side is finite). (b) The above proof of the implication 2° => 3° shows that for each pair zo, OQ as in 2° of Theorem 5.4, we have Oo(zo) = supa>o(G).
(5.46)
(c) When G is a bounded convex set, (5.46) is equivalent to zo G //
(5.47)
164
5. Reverse Convex Best Approximation
Now we shall give some examples in the plane R^ endowed with the Euclidean norm IkII : - V|xiP + |x2p
{X = (XUX2) e R ' ) ,
(5.48)
showing that various parts of 2° and 3° above cannot be omitted. Example 5.1. Let X = R^, with the norm (5.48), G = {y e R^\ \\y\\ < 1}, XQ = 0. Then for the element zo = (2, 0) e int CG and the function OQ e (R^)* defined by Oo(x)=x,
(x = ixuX2)eR^),
(5.49)
we have \\^o\\ = 1, supO(G) = ||0|| (O e (/?^)*), whence supOo(G) cI>o(-x:o) = infci>eXM|ci>iNi{sup4)(G) -
i ^ ^ ^ = WA " ""'
(5.50)
inf
CDGX*\{0} ^{g)<\{geG)
1:1^1^. ||(D||
(5.51)
3°. There exists % G X*\{0} satisfying (5.50), (5.51), and ^'^{g) <\ o;)(zo) = l.
{ge G),
(5.52) (5.53)
5.2 Elements of best approximation in complements of convex sets
165
Proof, r =:^ 3 M f 1° holds, then by Theorem 5.4 and Remark 5.9(b), there exists Oo G X*\{0} satisfying (5.39), (5.36), (5.38), and (5.46). Since G is an open set, by 0 G G and (5.39) we have sup Oo(G) > 0. Let ^o:=
^ 7 7 ^ ^0. supcDo(G)
(5.54)
Then by (5.39), OQ ^ 0. Furthermore, since G is an open set, by Lemma 1.8 we have (5.52). Also, by (5.46), we have (5.53). Hence, by 1°, Theorem 5.3, (5.52), and (5.53), we obtain lUo - zoll = dist (^0, C G ) =
11^0 I
inf
——-—
O6X*\{0} cD(g)
II Oil
11^0 I
whence (5.50) and (5.51). The implication 3° => 2° is obvious. 2° =^ 1°. Assume now that % e X*\{0} satisfies (5.50) and (5.51). Then by (5.50), (5.51), and Theorem 5.3, we obtain 11^0 - ^oll = —M—71— = O'
"
^"
inf
OGX*\{0}
— — - — = dist (xo, LG). IIOJI
D
Now we shall study the existence of elements zo G C G for which the dist in the left-hand side of (5.2) is attained (i.e., of elements of best approximation zo e T^CGUO)).
Definition 5.1. We shall call an optimal dual solution, or, briefly, optimal function (with respect to the pair ( C G , JCQ)) any function OQ G X* with ||Oo|| = 1 for which the inf in the right-hand side of (5.2) is attained (i.e., any OQ e X* satisfying (5.39) and (5.36)). Theorem 5.6. Let X be a normed linear space, G a convex subset of X with int G / 0, and XQ e G. We have PCG(-^O) T^ 0 if and only if there exists an optimal dual solution OQ G X* such that CGn{ye
X\ \\y - xoll = sup cDo(G) - Oo(xo)} / 0.
(5.55)
Proof The condition means that there should exist zo € C G and OQ e X* satisfying (5.39), (5.36), and (5.38), so the result follows from Theorem 5.4, implication
3° => r.
n
Remark 5.10. By Corollary 1.1, Lemma 1.5, and XQ e G, when G is a bounded convex set, a function OQ e X* with ||Oo|| = 1 is an optimal dual solution if and only if the hyperplane HQ = //(Do,supOo(G) G HG defined by (5.40) satisfies (5.41);
166
5. Reverse Convex Best Approximation
we shall call any such hyperplane an optimal hyperplane. Then Theorem 5.6 admits the following geometric interpretation: we have PCG(-^O) # 0 if and only if there exists an optimal hyperplane HQ e Tic such that C G n //o n bd B(xo, dist (JCQ, HQ)) / 0.
(5.56)
Since HoDbd 5(xo,dist (XQ, HQ)) = P/ZQUO), condition (5.56) can be also written in the form CGnP;,,(jco)7^0.
(5.57)
Now we shall show that if X is reflexive and G is an open convex subset of X, then Theorem 5.6 and Remark 5.10 can be improved; namely, conditions (5.55)(5.57) can be omitted. Theorem 5.7. Let X be a reflexive Banach space, G an open convex subset of X, and XQ G G. We have 'PCG(-^O) / 0 if and only if there exists an optimal dual solution Oo e Z* {or equivalently, an optimal hyperplane). Proof The necessity of the condition follows from Theorem 5.6. Conversely, assume now that there exists an optimal dual solution OQ G X* (so we have (5.39) and (5.36)), and let //Q be the hyperplane defined by (5.40). Then since X is reflexive, VHO(^O) 9^ 0- Let zo ^ 7^//o(-^o)« Then, since G is open, by Remark 5.2(b) we have zo ^ Ho ^ C G . Also, by Lemma 1.5 applied to the hyperplane (5.40), ||zo - -^oll = dist (zo, HQ) = \ supOo(G) - Oo(xo)|. But since xo e G, we have supOo(G^) - ^o(^o) > 0,so ||zo --^oll = supOo(G) - Oo(xo). Consequently, by Theorem 5.4, implication 3° ==> 1°, we obtain zo ^ ^CG(-^O)- D Finally, we shall summarize the connections between existence of elements of best approximation and existence of optimal dual solutions. To this end, we shall denote by ^CG(-^O) the set of all optimal dual solutions (with respect to the pair (CG,XO)).
Theorem 5.8. Let X be a normed linear space, G a convex subset of X, with intG ^ 0, and Xo € G. (a)7^CG(^o)#0=^^CG(^o)^0; (b)AG(^o)#0^7>CG(^o)7^0; (c) ifX is reflexive, and G is open, then 'PCG(-^O) 7^ 0 ^ -4CG(-^O) / 0; (d) it may happen that both PCG(-^O) = 0 CL^d W4CG(^O) = 0, e^en in the Hilbert space l^. Proof (a) is an obvious consequence of Theorem 5.4 (or of Theorem 5.6). (c) is nothing other than Theorem 5.7. Finally, (b) and (d) are proved by the following two examples, which complete the proof of Theorem 5.8:
5.2 Elements of best approximation in complements of convex sets
167
Example 5.4. Let X = /^ let G = {y = ( 3 . „ ) e / ' | £ ^ < l ) ,
(5.58)
n=\
and let XQ = 0. Then clearly, G is convex, and 5(0, l) = [y = (yn) G /^l ^ I j J < l [ C G c 5(0,2),
(5.59)
n=]
SOCG C C 5 ( 0 , 1) = {j G/^l ||};|| > 1}. Also, ^e^ e CG (n = 1, 2 , . . . ) , where en denotes the nth unit vector. Hence by || ^ ^ n || = ^ -> 1, we obtain distfxo,CG) = 1.
(5.60)
Furthermore, by (5.58), for each y = (y^) e CG we have 00
I
CXD
\\xo-y\\ = \\y\\ = J2\yn\>y]^ whence by (5.60), it follows that defined by
•PCG(^O)
I
> 1.
(5-61)
= 0- However, the function Oo e X*
satisfies ||^oll = 1. and by (5.60) and Theorem 5.1, oo
sup Oo(G) - Oo(xo) = sup V -yn = 1 = dist (XQ, CG) veG^« + l = inf {supcD(G)-4)(jco)}, l|0|| = l
soOo
G^CGUO).
Example 5.5. Let X = 1^, let oo
2
2
G = j^ = ( y „ ) e / ^ | ^ - ^ < l ) ,
(5.62)
n=\
and let XQ = 0. Then clearly, G is convex, and we have again (5.59). Also, ^ ^ ^ ^ CG (n = 1, 2 , . . . ) , whence again, we have (5.60). Furthermore, by (5.62), for each y = (yn) e CG we have
iixo-yii^=iMi^=i:.„^>i:-^>i, whence by (5.60), it follows that PCGUO) = 0- Consequently, since X is reflexive, by Theorem 5.7 we have ^ C G U O ) = 0D
Unperturbational Duality for Reverse Convex Infimization
Given a locally convex space X, with conjugate space X*, a convex subset G of X, and a function f: X ^^ R, in this chapter we shall give some results of unperturbational duality for the primal "reverse convex infimization" problem ( n
= (PGJ)
«' = < j
= inf / ( C G ) .
(6.1)
Any zo ^ CG for which the inf in (6.1) is attained, i.e., such that /(zo) = min/(CG), is called an optimal solution of problem (P^); these will be studied in Chapter 7. Taking G' := CG, one can also write (6.1) as the infimization problem
a'-
^Mf{G').
However, now we shall obtain dual problems that are different from the "usual" dual problems to convex and quasi-convex infimization problems (see Chapter 1, Section 1.4). In contrast to the cases of convex and quasi-convex infimization, it will turn out that for reverse convex infimization the theory of surrogate duality is more developed (see Sections 6.1-6.3) than the theory of Lagrangian duality (see Section 6.4). Our starting point for the study of surrogate duality will be the observation that best approximation by reverse convex sets CG may be regarded as a particular case of reverse convex infimization, by taking X to be a normed linear space, XQ e X, and / : X -> /? the convex function (1.264); indeed, then inf/(CG) = dist(xo,CG),
(6.2)
170
6. Unperturbational Duality for Reverse Convex Infimization
and for this case, the optimal solutions zo e CG of problem (P^) are the elements of best approximation of XQ by CG. Although the extension from the particular function / of (1.264) to a function f:X -^ /? on a locally convex space X, is a rather big step, it turns out that, similarly to the case of passing from best approximation by convex sets to convex optimization, many results and methods of the theory of best approximation by reverse convex sets can be extended to results on the reverse convex infimization of functions. In analogy to the fact that formula (1.249) on the distance to a convex set extends to the surrogate duality formula (1.330) on quasi-convex infimization, it is natural to expect that formula (5.9) on the distance to a reverse convex set will extend, under certain assumptions on G and / , to a formula like inf/(CG)=
inf
OeX*\{0}
inf
f(y),
xeX
(6.3)
obtained formally by replacing in (5.9) the function / of (1.264) by a function / on a locally convex space X; this will be achieved in Section 6.1. Next, corresponding to formula (1.355) on infimization, one would like to replace the hyperplanes [y e X\ 0(_y) = sup 0(G)} of (6.3) by other sets, e.g., closed half-spaces. Therefore, in Section 6.2, we shall consider "unconstrained surrogate dual problems" to problem (P^) of (6.1), defined as infimization problems of the form fi' = mfX'(X*\{0}),
(6.4)
where X*\{0} is the dual set (unconstrained), and ^ = XQ ^: X*\{0} -^ R is a. function (the dual objective function, depending on G and / ) of the form r (CD) = inf / ( ^ a o )
( e X*\{0}),
(6.5)
with {^G,4)}
(O e X*\{0}).
(6.6)
Problem (6.4), with X^ of (6.5), is an unperturbational dual problem to (P^), since it is defined directly, without using the method of first embedding (P^) into a family of perturbed primal problems, and it is a surrogate dual problem to (P^), since it replaces the primal constraint set G of (6.1) by a family of "surrogate constraint sets" ^G,4> c X (O e X*\{0}) (while it keeps the primal objective function / unchanged). Next, more generally, in view of further applications, given an arbitrary set X, a subset G of Z, and a function f \ X ^^ R, for the infimization problem (P^) of (6.1) we shall consider in Section 6.3 a "surrogate dual problem" of the form P' =P[,,=mfX{W),
(6.7)
6.1 Some hyperplane theorems of surrogate duality
171
where W = W[j ^ is a. set (the dual constraint set) and A. = A,^ ^^: W -^ /? is the function (the dual objective function) defined by X'cfiw) = inf /(^G...)
(w e W),
(6.8)
with {QG,w}wew being a family of subsets of X, related in some way to G, Then taking X to be a locally convex space, W = X*\{0}, and A. = A^ of (6.8), problem (6.7) reduces to problem (6.4). Furthermore, taking X to be a locally convex space, W c X*\{0} ovW^ (Z*\{0}) X /?, and X = X' of (6.8), we shall obtain some useful unconstrained and "constrained" surrogate dual problems to problem (P^) of (6.1). Actually, as in Chapter 3, instead of [QG.w}wew, we shall find it more convenient to use the equivalent language of polarities A: 2^ -> 2 ^ . In Section 6.4 we shall deal with unperturbational Lagrangian dual problems to problem (PO of (6.1). Finally, the general dual problem (6.7) will permit us to study (unconstrained and constrained) surrogate duality for more structured primal reverse convex infimization problems (i.e., in which the primal constraint set G is expressed in more structured ways), by considering suitable dual constraint sets W and dual objective functions X = X[jj : W -> R asin (6.8) (see Section 6.5). Remark 6.1. This chapter is devoted to unperturbational duality results only, since until the present there exists no perturbational duality theory for reverse convex infimization corresponding to those for convex infimization (see Chapter 1, Section 1.4.2) and convex supremization (see Chapter 3, Section 3.4.2). Similar to (1.383), we have i n f / ( C G ) = inf/(X), where / = / + XCG' ^^t the theory of Chapter 1 cannot be apphed directly to this function / , since for a convex set G, in general XCG is not convex. Another attempt could be to note that inf / ( C G ) = inf ( / + XCG)(^) = inf ( / +
- (-XCG))(^).
(6.9)
and h ^ c e to develog^a perturbational theory for infimization problems inf/(X), with / of the form / = / -j—h. In Chapter 8 we shall present a perturbational duality theory for such problems, but only when h is convex, so that theory cannot be applied to h = —XCG' where G is convex, i.e., to reverse convex infimization, since in general — XCG is not convex (however, note that it is quasi-convex when G is convex, since 5^(—XCG) = either G or X, for allJ € /?).
6.1 Some hyperplane theorems of surrogate duaUty Let us start with a generalization of Chapter 5, Remark 5.1. Remark 6.2. If G is a subset of a locally convex space X with intG = 0, and f: X ^ R is an upper semicontinuous function, then inf/(CG) = inf/(X).
(6.10)
172
6. Unperturbational Duality for Reverse Convex Infimization
Indeed, then by Lemmas 1.1 and 1.2, we have i n f / ( C G ) = i n f / ( C G ) = inf/(C(intG)) =
inff(X).
We have the following hyperplane theorem of surrogate duality, generalizing the (equivalent) geometric form (5.9) of Chapter 5, Theorem 5.1. Theorem 6.1. Let X be a locally convex space, G a convex subset of X, and f: X ^ R a function. (a) If f is upper semicontinuous, then inf/(CG)<
inf
inf
OGX*\{0}
f{y).
(6.11)
veX
4)(v)=sup4>(G)
(b) If f is quasi-convex, int G 7^ 0, and inf/(G)
(6.12)
then inf/(CG) >
inf
inf
OGXniO}
yeX (I)(v)=supO(G)
f{y).
(6.13)
(c) If f is upper semicontinuous and quasi-convex, intG ^ 0, and if (6.12) holds, then (6.3) holds. Proof If G = Z, then both sides of (6.11), (6.13), and (6.3) are +00 (since inf 0 = +00). Thus, we may assume that G ^ X. (a) If intG := 0, then (6.11) holds by Remark 6.2. If intG ^ 0, let O € X*\{0} and H '.= [y e X\ cD(j) = sup 0(G)}.
(6.14)
If sup ^(G) = +CXD, then // = 0, whence i n f / ( C G ) < infvGX.
inf
/(y),
yeX 0(j)=supO(G)
whence, since O G X*\{0} with sup 0(G) < -f-oo was arbitrary, we obtain (6.11). (b) Assume, a contrario, that inf/(CG) <
inf
OeX*\{0}
inf
veX 0(y)=:sup(G)
f{y)-=d.
Then by (6.15) and (6.12), there exist JCQ e CG and go ^ G such that
(6.15)
6.1 Some hyperplane theorems of surrogate duality / ( x o ) < J , f{go)
173 (6.16)
Since G is convex and intG 7^ 0, by the separation theorem there exists OQ € X*\{0} such that supcDo(G)
(6.17)
But since the function (/?: [0, 1] ^- /? defined by (p{r]) := Oo(^xo + (1 — ^)go) is continuous, and (^(0) = Oo(^o) ^ supOo(G), (/?(1) = <E>o(-^o) ^ supOo(G), there exists rio e [0, 1] such that ^0(^0^0 + (1 - r]o)go) = sup a>o(G).
(6.18)
Consequently, by the definition (6.15) of J, and by (6.18), the quasi-convexity of / , and (6.16), we obtain d <
inf
f(y) < fimxo + (1 - ^0)^0) < max{/(xo), f(go)} < d,
yeX
which is impossible. This proves (6.13). (c) This follows from (a) and (b).
D
Remark 6.3. (a) By inf 0 = +00, (6.3) is equivalent to inf/(CG) =
inf
inf
f(y).
(6.19)
cD€X*\{0} yeX sup4)(G)<+oo 0(v)=sup4>(G)
Formula (6.19) admits the following geometric interpretation: We have inf/(CG) =
inf i n f / ( / / ) ,
(6.20)
HeHc
where HG denotes the family of all hyperplanes in X that quasi-support the set G. This is another instance of the "reduction principle" (it permits one to reduce the computation o / i n f / ( C G ) to the computation of inf f(H), for all H e HG)(b) The first infimum in the right-hand sides of (6.3) and (6.20) need not be attained, even in the particular case that X is the Hilbert space /^ and / is a continuous convex function of the form f(x) = \\xo-x\\
(xeX),
(6.21)
where XQ e G, as shown by Example 5.5. (c) The condition ^ ^ 0 in (6.3) cannot be omitted, unless i n f / ( C G ) = inf/(X); indeed, for o = 0 we have {y e X\ <^o(y) = supOo(G)} = Z, so if we allow also O = 0 in the right-hand side of (6.3), then this right-hand side becomes inf f(X). (d) The assumption (6.12) is equivalent to inf/(G) = inf/(X).
(6.22)
174
6. Unperturbational Duality for Reverse Convex Infimization
Indeed, if (6.12) holds, then inf/(G) = min {inf/(G), i n f / ( C G ) } = inf/(X), and conversely, the fact that (6.22) inplies (6.12), is obvious. (e) In the particular case of best approximation, i.e., when X is a normed linear space and / is the function (6.21), where XQ e G, we have inf/(G) =dist (XQ, G ) = 0, so the assumption (6.12) is satisfied. The assumption (6.12) in Theorem 6.1(c) cannot be omitted, as shown by the following example. Example 6.1. Let Z be a normed linear space, G = {x G X | O O ( J C ) > 1},
(6.23)
where OQ G Z * \ { 0 } (SO G is an open half-space that does not contain 0), and / the function (6.21) with XQ = 0, that is, f(x) = \\x\\
(xeX).
(6.24)
Then inf/(G) =
inf
||x|| = - ^ > 0 =
inf
||x|| = i n f / ( C G ) ,
SO (6.12) is not satisfied. Furthermore, if O e X*\{0},supO(G) < +oo, then by (6.23), we must have (^ = rj^o for some r] e R, rj < 0, whence supcD(G) = sup(r]^o)(G) =-mf(-rj)<^o(G)
= r].
Consequently, for any such O we have inf
f(y)=
yeX
inf
f(y)=
yeX
0(>')=:sup(G)
n^Q(y)=r)
inf
||>.|| = - i -
yeX
> 0 = inf/(CG),
11^0 II
^o(v)=l
SO (6.3) does not hold. The same conclusions hold also for the closed half-space G = {x e X\^oM > 1}. However, in the case when X is a normed linear space and G is also bounded, with int G 7^ 0, the assumption (6.12) and the quasi-convexity of / can be omitted. Indeed, we have Theorem 6.2. Let X be a normed linear space, G ^ X a bounded convex subset of X with intG ^ 0, and / : X -^ R an upper semicontinuous function. Then we have (6.3). Proof The inequality < in (6.3) holds by Theorem 6.1 (a). In order to prove the opposite inequality, let jc e CG. Then by Theorem 1.3, there exists ^ e X*\{0} satisfying ^(x) = sup^(G). Then inf
inf
OeX*\{0}
yeX <^(y)=supO(G)
fiy) <
inf
fiy) < fix),
^iy )=supvI/(G)
whence, since x e CG was arbitrary, we obtain the inequality > in (6.3) and hence the equality (6.3). D
6.2 Unconstrained surrogate dual problems for reverse convex infimization
175
6.2 Unconstrained surrogate dual problems for reverse convex infimization While in the preceding section we have been concerned with "hyperplane theorems" of surrogate duality, now we want also to consider other types of surrogate dual results for reverse convex infimization, e.g., "half-space theorems." To this end, as mentioned at the beginning of this chapter, we shall consider for the infimization problem (P^) of (6.1) a "surrogate dual problem" of the form (6.7), where W = W[^ / is a set (the dual constraint set) and A = >.^ .: W -> /? is the function (the dual objective function) defined by (6.8), with {^G.w}wew being a family of subsets of X related in some way to G. Remark 6.4. In the sequel, when considering a reverse convex infimization problem (P^) = (PQ r), we shall assume, without any special mention, that G / X (since for G = X we have a' = inf / ( C G ) = inf/(0) = -hoo). As in Chapter 3, we shall express the surrogate duality results in the (equivalent) language of polarities A: 2^ -^ 2 ^ . Remark 6.5. (a) Using (3.39) and (1.144), the dual objective function X'' of (6.8) becomes y^(w)
= inf /(CA'({K;})) =
inf
f{x)
(w e W),
(6.25)
xeX U,'GCA((.V})
where A = AG : 2^ ^- 2^ is a polarity (depending on G, but not on / ) . Then by (6.7) and (6.25), the dual value (i.e., the value of the dual problem) becomes ^2 = inf inf f(CA\{w})) weW
= inf inf weW xeX
f(x).
(6.26)
U^GCA({JC})
AS has been observed in Remark 3.3(a), formulas (3.40) and (3.39) yield a oneto-one correspondence between families of subsets {^u;}u;eM^ of X and polarities A: 2^ -> 2^, so the two languages (6.8), (6.7) and (6.25), (6.26), are equivalent ways of expressing the dual objective function A.^ and the dual value )S^. In the sequel we shall choose the language (6.25), (6.26), since this will allow us, by using (1.140), to express the results, e.g., on the relations between the primal and dual problems, in a more concise way. Thus in particular, in this section we shall consider unconstrained surrogate dual problems (6.4) to (P^), with the dual objective function being of the form A^^(cD) = inf f(CA\{
inf
f(x)
(
(6.27)
xeX CDGCA(U})
where A = AG : 2^ ^ 2^*\^^^ is a polarity (depending on G). Then by (6.4) and (6.27), the dual value (i.e., the value of the dual problem) will be
176
6. Unperturbational Duality for Reverse Convex Infimization P\=
inf
inf/(CA\{^})) =
inf
inf
f{x).
(6.28)
cDeCA({jc})
(b) If there exists WQ e W such that CA'({W;O}) = 0, then by (6.25), we have k\{wQ) = inf 0 = +CXD. Consequently, by (6.26), )S^ = inf inf f{CA\{w})),
(6.29)
weG'
where G' := {w; G W| CA^({U;}) # 0).
(6.30)
(c) We have P'^ = inf
inf
fix) = inf
inf
f(x).
(6.31)
weGA({x})
Indeed, (6.31) follows from (6.26) and inf
f(x) = H-oo.
(6.32)
Jc€(C(dom/))nCA'({u;})
(d) In the particular case of Theorems 6.1 and 6.2, we have W = Z*\{0}, and by (6.6) and (3.39), the surrogate constraint sets are CA^({(I>}) = [y e X\ (^(y) = sup cD(G)}
(cD e X*\{0}),
(6.33)
where A = AG : 2^ ^ 2^*^^^^ is the polarity A^ of (1.166), and the dual objective function is r^(c|>)=
inf
f(y)
(cD6X*\{0}).
(6.34)
Note that for the primal problems (P^) of (3.1) and (P"^) of (6.1), the surrogate constraint sets (3.44) and (6.33) are the same, so the dual objective functions A,^ and A,^ coincide on X*\{0}, and the only difference between the dual values yS^ and )6^ is that in (3.42) we take sup(j,^;j'*^|o}, while in (6.28) there occurs inf(D€x*\{0} • Moreover, in the sequel we shall use the same special polarities A as those used in Section 3.2. We shall first give some necessary and sufficient conditions on G, / , and A, in order that a < ^^ or a > ^^ or a = )S^, where a e R is arbitrary, in terms of the level sets SAf) and A j ( / ) of (1.22) and (1.23). Proposition 6.1. Let X, W be two sets, f \ X -> R a function, A: 2^ -> 2^ a polarity, and a G R. The following statements are equivalent:
6.2 Unconstrained surrogate dual problems for reverse convex infimization
177
1°. We have ct
inf inf f{Z^\{w])).
(6.35)
2°. We have Adif) n CA'({W;}) = 0
{weW,deR,d
(6.36)
{w eW,deR,d
(6.37)
3°. We have Sdif) n Z^\{w})
= 0
4°. We have A«(/)nCA'({w;}) = 0
(u; e W).
(6.38)
Proo/ 2° 4^ 1°. By Lemma 3.4, condition 2° is equivalent to inf /(CA'({M;})) >d
(w e W, d e R, d < a),
(6.39)
i.e., to inf /(CA'({M;})) > a (W e W), which is equivalent to 1°. Finally, the equivalence 2° <^ 3° follows from the inclusions Adif) c Sd(f) c Ad'if)
(d, d' eR.d
< a),
and the equivalence 2° 4^ 4° follows from ^^^(7) = Uj
(6.40) D
Proposition 6.2. L^r X, W be two sets, f: X ^ R a function, A: 2^ -^ 2^ a polarity, and a e R. The following statements are equivalent: r . We have ct>P'^=
inf inf /(CA^({W;})).
(6.41)
2°. For each d e R,d > a, there exists Wd ^W such that A^(/)nCA^({K;j})#0.
(6.42)
3°. For each d e R,d > a, there exists Wd e W such that Sd(f)nCA\{wd})y^0.
(6.43)
Proof r => 2°. If r holds md d e R,d > a > P'^ = inf^ew mff(CA\{w})), then there exists Wd e W such that d > inf f(CA\{wd})), whence by Lemma 3.4, we obtain (6.42). The implication 2° =^ 3° is obvious. 3° =^ r. If d e R,d > a, and Wd e W satisfy (6.43), say, Xd e Sd(f) n CA\{wd}), then y6^ = inf inf/(CA^({U;})) < inf f(CA\{wd]))
< f{xd) < d;
weW
hence j6^ < infd^a d = a. On the other hand, if there exists no d e R such that d > a, then ^^ < +oo = a. D
178
6. Unperturbational Duality for Reverse Convex Infimization Combining Propositions 6.1 and 6.2, we obtain the following result:
Theorem 6.3. Let X, W be two sets, f: X -^ R a function, A: 2^ ^ 2 ^ a polarity, and a e R. The following statements are equivalent: 1°. We have a=
inf inf/(CA^({U;})).
(6.44)
weW
2°. We have (6.36), and for each d e R, d > a, there exists Wd e W satisfying (6.42). 3°. We have (6.37), and for each d £ R, d > a, there exists Wd £ W satisfying (6.43). Let us give now, for the case of = a^ of (6.1), some convenient sufficient conditions in order that a^ < ^^ or a'' > ^^ or a'' = P''^, involving only G and A, but not / . To this end, we shall need some preparation. Lemma 6.1. Let X, W be two sets. Then for any polarity A : 2^ -> 2 ^ and any set P
(6.45)
Consequently, for ^^ of (6.26) we have fil = i n f / ( C A ' ( W ) ) .
(6.46)
Proof. By (1.140) (applied to A'), U^^PCA'({W})
= C(n^^pA'{{wm
= CA'iP),
which proves (6.45). Hence by (6.26), Lemma 3.7 of Chapter 3 applied to {A,),e/ =
{CA'({W})U^W,
and (6.45), we obtain /61 = inf inf /(CA'({W;})) = inf f(U^,wCA'({w}))
= inf/(CA'(IV)).
D
weW
Corollary 6.1. Let X, W be two sets, G a subset ofX,f:X-^R A: 2^ ^ 2^ a polarity. If G = A\W),
a function, and (6.47)
then we have the ''weak duality equality " a'' = P'^, that is, i n f / ( C G ) := inf inf f(CA\{w})).
(6.48)
weW
Proof If (6.47) holds, then CG = CA\W),
whence by (6.46), we obtain (6.48). D
6.2 Unconstrained surrogate dual problems for reverse convex infimization Proposition 6.3. Let X, W be two sets, G a subset ofX,f: A: 2^ ^ 2^ a polarity. (a) If we have
179
X ^ R a function, and
G c A\W),
(6.49)
then a' = i n f / ( C G ) < inf inf /(CA'({ii;})) =
fi^.
(6.50)
weW
(b) If X is a topological space, f: X ^^ R is upper semicontinuous, and intGCA'(W)
(6.51)
{where int G denotes the interior ofG), then we have (6.50). Proof (a) If (6.49) holds, then CG 2 CA^(W), whenceby (6.46), we obtain (6.50). (b) If (6.51) holds, then by (1.20) and (6.51), CG = C(intG) D CA^(W). Hence if / is upper semicontinuous, then by (6.46) and i n f / ( C G ) = i n f / ( C G ) (see Lemma 1.1), we obtain (6.50). D Remark 6.6. (a) The following conditions are equivalent: 1°. We have (6.49). 2°. We have W c A(G).
(6.52)
3°. We have A^A(G) c A\W).
(6.53)
Indeed, 1° <^ 2° by (1.144). Furthermore, 2° ^ 3°, since A' is antitone. Finally, if 3° holds, then G c A'A(G) c A^(W), so 3° => T. (b) If A: 2^ -^ 2 ^ is a polarity such that A'(W) = 0,
(6.54)
then (6.49) implies that G = 0, for which (6.50) is trivial. (c) If A: 2^ ^ 2^ is a polarity satisfying (6.54), then condition (6.51) implies that int G = 0. (d) It is well known and immediate (see, e.g., [254], p. 194, Remark 6.3(a)) that we have (6.54) if and only if the empty set 0 is A'A-convex (i.e., for each x e X there exists w eW such that jc ^ A'({u;})), or equivalently, A({jc}) ^^W {x e X). Proposition 6.4. Let X, W be two sets, G a subset ofX, f: X ^^ R a function, and A: 2^ ^ 2 ^ a polarity. IfG is A' A-convex, then G 3 A\W),
(6.55)
and hence a' = i n f / ( C G ) > inf inf/(CA^({M;})) = yS^. weW
(6.56)
180
6. Unperturbational Duality for Reverse Convex Infimization
Proof. By definition, G is A'A-convex if and only if Vx G CG, 3W eW, G c ^'{[w}), x e CA'({W;}).
(6.57)
Hence in particular, in this case VJC e CG, 3W eW,
X e CA'({M;});
that is, we have, using also Lemma 6.1, CG C UUJ^WCA\{W})
= CA\W),
(6.58)
which is equivalent to (6.55). Also, clearly (6.58) implies (6.56).
D
Remark 6.7. If A: 2^ ^- 2 ^ is a polarity satisfying (6.54), then we have (6.56). Theorem 6.4. Let X, W be Wo sets, G a subset of X, f: X ^^ R a function, and A: 2^ -> 2"^ a polarity. (a) If (6.49) holds and G is is!lS.-convex, then we have (6.48). (b) If X is a topological space, f: X -> R is an upper semicontinuous function, and A: 2^ -> 2^ is a polarity such that intG c A\W) c G,
(6.59)
then we have (6.48). Proof (a) If (6.49) holds and G = A'A(G), then by Proposition 6.4, G = A\W), and hence by Corollary 6.1, we obtain (6.48). (b) This follows by combining Proposition 6.3(b) and the implication (6.55) =^ (6.56). D Now we shall give a sufficient condition for strong duality in terms of the limiting case d = a. Theorem 6.5. Let X,W be two sets, G a subset of X, f: X ^^ R a function, and A: 2^ ^- 2 ^ a polarity satisfying (6.38), with a = ot^ — i n f / ( C G ) . If there exists If 0 ^ ^ suc^ that 5«(/)nCA^({i/;o})7^0,
(6.60)
i n f / ( C G ) = min inf/(CA'({U;})) = inf/(CA'({M;O})).
(6.61)
then
Proof If jco G Sa{f) n CA'({W;O}), then by (6.38) and Proposition 6.1, we obtain oi<^\=
inf inf/(CA^({W;})) < inf/(CA^({U;O})) < /(XQ) < a,
whence (6.61) (with the min being attained for M;O G H^).
(6.62)
D
6.2 Unconstrained surrogate dual problems for reverse convex infimization
181
Remark 6.8. (a) By (6.38), for any XQ e Saif) H CA'({U;O}) we have /(JCQ) = of, i.e., Sa(f) n CA\{WO]) C SAf)\Aa(f).
(6.63)
Moreover, since a = o?^ = i n f / ( C G ) , formula (6.61) shows that if we have (6.60), then every XQ e Sa(f)nCA\{wo}) is an optimal solution both of the primal problem (P^) of (6.1) and of the "surrogate primal problem" (P""^')
«CA'({u.o})./ '= mff(CA\{wo})).
(6.64)
(b) By the above proof, in Theorem 6.5 condition (6.38) can be replaced by any other condition ensuring that a = i n f / ( C G ) < ^^, e.g., condition (6.49) or, when X is a topological space and / is upper semicontinuous, condition (6.51). The condition of Theorem 6.5 is not necessary in order to have (6.61), as shown by the following example: Example 6.2. Let X be a normed linear space, W = X*\{0}, G = {x eX\ \\x\\ < 1},
(6.65)
/ the function (6.24), and A the polarity A^ defined by (1.160), so CA^({0}) = {X
eX\ cD(jc) > IIOII}
(CD G X*\{0}).
(6.66)
Then a = inf,,CG lUII = 1. K(f) = [x e X\ \\x\\ < 1}, Sa(f) = [x e X\ \\x\\ < l}.inf^6CA'({cD}) lUII = 1 (O G X*\{0}), so we have (6.38) and (6.61), but not (6.60) (since 0(x) < ||0|| ||x|| < ||0|| for all O G X*\{0}, x G 5«(/)). Concerning simultaneous characterizations of optimal solutions of (P^) and of weak duality a^ = ^ A ' ^^^ ^^ prove the following theorem: Theorem 6.6. Let X, W be two sets, G a subset of X, f: X -^ R a function, and A: 2^ ^ 2 ^ a polarity. For an element XQ G CG and for a = a^ = i n f / ( C G ) , the following statements are equivalent: 1°. XQ is an optimal solution of(P^) (i.e., /(JCQ) = min / ( C G ) ) and a = P^^. 2°. We have Adif) n CA\{U;}) = 0
{weW,deR,d
< /(JCQ)),
(6.67)
and for each d e R, d > a, there exists Wd e W satisfying (6.42). 3°. We have SAf) n CA\[W})
(weW^de
R,d < /(xo)),
and for each d e R, d > a, there exists Wd ^ W satisfying (6.43).
(6.68)
182
6. Unperturbational Duality for Reverse Convex Infimization
Proof. 1° ^ T. If 1° holds, then /(JCQ) = i n f / ( C G ) = a, and hence by Theorem 6.3, we have 2°. 2° ^ 1°. Assume 2°. Then by (6.67) and Proposition 6.1 (with a = /(XQ)), we have /(jco) < P''^. Furthermore, by the second condition of 2° and by Proposition 6.2, we have (6.41) with a = i n f / ( C G ) . Hence by XQ e CG, we obtain a = i n f / ( C G ) < f(xo)
(6.69)
Finally, the proof of the equivalence 1° 4^ 3° is similar.
EH
In the remainder of this section we shall assume, without any special mention, that Z is a locally convex space with conjugate space X*, and G c X (with G ^ Z), and we shall apply the preceding results to the special polarities A' = A|^ : 2^ -^ 2^*\{^^ (/ = 1, 2, 3, 4) of Chapter 1, Section 1.2. (1) For the polarity A^: 2^ -^ 2^*\{0Uefined by (1.160), we have (1.161), and hence the dual objective function (6.27) and the dual value (6.28) become k\, (CD) = inf /(C(A^)^({0})) =
inf
f{x)
(cD e X*\{0}),
(6.70)
cI>(jc)>sup4)(G)
P\, =
inf
inf fiddly(W))
=
inf
inf
f(x).
(6.71)
4)(jc)>supcD(G)
Remark 6.9. (a) For O G X * \ { 0 } such that supO(G) == +oc, we have {x e X\^{x) > supO(G)} = 0, whence inf^ex,o(jc)>sup0(G)/(-^) = +oo. Therefore, the inf (Dex*\{0} in (6.71) can be replaced by inf^^G''. where G^ is the "barrier cone" (1.347) of G. A similar remark is valid also for some of the subsequent results, but for simplicity, in the sequel we shall use only inf(i>ex*\{0}(b) By (6.31) applied to A = A^, we have ^;, = ^G
inf
inf
fix),
(6.72)
(c) By Proposition 6.3(a) applied to A = A^, we have i n f / ( C G ) =a' <
inf
inf
OGX*\{0}
f(x) = yg^ .
xeX
(6.73)
^G
Theorem 6.7. Let X be a locally convex space, G a closed convex subset ofX (with G ^ X), and f: X -^ R a function. Then inf/(CG) =
inf
inf
cI>eX*\{0}
xeX 0(jc)>supO(G)
f(x).
(6.74)
Proof By (1.164) and (1.163), we have G = coG = (AlYAliG)
=
(Aly(X*\m,
and hence by Corollary 6.1 for AG = A^, we obtain (6.74).
D
6.2 Unconstrained surrogate dual problems for reverse convex infimization
183
Remark 6.10. (a) Theorem 6.7 is a "half-space theorem of surrogate duality," since the surrogate constraint sets Q<^ = ^G,
inf
inf
O6X*\{0}
xeX (;c)>supO(G)
f{x) =
inf
inf
4)6X*\{0}
xeX 0(jc)>supO(G)
f(y).
(6.75)
Proof The first equality holds by Corollary 6.2 below, and the second equality holds by Lemma 1.1. D Remark 6.11. Theorem 6.8, too, is a "half-space theorem of surrogate duality," since the surrogate constraint sets [x e X\
= inf /(C(A3^)^({CD})) =
mf
f(x)
(O e X*\{0}),
(6.76)
^(x)=sup